Prediction of potential genes in microbial genomes Time: Fri May 13 01:39:31 2011 Seq name: gi|316925039|gb|ADCP01000001.1| Bilophila wadsworthia 3_1_6 cont1.1, whole genome shotgun sequence Length of sequence - 10265 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 3045 3102 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains - Prom 3116 - 3175 4.6 + Prom 3187 - 3246 4.3 2 2 Op 1 2/0.000 + CDS 3305 - 4540 1216 ## COG0477 Permeases of the major facilitator superfamily 3 2 Op 2 . + CDS 4557 - 6203 2313 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold + Term 6220 - 6273 8.0 4 3 Tu 1 . + CDS 6317 - 7687 1542 ## COG0786 Na+/glutamate symporter + Term 7737 - 7762 -0.5 5 4 Tu 1 . + CDS 7819 - 9237 2031 ## LI0461 hypothetical protein + Term 9345 - 9388 2.8 Predicted protein(s) >gi|316925039|gb|ADCP01000001.1| GENE 1 3 - 3045 3102 1014 aa, chain - ## HITS:1 COG:STM2859 KEGG:ns NR:ns ## COG: STM2859 COG3604 # Protein_GI_number: 16766165 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Salmonella typhimurium LT2 # 688 997 380 686 692 255 44.0 3e-67 MQESTSLLILWLLSSLKQERSTLCLIQMLECPAAQAEIALHRVTADGLAEQRDETWNLTQ AGADHLREALPALLFRDTGSPLEELISACESKMPETACEICTKRMRVLRRRADTAGSAGY LELLLNQLAQWQNRRLTAPEARSFVHYAFAAHDASLYYMKNAARSSAFMKKAAEYAASAG DKRTLVLIRFAQASATFFTGTIHFERAHETFHSAMELLKTLDDKELYTRITPFLILMHFI RGNFQQALECYEMLQKSTHKPVLPLFESQLVLQAASSAAYSGHFAHALGMIKSAISTAEL EGNIMSAQIFRQHLGVLYAYMRREDDALETLQTVVSRSTMAANPKLMLRAFAAIALCYSR MGRAEMSYNLLSDTLRQAERHPSFSLSYNYLWILELAVFYAEHGFPPLPGIDLDTLFREA EISPSPMMRGHALRLKAILLRKKSPQDALDLLDKSLEILKDANMPIEYGFTERRRSELLN TLGREAEAEAAEETSLMLLSRIGLGRKRSPWPSPEQKEASFDVCCREVGSIHEWETLEEY CYQLAACLRKVFRAERTALFSMDGETLLCRGSCNLSGSEIASGMFVPSRDLIRERIREEA PFSAQQNGLDLLCLPFRLPDGNPWLLYADSSAHSSRMLHCAQEDLSRVGLLCASELKNVL RLLEARNMRAEVRQIHAVTKMAQEERLELWGKSLSFRMCLERAKVVSATDAAVLLLGETG VGKEVMANYVHRHSGCKGPFVAVHPASVSEHLFENEFFGHEKGAFTGATGQKIGFFELAD NGTLFIDEVGDIPMNMQIKLLRVLQEHRFMRVGGTKEIHSSFRLIAATNRDLPRAVREGT FREDLYYRISVIPVTIPPLRERLEDLEELIMAFTDHFSRRYGKKLPYPDEETLRRLSEYQ WPGNIRELRNAVERAVILYTGGPFDLAVGGQHEARDHCQRAETSLYADTPTLLELQKRYI QYILNKTGGRISGENGAEKLLGMKRSTLYLKLRQYGIKPGERERERERERERER >gi|316925039|gb|ADCP01000001.1| GENE 2 3305 - 4540 1216 411 aa, chain + ## HITS:1 COG:AF0367 KEGG:ns NR:ns ## COG: AF0367 COG0477 # Protein_GI_number: 11497979 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Archaeoglobus fulgidus # 5 392 5 391 397 81 23.0 3e-15 MFYGWKLATIGSCGNLLLQGSVFYTMNAFIEPLVELHGWSRAGIGLSMTFASAMMTLSMP IGIMLTNKISIRLLMTLGAFIGGLSYVGLGYVDDIRLFAALFAIVWVCGQLCGGVVATML MNRWFAAYRGRAFGLVNMGTSLSGAFLPFVALVLVDTLGVSWAFSILGGLAFLLFPLCWR IIRNTPEDIGLTVDGVAANAFAAARKEEKVIPMNWRELVASRQMWIIGVSFGIGLMAVGG VLSQLKPRFVDTGMSSYVAMGFMCLTALLGAVGKYAWGWVCDRTTPLFATKLLFLCNALS LACIFLPHTLFNVILFVVGYGVCMGGIWTVFPSVVAYLYGKKQFPHVYKYISLFVAVKSL GYAAVGLSHSLTGNYNMAYMGMVILLLAAFAATSTIREHEAAEGNVFSTCN >gi|316925039|gb|ADCP01000001.1| GENE 3 4557 - 6203 2313 548 aa, chain + ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 7 544 18 548 553 367 39.0 1e-101 MASATHVYRNGTILTMDSGGSQAQALAVRGETILAVGSDAEIMALADPHTVVTDLRGRTM LPGFIDGHSHFVSAGLMAATQLDLSSPPVGGVKNIAEIKGLIRAKAAETPKGEWILGFGY DDTGLEDKRHPLASDIDEVAPEHPVLLRHVSGHLSACNGLALAKANYTKDTPDPVGGVIR RDEHGNPNGVLEEPPAREPVFRHIPAPTEADWMEGIKAACAAYTAKGVTTAQDGFTATGD WGALKRAHELGLLRNRVQILPGVSRMDINTFNTHVSGTQLTADGKISLGAAKLLADGSLQ CYTGYLSNPYHKVIYDLPDGPMWRGYPMEPEQQFIEKVVGLHRQGWQLAIHGNGDDAIQM ILNAYEEAQKRYPRADARHIVIHCQTVREDQLDRIKRLGVVPSFFVVHTYFWGDRHYEMF LGQDRAERISPLRSALKRGIPFSNHNDTFVTPIDPLLSVWSAVNRRTSGGRILGENQTIP VMDALRSVTSWAAYQACEERIKGSLEPGKLADFVILEANPLTVAKETIKDIGISATIVGD EVVYGSLD >gi|316925039|gb|ADCP01000001.1| GENE 4 6317 - 7687 1542 456 aa, chain + ## HITS:1 COG:Cgl2722 KEGG:ns NR:ns ## COG: Cgl2722 COG0786 # Protein_GI_number: 19553972 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Corynebacterium glutamicum # 1 430 1 413 449 110 27.0 6e-24 MESTFYPYLGALGWTGLFLLIGTIIRAKVKFFQTFLFPASLIGGIIGFFVLNAGWLGIPS STGWKTITPVTFSTITFHLFAFGFVGIGLLQAKSGTSGKVVARGALWIALIFGLLFSVQA MVGKGTFDVWKLLFGGDFFTGNGYLLGAGFTQGPGQTQAYASIWETTYKISNSLNVGLAF AAVGFLVAGLVGVPLAFYGIKKGWVSIEGGKLPQCFLRGLMDKGDNPTCARSTTHPANID SVAFHLAIMATLYALAYAFGVWWLCTMPKGINGLGIGMIFAWGMFFAMIARKLMAKFDLI HLLDGETTRRLTGATVDFMICAVFMGIQVRQLQEVAMPFLIAVVLGTIATLFICLWFGRR SPEHGFERGLTLFGYCTGTAASGLLLLRIVDPEFDTPVAVEVGLMNVFATILFKPISWSM PFVPVEGFPMLWIFVAVTVVTPIVMYFLKMIRKPAF >gi|316925039|gb|ADCP01000001.1| GENE 5 7819 - 9237 2031 472 aa, chain + ## HITS:1 COG:no KEGG:LI0461 NR:ns ## KEGG: LI0461 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 2 472 3 498 498 364 41.0 5e-99 MKRIVTLILAAGLILGATSAAQAVDFKVSGLWQHRVSFADRNFEKHNGDDKLRAASRLRT QIDVIASESLKGVMFFEIGHQNWGKSAQGAALGTDGKEVKVRYSYVDWVVPQTDVKVRMG LQKYTLPNFTGIGSPILDADGAGITISNQFTENVGTNLFWLRAANDNDPEMTKHDAHDAL DFIGLTVPLTFDGVKVTPWGMGGIIGQDSFKGSGPDMTGIAMPGMLPLGGDAAIAASSDK DHGSIWYGGVAADVTYFEPFRFALDAAYGSVDMGTSKYKDKNFDVKRSGWYAAFLAEYKL ESCTPGLLFWYSSGDDANPYNGSERLPSIDPDVYVTSYGFDGTNYGGAAQTLGYGISGTW AVMARVKDISFMEDLSHVLRVVYYQGTNNKEMVRSKMISNPQDSVASMMYLTTGDKAVEV NFDTEYKLYKNLSLFVELGYIRLDLDKDLWKGVGYEAKENNFKGTFSIGYKF Prediction of potential genes in microbial genomes Time: Fri May 13 01:39:41 2011 Seq name: gi|316925036|gb|ADCP01000002.1| Bilophila wadsworthia 3_1_6 cont1.2, whole genome shotgun sequence Length of sequence - 1637 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 129 - 188 2.8 1 1 Op 1 . + CDS 285 - 767 215 ## 2 1 Op 2 . + CDS 837 - 1635 634 ## Ddes_2337 hypothetical protein Predicted protein(s) >gi|316925036|gb|ADCP01000002.1| GENE 1 285 - 767 215 160 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCGSVAQGEHSLESCTWPEKEYDVPLFCAFAEEAPLDEGDIPLEIIQYDPTDYFIDGNGQ FTFLIGKGVLDVLLRHGREGVASVSGIDVFIDTSLPIAAPGDLKACLGLWVDRNKLFGFT KDNGWFLSYSLNGLPKGAVEMRHKEGAIMVMDKKSILSPA >gi|316925036|gb|ADCP01000002.1| GENE 2 837 - 1635 634 266 aa, chain + ## HITS:1 COG:no KEGG:Ddes_2337 NR:ns ## KEGG: Ddes_2337 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 128 1 125 1035 101 42.0 3e-20 MADLTLRHPANGANQVIPSEKFDHIAFDFPSDSVVLSKEGNDLLLSFEDGSRITLTDFYT TFSKDSIPDFIVDGTSVSGSEFFAALNEPDLMPAAGPAVAASNADGGRFHEYTDASLMDG VERLGGLDLGLNRAAEPDRELEAYGNRGVEEEETVVEEVIVPERPLFNDAPSGGSSAVTT DEGNIPGMGSQHETPATQPFGAATDGSFKMELHGADATVSIGGTELKVENGKLYHNGVEV TADTAVSVPGGAHGTLTVTGMDADGT Prediction of potential genes in microbial genomes Time: Fri May 13 01:40:06 2011 Seq name: gi|316925027|gb|ADCP01000003.1| Bilophila wadsworthia 3_1_6 cont1.3, whole genome shotgun sequence Length of sequence - 11933 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 6, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 170 - 3487 2746 ## COG2931 RTX toxins and related Ca2+-binding proteins 2 1 Op 2 1/0.000 + CDS 3534 - 4820 646 ## COG0438 Glycosyltransferase 3 1 Op 3 . + CDS 4795 - 5721 789 ## COG2199 FOG: GGDEF domain + Term 5770 - 5805 -0.4 4 2 Tu 1 . + CDS 5884 - 7278 1525 ## COG1538 Outer membrane protein + Term 7342 - 7390 5.5 + Prom 7629 - 7688 4.1 5 3 Tu 1 . + CDS 7850 - 8269 365 ## gi|302863789|gb|EFL86720.1| putative glycosyltransferase + Term 8312 - 8351 4.3 - Term 8483 - 8510 0.1 6 4 Tu 1 . - CDS 8592 - 9920 1950 ## COG1301 Na+/H+-dicarboxylate symporters 7 5 Tu 1 . - CDS 10122 - 10508 135 ## - Prom 10674 - 10733 4.2 - Term 10862 - 10912 -0.6 8 6 Tu 1 . - CDS 11064 - 11639 521 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase - Prom 11800 - 11859 6.7 Predicted protein(s) >gi|316925027|gb|ADCP01000003.1| GENE 1 170 - 3487 2746 1105 aa, chain + ## HITS:1 COG:alr7304_2 KEGG:ns NR:ns ## COG: alr7304_2 COG2931 # Protein_GI_number: 17233320 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: RTX toxins and related Ca2+-binding proteins # Organism: Nostoc sp. PCC 7120 # 815 1011 638 836 1125 91 38.0 9e-18 MSVPGGAHGTLTVTGMDADGTVHYTYTLTAPVDATGNASNRPGEGDAGRGEAVHADAFDV TITTTGGTATGQITVDALDDAPVLTVQGDRVEHAADSEGGSITDTFMVHFGADGPGDAAF TFDGHALEKNVEGNWQYTDPDGLYTITVVQTGTDANEFRYSYTLEYDSTKVKEGFGGDLK VVATDGDLDTATDTVHIAVTNTAPEAADNIYDIDKAVAGESIISASASAVLGDSIITVGG TVGDRIHTGWMTDKQDDAFSVLEDGVAGDLFGKFFSDLGGNNLTLDTLKDAAHIYTLKIS SGTSADQVQAAVKYASEHNLLLYIEGDLNSSLLGNTPLNCVTIVNGNLSINSEGFGANSF LYVTGDVHAGEDFTVSGGLAVGGDLTGSASIEVEHTADVFTPDNVVISSTVPSGEVPSTS ITITFEDLLHNDMDRDDASVSKDGLHITEITIGGKTYTSHDASTDISYNETTKISIDWQK GTISVTNTGKNSESIQFGYGVEDRHGATDSADITVNVTATTGAGSIGDDLLQGATTTENV AMSYNISFVLDKSGSMGSSYSTAKEAVANYIEKLWDDIQNTDAIINIQVVKFSSSVGWGD NNTFTLDKSTTYKELQAFLSAHVTNNDKASGNTNYEDALLKAESWFNSQEENGFANRLYF ISDGEPNRPYGKPVERAEAVYDRIVGDSAHPVDVHAIGILGNGANDLDVLNKFDNTDGAD QIRNAGELYDAIASSTVTKPVSDTIFANKGDDVVFGDTAQFSVDGATVSLAEYVKAQLGF NPSTADVIDYVREHPEEIGSALVPNANEGKPDMPDALIGGEGNDVMYGQGGNDLLIGDGS NTSGADDTLHRLAQELGTLTGGSHGVTPASLSDAILNLGHDSAKLHELADWSEKHLENSS DGDDWLFGGEGNDVLFGLGGNDHLYGGSGDDVLFGGSGNDHLYGGSGNDILFGGSGDDYL DGGEGRDILFGGSGNDIIKYDSTDFLVDGGDGIDFLITDNKDLTLDDLLRNTDPNNGPIV QNVEVLISGDHALSLTDTAGLKQYGIELGLDGDKETLTLTDAWTQQGDAFVNADAGLTIQ VHGLTPETVTDDQAMLHKFILENAQ >gi|316925027|gb|ADCP01000003.1| GENE 2 3534 - 4820 646 428 aa, chain + ## HITS:1 COG:sll1466 KEGG:ns NR:ns ## COG: sll1466 COG0438 # Protein_GI_number: 16330066 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Synechocystis # 6 402 1 405 413 130 26.0 7e-30 MPEGPMRILFLNYTFPGPLGTMAASLAALPGNEVLFASEYGRRDFSLPGVKRVLLKKPKD RKKAVASISTPALDAGERDWTMAYLRGRATASSLMGVLAEGFEPDMVILSGSMGNGLFVR NLFPEAFLVGYADWYFRDEAETRYPLSTMPDIPLAPRNIRNTLQAKNFFDCNCHFTTTEW QREQYPEGMRGFISVLRKCVDTEFFAPAPASRFSFGDCELTERQEIVTLSVRDAARLREG GFWCELPSLLSARPDCHVVLISTGKEVRLDSLRESMAAFPSGMRGRIHLLDFLPPGAYRD LLCVSSLHLCKDISFMLPSELLEAMSCGCVVLAPDTGAVREVLSDGQNGFLYPSGRRMNW VMLITLLLERRSELLSIRRNARKTVFGKHNKNTLLPRHIAFLMDRYSHWRAQQIQLAEGN DAFESTSA >gi|316925027|gb|ADCP01000003.1| GENE 3 4795 - 5721 789 308 aa, chain + ## HITS:1 COG:aq_563_2 KEGG:ns NR:ns ## COG: aq_563_2 COG2199 # Protein_GI_number: 15606018 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 118 302 33 211 218 134 41.0 2e-31 MHSNLPLPKLLEIISRIVNDSGCTVRIVDEIPVQDREIRELCLQIYELRRAAQSIAHGEL GVHFENCGCMATYLKDIMDRLKAVERGIKELAEGDTPELPRLGALGTYFELIGNTLQENR RLIEKYRNLSLTDMLTGLPNRRGFLALAEKSFARASRKRQSISFIMADIDHFKKVNDRHG HEAGDEVLRVVAQRFLDCLRVEDICCRYGGEEFLVMLYDTDVIRAALVAERLRKSIAGKP ISTRGEEIRVAISLGVTEVVAADYAGTRKCHAEIRRAVQVSDAFLYKAKKAGRNRVAANL PEEDEGVA >gi|316925027|gb|ADCP01000003.1| GENE 4 5884 - 7278 1525 464 aa, chain + ## HITS:1 COG:VC1621 KEGG:ns NR:ns ## COG: VC1621 COG1538 # Protein_GI_number: 15641628 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 24 449 13 434 445 154 26.0 3e-37 MYGRAFAMILMTAVLCLGGMTNAPAAPTSNAAVTSAPVRESAVTVRETVAETIAHHRGLK VIQENLDVTRYELRRAKAGWGPSVDATGRYGASRLSDDTTRSYGSDKGMYAASGVGITLT QPLWDGFATRSRVRTGEATVESMEYRVFDNATSFGLDGLIAHVDYLRRREILRLAQENVA RHKEILASQRERLNLGASTTADLTQTEGRLSRAMSTLTDAEASLREAEASYIRLTGKPVP PSLAEVYVPEGMFADPDAVMKAAEEGNPKLKAYLADIRAARGEKELAQSAYHPKINLEVG PNYSDRGGRGSNWTSSFDVMATMRWNLFNSGADKAETEAAESRIRQARQTAYNFFDDLNL EIADTWTRYMAAIEQRKYYQEAVVYNTQTRDAYLAQFNLGQRSLLDVLDAESELFNSATQ EATARGNEIVGAYRLYALAGQFLPVLNIDDKSLYVSQRESEPSK >gi|316925027|gb|ADCP01000003.1| GENE 5 7850 - 8269 365 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302863789|gb|EFL86720.1| ## NR: gi|302863789|gb|EFL86720.1| putative glycosyltransferase [Desulfovibrio sp. 3_1_syn3] # 8 138 13 143 144 116 59.0 4e-25 MSETPFPVSAPPPDAASLAALLQSAAQALLRSGKPSDEEHPAPAKVERIQEEIRHYRFPP LAVRCEKELLEAEDAEEMRELLFRAFSSLETFADLLELMNPEAKVEGVVMHTLARELRRI GRLVFKIQDAYSTVELVEA >gi|316925027|gb|ADCP01000003.1| GENE 6 8592 - 9920 1950 442 aa, chain - ## HITS:1 COG:PA1183 KEGG:ns NR:ns ## COG: PA1183 COG1301 # Protein_GI_number: 15596380 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Pseudomonas aeruginosa # 5 417 3 415 436 487 64.0 1e-137 MEAPKKPLYKSLYFQVICAILLGIALGHFYPQTAQAMKPLGDGFIKLIKMIIAPIIFCTV VLGVAGMEDMKKLGRIGGKALIYFEAVTTLALIIGLVVVNTLQPGGGMHVDPASLDTTAI AKYTASAQHNGFIDFLMNIIPGTIVGAFANGDILQVLLISLLCGAAFSALGKRVAGIVGL IRQASSALFGIINIIMKLAPLGAFGAMAFTVGKYGLGTLGSLGYLMGSFYLTCLLFVFVV LNIIARLTGFSLVKLLRYIAHELWLVLGTSSSESALPTLMAKLEHMGARKDIVGIVVPSG YSFNLDGTSIYLTMAAIFIAQALGIELSLKEELTLLAVLLLASKGAAGVTGSGFVTLAAT LATIPTIPVAGIALILGVDRFMSEARALTNLVGNSVATIVVAKWEGALDMQRLHAELDQP TVIEEEPFLPDDGLLQPVPADK >gi|316925027|gb|ADCP01000003.1| GENE 7 10122 - 10508 135 128 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRRLFLPLHLFILFATLLAAGCWCPRADAHEPEKGAFRHAPTQYETVIQDVYQPLLIQT EVRVPHVRGHELSQKDNDAPVHDAGQSAAPPMLPPLLSAGLPAVVPLPLRPSWRGLVVFP LPPPAMIG >gi|316925027|gb|ADCP01000003.1| GENE 8 11064 - 11639 521 191 aa, chain - ## HITS:1 COG:CAC1383 KEGG:ns NR:ns ## COG: CAC1383 COG2087 # Protein_GI_number: 15894662 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 6 190 4 185 185 107 35.0 1e-23 MDKPDLTLYLGGTRSGKSARAEARVFQRADGPVLYVATAEARPDDPSMTERIRRHRARRP ENWRTLECPLQLGEHIGTALAEFRNTAGTPTVLLDCITLWVSNILFSLPDPEDLSSFEGA VRLETEALLDIMRSSGCQWVVVSGETGLGGIEPTRLGRNYCDGLGLANQLIAAQAREAFL VVAGRLLKLEE Prediction of potential genes in microbial genomes Time: Fri May 13 01:40:27 2011 Seq name: gi|316925021|gb|ADCP01000004.1| Bilophila wadsworthia 3_1_6 cont1.4, whole genome shotgun sequence Length of sequence - 4504 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 27 - 926 731 ## Amuc_1679 xylose isomerase domain protein TIM barrel 2 1 Op 2 11/0.000 - CDS 950 - 1696 825 ## COG0368 Cobalamin-5-phosphate synthase 3 1 Op 3 . - CDS 1696 - 2727 1175 ## COG2038 NaMN:DMB phosphoribosyltransferase 4 1 Op 4 2/0.000 - CDS 2754 - 3743 1165 ## COG1270 Cobalamin biosynthesis protein CobD/CbiB 5 1 Op 5 . - CDS 3740 - 4504 892 ## COG1492 Cobyric acid synthase Predicted protein(s) >gi|316925021|gb|ADCP01000004.1| GENE 1 27 - 926 731 299 aa, chain - ## HITS:1 COG:no KEGG:Amuc_1679 NR:ns ## KEGG: Amuc_1679 # Name: not_defined # Def: xylose isomerase domain protein TIM barrel # Organism: A.muciniphila # Pathway: not_defined # 6 295 9 267 271 209 40.0 1e-52 MRKYVIPNIRLGATSFLLHETYVPAVRFAAERCDDVALLLVEPGERNEYLATPGEIAELG RIIAGERATLHVHLPTDADFDTWEGARSMIGKIRRVAELTGPLDPHSFVLHVDFPSLHGT GGEPSAEQQEWTAEALREIAACLPAPERLAVENLETYAPSFWDRWLAGTPYSRCLDVGHI WKDGGDPAPVLAAWLPRVRVIHLHGLEPRGSEADAAKPQGQQKAPEKLAERNLSRLFGPR PRDHKSLRLMPPEFVDDVMHPLWKTGFSGVLNLEVFSVEDFTASHAVLMQSWERYAASS >gi|316925021|gb|ADCP01000004.1| GENE 2 950 - 1696 825 248 aa, chain - ## HITS:1 COG:cobS KEGG:ns NR:ns ## COG: cobS COG0368 # Protein_GI_number: 16129933 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Escherichia coli K12 # 4 244 7 244 247 95 33.0 1e-19 MLRAALGFFTRLPIGSAPLPPTFRGVAVWLPVVGLIIGALVSLALGLAAQLLPASLCGVI GCLVWVAVTGGLHLDGVADCGDGLIVEAPPARRLEIMKDSRLGTFGGAALFFILALKAAA LGALAGTFQADGSGPDAFAPLLLSCCLAGTLARCMVFAAMRVPSARPGGLGEALHEGVTS RHELIALAIGLGICAVNGVRGLYALAIALLAAYLLLSAAKKRLGGVTGDVFGCLVELTEC VVLIVCCV >gi|316925021|gb|ADCP01000004.1| GENE 3 1696 - 2727 1175 343 aa, chain - ## HITS:1 COG:CAC1372 KEGG:ns NR:ns ## COG: CAC1372 COG2038 # Protein_GI_number: 15894651 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 18 338 22 341 352 274 45.0 2e-73 MNTLTIPAFSAEAAEAAQRRLDNLAKVPRSLGRIEELAVRLAGITGKECPAFPEKSVVLF AADHDIALQGVSATGQEVTEMQVRNFVKGGGTINAFCRNAGARLSVVDVGVKNDLDDVEG LVRRKVMHAARDFSEGPAMTREEALACVQVGIDMAREEAARGVTLLAAGEMGIGNTSPSS AIAAVLTGASVDEVTGIGSGIPSERVRHKAGLIRRGIAINRPNPSDAVDVLAKVGGPELA AMSGLMLGGASLRIPVVVDGFIAGAAAAIAIGIRPGVRDMLIGSHSSVEPGHQILMDYLG IPTYFDFGLRLGEGTGAALLYPIIDAAVRILTEMNTLQGIGMK >gi|316925021|gb|ADCP01000004.1| GENE 4 2754 - 3743 1165 329 aa, chain - ## HITS:1 COG:STM2034 KEGG:ns NR:ns ## COG: STM2034 COG1270 # Protein_GI_number: 16765364 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobD/CbiB # Organism: Salmonella typhimurium LT2 # 12 310 9 298 319 157 36.0 3e-38 MNLAFLALLPAALLIDRLFGELPSRVHPVCMMGALASRIETLFRRGSNNALMSLSGTLAC LLVVLPCAALAGGLAWAAQVYADPRAAWFVSAIIIWICVAPRSLDEHALRVAVPLAKGDL EGARKAVSMMVGRNPDRLDAHGVARACVESVGENLTDGVLSTLFWAGIGLFFFGYPGAAC LAVLHRSANVLDALWGKKNEKYIRFGTFAARLDDALNFVPARLSLPCIAFASRIIPNLRH NDILPVGWKYRAAHESPNSAWSEAAFAAALGLKLGGPAVYGDLCVDHPWLGDGTPDAKPA HIVLAVRLMWHSVIIFTLFEVFLIGMLSL >gi|316925021|gb|ADCP01000004.1| GENE 5 3740 - 4504 892 254 aa, chain - ## HITS:1 COG:CAC1374 KEGG:ns NR:ns ## COG: CAC1374 COG1492 # Protein_GI_number: 15894653 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Clostridium acetobutylicum # 5 254 242 488 491 177 37.0 2e-44 DCGEKKAGSLDIAVVMLRHVSNYTDFAPLAAEPDVRLRPVRRAEEWGDPDVVMLPGSKSV VPDLDDLRRSGLADNILGHAERGKWIFGICGGLQILGRAILDPHGIESAAPEVPGLGLMD LRSTFAADKTLVRVARAETPLGVPSGGYEIHHGLTDHGPSALPLFLRADRAYPSEAERIC GYVSGRRWATYLHGVFDDDAFRRAWLDHVRADIGLAPQGRQLAAYDLEKALDRLADIVRE HSDMETIYQSMGLK Prediction of potential genes in microbial genomes Time: Fri May 13 01:40:36 2011 Seq name: gi|316925018|gb|ADCP01000005.1| Bilophila wadsworthia 3_1_6 cont1.5, whole genome shotgun sequence Length of sequence - 4392 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 2376 2081 ## COG1492 Cobyric acid synthase - Prom 2562 - 2621 1.9 2 1 Op 2 . - CDS 2730 - 4391 1163 ## COG1797 Cobyrinic acid a,c-diamide synthase Predicted protein(s) >gi|316925018|gb|ADCP01000005.1| GENE 1 3 - 2376 2081 791 aa, chain - ## HITS:1 COG:lin1171 KEGG:ns NR:ns ## COG: lin1171 COG1492 # Protein_GI_number: 16800240 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Listeria innocua # 441 789 5 350 511 341 48.0 3e-93 MKPEAHGGDLLRMAATAGRDPASLLDFSVNVRPEGPPEFIRAALFRAMTALAAYPSPHAE EAMLAAARHHGMDASRFVFGSGSNELIHALARVLRKRGVPSVRVVEPAFSEYAIACRLAG IKAIPVWGGIIEKNQCVPTTDTGKDEAVPTRDLLDALTDAPEGSAVFLANPGNPSGLFRT PEECLRLMSSRSDLLWIIDEAFVEYAGTETEASVLQRLPKNGIVLRSLTKFHAVPGVRLG YLAADAELAQAIRDELPAWSVNAFALAAAQAVFADTSDFAAQTRAENAERRADLAAALSS LPGIEVYPSAANYVLFRWPGAPRNLLGILLKRFGIAVRDCSNYHGLKDGSWFRAAVRFPE DHRRLAEALSAIRETTHGVSSSPLPETPASPESGNKDSINIKVLGRGGMGAWGKGGESPS PEGFLLPSPGISRRPRHTPALMLQGTSSNAGKSILAAAYCRIFRQDGYSVAPFKAQNMSL NSGVTAAGDEMGRAQIVQAQAALVDPDARMNPILLKPHSDTGSQVVVLGQPIGHMGVLDY FKKKKELWETVTEAYDSLAADHDVMVLEGAGSPGEINLKEHDVVNMRMAEHARASVLLVG DIDRGGVYASFLGTWMTFTDAERRLLTGYIVNRFRGDASLLGPAHEYMLDHTGIPVLGTI PYIRDLNIPEEDMAGFSWGHTDCGEKKAGSLDIAVVMLRHVSNYTDFAPLAAEPDVRLRP VRRAEEWGDPDVVMLPGSKSVVPDLDDLRRSGLADNILGHAERGKWIFGICGGLQILGRA ILDPHGIESAA >gi|316925018|gb|ADCP01000005.1| GENE 2 2730 - 4391 1163 553 aa, chain - ## HITS:1 COG:sll1501 KEGG:ns NR:ns ## COG: sll1501 COG1797 # Protein_GI_number: 16329614 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Synechocystis # 6 347 105 439 482 231 39.0 4e-60 DPAGGTADCARALGLPIVLVFNGRGMAGSVAALVAGFQLHAVRMGVRLVGAIANNVGSPR HADILRQALERANLPPLLGALPRREEWRLPERQLGLLPSEEAGTTAAWLDALAEMAEQHL DIDRLLALTTSKRPEAPAPLPSENVRPRRMGIAKDKAFCFYYEENERVLRSQGWEPVPFS PLADTALPIGIEALYLGGGYPEVFARELSRNAAMRENIRDFAARGGEIYAECGGYMYLCT TLEASEEAGGTRDDRRIWPMCGVIDATARMGGRIRSLGYREASMLSGAPFGLRHETFRGH EFHWSDIELHRDYPPLYEVRTRSGVEKAGVAFGNVRAGYVHLYWGNEGHAEEGQAPKSGK TSSEGFSSPRPTSPSGQVILLNGPSSAGKTTLSQALQTRLHAVYGLCCMTLSIDQLLRAS TGGYGSVLAGLAQTGLPLIETFHASVAAAAKAGAWVIVDHVIGEDPGWIEDLLERLAGVS ILSVQVTCDAKELQRRESLRTDRTPDWKHAERQARSIHVPLPGQMTVDTTRTDPGTCAAC ILEALFMESNIPI Prediction of potential genes in microbial genomes Time: Fri May 13 01:40:40 2011 Seq name: gi|316925013|gb|ADCP01000006.1| Bilophila wadsworthia 3_1_6 cont1.6, whole genome shotgun sequence Length of sequence - 6170 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 597 486 ## COG1797 Cobyrinic acid a,c-diamide synthase 2 2 Op 1 5/0.000 - CDS 736 - 3339 2235 ## COG2875 Precorrin-4 methylase 3 2 Op 2 6/0.000 - CDS 3336 - 4829 1138 ## COG2242 Precorrin-6B methylase 2 4 2 Op 3 . - CDS 4835 - 6127 786 ## COG1903 Cobalamin biosynthesis protein CbiD Predicted protein(s) >gi|316925013|gb|ADCP01000006.1| GENE 1 3 - 597 486 198 aa, chain - ## HITS:1 COG:MA0106 KEGG:ns NR:ns ## COG: MA0106 COG1797 # Protein_GI_number: 20089005 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Methanosarcina acetivorans str.C2A # 11 198 26 208 458 156 45.0 2e-38 MQHTTFHAFCIAAPRSGEGKTTASIALMRALARRGLRVQGFKCGPDYIDPTFHAQATGLP ACNLDTWMMGRDGVRALWDSRAHDADAAVCEGVMGLFDSRDPGDPAGGTADCARALGLPI VLVFNGRGMAGSVAALVAGFQLHAVRMGVRLVGAIANNVGSPRHADILRQALERANLPPL LGALPRREEWRLPERQLG >gi|316925013|gb|ADCP01000006.1| GENE 2 736 - 3339 2235 867 aa, chain - ## HITS:1 COG:lin1160 KEGG:ns NR:ns ## COG: lin1160 COG2875 # Protein_GI_number: 16800229 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Listeria innocua # 6 252 4 249 249 244 50.0 5e-64 MNTPLVEFVGAGPGAEDLITLRGLCALEQADLVVYAGSLVNPGHLKRCKPDCECRDSAAM NLGEQVIAMSEAALAGKRVVRLHTGDPAMYGAINEQIRALAEKGIASRIVPGVSSVFGAA AALGCELTCPDVSQSVVLTRTPGRTPMPKGENAAAFARTGATLVFFLSTGKIDDLMTALM GEGGLSPDTPAAVVYRATWPDERILRGTVSDIARKVEEAGFGRQALIFVGRALDAQGGAS RLYGADFSHGYRNHLANEAFDGRCALYAFTDKGVVRAKEIAAGLGLPAVIHSTRPTGAPD VVHTPGETFDATLSANWRQFDAHIFISATGIAVRKIAPLLRDKTSDPAVLSCSESGSHVV GLTSGHLGGANRLARRVARVTGGQAVISTATDVNGLPAFDEAAAQEHARILNTDAIRSLN AALLSGKPIAFCGPRTVFDRHFAGTDQVIYTDSPEAVTARHAVLWDAEGTVPEEVQHLDI TGKSFVLGVGCRRGVEPLELRLIAEGHLSDLGLRPEHIAAIASCDVKADEPAILELGEKW QVPVEFHAAARLDAVPVPTPSAKVREKVGTASVSEAACLLSAGYGSIPQPTLYAPKAAFG DVTLALARLPHLSAPKAAHGQVVVVGLGSGVPGQITPEVDAALRHCDTVAGYSNYVDFIR DRIIGKPIIQNGMMGEVARCRATLEAAAAGQEVCMVCSGDPGILAMAGLLFELRAREPEF ADIPIRVLPGITAANIAAASLGAPLQNGFSLVSLSDLLVPSDEVRQNLRAVAQSALPVTL YNPAGRKRRHLLTEALDIFREQRGGDVLCAYVRHAGRPQEAKWIGRLDDFPADEVDMSTL VIIGGPRTITDAGAMYEARGYVEKYME >gi|316925013|gb|ADCP01000006.1| GENE 3 3336 - 4829 1138 497 aa, chain - ## HITS:1 COG:PA2907_2 KEGG:ns NR:ns ## COG: PA2907_2 COG2242 # Protein_GI_number: 15598103 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Pseudomonas aeruginosa # 255 475 1 166 193 83 33.0 1e-15 MDGNGMDTEAGRIDVVSCGIGFPKDAETLGLIGRADVVYGSRALLAACPVEARLTRIIGA KAREDAADALTLCRVGRHVVVLASGDALYHGFGGTLSGMAHPDDTIVYHPGITAFQALFH RLGLPWQDARLFSAHSGEALPARAIAEAPLSVAYAGSRYPAHAISQAVLKLHPASAGRAA VIAERLGSPEERLFSGPLADLARTECGPTSILLLLPTPWHCVPHTVARIAHDARPDSGIH EGSPISHSIPAPILTLGFPEEDYERENNLITASDVRAVILSRLRLPAWGTLWDIGAGSGS VGLEAAGLRPNLNVVGIERKPERGAIIERNRARLGVPNYTLHIGDALELIHASSIDNLLS SPYGGTPAAQPPHQSEALGKRGEGEMRSLLQKGFLSPSPGGSAPSLPPPDRVFIGGGGRD LPELLAACMERMPPGGIMAASAVTLESFHTLSAWSPDRRTGLCSLNIAHEQPIAGTSHHL KQQNTIYLFTFQKEITS >gi|316925013|gb|ADCP01000006.1| GENE 4 4835 - 6127 786 430 aa, chain - ## HITS:1 COG:MJ0022 KEGG:ns NR:ns ## COG: MJ0022 COG1903 # Protein_GI_number: 15668193 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Methanococcus jannaschii # 6 312 6 300 362 139 32.0 1e-32 MKQTEQKTLRWGYSTGACAAAVATAAWLRLTRPETSLPTIPMLFLDGRERILPLLEPDEG HMAAIRKDGGDDPDCTHGAILFGNMRRCSASDARKEDYTLSVGGGTVILRGAAGVGLCTR PGLDCEQGKWAINTGPRRMIAENLHHAGLDAGCWLLEIGVENGEELAKHTLNPLLGVVGG ISILGTTGLVRPYSHEAYIETVRICVKSHHIAHGTTMVFCTGGRTKSGAERRLPSLPETA FTCIGDFIAESLAAACEYGMREIVVACMAGKLCKYAAGFENTHAHKVSQDMDLLRAEVRK HLPGEEALHDALAHSVSVREALLSIPEADRPGILRRLARTALGQFARRCGENIVLRLLVF DFEGQFLFEEKRGEQKEPEKNGKNFSGQSDPSHASASSPEAPTARSGEHAELSEYNGTIG LTYFLDGKKD Prediction of potential genes in microbial genomes Time: Fri May 13 01:40:54 2011 Seq name: gi|316924996|gb|ADCP01000007.1| Bilophila wadsworthia 3_1_6 cont1.7, whole genome shotgun sequence Length of sequence - 20885 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 10, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 224 - 838 517 ## COG2082 Precorrin isomerase 2 2 Tu 1 . - CDS 1114 - 2196 1066 ## COG3366 Uncharacterized protein conserved in archaea 3 3 Op 1 2/0.000 - CDS 2361 - 3062 697 ## COG2243 Precorrin-2 methylase - Prom 3258 - 3317 3.8 - Term 3314 - 3382 11.8 4 3 Op 2 . - CDS 3406 - 4278 1135 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase 5 4 Tu 1 . + CDS 4551 - 5039 -391 ## + Term 5062 - 5098 4.5 6 5 Tu 1 . - CDS 5225 - 6268 1175 ## COG0407 Uroporphyrinogen-III decarboxylase - Term 6429 - 6466 1.0 7 6 Op 1 . - CDS 6521 - 7039 522 ## COG5598 Trimethylamine:corrinoid methyltransferase 8 6 Op 2 . - CDS 7061 - 8041 1071 ## COG5598 Trimethylamine:corrinoid methyltransferase 9 7 Op 1 . - CDS 8289 - 10091 2206 ## COG3894 Uncharacterized metal-binding protein 10 7 Op 2 . - CDS 10088 - 10798 1018 ## Ccel_3200 hypothetical protein 11 8 Op 1 . - CDS 11078 - 11725 941 ## COG5012 Predicted cobalamin binding protein 12 8 Op 2 1/0.000 - CDS 11753 - 12637 993 ## COG0523 Putative GTPases (G3E family) - Term 12773 - 12831 -0.3 13 8 Op 3 . - CDS 12841 - 14187 2031 ## COG0407 Uroporphyrinogen-III decarboxylase + Prom 14470 - 14529 12.8 14 9 Tu 1 . + CDS 14768 - 16069 1281 ## COG2199 FOG: GGDEF domain + Term 16188 - 16231 11.7 - Term 16180 - 16215 7.4 15 10 Op 1 . - CDS 16383 - 19532 1939 ## Desal_0435 hypothetical protein 16 10 Op 2 . - CDS 19529 - 20401 244 ## gi|158512121|gb|ABW69086.1| EvpN 17 10 Op 3 . - CDS 20405 - 20875 409 ## Predicted protein(s) >gi|316924996|gb|ADCP01000007.1| GENE 1 224 - 838 517 204 aa, chain - ## HITS:1 COG:FN0970 KEGG:ns NR:ns ## COG: FN0970 COG2082 # Protein_GI_number: 19704305 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin isomerase # Organism: Fusobacterium nucleatum # 2 201 12 208 219 145 40.0 4e-35 MEIERESFRVIESECDLHKRMPAPEWRVARRLIHTTADMSIADTLVFRHDAIGSGLCALR AGAPIFCDSKMIRAGLSIERLKRLNPGYGPESLHCHISDQDVIDRAKLEGHTRALCSAEK ARPMLDGAIVLIGNAPLALARIARYALEEGVRPALVVGMPVGFVNVVESKELLAQCDVPQ IVLEGRRGGSALAVTTLHAVMESA >gi|316924996|gb|ADCP01000007.1| GENE 2 1114 - 2196 1066 360 aa, chain - ## HITS:1 COG:MA1324 KEGG:ns NR:ns ## COG: MA1324 COG3366 # Protein_GI_number: 20090185 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in archaea # Organism: Methanosarcina acetivorans str.C2A # 63 352 22 307 311 67 24.0 6e-11 MHRPAPASESMDADVGEDAVTSSGPTKMDTAAPRMKRHAPLTPAEKIFAKMKTFGRLFAI VGIAAFLGGLMEARRWHMALARVMGKLARMARLPEIVGLAMPTALCSNAAANSILVSSHA DGHIRTSALIAGGMANSYLAYVSHSIRVMYPVLGAIGLPGALYFAAQFSGGFMVILGVLL WNRWYVSGHGDTPVSGIPPVDEAPLGWPSAVGKAAVRSLTLLFRMVCITVPLMLCIEWLL KNGAFNFWEQYVPDQVNRFFPAELVSVVAAQMGGLVQSSAVAANLRAEGLIDNAQILLAM LVGSAVGNPFRTLRRNLPSALGIFPVPVAFAIVIGMQLSRFVVTLVAIAGVIAFMHYVLY >gi|316924996|gb|ADCP01000007.1| GENE 3 2361 - 3062 697 233 aa, chain - ## HITS:1 COG:MTH1348 KEGG:ns NR:ns ## COG: MTH1348 COG2243 # Protein_GI_number: 15679347 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Methanothermobacter thermautotrophicus # 4 187 3 186 232 106 38.0 3e-23 MKPGKLYGVGIGPGDPRYLTLRAADVLRSVDVIFTVISQNASSSVSRSVVESLEPRGEIH LQIFSMSRDKAVRAAQVQANADAIITELKAGRDCAFATLGDAMTYSTFGYVLEIIREAIP GVELEIVPGITSFATLTAKAGTVLVENGEQLRVIPSFKSEMAETLDFPKGSTTILLKTYR SRKALVERLRREENVEILYGEHLAMEGQTLLTDLDAVADRPEEYLSLIMVKKQ >gi|316924996|gb|ADCP01000007.1| GENE 4 3406 - 4278 1135 290 aa, chain - ## HITS:1 COG:lin1165 KEGG:ns NR:ns ## COG: lin1165 COG4822 # Protein_GI_number: 16800234 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Listeria innocua # 22 287 2 260 261 126 32.0 5e-29 MKKCFVMVLLCCLLVAGTAFAKTGTLLVAFGTSMDSARPAIDDIEKAYKKAAGNEPVLLA FTSDIIRNKLAKEGKPVLSVNAAMNELAAQGVTDLKIQSLHIAPAEEYNQLERMVVKNIT KNPGVFKTVKVGYPLLVSEKDLDAVVKVVLASLPKDRKPGDAVVLMGHGNDRGPGDLTLA ATAAAFHKADPHVWLATVEGSNSFDNVLPKLKASGAKRVWLQPFMIVAGDHANNDLAGPE EDSWASRIKAAGMTPMPNLKGLGQLKGIQDIFLSHTQNAVVDLANTKKVD >gi|316924996|gb|ADCP01000007.1| GENE 5 4551 - 5039 -391 162 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAEGWRAISRLFSGAPRRLHTVPPAYIRLFYPTCTAASRGLRALGISESDFFLKPFRNTS APCIWLPLKNLLRTAMIAPQKAWRRVPPDRASFPHPLAISYIRRAPLSSFSERHGKKISI GKWGGKGPSPGRSKAGCRLRGIMGDPESGAGRGKGGAGEISV >gi|316924996|gb|ADCP01000007.1| GENE 6 5225 - 6268 1175 347 aa, chain - ## HITS:1 COG:MA4558_2 KEGG:ns NR:ns ## COG: MA4558_2 COG0407 # Protein_GI_number: 20093342 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Methanosarcina acetivorans str.C2A # 2 342 6 358 369 135 27.0 2e-31 MTMTSRERFAAMLNHKAPDRMPVYPLVNSISRQSLGITYEEWTKDINKCAEAIIRTTDEL ELDCLCTLVDLSVEAADFGQELLYFKDKAACPDPAQRVVPTEGDYDKIPRIDPAKGKRMS EHVELCRKLVKARGKDKPVVAFVFGPLGIVSMLRGQQEMFMDLYTAPDEVKKGVEIVSDV LCDWIDQLCATGIDAVMFDTLYASRSIMSAEMWDEFEGVYMTRIAERVRSHGCAVMIHNC GQGAYFDEQYERMKPVLFSFAHPPQGCADMKEAVEKYADKMILMGTIDPGWFMTATPETL KARVEEELAVFGPSKRYVLSTGCEYPACLDFAFPRQMVDMAKAYTYK >gi|316924996|gb|ADCP01000007.1| GENE 7 6521 - 7039 522 172 aa, chain - ## HITS:1 COG:MA0528 KEGG:ns NR:ns ## COG: MA0528 COG5598 # Protein_GI_number: 20089417 # Func_class: H Coenzyme transport and metabolism # Function: Trimethylamine:corrinoid methyltransferase # Organism: Methanosarcina acetivorans str.C2A # 2 149 343 489 495 90 35.0 1e-18 MQAGAERALNRLMTALAGASVLFGQGMLETGLTFDIPTLLVDDEIIDYVLRMLAGFKVDA TTLSTDLIKEVGPFGTYLAEMNTFEHLGDLSTYNLMNRRNYDMWAASGKPDLYGQARERA KEILATHKQKTPLSPEQVKAIRDVLVDAEGELGVADFWKGKEEKRFIDNDLY >gi|316924996|gb|ADCP01000007.1| GENE 8 7061 - 8041 1071 326 aa, chain - ## HITS:1 COG:MA0528 KEGG:ns NR:ns ## COG: MA0528 COG5598 # Protein_GI_number: 20089417 # Func_class: H Coenzyme transport and metabolism # Function: Trimethylamine:corrinoid methyltransferase # Organism: Methanosarcina acetivorans str.C2A # 14 325 15 332 495 149 29.0 9e-36 MKRWAHSSLMGGVGVGLNFLREKDCEKIHEASLEVLHDRGAYFDSETAREVLRDHGCWED ADGCTHFPRTLVESALEAVPAEFVHRGRTPDDDIHMAQDQVYASNFGEGIFTHDLETGER RSTVKQDAVDILRVVDSLDNIHIYNRAIGPQDVPSESASMHNAEVAFCYTSKPMHLVSGS PFQTKKMIKMAEIAAGGKEELKRRPRTAFNHTTISPLRISHEACENAMIVAEAGLPNHIL VMVQQGATSPISYAGSVAVHNADFLAFNTLLQCVNRGNPTLYGASACVMDMKKGLSLVAA PEVFVLNAAMARMSKYYNIPSYIAGG >gi|316924996|gb|ADCP01000007.1| GENE 9 8289 - 10091 2206 600 aa, chain - ## HITS:1 COG:AF0010 KEGG:ns NR:ns ## COG: AF0010 COG3894 # Protein_GI_number: 11497631 # Func_class: R General function prediction only # Function: Uncharacterized metal-binding protein # Organism: Archaeoglobus fulgidus # 58 591 62 592 597 245 30.0 2e-64 MNVSITVRHCGETTTLRAEAGTPLAPLLREAGLLTLPCGVGKCGKCLILAATEPCAEERA LLGDAALASGLRLACYTRAAEGLDIALPQAGALRVLTRFAQSDYPFRPIVERRPFAMPEP SLDDQRSDLQRLIDACGAKGHALGLGQLAALPAFIRNARSNARSNGGCGDGHGNVCGFGL MHGETLVGYTASDEAYALIVDIGTTTVAALLVDTARRRVVAARGEHNAQSPYGADVISRI RHETEWEEHRNGPNPLQQAIAKQVSAMLAGLLEQAGIGDVDFLSLTGNTTMMHLLCGLPG EHIGKAPFIPATLEPMRLPAADLGIASQAPAFLLPGISAYIGADIVASLLAADAHRSQPP FLLVDLGTNAETVLCASGTLYACSAAAGPCFEGATLSCGMAGQDGAIDTVSPDPERGLSF TTIGDAPARGLCGSGVLDALALLLDAGIVDETGRLEADASPLGARITDDALTFTDSVRFT QKDIREVQLAKAAVRAGIDILLREAGMETADVARLYLAGGFGSAMRPESAARIGLIPEEL SDRVTVLGNAAGSGALRYATEEGATESALGIIRRTRYIELSAHAGFTDAYVERMLFPERE >gi|316924996|gb|ADCP01000007.1| GENE 10 10088 - 10798 1018 236 aa, chain - ## HITS:1 COG:no KEGG:Ccel_3200 NR:ns ## KEGG: Ccel_3200 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 217 4 230 253 139 30.0 1e-31 MHLIACEVFRPELERLTRAMRNAPEVTYLEQGLHDTPDELRRRVQQAVDALEAKGETVIF LAYGLCGRGLTGVTGHTATLILPRVHDCIPVLLGATQEQANESSLGGGTYWLSPGWLRYS QTSFIQNREKRFKEYEERFGTDSAAYLIELEGSWLRNYTNACLILWEGWEDERELVRTAK AVADDAGLGYRELPGDPNFIQALLDGGKDGRFMRIAPGFTPDLDGNGTITAVRVTS >gi|316924996|gb|ADCP01000007.1| GENE 11 11078 - 11725 941 215 aa, chain - ## HITS:1 COG:mlr1231 KEGG:ns NR:ns ## COG: mlr1231 COG5012 # Protein_GI_number: 13471298 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Mesorhizobium loti # 4 206 22 224 238 163 43.0 3e-40 MSAILQDISEKLQKGKMKEVAELVKQAVADGIQPLTILNEGLMPGMDIIGERFKKNDIFI PEVLIAARAMAAGTDVLRPLLVGDQMEEKGTVVIGTIKDDLHDIGKNLVRMMMESKGLNV IDLGNSVPPAKYIEAAAKHNADIIACSALLTTTMVHIKDVVKGVAESPLAGKVKIMIGGA PVNDAFRESIGADYYTPDASSAAEKALEICAAARA >gi|316924996|gb|ADCP01000007.1| GENE 12 11753 - 12637 993 294 aa, chain - ## HITS:1 COG:BMEII0179 KEGG:ns NR:ns ## COG: BMEII0179 COG0523 # Protein_GI_number: 17988523 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Brucella melitensis # 2 212 4 238 400 90 31.0 3e-18 MRKELIILSGFLGSGKTTLLRSLLLRHSDRKIAVLLNDFGDIPVDGETLRRSGVEGGVVV EIGGGSVFCSCLREPFIKALANLAARDEDLIIVEASGMSDPSAVDKMLRLSGLDGLFEHT STICLFDPVKSLKLARVLEVIPRQLASASVVVLTKADITTKEERDAARAYIRSQEPDLPI VESHNGNADFASLPERALRLFPVGFNTPETRPDCFALEAVRTDAKTLLDALTADGNVLRV KGYIRAADGVRFVSDTGQGFETMESRDAPVPLMIICMQGTAGAVRAALHAAGIA >gi|316924996|gb|ADCP01000007.1| GENE 13 12841 - 14187 2031 448 aa, chain - ## HITS:1 COG:MA0146 KEGG:ns NR:ns ## COG: MA0146 COG0407 # Protein_GI_number: 20089044 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Methanosarcina acetivorans str.C2A # 4 321 7 327 339 108 27.0 1e-23 MHGKALINEALKGNPVERTPWVPYTGSQIANLKGYTAQEMFRDADKLYECCIEAESQYSP DGMTPMFDLQVEAEILGCDLAWYDNTPPTVCSHPLEGELVIPTRRIQKTDGRIPLILDVM RRFKAAKPDIAMYGLVCGPFTLASHLRGTNIFMDMYDDEDGVKALVAYCEEVVREVADYY IEAGCDIIAAVDPLVSQISPDMFETFLSEPYTKFFASMREKGMPSSFFVCGDATKNIEPM CLTRPDCIAIDENVDIVEAKKLTDAHGITISGNLQLTITMLLGTQQDNQKAAIELMDKMG THRFILAPGCDVPFDAPAANLIGVGQAVHNPEAVRKALESYVAKDNLPEIEMPDYVNLDY VLVEVVTIDSKTCAACGYMVATANNAAKIYGDKVKVVERSIMFPENLAFVSKVGLTNLPS LLVNGVIKHISLIPTVEKLREEIEEAMK >gi|316924996|gb|ADCP01000007.1| GENE 14 14768 - 16069 1281 433 aa, chain + ## HITS:1 COG:AGc3960 KEGG:ns NR:ns ## COG: AGc3960 COG2199 # Protein_GI_number: 15889459 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 265 428 232 392 400 111 42.0 3e-24 MEDFSKSLLLRDYDAVFVFNRENGQYHTLVLSKKLFNVVPSEGTYEEGMEVLAKYVLPEE REFFRRHTERDRILKLVDSVEKAVVQYRIRRDDGICLCRNLKFLPGDEKHILAALRDETD EVMEQIRDANILALKNSCINFIVSNLCENFMTVDVRTGMSTTVIGAGNGEMLPQQTFKEQ ILWFAENVVVPEERENYIQYFALDNLVARIRENGGGTVSMFCNVIYEDGRHELLIRSTLV KDTLDLRGEYVLLFAQDMTSIRKIEEANRQLMLTSRHDKLTGLLNRAAAEKLISEHLDLV GASSSYCFLLLDIDYFKSVNDRFGHLAGDSVLQYMGSSMRKSFRSGDVLCRWGGDEFVIF LRGVRSREIVRERLDALRARLLDCRAGEEPLSVTLSIGGAFGNGPSSLADLYSKADKALY QVKQQGRNGTVLE >gi|316924996|gb|ADCP01000007.1| GENE 15 16383 - 19532 1939 1049 aa, chain - ## HITS:1 COG:no KEGG:Desal_0435 NR:ns ## KEGG: Desal_0435 # Name: not_defined # Def: hypothetical protein # Organism: D.salexigens # Pathway: Bacterial secretion system [PATH:dsa03070] # 503 1037 619 1154 1167 141 23.0 2e-31 MSPWKVLGWLLVALLCLLLTAGAYGLALLWGWSLPSVWWLAAIAGAGAGVALIIFLFRTM RDDTHEPAAPLPVELPSERLPRWLLLDGGGLFPHLPGILESAGRPLLSDGAAENAGSVGL WSTAHAHWIAVDFSAPHLSGERKGDGGEGNTPDKAPSLPLWDQLLDDKAVRRWLSPPAGV VLCIGAESLSGHGDDTAAFMRQRLLTVRNRLGNAPVCLLVTGLEHAPGLSRLARHFTAEI SIGRNALHAPLGWFMPVRRPWAPMPRWAADGVREGMNGLTAALDSLVRCSESGIPAPGGP AFLLEGALRQLASPLASFAAALGDGLGGIFWAASPPAPSAFQQATRLSPVNAPPTALRSS ASAPFSVPGGAPLAASRASQPSATVPGGSRVTPSSGIAAHPLFVRELLLETLPASALPAT GRRARQRAVCWTAAILFAVTAGVALYRGGERSQTLLPDMIRFEDHLPRAASIAELDGLIA EFHKLERGCSGTVRLYGAPKKRLERLRAALRERLTASLPESGDTDRWINDLMIWAAARPG VETLRFRAPDGTVLASVDGAATASGRKAAARFLDNVSAFASQSNILPEQDTPQHTGASDS ITERVEALRLRYRASTFAAWYAAGQCLLDAVQAPGIEPRTLLPKRHVPLTPRDLLGPNDP CSAFLAAAERELRSAEGPLPVWVASLRDMRSVRLLTRLPGNGTPLSETAELLGEPSEGAR QTLGNLETLFRARTAWTDYRSALAALSSETGTSDGLVRLARSLYGGELNGALRAADDAWQ GLAAALEARNPDLRNDPLPLSLMRAPLLFAAGTATAEAARNLQQRWSTEVVGPVEGLQDE ALQQALIGEGGLLWTFVADAAQPFLRPAASGYAPASALGMRFPLSPAFLSLLSETPQHIA VYPASYPVRVGFSPVTVNPKVRAYPRGLSLRMDCGGEPLRADAYNYQGTALFDWSPEQCG NLTLAILFDGFTAEKVYDSPLGFARFVDQAAAGIMEFTPSDFPTVQQQLENLGITRLRTR FRIEGGEAVRERLHALPSALIRSILHIEK >gi|316924996|gb|ADCP01000007.1| GENE 16 19529 - 20401 244 290 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|158512121|gb|ABW69086.1| ## NR: gi|158512121|gb|ABW69086.1| EvpN [Edwardsiella tarda] # 11 135 15 133 216 64 37.0 7e-09 MLRDCFGPCFRLMLRLPDSPPTGRTDVRLEAERLLEEAGRLALRGREAATVETARRMVEA WMDETALGIAWEGRAAWLGNPLQRRWRTGRRGGDWFFDEIRHLSPHRAEDAELAEVALRC LGFGFRGHFYNDPTGLGLEHRALAERFGCTAVPLPFPPAPYLPRENRLFRHVGPLLTGLF AVLLLIFWGIWQHRLNREYTGSPQVPAVSAVHPASDSPRTDGARSAVSLSPRPSTIIQKG REAVGSKKQPIPDVYGELRNRLAGRTPAPASHDPGHPGANNRNTTPGRSS >gi|316924996|gb|ADCP01000007.1| GENE 17 20405 - 20875 409 156 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLYRHDDPAESFRSLDSLLREILRGLLPETAAVVRFRADGEFLRAELPEGLGRTQALLV LQTEETVSSLLNEGRLVADAPDTLPATLAQAIAGVPLREIPAPPGLPQRDGIHYLEPDAR SERWQAALRSGALGVALFPLCGPALPGHLRLLYLRN Prediction of potential genes in microbial genomes Time: Fri May 13 01:42:13 2011 Seq name: gi|316924983|gb|ADCP01000008.1| Bilophila wadsworthia 3_1_6 cont1.8, whole genome shotgun sequence Length of sequence - 17055 bp Number of predicted genes - 14, with homology - 11 Number of transcription units - 6, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 121 - 717 381 ## Gmet_3310 putative lipoprotein - Prom 785 - 844 2.2 + Prom 731 - 790 4.3 2 2 Op 1 . + CDS 812 - 1846 731 ## COG3515 Uncharacterized protein conserved in bacteria 3 2 Op 2 4/0.000 + CDS 1892 - 4060 1990 ## COG3517 Uncharacterized protein conserved in bacteria + Term 4072 - 4129 9.0 4 2 Op 3 . + CDS 4457 - 4939 554 ## COG3157 Hemolysin-coregulated protein (uncharacterized) 5 2 Op 4 . + CDS 4939 - 5343 260 ## 6 2 Op 5 9/0.000 + CDS 5343 - 7142 1075 ## COG3519 Uncharacterized protein conserved in bacteria 7 2 Op 6 . + CDS 7106 - 8023 557 ## COG3520 Uncharacterized protein conserved in bacteria 8 2 Op 7 . + CDS 8028 - 8567 23 ## 9 3 Op 1 2/0.000 + CDS 8682 - 10793 1592 ## COG3501 Uncharacterized protein conserved in bacteria 10 3 Op 2 . + CDS 10872 - 11171 77 ## COG4104 Uncharacterized conserved protein + Term 11345 - 11379 5.0 11 4 Tu 1 . - CDS 11536 - 12237 510 ## Deide_3p01230 putative DsbA FrnE like protein 12 5 Tu 1 . - CDS 12345 - 15311 2877 ## COG0642 Signal transduction histidine kinase - Prom 15358 - 15417 5.4 + Prom 15804 - 15863 2.6 13 6 Op 1 . + CDS 15898 - 16884 1173 ## COG0502 Biotin synthase and related enzymes 14 6 Op 2 . + CDS 16881 - 17055 93 ## Predicted protein(s) >gi|316924983|gb|ADCP01000008.1| GENE 1 121 - 717 381 198 aa, chain - ## HITS:1 COG:no KEGG:Gmet_3310 NR:ns ## KEGG: Gmet_3310 # Name: not_defined # Def: putative lipoprotein # Organism: G.metallireducens # Pathway: not_defined # 5 183 1 174 184 111 34.0 2e-23 MIRFLKFHIAVLLGVIALCSGCSAPEPPPPTPPENDVPWVYEPDAVVLRISADERLNEHE GEPSSLMLCVYELATREGVDKRLASPEGFAELLACGRFDDSVVTSRRLFSDPGQAQNLSL DREEGVRWIAVIAGYFHGTPAHSARVVAVPVRKIVEGWVPFFKDTRHEPAQTIIPIRLGP AELMGGEPPQTQAGRQSS >gi|316924983|gb|ADCP01000008.1| GENE 2 812 - 1846 731 344 aa, chain + ## HITS:1 COG:PA2360 KEGG:ns NR:ns ## COG: PA2360 COG3515 # Protein_GI_number: 15597556 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 24 341 26 364 366 149 35.0 8e-36 MSRAKSDFVDWASCLHPFGEDPYGPALPHDPVCAAIREARREDDPGLPQGVWERELKRAD WDKALSLSLGVLRERSKDMQVAGWACEAALMRYGFASLPGGLRMIAGLCSGFGDGLHPQP EGSDQEARLARLAWLDSTLAERAASLPITEASPDVPAATYADWMAMEHRERIQNGNGKNG VAGDARRMLNLAGTQTSPAFHASLHGQLREALLALDELARAADALCGGQAPSFHALRDRL EHIDARVLAWHPSAGTDPLVSPPSSPASLSSPPPPSFSSREQAYASLSAIADYLMRTEPH SPAPWLVKRAVSWGGMTLAELLDELLGQGENLASIKKLLGMTKE >gi|316924983|gb|ADCP01000008.1| GENE 3 1892 - 4060 1990 722 aa, chain + ## HITS:1 COG:PA2366 KEGG:ns NR:ns ## COG: PA2366 COG3517 # Protein_GI_number: 15597562 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 292 720 63 491 494 511 62.0 1e-144 MGESIQKRLGRVRPPRVQITYDVEVGDAIVKKELPFVMGIMADLAGDRTVREDLGLPPLA EYKLRTFAAVDRDNFDDIMKTVAPALKLSGLDRFITEDASAAWREGGVEPEKAAFSCALR FEKLDDFRPECLVKNVKTLAAFFERRNLLQDLAAKLDGNDALQASLQKMLFPTGDSVSEL DALRKAYKEALASVDAARDAVSKAGEDQEKQKAAEEDVQRAETEASEAKQKLDEKRKAKR PDTEAIAAAMVRNSGDPDEDKRQREVADARLAACLAEHEDNPFTLPASGSMLGMLTERVA CKDKLLACQLDAILHAEAFQELEAVWRGLHYLVFNTETSDRLKLRLFNASFKELRTDLER AVEFDQSLLFKRVYEEEYGTFGGEPYSCLLHVHEYGLSAVDLGVLQNMAEVAAAAHTPLL SAASPQLFGLGSFTDLPLPRDLHKIFQSADYIEWRSFREKDDSRYVTLCLPHLLMRLPYG NDTDPVETFVYEEDVAGPSPDRYLWGNAAFALAACITAAFAKYGWTAAIRGYEGGGAVEN LNVYKYRQTNGETVALCPTEVSITDRREKELSDLGFIALIYRKNSDKAAFFSGQTVHKPP VYTSDTANANARISARLPYLLNASRFAHYVKANMRDKVGSFMTADNVSKYLNNWISQYVL LSDDAGDDIKARYPLREARVDVSEDPGNPGAYKCVIFLRPHFQLEELTASIRLVAELPPP AV >gi|316924983|gb|ADCP01000008.1| GENE 4 4457 - 4939 554 160 aa, chain + ## HITS:1 COG:STM3131 KEGG:ns NR:ns ## COG: STM3131 COG3157 # Protein_GI_number: 16766431 # Func_class: S Function unknown # Function: Hemolysin-coregulated protein (uncharacterized) # Organism: Salmonella typhimurium LT2 # 6 159 5 159 161 82 29.0 2e-16 MSTNTFLKLGDIEGESTISGFEKQMELINFTNGFHQPTSPIRSSTGPTTGQAVHSMMNVT KYLDSSSPNICKALWSAKVLDSVVLTCCRMDNDAAIKFLEITMENVVVASYNLRGGGDLP YEEIALNYATIKYTYIRQKEEGGSDGNIAASHDLKTNKVS >gi|316924983|gb|ADCP01000008.1| GENE 5 4939 - 5343 260 134 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKEERRIPLLSRLAGKASGRGLSRSEYLSAVAEDVEELLNTRAPCGNGDAADLVMGYGLA DWSSAPLDGREVARAVSAALERFEPRLKKIRVSPLGFEDGRLHLRVDAASEEEPVTFLLQ CDGGVFSAGAGAEG >gi|316924983|gb|ADCP01000008.1| GENE 6 5343 - 7142 1075 599 aa, chain + ## HITS:1 COG:PA0088 KEGG:ns NR:ns ## COG: PA0088 COG3519 # Protein_GI_number: 15595286 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 11 597 9 617 619 207 30.0 4e-53 MHAFEDIAGLYREELRSLREDAVRFAVRNPGMARQLGLASDTQPDPLVNLLLESFAWLAA RVRADAESRLPLVSHSLLGLLYPQFVCPIPSVGVAEFIPNPAGVAGLPDGFGIPAHTPIH NRDERGGVLRWRTARPLTLHPAKVVRASIIPPASWMPAAAQGVPGVLCLRIEGIAGLSLA KFGLPSLSLHLGGDRFATFPMHEWVHGACKDVFLADDAGAAKPRALSLGREALCHAGLEP GGRLFPESAYALPGYQLLQEYFAYPEHLLFWEVGGLERRRELGDLPGFDLLFPVDAPPPK TVSSRLFRTCAVPVVNLFERMAEPIRVDHERSGHLLVPDLRHRDSTEVHSVIAVDALEGG ERSSVMPYFSLSGEGDAPPSLMWFLRRKPSLDGIGTDVFLHFKEAAFRPETARPWVASAR VLCTNRHRAAERWRQGELTRDTDMPTAAIRFCGPPSQQHDPPMFGGELWKLVSHLSLHQL SLSGPDADNVLRELLRLYAPPRDAFAQRQIAGVNGLCAKPLIRPLRSAPGCAARGTRLTL TLDEDAFTGGSAFLFAEVLDVFFGLYAGINTFTQLVLKTCNRKGTWHRWPLRSGALPLI >gi|316924983|gb|ADCP01000008.1| GENE 7 7106 - 8023 557 305 aa, chain + ## HITS:1 COG:PA2370 KEGG:ns NR:ns ## COG: PA2370 COG3520 # Protein_GI_number: 15597566 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 69 298 78 293 338 96 36.0 5e-20 MASAIRRASADLDGDALSRFRSAPWRFDVWQAGAILEAWKGRMDWGVPASDSIPAGVVRG VAFPEGGAPRLDVNLLGLAGMDGPLPPHVALLVRERMRAGDPAFQEFLDLFHNRILHLWR DMASSLCPELRADGLPEDHPFAAFLEAVSGLPNTGERARDVPMPGPVAGRGDVPAALFRS CAAVWARQPRSAAGLEKLVKAAFGVEARVEPFHGAWVDLPEEDLTRLGGRDAGNARLGST ALLGSRVFDASAGIILRLGPLTEAQRRMFLPPPAGTCRADLLALVRWYLGDVGCEVILNN FTRSG >gi|316924983|gb|ADCP01000008.1| GENE 8 8028 - 8567 23 179 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPDIGDVYYLSNAIFFGFDGNPLGITGPPDHYFVVYDIVGEICFAKELTSNSTCWHELSV AYKPITAPPDFFSTKSYYNNIHEDFCFCRSESSAFPLVTELMPAWEADPTLGPKAFTHSP RPYRTAGAKSIWSANMGPVIAKPAPSPGPVSLPTAPLTPTPTPPTAKDRFVAALSKGKT >gi|316924983|gb|ADCP01000008.1| GENE 9 8682 - 10793 1592 703 aa, chain + ## HITS:1 COG:PA2373 KEGG:ns NR:ns ## COG: PA2373 COG3501 # Protein_GI_number: 15597569 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 9 703 25 668 668 390 37.0 1e-108 MESLDMSTTLTGYEALSRPFRYVLEYYPAEQPELAALIGKPYTVRLPMPDGSTRPLNGIV FRAEQGPSTVRSARYTLELRPWFLRLDQERTCAVFQNMSVPEIVTKVCTDAGFGHIRNAL SSTYPAKEYTVRYGETSFAFLSRLMAEAGIVYFFEHSGSAHTLVMADSPDVFADCPQGAT LEWLPDAQDKEGGPPTEPSAHVFELSLSRQVAAASCSVDDYNPLTPETSLAASAGEGFPC LGYPLAGHIDQQSGEGLAAIRLDACESEGFKLRGRSSCPGLSAGCAFAVKGHPDGQVNAR WVATEVRFTASFPGDGTAETGRFLADVSAVPASARYRPLPFFARSRMSGPLTGVVTGKEG EEVWTDQHGRCKVRFHWQGASDETSSCWVRVAQPWTGNGYGALFLPRIGQEVVIGFVGGD PDRPLVTGMVYNSGNPPPWALPEHAACSGLLTRSFPDGQAGNELRFDDTKDAELVYLHAQ KTFSCDVEDARTVTIIGEGGDALTLEKSSRITTLKEGNDALTLEKGNRSVELKEGDDAFT IEKGSRSATLKEGDDALSLEKGNRAVTLKEGNDLLVLEKGGRTVELKDGDDGLKVKGKRH VETGGDEERKHGGNVVINVKGDYTLKVSGNLTIEAGGTLALKSAKAQFSANQGMEISSSA NLSVSAQAELTQKATMVDIKANAKGTLSAGAMLEVKGGLVKIN >gi|316924983|gb|ADCP01000008.1| GENE 10 10872 - 11171 77 99 aa, chain + ## HITS:1 COG:all3319 KEGG:ns NR:ns ## COG: all3319 COG4104 # Protein_GI_number: 17230811 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 3 99 4 100 100 105 72.0 2e-23 MPPAARVGDMHVCPMQTPGLPPIPHVGGPILPPGCPHVLIGNMPAACMGDSAVCVGPPDV IIKGSSSVLIGGRPAARMGDQTAHGGQIVMGCPTVIIGG >gi|316924983|gb|ADCP01000008.1| GENE 11 11536 - 12237 510 233 aa, chain - ## HITS:1 COG:no KEGG:Deide_3p01230 NR:ns ## KEGG: Deide_3p01230 # Name: not_defined # Def: putative DsbA FrnE like protein # Organism: D.deserti # Pathway: not_defined # 3 208 5 206 211 92 36.0 1e-17 MSHFLYISDVFCPWCYGFAPVMQRISSEHPGYPVRVLSGDLVENPTTTGEMMRNHPTMRE FFVRLAKTTGQAVGQDFLRRLEPGQGDLRMFSPDMAVPLAALKKLAPGHELEQMEAFQSA FYGEGRDVLAPEVQMRIANVDGDVFMRALEDPGVQAAAEAEREEALDILGDFVVYPTLFL ETDDGERHVLARGYADYPVVAAKLASVLGGGAGGSETRPANACGLDGHCDISA >gi|316924983|gb|ADCP01000008.1| GENE 12 12345 - 15311 2877 988 aa, chain - ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 607 845 2 240 260 244 51.0 5e-64 MPKRLTLFAPVLLAVVLSIGIMAWFGYHNAKDILEQELVAHQKMIAQSAVDAIHTYFTGI TENVETLAASGPIRRLLKNPDSAEEMDSARRLTRLFIRHQSDMPNLNLLNKNGQIILSST TAMPGITRSVRTDFSDIDRPSFGSTTNGDGKVMAYVSAPVRSDGERGGERIGEVVVIIDL ENFLQSWKRTASIAPSDGYFHIINAQGKVLASWNSEAPPDNMLSPGKQNIFHQPEGQLIR FASRDRHTSLGIWFPIPSTDWRILMAVNENKILGPANILRDGTLTISGATGLLLLLLVWV LLSSLTRQIREKNLQLDAISSHLLGGLLITRLDRSFTILYANDGYLNMVGYTRSQLRKEK RNAAVSLCASEERNATMQAIFSQLSQNGMLSLEHRLQRRDGTRLWALIRGRQVNDPELGQ IGVWVIIDVTEQKETQLAFERQSCELETLVQQVSASENRLKFLVDSASIPLWSTDLETGL ISYNEHVCRMLGWPEGVLHMTLPDFLLRCHPDDVEHVRNVFESFSAEDAGGTLEYEYRVR DGNGNWRWILVKGQKAINPLTHKAERVGIVIDNHARKQEALDTLQRAAELEAQVQTRTAE LAARNAELLKSQQEALEATRAKSEFLATMSHEIRTPMNGIIGLTYLALQRECPAAIADYL AKIDVTAKTLLRIINDILDFSKVESGKMDFEEIPFTLSEVLDNTLQMFTHAVREKGLSLR FDVGEGVPQSLVGDPTRLGQILLNLVGNAMKFTHEGSVTLFVRCVETDQQTVTLGFAVRD TGIGIPADKLPSLFQPFIQADMSVSRRYGGTGLGLAIARVLVQHMGGTIHVRSVVGKGST FSFTLRFRRPAAVLKGAESTAPGNTPAPLDGLRVLLAEDNDINQIVATEILKMKGIAVDV ASNGLEAVDMARSGAYDVILMDIQMPELDGIEATRRIRAFLPDIPIIAMTAHTMKGDAEK CLEAGMQDHIAKPIDPETLFAALQRARG >gi|316924983|gb|ADCP01000008.1| GENE 13 15898 - 16884 1173 328 aa, chain + ## HITS:1 COG:slr1364 KEGG:ns NR:ns ## COG: slr1364 COG0502 # Protein_GI_number: 16330170 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Synechocystis # 1 323 22 349 362 231 40.0 1e-60 MDALLVRIAERITSGADPLLSPSEALELARLPESATLDILSVAGLARAARKPAAAFTCGI INAKSGRCPENCAFCAQSAHHQTDSPVYPLYDTDTLLRRAEELAANNVDRFGIVTSGTAP SDRDFDALCESALRIGREVKIGLCASLGLLTPERAARLREAGFTSYHHNLETSASHFPSI CTTHAYEQDLETVRVARSAGFRVCSCGIFGLGETWEQRIELSQTLTDLGVDSLPLNFLNP IPGTPLGDSPLLPPSEALRVVALMRLLHPEQDVLICGGRSKTFGQWQSWVFAAGANGVMT GNYLTTKGCAFCDDAEMYEVLGLREERS >gi|316924983|gb|ADCP01000008.1| GENE 14 16881 - 17055 93 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKGLFITGTDTDAGKTTVTAALLRALKVAGVPAAAVKPVQTGCVMRGGEKENRGEEGT Prediction of potential genes in microbial genomes Time: Fri May 13 01:43:20 2011 Seq name: gi|316924957|gb|ADCP01000009.1| Bilophila wadsworthia 3_1_6 cont1.9, whole genome shotgun sequence Length of sequence - 35747 bp Number of predicted genes - 27, with homology - 25 Number of transcription units - 16, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 190 - 2337 1977 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase + Term 2415 - 2458 3.1 2 2 Tu 1 . - CDS 2451 - 2840 619 ## COG3339 Uncharacterized conserved protein - Term 2912 - 2964 9.0 3 3 Op 1 . - CDS 3030 - 4415 2235 ## COG0471 Di- and tricarboxylate transporters 4 3 Op 2 . - CDS 4494 - 6203 2327 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 5 3 Op 3 . - CDS 6206 - 7825 1654 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 7946 - 8005 1.6 - Term 7944 - 7972 1.0 6 4 Tu 1 . - CDS 8052 - 8741 785 ## Desal_0411 transcriptional regulator, Crp/Fnr family - Prom 8934 - 8993 1.8 - Term 9164 - 9217 2.1 7 5 Op 1 11/0.000 - CDS 9463 - 10749 677 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 8 5 Op 2 11/0.000 - CDS 10749 - 11255 688 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component - Term 11288 - 11322 0.8 9 5 Op 3 . - CDS 11323 - 12348 1444 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component - Prom 12465 - 12524 1.9 10 6 Tu 1 . - CDS 12534 - 13553 287 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 - Prom 13686 - 13745 9.2 + Prom 14105 - 14164 3.2 11 7 Op 1 1/0.000 + CDS 14190 - 15227 1226 ## COG1609 Transcriptional regulators 12 7 Op 2 . + CDS 15243 - 16583 1527 ## COG0205 6-phosphofructokinase 13 7 Op 3 12/0.000 + CDS 16602 - 17420 1144 ## COG3959 Transketolase, N-terminal subunit 14 7 Op 4 . + CDS 17439 - 18377 1409 ## COG3958 Transketolase, C-terminal subunit + Term 18415 - 18474 5.6 15 8 Op 1 2/0.000 + CDS 18493 - 19380 1019 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases + Term 19391 - 19435 1.8 16 8 Op 2 . + CDS 19447 - 20529 1084 ## COG0451 Nucleoside-diphosphate-sugar epimerases + Term 20552 - 20608 8.5 17 9 Tu 1 . + CDS 20632 - 21441 610 ## gi|225570193|ref|ZP_03779218.1| hypothetical protein CLOHYLEM_06289 + Term 21475 - 21523 -0.9 18 10 Tu 1 . + CDS 21686 - 22609 967 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 22643 - 22690 8.2 + Prom 22937 - 22996 3.8 19 11 Op 1 . + CDS 23062 - 23904 949 ## COG1741 Pirin-related protein + Term 24083 - 24116 3.5 20 11 Op 2 . + CDS 24180 - 25592 1525 ## COG2133 Glucose/sorbosone dehydrogenases + Term 25620 - 25677 14.1 - Term 25939 - 25988 1.0 21 12 Op 1 27/0.000 - CDS 26065 - 29262 3721 ## COG0841 Cation/multidrug efflux pump 22 12 Op 2 13/0.000 - CDS 29314 - 30468 1141 ## COG0845 Membrane-fusion protein 23 12 Op 3 . - CDS 30470 - 32254 1423 ## COG1538 Outer membrane protein - Prom 32361 - 32420 1.7 - Term 32447 - 32485 4.5 24 13 Tu 1 . - CDS 32536 - 33432 976 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 25 14 Tu 1 . + CDS 33724 - 34077 132 ## 26 15 Tu 1 . - CDS 33907 - 35619 1565 ## COG2199 FOG: GGDEF domain 27 16 Tu 1 . + CDS 35468 - 35747 114 ## Predicted protein(s) >gi|316924957|gb|ADCP01000009.1| GENE 1 190 - 2337 1977 715 aa, chain + ## HITS:1 COG:NMB0732 KEGG:ns NR:ns ## COG: NMB0732 COG0161 # Protein_GI_number: 15676630 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Neisseria meningitidis MC58 # 279 711 1 429 433 504 57.0 1e-142 MGGVGASLGVSSSSSSSSSSSSSSSSSASSALVAPDVACYEAAGGGGIVLETYEPACSPH LAARLAGRPLTVSGLREKLERKAPDGTFLLIEGAGGIYVPLNDRETMLDFMREIDFPVLL VVGNKLGCINHALLSLDVLQSHGLRVCGMILNRPIPETVCDPGSGPDPDVGNERRLLRDN AETLARMGRERGVPVLAEIPYQTGFANNAENVWSSLAALVGPAVEALSPLFSDCLSNGIC PDNASSYIVSPSSSHERGQTSPASSCSNRESSRVASPHIPSSPLSPSSLLAFDREHLWHP YTSALNPLPVREAVGTSGVRIRLRDGRELVDGMSSWWCAIHGYGHPVLMDALRRQSAKMA HVMFGGLTHEPAVELGKRLLPLLPDNLKHIFYADSGSVSVEVALKMAVQYWASKGRPGKT RFLTPKGGYHGDTQGAMSVCDPVNGMHTLFTGILPKHLFVERPNCRFDSPFDPDSLRPMR DAIERHAEGLAAVILEPIVQGAGGMWFYHPDYLRGLRKLCDEHELLLIVDEIATGFGRTG KMFASEWAGLKPDILCLGKALTGGTMTLAATVASEAVARGIEEAGPEQGGGVFMHGPTFM GNPLACAVGCASLDLLNASPWAERVEGIRRGLEEGLSPCREFRGVRDVRTLGAIGVVEVD KPVNMAVLQDFFVERGVWVRPFNRNVYLMPPYCIGPEDLRRLTDAIVDAVRGAYS >gi|316924957|gb|ADCP01000009.1| GENE 2 2451 - 2840 619 129 aa, chain - ## HITS:1 COG:AGc4850 KEGG:ns NR:ns ## COG: AGc4850 COG3339 # Protein_GI_number: 15889931 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 27 120 35 126 133 71 43.0 5e-13 MSDKDIIDMKEENGQWSAYVSEFSPEKLWEKIKGHAKKVGCEGIRSALRLFYALDNPSMP LKTKMVIYGALGYFISPIDVIPDFIPVVGFTDDIGVLAAAVVMAASYIDAEAKAKADAKL AGWFGEGTC >gi|316924957|gb|ADCP01000009.1| GENE 3 3030 - 4415 2235 461 aa, chain - ## HITS:1 COG:SA0645 KEGG:ns NR:ns ## COG: SA0645 COG0471 # Protein_GI_number: 15926367 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Staphylococcus aureus N315 # 96 461 145 517 517 73 23.0 7e-13 MNRALTLKWFAVLGLASLFYLLAPRDVSPQMPLYLGMTSAAVIIWTFDLLPAVAVAAALT FLYLLTNMAPPEVVLGPWTTVLIWTTFGAAIFGEAMEKTGLAKRVALRCMLLTGGTFTGM LIGFAIGGIILGLLLASGFARTVIFCAIAVGIIRALDIDPKSRLSSIIVIACFFAAAAPT MQFLHTSESFIWSFQVLMKAIGGNVDWWEYLNHACFINTIYVALSFLTIYFMRGKVRLPE GGKLKLILEERLAEMGPMTVSEKRLLVLAILGILGFMVEPWTGVNAVYVLCLVALFCYMP GMKIMEPSSFNNLNIVFLVFVAGCMAIGFVAGEVGANKWAVSSIIPFLKELPPTLGVLCS YLAGVVTNLLLTPFAATVAFTPAFGELGVQMGVNPLPLFYAFEYGLDQYFFPYEGVMFLY IFTTGCVTLKHVIPTMIARCIVAGAFIAVIAIPYWTFIGIM >gi|316924957|gb|ADCP01000009.1| GENE 4 4494 - 6203 2327 569 aa, chain - ## HITS:1 COG:MTH1502 KEGG:ns NR:ns ## COG: MTH1502 COG1053 # Protein_GI_number: 15679499 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Methanothermobacter thermautotrophicus # 1 528 5 537 558 155 28.0 3e-37 MSAKPFVFDTDVLIIGGGFSGSWAALTARQHVENVLIVDKGPRDWGGLGGMSGGDMIVKQ PEFAAKDLVEELVYYYDGLCEQDVLEEILNQSYERFKDYEKMGHEFARDDSGRLMSIPQR GLELMRYYFYHPYGKGGAHTTQILNAELQRLNVQRIGRIEITDLVKDGDAVSGAVGFHAQ SGTPCLFRAKAVILAAHNGGWKGSYLLNTCAGEGAALAYGAGASLRNMEFIENWNVPKLF AWEGQTGMLPYGARFLNGEGEDFMRRYSPKLGAKADPHYNVRGMAFEVRAGRGPIYFDTS TMSPEGVEIMTPTGGWMKLNDTKLKELGIDFFKSKTEWMPQVLTSFGGTVADKDGWSGVP GLFVAGRARSVTPGVYMGGWDTCKTTTTGYIAGNSAGKYVSGLTSAPRFDDAHASGTLDA TLSLLGKPGVYPKDVVRLMQELMAPMDVCILKTGKGLTRSLERLEDAKRNILPYMTAPEP HYLVKLVEARSMALITEMYLKASLMRKESRSGHYREDFPKRHGDPAWIVIRDGHDGMDLR ADPVPLDRYPVKPHRYYMDNFAFPKTSAI >gi|316924957|gb|ADCP01000009.1| GENE 5 6206 - 7825 1654 539 aa, chain - ## HITS:1 COG:TM1217_2 KEGG:ns NR:ns ## COG: TM1217_2 COG0493 # Protein_GI_number: 15643973 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 37 430 60 469 472 223 37.0 7e-58 MINVIDQDKCIGCGTCFKSCSLDVFRLDTDQSKVSPCMAACPAGTDIRGYNALVQQGKLE EAAQKLRQANPFPGITGRICFHPCEKECARNAVDGAVNINAVEQFLGDLDLPEEQPAVRH TAKVAVVGSGPAGLSCACFMARMGYPVTVFEAAPRPGGMLRYGIPAYRLPDAVVEAHVAR LEALGVSFRCNTRIGKGGDLTLGDLRRRGYKAFLLAPGTSLSRRIAIEGAELPGVLWGVE FLRAVRGDHAPTFSGHVVVVGGGDVAVDAAISAKRLGASSVSMVSLESEAELPAYPHNIA DARDEGIDFQCGWGPVRVEGTDAVSGLAVKACLSVKDASGAFRPSFDEAQTRTIPADAVI FAIGQAADLDPFAEDVAITPRRTIETAAVTFETSRPDVFAAGDAASGPASAIQAIAGGRE AAFSMDRYLRGGQILANRAAVRETVDKDRLPGEGVLLSPRNERAVTDAPGFGERRRGLDL HAVLAEGMRCMTCGAKARIAYNDDCMTCFTCELRCPAEAINVHPFKEPLPRTLEIHCED >gi|316924957|gb|ADCP01000009.1| GENE 6 8052 - 8741 785 229 aa, chain - ## HITS:1 COG:no KEGG:Desal_0411 NR:ns ## KEGG: Desal_0411 # Name: not_defined # Def: transcriptional regulator, Crp/Fnr family # Organism: D.salexigens # Pathway: not_defined # 16 228 16 226 235 106 30.0 8e-22 MKPLKLDADSSRTGKLEDMNAPWSRLLHLATRYTYPKRYPLVLAGDAMFDFYYLAKGRLR IMHGAENGRERAMVYIGSGNVFNEATALAGFDDPDCRFYCMEDVELYRFPGTLLHDPRFV AEYPELIINLMVSMSTKVLVMHANLSETGGGTAVKQVSLFLCGLSRLHGDKLDFNPQMTQ EELAIFLGMHRATLVRALRVLRGCGALLQLTKNRLRIGDLALLQKIAGE >gi|316924957|gb|ADCP01000009.1| GENE 7 9463 - 10749 677 428 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 1 428 1 432 435 265 34 3e-70 MSLEVLIWIALAVLLVMGTPIFVSLGMASITVLLLSDIPLSLVMIDLIKVSDMFPLLAVV GFVFAGALMERGGMAAQIVEVASMLVGHIRGGLGIVTILGCMFFASMIGSGPGTVAAMGS IMIPSMIKRGYSPEYAAGVSATGGTLGILIPPSNPMLVYGIIANVSISALFTAGFVPGAL VGTCLMFTAYTIARRHGYKGTMHSYTMAEFGRALLKNIWSLMAPVVILGGIYAGICTPVE ASVVAVWYALFVGFFITKKLSLKEVFNAMKLANISTGTILIVVGVSTIFGRFLTMYQIPQ QLAAQMMLYTTNPILILLLIAALLFFLGMFMETLATIVVLAPVFLPIIKSVGIDPVFFGV FWVITNEIALLTPPLGVNLFIAMNLSNLPLERVAKGACPYIFLLILVVFFLIFFPDVVTS LPKALGMY >gi|316924957|gb|ADCP01000009.1| GENE 8 10749 - 11255 688 168 aa, chain - ## HITS:1 COG:HI1030 KEGG:ns NR:ns ## COG: HI1030 COG3090 # Protein_GI_number: 16272964 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Haemophilus influenzae # 1 149 1 147 161 64 30.0 1e-10 MREQLASIWKNLEEYICCALLGVIMTLLMMQVCYRYFGGKSIAWSEEVSRFLFLYLVYFA ASLAASKNLHIRVTAQLKLLPKLGQMLLLLTTDIIWLIFCIIVVNVGTDFIISMSDRPMV SGALMLDMRYVYAAVPIAFTLQAVRLVERWWRIFTKRDPLVVPSEEVF >gi|316924957|gb|ADCP01000009.1| GENE 9 11323 - 12348 1444 341 aa, chain - ## HITS:1 COG:HI1028 KEGG:ns NR:ns ## COG: HI1028 COG1638 # Protein_GI_number: 16272962 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Haemophilus influenzae # 15 324 13 317 328 158 30.0 2e-38 MNIRQSFNRALLGAVCGALLLVSPASAAEKKIVMRYSHSSSAMVKEPHHAAALDFKNYVE KATNGKVDVQIYPGSQLGGEERSFQDIQQGVIQIASLAVNNVTVFSPSMGVFDLPYMFTN YEDCYKLIDQNWDEINKRMIAESGNMAVGWLVQGFRVLSNSKRPINTIEDLKGLKIRVPN NPIMIATFRAWGGEPAPMAWDETFNALQQKVVDGQENPYPVFASNKFEEVQKYITEIHYK VWIGPIVVNAAWFKKQPADVQKAILEGGRLATENNRKMIAEMETDLVKVLKDKGVEILAK PEDEPVWQEKAMGVWPQFYDKIGDISLLDTMMKSLGRTRPQ >gi|316924957|gb|ADCP01000009.1| GENE 10 12534 - 13553 287 339 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 43 317 38 310 328 115 29 5e-25 MQLSNTWKATVGLMIGAILCLPAQAAAEKKIIMRYSHSSAAMVEEPHHTAALDFKKYVEE KTKGRVEVQIYPASQLGGEERAFQDVQQGVIQIASLASNNAAVFAPSLYVIDLPYLFRTN QEGWAILDKYWDELNAKTIKESGNRIIGWLDLGYRHVCNSKRPIRTIEDLKGLKIRVPNN PVMINTFRAWGCEPTPLAWDETFNALQQKVVDGQENSYVVFASNKFEEVQKYMTELRYKL QIIVMVVNDTWLKKQPADIQQAILEGGRLATKHNREMIMAMEARLKPAMKKAGVEILDKP EDENVWEEKAMALWPDLYPKIGNLDLLDKMMADMHRKRP >gi|316924957|gb|ADCP01000009.1| GENE 11 14190 - 15227 1226 345 aa, chain + ## HITS:1 COG:YPO0108 KEGG:ns NR:ns ## COG: YPO0108 COG1609 # Protein_GI_number: 16120455 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 343 2 341 342 181 32.0 2e-45 MNENGFNNTRATIKDVAVKANVSLSTVSRALREPDKTPAETVSKVRAAAEALNYVYNATA GSLSKRRSDTIGILLPSPTYAAFGVNLMAIQKTCSERNYSCKVALSQFSPEEERLAMRRF HEQRIGGLILVGLDMSNVEYMKTLEAAGIPCIILWEVPDESVNYIGIDNARSIYTSVKYL IDLGHKRIGFLVGPMTCARRTIDRMEGYKKVLADNGIPFDPELIRSQPPSYLLGKESMRA FLKLQDPPTAVLCVNDYLAIGAIRAITEAGLSVPEDISVCGFDDIDIAAYFNPPLTSIKT PCYEMGQMAANIMISAIESQTPLKVQYLLDTELIVRRTCGRVKEK >gi|316924957|gb|ADCP01000009.1| GENE 12 15243 - 16583 1527 446 aa, chain + ## HITS:1 COG:TP0108 KEGG:ns NR:ns ## COG: TP0108 COG0205 # Protein_GI_number: 15639102 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Treponema pallidum # 13 436 9 457 461 393 46.0 1e-109 MNTCEATPSAAQEIRIQSLGTPEIDSPLSRNLATAGERRVVFTVDEELVDEGAAPRPMSF ELAGPRDRIYFDPSKTKCAIVTCGGLCPGINDVIRAIVMTAYNAYRVPSVLGIRYGLQGF IPSYRYDVRELAPRDVEGIHEFGGTILGTSRGPQSSSEIATALERLNISALFIIGGDGTM KAAASIQQEVARRGKHISIVGIPKTIDNDINFIPHSFGFETAVDKAADAIRCAHIEAASV FNGIGIVKLMGRESGFIAANASLSMREVNFVLVPEVPFSLYGERGLFPALEKCLAAHSHA VLAVAEGAGQDLLSEHETRCDASGNKALGDITGFLRAKIAEYFSAKKIPYYLKYIDPSYM IRSVPANANDRLYCGFLGQHAVHAAMSGKTGMVVANIMDKFVHLPLELVTRKRRTMSVRS DLWQSVLETTGQGDVMGTSPEPEQHL >gi|316924957|gb|ADCP01000009.1| GENE 13 16602 - 17420 1144 272 aa, chain + ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 9 271 7 269 270 265 51.0 7e-71 MNERKLRQLTELSKEARQLLTKMIGQIGVGHVGGSLSLIEALVYLYYEEMRVRPEEPGWP DRDRIVLSKGHAGPALYALLAMKGYFDKETLLTLNQPGTRLPSHCDMKLTTGIDMTAGSL GQGLSAAVGMALAARMERKDYRVYCIVGDGEQQEGQIWEAAMYAGSQELDNLVVLVDDNG MQIDDYTDAINAVRPLDKRWEAFGWATLCIDGHNFSDLEAALTHARTIKKRPTAIIMATV KGKGLSVAEGKLSSHNMPLSPEDVAAALQELQ >gi|316924957|gb|ADCP01000009.1| GENE 14 17439 - 18377 1409 312 aa, chain + ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 2 307 1 306 309 273 43.0 4e-73 MLESQEMRKVYCDTMIALAKEDPRIVDVEADLTGAHGMKPFKAAYPDRSFNVGIAEANMV GVAAGLSACGKIPFVHSFATFASRRCFDQIAISVCYAGLNVKIVGSDPGVGAELNGGTHM ALEDMGIMRTLPGMTVFEPTDSVQLRKALPAIVEHEGPVYIRLFRRQAENVFDDGYEFDL GKADLLRDGSDVTLIASGVCVANALQAAETLAQEGVSARVLNIHTIKPIDADAVIKAASE TGALVTAENHNVIGGLGSAVAEVLAEQRPTPLERVGVKDHFGQVGKAPYLMGVFGITAAD IAKAARKAIARK >gi|316924957|gb|ADCP01000009.1| GENE 15 18493 - 19380 1019 295 aa, chain + ## HITS:1 COG:MA0614 KEGG:ns NR:ns ## COG: MA0614 COG2084 # Protein_GI_number: 20089503 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Methanosarcina acetivorans str.C2A # 3 284 11 288 300 125 31.0 7e-29 MDTVGIVGLGVMGSNCAKKLMESGVPVAGYDPYPPAVKRASEAGVAVCGSPAELAGKARV ILMFVPGPADTEKVVLGEGGIASGAPEGTVVVNMSTVDPGVNIRMGEALAPKGIDFVDAP VMGSPSGVGSWAFALGGSDEALAKIKDVLLVLSGSEEKLFHIGPLGHGNKLKLLNNMMLG AIDACAAETMALAEHMGLSQKTLIDVAVAANARVLSSAYKEIGTRIAEGRYDEPTFTVDM LIKDNRLCLDMAREYGAPLILGAAVDYVHRMVSVQGYGAKDHAVSWKAVAKNWKA >gi|316924957|gb|ADCP01000009.1| GENE 16 19447 - 20529 1084 360 aa, chain + ## HITS:1 COG:RSp0960 KEGG:ns NR:ns ## COG: RSp0960 COG0451 # Protein_GI_number: 17549181 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Ralstonia solanacearum # 56 353 7 309 318 122 30.0 1e-27 MACFRRLCPDVLGVVASEWFFPKAHVICIGYKAKTIQVLALSLLLAVGHPRGGFMRILVT GGAGFIGSHLVRALLRQGDEVVVFDRVPVPHLLQDIMDSITYVQGDSASDLDLYRAVATN GIEGIFHLGALMAGVCEQNPPLAFQVNFRSTQVLLDAAVACGVKRFFFMSSISLYSPTSV EPVPEDAPKDPATIYGQTKLAGEHLLRWYADNHGIDSRGIRPTWVWGPNRTNGLTTQYTT GLVDSIARGGEVHVDNPDERGDWIYIHDTIKAMLLVWNAEKPAQRFYTVCGSVHTLREVA ELTNRLCPETKVTYAEHSANTSPYACSFDDSAIRKELGYAPEWSIEDSVKDYIRVVMGRD >gi|316924957|gb|ADCP01000009.1| GENE 17 20632 - 21441 610 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225570193|ref|ZP_03779218.1| ## NR: gi|225570193|ref|ZP_03779218.1| hypothetical protein CLOHYLEM_06289 [Clostridium hylemonae DSM 15053] # 3 268 5 272 279 136 34.0 1e-30 MAVLFGVCNWVHPLLQSPSCCKELKEAGIDAIQLDLGSAEEDFPLSSPGIQRQWKEEAEL YGIRLDAVAVLAVLKHGMTAEPGSERRNTAEKAIAAAVDCAADMGIPRIVLPSFKASAIR NKDDLRETARCLRLACALAKHRGIAVATENLLDVTMMKSLLNIVAYDNLCSCLDLSHFVL RGREDVFEALPEHLAVCCGVHLKDGMYGEPGVRPLGEGSCRAGELLAALKREGYDGPLFL ENRYTEPVFGPDALYALRRDAEYVRQAFL >gi|316924957|gb|ADCP01000009.1| GENE 18 21686 - 22609 967 307 aa, chain + ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 3 307 9 332 332 317 52.0 2e-86 MPLTGPRDMFAKALAGGFAVGAFNINNMETVQGILQAAGDERSPVILQVSNGARQYAGRG YLRKMAEAASEETDVPFVLHLDHGSGFDICKACIDDGFTSVMIDGSHLPFEENVVVTRRV VEYAHDRGVWVEAELGRLAGVEETVSSRESVYTDPEEAETFVARTGCDSLAIAIGTSHGP CKFSGEARLDFERLDAIAARLPGVPLVLHGASSVPQEAVAAVNAHGGNVAGASGVPEDLL RRAAASAICKINIDTDLRLAVTAAIRTLLAERPETFDPRAYLTAGREAVREMVRHKIRAV LGSSGQA >gi|316924957|gb|ADCP01000009.1| GENE 19 23062 - 23904 949 280 aa, chain + ## HITS:1 COG:MA2278 KEGG:ns NR:ns ## COG: MA2278 COG1741 # Protein_GI_number: 20091116 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Methanosarcina acetivorans str.C2A # 4 279 28 324 325 261 45.0 8e-70 MSYRIVKKIVTGRQAIDGAGVHLVRVLGSGTIRDFDPFLMLDAFDSLNPADYVRGFPLHP HRGIETFTYLVKGEIDHKDSLGNSGRIKDGCCQWMTAGGGILHQEMPKASPRMLGLQLWI NLPRDKKMVHPKYRDITPDDIPVIEEKGVTIRVVTGKYGNIGGATQGEYVDVQFLDVAMG PGGAWEMETIPGNTVFSYLMEGSCAYEPEGVIQPARRAVLFSEGDKIHLHAGPEGARMVI VSGKPLREPVAWGGPIVMNTDEELRNAFMELEEGTFIRHG >gi|316924957|gb|ADCP01000009.1| GENE 20 24180 - 25592 1525 470 aa, chain + ## HITS:1 COG:PAE2689 KEGG:ns NR:ns ## COG: PAE2689 COG2133 # Protein_GI_number: 18313519 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose/sorbosone dehydrogenases # Organism: Pyrobaculum aerophilum # 14 286 11 257 371 140 35.0 5e-33 MHHSRVMAGAMTLLLLSTPASWGAAPQKAKAVPTAEPFEGVVIAEGLAAPWDMVWGPDNH IWVSEREGSRIISVDPKTGEKKLVGSIPDVKVGPQHEGVLGIALDPDLGKSGSKNNVYIA HTYMDGGREHARIVRFHYDPQTRKIADPKVILSNMPAGDDHNGGRLRFGPDGKLYYSIGE QGHNQGANVCKPIDAQRIPTKAELDADDHSAYAGKVLRLNPDGGIPDDNPVINGVRSHVY TYGHRNPQGLVFVGDLLFEAEHGPSSDDEINLLVAGGNYGWPHVAGFRDDQGYVYGNWSA APDCEKLAKSYFPTDIPASVPQQKETEWKAEDFREPIKTFYTVPNSYNFRDGRCGDMAYL CWPTIAPSSVAYYPANGPIPGWGNSLLVTSLKNGALYRVPLNADGKTAQGDVAKYFHTPN RYRSALVSPDGKTIYVATDVRGNALGNDGKPVNDMKNPGSIIAFIYDAKQ >gi|316924957|gb|ADCP01000009.1| GENE 21 26065 - 29262 3721 1065 aa, chain - ## HITS:1 COG:mlr3285 KEGG:ns NR:ns ## COG: mlr3285 COG0841 # Protein_GI_number: 13472859 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Mesorhizobium loti # 2 1062 3 1054 1067 746 39.0 0 MFTRLFVQNPIFTNVIAIIAVILGIVSLTRLPIALYPDITPPTIAVSASYPGADAETVAK TVGAPIELQVNGVENMIYMSSTSADDGTYSLTVTFALGTDVNIAMVLLQNLVNNALPLLP QVVQQQGVTVTKHSPSILQVIALDSPGGSYDNLFLANYANLQIENVLSRVKGVGQVQAFG AGFYAMRIWLDLDKLNYLGVSVDEVNQAISQQNVQVAAGNVGGRPNSDKQAMQLTVLMQG QLDTPTAFGDIIVKTLPGNQIIRLRDVGEIEMGAQSYGAQSQIRGDNAALIAIYQLPDAN ALETAANIRSAMEKLSKDFPPDMRWTMPFDTSHFVSIAVTQVYLTFVKSIVLVMLVVLVF LQNFRAAVGPAVVIPVTLMGAFVFLNIAGFSVNMISLFALVLAIGLIVDDSIIVVEATMS HLERGLTPQEAALAGMRDLFLSIVAVMFVFASVFLPAATLPGITGQLYSQFALVIGGAAL LSAVFAISLTPTECSLLLRQQKAVRPASFDAPQGERPDFAEDKADVTTGNVFCRVFNRGY NALHGFAMRIVGITLRSPAIALLTYGVLAAFAVWGLITLPQGFLPEEDQGYVLVSAQLPD AAASPRTSELAGRMDEVFANTPGVETWVTISGFSMMEGAALPNAVTAFVIFESWAKRGTK LNQYAIMDELYRGFGQIPEGEFMVIPPPPIMGIGNAGGFDMMIEDRASRGPQALEAAVRA YEEAAAKDPRLEHVMSLYSASTPKLYLNVNRTQAMTEGIALPQLFTAIQTAFGGQYINTF SKYNQNYQVRVQAAERYRSTADNLLSLRVPNGKGEEVPLAGFASVQQVSGPSIVTRYNLY SSASMQGSAAPGVSTGAGMKAMEEIAGRVLPQGFGYEWTGMSFQEEKSHGQAAIAVGLAI LLVYLILAALYANWILPLGVLLVIPLALFGTVAAIMLRGMDNNIYVQIGMLLLIAMSSKN ALILVLYARRYGEQGMTAMQAAHKAASVRFRPILMSSLAYAVGALPMLWATGAGAINQQY LGTAVFGGMVATALLTVLYAPFFYAVFMGLSSKLGGKRTPEPKQA >gi|316924957|gb|ADCP01000009.1| GENE 22 29314 - 30468 1141 384 aa, chain - ## HITS:1 COG:mlr3284 KEGG:ns NR:ns ## COG: mlr3284 COG0845 # Protein_GI_number: 13472858 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Mesorhizobium loti # 54 379 25 358 376 166 31.0 1e-40 MHSCDSQTSLRQNPARVPAFLLPGLPGIVVAFCLVLAGCNDTPHKAETDDPLPVAAVKPE RRNVPLLVESTGMTQAVRTADIVARVEGTLQKIAYEDGALVQEGDTLFVIDPTMYKAKRQ QAEATLAAQKAVRNRAAVEYARNQKLYAERAASQATVVSWREQLSNAEAQVAAAQAALTQ AGIELGYTVIKAPFTGRVSRHKVDVGNVISPQTGTLASIVSQDPVYASFTAPADSILPLL NSFDDAKSIPLELAVGDGPYSIKGKLDYISPQVDASSGTLALRGVFPNTDGKLLPGLFVR VQVPTERENEVMVIPRQAVLENQGKSFVYVIAQDSRARMVPVATGPAVGSDRIAVISGIT PESLVIYDGMAHIRQGMAVTNTAQ >gi|316924957|gb|ADCP01000009.1| GENE 23 30470 - 32254 1423 594 aa, chain - ## HITS:1 COG:PA2837 KEGG:ns NR:ns ## COG: PA2837 COG1538 # Protein_GI_number: 15598033 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 41 552 6 476 479 194 29.0 4e-49 MPNNVFSTALARNPYPFDSRSSRVRSHWLRAGIAKPLIVALCLIGVPGCSTWSVGPDFER PDMQNMPEKWDAGDPQHPLVGNVSLDTWWQSFNDPALNQLVTRARLSSPTIEAALLKIVQ YRAQYAIAIGSFFPQTQELTGTYSSEHPSRRSADAPQPGEPSQPGTIQQLNLGFQATWEI DIWGKYRRNAEAAKAALQGAHAGYELALVTLSADVAQQYFSYRMTEKQLEIARRNSLAQK ESVRLTEAMFRLGASSGRDYDQAVAMQKNTEADIPQLEAQLITNRNALCVLLGIPPGPIP ELAPGWVPQPPQTASGAALASSVSTAGKTLPQGATPYAAERTAQSFIPLDIAIGVPADLL RRRPDVRQAEQEAAAQCARIGINKAALFPSFSISGFLGFQGSDVGAFTLGDTFSHNAFTA SASPSLVLPFLNYGRLYNAVRAQDAVFQQSLVTYKQTVLQALQEAENAMSSFIRSRQRLT LLQQAAEAAGRSTKLALEQYNAGSTDFTTVVSAQTTQYQQENSVAAAAGNTALQAVALFR ALGGGWQPQPLPLPKATIDAMKKRTDWGGMLRDMESVPVAGQSEIPQAETHRGY >gi|316924957|gb|ADCP01000009.1| GENE 24 32536 - 33432 976 298 aa, chain - ## HITS:1 COG:BMEII1008 KEGG:ns NR:ns ## COG: BMEII1008 COG0697 # Protein_GI_number: 17989353 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Brucella melitensis # 9 298 8 297 297 62 23.0 1e-09 MSGFGVFLAIATAFSWALGCVVHTTASRVVGIPAVMLIRQPLASVVLGIFCLVMGDLWLP SLHLFWLAAASGVFGIIIGDACQYAGALTIGLRPAQVCQSLSSCFTALFGSIYLNEYIGL QGWLGMFVATCGVILVVLSEQRDAHNPPVSSKQRNKGVIISLLAAVFVALGIVFSKQALQ EGIGALPLAFYRNAISTAGIWAVGLAFGAIRPAFGNLKANPGILKLFPLGCLFGPAGGIW LSCVALDYLPAAVAAMLIGLQPIALLIITGIMERRIPAPNSIIGSIIACTGAAILLLR >gi|316924957|gb|ADCP01000009.1| GENE 25 33724 - 34077 132 117 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTPQQPVWASLVVLFDRDRRLLFEHPKAVGKAFSEEYVERELYPRAFRHTEGKSNAKKR GSLPKGIAVLAVLFGGIHGGVGALKEVDDGAPVVGVDGNADAAREREVLAGADIGLV >gi|316924957|gb|ADCP01000009.1| GENE 26 33907 - 35619 1565 570 aa, chain - ## HITS:1 COG:CC0091_2 KEGG:ns NR:ns ## COG: CC0091_2 COG2199 # Protein_GI_number: 16124346 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Caulobacter vibrioides # 392 568 67 245 254 121 39.0 4e-27 MSNSPISSSRIDSKDESWLLLLSDPSAEKAKMRSKLVFFSCCTLMTMIPAIAALVLYPLG QVFAALISLILAGAGLAGGFCLYVNRMLRPTHHIATAIERFRRGDSAVRCPEKAPGLLGL LGERVNLFLSQIQEGRKELMRVAPHIDIIAHNLPGGLLCCAEDFELRFVSDGMLRLLRCS REEITGRYGNNWKNIVHPDDFAFTVAHLNEQRKIQPEYRLSYRIQRADGSSCWVMESSRL TVNADGVPEYVCIIIDDTEQKIIENRRIAVERQYRRALQSVCEMLIQVDLTNDLVLSEYV RWGASSRFPRGENYSQCCVEFMQECVHPDDRGTFSAAFCRKNLHLQNMDEDAPVTYLEYR IKEPSGNYHWVAGTLVPLEDEATHTITAISYVVDIDERRRREDAVIDKSRRDGLTGLLNR NAMFELIASTLEGLGHKGRHSLFMIDLDNFKAINDNFGHVFGDAVLREVADGIRSVFRPT DALARIGGDEFLVFLENVDKPDVLARKAGEVVSTLHKSYIGPSQNFPLSCSVGISVYPDD GRTVIDLFKSADAAMYASKQHGKDRYSFRE >gi|316924957|gb|ADCP01000009.1| GENE 27 35468 - 35747 114 93 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGIMVIKVQQLKKTSFDRILAFSAEGSDNRRSQDSSLLSILLEEMGLLDMVYILKRVWG KGKWLSPEGGPCPDQFWRSPELSLSLSLSLSLS Prediction of potential genes in microbial genomes Time: Fri May 13 01:44:39 2011 Seq name: gi|316924893|gb|ADCP01000010.1| Bilophila wadsworthia 3_1_6 cont1.10, whole genome shotgun sequence Length of sequence - 80081 bp Number of predicted genes - 69, with homology - 58 Number of transcription units - 39, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 3115 3053 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 3148 - 3207 4.4 2 2 Tu 1 . + CDS 3531 - 3944 274 ## gi|302862244|gb|EFL85178.1| toxin-antitoxin system, antitoxin component, Xre family + Term 4028 - 4070 1.3 - Term 4100 - 4142 0.0 3 3 Tu 1 . - CDS 4242 - 5726 1527 ## COG2199 FOG: GGDEF domain - Prom 5939 - 5998 5.6 - Term 5975 - 6000 -0.5 4 4 Op 1 40/0.000 - CDS 6063 - 7466 862 ## COG0642 Signal transduction histidine kinase 5 4 Op 2 . - CDS 7463 - 8173 780 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 8247 - 8306 1.7 - Term 8271 - 8302 -0.8 6 5 Tu 1 . - CDS 8431 - 8706 320 ## - Term 8900 - 8947 7.2 7 6 Tu 1 . - CDS 8970 - 10229 1437 ## COG1145 Ferredoxin 8 7 Op 1 . + CDS 10221 - 10499 57 ## 9 7 Op 2 . + CDS 10561 - 10899 200 ## + Term 10934 - 10967 -0.6 10 7 Op 3 . + CDS 11056 - 11313 355 ## Dde_1149 hypothetical protein + Term 11511 - 11553 10.2 - Term 11491 - 11549 19.6 11 8 Op 1 8/0.000 - CDS 11742 - 12710 1236 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component 12 8 Op 2 . - CDS 12921 - 15047 2482 ## COG4666 TRAP-type uncharacterized transport system, fused permease components - Term 15241 - 15283 9.1 13 9 Tu 1 . - CDS 15307 - 15408 103 ## - Prom 15518 - 15577 2.9 + Prom 15056 - 15115 2.7 14 10 Tu 1 . + CDS 15355 - 15525 67 ## - Term 16215 - 16257 8.6 15 11 Op 1 . - CDS 16277 - 19798 4592 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 19819 - 19878 2.9 16 11 Op 2 . - CDS 19883 - 21835 1549 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 21874 - 21933 3.6 - Term 21919 - 21962 12.0 17 12 Op 1 9/0.000 - CDS 21984 - 22994 315 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 18 12 Op 2 11/0.000 - CDS 23043 - 24314 757 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 19 12 Op 3 . - CDS 24339 - 24851 179 ## PROTEIN SUPPORTED gi|239995925|ref|ZP_04716449.1| ribosomal protein S3 20 12 Op 4 . - CDS 24912 - 25580 637 ## COG1878 Predicted metal-dependent hydrolase - Term 25614 - 25667 18.1 21 13 Tu 1 . - CDS 25676 - 26716 1519 ## COG2017 Galactose mutarotase and related enzymes - Prom 26931 - 26990 7.2 22 14 Tu 1 . + CDS 26522 - 26938 190 ## + Term 27100 - 27139 2.0 23 15 Op 1 2/0.000 + CDS 27691 - 27966 227 ## COG2721 Altronate dehydratase 24 15 Op 2 . + CDS 27978 - 29135 1574 ## COG2721 Altronate dehydratase 25 15 Op 3 . + CDS 29149 - 30048 904 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 26 15 Op 4 . + CDS 30062 - 30916 1356 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 27 16 Op 1 . + CDS 31023 - 31778 1101 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity + Term 31790 - 31824 6.5 28 16 Op 2 . + CDS 31874 - 33334 1775 ## COG2368 Aromatic ring hydroxylase 29 16 Op 3 10/0.000 + CDS 33402 - 34163 820 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 30 16 Op 4 7/0.000 + CDS 34201 - 35316 1058 ## COG1960 Acyl-CoA dehydrogenases 31 16 Op 5 20/0.000 + CDS 35367 - 36209 1200 ## COG1250 3-hydroxyacyl-CoA dehydrogenase 32 16 Op 6 . + CDS 36295 - 37473 1202 ## COG0183 Acetyl-CoA acetyltransferase + Prom 37478 - 37537 1.9 33 16 Op 7 . + CDS 37564 - 37779 146 ## 34 17 Tu 1 . + CDS 38015 - 38176 105 ## 35 18 Tu 1 . - CDS 38304 - 39707 1545 ## COG2199 FOG: GGDEF domain - Prom 39791 - 39850 3.4 36 19 Tu 1 . - CDS 39909 - 46994 7933 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 47111 - 47170 4.9 + Prom 47597 - 47656 9.0 37 20 Op 1 . + CDS 47821 - 49338 2114 ## COG3119 Arylsulfatase A and related enzymes + Term 49452 - 49499 -0.9 38 20 Op 2 . + CDS 49533 - 50936 1830 ## COG0471 Di- and tricarboxylate transporters 39 20 Op 3 . + CDS 50959 - 51657 764 ## Desal_0411 transcriptional regulator, Crp/Fnr family 40 21 Tu 1 . + CDS 51816 - 52124 475 ## Dde_1202 thioredoxin, putative + Term 52134 - 52202 6.8 + Prom 52170 - 52229 3.9 41 22 Tu 1 . + CDS 52472 - 53395 371 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 53592 - 53636 13.1 42 23 Tu 1 . - CDS 53695 - 54294 700 ## Dde_1959 hypothetical protein - Term 54331 - 54366 0.7 43 24 Op 1 4/0.000 - CDS 54376 - 55509 1429 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB 44 24 Op 2 . - CDS 55545 - 56309 718 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 45 25 Tu 1 . + CDS 56730 - 56963 86 ## - Term 56805 - 56836 3.4 46 26 Tu 1 . - CDS 56947 - 57303 70 ## - Prom 57440 - 57499 3.7 47 27 Tu 1 . + CDS 57282 - 57965 410 ## COG2323 Predicted membrane protein 48 28 Op 1 44/0.000 - CDS 58156 - 59052 644 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 49 28 Op 2 5/0.000 - CDS 59049 - 60065 989 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component - Term 60077 - 60125 4.6 50 28 Op 3 5/0.000 - CDS 60135 - 61706 2300 ## COG0747 ABC-type dipeptide transport system, periplasmic component 51 28 Op 4 49/0.000 - CDS 61764 - 62669 931 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 52 28 Op 5 . - CDS 62669 - 63658 1123 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Prom 63730 - 63789 5.2 + Prom 63955 - 64014 4.7 53 29 Tu 1 . + CDS 64234 - 64803 391 ## COG0655 Multimeric flavodoxin WrbA + Term 64851 - 64902 17.0 54 30 Tu 1 . - CDS 64991 - 65473 174 ## RSP_2020 DHC, diheme cytochrome c 55 31 Tu 1 . - CDS 65635 - 66333 989 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump - Prom 66370 - 66429 6.5 56 32 Op 1 . + CDS 66655 - 67413 586 ## Hore_08770 transcriptional regulator, GntR family 57 32 Op 2 . + CDS 67479 - 68456 1020 ## COG3181 Uncharacterized protein conserved in bacteria + Term 68478 - 68518 8.8 58 33 Op 1 . + CDS 68710 - 69411 603 ## COG0546 Predicted phosphatases 59 33 Op 2 . + CDS 69408 - 70169 678 ## COG1349 Transcriptional regulators of sugar metabolism + Term 70248 - 70313 12.3 + Prom 70183 - 70242 9.2 60 34 Op 1 . + CDS 70463 - 71485 1434 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component 61 34 Op 2 . + CDS 71541 - 72047 765 ## RSKD131_4191 tripartite ATP-independent periplasmic transporter, DctQ component 62 34 Op 3 1/0.000 + CDS 72049 - 73335 696 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 63 34 Op 4 . + CDS 73355 - 75277 2600 ## COG0524 Sugar kinases, ribokinase family + Term 75331 - 75381 0.9 64 35 Op 1 3/0.000 + CDS 75524 - 76558 1189 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 65 35 Op 2 . + CDS 76566 - 77363 761 ## COG1082 Sugar phosphate isomerases/epimerases + Term 77376 - 77419 -0.6 66 36 Tu 1 . + CDS 77467 - 78225 211 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 67 37 Tu 1 . + CDS 78343 - 79074 674 ## COG0274 Deoxyribose-phosphate aldolase + Term 79134 - 79162 1.0 + Prom 79179 - 79238 5.7 68 38 Tu 1 . + CDS 79387 - 79563 167 ## - Term 79494 - 79519 -0.5 69 39 Tu 1 . - CDS 79686 - 80057 176 ## bglu_1g32480 integrase, catalytic region Predicted protein(s) >gi|316924893|gb|ADCP01000010.1| GENE 1 1 - 3115 3053 1038 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 565 992 328 757 776 223 32.0 1e-57 MTRQVQNPLTCASPSDEVSEVAASSQRRDTALQRTASSLNLYKIIGDSSSEIVWEQASAP NADCIFSGDPRFACANKCLTVEEAYETVHPDDQPLLRSDIDSLHSGEQDSFSRRIRALGC DGSWIPVDVCARSLRDNNRTPLYLIGGFKGISNLPPVSSAIASSRDPLISDIIWERFVSG GGFAISDGVERFLGHPRHHFRSVEDLVCLLHPSDLAIVHERMKAVSSLTGCDCLSVRMRH NDGTWRQVEARVVVLRGRAGEDRIIGALKDITPKAVRDLPSLQPAQISSLTGLYTLEAGQ AIIEKNLLTGLNLQALILVDFRNLDEIEQRHGNRWKNLFLQHAGAILRNLARENDVPIHC RAGSFILFADQYESFREVTDLTRNLQRTLGAVCAFDGESDSVEFSIGIAVAPYDALTFSG LVDRAAQACSQRSGVNFYDKQAADRFRLDSELLEEERRREKEQMREALQTIMDNVDAYLF VVHPFRHEVLFANRQIRSIRPECVPGSVCYRSFFRFDKPCVDCAILKGRDCLIEKDGQQI YLQLRHKKIRWLNDEMVYLVSGTDITERMLHARKLEHMAYHDSLLDIPNRQAALRNLQHM LREGKRCAVVLFDISDFKLFNETFGHAKGDLLLKDVSLSIARFVPEGSLYRSGGDDFLVL LSGADGPQAERLAQSVRNSFLRVLTIEDLEYTCNIDAGISISPQHGTNPSTLITHAELAL AEARKEGRGIYQFNKELDQILSRKKLLQILIRSALANGDFEVHFQPVFEISTGLFRKAEA LLRLRDAAGSFISPAEFIPVAEETGLIVDVGYLVLDTVCRQLLSMASVAGLPFQIAVNIS AIQLLQTNFVSRVVEIIQYHGINPQQLEFEITESVLINSFEQVKGVMQQLRDKGIRFALD DFGTGYSSLSYLTHLPISTLKLDRSFIRNLETSESNRQVCKAVIDIARHFNMAIVAEGVE NAEQSKIISELGATSIQGFYYARPLPGGQLVPWLSEHEDHDIPTPEQDRPRERERERERE RERERERERERERERERE >gi|316924893|gb|ADCP01000010.1| GENE 2 3531 - 3944 274 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302862244|gb|EFL85178.1| ## NR: gi|302862244|gb|EFL85178.1| toxin-antitoxin system, antitoxin component, Xre family [Desulfovibrio sp. 3_1_syn3] # 3 137 2 150 159 85 35.0 9e-16 MESLQERLKAVRGAMTQSEFATRLRTPLTTLGRYERGANMPDLAFIINVCTIFDVSPEWL LFGRGVMREQSKEYPSSGCGCRQCPLVLKELDEERTERRELSVENRQLYAENRKLLQENG RLREELARLRPFSLGAG >gi|316924893|gb|ADCP01000010.1| GENE 3 4242 - 5726 1527 494 aa, chain - ## HITS:1 COG:VC0658_2 KEGG:ns NR:ns ## COG: VC0658_2 COG2199 # Protein_GI_number: 15640678 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Vibrio cholerae # 331 484 61 216 219 104 39.0 3e-22 MLFSAIFVVSLIGTLFTTYLFMKSTAVESTRFLADDLTSQLNTVYSLLEGMSEQPLIKDT SISVQERAASMKTYADAFGFWMIGVVDPDGTISSTLRPKTGKVQRGYIPRIMASGKREIS DPFPAGATGDMTYSQFMPIKKDGKVVSICFVTTPLSLLSERVTRRDHTGNGYYLMLSSTG SLIAHPNPTMSLTHIRDLVAGERFLYGSSRETFLRDIKSDQPGTFISIFRNTLYFTAFMN IPETGWTLIHRVKILPTVKYMLLAFFTQTLLYTLLFVVLYRYGRTAISQELRPVDNILRQ VEELNRSVNAADSITEEDVFNIINISRKGLYDALTGLPTRNLFRQRVNDTIEKMPGRLFG VFILDMDNLKAINDHLGHEAGDKAIRSFASTLKEFTERHFGLACRYGGDEFVLFIPLDTP SDASALARELLETQHGSIVGNDVVYAYGTSIGVSFYPLSGTFDEALHMADTALYSSKRSG KGTFTCLPSHAECL >gi|316924893|gb|ADCP01000010.1| GENE 4 6063 - 7466 862 467 aa, chain - ## HITS:1 COG:SMb20218 KEGG:ns NR:ns ## COG: SMb20218 COG0642 # Protein_GI_number: 16263959 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Sinorhizobium meliloti # 173 463 148 437 454 167 37.0 4e-41 MNLDRLLPKTSFGRFFLINVISLALFWLLIQPMANYWRNQALFRMVASNESRLFLTHIRL LDKLPTIAERNGLFNPNDDVFSIRVSHLPPDMPRDGSNCSQSLRRRLERAFKQENIRYTD LLTRIVVSHAPITVMGSEEKYSDLETFLTHHYMQAEIALRRPDGIWIYAKHQIEIMPQNR LWLNTAALAIEFLLVMSGVALALNWLLRPLRKLVAATERFGRTQEITPLRTDSGPVEVRE AASAFNRMFSSIQRSFEERERFLTSFSHDLRTPLTRLRLRLEQVAQDDLREKLCADVDDL TDTLNRTITFLRSARSIEDVRRPIAVMPLLEALVEDRQGIGEQVTLNGNTAAVLLSWNRL RSAFENVVDNALRYGDCADIEVHHERDSEGREWLYIDFRDNGPGIPEEKLEFVLEPYVRL ETSRNRKTGGHGLGLSIVRNLVEASGGRVLLMNRPEGGLLVRMLFPL >gi|316924893|gb|ADCP01000010.1| GENE 5 7463 - 8173 780 236 aa, chain - ## HITS:1 COG:BMEII0791 KEGG:ns NR:ns ## COG: BMEII0791 COG0745 # Protein_GI_number: 17989136 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Brucella melitensis # 1 233 1 236 238 187 45.0 1e-47 MPDQTRLLVVDDDADIRELLSAYLARYDMTVETAADGAGLFEKLDHAHFDLIILDIMLPG EDGLSLCRRLRGTSAIPVIFLTSLDSSTDRVVGLELGADDYVVKPFDPRELLARIRTVLR RASDKGQAAAQSDIRRFAGWTLNVRSRELSQGDERIALSDAMYRLLLVFLDEPFAVLSRE YLLRQTQGRDADVFDRSVDIQVSRLRGVLKDTGTSKIIKTVRGGGYMLSVDVEHDA >gi|316924893|gb|ADCP01000010.1| GENE 6 8431 - 8706 320 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLKHTSLIVAGALVLGMGISTASFAEMPSLDSQSYTWPGSTINTSSDTDKIAPDPKTGKR PGILDWEGMGADVSRHPDKPGHDPKNCGICQ >gi|316924893|gb|ADCP01000010.1| GENE 7 8970 - 10229 1437 419 aa, chain - ## HITS:1 COG:AF1263 KEGG:ns NR:ns ## COG: AF1263 COG1145 # Protein_GI_number: 11498862 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Archaeoglobus fulgidus # 1 346 1 343 369 152 27.0 2e-36 MAHHYSNSLYDRLADRLNQFPQGAPLSDSLFAILKILFSEKEAGLVAQLPIKPFTVDRAA AIWKLDKATAQNMLEDLAGRAMLLDMEQNGQQMFCLPPPMAGFFEFSMMRVRGDVNQKAL AELYYQYLNVEEDFVKALFVENGTPLGRIFVQEAALPQDGKLEVLDYERASHIIKTASHI GVSMCYCRHKMQHLGRACDAPMDICMTFNTSARSLIKYGHARQVDAAECLDLLAVAQGHN LVQFGENAREGVNFICNCCGCCCEALLAIKRFAIARTIHSNFIARAGNDCRGCGKCEKVC PVNAIHMEDGPAGKRFAFVDPERCIGCGVCVRSCAFGQLTLEARPERAITPLNTVNRVVA MAAERGMIKELLEDNDAMGNHRIMAAVIGAIMKLPGAPRALAVAQLKSRYLENLIGKMG >gi|316924893|gb|ADCP01000010.1| GENE 8 10221 - 10499 57 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSHIAPRIMSCVKTPIPCSRKIDFGSRVLLGGFSVFQHQFWAVTQCCVLGRNIRICPLRR LLSQKRLARLRPGMSRLMPQSSRTYPFLSTLS >gi|316924893|gb|ADCP01000010.1| GENE 9 10561 - 10899 200 112 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLTGALCAVQPNKEFTVFVKWLPALAVLGVLYAYPAHAAEPLEDASEALENLDDLVKSV DSVGHSFGRIVQPRQYGFRWNGGEIGRDSNPSRMPRNSRSQDDRRDDDRDDD >gi|316924893|gb|ADCP01000010.1| GENE 10 11056 - 11313 355 85 aa, chain + ## HITS:1 COG:no KEGG:Dde_1149 NR:ns ## KEGG: Dde_1149 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 8 84 8 84 106 89 57.0 4e-17 MAIAPLMGVCSSALQYDITDDGRVTGIRFAGGCPGNLEAVARLAEGLPAEAVISRLEGIQ CGSKPTSCPDQLARALKRELAARGA >gi|316924893|gb|ADCP01000010.1| GENE 11 11742 - 12710 1236 322 aa, chain - ## HITS:1 COG:PA5545 KEGG:ns NR:ns ## COG: PA5545 COG2358 # Protein_GI_number: 15600738 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Pseudomonas aeruginosa # 31 321 29 317 319 185 36.0 9e-47 MKKWSMFLAAAACACVLASPLQAAEHKKVNISFPTAATTGALYPLGAGIANLWNTKLGYV NARVQASNGGIQNLNLLKAGNAQVSFAVSSITYEALNGERGFKDRAYKDVRVLAGLYYNP NQVVARADSGVASLADFKGKSFAPGAAGGTTEVESRIHFTETGLKYPDDIKAHFVGFTES IDLMRNKQLDGVWIMAGMPTAAVTEMCSTASGKLVGMDDELIAKVQAKYPWYSKFTIPAG TYDGQTEPVQTTAVKMLLLTDASMPDDVVYDLAKTFWENLDSLGKAHAVMKTVTKEMAVS DLSGIPLHPGAEKYYREIGLLK >gi|316924893|gb|ADCP01000010.1| GENE 12 12921 - 15047 2482 708 aa, chain - ## HITS:1 COG:BH2945 KEGG:ns NR:ns ## COG: BH2945 COG4666 # Protein_GI_number: 15615507 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, fused permease components # Organism: Bacillus halodurans # 25 707 17 645 656 408 40.0 1e-113 MNGNDAKEQAANTAHVHAVEQAAAQALVEEKESESRFRHYGKGPWAAAISALAVVFSLFQ MYASTFSAFDAMTLRSWHIIFLLVLSFLMYPAWKGERRSRTRPTLFDALCIAAGLFSFGY LILNYTEITLRGGYFLPVDYFVASVGVIICFEMARRVVGSLAALAGVFFLYNFAGEWIPG AFGHTGFSWDRVVEHMFWGSQGLLGVGVGVSATYIFLFVLFGAFLKYSGFSDFINDLALT LVGRTAGGPAKVSVVASALMGMINGSALANVATTGAITIPLMKKTGYKPEFAAAVEAVAS TGGQFAPPIMGAVGFIMAEFLGVPYTKVMLAAAIPAFLYYLTLLMAVHFEARKLGLKGLS PEHIPAAGKVLRERGHLFIPLIVLLWLMFDGYTPLFAAAASIFATVGATWLPSLIGLLRT KTARTFAFVLLLAVLGGLALSGLLSLGAAILTALCVLASAWCSARYRMTPRVVVQALDEG ARGAVGVGAACVIIGIIIGTVSLTGLGLTFGYEVLKYVGEGQLYLGGLFVMIMSTILGMG VPGVAAYVIVAAVAVPVLTGVGVMPMAAHMFCLFYACLSNITPPVAMSSYVAAGIAHSDQ TRTSLIAVKLGLTGFILPFFFLNNPLLLYSSANPALATLWAFATACLGVSALAAGLQGWL FGPCNSVMRGLLLVAAFLAIDPGLKTDLAALALCAVAGVWSWKERGKG >gi|316924893|gb|ADCP01000010.1| GENE 13 15307 - 15408 103 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIISATLHTLFSADAEGHFGFYYFMRKSGSLRA >gi|316924893|gb|ADCP01000010.1| GENE 14 15355 - 15525 67 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPFGIGGKEGVQRGGNDHEWLLYLRVRYARAVWVVNQFFGGEGHMPEDEPVCFTKM >gi|316924893|gb|ADCP01000010.1| GENE 15 16277 - 19798 4592 1173 aa, chain - ## HITS:1 COG:FN1170_1 KEGG:ns NR:ns ## COG: FN1170_1 COG0674 # Protein_GI_number: 19704505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 4 405 3 405 410 559 65.0 1e-158 MSNRRTQTMDGNTAAAYISYPFTEVAAIFPITPSSPMAELTDAWAAQGKKNLFGAPVRVV EMQSEGGAAGALHGSLQGGALTTTYTASQGLLLMIPNMYKIAGELLPAVFHVSARSLASN SLSIFGDHQDVMATRQTGFAMLASSSVQDVMNLGAVAHLSAIEARLPFIHFFDGFRTSHE IQKIELLEQEELADMLDWEAVEAFRHRALNPDHPQLRGMTHNPDVFFQLREAVNPFVDAL PGIVRKYMDKINAITGKDYKFFNYYGHPEADRLIVVMGSGASVVEETIDELMRRGEKVGM VNVRLYRPFLPEKLLEAVPASVKRIAVLDRTKEPGACDPLCLDVRNAYASKADAPAIYAG RYGLASKDFTPAQVLAVFDNLAVEQPRDNFTVGIIDDVTHTSLEVKDFHVVAEGQTACKF WGFGSDGTVGANKLAVKIIGDHTDMYSQAYFAYDSRKSGGVTISHLRFGSQPIRSSYLID YADFIACHNQSYVDKYDLLRGIKEGGTFLLNCMWKDGDLEKHLPASLRRTIARKKIRFYT LNAVDIAMSLGLGGRINMLMQAAFFKLAGIIPLNDAVAYLNEAIVKTYGRKGHAVVDMNQ EAVRQGIAMLHEVAVPAAWADATEDAPAACDERPDFVRNIVDVMNRQEGDELPVSAFLGY EDGTWPVGITAFEKRGAAIMVPVWDAEKCVQCNKCAFACPHAAIRPRLLSAEEAVAAPAS IEHRPCRIQKGFEFHMAISVLDCTGCGVCVGQCPAKEKALTMIKLENVMPEGARKWDYAE KNITYKKPTQGKLNVANSQYLRPLNEFSGACAGCGETPYAKLVTQLFGDRMMLSNSAGCS TVWAAGAPSVSYTTDENGHGPAWGFSLFEDNAEYGFGMFLGVAQIRATLALKMKALLEGG DISAALRSIMEEWLEGMDLGEGTRERAAALTAALDEALAEKDDAELKALREKSDFFVKRS HWIFGGDGWAYDIGYGGLDHVLASGEDINVLVFDTEVYSNTGGQSSKATPTGAVAQFAAN GKMSRKKDLGLMAMSYGNVYVAQIAMGASQEQTLKAISEAEAYPGPSLIIAYSPCLNHGI KGGLGNSQMQAKRAVEAGYWSLYHYDPRRIGQGENPFVLDSKEPSNGFREFLLSEVRYSS LKRQFPERAEMLYEKAEADAKARRSNYTRLANK >gi|316924893|gb|ADCP01000010.1| GENE 16 19883 - 21835 1549 650 aa, chain - ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 49 640 46 626 627 339 34.0 1e-92 MPKHPFICAIVSDNSAAEQIELCQHSISSPVRIITTESSKCIVATKRVIEEGAAVIVSRG SVWSTIKKAFPLVPTLQTPITCCDAIEFLTKAKQYDTNIGVVSFPSHIAKMVTVAPHLGV SLTVHQVNNPDDIEKGCYEMRDKGMKVLVGGGHAVQWAQKLGLHGVLHTVSADGVIQVLN EADRILNAIISERSKDARVRTMLNALKDGAISLNEQGKILEYNLPAQKMFANGESTMKSI RTFLQDTGIIEAVQQQLTWSGESKKYEKKQYLCNIIPAYSNDIYCGASVIIQDASHIQSL EHKMRRELHAKGHVARYTLKDVVGHSAEMRSLVEHAELYANSPSSIFIYGESGTGKEIFA QGIHMASPFRNGPFVGINCTALPESLLESELFGYAEGAFTGAKKGGKVGLFEMAHNGTLF LDEIGEIPTSVQAKLLRVLEEKIVMRIGQERYIPINVRIVCATNKDLAELVRCGKFREDL FYRLNVLRLNLPPLRKHPEDIEELADSFLSWLPQSLSLPRPQISPEALAVLKLYHFPGNI RELRNVIERLVVIAKGREIRESDISRVLLMNEAPCGNSHKKAVTVPPSPSAPADANRRTK QSMLRAQINSTILQVLEETQGNKAETARRLGISPATLWRKLKEIAAQEKK >gi|316924893|gb|ADCP01000010.1| GENE 17 21984 - 22994 315 336 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 29 318 29 315 328 125 27 5e-28 MKLFKAMLCSVAALSLLLPAAHAQAAPTVVKIALTQPTGHPTTDLIRGQFKKAIEEKTNG RFRIDVFDNYSFGNFEAVVQGLQFGMLQFAQESPSNLSIYDPKLMIFDLPYLMPDYDAVN ILLDGKVGKELSRSLEKIGIKGMGWIALGTRFYWMNKPFHTMAEAKSMKIRATASKVHIS MTKAFGMNPTPMAWSETFTALQQGTVAGVDVDIHSAVSANLHEVAPHLVLSGHFYTPHLF MASAKFMDSLSPEDRKIFEETFAEVQKIQRSAIHTGEQKLIEKLRAQGVDVIELSPEARA EFMNAGKAIWKDFDKQVPPSLIQQVLDGYKAAGKNY >gi|316924893|gb|ADCP01000010.1| GENE 18 23043 - 24314 757 423 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 11 419 11 422 435 296 38 3e-79 MDPVLLTTFGTLIVLLIFTVPIGVSIGTAVFMGMMVGGLPPVFLAQKMYSALDSFPLLAV PFFIVAGDIMQKGTIANSLLAVSRCLVGHIKGGMAHISILTSLFYGALSGSAAATVAAVG GIMIPAMEKEGYPKEFATAVNSTSGCLGIMIPPSIPLILFGSTAGVSISDLFVATIVPGI LMGCALMLVSYVICVRRKYGKTVARAKFAEMMKALYEAKWAIMVPVIVLGGIYGGITTPT EAGAIAVVYALFVEVFITRSMTRKLFFEIIKSSVRINAAIFLVVASATALGQIMMYYNLS AAVLDTIMGISTSKVVIMALILVVLLILGTFLEAAAIIMIVTPLLLPVINAIGVDPVHFG IIMLTSFAVGGQTPPVGMTLFVGASIAKISIERLSIASIPYILTLVVMLVIIAFVPQLSL CLI >gi|316924893|gb|ADCP01000010.1| GENE 19 24339 - 24851 179 170 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239995925|ref|ZP_04716449.1| ribosomal protein S3 [Alteromonas macleodii ATCC 27126] # 30 160 16 147 161 73 31 3e-12 MTEGIPTGGSMKPDSPFRKAINAIEVWVSVFAMLLLILVVSIQVFCRYVLQDSLAWTEEL ARYLVIWSVLFGCSFAMRTDSHLELSILSNFSGPRMKKYVKCFSCIVCLAFSIIMVYAGM ESVINIHWSEQLTPAMQLPAWIIWMAMPAGFGLMGIQAILRCVDEIAKNV >gi|316924893|gb|ADCP01000010.1| GENE 20 24912 - 25580 637 222 aa, chain - ## HITS:1 COG:PAE0036 KEGG:ns NR:ns ## COG: PAE0036 COG1878 # Protein_GI_number: 18311668 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Pyrobaculum aerophilum # 13 213 15 205 223 91 31.0 1e-18 MATRNIIDLSQPITPASPAPCDPPVEFRTVASHGPDSLYQMMWFGMTDHTGTHIDAPLHF VPGGKPIDEADLSALYSMPGVFLDFTAASCFHKIDAGDVERELKQYGSIPEDAFIMLHTG TGWYSNTQEYFNSPFITPAAAERIIRLRPVAVGVNGPTVDDRRGEGRPIHRAFLSHGVHI IEGIVRSELLAGKRFHCTALPLKLAGFTGSPIRFVAVLEEPA >gi|316924893|gb|ADCP01000010.1| GENE 21 25676 - 26716 1519 346 aa, chain - ## HITS:1 COG:TM0282 KEGG:ns NR:ns ## COG: TM0282 COG2017 # Protein_GI_number: 15643051 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Thermotoga maritima # 1 344 5 352 356 219 36.0 5e-57 MQRVERSVFGELNGKPIHLFRLSNANGMVAELLEFGGRIKALCIPDGQGGLDNMSVGYDT FEPYHARFNPWLGALLGRTASRVTGARFQIDGKTYNLSASGPAGSNMHGGFVGFDSRVWT GEADSDATGATLKLTYLSPDGEEGYPGNASVTVTYRLDDDNRFHMNWEAATDAPTIIDMS SHVYLNLNGFKNRDVMNEYLKVNASYYLEKDATGTPTGNFIPVEGSMFDLRTPKLLEELS QGAVYNPIMALDGEGGTLREVAEVSIPEQGRSYTLHTTARALQVYNAYNMASFYTSRGLE SPFAPAPGLAIEPQNYPNAINIPSMPSPILRPGETYSERQFFAFKW >gi|316924893|gb|ADCP01000010.1| GENE 22 26522 - 26938 190 138 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVGLEGVVPYAHVVQAALAVGDAQGLDAAAEFEELGHHAVGVGQTEQVDRFTVEFTKNAS FNSLHDNLQSSYVRDEERSGSTAGTPLVFFFLQILCQRAQARCMAAEEGPCARYREKAGL LKWFGYALGKDFRRRNGL >gi|316924893|gb|ADCP01000010.1| GENE 23 27691 - 27966 227 91 aa, chain + ## HITS:1 COG:ECs3973 KEGG:ns NR:ns ## COG: ECs3973 COG2721 # Protein_GI_number: 15833227 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Escherichia coli O157:H7 # 5 81 4 79 495 74 46.0 6e-14 MEHVIMMHAKDNIAIVMAPSLAVGSEFTVGGKMVCAKDELPFAHKVAIKAIPKGGQIVKY GEPVGIATADIAPGEHVHVHNVVSGRHTVKE >gi|316924893|gb|ADCP01000010.1| GENE 24 27978 - 29135 1574 385 aa, chain + ## HITS:1 COG:STM0650 KEGG:ns NR:ns ## COG: STM0650 COG2721 # Protein_GI_number: 16764027 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Salmonella typhimurium LT2 # 6 385 9 390 390 263 39.0 4e-70 MLPEIMGYRRPDGRFGIRNHIAVIATMDNANPTVRRICSLVKGCVPFCPGFGRGEMNEDL ALHNLLSINTALHPNVYGVIVVALEDQTAEYISGEIAKGGKPVVGFSIEDQGGTVEVAYR AAQAAIQMLHDAGRLRPEPMPWSEFVLGVECGGSDGSSGIVSNPVTGRLADRVIDAGGTV IMSETLELLGGEEMLGRRAVTPEVAAKVKGIIQRTLDYAAEHNLDIMGANPAPDNIAGGL TTIEEKALGAIKKGGTRPLVEVLEEGERPTKKGFVIMDAPAPGVENLTAIVSGGCHAVIF STGKGNCIGHPVAPVIKISGNPMTVKAMRDNIDVDVSGVINQGLTLDQATDIVQDKLVDV CWGAMTSSDMLGEIETAASRLLRTL >gi|316924893|gb|ADCP01000010.1| GENE 25 29149 - 30048 904 299 aa, chain + ## HITS:1 COG:MA0614 KEGG:ns NR:ns ## COG: MA0614 COG2084 # Protein_GI_number: 20089503 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Methanosarcina acetivorans str.C2A # 1 289 10 297 300 121 29.0 2e-27 MKVCFLGLGSMGSGVCRNLLNKGYALQVWNRSPEKAAKVAGWGAEAFATPQEAARGAGVV MSCLSSVAALKAVAGGEDGVFSCMGEGMAFYDMGTWDIESVLALEAEAERHGVRYVYMPM GKGPEAAEAGESPLFFGGSKDVYERDKGFLSDIGEAFYLGDVKAACAFKLVTNLIGLSNN VILTEGVFLAKRMGLSEEAFLNGAKSTGAWSYQMRNSGHKVFAEAFLPMRGTLDNAWKDM KFGVEMAEKAGVSCPAFAMLRDRYKAASEAGYGEEDYISVYRLLMDAENDGATAAEPCN >gi|316924893|gb|ADCP01000010.1| GENE 26 30062 - 30916 1356 284 aa, chain + ## HITS:1 COG:BH2634 KEGG:ns NR:ns ## COG: BH2634 COG2084 # Protein_GI_number: 15615197 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Bacillus halodurans # 2 284 4 285 299 129 28.0 5e-30 MKVGFIGLGKMGAGICSNIVKNGIDVVVYDINPANMQKFVDMGASSAATVKELAAQVDVC ITCVPMPKDAKGVLLGPDGVFENLPEGGVHFDLSTLDIETVVMLEAEAAKLGKKYLVIPM GKGPAFAADGTCPLFGGGDKATYDKYEECLLKKMGKPAYIGDVKAGCALKLIQNLMGMSI NAVCSESIKLAKLANIPKEQFIEQISNSGAFSFQFKNTASGTFDEDFEHPVFTVNLAFKD VRLGIEMCESFGQRVPMMSRAREVFAATAEKYGEENFTATFKML >gi|316924893|gb|ADCP01000010.1| GENE 27 31023 - 31778 1101 251 aa, chain + ## HITS:1 COG:STM1511 KEGG:ns NR:ns ## COG: STM1511 COG4221 # Protein_GI_number: 16764856 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Salmonella typhimurium LT2 # 5 251 2 248 248 262 53.0 5e-70 MSKKIIFITGVTSGVGRETARRFIKEGWKVIGTGRRQSRLDELAAELGGDFLPLCFDVAK RDVVVEMFANLPEGFKDIDVLLNNAGSSIGQNPAQAGNMDDWDEMIATNINGLLYCTRCV LPGMIERGSGHIVNIGSVAGNRAFKCGNVYGATKAFVNHFSRMLRCDLLGTPVRVTNIKP GHLHTEFLAVRFKGDLKTSEAAYENLDPLTPENIAESVWWSVNLPAHMDVCDMELMPMCQ SDGGWSFCHKS >gi|316924893|gb|ADCP01000010.1| GENE 28 31874 - 33334 1775 486 aa, chain + ## HITS:1 COG:AF0333 KEGG:ns NR:ns ## COG: AF0333 COG2368 # Protein_GI_number: 11497945 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Aromatic ring hydroxylase # Organism: Archaeoglobus fulgidus # 1 483 4 482 500 543 54.0 1e-154 MMTAEEYIQSLRDLRPRRVYAFGEKIEDCVDHPLIRPSINCCAMTYKLAEMPEYRELMVT SSSFTGKPVNRFCHLHQSTADLVNKVKMQRLCGSMTGACFQRCVGMDALNAVFSTTYEID ARHGTEYHRRFVEFVKEWQEKDWTVDGAMTDPKGNRGLAPHAQKDPDMFLRVVERREDGI VVRGAKMHQTGALNSHQILVMPTQTMRAEDRDYAVAFSVPADDPSILYIYGRQASDTRKL EASKVDVGNANYGGQEVIIIFEDTFVPYKNVYMLGEIDFTGMLVERFAGYHRQSYGGCKV GNGDVLIGASQTAAECNGCAKASHIKDKIIEMIHLNETLFSCGIACSCEGSPTRAGNYQI DMLLANVCKQNVTRLPYEIARLAQDIAGGLMVTMPSDADFTSDEVGEWCRKLMVGDARYS VEDRQRILRLIEAMTIGSVAVGYLTESMHGAGSPQAQRIMIGRQANMEAKKALARRLCGI DPDGLF >gi|316924893|gb|ADCP01000010.1| GENE 29 33402 - 34163 820 253 aa, chain + ## HITS:1 COG:CAC2712 KEGG:ns NR:ns ## COG: CAC2712 COG1024 # Protein_GI_number: 15895969 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Clostridium acetobutylicum # 4 253 5 254 261 251 51.0 1e-66 MYENILFEKKDGVAYLTLNRPDQLNALNGAVLKELDDALAAIDGDAEIRVLVVRGAGEKA LAAGADITEMQSMEAEAGMAFGAFGQKVFAKIEALRQPSIAMIPGFALGGGCELALACDI RIASEKARFGQPEVGLGITAGFGGTQRLPRLVGRGAASLLLFSGNMIDAAEALRIGLVDK IVPHDALAAEVSALAEGIARQARCAVQQTKRCIRHGLESGLESGLSYEAQAFGLCFSTKE QKDRMRAFVNRRR >gi|316924893|gb|ADCP01000010.1| GENE 30 34201 - 35316 1058 371 aa, chain + ## HITS:1 COG:FN1424_1 KEGG:ns NR:ns ## COG: FN1424_1 COG1960 # Protein_GI_number: 19704756 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Fusobacterium nucleatum # 5 367 12 375 377 327 47.0 2e-89 MEATLWQRLDEIITRTVRPMAERCDEESLFCGESVRALADAGFMGLPYPPKLGGRGSSYE SYVRCVERLSKVCPSVGVTYATHVGLACHPLHAFGSPMQKESFLRPMLRGDKLGAFALTE PEAGSDVGAIATTAELDGGFYVINGHKIFITNAGRADIYILFARTGGQGTEGLSAFVVCS DDAGLLISAPQRKMGIRGAQTCELFLRNLCIPADRLIGRPGDGFRIAMETLDGGRLGIAA QAVGIADAAFDEACVRLKERRQFGRTLEHFQGLRWKAADMWAKVEAARQLTLYAARLRDG GKRFRAEASAAKLVASEAAVWCASEAVQICGGSGYLQGSVAERLYRDAKITQIYEGTSEV QRMVIASGVLG >gi|316924893|gb|ADCP01000010.1| GENE 31 35367 - 36209 1200 280 aa, chain + ## HITS:1 COG:CAC2708 KEGG:ns NR:ns ## COG: CAC2708 COG1250 # Protein_GI_number: 15895965 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Clostridium acetobutylicum # 1 280 1 280 282 324 61.0 1e-88 METLFVIGAGTMGGGIAQVAALNGFTVYLYDIKQEFVDKGLKGIVGAWDKLVARGKMAAE DRDAALPRIIGTLDMRDAAKADVVIEAAVEDLDVKRSIFSKLSGIVSPGCILATNTSSLS VTAVASAVGRPDKVVGLHFFNPVPRMALIEIIRGAATSDETYEAAKALSERLGKTAVTVN EYPGFVVNRILIPMINEAVFMLQEGVASAEGIDTAMKLGANHPMGPLALADLIGLDVCLA IMELLRDEMGEGKYRPAPLLRKMVRAGKLGRKSGEGFFRY >gi|316924893|gb|ADCP01000010.1| GENE 32 36295 - 37473 1202 392 aa, chain + ## HITS:1 COG:CAC2873 KEGG:ns NR:ns ## COG: CAC2873 COG0183 # Protein_GI_number: 15896127 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Clostridium acetobutylicum # 1 391 1 391 392 462 62.0 1e-130 MREVVIVSAARTALGSFGGSLASVPAVELGAVAIRAALSRGGVDAGMVEEVVMGNVLQAG LGQNPARQAAVKAGIPVEIPAWTVNKVCGSGLKAVAEAALAIRAGEARCVVAGGMENMSQ ASYVLPSARWGGRMGDVKLVDEMIRDGLWDAFNDYHMGITAENVAARYHLSREELDAHAV LSQQRAAAALEAGVFDEEIVPVCVEQKKKSFDFARDEFPRPGTTMDVLGKLRPAFKKDGV VTAGNASGVNDGAAALVLMDAGLAAERGLSPLARIVGWASAGVEPDVMGLGPIPATRKVL EKAGWTVADLDQIEANEAFAAQFLAVGRELGFSLDNVNVNGGAIALGHPIGASGARILVT LLHTLKRKGQSRGLATLCIGGGQGIAMLVERG >gi|316924893|gb|ADCP01000010.1| GENE 33 37564 - 37779 146 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRALDGSIVLMDNEAGADIRVAVCGCPAQCVSVPPGTHMLHAPYMDGERMEPCDMAAYLL DMASQRNGRDE >gi|316924893|gb|ADCP01000010.1| GENE 34 38015 - 38176 105 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWKNGAGVMMESATVSAVVFPSIGASMILRRVIGKEHPAPASAVLNKHFRHPA >gi|316924893|gb|ADCP01000010.1| GENE 35 38304 - 39707 1545 467 aa, chain - ## HITS:1 COG:PA2870 KEGG:ns NR:ns ## COG: PA2870 COG2199 # Protein_GI_number: 15598066 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 318 466 357 522 525 99 38.0 1e-20 MKKKLFRTHAFITGIVLIGFICITIINYCTYSVVIRDDIRNISRLTSLNIFSSISNELTK PIFVSLTMANDSFLKSWLRGENDSPEQIAELQDYLNGLKRKYDYSSVFLVSAKTNIYYHY NGINKVVSRDDSHDVWYYNFVASGIPHRLEVDTDEMARNELTVFVDCRIEDDHGELMGVV GVGVKMRELQQILASFELDFDLTAFLVNREGVVQVHTSDAMIANATIDDLLPVSEYRKKI FTKSRVESSWYKYNGQETCLVTRYIDDLDWYLIVEKDTAIIKESLLSQLWKDILVVACII ILILLLVSGTVSRYDKFMTLFATTDKLTGLMNRERFNAILSRKLWERPLVFMFDVDHFKH INDTRGHLWGNEVLARVGHKAEEIVGKAGCVARWGGDEFIGFIDAPGDAEETLRSLLESV RDLDNHVTISIGATRINPEDDMDAILMRVDDGMYRSKAEGRDRITYV >gi|316924893|gb|ADCP01000010.1| GENE 36 39909 - 46994 7933 2361 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 1894 2328 312 751 776 219 32.0 6e-56 MNRLQEFLDNPHIITGTATVAADPQLTLIAANKALYTLVGETAESCRQQGNSLLPWIEQS SAVRLIGLLAGHPPLFDWEGVLVRKDATSVRIRLSGTRMEGVLHEGQYPVYLSAFTDLAC VEEMLRSAEYERRKYALIADISEDLPFEYDFETDTIAYTQKYHKVFGHEPVIPRFRERLG RGEAIDPVSEGFREPFLALDAEKASTEAAPERFLPTHSGRKRWFALYSTNILDTLGKPVK SVGALRDIDRQKREQLRLLDKSRTDSMTGLYNKVTTEEEIRIALRDARPGSSGVLFMIDI DNFKDVNDSMGHLAGDSIIMEIARQLRRTFRQDDIIGRVGGDEFHVYMRDVSEIAGIRTR AQSLCSSIRNLFKNSNIDNAVSVSVGIAVTERPIAYEDLFRQADVALYHAKGNGKNRYEF FGQSSGGDADGVQTSAPLAVNTVRNSIMVDIIDILFSMYDMHEGIDKALHFIGNALRVDK ILIFEYSLDGKAVSIAHEWCSDPKWSTKEQFQNVPVDKIELPKTRDSSGIYYCSDFSEVP PEEKTFILDDTISSLLQCDIVRDGHVVGHIGFEERGNRRIWTQQEVDALILMSKIIGEYI RQRRSASLLRESYESTRNILNSLPNTAVYVIDANHRIVYFNDTVARAYPNVKLGVTCFEV FWGKSDICSFCPVTKHGGGEAFTTLHEHPPFKGICEISVSGILWENKEPAHVVLISERLL TPEERKAKAKRDSFARALCESYHYVVDVDLGTGHFELLANRNDLATDTSGDYAAHFELFF GHILPQYRDAFREHFSLEGLRAAFAAGVNDRNIHLEYQFAEEAGPRWRTRIAFAYTQDDG SCHVLQCIRDINEQKCAEFTHRRDEENLQVALQNSYAKIYRMNLAANQLTCLFCNTKLLA PIDVSGEFDKDIVTVMKTRVHPRDRVVFRNFFTAETILRKLDAGEELSVEYRKLGLDGTY HWMLALIVPLPTGNRGETVLLVRDVTEKKEEENNYLLALQNNYSEIFSLDVRSRTVTPLF YNSEQVPIVKEEANFSAFVKRRAVDRVAPESLESVLDFYERRLFEELERGGRPECEYRKR SSENGPYRWIAASAQPVPGNEGHALILLRDVTKKKEEENNYLLALQSNYTEIFRLDLEAG LIAPLYYNSEQVTISPTLMPIEEFVLDRGKNRVHPENLKSVRTFYDVPNIMARLDLGEAP QLEYRKRQDAGKPYRWVNATIRAIPGAYHHALLLLSDITERLNEEADFYKALQHSYSEIY EVMLDTDSMRIVHRDEDSTLAKPELTHSYSRDTRTIANRYIHPDDRESFLELFSPENVSR KVADDPRLSTEYRVLAVDGTYHWISILVLPMPGSSGRLLLLWQDINERKRMEETAARLER RQSTVFRQSGDCIIEINLRTWQFHRNASAPSLPSEPRSGDYRTFHAETIAMVHPADRERI NRTTTPNALLEACRAHSRTLVDQYRVLFGENEQLWLENRVFFLEEGDDMTAFFIIRDITE QKRVEEERALEEERYNIALRNTYTEIYEIDLSADTPHLVYAADTPMIPVDHDKNGNIHTI AATLIHPEDRERFLTAFIGSNIRKEFSEGRMEVPAEYRRLGSDGKWYWVSAFIVPLCGHD SCRTDKGILLVRDISEQREEEQRRRISEQYDHALRNIYDELYELNITQDSYRIVYHVKGK YVTPPEQGRLSECIDLVSRNMLFPEDRTRFLEFFNLDALRQNFAAGREYLIGEFRKLWHD QEYHWASITMFPVAQPDGGDEIYLAFIMDIGDKKQAEEVAQQNILLERQRLDDERYRTIV EQTDTLVFEWNLETDTRYISPEIVARFAGNYDHRDLMHVWREDLVIHPDDLPLLTAFLKD SRIQRYTEMTARFRKRDSVYIWCKAALTCLHDDKGNPKRYIGTLNDVDSATRSVLALKYR AEFDLLTDLYNMHTFYSQAAQAVHAYPERRYSIVRMDIDRFKVINDLYGLKEGDKLLIAI ADLLREKMAGTHSVYGRLGGDVFCLCVDYSRERILALIKELTDRLADYPLPYKVVPSFGI CEVDNIDTPINVLCDWANLALKTIKGNYLNSYAFYDGKLRERILEEKKIENQMHDALLQG QFVLYLQPKVHIPTSRIIGSEGLVRWIHPTEGLMPPDRFIPLFEKNGFIIRLDEYIWEQA CITLRRWIDHGLTPTPISVNMSRMHIHDPRLREKLLDLMRRYELPPHLLELELTESAFLE NESGLFESMKALQAFGFQFSMDDFGSGYSSLNMLKSMPVDFIKIDRGFLNEVVTTERGKT VIRFSISLAREMSIKVIAEGVETEEQAAFLLQAGCAYAQGYFYSRPLPIPQFEALAFGTE HPFPVAPSIKALAEKLEKGST >gi|316924893|gb|ADCP01000010.1| GENE 37 47821 - 49338 2114 505 aa, chain + ## HITS:1 COG:PA2333 KEGG:ns NR:ns ## COG: PA2333 COG3119 # Protein_GI_number: 15597529 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 1 396 1 412 538 111 28.0 3e-24 MKEDRKIKNVVLIMLDTLQFNYLGCYGNKVVKTPNIDRLARQGFLFENAYSEGLPTVPVR RALMTGRFTLPYGGWQPLAPDDTTVADIMWGKNMQTALVYDTPPMRLPKYGYSRGFDFVS FNPGQELDHTSFADVPLDPALKPEDYTSPTMVFNEKGEVIDDDSTQLLDEIGCFLRQQQF RKPEDSYISRVMTDAQNWLSNRRDKTRPFLLWVDSFDPHEPWDPESVWKGEPCPYDPEYV GNPLVLAPWTPIEGRITERECEHIRALYMEKITQVDKWVGELLDSIRAQGLWDETLIVLM SDHGQPMGEGEHGHGIMRKCRPWPYEELVHVPLIMHVPGIEGGKRIKSFVQNVDVTATIM DALGELGTPQKASDFGFPTYDTSEMQGISLLPVMRGETDTVRDCAIAGYYGMSWSLITED YSYVHWLVSEDVKDSVDCIAGSGTEMKEEMWTCTAGAKVQVPDHDELYDRKADPFQLNNI ADQNPDKAREMLQQLKLVIGEIRTS >gi|316924893|gb|ADCP01000010.1| GENE 38 49533 - 50936 1830 467 aa, chain + ## HITS:1 COG:MTH788 KEGG:ns NR:ns ## COG: MTH788 COG0471 # Protein_GI_number: 15678812 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Methanothermobacter thermautotrophicus # 47 458 25 443 443 81 23.0 3e-15 MTTAVATNKLSGKWIGWLTSLLGGLFVAWCFYDPSTPKVNLYWFGTIMAIGMFAFDLLPN FVTAVLLLMYYILTGIAAPDVAFVGWTTPIPWLCMCGMLIGVLMEKTRLANRIALFTISR VGTTPIRMYIAFLLAGFIVSAIIPDVITVDILFMAIATGMCQSLNLSVTSRSATTIVLAA FFGATISSAAYLPNNTGIIGLLMVKDMGVPFTWLGFYSENLPYQIGHALLAYTILHLFGG RELGEHISRCRACAEAELATLGKMSRDEKKTLVLALLALVAFISEPWHGIPGYFAFCSMV LLGFTPIFNLMEASDLKKVQFPILFFIAGCMAIGIVAGTLGIPAWLAGKLVPYLQQIDSN AGSSMFAYWVGVVANLVLTPVAAATSLSVPMAEIATSLNLGIKPILYSFLYGLDQFFLPY ELAPALIMFATGYVRVRYLIGIMLTRMVLASLIVAFVATFIWPMMNL >gi|316924893|gb|ADCP01000010.1| GENE 39 50959 - 51657 764 232 aa, chain + ## HITS:1 COG:no KEGG:Desal_0411 NR:ns ## KEGG: Desal_0411 # Name: not_defined # Def: transcriptional regulator, Crp/Fnr family # Organism: D.salexigens # Pathway: not_defined # 8 222 8 222 235 179 43.0 7e-44 MNRFGFCKQEGPVFRIRDLNSPWRSVLHLGQRQMVGKSYQWEDDPETSTFSFLEKGRVRL LNLSETGKERIVLYIEPGCLFREIVLLHVSPSHPASLVALEPCEVYNFPRTLLDDAEFVR RHPALMSNLVHSLGAKAGAFYSQITESVELDPQTQVCRYLHRLADERRTRVVNPGVSQSE LALVLGLHRSTVCRIIRTLRDRGVLGHFTRCSLEILDREALAELCNPSEGGK >gi|316924893|gb|ADCP01000010.1| GENE 40 51816 - 52124 475 102 aa, chain + ## HITS:1 COG:no KEGG:Dde_1202 NR:ns ## KEGG: Dde_1202 # Name: not_defined # Def: thioredoxin, putative # Organism: D.desulfuricans # Pathway: not_defined # 1 102 1 102 102 114 60.0 8e-25 MIEIITEADYKDRLAATQNGVLLFFKKLCPHCKNMEKVLEKFGAAKPGIALYGIDIEENA AAAAALGAERPPTIFVIKGGEVKASKVGLMNPREMAAFFEKA >gi|316924893|gb|ADCP01000010.1| GENE 41 52472 - 53395 371 307 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 7 296 3 297 306 147 33 2e-34 MEERTCDIIVLGGGPAGMTAAIYAKRANLDVVLLETNVTGGLVNSTYTVENFPSYPEIHG MELMEKMREHVDHMGVEVEEVFEVTGLELGEDEKAVHGDGIVYKAPAIILATGRKPIALD VPTECDQMHFCAICDGAPYKGKRVLVVGGGNSAFDEGLYLLNLGVAELTVVEIMDRFFAA EATQDALLGDPRVKGFKTTKVVDVEVTDGKLSAAILENAQTGERTALPVDGIFVFLGQSP NNEWFKDAIALDDKGYILAGEDMSTNIPGVFGAGDINHKPYRQITTAVSDGTIAALAAER WLRSRKK >gi|316924893|gb|ADCP01000010.1| GENE 42 53695 - 54294 700 199 aa, chain - ## HITS:1 COG:no KEGG:Dde_1959 NR:ns ## KEGG: Dde_1959 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 4 199 5 208 208 203 56.0 3e-51 MTKELNCCGLACPEPVIRCRRMLEEEKPASLRVLVDNTAASENVGRFLGRTGYSVEVRQE GADLWCVSATACGCRVTEPEPAAATLTPGKTLVLITTETFGRGDDELGEKLMGNFLSTLP ELGESLWRVILLNGGVKLAATPGKALDSLKALENAGTDVLVCGTCLDFYGLLEAKQAGQT TNMLDVVTSLALADKVIRP >gi|316924893|gb|ADCP01000010.1| GENE 43 54376 - 55509 1429 377 aa, chain - ## HITS:1 COG:yjiM KEGG:ns NR:ns ## COG: yjiM COG1775 # Protein_GI_number: 16132156 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Escherichia coli K12 # 4 376 16 390 390 302 42.0 6e-82 MAYASLERFRDITGRNVLALQQAKEAGKKVVGQYCIYSPLEIALAAGAIPVSLCGTKNDS IPVAETMLPRSLCPLIKSSFGFALQDSCPYLAASDIVVADTTCDGKKKMYELLAAYKPVV LLQLPQIQDGDAKDYWRAQYAKLAARLEKDFGVSITTDDLRNAVRLTNRLRRALKDVLDL AKRKPSPLTGMDLLDICFRASFMPDYGQAIALLQDIAAEIGGASDGAVSPDAPRVLLTGV PTGLGSHKVIQLLEECGASIVCIDNCTCYKKVRLMMDETADPLTELAGRYLDTPCSVMSP NPNRYTALRELTEAFQADAVVDLTWQGCQTYDVESWSVKKFVREELDLPFLQIVTDYSEA DTEQLKVRIEAFLEMLN >gi|316924893|gb|ADCP01000010.1| GENE 44 55545 - 56309 718 254 aa, chain - ## HITS:1 COG:AF1959 KEGG:ns NR:ns ## COG: AF1959 COG1924 # Protein_GI_number: 11499541 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Archaeoglobus fulgidus # 1 251 1 251 251 189 44.0 4e-48 MPVAGVDVGSVAAKAVIFDPQSRSLLGKAVLPTGWNAREAGEKALAAACADAGGIAASRI VGTGYGRISLPFADKVVTEITCHARGAVHLFPGTGVVLDIGGQDSKVISTAPDGSVQDFL MNDKCAAGTGRFLQVLSGILGMELAELGEAAGRGKPAAISSMCAVFAETEIIGLLARGTP PADIAAGVFRSIARRMRGLAARIPLRGECTFTGGLATSPSFSRLLSEELGVPVNVPEYPQ LVGAIGAALIAARI >gi|316924893|gb|ADCP01000010.1| GENE 45 56730 - 56963 86 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLVVYERSERIPEPSTRQRNVYRAQDTKKPVCNHTGKAMGECPSGKKGGRGCRYQHTANC DGDIQPHRRRRQVMPEI >gi|316924893|gb|ADCP01000010.1| GENE 46 56947 - 57303 70 118 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKEAKLSWHYPLNSPETFSLKPSNGLGVFPLLSLACFSWSKDTPRKPCCPLSYSGGGDKQ PGQGGEMTHSKLEKAIPSVIYENQTLNALYIPMASDYCFMKILSKGTPCLVSAIRFQA >gi|316924893|gb|ADCP01000010.1| GENE 47 57282 - 57965 410 227 aa, chain + ## HITS:1 COG:FN0036 KEGG:ns NR:ns ## COG: FN0036 COG2323 # Protein_GI_number: 19703388 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 6 204 3 201 210 130 37.0 2e-30 MIASLLSYGNIAIKLAVGLIGFLWVLRTTNRCQLAQMTPVDLIGNFVMGGIIGGVIYNQD ITTVQFIIVLAIWQLLVLGVNALRIHTESGQKMIVGRPTPVVIKGKYLQDKFELLGLDIA DFATLSRIQGVHSLYDIWNAQIEPNGQATIQKKDARKTSNILITNGTIDTGALEMMEKDE GWLKNELHKRGYESYKDIFFAEWNELIDDERGKAGELYIVERTEKRD >gi|316924893|gb|ADCP01000010.1| GENE 48 58156 - 59052 644 298 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 6 271 11 278 329 252 49 4e-66 MKNPVLLSVRNYKQHFPINKSLTVHAVNGISFDLRKGEIFGLVGESGCGKSTVARAIMGI YTPTEGEIYFKDFLISEKRSYREHKKDIQRNMQVIFQDSAAALNPRMKVADIIAEPLVIN RVYPDKVRLRAEVDKLLTLVGLDTSYGNKFPAEISGGQRQRVAIARSIAVNPELIVADEP VASLDVSIQAQIVSLFQHLQRQHGFTFLFIAHDLSMVRYLCDRVGVMYEGRLVELAPAKE LFGNPLHPYTRALLSAIPVPDPVYERSKKIIPYTHETADLSGEWREILPEHFLLARSE >gi|316924893|gb|ADCP01000010.1| GENE 49 59049 - 60065 989 338 aa, chain - ## HITS:1 COG:lin2297 KEGG:ns NR:ns ## COG: lin2297 COG0444 # Protein_GI_number: 16801361 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Listeria innocua # 1 334 1 336 358 400 58.0 1e-111 MDTVLELENVSVHFDTPDGTVQAVRDVSLSLKSGETLAIVGESGCGKSVLCKSVMKLLPG NARISGKMLIDGKDIAPYTEKQMRKLRGSLFSMLFQDPMTTLNPTIPIGKQITEAVLKHR KMSREDAQKLAVEMLHRVGIDDPAQRMLLQPHYFSGGMRQRCVLAIALALNPHILFADEP TTALDVTVQAQMFDLLRDIQRKTGIGIVLVSHDLGVVARIADRVAVMYAGKIVEIGTAEE IFYDPRHPYTWGLLGALPSQAVENGELRGIPGMPPTLLDPPPGDAFAERNQYAMKIDYER MPPMFPVSETHSAATWLLDERAPAVTPPVTIKRRGMVH >gi|316924893|gb|ADCP01000010.1| GENE 50 60135 - 61706 2300 523 aa, chain - ## HITS:1 COG:Cj1584c KEGG:ns NR:ns ## COG: Cj1584c COG0747 # Protein_GI_number: 15792889 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Campylobacter jejuni # 38 520 22 510 511 253 33.0 8e-67 MKRNVLTAPVLKTVRWGLMCLALCLTSMPANAADEMANTLVYAGENEDTINPLLNNHQEL PTIIFSGLMKYDAKGQPVPELAESFTYDKGAHTYAFKLRDGVKWHDGKPLTVEDILFTYE ALTKDKTLASSITSNYEDIKSITAPDAQTVIFTLSKPNAAMLDNFTIGILPKHLFAGKDI NTAPANQHPVGTGRYKFVEWDTAGGMIILEKNKDYYGKVPNIDRIVYKTVAVESTKALML QSGEADLAWLNAKYADTFRGKDGYKNIDFATADYRSAAMDFHTDFWKRNGDSIGVLNYAL NKDAIAKSVLNGQGFPAFSPIQMNAYGGNKAADIYPYDLKKFAAEMEKLGWKKGKDGIYE RNGQKFSFTIQVRDYEEERVDIAKVIANELKKAGVDMQIVLVTKFDWKAGYNGFLAGFAA EFDPDGVYKDFVTGASDNTMAYSNPKVDELLKKGRATEDPAQRKAIYGQFEEAYAERPGH LLVAYLNGNYVSIAGLKGLDTARVLGHHAVGVMWNIEDWTLNR >gi|316924893|gb|ADCP01000010.1| GENE 51 61764 - 62669 931 301 aa, chain - ## HITS:1 COG:BS_dppC KEGG:ns NR:ns ## COG: BS_dppC COG1173 # Protein_GI_number: 16078359 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 29 287 48 307 320 171 38.0 2e-42 MDASSNDAFTIVGAGYRAHQPAIRKKGFWARLKGKPTLSVAVFSVIALGCLFSNLIINHD PAELFLMNVNEAPNAEFWFGTDSLGRDIYSIIWYGGRASIFIGLMSAAVITLIGVSYGCL SGVASGVVDSAMMRAVELLQSVPVLLSLLLILSLMGKPNELTIALVIGVTGWFALSRIVR SEVRQIRNSEYILASRCMGARFPFIMRKHLIPNFVSAIMFVVISSISTSMAMEATLSFLG LGLPVDVLSWGSMLALADRALLLNSWWVILIPGVFLVVTLLSITSIGHYFRKRSNQGPSN L >gi|316924893|gb|ADCP01000010.1| GENE 52 62669 - 63658 1123 329 aa, chain - ## HITS:1 COG:BH3638 KEGG:ns NR:ns ## COG: BH3638 COG0601 # Protein_GI_number: 15616200 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 10 322 5 320 322 194 33.0 1e-49 MNSRQLMASLARKTLVFIAMMFLLSLIVFYVSRLTPGDPLQSFYGDAMQSMTTTELDAAR ERLGLNGPIWMQYAKWIGNVMQGDFGISLKYKRPVLDVVSPLIGNTLMLGGIAYALVFML AIALALFCARFEDTFMDRFVCRIGTVAFYVPAFWMGVVLVLVFSVNLKLLPSSGAYGIGK SGDMADRLRHLVLPLIVMVASHLWYYAYMIRNKLLDEIRRDYVLLARAKGLGRTRVLLSH CLRNVLPTIVSIMAISIPHVLSGTYVAESVFNYPGIGLLAVSSAKYHDYNLLMLMVLITG AMVMLSSLVAQAVNEVIDPRMKSSDGVVC >gi|316924893|gb|ADCP01000010.1| GENE 53 64234 - 64803 391 189 aa, chain + ## HITS:1 COG:MA3658 KEGG:ns NR:ns ## COG: MA3658 COG0655 # Protein_GI_number: 20092458 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 169 1 179 190 137 38.0 1e-32 MSILVVWASPNQDGLTSAAKDNILKGLSEAGGEAEALHLNKRDIRCCRACGNGWGLCRSE GRCVINDDFSEDYQRLVSAEAIVWITPVYWHDLAENLKAFLDRLRRCETRRNHFLKGKKS LLVACAGGTGRGAIQCLDRLEDTLGHMDIVAVDRLPIIRFNREYMLPALLGAGRAFAAHI GQTASEEGV >gi|316924893|gb|ADCP01000010.1| GENE 54 64991 - 65473 174 160 aa, chain - ## HITS:1 COG:no KEGG:RSP_2020 NR:ns ## KEGG: RSP_2020 # Name: not_defined # Def: DHC, diheme cytochrome c # Organism: R.sphaeroides # Pathway: not_defined # 46 158 46 158 161 99 47.0 3e-20 MKKRYLWGFTGLLLCGIVGLAEAHDGYRKHSPNRLLIPPQTYADNCGSCHTAYSALLPSG SWKRLMGDLPNHFDAAVELDAADKEQIGAWLQGNAADTGTTRRGSKIAKKLHGDTPVRIT ESRWFIHKHDDVSQGSIAKAQGLKNCAACHADAQRGAFSR >gi|316924893|gb|ADCP01000010.1| GENE 55 65635 - 66333 989 232 aa, chain - ## HITS:1 COG:STM3115 KEGG:ns NR:ns ## COG: STM3115 COG1811 # Protein_GI_number: 16766416 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Salmonella typhimurium LT2 # 1 230 2 233 235 155 40.0 5e-38 MIGPITDSCALFVGGLAGTLFSGFIPKRVKETLPLIFGVITLCMGSTLVNKASALPVVVM ALIAGSVIGELVFAESFLAKCIRALFRLAKSKHCGNETYIVNIITLVSAFCFGSMGIFGA INEGITGSTDILLTKAVLDLFSGIVFGTLFGLSVSLIAIPQFAILVLLYASAAFLAPYMT PGMLSDFTACGGVIFVATGLRMCGIKIFPVINMLPSMLIIMPLSALWARYVG >gi|316924893|gb|ADCP01000010.1| GENE 56 66655 - 67413 586 252 aa, chain + ## HITS:1 COG:no KEGG:Hore_08770 NR:ns ## KEGG: Hore_08770 # Name: not_defined # Def: transcriptional regulator, GntR family # Organism: H.orenii # Pathway: not_defined # 10 251 4 243 244 72 27.0 1e-11 MEKSSDVRRVREQREPLYVEVKEAILELARTQCGPDRRLPSEEELCRLLGVSRATVREAL SILCREGFVSKRHGIGNLVNRSVLDTPMRFDLERGLRRMLEDAGYQASTIREMPVSPGGK DILLDDPTLPVRLTIPRPWKIQRTAHLVEGAQAIVTCNVFTSNGSWSGKGFAQELAYTDV INLLSGGTLSHTVMAFLPWCAGKGIASAFGLAPGTPIILWHELNYGIQDTLLCESVVAFN PDFVTLRALHRW >gi|316924893|gb|ADCP01000010.1| GENE 57 67479 - 68456 1020 325 aa, chain + ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 28 318 16 308 308 93 26.0 5e-19 MFSRKIQSLCIALFWLCAAVPQMAFAEYPEHPVTVYHPFEPTPYDALSKVYNAEMSKILG VPFVFEYGMMGKAAKATLNGKPDGYTVYFAAMGPMVLKPNMVGPSASPKDFRAVGRATLL PILFVANKNAPFKTFEEFKAYAKAHPGKARIGITNAPSSLQIGMTHLIKDLAKLDVQLVE QPDGPVRGTIDCLIGDTDALISHPPDIMRYIRRGDFVPLATFAPERLAMLPDVPTLRELG YDFSQTSWRVLVVNKDTPDAIVEKLAKASERALNSPAMKKSAEENYEILAWLPPSEADAF LQSEFDFYRDLSMSMGLHYTQRPKK >gi|316924893|gb|ADCP01000010.1| GENE 58 68710 - 69411 603 233 aa, chain + ## HITS:1 COG:aq_1342 KEGG:ns NR:ns ## COG: aq_1342 COG0546 # Protein_GI_number: 15606543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Aquifex aeolicus # 13 209 5 196 213 94 32.0 2e-19 MTGCGTERYADWFFDLDGTLVDSAPDILRLLGGVLREEGLAVPELDKGRIGPLLEDIIRG ICPDLAPADLERIVRIYRARYRACPFDESPAFPGIPRLFERLSGRGCRLFVATNKPEDVT RRLLDARGLLPFLAGFACSDSLPGRRLSKAGMLALLMERHGVEPGTAIMVGDSALDMRGG QEAGMATAAALYGYGRRGALLETGPDFVIEDAAWTRVVRLSGGAAGLHEGRIA >gi|316924893|gb|ADCP01000010.1| GENE 59 69408 - 70169 678 253 aa, chain + ## HITS:1 COG:ECs4266 KEGG:ns NR:ns ## COG: ECs4266 COG1349 # Protein_GI_number: 15833520 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 253 1 252 252 192 38.0 4e-49 MKQENRHQAILYAVMRQGYVSIESLAEELGVSSQTIRRDISELDRTGMLERRPGGASCRT TILNSTYDARQVEDFKDKERIARTIAEYIPDNSSIFLTLGTTVEVIAYALLERSGLMVIT NNVVAALTLNKKTDFEVVLASGYMRKSSNGLVGESTIEFVNGFQCDYVITSTGGISEADG HLLDYHTADVSVAQCMMRNANKVLLAASRSKIGRQAVVRVAPLKTVDVLFTNSPVPSRLE EVAQAAGVELIAC >gi|316924893|gb|ADCP01000010.1| GENE 60 70463 - 71485 1434 340 aa, chain + ## HITS:1 COG:SMb20036 KEGG:ns NR:ns ## COG: SMb20036 COG1638 # Protein_GI_number: 16263787 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Sinorhizobium meliloti # 6 335 8 333 338 155 32.0 2e-37 MRKGLFTLLSLVCAGLIWAAPAPANAADKIVIRFAHSNTPSDMDPYHALVTKMNDKLVKK LGADRIEIQEFPAGQIGSEERSFQDVQQGILQMTVLAVNNASIFAPSLGAFDLPYIFKSV KEFDDCVDQNWDLINKRMEAESGTIAIAWHSQGFRFITNSKQPVAKLADLAPVKIRTPNN PIMIGVYRAWGTDPVPMAMDETFNALQQKVVDGQDNPLVSIATNRFYEVQKYITEPHYKL WTGPVVVNVDWLRSLPADVQQAIIEAGHETTVEMRALLDKQESASREFLKEKGMVFCGVP TDEDEWMKRAVSIWPQFYPQIKNLEIVEAFMKTLGRQLPR >gi|316924893|gb|ADCP01000010.1| GENE 61 71541 - 72047 765 168 aa, chain + ## HITS:1 COG:no KEGG:RSKD131_4191 NR:ns ## KEGG: RSKD131_4191 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter, DctQ component # Organism: R.sphaeroides_KD131 # Pathway: not_defined # 24 167 4 147 147 93 36.0 2e-18 MKNLLAWLDVHFEDALCGVLLVGIMIMLMAQVVVRFAFGHGLTFSEELCRFSFLYLVYFA ASLVACKGAHIRVTAHTQKLPAPVRVGLLMLADLLWLGFNATVIYQGALLIDSMSTRPMV SGSMLLDLRYVYFAVPAAFTLQSFRILQRWYRHFRGGVSILGQQQEEV >gi|316924893|gb|ADCP01000010.1| GENE 62 72049 - 73335 696 428 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 6 426 8 430 431 272 34 4e-72 MSADAVLWIVMAFCLIVGCPIFVSLGVASTLALILTNIPMRIVALDMLKVMDMFPLLAVV GFIFAGALMEKGGMARQIVDVASLFVGRIRGGLGITTILGCLFFAAMIGSGPGTVAAMGS IMIPTMVRRGYSPEYAAGVCATGGTLGILIPPSNPMIIYGIIANTSIAALFTAGFVPGFV LGLMMMLMAYFLARRAGFTGTVDEDKGAHFWPMFRKNFFSLMAPVIILGSIYAGICTPVE ASVVAVFYALFVGTCITRELKIIQLWDAIKLTNVSAGSIIIVLGVSTLFGRILTMQRIPH QLANAMITLTDNPYVILVLIGLLLLFLGMFMETLATIVILAPIFLPLITKVGIDPVFFGI FWVITNEVALLSPPLGVNLFIAQNLSGISFERVAKGAFPYMMLIICFIIMLILWQDLPLW LPRLTGQY >gi|316924893|gb|ADCP01000010.1| GENE 63 73355 - 75277 2600 640 aa, chain + ## HITS:1 COG:AGl714_1 KEGG:ns NR:ns ## COG: AGl714_1 COG0524 # Protein_GI_number: 15890473 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 330 2 327 330 376 57.0 1e-104 MSEKDFDLICIGRSCVDLYSGEFGVPLERAMTFTKSVGGSPMNIAIGTSRLGLRVGAITG VGQEDNGRYLKWQLSCEGVDVSAVKTDPKRLTAMVLLSIRGDNDFPLIQYRENCADMGLT AEDIDPDYLARAGAVLVTGTHLSREGVRSATMKVLETAKRLGLKCILDIDFRPNLWGLQG HDAGSSRWAEASEQVTSEYKKVLPYFDLIVGTEEEFFIAGGKTEAMEALREVRRLSKALL VFKLGDKGCAALPGDIPDSFVDEVVYPGFPVKVFNSIGAGDGFMSGFLRGWLRNEDLASC CRYANAAGAFAVSRLGCSSAYPSWTELQYFVSHGSKHKWLREDAMLEQIHWATNRRNKWK NLAVFAFDHREPFSALAAETGRDAKAITAFKNLAFRAVAEASSELEGQNDVGILVDDTYG QSVLFESNRYPFWVGRSIEKTGVNPLMFEGKADVGSTLQAWPENHVVKCLFRPGAKDAPE VVEENERQLCRLFSGARSTGHDLLLEIIVDQPQTREEQDACLLRWMRRCYELGVYPDYWK ILPVADQGTWQAIEALIGEYDPYCRGILFLGLNVAEAELVKRFAALPESPRALGFAIGRS IFLEPARKWFAGQMSDEDAVATMRDCFLRLARAWNGRNRG >gi|316924893|gb|ADCP01000010.1| GENE 64 75524 - 76558 1189 344 aa, chain + ## HITS:1 COG:RSp0948 KEGG:ns NR:ns ## COG: RSp0948 COG1063 # Protein_GI_number: 17549169 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Ralstonia solanacearum # 1 339 1 341 344 257 43.0 2e-68 MKACVLYGPKDLRIEERPVSEPQPGFVRLRMGGVGICGSDLHYYHIGRVGNAVVREPMIL GHELSGVVDAIGPGVSGLVPGQKVIINPTDECGTCGYCRSGHQNLCPDLRYYGSAARTPH VQGIMVEYPMVQARQCVAVSDAMPLDQAACVEPLAIALHAVARAGNLLGKRVFVSGAGPV GCLIAAVARLNGAASVAISDMEAFPLSVAVRLGADAAFKAADPALADLANGCEACFEASG SVGGMNTCLRSAAPGGTVVHVGFLGEEEVAYPVNTMVIRKEVTVCGSLRAYQEFPLAARL VESGRLDLSPLITASYPLEKAEEAILAASDKTRSMKVVLTGPAL >gi|316924893|gb|ADCP01000010.1| GENE 65 76566 - 77363 761 265 aa, chain + ## HITS:1 COG:TM0416 KEGG:ns NR:ns ## COG: TM0416 COG1082 # Protein_GI_number: 15643182 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Thermotoga maritima # 1 245 1 246 270 106 29.0 5e-23 MNISISVAATPCAMPQIMFAGDLEARCALLAGLGYDGIDMFFPDPQGTDARTAKAALDRN GLRATMLAAQGDLMADGLYLNDAGRLSELLERSKYHLEQCAVLGAMPNIGFIRGWHRNDP GSLPRMADGLAAYCQLAASFGVDVLLEPICRYEIDSIHTTDQAIDLCERAGKPANLGLLL DLFHMNIEEASVCGAICRAGSLVRHVHFVDNTRAVPGMGCMALPDVVACLKAVGYEGFLG IEAIPGSDPEGEARSGLAFTRALGV >gi|316924893|gb|ADCP01000010.1| GENE 66 77467 - 78225 211 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 246 4 238 242 85 27 6e-16 MFSLNGKTALITGSGRGIGLAIAEAMAHQGANIFLSDINSSVVERAAEELAEEYPNVTVR GLTFDVTDKAQIESAMQTIRDAGNGLQILVNNAGINLREPVADMDDALWQKMLDTNLTSV FRVSRAAFPMLKEKGGKVINLCSLMSEIARPTVSPYASTKGAVRQFTRALATEWAEHNIQ VNGIAPGFIATDMNIPLMEDKDLNDYIMRHTPAKRWGKPSEVASVAAFLASPAADFVNGQ VIFIDGGFIISL >gi|316924893|gb|ADCP01000010.1| GENE 67 78343 - 79074 674 243 aa, chain + ## HITS:1 COG:BH1352 KEGG:ns NR:ns ## COG: BH1352 COG0274 # Protein_GI_number: 15613915 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Bacillus halodurans # 10 197 2 187 224 123 41.0 3e-28 MRINGTVWSARDIAARIDHTVLAPDATRKDMENACVLARAYGFKAVFTNPYWTPLVAELL DGSGIAAGISAAFPLGSLSTDAKVAEVMDAVARVDGKPCAVDMVTNIALLKEGHFDDYTR DIAAVVNAVEGRGIIVKAILETSLLNAEEIRTACRCAAEAGVDFVKTSTGRAGAPALSHI RIMREALPAHVGIKFSGFGTLNAPELALWAFFLGADVLGSPCGDVVVDVLSSGYAEVGML AGS >gi|316924893|gb|ADCP01000010.1| GENE 68 79387 - 79563 167 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGKTVMVLVIPEPKVRKATAKPVRKHKDKTAYSRKAKHKTPRMGVFFYPKGKNIDQLF >gi|316924893|gb|ADCP01000010.1| GENE 69 79686 - 80057 176 123 aa, chain - ## HITS:1 COG:no KEGG:bglu_1g32480 NR:ns ## KEGG: bglu_1g32480 # Name: not_defined # Def: integrase, catalytic region # Organism: B.glumae # Pathway: not_defined # 3 123 226 346 348 220 81.0 1e-56 MEMGLIRMLTDRGTEYCGKVEAHDYELYLGVNGIEHTKTKARHPQTNGICERFHKTILNE FYQVAFRRKLYQSLEELQADLDTWIDSYNTQRTHQGKMCCGRTPMQTLLDGKSLWAEKVG QLN Prediction of potential genes in microbial genomes Time: Fri May 13 01:46:28 2011 Seq name: gi|316924871|gb|ADCP01000011.1| Bilophila wadsworthia 3_1_6 cont1.11, whole genome shotgun sequence Length of sequence - 14607 bp Number of predicted genes - 28, with homology - 12 Number of transcription units - 14, operones - 6 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 303 235 ## Nmul_A1248 integrase catalytic subunit - Prom 346 - 405 2.4 2 2 Tu 1 . - CDS 408 - 1007 121 ## LIA001 integrase - Term 1367 - 1398 -0.7 3 3 Op 1 . - CDS 1497 - 1799 157 ## 4 3 Op 2 . - CDS 1809 - 2039 224 ## - Prom 2078 - 2137 2.9 + Prom 2011 - 2070 1.8 5 4 Tu 1 . + CDS 2169 - 2807 650 ## gi|237747956|ref|ZP_04578436.1| predicted protein + Term 2817 - 2853 1.3 6 5 Op 1 . - CDS 2889 - 3905 1137 ## LHK_01792 hypothetical protein 7 5 Op 2 . - CDS 3909 - 4712 572 ## LHK_01791 RecB 8 5 Op 3 . - CDS 4709 - 4849 130 ## 9 5 Op 4 . - CDS 4846 - 5139 334 ## 10 5 Op 5 . - CDS 5228 - 5470 342 ## - Prom 5502 - 5561 5.1 11 6 Tu 1 . - CDS 5633 - 5827 83 ## 12 7 Op 1 . + CDS 6002 - 6262 172 ## 13 7 Op 2 . + CDS 6278 - 6505 91 ## 14 8 Tu 1 . - CDS 7086 - 7490 219 ## - Prom 7555 - 7614 2.9 + Prom 7519 - 7578 4.1 15 9 Tu 1 . + CDS 7625 - 7822 85 ## 16 10 Op 1 . + CDS 8076 - 8528 433 ## 17 10 Op 2 . + CDS 8525 - 8827 147 ## 18 10 Op 3 . + CDS 8821 - 9306 245 ## CLJ_B1785 putative phage N-6-adenine-methyltransferase 19 10 Op 4 . + CDS 9303 - 9767 175 ## DVU1741 hypothetical protein 20 10 Op 5 . + CDS 9685 - 10314 122 ## DvMF_1672 hypothetical protein 21 10 Op 6 . + CDS 10193 - 10618 274 ## 22 11 Op 1 . + CDS 10856 - 11050 136 ## 23 11 Op 2 . + CDS 11043 - 11246 193 ## 24 11 Op 3 . + CDS 11246 - 11620 346 ## gi|302343033|ref|YP_003807562.1| hypothetical protein Deba_1600 - Term 11573 - 11614 10.2 25 12 Op 1 . - CDS 11651 - 12259 213 ## Aaci_1133 restriction endonuclease, type II, Eco29kI 26 12 Op 2 . - CDS 12259 - 12489 75 ## COG0270 Site-specific DNA methylase 27 13 Tu 1 . + CDS 12549 - 12671 83 ## + Prom 14024 - 14083 3.3 28 14 Tu 1 . + CDS 14150 - 14536 416 ## RB2501_01256 hypothetical protein Predicted protein(s) >gi|316924871|gb|ADCP01000011.1| GENE 1 3 - 303 235 100 aa, chain - ## HITS:1 COG:no KEGG:Nmul_A1248 NR:ns ## KEGG: Nmul_A1248 # Name: not_defined # Def: integrase catalytic subunit # Organism: N.multiformis # Pathway: not_defined # 1 100 1 100 347 155 72.0 3e-37 MESFNQNVIKHKTGLLNLAAELGNISKACKMMGFSRDTFYRYQAARDAGGVEALFEVSRR KPNLKNRVEEAIEVAVTAFAVDFPAYGQTRASNELRKQGI >gi|316924871|gb|ADCP01000011.1| GENE 2 408 - 1007 121 199 aa, chain - ## HITS:1 COG:no KEGG:LIA001 NR:ns ## KEGG: LIA001 # Name: not_defined # Def: integrase # Organism: L.intracellularis # Pathway: not_defined # 14 194 171 351 359 184 44.0 2e-45 MSFQGFPSFPHIEYEHFIPPTQQELALVFVHAPEHVRRVVILGSQMGMRVGPSELFGLMW SDVDLENKVIHLRAAQKNKNEPVRDIPIRQSLVEELKAWQEVDRAKRVAPVIHYGGKPVK CIHGAWHRTLLRAGIQRRIRPYDLRHAFVTEAIAAGVDIGTVGKLVGHANLTMILKHYQH VLSSQKRAAVEAIPEPQLG >gi|316924871|gb|ADCP01000011.1| GENE 3 1497 - 1799 157 100 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEPELLTTKQAASVMNIGEERARAILLSRGVQPVSLPWGKERKTLRWSRRAVMTVIDTLH AEAQAKAGVPKRRRSLKTAGCVIGKSAEELFAEFNMGAVQ >gi|316924871|gb|ADCP01000011.1| GENE 4 1809 - 2039 224 76 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MITPEELDYIRTAAIGDMLGDSRAFDGMGPSAVIFRLCVEIKKLRKERNENSVLIRFIIG RLEAIAQRGKASRKAV >gi|316924871|gb|ADCP01000011.1| GENE 5 2169 - 2807 650 212 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237747956|ref|ZP_04578436.1| ## NR: gi|237747956|ref|ZP_04578436.1| predicted protein [Oxalobacter formigenes OXCC13] # 6 154 1 149 159 93 48.0 7e-18 MQKVLVIRRVGGLEVPAQAQGGSPGVIRRVGGLEVEQQRRDILTEVIRRVGGLEVTPDTA RPRAKVIRRVGGLEVIGLIPSRSLFVIRRVGGLEVQRHGTATGPLVIRRVGGLEDREGFT ESLIRVIRRVGGLEDLRRTFPELLDVIRRVGGLEAPEQRGEQFSPVIRRVGGLEVKSTDD FRGQDVIRRVGGLEVHGTENRMDQFVIRRVGG >gi|316924871|gb|ADCP01000011.1| GENE 6 2889 - 3905 1137 338 aa, chain - ## HITS:1 COG:no KEGG:LHK_01792 NR:ns ## KEGG: LHK_01792 # Name: not_defined # Def: hypothetical protein # Organism: L.hongkongensis # Pathway: not_defined # 6 215 3 214 311 261 59.0 2e-68 MSQQPQTTTLSELKKPVPPLDPSIKAGFDTVGGFDLIQRTAKLFAASNIVPQQFQGNLPN CVIAVDMALRMGANPLMVCQNLYIVHGRPAWSAQFLIATLNQCGRFTSIRYEFQGEEGKD EWGCRAVATELATGEKLTGPLITIGLAKKEGWYGKNGSKWQSMPELMLRYRAASWFVRAY APEIAMGLKTAEEVQDTYALEPAEDGTYRVSVQEMKEEAQDRDTPSKRSRPTNAEMEARR KEAADAWLATGNPIEDVEKLVNAYARNWTTAQCEKAKQLAAEAMRNGAQQDAPEVPAQPE AQPAPAANMITCPKTETQVSDWTCSDCEQRAGCPAWAE >gi|316924871|gb|ADCP01000011.1| GENE 7 3909 - 4712 572 267 aa, chain - ## HITS:1 COG:no KEGG:LHK_01791 NR:ns ## KEGG: LHK_01791 # Name: recB-2 # Def: RecB # Organism: L.hongkongensis # Pathway: not_defined # 7 266 5 270 271 303 59.0 5e-81 MNIKEPILIRASSLAGLFECPARWEAQNIRGLRTPSSGSARLGTAVHTSTALFDTSRMNG TGITPDEAAGAAVDAIHKPDEEVVWDDLQPTEAERIALSLHRLYCSTIAPRQTYAAIEAT CERLDIADLGIALTGTVDRVRRTEDGYGIADIKTGKSAVGADGTCKTQGHAAQLAVYELL AEHSTDICIEAPAQIIGLQVAKTERGQRVATGTISGGRDLLIGDEEFPGLLEMAATLIHS GNFYGNPRSNGCGEKYCPIFNTCKWRR >gi|316924871|gb|ADCP01000011.1| GENE 8 4709 - 4849 130 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKWIEQYPVAWRVALIVLCFLLVGYFEWEDQELFNNMTPLSVEAER >gi|316924871|gb|ADCP01000011.1| GENE 9 4846 - 5139 334 97 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKDLNRMTAQELDSELSRLARLRTATCRITPIFEPQGLGFVNDIDGNELAHCAHGSVRDY IVTFDPATVRGLLDVAIDAVSARLDAVSEAERKRDAA >gi|316924871|gb|ADCP01000011.1| GENE 10 5228 - 5470 342 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTTAPYLVIGTSPFNGMDTVDECDSLEEAREFANEHGGKVIKASEYRPKGLLEDWLNNE PEDFGVRPGIDFPATLHRAG >gi|316924871|gb|ADCP01000011.1| GENE 11 5633 - 5827 83 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTNTSLSSEASLSYSLNNASSGLLLEFRIMKILFVMSVTFSEVDIGPSFGMNLQDIVLPL CPSS >gi|316924871|gb|ADCP01000011.1| GENE 12 6002 - 6262 172 86 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEENMAIELNDFVEKVLRDLAGFVPVGERVEFEVSPAYVASFLKSDGKTLYGEEATHAYL GENSNGDKLIFSVPQKYEKTIKQPLG >gi|316924871|gb|ADCP01000011.1| GENE 13 6278 - 6505 91 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEIKELIQALEAANDEEKERLTVAIAPFLDRQIIKTLGAASRATASHGGMSCGAPVNSCP IDTLLMLKEKINLYR >gi|316924871|gb|ADCP01000011.1| GENE 14 7086 - 7490 219 134 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNETFIENVKKALRLYVDREHNGVTSQAEKALGLAEGSGLLSKWLKPRGDKGERVPSLLQ IAPLFPQLGIRVFFPWEELPTESVDTRKLESEIHRLNIELAKKDGAIEALNARLDRYETV AEKENTMTAPASAK >gi|316924871|gb|ADCP01000011.1| GENE 15 7625 - 7822 85 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEASKQGLPYSTVRKHWLGERELGIKSVIRYEQILGIPRSELMPELFTPAVPSISTEPEE VSNAE >gi|316924871|gb|ADCP01000011.1| GENE 16 8076 - 8528 433 150 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKLLQAVAAKTYQLCRTYPGGIRAMFAGMRERHGLSESSVYADLNPNNGQSTLRVWELV RVMEATGEHGPLRELANWFGYSLASAADEEPDAPTLEAEALQDYPPLVAFHEGCKLYRQG KITLAQLDELKDRAFKEVRETFCRTVKGDE >gi|316924871|gb|ADCP01000011.1| GENE 17 8525 - 8827 147 100 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLTFSAHALDRCFERRISLQGVLDALRQGTCVKGSGMKDGERFVMAHGRLKVVAEFEGP ACVVISAWRDQKGSKRAARERRQRLRRIRLAFKEGRVITC >gi|316924871|gb|ADCP01000011.1| GENE 18 8821 - 9306 245 161 aa, chain + ## HITS:1 COG:no KEGG:CLJ_B1785 NR:ns ## KEGG: CLJ_B1785 # Name: not_defined # Def: putative phage N-6-adenine-methyltransferase # Organism: C.botulinum_Ba4 # Pathway: not_defined # 1 158 1 140 140 137 45.0 1e-31 MLNRALFSSVKDDWPTPWEFFRNLDLEFDFTLDVCAVPWSAKVCRYCVPPHALRVWGETT FRRLFPDALVDGLAHSWAGERCYMNPPYGREIGPWVEKARREGERGALVVGLLPARTDTA WFHEHVYRAATEIRFLKGRLKFEGAAASAPFPSMIAVWGKV >gi|316924871|gb|ADCP01000011.1| GENE 19 9303 - 9767 175 154 aa, chain + ## HITS:1 COG:no KEGG:DVU1741 NR:ns ## KEGG: DVU1741 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 149 1 153 323 157 54.0 8e-38 MNTDIRLSVGFWQHPKTKKTARRLGLEGIRSLQVLWLWSTQYRPDGNLSGMDWEDIELAA DWQGEERKFFDTCLGMWVDETSDGYVLHDWQEHNPWQSEALARSEKAQKAAQARWGKANN INKQCLTDAQAMPEHDSSNAPSPSPIPEEKDKCV >gi|316924871|gb|ADCP01000011.1| GENE 20 9685 - 10314 122 209 aa, chain + ## HITS:1 COG:no KEGG:DvMF_1672 NR:ns ## KEGG: DvMF_1672 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 50 129 169 248 285 85 45.0 2e-15 MLKQCPSMIRAMPLLLLLYQKKKTSVFNAGARVERPAENPEPEAVPSRQPSLAFDEFFEA FPEQHRGGRSEAAGEWVALESSRALPGLPRILDALGQWEDSEAWKRQGGRYIPSAANFLK REYWLRKPPEREIQSAGPSGGRPMTARQAEAKERGDWAKHILAFDEAVRNGDVEDFGFGT EQGVCALPATDAGAGRVRAAGQGMGRRVG >gi|316924871|gb|ADCP01000011.1| GENE 21 10193 - 10618 274 141 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MATLKILALELSKAFVLYRQPMPEREEFELLVRAWADVLADVSDAEFIEGIRRVEAKLSF FPVPADVMRQVEESRKRTPAVNREALPENALTLDERCELGTDWCAKILANLRGKMDVRRQ GRPDTSLDEQLANLRALGVEQ >gi|316924871|gb|ADCP01000011.1| GENE 22 10856 - 11050 136 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLRGEIGHTKKPDLDNMAKQLKDAMSRTGFWGDDRQVVSLRCSKCYAAVPHWEVAVYPLE ARDA >gi|316924871|gb|ADCP01000011.1| GENE 23 11043 - 11246 193 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPERNLLLGWAKICAYAKVSRLLMIRYGYPVYDCDRAVHHGYGVCAYTDELDAHKAQLER LGKKRGA >gi|316924871|gb|ADCP01000011.1| GENE 24 11246 - 11620 346 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302343033|ref|YP_003807562.1| ## NR: gi|302343033|ref|YP_003807562.1| hypothetical protein Deba_1600 [Desulfarculus baarsii DSM 2075] # 9 114 7 113 126 62 37.0 9e-09 MGELWVSHDELVDAIGDDGADLLCRAVGGVSTYIPRKPVAGSPISAVLGMKRMERLCAAF GGLRVTLPNRRKGEPFKDRIVRMLSSGKSPGNIALELGVTERYVRILARQIKEQPKPQVQ LRLW >gi|316924871|gb|ADCP01000011.1| GENE 25 11651 - 12259 213 202 aa, chain - ## HITS:1 COG:no KEGG:Aaci_1133 NR:ns ## KEGG: Aaci_1133 # Name: not_defined # Def: restriction endonuclease, type II, Eco29kI # Organism: A.acidocaldarius # Pathway: not_defined # 1 199 1 199 207 266 62.0 5e-70 MDVKIYNPLDKVNLGGSVADAMLSGPIFPLGGLESFNGAGIYAIYYTGDFEAYTPLSAKN KDGEFNMPIYVGKAVPPGARKGNFGLDSEPGPSLYKRLQEHAESIATVENLRIEDFFCRF LVVDDIWIPLGESLLIAKFSPVWNKLIDGFGNHDPGKGRYEGARPKWDTLHPGRNWANKC ATRAESVEQIIQEIQAYFRGIV >gi|316924871|gb|ADCP01000011.1| GENE 26 12259 - 12489 75 76 aa, chain - ## HITS:1 COG:MTH495 KEGG:ns NR:ns ## COG: MTH495 COG0270 # Protein_GI_number: 15678523 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Methanothermobacter thermautotrophicus # 8 62 357 407 413 61 54.0 5e-10 MLVTPSGEVRYFTVRESARLQTFPDTYIFHGSWTETMRQLGNAVPVKLARIMAASIAEKL VEAETKRLIAKMRRCA >gi|316924871|gb|ADCP01000011.1| GENE 27 12549 - 12671 83 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGYPYARRKRALRLERYGYRIMPSLSYFEDREALVMRPER >gi|316924871|gb|ADCP01000011.1| GENE 28 14150 - 14536 416 128 aa, chain + ## HITS:1 COG:no KEGG:RB2501_01256 NR:ns ## KEGG: RB2501_01256 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 6 119 1 113 117 127 54.0 9e-29 MAVLSLRHFSPVEFRCKCGCGAGMEKMDADLLQMLDEARDLAGIPFPLSSAYRCPKHNKA VGGVPTSAHTRGYAVDIRCVDSHSRFVMLQALLEAGFRRIELAPTWIHVDNDPDKPRDVA FYQHGGKY Prediction of potential genes in microbial genomes Time: Fri May 13 01:50:14 2011 Seq name: gi|316924833|gb|ADCP01000012.1| Bilophila wadsworthia 3_1_6 cont1.12, whole genome shotgun sequence Length of sequence - 29512 bp Number of predicted genes - 39, with homology - 20 Number of transcription units - 13, operones - 9 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 55 - 336 194 ## RB2501_01256 hypothetical protein 2 1 Op 2 . + CDS 336 - 626 389 ## 3 2 Op 1 . + CDS 734 - 946 201 ## 4 2 Op 2 . + CDS 943 - 1221 334 ## + Term 1246 - 1283 -0.9 + Prom 1251 - 1310 6.5 5 3 Tu 1 . + CDS 1349 - 1945 -28 ## + Term 1971 - 2012 5.2 6 4 Op 1 . + CDS 2173 - 2406 198 ## 7 4 Op 2 1/0.000 + CDS 2403 - 3767 1101 ## COG5410 Uncharacterized protein conserved in bacteria 8 4 Op 3 2/0.000 + CDS 3764 - 5221 1440 ## COG3567 Uncharacterized protein conserved in bacteria 9 4 Op 4 . + CDS 5221 - 6096 781 ## COG2369 Uncharacterized protein, homolog of phage Mu protein gp30 10 4 Op 5 . + CDS 6093 - 7307 1318 ## 11 4 Op 6 . + CDS 7308 - 7880 665 ## 12 4 Op 7 . + CDS 7897 - 8862 1337 ## COG4834 Uncharacterized protein conserved in bacteria 13 4 Op 8 . + CDS 8873 - 9211 428 ## 14 4 Op 9 . + CDS 9222 - 9620 515 ## 15 4 Op 10 . + CDS 9622 - 10101 520 ## PAU_03111 hypothetical protein 16 4 Op 11 . + CDS 10113 - 10475 439 ## 17 4 Op 12 . + CDS 10472 - 11029 595 ## 18 4 Op 13 . + CDS 11020 - 12513 1931 ## Bcep1808_4542 hypothetical protein 19 4 Op 14 . + CDS 12526 - 12969 674 ## 20 4 Op 15 . + CDS 12997 - 13419 506 ## + Term 13490 - 13520 0.5 21 5 Op 1 . + CDS 13625 - 15340 761 ## PD0969 hypothetical protein 22 5 Op 2 . + CDS 15353 - 15919 363 ## 23 5 Op 3 . + CDS 15979 - 16536 123 ## DEFDS_0878 hypothetical protein + Term 16542 - 16578 5.6 - Term 16528 - 16566 2.2 24 6 Tu 1 . - CDS 16571 - 16783 107 ## - Prom 16915 - 16974 3.9 + Prom 17035 - 17094 5.9 25 7 Op 1 . + CDS 17233 - 18105 369 ## COG3617 Prophage antirepressor 26 7 Op 2 . + CDS 18115 - 18369 212 ## + Term 18371 - 18418 1.6 - Term 18363 - 18402 1.1 27 8 Tu 1 . - CDS 18408 - 19298 109 ## Shew185_4132 hypothetical protein - Prom 19339 - 19398 10.1 + Prom 19351 - 19410 6.7 28 9 Op 1 . + CDS 19475 - 19855 415 ## 29 9 Op 2 . + CDS 19852 - 20775 763 ## PD0964 hypothetical protein 30 9 Op 3 . + CDS 20775 - 21557 719 ## plu3029 hypothetical protein - Term 21544 - 21582 7.2 31 10 Op 1 . - CDS 21586 - 21969 538 ## COG1598 Uncharacterized conserved protein 32 10 Op 2 . - CDS 22009 - 22191 146 ## - Prom 22244 - 22303 4.1 + Prom 22153 - 22212 3.8 33 11 Op 1 . + CDS 22333 - 23622 654 ## gi|266621163|ref|ZP_06114098.1| spore surface glycoprotein BclB 34 11 Op 2 . + CDS 23627 - 24865 1042 ## COG3299 Uncharacterized homolog of phage Mu protein gp47 35 11 Op 3 . + CDS 24868 - 25596 453 ## 36 11 Op 4 . + CDS 25605 - 26657 192 ## gi|302859634|gb|EFL82713.1| side tail fiber-like protein + Term 26663 - 26695 4.5 - Term 26651 - 26681 3.3 37 12 Tu 1 . - CDS 26687 - 27436 234 ## Ddes_0771 hypothetical protein - Prom 27457 - 27516 5.5 + Prom 27933 - 27992 10.5 38 13 Op 1 . + CDS 28220 - 28744 442 ## gi|212703059|ref|ZP_03311187.1| hypothetical protein DESPIG_01098 39 13 Op 2 . + CDS 28986 - 29511 177 ## COG0500 SAM-dependent methyltransferases Predicted protein(s) >gi|316924833|gb|ADCP01000012.1| GENE 1 55 - 336 194 93 aa, chain + ## HITS:1 COG:no KEGG:RB2501_01256 NR:ns ## KEGG: RB2501_01256 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 6 84 35 113 117 99 58.0 5e-20 MRPVTLAGIPFPLSSAYRCPKHNKAVGGVPTSAHTRGYAVDIRCVDSHSRFVMLQALLEA GFRRIELAPTWIHVDNDPDKPRDVAFYQHGGKY >gi|316924833|gb|ADCP01000012.1| GENE 2 336 - 626 389 96 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEATVIDFILSTLMSFSAQYPDAARLVTALSVVMTVCGLCAVATVWMPVPKEPTGLYAIF YRWAHALVAHFGQNKGAVADGKSETVKAEVKAVTGK >gi|316924833|gb|ADCP01000012.1| GENE 3 734 - 946 201 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTLAASGCSTVAEPTAATSPAPLTPGAVVTGEWSYTYRGETFTESGEWVHLPAGEAGNLL LWIKGVEAGR >gi|316924833|gb|ADCP01000012.1| GENE 4 943 - 1221 334 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNDDVQLVREIGEVKAELSGLKAEVAGTNQRLDDIVITQLKDHGKRLAALDVRIAALEAA ENKRAGGLSVLAGVAAAAGSLGALIMKLIGPM >gi|316924833|gb|ADCP01000012.1| GENE 5 1349 - 1945 -28 198 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGIFYHQFTAFGAAAQPMHAAAVQLATACAQHFAMIAAWHIGIGNSSCRAAVAVNSPFYP AIIGQPFGAMYGNSQLAIQAIALGHGGAVLGGHAERAALSAGAIPGLPPVPLYTVVPGIG NLQAVLYVDLAPCLNCQFWLNGLGGGVPNPYNGIINGLGATTNLHVWYRWPYTPAGVAAM TAFHGNPLPAQVGIIAGW >gi|316924833|gb|ADCP01000012.1| GENE 6 2173 - 2406 198 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGDGGRSGAEILAAALFKKAAKGSERAFELIRDTVGEKPSDRIDHTSSDGSMSPYRLTPA EVAQELIRQSEELEGEE >gi|316924833|gb|ADCP01000012.1| GENE 7 2403 - 3767 1101 454 aa, chain + ## HITS:1 COG:XF1569 KEGG:ns NR:ns ## COG: XF1569 COG5410 # Protein_GI_number: 15838170 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 42 272 55 314 316 113 31.0 9e-25 MIFPTLEQYARAKGAATRVPRVILPYHRKMYAAITSWAAGTLPGGARNLAITIPPRHGKT LAAHDTVEWLFGMVPESRWLYTAYSADLAVSQTMRIRDALVSDWYRRMFPDTQVRGNRQN FVTTAAGGELYGVGMTGTLTGFGAGRKRREFGGAIVIDDPLSAEESRSATRRAHVNEWYT QTLKSRRNHDGTPILLIMQRLHTEDLVGHVLATKPGLWRVLKLSAMDEATGEMLWPETFS RESAELMREVDPMTFYAQYQQEPMIPGGAMIKKEWWQWFDFDGRYRFDGMLFATADTAYK AKSTADASVIRVWHGTRNALDCVDCAYGRWEFPELLHAAQFVYERWKERGLRQFFIEDKA TGTPLEQTLRRQGVPAYGWRPADFGFPDDKVSRVQESAWVVAGGRVRLPRGAEHAQVLVD EAARFMPDMSHAHDDHVDTLTMAVSIWRYAGGQA >gi|316924833|gb|ADCP01000012.1| GENE 8 3764 - 5221 1440 485 aa, chain + ## HITS:1 COG:XF1571 KEGG:ns NR:ns ## COG: XF1571 COG3567 # Protein_GI_number: 15838172 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 60 481 60 464 467 132 28.0 1e-30 MSRFPIRSSRRVVVRRQPIRNMMLDGGGASGNGDRGALQHAAQRTSNPYYSNNFLYRWQE YTRWYMTSWEARKIIDIPVDDALRLPFEITGVDTALSTDLRSAYEAFDLDRQNRRALIQE RLYGGCCQTIVIKGEEDERLSDRLSLERIRRGDLEAFNVVDVSRITRPDYDQNPFSAGYD RAERYIIQGVEADVSRLVVFDGSPLINRAAMNILQNFRYNPAGFGESKLAPLYDLLVRVV GTQQAAYHLVNMASVLLVRTSNLMALQATDSPALAKLEEICKQISLYRGAVIDNPNADVQ QHAASFGSVPELVMSFAQLLSAASDIPATRFLGQAPGGLNATGESDLQNYYNMIDAFQRL RIKPVVLKQLSVIGPHLMGFERWRAASKSLDIVFPPLWNESSQVKAENARTYAELFRVLY ADGVIKRDVVVKELIQRGVFQTGEQVKDFLAEEADSPDAADLMQPVDPSGPLAELEKLAA SGRAV >gi|316924833|gb|ADCP01000012.1| GENE 9 5221 - 6096 781 291 aa, chain + ## HITS:1 COG:XF1574 KEGG:ns NR:ns ## COG: XF1574 COG2369 # Protein_GI_number: 15838175 # Func_class: S Function unknown # Function: Uncharacterized protein, homolog of phage Mu protein gp30 # Organism: Xylella fastidiosa 9a5c # 78 264 103 273 281 111 37.0 1e-24 MPSMIVLNAKRTRKLPGVRSPKPVEARARRAIDSLMAPMLADVAARLAGLDSMTRIADVI LALREAQEVWERAFSWKAEGLAGRWALDVDVIAKEKLGRAVGKALGIDMAMILDAPDVAV SVRLASMEAAQLIRTIPTDYIGRISRAVYQSMRQQPFPEGRTLAEEIRLIGGFAEERARV IARDQTSKMNTNINQIRQTALGIQEYIWRTSEDSRVVGAPGGKYPQGNAMHGNHYVRNGK IFRWDEPPDDGHPGWPLQCRCVALGIVDRSQLREVAQVGKWSGPQTGWRLV >gi|316924833|gb|ADCP01000012.1| GENE 10 6093 - 7307 1318 404 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKAFRNRQRIANWREDENGLLTVTVCVLKEGVYPYGADECEGLPDTLAGRGTVMEFIPAA EFTPEAMKTLEGKPATIMTDEEDAHEWRTPDNAMKDGLTVGAVAGTPWVEGDELRCDLLI SDRDAIEAVKRGELAEVSAGYEGEITYGDGTFSGETYQARQSDFRFNHILLLPSGAARLG PDTRIINKRTTDMAEYTVKQHFKNGARTFRFTNEADKEEAERMANEAAEEAKVSSAEEVE NAMKRCNELKEEIDVKNAELEEAKQVIVDYKEQIDKLLDPESQEQLAAELLDQAAAEEEI MESEVDESERDELENRVKNCKLRAGRRKAIVAHVMNKRGVAVDDNWTDEGIGAAFAVLAA SAKAKIGNRKPRKAPGSEPVKVKNANIGDPFARMMAFKNRKGDK >gi|316924833|gb|ADCP01000012.1| GENE 11 7308 - 7880 665 190 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSVYTGSNNGFMQSTYVDQQGTALPGDLAYASDVDLIDACVVSMPAGSEGDLLPVGVGVV GAYSADASRPGMTSVKVSPVGADTTAAQLYGVTVRNQQCRTDGNNVSGWGDGDVCNVMRT ARVGGRIWVTAGNAATANTAAHLVVKDTTSHGLPVGSFVGTEITGDTVALTNVQWVTAAS AGSLGLLEII >gi|316924833|gb|ADCP01000012.1| GENE 12 7897 - 8862 1337 321 aa, chain + ## HITS:1 COG:mlr8009 KEGG:ns NR:ns ## COG: mlr8009 COG4834 # Protein_GI_number: 13476628 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mesorhizobium loti # 4 321 22 339 339 70 24.0 4e-12 MPITYGSHENVTASDIAFSIHTAVDAAFYDVLYPEHEWYNVLSEDQIMRDINPGATQYAY ISRDRQGAASFIGNGPNANIPMVAQSAGAVNVPVAYAAVGAVITNEDAREYTFGFNGNLS QDLGETMRVACDNLTEQSFFYGNADMNFQPFLNYAGVTATTVPQGESGKTEWKDKTPLEI FNDINNALTKMWQDSRTIFKPTIIYVPLAQYALLSQPAVIGGVGMLTSIKEYTIANATVA GIAGKPLEILPLRYLAGAGASGADRMIVWDRDRRNQCMPMPMPYTLQAPVPAPLAAEFYA EMKIGSFHVRQAGSIAYYDGI >gi|316924833|gb|ADCP01000012.1| GENE 13 8873 - 9211 428 112 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSVITNRMDRPWTFVAPGWAFSFTLKPLEMRVLSDEEAKAVSANPSVARLVDSGLLAMD TPKKDAETVNTVAQTQKRVKDAEKSEVLTAPSGNKAATVKTEIKGTSTVLVK >gi|316924833|gb|ADCP01000012.1| GENE 14 9222 - 9620 515 132 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSVTVEAFRAAFPEFADVPDAAVAGALETASLLWNPGRLGKFWERIVMLSVAHRLAVRFN IGRALNAAGMKGAEAGVVNSQSASTSSLSVTNANNGMVTGTDPFQADYARTSYGLELLSL LELIMPRGYVVK >gi|316924833|gb|ADCP01000012.1| GENE 15 9622 - 10101 520 159 aa, chain + ## HITS:1 COG:no KEGG:PAU_03111 NR:ns ## KEGG: PAU_03111 # Name: gene0057 # Def: hypothetical protein # Organism: P.asymbiotica # Pathway: not_defined # 12 157 11 154 156 62 31.0 6e-09 MISIKLNRKNPGGLKGFSKRLEALAGKEVAVGFPRGGSGLGNPHYKNGASILLVAVANNY GLGVPRRAFMELAAEKIRKWFPEYMRTAMPDAEAGNTDIHDVLETAGSVGTDLAKEAIAD GDWEPNAPETIKRKKGSDKPLIDTDAMRQAATWQVRDRS >gi|316924833|gb|ADCP01000012.1| GENE 16 10113 - 10475 439 120 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDFSSVLDSFSRPVIVTDTTGAHVNGVWVEDGPGTTRTVSAIVLAMSFEELQFYAEGDSS AAGITLTTDEELFFTDINAEGLERRQSYVEYGGYRFRVAGTGFMQKNTLHNIYACVRYFE >gi|316924833|gb|ADCP01000012.1| GENE 17 10472 - 11029 595 185 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSGPLVKTLTVNAVNTLLADYLTSVFQWEAGRVVVETQAGPRPPSGVYATLWWKGQELLQ QGMGDFTLPESPEDEGVQSLGNEALCTVQVSVRGPDAYSLASEARYGLEAAERFFDLWRV LGFAGCGPVTDLSGPLGGRIQQRAFFDITFYALFGRAYPLEWFDASQWAINNESLQLPKE EAPCL >gi|316924833|gb|ADCP01000012.1| GENE 18 11020 - 12513 1931 497 aa, chain + ## HITS:1 COG:no KEGG:Bcep1808_4542 NR:ns ## KEGG: Bcep1808_4542 # Name: not_defined # Def: hypothetical protein # Organism: B.vietnamiensis # Pathway: not_defined # 12 495 6 489 491 161 29.0 6e-38 MPVSPVVCPKEPLSRDLDVSVSISRPITEIATDMTMICFVTPDVEFSPGNGRVQFFSTMK ALSAAVPANSAAYWAGNAFFSRDDRPKTLAVGRVFTEPTAAELTGGQVALSGLANVTDGA FDIEVNDTLVSVSGLSFDGAPTVAEVAEVLNMAMASKGVTVAASGNALRLVTTQAGDGAS LGYAAAPSSGTDVSALLGLTSAKAASNIPGYTPGDLVSEIGLIQTAARCSGNAVYGWAID AQYRDTDEQKAVADWAEGQSPAIFGACTNAPNAYDTANTTNIGYYAMNSGYRRTFTFYHD NPQVYPEMSYLALALSVNYALNNSTLTMKFKQLPGISTVPLTETQLAALESRRINTYVSI GNTSSVIREGVQAASDWFTDSLVNLDNYKEELQVEVYNVFLRNKKVPYTQAGQNLLVSAA AKINRRYTDNGTFAPRDVESDNTETGYDTLPATSITPASVAGATMSERAARIAPPIAITA YEAGAFHSVAIAVSVYN >gi|316924833|gb|ADCP01000012.1| GENE 19 12526 - 12969 674 147 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKVYNQKNMSLTIDGVNIQDFHEGATFVYTWDGGEVDKTQGTDGAGINIATNQGATLQF TLRETSRSIKFLSDLRLRQENGGAGVTVVARTGADILLTMTEGYISRPGQLSTGDKKQGS MQFTITSAEDETANLSSSSEQLLSSLF >gi|316924833|gb|ADCP01000012.1| GENE 20 12997 - 13419 506 140 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNVSGLGSFSLDGVTYRFEALNPLEAMRFGNRVFKVFGPALLSLASAKKDGGEMAGEGVL ASLAPALSEMDEDKVSLLVEEALRRCYTPQNEALRDEVVFNRWFMEHPDQLYGAGLLAVW HLVKDFFPKTLATLKVPSLT >gi|316924833|gb|ADCP01000012.1| GENE 21 13625 - 15340 761 571 aa, chain + ## HITS:1 COG:no KEGG:PD0969 NR:ns ## KEGG: PD0969 # Name: not_defined # Def: hypothetical protein # Organism: X.fastidiosa_T # Pathway: not_defined # 1 323 1 331 629 75 25.0 7e-12 MIVEELVTLLGVELSPGAKEKLQAFDKGLDAVVSRVKQASVVLTAAAGGMALYFSNAVNG AADLQMLSDTTGVSTTKLQEWAYAANAMGVSASAVQSDLAKMEKQARWTGRTLESYANQF KGMDAATANIWGDAYGLSPETVLLLRQGADGIAKLKAEAHDVGAIIPPETVKRASEFKAQ VVQITTMIRGMATTIALAALPQAERLVSTFKGWISENREWIRLGIGKIIEGMGTAFERVW KAGKRLLDWIKDALGPVGDFFQKINKGVDWSKLLTGALVLLLAVFAPLIAKVVLIGAAFA VASAIVEEFLSFLEGKDSVIGRFLKDFEERFPALFELFKKMGTFLKNDLVDKLESISDLM GKILDIANKLVEASSKIAEGAAEYLGIGPSKEEKEQREQAYANAGKAKKEGDGFFSDDPV WRADYQQKKENSSLDKPSPPSSMSTQKEGEGGKAVQPVSLQPVEKSTTPEFHQKDEKKQL ISPVLLENINALADQLAVYSRQRTGMQTSLEGQSVIRMPQKNGPTISDNRSFTVYQTITP SDPQQAADMSAKSFKDIAQIVTPGMNAPVVY >gi|316924833|gb|ADCP01000012.1| GENE 22 15353 - 15919 363 188 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MADSSTSSTVQSGEPAAIVRKGVAIAGIQVSVKKSEAHTYTSQATELAMESGATVTDHVI LKPLTLAVTVAMTNAGDGREAARDTFESFVEMRKKREPVEVITEHAIYTNMVITSLTPTH SAPYKGALEIGITFQQVNFVELQSVGRSPSALKGSAKKTGSAPVQSGKVEAKEADKGIIT QMYDAWRK >gi|316924833|gb|ADCP01000012.1| GENE 23 15979 - 16536 123 185 aa, chain + ## HITS:1 COG:no KEGG:DEFDS_0878 NR:ns ## KEGG: DEFDS_0878 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_SSM1 # Pathway: not_defined # 9 132 21 144 197 85 36.0 9e-16 MCILLSGVAHAGPFGTSEGDGLDRYPGAESLGDNTYVVYSLPKSHPDFDTFLLTIPEKYG LVKIQALSKTIDNDSAGEKTRALFSKIQKQLESKYGEPEYVFDDIKPDSIWKKTTEWSIA LAKDERALRARWKVQNDKDHIIIIGLSAHAKQKGMSTENAVHLFYFYEKMNEYMKEKAAE ETGSL >gi|316924833|gb|ADCP01000012.1| GENE 24 16571 - 16783 107 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWGCFWGGFLAFLAERVFPTHVGVFLITQLCNFFDTCLPHARGGVSELAKLLGYKDESSP HMWGCFCRSF >gi|316924833|gb|ADCP01000012.1| GENE 25 17233 - 18105 369 290 aa, chain + ## HITS:1 COG:YPO2093 KEGG:ns NR:ns ## COG: YPO2093 COG3617 # Protein_GI_number: 16122332 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Yersinia pestis # 1 140 1 137 187 106 39.0 5e-23 MTTFLCFNDFTFSPVTRDNQPWFKSSEIARALGYKREDFLSKLYRKNADEFTPDMTQVVE NRAERRNGVPGNLSDGRVRIFSLRGCHLLAMFARTPVAKAFRRWVLDVIEQYGDRVPAEQ PVTLSTPSTPADRKPLRSLVNAWAKLANVHQSTLWPQVRAHFQLERIDDLPVEWLPDALA WVQGKIDELSRVPEVKVLSCEARLAQLDAQLDALRKHVEKERSDIYSGLLTILDPRIYGF EVWDRCHEALWPMLDSCLVTPWGNKMHDPVRFVRFIIQMHGNHTGAINAR >gi|316924833|gb|ADCP01000012.1| GENE 26 18115 - 18369 212 84 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLTTLDAARGMQRKHSKLIRDIDRVRSILPPDFAATAFTPDAQTSAAGKRQRFFHLTRDA LPFLFMGQATKHEILWMMDVIKAM >gi|316924833|gb|ADCP01000012.1| GENE 27 18408 - 19298 109 296 aa, chain - ## HITS:1 COG:no KEGG:Shew185_4132 NR:ns ## KEGG: Shew185_4132 # Name: not_defined # Def: hypothetical protein # Organism: S.baltica_OS185 # Pathway: not_defined # 7 285 3 276 281 135 29.0 2e-30 MKNIDTLYKYYSNESWKNVFGEWTIRFTPPKDFNDPFECMPYAENIYPEKELYNMVECQF ESIINDEYKNGPEIIKIAMTLDKFKSYAYNMYESNIKEQILSVCRDKVTPLFLKKMSEFA HRDIGILCLTKNQNNLLMWSHYGNSHKGFNVGFNINNKLFNKRKNNEDQLRHIREVHYTH KRNIKYLTNIDDISNYILLKGKIWEYEEELRMPVYIDEKEDFEWIKQDVCKGICRLPKDA VKSILFGSAMPEDEIKERCREIRSQADCDHILLQRATLHPSDFDIVINPVHESFYM >gi|316924833|gb|ADCP01000012.1| GENE 28 19475 - 19855 415 126 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAYLLPLTSDGERTFSVVLGSNTYFFRSYYVRGQESAWLLDISDAGGSMLASGVKLVPGS PNCLAGYGDAFNGENIVVTLSRGKPGDEEAPGDTLNVLWFPEGEESPFTLGDPMETLGEA IRLAGE >gi|316924833|gb|ADCP01000012.1| GENE 29 19852 - 20775 763 307 aa, chain + ## HITS:1 COG:no KEGG:PD0964 NR:ns ## KEGG: PD0964 # Name: not_defined # Def: hypothetical protein # Organism: X.fastidiosa_T # Pathway: not_defined # 46 293 21 258 276 89 27.0 2e-16 MMAAEKKSSTPNRPFLRRIVVTLGPLEEWRGKSRGEIVQFKSDGTLEGLRVTGTFQKTLM GMPQPSQISIYNLSRDTRNAIKGSLTKITVEAGWNNTDLRKVFQGSIMSSSSERNGPDIV TKLVALPGYGSLVRGVSSVTFGAGTPVSVAAQKLASDLPGMTVQGGNFQGVAGNIGPRGW SYAGATKDGLTRLGEEHGFSWSVQDGEVTAIGDRFMLGSYVELNGENGGLISIAPTLTGP MQIQSGVKIKALYVPGISAGHSIKVTSTLNPRLSGTYRIHTMSINIDAYSEAWTMDIESF RFPPGKK >gi|316924833|gb|ADCP01000012.1| GENE 30 20775 - 21557 719 260 aa, chain + ## HITS:1 COG:no KEGG:plu3029 NR:ns ## KEGG: plu3029 # Name: not_defined # Def: hypothetical protein # Organism: P.luminescens # Pathway: not_defined # 23 258 25 250 251 93 31.0 6e-18 MADYSVTSESENLRLQMRRMMDGLHVAMPAKVLAFQPGPPVRVTVQPTTQMKITLGEEVS YRSLPQLSGVPVVLPFAQTAGFLLTVPIQSGDTGLLVIPDRGLDNFLQAGDVAAPPFYGD PTLIQPRGHSLTDAIFIPGLSSDAAQIADYSTEAIELRDRERKSYISLGPDGITMTDGTA VMKMSGGKLETTAPSGIAMQTDAQCTVRSQNMDLSGEGNTFSGDCRSTNGTFTDKDGVVL GTHTHEGVQTGDGTTGGPVK >gi|316924833|gb|ADCP01000012.1| GENE 31 21586 - 21969 538 127 aa, chain - ## HITS:1 COG:mlr2150 KEGG:ns NR:ns ## COG: mlr2150 COG1598 # Protein_GI_number: 13471996 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 2 127 3 128 133 95 42.0 2e-20 MRYPVIVHKDGGSDYGVTVPDFPGVFSGGGTLDEALANVQDAIETFYEGEEVERLPDPSP LESVLASEDAEGGAVVLVEVNFDFLEKKAVPVNITVPLYLRNRIDRAAKARGMTRSAFLV RAAQAYM >gi|316924833|gb|ADCP01000012.1| GENE 32 22009 - 22191 146 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTARELIKKLEEAGFVNKGGTNHDKMVHPDGRVTVIHRHKGDIPLGTLKAIARQTKIKLP >gi|316924833|gb|ADCP01000012.1| GENE 33 22333 - 23622 654 429 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|266621163|ref|ZP_06114098.1| ## NR: gi|266621163|ref|ZP_06114098.1| spore surface glycoprotein BclB [Clostridium hathewayi DSM 13479] # 175 319 1 193 330 113 44.0 3e-23 MAWDLEFKNGDLTGGIVTGDDEVIQRIRTRLFRELGEWFLNTASGLPWYQDGKGILGSPL RDKGAVDLFIRKQALGTEGVSRILKLNSLFAAGQREYSIYMQVLLDSGKLVEETVTVHES VFTAPGMTMSDLTKLMAQNILFSDGETLQQKLENGAFVGPEGKPGEDGTAATLSVGKVTT GAPGTEAAVQNTGTDKDAVLWFTIPRGEQGIQGEQGMQGIPGIDGVAATVAVGAVQTGEA GSPAEVTNSGTENEAILNFVIPQGVRGEQGVQGKAAQISIGTVITGEPGTQASVTNSGTD EDAVFDFVIPRGNTGKTGAKGDKGDKGDTGATGQQGPKGDKGDSGLSYSVKGAVVSGTTY AANDIVFDPESNTAYLVLKSARWPFSVVAVSEGAVTSSTQAIILSRAASASALKDGKLDS SLLNGALWR >gi|316924833|gb|ADCP01000012.1| GENE 34 23627 - 24865 1042 412 aa, chain + ## HITS:1 COG:XF1704 KEGG:ns NR:ns ## COG: XF1704 COG3299 # Protein_GI_number: 15838305 # Func_class: S Function unknown # Function: Uncharacterized homolog of phage Mu protein gp47 # Organism: Xylella fastidiosa 9a5c # 44 410 40 387 387 112 29.0 1e-24 MASSSYGMTLAGFIPKRLADIQNDMNASIALIVDPHTGEYPFQNVTDDAVLQQVVGVFAS ALEEAWEAAYEASVQFDPQKNTGAGQSGTVQLNAITRKAGTKTILTFDLTGTPGVLVPAG ALIASASGETAYALQENVIFPAVVEGQRTSHTTARGVCTEYGAFDPGPGTVNTIQTPVAG WFNASNTATESIGTAQETDEELRKRQQRSTQLTSYRQIDAIYASVLAVEGVTYCRAYQNA TAYPVDDRGIPFKEVAVVAEGGDPAAITDALFLRFPVGVIGHGSISITKYDQQGVGYPIS FSRPTPIPVYVNVIVEITNRSEFPDNGIQLIKEAIVAYAQYGDTSNTEGFPPGEDVIRTR LYTPINSIAGHEIILCEIGTAQGSLSEENIPIAWNQVATFDVGDITVTVRGQ >gi|316924833|gb|ADCP01000012.1| GENE 35 24868 - 25596 453 242 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVPSRLEVSFGQYTKSLIDEALAKLPSQFLTSCMLRQLVAVFVGEVQELYDAVLGMQRG RTLYAAEAANLDALGRIVGEERAPYQYSDSHWFAFDRMGQAWDSTPWWCRDAPLVEYIPA EDPVYRTNILTRIIANHTLVASVPELDGLIRLVAGNSVSFEKTGPMQVRLIAPSTISTTN LVKLTTAVTTRRVDDEYAVPYPATLHVSGYIVFVPVSFFSFDKSNAQRWDSGRWASKGNL FL >gi|316924833|gb|ADCP01000012.1| GENE 36 25605 - 26657 192 350 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302859634|gb|EFL82713.1| ## NR: gi|302859634|gb|EFL82713.1| side tail fiber-like protein [Burkholderiales bacterium 1_1_47] # 58 350 60 348 348 96 33.0 2e-18 MALSGIQRTFAGVLNSIWGKNAQTTIPTQPVSGIAYRDTDAEFESGQKYDSLGESSRWNQ LLYLLSGLSEECSRFGILPWSSAQPYSKGALCIGPDGSLWQAQVAIAVNPTNPVTPGTNP AIWDVPLFKGKKAGGDTPPGSVIPFYSVNLGGPDNRNPIFWGESEPDTGWLICDGGSDGR GGSVPDLRNRFVMCSSYSGDAGQTGGAASVTPTVSVQNATQGGSIGNTAAGGGIAASGTG VGVQGTTLDGNTLPSHNHSMISCIGASNQGVQNFLRSESTSAAWVSATNYTGNSWAHAHG IYDPGHSHGFSAVAHSHTFTGTAHTHAATANAISTVPPYYKMAYCVKLPE >gi|316924833|gb|ADCP01000012.1| GENE 37 26687 - 27436 234 249 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0771 NR:ns ## KEGG: Ddes_0771 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 139 111 251 329 79 35.0 1e-13 MPFLSNIADRGLVLHVDGQKKLTSSNSDYELLNTKIAHPSTVYYNTLTSYPDSIPLQFKN KGAELAFFHGLYDISGMRKAYPLMGFDKLALCEELVAAGYQLNNATLQHIDDRDLFAYAA KQVPRDKPFFHVIITMNMHDPAKNNINPLFKGDADEGFFTNARITDDGIRAYVESLPAGT DVLIMGDHTPYHGPRSVHVPLILYRTGEPVPMKSPDISLTRCEASQYLRRVFDLPPVSPD IPVVEEICR >gi|316924833|gb|ADCP01000012.1| GENE 38 28220 - 28744 442 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212703059|ref|ZP_03311187.1| ## NR: gi|212703059|ref|ZP_03311187.1| hypothetical protein DESPIG_01098 [Desulfovibrio piger ATCC 29098] # 5 171 4 160 165 129 49.0 1e-28 MKQDVIVVPSDRLIIIDGIPLQFDFPAPKNMHALQWHEGSGHIEWTDDINHPLTPDDYAA DVAPFVALWEAEKARLDEEAAAAGAARIAEYNSEPARAARIRAERDRRLDATTWLVERHK EQTAGNIETSITGEDYAALLTYRQALRDLPQQEGFPWEGPDDPACPWPIEPEKV >gi|316924833|gb|ADCP01000012.1| GENE 39 28986 - 29511 177 175 aa, chain + ## HITS:1 COG:YPO2050 KEGG:ns NR:ns ## COG: YPO2050 COG0500 # Protein_GI_number: 16122289 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Yersinia pestis # 1 149 27 185 267 59 24.0 2e-09 MIGAEQDKRQDWTFQGFAPQFDAHVREQLPWYGMATESVALIARHYIPRRGLVYDLGAST GNVGRALAPTLTERNARLVAIDEVPDMVKVYNGPGRAVMADVARYPYKPFDVAVCFLVLM FLPVPERRRLLATLRGKVKPGGAIIVVDKEESPGGYPATVLSRLTWDCKLRQGAE Prediction of potential genes in microbial genomes Time: Fri May 13 01:54:43 2011 Seq name: gi|316924788|gb|ADCP01000013.1| Bilophila wadsworthia 3_1_6 cont1.13, whole genome shotgun sequence Length of sequence - 57572 bp Number of predicted genes - 47, with homology - 39 Number of transcription units - 30, operones - 9 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 414 - 465 14.0 1 1 Op 1 . - CDS 489 - 578 90 ## - Prom 637 - 696 2.0 - Term 756 - 797 7.2 2 1 Op 2 . - CDS 801 - 1667 637 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 1710 - 1769 1.8 3 2 Tu 1 . + CDS 2336 - 3940 1989 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 4 3 Op 1 . + CDS 4049 - 5284 1141 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 5 3 Op 2 . + CDS 5344 - 6375 1509 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component + Term 6453 - 6484 1.8 6 4 Op 1 11/0.000 + CDS 6523 - 7557 255 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 7 4 Op 2 11/0.000 + CDS 7609 - 8115 167 ## PROTEIN SUPPORTED gi|239995925|ref|ZP_04716449.1| ribosomal protein S3 8 4 Op 3 . + CDS 8119 - 9387 658 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 + Term 9396 - 9458 20.0 9 5 Tu 1 . + CDS 9475 - 10521 1379 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 10545 - 10587 7.2 + Prom 10558 - 10617 3.8 10 6 Tu 1 . + CDS 10715 - 11470 583 ## COG1145 Ferredoxin + Term 11581 - 11616 1.4 11 7 Op 1 . - CDS 11731 - 12648 356 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 12 7 Op 2 . - CDS 12645 - 12959 406 ## Dde_1202 thioredoxin, putative - Term 12987 - 13027 2.8 13 8 Op 1 . - CDS 13112 - 14695 2011 ## COG3119 Arylsulfatase A and related enzymes 14 8 Op 2 1/0.143 - CDS 14743 - 15684 1327 ## COG3181 Uncharacterized protein conserved in bacteria 15 8 Op 3 . - CDS 15710 - 17206 2020 ## COG3333 Uncharacterized protein conserved in bacteria 16 8 Op 4 . - CDS 17223 - 17681 679 ## 17 8 Op 5 . - CDS 17712 - 17984 329 ## - Prom 18079 - 18138 7.9 + Prom 18061 - 18120 8.4 18 9 Tu 1 . + CDS 18259 - 19833 1284 ## Maqu_0412 restriction modification system DNA specificity subunit + Term 19856 - 19903 14.3 19 10 Tu 1 . - CDS 20257 - 21264 1083 ## COG3938 Proline racemase - Prom 21327 - 21386 7.1 20 11 Tu 1 . - CDS 21803 - 24478 1632 ## COG0642 Signal transduction histidine kinase 21 12 Tu 1 . + CDS 24856 - 25080 82 ## - Term 24788 - 24826 -0.5 22 13 Tu 1 . - CDS 24985 - 25986 1247 ## COG1609 Transcriptional regulators - Prom 26138 - 26197 7.4 23 14 Tu 1 . + CDS 26452 - 27780 1835 ## COG0477 Permeases of the major facilitator superfamily + Term 27815 - 27851 -0.2 24 15 Tu 1 . + CDS 28095 - 28919 735 ## COG1409 Predicted phosphohydrolases + Term 28948 - 28989 8.4 - Term 28937 - 28976 3.4 25 16 Tu 1 . - CDS 29094 - 30440 635 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 30506 - 30565 7.0 - Term 30634 - 30673 8.1 26 17 Op 1 . - CDS 30705 - 31763 1514 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 27 17 Op 2 . - CDS 31774 - 32727 1129 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family - Prom 32802 - 32861 5.7 + Prom 32747 - 32806 9.2 28 18 Tu 1 . + CDS 33010 - 34377 1275 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Term 34578 - 34625 -0.2 - Term 34564 - 34613 4.0 29 19 Tu 1 . - CDS 34729 - 34983 448 ## - Term 35396 - 35444 6.1 30 20 Op 1 . - CDS 35508 - 36356 1048 ## COG0489 ATPases involved in chromosome partitioning 31 20 Op 2 . - CDS 36447 - 37409 1136 ## COG0731 Fe-S oxidoreductases 32 20 Op 3 . - CDS 37473 - 37781 326 ## 33 20 Op 4 . - CDS 37778 - 38149 290 ## COG1342 Predicted DNA-binding proteins 34 21 Tu 1 . + CDS 38525 - 39415 978 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase + Term 39436 - 39473 5.0 35 22 Tu 1 . - CDS 39584 - 40492 1449 ## COG0679 Predicted permeases - Prom 40576 - 40635 4.3 36 23 Tu 1 . - CDS 40748 - 42154 739 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 42240 - 42299 3.2 - Term 42298 - 42354 2.1 37 24 Tu 1 . - CDS 42378 - 42788 500 ## Ddes_1698 OsmC family protein - Prom 42923 - 42982 1.9 + Prom 42744 - 42803 2.4 38 25 Tu 1 . + CDS 42990 - 44150 1091 ## COG2199 FOG: GGDEF domain + Term 44190 - 44241 18.2 - Term 44181 - 44226 13.1 39 26 Op 1 . - CDS 44274 - 44474 308 ## - Prom 44543 - 44602 2.3 - Term 44567 - 44599 1.1 40 26 Op 2 . - CDS 44765 - 45217 -170 ## 41 27 Tu 1 . - CDS 45353 - 47755 2198 ## COG0642 Signal transduction histidine kinase + Prom 48300 - 48359 2.6 42 28 Tu 1 . + CDS 48407 - 49171 384 ## gi|302861788|gb|EFL84723.1| sigma-54 dependent transcriptional regulator/sensory box protein + Term 49234 - 49266 -0.9 + Prom 49203 - 49262 5.2 43 29 Tu 1 . + CDS 49290 - 53039 3735 ## COG5013 Nitrate reductase alpha subunit 44 30 Op 1 12/0.000 + CDS 53368 - 54837 1531 ## COG1140 Nitrate reductase beta subunit 45 30 Op 2 12/0.000 + CDS 54834 - 55391 660 ## COG2180 Nitrate reductase delta subunit 46 30 Op 3 1/0.143 + CDS 55664 - 56326 554 ## COG2181 Nitrate reductase gamma subunit 47 30 Op 4 . + CDS 56355 - 57383 1030 ## COG1275 Tellurite resistance protein and related permeases + Term 57392 - 57435 6.2 Predicted protein(s) >gi|316924788|gb|ADCP01000013.1| GENE 1 489 - 578 90 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSMFPIMIIAGAFMLISVVVGIYGVRKYN >gi|316924788|gb|ADCP01000013.1| GENE 2 801 - 1667 637 288 aa, chain - ## HITS:1 COG:PA0163 KEGG:ns NR:ns ## COG: PA0163 COG2207 # Protein_GI_number: 15595361 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 24 266 24 265 265 158 37.0 1e-38 MALRQHYWNPRTNEACFVDETQLIYAFNERQDAGTWSDLHVHPDWGELLFVATGSIVLCS ETGNFLGQGYRATWVPPGVQHEWYLPESSWNRSLFFHASLFEDSPRFRQCHGLEMSPLLR ELLFAVDDLKPDFTTEEGKRFALVLMDRLKASKEVGGPLLMPSEHRLVELCAAALAAPDA PICMADWSRHLGMSEKTLARLFIRQTGQTFGRWLQIMRLQHAMTEIEQGQSVTAVALDCG YNSVSAFISAFKKHFGSTPGAIAKRRHERETDEVARPRPFRLPFPPDY >gi|316924788|gb|ADCP01000013.1| GENE 3 2336 - 3940 1989 534 aa, chain + ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 6 532 16 548 553 269 32.0 1e-71 MLVSKKTVYVGADIVTMSEKAPKAEAVCIADGRIECVGSREEVLGYAKAGEYDVVDFGGG TLYPGFIDTHSHMSSFSRCLNQVYCGASLGSIAAVQQALRDKAAASDDEWVIGYGYDDSG ISDNRHMNRHDLDAVCADRPVMVVHISVHMGYVNTIGLERLGFTADTKVPGGEIVLDGNG LPTGLLLENAIIEASGRLPVPTEAQVRESLVRAIAEYNKKGFTTFQDGGLGINGEAEVFL RPYMELAREGRLNARAYLQLLPSEMDKLIPLGAWGIGNDHMKIGGVKYFTDGSIQGFTGA LLEDYYTRPGYKGALLWSQEEIDEIIIKYHCLGFQVAVHTNGDAASESVIQAFEKAVERC PRTDLRHMLIHAQLVSDSQLERMKACGIIPSLFARHIEVWGDRHAAIFLGPERTARMDPA GSCVRLGMPFSLHVDTPVLPVTALGSMHAAVNRISDGGVLFGGDQRITPRQALEAYTTYA SLCCGGEHDRGRIEPGRFADFVLLDSDIEAVDPSGIRDIKVLKTICGGRVVYEA >gi|316924788|gb|ADCP01000013.1| GENE 4 4049 - 5284 1141 411 aa, chain + ## HITS:1 COG:lin0541 KEGG:ns NR:ns ## COG: lin0541 COG0624 # Protein_GI_number: 16799616 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Listeria innocua # 4 407 6 409 414 226 33.0 7e-59 MQAQRLFDTLEELSTCNATPGNGVTRFSWSEADSRARRVLERELKAIGLEPWTDGMGNLH ARIEGSTGAPAVLTGSHLDTVRNGGRYDGTYGVVAALEALRSFHDEGYKPERAIEFIAFA EEEGSNFGSTCLGSKGIAGQIDVEGLKRLSNAEGAAYDALRAFGLDPDALPGEQIDPANV KAFLEVHIEQNAMLEDAGRELGIVTAISGMRLHRIVYKGHSDHAASPMQGRRDPAAGFAE LAFRMEQLWKDGDLPEDFSCTIGELACSPNVGIVVPESVTFTIDIRHVDVPVLEEGWQRI ESLVRSVAESRGLDLELVRLSASGGVRTSSEVGEVFRQAAERRGVQPLFLKSGPAHDAAS LGSRVPVGLLFVPSIKGLSHCPQEATAPEHLALGAWVLEDALRSLAGGGRP >gi|316924788|gb|ADCP01000013.1| GENE 5 5344 - 6375 1509 343 aa, chain + ## HITS:1 COG:SMb20036 KEGG:ns NR:ns ## COG: SMb20036 COG1638 # Protein_GI_number: 16263787 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Sinorhizobium meliloti # 48 272 50 273 338 133 33.0 6e-31 MKRFLVAFLVCTAMLLAQTQAQAASKITVIVASSVDTSEQNYMNVTYKKLVELAQKYSDN AFDFQLFPSMQLGDEQETVRALQLGTLQMSMLATNNYAPFAPSCGWVNMPYLFDTLDEFR ALLDAMWEQNNEWAVKEGGCRILSILDIGYRQLTTSAKHPVRKLADAKGIKIRTPQNALM VSTFKALGLEPTAVSFSETFNALQQGVVDGQEGCFNNVVTLKFYEAQKYATQINYSVHSA NIIASERWLQSLPEKERNALIRAGKEAMEYERTKVAGMLEADDKLMVEHGMELLGVVEDL PEWIRLGRTAWPKCYEVIGDGNAEKGKAVMDIVLAKKEALAKK >gi|316924788|gb|ADCP01000013.1| GENE 6 6523 - 7557 255 344 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 39 294 40 291 328 102 28 4e-21 MKKLLVVCLLCASVIGLAFGQAKAASKTTVIVASSVDANPQNYTNVFYNKFIELAQQYSD NAFDFQFFPSMQLGDEQETVRGIQLGTIHIANLATNNYAPFAPSCGWVNMPYMFDSLEEF RALVDAMWDQNNEWAIKEGGCRLLCILDIGYRQLTTSAKHPVRKLSDAKGIKIRTPQNAL MVSTFKALGLEPTAVSFAETFNALQQGVVDGQEGCFNNVVTLKFYEAQKYATMINYSVHS CNFIVGERWLQSLPEKERNALIRAGKEAMLYERTKVADMLATDDKLMQEHGMELLGVVED LPEWVRLGRTAWPKCYEVIGDGDAAKGKAVIDVVLAKKEALAKK >gi|316924788|gb|ADCP01000013.1| GENE 7 7609 - 8115 167 168 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239995925|ref|ZP_04716449.1| ribosomal protein S3 [Alteromonas macleodii ATCC 27126] # 1 159 1 156 161 68 26 6e-11 MKVLRWLDEWAEEFCVSIMLSLLILLLGTEVFSRFLLSKSFSWIEELCRYLFVWASYLGV AIAVKRKEQLRVLMLMDTLEKHFPRLVKVCYVVSELSFAVFCVLVFYYSINMLENMTRFK QVSASLEINVMYAYLIIPISMAVTTFRVLQGLYRDFRNHTLHFQGRGD >gi|316924788|gb|ADCP01000013.1| GENE 8 8119 - 9387 658 422 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 1 420 7 428 431 258 32 7e-68 MLLVLSFVFFMLLGLPLFGVIGLASIISMIGMPFPMEFIPQNLYTGMDQFPLIATIGFVM AGFLMEPAGITGEIIEVAKKAVGNIRGGLAMVTILACMIFAALSGSGPATTAAIGSIMIP GMIGAGYRKDDAAAVSATGGTLGILIPPSNPMIIYGVVANTSIAGLFMAGFIPGFVMTTS MCITAYCIARKRGYKGTGEKFSVYGLLKAIWDGKWALATPVIILGGIYCGVFTPTEAAEV GVLWTLFVGLFVYRKLTWKNISSALIRTSAFAGSATVLVGVSMAFSRLLTLYHIPQSVGA FLGSISTDPTITLLLIAFFIFLCGFVADTLAMVVVLAPVFLPITNALGIHPIQLGVLFVV CCETGFLTPPFGANLFITMKITDVKLEEVALKAFPYLCTIWLLIILIAACPQLVMFLPNL LM >gi|316924788|gb|ADCP01000013.1| GENE 9 9475 - 10521 1379 348 aa, chain + ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 7 251 3 244 332 70 25.0 5e-12 MIFGKMKSLLAVLALALLLAPAAQAAPAKLSSAWMGEHETFVAWYAKQQGWDKEAGLDLT MMPFDSGKNIVESLAAYDWAVAGCGAVPALTTPLSDYLYIIAVANDESANNAIYVRKDSP ILGAKGTNPSYPNVYGSAVTVQKADILYPKSTSAHYLLATWLRILGLSEKDVKLQEMEPT PALSAFTGGVGDAVALWAPLTYEAEAKGFKSVANSKDCGITQLVLLVANRRFADQHPEQV QAFLKMYMRGIEALRAKPAKELAVDYVRFYKEWTGRELTPEMAVADIQSHPVFTLDEQLA MLAPGGSVQKALNEIVDFSISHGSFTPEQIDKMKGKTQVAARFLEAIK >gi|316924788|gb|ADCP01000013.1| GENE 10 10715 - 11470 583 251 aa, chain + ## HITS:1 COG:CAC0885_1 KEGG:ns NR:ns ## COG: CAC0885_1 COG1145 # Protein_GI_number: 15894172 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 5 114 3 98 115 89 44.0 7e-18 MKEYRNIIEIDRERCNGCGQCVLDCAEGAIAIIDGKATVVSDSYCDGLGACLSGCPQNAL AIVRREAVPFDEEAAMRHVARQKEERRQQLFQPLGKHEPHGTLGCGCSGATVQEFAPRKR QESKAAGPVGERRQTWPVKLRLVPPSAPFLKGADILLAADCSAPASARFCEIASGKVVLI ACPKFEDNQEMRARLVALFTEARPASVTVLRMEVPCCRGIAAVCHEAAEECGISVRELVM GRNGELCQMKD >gi|316924788|gb|ADCP01000013.1| GENE 11 11731 - 12648 356 305 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 6 304 3 306 306 141 32 7e-33 MNGHSDILIIGGGIGGMTAALYAARANLYVRIVEKEVCGGLVNWTHTVENVPSYKSIHGM DLMTACREHVGSLGVAIEEVNEVEEVRLDGQPKEIRTSEGDSFTADVVIIATGRKPIPLP VETAFEKVHYCSVCDGTAYKDKDVLVVGGGNSGFDESLYLAGLGVRSVHIVEMFPACAAA QSTQDRALATGIIRANVNTTITALDPLPDGRCRASLRDAASGAASEETVDGVFCFIGQKP NTALLEGLIDMEKGYIRTDEDMRTSLPGVFAVGDVRAKRYRQITTAMGDGTVAALEAERF IRSLR >gi|316924788|gb|ADCP01000013.1| GENE 12 12645 - 12959 406 104 aa, chain - ## HITS:1 COG:no KEGG:Dde_1202 NR:ns ## KEGG: Dde_1202 # Name: not_defined # Def: thioredoxin, putative # Organism: D.desulfuricans # Pathway: not_defined # 21 103 20 102 102 100 56.0 1e-20 MSEATLLDSVTFDEWIKSHDGIVLFHKKLCPHCKVMRTVLGKATAERPDIQLASVDSEEQ PDLMARCSVERVPTLIVCRNGAVAARKSGIMNPRELLAFYESAS >gi|316924788|gb|ADCP01000013.1| GENE 13 13112 - 14695 2011 527 aa, chain - ## HITS:1 COG:ECs0942 KEGG:ns NR:ns ## COG: ECs0942 COG3119 # Protein_GI_number: 15830196 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli O157:H7 # 8 394 4 367 493 117 26.0 4e-26 MREKKNVIVVMLDTLQFNYLGCYGNTRVKTPNIDRFATEGVVFENAYTEGLPTVPCRRAM LTGRYTLPVKGWSQLDVSDTTIADLCWGQPIDSYLIHDSGPFRLPKFGYSRGFDKVFFLH GHETDHQYYAQDELGGGLKAEDYYEDHVMEKADEILGENVMRPLMNQVECHLKERQYWKS DGDQHAAQIMKEAVRNLERVDRNKSFFMWIDCFDPHEPWDAPSVYDPDLKCPYDPDYEGK DMFLPIQGHVDGLYTDRELEHIRALYAEKVTMVDKWFGHLMNNIRTLGMEDDTLVILVSD HGSPMGKGEHGHGIMRKCRPWPYEELAHIPMIMRGPGLPRNRRVRGFVQSCDVAPTVVDW LGIGVHPSMQGHSLLPLAKGDVEKVREFAIAGYFRYSWSIITEDWSFIHWLKDDEKSVAD ARFGIYGKDLGESTAHILQMKKANAVEDRDTAFYNRAYEEHKKAATLDGEDQWTCTAAAS SEVPARDELYCRKDDPFQLNNVSAKHPEVAKELFEQLKLFMAELRAS >gi|316924788|gb|ADCP01000013.1| GENE 14 14743 - 15684 1327 313 aa, chain - ## HITS:1 COG:BH2007 KEGG:ns NR:ns ## COG: BH2007 COG3181 # Protein_GI_number: 15614570 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 24 309 41 332 339 147 32.0 3e-35 MFRKLLLALAAVCLMTGSAVAADAFPNRPITIVVPFAAGGETDLVARMLADGMSKELKQV MVVQNIIGASGVAGISAVTGAKPDGYTLGVTPSAPLAMHPHMRKVPYTLDSFDFVGRILK APYIVMVAKTAPWNTLDDMLRDMKTNPNTFFFASSGVGSVPYFAMMDLFKKAGAQVRHVP FSGDADAFQAIAGNRVQVYTSTAGSLDQYDVKGLALMDATRDPLLPNLPTVKESGVEAYY SQWMPLLAPKGLPADVKATLSDAMRKAMQSPEFVERLHKLNLVPGYLDSKDCRTFVEEES VRNAEVIKGLMAK >gi|316924788|gb|ADCP01000013.1| GENE 15 15710 - 17206 2020 498 aa, chain - ## HITS:1 COG:BH2009 KEGG:ns NR:ns ## COG: BH2009 COG3333 # Protein_GI_number: 15614572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 456 1 453 504 284 35.0 3e-76 MFSDIMHGIMITMSPDCLLALCIGAMFGTVVGALPGLGTAVAITICIPFTLQMGNASAIA LLLGVYATSIYGGSISAVLLNTPGTPQSACTGFEGYAMAKAGKAEKALGWVTISSVLGGL FSCVALVIAAPQLADLSVKYGGPLEICGLICLGLACITSLSEDNQIKGLLMGVAGLFLAT IGVDPLSGEMRFTFGSPQLVSGIDLMPIVVGVFPLAELFYRAYEVHANVEPTAIDCKRIS FPTWQEWKGRGLILLRSSLIGVGLGILPGTGATAATFVSYSSAKRLSKNGENFGKGEPDG LVAAESSNNAVAGGAMVPTLALGIPGEPVMALMLATLTLHGITPGVRLMADNPDIVYSTF LSLILANLLIIPTAIITVRGFGKLIKFPTAILLGIIVICSLLGVYLPRSNMFDVWMALLI GVIAFLMRFADFPVAPFLIGYVLSAQLEYRLGQAVIYKGDMPLTEYLFSAPVALVLFAVS ACLLLAPMLRPLWAGRRK >gi|316924788|gb|ADCP01000013.1| GENE 16 17223 - 17681 679 152 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLSGRTALGLLFLGIGALLLRETYAVDVGAFAVPGEMGTMTYPRILLFGWIGLSILYLL NPGKPFNAHDLRASLPGVSKVVASIAVYIILFATVGLPLGTFLFLLLFFSLMRYRDIRRM ISFALLGAGLTWLVFEKILGVVVPQPFWVGLF >gi|316924788|gb|ADCP01000013.1| GENE 17 17712 - 17984 329 90 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSFPLSIVIWDYDPERIAEIDRNLHLALHELNLRGTVSSLSEPPLMSREGLVGREPVLEI GDAYWSLRPGETISKEACLKLLSRIVQEKG >gi|316924788|gb|ADCP01000013.1| GENE 18 18259 - 19833 1284 524 aa, chain + ## HITS:1 COG:no KEGG:Maqu_0412 NR:ns ## KEGG: Maqu_0412 # Name: not_defined # Def: restriction modification system DNA specificity subunit # Organism: M.aquaeolei # Pathway: not_defined # 180 503 256 567 588 97 29.0 9e-19 MLFDNVTLKKGMDFAAQLYAAHRQLSPIDLRDILLGCFSGLPIAGKNQCDHIRKNIVSPS LDGASPVLMLIAALSEMDRLTGWRQLLPIELADLMSHMLPGDEDAPVVCHGSRTLQFAVW FASKGREVCFTGESALVPHIAAIAGGQVKGLGPSSPYPANAGHIAIHDSDSRNAELGTFL RQLDHQSYQGAVVMTNWSFLSSASPRTLHLKQRLIENGSLYSVTQLPGGIIPRTLPALLQ LGPSEPGRTVRLVNAKDWCVPSRQGLEISYLDPILAQACDRPIGRLPRWTPKPPAENVDY EALILRGCDLRLKSSPLVQTSAPSLREESLGGCAALVRGQMLAKAGDGELSNIYREVTLA DIDESGFVTDASRLVPDAAPLPRSRRIARLREGDLLLTCKGSLQSLGKVGIVTQCGDNWL PSQTFYLIRTECIDPIWLFHYLRSPRALNYLRSNISGTSIPQIRVADIAALPIPIPNEEM LASVHAVHRQALKLLQKIDKLRDELDGLLDAPAVMLDSLMAPES >gi|316924788|gb|ADCP01000013.1| GENE 19 20257 - 21264 1083 335 aa, chain - ## HITS:1 COG:SMb20268 KEGG:ns NR:ns ## COG: SMb20268 COG3938 # Protein_GI_number: 16264006 # Func_class: E Amino acid transport and metabolism # Function: Proline racemase # Organism: Sinorhizobium meliloti # 4 332 3 330 333 250 42.0 2e-66 MHLSHLVETIDTHTAGNPTRNLIAGVPRIEGKTMQDKMAYAEEHLDWLRTAVMMEPRGHS NMSGTIWVEPCHPEADMGILFIDAGGHMPMCGHSTIGCVTAMLESGRVPITGEVTEVNID TPAGLVRTRATVENGSVTSVAFRNVPSFLFCSGTVEVPGIGAVPFDVAYGGNTYAIADAA YFPGLELRSGHRSAIEKAAQAFGDAVRAAVKFQHPLQPFINVITHVMFYTKPDDPTATYR NTVVFLPDSLDRSPCGTGTSARVASLFAKGELGLNEAFVHESVIGTQFRARIIEPAMVGP YKGGIPEVSGSAYVTGLLKLVIDPADPLRHGFRLA >gi|316924788|gb|ADCP01000013.1| GENE 20 21803 - 24478 1632 891 aa, chain - ## HITS:1 COG:ZarcB_1 KEGG:ns NR:ns ## COG: ZarcB_1 COG0642 # Protein_GI_number: 15803750 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 EDL933 # 288 809 58 562 562 224 30.0 5e-58 MQLKTKLLVYITVPIALAVCISSYMTYRTHSAAMRHIIEKDIKATSFFVAENINLTISDI KNSFRVAAKSEIIKESLENTGQREHERVVSFFEAIKHVMPTITDVFLLDETGNIRASLNS HDYGNNYGDRTYFRQAIAGETAIVGPLVSRVTQKECVYIAVPVGNERNKGVLVASVELDS ISVLCFNHDITSSRIDIFLLDNTAHILMAKESTKDGKHPDSIKLDDHTLSDGTPQGYATY AFNGKTYTGFYKKIKNLNWYVLIAMDDTQINKTVLSSTKNSFLLTLLAILIGLLIGSILI YNVMKALYKIIEYARRISNGQLEATLDVYGQGELGVLANTLRSMARIVKQDQDRLNRLVE ERTDQLRLSQERLLKESALLKTILNTVPDLIFYKDMNGIYKGCNKAFGAFIGKSEQEIIG KDDVELFQLSGNAAQKFIEDDLQVMRGKLDTLIREEEVLYPDGKRIYLETIKTLYYSEDN APFGMVGISRNIQLRKETEKAYAEAIQRAHEASKAKSEFVARISHEIRTPLNAIIGMNYL LKQSCTDPVQESYLRKMELSAKNLLSIINDVLDFSKIESGKLEIEKNTLSIEKLVRDVVS INEPKAKAANESFETHIDPGIPDVLIGDTLRISQIMLNLVSNAVKFSHHGTIKIEVLCER VHGENVTLLFSVRDQGIGMSQQQLDKLFIPFTQADGSTTRKYGGTGLGLSICKMLVELMH GEIWAVTEEGVGSTFFFRLQLEKKDAATPAAQHPETKQSAPVEALPASSLKGKTVLLAED NEINQEIAIAILNSFGLEVELAENGLEAVQKIQTNQYALVLMDIQMPIMDGYQATRLIRK DNRFNALPIIAMTANAMNTDRAESLKAGMNEHIGKPFDPEALKRLLGKWCR >gi|316924788|gb|ADCP01000013.1| GENE 21 24856 - 25080 82 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYNLQESPGFRSFPGQAAAMRHLKTVGRIRRAAQSPGFPFVSSLPRFFLSTRGGCRIPDN QMMLEFDAGDVAVG >gi|316924788|gb|ADCP01000013.1| GENE 22 24985 - 25986 1247 333 aa, chain - ## HITS:1 COG:BH2313 KEGG:ns NR:ns ## COG: BH2313 COG1609 # Protein_GI_number: 15614876 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 12 325 28 329 337 106 26.0 7e-23 MILSGRTDVSFSADTVRKVRETAEALGYAPTAKKRPSLFDRKTVLIVCPNVLNPYYSTIV QAIQQAAADKDCDTLVYTTYRDAENEIRILNAVAGSDLAGVVFTMMPQSTELVERVNRLV PVVVIGDRNTSLNVDTVEMNNYSAGAIIAHYMIGLGHKHIAYISTTLNEANSARVRRLEG VRDTYRDECPEGSVIVRSREVTPKEERDNISIEHAVGFELTRKCLGERRITAFVAVNDMV AYGVLDAIRAEGRRVPEDYSVCGFDNIFPSQFLPVGLTTVEHYIADKGRNAFEILHSKMS GESSDRNITRVEFKHHLIVRDSTAAPRGEEKTG >gi|316924788|gb|ADCP01000013.1| GENE 23 26452 - 27780 1835 442 aa, chain + ## HITS:1 COG:ECs2778 KEGG:ns NR:ns ## COG: ECs2778 COG0477 # Protein_GI_number: 15832032 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 10 429 21 438 438 287 37.0 3e-77 MSAPSKQEIRKVVLASLLGATIEWYDFFLYGVVAGIVFNKLFFPTTDPFIGTILAYSTFA IGYLARPLGGFVFGHFGDKLGRKSMLVLTMLIMGLATIGIGLVPTYAQIGIAAPIILQTL RLCQGLGLGGEWGGAVLMTYEYASKRERAFYASIPQMGLATGLCLSSGVVALLSWGLTNE QFMAWGWRCAFLLSVVLVGVALYIRMHILETPDFRKAQERGKTEHKSLPIVRVCKEHPGN IALGVGARWIDGVFFNVLAVFVITYLVQYVDVPRTEALTVVMIAALVMCPFILFAGRLAD RFGRGKVYGLASLLCGLSIFPSFWLMEGSGGNLFLIGLAIVIPLSVFYAGVFGPEAALFS DLFPPDVRYTGISIVYQFPGFLVAGIVPGLCTLFMKWNDGDPLYVCLFTLLAGLTSAASA FIIQRKHNAAVRRSVDGEVEHV >gi|316924788|gb|ADCP01000013.1| GENE 24 28095 - 28919 735 274 aa, chain + ## HITS:1 COG:RSc1795 KEGG:ns NR:ns ## COG: RSc1795 COG1409 # Protein_GI_number: 17546514 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Ralstonia solanacearum # 1 265 1 273 282 159 39.0 7e-39 MRILHLSDFHLRGDGGLSFRVVDTLECLRVVADHLKTLVQKPDALVITGDLADSGDAHAY SLLHEALAPLSLPVYAVPGNHDRRDRMRELLGNWCPADASVAPYLCYAVEDGPVRLVMLD TMQPGSHSGHCPEAVAGWLERTLAARPGVPTLLFMHHPPFVTGMGAMDEPFENVGRLADI LCRSPWVRLCCGHMHRPIVTQWAGCIALTSPAVSMQIDLDLSPEGGDTFRMETPGYLLHH WDGSVLNSHICQIPCSPTFSGPHPFVGSVNPVEG >gi|316924788|gb|ADCP01000013.1| GENE 25 29094 - 30440 635 448 aa, chain - ## HITS:1 COG:BH3671_2 KEGG:ns NR:ns ## COG: BH3671_2 COG0791 # Protein_GI_number: 15616233 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus halodurans # 86 201 3 115 116 94 41.0 6e-19 MKARHSSICTVALLVAAIALSGCSSKKLQPAFEPVYDDPYSFEAYLENRKKADRSSMSQE ERFAYDNRTASLSTMRESGIPLPNLHLIEQAKTALGTPYVPGGTDTQGFDCSGFVQWAYR NVGVTLPRTAREQSVMGRPIKSGSMMAGDIVAFNHPQRGYHTGIYLGGGSFIHSPGRGKS VSIAALSDPYFSSTFIGARRVQSKESDAEAIKKLMALDSRKPTLTHKAALASNQKRSATR KQTPAAVAEHRTQADTRRKASAPPAKQEPAQNKKNGGPAVAARKQTTPSQATAKAATKTI PPKKETPGKQAAQAPTPPKGAVPAKNTPVSEPAKAKAAPAKAGKEGKAAQPVAGKKAVPA KADGKTTATAKAPVKTKAEQPKAATQTAQTTATAPAKAAQTQKPAQNSKASAKAPTNSAS VKARATQAKPEKAAPKQPAKTAQQKSGK >gi|316924788|gb|ADCP01000013.1| GENE 26 30705 - 31763 1514 352 aa, chain - ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 3 255 1 251 332 75 27.0 2e-13 MRLNKFLSGLAAVLLAFSLCNAASAAPLTKIRTAWMDSYETFAMWYAKEKGWDKEAGLDI DILFFDSGMAILNALPAGEWQYSCLGGVPAMMGNLRYGTSVIGVGTDESACVAVLARPDN PVFKTKGLNPKHPGVFGSPEDVRGKTILCTTVSSAHFALTAWLDAVGVKPTEVTIKNMDQ PQALAAFENGIGDFVALWAPHMYAGTDKGWKVAGSAALCGKGTPLVLVADTKYADAHPEI TAKFLSVYLRGVEHLRTTSVDDLIPEYQRFFFDWAGKTYSKELARLDLESHPAWDIKGQL ALFDTSKGMSTVQQWQADTAQFFASIGSITPDELKKVENASYVTDKFLKLVK >gi|316924788|gb|ADCP01000013.1| GENE 27 31774 - 32727 1129 317 aa, chain - ## HITS:1 COG:SSO2732 KEGG:ns NR:ns ## COG: SSO2732 COG0010 # Protein_GI_number: 15899447 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Sulfolobus solfataricus # 5 311 4 304 305 230 39.0 3e-60 MGVPVNSLASPRFCGVRTFQRLPYSADCSGNDFAVLGVPFDTATSYRPGCRFGPAAIRDA SSILKSYHSVLDVDIFEHCQGVDAGDVDIIPGYLDESFERIEAAVAGVLDAGAVPVIMGG DHSITLPELRAVAKRHGPVALLHFDAHSDTGDDFFGKPYNHGTTFHWAIEEGLILPEQST QAGIRGPLYSRSSLDYAKGKGMRVISGWELHDIGVDETIRIMRERIRPETPVFVTFDIDF LDAAYAPGTGTPEIGGFTTHEALKLILGVCPGQRLVGMDLVEVLPDTDSAGITSFAAAGI MHAFLACLAANKKHNPS >gi|316924788|gb|ADCP01000013.1| GENE 28 33010 - 34377 1275 455 aa, chain + ## HITS:1 COG:BH0992 KEGG:ns NR:ns ## COG: BH0992 COG3829 # Protein_GI_number: 15613555 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 17 447 20 452 454 342 45.0 9e-94 MSPFDAALPSTEDLFSVIEAIHDNVAIVDHNGVMRWVSSCFERNYGISRDRVIGRTTYEL EAEKVFSPSVAALVLKTRRVVTLTEATRSGQYNIVTGVPIYDDDGDVSFVVSYSVDMRYS RRLHEEYQKISALVSPNAPQTSGGLPFSGTMRAIAATIEKLAKVNTTVLITGESGVGKNV VARLVHHLSDRTNGPLVEINCAGIPAPLLESELFGYEAGAFTGASHKGKEGRIALADKGT LFLDEIGELPLRLQSKLLQVIQEKQIVKLGGVKPVSIDFRLVAATNQDLEALVEKKRFRS DLFFRLNVLPLWIPPLRERREEIVPLCLSILGEMNAKYGTEKVFSDEVKRHLAAYSWPGN IRELRNVIERLVIVSSGGVIKADELPEHMFRHQENQESPIGKSLPDALETLERRMILEAY KACGTTTGVAKALGISQPTAARKIAKYKNEASGER >gi|316924788|gb|ADCP01000013.1| GENE 29 34729 - 34983 448 84 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTPTRDTAVSVSESTFLRDNMVCHLDTVTYVSAEHGEKETVVEVCEPYVAPEPKDCTQS SQEREPESQVFISPEFGNAIEALV >gi|316924788|gb|ADCP01000013.1| GENE 30 35508 - 36356 1048 282 aa, chain - ## HITS:1 COG:MJ0283 KEGG:ns NR:ns ## COG: MJ0283 COG0489 # Protein_GI_number: 15668458 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Methanococcus jannaschii # 4 282 8 287 290 241 44.0 1e-63 MASCSEQQGQTTIPMTAAMQRKVMSENLKDVRHKLFVMSGKGGVGKSSVTVNLAAALAAQ GFNVGILDVDLHGPSVPHLLGSHGFVRVDNEDGKLVPVSCGERLSLISIESFLEDKDAAI IWRGPKKVGAIQQFVADVKWGALDYLLIDSPPGTGDEHMTILDAIPDAKCVVVTTPQEIS LADVRKALDFLKVVKADVLGLVENMSGLFCPHCGEEIDLFKKGGGEALAKQEGLNFLGAI PLDPATVVAADRGHPIVSMPADTPAKAAFLKLAETVREATQA >gi|316924788|gb|ADCP01000013.1| GENE 31 36447 - 37409 1136 320 aa, chain - ## HITS:1 COG:CAC1621 KEGG:ns NR:ns ## COG: CAC1621 COG0731 # Protein_GI_number: 15894899 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 5 271 4 263 308 215 39.0 1e-55 MNTGFKHVYGPVPSRRLGRSLGVDALTFKSCSFDCVYCQLGRTTNHTIERKEYIPTADIL DEVRRKLENGDKPEYISFAGSGEPTLHSGLGDIIRGIKAMTDVPVVVFTNGSLLWMPEVR ADLAAADVVIPSLDGGDAALLDKVNRPAASLDFDRIVEGLIAFRGAFNGQIWLEVMLLGG ISDDDASVDAIARLAERIRPDKVQINSVCRPPAETYALPVPAARLLEIRKHFAGVPGTVE IIAEHLDDMAHTAFRTQIKEEDILALIERRPCTAAGAAAGLNIHVTEAIKHLEMLVADGK AVTQPKDGQVFYSACELRTE >gi|316924788|gb|ADCP01000013.1| GENE 32 37473 - 37781 326 102 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNEIICSITSAIEEGRSWGHTDKEIADYIYAKHIRPLETKLAEAPEPRAAFLEGFMLTTK SRNGENPFAGDYAEAWKAIASHAGVSQRAFEAIEGGPDESKA >gi|316924788|gb|ADCP01000013.1| GENE 33 37778 - 38149 290 123 aa, chain - ## HITS:1 COG:CAC3166 KEGG:ns NR:ns ## COG: CAC3166 COG1342 # Protein_GI_number: 15896414 # Func_class: R General function prediction only # Function: Predicted DNA-binding proteins # Organism: Clostridium acetobutylicum # 1 107 1 124 143 68 37.0 4e-12 MPRPRKWRRVCRLPQNTLFGPLAGIDDGTIVMTVEEYETIRLIDLEGLNQEACAERMHVA RTTAQAMYRNARHKIADCLVNSRRLIIEGGDYLVCNAREMTCGCPHCRRLIQPADGETPT ITP >gi|316924788|gb|ADCP01000013.1| GENE 34 38525 - 39415 978 296 aa, chain + ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 1 290 1 287 290 280 45.0 3e-75 MRNPIPRVAAIHDLAGFARTSLTAAIPILSSMGIQATPLPTAILSTQTVGYNDFTLLDLT DDMVRILDHWERLGLRFDGVYSGFMASTAQMDSAARCIRNCLAPGGLAVVDPVLGDDGKL IPTMTPEMVAKMRWLITCADLITPNFTEVCLLLDEPYSPEADLPTLKGWLRRLAENGPKT VVATSVPLADGQASRKPECTSVLAYERDEDRFWRVDCSYIPAYYPGTGDVFASVLTGSLL QGDSLAIALDRAVQFVTLGIRATFGQGLPNREGILLERILDTLRAPLGSYMCRLVD >gi|316924788|gb|ADCP01000013.1| GENE 35 39584 - 40492 1449 302 aa, chain - ## HITS:1 COG:L181807 KEGG:ns NR:ns ## COG: L181807 COG0679 # Protein_GI_number: 15673902 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Lactococcus lactis # 67 302 3 238 238 114 33.0 2e-25 MEEALFKVLSFITFIMLGYLLRRLSILKAETFRVLSGIVFYITLPCIIITSINGVPVTSE MLWLVALGALCNVLMVATGYGMTHGKARQERAFSMLNLSGYNIGTFAMPFVAGFLPPTGF LATCLFDAGNAIMCTGGTYALANSVTDASQHLSIKSFLRNVFSSIPFCIYLIMIVMAFLH LALPHPVIIFTKIAGDANAFLCMLMIGVNINLQMDSKSFRCLIKHITVRYALSAVLAFLF YRYLPFSLEIRQAMAVIALAPSSSLALIFTMKLKGDVIMAGNICSLSILLSTIAMTILLV YL >gi|316924788|gb|ADCP01000013.1| GENE 36 40748 - 42154 739 468 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 3 466 6 450 458 289 36 3e-77 MKLTIIGAGPGGYSAAFAAAKAGVEVTLVERAKLGGTCLHTGCIPTKTLRSSADVLEMSG RLAEFGITGECALKADMPAIVNRKRKVTATLQTGLEKTCAQLKVRVVYGKAELVSAKLVR VTTAEGTEEVESDNVIIATGSSPLELPALPVDHARVLSSDDALELQAVPPSLIIVGGGVI GCELAFIYRAFGSKVTVVEGQNRVLPLPSVDEEISRLLQREMKKKGIAVELARTVTATTP TGTGVSVEIGASPFVEVANPPAPRTLEADAVCVTVGRVPHTDGLGLDAAGVKVDARGWIE ADDFLETSVPGVYAIGDVIGPRRIMLAHMAVAEAHTAVHNILHPEDRKAQRYDVVPSAIF TAPEIGDVGLSEAQAKQQGFAVKTSVFQFRELGKAQAMGELAGLFKLVVEEGSGKLLGAH IAGAHASDLIAEATLAIQRGCTARDLFETIHAHPTLSEGIYEAAGSLV >gi|316924788|gb|ADCP01000013.1| GENE 37 42378 - 42788 500 136 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1698 NR:ns ## KEGG: Ddes_1698 # Name: not_defined # Def: OsmC family protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 135 1 135 136 202 69.0 3e-51 MATITATYLGDLRVESVHVASGARLITDAPVDNNGKGEGFSPTDLCATALASCAMTIIGI YGKMHDVDVTGTSIEVTKTMSANPRRIGKLEVVFTMPDREYTDKQKTMIERAAHTCPVHL SLHPDVEQVFTFHWKR >gi|316924788|gb|ADCP01000013.1| GENE 38 42990 - 44150 1091 386 aa, chain + ## HITS:1 COG:DRA0342_3 KEGG:ns NR:ns ## COG: DRA0342_3 COG2199 # Protein_GI_number: 15808001 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 230 385 15 165 176 87 37.0 6e-17 MALLGKTDWYSAEKKIILTIALAILIASVSVFAILYLMMSHALQEDIRARARVVNAYAES HLNLEGFVHINTLKDMRRDTYQDMQSMLDNIRQIANVRYLYTAKFNDRGQPMYLVDGLPP NSSDFRSPGDLIEQDIVPMLNRCLSGELIESDGVLNTEWGAIFLTCMPAYTIGEAEPIGA VVMEFNADVIYKSKLRAMLYSGALALVIVGGCTFITMLCLRRLATPFYKKLAYTDMLTGI GNRTAFELELKNLEKRLPHPFTIVAYDLNYMKRINDTYGHAAGDAYLRRMAHLLMREEPV RRGLSFRIGGDEFVTLFEGEEEETLLRELEVFHMAGAQAEVNGEPVTFAYGVASYDPALD KGSLHNTLSRADALMYRFKKAARGES >gi|316924788|gb|ADCP01000013.1| GENE 39 44274 - 44474 308 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKDKKEPIVLDLDIFIHLMTSAYREGFKAGAGKNIQRLENDCASAVSQFRFGMADMLNEP ESTVQH >gi|316924788|gb|ADCP01000013.1| GENE 40 44765 - 45217 -170 150 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSLSAGASFFSFDSIAEPIPNGADRAGTSVPCWNPRTRCRFMKKALPVQTRGPLRVCLQY AVFPGIPCLSATAFRAAVALSLPLASPVRPYSGACRNGKRENRSSTFPEKRGRQGLLAGK LSRIFLMKRKKGMRPKLGIQHFFARKLFLS >gi|316924788|gb|ADCP01000013.1| GENE 41 45353 - 47755 2198 800 aa, chain - ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 277 518 33 273 280 179 41.0 3e-44 MEYDTLKQPDRITHEYNMLMSSMHVSVGKHLLDPYFTVIWANDFFYEKTGYGREEYEATF HNHVSEYYSAFPDVYETMGKFIKTALKHGEPSYEFVCPMPVKGGSRIWIKVVGTFTKETV DGIPVIYSVFTDITDLVQAQTEKSITYDNLPGFIAKFQIRAGCAQERFTFLDANDRFIDF FGVRAAGDAPYSLVNWDSARNQQALNEHYPAMREGKPVHFTVQAKTLQNDDAWLQLNGDC IDVIQGDPVYLFIYIDITDITEQRVLQKKLEERSELLRNALEMAERANRAKSDFLSRMSH DIRTPMNAIMGMLSIAKESWDDPDRVRDCLDKVENSARFLLALINDILDMSKIESGKAIL KKKAFDFAAFTRNLAAMFYGQAEKKQVRFQVLLGRSLAEAYVGDELKLNQILINLLGNAI KFTEPGGSVLLSVQEDKRSNKASELVFTVTDTGIGMEKSFLKKIFEPFEQDSQQRSNQGG SGLGLSIAEKYARMMNGGIEVDSTPGKGSTFVARIWLENDEAQAPVSLEERFRHLKALVV DADAEELEYTCSLLFRFGVAAEGFASGDMALERLDAAVRQGESDIIVLVDWGNARPAADA LIRQIRKSVEKNGPVIAVAAYDWSAIKVKALEAGADVFLQKPLFASTLYDLLLSLTQDRP LACDANPEEGFDGERVLLVEDNALNLEIAVTLLTTRKLQVDTARNGQEAVEKFRGSEAGY YLAILMDIQMPIMDGLEATRRIRSLAHDDAERIPIIAMSADAFEDDVEKSLEAGMNAHIS KPVDIPALFTTLRQIKDKRA >gi|316924788|gb|ADCP01000013.1| GENE 42 48407 - 49171 384 254 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302861788|gb|EFL84723.1| ## NR: gi|302861788|gb|EFL84723.1| sigma-54 dependent transcriptional regulator/sensory box protein [Desulfovibrio sp. 3_1_syn3] # 1 245 7 251 255 124 31.0 5e-27 MKFDPESALHLLLEQALKNVTLDGYASFLATFFVQFMPVASLAVREFDTNHITQLAHYAV NKTVFSPPRVVTIPDALVYELYLDPDCSGDKYKASILSSKDGTSRTRVFTMLHMAEETSI FFPLLVDDRFLHCFYISISAIGKNKYTEEHLQICESLRIPLANALTVLLQRRSESPLSAF PPLAAGQTFNGAEAPSMMLEEVTTAYIHHVLAYTRGRVSGPRGAARILGIPASTLVSKLR KLGVNPKAYFKAGD >gi|316924788|gb|ADCP01000013.1| GENE 43 49290 - 53039 3735 1249 aa, chain + ## HITS:1 COG:BS_narG KEGG:ns NR:ns ## COG: BS_narG COG5013 # Protein_GI_number: 16080781 # Func_class: C Energy production and conversion # Function: Nitrate reductase alpha subunit # Organism: Bacillus subtilis # 27 1244 21 1221 1228 1293 51.0 0 MRFLNSRTLRYFGQRARELAHNFENAHHPYEERQGGRSWEDYYRRRWQHDKVVRSTHGVN CTGSCSFDVFVKDGIIVWEAQKTDYPTPHPDFPDYEPRGCPRGVSASWYVYSPLRVKYPY IRGKLLEMWKAAKQASNNDPVAAWEAIQSDPAKRKAYQQARGKGGFVRFSWDEASEIIAA SLISTIKKHGPDRIFGFTPLPAMSMTSFASGARFLSMLGASMVSFYDWYCDLPPASPQIW GEQTDVPESADWYNAGYIISWGSNLPQTRTPDAHFYVEARYRGTKIAAISPDYADFTKFA DHWLPVRAGTDGALAMAMDHVVLKEFYLDRRVPYFEDYAKRFTDLPFLLFLDEEERDGET VLAPGRCVRASDLGLGGNNPEWKFVIHDRTRKGPAVPNGSIGSRYGEEGTWNLEMRDCYD RADLDPVLSYAELGDETEWKLAAFPVFFEGQPSLRKGAVPVRRLAVTGADGKQQERLVTT VFDILAASLAIDRGHGGDVASGYEDARAYATPAWQEAITGVPAEDMIRVAREFADNAERT GGRSMIIMGAGVNHWYNNDVTYRAMISLTTLCGCQGVSGGGWAHYVGQEKVRPLAGWTTV TVGSDWMGPPRLHNGTSFYYFALDSWRHELLSMDKLTPPDRKGSLPDHPADCNALAARLG WLPFYPQFKENSLETCEKATKAGAASNEEIVAHTLERLKSGDLELSVDAPDDPANVPRVM VFWRANPLGSNVKGHEYFLKYLLGTESSFLGEEARQPETIRTTPEPDSPEGVGGGKLDLM VTSEIRMSTTCVYSDIVLPAAHWYEYHDLSSTDMHPFIHPFNPATDPAWEARTNWDQFKA IAQKFSELAGKHLGVRKDMVATALLHDTPGEIGQPFGEVRDWRRGDAEPVPGKTMFNLKV VERPYPDIYKMYSALGPNVAKPGGVGAKGVSWSCAPEYEQLKARLGVVSEPGVSEGMPRI DNAKDACEIMLALSPESNGDVGVRSWAGLEKQTGFKLNDLSRPVQDQHLTFEGITARPTK GFTSPNWSGIEVHGRTYAPFELNVQRLVPFHTLTGRQHFYMDHEWMRGLGEALPVYRPPL SLAAIGEISGPRIPRTDKDLVLNFLSPHSKWSIHSSYSDNHIMRELSRGGGEIWLNNDDA ASAGIADNDWLECFNANGVFMGRAVVSHRIPHGKTYIHHAQERTVNVPLSPLSGTRGGTH NSLTRPLVKPTQMIGGYGQLSYFFNYYGPTGCQRDEFVVVRKVQGDVRF >gi|316924788|gb|ADCP01000013.1| GENE 44 53368 - 54837 1531 489 aa, chain + ## HITS:1 COG:BS_narH KEGG:ns NR:ns ## COG: BS_narH COG1140 # Protein_GI_number: 16080780 # Func_class: C Energy production and conversion # Function: Nitrate reductase beta subunit # Organism: Bacillus subtilis # 1 481 1 478 487 627 57.0 1e-179 MNIKVQMGMVLNLDKCLACHTCSIPCKNAWTTAPGTEYMWFNNVETKPGVGYPKEWENQD RYKGGWEIRDGKLHLRAGGKTDKLVNIFANPDLPALDDYYEPWKYDYERLTDSPASRHQP VARPYSAVTGKAFNPQWDSNWEDDLGGAPVTGLSDRNFAGLEAKAYLDFKNVFMMHLPRL CEHCLNPACVASCPSGAMYKRDEDGIVLVDQSRCRGWRYCVSGCPYKKVYFNWKTHRSEK CLFCYPRIEAGEPTLCAHSCVGRIRYVGVMLYDADRVREAASHPQPQGLYQSQLEVFLNP NDPAVCREAERKGISWHVMEAARRSPIRKLVVDWKLALPLHPEFRTLPMVWYVPPTSPLV SGGVEDAKGLDRMRIPVRYLANLLAAGDEEPVRSALSRLLAMRRHMRDGQFQSTASAPFP SLRAHPDTSGVLADAGLTPEQAAEMYHLLAIARYEDRFVVPTSGRENRDDVWAMKGSEGL AQGEMEERP >gi|316924788|gb|ADCP01000013.1| GENE 45 54834 - 55391 660 185 aa, chain + ## HITS:1 COG:Cgl1161 KEGG:ns NR:ns ## COG: Cgl1161 COG2180 # Protein_GI_number: 19552411 # Func_class: C Energy production and conversion # Function: Nitrate reductase delta subunit # Organism: Corynebacterium glutamicum # 1 126 18 144 228 67 33.0 1e-11 MNDSDRQLLLAVSGLLHYPEDDFFVQARSFEELCDLLSVVRACPLREFIRSARSKGLSRL REEHVAIFDMDSKCSLSLAWHRYGDDPRLGRALAGLNELYRDAGFEPVQGLLPDYLPLVL EFFAAAPDWARVVLLDGFGKELIGIYAALEARHPDSSYVSLFRPIVAVLGLVPDQRLPAK RLEQV >gi|316924788|gb|ADCP01000013.1| GENE 46 55664 - 56326 554 220 aa, chain + ## HITS:1 COG:BS_narI KEGG:ns NR:ns ## COG: BS_narI COG2181 # Protein_GI_number: 16080778 # Func_class: C Energy production and conversion # Function: Nitrate reductase gamma subunit # Organism: Bacillus subtilis # 9 216 11 222 223 147 38.0 1e-35 MHFFFAYYPYICITVLLAGLAFRYVTDPGDWNARSSEIFEKTWLRRGSIIFHYAIILSFF GHIVGLLTPDSVMHALRISMEAHTVVAVGMGMILAPLVVVGLGILLWRRITNEHVWATTV PMDVLVVLLILINGCTGFYQAYVAHFPVFVTVGPWLRSLVVFRPDVVLMETVPLFLQIHV VSAFTLFALIPFSRLVHIFSVPVTYALRPFQIYRRRYAGL >gi|316924788|gb|ADCP01000013.1| GENE 47 56355 - 57383 1030 342 aa, chain + ## HITS:1 COG:MT1776 KEGG:ns NR:ns ## COG: MT1776 COG1275 # Protein_GI_number: 15841198 # Func_class: P Inorganic ion transport and metabolism # Function: Tellurite resistance protein and related permeases # Organism: Mycobacterium tuberculosis CDC1551 # 179 327 3 150 197 84 34.0 2e-16 MLEAQAPANFAMVMSTGIVSVALHLLDYPIGARLLFWLNAALCVGLALLYIIRLFVFPKR FVQDFRSHGAGPGFLTVVAGICILGNQFVLLGKDPAMGEALFWIGSVLWIVILWGVFYFV FSDEPKPPLEKGINGAWLVATVSTQAIVILGCILIDHMPWDKEIAFFAFTALFLLGFMLY LFVITMIFYRFAFKELEPAQLSPTYWINAGAVAITTLAGAELLSHPGASPLLMEFFPFIK GLTLMAWATATFWIPMLFLLGFWRHSVKQYPAAYTPEYWGMVFPLGMYTACTAMLVKSLN LEFLLPLPEVFVYVALGAWTVVFCGMVITRLVMLLRGDCPTA Prediction of potential genes in microbial genomes Time: Fri May 13 01:56:01 2011 Seq name: gi|316924748|gb|ADCP01000014.1| Bilophila wadsworthia 3_1_6 cont1.14, whole genome shotgun sequence Length of sequence - 34741 bp Number of predicted genes - 43, with homology - 30 Number of transcription units - 18, operones - 13 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 461 - 1084 382 ## gi|237747956|ref|ZP_04578436.1| predicted protein 2 1 Op 2 . - CDS 1164 - 2117 435 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair - Term 2491 - 2537 2.0 3 2 Op 1 . - CDS 2704 - 3765 807 ## ACIAD2482 hypothetical protein 4 2 Op 2 . - CDS 3778 - 4611 707 ## ACIAD2481 hypothetical protein 5 2 Op 3 . - CDS 4608 - 5852 637 ## ACIAD2480 hypothetical protein 6 2 Op 4 . - CDS 5867 - 9106 1709 ## COG1203 Predicted helicases 7 3 Tu 1 . - CDS 9573 - 9722 85 ## - Prom 9773 - 9832 10.0 + Prom 9801 - 9860 6.7 8 4 Tu 1 . + CDS 9965 - 10303 359 ## 9 5 Op 1 . + CDS 10557 - 11159 460 ## gi|212703283|ref|ZP_03311411.1| hypothetical protein DESPIG_01325 10 5 Op 2 . + CDS 11162 - 11248 58 ## + Term 11258 - 11283 -0.5 11 5 Op 3 . + CDS 11294 - 11491 115 ## + Term 11500 - 11531 3.1 12 5 Op 4 . + CDS 11539 - 11943 324 ## gi|294789834|ref|ZP_06755062.1| ASCH domain protein + Term 12016 - 12048 -1.0 13 6 Op 1 6/0.000 + CDS 12075 - 12329 275 ## COG2161 Antitoxin of toxin-antitoxin stability system 14 6 Op 2 . + CDS 12332 - 12592 299 ## COG4115 Uncharacterized protein conserved in bacteria + Term 12615 - 12649 -0.7 - Term 12586 - 12630 -0.9 15 7 Tu 1 . - CDS 12645 - 13394 743 ## COG2356 Endonuclease I - Prom 13439 - 13498 3.6 - Term 13588 - 13629 8.0 16 8 Tu 1 . - CDS 13634 - 13819 162 ## 17 9 Op 1 . - CDS 13939 - 14325 546 ## 18 9 Op 2 . - CDS 14405 - 15043 674 ## DVU1494 hypothetical protein 19 10 Op 1 . - CDS 15240 - 15539 368 ## 20 10 Op 2 . - CDS 15605 - 16117 659 ## Msip34_1624 hypothetical protein 21 11 Op 1 . - CDS 16261 - 16581 254 ## 22 11 Op 2 . - CDS 16590 - 17642 1628 ## SAR116_2254 hypothetical protein 23 11 Op 3 . - CDS 17668 - 18153 684 ## MCR_0943 hypothetical protein + Prom 18491 - 18550 1.7 24 12 Op 1 . + CDS 18664 - 18918 98 ## Oant_0622 hypothetical protein 25 12 Op 2 . + CDS 18915 - 19253 453 ## COG4226 Uncharacterized protein encoded in hypervariable junctions of pilus gene clusters 26 12 Op 3 . + CDS 19265 - 19552 334 ## COG3093 Plasmid maintenance system antidote protein 27 12 Op 4 . + CDS 19570 - 19866 254 ## AM1_B0188 HicB family protein + Term 19891 - 19925 -0.5 28 13 Op 1 . - CDS 19883 - 21724 1321 ## COG4585 Signal transduction histidine kinase 29 13 Op 2 2/0.000 - CDS 21721 - 23307 626 ## COG0747 ABC-type dipeptide transport system, periplasmic component 30 13 Op 3 . - CDS 23311 - 23967 604 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Prom 24177 - 24236 1.6 + Prom 23984 - 24043 2.0 31 14 Op 1 2/0.000 + CDS 24137 - 25810 1078 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 32 14 Op 2 . + CDS 25897 - 27069 1107 ## COG0477 Permeases of the major facilitator superfamily 33 14 Op 3 . + CDS 27084 - 28505 818 ## LI0461 hypothetical protein + Term 28596 - 28655 14.3 - Term 28594 - 28632 6.4 34 15 Op 1 . - CDS 28720 - 29892 529 ## Acid_4109 hypothetical protein 35 15 Op 2 . - CDS 29902 - 31320 1272 ## HSM_0897 hypothetical protein 36 15 Op 3 . - CDS 31322 - 31669 182 ## - Term 31845 - 31886 3.2 37 16 Tu 1 . - CDS 31948 - 32376 418 ## gi|294789837|ref|ZP_06755065.1| conserved hypothetical protein - Prom 32555 - 32614 2.3 - Term 32508 - 32549 -0.9 38 17 Op 1 . - CDS 32638 - 33324 525 ## LAR_0782 hypothetical protein 39 17 Op 2 . - CDS 33331 - 33489 156 ## 40 17 Op 3 . - CDS 33548 - 33823 288 ## 41 17 Op 4 . - CDS 33820 - 34032 114 ## 42 18 Op 1 . - CDS 34140 - 34430 376 ## 43 18 Op 2 . - CDS 34430 - 34741 162 ## RB2501_01256 hypothetical protein Predicted protein(s) >gi|316924748|gb|ADCP01000014.1| GENE 1 461 - 1084 382 207 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237747956|ref|ZP_04578436.1| ## NR: gi|237747956|ref|ZP_04578436.1| predicted protein [Oxalobacter formigenes OXCC13] # 47 198 1 152 159 79 46.0 2e-13 MFSKTLVIRRVGGLEDQEGAHESAASVIRRVGGLEAAQHGHAGQGEVIRRVGGLEALLII CPRAIKVIRRVGGLEANGPPKKPPVYVIRRVGGLEVRHPSGRHPTGVIRRVGGLEVRDDV PIHVDGVIRRVGGLEVFAGAIVGMVGVIRRVGGLEDHARGRRGHFVVIRRVGGLEGRERG LDRLRNVIRRVGGLEVTVGAFLRLPLT >gi|316924748|gb|ADCP01000014.1| GENE 2 1164 - 2117 435 317 aa, chain - ## HITS:1 COG:YPO2468 KEGG:ns NR:ns ## COG: YPO2468 COG1518 # Protein_GI_number: 16122689 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Yersinia pestis # 7 310 10 317 328 231 40.0 1e-60 MGIKRHSEPRPLQLSKRANVFYLEHARVVQQDDRIVYLTQDGGEFEQMFNIPERNTAFLL LGKGTSITDAAMRRMAASNVMVGFCGTGGSPLFSVCDIAFMTPQSEYRPTEYMQAWAEMW FAPARRLEKARCFLRRRAQMTAECWQENSYLQKLGIVLSDAVLERFHSDLEQAKDVQELL LAEARWAKRLYADLARGHGFSFVREEGARRSTSKADVCNGFLDHGNYIAYGYAAVALCGL GISFAMPILHGKTRRGALVFDLADVVKDGYVMPLAFECAKEGETQKDFRQRLIEHCQEED VLDFLFDFMKNLCVKNT >gi|316924748|gb|ADCP01000014.1| GENE 3 2704 - 3765 807 353 aa, chain - ## HITS:1 COG:no KEGG:ACIAD2482 NR:ns ## KEGG: ACIAD2482 # Name: not_defined # Def: hypothetical protein # Organism: Acinetobacter_ADP1 # Pathway: not_defined # 7 353 6 353 353 301 48.0 2e-80 MAKKMNKLPGVLSFQRCLLVTDGLFYNELDNGNLSPLWVMRHGIRGTQNINKASKEGQAA SASAKRDEVSNIQTTDSAKLDANASALQVRFNLRGMDIQKALFACAPGQKDSMDDLNAFK ADLAAFVDRAKESEALVHLACRYVRNIANGRFLWRNRTVASSATVEVLLPDGTKKMFDAL ATPFNHFEEVSDDERTVAEVLARGWKGDPTTELRVTAHVDLGVGGAVEVFPSQNYLENKA RGFARPLYCVGGAPDGQDQHSVRIMGQAALRDQKIGNALRTIDTWYPAYEERRVPLPVEP NGASLDAQEFFRNKGDASGFKLMLRVSELDPMTPDGLFLLACIIRGGVFSGSE >gi|316924748|gb|ADCP01000014.1| GENE 4 3778 - 4611 707 277 aa, chain - ## HITS:1 COG:no KEGG:ACIAD2481 NR:ns ## KEGG: ACIAD2481 # Name: not_defined # Def: hypothetical protein # Organism: Acinetobacter_ADP1 # Pathway: not_defined # 3 253 4 267 294 153 38.0 7e-36 MSYLVFPRVRVQAANMLSASFLMGGPPVFAAYGLGEALCFHLGGGAKVTGMALIHHNREA LGQSFYGVFSPQQRRAAAFTFGKSANGSDYSSKNPHALSLQPVACAHLRVSIIWELEQVA GVKEAREFLHRARLSGGLVTGHGEIVLEDSLEAAFDRVGNGYVVTDRRDMLEDKGKNQAE LLVEALGAQPSAGEDNTWLSAACLGYAAITPFEHRGGARCGYTHAFAEPLVGMVQYRSLR QWRKEADAEEALWRPVWLDDRRGAFVLRQEQTEPEDM >gi|316924748|gb|ADCP01000014.1| GENE 5 4608 - 5852 637 414 aa, chain - ## HITS:1 COG:no KEGG:ACIAD2480 NR:ns ## KEGG: ACIAD2480 # Name: not_defined # Def: hypothetical protein # Organism: Acinetobacter_ADP1 # Pathway: not_defined # 28 371 17 354 401 165 32.0 2e-39 MDEANMEADGGLQERQAKNEKLWKAYPPETYRNTIAELVHFSSKTVNSNAALAGVRLASD DALDAPYVSARTLMARGVSLSLDYGKGGAAVAGICKQVNAVVRGNPFAEAASSEERLAIE DAMTGACSSAFSVGVEHVDHRLRQILIPKEGAEGGYVSMTPMTAGGICELLFEKDNGLVP RHNAACEEEGKRTGKPENGEEVGNAPRMKMRKLRQAQFGIGGSNPQNVGGLVRVMQRPLF VDAPRSADDLRAAFFLYYKGISIDFSLPGPLRQALLAYADFRRRHGLDNGESATPGRTDL KAREEEEALLGNIAAAVLQRADEAREMLRDYAVMLPQEQNPETGTEALVSPVLRPIAMRG LLDPALRDSTWPRDMAWLVIGGMEHASYANGNRVMVLDVTATATVAGLLEEAFR >gi|316924748|gb|ADCP01000014.1| GENE 6 5867 - 9106 1709 1079 aa, chain - ## HITS:1 COG:YPO2467 KEGG:ns NR:ns ## COG: YPO2467 COG1203 # Protein_GI_number: 16122688 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Yersinia pestis # 277 931 312 946 1095 232 31.0 3e-60 MHVIFISACQKRAIRRSRAILDSYALRAGERCWMSPMTTEGLQEVRAALARSATRQTSVA CYQNDGRRRMKLLWIVGARGRFGPEGHFPAGYRRITTPVRVSEWVRHAGLAARLAGLLHD LGKYSLKFQEKLRGKAELADAVRHEWISLKLWQAVRSGMDWKTAWANVYTKRLEMFIGKR EICNASRYGLASAQECVDALVSTHHGLFSSRLPCPEGRLVRENLPCPPDDALFTPWSEPE GSFGELAQRLDKRLSAAAVGMAGEAWIPYWRSVFLYARAALVFADHTVSALECPPNKDTS PAFANTHVAPDGRRLNQRLAEHLSRVSEKAADLVWRMANLVERPEASLTGLQPCSLEAVL RPADPESRFAWQNTAADALAEAREAYPTSGALVFNMAGTGSGKTRMNVRAACVLSRSDAP RFSIALNLRSLTLQTGRALQCQLGLSDSDLATVIGDDVTRKLFNASNIKEKENIAAPWSD DDGNPAETEALTSGGNWPLPAWLESFFPKPQERRIVGAPLLVSTIDYLIAAGEPHRQGHH VKALLRLMSADLVLDEIDSYDPESLVAVLRLVQWCAFFGRNVICSSATLSRPVAQAVEAA YASGAEMARALRHGKSACTDEEKPRSEAKTLYVLAFIDDALPPLIKAVPHVPEKGRAERA AALLALYDSRVSAQLEAIAALPVYRHAELLPMAEESVSTWMGAVTAGVQKLHARHALTDA RSGVAYSFGLVRVANIATAVDVARHLAKELPQARVACYHANDWHIARFHKEKRLDFLLSR AEGDRHIVADHEIRAFLDEAAHEERPDVPFIVVATPVEEVGRDHDFDWAVLDVSSAQSLV QAAGRVNRHRLRPCGDAPNIAVPQFNWRHCRNSDRGEPRAPAFCWPGYEGKDKERYSIHD LAQILPWREGRLVITADVRLGEDCRLGRRDDKAIAKRLFPYFGPETGEYSFVAERSPAAL LSEWVYNETPLRNRTGRTEMWRLGHDVERLGDVYEAFVYQGERRGTGGQWVERDERSFRE VDALPNAWLWLDAETMRAQCEAVGVDEERALSAELVSYSDTDRWEYDKGFGIVRRPAAE >gi|316924748|gb|ADCP01000014.1| GENE 7 9573 - 9722 85 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSVSTVHRATQDDTIGGCQLNTLAKIADALGVKTKRLYEEVEECKKEHS >gi|316924748|gb|ADCP01000014.1| GENE 8 9965 - 10303 359 112 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MATPSITDLQSLVNRLTADNNRLRNLVKLYEATFAAMKPGKYYVCSDDGEVSSSYWRGGA AETFASELAADRTGATVVLVTRRYEADPAPREDFGVLPGVDYPATLTPAVGL >gi|316924748|gb|ADCP01000014.1| GENE 9 10557 - 11159 460 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212703283|ref|ZP_03311411.1| ## NR: gi|212703283|ref|ZP_03311411.1| hypothetical protein DESPIG_01325 [Desulfovibrio piger ATCC 29098] # 38 118 138 219 332 68 43.0 2e-10 MGYRGKKALVIKLAYIEAFNALEAELQRQRGALPPADALAPSQQAQLKALVDAKVGMLPK EHQRKGYGEVWSRFARHFEIAKYTQLPPERMAEAVEYLIGLELKVAEPALPDMRSLPRFA SYDRYAASIEAAYTILSRLNALDVQFSQAHLGTHPLRGADCSRFYDAVMREVRISATMVD AARNALNNAIGLRQIAETVW >gi|316924748|gb|ADCP01000014.1| GENE 10 11162 - 11248 58 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPPITRLEIWMCGYVLGIVTMLIREMGM >gi|316924748|gb|ADCP01000014.1| GENE 11 11294 - 11491 115 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLEQEKPRRTIAERIRKKEPKPKKAKTPKKRESYRVEVEPGESRIMRCPRPMLVTAPAWL AEWFR >gi|316924748|gb|ADCP01000014.1| GENE 12 11539 - 11943 324 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294789834|ref|ZP_06755062.1| ## NR: gi|294789834|ref|ZP_06755062.1| ASCH domain protein [Simonsiella muelleri ATCC 29453] # 3 95 1 99 153 71 43.0 2e-11 MEIKALSVRQPFASQIVIGEKTIEWRSKPFNWRGPLVICASKSAIIELDNGKRLPVGVAL GIVDVVGCRPMTRKDLVAACCEDYEDEMYGFAWELVSPREVVPVPVKGIVAPWPWRGPEL TLCPGWHDRNVLDD >gi|316924748|gb|ADCP01000014.1| GENE 13 12075 - 12329 275 84 aa, chain + ## HITS:1 COG:yefM KEGG:ns NR:ns ## COG: yefM COG2161 # Protein_GI_number: 16129958 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Antitoxin of toxin-antitoxin stability system # Organism: Escherichia coli K12 # 3 82 11 90 92 79 52.0 1e-15 MSQAITYSEARQNLAETMNRVCDHHEPVIITRQKSPSVVMMSLEDYNSIMETAYLLRSPA NAVRLREAIQAADTGKAIPHELED >gi|316924748|gb|ADCP01000014.1| GENE 14 12332 - 12592 299 86 aa, chain + ## HITS:1 COG:AGc3658 KEGG:ns NR:ns ## COG: AGc3658 COG4115 # Protein_GI_number: 15889305 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 86 1 88 89 99 54.0 1e-21 MLLTWTPQAWEDYLYWQHTDKRTVKRINELLRDAMRNPFEGMGKPEPLRFDLAGCWSRRI NQEDRLVYKVDERSEGLIVLQCRYHY >gi|316924748|gb|ADCP01000014.1| GENE 15 12645 - 13394 743 249 aa, chain - ## HITS:1 COG:VC0470 KEGG:ns NR:ns ## COG: VC0470 COG2356 # Protein_GI_number: 15640497 # Func_class: L Replication, recombination and repair # Function: Endonuclease I # Organism: Vibrio cholerae # 29 243 24 229 231 159 40.0 5e-39 MGKHVLAAVLAVLFMASGAQAAGNVWNDSFNKAKKTLEWQVYYDHRITLYCGAAFDEKKD VALPEGFTAPKHEKRAGKIEWEHVVPAENFGRAFPERREGDAQCVDKRGKAFRGRKCAER VNREYRLMQSDMYNLYPAIGAVNALRQNYNFQMLPGEEPDFGSCGMKIADRRAEPPIRAR GQIARTYKYMADAYAPRYRMSRQQAQLMDAWDRMYPVDAWECTRAKRIENLQGNENPFVK RPCREARLW >gi|316924748|gb|ADCP01000014.1| GENE 16 13634 - 13819 162 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHIWQICDIHARDFVGMAGQARPIRLEAVRAECDRTDDPEGNFQKVMLIEAILFPSRYLK K >gi|316924748|gb|ADCP01000014.1| GENE 17 13939 - 14325 546 128 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKFVDDATAKDTAFIKCFPDDSGVYARVLTETELDILRIKSRTFNSNEKRTPELMDRRFK ILHLQRALSGWEGLEFEDGSPIPFSKEMIKELWEVNPNLMGIIYSCVSSELSFVKAAEEK NSVTGADA >gi|316924748|gb|ADCP01000014.1| GENE 18 14405 - 15043 674 212 aa, chain - ## HITS:1 COG:no KEGG:DVU1494 NR:ns ## KEGG: DVU1494 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 5 209 178 384 387 91 28.0 2e-17 MGAEAQAFATSIVVEDAKVFFTGQKIQNVTKSDDNSSKGYAVTAVDERTNTQTVAPGISG AWAVDDVVTWWMPYGPAIGNELENADSVIRIDGTAGKMRSCTIKFSTPTEFTDELGDHFP GQPIDTMRASSVDFEYYMRNDAAKRLREGSEGKEVRFDAEFGSEEGRKVVVSCPRIKNKM PAINADSATVTLSQSSDILGVALEDAVEIILE >gi|316924748|gb|ADCP01000014.1| GENE 19 15240 - 15539 368 99 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLYRQKKDGQVYNDQTMRYEPSIKDTPFKGIRKNAKIEENPELPIQLGDCIILAAASDLP VPAVPDQIIMDGETWSVVDSAPVAPGDTALVHNILIRRG >gi|316924748|gb|ADCP01000014.1| GENE 20 15605 - 16117 659 170 aa, chain - ## HITS:1 COG:no KEGG:Msip34_1624 NR:ns ## KEGG: Msip34_1624 # Name: not_defined # Def: hypothetical protein # Organism: Methylovorus_SIP3-4 # Pathway: not_defined # 1 151 1 152 167 88 37.0 9e-17 MPLIVEDGTLPAGANSFASVADADAYHAARLTAAWTDELAEVQKEAALIRASDWLNRKVM WNGRKASRSQRMAWPRSGVVTQDGEIAPDEIPAEVVEACCELAGFFVEQDYLAPLDRGGD IASLSVDVISIAYNGTTPAETVFPSLSGLLAGLGTVCTGKGGGIMEVGRG >gi|316924748|gb|ADCP01000014.1| GENE 21 16261 - 16581 254 106 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHISTVRIRSVETASGFIVINEADFDPSKHQLWNSEVTTATPEPSSAEPNANKPLDLMTL TELRDHAKAHGIAIPATITAKADVLAHVLAASGACAENETAGLPQA >gi|316924748|gb|ADCP01000014.1| GENE 22 16590 - 17642 1628 350 aa, chain - ## HITS:1 COG:no KEGG:SAR116_2254 NR:ns ## KEGG: SAR116_2254 # Name: not_defined # Def: hypothetical protein # Organism: Alphaproteobacterium_IMCC1322 # Pathway: not_defined # 2 350 3 392 394 207 35.0 4e-52 MNDLSKVVDKLLAQGLLALRGTCVMPRLVNSDYSNLAAQQGASIDVPIPSAIKAQAVTPG ATSQDTGDISPVSATIKLDRWMEAPFYLTDKDLMEANRGVIPMQASEAVKAIANDVNATL LGLGRKFYGMVGTPGTTPFSTVVDATNARKVLNRQLAPVNDRRIVLDPDAEAAALGLSGF ADVSKSGDARPIIDGTIGRKYGFDWAMDQQVPTFEASVMTEGALTVNGANEAGAQVVSLA KATNAAGLKEGDILTIAGDAQTYVVMEAVTVSGSHVMNLAFHRDAIAFATRPLMDSANGL GNLIQSAVDPVSGLSLRLEVSREHKRTRFSYDILYGADVVRRELGCRIAG >gi|316924748|gb|ADCP01000014.1| GENE 23 17668 - 18153 684 161 aa, chain - ## HITS:1 COG:no KEGG:MCR_0943 NR:ns ## KEGG: MCR_0943 # Name: not_defined # Def: hypothetical protein # Organism: M.catarrhalis # Pathway: not_defined # 8 138 92 222 244 87 42.0 1e-16 MTERFPNLEKALADSKKDSADRLAAKEASIRTLLVKGIFDSSAFLKDKTVLPSDVAYASF GRHFEVKEENGELRVVATMNGQPIFSRSAPGTFAVSEEALEAIIDKYPMKDRILKAPDGG SGSHPNAAYAPGAKIIPKGDMSAFGTNLEAIAKGEVTVAAQ >gi|316924748|gb|ADCP01000014.1| GENE 24 18664 - 18918 98 84 aa, chain + ## HITS:1 COG:no KEGG:Oant_0622 NR:ns ## KEGG: Oant_0622 # Name: not_defined # Def: hypothetical protein # Organism: O.anthropi # Pathway: not_defined # 1 84 13 96 96 127 75.0 1e-28 MKKKHQMTLKQIFARPVSGSIRWADIEALFVELGAEVSERAGSRIVVVLFGEVRVFHRPP PAPTTNRGAVASVRIWLESHGVKP >gi|316924748|gb|ADCP01000014.1| GENE 25 18915 - 19253 453 112 aa, chain + ## HITS:1 COG:RSc1697 KEGG:ns NR:ns ## COG: RSc1697 COG4226 # Protein_GI_number: 17546416 # Func_class: S Function unknown # Function: Uncharacterized protein encoded in hypervariable junctions of pilus gene clusters # Organism: Ralstonia solanacearum # 1 111 1 111 112 115 57.0 2e-26 MNNVMTFEDGYKAVIAYDPKIEMFRGEFVGLNGAADFYAADLEGLKREGKTSLEVFLEVC AEKGIAPKKQAGRFALRLDPETYQSVAIAASASGKSINQFIVDSLKQSVQAV >gi|316924748|gb|ADCP01000014.1| GENE 26 19265 - 19552 334 95 aa, chain + ## HITS:1 COG:Cgl2962 KEGG:ns NR:ns ## COG: Cgl2962 COG3093 # Protein_GI_number: 19554212 # Func_class: R General function prediction only # Function: Plasmid maintenance system antidote protein # Organism: Corynebacterium glutamicum # 5 74 16 85 121 93 58.0 1e-19 MCAAIHPGEILREEYMAPLGLSANALAIALGIPATRIHEILHEKRGVSTDTAMRLARYFG TDMEMWINLQAQYEACLLEREKGQEFMRIIPHRAV >gi|316924748|gb|ADCP01000014.1| GENE 27 19570 - 19866 254 98 aa, chain + ## HITS:1 COG:no KEGG:AM1_B0188 NR:ns ## KEGG: AM1_B0188 # Name: not_defined # Def: HicB family protein # Organism: A.marina # Pathway: not_defined # 4 98 6 100 109 134 66.0 1e-30 MEKYTYRVIWSEEDQEFVGLCAEFPSLSWLEEEQDAALHGIVRLVSDTLKDMEANKEPIP EPLSLRKFSGNVVVRTTPEMHRQLAILSAEAGVSLNRL >gi|316924748|gb|ADCP01000014.1| GENE 28 19883 - 21724 1321 613 aa, chain - ## HITS:1 COG:RSc2311_2 KEGG:ns NR:ns ## COG: RSc2311_2 COG4585 # Protein_GI_number: 17547030 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 410 611 1 205 207 136 36.0 1e-31 MNLLRHPLGLASMASLPLLLFATAFVLLAGVLLFAFRDINAMLEQAAEQKIPVLTQTTAL VRECEWLRGMMFRLMQTDSELVREAIIITLKERIDDAGEMASRLEAIGLYKDKVESLKEQ LSSLDGIADDVNTLITRWIILREQRQNQVKSLRSLSEELIALDVSGVEQLEAWKQGAYRI LSLLLVLSTDLDVPYGLRVTSEMYEWVRNVRADLSRVPSTPHISGMVEAANHLYDKLILC AQGENGIIPSFKRQHAVSQQFAALTIRMDALSESIMIVAGQLMAQAKTDAEHARDGFSER VSGLSGLLYGFVVLVLAVIAATYLYLARHVIRPVIRLNNCMRLRTQGFSANIPYDGAWEI REMARSVFYFVSEIEHREHELRESHAGLEEQVATRTAELKRLSNRLLQAQEEERFKLAAE LHDDIGATMGVIKFGIERALLMLDRPDAIKVQEPLNEAVDLVKGLARQLRRIQNELRPAH IDVGLLTSLEWFCKDYQGAYSHLRLDLAADVDENDIPMTLRIVLFRVIQESLNNIAKHSG ATSVKIRLQKQGNTLKVSVTDNGVGFDITAAMGMQETGRGLKSMRERVELSGGTFIIRSK PNKGTQLTACWEN >gi|316924748|gb|ADCP01000014.1| GENE 29 21721 - 23307 626 528 aa, chain - ## HITS:1 COG:AGl2786 KEGG:ns NR:ns ## COG: AGl2786 COG0747 # Protein_GI_number: 15891502 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 33 490 94 554 585 217 32.0 5e-56 MRSMLWALCCAAFLWGFSCPASAAGIKEIVVAIESTPPHFNPALLSGSTVATVGAQLYAG LTRVDAAGNAMPYLAKSWKASPDGRSVRFELRENAFFHDGHPITSGDVAYSILMSREYHP FRPMLESVTRVDTPSPHTVIVRLSRPFPLLPKVLVPALVPILPRHVFDNGHPLPTHPANK KAVGSGPFMLESFTPDRSIRLVRNPRFFLPDRPIADRLTFRIFWDQGEIPLALASGEVDV YLFASCIQLRRFLDGYKFAPFTVLPLGNLHPYTVLSYNLRQKPFSDRKVRRAFALAIDRN FLADRLLHGMVKPLSGPFPPGALFYTPLPAPHDIAEANRLLDEAGYPRGKDGKRFSIIVD YSPDSPLPRELLRLLQYELAARLGIEVRTRDSESVSDWMRHVVTGDYQVVLDELFAWYDP VIGIHRLYSKNNSDRGIVWTNSGGYANDTVEALMQQATEEAIPRRRQTLYDRLQRELVED NALLWLTTSPYALVAQPSLRHLEQLGLGLMSPLDGIEKVPSLFLGGNP >gi|316924748|gb|ADCP01000014.1| GENE 30 23311 - 23967 604 218 aa, chain - ## HITS:1 COG:SA2179 KEGG:ns NR:ns ## COG: SA2179 COG2197 # Protein_GI_number: 15927969 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Staphylococcus aureus N315 # 3 214 1 216 217 154 39.0 1e-37 MSVSIVIAEDHALFRSGLKQLLLLGEGLEVIGEAGDGFEAVTVTVAGRPDLLLLDLSMPG RDGLEVIAHIRKECPDTRILIISMHATSMHIRAAFKAGAHGYLLKTADRQEFLFAIQSVL RGNRYVSTELTGTMIDWCVGEQTEEIASPLAQLTQREREILKLVAEGCSNKEIAARLSVA EKTVVTHRTNFMRKLGLRNVREVTMFALECGLIDMKRR >gi|316924748|gb|ADCP01000014.1| GENE 31 24137 - 25810 1078 557 aa, chain + ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 13 547 14 541 553 312 35.0 1e-84 MAIREGTGAVPQIADTVYGNGTILTMNPAQPEAEAVAVAGGVIIGVGALSDMKALCGAES RFVDLRGRTMLPGFIDAHSHFIDNAARTPWVNINSKPLGPVESIEDMLRLLGERAGRTPE GGWIVGWGYDDTMIKEMRHPTRADLDTVSTKHPIVIQHISGWVTAANSAALRLAGITRDT PDPENVVIRRSASGEPSGVIEASHCPVLALVPQLDQDQFLDTLSAGSDMYLAKGCTTAQE GWVADPNWFPLMGEALKRRTLKLRLVLYPLGQDISLEEYGRVFPDVPSGTPLDEEGKLVM GATKLSADGSIQAYTGFLSQPYHKTPEGKPGYVGYPSNDLEWLRQRIIDLHSNGRQIAVH CNGDAAIDVALDAYEEAQRQSPRDDTRFIVIHSQMARSDQIERMSRLEAIPSFFITHTYF WGDRHYAIFMGPERAVRMSPTGDALRCNLPFTLHNDTFVTPIDPLLLVWSAVNRLSYGGR DLGKAEQGIPVLEALKGITINAARQGFEEGVKGSIEPGKFADFVVLDENPLAVDLMHLKD IEIAATIVGDVLAYGAL >gi|316924748|gb|ADCP01000014.1| GENE 32 25897 - 27069 1107 390 aa, chain + ## HITS:1 COG:ECs4796 KEGG:ns NR:ns ## COG: ECs4796 COG0477 # Protein_GI_number: 15834050 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 3 383 23 408 421 148 27.0 2e-35 MALPYVHIKFYDVIREATQATNNELGLLMTIFTGVGMLLYIPGGVLADRGSPKKLLIISL AMMAALNTIFAVHTSYVMALLIWALLAVSANMIQWPTLIKAVRDTGSKEEQGRMFGTFYG ATGIFSSLIGFAGAALYSSVDDRIQGFHYMLYGQAVIAVLALIAVAMFVDDKTEYNQSVS VPEENPFKSALTVMKLPAVWMMVILIFCGYGMYIGIAYMTPYTTNVLGVSITFGAILGTI RAFGLRILTGPFSGYISDKIGSTAKILMVCFLLLIGMLFVLLRLPAGTSSTMIILLTMMF SFFGLVVYTIMFSCMEEVRIPPQYTGISVSVISLLGYLPDGIFSPLFGHWLDVYGNEGYR IIFYFLAMISLIGSIVSLLIYRRGKAMRNA >gi|316924748|gb|ADCP01000014.1| GENE 33 27084 - 28505 818 473 aa, chain + ## HITS:1 COG:no KEGG:LI0461 NR:ns ## KEGG: LI0461 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 2 473 3 498 498 372 43.0 1e-101 MKRLIALFLAAGLILSTATAAQAVDFKVKGLSQHRLSWADRNFTKGNGDDNFKAASRLRT TIEAVVSESLKGVVFLEVGHQNWGVSRDGASLGTDGKQVKVSWSYVDWSVPKTETRVRMG LQPYTLPTFTGTGSPILDGDGAGISISNRFSENVEGTLFWIRPGNDNASEETGKRDPHDA MDFIGVTLPLSLEGLKVIPWGMYGIIGRDSLHGSAGDLEVAAAGLLPQLATPSIVSLSDD AHGDAWFGGVASELTVLDPFRFAFDAAYGSVDIGSAPVNGRKLDVKRSGWYAAFLAEYKM ENVTPGLLFWYASGDDANPWNGSERMPSLDPDVYVTSYGFDGTYYGGAAQTMGYGLSGTW AVMAQLSDISFLEDLKHTVRVVYYQGTNNTQMVREKSVTNPQDTMYSMLYLTTEDKAFEV NVDSSYKIYENLSLYWELGYIRLDMDEALWKNSVGYEANKNNFKCTLSMIYTF >gi|316924748|gb|ADCP01000014.1| GENE 34 28720 - 29892 529 390 aa, chain - ## HITS:1 COG:no KEGG:Acid_4109 NR:ns ## KEGG: Acid_4109 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 7 235 6 239 455 82 30.0 4e-14 MNSSYEQKHPLYAASSTKRRKAMDLYEGGERIEGNRAYLTRHTYESDKQYDIRASRATYR NFAAPIVDVFASFINEGRPERILPDALRPIEADADRYGMTADAFFADVTRLAAAGGARFV IVDMEQKKGETVAEDRASGRRMVPYFTSISPDDVWDWGMDAKGLAWAVVHSEEMEHPAAF AAPLRYETLTVWTRDSWTRYRRPMKDKENTEYAIAGDGERRHGLGAVPLVPFLFEPTSPM TGLPATDDVLSLILRIYRRDSELDKMLFDRAVPLLNVGGVSQEHWDTFVVASSNALMSTE PGGITAQYVEPVRHRVSGAGRSPCPRRCGKSPFVWSVRNPPWESPPSPRPSTKHSSTRSL PASPGAAAARKPSAGSSPPGGLELRRTGSR >gi|316924748|gb|ADCP01000014.1| GENE 35 29902 - 31320 1272 472 aa, chain - ## HITS:1 COG:no KEGG:HSM_0897 NR:ns ## KEGG: HSM_0897 # Name: not_defined # Def: hypothetical protein # Organism: H.somnus_2336 # Pathway: not_defined # 13 354 4 386 515 218 36.0 6e-55 MKATVTDGIPVPVYHAEPTLARFHASDAFFRGVRGPLGSGKSVGCCAEVMSRILRQKAFM GLRRSRWAIIRNTYGELKTTTIKTWMDWYGAVTKISYGHPIMGLINMPLEDGTDVQAELV FISLDRPDHVKKLKSLELTGVWLNEASELPEEVLQMATGRVNRYPSKMQGGFSWTGIIAD TNSCNVDNWWYRLAEEERPEGYDFFAQPPALLVKKDAEGKLQYREKPEAENIENHTAGYG YYFLQLPGKHTEWVNVYVMNRYGSSNPGNLVYPEYGDGNLTGVSFDPGRDVIWTHDFNFT PLSSVILQTDKDGNVYAVDEIVLSSAVAKQAAAEFCERYAGFRGIVYLYGDASGHVGEKH GHASDYVTLEQELRRNGFRVQQKVPRTNPAIKDGQASLNAKICDAMNVRSFFVNKGKCPM LHKGLSTLKLKGGSTFQEEDAEYQHITTAVRYYTAVEFPIKKREIFFGSSHG >gi|316924748|gb|ADCP01000014.1| GENE 36 31322 - 31669 182 115 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRELRILCRIKKAMDAEEIAGTPTGKMNGQGKELLHPAMLLIGAQQTKKADGNDATSLST VSETQEMHYLRLDAAHSMVLEQIRRVSDSLARMEGEGGEESRNGVAIVLDLGGGR >gi|316924748|gb|ADCP01000014.1| GENE 37 31948 - 32376 418 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294789837|ref|ZP_06755065.1| ## NR: gi|294789837|ref|ZP_06755065.1| conserved hypothetical protein [Simonsiella muelleri ATCC 29453] # 1 132 91 222 228 87 33.0 2e-16 MDLTDREEKQANVFQNNPDAQGDWDVGMLGNLMQADGFTAEELGFTESTAVMMFDGDPRF SALFQDTEDVTATKEQIREVRDHRADSMKDMQEAQSADFYFTVVFQNQKAKDRFLMEMGV PVCEQFVNGDILSRRFGVDSGE >gi|316924748|gb|ADCP01000014.1| GENE 38 32638 - 33324 525 228 aa, chain - ## HITS:1 COG:no KEGG:LAR_0782 NR:ns ## KEGG: LAR_0782 # Name: not_defined # Def: hypothetical protein # Organism: L.reuteri_K # Pathway: Sulfur metabolism [PATH:lrf00920]; Metabolic pathways [PATH:lrf01100] # 7 223 13 228 232 199 45.0 9e-50 MAATTLLFDGIKAAANITDRVLVFYSGGKDSAVTLDLCARYFKRIQPVFMRLGPILSFQR ACLDWVERRYRVPVMIVPHPMLAEWLRYGTFREYDFDVPIISFLDVYTYARSKSGVWWIA AGERIADSIVRRAMIKGDGGVVNPKRGRFFPLAHWSKSDVYAYIRHHRLKVAPETQHLGF SFRSLMGSDLAAIRRVYPKDYARIESWFPLVGVSLAQYVFAKKQEGTT >gi|316924748|gb|ADCP01000014.1| GENE 39 33331 - 33489 156 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAKTQTAGQAVGTIGRRHYSTGGTTDGRRSRSDLGSRGRNMLRRKSMGGSGG >gi|316924748|gb|ADCP01000014.1| GENE 40 33548 - 33823 288 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNDDVQLVREIGEVKAELSAVKAELAGLRERIDDVVISILRDHGKRMAMLETRVAALEAA ETRRAGGMAALVAVAAAAGAAGNVLSRWIAG >gi|316924748|gb|ADCP01000014.1| GENE 41 33820 - 34032 114 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTLAASGCARWAGPTAATSPAPLTPGAVVTGEWSYTYRGETFTEPGEWVHLPAGEAGNLL LWIKGVEAGR >gi|316924748|gb|ADCP01000014.1| GENE 42 34140 - 34430 376 96 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEATVIDFILSTLMSFSAQYPDAARLVTALSVVMTVCGLCAVATAWIPVPKETEGLYAVF YRWAHALAAHFGQNRGAVADGKSETVKAEVKAVTGK >gi|316924748|gb|ADCP01000014.1| GENE 43 34430 - 34741 162 103 aa, chain - ## HITS:1 COG:no KEGG:RB2501_01256 NR:ns ## KEGG: RB2501_01256 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 12 94 31 113 117 100 56.0 2e-20 DGRDLLQDGSTRPRDLAGIPFPLSSAYRCPKHNKAVGGVPTSAHTRGYAVDIRCVDSHSR FVMLQALLEAGFRRIELAPTWIHVDNDPDKPRDVAFYQHGGKY Prediction of potential genes in microbial genomes Time: Fri May 13 02:00:01 2011 Seq name: gi|316924726|gb|ADCP01000015.1| Bilophila wadsworthia 3_1_6 cont1.15, whole genome shotgun sequence Length of sequence - 28811 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 10, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 80 - 466 419 ## RB2501_01256 hypothetical protein - Prom 490 - 549 3.9 + Prom 1322 - 1381 12.2 2 2 Tu 1 . + CDS 1593 - 2672 1179 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 2692 - 2729 7.1 3 3 Op 1 . + CDS 3385 - 4251 814 ## Ddes_0043 hypothetical protein 4 3 Op 2 . + CDS 4256 - 4963 708 ## COG3382 Uncharacterized conserved protein + Term 4999 - 5057 6.3 5 4 Op 1 . - CDS 5244 - 5789 469 ## COG1988 Predicted membrane-bound metal-dependent hydrolases 6 4 Op 2 . - CDS 5875 - 6804 824 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + Prom 6968 - 7027 7.7 7 5 Op 1 . + CDS 7122 - 7454 366 ## azo0151 putative transcriptional repressor 8 5 Op 2 . + CDS 7477 - 11424 3011 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Term 11552 - 11591 -0.6 9 6 Tu 1 . + CDS 11734 - 12723 1060 ## COG1446 Asparaginase + Term 12797 - 12831 1.2 + Prom 12875 - 12934 2.1 10 7 Op 1 11/0.000 + CDS 13104 - 14996 687 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Term 15022 - 15071 5.1 11 7 Op 2 38/0.000 + CDS 15219 - 16763 2165 ## COG0747 ABC-type dipeptide transport system, periplasmic component 12 7 Op 3 49/0.000 + CDS 16829 - 17752 1156 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 13 7 Op 4 1/0.000 + CDS 17760 - 18647 725 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 14 7 Op 5 1/0.000 + CDS 18650 - 19717 479 ## COG3191 L-aminopeptidase/D-esterase 15 7 Op 6 . + CDS 19756 - 20574 240 ## COG2362 D-aminopeptidase + Term 20597 - 20633 6.1 + Prom 21151 - 21210 1.7 16 8 Tu 1 . + CDS 21236 - 22261 453 ## COG4974 Site-specific recombinase XerD + Term 22391 - 22428 8.1 - Term 22420 - 22468 9.0 17 9 Op 1 . - CDS 22504 - 22950 423 ## 18 9 Op 2 1/0.000 - CDS 23043 - 24542 1704 ## COG3333 Uncharacterized protein conserved in bacteria - Term 24559 - 24596 3.8 19 9 Op 3 . - CDS 24602 - 25555 1240 ## COG3181 Uncharacterized protein conserved in bacteria 20 9 Op 4 . - CDS 25607 - 26839 1081 ## COG0498 Threonine synthase - Prom 27025 - 27084 8.5 + Prom 27574 - 27633 4.3 21 10 Tu 1 . + CDS 27678 - 28634 1113 ## COG3181 Uncharacterized protein conserved in bacteria + Term 28692 - 28745 14.4 Predicted protein(s) >gi|316924726|gb|ADCP01000015.1| GENE 1 80 - 466 419 128 aa, chain - ## HITS:1 COG:no KEGG:RB2501_01256 NR:ns ## KEGG: RB2501_01256 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 6 119 1 113 117 128 54.0 6e-29 MAVLPLRHFSPLEFRCKCGCGAGMEKMDADLLQMLDEARDLAGIPFPLSSAYRCPKHNKA VGGVPTSAHTRGYAVDIRCVDSHSRFVMLQALLEAGFRRIELAPTWIHVDNDPDKPRDVA FYQHGGKY >gi|316924726|gb|ADCP01000015.1| GENE 2 1593 - 2672 1179 359 aa, chain + ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 43 267 43 258 332 71 26.0 2e-12 MRLRKFFSCCIALLLCLAISAPAFSAPLTKVRTAWMDEFEAFPIWYAKEKGWDKEAGLDL EILYFTSGMAILNALPSGEWQYAAIGAVPAMMGALRYNTYVIANADEEFLINRVLVRPDS PIAKVKGWNKDYPEVLGSPETVKGKTFLTTTVSSAHYALSVWLRVLGLTEKDIVLKNMDQ AQALSAYEDGIGDGVCLWAPHAYVGEAKGWKLVANPKLCGRSSPCVIVADRKYADANPEV TARFLSVYMRAMNMLQNEPLESLVPEYRRFFLEWAGRDYPADLALLDLQTHPVFNLDEQL AMFDASKGQSKAQYAQSEMARFFAEIGSINKDELKRVEDGSYATDKFLKLIKKPIPSYK >gi|316924726|gb|ADCP01000015.1| GENE 3 3385 - 4251 814 288 aa, chain + ## HITS:1 COG:no KEGG:Ddes_0043 NR:ns ## KEGG: Ddes_0043 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 279 2 281 301 323 60.0 5e-87 MIEKCEPITHLFDAVHQLWDHRRTQRAFATLIFFIYLAGLLGIEANRQGLLPPWLEHITP MSHFQAIHLAFTLILGMEVMELILTISGSLSKSLGKQFEVMALILLRDAFKELSKLPEPV SLTSGIEPVVHIASAGLGALIIFICLGFYYQLRTHQNYITNPEEQMRYVMSKKLISLVLF LLFVGIAVRDLILFVQTGNDADFFETIYTILIFADIALVLISQRFMPDFHAVFRNSGFVI GTLIMRLSLSAPWPWNIASSIFAAVFVLALTWATTYFGPSRLRKPLPR >gi|316924726|gb|ADCP01000015.1| GENE 4 4256 - 4963 708 235 aa, chain + ## HITS:1 COG:L36841 KEGG:ns NR:ns ## COG: L36841 COG3382 # Protein_GI_number: 15674160 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Lactococcus lactis # 5 235 6 234 235 122 32.0 5e-28 MLYEIEQAVFERFPGYARMVVVAEGVDNTREIPELAELLAQCEEGVRRDDLEDFWHVPVL ETWAEAFSGMGIKPKKNPPSVINLVKRCRAGKPLPFINPLVAIFNCISLKYLLPCGGDDL NVIEGDLRLGIADGTENYVPLGQPDVLEHPQPGEVIYYDTATKDVFCRAWCWKNGNRSKL MPETTNAVINVDAMPPLPLATAEQAAEEVAGLLRRFTGAKTAIHKLTAETPRFTL >gi|316924726|gb|ADCP01000015.1| GENE 5 5244 - 5789 469 181 aa, chain - ## HITS:1 COG:PA4962 KEGG:ns NR:ns ## COG: PA4962 COG1988 # Protein_GI_number: 15600155 # Func_class: R General function prediction only # Function: Predicted membrane-bound metal-dependent hydrolases # Organism: Pseudomonas aeruginosa # 1 160 1 160 178 144 55.0 7e-35 MPTVFSHPAPVLALAALLGGRLSTRMLLFGILCAVLPDADVIGFRFGISYADAFGHRGFS HSLAFALLMGCAGFGVAPLFLRGSRLMGFTVGLLAVSSHILLDAMTNGGLGVAAFWPFDQ TRYFCDWRPIRVSPFGLKGLLSQRGLSVMLSELRWVWAPCLAVIAAALFFGKNPMRAIPR K >gi|316924726|gb|ADCP01000015.1| GENE 6 5875 - 6804 824 309 aa, chain - ## HITS:1 COG:BMEI0908 KEGG:ns NR:ns ## COG: BMEI0908 COG1752 # Protein_GI_number: 17987191 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Brucella melitensis # 2 283 34 295 314 207 41.0 3e-53 MKIGLALGSGSARGWSHIGIIEALAELDIHPEIVCGTSIGSIVGAAYATGNLGRLKQRVC ALTKLNTASYFNFSMSIDGFVNKEKLRAFFADCVTPPGMLIEDLPVRYASVATEVSTGRE YWLKKGPLEEAIWSSISLPGLFPPVFYHDRWLFDGGLVNPVPISVCRALGADIVLAVNLN GELIGRHSAKKKPKENKEGEGDFVERIAMTIRQYGTTLFGENESPAHHPPSVFGTIASAV DIVQDRITRSRMAGDPPDIFLTPRLAHVGLLEFYKAEEVINEGKACVSRALPALETLLHR DFGHAHQGD >gi|316924726|gb|ADCP01000015.1| GENE 7 7122 - 7454 366 110 aa, chain + ## HITS:1 COG:no KEGG:azo0151 NR:ns ## KEGG: azo0151 # Name: smtB # Def: putative transcriptional repressor # Organism: Azoarcus_BH72 # Pathway: not_defined # 1 98 1 98 103 76 38.0 2e-13 MNTHYLNGLPQEWTAIAEVFSALGDGTRQTILLLFEPGESIELKTFVDILPLSRSAVVHH LKVLEQAGLLIPHKHGRALSYTLNLACAVHALGQVKTYADELLAATEAAQ >gi|316924726|gb|ADCP01000015.1| GENE 8 7477 - 11424 3011 1315 aa, chain + ## HITS:1 COG:CPn0922 KEGG:ns NR:ns ## COG: CPn0922 COG0318 # Protein_GI_number: 15618831 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Chlamydophila pneumoniae CWL029 # 824 1315 45 534 538 297 38.0 9e-80 MSTDSGNRKYQALALGTSYALGTFNDNFFKQAGLLLAVTTGNAVFQSQVTFLFALPFVLF SAWSGWLADRFAKKSLVICAKTLELAAMLAGAWGMVTLDWNWMLAMTFCMGLSSTLFSPA LNGSIPELFPVGQVPRINALFKLGTTASILFGVFLAGIALDQAWIETAYPFGRWLVAILA VTVAAGGLVSTAFIPAYPGAGSRNPFPWSAVIDSFRFLRTLRKDGPLHLVIWAEAFFYFL STLLLLEINNLGGAELGLSYTATSFLPVALMVGICAGSLLAARGTPESWRTLLVPAIMGI GVLLCLVPVVAAADPALRLPLLFGLYALAGTCGGLYLIPITSFIQVRPAATDKGRILGLD NCLSFTGILLAGQLYLPLSLLRPSHGHVVLGILSLGVACMFLAVMRGFGRHQNPEEPFSE PVPATSSLPPVAHGPGFRALLAFVRGLLRLRYTIEVEGLEAVRARDDGRPILFLPNHPAL IDPALVYTSLAGFAPRPLGDERQVEQPVIRTLTRLIGTISIPDLRREGRTAESGVHEALE RVAGVLRSGGNVLLYPAGGLTRTGRERLGGNRGVYSLRGLVPDVRLVLVRTTGLWGSSFS WARGTAPDILKGLARGVFELLLNGIFFMPRRRVRISVSEPELPGQADGLRTLNEALETFY NADMAPALAVPYHFLLGSTPKELPAPAMQTPDGAALADVPKAIRERVLAILREESGVEVI EDTATLATDLGIDSLSLINVSVRLEEISGQPIEQLEALRTVGDCILAAAGLLGAAGEAAE PPAAWFPTGEARTLSVPDGRNLVETAFRQAMRSPSRLMLADGAAALSARDMIMRAFVLAS FIKAKAGNGERVGIMLPASAAAVLAWLGALMAGKTPVMCNWTSGAANFSHGLEAAGVRRV FTSSRLLDRLSGQGFPVGEHADVWVALEDAKRLSLPAKLGAFLKSRLLGIPCLGEAVIPR RVPETAAILFTSGSEALPKAVPLTHMNILANCRDIAAVLKITSHDSMLSMLPPFHSLGLT GNIALPLAFGLPAVYYANPTEGARLAALTRRWKPTITVAPPTFLDGMLRKARPGDLASLR LGFVGAEKCPDSVYAAFAEAAPGGVLCEGYGVTECSPVISVNRPESVLPGTIGQPLPSVR VAVVSAEGEPRRVAPGETGMLLVSGPNVFGGYLGVDADKQPFVAFEGMEWYRTGDLVSEA ADGCLTFRGRLKRFVKVGGEMISLPQIESVLAAAFGEGRSEESGEQGPALAVESGEDGAI VLFTTLDITREEANAALRAARLSGLSAVARVQKLASIPVLGTGKTDYKALKASCV >gi|316924726|gb|ADCP01000015.1| GENE 9 11734 - 12723 1060 329 aa, chain + ## HITS:1 COG:RSc1378 KEGG:ns NR:ns ## COG: RSc1378 COG1446 # Protein_GI_number: 17546097 # Func_class: E Amino acid transport and metabolism # Function: Asparaginase # Organism: Ralstonia solanacearum # 5 323 4 319 320 301 57.0 9e-82 MVQVPVLAIHGGAGTINREALSAEAEREYRGALQSILEKGRDALAAGASALDAVTLAVSM LEDCPLFNAGRGAVYTSAETHEMDAAIMDGSTLRTGALSCVHGVKNPIRAARVVMEQSPH VLMTSDGAMAFLREHGVEFMPDAYFDTEHRLAQLHQAQARQPGAAVLDHDGAAAASKLSF AGNPLDEKTKMGTVGAVALDSRGNLAAATSTGGMTNKLPGRVGDTPIVGAGCYADDGVAV SCTGSGEYFIRLVVGHDVAARVRYQGASLEDAVHAVLARVGELGGTGGLIAVDKKGHVTL PFISEGMYRGCVRGAEAPLVAIYGQEEGC >gi|316924726|gb|ADCP01000015.1| GENE 10 13104 - 14996 687 630 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 323 599 11 286 329 269 48 2e-71 MGPSDRNVLEVQGLTVRFDTSERSVVAVKDLGFHVRAGEVLAIVGESGSGKSVTSLSVMR LIEHGGGTIASGKISFTRRNGGKLDLAKAADSVMRTIRGGEISMIFQEPMTSLNPVFSVG TQVAEAVMLHQGLSHAEAEAEALRMLELVRIPEAKQILKRYPHQLSGGMRQRVMIAMALS CKPSLLIADEPTTALDVTIQAQILQLIRQLQEEMSMAVIFITHDMGVVAEVADRVLVMYH GEAVEEGTCEQIFHNPRHPYTQSLLAAVPRLGSMRGTDEPAPFPLLRITDPEAEQLGTAD MDETPVDMPEPAASAPSVSDGPVLSVDNLITRFDVETGFWGKVKRRVHAVEQVSFNLYPG ETLGLVGESGCGKSTIGRSLIGLETPRSGSIVFNGQELTQVSGSQLQKLRRNIQYVFQDP YAALDPRLTVGFSIMEPLLIHKVCSRQEAERRVGELLERVDLDPAMAVRYPHEFSGGQRQ RVCIARALAMNPEIIIADESVSALDVSVRAQIINLLLALQKEFRIAFLFISHDMAVIERV CHRVAVMYLGQIVELGSRRDVFENPLHPYTKRLMSAVPIPDPSRRTMSHTLLTGEIPSPV RSADYEPVVAPLKEVSPGHFVSEEQVANWF >gi|316924726|gb|ADCP01000015.1| GENE 11 15219 - 16763 2165 514 aa, chain + ## HITS:1 COG:RSc1380 KEGG:ns NR:ns ## COG: RSc1380 COG0747 # Protein_GI_number: 17546099 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Ralstonia solanacearum # 8 514 10 516 517 701 67.0 0 MFFCKTWKKVIIAATMVVALCAPQAHASKDVVLAIASTLTTTDPWDANDTLSHACAKTFY EGLFGFNEKLELRPVLAESYDVSPDGLVYTFHLRKGVKFHDGTDFNAEAVKVNFDRVTNP ENKLKRYSLYSNIAKTEVVDDLTAKVTLKTPFSPFINQLAHPSAVMISPTALKKYGKDIM FHPVGTGPYKFVEWKQTDYLKGAKNENYWRPGLPKIDTITWVPVVDNNARSAMMRTGEAH FTFPVPYEQAAVLEKDEHLKLVAAPSIVTRYLNMNMLQKPFDDLRVRQAINYAINKEALA KVAFSGYAFPSEGPLPQGVDYAVKLGPWPYNPKKAKELLAEAGYPNGFETTLWSAYNHTT GQKVIQFIQQQLAQVGIKAQVQALEAGQRVEKVESHQDAATAPVRLYYTGWSSSTGEADW GLRPLFAGDKTPPSMYNISYYKNPTVDADIMKALGTTDRAEKTKLYTEVQEEIWKDAPWA FLVTEKLLYATSKKLTGMYVMPDANYFFEEIDLQ >gi|316924726|gb|ADCP01000015.1| GENE 12 16829 - 17752 1156 307 aa, chain + ## HITS:1 COG:RSc1381 KEGG:ns NR:ns ## COG: RSc1381 COG0601 # Protein_GI_number: 17546100 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Ralstonia solanacearum # 1 304 1 303 307 457 77.0 1e-128 MLHYFLKRLLGLLPTLFLVAVIVFLFVHMLPGDPARLAAGPEADQVTVELVRKDLGLDKP LPEQFITYFGNIITKGDFGTSLRTKRPVSVEIGERFAATFWLTVWSMVWSVFFGLIIGMI SATKRNKWQDHAGMALAVSGISFPTFALGLLLMEIFSVSLGWLPTVGADSWKHYILPSIT LGAGVAAVMARFTRASFVEILQDDYIRTARAKGLSEWAVIVKHGLRNALIPVVTMMGLQF GFLLGGSIVVEKVFNWPGLGRLLVDAVEMRDYPVLQAEVLLFSLEFILINLLVDILYAVI NPRICFK >gi|316924726|gb|ADCP01000015.1| GENE 13 17760 - 18647 725 295 aa, chain + ## HITS:1 COG:ECs0911 KEGG:ns NR:ns ## COG: ECs0911 COG1173 # Protein_GI_number: 15830165 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli O157:H7 # 11 292 20 301 303 442 78.0 1e-124 MQNNVSVASGPTKVRTPWSEFWRKFKRQRMTLFAGSFIGLLVLIAIFWPWLVPYDPENYF DYDMLNAGPSLQHWFGVDSLGRDIFSRILAGARISLAAGFSSVLVGAVVGTLLGLTSGYF EGWWDRVVMRVCDVLFAFPGILLAISVVAVLGSGMANVILAVAIFSIPAFARLVRGNTLS LKRQTFVEAARSLGAGPMTIMLRHIFPGTISSIVVYFSMRIGTSIITAASLSFLGLGAQP PTPEWGAMLNEARADMVVAPHVAIFPSLAIFLTVLMFNLFGDGLRDALDPKIDQE >gi|316924726|gb|ADCP01000015.1| GENE 14 18650 - 19717 479 355 aa, chain + ## HITS:1 COG:RSc1383 KEGG:ns NR:ns ## COG: RSc1383 COG3191 # Protein_GI_number: 17546102 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: L-aminopeptidase/D-esterase # Organism: Ralstonia solanacearum # 9 346 8 341 341 327 57.0 2e-89 MDGLDFPRVGVLPSGQLDAITDVGDIRVGHCTLDEGNVHTGVTVVLPNDDPLMQKPLAAS CIFNGFGKSVGLMQMAELGCLETPLALTNTFSVGAIATAQIRAAVSAHPEVGREWSTVNP LVLECNDGYLNDIQALAVTEKHYIEACASASREFMQGSVGAGRGMSCFHLKGGIGSASRI AECGGQAFTVGALVLSNFGKAEHCLLAGRPVGHLLQERLDRIGAEAAALAAAPEKGSIIM LLATDAPLDVLQLGRLARRAGNGLARTGSLFGHGSGDVAIAFSSSQHLPQLAESAVPLLH APDGWMDRLFQAAAEATEQAIFKALFFAEPFVGRDGHRRPSIQEVLPEWRDIVRS >gi|316924726|gb|ADCP01000015.1| GENE 15 19756 - 20574 240 272 aa, chain + ## HITS:1 COG:RSc1384 KEGG:ns NR:ns ## COG: RSc1384 COG2362 # Protein_GI_number: 17546103 # Func_class: E Amino acid transport and metabolism # Function: D-aminopeptidase # Organism: Ralstonia solanacearum # 1 268 1 267 271 196 44.0 3e-50 MRILISADIEGISGVMSPEQTSPGSGEYERARLLMTREVNAAVEGALEGGATEIWVADGH GQYNNILLEELHPGACLVSGKPRLFGMMGGLDCGPWDGLFLIGYHARAGAHGVLAHTING SAFAEVRINGAPVGEYYLNGLLAGEYDVPVRLISGDDRLAEEAVSVYPNAEIVTVKRALS TRAAVHQPIGVVHKALHVAAQKALRNVSCKPLKLAGPFSVQVSTVRTYHADLFCMLPGAK RLSPTCLEFIADTPFAISRTLNCFSSMAASVH >gi|316924726|gb|ADCP01000015.1| GENE 16 21236 - 22261 453 341 aa, chain + ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 47 330 2 284 297 105 28.0 1e-22 MPLPALSSTQLPEAIPDWTQLDRKRVLEHFMLSRDYENIAEYISFQSDLLPDERVILFFL HAVYANSTKTIELYALYVIGLFNYAKKSFNQLKAWDIDAYLRHLTNKGLKPSSRNTAAAT LRSFFRHLVDSGVCTVNPAAFLKRKRNDGKGSLPGHLSHSLGRDELERLFAGMEEVGAPF RDIVLFRVLFMTGLRAEEAVSLKWMNVINWQGRWYLDVLGKGSKARRVYLPQKAQAALEE LRRHTSPRPEYPIFENLRHRGRPISRHGLYALVKKWSSLVISRGDVSPHWFRHSCFTQLA SRGARLESIQALAGHANIQTTMHYNEAAQLMSPASAIFDEE >gi|316924726|gb|ADCP01000015.1| GENE 17 22504 - 22950 423 148 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKERTIDILTAVFLIAFSLVMRSQLEGIPREGVLFPLCVLYLIMASSVLLALRAIMAGKG EMSFFGDIPPQRWCLVTAVFIAQVLGAMYVSFNICMAVGIFAMLIVLTPKKTGKALLANL VFTAAFIVFFQVFFTNIMHIYFPEPFFG >gi|316924726|gb|ADCP01000015.1| GENE 18 23043 - 24542 1704 499 aa, chain - ## HITS:1 COG:FN2105 KEGG:ns NR:ns ## COG: FN2105 COG3333 # Protein_GI_number: 19705395 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 16 467 14 463 494 313 43.0 5e-85 MNMFDPWVVLSNLSDPVTLLLMTLSACFGIIMGAVPGLSGTLGIALLLPFTFGLESATAL LMLGSVYCGSEYGGSIPSILINTPGTAAALCTTFDGYPMAMKGQAQKALFTALISSVFGG IFGVSALLFLSVPLANVSMKFGAPEQFWLCVFALTIIASLSSGNVLKGVIGALIGLVLAC VGMDPVTGYPRFTFNTMELMGGINVVPALIGLFAIPQALLLLRPAGEKGAMAPYNPAPGV FWQTLKDFFRRIWVTLVAGVIGVIVGIMPGAGGNIATFVAYNETKRFSKKPEEFGTGVME GIMAPEVCNNAVVGGAQIPMLTLGIPGSAPAAVMLGALMTHGLKPGFDLFSEQGDIVFTY TFGLILSNFIILLLGLVLVRIFVRALKIPQHFLVVAILALSVVGSYSISSSLMDVLSMAA CGLLGYVLVRYKYGVAPMALGLILGTTMEEGFKLSLHLGAAEGNIMLFFFSRPQSLALII LTLLSLGYSIWKECGAKKV >gi|316924726|gb|ADCP01000015.1| GENE 19 24602 - 25555 1240 317 aa, chain - ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 19 311 11 300 308 155 33.0 9e-38 MKSLFLSSLLALGLLAVPAQAAEYPAKPITLMTAFNAGGGSDVSHRLLEKFAKGVFDQPI VVTYKAGAGGEIGWTWLVGAKADGYTIGGVDLPHIVLQPMLRPEGQPGYKTEQLSPLCGL VYDADVVMVPEDGPYKTFKDLIEYAKANPGKVKVATVGKLTGDHLFLMQIEKLTGAKFTQ VPYSGSGKAIPALLSGEVDAYFGSGSSFLRMEKTRGLAIGSKERYELCPDVPTFIEQGYA IESGKYRGLATPQNIPAEARQYLETKFAELCANPEYQKAVKSSGLMPQFQTGKAFGEIIR TEGEQAKKILEAYGLLK >gi|316924726|gb|ADCP01000015.1| GENE 20 25607 - 26839 1081 410 aa, chain - ## HITS:1 COG:MA1610 KEGG:ns NR:ns ## COG: MA1610 COG0498 # Protein_GI_number: 20090468 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Methanosarcina acetivorans str.C2A # 8 404 4 396 405 218 32.0 2e-56 MHTECLGLVCRECGKRIPEAECALSCPDCGAPMRVMFSEASLRQALSAGLPAPEGRSFLR QWRSILPISDESLIDRVSLGEAETPLLPSHRYGEKLGIPDLYFKVEQGPTLSLKDRGTAL CVLKALEFGCKTVCLSSSGNNAASVSAYGSRAGLNPVVFVQKHVSAAKIFKSLVYGGRVV RIDGDMAAASRICGEMVKRRKWFQCGGPNPYRIAAKRTFAYGIVQQLGRAPDTVLIPCGG GAGMVAAHDAFREMFAAGVIDRMPRLVGVQLQACDPTARAFHEGRDAVTPVDKKPSLSDA IMNNNPYWGRYCLQAVRETGGTMISVSDADFIRTIRELGREEGLFTEPAGSVSVAALRYL VKVPGFENPGLTVCNLTGHGLNVPQVATDESETPGVIPPTVEAVEAFLQS >gi|316924726|gb|ADCP01000015.1| GENE 21 27678 - 28634 1113 318 aa, chain + ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 25 316 16 307 308 148 31.0 2e-35 MKTFYALLLALGICLGFTFPAMAEYPAKPIMLMTAFNAGGGSDLSHRLIEKFAKGVIPQP IVVTYKAGAGGEIGWTWLVGAKADGYNIGGVDLPHIVLQPMLRPEGQPGYKTEQLNPLCC LVFDPDILIVSEKSPFKSFGELIEYAKANPGKVKAATVGKLTGDHIFLMNVEKFTGAQFT QVPYPGGSKAAPALLGGEVDCYFASTSNFLRMQGARGLAIATKDRYELCPDVPTMNELGY KLESAKYRGLATPPNFPAEAQAYLEAALAKVCADPEYQEGVRGAGLLPYYKDGKAFGEII EQEKEKARKTLKEQGLIQ Prediction of potential genes in microbial genomes Time: Fri May 13 02:00:34 2011 Seq name: gi|316924710|gb|ADCP01000016.1| Bilophila wadsworthia 3_1_6 cont1.16, whole genome shotgun sequence Length of sequence - 16684 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 9, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 66 - 452 508 ## COG0509 Glycine cleavage system H protein (lipoate-binding) 2 1 Op 2 . - CDS 515 - 1903 688 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 3 1 Op 3 . - CDS 1900 - 2334 349 ## Desal_3028 OsmC family protein 4 1 Op 4 . - CDS 2408 - 3259 790 ## COG1712 Predicted dinucleotide-utilizing enzyme - Prom 3304 - 3363 3.7 - Term 3371 - 3401 4.1 5 2 Op 1 . - CDS 3433 - 3810 667 ## COG0251 Putative translation initiation inhibitor, yjgF family 6 2 Op 2 . - CDS 3830 - 4240 493 ## Ddes_1698 OsmC family protein - Prom 4277 - 4336 5.7 7 3 Op 1 . + CDS 4239 - 4511 89 ## 8 3 Op 2 . + CDS 4550 - 5041 367 ## COG2606 Uncharacterized conserved protein 9 3 Op 3 . + CDS 5136 - 5846 660 ## COG1802 Transcriptional regulators + Term 5849 - 5881 2.1 - Term 5981 - 6011 3.0 10 4 Tu 1 . - CDS 6064 - 6441 654 ## COG0251 Putative translation initiation inhibitor, yjgF family 11 5 Tu 1 . + CDS 6751 - 6906 83 ## 12 6 Op 1 . + CDS 7009 - 8181 1166 ## Dvul_2963 hypothetical protein 13 6 Op 2 . + CDS 8184 - 10463 2218 ## COG0477 Permeases of the major facilitator superfamily + Term 10580 - 10614 1.4 14 7 Tu 1 . + CDS 10685 - 12136 926 ## DvMF_2409 hypothetical protein + Term 12268 - 12304 3.0 15 8 Tu 1 . + CDS 12617 - 14506 2602 ## COG4666 TRAP-type uncharacterized transport system, fused permease components + Term 14525 - 14581 12.2 16 9 Op 1 . + CDS 14619 - 15470 1222 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 17 9 Op 2 . + CDS 15474 - 16457 1348 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component + Term 16506 - 16565 3.3 Predicted protein(s) >gi|316924710|gb|ADCP01000016.1| GENE 1 66 - 452 508 128 aa, chain - ## HITS:1 COG:BH3484 KEGG:ns NR:ns ## COG: BH3484 COG0509 # Protein_GI_number: 15616046 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Bacillus halodurans # 7 126 3 125 128 117 47.0 7e-27 MNTQSLNLPEDLGYTARHVWARKDGDTLVIGITDFAQDQLGEILFVDLPDAGASFGADDE FGTVESLKTVSSLYMPVAGEVVERNEALEGKPTLVNLNCYADGWMLRIRPTETPALMTAA DYRSQLEG >gi|316924710|gb|ADCP01000016.1| GENE 2 515 - 1903 688 462 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 3 457 6 447 458 269 36 8e-72 MKLTIIGGGPGGYTAAFAAAKAGVEVTLVERAHLGGTCLHTGCIPTKTLRSSADALDTVA RLREFGIAGDCAATPDMSAIVARKRKVTATLQTGLEKTAAQLKVRVVRGDAEFVGAGLVR VASVDGSLEIAGDRVILATGSSPLELPSLPVDHRLVLSSDDALELQTVPEHLVVVGGGVI GCELAFIYRAFGAKVTVIEGQGRLLPLPSVDGEISRLLLREMKKKGIAVELSHTVSRVTP CDGGAAVEIAPFPTGAGDSRVLNASAVCVTVGRVPNTAGLAEAGIALDQRGWIVVDDTLE TSVPGVYAIGDVTGPRRIMLAHMAAAEAHTAVHNILHPEKKKVQSYTVVPSAIFTSPEIG DVGLTEEQAREQGIAARSVVFQFRELGKAQAMGALSGLFKIVAEEGTGKLLGVHIAGAHA SDLIAEATFALQKGCSARDLFETIHAHPTLSEGLYEAAGMLA >gi|316924710|gb|ADCP01000016.1| GENE 3 1900 - 2334 349 144 aa, chain - ## HITS:1 COG:no KEGG:Desal_3028 NR:ns ## KEGG: Desal_3028 # Name: not_defined # Def: OsmC family protein # Organism: D.salexigens # Pathway: not_defined # 1 137 1 137 142 95 40.0 3e-19 MSQSIQVSFVQEGDVYTVHTGSSVLKDIVMDYTGVPETERGGNSSTLLIAAALSCFCGSI RAALVARGVPFRAIRAQGTGTKEPNADGAMRLVGIDIDVTVDADEAYAAKVDHCAKIVKN CLITASIFEGINVSHSVRRGGSAS >gi|316924710|gb|ADCP01000016.1| GENE 4 2408 - 3259 790 283 aa, chain - ## HITS:1 COG:MTH973 KEGG:ns NR:ns ## COG: MTH973 COG1712 # Protein_GI_number: 15678991 # Func_class: R General function prediction only # Function: Predicted dinucleotide-utilizing enzyme # Organism: Methanothermobacter thermautotrophicus # 16 278 6 252 257 137 37.0 2e-32 MLPTDLLEKAMRPAFLGIIGCGAMGRLIARHVREQLPSYRLGGLFSRTTAHAEGLALELG DVPVLGLQQLIETCDLLVEATSAAAMPDIVRACLAAGKAVVPLSVGGFSLDPALLRDVEA ARAAVHVPSGAIAGLDGIRAMREMGLDAVTLTTVKHPRSLGQMPPLSELIPPTAEEALIM SPEARLLFSGTAEEAIRRFPANVNVAVSLGLAGLGFSRTRVRLFADPCAAGTLHHISARA GDCTLETMTRPAPLPQNPRSSALAMYSVPALLRGLAANPKLAG >gi|316924710|gb|ADCP01000016.1| GENE 5 3433 - 3810 667 125 aa, chain - ## HITS:1 COG:FN1973 KEGG:ns NR:ns ## COG: FN1973 COG0251 # Protein_GI_number: 19705269 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Fusobacterium nucleatum # 1 124 4 127 128 124 57.0 5e-29 MKTVISTPKAPAAIGPYSQAIRKGGTLYLSGQIGMVPATGELVSNDVKEQAAQALANMKA VLAEAGSTPADVTKVTVFIVDMADFQAVNSVYSETFGADAPARSCVAVAALPKGARVEIE AIAVL >gi|316924710|gb|ADCP01000016.1| GENE 6 3830 - 4240 493 136 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1698 NR:ns ## KEGG: Ddes_1698 # Name: not_defined # Def: OsmC family protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 136 1 136 136 205 70.0 3e-52 MATVTAKYLGDLRVECTHVASGTKLVSDAPVDNNGKGEAFSPTDLCVTALASCAMTIIGI YGKMHNVDVVGTEIEVAKTMSANPRRIGKIEVTFIMPDREYTDKQKTMIERAAHTCPVHL SLHPDVEQVFTFKWAN >gi|316924710|gb|ADCP01000016.1| GENE 7 4239 - 4511 89 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWFSFVVRVELVAESTIPRTRKMVNSIPSTYFEQRNFAVTRAWFPVFPYAFGSIPVGKRS NRVSKKHRQASEEKSGSRNSRLKQSTVSGI >gi|316924710|gb|ADCP01000016.1| GENE 8 4550 - 5041 367 163 aa, chain + ## HITS:1 COG:NMA1462 KEGG:ns NR:ns ## COG: NMA1462 COG2606 # Protein_GI_number: 15794364 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 8 148 10 149 159 111 39.0 6e-25 MHGKGEAPAIRILEEAGEDVGRLEYRFVERGGTKDAARQLGLCEHDIVKSLVFDDGKGGQ AVMALMHGDERVSLHKLQRLSGVPHLQPSSPENAERLTGYKPGGICPFGLPSPLPVFAQI SLFNLGMLYINAGERGVIAVIRPEALRLSGAVAGDILSGGSRR >gi|316924710|gb|ADCP01000016.1| GENE 9 5136 - 5846 660 236 aa, chain + ## HITS:1 COG:BS_ydhC KEGG:ns NR:ns ## COG: BS_ydhC COG1802 # Protein_GI_number: 16077637 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 3 217 5 219 224 80 26.0 2e-15 MQQLNKPTYSEQVVNLIRQRIRNGKLRSGDRISEASIAEECGISRAPVREALYQLETEGF LMSHPKRGKCVTLLTGDGIRHRYELCGLLEGAAAANVVLSLPKGPWEELEALLDRMKQGV LSGTSFEDHAALGTAFHETILERASNPLLIAIARSSCRVISKYLMFQQWRTLYTTEELFQ RHNSMYEAFLTRNPYTIEEAVRAHYADSAERLAQICEAEQKPKTARSQRRAHFPTE >gi|316924710|gb|ADCP01000016.1| GENE 10 6064 - 6441 654 125 aa, chain - ## HITS:1 COG:FN1973 KEGG:ns NR:ns ## COG: FN1973 COG0251 # Protein_GI_number: 19705269 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Fusobacterium nucleatum # 1 124 4 127 128 124 58.0 3e-29 MKTVISTSKAPAAIGPYSQAIRKGDTLYLSGQIGMVPATGELVSDDVKEQTAQALANMKA ILAEAGATPADVTKVTVFIVDMAEFQIVNGVYSETFGADAPARSCVAVAALPKGARVEIE AIAVL >gi|316924710|gb|ADCP01000016.1| GENE 11 6751 - 6906 83 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIFIALSPIAHFMETTDAERALPLVRGERLSGHVLALNVFENARKSEIIS >gi|316924710|gb|ADCP01000016.1| GENE 12 7009 - 8181 1166 390 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2963 NR:ns ## KEGG: Dvul_2963 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 23 388 43 410 412 469 61.0 1e-131 MNRFCRLFLFMLMCGGLLFAAVPASAGDGTQAVQEKTPAHPAKKWRVAYIEGGGYTDYQR ILAATAKGLAELGVIADGDVPIPEKTDDTRPIWDWLAEHAGGDRLVFLKDGYYSANWDAA QRAANRKALLDRIREKGDVDMIFAFGTWAGLDMATADISVPVFSMSVTDAVQAGIAKSLK DSGRDNLHAQIDPERYKRQIAVFHDIFGFKKMGIPYEDTPEGRSDVSLAAIESAADELGI ELVRCTTALNVPPDQSFANLKQCVSQLAKTSDAVYLTTNSGMQWNRMRELLQPLIEAGVP SFSQSGIEETKLGVLMSLSQSSFSSEGRFGAEAIAKVMDGIRPRDVGQVFDSAIGLAINL EMARLIYWDPPFEILLAADAIYQDIKNAGE >gi|316924710|gb|ADCP01000016.1| GENE 13 8184 - 10463 2218 759 aa, chain + ## HITS:1 COG:PA4096 KEGG:ns NR:ns ## COG: PA4096 COG0477 # Protein_GI_number: 15599291 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Pseudomonas aeruginosa # 421 625 65 269 419 62 33.0 4e-09 MESSRRRLLIGGLATLLAAQCLYGILTLATLYKSYKTSLVSVQAIASEKFSLDLSRLARF GKDPERVEDMAERVRRFCEIAGVSRLAVLDGKGRSIAAWPSDASEAPPVPQDENKLKTLH GEVKTFDADGKVWIVQPIRNRTGADVGSVLAGFDEAGMTALVGKAASMHALLFLGVAVLG GSLFILLVLREGKPPFPRLGKGCLIIPLLVSQIAFLFFLRGPVVSFLEENARDAGTQLAR YIGHDVAHINDLGLKLADVPSVRTYLDHIRQSLPWAQSITVSDAGGTTFTAGNPMTPLQA GVPLSADSPVSVTVGIDESTVWEAFRSIVYDTLTIMIIAMLFMLELVPLQGVGQAAAPSA GDSLPPRIMRPIIFLCMFAIDLPASFIPLRIAEMDLGLLGLPPDVVMGLPLSFEMCAVGI GILIGSFWSQKSGWRPLLLWGALLVALGNVASGLVSDSLAYILSRGGAGFGYGLINLAGQ VFVVSHSSPEHRAGNLSALVAGLYAGFLCGSAFGGLIADNLGYASAFLVSAGLMAIIGIF LHFALPREAWTPEPSASGRISLRGLGAFFSDIKMAGLLLGNIFPCAFVTVCLFQFFLPVS LSQAGVSPAGIGRVFLLFCLVIIYLGPFFGRAVDKSPNKLVWLVGGGFLCIGGIIALLLL DGLAAAFACVALLALCNAIVASAQGTYALEIPVSRQVGSGRTVGIYNITERLGQMLGPVA LGQVIALWGVNSGLLGMAAVLAVLNILFALTGRLAKAGA >gi|316924710|gb|ADCP01000016.1| GENE 14 10685 - 12136 926 483 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2409 NR:ns ## KEGG: DvMF_2409 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 13 483 24 563 566 252 37.0 3e-65 MIDPLAMPIFKPLWEAFPYLPGCFAAWRGMSLGEISASAYATADGERRPSDAIVDCMAKH AARSLGNDVGQAVKRYFSRNSAVLSANLHGIDCLPEMVQAVPFFGLDRLAGGTPGAVIPV LSCGGVSLQSSAYPRGLQSFHTDAPARFPLFRSSWQDTVELNAPALAAKDVRSSAAHWRP TRAYEERGLGTVLHHLLDPDTLALKRFGEQAARVNAKLYAALFPDARPVVAYLELEEIAR ELLIGDLARPDSVLHRALFTPVLREGLLRRLAGVRGCWSPCVASGGELPRTAGNGTAFFW GVDGRGRRHPLRLSPDNGAFEFPGFHLPLVPEAVIGALRDRNIYPGLFVSYMTLALEHGL CCHGGIFLVRYLPTMLHAVRDLFMECGEPLPPLPGRTMLAAFAISMQARETGRLFPAGML ELLTSGGISRADLQRLADLPLEAVLPLSLARWCLDYTPRSQRTPAWEADLRNVAWEGLTI EVS >gi|316924710|gb|ADCP01000016.1| GENE 15 12617 - 14506 2602 629 aa, chain + ## HITS:1 COG:BH2945 KEGG:ns NR:ns ## COG: BH2945 COG4666 # Protein_GI_number: 15615507 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, fused permease components # Organism: Bacillus halodurans # 17 629 44 651 656 369 40.0 1e-102 MSFALTREGIKPVAKGFALALALFQIWFTTGFGVLDGSMMRVMFVSFITVLVFLFIPCRK YKENEKEPTLFLLIDLCCAGLAIATAVYFALHLTEITTRMRYIDDVTPAAKFFAAATVLL VLEITRRTTGWALVIVASTLILYAFFGDMLPRAVKHTGFTFDVIVEHLFLLNEGVYGIPI GVATSTLFGFIMFGAFLERSKMSSIFMDLACLLTRNSQGGPAKVAIFASALFGTISGSAA ANVYGTGTFTIPLMKKVGYRAPFAGAVEAVASTGGQLMPPVMGTAAFLMADLSGAGYLNV AKAALLPAILYYLALLVMIHFEAVKNDLGKLSPDMVPETKSVVSRLYYLVPIAVLILLLG MGRSVVFCANIATLSIVLLAMLKAETRFTFKSFIEALVASSRGALMVSACCACSGLIVGV LSLTGIGYKFINLITILAGDSLFLLMVYLMLTSFVLGMGIPTTPAYIVVATLGAPALIKA GAPQLVAHMFVFYYAILSFITPPVCVAAFAGAAIAESKAMETGFIALKLGIVAFIVPYMF VYQPALLGIGETPEIIWAAITAIIGVIGIAGGMQSWLLCSTSAWERAFLLIGGLTLIYPG LSTDIIGFGLLFSVGVVQMLRRRRAPIAA >gi|316924710|gb|ADCP01000016.1| GENE 16 14619 - 15470 1222 283 aa, chain + ## HITS:1 COG:MT2360 KEGG:ns NR:ns ## COG: MT2360 COG2159 # Protein_GI_number: 15841794 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Mycobacterium tuberculosis CDC1551 # 43 278 60 306 307 86 24.0 6e-17 MIIDFRMRPPARGFLNIGVYDDVARTAKLTECFSMKQAPSVARKSPELMLQEMDTAGVTM GVIPGRNGHFKGSISNDDIISLLGDFPGRFVGMAGLNASKREESLEEIHRTVLNGPLKGI CMEPGALDKPMYADDPRIYPIYDLCEQHRIPVILMLGGRAGPDITYSDPKIINRIAADFP KTNFIISHGGWPWVQQILGVCFFQKNIYLCPDMYLFNCSGAADYIMAANNFMQDRFLFGT AYPLMPIVDCVSHFKGLFKPEVLPKLLYKNAAKLLNIELPEEA >gi|316924710|gb|ADCP01000016.1| GENE 17 15474 - 16457 1348 327 aa, chain + ## HITS:1 COG:AF0635 KEGG:ns NR:ns ## COG: AF0635 COG2358 # Protein_GI_number: 11498243 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Archaeoglobus fulgidus # 38 327 48 329 330 100 28.0 4e-21 MTCKHFFKAVALALSLAVGAAPALAKNFTELKAAASPAGGAWYVGMGAMGKAISTMYPEL DVSLFPGGGLANVMHVEKGASDFGIAAHSVIFAAAKGIDPYKKPTENVMGLLNLHDSTRM HFIVNKASGITSLAQIKEKKMPIRISMSHTAGNSYLFGKWILEEYGISQQDITAWGGKIY TPSTNDAASMMKDGQLDVALWIGPGEAFIVQDMMNSVDLDILPVDENVIKAVGEKYGLQR DVIPASFYGGKFGKDIVTVSASTELMINKNVSEDIAYKMVKAMCEKRDDIVIASPFWKSF MPEKAGQGLALPLHPGAAKYYKEKGWL Prediction of potential genes in microbial genomes Time: Fri May 13 02:01:26 2011 Seq name: gi|316924681|gb|ADCP01000017.1| Bilophila wadsworthia 3_1_6 cont1.17, whole genome shotgun sequence Length of sequence - 44888 bp Number of predicted genes - 32, with homology - 26 Number of transcription units - 19, operones - 7 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 32 - 1978 1987 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 2007 - 2066 4.7 2 2 Op 1 1/0.200 + CDS 2340 - 4202 2257 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 3 2 Op 2 . + CDS 4258 - 5115 1149 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 4 3 Tu 1 . + CDS 5269 - 6594 1408 ## COG0477 Permeases of the major facilitator superfamily 5 4 Tu 1 . - CDS 7445 - 7846 151 ## 6 5 Tu 1 . + CDS 8017 - 9045 967 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Term 8907 - 8950 1.0 7 6 Tu 1 . - CDS 8983 - 9192 253 ## - Prom 9264 - 9323 2.3 8 7 Op 1 . + CDS 9573 - 11015 2276 ## LI0461 hypothetical protein 9 7 Op 2 . + CDS 11132 - 13204 2328 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family + Term 13226 - 13276 18.1 - Term 13209 - 13268 14.0 10 8 Tu 1 . - CDS 13288 - 14199 753 ## COG0583 Transcriptional regulator 11 9 Op 1 . + CDS 14414 - 14644 201 ## 12 9 Op 2 4/0.000 + CDS 14716 - 15759 1259 ## COG1073 Hydrolases of the alpha/beta superfamily 13 9 Op 3 . + CDS 15819 - 17009 1457 ## COG2814 Arabinose efflux permease 14 9 Op 4 . + CDS 17102 - 17644 503 ## COG0655 Multimeric flavodoxin WrbA 15 9 Op 5 . + CDS 17649 - 18131 312 ## COG4925 Uncharacterized conserved protein + Term 18251 - 18280 -0.4 16 9 Op 6 . + CDS 18374 - 18649 135 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase + Term 18790 - 18857 17.4 - Term 18778 - 18838 4.1 17 10 Tu 1 . - CDS 18855 - 20810 1815 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 20870 - 20929 6.2 + Prom 20888 - 20947 2.9 18 11 Tu 1 . + CDS 21102 - 21953 987 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold + Term 21990 - 22042 16.1 + Prom 22188 - 22247 7.1 19 12 Op 1 . + CDS 22319 - 24253 2374 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 20 12 Op 2 8/0.000 + CDS 24281 - 26173 244 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 21 12 Op 3 . + CDS 26248 - 27252 1609 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component + Term 27272 - 27314 9.2 22 13 Tu 1 . + CDS 27436 - 28302 1166 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold + Term 28357 - 28393 1.0 23 14 Tu 1 . + CDS 28685 - 29719 1223 ## Plut_1380 hypothetical protein + Term 29739 - 29775 6.1 24 15 Op 1 . + CDS 29917 - 30090 69 ## 25 15 Op 2 . + CDS 30099 - 30587 524 ## 26 16 Tu 1 . + CDS 30759 - 34394 3355 ## + Term 34438 - 34477 6.7 + Prom 34636 - 34695 2.1 27 17 Tu 1 . + CDS 34846 - 36009 1536 ## COG0477 Permeases of the major facilitator superfamily + Term 36023 - 36090 13.5 - Term 36011 - 36076 17.4 28 18 Op 1 . - CDS 36167 - 37576 1365 ## Ctu_12980 uncharacterized protein YbfM 29 18 Op 2 . - CDS 37815 - 39980 2340 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Prom 40316 - 40375 4.3 30 19 Op 1 2/0.200 + CDS 40604 - 41086 534 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 31 19 Op 2 . + CDS 41079 - 42740 1662 ## COG0747 ABC-type dipeptide transport system, periplasmic component 32 19 Op 3 . + CDS 42737 - 44605 1996 ## COG4585 Signal transduction histidine kinase + Term 44783 - 44815 0.1 Predicted protein(s) >gi|316924681|gb|ADCP01000017.1| GENE 1 32 - 1978 1987 648 aa, chain - ## HITS:1 COG:BH1879 KEGG:ns NR:ns ## COG: BH1879 COG3829 # Protein_GI_number: 15614442 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 207 640 120 552 555 282 37.0 2e-75 MEEHIERNKRVLEQWERFHRQEPVDESAVRPLILRSWKRCRLQGVRPDNASKVSLSPAAL EKLLDRNADLVAVAKSIMEKLYNPISTSRSFISLSDAQGIVLHALWDNTGSYPIPHLAPG NLAAESASGTNAIGTCLVEHTPVETLASEHYCRSFHGWFCSAAPIRDSRNAIVGVLNVTL PSALYHHHTRGMMEAAAHAIAEQLRLRLLLQEQKAIIEMLDEGVVVLEGDGTIRTLNNKA QAMLDLPPDAVHGNIQDIIFSSDIIRAILSESGQFSDQEAFLQLKGGSLNCMLSLTRLES GKGRVLTLRETKRIKESAVRVTGAKAVYTFDHIVGNAPATQEVVRMARMAAQSDVTTLIL GESGTGKELFAQSIHNGGRRANAPFVVVNCGALPRNLVQSELFGYDEGSFTGASRLGKPG KFELADNGTIFLDEIGEMPLEAQVSLLRLLQNGEVTRVGGKHTRLVNVRVLAATNRNLEN AIRQNAFREDLYYRLNVFTLNVPPLRERSSDIALLINHFLDHFVASLGRGPLRVTDRAMD VLLGYPWPGNIRELENVIERMVHMSQGVPSIDIDVLPANILNHEGIPGGAPRPAVPRGLL SHQEKETIVRALQEAGGNIRATAKALGISRSGLYVKMRRFGLSPDECR >gi|316924681|gb|ADCP01000017.1| GENE 2 2340 - 4202 2257 620 aa, chain + ## HITS:1 COG:lin0492_1 KEGG:ns NR:ns ## COG: lin0492_1 COG1902 # Protein_GI_number: 16799567 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Listeria innocua # 1 325 27 350 364 238 41.0 4e-62 MCTKYAAPDGGVTERMLRYYEERAKGGAGLVTIEATSVDPTGNSFSRGLSIADDARLPGL TDLARRVKRHGARISIQLQHGGRAALPQFSGHAVPLVSAIPGVTPYDNSVILSEEEIARL VECWGKAAIRAREAGFDAVEIHGAHGYLISQFLSPYTNRRTDGYGGSLENRMRFALEVCR KVRESVGPDFPVTYRMSAVEGLPGGTTLEDSVALAKRLVADGIDALHVSVGLRETNFMVS PPACVEKGWNAPLSRAVRDGIEAAVPVIVAGRVADEQTAQGIIDRGDADMVAMGRALIAD PFLPAKVAAGEAGDIIRCVGCNEGCVAGSARGTGVGCALNPLSGAEGKYDLEPVASPKKV AVIGGGPAGMQAALAARFRGHTVDLYEKSDRLGGLLNVACKPPHKEDLGHVTGWFERQLK RAGVGLHLGAALTSEAVRDLGADAVIAATGSSPVFPGFCRKARNAVVAQSILSGEAKAGK KALVIGGGLIGCETAEFLAAQGSDVTILELQPELAKDMESRTRRYLMQRLREYGTRFLTG TQVLEVTEEGKVKAKFPSGSERWLDDFDTLVIAVGYRAENALAAELEEMGCPCVRVGDCA SVGKILTAIESGFQAGCSIK >gi|316924681|gb|ADCP01000017.1| GENE 3 4258 - 5115 1149 285 aa, chain + ## HITS:1 COG:MT2360 KEGG:ns NR:ns ## COG: MT2360 COG2159 # Protein_GI_number: 15841794 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Mycobacterium tuberculosis CDC1551 # 49 280 66 306 307 118 31.0 1e-26 MIIDFRLRPATRGFLNMHIFRNQERIASWSGKLGMVPPPSAKQGDMELMLKEMEEGGVTL GVVPGRQANAFMGNVPNEDIAAIVADYPEKFIGVAGIDPTNRAKALEEIERFVVNGPLKA VGMEPGVLASPMYADDRRIYPIYEYCAEHGIPVLLMGGGGNGPDLSHSNPIIIDRVCGDF PEMTVINTHGGYPWVTEILFVAFKRPNLYLCPDMYMYNMPGVSDYVMAANTFLSERFLFG TAYPFIGFREGVEQFKALPFKPEVLPNLLYRNAIKALKLDIPCGE >gi|316924681|gb|ADCP01000017.1| GENE 4 5269 - 6594 1408 441 aa, chain + ## HITS:1 COG:MA4166 KEGG:ns NR:ns ## COG: MA4166 COG0477 # Protein_GI_number: 20092959 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Methanosarcina acetivorans str.C2A # 11 425 2 425 444 209 34.0 1e-53 MSPHSSKAMTRHAWFIFSILTICYALVSFFRTSTSTLAVDIMREFAVGGGLMSVMSSAYF YPYGFMQIPAGILSDRWGSRNTIASFLLIGAAGSFLFALSGSVTAATAGRVLIGVGMGMV FVPALRVILYWFPPARHALGTGLILSLGTGGMLLATYPLMLLSQAVGWRGSMFAATAATV LVAGATWLWVRNRPEEAGYAPAWVSITPRKHPAPPLRETMLRIVRSRTYWSVSTWFFCMY GSFYAMTGLWAGPYFIQGYGLDKGTAGGILFCMALGAVVGPSLVGVVVTWVRWSKGVLLL AASGVTILLAAPLLADHPVLPTSLLPAWSLAFGIFCGGFGGIALTKVQEDFPPEIVGTAT GMINIYSYIGTALLQLASGWIMEAQAPGQAAYGIGQYASMFILFMAMFVTAFSATLLGLG KSRGNVTDVFSPMGEGETKSA >gi|316924681|gb|ADCP01000017.1| GENE 5 7445 - 7846 151 133 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDFTRGADLKDALVQHFHDRTLPAESGPQVELFRDGVGRSLHHAPRMQMIGFPGDGNIED ACKRAVFVADGGCGAAPDMMPGAIVLCADDGHGLPFDKASPDAVGAGYGFRGDLAGKHVG PVALTPQGVQYEP >gi|316924681|gb|ADCP01000017.1| GENE 6 8017 - 9045 967 342 aa, chain + ## HITS:1 COG:aq_091m KEGG:ns NR:ns ## COG: aq_091m COG3829 # Protein_GI_number: 15607134 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Aquifex aeolicus # 4 336 202 529 530 207 38.0 3e-53 MLSFAPLGMGKGGVLALRESRRLCGFAARTIGAGAVYALDDIPGDSPAIAQAREALRRAA RHDAPVLLTGEEGTERHGFAQAIHNAGRRREAPFVAVHCGGIPRSLMRSELFGFDGEGGR PGKCELADGGTLFLDGVEALPLTAQMGILRLLRGGETARVGGGQGRTVNVRLIAGAGPGL SEAVREGLFLKELHELLREYTITVPPLRERRSDIEALAARCASRFAQALGKQPKPIAPEA MEALLKYSWPGNVRELETVLERAVALAEGDVIGLSDLHARISAAEVSVPTGKPLPGHEAK RLLAALERTAGNVREAAKLLGISRGGFYVKLKKLGLNPDEYR >gi|316924681|gb|ADCP01000017.1| GENE 7 8983 - 9192 253 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGEMAARDDLEGQGEGIMDEERDTGMAAMSRKDRREHTSGHIRRRQALWLAVFVGVEAEF FELHVEAAA >gi|316924681|gb|ADCP01000017.1| GENE 8 9573 - 11015 2276 480 aa, chain + ## HITS:1 COG:no KEGG:LI0461 NR:ns ## KEGG: LI0461 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 480 1 498 498 372 42.0 1e-101 MKKVITLLLAAGLVIGAASAANAVDIKAKGLWHNAMSFADRNFEKHNGSDKMQNATRIRT QIDVIASESLKGVVYFEIGPQNWGKGADGASLGTDGKVVKVRYNYVDWVIPQTDAKVRMG LQNFSLPGFALPNVILGGGGADGAGITLSTQFTKNIGATLFWLRAENDNTDSATDYGKQY HYSDAMDFTGITLPLKFDGVKVTPWAMYGFVGRDSFENAGASGDQKKLLQGLLPLGVSNK TLTGNSLTDRHGDAWWAGFTTELKVLDPFRFALDATYGRVDLGEGLSTKGKKIDVKREGW IVSALAEYKMESVTPGLTAWYSSGDDANVNNGSERLPTINPDVKVTSYGFDGTNFCRAQQ VLGTSVDGTFGIVGHLKDISFFEDLSHTLRVAFYRGTNNTENVRQKMITKPSETVSSMYY MTTADKAWEVNVDSEYKIYKNLSLFLELGYIRMDLDSNVWKDYESKKNNFKGAVCVLYSF >gi|316924681|gb|ADCP01000017.1| GENE 9 11132 - 13204 2328 690 aa, chain + ## HITS:1 COG:MA1426 KEGG:ns NR:ns ## COG: MA1426 COG1902 # Protein_GI_number: 20090286 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Methanosarcina acetivorans str.C2A # 9 376 2 354 365 187 34.0 9e-47 MSTGQFHKLLSPLSLGSGLTLKNRMIKAPQSSWFFEDDGSAGDRVIDFYEAIAAGGVGLV IVSAITWRSDHPAGAYGALHDDRFLPGMKRLVERCHAHGCPVFCQLHHSGASAMTGHGGG LPIGPSDLGPDEIPCPPPVGKPTRGLTAEEIAEDKELYFKAAERAKAAGFDGIEVHCAHG YYLESFVSRVWNRRDDQYGPQSLENRTRLPLEIIAGLRERLGADYPLGLRMNGQEWGADK GLTIAETVAIAKIFEGAGVRYISVSGYGFGPLPFRYLPDYWPYPEPEEHMKPYLKDFMGD GLLIPAAAAIKQAVNIPVIAVGRLDEAKAEKILEEGKADLVAFGRVLWADPEFPRKVAEG RIEDIVRCTRCGTCEDPPVGRPRRCRVNPAMGRESLFAIRPAEKKKKVVVIGGGPAGMET ARVAALRGHDVTLFEKSSRLGGKLPLAAMIKGVEVEDVRPVISYLTTQVGKLNIKVRLGE EATVSKILAEKPDAVVIATGGLYRLPSLKGVENRNVAGVNSLSKQVKLPLLVFGPELLNK LTKLFLPIGKRVAIIGGQIEGLQGAVFLRKRGREVTVLEPSETFGKGIPPRYLDRLKPWL AKKDVQLLGGVTCDQITDQGVVITTKDKQRKLIEADTVMVLLSQEPDTSLLEALKGSVKE VLCAGSVNGAQVGSLIVNAIEDGRRIGCNL >gi|316924681|gb|ADCP01000017.1| GENE 10 13288 - 14199 753 303 aa, chain - ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 289 1 287 291 286 47.0 3e-77 MELRVLRYFLAVAREESISGAAEALHVTQPTLSRQMMELEEELGKTLFLRGKRKISLTEE GMFLRKRAQEIVTLVEKTESEFSAAEETISGDVHIGGGETDAMRLIARAAHRLQSAYPHI AYHLFSGNADDVTERLDRGLVDFGVFIEPADLSKYDFIKLPVTDIWGVLMRKGCPLAARA TIRPQDLLGLPLLASNQHLVKNEFSGWFGEGYEKLNIITTYNLLYNASIMVEEGMGYALC LDKIVRTSGGSPLCFRPLEPKLEVGLHIAWKKYQFFSKAAEKFLECLQREIADRREADSP AYQ >gi|316924681|gb|ADCP01000017.1| GENE 11 14414 - 14644 201 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKGNPKQDDPLIQNLHDAGCSQQIIEQFMQSRKEGNIFCQMHILVKQRNLLLEDLHKDQN KLDCLDYLLFQLKKGC >gi|316924681|gb|ADCP01000017.1| GENE 12 14716 - 15759 1259 347 aa, chain + ## HITS:1 COG:RSc0206 KEGG:ns NR:ns ## COG: RSc0206 COG1073 # Protein_GI_number: 17544925 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Ralstonia solanacearum # 17 347 12 342 342 492 73.0 1e-139 MRAMKQLGMTAAIIGAMSAAPVLAQEIPADASNFYKSESVTVRQVTFPNQYNMNIAGNLF LPKNLDQKTKHAAIIVGHPMGAVKEQSANLYAVKMAEQGFVTLSVDLAFWGGSEGQPRNA VSPDLYAETFSAAADFLGTNPLVDRERIGVIGICGSGSFAISAAKIDPRLKAIATISMYD MGAANRNGLRHGMTLEQRKQILREAAEQRYAEFTGGETRYTSGTVHELTETSTPIEREFY AFYRTPRGEFTPEGSSPQLTTHPTFTSNVKFMNFYPFSDIETISPRPMLFIAGEKAHSIE FSEDAYRLAAEPKELYIVPGAGHVDLYDRASLIPFGKLTDFFTKNLK >gi|316924681|gb|ADCP01000017.1| GENE 13 15819 - 17009 1457 396 aa, chain + ## HITS:1 COG:yicM KEGG:ns NR:ns ## COG: yicM COG2814 # Protein_GI_number: 16131532 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 11 387 72 448 451 265 39.0 1e-70 MRNGVFQTGNPRWSAVFSLFMGVMALIAAEFIPVSLLTPIARDLAVSEGMAGQTVTAVGV LAVLTSLLLAPLTGNTDRRRILLAFSAMLVASYMLIGMAPTYSIMLIGRAILGICVGGFW SLASAVTLQLVPTKDVPRALSIVYAGVSVATIIALPVASWIGHLIGWRNVFFLGALMAAF GFVWQYRALPSLPARAGSGFRDMSALLRRTWILAGIGGNILSFGGYHIFFTYLRPFLELD LALGANTLSVILLVFGVANVLGTFIAGALLGNHFRSTMILVHLAFTAAALALLLSQEHAD IGIALAVFWGFAFGFTPVGWSTWIARTLSDKAELAGGLTVAATQFAIGLAAAVGGFTYDN LGINGIFLAAAGISVMAALLIAVSFSLFARDTGHKA >gi|316924681|gb|ADCP01000017.1| GENE 14 17102 - 17644 503 180 aa, chain + ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 177 1 176 179 147 42.0 1e-35 MKKNILILSTSPRKNGNSEMLAREFARGAEEAGHNVELLSLHDKTIGFCKGCLACQKTQR CVIHDDADAIVRKMKDAEVIAFATPIYFYEMCGQMKTLLDRSNPLFPSDYAFRDIYLLAT AADSAESSMDGAVKGLEGWIACFEKAALRGVVRGTGADGAGTILQVPVALKSAFELGNKA >gi|316924681|gb|ADCP01000017.1| GENE 15 17649 - 18131 312 160 aa, chain + ## HITS:1 COG:RSc0630 KEGG:ns NR:ns ## COG: RSc0630 COG4925 # Protein_GI_number: 17545349 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 44 145 1 103 118 113 51.0 2e-25 MNAKQTMRQGRRLPLKQCLLPMAALLLTALAALLPHSAPAGEPMRIRLIVDGETLPAVLE DNPAGRDFLSMLPLTLTVKDYNGTEKIGNPPRVISTEDMPDSIEPSAGDLTFYAPWGNIA IFYREFRLSRGLVRFGRITSGIEKLAAMHGDFSIAFERAE >gi|316924681|gb|ADCP01000017.1| GENE 16 18374 - 18649 135 91 aa, chain + ## HITS:1 COG:ECs0335 KEGG:ns NR:ns ## COG: ECs0335 COG0656 # Protein_GI_number: 15829589 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Escherichia coli O157:H7 # 1 91 55 145 191 122 60.0 2e-28 MGAAVRESGIPRKELSITTKVWIQEAGYEKTKKAFEASLSRLGLDTLDLYLIHMPFGDYY GSWRAMGELYENGKIRAIGVCNFLPGRLMDL >gi|316924681|gb|ADCP01000017.1| GENE 17 18855 - 20810 1815 651 aa, chain - ## HITS:1 COG:aq_091m KEGG:ns NR:ns ## COG: aq_091m COG3829 # Protein_GI_number: 15607134 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Aquifex aeolicus # 339 641 230 529 530 288 48.0 3e-77 MTDTHDSYLQSDARQEAISAQKELFLRGLPVDTTVVSDFVLRSWQRSRLAGVDPETTVRK KVDETIFRHILAANADLLESSRVIMKELFSSLVSGAGSMILSTAECISLHMETSGRDGDT YPSSKPGMITTEQLRGTNGIGTCVAERRPIEIIGAEHYLTVGHRWSCSAAPIFDSKNQLI AVLNVSQLREKYHSHTLGMVRAAAYAISEQLRLRALLQQQQAIIELLDEGVIVVSRSGEV KLMNSKAASMLGLPAPQQGENIYKFMRPSQMLDNILTGDPHIMDQEAQFPLESGSLSCFF SAMSLTREACVVLTFREARRMRGFAARVAGSKAVYTFDRILGDSPPLMEVIDQAKTIARG NTTVLILGESGTGKELFAQSIHNASPRASRPFVAVNCGALPRNLVESELFGYEDGAFTGA SRTGKPGKFELADGGTIFLDELGEMPMDAQVSLLRLLQNGEVTRIGGKSSRTVSVRVIAA TNKNLEEAVRQHTFREDLYYRLNVFTLVLPPLRSRMSDIELLAEHFLLKFAGSLGKDVRG FTPGALALLRRYQWPGNIRELENVMERLANIVRHPLVSEEDLPPQMVTVRQPASPQGLLH SKEAETILETLRQTGGNIRAAAALLGVSRGGLYVKLRRLGIDVESCRGGEG >gi|316924681|gb|ADCP01000017.1| GENE 18 21102 - 21953 987 283 aa, chain + ## HITS:1 COG:MT2360 KEGG:ns NR:ns ## COG: MT2360 COG2159 # Protein_GI_number: 15841794 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Mycobacterium tuberculosis CDC1551 # 48 280 61 303 307 83 30.0 5e-16 MLIDFRVRPPYKTNLTTSLYNKPAPSPDPTKQSIFLIGKERVPSAEARSIELFMQEMDET ETERAVIMGRKADEYGSVDNDEIDELARAYPGRFIPFAGVNPNLPGQVEEIERVAAMGFR GVGIDAGWLRRPLYHDDPVFDPIYAKCQELGMIVSLTSSFMLGPDLSYSEPCAIQRVAMK YPNMKIVVPHGCWPHVHTALAVAIRCPNVYLMPDCYIYIDGFALSDEYIKAANTYLKYRI LYASSYPVRSLSQCLRGWKSRGFTDEALQNTLYDNAARLLGEK >gi|316924681|gb|ADCP01000017.1| GENE 19 22319 - 24253 2374 644 aa, chain + ## HITS:1 COG:lin0492_1 KEGG:ns NR:ns ## COG: lin0492_1 COG1902 # Protein_GI_number: 16799567 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Listeria innocua # 6 362 7 363 364 226 34.0 1e-58 MSSSVLLSPLTIRKTTLRNRIMLPSMVTHYCAVNGEVSERLIAYHAERAYGGVGLNMLES TSVDDTGKSYWPGVSIASDHFIKGLRRLTDAIHAHGGKAGIQLNHAGRLAQPAVSGHPRQ LVSYIPGLTTYDDIHVMDEDDIERVIGSFCSAAERAAEAGFDVIEIHGAHGYLLSQFMSP LFNRREDAYGGNFEKRMRVPVEVIRGTRKRLGPDFPLFFRCSVEEYLPGGITLELARDIA RTVADNGIDLFNVSVGLGETNRFTGPPPCLPKGWNADRAAAIKEALGDRALVGVAGRIID RESADAVLNSGKADLVVMGRALIADPDLPNKLAANRDDAVIPCVGCNEGCVARLRERKAV ACAVNPRTGCEGMYPRTPAAVSKHVVIVGGGPAGMQAALVAAERGHRVTLFEKRTRLGGL ANIAALPPHKDLYAVLVNRFSERLIASGVKVFLGTAPSVADLAELKPDTLIVATGSVPVI PRFCEGIANAVTAEDILTGKAKAGQRVLVVGGGMVGSETAEYLALKGGDVTILELRPDLA ADMQARARSFLLESLREHKVKALTSTELVSVGPDGRVAVKDAYGSQRELPPFDTLVLALG YRPSNALCAELSLAGVPFTAIGDCRRVGKVLDAMHEAYTAAYNL >gi|316924681|gb|ADCP01000017.1| GENE 20 24281 - 26173 244 630 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 192 550 64 413 432 98 25 6e-20 MVPVFDRRKLRPLILALAIALCAFQLTTCTGMWLLDPLIVRATHVGLVLSMLFLWRSSNR ALRKNPRPEPAWSFLFDILMIAMSAGAAAYIITNFSYIQDRMPHIDELTAWDLFWGAGLI IAILEATRRVAGLALVIVSGIALLYAFFGHLLPGSMGHLYLAPNQIIESLYLLNDGIWGS SIGASATIIYVFILFGALLEKTNMASVFLELACLVTRNAKGGPAKAAIFGSALFGSVSGS AAANVYATGTFTIPLMKRVGYRPEFSGAVEAVASSGGLIMPPVMGSIAFVMAEYTGISYL SICKAALLPAVLYYLSLFTMIHFEALRHGLGGTPSDLIPELRSIKRKLYYLAPLALLIVL MIAGRSIIFSALLACAAVVALSFLSPETRLTPARFLAAMENAAANLLMIAACCACVGIVI GVVTMTGFGFSFVNLMGSLAQVHIGIFLLVLAGTCVIFGMGLPSLPAYILVATFGAPALV QAGVPVLAAHLFVMYFAISSGITPPVCLVAYAGAAIADAPPMKTGFTAFKLGIAAFIVPF IFIFEPALLLMGDWATILQAVFTAVLGVVCLASSMQGWLFTESSPAERVMMFVAGIVLVY PGLLTDLIGFGTASLVFFIQIARKKRLCAA >gi|316924681|gb|ADCP01000017.1| GENE 21 26248 - 27252 1609 334 aa, chain + ## HITS:1 COG:SMb20292 KEGG:ns NR:ns ## COG: SMb20292 COG2358 # Protein_GI_number: 16264030 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Sinorhizobium meliloti # 5 334 3 327 327 82 24.0 8e-16 MSLLHKAITLSGACAIGLTLAFAAPAAADDAKPINFRAVGGAVLGGTWNVGLTGVGKLVN DRYPGSAINVLQGASVSNPLRLEQNAADVTLTQTFNTVAARDGKAPYKKPLKNVASLANM NDTSRLSIIVSADLPVNTFDELMEKKLPVRLDRGAKGTLHNVVGAMLLAEYGYTYDDITK WGGAHTAVSANDRVGMFQDGTLNAYLTLGPGQQSHIQELVLNAKVKWLPVSDKVLKSVTA KTGQSIGVIPADFYGGAVGRDIPCITDSTVMLVRKNMPDADVYKITKAIVEGFEELHAVQ PTWKTLVPEHMADNLALPLHPGAEKYYREAGIIK >gi|316924681|gb|ADCP01000017.1| GENE 22 27436 - 28302 1166 288 aa, chain + ## HITS:1 COG:MT2360 KEGG:ns NR:ns ## COG: MT2360 COG2159 # Protein_GI_number: 15841794 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Mycobacterium tuberculosis CDC1551 # 52 215 64 229 307 68 33.0 1e-11 MVIDFRIRPPYKGFMNLGIVRNWQSVPDDPRKMRPTGFERLPVPSMEHASVDMLVDEMKA AGITKGVLHGRHTGNARYGDVSNAEVNELLLRYPGLFVALAGISPNAPDALEEIEHCVRD WGFKGVALDPGWCSPAMYATDPKIEPILDLCQQLGVFVSITMSAYGGPDLSYCDPTPLVP MLRKFPKVNVVIPHGCWPKVQEAVGVALLCPNLYLAPDCYFFLRNMPMRHEYANAANSFL KYRFLFSTSYPIRGFAQAIEDWKNAGLEPESLQRSLYDNAAELLGISG >gi|316924681|gb|ADCP01000017.1| GENE 23 28685 - 29719 1223 344 aa, chain + ## HITS:1 COG:no KEGG:Plut_1380 NR:ns ## KEGG: Plut_1380 # Name: not_defined # Def: hypothetical protein # Organism: P.luteolum # Pathway: not_defined # 44 344 16 321 321 130 33.0 9e-29 MEKQGFDLAMHYFRAFAIFSVISVHMWIVPPLEGHESMSRFLDMLRGVLFHSSTLYFVFI SGYLFHFLAQRKFDLRTYYRKKLLNVIVPYVILSVFFTGYAFFWYSYTGEVPRFTKLVTG CREFFQALLYGDVVLTYWYIPFIATIFLVSPLLLRLSEERFALLTLFTVWIPLLIPRTEL TSFFHNYAFFFPAYLLGMFYSMNRDLILAFIAKHIKILVAFSLVFTALLGINAYEEILPN IEAAYYMQKIAIGACVVHFLEKIKHIRITLLDLIATYSFALYFLHDFILFTFQNLLFTFF GILLPESLFFVSLFTFFCEVALCLLLIMGIKKLLGRRSRYVIGS >gi|316924681|gb|ADCP01000017.1| GENE 24 29917 - 30090 69 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAIREHFSAARNRYPLKQNFCAAAASSDFLDRFLLIPFHKDTFINIIGIPECRTLTV >gi|316924681|gb|ADCP01000017.1| GENE 25 30099 - 30587 524 162 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNATPYIEALGKETGMDITLSASGAAALSLGGRNLLIQWIEATHSFIIYVEVGALGGWRN GEICRLLLASNFLLADTQGGALSYNDATGMVGLNFPIPAYGLDTGDFLKTVNNIVLFSET WKARLDAMNKEQEALTEQAQQAVLEGGEAPEPSFTAGQFLRV >gi|316924681|gb|ADCP01000017.1| GENE 26 30759 - 34394 3355 1211 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVSLQQFFNSANTVGDSASLFLQNGGESVGDTSSLHGIHKLSRSAKAEENRATVTAFLN ALDQSPQFRNINADIRGMLNAKIEGGKPLTAEDVKLVRDSVLYDEALAAGRQLADGNALP AGHATSFAQFAVVRNMDISTPGGQRDAVQAYLNEKVIRQNLGPLTQLPGLGEHGAAITTA LARLNQPFTGANGFFAHQLRADMEAHGTDGAFTRLQTAYRDANAATIDILSSLKDDMVGL LPQLPNGKDMIATLKEALPTLGRDNMQGLAMSFATNMPTLATPAERQDAVRGFMMRTAGK AEGIRQAMTLAGLPQNFSSALANNPAVIKHCTALLNDNPGPGVYPSQERVAEAMDIAVQV FVEDNLPLLREFALMAQDPPGDLNPPVTAETMPRYINAMLAGDVMVEQLLNDSVPMDAAF LERIADHADALNSAAHSFKGDYGADDIAAVLRNSVSMLLARRGVTQDMLPDLMKNAVDKF GPLANQFATLNGAIQRGLGGMRGLEFLKEGMTQFRSLEGHARALISLMSREQKVDMGIAT PGDVDPQSEEIQRQDGELLSEFLESRFEVFGDTEQIPVMLREFARAHGLDIPRLSTTQHS ALSGANRETFNAVLDELIPEQGHVVEANTDAFRAVFNSINEDGALAGLRPDAINPRPFYQ GVSQALTPLLNAANEEGNAVDAAQLRQLAGDVIGAELLGLKDTLDDIGALPAERFSDADK DVMKEIAQRYGVRDAGAIAEAFTAARELPVPTGLVNLARLDQTPGRFTQAVMDVSERFCA FHERYAQLPGSEDLLPMMCDFILEGMTPDELANVSANMQSDMAHKLAGACLHIVGHPRAP RDTAPLMGATQIMNNLRQNAEYRLGHNPQVDPMYFNDEINHLCEMPGDAESPLSRLGRFA PGVITDFDVQMNRHAERLTPQQWEQLRGIHTQLAQTAQGAQDFLLPYWVESSVSDLLAAL EANRGKPLSNRQIWDAMVGGPMPRGISAEHFGADLIKSVSKMYVGLLQAAAPDMPQPVMD AALMNSSSFGLSPKKLIALTRPHAHISLKDISVATGMGSLSGIDEETAYGLVTDFRRRGK NTVMQFEDRNGNGFATSPFSISDEENTSENPHFTEIIGRVRGMTHSEGQLARVMQCFSQA PLIMPRVLSTCFPGVEFSEHGNFSVSAKEQQDGSVLVDITSDPALPLILDMQIRVGTDGS HTFERLDMSRP >gi|316924681|gb|ADCP01000017.1| GENE 27 34846 - 36009 1536 387 aa, chain + ## HITS:1 COG:BS_yfnC KEGG:ns NR:ns ## COG: BS_yfnC COG0477 # Protein_GI_number: 16077799 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 6 381 23 400 409 143 28.0 6e-34 MDKKKIFLVSLGHLSCDINGGALPAILPFLITAYGFDYQAAAGLMFAFSCLSSIIQPAFG YLSDKLSKPWFIPVGVLLAGGGLASIGFLHSYWAIFAAIALSGVGAALFHPEGARFANNV SGASKGTGLSLFSIGGNSGFVFGPMLAVAAIGAFGMPGTSVFGLIALVMSVILLYQISHL SGMKKTETESAKAAGETEEVKNNWKEFSKLTLAIISRSVLFVGCNTFIPLYWIHNFGQSK AAGAAALTCFCTFGVISNFIGGMLSDRFGYLRIIRLSYVVLIPSIVLFGLVDNLYAAFAL LLPLGFSLYAPFSSMVVLGQKYLAKNIGFASGVTLGLATSMGGIVAPLLGWIADGYGLPR AIQSMTIVAVIGTVAAFLLVSLNRRTD >gi|316924681|gb|ADCP01000017.1| GENE 28 36167 - 37576 1365 469 aa, chain - ## HITS:1 COG:no KEGG:Ctu_12980 NR:ns ## KEGG: Ctu_12980 # Name: ybfM # Def: uncharacterized protein YbfM # Organism: C.turicensis # Pathway: not_defined # 49 469 35 464 465 283 39.0 1e-74 MTRLRLFFVAVAYTLLAVLTAVPAAANEDLAPETQGLDSREAFRTGTRFFDEAEVSGGLY FFRRDRRRYDVDKGRYGTNLNHASVQANADFVSGFAGGWLGFDFGVFGSHDLMNKGAVDH EMGFEPWGDPWHPDWSKRRTDDGVSVYKAALKAKAGPAWAKAGWFQPTGPGVLGVNWSIM PGTYRGVNAGADFGRLSLAAAWADEYKSPWFVNMNRFRKNDGETSVPWLWSAGVRYAFES GLTLELGYGASKDHLQNAHFKSSYRTGGGAGPLTVGYHAYFMNDSDDSGKSENDNFDGLA SLHYLFGKYEVPPWTFRLEGTYVRAPMSGPQSQGQFAYRLTDRSGSSSGAFDVWWDARSD WNADNEKAAYCGAMRTLDDLLPIAGFSAGAGAAFGFDGRGYGTAEHLKEWAFTFDLNYVK PDGPLEGAFVKLHYTEYRNGTAQPSWGTYRNAFQSERDLKLLIGIPFSL >gi|316924681|gb|ADCP01000017.1| GENE 29 37815 - 39980 2340 721 aa, chain - ## HITS:1 COG:lin2832 KEGG:ns NR:ns ## COG: lin2832 COG1455 # Protein_GI_number: 16801892 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 12 434 8 427 435 264 35.0 6e-70 MTSQLRKINNVIEQFLLPLAEAVSGWSTLAAIRSALIVTLPLMFLGSLAELMISFPLDAY KEFMFRQFGPEWRMFGQILKDATFSVMSLIMVFSIGHHLTDQFNHDNPVLRANPVIAGLV AFTAFFCLLQQDGDALNRRWLGVAGLFVAILVGVFATRLFLYFFSIKRLHLHLPGGTPDI AIPQAFNAFIPGMLTVLVFAALGAAMQTFVGMSLHEALYRSIRMPFDAIGEGLGRGMLYI LSLHALWFAGIHGANVLDPITHDIYGAAMLANEVAAAAGEPLPHIMTKTFMDTFVFMGGA GTGISLAGALILFGKTQASRKIGIFSLVPGLFNINEVLLFGLPIVLNPLMLIPFLLTPVL LAAISYVAVATGLVPGTNVATEWTTPILLNGYLSTGSLSGSALQLANLVVGVLIYAPFVL IANKIKVKQINDAFRSLLRRSCATADSSRRCLDHNDDAGSLARSLITDLEYDYRHGEGLF LEFQPQICSRTGRVVGVESLIRWKHPSYGLIPAPITVALAEDSGLIRPIGLWVFETACQV RKSWLDAGITDLTMAVNVSALQLERSFPKQLLDIAARYDLPPSLMEVEVTESSALDSDKP ESHILSRVYDAGFPVAIDDFGMGHSSLKYLKQFPVSVVKIDGAISREVVTNSICSDIVAS ITRLCRARNMLSVAEFVENEEQAALLRELGCDVFQGYLYSKSLLPSDCLQFIQKRNGGNG A >gi|316924681|gb|ADCP01000017.1| GENE 30 40604 - 41086 534 160 aa, chain + ## HITS:1 COG:mll4697 KEGG:ns NR:ns ## COG: mll4697 COG2197 # Protein_GI_number: 13473938 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Mesorhizobium loti # 1 154 61 212 215 125 40.0 3e-29 MPELSGVEAIPRIREVAPGARILIVSMHATAGHVRAALKAGADGYLLKTADEDEFLVAIR ALLKGRKYISTELTGSMIISYVDGNAEASSKADSLTPKEKMVLKLIAEGNSNKDIAAKLN LSVKTIDTHRTSFMKKLDLHNVREVTRFAMQNGLVGDGID >gi|316924681|gb|ADCP01000017.1| GENE 31 41079 - 42740 1662 553 aa, chain + ## HITS:1 COG:AGl2786 KEGG:ns NR:ns ## COG: AGl2786 COG0747 # Protein_GI_number: 15891502 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 53 540 82 569 585 210 31.0 7e-54 MTKRLRRADARRAAPRLSAASPFPPGTGRPVRTGHAYLAAFLSAALLLLALCAAPASASP EDNSLIVALERFPPGFNPAVASGSLTSQIGAQLFAGLVRTGKDGPIPYLAESWEIDSQAK RFRFHLRKGATFHDGTPITANDVAFSIRAAQRHHPFRSMLEAVDKVVPVDDLTLDLETSI PQPALLKCFIPALVPVLPAHIYGDGTPIDTHPANLRAVGSGPFRLASLEKGKNIVLERYK GFFLPGKPRLDRITFRVYWDQNEIPLAIMQGEADLYAFSSLSDEERIFRSNPAIEVTRDE FSLLHPFALLTFNVRNPIFRSPEVRKALAMSIDNKALAQAVPGGVRPMYGPIPPGSEWHS PVSTPYDPDEANRILDRAGYPRNENGIRFTVEIDYEPSAQFSLSILKYLQSQFIRTIGVY FRIRTAEDPGSWADRVTSGAFDVTMDELYGWHDPAIGIERIYATNTAAILWSNMSHYSNP EVDALFRSASAEKDPEGRKALYAALQERLSREHVALWLCTIPYATIRNRNVLHVADQPFG VLSPLDEACWKRP >gi|316924681|gb|ADCP01000017.1| GENE 32 42737 - 44605 1996 622 aa, chain + ## HITS:1 COG:RSc2311_2 KEGG:ns NR:ns ## COG: RSc2311_2 COG4585 # Protein_GI_number: 17547030 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 408 617 1 207 207 122 35.0 2e-27 MSIMDKSIASRIALLLGLFCLVLCAFLISLLQLLKMLSEEADETVNVALPNMAIASYISK ESEWAKGILHDSILCRDRFVHLGVVQEIEARRFPENVLESLEALAVDPGVKEKIKNNLHE LNGILAGTNQAVAQRIDNLRNTERLVKRIRTLNADLPQLERELYASGMKPGDPGFNVWKT AYSDVLTAMLLLSMHHDRPYSLRLKSEIRTDVKRLLRSAEASPFSGRLKKLSSDAATMAL AEDGLLALYEEYTALESTLDALKITHQYTSNSLIESANAVSQQSWTSIQASKKTLAERYA VIVATVGVTFFLAVFLALFIYRSIVRNVVDPIKRLNNCMLDRVNRIPTPFPESVRYELAE MTSSVRYFTSEIEKHEQELLHSHENLEKQVAERTCELKALSEKLLMAQESERFKLASELH DNIGATLGAIKFGMERSLRSIRSLPEHELQPDICDTLSTSVSLVKKLATHLRRIQNELRP PQLELGLEATIADFCEEYERVFAHMPIELSIALDEEKLPPNLPIVIFRIIQEALSNIVKH SGATRASVSLGMDGDALRLVISDDGKGFSVAEKLADVSVKSGLGLKSLRERAQLSSGVLT IGSELGRGTIIAVVWDAAALRG Prediction of potential genes in microbial genomes Time: Fri May 13 02:03:06 2011 Seq name: gi|316924672|gb|ADCP01000018.1| Bilophila wadsworthia 3_1_6 cont1.18, whole genome shotgun sequence Length of sequence - 9251 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 187 - 1461 1854 ## COG0477 Permeases of the major facilitator superfamily 2 1 Op 2 . - CDS 1480 - 3081 1901 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold + Prom 3422 - 3481 5.5 3 2 Tu 1 . + CDS 3659 - 4834 937 ## COG0628 Predicted permease + Term 4917 - 4962 10.1 4 3 Tu 1 . + CDS 6020 - 6457 500 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Term 6508 - 6565 7.1 - Term 6499 - 6550 11.7 5 4 Op 1 . - CDS 6556 - 7302 612 ## COG0863 DNA modification methylase 6 4 Op 2 . - CDS 7280 - 7471 76 ## - Term 7534 - 7559 -0.8 7 5 Tu 1 . - CDS 7720 - 8760 871 ## COG3344 Retron-type reverse transcriptase - Prom 8945 - 9004 2.4 Predicted protein(s) >gi|316924672|gb|ADCP01000018.1| GENE 1 187 - 1461 1854 424 aa, chain - ## HITS:1 COG:YPO1668 KEGG:ns NR:ns ## COG: YPO1668 COG0477 # Protein_GI_number: 16121932 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 6 407 3 408 411 136 26.0 9e-32 MSNFLKYMRLFALGITFSSSMALPYVQIKFYDVFREATQASNNELGLLMTIFTAVSMALY IPGGVLADRGSPKKLLLLSLIMMCGLNAYFAFHTSYGAAQLIWALLAIAANTVQWPTLIK AIRDTGSSEEQGRMFGTFYGTTGIFSSIIGFVGAWIYSNGSSNIEGFHLMLLGQSAFCAV AAVAVGFFVEDKTPYNNNVPAPKENPFKSALTVMKLPAVWMMVVLIFCGYGMYIGITYMT PYTTNVLGASIAFGAVLGTIRAFGLRVLTGPFSGYISDKMGSAAKILAICFLIIIGMLFV ILSLPKGTSNAVIILLTMMFSFFGLVIYTTMFACMEEVSIPPQYTGIAVSVISLLGYLPD GLFPPLFGHWLDVYGNQGYSVIFYFLAGISLLGCCVAIVIYRKGRALRSGQVEVEAGNEA QVTA >gi|316924672|gb|ADCP01000018.1| GENE 2 1480 - 3081 1901 533 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 5 532 18 549 553 281 33.0 2e-75 MDRKVFYNGTILTMQAEGHTVEAVYTIGDRIAAIGNRDDVFRQITDDTERVDLEGKTLIP GFIESHAHVTDWAECEFLQFTCYHIKNIAELQDMIRENAKTKKAGEWVIGYGYNDSMVAE MRHLERADIDAACADHPVFVMHGSGHLAYVNSKALEICNIHADTPIPAGGQGEIHLGADG LPSGLLIGQAYNLALSVIPKYTVEEYKGAFRKGIAAANAKGVCSIGDGSIGYMGNQWQLL RAFHELEKEGDLNLRFYLVIMDYGYIPFMEGGLVTGFGSESLQLGSIKLLLDGSIQGRTA SVKEPYLGTDEKGLLHHTPEDLQARVERYHRGGCQIAIHTNGDNAIEEAINAIEKAQAAY PRPDARHMLIHCQMASEEQLDRMAKLGIIANFFVGHVHVWGDLHRDRFIGERALRMDPVA SAKKRNMPFCLHTDLPVTPLDPILSMYAATARRTKSGKILGEDQRVSVYDALRAWTVNAA HSMFGETVKGSIREGYLADLVILSQNPLETAPDDLLDVQVVQTVVGGKTVYRA >gi|316924672|gb|ADCP01000018.1| GENE 3 3659 - 4834 937 391 aa, chain + ## HITS:1 COG:PA2651 KEGG:ns NR:ns ## COG: PA2651 COG0628 # Protein_GI_number: 15597847 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Pseudomonas aeruginosa # 5 343 3 343 352 275 44.0 9e-74 MLTANRKTEDCAFLALLIGVTALFLWMMLPFFDVIFWAATIGILFHPVYRTIAEKWRMGR ILSAFITILLCLVLIIIPLSYILYNCVVEATLLYQRLHSDQNSFVGYIDRIKDSYPFLQD WLHYAGYDLERIKSLLTKAALSASSFLARNTVTLGENTLGFLTNLALVLYIAFFLLMDGT KIQELLIKALPFGDHREKLLFRKFAEVTRATVKGSLLVAAAQGALGGVIFWLLDIHAALL WGVVMTALALIPVVGAALVWGPVAVYLLLTGDYWDGGILLAYGSTVIGLADNILRPLLVG RDTKLPDFLVLLSTLGGFVFFGMDGFVTGPTLAVLFVTVWQIFIAEFGANAPVSPAASAI PGVIPVRPQPDSSVPTKAKKSKAPRKSRGQR >gi|316924672|gb|ADCP01000018.1| GENE 4 6020 - 6457 500 145 aa, chain + ## HITS:1 COG:PA4913 KEGG:ns NR:ns ## COG: PA4913 COG0683 # Protein_GI_number: 15600106 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Pseudomonas aeruginosa # 4 123 7 127 374 79 41.0 3e-15 MKKRMGIVAAVATAMLCSSFAFAAPIKIGLMAPITGAFASEGQDMQKICELMTEELNKAG GINGNKVQLVIEDDGSTPRSAATAASRLVAADVCAAIGTYGSAVTEASQDIYDEAGIVQI GTGCEQSRKLRISVTKIENGPDRFF >gi|316924672|gb|ADCP01000018.1| GENE 5 6556 - 7302 612 248 aa, chain - ## HITS:1 COG:XF2313 KEGG:ns NR:ns ## COG: XF2313 COG0863 # Protein_GI_number: 15838904 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Xylella fastidiosa 9a5c # 12 238 11 239 243 203 43.0 2e-52 MTKDCLSDGVTLYQGDALGILATLPDAVMDAVLTDPPYSSGGVTMGARQADPAQKYQQSG TKRQYPPMLGDAKDQRSWTMWCTLWLGECWRIAREGAPLMVFTDWRQLPALSDAVQAAGW AWRGVIAWDKRSARPQIGKFRQQCEYVLFATKGRFIAHTRACLPGVYSYPVIPVQKVHLT SKPVALIEDLLAVTAPHASVLDPFMGGGSVGEACIRTGRGYVGMELSREYYDISRTRLTA VFAAKEQV >gi|316924672|gb|ADCP01000018.1| GENE 6 7280 - 7471 76 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQKEVLEEIRCRHCGKLLARGQAKIMEFKCPRCGAFTILRAMRPCSESPDGQNGANCDKG LSE >gi|316924672|gb|ADCP01000018.1| GENE 7 7720 - 8760 871 346 aa, chain - ## HITS:1 COG:alr3497 KEGG:ns NR:ns ## COG: alr3497 COG3344 # Protein_GI_number: 17230989 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Nostoc sp. PCC 7120 # 3 336 2 347 352 188 33.0 2e-47 MAKRAKHIWPDMVSFGNLLEAWRKVQLGKRYRPAVVRFRAAQEDNLLGLRDRLQSHTWEP SPCRRFRILEVKPRTIDAPTIGDCVVHHAAMNLLEPWFERRFIADSYACRKGKGTHAASL RTREFMRAASAQWGRPYVLKGDIAKYFSSIDHNRLLSMLPRIISDPDVLWLFERIVRHNG YEEKGLPLGCLTSQWLANLYLDALDHYVKDDMGVKYYVRYMDDFVIIGPSKAWCWATLES IRNFVEISLLLRLNPKTGVWPISRGIDFVGYRHWTDHVLPRKRTIKRARTAFRSFPGLYR AGKIDLDYIRSRVVSFTGYMAHCDGHTTLEHILARLVLTPGARRSP Prediction of potential genes in microbial genomes Time: Fri May 13 02:03:19 2011 Seq name: gi|316924663|gb|ADCP01000019.1| Bilophila wadsworthia 3_1_6 cont1.19, whole genome shotgun sequence Length of sequence - 7274 bp Number of predicted genes - 8, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 183 - 608 373 ## Amet_2552 hypothetical protein 2 1 Op 2 . - CDS 608 - 901 242 ## 3 1 Op 3 . - CDS 928 - 1227 211 ## HSM_0915 hypothetical protein 4 1 Op 4 . - CDS 1224 - 1646 460 ## 5 1 Op 5 . - CDS 1633 - 1896 329 ## 6 1 Op 6 . - CDS 1966 - 3447 1102 ## RC1_1137 phage-related hypothetical protein 7 1 Op 7 . - CDS 3457 - 6648 3291 ## Daci_2598 virulence-associated protein 8 1 Op 8 . - CDS 6658 - 7185 539 ## Daci_2599 hypothetical protein Predicted protein(s) >gi|316924663|gb|ADCP01000019.1| GENE 1 183 - 608 373 141 aa, chain - ## HITS:1 COG:no KEGG:Amet_2552 NR:ns ## KEGG: Amet_2552 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 5 123 4 113 113 67 37.0 1e-10 MALQDFTLLAKLEDFDLYTHNVCQKFQKSERHVLSASIRGNVEEIVCLVIEAAKVQLEER RKKRPPAKTLELLKRTDIRLEYLKMQIRKAYKLKQTDEKTYETWAGLARELGGLLGGWIK KVEPYADVSATSRTQKQNSLF >gi|316924663|gb|ADCP01000019.1| GENE 2 608 - 901 242 97 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIKVFVPAPRYYGKYGSTRKKSPDAGTAIRHFRHEAGLSQDGLADRMDVSPSYISMLESG KRYPSIEMLIRIALALNIKPGLMLDYIAERYLQRKES >gi|316924663|gb|ADCP01000019.1| GENE 3 928 - 1227 211 99 aa, chain - ## HITS:1 COG:no KEGG:HSM_0915 NR:ns ## KEGG: HSM_0915 # Name: not_defined # Def: hypothetical protein # Organism: H.somnus_2336 # Pathway: not_defined # 11 91 3 87 89 62 39.0 7e-09 MKTVIRNACRRLRRWGKQVLVALDQLVNALFGGWADETLSSRCWRKRDKTGWKQARMALD FVAALLGDADHCRASYDSERLRLQCPPELRSPDAQAVQF >gi|316924663|gb|ADCP01000019.1| GENE 4 1224 - 1646 460 140 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPRTEIGQHEISGLSGAEHDKAAARAAIDAETSAAILEGFDYEIDPGTGTPETLHFSYDA FDQQNFSDTANACLMLKAGAHGLPESVTWNAYRADGELVRLVLTADAFLALYAGGALAHK AACMAEGGTKKAALETEGAA >gi|316924663|gb|ADCP01000019.1| GENE 5 1633 - 1896 329 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDILISYDNQELSLSLAGGENAALYAVPTAYSEDGLYLAQWGTGELEPLPACGTWTPLLR LCLADGEDGADSLVEDFKADGVSYAQN >gi|316924663|gb|ADCP01000019.1| GENE 6 1966 - 3447 1102 493 aa, chain - ## HITS:1 COG:no KEGG:RC1_1137 NR:ns ## KEGG: RC1_1137 # Name: not_defined # Def: phage-related hypothetical protein # Organism: R.centenum # Pathway: not_defined # 62 492 431 849 850 233 34.0 1e-59 MIDKKSEYLELPLPNEQNMLEDDCPRIVEAFRKVDAHAKTADERLDGVEGRADSLEIRAS ALETAQASTEGTLQSHAASLSSLATEKADATALAGEVQARTEAVSGEASARADAITAAVA AETQAREEAMTAHDASLSAHDALVKRITVGSLSPVIGICCVEDGGGSGLWFNIDAEGQPV SPPRSYFDYHPTYNALRRVLVDGQMMQEHHKFYYKAFEIASGPFAGRRGRCISPGQMDGF KPFPSFMKNGQEVDTWYCGTFQATDEGGSPKKLGSRPGKAPFVNVNFPTMQSYCRNRNVG GVDGFDMWNIYQAAEIQLLALIEAATPDSQAYYGRGRVDTSSAANVDATDVATASWRGHV GLWGNVWQMCAGLEISTSGTVKLWKNDGSKEWVDTGFVCPAYDGSNPAYMQTLKTGNGEG FDFEDIFFPATTTTSAAAGTIPDGFWGRNGSAGNVCYLGASWVDGVRAGLFACDLNDPAS YATTNIGCRLAKM >gi|316924663|gb|ADCP01000019.1| GENE 7 3457 - 6648 3291 1063 aa, chain - ## HITS:1 COG:no KEGG:Daci_2598 NR:ns ## KEGG: Daci_2598 # Name: not_defined # Def: virulence-associated protein # Organism: D.acidovorans # Pathway: not_defined # 13 1062 7 1038 1038 767 43.0 0 MQTQSGKRIDNYWNRYDAGKKYMELLFRDGYGTQASEMNELQSLFAARVKSLADSLFKDG DILQDAQITVNAQTGALTAEAGLVYLSGAVWPVETASFVIPVQGTVSVGVRLRESIISEL EDPALCNPAVGSRGEGEPGAWRKKVEAVWGYDGDGGSGEFYPIHTVDDGVPRAKETPPNL DSFNQGIARYDRDSTGGGTYVCSGLTVRQAEDAGGGAQVYTVAEGRCRVSGYGVELLTSR RLSYAATPDLRFIDTEVIEADGSAKQRINVAHPPIHNITALRVTLQKTVSVVHGSYSGCA DALPDTSVMSLVEVKQGETTYEAGTDYKRTGDTVDWSPTGNEPATGSTYQVTYTYLDKTL TPEDADYDGFSVSGAVSGTGIMVSYNQALPRIDRLCITGDGTFTWVQGVAAEYNAKAPSV PETVLALCSVKQTWREASARVVQNDGVRVVAFSDMEALAARVEYALQEVARQRLEADVAT REAGARVGLFVDPLLDDSMRDQGIEQDAAIVNGFLTLPIAAKVYALSQDIQAPTARTFTP QILLEQPYRTGEMKVNPYMAFAILPGKATLSPAVDRWTEQATEWTSSVTRSFYQVIYAPN HPQHGQTVTSTSTSTEAVGTSQKALEYLRQIDVAFEVTGFGSGEVLQSITFDGIPVQAKE GQLTADDTGKLSGTFTIPAGVPAGTKLVVFRGGEGGSTAQATFVGQGTLEVTTLRQVQTV TNTNIDPLAQTFMLDKAVQLAGVDLWFTARGDSGVRVQIREVSNGVPTRTVLAEAVVPAA SLVVTGGGHTRIRFDAPVALAASTEYALVILCNDADTAVAIAEMGQFDSLHQQWVAAQPY TIGVLLSSSNASTWTAHQTRDLTFRLLEAVYAEGTNTLDLGTATVTDGATDLILLALAET PTAQSRVEYELGLPSGESMTVAEEQAVRLSESLTGPLSVKAKLAGDTAGSPVLWPGSQVL AGTVQTSGSYYTRSIPASGAKKAVLLYSASIPSGASVTPEIQVDSGSWEAMTADGTVQQG DGYVEYKFTHTLADADLVKVRLTLSGTIAARPMLYDIRLMAVA >gi|316924663|gb|ADCP01000019.1| GENE 8 6658 - 7185 539 175 aa, chain - ## HITS:1 COG:no KEGG:Daci_2599 NR:ns ## KEGG: Daci_2599 # Name: not_defined # Def: hypothetical protein # Organism: D.acidovorans # Pathway: not_defined # 3 174 1 160 161 119 41.0 3e-26 MSLATLTKAGRAAIALALSARPIHLAWGSGNPEWDAEEADLPSLVNATALVNELGRRTPA TIGFVEPDDEGDIVIPVATGAGGEVQEARYKSVTGPSPYLYVRTNYNFEDASNAVIREIG VFMDTELKEDLPPGQRYFTPDNLKSPGLLVAAQIIVPPINRSPSVRQTIEFVLPI Prediction of potential genes in microbial genomes Time: Fri May 13 02:04:46 2011 Seq name: gi|316924588|gb|ADCP01000020.1| Bilophila wadsworthia 3_1_6 cont1.20, whole genome shotgun sequence Length of sequence - 70934 bp Number of predicted genes - 76, with homology - 63 Number of transcription units - 27, operones - 13 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1238 785 ## Dde_3392 hypothetical protein 2 1 Op 2 . - CDS 1239 - 2426 1011 ## COG3299 Uncharacterized homolog of phage Mu protein gp47 3 1 Op 3 . - CDS 2377 - 2850 457 ## DvMF_1302 baseplate assembly protein, putative - Term 2864 - 2902 7.4 4 2 Op 1 . - CDS 2963 - 3697 740 ## DVU1104 baseplate assembly protein, putative 5 2 Op 2 . - CDS 3699 - 4517 741 ## DvMF_1300 hypothetical protein 6 2 Op 3 . - CDS 4514 - 5047 490 ## DVU1106 hypothetical protein 7 2 Op 4 . - CDS 5044 - 7566 2709 ## COG5283 Phage-related tail protein - Term 7604 - 7644 0.6 8 3 Tu 1 . - CDS 7739 - 8065 476 ## DvMF_1296 hypothetical protein - Prom 8108 - 8167 1.6 - Term 8120 - 8158 8.3 9 4 Op 1 . - CDS 8189 - 8653 608 ## DVU1110 hypothetical protein 10 4 Op 2 . - CDS 8663 - 10378 1736 ## DvMF_1293 hypothetical protein 11 4 Op 3 . - CDS 10388 - 10747 430 ## gi|212703613|ref|ZP_03311741.1| hypothetical protein DESPIG_01658 12 4 Op 4 . - CDS 10761 - 11276 674 ## DVU1113 hypothetical protein 13 4 Op 5 . - CDS 11288 - 11731 475 ## COG5005 Mu-like prophage protein gpG 14 4 Op 6 . - CDS 11734 - 12159 452 ## DvMF_1289 protein of unknown function DUF1320 15 5 Op 1 . - CDS 12279 - 13274 1064 ## DVU1116 hypothetical protein 16 5 Op 2 . - CDS 13305 - 13709 491 ## gi|212703608|ref|ZP_03311736.1| hypothetical protein DESPIG_01653 17 5 Op 3 . - CDS 13727 - 14659 973 ## DVU1118 hypothetical protein 18 5 Op 4 . - CDS 14741 - 15817 945 ## COG1397 ADP-ribosylglycohydrolase 19 5 Op 5 4/0.000 - CDS 15907 - 17823 1604 ## COG2369 Uncharacterized protein, homolog of phage Mu protein gp30 20 5 Op 6 . - CDS 17804 - 19363 1275 ## COG4383 Mu-like prophage protein gp29 21 5 Op 7 . - CDS 19448 - 19711 201 ## 22 5 Op 8 . - CDS 19729 - 21204 1419 ## COG4373 Mu-like prophage FluMu protein gp28 23 5 Op 9 . - CDS 21201 - 21818 686 ## DvMF_1281 hypothetical protein 24 5 Op 10 . - CDS 21828 - 22067 237 ## 25 5 Op 11 . - CDS 22064 - 22456 392 ## DVU1125 hypothetical protein 26 5 Op 12 . - CDS 22462 - 22725 208 ## 27 6 Tu 1 . + CDS 22643 - 23077 123 ## + Term 23170 - 23221 7.4 - Term 22981 - 23015 -0.1 28 7 Tu 1 . - CDS 23026 - 23667 441 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 29 8 Tu 1 . - CDS 23793 - 24152 387 ## LIC084 phage-related membrane protein - Term 24229 - 24272 10.5 30 9 Op 1 . - CDS 24321 - 24773 490 ## DvMF_1273 DNA-binding protein 31 9 Op 2 . - CDS 24776 - 25129 82 ## DMR_14540 hypothetical protein 32 9 Op 3 . - CDS 25122 - 25610 386 ## DVU1132 hypothetical protein 33 9 Op 4 . - CDS 25603 - 26031 317 ## DVU1133 hypothetical protein 34 9 Op 5 . - CDS 26036 - 26224 163 ## CDR20291_0106 anaerobic ribonucleoside triphosphate reductase 35 9 Op 6 . - CDS 26230 - 26439 104 ## 36 10 Op 1 . - CDS 26701 - 26964 204 ## 37 10 Op 2 . - CDS 27006 - 27665 413 ## GM21_3690 hypothetical protein 38 10 Op 3 . - CDS 27751 - 28008 319 ## 39 10 Op 4 . - CDS 28013 - 28195 115 ## 40 10 Op 5 . - CDS 28245 - 28538 352 ## COG0776 Bacterial nucleoid DNA-binding protein 41 10 Op 6 . - CDS 28549 - 28803 232 ## 42 10 Op 7 . - CDS 28815 - 29327 561 ## DVU1136 host-nuclease inhibitor protein Gam, putative 43 10 Op 8 . - CDS 29327 - 29608 355 ## gi|212703591|ref|ZP_03311719.1| hypothetical protein DESPIG_01636 44 10 Op 9 . - CDS 29638 - 30348 864 ## DMR_14500 hypothetical protein 45 10 Op 10 . - CDS 30378 - 32417 1908 ## COG2801 Transposase and inactivated derivatives 46 11 Tu 1 . - CDS 32574 - 32816 160 ## - Prom 32973 - 33032 7.1 47 12 Tu 1 . + CDS 33265 - 33657 135 ## + Term 33711 - 33750 -0.7 48 13 Op 1 20/0.000 + CDS 34109 - 34882 1060 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Term 34897 - 34927 5.0 49 13 Op 2 24/0.000 + CDS 35024 - 35929 1313 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 50 13 Op 3 19/0.000 + CDS 35926 - 36966 1277 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 51 13 Op 4 18/0.000 + CDS 36963 - 37724 225 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 + Term 37729 - 37783 -0.9 52 13 Op 5 . + CDS 37881 - 38627 248 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein + Term 38650 - 38687 6.1 - Term 38632 - 38679 13.3 53 14 Tu 1 . - CDS 38925 - 41936 3390 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 42172 - 42231 7.1 - Term 42607 - 42646 3.4 54 15 Op 1 39/0.000 - CDS 42652 - 43524 953 ## COG0074 Succinyl-CoA synthetase, alpha subunit 55 15 Op 2 2/0.000 - CDS 43521 - 44705 1238 ## COG0045 Succinyl-CoA synthetase, beta subunit 56 15 Op 3 22/0.000 - CDS 44916 - 45473 710 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 57 15 Op 4 23/0.000 - CDS 45477 - 46277 859 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 58 15 Op 5 . - CDS 46279 - 47433 1167 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 59 15 Op 6 . - CDS 47430 - 47642 248 ## - Prom 47842 - 47901 2.1 + Prom 49026 - 49085 2.8 60 16 Tu 1 . + CDS 49289 - 49615 248 ## - Term 50010 - 50045 0.0 61 17 Tu 1 . - CDS 50170 - 51450 1219 ## PTH_1333 hypothetical protein 62 18 Tu 1 . - CDS 51587 - 52612 645 ## COG0247 Fe-S oxidoreductase - Term 52788 - 52832 4.0 63 19 Tu 1 . - CDS 52967 - 54343 1043 ## COG0277 FAD/FMN-containing dehydrogenases - Prom 54539 - 54598 5.4 - Term 55131 - 55175 8.6 64 20 Op 1 . - CDS 55222 - 55674 321 ## COG1803 Methylglyoxal synthase - Prom 55750 - 55809 3.0 65 20 Op 2 . - CDS 55811 - 56500 467 ## COG0684 Demethylmenaquinone methyltransferase - Prom 56615 - 56674 2.5 - Term 56542 - 56590 12.6 66 21 Op 1 . - CDS 56746 - 57744 655 ## COG2055 Malate/L-lactate dehydrogenases 67 21 Op 2 . - CDS 57741 - 58670 682 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases - Prom 58725 - 58784 3.2 + Prom 58683 - 58742 4.5 68 22 Tu 1 . + CDS 58775 - 59503 660 ## COG2188 Transcriptional regulators + Term 59529 - 59557 -0.9 - Term 59552 - 59593 10.2 69 23 Op 1 2/0.000 - CDS 59618 - 60778 1334 ## COG2721 Altronate dehydratase 70 23 Op 2 . - CDS 60784 - 61068 449 ## COG2721 Altronate dehydratase - Prom 61102 - 61161 2.2 71 24 Op 1 1/0.000 - CDS 61262 - 63241 2135 ## COG3333 Uncharacterized protein conserved in bacteria 72 24 Op 2 . - CDS 63299 - 64255 1180 ## COG3181 Uncharacterized protein conserved in bacteria - Term 64937 - 64981 8.3 73 25 Tu 1 . - CDS 65010 - 65921 706 ## COG3058 Uncharacterized protein involved in formate dehydrogenase formation - Prom 65959 - 66018 3.5 - Term 66115 - 66158 2.9 74 26 Tu 1 . - CDS 66406 - 67242 626 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family + Prom 67600 - 67659 3.8 75 27 Op 1 5/0.000 + CDS 67826 - 68413 730 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 76 27 Op 2 . + CDS 68510 - 70876 3146 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing Predicted protein(s) >gi|316924588|gb|ADCP01000020.1| GENE 1 2 - 1238 785 412 aa, chain - ## HITS:1 COG:no KEGG:Dde_3392 NR:ns ## KEGG: Dde_3392 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 4 168 5 164 215 145 45.0 3e-33 MSEFWKYFHDRLAWPLIHNPGPLSGLVRGLSLALDGTRDDIVYFRRQWFPELCEADLVPD FGTGRGLVRHPKETAEQFRARVVGAFRWHRLGGKTEGLPEILKFYGFDALTVENLRIFQP SRWAEFQLGLRTPATQAEQDALLADLDTLLWLVNEYKPARSVLARVYTDTYDRVPTVWSG GPISEQGWSNGFWSLFSGVTWPGDGGDIVVSFGMVRRFLSERYNETGAGLGVESRTGFLA PYIDRPVWSRSAWSDVFPRNHGFTIGEIVSLHWCVRTTSSYPWHGAWDSRHWQEAATWDR ILPEWKMRWRSWARVEAVFSWPGDGKEPGDPVKVHGDGTWGDVNACYGRPQAIIYQGTRW GDAWGADPGRRELEILERRQDKGGLCTPAVHPARPQTAATRLPLAVPCRFVN >gi|316924588|gb|ADCP01000020.1| GENE 2 1239 - 2426 1011 395 aa, chain - ## HITS:1 COG:NMA1323 KEGG:ns NR:ns ## COG: NMA1323 COG3299 # Protein_GI_number: 15794250 # Func_class: S Function unknown # Function: Uncharacterized homolog of phage Mu protein gp47 # Organism: Neisseria meningitidis Z2491 # 72 395 54 349 351 82 27.0 1e-15 MPTQEQTVLPRVSRTIEDIRASVFGYVESVQDSLAARGYLPARLNLNKGVVRGLLEIFCW GYWQIYSLLERLLTQATPSSATGAWLDMHAASVDLSRRAATKARGNVRFLRAAQGNLEAN VTIPAGRIVRTMPDGAGRVYRYGTQAVAVLPAGADFVDVPVEAEDYGAAANASAGQICEL VTPVTGISGVTNPAGWLMEEGADEETDAQLRERYALQWQANNGCTKYAYMAWALSVPGVT SVSILDHHPRGQGTVDIVVRGADVLPTAALLDKVRAAIAPNTPINDDWLVKGPSAVSCVI DGAIEYTTGDPDAIRAQAENRLRALFAETSPLADVTALQIGQDLTLDLLTHTVMAVPGVK RVTWASPAQDVLPVPADGVACLENLSLSAVMAEEM >gi|316924588|gb|ADCP01000020.1| GENE 3 2377 - 2850 457 157 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1302 NR:ns ## KEGG: DvMF_1302 # Name: not_defined # Def: baseplate assembly protein, putative # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 6 147 2 143 144 181 62.0 6e-45 MSSVTTDLWGQDIALDDSGQARVAANGELLLTDGVETGVQDIRLRLFTRLGNLFYDREFG SLIHDWILEDSTAGNRAAFESEIVMRIEEDPRVVVGSVRCTVTAWDARSITALASWRFLD EDTPLNLVLQVNKLTMEMVIEDADPRTDSFTACFPND >gi|316924588|gb|ADCP01000020.1| GENE 4 2963 - 3697 740 244 aa, chain - ## HITS:1 COG:no KEGG:DVU1104 NR:ns ## KEGG: DVU1104 # Name: not_defined # Def: baseplate assembly protein, putative # Organism: D.vulgaris # Pathway: not_defined # 6 240 11 241 243 265 59.0 8e-70 MDEKQGKRDIVGAIRRLVELAMPDLRHYYRMTKKAKVVAVYESSGEYFCDVQPLRNDESA DPKEPVVPRVALPVLWGGPDRGVVCPPVTGALCDLSYYDGDPNYPFISNIRWGGGMSAPK AALNEFVIQLENGVEIRIDQEKRVVTLTPQEVRTEAGKNWTVKAGQGATVEARTVIVKGA QSVTLEAPNIYQNGNVTTGGHGGGSGNSIQNGDLTINGNLKVNGNTTTTGTSHAGSRSGG SCPH >gi|316924588|gb|ADCP01000020.1| GENE 5 3699 - 4517 741 272 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1300 NR:ns ## KEGG: DvMF_1300 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 2 271 3 265 266 257 54.0 3e-67 MIEGLNIRCNVGPVEVLRSPFMELVFRRRAVVSRATVVIPDPEGEVRAALTVGQAVSVRW GYRGEVGYWQEWEGTVENIDQPRADSASADAVTVSAVGLEKALTTTTVTESFYREPADVV ARRLLARTGLAVGTVDVPGDVLPYQVFSGVSVARAVKQLSHTLERSFGHDMSRHALWLGA SGLTWSAGDEPGDVYSVATAENLIAHTPPQTPDGVGVVVSVLLPGLTHSRLVHIRDTRRG VDATVRALDVVHTLQDGGNSTAISYGKDTGWL >gi|316924588|gb|ADCP01000020.1| GENE 6 4514 - 5047 490 177 aa, chain - ## HITS:1 COG:no KEGG:DVU1106 NR:ns ## KEGG: DVU1106 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 176 1 174 183 177 53.0 1e-43 MKLLTFEDGIVRIDGEELPGILSDLRVSGQVRFDEQSVDKASGKKKTPQGWEDADISLSL YLLTDEAGTCYDKLAVLEGFFKKTDGKANPSIFTVANSHLLARGIRRVVFSRLDSAENTR TDEIRASLSFMEHNPPIIRQERAQAKTPTPAELAKKAKEKAEKAAEQPETVIGVDVQ >gi|316924588|gb|ADCP01000020.1| GENE 7 5044 - 7566 2709 840 aa, chain - ## HITS:1 COG:ECs2641 KEGG:ns NR:ns ## COG: ECs2641 COG5283 # Protein_GI_number: 15831895 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Escherichia coli O157:H7 # 69 553 201 644 696 218 34.0 3e-56 MEVFSVFATLSLVDMLSGPLDRIRRAMSGVNAATAGLGTRMGNLALAMAPAALAAGVMLG SFGACVGVAAGFEDQMAKVGAVSRASSEEMAALEATARELGATTQFTAVQVGEAEQYLAM AGFSAKENIAALPGVLNLAAATATDLGRAADISSDILSAFGLKAEEMTRVADVLALTCAT ANVNMELLGDTMKYVAPVARMAGLSLEETAAMAGLLGNVGIKGSQAGTTLKAMLNKMAAP TKEAQELFQKLGVTVKDSAGNLRSPVAVLGEMAEGLKTMGTAEQIAAMKMIVGEEAIAGF SELIKQEGIGAIAEYARQLEAGGGSAAEMAARMNDTLAGSLRGMGSAWESLQITIGKLFL PVVRAVVDALTGLLRLLDKAAQSKVGAAVLKMAAALATGVVAVTGFSAGLWAISRAAPFV AKALLPIKTALLGLGWPVWALIAAVGLLYAAWKKDFGGIATTLSRWGRNISLVVRGVTAV FGSLKDGVGEIRGELAEDIQAAGLVGVVTTVARVVHRIREFFAGIWDGLNFDGAVTALTP AILKVRELFDSLGELIGRVFGSEVKGAASEARGLGEILGGVLSWGLELVATVIANVVRGV DTLVSLFRWICAILTGDWTTATAMAENIWNNFCDSLMAFADLFRVGDWIRDAWAKATEWL STIDLFESGARLLDTFKEGILSKVESLKQTFSDALSKLRNLLPFSDAKEGPLSTLTLSGS RLMSTLGEGMNAGFPGLYSTLSSKLGSLKESIGNWWDGLWSEETASPAQGSGPRIPAAPG APEDVAEEQDRRARAAGDRAAAPQSWTLHIANITLPNVKDAQDFFTELQAAATEMGESMA >gi|316924588|gb|ADCP01000020.1| GENE 8 7739 - 8065 476 108 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1296 NR:ns ## KEGG: DvMF_1296 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 4 108 21 118 118 87 41.0 1e-16 MPEAKDNNRKYVSFSHTFSDPWSGENAEDALDVTLTFRFSKPTKTQIQRLQDKAPKNPGQ ASRNLLLEVVHPDDKQALTDKMEEYPGIATSFATAIIKGVGISSDLGN >gi|316924588|gb|ADCP01000020.1| GENE 9 8189 - 8653 608 154 aa, chain - ## HITS:1 COG:no KEGG:DVU1110 NR:ns ## KEGG: DVU1110 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 149 1 148 148 196 64.0 3e-49 MAANGNKYDWESITVTLPGGEAVAITEIKYEDGQGVTARYGRGSIPRGYGRGNYEASGSM VLDREEWEKLKKELTADGGGIYDHTPFTIVVSYANDDMETITDTLKDCKISKFSGGGGSQ GDDNVSPVTCEFTILKPILWNGVPAKKERASAVL >gi|316924588|gb|ADCP01000020.1| GENE 10 8663 - 10378 1736 571 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1293 NR:ns ## KEGG: DvMF_1293 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 566 1 560 561 608 60.0 1e-172 MGDVLHYLIDGTSGIATGGVDGKALVAGVCSGGTVGKAYLIGKRTDLGSMLGTGPLVDRV RDMLNTGGQEPFLVAVPVQGQPGGYISALQIKGTDVAATVSGYPSHNADVVARVATAGVI GTATLEISTDGGKTFAEPVPSATQNPISSGEEATGATLVFAEEAVLEQGASYAFTVRCPV GPVYRVGDAESPLVEVTEEATGVLDGAELVIQIVKGGARNEGTWRLSTDGGDNFGKTRTI PVDGKAEVPGFGVSVTFPVGSYAAGTTYECRLLPPSPSIVDVLDALETPLGIYDVEFVHV VGASDSVDWAAAQAKAEELWNRQRPTYFKLEARLPHDGEDLNDYAAALLAEKQGVACRFV TVCAQHGEITDSTGAARLRNAAGLQSGRVMSIPVQRAAGRVKDGPVSQLSLPDGWEAVRT ALEEAGFLTAKKYAGMEGTYWGDSRTLAEDSSDFRYEEALRTTFKAVRLTRQAALKSMYD EAGDPLRPDREGGLAYLKAQLENALDAMTDAGELAGYVVDIPSGQNVARDGVAVEITLIG VPIIREIRLYNRYTYAGSNFDPRIESYALAA >gi|316924588|gb|ADCP01000020.1| GENE 11 10388 - 10747 430 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212703613|ref|ZP_03311741.1| ## NR: gi|212703613|ref|ZP_03311741.1| hypothetical protein DESPIG_01658 [Desulfovibrio piger ATCC 29098] # 1 117 1 124 125 92 62.0 6e-18 MAARKKTTEEKNEQQAAATPAEQTSEQAAPGQEEGHVEDAREQTPETVETPEAPAAPETD VEALESLSVLADRHRVPSWQQAALCRFMGWEDGKMVSDAEYREALNSLKRRRLGGGRMA >gi|316924588|gb|ADCP01000020.1| GENE 12 10761 - 11276 674 171 aa, chain - ## HITS:1 COG:no KEGG:DVU1113 NR:ns ## KEGG: DVU1113 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 168 1 172 178 145 49.0 4e-34 MQTFATETITRAALAAGLPEGRVIDIVKKDNLTIERPRVEIQFLPERYTRTGRKLAVTRT KTEQIRKRELYEVELTVNVNALADDRAWLEAFSVDFVARLPRGGNDSRGNWIKVRAREAT FSRAPDKRVGDEVIEVFVKVNRLFVLTFTGRITKEEAEKLIPSFTINPTFK >gi|316924588|gb|ADCP01000020.1| GENE 13 11288 - 11731 475 147 aa, chain - ## HITS:1 COG:HI1568 KEGG:ns NR:ns ## COG: HI1568 COG5005 # Protein_GI_number: 16273465 # Func_class: R General function prediction only # Function: Mu-like prophage protein gpG # Organism: Haemophilus influenzae # 17 141 15 133 138 68 32.0 3e-12 MAVKNGVSLNWGGFDKALGKAAHKLGDTQALMESVGDALVSGTLKRFDAEEEPTGKKWPK SKRAAKEGGQTLTDKAFLRRSIDYAATPEKVMVGSNLPYARIHQLGGKTGKGHKVDMPAR PYLGVSKEDMEEVRETMADFLAGAFKA >gi|316924588|gb|ADCP01000020.1| GENE 14 11734 - 12159 452 141 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1289 NR:ns ## KEGG: DvMF_1289 # Name: not_defined # Def: protein of unknown function DUF1320 # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 140 1 142 143 166 61.0 2e-40 MILCSREHIVDLLHAKYVAACEKQNPGLVERTIEAVSGEIGDALSYRYPQPWPYVPEIVR YIAAVTSAYRVVEAITSLVDTEESGDNEWIPLQKQWKYCMDLLDQIARGKLKLPLEETNP DREEASVAVFSRQPFFDLRGL >gi|316924588|gb|ADCP01000020.1| GENE 15 12279 - 13274 1064 331 aa, chain - ## HITS:1 COG:no KEGG:DVU1116 NR:ns ## KEGG: DVU1116 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 330 1 330 331 340 51.0 3e-92 MLANLKGIFAPQAVAQSLKTLPPLESTIMDRFFKQRPTHPLSMLGITDLKAVVQTVPVVR RDGVPVPLDNESIETQFFAPLPIKVQVPVTAAELNDLRVLLGNQASLEAWRTRKVDQIRQ AVHATTEGMCAGVLTTGKLAWPVQLPGGRSESYGIDYGAPLTHELATKLTGTSKLSDVYR LLRAMQQEIRMAGIGGKVEFMCGEDVAAVFLDMAENYRSTAQDAPIGIKLGDGEVRIGSY VIRFMDETYPAPMTGEWVPKLDAKTLMGVAVDVPGTIWYCAIDSISANNAAVPLHIVPVK SDDDSSITLIGQAKPMPARPSRAVCKAVVVA >gi|316924588|gb|ADCP01000020.1| GENE 16 13305 - 13709 491 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212703608|ref|ZP_03311736.1| ## NR: gi|212703608|ref|ZP_03311736.1| hypothetical protein DESPIG_01653 [Desulfovibrio piger ATCC 29098] # 73 134 135 196 196 73 61.0 5e-12 MNEGYLGKHTLSGERAATGDHPVVLHHLPLSAKAKTAAIPVGTVMKRVDVMGDDDESDTV VGAAWEPLLSTDAATVLPVAVVDTPCDPTGENGESSALCVVHGGVKNRVLTTGDGKALTD VRIAQLSEHGIYAV >gi|316924588|gb|ADCP01000020.1| GENE 17 13727 - 14659 973 310 aa, chain - ## HITS:1 COG:no KEGG:DVU1118 NR:ns ## KEGG: DVU1118 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 6 310 9 312 316 204 44.0 3e-51 MGKEKWIEIARTGTFEDSAGRLRTFTAGDLDAIARSYDPARRDAPLTFGHPQTDKAPAYG WVEKLKSEGGRLYANFSQVPEQVRDLVAKGHYRHVSMSLMPDLVTLRHVALLGAEQPAID GLAAVEFADGGDAITVDFAAARGEGDTMTVEELQRQIGQLQGQLEALRAENASLKKQADS HKQEKDKAEAAKTEAEQKAEKASADFAAYRGKIEGERREARVAALVKAGKVKPAEKAGVL DFAARLATQAGTVDFAAPDGRTEKLSMEERYFRDLEARSADERGAEFSAPPAHAGGQSDN FNPAELTAKL >gi|316924588|gb|ADCP01000020.1| GENE 18 14741 - 15817 945 358 aa, chain - ## HITS:1 COG:CAC0339 KEGG:ns NR:ns ## COG: CAC0339 COG1397 # Protein_GI_number: 15893631 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ADP-ribosylglycohydrolase # Organism: Clostridium acetobutylicum # 48 341 24 318 341 272 44.0 8e-73 MFRSAVINGHQQDNADCGNCEIYEYPESKPNEVYFDGAECEFYEKADDRPLALIMGVAVG DALGVPVEFKKRGTFYVTDMQGYGTHNQPAGTWSDDTSLTLALADNLLPESYELSDIAWG FINWYDKAAYTPHGKVFDVGNATAEAIKRLKKGVTPEKAGGTGERDNGNGSLMRIAPLTF YMFGIREAEERFRIVRDVSSLTHAHEWSVAACYIYVEMLNKLRMGRKKKAAYAELREDFA RGVPFISKATLGKFVRILENDISTLPEEEIRSSGFVVDTLEAAFWCFMTTDNYKDAVLKA VNLGDDTDTTGAVTGALAGLAYGLDSIPQEWREQLAAYDEVRRIAVAMPRWDYFRKIA >gi|316924588|gb|ADCP01000020.1| GENE 19 15907 - 17823 1604 638 aa, chain - ## HITS:1 COG:NMA1315_1 KEGG:ns NR:ns ## COG: NMA1315_1 COG2369 # Protein_GI_number: 15794244 # Func_class: S Function unknown # Function: Uncharacterized protein, homolog of phage Mu protein gp30 # Organism: Neisseria meningitidis Z2491 # 15 188 9 202 210 127 36.0 5e-29 MAKISVELPPWEIIAEPVAPDAAIEFWKQRAKLTDEEAKALGEEVKHRAFYVTGLAKQDL VQLVSDGIEEALKNGETLADFKKRIAAAIQTQGWHDYRVENIFRTNMQTAYSAGRYKKMQ AVKASRPYWQYIAVMDKRVRPSHAILHEKVYPADHEFWATNYPPNGFRCRCGVRTLSARQ VEKQGLTVETEMPKADMWTDPKTGYEYFVHFPGADKGFRNNPGKDWVQAGLDLKKHGMDT APPPPKKEPLTQKKLEADIASMDTLIKAAGDKQSVAELEAKKAELQELLDKKKTQAAKKK LNAQKKKLEQQIGEFPVKTYSGIWQADVTTADWAAKAGSIQAKKEYFESKLLFGSLTPEE TAKFKGLLQDLEEFDTQGQQLHELQKKQKNVQESLSKLKNGGKEDPNPYSEARKDAALWA QTPQEADDVLREKCGEVWRKASKAEKDALYAYTQGSGGFNRPLRGHDGYWGNFKGVGKVD LNNEGRGAAIQHMTNVINRSTYDKDIWLQRGIETAEGAASFLGIPVEALHQWSVSKLKKL EGEEIVEPAFASCGSAKGQGFSGYIFRIYCPKGTKMMYAEPFSHYGAGGKRKWDGKKTQT SFGYEDETIIQRGTKFRIMKVEKSGYKVSFEIAVIEQI >gi|316924588|gb|ADCP01000020.1| GENE 20 17804 - 19363 1275 519 aa, chain - ## HITS:1 COG:NMB1095 KEGG:ns NR:ns ## COG: NMB1095 COG4383 # Protein_GI_number: 15676976 # Func_class: S Function unknown # Function: Mu-like prophage protein gp29 # Organism: Neisseria meningitidis MC58 # 67 496 70 509 522 98 25.0 4e-20 MADGLFMPDGTFQPFNSAELSTELATRQNAGVFFGELDGWLNTLPDPDPVLRKRGDDAAI LRELSADDQVTTAMLSRKNRVLNCPHFSIRAGAPEGETPTPEAEELHRRFMRDLERTNLR TVISGMLDAPFYGVTPLELLWRFDGDWWHLVDIVPKPYHWFRFDSRNQPVFVGEYGLFCA DPRPLPAGKFVFVAHHATYDNPYGLRLLSRCLWPVSFKRGGLSFYARFVERHGMPWVVGE APAKATALEKQDMARGLSRMVQDAVAVIPYGANVKLEGAGQTQGALHESFLARQDRAISK VLMGQTLTVEMEGKNSQAAAQTHADVADDLADADKAMVTDAWNEIAWLYAQVNAGPGVFA PLAEYDEPEDLNVQADLGKKIREMGAKFTREYFTGRFGLKPEEFTLEDETTQEGGVDFAA PSGRKKTTAEKAQGNLDAAIVKMLPTALKSSRDFVTQVENEIRAAKSYEDLEEALAALLS PSMTRDALESFLARAMTAAAGYGAASVQAEGEEDGENLR >gi|316924588|gb|ADCP01000020.1| GENE 21 19448 - 19711 201 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRNSMLDEIEDKDVQRYRGEILEALKKLEHAQPRASLPSSGGNFPCRIKALEVCVLAMSD FLSDDHFTVTVQLHDKVEITGKTWIEA >gi|316924588|gb|ADCP01000020.1| GENE 22 19729 - 21204 1419 491 aa, chain - ## HITS:1 COG:HI1500 KEGG:ns NR:ns ## COG: HI1500 COG4373 # Protein_GI_number: 16273402 # Func_class: R General function prediction only # Function: Mu-like prophage FluMu protein gp28 # Organism: Haemophilus influenzae # 8 466 15 473 508 385 45.0 1e-106 MNVNIANVLLPYQRRWAADTSRVRVWEKSRRIGASYCEALLSALEAAKSREAGGQDTFYL SYNKEMTQTFIRDCAYWAKIFNMVAEDAEELVLRDEDRDVTVYRIRFASGFNVWGLPSEP RSLRSKQGRVIIDEAAFVDDLPELMKAAFALLMWGGSVSILSTHNGEDNPFNELVKDIRA GAKKYSLHRTTLDDAIADGLYKTICKRANPPRLWTPEDEEAWRAGIIADYGDGADEELFC IPNRSSGAYLTATIIEACMQSVPVLTWTPPAENFVDWPLPVAETYTKGWIEENLAFRLEE LPEDRAHFCGVDFGRSGDLSVIWPATEERDLRLVPPFVLELRNCPHRTQQQILFAILDRL PRFSGVSLDARGNGSALAEAARQEYGPAQVREVMISEAWYRETMPILKAGIEDKTLILPK DAGILSDFRSLRVVKGVARVPEQRTKDKTGGRHGDSAVACAMMLDARKELGSAEPWEYVG IELPGFDLNGW >gi|316924588|gb|ADCP01000020.1| GENE 23 21201 - 21818 686 205 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1281 NR:ns ## KEGG: DvMF_1281 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 200 1 198 199 145 51.0 1e-33 MGREYPTDTLWRAQELYCVDRLSYAAVAEATGVSATTLKAWGQKYGWARRREEIARAESE IRVNIIKGRQKALEQLLAAEDAKEAAPMAFAVSSLESLALKRQELAASGKIPDASAPARR KIATRADAVAALREAVERKLGLALADPDKISTATVQDVKRCLDLVAELEAGLPKETEAED ARKRGMSGELAQNIYRALGITEDAE >gi|316924588|gb|ADCP01000020.1| GENE 24 21828 - 22067 237 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRLEEMGHREELRTRRKIIEAEISSHSDSIRAALPLTGEPEDIDGEYVMMLAIKLNERV QELRGVRRKISVLERELGL >gi|316924588|gb|ADCP01000020.1| GENE 25 22064 - 22456 392 130 aa, chain - ## HITS:1 COG:no KEGG:DVU1125 NR:ns ## KEGG: DVU1125 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 15 125 20 141 157 78 37.0 7e-14 MSAAISAALPLLEALFSLGVPGVLLMLASIPALVIAVIFILDYKHGKRVSRVLEAYREDT QESLRVMTEKFEASLREMNRKHDEVAEYYRKNVTLVKNYERMNDTLQTLVVNNTRAVEHL STIVETRTKL >gi|316924588|gb|ADCP01000020.1| GENE 26 22462 - 22725 208 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKHAVALLSVLIVLCSCGCGQKQAVPLILRYHDCPAPSVPVLPELDAAEPLDSTENVTRL LERDDRLRDYINGLKSALQCEQARGKL >gi|316924588|gb|ADCP01000020.1| GENE 27 22643 - 23077 123 144 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSGTACFWPQPQLQRTIKTLNSATACFIVHHLVLFFGGSGTRAGLFHYGRTFRRVPRGVR LPVQAQPGVSACATAPPVRFRRRFPPVAVLRRHGMGGAFRVKFGVYLGKASQQSRKQQGR RQPPHRRFHGRAPPSIPGPQPALI >gi|316924588|gb|ADCP01000020.1| GENE 28 23026 - 23667 441 213 aa, chain - ## HITS:1 COG:STM4217 KEGG:ns NR:ns ## COG: STM4217 COG0741 # Protein_GI_number: 16767467 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Salmonella typhimurium LT2 # 30 202 23 195 201 184 55.0 1e-46 MATGFTVGVGFWLALGFLCGLAHAAEVTIPRAAQQHRTTLTRAAHATWGMNAPVSVFAAQ VHTESWWRNNTVSHVGAQGLAQFMPSTARWLPSVAPETGKPEPFNPGWSLRALCTYDKWL WERNSGANDYERMAFTLSAYNGGQGWVNRDKKLARQRGLDAARWFGVVATVNAGRSAAAW KENRNYPRLILEERQHAYIRAGWGPGIEGGARP >gi|316924588|gb|ADCP01000020.1| GENE 29 23793 - 24152 387 119 aa, chain - ## HITS:1 COG:no KEGG:LIC084 NR:ns ## KEGG: LIC084 # Name: not_defined # Def: phage-related membrane protein # Organism: L.intracellularis # Pathway: not_defined # 5 119 11 124 124 87 39.0 2e-16 MKKSLCNPRLLLGLFLLCGIVLLAALLFFSPVQGPVVLYKVALVVVAAIAGMAFDFLAFP YALPSSYLDKDWRKDPDATGDDGKPDFPVATGYFRPFCAALLRRAVIIAAFVLAVALGL >gi|316924588|gb|ADCP01000020.1| GENE 30 24321 - 24773 490 150 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1273 NR:ns ## KEGG: DvMF_1273 # Name: not_defined # Def: DNA-binding protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 150 1 155 155 147 50.0 1e-34 MRTRIQEIVTLTKEGWRPYDAEPDRSVYEKLGCNHALRGQKRKPFWFVRDDVFCCIGCAD HCTLKRPAGFPLPLPIRYSVVPPDQPYTLTPLEMLARHDILTVRQAAYCLNISERQVYDY IAEGKLVKLKDTPVRVRADEVKALRKDFDE >gi|316924588|gb|ADCP01000020.1| GENE 31 24776 - 25129 82 117 aa, chain - ## HITS:1 COG:no KEGG:DMR_14540 NR:ns ## KEGG: DMR_14540 # Name: not_defined # Def: hypothetical protein # Organism: D.magneticus # Pathway: not_defined # 2 111 4 106 113 68 40.0 5e-11 MNEVDTLRAQVLERYTSIYAFCRAHPELKRATVYLVLSGRYPGKWSFQAARIRAALTGAE KAPETPAPGVTREILAETLQSIRCAHCRRLDRRECMACRDQTEREGKELFGRLFQEK >gi|316924588|gb|ADCP01000020.1| GENE 32 25122 - 25610 386 162 aa, chain - ## HITS:1 COG:no KEGG:DVU1132 NR:ns ## KEGG: DVU1132 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 7 156 19 169 169 100 44.0 2e-20 MDNQKMRLGLYRKIEIARKQLPNMDEEAFRALLRSEFGVSSRKDMNIHQLSRLVQLFAAQ YGVKYTAPARSRNNRVTPHGRPDFIEITDSMPYAREKRQILAIWRKLGYSMTSLDTRVKR AFGVHCFVWLQNGEQISTLLSDLQRREKAFEKKRKAEGGAGE >gi|316924588|gb|ADCP01000020.1| GENE 33 25603 - 26031 317 142 aa, chain - ## HITS:1 COG:no KEGG:DVU1133 NR:ns ## KEGG: DVU1133 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 4 138 3 147 152 67 37.0 1e-10 MRKDNRKFCASISVRNGEERAKLELSPAPLHGGPEGFYRVRLARRWLDTEDGAPRFFDRD GLARLAAELALCSLETPAPAPSIPCPSRVSVRRADGFYEGAWTNTEPLLNHAGRWVVNVS LGGRRVFVPVEDVVVHKERRRG >gi|316924588|gb|ADCP01000020.1| GENE 34 26036 - 26224 163 62 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_0106 NR:ns ## KEGG: CDR20291_0106 # Name: nrdD # Def: anaerobic ribonucleoside triphosphate reductase # Organism: C.difficile_R20291 # Pathway: Purine metabolism [PATH:cdl00230]; Pyrimidine metabolism [PATH:cdl00240]; Metabolic pathways [PATH:cdl01100] # 25 58 749 782 783 63 82.0 2e-09 MQEGIVDQRLPRQEEGGKVIIGEGVRFERIRRITGYLVGSVERFNNAKRAEVADRVKHAA MR >gi|316924588|gb|ADCP01000020.1| GENE 35 26230 - 26439 104 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTRKRVGTFAPILGGYVGCVRFGCPTREFTPIDAYEYIDWPCCWHWPVTGAERIPFIPAK GNLRIFKVR >gi|316924588|gb|ADCP01000020.1| GENE 36 26701 - 26964 204 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRGAISDEVLSIVREMVSMNTANVTLPSNAVEAMINRIDHQDKMLRAYRVLVACHDNYLC EVDKTTAVSERFDRLEAARQIVAELEI >gi|316924588|gb|ADCP01000020.1| GENE 37 27006 - 27665 413 219 aa, chain - ## HITS:1 COG:no KEGG:GM21_3690 NR:ns ## KEGG: GM21_3690 # Name: not_defined # Def: hypothetical protein # Organism: Geobacter_M21 # Pathway: not_defined # 2 195 5 210 234 137 43.0 3e-31 MERDRIIERIRKLLRLSRSENPYEAALAAERVQRMLSEYNLTLEGIVDEETEKARQINRK TRKDLEEWAHILAGRTASVFDCQYFHDPNTGETSFVGVGADPEVCGWMYGYLYKTLLRLA SEHMRGPARRLRSSKSKREARKSFLLGAVGVISYRMAAQKKETPVTSCALVPVKEGLIRA AMPDDLKTNELHIGKLRDNDRLCGMIAAEGIPLERLHDL >gi|316924588|gb|ADCP01000020.1| GENE 38 27751 - 28008 319 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKYSEAIAEILLSNGEEYEQARKVIAGNLDNLAEVFGPRIRELSQFRTQLDIEAVQQMQA AGMDQGQAVLLRASTNMALANMFSK >gi|316924588|gb|ADCP01000020.1| GENE 39 28013 - 28195 115 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRFFGRNRKCRQCGAEPVLHVSLFGPLVGLRCPECDFTRTMMFTSMAKARQYWNRRNKG >gi|316924588|gb|ADCP01000020.1| GENE 40 28245 - 28538 352 97 aa, chain - ## HITS:1 COG:AGpT222 KEGG:ns NR:ns ## COG: AGpT222 COG0776 # Protein_GI_number: 16119937 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 19 95 71 147 149 63 41.0 6e-11 MTKRELIEKALETARNSHERAMTLADMGAALDSLCEVAAAELLGGGEVSLPGLGKIKMRE TAAREGRNPRTGESLHIPAGKKVVFVPGKDLKEALKP >gi|316924588|gb|ADCP01000020.1| GENE 41 28549 - 28803 232 84 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAYRLTKPDREKLGMRFAEILICRANARPGDMPELASRKDWKEAAAYERKKVASEIAEEA RGILLKSGYPKEQVEKATRHLIRY >gi|316924588|gb|ADCP01000020.1| GENE 42 28815 - 29327 561 170 aa, chain - ## HITS:1 COG:no KEGG:DVU1136 NR:ns ## KEGG: DVU1136 # Name: not_defined # Def: host-nuclease inhibitor protein Gam, putative # Organism: D.vulgaris # Pathway: not_defined # 1 170 1 170 170 181 57.0 7e-45 MARIKPDPHVVENRAQCEGALAEMAALDRKLSAIENEMRETVDGAKAKAGQLAGPLQARR KELADAVAVFARLNRQELFAKSKSLDMGFGVIGFRASTKIVQIRGVTAEMTLEKLHQYNL SDGIRTKEEINKDAALGWPDERLELVGLKRQQSDTFFIEISKENVPQGTA >gi|316924588|gb|ADCP01000020.1| GENE 43 29327 - 29608 355 93 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212703591|ref|ZP_03311719.1| ## NR: gi|212703591|ref|ZP_03311719.1| hypothetical protein DESPIG_01636 [Desulfovibrio piger ATCC 29098] # 1 92 1 92 93 122 91.0 6e-27 MLKENLLSILAALGEVQGNAKDEEQAALVRGCRDNLRAAAEQAEALENNLHVVSVDWTTE EAEIAVPAMELRQVARGIRASGVFPARAIREVM >gi|316924588|gb|ADCP01000020.1| GENE 44 29638 - 30348 864 236 aa, chain - ## HITS:1 COG:no KEGG:DMR_14500 NR:ns ## KEGG: DMR_14500 # Name: not_defined # Def: hypothetical protein # Organism: D.magneticus # Pathway: not_defined # 1 235 1 239 240 181 42.0 3e-44 MRDIIIETEAMSRFDTAVDEVAAADHGLSGFILAYGQAGRGKSVAAKRYAKERGGTLVHV WQGWSQTAFMQNLLAAIRGGEDMPRHSGTRCKEQIVRELEDDRRPLFVDEADRLRIDRIE DLRDIHELTGVPVVLIGEESIFGLLAERRRIWSRVINEVEFAPASPAEVALYAMRSTGLD IPMQLASEIARRTEGDFRLIRNMVFQLEKSAKASGSFKVDDGMLETVLSSRPWRRK >gi|316924588|gb|ADCP01000020.1| GENE 45 30378 - 32417 1908 679 aa, chain - ## HITS:1 COG:NMA1284 KEGG:ns NR:ns ## COG: NMA1284 COG2801 # Protein_GI_number: 15794213 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Neisseria meningitidis Z2491 # 143 536 211 622 657 66 23.0 2e-10 MPEETRVAIRAAEEKQALALCPAPAPIPALSPSTTTAIMDDKRRYKALARADLVCQYLTW QRRHGATKVQKGEFIIAYKAGAWPKLLEEVGPVSWQTLERWKLEQERAGSVLALADRRGV THKGKTMLTEEHKRVILGHVLNPNGAKVSQCVREVQKKFQAAGMALPSEPTIRRFVKHYM EECFDEWTLFREGKKAWNDKCAISLLRDWSLVGVGDVIIGDGHTLNFETLNPATGKPTRM TIVLFYDGASNCPLGWEIMPTENTDSISAAFRRSCIMLGKFPRVVYLDNGRAFRAKYFKG CPDFEQAGFLGLYRDLGCSVIHAWPYHGQSKPIERFFGTFHDMEVWMPSYTGNDIAHKPA RMKRGEDLHRQLYTKMGGRPLTMEETLVQVARWFAEYATRPQYRTHLHGRTPGDVFMEGR GEGLSPQDMQKLTLFMMQKEVRTITKDGIKVNGRLYWHEKLYSRRHPVLVRYDEHFNPYS VYVYTLDGEPLCEAKDREHYRIASGLHPVARILGTAEQQEELRANLELRKGLERSSTALM RGLLQSIVLPETQARIAALETETAEILPEKPKTAKVLRLSPPKVTREEEAALEEAKAQAR AEMDAEPSYTPSDLMRWKDSQARYAYLFEVKFEQGLELVPADAAWMEAYESTPEFERYLK NRYAALREMYEHKQAIQSA >gi|316924588|gb|ADCP01000020.1| GENE 46 32574 - 32816 160 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKLQPYGRGRADSKREIQRLLDAKGKNFVDVAAAAGVTPQTVSATMNGFRHSPRVLDAL RFFGIPERLLFDPRRAESAA >gi|316924588|gb|ADCP01000020.1| GENE 47 33265 - 33657 135 130 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQLPSESELADEIKGKVELSDAPLPTSSVGKNLAKLEHELDLEREERRELAAETRRLYRE KENLHREKEQLLREKEELLRENGTLREKLARLEAEHGKRISGHDDEGDFPSLFDERRSIA SSNPLPRARK >gi|316924588|gb|ADCP01000020.1| GENE 48 34109 - 34882 1060 257 aa, chain + ## HITS:1 COG:SMc00078 KEGG:ns NR:ns ## COG: SMc00078 COG0683 # Protein_GI_number: 15964676 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Sinorhizobium meliloti # 7 245 123 358 368 128 34.0 1e-29 MTPYTGSTSIRLTAKGMKRFFRTCPRDDEQGRVAQSTLAKFGFKKVAILHDNSAYAKGLA DEAKQRILDKKSADIVFFDALVPGERDYSAILTKIKGVKPDAILFTGYYPEAGLLLRQMQ EMKWAVPIIGGDATNNTDLVKIAGDAAAGYRFISPIMPADIDTPMAKAFLKAYEAKYHSL PSSIWSVLAGDAYNVLIAAIKAKGPDPAAIAGYLHNDLKNLDCFTGQISFDEKGDRVGDL YRLYEVNAKGEFILQPK >gi|316924588|gb|ADCP01000020.1| GENE 49 35024 - 35929 1313 301 aa, chain + ## HITS:1 COG:SMc01951 KEGG:ns NR:ns ## COG: SMc01951 COG0559 # Protein_GI_number: 15966238 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Sinorhizobium meliloti # 1 301 1 300 300 210 44.0 3e-54 MEEFFHQLTNGLAVGGIYALIALGYTMVYGVLRLINFAHGDLFTLGSYLGLTLLFSLGLQ GKMSGILVGGALIVMSMGLVAVVGYLLERVAYRPLRGSNRLAAVVSALGASIFFSNAIML VYGARSQVYPNGILPSVAVSLLGVDIPLMRIVMFVGTVVLMLLLYWFINHTRIGTAIRAV AIDQDTAKLMGINVDNIISMIFLIGPALGGAAGVMMGLYYGQITYDMGFSFGLKAFTAAI MGGIGNIPGAMVGGILLGVLEAMGAAYVSIAWKDAIAFLVLILILIFRPTGLLGERVAEK I >gi|316924588|gb|ADCP01000020.1| GENE 50 35926 - 36966 1277 346 aa, chain + ## HITS:1 COG:YPO3806 KEGG:ns NR:ns ## COG: YPO3806 COG4177 # Protein_GI_number: 16123940 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Yersinia pestis # 51 319 120 410 428 193 42.0 4e-49 MSDSLFSKRYFYPVAGGIFCLVMACVPTILELAGLGGGGIAQLDMLNKIGIYAALALSLN LILGQTGLFHMGHTAFFAMGAYTTAILNTVYHWPVFATMPIAGVLAAIFAFVLAKPIIHL RGDYLLIVTIGIVEIVRIALTNNLFGLTGGANGIYGIARPSLFGMVFSSPKRQFYLIWTF TILTMFLFYVLEHSRFGRALNYIRYDDLAASGCGIDVTQYKLMAFTVGAFWAGMVGTLFA ANIRTIEPTSFNFYESVILFAIVILGGSGSIAGVVLGAFLLIGLQDVFREFESARMLIFG AAMMVMMIICPQGLLPPRRRFYNARFLVGRFPRGVQRAEPAREEAL >gi|316924588|gb|ADCP01000020.1| GENE 51 36963 - 37724 225 253 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 4 241 9 230 309 91 28 1e-17 MSLLILRNLLKTFGGLVAMNSVTFSVDEASIVGLIGPNGAGKTTTFNIITGNYRPDSGEV IFDRKDITGRPTHKIVELGIARTFQNIRLFQDMSVLENVLAGCHCRMRAGLFPSLFRTPA QRSEERQAVERAMRELAFVGLDDQYANLAGNLAYGNQRLLEIARALASEPRFLILDEPAG GMNDQETADLMLLIKAIQKRGITILLIEHDMNLVMRVCEKIVVLENGALIAEGTSAEIKR NPRVIEAYLGAEG >gi|316924588|gb|ADCP01000020.1| GENE 52 37881 - 38627 248 248 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 5 224 2 219 311 100 34 3e-20 MSEATTYLELRDLRVSYGKVEALHGISVHVEQGEIVTILGANGAGKTTTLTTISGLRKPS GGSILFQGRPLHTIPSHEIVRLGITQSPEGRRVFGTLSVRENLDLGAFTSKDRGRSAKIR KWIFDLFPRLAEREGQLAGTLSGGEQQMLAIARALMADPKVLLLDEPSLGLAPLLVRSIF DSVRKINRQGVTVILVEQNARAALKLASRGYVIEVGRVVMEDASEALLANPDVQAAYLGG GQENGQEN >gi|316924588|gb|ADCP01000020.1| GENE 53 38925 - 41936 3390 1003 aa, chain - ## HITS:1 COG:sll1561_2 KEGG:ns NR:ns ## COG: sll1561_2 COG1012 # Protein_GI_number: 16330961 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Synechocystis # 404 1000 5 585 586 566 49.0 1e-161 MDQQIEQKIVERGQEFFNSISGEAPSIFNKGWWTGKVMDWSMKNEDFKVQLFRFVDVLPY LNTSESLLRHIREYFASSGSEVPSVLRWGAGKAGLGGALTAKIMGGAIRSNIESMGRQFI IGQNVKEAMGGLAKLRKDGFAFTVDLLGEASVNEEESDAYAAGYHEVLDALAEEQKKWPA LSGNGPDNGMDWGSMPKVNISIKPSALYSRANPVALEDSVEGIYRRLAPLYQKTIDMGGF MCIDMEQLKYREITVELFKRLRSAPEFRHYPHLCLVQQAYLKDTEQAVRDLIAWARKEKL PIALRLVKGAYWDAETVFAKQCDWPVPVWTHKPESDLAHEKISRLILENHDIVYFACASH NVRSIAAVMEYARQLDVPEGRYEFQVLYGMAEPVRKGLRNVAGRVRLYCPYGKLIPGMAY LVRRLLENTANESFLRQSFADGAAVELLMENPAVTLERELAARQEKAPPNEEGPFPPFRN EPPVDFTIPEKRKAYTQGIAAVRAAEGRTLPLYIDGKDVVTETLLPTVNPADPGEVLAQV CQAGREEIDRALAGAKAVFPAWRDTPPLERAQYMHKAAAVARRRIYEMAALQTLEVGKPW EQAYGDVGEGIDFLEYYARDMLRLSVPRRMGRAPGEHNVLFYQPKGVAAVIAPWNFPFAI AMGMVSAAIVTGNPVVFKPSSLSSAIGYNLVEIFKEVGLPAGVFNYCPGQSSVMGDYLVE HPDISLICFTGSMDVGLHIVEKAAKVQPGQRQVKRVIAEMGGKNATIVDDDADLDEAVSQ VVYSAFGFQGQKCSACSRVIVLDAIYDTFVQRLALAAQSLRIGPAENPENFLGPLADASL QKNVLEYIEIAEAEGKTLVKRSDLPDKGAYVPLLIVEDVMPEHRIAQEEVFGPVLAVMRA KDFDEALELALSTRFALTGAVFSRSPENLAKAREKFRVGNLYLNKGSTGAMVGRQPFGGF NMSGVGSKTGGPDYLLQFMDPRCVTENTIRRGFTPIAEDDDWV >gi|316924588|gb|ADCP01000020.1| GENE 54 42652 - 43524 953 290 aa, chain - ## HITS:1 COG:NMB0960 KEGG:ns NR:ns ## COG: NMB0960 COG0074 # Protein_GI_number: 15676853 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Neisseria meningitidis MC58 # 1 289 1 290 296 325 58.0 9e-89 MSILIDDGSRVLVQGITGREGQFHTERMIRYGTRVVGGVTPGRGGQSTLGVPVFDSVRQA VRETEADVSVAFVPAALAADAACEASEAGIRLVVVIAEHVPVLDMVRVKAFFRERGTLLV GPNCPGIISPGKCKVGIMPGYIHRPGGVGLVSRSGTLTYEAVHQLTAAGLGQSSCVGIGG DPIHGLSFVDLLKLYGDDPETEAVCLIGEIGGDAEERAAEYIRSSRYPKPVFGFIAGLTA PPGKRMGHAGAIISGTSGRGEDKIAAFEEAGVTVIRELGSFGSVVAQKLL >gi|316924588|gb|ADCP01000020.1| GENE 55 43521 - 44705 1238 394 aa, chain - ## HITS:1 COG:SA1088 KEGG:ns NR:ns ## COG: SA1088 COG0045 # Protein_GI_number: 15926828 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Staphylococcus aureus N315 # 1 388 1 388 388 370 50.0 1e-102 MHIHEYQAKRLLSAYGVPTPEGRLAATPGEAEQAARDLPGPVWVVKAQIHAGGRGKAGGV KVCRSPEEAREAAASLLGARLVTPQTGPDGERVNAVWIERGTASARECYLAVALDRASQC LTVMASPGGGMDIEAVAASSPERVFTARLDGGHYLWPFQARNLLEGWGLESGPARELASL VRRLAALATEKDMTQLEINPLALSEGGGFVALDAKMNFDDSALKRHPDIAALRDREEGDP LERAAAEKGLTYVRLGGSIGTLVNGAGLAMATMDAIKQAGAEPANFLDAGGGASEETVAA GFSIMLSDPHVRGILINIFGGILRCDIVAHGIVNAARKLDIAVPLVVRLEGTNVAEGRRI LRESGLRFHTASSMSEAARLIVGLAAESPQEAGV >gi|316924588|gb|ADCP01000020.1| GENE 56 44916 - 45473 710 185 aa, chain - ## HITS:1 COG:PAB0348 KEGG:ns NR:ns ## COG: PAB0348 COG1014 # Protein_GI_number: 14520725 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Pyrococcus abyssi # 11 181 9 182 185 129 42.0 4e-30 MENAYAVSRFLLSGSGGQGVITMAILLAEAAVKYEGLVAVQSQSYGPEARGGATRSDVIL SHKAIYYPKVEQPNILVALTDAACVKYLPLIRPGGLCIYDTELVHPGRKVEAQFKGLPLW GAVRERLGSAVSYNVAVLGALVTLTGAVRIGSIEMVLAERFPAAHHEGNLRALHLGVELA EPLLD >gi|316924588|gb|ADCP01000020.1| GENE 57 45477 - 46277 859 266 aa, chain - ## HITS:1 COG:TM0405 KEGG:ns NR:ns ## COG: TM0405 COG1013 # Protein_GI_number: 15643171 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Thermotoga maritima # 5 256 8 255 266 269 49.0 5e-72 MAIRSLIRERFFPHLWCPGCGHGHVLNAMLHSVDDLGYDPSSLVMVSGIGCSARIAGYVD FHTMHTLHGRALAFATGIKMSRPDLHVIVPMGDGDALAIGGNHFIHAARRNIKLTAIVMN NRIYGMTGGQYSPLSGIGVSATTAPYGNIDREFDTVRLALGAGATFVARTTTYHMREMIS IFKKALQHEGFAVVEVLSQCPTYFGRKNKLGDAVRMMERYRDHTAPVGSPKLAENPTLVP RGIFTDEDQPEYCARYASIIASQHQE >gi|316924588|gb|ADCP01000020.1| GENE 58 46279 - 47433 1167 384 aa, chain - ## HITS:1 COG:TM0878 KEGG:ns NR:ns ## COG: TM0878 COG0674 # Protein_GI_number: 15643640 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 7 376 6 379 386 340 48.0 2e-93 MNGEKGILFTQGNEACVRGALYAGLRFFAGYPITPSTEVAELLSEELPHVGGRFIQMEDE ISSLCAVCAASVAGDKAMTATSGPGFSLMQEALGYAIMGEIPCVVTSVQRGGPSTGLPTK VAQGDVNQARWGVHGDHAIIVLTASSVQDVFAMTVEGFNMAETYRTPVILLFDEVVGHMR ERLDMPEPGELPVVERLRTSVKAGVNYYPYLPREDGRLPMSDFGGVHRYNVTGLYHDIWG FPTEQPEAVNKLLYHLVDKIEAKAAEIARWKEYFLDDAQQVFVSYGSAARSALHLVHTLR GKGIRAGLLELQTLWPFPAELVRDRTEGARNVYVVEMNMGHIMAQVRQAAWKPDRVFLVN RVDGQLITPTDIGKVMRVIEGRGF >gi|316924588|gb|ADCP01000020.1| GENE 59 47430 - 47642 248 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKVLINLQWCKGCGVCVAFCRSGALSLDGDGKPEYAPALCVGCGLCELYCPDLAVECLGE DGPGESGEGA >gi|316924588|gb|ADCP01000020.1| GENE 60 49289 - 49615 248 108 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIALKPENVVDVLFGNRLRQLLDERNISQAHFADRIGVSRSRMNNYVAQRSEPDYATLIR IANTLDTTIDFLLGKSEHQGLPQTACLSGFPDFIPSRGAGGEPPDPSC >gi|316924588|gb|ADCP01000020.1| GENE 61 50170 - 51450 1219 426 aa, chain - ## HITS:1 COG:no KEGG:PTH_1333 NR:ns ## KEGG: PTH_1333 # Name: not_defined # Def: hypothetical protein # Organism: P.thermopropionicum # Pathway: not_defined # 8 416 3 412 415 344 46.0 5e-93 MTETRCPLPKMARIRQTFARPRVDDMAAEMREQMQVLTPLIRPGMTVGLTVGSRGIQNIL TMLEVAVQAVRGCGASPVLLAAMGSHGGGTRQGQKDVLDSLGITEERLGAPVITCDVTRA IGETPGGLVAYMLESAFGVDAIIPINRVKTHTSFKGCVESGLCKKLVVGLGGPGGAGQFH SLGQAELPRLLVEVTKVILGKMPVLGGVAIVENAYEETARIKAIPAEALIEEEIRLLAWS KSLMPALPTDRLHGLIVEEMGKNFSGTGVDTNIIGRLRITGEPEMESPRIRYVSVLDLSE ASHGNATGVGLVDFVTRRLVDKIDRKATYLNNLTTTFVTRAFTPLWFDTDREMLETMMFC LRSVPLAETRLILIPNTLYLADCYVSEAILAELADTGRFEVLGPLRELAFDAQGNLISRI GLPRTS >gi|316924588|gb|ADCP01000020.1| GENE 62 51587 - 52612 645 341 aa, chain - ## HITS:1 COG:SMc00926 KEGG:ns NR:ns ## COG: SMc00926 COG0247 # Protein_GI_number: 15964532 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Sinorhizobium meliloti # 73 338 142 419 430 70 27.0 5e-12 MEPLDGWGAGETPLPWKTIRRLSRLCAGCGRCARRCARKLSTSELLADLRSQHPHWTQFV WDIWIRRVGPLWPLAGRIASLPFLPGGSSLETARALLPQETAAPWGRIKPAVQVGVGTPV AVFAGCTARNARPGWIARAETLLRRWGYVVLDGGGFTCCGGTLHHAGLFAAQAEVRRRNI ERWRDMGRPVIAAFCASCKHGLDAYAEEGGMEPEEAALWKRQVRGLSALLAAPVCETVPD APARIGYHQPCHWGEDDPDLPFLRQAFPLIMKGTGLCCGMGGILKMTDADVSAAIGRRCV EGFADCRHIITGCSGCTLQLASAAPSGVTVRHWLDVVDISV >gi|316924588|gb|ADCP01000020.1| GENE 63 52967 - 54343 1043 458 aa, chain - ## HITS:1 COG:BS_ysfC KEGG:ns NR:ns ## COG: BS_ysfC COG0277 # Protein_GI_number: 16079920 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Bacillus subtilis # 33 458 28 458 470 317 39.0 2e-86 MNAAKPALTPEQRSFLEALFKDDAAFEEDALRVYASDASLRRGPVLAMVRPRTVEQVREL MRWAEVERIPIHPRGRGTSLSGGCVPTRAGVVVSMLGMDRIFDISSEDFVAVIEPGVNTL AFQQACEQKGLFYPPDPASGKATSVGGNVMTCAGGLRAVKYGVTRDYVLGMEAVLPGGKL LKLGGRTHKDVIGLDISRMFVGSEGTLGIVTKLYLKLLPKPESSASVLVGYPSLDAALSS MSKVFAAGILPCAVEFMNETVLEILSRTGDVPWPDTVKSLLLFQVDGSRETVPLENARLA AQLDDALWSMRGVGKEEEDRLWAFRRRVSSSSYVLGPDRIGGDMAVPRGSLLKAVRRFEA IAASHGKRLIGFGHAGDGNIHANLHYDASDPDDARRTAAAHHALDDAALEFGGSLSGEHG GGCLKDVGKQLGADELEAMRAVRRLFDPQGILNPGKGY >gi|316924588|gb|ADCP01000020.1| GENE 64 55222 - 55674 321 150 aa, chain - ## HITS:1 COG:PM0394 KEGG:ns NR:ns ## COG: PM0394 COG1803 # Protein_GI_number: 15602259 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Pasteurella multocida # 2 149 13 151 152 152 50.0 2e-37 MNIALVAHDNCKKDLVRWAREHVNELKGHRLTCTGTTGRLIDAALRELLGDESPSTPVEC LKSGPLGGDQQMGSRIAEGRIDALIFFWDPMLAQPHDVDVKALLRISNLYNIPTACNVAT ANCLVSSPAFLQEHPQSTYDFSEYTERSLS >gi|316924588|gb|ADCP01000020.1| GENE 65 55811 - 56500 467 229 aa, chain - ## HITS:1 COG:AGpA472 KEGG:ns NR:ns ## COG: AGpA472 COG0684 # Protein_GI_number: 16119556 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 223 1 223 227 159 40.0 4e-39 MSVGFRIFTQRPLPDPELVRAYGAFATAPVADAMHRMCAMTPDVRLLSDPVGTMSGVALT VKARPGDSLMIHKALNMAEEGDVLIVSGGESGRSLMGELMFRYAASKRLAGIVVDGPIRD ADCLRGFPLPVYASGFTPGGPYREGPGEINVPVVCCGQSVEPGDIILGDRDGVIVIPSGD AASLYDAARQSHERDLAKTENARRGLFDRGWVDGALERKGCEIVEGPWR >gi|316924588|gb|ADCP01000020.1| GENE 66 56746 - 57744 655 332 aa, chain - ## HITS:1 COG:AGc268 KEGG:ns NR:ns ## COG: AGc268 COG2055 # Protein_GI_number: 15887519 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 315 18 333 346 275 48.0 1e-73 MILTPEQLQDLGMAVFREAGVPEESAASVVEALVRAELDGLPSHGFSRIPFYVDQALSGK VKARAIPLVRQPAPAAVLVDAGNGFAFPAILAGLEKAIPLARGNGVSVLGVTRSHHCGVV GLYAERIAEAGLISLIFSNTPAAMAPWNGSKASFGTNPLAFGCPREGKHPLVIDMSLSKV ARGKIMGAKQRGEAIPEGWALDAAGRATTDPVAALGGTMIPVGDAKGAALALMVEILSAT LTGANHAYEASSFFEPTGSAPGIGQTFILLDPKPLNPDFSARLEALCGHLLGQEGVRLPG ERRFMLRERNRVSGVNLPEALYNDLRKRAGVA >gi|316924588|gb|ADCP01000020.1| GENE 67 57741 - 58670 682 309 aa, chain - ## HITS:1 COG:MTH970 KEGG:ns NR:ns ## COG: MTH970 COG0111 # Protein_GI_number: 15678988 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Methanothermobacter thermautotrophicus # 3 309 5 307 525 220 40.0 2e-57 MAEIVISEFMDEAAVADLRASFDVLYDKTLVDRPEELAAAAAGCRALIVRNRTQVRGALL EGARLRVVGRLGVGLDNIDCKACRERGIAVIPATGANNTAVAEYVLAGLLMLARGCYCGT FAVAAGSWPRERMVGGEISGKVLGLVGFGGIARDVALRARACGMRVMAYDPFLPADAPDW EKLGVEPVMLETLLAEADAVSLHVPLTPDTRQLFDAGRLARMKPGAVLINTARGGIVEEA ALADALRSGRLAGAMVDVFEKEPLPAGSPLADVPNCLLTPHIAGVTRESNVRVSAVVARK VAECLRGAA >gi|316924588|gb|ADCP01000020.1| GENE 68 58775 - 59503 660 242 aa, chain + ## HITS:1 COG:RSc1997 KEGG:ns NR:ns ## COG: RSc1997 COG2188 # Protein_GI_number: 17546716 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 8 241 39 273 274 145 37.0 5e-35 MTPSSKLTFAPLYEQVKKAILNRIIAGEWGPGSFLPSEIALAKEYGVSQGTLRKALNALT REKRLVRYQGKGTAVAVLDEDSSLFPFFLLSDRNGKRVYPLSQIYGVRLATPSDAERAAL ELAPDAKVIHIDRVRMLNDFPVINERVALPTARFPHFNFEFDKVPNTLYEHYQIHFGITV AHATEQIEVTMPDSGDLKRLNIERNHPLLLVTRTSFDLQDRPVEFRISRINTGNHVYRAD LR >gi|316924588|gb|ADCP01000020.1| GENE 69 59618 - 60778 1334 386 aa, chain - ## HITS:1 COG:Ta1393 KEGG:ns NR:ns ## COG: Ta1393 COG2721 # Protein_GI_number: 16082370 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Thermoplasma acidophilum # 1 386 4 387 387 429 56.0 1e-120 MQKSFFGYRRDNGRVGIRNHVLILPLDDLSQPACNAVANNIGGTVAIQHPYGRLQFGADL DLHFRTLIGAGSNPNVAAVIVIGIEEKWTQDVVDGIAKTGKPVAGFGIEQHGQIDTIARA SRKAKEFVQYASALQKVECPISDLWVSCKCGESDTTSGLASCPTVGNAFDKLYEEGCTLV FGETTELTGGEHLVKARCANDRVRSEFTAFFDRYAKVIDDHKTSDLSDSQPTKGNIEGGL TTIEEKALGNIQKIGTKAPVIGTLDKAVTPTGPGLWFMDSSSAAAEMITLVAAAGYVVHF FPTGQGNIIGNPIVPVIKLSANPRTIRTMSEHMDVDVSGILRREMNLDQAGDKLLDMMFQ TIAGRMTAAEALKQNQFVMTRLYESA >gi|316924588|gb|ADCP01000020.1| GENE 70 60784 - 61068 449 94 aa, chain - ## HITS:1 COG:SSO1260 KEGG:ns NR:ns ## COG: SSO1260 COG2721 # Protein_GI_number: 15898103 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Sulfolobus solfataricus # 5 94 4 93 99 82 50.0 3e-16 MSVQFLVHEDGDSVGVITVEGVKAGQELTGWIMKEDKTITFKVLDDIPIGHKIALKDLNV GDTVFKYGTDIGKVVKPIRQGEHLHVHNVKTKRW >gi|316924588|gb|ADCP01000020.1| GENE 71 61262 - 63241 2135 659 aa, chain - ## HITS:1 COG:FN2105 KEGG:ns NR:ns ## COG: FN2105 COG3333 # Protein_GI_number: 19705395 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 15 496 14 491 494 321 43.0 2e-87 MTEFILPAILNLFEPLNILLMIAGLAGGIIIGALPGLSATMGVALMVPATFAMNPTSGLI MLGAIYAGAIYGGANSAILICTPGTPSSVATTFDGWPLCQKGDADIGLYTSLLSSAFGGI VGTVFLLCLAGSLARFALQFGGPESFWLCLFGLSTIAVMTPDNMGKGIVSGAVGILVSTI GLDPNTGAPRFTFGTYDLLQGVSVIPCMIGLFSFSQVLYLIGTDKTFIAEYHPHKGAFGK VLRYLTTRCKTVLVRSSIIGTWVGMLPGAGGEIASIISYNESKRWDKDPSRYGKGCIEGV AASESANNAVIGGSLIPMLTLGIPGSAVAAVILGALMAHGIQPGFKIFTASGDLAYTFIM SQFAVNLLMIPVGFVLCRTMARLLSIRLTLVAIGIVVLAYIGAYAISNSLVDIWVVLAFG LVGFFGGKVGMDTGAMALGVILGPMIEENLGKSVDLSRSVDGGLMSVMFEGSINKVLIAA LLLSLLTPFLLKLRNRKHAALLPEGPRPGWKTDLFAGIGFLAIAAAFLVQYEGLEGVSRV FPEVLTTLIGAGGAYFVGKGLWRRSREAAEPEGGEAVAWCKIGIICATSLCYAVLIPVFG FLVTTMGFIAGSTLILGDHSRGLPRLARTAIAFAIGFGLLVWAGFTLLLNVPVPEGMFL >gi|316924588|gb|ADCP01000020.1| GENE 72 63299 - 64255 1180 318 aa, chain - ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 25 315 16 307 308 174 34.0 2e-43 MKLLSCLRTALCLLGLLAMPAAAAYPEKPINMIIAFTAGGSSDVQARIMQKYWNKYAPQP WVFVYKPGAGGIIGFTEIAKARPDGYTIGGLNVPHIILQPLAQNAQFTPDSFKYICQVVN DPQCIAVRKDSPYQSTSEIIAAARKNPGKIRVGLVGPLSGHHLMFLDYGKKYPDAKLTSV FYKGAADQNAALLGGEIDMIFGNINDVMRSIDEFRVLNVAAEKRNDFLPDVPTLREAGID MVSDIRRCFAAPKGIKPAELDALRVLFKKICEDPEYLADMKKAGQPAEYMDGEAFAAYIA GQQAHAQEALSAAGLLKK >gi|316924588|gb|ADCP01000020.1| GENE 73 65010 - 65921 706 303 aa, chain - ## HITS:1 COG:aq_1051 KEGG:ns NR:ns ## COG: aq_1051 COG3058 # Protein_GI_number: 15606334 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Uncharacterized protein involved in formate dehydrogenase formation # Organism: Aquifex aeolicus # 140 296 134 276 283 72 28.0 8e-13 MEASKNDRLVKKFAALEDKTWFPQELLELVKDVCRLQGEVRASLAVSVDPALVCGDMEHR QGAPLLAREQFPVDAEGAEKLFRAILEVAESLPQLRTTAQLVRDKLIRGDIRPEELFRAY MLENAEPFEAWAKEAPDAPNLLPFLVYNSMEPWLEAAGEALSSAYPQNDVWQHGHCPVCG SPAFIGHLSGPEPSRNEGRDINKGGKRMHTCSYCRTTFRAKRIQCPFCLEEDAKKLDYFT TENEPGYQVHVCRSCKSYIKIADFREFFGRESIPALDDLESLPLDIAAQNEDFHRVAPSE WGL >gi|316924588|gb|ADCP01000020.1| GENE 74 66406 - 67242 626 278 aa, chain - ## HITS:1 COG:CAC3096 KEGG:ns NR:ns ## COG: CAC3096 COG2145 # Protein_GI_number: 15896347 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Clostridium acetobutylicum # 8 274 6 271 273 253 50.0 2e-67 MNPFENILSAVRQKRPLVHFITNYVTVNDCANMTLALGGSPIMADDLCEVEQIVGHCSAL VLNIGTLSERTVQSMLAAGKEANRLGRPVVFDPVGVGASRFRNETAAMLLREVRFDVIKG NISEIAYLAEGIGTTKGVDANLDALSTEGNLRWHTDLARKVSTQTGAAIVISGPIDIVAD SHEAWAIRNGHPMMANITGTGCMSAGVIGCCVGADPQALLPSCVCAMSAMGICGELAYEK LLSVDGGSGTYRVLLMDAMSKLDGATLTRRSKAERLRI >gi|316924588|gb|ADCP01000020.1| GENE 75 67826 - 68413 730 195 aa, chain + ## HITS:1 COG:aq_1039 KEGG:ns NR:ns ## COG: aq_1039 COG0243 # Protein_GI_number: 15606330 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Aquifex aeolicus # 1 195 4 198 1015 172 44.0 2e-43 MSVSRRGFLKFGLGAALAGAVSGLGFNLRPAFAQAKEFKLKWTKQTTSICCYCAVGCGLI VSTSTSDNPLGKGRAINVEGDPDHPINEGSLCPKGASTWQLTVNDQRPKKPLYRAPHATE WKEVEWEWAMEEIAKRVKESRDKSFTEKNDKGQLVNRTEAMASFGSAAMDNEECWAYQAI LRSLGLVYIEHQARL >gi|316924588|gb|ADCP01000020.1| GENE 76 68510 - 70876 3146 788 aa, chain + ## HITS:1 COG:STM4037 KEGG:ns NR:ns ## COG: STM4037 COG0243 # Protein_GI_number: 16767302 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Salmonella typhimurium LT2 # 1 781 229 1016 1016 668 45.0 0 MGSNPAENHPISFKWVLKAQDKGAKVVSVDPRFTHSSSKADFYVQLRVGTDIAFLGGMIN YILNNDKYFKQYVIDSTNGPTVVGERFGFSDGLFSGFNADKKSYDKKTWAFEMDANGVPV RDETLTHPRCVLQLLKKHYDRYTLDKVSETTGTPKEDLLRVYETFASTGAPDKAGTIMYA MGWTQKSVGVQNIRAMSIIQLLLGNIGVAGGGVNALRGESNVQGSTDQALLADIWPGYLP VPTSTLTTLADYNATTPKTNDPKSVNWWGNRPKYTASFLMSLYPGVTPDVAYSYVPRLDA DKKRTDYMWMSIFDRMVEGKLDGLFAWGMNPACSGPNANKSRGAMEKLKWLVNVNLFDNE TGSFWRGPGKDPTQIATEVFFLPCCTSIEKEGSIANSGRWMQWRYAGPDRYGETKPDGDI MVEMMLAIRKLYKEQGGVFPEPILGLGIDKWMEGHEFSPANTAKVMNGYFLRDVTIGGKL YKAGHQVPAFAMLQADGSTASGNWLHAGSWTDEGNMMARRDKTQTPEQEKIGLFPNWSYA WPMNRRIIYNRASVDKQGRPWNPDKPVIQWRDGKWVGDVVDGGGDPGTKYPFIMQKDGQG ALFGPGREDGPLPEFYEPMESPFEKHAFSAQRSNPVAFKAIGEVLAVADERFPFIGTTYR VTEHWQTGLMTRRCSWLVEAEPQVFAELDPELAKERGIGNGDVVRVSSARGNLLAKAIVT NRFQPMNANGKTVHMIGLPWHFGWLVPKRGGDSANLLTAAVGDPNSAIPESKAFMVNIEK MPDQDLPE Prediction of potential genes in microbial genomes Time: Fri May 13 02:09:39 2011 Seq name: gi|316924508|gb|ADCP01000021.1| Bilophila wadsworthia 3_1_6 cont1.21, whole genome shotgun sequence Length of sequence - 107193 bp Number of predicted genes - 81, with homology - 72 Number of transcription units - 54, operones - 16 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 45 - 749 622 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 2 1 Op 2 . + CDS 774 - 1253 477 ## Ddes_0558 cytochrome c class III + Term 1288 - 1326 6.7 3 2 Tu 1 . + CDS 1596 - 2180 607 ## COG0746 Molybdopterin-guanine dinucleotide biosynthesis protein A + Term 2297 - 2345 0.2 4 3 Tu 1 . + CDS 2701 - 6732 3805 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family + Term 6794 - 6827 2.2 - Term 6990 - 7016 -0.7 5 4 Tu 1 . - CDS 7041 - 7997 787 ## COG0583 Transcriptional regulator - Prom 8120 - 8179 3.9 6 5 Op 1 8/0.000 + CDS 8199 - 9191 1357 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component 7 5 Op 2 . + CDS 9365 - 11302 2455 ## COG4666 TRAP-type uncharacterized transport system, fused permease components 8 5 Op 3 . + CDS 11305 - 12939 1985 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit + Term 12960 - 12987 0.1 9 6 Tu 1 . + CDS 13176 - 14864 1872 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Prom 14872 - 14931 2.9 10 7 Op 1 6/0.000 + CDS 15086 - 16552 1824 ## COG0043 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases 11 7 Op 2 . + CDS 16662 - 17228 709 ## COG0163 3-polyprenyl-4-hydroxybenzoate decarboxylase 12 7 Op 3 . + CDS 17225 - 17575 460 ## 13 8 Tu 1 . - CDS 17621 - 18412 698 ## Dtox_2803 4Fe-4S ferredoxin iron-sulfur binding domain protein - Prom 18474 - 18533 5.7 14 9 Tu 1 . + CDS 18887 - 20260 1527 ## ROP_05410 putative oxidoreductase + Term 20482 - 20518 -0.5 15 10 Tu 1 . - CDS 20332 - 21531 949 ## COG2508 Regulator of polyketide synthase expression - Prom 21707 - 21766 9.3 16 11 Op 1 . + CDS 22046 - 22426 351 ## COG0251 Putative translation initiation inhibitor, yjgF family 17 11 Op 2 . + CDS 22463 - 24025 1831 ## COG1292 Choline-glycine betaine transporter + Term 24075 - 24120 14.0 18 12 Tu 1 . - CDS 24418 - 25077 465 ## COG0730 Predicted permeases - Prom 25215 - 25274 3.8 - Term 25409 - 25442 2.1 19 13 Tu 1 . - CDS 25507 - 26967 1403 ## COG2233 Xanthine/uracil permeases - Term 27002 - 27040 1.0 20 14 Tu 1 . - CDS 27112 - 27438 236 ## Glov_0942 hypothetical protein - Prom 27623 - 27682 2.6 - Term 28078 - 28109 4.1 21 15 Tu 1 . - CDS 28141 - 28449 225 ## Ddes_1652 hypothetical protein - Prom 28505 - 28564 7.0 22 16 Op 1 . - CDS 28732 - 28965 220 ## LI1082 Fe2+ transport system protein A 23 16 Op 2 . - CDS 28962 - 29354 117 ## 24 17 Op 1 . - CDS 29458 - 29715 71 ## - Term 29732 - 29767 -0.7 25 17 Op 2 2/0.000 - CDS 29774 - 31051 1584 ## COG0477 Permeases of the major facilitator superfamily - Term 31088 - 31132 9.2 26 17 Op 3 . - CDS 31135 - 32748 1829 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold - Prom 32868 - 32927 4.7 + Prom 32832 - 32891 3.4 27 18 Tu 1 . + CDS 32977 - 36090 3276 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains + Term 36208 - 36247 3.0 28 19 Tu 1 . - CDS 36310 - 38655 1424 ## COG4953 Membrane carboxypeptidase/penicillin-binding protein PbpC - Prom 38813 - 38872 2.7 - Term 38832 - 38872 11.2 29 20 Tu 1 . - CDS 38961 - 44549 4912 ## COG2373 Large extracellular alpha-helical protein - Prom 44570 - 44629 6.3 30 21 Tu 1 . - CDS 44756 - 45535 767 ## COG1414 Transcriptional regulator - Prom 45582 - 45641 2.2 + Prom 46052 - 46111 5.2 31 22 Op 1 . + CDS 46154 - 46516 355 ## COG3450 Predicted enzyme of the cupin superfamily 32 22 Op 2 17/0.000 + CDS 46558 - 47313 192 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 33 22 Op 3 21/0.000 + CDS 47366 - 48472 1665 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 34 22 Op 4 2/0.000 + CDS 48652 - 49449 1055 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 35 22 Op 5 24/0.000 + CDS 49453 - 50289 1235 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 36 22 Op 6 . + CDS 50317 - 51078 197 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) + Prom 51343 - 51402 2.9 37 23 Op 1 . + CDS 51428 - 52210 998 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 38 23 Op 2 11/0.000 + CDS 52235 - 54634 2865 ## COG1882 Pyruvate-formate lyase 39 23 Op 3 11/0.000 + CDS 54653 - 55558 1056 ## COG1180 Pyruvate-formate lyase-activating enzyme + Term 55567 - 55608 -0.8 40 23 Op 4 . + CDS 55727 - 58099 2806 ## COG1882 Pyruvate-formate lyase + Term 58228 - 58269 9.7 - Term 58220 - 58253 5.4 41 24 Tu 1 . - CDS 58465 - 59964 1604 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit - Prom 60157 - 60216 2.4 42 25 Tu 1 . + CDS 60310 - 61116 814 ## LI0105 hypothetical protein 43 26 Op 1 . + CDS 61232 - 61723 414 ## COG3542 Uncharacterized conserved protein 44 26 Op 2 . + CDS 61741 - 62187 336 ## COG1683 Uncharacterized conserved protein + Term 62421 - 62475 4.2 - Term 62218 - 62264 9.4 45 27 Tu 1 . - CDS 62490 - 63026 773 ## COG2193 Bacterioferritin (cytochrome b1) 46 28 Tu 1 . + CDS 63296 - 64093 794 ## COG3884 Acyl-ACP thioesterase + Term 64141 - 64169 1.4 + Prom 64778 - 64837 5.5 47 29 Op 1 . + CDS 64952 - 65662 893 ## COG2186 Transcriptional regulators 48 29 Op 2 . + CDS 65664 - 66461 879 ## COG0730 Predicted permeases + Term 66478 - 66508 3.3 - Term 66466 - 66496 3.3 49 30 Op 1 1/0.077 - CDS 66529 - 67164 690 ## COG1266 Predicted metal-dependent membrane protease 50 30 Op 2 . - CDS 67161 - 68081 1039 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase - Term 68091 - 68131 3.0 51 31 Tu 1 . - CDS 68233 - 68430 62 ## - Prom 68484 - 68543 2.0 + Prom 68755 - 68814 3.6 52 32 Tu 1 . + CDS 68842 - 69201 105 ## 53 33 Op 1 . + CDS 69463 - 69711 185 ## 54 33 Op 2 . + CDS 69708 - 71183 1859 ## COG0591 Na+/proline symporter + Prom 71509 - 71568 2.8 55 34 Tu 1 . + CDS 71590 - 72558 461 ## COG1570 Exonuclease VII, large subunit + Term 72610 - 72656 12.3 + Prom 72897 - 72956 8.3 56 35 Op 1 40/0.000 + CDS 73091 - 73816 899 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 57 35 Op 2 1/0.077 + CDS 74057 - 75541 1786 ## COG0642 Signal transduction histidine kinase + Term 75679 - 75715 -0.5 58 36 Op 1 16/0.000 + CDS 75902 - 78055 2758 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 59 36 Op 2 . + CDS 78149 - 78871 809 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 60 36 Op 3 . + CDS 78868 - 79608 993 ## Dbac_2336 formate dehydrogenase gamma subunit 61 36 Op 4 . + CDS 79619 - 79912 617 ## COG1555 DNA uptake protein and related DNA-binding proteins + Term 79986 - 80041 4.3 62 37 Op 1 . + CDS 80089 - 81249 1709 ## gi|121533853|ref|ZP_01665679.1| hypothetical protein TcarDRAFT_2230 63 37 Op 2 . + CDS 81325 - 81567 318 ## gi|121533854|ref|ZP_01665680.1| hypothetical protein TcarDRAFT_2231 + Term 81580 - 81628 9.0 64 38 Tu 1 . + CDS 81648 - 82370 774 ## MTH1541 hypothetical protein 65 39 Tu 1 . - CDS 82947 - 84365 2067 ## LI0461 hypothetical protein 66 40 Tu 1 . - CDS 84478 - 85851 1782 ## COG0786 Na+/glutamate symporter - Term 85918 - 85956 0.6 67 41 Tu 1 . - CDS 85972 - 87618 1827 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold - Prom 87692 - 87751 5.7 - Term 88325 - 88366 -0.4 68 42 Tu 1 . - CDS 88370 - 88822 246 ## - Prom 88895 - 88954 6.0 + Prom 88846 - 88905 4.5 69 43 Tu 1 . + CDS 89079 - 89720 319 ## COG0500 SAM-dependent methyltransferases + Prom 89910 - 89969 3.3 70 44 Tu 1 . + CDS 90079 - 93180 3016 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains 71 45 Tu 1 . + CDS 93481 - 93744 279 ## + Term 93827 - 93870 10.0 - Term 93798 - 93869 21.4 72 46 Tu 1 . - CDS 94091 - 95731 1737 ## COG0303 Molybdopterin biosynthesis enzyme - Prom 95777 - 95836 4.2 - Term 96238 - 96275 0.6 73 47 Tu 1 . - CDS 96298 - 96945 524 ## Dvul_2281 putative phage repressor - Prom 97014 - 97073 1.7 74 48 Tu 1 . - CDS 97131 - 97358 204 ## - Term 97402 - 97449 -0.4 75 49 Tu 1 . - CDS 97471 - 97878 347 ## COG0432 Uncharacterized conserved protein - Prom 97965 - 98024 3.7 + Prom 97929 - 97988 3.5 76 50 Tu 1 . + CDS 98091 - 100292 2378 ## COG2199 FOG: GGDEF domain - Term 100560 - 100614 4.1 77 51 Tu 1 . - CDS 100753 - 102693 2126 ## COG3284 Transcriptional activator of acetoin/glycerol metabolism - Prom 102879 - 102938 9.1 + Prom 102838 - 102897 8.7 78 52 Op 1 . + CDS 103047 - 103559 792 ## gi|302863611|gb|EFL86542.1| hypothetical protein HMPREF0326_00314 79 52 Op 2 1/0.077 + CDS 103581 - 105089 2221 ## COG3333 Uncharacterized protein conserved in bacteria + Term 105122 - 105151 2.1 80 53 Tu 1 . + CDS 105420 - 106397 1608 ## COG3181 Uncharacterized protein conserved in bacteria + Term 106440 - 106473 2.7 81 54 Tu 1 . + CDS 106600 - 106980 588 ## COG0251 Putative translation initiation inhibitor, yjgF family Predicted protein(s) >gi|316924508|gb|ADCP01000021.1| GENE 1 45 - 749 622 234 aa, chain + ## HITS:1 COG:STM4036 KEGG:ns NR:ns ## COG: STM4036 COG0437 # Protein_GI_number: 16767301 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Salmonella typhimurium LT2 # 4 207 31 238 300 148 37.0 1e-35 MAKAFFVDLTRCTGCRGCQIACKQWKNLPAEETTNRGGHTNPPDLSSVTYKTVHMREVGK GKDFRMLFFPEQCRHCELAPCMTASTIDGAIVQDEKTGAVLFTPLTAQLDAQAVREACPY DIPRVDPQTKQIHKCDFCNDRVHQGMLPACVLSCPTGCMNFGEREDMENLAEQRLAEVKE RFPNAVLGDNGIVHVIYLYAEDPNLYHKFAVFAQNDANPMTRRQLFAKLRRPVA >gi|316924508|gb|ADCP01000021.1| GENE 2 774 - 1253 477 159 aa, chain + ## HITS:1 COG:no KEGG:Ddes_0558 NR:ns ## KEGG: Ddes_0558 # Name: not_defined # Def: cytochrome c class III # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 32 141 24 130 147 85 43.0 6e-16 MRKELLTLAVACLLCVPAFAADDEEPKGVSDIKEVGKVIRNPITIEATGGKQKMDVVFNH SSHRRVRCQTCHHALPSDIDAKYVSCGASEECHSVPRGDDGVAPSLFKAFHAKDSDHSCY GCHMKKRKQYTGFQKGCLPCHEKAQKDPKAPAVAEKTLN >gi|316924508|gb|ADCP01000021.1| GENE 3 1596 - 2180 607 194 aa, chain + ## HITS:1 COG:slr0902_2 KEGG:ns NR:ns ## COG: slr0902_2 COG0746 # Protein_GI_number: 16331654 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein A # Organism: Synechocystis # 1 145 12 156 192 65 28.0 7e-11 MAGGLSSRMGRNKLRLSIHGDGKDMLERTVELLGGFTDDVFVSCRAPEDAAPFKAIPDEV DRQGPFGGVYSALRRLQKPILVLSCDLPFMDGPTLRRLLDARKARLPGTIMTTYQQEETG FIEALVAVYEPACLPWFDAAWEQGIRKFSVVIPEELRTHVPYSRSEALPFFNINYPADFE IARRLAELMRRDGM >gi|316924508|gb|ADCP01000021.1| GENE 4 2701 - 6732 3805 1343 aa, chain + ## HITS:1 COG:all4900 KEGG:ns NR:ns ## COG: all4900 COG0553 # Protein_GI_number: 17232392 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Nostoc sp. PCC 7120 # 489 1341 4 866 869 674 45.0 0 MNPLKSLRQRYVELSEPYRLLLQLKAVTGGFSRSLLPECLREAGVQPPRGQVWNAGDIRT ALHTLRQMGLTDDKDRVVAEFEHDLCIEGLQRMAPAVRKVLERGAGIHTAALRLRLAVYE RDLNAYERARLDAQAMEEGGNHPFAGQFADTPTDPGWLAALPPRMRLDVITNNLIPLVEA GSITGNLRNCLDVLPQLRSSLADLPPCPALILFDALAGRYAEARAQIVPLLLDRTEFRGP LFQGIIAFLEGGDAVPALREAQKRFRKTRGKRKANLPGPGGLCFALALIRADQPELREEC AAILHELSPRQPAYKGLAAAKALLGLANGRSEAAMLKRIEDLPSDPLSEAVFRAVDASLD DGSHDDRPTERQFERLKGILPLTACLFAETLARLRPQGPWKAWIEGFPIRHVRLTDFLDR RENWERVFEALVHKLQQPVVKEKAKKRIAWLVDTERFRVEALEQSSKGEDWTAGRAISLK RLHEESETLDWLTEQDRSVLATLRWSRSWNGTTCEADPYRALPALAGHPNLLRASDRVPI SLACKPVELVVRQERNGYSLALSRWMPRGEMQCVEAAPRSYILYYLPGRLEGTADLLGEH GLSVPTEGGPRVLELIRNLDEDVVRPVIRAEEVSASVRLLVRLRPLGTGIEAALHVRPFG MPGTPAFPVGDGPVAPLAEVEGRAVRAERDFEEEIHAARALVRACPTLRERGGIGPWCIE DIEEALDCLLELEQAGPELEWPEGEELRVCPQVSTARLTVDVRHSRDWFQLHGQIAVNES LVLDMAQVLERLAQSKGRFVPLGDGAFLALTKQFRQQLDRLERLAERDGASLRVHPLAAD TVCDLLDGAEVKGDATWESWLGRIRLPGGTPAVPSTLRAELRDYQLDGYVWMSRLARWGA GACLADDMGLGKTVQTIAVLLAQAEMGPSIIIAPTSVCHNWENELDRFAPTLSVHRFGPG DRAALVGALGPGDVLVASYGLLHTEAKCLSGREWQVAVFDEAQALKNADTRRARASRQIP AAFRVALTGTPIENRLEDLWSLFNLINPGLLGTRQSFQKRFAAASAPSTEENAVSEGQSA ARQALRALVRPFILRRTKSEVLTELPPRTEQVIEVDLPDDERAFYEALRRNALASLEAAK QEDAEGSQKFSILTELMKLRRACCAPVLIDPGTSLTGAKLSAFMELVEELVRGGHKALVF SQFVGCLSEARRLLDAAGYGYQYLDGSTPDRERQAAVAAFQSGKGDLFLISLKAGGQGLN LTAADYVIHLDPWWNPAVEDQASDRAYRLGQQRPVTVYRLVARGTVEESILKLHRSKRAL AADVLEGTDVPLSEAELMDLIRR >gi|316924508|gb|ADCP01000021.1| GENE 5 7041 - 7997 787 318 aa, chain - ## HITS:1 COG:CAC2394 KEGG:ns NR:ns ## COG: CAC2394 COG0583 # Protein_GI_number: 15895660 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 14 301 1 282 286 123 28.0 4e-28 MPCGTGQRGSGVGMDLKRLEYFCAIVEKGQISKAAEALHISQPPLSMRLKELEEELDVQL IYREGKAWQVTPEGEALYQRAQFILSYVAGLEKDIRESRNKVSGLVRIGVCPPCLSLAST PIPALNREFPDLRFRVWVMDNQSLERHMQERNLDFALVQLPVQNANYAMLPLARQEFVAV YGQGVPPPEARSIGVEELREAPLMLSRRRDGGGGYDLIMRAFQDKGVRPNVVLDSQDTRL LHKLLRQGMSAVAILPEREVAAGTLESFPMRRLDVPGLDLAPVLIWLEHAYLSRTARRVI QCLHKVNADPGEQPDFEI >gi|316924508|gb|ADCP01000021.1| GENE 6 8199 - 9191 1357 330 aa, chain + ## HITS:1 COG:AF0635 KEGG:ns NR:ns ## COG: AF0635 COG2358 # Protein_GI_number: 11498243 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Archaeoglobus fulgidus # 31 327 37 325 330 110 28.0 5e-24 MEWRKSLACALALAGMMAFGPGVRQAALAADTHNLTILGGGSGGLWAIISEGVGETLRRS MPNVRVTTEPGKDGPNQVMTSRGEVQFALATDVLTLKAIKGEAPFKGRKLDNLRLVAVMN PINALQFFVDAKTGAKSIQDIKDKKIPLRITVNRQGTLIDVATEELLKTYGITYSDIAKW GGKVHKIPGPEATDLWDAGQMDAIVEVSQFPTSRFYELGQKHDLIMLPIDPANQEKLNKE LGTSSLTIPAGTYSFQKEDCPTVSSQLLLITSADQPDEVITGVLKAMTANIDYLHSVHAN LRDLSPETMSANTSIPMHPAAEKFFRELKK >gi|316924508|gb|ADCP01000021.1| GENE 7 9365 - 11302 2455 645 aa, chain + ## HITS:1 COG:AF0636 KEGG:ns NR:ns ## COG: AF0636 COG4666 # Protein_GI_number: 11498244 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, fused permease components # Organism: Archaeoglobus fulgidus # 13 626 22 669 679 343 34.0 5e-94 MRLGLFFKEGRRRKLSPLWSRIVGVLVVLFAAVELYGAPFGFIDSFILRSLFVSFAITLT FICYTPIRTEPGQSENVPVPDILLACISLAAGAYMVLNGAELVTRWTGVDPLTPADWLVT AAFVLLTLEVTRRTVGPVLLGIILVFIVYNFVGEYLPGYFGHRGSTMEVFLDRMVYTYDG IFGTPVGVACTYVFMFVLFGQVFNAAGGGNFFFRLAASIAGRMRGGPAKICVIASALYGT LSGSPVSDVVTTGSITIPLMKRLGYGETFSGAVAASACSGASILPPIMGTAAFLMVDVAG IPYIEIAIAATLPAIIYYFGLMMQVHYRAVLKGMRPISEDTDHESAWTVLKENWLYIIPI ALLVTLIIQRVNPTVVGLLATASVVVTSWLIPGHRMGIKELYEVIRKTGLGILTVANASA AAGMVIGGIMLTGLGGKFTSLVFAATGGQSGLCLSMVAIVCIILGMGMPVPAAYVLTATL AAPALLEFKFSLMSAHLFIVYFSAISAITPPVAVAAYAAAGISEGDPNATGVQAVRLAVA AYIVPFVFMYRPALILDGSLPEILWTAVVATAAVYSFASGLEGYLHGRLRFPLRLALMAA GVVFVWPDTWADLTGFAILGAVTAHQYAIRKNNPDPVCLAQPQGE >gi|316924508|gb|ADCP01000021.1| GENE 8 11305 - 12939 1985 544 aa, chain + ## HITS:1 COG:PA2298 KEGG:ns NR:ns ## COG: PA2298 COG1053 # Protein_GI_number: 15597494 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Pseudomonas aeruginosa # 6 525 3 526 574 175 28.0 2e-43 MTASQTIRLETELLILGTGAAGCGAALAARQAGIRTLMVDKGKLESSGCLGGGNDHFIAV LNTDEPQDTIDDLVKAYLKPSSGYSEKQIRDWGEVMPAMVDFLEGEGVKLLHNPDGSYLR TAGRGEPAWYINIADGQMIKRLIARKIRSMGADVLDHVMITKLLKKDGRVVGAAGYNVLD GTFYVIRAAGVILALGNSCNRATANSTGNPYNTWHSPFNTGSQFVLAYEAGADIINMDIK QQATLVPKGFGCAGMNGINSAGAHELNALGERFMGRYHPMMENCPRQFQVGGTYQQQVEG NGPPFYMEMRHIDAESLRHLQYTLMPGDKATFLDYCEQRGVDFATAPMEVELSEIEFSGM LATDDGFRSTVPGLYNGCVFYTFSGSMCSGYLAGRSAASEAGAVPDLDGLEAEIEAERAR IAAPLAPRGDALPQAMFEASIRQVMSYYMGFVRNGKGMEVALERLAFIASQADKLSASNM HELMLAHESLHLLQSSTLSTLCSLERRESGRSIYRRSDYPEKDPALDRVLAVRRGADGPE VFWS >gi|316924508|gb|ADCP01000021.1| GENE 9 13176 - 14864 1872 562 aa, chain + ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 22 420 41 461 468 186 34.0 1e-46 MPPIYSKEWNRITIQVRGDHRQGRTSPCEHACPLGNGIQQMHTLIAAGETGKALARLHAR NPFPGITGRVCPHPCETKCNRAEYDEPVAIHALERFAADTGHETRFIPLPASGKRVAVVG SGPAGLTAARFAALLGHAVTVYESAPVMGGVPRHAVPDFRLPKDVVDRETGAIVASGVQV RTNVTVGRDITLQSLLDTYDATILAVGLWKERRLDIPGKEHLVPAVGWLKRSTLERQSLT GKTVVILGGGGVAFDCAFTARRLNADAVHIVCLEPTDAMRAPAEEVQQALDEGIVIHNSH LSHAVAPEGGRLRFEARPVTSFSFDETGALHTEFAPGDPLCMNADLVICASGLQTDEIPL EGVDMARTPRGFVAVDPVSFRTSVPGLYAAGDIANGPSLIARAIGHGRQAAIAVHKALSG IDPAENIDIWIDETGRVREDHVPALPAPHVVAFKEIMHADYHEHAARQILPPAASDGPEL AFAELSGGLAAEAAVAEAGRCMHCGHCMECGSCVESCPGHILEMTDDGPAVAYPSQCWHC GCCRIACPTGSIAYRFPMTMML >gi|316924508|gb|ADCP01000021.1| GENE 10 15086 - 16552 1824 488 aa, chain + ## HITS:1 COG:MA0246 KEGG:ns NR:ns ## COG: MA0246 COG0043 # Protein_GI_number: 20089144 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases # Organism: Methanosarcina acetivorans str.C2A # 70 476 47 417 422 161 32.0 3e-39 MSLTAADITDLRTALKLLDQHGELAQTDVEADPVAELCGAYRHIGAGGTVARPTRIGPAM LFNKVKRHPGARVAIGVLSSRKRAALLLGTEPERLGWHLMDALQHLIQPTAIKGIAPCQE AVHRADEPGFDLRRLVPAPTNTPEDAGPYVTMGLLYGKDPENGDEDVTIHRMCLQGPDTL SVWFSPGRHIDVFRAKAEAAGKPLPVSVSIGLDPAIYLGACFEPPMTPIGFNELAIAGAL RGRAVEQCACLTVDAKALAWAEYVIEGEILPNVRIAEDAQSGNGKAMPEFPGYCGPSDPQ TPVMRVTAVTHRLHPIMQTTIGPSEEHVNLAGIPTEASILGLVERAMPGRVRNVYAASCG GGKFMAVLQFRKATIADEGRQRQAALLAFSAFPELKHVFLVDDDVDPFDMSDVLWAMTTR YQGNFDTVFLPGVRGHVLDPTQSPLYDARIPARGVGCKAIFDCTVPFDQKERFQRAPFME VDPGKFLI >gi|316924508|gb|ADCP01000021.1| GENE 11 16662 - 17228 709 188 aa, chain + ## HITS:1 COG:Z4047 KEGG:ns NR:ns ## COG: Z4047 COG0163 # Protein_GI_number: 15803256 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase # Organism: Escherichia coli O157:H7 EDL933 # 4 187 2 185 197 189 50.0 2e-48 MKKRLVVGLSGASGAILCVSLLEAMRREKDWEVHLVTSEAGERVLREETGMSGTDLAGLA FRTHDIGNIGAAIASGTFETEGMAVVPCSMKTLAGICHGYAENLLLRAADVTLKERRKLV LVARETPLGLVHLRNMTALAEMGVTILPPMLTYYQHPQSIEDMTTHIVGKIMREFQLEAP GFRRWEGS >gi|316924508|gb|ADCP01000021.1| GENE 12 17225 - 17575 460 116 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIGDATLTRLVRGKEIVLNARRQGGDCLLSVTGGDAPHIGAAALCADGEIRRLDRNGHRE GELAAELAERAASKLGCCVCAVCGIHFDGISRAEIASVVAAVRDMADTWLGSASHH >gi|316924508|gb|ADCP01000021.1| GENE 13 17621 - 18412 698 263 aa, chain - ## HITS:1 COG:no KEGG:Dtox_2803 NR:ns ## KEGG: Dtox_2803 # Name: not_defined # Def: 4Fe-4S ferredoxin iron-sulfur binding domain protein # Organism: D.acetoxidans # Pathway: not_defined # 1 262 1 264 265 176 36.0 9e-43 MEIKRVHIAYFSPTGTTKRTLRAVAEGFGCSELSETDWGDPKSRAAILKTGPDELVIAGM PVYYGRIPSLFHEGLPLRGEGTPFVPVAVYGNRHYDDAVLELKTLGEAAGCVTLGAAAFV AQHCLNPDMGRGRPDASDLAVMKRFGSALRAKADGLAGSMPPLSVPGAVPYKEYKPTPFA PELLDAETCIRCGLCARTCPVRIIDSETYAVTEPERCLFCFGCVRICPVAVRGPRPEVAP VFAAQMAGLAARCTERREPELFL >gi|316924508|gb|ADCP01000021.1| GENE 14 18887 - 20260 1527 457 aa, chain + ## HITS:1 COG:no KEGG:ROP_05410 NR:ns ## KEGG: ROP_05410 # Name: not_defined # Def: putative oxidoreductase # Organism: R.opacus # Pathway: not_defined # 1 446 1 447 464 452 50.0 1e-126 MRDAIVIGGGIAGMSAAWRLRHSDVLVLEGSNRIGGRLRSERRGAYWLNWGGHVFGGEGS FADKLLKEVGVDALNLQGSFAALAYKGKVLNDGNVNFYPFRLPISWSARWEIVKAGIKVM RAVNAYGKVNKPRKGEDYRVRQQRIYDFMADKTFSEFTGPLSEEADAFFRPTVSRSSGDP EEISAGAGVGYFLLVWDKSSGLARNIVGGAATLPQAIAHTLGDKVKLGAEVLEVIQHRDH VEVTYKSDGKEVTEQARYAVITTPAAITHRITKNLDPLVHDALGKIQYGHYVSGAFLTNE TGRRPWDSVYAYCTPQRSFNIAFNMSNLVRTMESERQPGSSFMVFSPAKLARDLINLPDE KVLEIYRKDLEEVFPGFGSLVVEQSVQKFHLGLAYCFPGRNKLQPLLMRTPRRIYLAGDY LGSWYTETAIQTGLLAGEDINSKLYTDTLLPSNGFSY >gi|316924508|gb|ADCP01000021.1| GENE 15 20332 - 21531 949 399 aa, chain - ## HITS:1 COG:BH1994_2 KEGG:ns NR:ns ## COG: BH1994_2 COG2508 # Protein_GI_number: 15614557 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Bacillus halodurans # 213 386 123 309 313 73 32.0 7e-13 MIQNFIQEQVDTLARTLQRSVAIDDVGINLVAVSKHFDDADDARVRAILARTLEEDSCRY LFSFGIVATRDKIVRIPECRELGFKERSCYPIRWQERTLGYLWLVGKVSPAEDKAAVACA GELAVPMFSMQLEGEQAFSKHELIVRNLLSVDTRWASRAAAILSRQGRIRTDRKYRAFSA ATRKGTGGQHSELTFLRKQFREFTAAHGEAAPLFSDFSPELVIVVPADSSLSDPGTLMCI ACSGLNADLRIGIGPEVDVLHLYDSYAQASSVLQVLRAIPNLGPCSTWEGLGVYGNLTLL TRNHEDAQLPLTPNIAALREEDPALFDTLELFLDNAGSIAKTSEELCIHRSTLYYRLKRI ADITGTDLNSGLDRFTLHLELKLFRLTSVLLGDDEPATA >gi|316924508|gb|ADCP01000021.1| GENE 16 22046 - 22426 351 126 aa, chain + ## HITS:1 COG:PAE3003 KEGG:ns NR:ns ## COG: PAE3003 COG0251 # Protein_GI_number: 18313754 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrobaculum aerophilum # 1 123 1 123 126 110 45.0 6e-25 MRQTVSATGAPAAVGSYSQAVKAGNLLFLSGQLCANPETGRFERESVAAQTRRIMENIKA IIEEAGGTLDAVVHCRAFLSDMKHFPEFESVYRSYFSAPYPARMTVASAGIYDGLDVEMD AIAYLP >gi|316924508|gb|ADCP01000021.1| GENE 17 22463 - 24025 1831 520 aa, chain + ## HITS:1 COG:SA2408 KEGG:ns NR:ns ## COG: SA2408 COG1292 # Protein_GI_number: 15928201 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Staphylococcus aureus N315 # 6 499 7 507 540 259 32.0 1e-68 MSNAALDKKILLPTIAIVLLTSAPLIFYASSLQGMIQSIYDFTAKSFGWGYILLYLLGGF FVFFIAFSPYGKIKLGGHDQKPVFNNFHWGSMIIAAGHGIGMVNWSMVEPLITNNAAPLG HGANAAYSYEVATAYTFFHWGPYYWVLYLIPCIPIFYFMGVRQVKKQRVSECLTPLFGRK LIDGWFGVCLDVFIIMGLAGGIGSTLATAVQLVSGLYADYFGLPDTQALHLGVLGMFTVI TLGSIRKPLSKGMRLLSDMNSILALSLLAIVLIGGATGYFFSLGTNTLGMVLDMFPRVSG WTDPFNASGFPAKWTVFYGAWFLAYGPMMAIFFTGISMGRTLRETVLGVYFFGCLSSFLF SVIFGGFSLYLQYTGQFDVYAFYLANGLPNTVSKVVSLTPLHQIMSPAFLICCTIFLTTT IDAATRVMASMSSKEILSDQEPSVTSKYIWCISLSILVLGVLLVGGLEIIQALVVLAAIP LLGICVIMNISMFKAVREDFPNIWAANLLYIDKGKSGGKA >gi|316924508|gb|ADCP01000021.1| GENE 18 24418 - 25077 465 219 aa, chain - ## HITS:1 COG:ECs4801 KEGG:ns NR:ns ## COG: ECs4801 COG0730 # Protein_GI_number: 15834055 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 211 42 251 291 58 23.0 8e-09 MVALPIVACVLEMSEAVPATCLVAVTLAVYVAWMYRAHSRVAPIFPMLLGCIPGLIIGVC VLRVVPGIWLQGGLGAALIVYVAWQLLHRAGTAHPESVRGCAVSGFASGFANAAISFSGP PVAIYALYVGWDKDTTRGTLSLYFLGISICTCIVQAAAGLYTPVVVKAALWGMPGALVGL VVSLPVARFVRESVFQGILLVMIAFAGSVCIIRAVESLL >gi|316924508|gb|ADCP01000021.1| GENE 19 25507 - 26967 1403 486 aa, chain - ## HITS:1 COG:NMA1126 KEGG:ns NR:ns ## COG: NMA1126 COG2233 # Protein_GI_number: 15794072 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Neisseria meningitidis Z2491 # 19 460 13 447 463 396 48.0 1e-110 MSSAPASQPATPNPEKESELIYGLESHPPFGSSLLAALQHILAMVLSVMAPPAIVAGALG VPPEAIAYLVSMSLLFAGIGIWFQVSRPFGIGSGMLSIQATSFAFPGTLIAVGALLMKEQ GMSWEQALSTLFGVCFIGGFVVMAGSRIMPFLRKIITPTVAGITVMMIGVSLVRVGAIDL AGGFAAKADGTFGDISNLFLGTLVILAIVCVNRSKNPLLRMSSLIVGILVGFAVAIVMGK VNWNILSEQHTWFIMPVPFKFGFFGFDLHSFLVLSFLFLVVVIEAIGDLTATSVVSEQPV RGRVYRSRLAGGILCDGMMTSLGAIFGCFPVATFSQNNGVIQLSGVASRKVGKFCGVLFI LFALFPIIVVFFQLLPRPVLGGALVVLFGTIACSGIRILLQHEVNRRESIIIAVSIGVGL MSMTTPEVFERLPRFLKIFFDSAIVSGGVTAMIVHQILPKGICPYEGDEDDEDCNPHSEF LFNKHE >gi|316924508|gb|ADCP01000021.1| GENE 20 27112 - 27438 236 108 aa, chain - ## HITS:1 COG:no KEGG:Glov_0942 NR:ns ## KEGG: Glov_0942 # Name: not_defined # Def: hypothetical protein # Organism: G.lovleyi # Pathway: not_defined # 35 108 94 167 172 62 40.0 5e-09 MKYAWTLAICLAVLCLAAPSGAAPQKALRDAHVDLACADCHTGQKNPKQAGAVSCAKCHD APEVAKRTASRGKKNPHISPHWGTEVPCWVCHKEHAEDQNYCLICHTW >gi|316924508|gb|ADCP01000021.1| GENE 21 28141 - 28449 225 102 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1652 NR:ns ## KEGG: Ddes_1652 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 102 1 102 102 96 55.0 2e-19 MNNRFGTTSDMIIQTVQEKEGCACLAIDEKGLYLTSPAYVDRPLADPNRYAGNRKDVAAR LNALSLDADSLLEQNRHRIPKITGESKKKVNPLKASKRSMKG >gi|316924508|gb|ADCP01000021.1| GENE 22 28732 - 28965 220 77 aa, chain - ## HITS:1 COG:no KEGG:LI1082 NR:ns ## KEGG: LI1082 # Name: feoA # Def: Fe2+ transport system protein A # Organism: L.intracellularis # Pathway: not_defined # 5 74 40 109 112 65 45.0 5e-10 MIRRLHHLRNGQSGFVIAIAAAGERAQRLRDMGLLPGVGITVMGRSPMGDLLSLRVMGTT LALRVADAEDVTVSTDD >gi|316924508|gb|ADCP01000021.1| GENE 23 28962 - 29354 117 130 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTINFKINKLRQKRMLLRWLAHTLPLSGIMVHQEAELTFSYKRCARVLTLVAVSVLGLAL LVRALVAFIEALAVLFLAVVVASLALPRGISWYWRQARREIPRLLRNLGDLLDTSASEPK AAADGQARTE >gi|316924508|gb|ADCP01000021.1| GENE 24 29458 - 29715 71 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTGVHAALFWVGGTVPGYAWGEGKGERAAVRSKKFKRGRFRDSVYQYRRMFVCPSIFPLL LLGGVLGRVDFDNTGIGRLYAETCG >gi|316924508|gb|ADCP01000021.1| GENE 25 29774 - 31051 1584 425 aa, chain - ## HITS:1 COG:CC2485 KEGG:ns NR:ns ## COG: CC2485 COG0477 # Protein_GI_number: 16126724 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 1 383 1 399 444 103 23.0 5e-22 MSQATQTTPIEAPAEEPAVPVSRPRFMFAFVVLYMLFLLDFSARLGVTSVFPAMQKALGL TDSQIGVAGSAVLCGMTLFVLPLSFIADKTSKKKAITLMSGLWGVGCLLCGLVSNFFLIL LGRFMVGMGNASYAPVSVSMLTSWTKKSRWGSVIGLYNSSMSVGLAAGTAVAGILAQSIG WRAPFIVMGVVTLLFMVLSMALPKTSAKPENAKKDDVSLKEAFGVTLKNKTLVMLGIGVG LGNMVYSSMVAWIPMFLVREMGWTSAEVGAYMSPVYLFTGIVITPLSGIISDRLGRWSRK TRAWLGVPCFVLLAACFGFGFMLKSFPLIALGVMLFLIPVTGIHIATQELVPARYKASAY GTYVTFLQGLGFFGPMLVGALSDAFGLNIAIVYVQFVFVLGALMLLCAGFMYQGDFDRAR ALDKR >gi|316924508|gb|ADCP01000021.1| GENE 26 31135 - 32748 1829 537 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 4 533 15 548 553 329 38.0 9e-90 MQYASVLLFNAVVLTLDGNDAVASALALDGDRILAVGDRDGLDPLIGPDTECRDMGGAAV LPGFYDAHGHILMTAQGRGRVNLNSPPLGTCRTLGDILAAIRARAAETPEGEWVLACGLD DTLLTEKRFPSRWELDEAAPHNPVFAQHISGHLCALNSAALKLAGIDRHTPDPAGGIIRR DADGEPDGVLEESPVYETIMPLLPTQTREQRIDDLAATTRDYAARGITTAVDAALFSYDD AELLRTVQEQGRLAVRVHVNPFTSLDPDDPRLAFDGKDVTIGGVKLLADGSLQGYTGYLT KPYHTPYQGDPEWRGYPTHSRENLFALIEAAHGRGQFLIHTNGDAATDDALDALEAAQAK HPRKDCRHILIHAQTIREEQLDRLGAAGYTPSFFTAHVYYWGDRHRDLFLGPERAARMDP MRSALDRGLVITAHCDSPIVPADPLLSIWASVNRLTSSGQVLGPDQRISVLEALRAHTFN PAWQNFQENDKGSIEPGKLADLVVLDANPLEVDPAALRNIGILETIVGGKTVYSARA >gi|316924508|gb|ADCP01000021.1| GENE 27 32977 - 36090 3276 1037 aa, chain + ## HITS:1 COG:ECs3353 KEGG:ns NR:ns ## COG: ECs3353 COG3604 # Protein_GI_number: 15832607 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli O157:H7 # 627 1032 243 657 663 260 39.0 1e-68 MSTLASSVPHASLLLALQLSARPLHERELAAITGLPEIRVRSALRELSLQKHVAAAPAEG SPDGGSGAPAWAITRIEPDLLEDMAETLCSACPDTGVPYHMGVLGGDATLEDCEIVVRFI DDRLKNNEPAVFACFEMVVQFLLRWGRAHMDDAGQKSWRYAELVLVVQSMCIFSHHYLQL ATELSPLAYELSARNGSERFMPMIWIFRHYLQLFSGGEPPLTDRFFHGSEKMRHFKEQDM QDRIPVFEGIENFLKGHFRETLQCYARRPAQDHWWYKRFFEPLSFCASQAAMYLREYPLA TGIIESARQTAELAGERLVAVLWIAHLCFLLLRKGALDEAIVKIDYLLNCVPPEQNHKVA SSAVRALAVYHHLSGRTDAAYRVLRNETLRAAARGVPHSPFLDPLVLDLLYVFEQRGYPP IPRYEVEATIETFLRQPNRQLRGTALRVHALRLRGRGAPAEEVVSLLHDSLRELEPTGDP RELVLTHHELANMLEAMGQHREAHQQRKSVAEIIGHPLDENASYRTAAIAATGEAFPALV PAPEPEAEDTPGRILLDRCHAAFNREPTSYRSDELFQHLLDIAQQELQSERAALFRPTDD GRPEFVASVNLTRMELESDGMRSCMEWLESILRNAASTHSEHQRLCLALDVGEPYPWLLY MDSAFTSGMFQRLSPALLRDLARLFAAEVRSGLRLELVRTEEASQQQDRLLAITRQKDGD MIPVVGEGLRASLEQALRVGITDAPVLILGETGVGKEIMARNIHQISLRSGPFVVVHPAS MPESLFESEFFGYERGAFTGAIRQKIGLFELADQGTLFIDEVGEIPPLIQTKLLRVLQDQ RFMRLGGTREILSRFRLVAATNRDLWKEVREGRFREDLLYRISVVPLTIPPLRERKQDIP GLVQAFIEHFARRHDKHPLPLSPEQQRRLCEYDWPGNVRELKNEIERAVILNNAGQLDFV LGASPAAHQPADRPSPFLETVADVPTLSELEERYLRHIMERTEGRVRGPSGAETLLRMKR STLYAKLKKYGIPCGSA >gi|316924508|gb|ADCP01000021.1| GENE 28 36310 - 38655 1424 781 aa, chain - ## HITS:1 COG:RC0249 KEGG:ns NR:ns ## COG: RC0249 COG4953 # Protein_GI_number: 15892172 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein PbpC # Organism: Rickettsia conorii # 28 778 23 795 796 532 37.0 1e-150 MRKTIAFALVLTLLLPIAAFWGWLVLSPKPPLLDGVPFSPLVLDRDGNLLRLGLSEDEKY RVRIRLNEVAPEMVRAVLLYEDRHFYRHPGVNPLSLLRASAGMLGGARRMGGSTITMQVA RLRLGLSTTTLSGKFAQMARALQYEYHYGKGEILEAYFNLAPYGGNIEGVGAAARIYFRT TSGRLTRTESLALAVVPQNPVRRSPLNGPDFEAARARMQMLADAEDAPGGGQGATASFGA SGFALGRALPPLRVYGPARLPFGAPHVSSETLPLSRGGEPVRTCIDSGLQRLMERAVAGF AARGRRYGLNNAAALLIHWPSMEIRGLVGSAGFSDAAISGQVDGTRARRSPGSTFKPFIY GMALDQGLIHPQTLLPDSPRSFGGYDPENFDRVFRGPLPAHEALRASRNLPAILLASQLR PGFYGFLLRAGVDLPFSADHYGLSLVLGGAEVSMRELGGLYAMLANKGVWRHPRLYEGEA AGSAVPLLSPEAAVVTLRMLEDDAHFVRSKEGPVPLRFKTGTSNGFRDAWTAGVVGPYVL VVWVGNFDNTPNPLLVGGDVAAPLFTDIAQALASDAGPLDDLVPVQQEGLNITRETVCTA TGDLDVSLCGDTTETWYIPGRSPTRSSGVFRTILIDAETGLRACRPVPGRTEERVWEFWP SDLQALFLRAGVVKPPPPPFSPECGPEAGVQPGRPPRIITPKDGVVYQRRLSDPERSGIP LVASTDADASAVFWFVKDRFIGRGVPGEPLVWHPGDGAQMLRVVDNLGRAAMQKIVVRTL P >gi|316924508|gb|ADCP01000021.1| GENE 29 38961 - 44549 4912 1862 aa, chain - ## HITS:1 COG:RC0835 KEGG:ns NR:ns ## COG: RC0835 COG2373 # Protein_GI_number: 15892758 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Rickettsia conorii # 11 1862 12 1892 1892 814 28.0 0 MRHRSILFLLVLLCCVAVPALAASDQAPAFVRTAVTAPLEGREPLRVIFAPDEAACRKAY GDDWGVRCAAAPGQDGAVVSGVRLSPDIPGEWRWSGGDTMEFRPKNPWPESTAYTLSLAK LPLPSRMKLASPSLSFSTPPLAVLKMDGRLWIDPDLNGERAVSFDVRFTTQPDRQTVQRD AALKVSDKSLMLAKPEFVWGENGTCLIKSRILSLAKTPATVTLSLPGVAAEVRRDGTHWK IPKGKMEARQQVTAPGTSSLFRIKSASLETSRDASLAGEYRLTVETSLLVRPDAFAKAVT ALQLPRALEEGAVEPTPWTKAPVVDETVLNRAKPVKIEPLQPADQPAGTLRFRVPVPSDS YLFLNLPQGFGPSPAFSLAGPWREVFHAAPFQPELDFLQPGNVLALGGERKLDLHASGLT AIRWRASRVLKPYLGFLATQPQPFTNTDIPFDALSDVQEGVIALKRTDPGVPQFTVLDVA PLFKDGRGLMRIELVGMDGDKEVASASRFLLVTDLGMIVKKNADGSRDVFVCSLSEGNPI FGAAVHILGTNGLPVAEAVTDAGGRAALPSVSGLNREKRPVAVTVSVKRGADEDAAWLPL DDYSRVVDYSRFPTQGQTSNADGINAYVFSERGIFRPGETLRFGMLMRRGDWKALPPDMP FFAELSDPSERTILRRQFVVGADGLAELSWTVPEGAPTGRYRLDVQTPNADGFAVVLGSG AVRVEEFQPDTMSLSATVNPAPGKGWLNASQASADAALKNLYGLPAVDRRLRGQLSVRPA SLSFPGYEDFTFHDAMPYRGSALTLDLGEVRTDAQGRAVLPLPLEKLRGGTLHCRLLVEG FEPGGGRSVTTVRDFLVSPLQAVLGYRPTGAGGNLGFIPKGSESTLEFVALGPDLGRADP GELTFSVAERRYVTSLVTDKDGRYRYDETPVDREIASSKRSFDASGNLAWAVPTDKPGEF LLSVRNASGQIMALVPFTVAGNDDLRLAGRDELPSGNLRLHIDKADYAPGESIRLFLSSP YDGMGLITLERDSVAASRWFRVRAGNSVQEIAVPKDFEGRAYVNVSLARSLSSPDVFMQP HSYAVAPLTVNVARRDMGLRLKAPEQVLPGSALSVSLSARHPGKALIFAVDEGVLQLTAF ATPDPLRYLLNDRALEVETRQMFDLLMPDHGQLRIPAFGGDMALSGGRFHNPFKRKAEPP LSWWSGIVEVGAETSVTIPVPGYYNGRVRIMAVAASPDTAGRAETDATVRGPVVLTPQLP VLASPGDEFEAALAVANNTGQPASFALALSPDEALRVVSPPPVEIAVGAGSERVIPFRVK VGDVPGNAELRFAVTDAAGNRTVRSATLSVRPASPLRESLSVGSASASTVLKTGRELYPY EAKGSASVSALPLPALRGLIRYLDAYPYTCAEQRISRAMPYALLMNRPELLADAGRAPDA ARKLARERMDEAVQGIQSALNWRGVSLWPGGEPDVLVTAYAADFLLTMRESGAALPGGLL AEVFGALENALDRVPDSLEEGRAQAYGLWVLTREGRITTQAMEQLVSQLEDRFPEWRKDV ASTLLAGSCAIMRMNDDARHLIALYQAPGPDFQASGYLDALGVQALRASVLARHFPDMLD TQRTELAQELLDVLNGGRYVTLSASLGVRALLDMGGAANMPSGISLRCTSMQPGFEPVES APADLGGMLTLSAPGCAAYALDMPAGSQPLYWEVSAQGFDRKPPSEALAQGLEVSREYLD AEGRPVTTVKQGDVVTVSISARSHGGSVSDAVIVDLLPGGFEMVLNSDIKGQNPSGEPGY KSVDRREDRMIVFSDLAVEPSVFTYKIRAVNKGVYTLPPVQAEAMYNRSLQAHSAGGVMV VE >gi|316924508|gb|ADCP01000021.1| GENE 30 44756 - 45535 767 259 aa, chain - ## HITS:1 COG:yagI KEGG:ns NR:ns ## COG: yagI COG1414 # Protein_GI_number: 16128257 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 9 252 4 250 252 117 31.0 2e-26 MKPSNPSRVLTLERGLQILEYIVERRNVTVTNVATAFGIQKSASHRFLNTLKYMGYVEQT PQSDYALTGRLRYLAEGVVPRIEVRNFARPYLEELSKAAEQPSNLGHWDGREITYLAQRL FNSALEGYAAGNRIPAYCSAMGKAVLAFSPREEVEAYVARTQFTRFTEKTICDRAMFLKE LEAIRERGYAVMNEEMAAHLVGVAAPVFNKEGYPRYAVSVAGLCFRPVEAFVAEVCDDVV QTAREISEFLAHARTMEDK >gi|316924508|gb|ADCP01000021.1| GENE 31 46154 - 46516 355 120 aa, chain + ## HITS:1 COG:PA5314 KEGG:ns NR:ns ## COG: PA5314 COG3450 # Protein_GI_number: 15600507 # Func_class: R General function prediction only # Function: Predicted enzyme of the cupin superfamily # Organism: Pseudomonas aeruginosa # 7 113 9 114 120 66 31.0 1e-11 MDTGLIFLGTEQAVEHVPGRPEHVLDGDPLTNVWRNAVHPSGKMHAGVWDCRAGAFELPC HASTELCVITEGSAEIEDAQGRKTLVKPGDAFIIPQGSRTVWRIEEYVKKYYLCVLDLED >gi|316924508|gb|ADCP01000021.1| GENE 32 46558 - 47313 192 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 1 213 1 215 309 78 28 1e-13 MTTQDARPIKIEVKNLTKRFGDLLVLDDMNFTIRKGEFLAIVGPTGCGKTTFLNCLSKLM PTSEGDIYIDGEPANPQKHNISFVFQEPTCFPWRTVRENVAYGMEIKRWDKKLIEGKLDA ILPLVGLADTAELYPNQVSASMVQRIAVARAFAVEPDLLMMDEPYGQLDIKLRFYLEDEL VKLWKALGSTVLFVTHNIEEAVYVAERILVLTNKPTNIKKEIVVDLPRPRDTVSDDFIRI RKQVTELIRWW >gi|316924508|gb|ADCP01000021.1| GENE 33 47366 - 48472 1665 368 aa, chain + ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 48 253 47 244 332 71 29.0 3e-12 MKALRTLTILFALALMCLPAPSAAAAEKPFKLDICAMPEHESFIFWYAKEQGWDKDEGLD FQIHFFQTGMDQLEALPAKQWVVGGTGTVPILVGALRYNTYLIGIANDESWVNAVTVRPD NPALKVKGWNPEYPNLLGSPETVKGKTFLYTQSSSQHFGMSEYLKALGLTEKDVVMQNMD PAQVLAAFDAGIGDFAGIWPMWLYIGMERGNKLIYQPGDVNAVLTLTLVGDKAFCDAHPD IVVKFLRVWLRGVNMIKKEIKNPKLAEQYHRFWTEWAGQDIEPQMAKMDLEYHPVFDIDQ QLAIFDDSKGPSQVEQWQDKVLDFFASQGRFSPEEVKKIRGSGYINGTFLKMAAESIKEK PLGGYKLD >gi|316924508|gb|ADCP01000021.1| GENE 34 48652 - 49449 1055 265 aa, chain + ## HITS:1 COG:mlr4519 KEGG:ns NR:ns ## COG: mlr4519 COG0600 # Protein_GI_number: 13473801 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Mesorhizobium loti # 17 260 38 280 286 139 34.0 4e-33 MSQYQQVKTHRQRALILLPLLSLSLFFLSWELIVDMGLVPQTLLASPSAVVELFFAKLAN ASPDGMLLQEHAWISIKEAFLGYFLALAVGIPLGLAMGWFRLAEGLARPIFEFIRPIPPI AWIPLTIYWFGIGLTGKVFIIWIGGIVPCVINSYVGVRMTNPVFLQMGRLYGATNWQLFW TVCIPSALPMVFGALQIALAWCWMNLVGAELLAANGGLGFLIQCGRKLGRPDLVVLGMVT VAMTGVFIGILIHYVEKKLLAGMRR >gi|316924508|gb|ADCP01000021.1| GENE 35 49453 - 50289 1235 278 aa, chain + ## HITS:1 COG:AGpT116 KEGG:ns NR:ns ## COG: AGpT116 COG0600 # Protein_GI_number: 16119871 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 59 274 96 307 313 153 37.0 3e-37 MSQSIANEIQIESRERRPLRLDDILMNRWFLHTVSVSLFLGLWYWTVYMGVFGRGLCYPH EVLQEIGNLMTRRLAGKGLWDHAWDSTRRVFIGFGIACALGIPLGFFMGINKYFNAFFNP IFNLIKPMPPISWVSLSILWLGIGEEPKIFIIAIGAFVPIVLNAYNGIRLIDEELYDVVR MAGGGRMKEIVEVSFPAAVPSIFAGLQISLSGAWSCVLAAELVSARSGLGFVIVQGMNSG KASMVIAGMLLIALIASLCSYGIDSLEKKICPWRREIM >gi|316924508|gb|ADCP01000021.1| GENE 36 50317 - 51078 197 253 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 25 213 20 216 223 80 28 3e-14 MREAKIQCEGVGKTFIQKGNTQVHVLRDVSIDVGRNEFVVILGPGQCGKTTLLRMIAGFE KPSVGRVLMDGREIEGPGHDKGFVFQSYMLFAWRTVRGNVEVGPRVRNLSREERRRLSQQ YIDMVGLNGFEEHYPHQLSGGMKQRVGIARAYCNCPEIMLLDEPFGQLDAQTRLFMEKET ERIWMQEKRTVVFVTNNIDEALTLGDRIYTMKNKLPGTLCNTYKVDLARPREPTDRAFLR LRQQIIDESELVL >gi|316924508|gb|ADCP01000021.1| GENE 37 51428 - 52210 998 260 aa, chain + ## HITS:1 COG:BS_yxjF KEGG:ns NR:ns ## COG: BS_yxjF COG1028 # Protein_GI_number: 16080948 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Bacillus subtilis # 6 198 5 193 257 134 41.0 2e-31 MSNIPVALVTGGSRGIGRAAAEELAREGYKVALIARTPERLEAAARELVETLGLDAEHAP VTFALDVGDGQAVGDAVDRLAAEWGRIDVLFNSAGVSIPGTLELGADTLDLLYRTNLRAP FLFMKHVIPIMRRQGSGYIFNLASRNGKIGVAGLGGYTASKFGLVGLGEVLYRELAAQGI KVTTLCPGWVNTDMASGEGCVHPPEKMIQPEQIAATMRWLLSLGPSVSIMEVLLECTADV ERRATGELHALYALRDGTPS >gi|316924508|gb|ADCP01000021.1| GENE 38 52235 - 54634 2865 799 aa, chain + ## HITS:1 COG:pflD KEGG:ns NR:ns ## COG: pflD COG1882 # Protein_GI_number: 16131789 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 13 799 4 765 765 485 37.0 1e-136 MIDTTQGMAQPERARFLKERFLSRRPSICLEGALAKTRAFRETEGEALIVRRAKGFRRHC ETKTIVIDPRELIVGHPGCRSRSAVICPELSNTWLCREFDRMPVRDQDPYDVTEEQKRLY REEIYPYWVGKTLRERWNGQAPKDLLELIAVGGVIDNDIKIECAPGETTPEFPDHLLPKG FKGIQDEAQALLDAVDLTVPANYEKRDFWQATVIVNEGLRILCRRHAEAAEALAARESDP IRKAELLNIASDCAFLADNPPATFRQALQQIHFLFMGLYMESNAGGYSPGRMDQYLYPYY LRDRAEGILDDATALELIECFWIKSNDAVWYWDEAGIKHYAGYCSFQNVCLGGLDRETGQ DGVNELTYLMLKATIDLQMVQPSVSVRLSKKNPEDYSLKIAELVRTGSGFPAIYNEEVGL KMLMKKGVPLEKAWDWSCIGCVEPLMPGKTSQWSSAGHYNLAAAVEFALTNGVHRKSGKR IGLETGNPEAFATYEVFKAAVYAQLDRLIRQFSISQNLIERLQQQFFPNPLISMSILDCV ENGRDLMHGGARYNVGPGMNGNGVADFADSLVAVKKLVYDEKRLPMHTLLEALAADFQGY EAVERMLAEDAPKWGNDDPDVDAVVMELCDFIIKIHAELKGILGNPKMPALYPVSSNVPQ GLSIGALPSGRRAGKPLADGCSPSQGCDRFGPTAILRSLDKMPHACLDGGTLLNLWLTPA VVQGEEGGHRLSAFLRAFLDMDIFHIQFNVVGQDILRCAQERPEEYRSLLVRVAGYSAYF VELSREVQDDILSRTVNTL >gi|316924508|gb|ADCP01000021.1| GENE 39 54653 - 55558 1056 301 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 1 276 4 275 302 221 40.0 2e-57 MKGVLENIQHYCIHDGPGIRTVVFLKGCPLRCRWCSNPTTQEHKPELLYHADKCFSCGHC AEICPQRAVTREGEAVIFDRALCDGCGLCAKECPGKALQIAGVERDTADVLEDIKKDMAF YRNSGGGVTLSGGEVLAQAAFALDLMAACKRYGIHLALETSGFGKREDLLALADAADCLF YDVKHTDDDVHRELTGQGSGLILDNLDAVLDRAAGKLTVRLPLIPGLNDDEGHLENFART IKALGRVQAVELLPYHRLGKNKYALLGRAYALESLGPMSKEDLARRAAFVAERLGGTKVL F >gi|316924508|gb|ADCP01000021.1| GENE 40 55727 - 58099 2806 790 aa, chain + ## HITS:1 COG:pflD KEGG:ns NR:ns ## COG: pflD COG1882 # Protein_GI_number: 16131789 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 2 790 1 765 765 478 35.0 1e-134 MLTSRIDQLKKQFFTHVPAVCLEGALSKTHVFKETEGEPMIVRRAKAFKRHCETKTITIQ PGELVVGNAGRTARTIHICPELSNNWVYEELDTMATRPQDPYAITEEQKMLFRNEIYPYW KGKTLRDYWNAQAPKPILDIISVGGVIDNDIKIECTPGDIVPEFKENIFAKGFGGIRAQA QALLETVDLNDVENYDRRDFWEAITITCDGFSILCRRHAQAARELAATEKDGTRKAELLG MAEACDAIAENPPKTFREAMQLHYFIFVGLFIEGNAGGYSPGRMDQNLYPYYKRDVEAGI LTDAEALELIECMWIKFGEQIWYWNEPAATHYAGFCAFQNVCIGGVDMDGLDAVNPLSYL MLQATIDTQMVQPSLSVRLSRKNPEDFFLKIAELIQTGSGFPAIYSDEIGMKQLMKKGIP PELARDWVGLGCVEANMPGKMSQWSSAGHYNIAAAVEFALSNGVHLKSGKKLGLETGDPA SFTTFEQFRDAVHAQLDHLLRTFSSMQNLLELLHQRYLPNPVASMVLLDCVEKGKDLMRG GARYNTGPGMNGNGVADYADSMVAVKKLVFDEKKVDMATLADAVKHDFKGYEPLLRLIDE EAPKWGNDDPEADAMVIDLTSFIIKKIAAFRGLLGNQKLPALYPVSSNVPQGMAVGALPS GRRAFRPIAEGCSPCQGADRTGPTAVLRSLGKLPHTCIDGGTLLNLKFAPASVEGEAGRI RLSAFLKSFLDLDVFHVQFNVVGHEVLRCAQAHPEEYKSLLVRVAGYSAYFVELSREVQD DILSRTAHAL >gi|316924508|gb|ADCP01000021.1| GENE 41 58465 - 59964 1604 499 aa, chain - ## HITS:1 COG:FN0050_2 KEGG:ns NR:ns ## COG: FN0050_2 COG1053 # Protein_GI_number: 19703402 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Fusobacterium nucleatum # 30 493 38 475 484 152 27.0 2e-36 MKTRRAFIQTGAVALCGLAAAPALAKPRKPKDADWTETFDVIVIGSGLAAFVAGLSALES GAKRVALLEKMGMIGGSTAISNGTISVPGSPLQKAQGIEDSPEAMLKDLLKAGKGFCHPE LTKTLVGNGVEAFDFIVKHGAKFKNTVMMPGGMTAKRLLQPDYNGATGLLAPLRESYLKM GGQLILCCKVDRLLLDRDGRVEGVAARTDYHFNTRLKSDDLENKGGTPVRYRAKRGVICC TGGFEADRAFRSQELPQFDGAFTTEHPGATASGYRMLAAAGARMINGTLYRPAFPCTDEQ PLGMMIDPATGKRFVSESADRVAIFHAASKVLASNGRRMPLSILDGDAYELVENKHRMEK NAGAGYVTKHDSLEDLAEHYKIPLDALKATIEEYNAMAEKGEDTAFKADFAKTKNNKVDK PPYYAVMVKPNLTYSLAGALITPKAEVVSVLDDAPIPGLYAAGEATSGVHGAMRLGGCAI LDCCVFGMLAGRNVMAKKV >gi|316924508|gb|ADCP01000021.1| GENE 42 60310 - 61116 814 268 aa, chain + ## HITS:1 COG:no KEGG:LI0105 NR:ns ## KEGG: LI0105 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 18 254 65 307 314 222 46.0 1e-56 MRKLIFCLCCLLWAIPACAGDLELDLECLQEAYPGFITGTETDDAGHVWFLTKNGGRLLY DDGKMKSHAELLENADIEDAMRQPYPLEPERPDFAPDEEPGRIRCYPLLKALYGADQRSV ERGIVRTLFGGKTKVRLAAPAAEAFQRIDAAWRLRPADPELNSYFSPIYGYFWRAIAKTN RLSPHSFGIAVDLNPDKGPYWQWSKLRPHPLQKTFPSAIVSLFEDNGFIWGGKWEHFDLM HFEYRPELIIKAKKLRAQANGEKPEDAS >gi|316924508|gb|ADCP01000021.1| GENE 43 61232 - 61723 414 163 aa, chain + ## HITS:1 COG:sll1188 KEGG:ns NR:ns ## COG: sll1188 COG3542 # Protein_GI_number: 16329452 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Synechocystis # 7 162 9 161 164 146 46.0 2e-35 MDPQKLVEHFSMSPHPEGGFFAETYRSQGAIPADALPGFGGTRNFSTGILFLLRRGEYSH LHRLKQDEMWHFYLGAPLRLAIVRPDGTAEEILLGQDVLNGQYLQYTVPGGCWFGATPAE GSDFALVGCTVAPGFDFADFEMADPDVLGQTFPHAAGLVREFC >gi|316924508|gb|ADCP01000021.1| GENE 44 61741 - 62187 336 148 aa, chain + ## HITS:1 COG:FN1602 KEGG:ns NR:ns ## COG: FN1602 COG1683 # Protein_GI_number: 19704923 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 8 146 6 146 156 125 44.0 2e-29 MEPNMSKQYVVSACLAGESCRYDGGCSPCPAVQALVRTGQALPVCPEVLGGLPTPRVPSE IRGGRVVAKDGTDVTGAFTCGAEEALRLALENGCTAAILKARSPSCGSGEIYDGSFTGTR VPGEGVFARMAREAGLEIWSEETFTEGR >gi|316924508|gb|ADCP01000021.1| GENE 45 62490 - 63026 773 178 aa, chain - ## HITS:1 COG:mlr5526 KEGG:ns NR:ns ## COG: mlr5526 COG2193 # Protein_GI_number: 13474607 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin (cytochrome b1) # Organism: Mesorhizobium loti # 11 150 7 146 161 79 32.0 3e-15 MSDRETRKANVVEALNKARSMELHAIHQYMNQHYNLDDMDYGELAANMKLIAIDEMRHCE MFAERIKELGGEPTSEVAGKVVKGQEVKAIYSSDSDSEDDTIFVYNELRKVCVDNGDILT AKLFETIIGEEQIHLNYFDNISDHIETLGDTFLSKIAGTPSSTGGTTKGFAIAPAAAN >gi|316924508|gb|ADCP01000021.1| GENE 46 63296 - 64093 794 265 aa, chain + ## HITS:1 COG:SP1408 KEGG:ns NR:ns ## COG: SP1408 COG3884 # Protein_GI_number: 15901262 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Streptococcus pneumoniae TIGR4 # 7 214 5 212 245 79 26.0 7e-15 MLDYPEYTLGIQARYGEAGTDGKIRLGALANWFQEAAGHNASALGFGDERLFAEGKAWIL TRMAFRIKDLPSPGDKVNIRTWPAKLEHLGHRGYEVYNASDELIVAAVSAWTVLDLSTRR LTALPEELAAVYPVNTIPCIPFPSRTIPRLREGIGSADVLVRRDDLDINGHVNNSKYLGW LLECLPWSPGESLIPSLLDVTFRAECFPGDALTSQCVPLPDETDATAPDGFPQAPHGLLH VIRRTDSGDDVCRAVTRWTQKVRLP >gi|316924508|gb|ADCP01000021.1| GENE 47 64952 - 65662 893 236 aa, chain + ## HITS:1 COG:HI0054 KEGG:ns NR:ns ## COG: HI0054 COG2186 # Protein_GI_number: 16272028 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 13 225 24 237 266 105 31.0 7e-23 MPKTDSSAGRPRRSYQKLTERLRQFMEEGHFRDGDKLPPERALAESFGVSRSSVREAIRA LAEKGLLESRQGDGTYVRVPDIEPLKEAILEAVDSEGLMFDEVTEYRRIVEPGIAEIAAV RHTSEQLDRLKIIACDQQRRLLVGENDGDLDARFHLALAECTGNKLLIDTVARLNRIYAK GRTPELRDTPWRQFSVSSHLRIIDALERRSPEDSRKALEEHIDTVIHKHLFATVRD >gi|316924508|gb|ADCP01000021.1| GENE 48 65664 - 66461 879 265 aa, chain + ## HITS:1 COG:PA0340 KEGG:ns NR:ns ## COG: PA0340 COG0730 # Protein_GI_number: 15595537 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Pseudomonas aeruginosa # 3 261 1 259 267 204 48.0 9e-53 MLISLVVYCLLGAIAGVLAGLLGVGGGIVIVPMLVFAFGWQNFPPDVLMLMALGTSMGSI MFTSISSSLAHSRNKGVQWDAVRNITPGILIGTFCGSFLASHVPARFLQLFFVAFLFFVI TQMLSGKKPKPSRHLPGLGGMSVAGGIIGVVSSLVGIGGGTLSVPFLLWNNLDMRKAIGT SAAIGFPIALAGCFGYIVNGWNAANLPPYSFGYIYLPSLFGIVVVSMFTAPLGARLAQTL PVPKLKKCFALLLIVIGIRMLLKAL >gi|316924508|gb|ADCP01000021.1| GENE 49 66529 - 67164 690 211 aa, chain - ## HITS:1 COG:TM1529 KEGG:ns NR:ns ## COG: TM1529 COG1266 # Protein_GI_number: 15644277 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Thermotoga maritima # 22 210 23 207 208 67 25.0 3e-11 MTPRIPTAGQVCGTLALAAFFWWVNFGLHWVNFWIGMGTAASVLALLSFWFAGVPMKRTE WTGRAVLIGLVSAALLYAIFALGNTLSGWLFRFAPHQVSAIYDIRHEGSPLAMALVLLFI TSPAEEVFWRGFIQRWFTHRFGGKAGWLLAVCVYAGVHVFSGNLMLVMAALTAGLFWGWL YWKTDSLVPCILSHAFWTVAVFILWPLTPGV >gi|316924508|gb|ADCP01000021.1| GENE 50 67161 - 68081 1039 306 aa, chain - ## HITS:1 COG:TM1528 KEGG:ns NR:ns ## COG: TM1528 COG1575 # Protein_GI_number: 15644276 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Thermotoga maritima # 3 289 2 288 289 145 34.0 1e-34 MKKWLLATRPWSFTASVIPLTLGAALAWASDAAHAGLFLLTLLGGVAVQTGTNMLNTYGD YRSGVDTEASAHGESPILLGLISPEAMRRGGLIALCVAFAVGIVLGFACGWPILAFGLVG IAGGYGYTSGFWPYKYHACGPIMVFLLMGPLMALPAYYIQGGSLDWRPFLASLPIACLVT SIMHANDIRDIAHDREAGITTLAMLLGRRKALYLYAALCVGAYGVLLLLAAFGVLPLSGL LPFVLAPGLWRTLRTLGTRPLPESELVSLDGVSARHHFLFGLLLIAGILLPRGLELAQGL IRGIFA >gi|316924508|gb|ADCP01000021.1| GENE 51 68233 - 68430 62 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHPFIEAIWRFDTITLYWQLEQMAKRDCVEIRERCKRQLSQEGRDAQKEGRRGFQQSFLH GKKMW >gi|316924508|gb|ADCP01000021.1| GENE 52 68842 - 69201 105 119 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKRYGIPLHSIVLLALLLAAGGFARVFPVHPTNGSARFNAPEYIQPLLIQEVCPPGSDE DVTSFKSKAPPPDPGTHLAASTSPPVPPLTQGRLCSLIPECPRPSWRGLLPFPLPPPSL >gi|316924508|gb|ADCP01000021.1| GENE 53 69463 - 69711 185 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQSRIPTYIFSLFIPLALVIGCFGFYNRVEPRILGFPFIYAWIFACFFLVSASMCIGWLL DPQSDRNKKRASSSRAPKENAS >gi|316924508|gb|ADCP01000021.1| GENE 54 69708 - 71183 1859 491 aa, chain + ## HITS:1 COG:BS_yhjB KEGG:ns NR:ns ## COG: BS_yhjB COG0591 # Protein_GI_number: 16078109 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Bacillus subtilis # 4 464 6 477 489 177 28.0 6e-44 MTALVFIALMAFSVVLAFMAKRGVIADSMDDVMVAGRSFGAFMVFFVTVGETYGIGTMIG VPGAIYSKGISYSLWFLGYILLGFVVGYFMNPAIWRMGKISGAMTMPDCFRWRYGSKALE VLVAVICIVFLLPWMQMQFAGLATILRYIGWDISYTVGIGISAAIAYLYIAVAGIRAPAW ISIMKDILMMAAIVSGGLVAIHNMPGGISGIFDMAIAHFPDKVVIDAEPITKNATFVFST IIFQALGFCVTPLQYQFIFTAKSEDTIRRNQIIMPLYMFMYPFLIVAAYFVLVTVPNLND PDSSFMALAAANLPEWTVGMVAAGGALTCILVIAVSALNLGGIFSRNIWGVFRSSVSQKQ AVLVTNMATGASLLISVVLAVMLPNLMLGVINIAYFWATQCFPMALATFFWRGATRTGVF AGLLTGVVAVVVLTETHVTFWGLNQGFVAMTLNALVMVAVCLCTNPDEATNETSNALFAY AAQEEPEAADL >gi|316924508|gb|ADCP01000021.1| GENE 55 71590 - 72558 461 322 aa, chain + ## HITS:1 COG:all1774 KEGG:ns NR:ns ## COG: all1774 COG1570 # Protein_GI_number: 17229266 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Nostoc sp. PCC 7120 # 37 289 34 291 412 62 24.0 1e-09 MPMPAMPSLSVARLSSVFEDMAGAALKNAVDDSSCVFRTQGIIGDWTAKRTFWNGVHYSI ELADGGKSICVDLPLEVIQNGGIRPGDRVDVLGIPVVYLKKSVVLFKLHVHAASVIEASA GGRGPEPVKNVEGTISHLKELGFKRIPFPKRATAISVIHSKSHAANVFADFKNELDLKSV NVESLPTAMTDPSAIARAIDQASGNVVVLIRGGGDDAEFTTFQHDDVVKALARKAAHRIT GLGHYGNLTYADIIADFCTTTPTSAGAYVREQLIRTYNMRQTEREALEEQAALIKALRIS KLKWMLIALAGIALAGYLGFFR >gi|316924508|gb|ADCP01000021.1| GENE 56 73091 - 73816 899 241 aa, chain + ## HITS:1 COG:VC2692 KEGG:ns NR:ns ## COG: VC2692 COG0745 # Protein_GI_number: 15642687 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Vibrio cholerae # 12 240 5 228 228 179 41.0 6e-45 MSTIKHSQISALLVDDDAKLCGLLRDYLEGFGFSLDVAEDASSGLEKAVANEYDVILLDM MLPDMDGLDVLRRLREIGSTPIVIFSAQDEEASRIVGLEMGADDYVPKAFSPRELLARVR AVLRRAQPVQAERKEPEEDGVVRVRGLWMDRKSMKARMDGVPMDLTALEFRILFGMAAHP GRVFTRENLLELAVGREFSRYDRSVDMHISTLRRKLGDDPHNPAYLKTIRGMGYTLLPEE A >gi|316924508|gb|ADCP01000021.1| GENE 57 74057 - 75541 1786 494 aa, chain + ## HITS:1 COG:PA3206 KEGG:ns NR:ns ## COG: PA3206 COG0642 # Protein_GI_number: 15598402 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 254 489 203 437 445 135 34.0 1e-31 MRVGLYQKFFTWAMLNLVLLGVITALFIGGVLLFNNNLYSPMLFNGSIASTFRMTSVELQ YKPESQWASVLEECGRECGLHFGILILDDMDPPLAAGGLPPRVVEAARTIPRPPYSLCAE PDSGAFSTITPAVELEAGILPQQYVIFMRDGDPAQYWYARTILVSNDEGVVRYVLLAASS PSITGNGHFFELRLAVALLLSGLGLSFLWWWPFVRHLTKPIILMTKVTERIAAGDYSSLS DDSPISLKAVGEGRGDQIGCLAKATGIMAEKVKQQVHGQRRFIRHIAHELDSPLARIKLG LAVLDSRVDEKNQQRVQEISDEVEQLSLLVEDVLSFMRSEAIPESPAREAVSLAPLLQYV TRREARDRDVRLSVDEGLKVWADASCLGRALANVVRNAVRYAGEDGPIAIAARPQGDRVC VEIRDSGPGVPAEEMEHITEPFFRGEQAKKYPGGSGLGMSIVKHCVEACGGELHYYNQYP KGFTVSITLKACDK >gi|316924508|gb|ADCP01000021.1| GENE 58 75902 - 78055 2758 717 aa, chain + ## HITS:1 COG:PAB2442 KEGG:ns NR:ns ## COG: PAB2442 COG0243 # Protein_GI_number: 14521004 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Pyrococcus abyssi # 45 678 7 637 665 206 24.0 1e-52 MSNTFSRRHFLQYGMAAIAGGTALGGLPVHAAQAPAGTAAPQNDLVKGYCPFCQVRCTYH ARVRNGKILELIGDRGNRWTGGAMCPKGLSIVELLNSPYRLVQPMLKQGSEWKTISYAEA VDIVVDKLRQSRAKHGDKIAERLALTSPLWDCRESELAALMTMRTAGGVNVMPAGEVCIS TASNVLGMLLGANTSTTTVNEIVNAKTLVLWGANISETYPPYTRWLDKARDAGVRIVSVD CRKTPTSAWAADQLMPLPGTDGALALGAIRFVLENGAFDRARVDESITGFELMEKGAQPW SVDKVAQATGLSEDAVTAFYRTLAESPRTIIWMGGCLSRYTNGLQSIRAIIALQALRDNL IGSGRGLLTMEGGKPEGEKEFVDAVCGPAKDSGVNFRRLLNTMKKGNLDVLFLNSSYRRY PDCTGVAEAIKKVGFVVHRGFFKTEEMDVADLFVPAAFGPESAGSHYGAEKQVVWRDKCV DAPGSCVPDWQFYRDIGRKLAPEIYPDFTGPEELSERFRNTVPSWKAMSVSRMRQSPDGL IWPQPEEGAQERIGTVFTSGTYATENGKIPLDLKLMGAFGWTLPKGNPHGAGADKEYPLV LTQGKVVTQWQQTMTNFAASLAQFSRGRYVLVHPDTAKPLGIAHGDTVRIKTATGSVEAV AELTASIRPGIIFTPSHFTGTSPHAATRSKPINAILPNYWDRVSAQFNGVGCTLEKA >gi|316924508|gb|ADCP01000021.1| GENE 59 78149 - 78871 809 240 aa, chain + ## HITS:1 COG:APE2605 KEGG:ns NR:ns ## COG: APE2605 COG0437 # Protein_GI_number: 14602171 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Aeropyrum pernix # 3 161 70 226 250 152 46.0 4e-37 MPRYVMAIDASKCLNCKACLIACQQRNAVPYGLSRNWVRETPDTASPSGWRYQPGACMHC DEPSCVDACPTHATYKAEDGVVMVDETRCIACGSCMRACPYQARHIDPARRVVDKCDYCA PSRAEGMEPACVSVCTTRARVFGDVDDPQSPVSRVLGSHKVTFVEAEDAPTKPTLAYLND VADKHWPKAESAGPVGFMGTVATGVRWLGALSLFGVVGVGLRQLIRPTGEASHEEKRKGS >gi|316924508|gb|ADCP01000021.1| GENE 60 78868 - 79608 993 246 aa, chain + ## HITS:1 COG:no KEGG:Dbac_2336 NR:ns ## KEGG: Dbac_2336 # Name: not_defined # Def: formate dehydrogenase gamma subunit # Organism: D.baculatum # Pathway: Glyoxylate and dicarboxylate metabolism [PATH:dba00630]; Methane metabolism [PATH:dba00680]; Metabolic pathways [PATH:dba01100] # 1 231 5 232 240 182 42.0 7e-45 MIRRHSLSAVCMHWFNAFCWIALLFTGFALLSNPAMQPVGMWWADLWTGVFGAYGLLVFH LLIGSAWLAVYFLYIIFGLRHDVVPFLKEVFSLSPASDMIWCVRKGMRLVLGEKAMRSMG LDSALPPQGFYNAGQKLAAVAAVLCSVGLALSGLFLAALALHLVAPGSEDAAQWALFIHL LCAGVMAVVLPIHIYMAAFAPGEGPALRSMFSGFIPVTFIRHHNPLWYEQLVRKGVIRPE TTNTLS >gi|316924508|gb|ADCP01000021.1| GENE 61 79619 - 79912 617 97 aa, chain + ## HITS:1 COG:NMB1657 KEGG:ns NR:ns ## COG: NMB1657 COG1555 # Protein_GI_number: 15677506 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Neisseria meningitidis MC58 # 21 87 121 187 205 65 47.0 2e-11 MNIKTIASAVLAAVVLCAAPAFADGKMNLNTATEQELSANPAVGPELAKGIVEFRENVGD IKSMDELLEVKGVTPEVLEKLKQAVTVEAIEGAECSC >gi|316924508|gb|ADCP01000021.1| GENE 62 80089 - 81249 1709 386 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|121533853|ref|ZP_01665679.1| ## NR: gi|121533853|ref|ZP_01665679.1| hypothetical protein TcarDRAFT_2230 [Thermosinus carboxydivorans Nor1] # 16 386 16 384 384 221 36.0 5e-56 MSHISLRAIGKTAAAICLAVAMSGVAMAGDAPDRVVMQSPAKDAAVLAKMQGLVNNRKAK TLSGLSRTINDGVVIRKSTIKDGKIIIPPSGLYITVRSTNWHPFDNEPLTLAGKTYHAIT DRYVRDVIRNVTMKKGQVLPSNANKSRGWELSSVENDTYGIEGGNSAVFKIVKTTGNYYG SQFPVYAGAQISNAAADGSLVKGSKDPEGHTVPTAENGLSEMIYGSNIATVGRTHVIIDA IEGDSVKVRELATDSCSAIFVSPNEPVVASYGKGDTFTIGDAKVEVTDVAANAATVKLTD KSGTVTKVLGPLTPENTRMLLMSVVDRDRLWTLSKDGKVAVHLNIRQSDKPIADGKVSLV AYNDVVEVNDGSVWPADTRFLARPET >gi|316924508|gb|ADCP01000021.1| GENE 63 81325 - 81567 318 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|121533854|ref|ZP_01665680.1| ## NR: gi|121533854|ref|ZP_01665680.1| hypothetical protein TcarDRAFT_2231 [Thermosinus carboxydivorans Nor1] # 1 71 16 86 105 86 52.0 6e-16 MFVGPNKHFSIVVDEFDGKIVKAWHIENSKGEKSPNLATRAGGKHIDLVVGKACRSTAHF ISRFYPAMYDETLNGMDSFK >gi|316924508|gb|ADCP01000021.1| GENE 64 81648 - 82370 774 240 aa, chain + ## HITS:1 COG:no KEGG:MTH1541 NR:ns ## KEGG: MTH1541 # Name: not_defined # Def: hypothetical protein # Organism: M.thermoautotrophicum # Pathway: not_defined # 116 203 82 169 169 63 37.0 7e-09 MGKVLIALCTVFLLAFPARGEQAVTWESLTVWNDLPPLYDPAPWPSDLAGKRVILRWNGK HNGDLLLDYLHSLLSRAFPDTEFIKAYETDPALSGISKDLPESEATARALLELRPDLVIA ATGDCRACSAWLAIDQIQLERAGVPTLTILTQPFLKTFHNVRQNLRIGPLHCVAVPHPVA IIRDEKVRAKVDEAFPAILDALRSGKAADSTKIFPAETAPPAEAEAPSAAPGKQIPADLQ >gi|316924508|gb|ADCP01000021.1| GENE 65 82947 - 84365 2067 472 aa, chain - ## HITS:1 COG:no KEGG:LI0461 NR:ns ## KEGG: LI0461 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 2 472 3 498 498 370 42.0 1e-101 MKRLMTLLLAAGLVLGATSAAKAVDFKMTGLWQNRVSFADRNFEKHNGDDKMRAATRLRT QIDVIASESLKGVMFFEIGHQNWGKAAEGAALGTDGKEIKVRYSYVDWIIPQTDAKVRMG LQPYVQPTFTGIGSPILDADGAGITISNQFTENVSASLFWLRAENDNDPEMTKHDAHDAM DFIGVTVPMTFDGVKVTPWGMGGIIGHDSFKGGNFDLNYPMAQGMLPLMGTSTIVANSDK DHGSAWFGGVSADLSYFDPFRFALDAAYGSVDLGTSKLNGKNFDVKRSGWYAALLAEYKL ESCTPGLLFWYSSGDDANAYNGSERLPSIDPDVYVTSYGFDGTNYGGAAQTMGYGISGTW AVMARVKDISFMEDLSHVLRVVYYQGTNNKEMVRQGMISNPQHSQYSMMYLTTGDKAVEV NFDTEYKLYKNLSLFVELGYIRLDLDKDLWKGVGYEAKENNFKGTFSVGYKF >gi|316924508|gb|ADCP01000021.1| GENE 66 84478 - 85851 1782 457 aa, chain - ## HITS:1 COG:Cgl2722 KEGG:ns NR:ns ## COG: Cgl2722 COG0786 # Protein_GI_number: 19553972 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Corynebacterium glutamicum # 1 444 1 426 449 104 26.0 4e-22 MEASFYPYLGALCWIGMLLMLGTLIRAKVPLFQKLLFPSSLIGGLIGFVLINLDLVGMPT STGWKDITPNIFSMITFHLFAFGFVGIGLLQTKKPASGKVVMRGALWIALVFGMTFSVQA LIGKGVFVLWQDLFGGTFETVNGYLLGAGFTQGPGQTQAYATIWQTSYQTANALSVGLAF AAVGFLVAGIVGVPLAFYGIKRGWCTGGRSGELPQNFLRGLMDRGDNPPCSHSTTHPANI DNLGFHVGLMAAIYGVAYLFGLLFSKYMPAGIAGLGFGLTFCWGMFLAMLLRKFMGKIDI LHLIDGETTRRLVGGSVDFMICAVFLGIEMRALQEVLLPFIVAIAAATFITLIICVWFGR RSPEYGFERCLIMFGYATGTAASGLLLLRIADPDYETPVAVEGGLMNVFSCLTFLPVTLS MPFAPLPGYPIVWVFAAVIIATPIAMYLLKFIRKPAF >gi|316924508|gb|ADCP01000021.1| GENE 67 85972 - 87618 1827 548 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 4 544 15 548 553 345 40.0 2e-94 MGKATQIYRNGTILTMDSRGSVVSALAVEGNVVLAVGTDDDVMAHAGPDTVVTDLRGRTM LPGFIDGHSHFVWAGLMFATQLDLSSPPVGKVKDIAEIRDLLKEKAASTPKGEWILGYGY DDTALADRRHPLASDMDDVTPDHPVLLRHVSGHLCSCNSRALALVGYDRNTPDPLGGIIR RDGHGNPNGVLEEPPAMNPVAALLPEASEEDWMRSIVTACDAYTAKGVTSAQDGFTQEKQ WAQLRKAHEKGLLRNRVQILPGLGCCDLNQFTSHASGTPLTADRKISLGAVKHLVDGSLQ CYTGCLSNPYHKIIYDLPDGPMWRGYIQENPEGFIDRIVALHRQGWQIAIHGNGDEAIQL ILDAYEEAQKRYPRADARHIIIHCQTVREDQLDRIKRLGVVPSFFVVHTYFWGDRHRDIF LGPDRAKRISPLRSALKRGILFSNHNDTFVTPIDPLLSVWSAVNRITSSGQVLGEEYTIS VMDALRSVTSWAAYQACEETSKGSLEPGKLADMVVLGGNPLAVDKKAIRDIPVLATIVGN ELVYGSLE >gi|316924508|gb|ADCP01000021.1| GENE 68 88370 - 88822 246 150 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNAEKVFEQVWREFIRRLEALRNQKPKMTYEQLGEMCGVSKVTTKRWLEGAGGEKTSFPD MLRYMQAVGMNIRAVIPQSQSEKEAVLESRLSEIQKELAVCKDALKQAEKERDAYKSRWE GHLETVRAQSGNHPSGPFAHGQKPLGEKAD >gi|316924508|gb|ADCP01000021.1| GENE 69 89079 - 89720 319 213 aa, chain + ## HITS:1 COG:ECs2580 KEGG:ns NR:ns ## COG: ECs2580 COG0500 # Protein_GI_number: 15831834 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 # 5 145 15 165 247 62 27.0 6e-10 MTQGVGNWTFNVLPCQAGRRSEKQSPWYDMATAAVAMIARPYIPQGGKVYDLGCAAGNVG RVLAPTLHEREAQFVALDGCRDVVVAYCGPGRAIMADVASYPYKPFDVAVAFLTLMDLSS STRQLLLSTLRSKMRPGGAIIIINKEKPPRGSLSATASRLTFGCKVEREAHEARKWRGHY TLEIYSALYRSELGPDAVEAFRLGDFVGWLIEG >gi|316924508|gb|ADCP01000021.1| GENE 70 90079 - 93180 3016 1033 aa, chain + ## HITS:1 COG:lin2276 KEGG:ns NR:ns ## COG: lin2276 COG3829 # Protein_GI_number: 16801340 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Listeria innocua # 728 1023 156 452 455 254 44.0 8e-67 MPRSLNLPQGILAIALHAAGRPLLTQELAVLAGKPQEAIASLLEPLLGAGFVAHTEAPDA WVCPGLPDAVRDEFVRLLVRNAPPGSLTHALARLGAPDMALDDCTAIAEAIGDDLGKQRP GSIICMECALRFLIAWGERHLYTAGGSESCLYGRLALTFQSMCILANRHIPLAFRLSSIA YELAGYGGNERFRPLIGILKSYMRLLCLELPEEPWNISEIETEFLRLKDACEQEMKNHIA CFAGLLYAISGNYPDSLRSHVHQGEPGDWLNRIALFHAMSISQSAMYLGRYTLAIGTVES ARHSAELAGEKTFSLLWLTHFCFLLLRKGDWDEALNHLDFLFSCTHPVENPKIFSSTVRG LALYHYLNGRLDAAYRILRRETENAIAAGSPHSPFVDPYILDMLYGFTRAGYPEIPRYAF RDTLEALLRSPNRQLRGAALRVKALCLRDGRADLWDAEETDPVSLLRQSVGCLASSQDVR ELALSRHELANTLERLGRHDEARPFRMLVSESIGRASHAGMDPLEISILATRDRDSDVFS PGVPNIGTLFLKEELMDRCHDAFNDVPIRQTLEDNLQRIVNVAQLELKAERAALFRPSAG CNSVECAASVNLTRTEMQSPAMRDRIAWIADCLSNGKGDRDRQAICLSLNVGGRQPWLLY LDTRFGPGFLTQLKKSELGELARLFAAEVRTALRLDDMLQQEIHWQNAKFNAVTLQKNTG EAPIIGEGLKPFVSQVRQVGMTDAAVLLYGETGAGKDVIARQIHLFSQRKGPFVAVHPAS TPEQLFESEFFGHEKGSFTGATSQKIGYFEMANGGTLFIDEVGEMPLSMQVKLLRVLQNQ RFMRVGGTAEIISRFRLVAATNRDLRDEVAKGRFREDLFYRLSVVPLRVPPLRERREDII GLAEAFAESYCQRYGRPSLRLTDAERACLLNYSWPGNVRELKNIIERAVILNAPLLELVS PSRATASGSGAGAFIIPDFPTIPELEERYLRYVLERTGGKVCGPDGAESLLHLKRSTIYL KLKKYGIIPSDFI >gi|316924508|gb|ADCP01000021.1| GENE 71 93481 - 93744 279 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSLYFSICAPSGKFLCADALLDSSIHALKEASEKAAGTFTLSFSQDQCIWSVMMEAKDGK TTVINPHLYEKAKKFSMMIEGWNKEAD >gi|316924508|gb|ADCP01000021.1| GENE 72 94091 - 95731 1737 546 aa, chain - ## HITS:1 COG:mlr0093_1 KEGG:ns NR:ns ## COG: mlr0093_1 COG0303 # Protein_GI_number: 13470396 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Mesorhizobium loti # 201 541 6 328 330 93 26.0 9e-19 MNIGSYTFQEFKRLAENFHGYAAPGLLIGGYMVEMAKARIPEGTLFEAVVETRKCLPDAV QLLTLCSAGNNWMKVHNLGRYAVSLFDKHTGEGVRVSVDPAKLDAFPEIRGWFLKEKPKK DQDEVRLLSEIEEAGDGICKAEPVTMKRRFLGHTHMSAIGLCPMCGEAYPKEDGPVCRGC QGEAPYVTASRVLKTPPTRVVPVEEAVGKTAAHDMTRIEPGAFKGPEFKAGQRISVGDIC RLQQMGRFHVAVVEDAPDAGDLVHENDVAEAFARRMAGPGVTYKLPPHEGKIDFIAEREG LFSVDAERMFRFNMLPEIMVASRQDATVVGEGARLAGTRAIPLYISRSRLGEALSALGEK PLFEVLPLRKAKVGILVTGTEVFQGIIEDKFIPVITAKAHQFHCEVVRSVIAPDDVKRIA EAVREIQEAGADLLVTTGGLSVDPDDMTRRALLEAGLTGVLHGVPVLPGTMSLMGRIPGD LAHPEGMQVLGVPACALYYKTTFLDLVLPRLLAGRELTRAELARIGEGGYCLGCKICTYP KCSFGK >gi|316924508|gb|ADCP01000021.1| GENE 73 96298 - 96945 524 215 aa, chain - ## HITS:1 COG:no KEGG:Dvul_2281 NR:ns ## KEGG: Dvul_2281 # Name: not_defined # Def: putative phage repressor # Organism: D.vulgaris_DP4 # Pathway: not_defined # 4 215 2 232 232 170 40.0 4e-41 MDSSSFDEIFRRFQSVTGTSTQQELADVLGIKQSTISESKKRGTVPPGWFLVLFEMRGVN PDWLKQGKGPIYLRTEDGRYMEPEPAATPLGRGVSLPYYDSDRVQDGVWEPSGTIMLPEP YAGKELRILRITGDSFSPTVRQGAFVGVDTSCVRPSSGSIFAVSVPFEGVVIKRVFCDSD TLLLRTDNPLHPSMSIPLAEAGRLIGRVAWVFQAI >gi|316924508|gb|ADCP01000021.1| GENE 74 97131 - 97358 204 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLVLPFRLAPDTLAAPVPGPQHFLMNHPEKREPVQVQRHRYKKQDVEGGVLSAALKLGN VNTRETSEISKFLLR >gi|316924508|gb|ADCP01000021.1| GENE 75 97471 - 97878 347 135 aa, chain - ## HITS:1 COG:CAC0907 KEGG:ns NR:ns ## COG: CAC0907 COG0432 # Protein_GI_number: 15894194 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 8 127 11 130 132 136 49.0 9e-33 MEICTVVTRDREEMVDITERLCAVIREKGWRHGVAALFCPHTTCGLTINEGADPDVRRDM TAFFSRIIPHRGEWRHGEGNSDAHIKASLMGPSLFVIVEDGEPRLGTWQSVYLYEGDGPR SRSIWVQWLPASENA >gi|316924508|gb|ADCP01000021.1| GENE 76 98091 - 100292 2378 733 aa, chain + ## HITS:1 COG:RSc0588_4 KEGG:ns NR:ns ## COG: RSc0588_4 COG2199 # Protein_GI_number: 17545307 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Ralstonia solanacearum # 569 724 6 165 182 99 36.0 2e-20 MDQRELRGLRTYALYCIVAAALLAVVFLFYFSRTKQNIQEQAEAQLTEVTRQYANAIKAD MRGNIQTVEALAHVFGELGQGNPETIIPLITGAVATLGLKRMGFVTLDGMAHTTDGLTFD ARDRLYFKRALTASSTVAILLVDKTDNTPINVYVSPIVHKGKVLGALFATVQTDAFIMSI DDRLFNGSGQVLVTNNQGHILFRGRGPIPLARYNNITELIPTGVPPAPGNDIPDTFGSGG LFRYDGSPYYMAHSKLEITSTDWHIFSIVPEDTLTANVGTIQREVLYLMLFFIGLSMVFL WLVLRMQGRQAKRLRRAKQFLETVIENIPGGFFRYSNDEKQEFDYISEGFLKLLGHTRAS FREAYGNRFDNLVHPEDRKRVLDSINSQIKHSDYDTVEYRVTKADGQILWLFDKAQLVRD ENGRSWFYVIVMDITSLRNTQQELKISEERYRILTELSERIIFEHDLVTRCSYFSPRFKE KFGYDPAVDSASGSLSEYCIHAVDRPLYRRFVTQILSGHSAPAELRFLKQPSGFLWCRVQ AVAIFDESGQPVRLVGEIEDIDAEKRDTERLRMKAQLDPLTGLYNASSARKRIEASLQQP PSVGMHVLAVIDVNHFKQINDTCGHLTGDHALIETARRLGSMLNEGDIAGRIGGDEFIAL FRGVPSMGALEKRIDELRRGLSFPLPDGLSMSASIGIAAFPNDAADYTGLFRLADASMYA DKRHRDEDEMKGN >gi|316924508|gb|ADCP01000021.1| GENE 77 100753 - 102693 2126 646 aa, chain - ## HITS:1 COG:BH1826 KEGG:ns NR:ns ## COG: BH1826 COG3284 # Protein_GI_number: 15614389 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; K Transcription # Function: Transcriptional activator of acetoin/glycerol metabolism # Organism: Bacillus halodurans # 24 644 28 622 623 334 34.0 3e-91 MLTTLPSPSAQTIRQSVPDQEDIIRRSLERSARYGVDPHLDGAPESTRLSDEQLRERING QRVFYTLAKEQIDSLYRLLRDTGFCMALADSEGYVLYVVGDSDLVEHFKRRRCIPGYRWT ERDIGTCAIGLSLEERVPVFLPGDRMYSAQARKISNAGAPVFSLDGGRVLGAISLSGGSD MMHIHTLGLVRQAAETVTSQLRERERIRELAIKNQYMRALVESDSRGIVTVDQTGRIVEA NSTARKLLKLSPGCEGKSFEESVGESYNITGYLKEGKGFRAREILARRSGTTHFASLDPI RMNSGELVGGLFTVMEKKEMMRMAVEMTGAHAHFTFESILGASESLRSALHLAHIAAGST APVLLSGETGTGKELFAQAIHNDGPRRNRPFVAINCGAIPKELLESELFGYEEGAFTGAQ KGGRPGKFELADTGTLFLDEIGDMPFDMQVKLLRVLQSGEIQRVGGLRTVPVDLRIISAT NKDLKQAIAQHQFRADLYYRISTLSILVPPLRERAEDILPLAEHFIHRHELRLNRHPVPL PQETAEAILRYPWPGNIRQLESAVERAVHLAEGGALLPEHFGIADLMENRRPAAPAPAQA TLEDIERQAIAAALVRFGGNISQTAFALGVSRPTLYRKMSKYGLEE >gi|316924508|gb|ADCP01000021.1| GENE 78 103047 - 103559 792 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302863611|gb|EFL86542.1| ## NR: gi|302863611|gb|EFL86542.1| hypothetical protein HMPREF0326_00314 [Desulfovibrio sp. 3_1_syn3] # 29 169 27 168 169 101 45.0 2e-20 MGRLLRRNIWIEAGISLFLMLFFIYFYCHIDQWVIETLPSTISPAFFPSVVTLVLIIMSA LLTGFCLRSVRHMLAGKVDEERLELQEGGEEAGRFAALAGYVGILFLYLIGLHFIGFVYS TPIVMFLVSLMLGLRHWLIGIVCYVLFTLLLNYVAFNFMQIILPTGVLFG >gi|316924508|gb|ADCP01000021.1| GENE 79 103581 - 105089 2221 502 aa, chain + ## HITS:1 COG:BH2009 KEGG:ns NR:ns ## COG: BH2009 COG3333 # Protein_GI_number: 15614572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 8 469 8 466 504 338 39.0 2e-92 MFADICTGLISILTPETIGAVALGIFTGLMFGAIPGISGIMAISILLPLTFYVSPLVGIP MLLGIYKASMFGGSITAVLLNTPGAPPAVCTAMDGYPLTKQGKAGKGLNAALVGSVFGDT FSNILLICVAAPLSYLTLKVGPVEQCSLILLALTVVGSISGSSILKGILCAGVGILLATV GVSGTTGAMRFTFENENLMSGIALIPMVIGLLCLPEVIHQACSGIRKTFEQQFDLSGENG RLSWQEIKSRIPVLLRSSIIGSVIGAMPGLGASPAAYMAYSEAQRTSKHPEQFGKGAIEG VMAPEAANNAVTGSAMIPLLTLGIPGDDVTAVLMGAFLIQGITPGPNIFFENTTVVYGIF GSLIMCDILLYVIAKLGFRVWVRITQLPKHIIFSTVTIFAFVGTYSINQNLFDILCLILF GILGYGMRRFQFPAGPMIIGFILGPLLESAFDQTMTLSDGSFMIFLTHPFSVVLLLLTVA AVFSIARARLRRSKIAQLAQEG >gi|316924508|gb|ADCP01000021.1| GENE 80 105420 - 106397 1608 325 aa, chain + ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 16 321 1 307 308 125 28.0 9e-29 MFKTLRTALFGMGAFLLAGGLLAVPAQASDYPNKPIQLVSPYGAGGDSDLTARVWAEFAK KKLGQPVVVVNKTGGGGLTGSLFAAKAKPDGYTLFLAQAGPNIIIPLTAKGANYSFDSFD YIARIMVANCAVVVNKDAPWNNLKEFEAAAKKEPGKLVFASPAATSWLTFAMRNWFTGTG VQVKQVEYKSGGEAATAVLGGHADMTFLFPQNYAPMASADKLKILAIGTKSDVYPNAPTF AEQGYEGNYYGWAGIAAPKGTPKEILDRLAAISEEIVKDPEYIKAIKNMNATPDYSTGEA WMKQLKEQYAEMNKVLTDLGLTGNK >gi|316924508|gb|ADCP01000021.1| GENE 81 106600 - 106980 588 126 aa, chain + ## HITS:1 COG:SMb21139 KEGG:ns NR:ns ## COG: SMb21139 COG0251 # Protein_GI_number: 16264466 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Sinorhizobium meliloti # 1 124 1 125 127 144 60.0 4e-35 MSIKRYGTPTKPGALPFSKAAGADGWLFVSGQVPRDADGEIVTGNITVQARVTLENLKKV LESAGYSLEDVVRVNVFLDDPRDFAGFNKVYAQFFTAEHAPARVCVQAMMMSDLRVEVDC VAYKKD Prediction of potential genes in microbial genomes Time: Fri May 13 02:12:26 2011 Seq name: gi|316924498|gb|ADCP01000022.1| Bilophila wadsworthia 3_1_6 cont1.22, whole genome shotgun sequence Length of sequence - 11612 bp Number of predicted genes - 10, with homology - 7 Number of transcription units - 2, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 79 - 1191 1486 ## COG0665 Glycine/D-amino acid oxidases (deaminating) 2 1 Op 2 . + CDS 1206 - 1481 337 ## RALTA_B1295 putative sarcosine oxidase alpha subunit; 2Fe-2S ferredoxin 3 1 Op 3 . + CDS 1481 - 1684 66 ## 4 1 Op 4 . + CDS 1630 - 2868 1711 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 5 1 Op 5 . + CDS 2920 - 3369 347 ## 6 1 Op 6 . + CDS 3379 - 4812 2246 ## COG0591 Na+/proline symporter 7 1 Op 7 . + CDS 4831 - 4947 221 ## 8 1 Op 8 . + CDS 5048 - 5758 933 ## Sros_5198 hypothetical protein + Term 5814 - 5874 23.6 - Term 5800 - 5862 24.0 9 2 Op 1 . - CDS 5869 - 6426 468 ## Ddes_1997 outer membrane chaperone Skp (OmpH) - Term 6448 - 6493 10.1 10 2 Op 2 . - CDS 6535 - 11262 4183 ## Ent638_0501 outer membrane autotransporter - Prom 11288 - 11347 2.6 Predicted protein(s) >gi|316924498|gb|ADCP01000022.1| GENE 1 79 - 1191 1486 370 aa, chain + ## HITS:1 COG:AGpT61 KEGG:ns NR:ns ## COG: AGpT61 COG0665 # Protein_GI_number: 16119833 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 22 352 21 352 372 196 34.0 7e-50 MSTTEQDIIIVGGGISGAALAYGLSGKGRKVTVLDAPTDTNKASRTNVGLIWCQSKFLHL PEYAKWGFISSRLYPALTKELEEISGHAIPVNYTGGIIPVLSEEDYQKRGDYIEKLREAL GEYKGNMIDRAELEKKLPKIGWGPEVCGAAWCEEDGVVDPLSLLRAFRAALPRVGVDYRQ TLVFDVQPHAGGYRVTTKEGTFDCQKLVLAAGLSNRRFVQFAMPTLPVYADKGQVLLVER MPFVMPIPVLGVTQTFGGTTIIGFRHEKAGHHTQVVPSAVASEGKWAIRVWPELGKKRLI RAWAGLRVMPDDSMAIYSRLPGHPNVTLVNTHSAVTMAAAHTRLLPDFILGGELPETAQG MTLKRFGYSC >gi|316924498|gb|ADCP01000022.1| GENE 2 1206 - 1481 337 91 aa, chain + ## HITS:1 COG:no KEGG:RALTA_B1295 NR:ns ## KEGG: RALTA_B1295 # Name: not_defined # Def: putative sarcosine oxidase alpha subunit; 2Fe-2S ferredoxin # Organism: C.taiwanensis # Pathway: not_defined # 4 85 15 95 118 92 63.0 4e-18 MQNASTVHITFDGQPMEVPAGISVAAAVLGHAHAGHTCKHPVDGSARAPYCLMGVCFECL MEIDGEPDVQSCLVTVREGMVVRRQLPGEAE >gi|316924498|gb|ADCP01000022.1| GENE 3 1481 - 1684 66 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDRNMDLIVIGAGPAGLSAACAARACGLDVTLVDEQAARAASFSAISKPRWRRRCSIPRN APPGSGS >gi|316924498|gb|ADCP01000022.1| GENE 4 1630 - 2868 1711 412 aa, chain + ## HITS:1 COG:AGpT58 KEGG:ns NR:ns ## COG: AGpT58 COG0446 # Protein_GI_number: 16119831 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 402 64 454 472 192 36.0 1e-48 MAQALLDPKERSAGLGLVKRFRESGATYYPGTTVWGLEPHRVSCTMNGKAEELAASHIIV APGGMERPVPFPGWTLPGVMGAGGADILLRSGGTLSADKNAPVVLAGNGPLLLLLAGHLL EAGVPIAAWLDTGWWSRRIMAGALMPAGVLDMPYVAKGMKMALRVLKGKVPIIRNVTNIR AVGSDHLEKVVYDAGGKTHEINASTLLRHEGIIPRTHILNSLNAKHAWDGVQRYWHPVVD ENGRTSVDGISIAGDGTYVHGGDASMLKGAIAGVEIARRLGVISDAEAAYRSGPARRQLR AMRIARGFLRYVFAPNPAIFNVPDETLVCRCECVTAGDIRKAVAEGFRDVNEVKRFTRCG MGQCQGRMCGPALAEITAAAQSKAPDAVGCLQVRQPFRPVSLENYCNLNLPG >gi|316924498|gb|ADCP01000022.1| GENE 5 2920 - 3369 347 149 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRRSLSLLCRRGICLLLAASYIVCLCAASLNAAPTVAEAGSIASLELLPRAENPHLSPDD PALAASIPETSGKSLQSWESLFPLLLVILWPHLFRAVRLRLRDHRLPRHTDIRGLLPLPA APPLADPPFFAEDSIMFSRRIMQRRAACV >gi|316924498|gb|ADCP01000022.1| GENE 6 3379 - 4812 2246 477 aa, chain + ## HITS:1 COG:PA0287 KEGG:ns NR:ns ## COG: PA0287 COG0591 # Protein_GI_number: 15595484 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Pseudomonas aeruginosa # 6 426 3 416 461 144 30.0 4e-34 MTIHWLDYAIIILYFFVVVYIGYWAMKKVTNFDDYAVAGRGLPMSIFFAAIAATLCGGGA TIGRVSFMHTTGIVVFAGLIGVVINQIFSGLYIAGRVHNIKNVYSVGDLFGLYYGRAGRL VSSIVCFFFCLGLYGVQILAMGAILQTAVGIDLIPAALISSVITLAYTWSGGMLAVTMTD AVQYVIIIIGVSLCGYLAIDHLGGFDAMMATLNAMPRYESNLKLFSGWGPIQFAGLFLSF LFGEFCAPYFIQRYASTKSAKDSKAGVLIFSVHWIFFLATTAGIGLASMALQPDVKPDLA FTNLIRDVLPIGVTGLVLAALLAAVMSSGAAFINTACVVYTRDIYNKFINPQATQDQMLR QSRMSTLLVGGVSIGVAILFQDVFGLMIYIFKLWPSAIIPPLLAGLLWGKVSPYAGAPAV VIGVVSCFLWSDKVLGEPFGIPANLIGIGLNCLTLFVVHQMMKRRVPSTGIYAPEVL >gi|316924498|gb|ADCP01000022.1| GENE 7 4831 - 4947 221 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLEFTKYLIYASFILNIACGVWVLTGIVQWIAEMNKKN >gi|316924498|gb|ADCP01000022.1| GENE 8 5048 - 5758 933 236 aa, chain + ## HITS:1 COG:no KEGG:Sros_5198 NR:ns ## KEGG: Sros_5198 # Name: not_defined # Def: hypothetical protein # Organism: S.roseum # Pathway: not_defined # 3 235 4 237 251 190 41.0 3e-47 MKGGRLYYDMPIGILCLESLFPKPRGHMRNPLTYGFPTVTRVIRGVDIPRLLFNPTPDLL EPFIQAAKELEADGVQAITGSCGFMARFQNQIAAELHIPVFLSSLLQLPLVRLMHGEKAE IGVLTASSQALTPAHFANCATPMESVHIRGMEGNPEFWETIIEGKRHDFDMERLEAEIVG SAAAFAREKALDALVLECTDLSAFSAPIQRAINLPVYDINSLVEYAYYAVCRKDYR >gi|316924498|gb|ADCP01000022.1| GENE 9 5869 - 6426 468 185 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1997 NR:ns ## KEGG: Ddes_1997 # Name: not_defined # Def: outer membrane chaperone Skp (OmpH) # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 19 185 16 181 194 91 34.0 1e-17 MRIGSVCSCLVFVFFCLPLVSSCNAAPKIPAVVSVDMQRLMTESEPAKQASEHLGEVRAV LQKGFEALRETYKNAPKEQQEAIFANGLAVLNRQMELETRAAAKSVNDIIAEEVGKWRKS NGVMLVISRSAVIDGDWDSADYTKTILAAVNKRKAAFAALPAVNIVKGQAGKADEETKSA GSRKK >gi|316924498|gb|ADCP01000022.1| GENE 10 6535 - 11262 4183 1575 aa, chain - ## HITS:1 COG:no KEGG:Ent638_0501 NR:ns ## KEGG: Ent638_0501 # Name: not_defined # Def: outer membrane autotransporter # Organism: Enterobacter_638 # Pathway: not_defined # 961 1575 755 1371 1371 321 36.0 2e-85 MTMSKGAVGNLINRYRAVLKKCHMLNVFGSLAVAGMLVMGGAGVAGAQTIVNDEVNVQNE TVSDSSGDLDGLVYKVNDNGKLSITGGSYSGNKTARSGGVLVGTKNQAHIEIVSTVFSQN SAQYDGGAIGNFGYLTVKDSTFEGNTAQIGSDGSSHVIDNCDMGGGAIAIGSGSHVSISG STFTGNVSGKDGGAVATRKFYGAYDTDSADVGLTVSGSTFTNNTAGKSISDTPYNYNTAI TGLYESGNGGALFNTFNDTLVEGSSFTGNTAINGGAVANFYDRGKTENYGSITIKNSTFT NNTAVVDKKAKEGSGMGGAIYSYASKRATDKDFKVTVVDSQFTGNTAYNGGAIANNGNLD VSGATFKGNTASNWGGAIVNWQDMTVKGSTFEDNVAGFVGGAIGSDSTSNTVNIASSVFK NNHSVYDGGAIGSYKGLTITGSTFEGNTAQLDKSDSGEWNVAVEDATAIGGGAISLGAVS ATAIATIENTLFKNNVSGYNGGAIGTRMGKDANNSAAKLDISATFTGNRAKNGGAIYNTF YADNGLGKGAGVTVTGMFTGNSASENGGAVYNDGAQDKAQNAGGVMTITDSLFENNTADK GGAIYNTGTLHFAGTNTFKGNNATEGNDIYNVGSVTVESGVTGLNSGYTQEKGSLAVNSG AALNAAGLELKGGTMSVAGSAVAAEGGAVAFGSGSSLIINGGSVTADSGNFITDAFGAVA GVDTALVGTAGDLYLRFDGKTYTVDQYKAAKAALFEQDTEKALLTLLNGTLKVQEGETVV VSGTADPATNKVTLSGVTPVVAKSGESANDSDVTIASVPDQLVRVVDSNDKDKALQTVGN LEIGKSVFNAKGLQVAPADAAAGVVLNLVDNSKITLIGGETEDSGVLVDGNNNAVKTHVV VGDGSALQLGMAGNAGKQMASFGTVEVQGKLDVAGNGLDTTRYGYDEIKTSHGNAEVNLS NASISASTATLEQGAVNVLNATAHIGTLNSTGGTLFIDPAYVKVDSLSGDTFGSSLLVGD GSVVEIGSIADLQTKAAEAGLNPAAFGNGFTVASGESMLALGKAVALGAGGKIVVDAGVD KSGAIGGVPVGDSAYFGNKSLLVVDSAIASGDGAIRANSAADTITVKKDAKLYLVDAKAN GTYTIASGFGDTSSTVEGWNGDALITNRLVNAERIAGTDGSVTVKTTAKRASVLYPGISI PNTLDHMIGYKQNDVNSGNAGIKFLSRALEPQFLAEGDVIPTLDGAAQLAYAGGVQSSTL TVAQAPVRAIQDHLSLAGKVAQKGTNLHENGFDLWANAIYGSNRTRDFSAGSLDAGYNSD FAGGVIGGDWTFDAGAGKGRVGLALNVGAGDTKSRGDFNSTKNDFDFWGMSLYGGWSMDN VNVVADLGYSASKNELKQDIPSSLGLGGRIKGDVDSSVITAGVKAEYMVKTDVLDVMPHV GVRYMAVKTDSFTTKLDMGGDLFHTDSDLQHVWQFPVGVNLSKTIETESGWKVKPQADLS VVPAAGDTKTKIDVRTPGVNASDSMKRRVMDTTSFDGVFGVEVQKDNVSFGLGYNIQASE HQTGQGVTASFMYKF Prediction of potential genes in microbial genomes Time: Fri May 13 02:13:22 2011 Seq name: gi|316924494|gb|ADCP01000023.1| Bilophila wadsworthia 3_1_6 cont1.23, whole genome shotgun sequence Length of sequence - 4087 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 196 - 1716 1178 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains - Prom 1807 - 1866 4.7 + Prom 1713 - 1772 7.3 2 2 Op 1 . + CDS 1899 - 2132 265 ## Sterm_3521 4Fe-4S ferredoxin iron-sulfur binding domain protein 3 2 Op 2 . + CDS 2142 - 3827 2367 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit + Term 3866 - 3893 0.1 Predicted protein(s) >gi|316924494|gb|ADCP01000023.1| GENE 1 196 - 1716 1178 506 aa, chain - ## HITS:1 COG:hyfR KEGG:ns NR:ns ## COG: hyfR COG3604 # Protein_GI_number: 16130416 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli K12 # 197 497 347 657 663 263 46.0 4e-70 MPSYPSDPFFESGAMACLIFYRTNGLHGSITCLHNMIRPYLPVVRINAILISRDFTRVIQ MFDTKLDSRRTTRISNPERGAPFVESEKINQPVLINNLDGYKVQAPFKDPDLADVPFLIH SAMIRLPLFENGEYYFCLNFWSDDYDAFTWTHVEQLQRLVRPLAGELRERLSFTEEKALP YTPAGSGFERLSLCPELAEVRQRIELAAPTRTTVLILGETGSGKESVADAIHERSDRRNG PFLKVNCGAITPSLLSSELFGHEKGAFTGAHTTRRGYFEQANGGTLFLDEIGEMPQEAQV HLLRVLESRYVTRVGDHRPIPVDVRVIAATQTDLMKKVREGTFRKDLWFRLAVLTIVIPP LRDRKADIPALVRHFLRTKAAQFEMEVPDVPAPELERLYSHDWPGNVRELEFVIERSLLL SRSKAVGTPLKFEFAPEVEGGAASPEGGWPSLAELEQRYIRRVLEKTGNKLMGPGSATEL LGVHYTTLRAHMLKMGLPLPRRKEGR >gi|316924494|gb|ADCP01000023.1| GENE 2 1899 - 2132 265 77 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3521 NR:ns ## KEGG: Sterm_3521 # Name: not_defined # Def: 4Fe-4S ferredoxin iron-sulfur binding domain protein # Organism: S.termitidis # Pathway: not_defined # 1 62 1 62 66 61 43.0 8e-09 MPPRIDPEKCTGCGLCASICPLQVFRQKQPKTTPEVAYGEECWHCNACVLDCPAKAVSLR LPLNYMLLHVDADTLHS >gi|316924494|gb|ADCP01000023.1| GENE 3 2142 - 3827 2367 561 aa, chain + ## HITS:1 COG:PA2298 KEGG:ns NR:ns ## COG: PA2298 COG1053 # Protein_GI_number: 15597494 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Pseudomonas aeruginosa # 9 530 7 520 574 127 27.0 8e-29 MIPVSKTLETDIAVVGGGIGGLCAAIAAAEGGARVMVLEKANTKRSGSGATGNDHFACYY PKQHGGDIRPILRELLDSLVGLCHDPLLSLRFLERSISIVDKWHEWGINMKPFAEDYVFM GHAYPDRPRIWLKYDGHNQKEALTRQAKKVGVTIVNHHPAVDLMRDEQGITGVLALDVSG ESPSFTFVKARKVILATGTANRLYPAAGSPGWPFNTAFCPACAGAAQAQGWRIGAKLVNM ELPNRHAGPKFFARAGKSTWIGVYRYPDGKLLGPFVDKATREVGDITCDVWNSAYTDVLM NGTGPAYIDCTGTGPEDMAFMREGMASEGLTGLLNYMDEKGIDPSKHAVEFMQYEPHLIG RGLEIDIDGQTSVPGLYATGDMVGNFRADIAGAAVYGWIAGEHAAAHLGDGPLADIENTP WAQERMAYYSRFMERPSGAHWKEANLALQQIMADYAAAGPHRVRSATLLNAGLKYLADLR RNAEAEIAASDAHTLLRAIETLDLIDNGEIVMHGALERKESRGMHQRSDFTFTNPLLSDK FLTLRKEQGRIIPEWRQRWTA Prediction of potential genes in microbial genomes Time: Fri May 13 02:13:37 2011 Seq name: gi|316924482|gb|ADCP01000024.1| Bilophila wadsworthia 3_1_6 cont1.24, whole genome shotgun sequence Length of sequence - 18258 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 6, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 89 - 148 4.1 1 1 Tu 1 . + CDS 181 - 1575 1691 ## COG0471 Di- and tricarboxylate transporters 2 2 Tu 1 . - CDS 1926 - 3710 1292 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains - Prom 3818 - 3877 7.1 3 3 Op 1 2/0.000 + CDS 4311 - 4985 183 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 4 3 Op 2 . + CDS 4823 - 5308 411 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 5 3 Op 3 1/0.000 + CDS 5320 - 6729 2327 ## COG1757 Na+/H+ antiporter 6 3 Op 4 . + CDS 6916 - 8079 1626 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 7 3 Op 5 . + CDS 8098 - 9777 2320 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit + Term 9891 - 9919 1.0 8 4 Op 1 . + CDS 9926 - 10153 314 ## Dhaf_2058 4Fe-4S ferredoxin iron-sulfur binding domain protein 9 4 Op 2 . + CDS 10164 - 11183 1492 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 10 5 Tu 1 . + CDS 11550 - 12518 1095 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 11 6 Op 1 2/0.000 + CDS 12919 - 15414 2810 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits + Prom 15586 - 15645 3.0 12 6 Op 2 . + CDS 15860 - 18244 2865 ## COG1042 Acyl-CoA synthetase (NDP forming) Predicted protein(s) >gi|316924482|gb|ADCP01000024.1| GENE 1 181 - 1575 1691 464 aa, chain + ## HITS:1 COG:MTH788 KEGG:ns NR:ns ## COG: MTH788 COG0471 # Protein_GI_number: 15678812 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Methanothermobacter thermautotrophicus # 21 461 7 441 443 160 29.0 4e-39 MTTSLRKGCIGLAILLGYFLLVSLPCPDGLSPNGKRAIAMMFAAVLIWVFEILPVAVASV LFAVLPAVAGILPLPKMMQLFATPTIFFVFAMFCIAIAFQNSGLSRRIVLWTSLRSKGSP VRLLLLLMMVCALLSTILADIPVVAMMLPVAVLLLEKNGCVKGTSNFGKAVMLGLSIACL IGGVGTPAGSAMNMLTISLLQSTADVHIGFFEWTAIGMPMVLVLTPLAWWLVIRAFPPEI GRLAGMDLVEREYAGLGGLGRRESVFIAVLAVNFILWSTEKLHGIPLPVAAVIGCAIFSL PTVELLTWEQDKGRIGWDSLMLIGASNALGMAIWETGGAAWIAHACLSGVAGMPLAGVIA VISLFTVVIHLLIPVNTAIVAVLLPALAALAGTMGVNPAVLAIPMGFSVSAAFLLPLDAV PLVTYPAGYYRMFDMFKPGCLISVVWVVVMTLVMFAIAVPLGLL >gi|316924482|gb|ADCP01000024.1| GENE 2 1926 - 3710 1292 594 aa, chain - ## HITS:1 COG:aq_218 KEGG:ns NR:ns ## COG: aq_218 COG3604 # Protein_GI_number: 15605774 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Aquifex aeolicus # 149 437 142 430 506 256 47.0 1e-67 MPDTYQDGVFFHKSTRILAMPDVRESLIAFYKLLKGTFPLEGMVMHHFLPSSQSLLELHF IHDGGIHFLGRFIPFSREQTMFLRAFSLTGGVNAIPNSQEVRTAESVNNDLREFIENRPR AHLVCCLKGAGAPLGLLRLIGTEVGCFTEEHERRLGLLAIPLSFALVKVLREEEVQRFFS AHGYDTPSQSPHEDDAPAELIGTSGGLRNVMETVRKLSGTDAPVLILGETGTGKELVANA IQARSQRVGKPFIKVNCGAIPETLMDSTLFGHEKGAFTGAHCAVGGKFELANGGTLFLDE LGELSLQAQVRLLRTLQNHVVERVGSTTSIPVDVRIIAATNRDLHKMLREGTFREDLYHR LNVFTISVPPLRERLQDLLPLARHFIDKATRRLGLPPISGIEPDSAERLLQYDWPGNVRE LENLVERAVILDYNSKLKLDRYLQPVLERAVPGRSAPEHGEALEGRVRELVRRCFAEWME KGMLPEAECPVPSRSVREHRLDNAAGRGEARLSEKTGAASVGAASGFSEFRSEAALRSLD DVVREHIEAALDRTGGKIHGPGGAGELLGVHPDTLRKKMKKMGIVRPSGRREGR >gi|316924482|gb|ADCP01000024.1| GENE 3 4311 - 4985 183 224 aa, chain + ## HITS:1 COG:Ta0779 KEGG:ns NR:ns ## COG: Ta0779 COG0111 # Protein_GI_number: 16081846 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Thermoplasma acidophilum # 54 166 51 157 309 81 38.0 2e-15 MTHMLSLLPKSRFEESGVEIPGGLAFLADYRPDAIAEAAAGTEGLFMPPSHPRLDADLLA RLGSVRIIQTAGAGFDSVDHAAAAKLGLIVCNSPAQNAITVAEHVIGAVICLQRELAYAD EAIKSGAYAQARERILDRGACELFGATVGIVGLGGIGRALAPALRPSGQRSWPPTSSGPR ISRRSTASGASISTNCSPSATSSPCTARSWTPRAALWTRRGSLP >gi|316924482|gb|ADCP01000024.1| GENE 4 4823 - 5308 411 161 aa, chain + ## HITS:1 COG:AF0813 KEGG:ns NR:ns ## COG: AF0813 COG0111 # Protein_GI_number: 11498419 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Archaeoglobus fulgidus # 1 160 166 319 527 114 41.0 7e-26 MASDVFWPEDFAQEHGIRRVNLDELFAVSDIVTLHCPLMDATRGLVDAARLASMKPGAIL INAARGGIVDETALAAALERGGIRGAAIDNFESEIPSPGNPLLRLSPEARRRVLFSPHLA GVTRAAFARLIRQAIGNLENSLRGLPPQFSVNGLSSIRPSA >gi|316924482|gb|ADCP01000024.1| GENE 5 5320 - 6729 2327 469 aa, chain + ## HITS:1 COG:BH3946 KEGG:ns NR:ns ## COG: BH3946 COG1757 # Protein_GI_number: 15616508 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Bacillus halodurans # 3 456 4 458 473 274 37.0 3e-73 MNDKKTVSFGMALAIIFLPILVILLGGLVLRVGFLVPLMLATITASLLSIMAGFTWKEVE ACIVNGVHRIIIVTCILFLVGTLIGVWIQAGVIPLFLYWGLKLLSPSMFLVSTVLICAVF SLMTGTSFGTIGTAGLALLGVGEALGFPTGLTAGAIVSGAYFGDKMSPVSDTTNLASGIT GTNIFSHIGSMLYTTVPATLVALGLYWYLGLNYSGGAMPDLTPITDALSANFNRSFLLAL PPIVLVALAIRGIPPLPTLVIAILVGVVCAFVIQDGMTIKGMFKAATDGYVSQTGHPLVD KILSRGGLTSMSFVVFLLLIAMTLGGILEGTGALGVVVDRMTRSVTSPGGLILATLVSCY LMTIGTGNGMLSIIVPARAFEKKFRDMGIQSRVLSRTLEDAVTLGIALVPYSMAAFFIVG VLKIDAMQYIPYAFVNWIVPIFSLTYGFTGFAIWKINKDAGNSPAESEA >gi|316924482|gb|ADCP01000024.1| GENE 6 6916 - 8079 1626 387 aa, chain + ## HITS:1 COG:PH1371 KEGG:ns NR:ns ## COG: PH1371 COG0436 # Protein_GI_number: 14591174 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Pyrococcus horikoshii # 33 384 26 383 389 303 44.0 3e-82 MDTQRFSRIANGLKSASISSVTQTADALRREGKEVLVFSIGRPDFDTPEHIKAAANDALA KGFVHYTHNRGILPMREAVAAYLKRQNGLDYDPKTEIMITAGAQEALMLCTHALLDPGDE VLVPSPGFLLYYSSIPMLHAVPVPYVLKEPDYDWDGAPVSDRTKLMIFNNPNNPTGKVFS AEEMRDAADFVKKHDLLAISDEAYDRLLYGDARHRSLAAEPDMRERTLLVGSLSKTYSMT GWRLGYVAGAPELIERLARLQQNYMLSVTSFAQYGGAEALNGPQECVETMRRAFEERRKV LIEGLTGAPNIRFNNPEGAFYLFIDHRDTGLDSVTFCKRLLEEQLVACVPGDDFGPSGEG HIRISYATSTENCAEGARRIRAFLNNL >gi|316924482|gb|ADCP01000024.1| GENE 7 8098 - 9777 2320 559 aa, chain + ## HITS:1 COG:PA2298 KEGG:ns NR:ns ## COG: PA2298 COG1053 # Protein_GI_number: 15597494 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Pseudomonas aeruginosa # 1 558 1 552 574 211 29.0 4e-54 MQTHRHQADILIIGGGVCGLNAALTAREQGRSVIVMDKAVIERSGHIAGGIDHFLAYMDT GAEWDTREAYLEFTGKSARGVTNIDVVDSVYCQELPHALKRFEEINCTLRQPDGTYYRTK SYGQPGPWWINFNGKRMKPLLAKAARKAGCAVLDRVVTADLLTNGGTVCGAAGFDIRTGD FHIIRAGAVIISTGGTNRLYSNPSGMSFNTWMCPADTGDGEAMSLRAGAELANIEFLRMT VVPRSFNAAGLNALSGMGAVLINAKGEAFMDRYHPLGMKGPRYKMVQGVLNEMREGRGPV YMDCRGIAPEAQKHLIATLSCDKDSFKDYFEQLGIDLTTTPMEVDTSEGMQGGPNEVCGS GVKINKDCGTNVPGLYAGGNGADQCRSLHMAVTSGIHAGRMAAHYCASLASAPEPDEAQI QRIHERVYAPLDPERSIGWKEFEMTLQRILTEGAGPCRTEKKLLRAKEKLEQVQAASGLV GAKDLHDLLRLHEVHNLLTIGRCTVDAALYRQESRFGNCHFREDFPEQDDERFLGQVVVS AESDGRLKLELRRTDNPYA >gi|316924482|gb|ADCP01000024.1| GENE 8 9926 - 10153 314 75 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_2058 NR:ns ## KEGG: Dhaf_2058 # Name: not_defined # Def: 4Fe-4S ferredoxin iron-sulfur binding domain protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 2 66 3 67 67 66 47.0 3e-10 MPAMIDTKTCTGCGRCDRSCPLDVIYYNKDEKKAEIRYPEECWMCGSCRQECPSGAITLR FPLNTLYNASSNPYM >gi|316924482|gb|ADCP01000024.1| GENE 9 10164 - 11183 1492 339 aa, chain + ## HITS:1 COG:slr0619 KEGG:ns NR:ns ## COG: slr0619 COG2159 # Protein_GI_number: 16331820 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Synechocystis # 42 333 55 336 348 133 32.0 4e-31 MLYIDAFNHILPKKYQAVLEQKVPNRDMSSNLSRYSETVPTLLDLDARFRLMDSVEGYMQ VLTLAAPTVESVASPEVAVDLCRFANDEMAELVEKYPDRFAAAIAALPLNDLDASLEEID RAIRDLRLRGIQMYSDVNGMPLDAEALYPIYEKMERYNLPILIHPKRSPALPDYPGEENS RYRAWTKLGWPVASSMAMFRLVYGGVMERFPNLKIVTHHCGGVIPYLAGRMEWNDDFNEM RMGHKDILFPHKPLEYFRRMYYDTANNGYPAGLRCGMDFAGIGKLVFATDLPFCNQKGLR LIRDSIAAVDALDLAPGDRRKVFQSNAVDLFRLPLSTFV >gi|316924482|gb|ADCP01000024.1| GENE 10 11550 - 12518 1095 322 aa, chain + ## HITS:1 COG:MTH970 KEGG:ns NR:ns ## COG: MTH970 COG0111 # Protein_GI_number: 15678988 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Methanothermobacter thermautotrophicus # 49 316 41 308 525 219 41.0 7e-57 MKKHTICLEARPFAAFSDEQLDRLTARPDVEVLDCRGGAVDDPAFMGRLQSADIVISGND LHLDAALLDALPNLRCVAKLGVGLDMIDIPAAVARDIIVCNTPGANDVAVAEHTFALLLG LLRQVPRCDAGMRGGKWEQGAILGREVQGRTFGIVGMGAIGRCVASIAQGFGGRVTGFDP YWPEAFAAERNIERRDLDTLLRESDFVCIHCPLTPQTENLIGTRELGLMKPSSVLVNMAR GGIVDESALYGALTARHISGAVLDAFSQEPPSSMPFAALPNVLLSPHVGAFTEEALEKMS RIAVDQVFQFLDGERPMHMKVR >gi|316924482|gb|ADCP01000024.1| GENE 11 12919 - 15414 2810 831 aa, chain + ## HITS:1 COG:CAC2001 KEGG:ns NR:ns ## COG: CAC2001 COG4231 # Protein_GI_number: 15895271 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Clostridium acetobutylicum # 16 605 3 582 584 212 29.0 2e-54 MNEIQQLCDEAAGTNKVLQGNIAFAVGCVRAGIHAADGYPGTPSTETIDKGLSQAQDRIT VGWSVNEAVAVSVGVGHAMAGRDCVVTMKIPGLYQACDAFTSVSCYMQPKGALIYYIASD FTPSSTQHVIDPRYLFKSCFVPVFEPRTHQEMHEAAGLAADISRTHRTPVVILAGGLLCH SEGLVKLAEQHTRPLAEVGPLALQHALPGMAHISYDTVMAERMPGLVRMVEESPLNIHYK GAGRRGVITCGATTLLMREYKERFDPDLDILSVAFTNPLPMERIRAFCASIKGEVSVIED GYRFIQEACLAAGLDVSGKDALSPVTEWTPERIAAFLGRETKSGTPAPSVPRPPMICAGC PYRLAALHLGKLHKRGDIEAIFGDIGCNTLIKGLNALDVNLCMGASEAMRMGYVLSKPEA AARCVSIIGDGTECHSGMDATRNTVFRQVPGLKIVLDNEWIAMTGGQPSPSSPCNLAGQP NAFRLDEALKSQGAHVITADAYDKAEIEARVEEGLKLAAEGLFAILIIRGTCIRKVPASA YGQKLAVDKELCRKCGRCHICPGIEASEDGTPRWNNLCSGCVSRTPACLQMCPFKALSVA GSNTETALETVTLPHAPEVIDVPPADSFRRPPRLSLAIRGVGGQGNLFFGKVLAQVAFLA GYDDRNILKGETHGMAQMGGPVISTFGCGEVFSPALVPGTANVLIAMEKAEVLRPGFLDL LEPGGTVLMADTRILPHGLKPEAYPSDEAIAAQLEGYRVVSVDVLSIALNLGDPKGRCAN VAMLGVLSTLPPFDVVPEAVWLQALRGISRKPALWDLNHAAFMAGRGMQKG >gi|316924482|gb|ADCP01000024.1| GENE 12 15860 - 18244 2865 794 aa, chain + ## HITS:1 COG:PAB1230 KEGG:ns NR:ns ## COG: PAB1230 COG1042 # Protein_GI_number: 14521901 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Pyrococcus abyssi # 306 777 9 450 466 156 28.0 2e-37 MNLTVAYEPITELLRNAYAEGRQFLYEYEVYNLLSLSGSETPPKCSFIPRNARLADEEVM AMPGDKAVLKIVSPTIIHKTEVGGVRIVPKTPDKVRSAVRRMLSEVPERYAEWIERHPSG APKSYRGLEGAALQNAIASDLKGVLQVQFMPPDSEAFGNELIVGLRRTREFGMVISAGLG GTDTELYAERFRKGQAIVAALTAMTDGETFFRLFRQTVSYRKLAGLTRGQRRIVTDDQLI ECFESFIRMGNHFSPDNPDAPFVIDELEINPFAFTDYLMVPLDGMCKFSLPEKEPTARPV ARIGNLLHPERIGIIGVSAKRRNFGRTILENIIGSGFDKYRIVILHDGEPDPSGVRCVPD LRSVGEPLDLFIVAVGAEHVPGLVDEVLETGAARSVMLIPGGLGETEESKEIAARMIERI REAHGKGDGGPVFLGANCMGVISRPGNYDTWFIPAEKLQAARRTSWRRTAIVSQSGAFLL NRFSQAPEMSPAYLISMGNQTDLTLGDMMRHFISSDEADVIAVYAEGFKDLDGLRFAQAV REAVRNGKQVIFYKAGRTPEGKNATSGHTASLAGDYMVCENCIRQAGAIMARNFTEFQEL ILLAEAFRNDAVNGKRLGAVSGAGFEAVGMADSIQSDEYSMSLGAYAESTRSAMQSCINE KHLDKLVTIANPLDINPGADDEVHAHMAELMLNDEGIDAVVIGLDPHSPVTHTLAETDVE AFRMNAPGGILERLSAVRERFRKPLVAVVDGGAQFEPFRNALRERNIPVFPVCDRAIATL SLYMEARLMVKGLV Prediction of potential genes in microbial genomes Time: Fri May 13 02:13:45 2011 Seq name: gi|316924473|gb|ADCP01000025.1| Bilophila wadsworthia 3_1_6 cont1.25, whole genome shotgun sequence Length of sequence - 8455 bp Number of predicted genes - 10, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 158 - 203 -0.7 1 1 Tu 1 . - CDS 325 - 1050 557 ## COG2932 Predicted transcriptional regulator - Prom 1124 - 1183 4.4 + Prom 1056 - 1115 5.3 2 2 Tu 1 . + CDS 1234 - 1446 253 ## + Prom 1755 - 1814 7.1 3 3 Tu 1 . + CDS 1992 - 2375 557 ## + Term 2399 - 2450 10.2 - Term 2390 - 2432 4.1 4 4 Op 1 . - CDS 2672 - 3538 1005 ## COG1814 Uncharacterized membrane protein 5 4 Op 2 . - CDS 3607 - 4428 615 ## Dtox_0030 hypothetical protein - Term 4442 - 4479 3.8 6 5 Op 1 . - CDS 4563 - 4760 60 ## 7 5 Op 2 1/0.000 - CDS 4797 - 5975 545 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 8 5 Op 3 . - CDS 6150 - 7202 730 ## COG0502 Biotin synthase and related enzymes 9 5 Op 4 . - CDS 7199 - 8035 613 ## Dtox_0027 Pyrrolysine-tRNA ligase 10 5 Op 5 . - CDS 8032 - 8388 263 ## HRM2_15230 hypothetical protein Predicted protein(s) >gi|316924473|gb|ADCP01000025.1| GENE 1 325 - 1050 557 241 aa, chain - ## HITS:1 COG:NMA1884 KEGG:ns NR:ns ## COG: NMA1884 COG2932 # Protein_GI_number: 15794772 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Neisseria meningitidis Z2491 # 33 239 24 231 234 88 29.0 1e-17 MNTIPSESDSNAILDRLLSGAGLHRDAQLAALLGVSPQAVSQARRKRKIPEGWVVKVSQQ CGLSMDWLMFGKGDESTPVASHAPVASSTSQASTPAGAAVPDLDLLCIPLVAASLSAGVG SLQTEADVLDYFAFRSDWLCRKGNPDKMVLMKVYGDSMEPEICHGDMALIDQSKQQIYPH TIYAVGVNEEIYIKQIETLPGHRMLLRSLNERYEPIEVDLRGDMAESVRIIGKVIWWCRE A >gi|316924473|gb|ADCP01000025.1| GENE 2 1234 - 1446 253 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAISEDLRSVQATLMQGLNLDNAPLLRLVCSNLQSIAEQIESLENLPLATRDIAGKVGRR RGKRCALGAA >gi|316924473|gb|ADCP01000025.1| GENE 3 1992 - 2375 557 127 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTNKELEVIVCNAIDARKEQGTPFDRDAFAALVKMAQALYPHTTYEHQIDVVARMKSIVD PIDGFHPRWETNELFDACIKAIEAKQPSYTIPYEDVVIFINSIKMALKRLDLKVDTCSCL SALAEKL >gi|316924473|gb|ADCP01000025.1| GENE 4 2672 - 3538 1005 288 aa, chain - ## HITS:1 COG:TM0497 KEGG:ns NR:ns ## COG: TM0497 COG1814 # Protein_GI_number: 15643263 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Thermotoga maritima # 13 288 8 284 284 244 47.0 1e-64 MREETLAIVRRMQDNEATDQRVYAALAKQASLQKNSEILEKMSHDEGLHCAVWGRYTGIE AKADMFRVWLFVVLGKIFGLVFVINLMEGGEDDSAENYRKLMEELPEARSIMEDETRHEA QLAAMIHEEKLSYISSMVLGLNDALVELTGALAGFTLALNDNRMVGMAGFITGVAATLSM AASEYLSKKADTSEKHPLKAAVYTGVAYMITVAFLLLPYIVFESPLVALGFCLFDAALII LGFTYFVSVVRKESFVRGFTEMITISFSVAGISFLIGWAARSWLNIDM >gi|316924473|gb|ADCP01000025.1| GENE 5 3607 - 4428 615 273 aa, chain - ## HITS:1 COG:no KEGG:Dtox_0030 NR:ns ## KEGG: Dtox_0030 # Name: not_defined # Def: hypothetical protein # Organism: D.acetoxidans # Pathway: not_defined # 1 266 1 267 271 125 34.0 1e-27 MTRLTENDIAGIEAEWATYERRLEELTGDDLLTLTARTLGIDPETARSGVRELRVGAIPI SSGEGLIGGFADSLASIAGHLGFEADVLPADVPGFQLAKSGGFDLFIWADDDTYLAENIL TGTVGENGRATGRGFATALIRMAARKRLDKRALVLGAGPVGCAGAETLALAGYEVFLCDM DGEKARVACGALSGCTPCTPDDLSGLPLFECLLDAAPTNDFFPLDRLAAGACISAPCVPC IWTLRAPEGASVWHDPLQLGTAVMLLAAAFGRP >gi|316924473|gb|ADCP01000025.1| GENE 6 4563 - 4760 60 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLAGCNGWFAHAIMPSVSNPVPFMMQRAVALLPVVRHGFPCIFSAPEKTFAEIPLPAYSF LHLYV >gi|316924473|gb|ADCP01000025.1| GENE 7 4797 - 5975 545 392 aa, chain - ## HITS:1 COG:MA0153 KEGG:ns NR:ns ## COG: MA0153 COG0458 # Protein_GI_number: 20089051 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Methanosarcina acetivorans str.C2A # 3 323 4 305 363 139 33.0 1e-32 MRVTIVGGGLQGVELCWLARKAGWGTLLVDERPAPPALRLADVFAQCDVTKLGGSGVLTR KVEHQILGTDIVIPALENAAALAALDGWCAREGMLFAFDPFAYEVTSSKLRSRDLFLRCG TPIPQPAARADTKVFPIIAKPSEGSGSRGVRLLSDEAELRAYIPEGFDAEGWVLESYCPG PSLSLEICGTPGNYRIFQVTDLLMDEAFDCRGVVAPTGSSPSFVREMEAALLNLAEMLRL HGLMDLEVIQTPEGMRVLEIDARFPSQTPTAVWLSTGVNLAEHLAACFFPYAPGSGLGAP RFARYEHVLCKDGGLHFLGEHIMGQFGPLEPVNGFCGADEALVGGSLLSGSWAATLMFAG QDAEDVASRRSDCIEHLRELAERQTGKTIFKG >gi|316924473|gb|ADCP01000025.1| GENE 8 6150 - 7202 730 350 aa, chain - ## HITS:1 COG:MA0154 KEGG:ns NR:ns ## COG: MA0154 COG0502 # Protein_GI_number: 20089052 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 34 337 47 344 350 249 43.0 4e-66 MKAAICQPQMGKEALLGLLSAKSEGERAFVFDAAREARHRAFGNKVFLYGFLYLSTHCRN DCAFCQYRRSNSSLERYRKPLGEMLEAAGRLAADGVHLLDLTLGEDPYYVEPAGFHRLLE LVSALKETTGLPIMVSPGVLSAEQLVRLREAGADWYACYQETHNRDLFTRLRLEQDYDRR KRTRLEAAGCGLLAEDGLLTGVGESAEDLADSILDMAREPLDQVRAMSYVPHESTFPSTA GDTLEERRAHELLAIAAMRLVMEDRLIPASLDVDGLEGLAMRLKAGANVVTSIVPSGCGL AGVASKDLDIENQRRSVAAVVRQLGVLGLEPALPGEYRAWVEQRRRGEER >gi|316924473|gb|ADCP01000025.1| GENE 9 7199 - 8035 613 278 aa, chain - ## HITS:1 COG:no KEGG:Dtox_0027 NR:ns ## KEGG: Dtox_0027 # Name: not_defined # Def: Pyrrolysine-tRNA ligase # Organism: D.acetoxidans # Pathway: Aminoacyl-tRNA biosynthesis [PATH:dae00970] # 2 278 4 278 278 266 48.0 5e-70 MIFSEEQQRRLGELGAAFEDLQAGFADSAERNRAFQRLESRLVTEQHERLDALCEGPRRP FILELEERLSAVLRTAGFLQVHTPIILSRARLEKMGVFDGSIMEKQVFWIDSKRCLRPML APHLYEYMREVGRLRPRPVRLFEVGPCFRRETQGQRHANEFTMLNLVEMGLPEGTDLNAR LRELGAMVLDAAGIEGWRMTDEDSAVYGETSDFVDKNGMELASSALGPHPLDAAWGIMEN WVGIGFGLERLTMAATGESTMAKTGRSLSYLHGIRLRI >gi|316924473|gb|ADCP01000025.1| GENE 10 8032 - 8388 263 118 aa, chain - ## HITS:1 COG:no KEGG:HRM2_15230 NR:ns ## KEGG: HRM2_15230 # Name: not_defined # Def: hypothetical protein # Organism: D.autotrophicum # Pathway: not_defined # 4 114 2 114 117 117 50.0 1e-25 MSETPTTRPAPKQRTYRKNQFLFALIGKMKLWPSRKGILHGIRTMEIAGDHAVITTHCGR TFTVRDSRTSRASRWLRNKIFADPCPICAIPEWKLKKYQGTVFRKKSGAILRAEGGGQ Prediction of potential genes in microbial genomes Time: Fri May 13 02:14:34 2011 Seq name: gi|316924455|gb|ADCP01000026.1| Bilophila wadsworthia 3_1_6 cont1.26, whole genome shotgun sequence Length of sequence - 17093 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 9, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 24 - 737 647 ## COG1802 Transcriptional regulators + Prom 1085 - 1144 3.1 2 2 Tu 1 . + CDS 1233 - 1940 951 ## COG1802 Transcriptional regulators + Term 2124 - 2160 9.6 3 3 Tu 1 . + CDS 2362 - 3258 1027 ## Ddes_0512 protein of unknown function DUF6 transmembrane + Term 3292 - 3319 1.5 - Term 3280 - 3307 1.5 4 4 Op 1 . - CDS 3460 - 4254 753 ## COG0384 Predicted epimerase, PhzC/PhzF homolog 5 4 Op 2 . - CDS 4282 - 5100 230 ## COG1396 Predicted transcriptional regulators - Prom 5176 - 5235 3.1 - Term 5257 - 5298 8.1 6 5 Tu 1 . - CDS 5330 - 5590 383 ## gi|212704890|ref|ZP_03313018.1| hypothetical protein DESPIG_02957 7 6 Op 1 . - CDS 5810 - 6952 1112 ## COG0477 Permeases of the major facilitator superfamily 8 6 Op 2 . - CDS 6961 - 7443 392 ## DSY1001 hypothetical protein - Prom 7512 - 7571 2.2 - Term 7580 - 7617 7.1 9 7 Tu 1 . - CDS 7633 - 9213 1559 ## COG1760 L-serine deaminase 10 8 Op 1 2/0.000 - CDS 9571 - 10728 1123 ## COG2721 Altronate dehydratase 11 8 Op 2 . - CDS 10725 - 11030 409 ## COG2721 Altronate dehydratase 12 8 Op 3 . - CDS 11045 - 12121 1256 ## Amico_1250 NAD/NADP octopine/nopaline dehydrogenase 13 8 Op 4 9/0.000 - CDS 12168 - 13163 170 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 14 8 Op 5 . - CDS 13281 - 14585 564 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 15 8 Op 6 . - CDS 14587 - 15084 697 ## Dret_0071 DctQ (C4-dicarboxylate permease, small subunit) - Prom 15246 - 15305 3.9 + Prom 15221 - 15280 5.8 16 9 Op 1 . + CDS 15394 - 16161 656 ## Slin_1483 hypothetical protein 17 9 Op 2 . + CDS 16237 - 16878 686 ## COG2186 Transcriptional regulators + Term 16972 - 17008 0.3 Predicted protein(s) >gi|316924455|gb|ADCP01000026.1| GENE 1 24 - 737 647 237 aa, chain - ## HITS:1 COG:TM0439 KEGG:ns NR:ns ## COG: TM0439 COG1802 # Protein_GI_number: 15643205 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 17 210 9 196 214 71 26.0 2e-12 MASHLLEKVAQDGTNSLRERVYRHLEAALREGQLHHGTCLDQDRLCEELGVSRTPLRDAL LRLEAEGFVTIQPRKGVYITPVSDAFVKSACQILGALEADCLDTVFPLLTSRHIRQLEES NKRQEQFLERRQAVEYHAENARFHDIFLSLSSNTLLYQTIGPLRRRLDMLPDHPLAYEQA RAALNDHCRIIDSLKMKNRTAAVSVLRHEHWAPVRFLCPKSAKLSRVASAAAKRSAA >gi|316924455|gb|ADCP01000026.1| GENE 2 1233 - 1940 951 235 aa, chain + ## HITS:1 COG:BMEI0320 KEGG:ns NR:ns ## COG: BMEI0320 COG1802 # Protein_GI_number: 17986603 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Brucella melitensis # 17 211 12 203 230 72 24.0 9e-13 MKHSKKIVKDAALGEEPKVDSLRERVYQYLKDAMSAGKLRYGEFLDQDSICETLEVSKTP LRDALIRLEAEGFVTILPRKGVYINPISLDFIKSAYQIIGSIEADCLNEVFNKLTAYHVR QFEASNERQWELLKNNDYTEYYDENIRFHGIFLSLSENILLEQTLIPLRRRLYDFPRRQY SYEWESMNLYTHQRFIDSVKLGNREAAVSIFRIEHWSYEVHKKYFKLYYDFDQKK >gi|316924455|gb|ADCP01000026.1| GENE 3 2362 - 3258 1027 298 aa, chain + ## HITS:1 COG:no KEGG:Ddes_0512 NR:ns ## KEGG: Ddes_0512 # Name: not_defined # Def: protein of unknown function DUF6 transmembrane # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 292 1 294 309 258 58.0 2e-67 MGFFLGLFSSATFGLIPLFTLPLIHDGVSPASVLFYRFLIASLTLGGVLVLRRERFHASA IDLCKLAGMSFMYTAAALLLFYGLNYMPSGVATTIQFLYPVMVMLLMIAFFHEQFSMITA CSIVLAVAGVALLSIGGDSSRPVTLLGVGMMLLSGFCNALYITGIHVAGIRNMNGLVMTF YVLFFGAAFAFANAVGTGTFQPLSSWWEFMMAALLAVITAVLSNLTLVLAVQRIGSTLTS VLGVMEPLTAVFVGILVFNEPFTPALVGGVILISSSVTLVMLGRQVQAIARRFQRKAA >gi|316924455|gb|ADCP01000026.1| GENE 4 3460 - 4254 753 264 aa, chain - ## HITS:1 COG:PA2770 KEGG:ns NR:ns ## COG: PA2770 COG0384 # Protein_GI_number: 15597966 # Func_class: R General function prediction only # Function: Predicted epimerase, PhzC/PhzF homolog # Organism: Pseudomonas aeruginosa # 4 262 6 257 259 224 48.0 1e-58 MRYFVVDAFADKLFQGNPAGVCVADGPLSEGLMQRIASENNLSETAFLARRDDGEYDLRW FTPLEEIDLCGHATLGSSFVVFNELEPDRQSVAFHTRSGVLRVERRGELLEMAFPVRRPE RVPLDPIAPLLIEALGVNPVETWLYRDLLVLVNSQRDVERLVPDVGKMRLLPMGKAVVIT ARGDAGGPDFVSRFFAPEMGIAEDPVTGSSHSMLVPFWADRLGKDRFVARQLSARGGTLH CALTPEAVLISGRVVPYLKGEIFV >gi|316924455|gb|ADCP01000026.1| GENE 5 4282 - 5100 230 272 aa, chain - ## HITS:1 COG:L12334 KEGG:ns NR:ns ## COG: L12334 COG1396 # Protein_GI_number: 15671989 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 6 83 1 78 107 62 37.0 9e-10 MARKHLNIGGKIQARRKAMGLSQEDLAQLTGVSRQSVTKWETGQSAPDLDRLVEVSDVLG VSLDFLLREPQQVSSPPSFAVADSPPFAEDGSPGASGVPRGIDSSLNGMAPAVAVKRVAS CGQVPVAPFGRETDGLLGITKRSSAPLSGISERSSGLPETLPPESASPREPASPLRLARS DDSGFHGSDLRPLLRRAAAICGIVLFMIGGLGLLALWTLSEMYPVQLTDWDGSRHTGLWG FLLAHEIRSLFWLASGLLASGGVLAVGAWRAR >gi|316924455|gb|ADCP01000026.1| GENE 6 5330 - 5590 383 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212704890|ref|ZP_03313018.1| ## NR: gi|212704890|ref|ZP_03313018.1| hypothetical protein DESPIG_02957 [Desulfovibrio piger ATCC 29098] # 1 85 16 100 101 67 57.0 3e-10 MRKLIVCAAMALTLGMAGLAQAGCTAEEAQQKAATFAQVVQAKAQQDPEGYAKVMQELQP ELLQIQQKQDVDALCAFYDKAIAKLK >gi|316924455|gb|ADCP01000026.1| GENE 7 5810 - 6952 1112 380 aa, chain - ## HITS:1 COG:YPO1221 KEGG:ns NR:ns ## COG: YPO1221 COG0477 # Protein_GI_number: 16121510 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 11 379 12 382 384 254 49.0 2e-67 MPVSRSDCPPPGRAEQITTRIAYLVLGVGVSSWAALVPYAKARLGLDEAVLGMLLLCVGV GSLLSMPFTGLISGRFGCRKVILVSGFIFLAMLPLLASVESVWLMALCLFLFGASIGMMD VSLTIQAVFVEQAAGRAMMSGFHCLYSVGGICGAGGMALLLGFLAPHLAMLVICLFMIAL LAAFGRHFLPYGSEGETPLFVVPRGIVLLIGVLCFIMYLSEGTILDWGALFMTAERGTEA SRAGLAFACFSVAMAIGRLFGDRIVQALDDARVLLYGSLCAAAGFGLVVAAPWAWASLAG FTVVGLGVSNIVPVLFSATARQKFMPLSLAVSAVTTIGYLGVLAGPALMGFVAHATSLVI VFCITLALMCFVAVVSRAVP >gi|316924455|gb|ADCP01000026.1| GENE 8 6961 - 7443 392 160 aa, chain - ## HITS:1 COG:no KEGG:DSY1001 NR:ns ## KEGG: DSY1001 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 23 153 23 153 163 62 29.0 8e-09 MPGSILFPTEVSMYDTEKAFYRHYEYIQKENSLYSRWFQKRGLSYPQFLILDAVLRRPEG AEPTALAEECFMPKQTVTGLLDQLERAGFIRRERCETDRRRTRVFCLPAGDVFVTGIMEE LDRHEQAALASISAEDMETFNNVYAAIVDGLEKHLFSDPS >gi|316924455|gb|ADCP01000026.1| GENE 9 7633 - 9213 1559 526 aa, chain - ## HITS:1 COG:BS_ylpA KEGG:ns NR:ns ## COG: BS_ylpA COG1760 # Protein_GI_number: 16078649 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Bacillus subtilis # 277 519 48 288 300 191 48.0 3e-48 MTHCTACRAARPQSARLLSLFDSLGPIMSGPSSSHTAGMARIGRMGRSLVGGTPERITLH FFGALSHTYKGHASDSAVVAGLIGEREDSPNVRHALRLAAERGVEVEIRTHPDSGRNPNT VSMELVRGGRTYAVAGVSVGGGEIEMTELEGFPVCLRGNEDGALFIGPDGLGRAVFEERL GALSGFSCVKDGERALYLCLAEKPFPEGMDMPGLEMFPVRNILGNKLADAEPLFSTLAAM AEMAGGDLPGLIERYEARRSGVDRDTIRAAVLGSWEIMKASMTDGLAGKSDMLAGLVPGD AGFRLARRVESGQALSGRTIGMAVARALAVMENNGSMRCVVAAPTAGACGVLPGAFLSAA EERGLGDDAIVDGLLVAAAVGVLVAMRAPISGAIGGCQSEIGVASAMTAAGLAQLGGGTP EQVIHAAAIALKNLLGLICDPVAGPVEIPCIKRNAVGVSNAFAAADMALAGIASRIPPDE VVDALINVQGLLHPDLRGNLRGGLASTATGRALKDEWYARMKRMQA >gi|316924455|gb|ADCP01000026.1| GENE 10 9571 - 10728 1123 385 aa, chain - ## HITS:1 COG:STM0650 KEGG:ns NR:ns ## COG: STM0650 COG2721 # Protein_GI_number: 16764027 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Salmonella typhimurium LT2 # 6 379 10 383 390 262 40.0 1e-69 MSSLQAFPRADGSVGIRNIVLVVSTVTCANTVVNHIAWKTGAVPLTHERGCVEGEQSFRR TMLALTTFAKHPNVSGVLIVGLGCEQIRTLDLQVAIGGGKPVHAILIQEEGGSVEAEARG CELVREMQRAAQDLERQPCPLSGLVVGVQCGGSDWTTAIAGNTVIGAMTDLVIAAGGSVL MSEVPGIPGCEHIVASRAVSREAGEQILGMVDELRAEFLRAHGQPIEAVNPTPGNKAGGI TTLVEKAMGNIKKMGTSPVQGVLKLGERPPHPGLWIVDNRANGPDPVNLAGFAMAGASAT VFSTGRGSPVGSPVMPVVKLTGNPHTYAKMPGLMDFNAGVVVDGADIGETGRALYDLLLE VAGGRPTRSEENGDYEFSIPYEEAR >gi|316924455|gb|ADCP01000026.1| GENE 11 10725 - 11030 409 101 aa, chain - ## HITS:1 COG:CAC0696 KEGG:ns NR:ns ## COG: CAC0696 COG2721 # Protein_GI_number: 15893984 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Clostridium acetobutylicum # 4 81 5 82 492 68 42.0 3e-12 MKALRIEPADNVAVVAQDTRKGDVLRHAGGEVTALEDIGLGHKIAVEAVAQGGLVRKYGV PIGRASAPIAAGAFVHTHNLEDITEELCRQYVAAFRDGRGA >gi|316924455|gb|ADCP01000026.1| GENE 12 11045 - 12121 1256 358 aa, chain - ## HITS:1 COG:no KEGG:Amico_1250 NR:ns ## KEGG: Amico_1250 # Name: not_defined # Def: NAD/NADP octopine/nopaline dehydrogenase # Organism: A.colombiense # Pathway: not_defined # 1 355 1 359 359 259 37.0 9e-68 MKKIAVLSAGNGGQALAADLALRGHDVALYDLPRFAPVIEAIRRRGNTIELQNKITGTAT IRLVTTDIAEALDGAEVVYFTAPSYGQKAFFDLAVPALSDGQVIVLMPGNYGTLALKAAL CEAGKDVLVAETDNLPYACAATEPGVVNVRGVKKAVTLAAFPAGDYAAVEAAVDGAFCTG WRKGENVLATSMSGVNMVVHCAPMLANAGRIESEGGHFEFYYAGMTPAVCRLIEATDRER LAVARAYGLDLVSTAQTFRNQYGVEGETLYDVLQANPAFAGFAPKTLHHRFLTEDTPYSM VPMAALGRLAGVPTPIMDALIALLGELLGEDYAVTGQTAERMGLAGMTPEAIRNLVSA >gi|316924455|gb|ADCP01000026.1| GENE 13 12168 - 13163 170 331 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 27 289 31 291 328 70 26 1e-11 MKKVMSVLALCGLLALSGGSAFAAQWKLAHIRPADSVIERDLQAFAKEVREATDGRIDIR IYGSSQLGDYTVVQEKVSLGSVEMSCESVSTQVDKRFLSYILPYVAKDYATAKKNFGTGT PYAKYCANLFDKQDIMVLANWPVYFGGIGLVKAPEAPADPNASHHIKVRVPTMKTQEALA DGLGYQATPLPFAEFFTAAQTGMVSGIFGGGAENYYASFRDLLKCYIAANTHFENWPLLI NKELFESLSAEDQKILLEKSAAFEARRWEVAEADQAAYEKKLEEAGWTVIRLTPEQQNAF AEKGRKASWPELKKAIGDKAYEEVISTFVLE >gi|316924455|gb|ADCP01000026.1| GENE 14 13281 - 14585 564 434 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 1 432 1 429 432 221 30 2e-57 MEPVTLAVVSLAVLMLLLSFGVPIVFCFSGALAFMSIVGGVTMKGMMMWGLQQILSPTLL CIPLFIYAGSLMSESGIAKYLLEFVDVFVGRIRGGLGFVATLTCAILGAISGSTFTGLAA TGNMLIPEMVSRGYPRAFATALITCSSILGVLIPPSTPMIIFGWVTGTNVLACFLSTVGP GIVTIIIFCGINFVLSRRFDLKLMPQLSKTEKRRQIAVRGWKALPALIMPVLILGGIYSG AMTPTEAAAVAVIYALPVGFFVYRGLNLRNAFGCTKDSAVSVGSIMIMIFCSMMLSQTYV VLQIPQAIVESVFSITDNSFLILLLINLILLFVGMIVNDTTGIILVAPLLLPLAKAIGLD PIQYAAIFGVNLAVGSLTPPYASLLYLGIRIGKVDFVEILPYVGLFLLGYVPVMLLTTYW PDVSLFIPRLFGFI >gi|316924455|gb|ADCP01000026.1| GENE 15 14587 - 15084 697 165 aa, chain - ## HITS:1 COG:no KEGG:Dret_0071 NR:ns ## KEGG: Dret_0071 # Name: not_defined # Def: DctQ (C4-dicarboxylate permease, small subunit) # Organism: D.retbaense # Pathway: not_defined # 14 156 13 155 162 119 46.0 3e-26 MLDKVLDWINVFSLLSVAVLMFVQVILRYVLKMPLMGIEELCYFPTVWLYLFAAVKASSE RGQLVARVLEIFCKRQRSIFLLRGIAAFASSAILLWLTWWGYDYLKYALRLQKETASLYL PWIYAEAAVFVSMFLMTLYTLLELRDMITLYRTTPASLPVEKEGD >gi|316924455|gb|ADCP01000026.1| GENE 16 15394 - 16161 656 255 aa, chain + ## HITS:1 COG:no KEGG:Slin_1483 NR:ns ## KEGG: Slin_1483 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 1 236 1 235 250 127 34.0 5e-28 MLFLLIKLFTTPALIWASTLAARRWGPVIGGGLAGLPFISGPASLFIAVQYGTPFAASAA ASSLLGAVASCCYCLAYAHSARRFGWAGSLALALLAFLLCAFLFTRASCGLPVAVCLSIA VPAACLRLLPDIPGTTDMRRVQPPWRLPVQMFCGGLSVLILTELAGVVGEQWSGILLTFP IISSILTPFAHLSFGARAAVLTIRGLLAGFFGTSCFITIIAVCLEPLGIASGYCIAALGA LLTSILIMHCVNKRR >gi|316924455|gb|ADCP01000026.1| GENE 17 16237 - 16878 686 213 aa, chain + ## HITS:1 COG:CAC2546 KEGG:ns NR:ns ## COG: CAC2546 COG2186 # Protein_GI_number: 15895808 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 198 15 226 231 98 32.0 9e-21 MQQITAAIQSGQWKPGSKLPGEIALAEMFQVSRTCIREVLKALAYSGVVESRSGVGTFVK ELPSEAPSPAVDLLSSTDYTQILEMRKLLEGQAAYWATERATPEEIAELESILKQEGLSL KDIHTRFHNGVTALSKNPILIHMLAEFQEQFKKQRELNFVILPDEDRLEHWKVLEAIKSG SPAKARRAMHQHIDYIWKKRPHLFPGKKGPDKM Prediction of potential genes in microbial genomes Time: Fri May 13 02:15:25 2011 Seq name: gi|316924436|gb|ADCP01000027.1| Bilophila wadsworthia 3_1_6 cont1.27, whole genome shotgun sequence Length of sequence - 36821 bp Number of predicted genes - 21, with homology - 17 Number of transcription units - 17, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 137 - 1864 2157 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] - Term 2205 - 2233 0.3 2 2 Tu 1 . - CDS 2276 - 3910 2164 ## COG1620 L-lactate permease + Prom 4048 - 4107 2.6 3 3 Tu 1 . + CDS 4241 - 4741 224 ## gi|212702275|ref|ZP_03310403.1| hypothetical protein DESPIG_00286 + Term 4762 - 4794 1.7 4 4 Tu 1 . + CDS 5138 - 6043 241 ## PROTEIN SUPPORTED gi|30995401|ref|NP_438934.2| transcriptional regulator + Prom 7126 - 7185 4.1 5 5 Tu 1 . + CDS 7292 - 8932 2230 ## COG1620 L-lactate permease + Term 9026 - 9065 9.1 + Prom 9643 - 9702 5.9 6 6 Tu 1 . + CDS 9840 - 10613 575 ## gi|302861788|gb|EFL84723.1| sigma-54 dependent transcriptional regulator/sensory box protein + Term 10638 - 10687 17.1 - Term 10623 - 10677 7.5 7 7 Tu 1 . - CDS 10768 - 10977 74 ## 8 8 Tu 1 . + CDS 11633 - 13513 1710 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit + Term 13620 - 13667 9.3 - Term 13616 - 13646 1.0 9 9 Tu 1 . - CDS 13654 - 14124 152 ## - Prom 14270 - 14329 4.9 + Prom 14193 - 14252 4.4 10 10 Tu 1 . + CDS 14427 - 14681 214 ## - Term 14780 - 14836 16.3 11 11 Tu 1 . - CDS 14950 - 16611 1324 ## COG4452 Inner membrane protein involved in colicin E2 resistance - Prom 16850 - 16909 3.7 - Term 16886 - 16930 9.9 12 12 Op 1 . - CDS 16931 - 17434 363 ## COG0602 Organic radical activating enzymes - Term 17493 - 17518 -0.5 13 12 Op 2 . - CDS 17601 - 17768 183 ## CDR20291_0106 anaerobic ribonucleoside triphosphate reductase - Term 18032 - 18087 10.0 14 13 Tu 1 . - CDS 18097 - 20364 2982 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 20452 - 20511 2.7 15 14 Tu 1 . + CDS 21629 - 21922 298 ## Dbac_0602 response regulator receiver protein + Term 21929 - 21967 4.0 16 15 Op 1 3/0.000 + CDS 22043 - 22414 95 ## COG2230 Cyclopropane fatty acid synthase and related methyltransferases 17 15 Op 2 . + CDS 22305 - 22793 420 ## COG2230 Cyclopropane fatty acid synthase and related methyltransferases 18 15 Op 3 . + CDS 22808 - 23077 87 ## + Term 23193 - 23235 1.4 + Prom 23128 - 23187 3.4 19 15 Op 4 . + CDS 23357 - 35107 15403 ## COG2931 RTX toxins and related Ca2+-binding proteins + Term 35173 - 35220 18.5 - Term 35248 - 35283 -0.9 20 16 Tu 1 . - CDS 35480 - 35929 234 ## Dvul_2922 TraR/DksA family transcriptional regulator 21 17 Tu 1 . - CDS 36108 - 36821 521 ## COG0546 Predicted phosphatases Predicted protein(s) >gi|316924436|gb|ADCP01000027.1| GENE 1 137 - 1864 2157 575 aa, chain + ## HITS:1 COG:YPO1358 KEGG:ns NR:ns ## COG: YPO1358 COG0028 # Protein_GI_number: 16121638 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Yersinia pestis # 3 571 2 570 573 545 48.0 1e-154 MSKRTVAEVLVDELAEAGVTRLYGLVGDSLNPVSDALRRDGRIRFIHVRHEETAAFAAGA EAQLSEKLAACAGTSGPGHVHLINGLYDANRSYAPVIAIASHIPGSAIGTRYFQETHPDQ LFNECSHCNELVSSPSQMPHVLRRAMQTAVSKRGVSVIALPGDVAAMPMNEEPSHGVIPP PPAPAANAHDIQLLADEINGARKVTLYCGSGCRNARKEVMALADKLKAPIGHAWRGKQWI EPDNPFDVGMTGLIGFGGAYEAMEHCDLLILLGTDMPYSAWYPKKPRIVQLDIRGEHLGG RTRIDLGVVGDVAQTLRALLPFVEPRDDSRHLDEAIANLKKSREHLNAYVEHVSSEGRPH PEHLTSAIDANASDEAVFTVDTGLNDVWAARYITAKVGRNIIGSFNHGSMACALPMSIGA QLLYPERQVIALCGDGGMTMLMGELLTLVQYNLPIKVIVYNNGALGFINLEMRTAGYPEF QTDMKNPDFARMAEVIGMKGFRIEKGADVEPVIKAALATPGPVLVDALTDPAAIPLPPTI TAGEAKGLALGLGKLALMGRFSATMDMIKSNIRQF >gi|316924436|gb|ADCP01000027.1| GENE 2 2276 - 3910 2164 544 aa, chain - ## HITS:1 COG:BS_yvfH KEGG:ns NR:ns ## COG: BS_yvfH COG1620 # Protein_GI_number: 16080472 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Bacillus subtilis # 1 544 1 563 563 448 46.0 1e-125 MPWVQQYAPVGGVVASACLAAIPLIILFYMLAVRKAKGHVAAAAGLIGALIVAVAVWNMP VSLAISSTLNGAAFGLFPIVWIVITAVWVYNMTVESGEFEIIKDSLARLTDDRRLQALFI AYAFSCFIEGTAGFGTPVAIAAAMLVGLGFTPLWGAGIALIANTAPVAFGAIGVPLIVAA SVSGLDQMTVSAIAGRQLPILALIVPLWVCVTMCGFKRSLEVLPAIIVGGVCFAGAQFLL ANYHGPTLPDIGSAIATIVGLVLLLKVWKPSRTFRFEGEPESNLSGSGYPASVVLRAWGP YIVLAVFVFFWGLPQFKNILNAVPGANLSFGWPGLHGEVMKTAPIVAQDAVYGATFAFNW LSAGGTAILLSGLVSVPMMPNYGFGKAIACFMRTLRQLTFPIVTIAMILGLAYLMNYSGM SSTLGLAFTHTGSLFPFFSPLLGWLGVFLTGSDTSSCALFGGMQKDTATAVGMSPELAVA ANASGGVTAKMISPQSLSVATAATNMVGQEGNLFRFTIGHSIAMTLVLCVLTYLQAGPLS FMLP >gi|316924436|gb|ADCP01000027.1| GENE 3 4241 - 4741 224 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702275|ref|ZP_03310403.1| ## NR: gi|212702275|ref|ZP_03310403.1| hypothetical protein DESPIG_00286 [Desulfovibrio piger ATCC 29098] # 1 161 1 162 165 137 43.0 3e-31 MTDNRATGWKIPLLFCGVILSIVAVAALFRAHAPEPPAVPQALLKEAKGIRIDLESDPEG QSWKARIASAASGFSTQADKDGRLGEIVLTTAENKRFDASCTAAVLIRDDGLRDGLMRKI ANAASADCASLPWGVFAMHGMRDPQAQAEASALLTQRWKECHEGRE >gi|316924436|gb|ADCP01000027.1| GENE 4 5138 - 6043 241 301 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|30995401|ref|NP_438934.2| transcriptional regulator [Haemophilus influenzae Rd KW20] # 1 245 1 253 301 97 26 1e-19 MLNIRTLQILVEVARQGGFSRAAQTVHAVQPTVSKAVQAVEDRLGVRIFERANNGVRLTV EGEIVYRRALNILQEFEALEADVAALHKLEKGVLRIGFPPVASGILFADLLSKYRQLYPG INISLQEQGCASLEPMVLSGDLDLAVTLLPVPPNFSWLQIRDEPLMVIMPPDHPLADRKR LKMDELADNAFIGFEQGFLLNDRIKAVCRQRGLKLNECLRSGQLDFIMTLVATGIGVALL PRLELERRELPNIHIALLDEEELRWRAAFVWRRGAVLSSAAQAWLKLLPGGEEALASERL A >gi|316924436|gb|ADCP01000027.1| GENE 5 7292 - 8932 2230 546 aa, chain + ## HITS:1 COG:STM3692 KEGG:ns NR:ns ## COG: STM3692 COG1620 # Protein_GI_number: 16766977 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Salmonella typhimurium LT2 # 3 546 4 551 551 469 49.0 1e-132 MAWTQVYDPAGSVVLSALSAAIPLMVLFYMLAIRRAKGHVAAIVGLAVAFVLAVALWGMP FGTAVSSVAYGMANGLFPIVWIVITAVWVYNMTVESGEFEIIKDSLARLTDDRRLQALFI AFAFGAFLEGTAGFGTPVAITAAMLAGIGFNPLYAAGLCLIANTAPVAFGAIGVPVIVAG QVSGFDPNLISAIVGRQLPFLSVIVPLWLCVTMCGFKRSMEVLPAILVAGLCFAISQFVF SNYHGPTLPDIMSAIITLVGLVILLRFWKPATIWRFEGEKPTVLTGKGYSFGEVIRAWIP FIILAVMVFFWGLPQFKAFLDGISGSIATKGFAWPMLDGMVSRTVPVVPAETPYAAFFKF GWLSAGGTAILLSGFFAVPFMPKYSFGKAVACFFSTIYQLRFPVLTIATILGLAFLMNYS GMSTTLGIGFTKTGSLFPFFAPILGWLGVFLTGSDTSSNALFCGMQRSTAQAVGMPPELA VAVNSSGGVTGKMISPQSISVATAATGMIGQEGNLFRFALGHSIAMTLFICVLTYLQATV WQWMLP >gi|316924436|gb|ADCP01000027.1| GENE 6 9840 - 10613 575 257 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302861788|gb|EFL84723.1| ## NR: gi|302861788|gb|EFL84723.1| sigma-54 dependent transcriptional regulator/sensory box protein [Desulfovibrio sp. 3_1_syn3] # 32 250 38 251 255 92 27.0 2e-17 MRYSPEEIFKKFINSVLSDASLDFFVKRAAVFLSMMMPVSSLFIFRSVEGSITKLAECNK TPAINISDHFAISPESMTRLRTEKHFFSEGACNLKIFTGHETCAYADVHRSVYKTDTSAV YIPLKFNMLRETSVTMVVLSMGVDNYTQEHLDICSCLQPVFIDNFLNILSENSSDRAFDL PSPKSKQEPNSKRNKDNAEPFQTFNEVTINYIIKALNRTNGKISGQDGAAELLGLNPSTL WSKIRKYRINVKGLERE >gi|316924436|gb|ADCP01000027.1| GENE 7 10768 - 10977 74 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDAWLDWIGFLGIALVAGWFFFRKQMAENELTEREEDARKNAPLFSSGEGLGGQEGDDGD DDGGGDGDD >gi|316924436|gb|ADCP01000027.1| GENE 8 11633 - 13513 1710 626 aa, chain + ## HITS:1 COG:slr1860_3 KEGG:ns NR:ns ## COG: slr1860_3 COG2208 # Protein_GI_number: 16330246 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Synechocystis # 407 619 3 241 244 120 29.0 7e-27 MLEEQRRSSVNTVALATFNVMDRYQILFRSKADQAVALQHNLSWLAKLITTLVHDPSERF KIMQNFPLPEGISLMAIDPSGEIKLVRGDLEKDDNPYLLRDIKDRPVGPTMRALAEKGLP ATCFVQWDAGGKKQRLYGTHLYLPGEGGLLVGLWANIDKLDEHQQRLKEDSIRELQNEFG KQRIGPSGFLFVTDSKGTPVIGPEGMNTTLSGINPKTGNTLIADIRNASKTPENPVEASI QSLIKRAPSRDALLFVRHVKPFDWYVTGVVYTDEASAPGKQLSFLLIAAILAATLFILPV ALLMVTRLTSPLAKLGDYARKLPEQDFMEDARPTPLLAALADGKHGGEIASLANSLMFMD KTLRSRVRELMEATSSRERLEGELSAATEIQMGFLPKPLPPDVAAGRFSLAASLVPAREV GGDLYDFFMLDDRHLCFIIGDVSDKGVPASLFMSMTLTLIRSRAGSHPSPEHLMAEVNEN LARDNAKCMFVTLFIGVLDLDTGELLYANGGHNPPLRLGDDGVAWLKGISGPVVGAMEGM NYTLLRTVVRPGETLFLYTDGVNEAMNAEGDVYSNEAMFRALASSPSREPASVLRHMLDD VSVHVNGNTASDDMTILCVRYFGGKR >gi|316924436|gb|ADCP01000027.1| GENE 9 13654 - 14124 152 156 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTDLGMMTGKAALRLAKEGSGLTRDEIAERLGVSHSVTKRYFNINDTYMPSLEMIPRLCL ALGNDILMRWLEARLQGGESFSREEIEEEMVRAANAFEELRTLVNEDPPPSVRDMQHAVN RLILELGCIQEILSGHSLHRSRGALPLCPWWKFWKK >gi|316924436|gb|ADCP01000027.1| GENE 10 14427 - 14681 214 84 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGIPQKSLVIGACEIACHYPELSLNDAAGDALQLAEKIRLYGIEENQKKETVFIAACRFV SADKDLTPQKAVEKALRLWDIIEA >gi|316924436|gb|ADCP01000027.1| GENE 11 14950 - 16611 1324 553 aa, chain - ## HITS:1 COG:FN1985 KEGG:ns NR:ns ## COG: FN1985 COG4452 # Protein_GI_number: 19705281 # Func_class: V Defense mechanisms # Function: Inner membrane protein involved in colicin E2 resistance # Organism: Fusobacterium nucleatum # 117 546 1 442 454 200 29.0 6e-51 MFEVMLIGLLVLVLILIPLCIGFVAIVRWIMRRTRRVSGQPPVTAAEPLPPQGAAEKGVV RTESLTAEDTFAFETSGEQGAPSEKAREPQLPVEDVVETPHNTDKAVLQDQLKDSMMFRY AVVAGIALLLLIPLLMVESLVGERSSLYREVVADISRTWGGLQQLSGPYLLVPYTERVER ERVVPVEGGKDRLVRETRLEASYFVILPSRLSFDATLDPESRRRGIYRALVYTSEIGISG QFTLPSREALTRIVPALVDVDYTRSFVVTGLSHPSALREASPFVWAGVPHSAEPGTQPFE NLPSGFRVPIALNAGQGTFDFSQRLALSGSGGIRFATAGETTEIKVRSPWPHPSFQGQVL PASYETSNTGFSAVWSVPSLARSYPNLGTLHTWPRHFTEFAVGVDLYQAGTHYKLVERSV KYGVLFIGLTFLAFIVFEMGLGARLHPVQYGIVGLSMVVFYLVLLSLSEHLAFLTSYLSA SGCIVLMVSCYVGFALRNYKEGIGIGVLLVALYSLLYTILQMEDYALLMGTALLLVMLAA LMVVSRNLARGNK >gi|316924436|gb|ADCP01000027.1| GENE 12 16931 - 17434 363 167 aa, chain - ## HITS:1 COG:CAC0481 KEGG:ns NR:ns ## COG: CAC0481 COG0602 # Protein_GI_number: 15893772 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Clostridium acetobutylicum # 4 149 2 150 153 129 42.0 2e-30 MSDEIRISGIVEESIVDGPGLRYVLFTQGCPHHCKGCHNPTTHPFDGGRMVSPDWVFADV RKNPIVRGVTFSGGEPFVQSGKLAPLAERLRAAGYNLTSYTGYLYEELLADSRHMPLLRQ LDILVDGPFILEEKSLIIRFRGSRNQRIIDVPRSLAEGRVVLHPAMD >gi|316924436|gb|ADCP01000027.1| GENE 13 17601 - 17768 183 55 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_0106 NR:ns ## KEGG: CDR20291_0106 # Name: nrdD # Def: anaerobic ribonucleoside triphosphate reductase # Organism: C.difficile_R20291 # Pathway: Purine metabolism [PATH:cdl00230]; Pyrimidine metabolism [PATH:cdl00240]; Metabolic pathways [PATH:cdl01100] # 14 47 749 782 783 68 91.0 7e-11 MNTESKKVLVGEGVKFERIRRITGYLVGTVDRFNNAKRAEVEDRVKHCKIGCCGN >gi|316924436|gb|ADCP01000027.1| GENE 14 18097 - 20364 2982 755 aa, chain - ## HITS:1 COG:CAC0480 KEGG:ns NR:ns ## COG: CAC0480 COG1328 # Protein_GI_number: 15893771 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Clostridium acetobutylicum # 4 734 3 674 702 518 42.0 1e-146 MVSSIRKRDARIVPFDVNKIAQAIFRAASSQGGKDYVLSLKIAAEVLERLNEQFEDAVPD VEQVQDLVEKTLIKSGHARTAKAYILYRHQRTKIREMNTRLMTTIRDLTFKDAKDNDLKR ENANIDGDTAMGMMLRYGSESSKQFYEMYILNPKHSLAHKNGDIHIHDLDFLSLTTTCCQ IDIKTLFKGGFGTGHGFLREPNDIRSYSALACIAIQANQNDQHGGQSIPNFEYGLAPGVA KTFIKELCLILDLYNMESKDAVKAGLRAYIEENESILDEKGLAFVREILTQHAPALNIDF AVKKALEYTEKATYQAMEALVHNLNTMHSRAGAQIPFSSINYGTDTSPEGRMVIRNVLLA TEAGLGNGETPIFPIHIFKVKEGVNFREGEPNYDLFKLACRVSAKRLFPNFSFIDAPFNL AYYKPGHPETEIAYMGCRTRVMANMNDPSREIVNGRGNLSFTSINLPRLGILADHNVDAF FAELDKMVDLVFEQLLERLEIQSCKKCKNYPFLMGQGIWIDSERLGPDDEVREVLKHGTL TVGFIGLAECLKALIGKHHGESEEAQELGLKIVGFIREKADKKAAETKLNFSVIATPAEG LSGKFVRLDRKRFGSIEGITDKDYYTNSFHIPVYYSISAYNKIRLEAPYHAMTNGGHITY VEVDGDPVNNLEAFEDLIRAMKDMGVGYGSINHPVDRDPLCGYTGIIGDVCPRCGRHDGE GVDIEVLRKIKGYIRDVRSEEERLEAEDKMPNMVI >gi|316924436|gb|ADCP01000027.1| GENE 15 21629 - 21922 298 97 aa, chain + ## HITS:1 COG:no KEGG:Dbac_0602 NR:ns ## KEGG: Dbac_0602 # Name: not_defined # Def: response regulator receiver protein # Organism: D.baculatum # Pathway: not_defined # 8 91 1 84 91 86 52.0 2e-16 MQGVGKVVSYTNADRILPRELLDAIQQYADGVYLYIPRKAERKRAWGEATDSRRERLARN RELYEKHLGGAPVHKLAEEYYLSAKTIYKILASMRQA >gi|316924436|gb|ADCP01000027.1| GENE 16 22043 - 22414 95 123 aa, chain + ## HITS:1 COG:MA1834 KEGG:ns NR:ns ## COG: MA1834 COG2230 # Protein_GI_number: 20090684 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cyclopropane fatty acid synthase and related methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 6 94 7 94 240 101 49.0 4e-22 MQPTSTASNPLFDRDFLLATMMGPNCVRFAEELTANIPLSPHMRVLDLGCGMGLSSIYLA RTFGVRVFAADLWINPSDNFARFRDFGLEDAVTPPPGRRTCPALCRRILRRRDLYRCLPF FRL >gi|316924436|gb|ADCP01000027.1| GENE 17 22305 - 22793 420 162 aa, chain + ## HITS:1 COG:MA1834 KEGG:ns NR:ns ## COG: MA1834 COG2230 # Protein_GI_number: 20090684 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cyclopropane fatty acid synthase and related methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 1 149 88 238 240 170 49.0 1e-42 MKTPLLPLRAEGHALPFAEGYFDAVICIDAYHFFGCDPEYLDAHIVPLVKPGGFIAAAIP GLRQEFTGDIPDELRPYWKEDINFHSAGWWAELWKQSTRAAVRDAFSLRSHKAAWEDWLQ CDNPYAQRDAGMMEKAGWRYFDSIGIIATVSPAACSEPYLLL >gi|316924436|gb|ADCP01000027.1| GENE 18 22808 - 23077 87 89 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKACRPCGGTASPLAAPPWKYRSPSLHTSDRHSSVSGTNPDGCIPAIPHQPKPFAGASHL RQKGIPGEPGTQQKAAASGETLSSEVKKP >gi|316924436|gb|ADCP01000027.1| GENE 19 23357 - 35107 15403 3916 aa, chain + ## HITS:1 COG:all2654_2 KEGG:ns NR:ns ## COG: all2654_2 COG2931 # Protein_GI_number: 17230146 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: RTX toxins and related Ca2+-binding proteins # Organism: Nostoc sp. PCC 7120 # 3714 3806 22 114 203 80 52.0 7e-14 MADIILNKPEAGTQAVFEAAGDSRIDLNFPTDQATLERSGNDLIFRFDDGSTVVLRDFYT AYTKDSMPDFVIEGTPIAGEQFFTALNEPDLMPAAGPAANAASADGGRFREYADDALING VNRLDGLDLSSNRAFFPERDPWGGLRGDDTPNYAPTLSVSGSLGVIESGVFPGGNELYEG VPSMSGRATGTDANGDTLSFGFIDANGAQVTSIVTPYGVMAMAPDGTYTYTIDNADPDTN GLALGETRTETFTVYVSDGRGGLATQEITVTLTGTNDRPELSIANAAQGIHEDTASVGGT FAVQDPDSDSGQNQTFHIEGGSNTPAADGTSPSDGSHSATGSTDATFTTDYGTLTLDPAT GQWTYALNNASDKVQQLNAGETKVETFEVTVTDEHGATSTQTITVTITGTNDIPVIDTDQ SNFHLDFKEQGVYQPSENGDGNTPTTPGGTGEGQHQTGTLSGRIFASDADKENGAGSTEH DVNKLNFHVEHAGSSLTDGGASTTVTGTGTPGTGDVVYAYTSAYGTLTFRADGSYEYTLN NKNPGEAGADGNAVNNLALGQTVTETFTVYVTDAQTGRSVPQTITVTINGTNDVPTLDLS NDNLNDLLGGDGNLHVVEDGVGREDANTPTTDPGKENTSFTGHTTDTGTASGNDVDAGHI LYFGAAAGENTKTFDPSVFNTVDSTATGGAASSVVAGGQYGSLTINSNGSYTYTMKGEGE NVSFELDGKTYNSLDQLAEGDTIYETFTIYVRDEHNAWTAKTVTVAIHGTNDIPTLNITG SDWNITQGGDLSIDGTFTVTDNDRDAGTDQTFHIAGGKDTSGTGTDGAHGTDGDTNATFT TDYGTLTLDPATGQWTYEANPDAIKGLGKDETKIETFEVTVTDEHGATSTQTITVTLNGI NDAPWLGQTSIDLKEEGVILTPEQPGETSNTETHEAPGNTGEAGQDEHRTLVEGELPWKD DDINDKPIFGISGLIGATDGILNVTIKNGDPDASNNVDVKILSSTTDPNTHIQTIVTNYG TLTLDTQTGKFTFDISGSDADKLAAGEELEFSFHTTVNDQNGGNADNRLNVTIRGTNDRP TLDLVEPTHGDNVTVVTDDKTGEVKFDITEKADVANDTTVSGTLKSDDDDRGANLRYGVA LGKQDVESEAGRNLAFGSGSDGKPGMGEPLHQTTDGKIVIEGRYGTLTIDPESNTYTYKT NENADRLGLDADGNPQTGTDEFTIYVRDEHGAWTAKPISVTVTGSNDTPTITADDAEHWV KEAGVVDTSTDHGTTTDTAKTPDPSDDSRELTDADTSLSRNEISGQVHVKDTDTTDTLTL NIGAKEGSGTTLIGDPKTDANGNITLETEFGSIILHKDGTYTYTIDEDKTQSLSQGQTEK EIFTITVSDGHGGTASVDITINIVGTNDRPTLTLTPTSDTVVSDPGYDKNHTEVAEDLTV TGTFEGADPDSNPTLEYGVSTSAGNRDTAFDANGNNPGMGGGHHSATGTYGSLTIDPSTG EYTYTLDTAKGGAADKLGLKPDGKPEQGYDTFTIYVRDEHGAWSEQTVTITVNGSNDAPA IAKTENTLTVTESGFKADNTAVDTTHDVSKGSVNATDVDTSDQGKLTYYFSDKAHNPVTF GKGDVIGHLTLADGTKTEITVTSVKSDGTIVTDYGTFHLDTKTGEYTFTKTESTGNATDQ LQLGDKVELDFSISVKDSHGETASSTHDVTVVINGSNDRPSATMQGITVKEAGVHDGNTA TTADTDGTLGAGEHRVTSGTLNITNLKDVDDDISKGFGTGEDQFKISLRGSGNCGTPSHN ADGTWTMTHLLSNGGDFNNVRATLFNSNFPKDAFDKLEAQLRAEGLLGQNQDLTYGNAAS ILSQVALGTLTVNPDGSYSFTLPPDGSAGSMIVNMFGADNSSNRTINFSVTDPHGGVFNG SFGVTIKGTNDRPELELLGGDDHRLVISTGTTPDGNATTHATITMTEDDKSFSANAKGTD VDFGSRLTYGIAGGHIGDADSADINDLKAAFDGDKGMGNAHTRIETEHGVFTIDSSTGKY TYTPNEDLVYGEKYTDEFTIFVRDEKGAWSQQHVTINVTGGADAPILVGKLPNAIMAEIT EAGVVPNTNTDVDGSIHVNGQLVDGHFNSGGHALGSFEVKQVDTGEGAGHLIAGFVVGGK FYAGDLHTDYGTLHAEVATENGVSKIVYSFILPEPGTKEAANLDALDAGQREKLFDNLKV GVYDSAHESLVNGGANADGSFNINTGNSNLIPTQDVDVYVKGTNDRPVFTDENGNVIAEV VTDANGKTFTKITSDQTSEGVLQEDGSHTLSGNLSAHDPDKSHGDAAGNLSYSIESGGKL VQIIEGKYGILKLNQDGSYTYEITKPELLKELNAGQSLTDSKLPQEVFDVRVTDPLGAHS SGKLVIDVTGTADMPTISFNNTVISEDNGAIVTPSEGDHSHDPSITGQLTLGDRVDAEDI GGSLTWTNKGQTGFATGADGKPLGTLNIDPETGEYTYTLTENGSKIVQSMNDGDVKTETF KVQVEIEAGKIVEKDITITIKGTNDAPTFTDTVTGLEGDVKQDAFVDPDGSDGGVPGVVF TGTLSGATDVDDPDGQLRFMLVGKDGKPVTELKTEYGTIVLTYETAADGSIITHYKYTLD NESTELDEALKDPSKLTDKGTLLDGAQVVVVDPHGKVSEEQKELTINIHKPDNEGGWDGG AGLIIDADKSEFNGAVVEDGRDLPQTPDVTEGLIFEGQLHAKWDGEGHTGTPPDRVFGIE EKDEFGHGTGKQIQSSAADGFVTAEGKYGYLVVDPVTGKYTYTLYNGENGKPGKVQDLAE GQMEKEEFNVMLNGTRTNSKITITIHGTNDAPVIDSYQNMTIQEGDDGLGNLTTSGTLKA HDIDKLLGTDGKPLPEGTETSTLKYYFEGGNNTLTTKYGTVTLTFDKDGNCTYTYTTDGA KLPDHLTEGKTLPDSFIIYVRDEHGKEVEQEITVTINGTNHGPEVVPGEHVLNVVEDVTV SQEGNLNDIIKDDEGLNNLHFSINGKGTVVEGEYGTLHIDPATGKYIYTLNNADPEVQGL DAKSSIKETFTITVTDKHGEMTTVDVTVNVKGTDDTPELTLGKVLSVREGDADAVGDTAV GFDKDIADQGHLTYSFGKGADGNPLTEITNEYGTFTIDPKTGAYTFTLDNTSETVLKMAA GRLYETSINVTVTDTSGLSDTKELVVNIEGTNTAPVITSGEHGVIIANPAPLVEDGGVSK VTGQVTAREYDEGDHVVAFKFVNDKGELVDSLTGKYGTISIDKDGNYTYTFNKGQAQHLG AGEMAAEHFNVVAVDTYGAQTTTPSDLQIQIQGTNDAPVITSPTPVLNLTELASGQAEIT GTITFNDADKKADGTFYDTHTFSVRPAGAAEAENGAAAEGKYGTLTIDEHGNYKYTLTSD ALGEGDKYTETFTATVDDGNGGKATQTITVNLTGTNDAPVITESHTDNGTTGSFIFTDAD VKADGSFYDTHSFAISVDGKAHGVTLDSTGTHGTVTIDGLGTFELTQGDGGNWHYAFTAS PEAIAGAALGSLVTHDFQIIVNDGHATAMTPAGEDSLSVSFMGTGTPPADMDLGNLTPGM AQGDHLPGMDADGHQLAYAFDKAVDGNIQGEFGSLHFNAETGQYTYTLDTSEDGLHKLAQ AQANGSALKESFGYTVSGHEGHSNGSLEINLTDLHTQLGHAGADTLGDQTAAHSQVIFGE GGDDVIHGGAGNDWLFGGEGDDQIFGGTGDDILYGGAGNDYLDGGTGHNSLYGGAGNDIL VYNQGMAHASGGEGIDFLVGAEKDTLDSLFANPDNNPIQSDIEVLITSKPDSLSLTNLDD LKSIGISIEGDKLHLSGDWAPTAIGGEEHGISLGNYAEFTHHSDHGDITILVQSGTPATD DLAQQIVQNTLNHGQG >gi|316924436|gb|ADCP01000027.1| GENE 20 35480 - 35929 234 149 aa, chain - ## HITS:1 COG:no KEGG:Dvul_2922 NR:ns ## KEGG: Dvul_2922 # Name: not_defined # Def: TraR/DksA family transcriptional regulator # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 107 30 135 151 88 51.0 8e-17 MNTSQLAALRWMMERELSALRSPRMSPEISLKCPDENELASRLSERAVDERLTARRVERI RDLENALRRIEVMDYGVCEACGEPIPLLRLLASPSVRYCVACQQELEEEERAVVVTGSFD APRPRRTMPPRPVTPPMPFFLPRRRQASH >gi|316924436|gb|ADCP01000027.1| GENE 21 36108 - 36821 521 237 aa, chain - ## HITS:1 COG:CAC0418 KEGG:ns NR:ns ## COG: CAC0418 COG0546 # Protein_GI_number: 15893709 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Clostridium acetobutylicum # 23 232 2 212 216 191 46.0 1e-48 GIIPTPATVAQRPGGTYKRAMQYKTILFDLDGTLTDSEPGIVNSVRYALRSFGMEAEPAT LRSFIGPPLYDSFRGTMGMSDADAKRAVDTYRVYFRDKGIFENAPYPGVPEMLEVLHAAG RRLIVATSKPEVFAKRIAEHFGFAGALEGVYGADMEGKRSSKIDVIRYAMRERGITPSSA VMVGDRKYDIAGAREAGLADIGVLYGYGSREELVEAGATRLAASVADLREMLSPSDG Prediction of potential genes in microbial genomes Time: Fri May 13 02:16:31 2011 Seq name: gi|316924431|gb|ADCP01000028.1| Bilophila wadsworthia 3_1_6 cont1.28, whole genome shotgun sequence Length of sequence - 8025 bp Number of predicted genes - 7, with homology - 4 Number of transcription units - 7, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 32 - 748 202 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 - Prom 778 - 837 6.8 + Prom 728 - 787 2.9 2 2 Tu 1 . + CDS 916 - 1005 68 ## 3 3 Tu 1 . + CDS 1484 - 1573 74 ## + Term 1608 - 1645 6.2 + Prom 1703 - 1762 1.9 4 4 Tu 1 . + CDS 1842 - 2912 1570 ## COG2855 Predicted membrane protein + Term 2939 - 2983 3.7 5 5 Tu 1 . + CDS 3509 - 4438 942 ## COG0583 Transcriptional regulator + Term 4602 - 4640 3.1 6 6 Tu 1 . + CDS 4872 - 7262 1975 ## COG2199 FOG: GGDEF domain + Term 7288 - 7327 4.1 7 7 Tu 1 . + CDS 7907 - 8024 219 ## Predicted protein(s) >gi|316924431|gb|ADCP01000028.1| GENE 1 32 - 748 202 238 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 3 229 6 240 259 82 31 1e-15 MPTMLVLGATSGIAGATARAFAGSGWTLQLAGRDIQRLQEVADDCGGASIFPFDALDPAS RSALWGLLPSCPDAVLCAVGLLGDQDCACRDPKAALQVIECNFTGLVPVFLQAAEAFEAR GSGLLIGLSSVAGDRGRASNYAYGSAKAGFSAFLSGLRARLWNSGVRVLTVKPGYVATGM IAGRRLPSCIVASPELVARDILRAVRTGRDVLYTPGWWRPLLALYRALPECVAKRLKL >gi|316924431|gb|ADCP01000028.1| GENE 2 916 - 1005 68 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDMQSKLILLTVAFVLVGLVRGYCAMKEM >gi|316924431|gb|ADCP01000028.1| GENE 3 1484 - 1573 74 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNMQSLFTLLPFIVLAIGITRVYCTIKEI >gi|316924431|gb|ADCP01000028.1| GENE 4 1842 - 2912 1570 356 aa, chain + ## HITS:1 COG:ECs3050 KEGG:ns NR:ns ## COG: ECs3050 COG2855 # Protein_GI_number: 15832304 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 42 354 43 347 349 223 45.0 3e-58 MQSEKAIGNGYHLQLMNGVLFVILFASAAYQLSQWHYISLLGISPLIVGIVLGMIYGNTL RLHLPVYWLPGILFSSKVLLRVAIVLYGFRLTFQDLLLVGFNGLAISSIMLTSTFIGGTW LGIKVFGLPRRLALLTAAGSSVCGAAAVLATEPVVKGESYESSIAVGTVVVFGTIAMFVY PFLYSHGFLDMSEKAYGMFVGGTLHEVAHAVAAGQAISIDAGNTAVIVKMLRVIMIAPLL ICLGFWLNRFPEKTEEESRVPSYGTYNFSSFPWFAVLFIACIGINSLGIIPKQAVSVINS LDIFMLTMAMCALGMETSLEKVRMVGPKPFLLAGVLALWLIGGGYFVTRFILSLNF >gi|316924431|gb|ADCP01000028.1| GENE 5 3509 - 4438 942 309 aa, chain + ## HITS:1 COG:ECs3049 KEGG:ns NR:ns ## COG: ECs3049 COG0583 # Protein_GI_number: 15832303 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 4 272 23 286 293 150 34.0 2e-36 MRDAAAKLFISQAAVSSALRDFEAELGVSLFDRMGRGIRLNDKGRLLEERLAPLYNQLKN VLALVASDELAGKIRIGASTTLSDFVLPQVLYNFKMRHHQVEIECESGNTADIVRHVEHG LLDVGFVEGDVHNLAVEITPLAKESLVIVTADKALAEAGPYPIEALLDRHWLLREPGSGT RETFLRQLTPRGLRPQILLEFEHNDSIKQVLHNPGMLSCLSPRIVQREVRAGELFIVGVS NAQFDRTIYRVEHKALPFSSLREALSKEVESCLEEEERLCHLVLKTGGPLPEHEKERAAP RKAAQHAQA >gi|316924431|gb|ADCP01000028.1| GENE 6 4872 - 7262 1975 796 aa, chain + ## HITS:1 COG:all1175_3 KEGG:ns NR:ns ## COG: all1175_3 COG2199 # Protein_GI_number: 17228670 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Nostoc sp. PCC 7120 # 632 794 6 174 183 112 39.0 3e-24 MSLTTRMLAPIALIMSIALAFVAGSAYDNAKKIVSSNIVTIEATVVEAVSREIEFMLEST RNYMRLSAQRQIVKQYAEWLLTQKKPDRSAPEQKIFLEQINQSVAIYNYVDSLIFLNKQG TVEASTENMGEGQNRSDREYYREAMLGKSSVQGPLITRTKGYYAFFLGEPVFVNGEVSGV LVGAINMDYIASHTIDPVTMHGKGIVYVVDPNGQIILHPDRQKMIGNAMIEQAILEAISD GGAGSFENERDGMAYYSTFNTLPNGWTVIATVSRDFMMSDVQLMRDRTAAVALAAVCIAL FFMFLVVCRVVAAMRKGVQFAESVAEGNLDQTFNIRRNDELGALASALNTMVGKLKNSFE IANRQTREAEEARARATSTYRELQALIDSVDGGVARFALDDSFRVIWANTGFYALSGRTR EDYARDVGNRGINVVHPEDGLTMLKTFREHAQKNDSLKAEYRILRKDGGTSWIYLRAKRV GEWEGYPLFQGVFIDITQQKNIIRALEMEQQRYNVVTEITEEILFEQDIATDTLTFSSNF EKLFNRPRSIEHYLRDKHFLEIVHPDDLHLLPSTKSTELSEDDFMRFDARLLTSENTYQW FTICFKVLRDNAGKPTNVIGRLSNINDKKLEEDRLRREAQTDMLTGLYNKMTFKQLAEDM LSEGTHALIIVDIDDFKNVNDTYGHLFGDEVILTVASVVRDGFRSSDITGRIGGDEFAVF AHDALNENVIRNRCRQITARLAEIDYPNGYRISVSMGISFYPRHGKDYPTLFSHADAALY HLKKHRGKGGYAVYGE >gi|316924431|gb|ADCP01000028.1| GENE 7 7907 - 8024 219 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPLKEENSRERERERERERERERERERERERERERERER Prediction of potential genes in microbial genomes Time: Fri May 13 02:17:16 2011 Seq name: gi|316924394|gb|ADCP01000029.1| Bilophila wadsworthia 3_1_6 cont1.29, whole genome shotgun sequence Length of sequence - 52184 bp Number of predicted genes - 38, with homology - 35 Number of transcription units - 28, operones - 9 average op.length - 2.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 200 - 232 -1.0 1 1 Tu 1 . - CDS 302 - 709 144 ## LIC064 hypothetical protein - Term 2401 - 2450 -0.1 2 2 Tu 1 . - CDS 2469 - 3260 574 ## COG1192 ATPases involved in chromosome partitioning - Prom 3294 - 3353 3.0 + Prom 3999 - 4058 3.1 3 3 Tu 1 . + CDS 4197 - 4985 421 ## LIC062 hypothetical protein + Term 4994 - 5026 2.3 - Term 6054 - 6096 12.3 4 4 Tu 1 . - CDS 6189 - 6635 491 ## COG0716 Flavodoxins - Term 7278 - 7339 9.2 5 5 Tu 1 . - CDS 7388 - 8737 1348 ## COG0305 Replicative DNA helicase + Prom 9469 - 9528 4.2 6 6 Tu 1 . + CDS 9557 - 14206 4094 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain + Term 14343 - 14391 8.4 7 7 Op 1 . - CDS 14599 - 15525 748 ## COG2199 FOG: GGDEF domain 8 7 Op 2 . - CDS 15529 - 16050 361 ## Ddes_0755 4-vinyl reductase 4VR - Prom 16158 - 16217 4.9 + Prom 16080 - 16139 4.7 9 8 Tu 1 . + CDS 16264 - 17031 633 ## + Term 17087 - 17134 9.0 + Prom 17275 - 17334 6.2 10 9 Op 1 . + CDS 17421 - 18908 730 ## Dole_3188 radical SAM domain-containing protein 11 9 Op 2 . + CDS 18913 - 19152 66 ## 12 10 Tu 1 . + CDS 19293 - 20225 1082 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 20279 - 20303 -1.0 13 11 Op 1 24/0.000 + CDS 20568 - 21212 516 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 14 11 Op 2 . + CDS 21209 - 21979 241 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 22022 - 22060 4.5 + Prom 22342 - 22401 4.2 15 12 Tu 1 . + CDS 22468 - 22656 203 ## Dret_1671 twin-arginine translocation protein, TatA/E family subunit + Term 22709 - 22748 2.3 16 13 Tu 1 . - CDS 23465 - 24121 616 ## COG1802 Transcriptional regulators - Term 24349 - 24403 19.2 17 14 Tu 1 . - CDS 24433 - 25434 1526 ## COG2515 1-aminocyclopropane-1-carboxylate deaminase - Prom 25504 - 25563 2.2 18 15 Op 1 . - CDS 25823 - 26197 562 ## COG0251 Putative translation initiation inhibitor, yjgF family 19 15 Op 2 . - CDS 26257 - 27009 730 ## COG1794 Aspartate racemase 20 15 Op 3 . - CDS 27082 - 28317 1795 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 28411 - 28470 2.2 + Prom 28710 - 28769 9.7 21 16 Tu 1 . + CDS 28922 - 30316 1616 ## COG1757 Na+/H+ antiporter + Term 30424 - 30468 6.4 - Term 30290 - 30328 -0.1 22 17 Tu 1 . - CDS 30488 - 31714 1514 ## COG0520 Selenocysteine lyase 23 18 Op 1 1/0.167 - CDS 31963 - 32817 1328 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 24 18 Op 2 1/0.167 - CDS 32842 - 34785 2486 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 34898 - 34957 2.5 25 19 Tu 1 . - CDS 34991 - 35848 1144 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 26 20 Op 1 8/0.000 - CDS 35960 - 36937 1460 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component 27 20 Op 2 . - CDS 36984 - 38876 2560 ## COG4666 TRAP-type uncharacterized transport system, fused permease components - Prom 39094 - 39153 1.7 - Term 39152 - 39183 -0.9 28 21 Tu 1 . - CDS 39193 - 41079 2304 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 41322 - 41381 3.1 + Prom 41297 - 41356 4.6 29 22 Tu 1 . + CDS 41396 - 42340 1048 ## COG0583 Transcriptional regulator + Term 42389 - 42416 0.1 30 23 Op 1 . + CDS 42751 - 43512 624 ## COG0561 Predicted hydrolases of the HAD superfamily 31 23 Op 2 . + CDS 43601 - 45433 1384 ## COG0514 Superfamily II DNA helicase + Term 45484 - 45511 0.1 32 24 Tu 1 . - CDS 45430 - 45624 198 ## - Prom 45651 - 45710 3.8 - Term 45627 - 45664 3.2 33 25 Tu 1 . - CDS 45793 - 47226 2179 ## COG1966 Carbon starvation protein, predicted membrane protein 34 26 Tu 1 . - CDS 47457 - 48830 1572 ## COG2379 Putative glycerate kinase + Prom 48994 - 49053 3.2 35 27 Op 1 . + CDS 49095 - 50090 1240 ## COG1052 Lactate dehydrogenase and related dehydrogenases + Term 50099 - 50129 3.0 36 27 Op 2 . + CDS 50175 - 50912 558 ## PROTEIN SUPPORTED gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase + Term 50934 - 50984 8.8 - Term 50979 - 51018 9.8 37 28 Op 1 . - CDS 51138 - 51521 585 ## Sdel_0545 cupin 2 conserved barrel domain protein 38 28 Op 2 . - CDS 51582 - 52001 526 ## COG1846 Transcriptional regulators - Prom 52023 - 52082 1.5 Predicted protein(s) >gi|316924394|gb|ADCP01000029.1| GENE 1 302 - 709 144 135 aa, chain - ## HITS:1 COG:no KEGG:LIC064 NR:ns ## KEGG: LIC064 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 5 115 389 497 498 67 37.0 1e-10 MLYQQQPQTVMKTEIPPSLPLDDTCIWNTDGDLIAILWPFAADAGFGPAHLAQLKRAYQL QGWEPENVSRCLRYLDWELANGVSSGPEHVTAWLRTMQRQGHYPRPEGYVDPEVLRLRQK ADEERELAEARTRFK >gi|316924394|gb|ADCP01000029.1| GENE 2 2469 - 3260 574 263 aa, chain - ## HITS:1 COG:AGc5130 KEGG:ns NR:ns ## COG: AGc5130 COG1192 # Protein_GI_number: 15890074 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 257 7 257 264 150 34.0 2e-36 MAAKVIAFANHKGGVGKTASTVNVAYCLAKRKRRVLVVDCDPQGNASLTLGTVSPYEQPR TVANLFTGLSFSAAAVPSKYEGLDLIPANLNVYATVSTLSNSIKRFFGFRQALDKAALNT YDYILLDCPPTIEGALLTNALIITDYIIIPVGVEDTYALSGVSHLIKVAETLRADTESNL AIMGVLLTMYDGRNNAAKTIRNVAIGTFGEEMVFRTTIPRNTTLNKAVMSNLAVCDYDDG CSSCRSYRELAQEMEARLDPKNA >gi|316924394|gb|ADCP01000029.1| GENE 3 4197 - 4985 421 262 aa, chain + ## HITS:1 COG:no KEGG:LIC062 NR:ns ## KEGG: LIC062 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 256 204 446 452 281 58.0 2e-74 MALLAIIESGAEGAEARHARSAGALPVKDTEEGFVPIAPNLPEAPYVSGGVEAFPEPVRS AFPESGTGKFTYRGPLLPILRLMGHKRPGKNHYTRLVTGLTCLAHGTIELVVRERGQETF YDLTHIISNLRLRGKNDAERDISVTISPFFREMYVANRLTWIDVAKRFQIRGSIAKAMYR FCQSHRENPVFRGDIRTLALALNMDLRSPLKETRRQIRDAIAELAEKKVLEKTSILTKGN IVILNRTAEALPSRRRGRRKED >gi|316924394|gb|ADCP01000029.1| GENE 4 6189 - 6635 491 148 aa, chain - ## HITS:1 COG:MA2699 KEGG:ns NR:ns ## COG: MA2699 COG0716 # Protein_GI_number: 20091523 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 1 148 1 145 147 114 42.0 5e-26 MSKVLIVYGSTTGNTEALAEILGRLIQEAGHETTLLNAADASAPGLCDGWDMILFGCSAW GDDEIILQDDFDALFQQFDLINAKGHKVACFATGDSNFTYFCGAVDVIEAALERLGADVV VEGLKIDGQAQSDQPEIQEWTKAVIETL >gi|316924394|gb|ADCP01000029.1| GENE 5 7388 - 8737 1348 449 aa, chain - ## HITS:1 COG:XF0361 KEGG:ns NR:ns ## COG: XF0361 COG0305 # Protein_GI_number: 15836963 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Xylella fastidiosa 9a5c # 15 448 27 465 471 391 49.0 1e-108 MKTEFETATTDLARRVPPHSVEAERAVLAGLILRPSAMGQIAPMLHPEDFYLPAHQTLYA AALGLHTDNRPVDLVTLAEHLRDQGSLEQAGGAAYIADLAQSHVSAANAEYYAKLVRDKS IQRGLIEAGAKIVSTGFDTSKDLAGLIDEAEQSVMAVSSRTSSGGFRPVQALVGDVVDAV LKPAEGGITGLATGYPELDAITRGLQPSDLIIIAARPAMGKTALALNLAMRAAISEGAPV GIFSLEMSEHQLVQRMISLWGKIPQEQLSTGRLSKAEGARFFETADLLRTAPLFINETPA ISTLELRSQARRLKAEHGLGLIVVDYLQLMRSSRRTDSRELEISDISRSLKAIAKELDVP VVALSQLNRKVEERKDNRPMLSDLRESGAIEQDADIVIFLYRDEVYKPDTPKKGIAELII GKHRNGRVGTVELAFLPQYTAFEPLAKDR >gi|316924394|gb|ADCP01000029.1| GENE 6 9557 - 14206 4094 1549 aa, chain + ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 1108 1532 332 751 776 202 34.0 4e-51 MQTIASRSEDVVRRFFDAYLHLRDLDKTLECLADDIRWIGTGDFERAGKGFCDCRELLIE EFKATPGSFDVKLRNLYSVEAGKACYTDGDMSVVDPSSGKGVNVRISAMCGEVPDGRCLI QGIHASLPSGEQPDGEFFPTSFLGEGNELSSQSRRSAFQLLKNSIPGGLIGGYIEKGFPL YFINDQLLGYLGYSSYEDFVEDTGGMVENGIHPGDRDRVNTLVNESMTVTDSYEVSYRML RRDGSYIWVLDRGLRNISEDGRPVIVSIVVDITRSHELQEQLQQSVVSLKEKNAELEACH AVVFNGFAKLAADAEYTILEANDQFFCMIGYTREEVRGQFGNRAINFVYAEDLIVGNDEI RNRKATGRFSIPFRVRRKDGSDIWVRLDACRAEERWNGHEVLYCFYTDIDEQQRREADYR RQRYFMSLIDSSLEGGFFVTDGGDERRLAYISEGMLHFLGYEPDVFQGLMAGGLEAIVCS DDVALFRTAWREAVNGCYEQEYRIRKGDGQVIWVLEKGRLVVDGEGEPVCICLLLDITER KMRQDELIRQTRLDPLTGLYNREYAQQFIQTYLDIHCGGHSSALLVFDLDHFKRVNDRYG HLRGDAVLIAFAKLLQESFRTRDFVARTGGDEFIAFMQDVPSRTEALDVSRRIREAMSAT LGNEYADCKLSVSIGVAYSHEQISYDALFQLADDDMYRMKFRARGGRELDELPVYDSEGE HSCLFRYAYGLVLRIDLDTGLYTVPYGEFLARKKIPSQGDYEAMLKEAVRSNIVAEDSED VYALCRIENLRQVFESDKRELLCEYRAKMSAGGTRWIQSRFFFATSGRSRLCYSTITDIT ETRHERERSRVALLYDFALSEESGEIYECNLTRNTFRIIRHSSGTFLPLPDEGRLDDLKA MVRESMIHPEDLERYDRVVANARSLRLNDGVLKEDFRCLWRDDSYHWVTANVLCVEDADK LHFVWVHGIDDRKRLDEFSRENAELQRLHMMDERYRIIVEQTKSVVFDWGPEQQLHYAPY LGSLLDCRNNPGNVPDMLRSLTVHPRDMADFKAFHASLYREDQVETTVRLRRRDGAFIWC RIAATLKRDAQGKLLRIVGTISDMDDNVRALRHLRYQAEHDPVTGYSNFAKFKTDAAELL AARGDRKYSLWYCDIRNFKFINDIYGYDIGDELLSYWAELIAEGARPGETFGRISGDNFA LLRCYRDIDDLVARFLRCSDLLTRFEGLANRRFRVEMIAGIYLVERPEDILSIDDMLDRA NLAQKSVKHLSGSKYALYSEEMRKRVLYEKNIESCMEEALRNREFCLHLQPQVDIQHGDA LFGAEVLVRWERPGYGMVSPGDFIPLFERNGFIVDLDAFVFEEACAYLASRGSRGLPPLR LSVNVSRLSIAQGDFLDRYSAIRDKYGIDSGMLELECTETMVIRNFALFRELMAALPSRG FRSAMDDFGTGYSSLNMLKEIALDVLKLDIAFFRDTEGTPRERAVVESIVQMARALGMST VAEGVEKQEQVEFLRSIGCSAIQGYIFSRPVPLALFESVEASFPLYVEG >gi|316924394|gb|ADCP01000029.1| GENE 7 14599 - 15525 748 308 aa, chain - ## HITS:1 COG:aq_035_2 KEGG:ns NR:ns ## COG: aq_035_2 COG2199 # Protein_GI_number: 15605636 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 126 304 79 242 251 105 38.0 2e-22 MTAPLSTDEATALLTSILMDPQWSVDSIADLPKDDPFGKLCYDLAEIRAHVQALSKGDLS RGSKARGFVAGSLKATEANLRHLTWQMERVAQGDYSQSVSFMGDFSKAFNKMSREMHSKA EELSRLLERYRMSTDEDILTGLLNRRTFFKLAMSELRRAKESYQDSHSALCLALADLDNF KNVNDHFGHANGDRVLQLFACRLREACRADDLCCRFGGDEFIILIPRMTKDESVEHVNRL REACSLAPISGPAAELNITASFGLSFIGESELLNTFNSVEILEHAIQIADHRLSRAKQEG RNRVCSAG >gi|316924394|gb|ADCP01000029.1| GENE 8 15529 - 16050 361 173 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0755 NR:ns ## KEGG: Ddes_0755 # Name: not_defined # Def: 4-vinyl reductase 4VR # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 170 1 170 182 216 57.0 2e-55 MENRLYAFDWKYIGDMELGRPNLGKTARLEIYRLFQYTLRDVLETEYGTEQSDRLLYKAG FLAGTEFCKRYVGECASFSDFVAKVQAALEEMNVGILHVERADMDKLEFTLTVAEDLDCS GLPDLGHVVCTYDEGFISGLFKSFTGKSFEAKEVDCWCTGDRTCRFEVKPATE >gi|316924394|gb|ADCP01000029.1| GENE 9 16264 - 17031 633 255 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHILFGLAAVLFASSVALADDQLTAQVKPASAEAAQPSAPVPAKVPSGADADVHTANSE SAASAEKPAPTGERGIGAFSLGMSEDEAKALGAKPTDNADMLQLPFKWMEADWTCVVQFQ EGKAVAVILYANMSDPLLARVFDDLRAQDCMPITVQPAMGHADLYQLAAQGKSDDECWTA MRKRLNVFSTATEGACSVLFTTQGFFKALASTVKDPSKEEAILTEHTADPVYAVNIDMSA NQLTYLITSWGFISK >gi|316924394|gb|ADCP01000029.1| GENE 10 17421 - 18908 730 495 aa, chain + ## HITS:1 COG:no KEGG:Dole_3188 NR:ns ## KEGG: Dole_3188 # Name: not_defined # Def: radical SAM domain-containing protein # Organism: D.oleovorans # Pathway: Biotin metabolism [PATH:dol00780]; Metabolic pathways [PATH:dol01100] # 175 493 3 321 329 184 32.0 6e-45 MMELDNCPCSGANLPRFVQPVILAVLSSGPLHGYLVVQRLAETSLFRKQPPDATGVYRML RNMEQEGVLESDWELENSGPARKRYTLTEKGGHCLDQWMRTLTSHQAFIANLLLFLQDAR SGMNSEPCPMPEHSVSLSPQEVFLSAGSPFPPASCGCGTPQPFAGAVHMDTYSFIDALKN RALRGMPASRDEVLRLLALAPDSEEAAYLGRAARDIAHIVVGNEGRVWSAIGIDCRPCSM NCGFCAFGEKWGLITEPHEWSDEAIIKAARAFVDEGASWVTLRTTEFYGLNRLCALAKKV REAVPGNYGLVVNTGEFGPLEARAMIASGIDVVYHSLRLGEGRTTCFRPEERKATLAAVR DSDLKLAHLVEPVGPEHTDDEIADVLMTALSNGAALSGAMARINVKGTPFESHDPLPDLR LAQIVAITRICGGVNVPDICVHPPRKEALEWGANVVVVETGAVPRNDAECSAEWQGFTVA DAKQLFRDAGYFVKD >gi|316924394|gb|ADCP01000029.1| GENE 11 18913 - 19152 66 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIGGGKGDPSTEGSPFPPPNLPSSPPKTFDLIESLISGRGMGRTVFSDMKVSSCIVTHIH FFNAETVFFFGKYNRGGIF >gi|316924394|gb|ADCP01000029.1| GENE 12 19293 - 20225 1082 310 aa, chain + ## HITS:1 COG:MK0601 KEGG:ns NR:ns ## COG: MK0601 COG0715 # Protein_GI_number: 20094039 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Methanopyrus kandleri AV19 # 46 303 57 314 319 120 30.0 3e-27 MRKLLLTFVLALTLCVPSLAAETWNVGTWKTAQTIQPFLYEPFANGATVKVHPFTNPGDQ KAALLAGSLDMTGTTLALAIQAASRGEPIVLVASLGNKCSALVVKKGGVKSVGDLEGKKI GYVPGTMHEILLRETLTRAGLNPDKDAKLVRIDFFDMGTALSRGDIDAYLSGEPLPTLAK RQGYGEVLAYPYYGEGIGAINSGMIVRRDFVEKNPERVMEMLRAHRKATEQCMSDKAFWL ETSSKMFGVELDVLRDAADNMELVWDMDDTFMKQLSALGKRMLELGIIKKEPDYNALVDR RFVDALRQGQ >gi|316924394|gb|ADCP01000029.1| GENE 13 20568 - 21212 516 214 aa, chain + ## HITS:1 COG:YPO0184 KEGG:ns NR:ns ## COG: YPO0184 COG0600 # Protein_GI_number: 16120525 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Yersinia pestis # 3 194 75 263 284 115 40.0 6e-26 MGAVSEGYADAYAGRFWGDLSASLGRVGSGYLLAVIPGVALGLASGRSPALASLLSPIIN GVRAVPGISWLPLALLWLGIGFRATVFLIALAGFFPAYLNAAAGAASVPPVLIRAGRMLG FGRRAIFFRVVIPSAMPQVRTGLRVALGMSFSYLVLGELTGVPDGLGAMIMDARLAGRVD LLVSGIILIALVGWLCDVLLMRALSAVSVSLRRQ >gi|316924394|gb|ADCP01000029.1| GENE 14 21209 - 21979 241 256 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 23 244 16 241 245 97 32 1e-19 MKPFFACSGIWKSFRKGGESRLVLPGVSVAIPRGGVLALIGASGCGKTTLLNILAGFDAP DAGTVTLEGRPCGGPGPDRAVVFQEPALFPWLTALENVEIGLEAAGIPSAARRARALDML TLTGLSGRENQIPAELSGGQKQRVSLARVLALSPPVLLMDEPFAALDAITREQMQLLLVT LHARIGMGIILVTHDVEEALLLGDEVCVMAPGQGIVASRRVEVPRTGDVTDRKLLRELKT ILRQSMAGQGGPLLLS >gi|316924394|gb|ADCP01000029.1| GENE 15 22468 - 22656 203 62 aa, chain + ## HITS:1 COG:no KEGG:Dret_1671 NR:ns ## KEGG: Dret_1671 # Name: not_defined # Def: twin-arginine translocation protein, TatA/E family subunit # Organism: D.retbaense # Pathway: Protein export [PATH:drt03060]; Bacterial secretion system [PATH:drt03070] # 6 57 7 58 79 67 55.0 2e-10 MDLFSIPHLLIVLLVVMVLFGTNRLPEIGAGLGKAIRNFKQASSELDEVDATFKKDEKDI TK >gi|316924394|gb|ADCP01000029.1| GENE 16 23465 - 24121 616 218 aa, chain - ## HITS:1 COG:mlr7144 KEGG:ns NR:ns ## COG: mlr7144 COG1802 # Protein_GI_number: 13475949 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 6 206 40 235 253 97 30.0 2e-20 MDFTKQTYCNQVTSYIRSLIRKGTLEIGAPVKEAVLSEQLGISRAPIREALQVLVQEGLI TSEPQKGKHVRQMTGKEIYDSYVVAGILEGAGVAESLPLWTEANMETFRSVVRQMEGKIS YATKLDELAEIDELFHATLFMACDNTRLVEMARTSCATISKYLYYQHWITMFTPREYADR HHAVAEAVYSRDVKHVETALRDHYKETGLRMAQFGKEA >gi|316924394|gb|ADCP01000029.1| GENE 17 24433 - 25434 1526 333 aa, chain - ## HITS:1 COG:CC2032 KEGG:ns NR:ns ## COG: CC2032 COG2515 # Protein_GI_number: 16126275 # Func_class: E Amino acid transport and metabolism # Function: 1-aminocyclopropane-1-carboxylate deaminase # Organism: Caulobacter vibrioides # 1 327 1 324 333 300 50.0 3e-81 MNLAKFPRRGYVTEPTPLEALPNFSKALGADINIYIKRDDLLPGTAGGNKTRKLDFSMAD AINQGADTIITCGAVQSNHCRLTLAWAVKEGLDCHLVLEERVKDSYNPEASGNNFLFQLL GVKSVTVVPGGSNMLGEMEKLAEKLRAEGKKPYIVPGGASNKIGALGYVSCAEEVLRQLF DRGLAIDHMVVPSGSAGTHAGIIAGMVGNNAGIPVTGIGVNRKKPVQEAAVLNLANETLE YIGAEARVPAEKVVAFDDYVGPGYSLPTDAMVEAVKMLARTEGILLDPVYSGKAMSGLID LARKGYFPKGSNVLFLHTGGSPALYAYLPTFRA >gi|316924394|gb|ADCP01000029.1| GENE 18 25823 - 26197 562 124 aa, chain - ## HITS:1 COG:PM1466 KEGG:ns NR:ns ## COG: PM1466 COG0251 # Protein_GI_number: 15603331 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pasteurella multocida # 1 123 1 127 129 111 48.0 3e-25 MKAILSTADAPAAIGPYSQAVRSGEALYCSGQLGIDPATGKLPEGVAAQTEQSLKNIRAL LAAAGAGIENVVKTTVFITNMADFPLVNAEYSKVFAENPPARSCVAVKELPLGGLVEIEV LAVV >gi|316924394|gb|ADCP01000029.1| GENE 19 26257 - 27009 730 250 aa, chain - ## HITS:1 COG:PAB0912 KEGG:ns NR:ns ## COG: PAB0912 COG1794 # Protein_GI_number: 14521575 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Aspartate racemase # Organism: Pyrococcus abyssi # 7 242 4 223 228 111 35.0 1e-24 MRQDERIGIVGGVGPHAGLDLTRKLFDHTRAEADQEHLPVMLYSFPDRIGERPAFLLGKT ADNPGEAIGDIMAELARAGATVIGMPCNTAHSPRILDAALEKLNATGRPVRFVHMIDAVV RHVRQRCGEGARVGILSTLATLETRLYQDSLERAGLEALHPAPDGCARVQEAISNREYGI KARNPVTERARADLLDEARRLAGNADAIILGCTEIPLALPEKELGGVPLIDATDVLAREL VRAFAPGKLR >gi|316924394|gb|ADCP01000029.1| GENE 20 27082 - 28317 1795 411 aa, chain - ## HITS:1 COG:PA5479 KEGG:ns NR:ns ## COG: PA5479 COG1301 # Protein_GI_number: 15600672 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Pseudomonas aeruginosa # 1 406 1 421 444 285 40.0 1e-76 MKGSRMSLPTQMAIGMVLGIAAGALTPAMGLDPSWYKPVGQLFINLVRMVVVPLVLTTLV AGAASVGDVSKLGRVAGKTLAYYLVTTAVAVVIGLVLANIFQPGSGLSISTEGLKAKEVA PPTLIDVFLNIVPINPIEALSKGNMLQIIFFALLFGFGLSVIGDKGKQVLHFFDGAADVM IKVTGFVMLYAPIGVFGLMAYTVSMHGLSVLLPLIKLVLVMYLACILQILIVYLPCVKVT GLTPSRFLKGLASPMMIAFTTCSSAAALSTTLLSVQKLGASRSVSSFSIPLGNTINMDGA AIYMGIAAIFAAEVYGIPMPLDRQLTVILLAVLASIGSMGVPGAALIMITMVFTQVGIPL EAIALVAGVDRIMDMARTTINVLGDATGALFVSKLESDFDPERGERLAAAE >gi|316924394|gb|ADCP01000029.1| GENE 21 28922 - 30316 1616 464 aa, chain + ## HITS:1 COG:VNG0436G KEGG:ns NR:ns ## COG: VNG0436G COG1757 # Protein_GI_number: 15789678 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Halobacterium sp. NRC-1 # 3 459 20 512 521 201 33.0 2e-51 MQEKKLEMYGGLFGGLIPLIVLVGGLVWLSVAERGGTKPFWACAWIALCVGIFFAKNKEE YCKAAMRGIGDRTGIVIVTAWLLAGVFGKLMAAGGLVQGLLWMGMTTGAQGGVFVVLVFL AAMLFALGTGTSTGTCIALSPVLYPAGYFLGADPAMLGLAILSGAAFGDNLAPISDTTIV SAYTQGAEMRDVVRSRFPLAISAASISAVVFLVFGGGGEVRPLPEMQASLNPSGLFMLLA LGIVVVSALKGRHIIESLIYGNVAAAFVGMLIGTIRPADIFSVPAAKGGSTGLIQAGIDN VVGAIIFAILILAVTQILVECGIMRRILDFAQSTLVATVRQAELFIVGVTILASIPISAN APAELLVGPSIVRPLGERFGLAAARRANLMDCAVCTIFFVLPWHIAVAAWYGALYSAAET YAIPAPPISAALYNPYSWALLAVLLFSAFTGWNRKYAKDEAPAA >gi|316924394|gb|ADCP01000029.1| GENE 22 30488 - 31714 1514 408 aa, chain - ## HITS:1 COG:alr3867 KEGG:ns NR:ns ## COG: alr3867 COG0520 # Protein_GI_number: 17231359 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Nostoc sp. PCC 7120 # 10 406 19 462 464 318 38.0 1e-86 MNDIYPIQEIRSLFPALCRLHNGNKAVYLDGPGGSQVVATAITAMTNYMRRGVANTHGKF PSSEETEALIESAKAALGDLLACAPNEIAFGANATSNMFAVSRALSRSWNPGDEIVVSEM DHHSHIDTWILSARDRGIMVRHLPVDPKALTLDLSDLDAIVNAKTRLVAVGYASNAVGTI NDVARISRRCREVGALLSVDAVHIAPHKAIDMDAIGADMLFCSVYKFFGGHIGIAAIRKY VFEALDTYRLHPAPSTAPGKLETGTQSHEAIASIVPAVDFIAGLGTGSTRRERLVSGFER IERHENALAGIIRQGLTDVPGLTLYQSEGPKTATVAFTLEGQQPGDVCRKLCDRYGIFAA DGDYYAETLAHRVGVDRIGGWIRVGFAPYNTEEEAELLVRAVREIAAG >gi|316924394|gb|ADCP01000029.1| GENE 23 31963 - 32817 1328 284 aa, chain - ## HITS:1 COG:MT2360 KEGG:ns NR:ns ## COG: MT2360 COG2159 # Protein_GI_number: 15841794 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Mycobacterium tuberculosis CDC1551 # 52 284 64 305 307 75 31.0 1e-13 MLFDFRIRPPFKSFLTTGGMFGGHDGSHDPAYIDPYETGKDPIPSADKGDMDLFFQEMEA NGIDKAGIIGRNCGPNGFVDNADIREFIQKDPNRFFGFAGIDPNAPDAVDEVTRCIRDWG FRGISLEAGWCDPSLKPDDPIIDPVYEKVNELGGITLISLSFCYGPDLSYSDPITIQHVA NKYPNMKICISHGGWPQVQSMLAVAMRCPNVYLMPDCYLYIPGWPYARDYVKAANSYLKY RTLYASAYPLRGFEQCYKGWCAQPFEPDALKANLYDNAARLLGL >gi|316924394|gb|ADCP01000029.1| GENE 24 32842 - 34785 2486 647 aa, chain - ## HITS:1 COG:AF1262_1 KEGG:ns NR:ns ## COG: AF1262_1 COG1902 # Protein_GI_number: 11498861 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Archaeoglobus fulgidus # 9 364 6 354 354 212 36.0 2e-54 MHTPVSLTPMNIHTLTVRNRFVLPGMVTDMAVDGGYVTERLLSYYEERAKGGVGLVIVEA TSIDVSGKTFLHGLDISDDRFIPGLRLLAERVHSHGAAVAIQLQHGGGRAHPEYSHMPRR VMGVIPGVFEPDNAITLDEAEFARLADAWAKAALRAKAAGFDAVEIHGASGYLLEQAVSA FTNRRADGYGGSLAKRLRFPAEVVRAVRAAVGGDFPILYRHTSVEDVPTGNGIDLGTTVE LCRTLTDAGVNAFDITAGMQCCFELMTPPTCMPKAWNAATSAAVKEALGDRARVMLTGRI SDADTAERVVRDGLADFAIMGRALIADSHLVEKYAAGRKDEICPCVACGQGCVGNADKMI PITCALNPLSGREASMPTVPKAETPGRVVVVGAGPAGLMAAATAAERGHEVILLERSDRH GGQISLAAVPPHKEDLRLISDYLYGKAQRAGVTFRFSCEATPESVRNLSPDAVIVATGSL PVVPRFCASAAGAVTAQDILSGAEAGKNVLVLGGGLIGCETAEYLAAQGRSVTVVEMLPQ LAKDMEWCARVLLLRRMAFLGIKLRPGNEILSIGAGNAVTVRNEKGREETLSGFDTLVVA VGCRPDNALSDGLSAALDCPCVAVGDCRKTAKIMDAVHSGFEAALTL >gi|316924394|gb|ADCP01000029.1| GENE 25 34991 - 35848 1144 285 aa, chain - ## HITS:1 COG:MT2360 KEGG:ns NR:ns ## COG: MT2360 COG2159 # Protein_GI_number: 15841794 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Mycobacterium tuberculosis CDC1551 # 52 213 64 229 307 85 33.0 7e-17 MIIDFRARPPYKSFLKLSLYKPWRPLPEDPAEWGAFELGREPNITADAHDMDAFVKEMDD NGIVKAALMGRHADDFGIVDNDELYELTQKYPGRFFPFAGINPREEGAVEEVERCISKMG FKGISVDPGWLNPPLKGDDPIFTPVYDKCAELGGILVMSVSAALGPDLTYSDPVIVQHVA TRYPGMKIVVAHGCWPHVERMLAVAMRCPNVYLVPDCYVYIDCMPFADTYIEAANSFLKY RTIFGSTYPERSLKQTIEGWKKRPWDPEALELSLYGNAARLLGIE >gi|316924394|gb|ADCP01000029.1| GENE 26 35960 - 36937 1460 325 aa, chain - ## HITS:1 COG:DR1649 KEGG:ns NR:ns ## COG: DR1649 COG2358 # Protein_GI_number: 15806652 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Deinococcus radiodurans # 11 323 9 316 319 94 28.0 2e-19 MRTNLIAAAVALAVALLPSAALAEKTRLTAVSGPVSGGWYLGVGLVAKAFTDANPNYEVT MLPGNSTSNVIQLQQKKADLSIGMHTMNMAAIKGTAPYKKAFPNIAAYANLDDTARFHFV ETKKSGITSIAQLRDSKMPVRLAYGAVGGSGEVFCGWIFESYGFTYKDIQSWGGKLYSNN FDDIVNMTKDGQLDAIVWVGPGESWFFTEIAQNVDLVWLPVDEKIMDDVAKKHGLGKGTI PGSLFKGMVGKDIPTVTECNELTVRADLDEETVYRLTKAFIENLDDIRKGCATWADCTPE KAAKDTGAPMHPGAVRYYKEAGLLK >gi|316924394|gb|ADCP01000029.1| GENE 27 36984 - 38876 2560 630 aa, chain - ## HITS:1 COG:BH2945 KEGG:ns NR:ns ## COG: BH2945 COG4666 # Protein_GI_number: 15615507 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, fused permease components # Organism: Bacillus halodurans # 13 629 40 651 656 375 41.0 1e-103 MSLSLDRSLLKPVATVVAIVLSLFQIYFTGGFGVTDAHILRGTHLTLIMILVLLLVPSRK LAPGEKEGWFPILIDLALIALAIAINVYVFQEADNISRRLKYVEDVTNLDLFYGTGLVLL VLEATRRTSGWPLVIVSGSFLAYAMFGEYMPGGLAHNGIEYDRLIEQLFLLTDGIYGVPL GAAAGMIFAFVMFGAFLESSKMSSLFMDLACLLTRKSAGGPAKVSIFASALFGTISGSAA ANVYGTGTFTIPLMKKVGYTSHFAGAVEAVASTGGQMMPPIMGAAAFIMADVIGVSYLTI AKAALIPSVLYYLTLLAIIHLEAVSKNMGTLPPELVPSAASVLRRLYYLFPLVFLITILL MGRSVIACAYYGTVCIVILSMIRSETRFTFKRLCGALELSAKNAMMVSSCCACAGIIIGV ISLTGIGYKFINVITALAGSNLLLLMAMLMVTCIILGMGVPTAPAYIIVATLGAPALMKA GVPIIAAHMFVYFYAILSVITPPVCLAAFAGAAIAETNAMKTGVTAMKLGIVAFIIPFMF VFEPALLMQGSTTEIATAFASALIGVIGIASGMQNWLLVRCRLWERALLLASGLMLIFPG LITDTIGLSSLLIVLFAQRLRKGGSPIAAA >gi|316924394|gb|ADCP01000029.1| GENE 28 39193 - 41079 2304 628 aa, chain - ## HITS:1 COG:AF0455_1 KEGG:ns NR:ns ## COG: AF0455_1 COG1902 # Protein_GI_number: 11498067 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Archaeoglobus fulgidus # 1 348 12 354 354 244 40.0 4e-64 MTLKNRIVMPAMASYHAAVNGEATEKLIRYHEERAKGGVGMNIVEATYVARSGNSFDLGL GISDDFMIKGLSKLTDAVHRHDGKIAIQLQHGGRFGNPPTSGCPRLLVSMIPGLAPTENA RVMDADDIEGVVEAYVQAARRAVEAGFDAVEIHGAHGYLINQFMSPLTNLREDAYGGSFE NRMRFPLEVLRAVRKQVGPDFPILFRYSMEEFMPGGIDMEQAVRIAKVMADNGVDMLNVS IGIGESVEYIIPPASVPDGWNADRAAAIKRAVGSRIPVAVVGRICNRKTAENIIASGKAD LVAMGRALLADPFLPAKLAEGRDDEILTCIGCNEGCTGMLNECRPISCALNPRTGYEDDY PMTQADAPKAVVVIGGGPAGCEAALTAAQRGHKVVLFEATSTLGGLANIAALPPGKGVFA TLGTYFSTMLPRAGVDVRLNTKADADAVRALHPDHVIVATGGTPIVPRFCADSPVVLAQD ILTGAAQAGSRVLVIGGGLVGSETAEFLADKGREVTVVELRDGIALDMEYKTRQMLMPKL AALGVVCLTETEVLEIGCNGNVKVKTPYLEKELSGFDTVVVALGYRPDAALCADLAAADI DFVQVGDCKKVGKIINGVWEAFQLAYRI >gi|316924394|gb|ADCP01000029.1| GENE 29 41396 - 42340 1048 314 aa, chain + ## HITS:1 COG:STM3800 KEGG:ns NR:ns ## COG: STM3800 COG0583 # Protein_GI_number: 16767086 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Salmonella typhimurium LT2 # 13 299 20 303 307 207 41.0 2e-53 MNQNKLQVSGQFLYSFTVVAKHMSFTKAAEDLCITQSAVSHRIRCLEQELGFPLFHRLTR KIILTEEGKTLYGALDASLKLIHETIRNIQEAELQGDLVIACAPSIAGCWLLPLLPEFQH EHPNISLHIRSGNDLISYENEEGNADVAIYCGDSITDGLHATPLLRDNLLPVCSQRYARE HDLIENPEALRNCELLHEHPEAANTPFYAGWEIWANWAGITGLPLQSGYSFDRAELTAIA AKQGLGVALGREWLVREALQKKELIVPFNLVFPAPQTYYVVSTRKGILRPKVRLFHDWIL KKARHSKPYGNAVL >gi|316924394|gb|ADCP01000029.1| GENE 30 42751 - 43512 624 253 aa, chain + ## HITS:1 COG:FN0869 KEGG:ns NR:ns ## COG: FN0869 COG0561 # Protein_GI_number: 19704204 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Fusobacterium nucleatum # 1 228 7 236 270 63 26.0 3e-10 MKIAACDFDGTLFRDGVVSEADLEAIADWRRSGNAFGIVTGRGRNTLLRDVKRFSIPYDF LICNNGAMICDEQAQDVYCAVLPEPVRAEIMDHPGMRASSQCAFFAGTAIFTHAGKTDYW ILKEHVLPRLSPVEALHMPGLHQISLAYAEPGESAAWSEAFTEGCGENAGVHFSNICIDI TAPDVSKAAGMTHLLELRRWGEAKEVLVIGDDMNDLPMIRHFNGHAVANAAPEVRDAASS VFLSVGQMLRERG >gi|316924394|gb|ADCP01000029.1| GENE 31 43601 - 45433 1384 610 aa, chain + ## HITS:1 COG:YPO3833 KEGG:ns NR:ns ## COG: YPO3833 COG0514 # Protein_GI_number: 16123968 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Yersinia pestis # 18 610 17 608 610 522 46.0 1e-147 MEQGGAGRWSDAELHGALARVFGYTAFRPHQIEIVKGILGGHDSLSIMPTGGGKSLCFQL PSHLMDGVCVVISPLISLMKDQVDAACANGLRAAALNSSLSPGGHRAVREALASGELDLL YISPERLSAPGFWDALASWPVSFFAVDEAHCISQWGHDFRPDYLALSGLVERFPQCPVAA FTATATPEVERDILSRLGLREPRLIRASFDRPNLFYHVLPKEEPHAQLLSFLGGHEGESG IVYRSTRRKVEETAAFLQKKGVKAEAYHAGLPDAERMRVQEAFRRDECPVVVATVAFGMG IDKPDVRFVAHLDLPKNVEGYYQETGRAGRDGDPAHCLLLYSAADMAQLLYFARQTEDEE QRSIAEKHAYAMLEYAERNQCRRKALLSYFGEDFEEPNCGGCDVCTGEIQEEDYTIEAQK ALSAMVRSGCRFGRVLIMDILMGADNRRIRETGFQHLPTYGVGKDRTRRFWKYLIDALTR QGLAAIEGEEYPYLRVTETGWETLRGAPFRALRIVETRARRNRAERDGASQDMDVDDTLF QLLRAERRRLAEAAQVPPYVVFHDRALREMAAGKPASREAMLGIAGVGERKLALYGDAFL AVIARYISEN >gi|316924394|gb|ADCP01000029.1| GENE 32 45430 - 45624 198 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MERQEERERERERERERERENAGIAEKKTNAHREAPLLPSQNASVEAFCRCLRVYPEQMG FEMI >gi|316924394|gb|ADCP01000029.1| GENE 33 45793 - 47226 2179 477 aa, chain - ## HITS:1 COG:VC0687 KEGG:ns NR:ns ## COG: VC0687 COG1966 # Protein_GI_number: 15640706 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Vibrio cholerae # 4 474 1 484 494 458 51.0 1e-128 MSGLWYFWGSLALLLGGYFVYGAFVEKVFGADFGRLTPVKSRRDGVDYIELPRYKIFLIQ LLNIAGLGPVIGPILGALYGPSALLWVVFGCIFGGAVHDYCSAMMSLRYGGASYPEIIGR NLGTGVRRFMEVFAIAFMIMVGAVFVLGPAALLANLTSFGLPFWATLIFAYYFLATIMPI DTIIGRIYPFFSVLLLVMAFGLAGSLMLSGRPVLPNTDFFVNTDPAGLPIWPLLFITIAC GAISGFHATQSPLMTRCMESEKHGRMLFYGPMIAEGVLGLIWVTLGLSFYESPEALSAVI KAGTPTLVVQEISMALLGPIGGMLAILGVVVLPISTGDTAFRSARLLVADTLRIDQGPIG KRLMIAVPLFTAGIALTFVDFAVIWRYFGWANQTMSCVTLWAIAVYLARRERFHWIATLP AAFMTVVCVSYLCYAKIGFGMSVEMSTLIGIASAAASLVLFLSRGKRMPELENEASC >gi|316924394|gb|ADCP01000029.1| GENE 34 47457 - 48830 1572 457 aa, chain - ## HITS:1 COG:PAB1021 KEGG:ns NR:ns ## COG: PAB1021 COG2379 # Protein_GI_number: 14521745 # Func_class: G Carbohydrate transport and metabolism # Function: Putative glycerate kinase # Organism: Pyrococcus abyssi # 15 457 7 431 435 290 39.0 3e-78 MTDLPPCQAASELRRLTDAALSAVAPDGAVLRHLHLDGGKLTLIDESGAPAWSGRLDAYR RIRVLGAGKGAAPMAAALENLLGDRISDGLVIVKYGHDLPEGQRTRHIRIKEGGHPVPDE AGAAAAGEIVDMARDSREDDLVLCTFTGGASALTPALHPEIPLADMQRLTCMLLECGATI HEINTLRKHLSRFSGGSLVRAAFPATVLGLIVSDVVGDDLDVIASGPTVPDPSTFADCLR VVEHYGLRWKMPQSIWAHIEGGLQGRTPETPKADEPAFGRVRNVLVASVKQALEAAADEA ARCGFVPRILTTAMSGEARRTAAQLVAEARRAQAGLRPGDAPLCLLAGGETTVTIQGSGK GGRNQEMALAATLELADDRGIDLICVGTDGTDGPTDAAGGYAFSGDLARLRAIGLHPKES LDANDSYPLLLKAGTLLRTGPTRTNVMDMAIMLVRPK >gi|316924394|gb|ADCP01000029.1| GENE 35 49095 - 50090 1240 331 aa, chain + ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 326 1 324 324 303 45.0 3e-82 MKIVVLDGYTLNPGDNPWDPLAALGDLTVYDRTNPEDVLERSRGAEILITNKTRLNADTL AALPDLRCIGVIATGYDVVDIAAAGKRGIPVMNVVNYGTEAVAQHAFALLLELCRRTALH DAGIRSGRWAAGPDWCFWDTPQVELSGKVMGILGFGTIGRRVAELAHAFGMKVIAASARK SAAELSASSAPVSFPVEFVEMDELFRASDVLSLHCPLTDQTRAIVNAANIASMKDGAILL NLARGPLLDEAAVAEALASGKLGGLGADVVSVEPIAQDNPLLASPNTLLTPHIAWATRTA RQNITRIIAENIAGWMAGQPKSVVNKAYLNG >gi|316924394|gb|ADCP01000029.1| GENE 36 50175 - 50912 558 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase [Ochrobactrum intermedium LMG 3301] # 1 245 1 243 245 219 46 3e-56 MKENKYDDPTFFGKYSRMPRSKEGLAAAGEWHVLRRMLPPFEGKRVLDLGCGFGWHCRYA VEQGAASVVGVDLSERMLAEARAMTDSPAIQYLRMPIEAIDFPADSFDVAISSLAFHYIE SFGGLCAKVNRCLSAGGHFVFSVEHPVFTAQGSQEWHCDAQGNHLHWPVDSYFSEGVRKA GFLGEEVMKYHKTLTAYVNGLLQNGFELLELVEPQPEPGLLEAYPEIMRDELRRPMMLLV SARKR >gi|316924394|gb|ADCP01000029.1| GENE 37 51138 - 51521 585 127 aa, chain - ## HITS:1 COG:no KEGG:Sdel_0545 NR:ns ## KEGG: Sdel_0545 # Name: not_defined # Def: cupin 2 conserved barrel domain protein # Organism: S.deleyianum # Pathway: not_defined # 18 127 24 131 131 165 70.0 5e-40 MAQYKLAHIDMKGNRGELHDLLNLTGAEVSCNTLPAGASVPFVHHHTQNEEVYLILEGKG MLYIDGEEVPLKEGDCFRIDPQGERCLRAADDSAMRFICIQAKAGSLEGFTMSDAVVTES GSKPSWL >gi|316924394|gb|ADCP01000029.1| GENE 38 51582 - 52001 526 139 aa, chain - ## HITS:1 COG:FN2010 KEGG:ns NR:ns ## COG: FN2010 COG1846 # Protein_GI_number: 19705306 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 11 137 10 136 160 59 27.0 2e-09 MKYSAIFSHASRLRESGNRFILAELEKVGLSDIAPSHGDILVRLLACEACNMSELAKQVH RTKSTVTALVEKLERNGYVLRIPDPEDSRGVLVRLTDKGRALEPAFEAISNGLQRLITDR LSEEEAALLDRLLDKCVNG Prediction of potential genes in microbial genomes Time: Fri May 13 02:18:19 2011 Seq name: gi|316924391|gb|ADCP01000030.1| Bilophila wadsworthia 3_1_6 cont1.30, whole genome shotgun sequence Length of sequence - 2124 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 3 - 54 4.0 1 1 Op 1 3/0.000 - CDS 148 - 1020 1201 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 2 1 Op 2 . - CDS 1017 - 1841 919 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 2031 - 2090 4.0 Predicted protein(s) >gi|316924391|gb|ADCP01000030.1| GENE 1 148 - 1020 1201 290 aa, chain - ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 8 271 9 272 306 284 59.0 1e-76 MKNETALGHVLTLMTIIIWGTTFVSTKVLLEAFTPIEILFFRFTLGYLSLWAIYPRKGTY GTLRQELLFAAAGLCGVTLYFLLENIALTYTFASNVGVIISIAPFFTAFLANWLLDGEPL RKRFFVGFAAALTGIILIGLNGNFVLKLNPVGDILATLAAVVWACYSILMKKIGAFRMNM VFCTRKVFFYGLLFMLPALFIFDCRLGLERFASPVNTLNLLYLGFGASALCFVTWNWAVR ILGAVKTTVYIYLVPVVTVVSSVIILHETITPLAIVGTALTLTGLIVSQR >gi|316924391|gb|ADCP01000030.1| GENE 2 1017 - 1841 919 274 aa, chain - ## HITS:1 COG:BS_ybfI KEGG:ns NR:ns ## COG: BS_ybfI COG2207 # Protein_GI_number: 16077291 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 1 269 1 269 275 268 50.0 1e-71 MIKETRTIHYDAELAIEAYWLQNVLEPFPDHFHDYYLIGFIEKGARELTCGGQKYITTEG DLFTLNPHEPHGCRSYDGKPFSYRGIGVLPDVMRAAMREITGQAILPRFRERILSQSELA CSLRDLHAMIVQEDKEFRKEEIFLMLLGQLLRDNAGETPLPDAYKDESDIGAVCAYLEEH SGEPVSLDRLGEVAGLSKYYLLRSFTKQKGISPYRYLETIRIAKARKLLERNVPMIEVAL QTGFADQSHFSRFFKRLIGVTPRQYAEIFGGRQA Prediction of potential genes in microbial genomes Time: Fri May 13 02:18:32 2011 Seq name: gi|316924364|gb|ADCP01000031.1| Bilophila wadsworthia 3_1_6 cont1.31, whole genome shotgun sequence Length of sequence - 29186 bp Number of predicted genes - 29, with homology - 25 Number of transcription units - 11, operones - 6 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 21/0.000 - CDS 51 - 1148 1496 ## COG3839 ABC-type sugar transport systems, ATPase components 2 1 Op 2 38/0.000 - CDS 1170 - 2015 1302 ## COG0395 ABC-type sugar transport system, permease component 3 1 Op 3 35/0.000 - CDS 2031 - 2924 1287 ## COG1175 ABC-type sugar transport systems, permease components - Term 2984 - 3021 9.3 4 1 Op 4 2/0.000 - CDS 3042 - 4361 2057 ## COG1653 ABC-type sugar transport system, periplasmic component - Term 4475 - 4522 12.5 5 1 Op 5 . - CDS 4535 - 5380 995 ## COG0584 Glycerophosphoryl diester phosphodiesterase - Prom 5460 - 5519 3.3 + Prom 5399 - 5458 2.1 6 2 Tu 1 . + CDS 5672 - 7162 1805 ## COG0554 Glycerol kinase + Term 7393 - 7433 -0.8 7 3 Op 1 13/0.000 + CDS 7697 - 8284 626 ## COG1556 Uncharacterized conserved protein 8 3 Op 2 . + CDS 8335 - 10491 2800 ## COG1139 Uncharacterized conserved protein containing a ferredoxin-like domain 9 3 Op 3 . + CDS 10492 - 10695 89 ## + Term 10822 - 10881 20.9 10 4 Tu 1 . - CDS 10884 - 11303 355 ## Sterm_1353 hypothetical protein 11 5 Tu 1 . - CDS 11406 - 12794 1713 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 12849 - 12908 2.1 - Term 12918 - 12955 9.2 12 6 Op 1 11/0.000 - CDS 12983 - 13921 1263 ## COG1180 Pyruvate-formate lyase-activating enzyme - Term 13945 - 13984 10.5 13 6 Op 2 . - CDS 14027 - 16519 3337 ## COG1882 Pyruvate-formate lyase 14 7 Tu 1 . + CDS 16555 - 16758 151 ## + Term 16799 - 16837 0.0 - Term 17193 - 17231 8.3 15 8 Op 1 4/0.000 - CDS 17252 - 18706 805 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P 16 8 Op 2 4/0.000 - CDS 18857 - 19153 350 ## COG4576 Carbon dioxide concentrating mechanism/carboxysome shell protein - Term 19192 - 19247 6.5 17 8 Op 3 1/0.000 - CDS 19283 - 19561 438 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 18 8 Op 4 4/0.000 - CDS 19650 - 20642 1036 ## COG0280 Phosphotransacetylase 19 8 Op 5 4/0.000 - CDS 20642 - 21400 1048 ## COG4812 Ethanolamine utilization cobalamin adenosyltransferase 20 8 Op 6 . - CDS 21434 - 22360 893 ## COG4766 Ethanolamine utilization protein 21 8 Op 7 . - CDS 22453 - 22857 669 ## CLD_2520 microcompartments family protein 22 8 Op 8 2/0.000 - CDS 22876 - 23433 759 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 23 8 Op 9 . - CDS 23494 - 24831 1549 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Term 24845 - 24898 6.0 24 9 Op 1 . - CDS 24922 - 25854 1307 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 25 9 Op 2 . - CDS 25898 - 26083 67 ## 26 9 Op 3 . - CDS 26149 - 26349 73 ## - Term 26414 - 26457 12.3 27 10 Op 1 . - CDS 26471 - 27004 735 ## COG5405 ATP-dependent protease HslVU (ClpYQ), peptidase subunit 28 10 Op 2 . - CDS 27039 - 27434 464 ## COG0736 Phosphopantetheinyl transferase (holo-ACP synthase) + Prom 27460 - 27519 1.9 29 11 Tu 1 . + CDS 27603 - 28931 1514 ## COG1004 Predicted UDP-glucose 6-dehydrogenase + Term 29043 - 29109 12.6 Predicted protein(s) >gi|316924364|gb|ADCP01000031.1| GENE 1 51 - 1148 1496 365 aa, chain - ## HITS:1 COG:AGl3073 KEGG:ns NR:ns ## COG: AGl3073 COG3839 # Protein_GI_number: 15891652 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 360 56 408 413 352 50.0 5e-97 MARVQLKNVEKTFNKNKVLKGINVEVPDGSFMVMVGPSGCGKSTALRCIAGLEEVTGGSI FIGDEDVTSKEPKDRNIAMVFQNYALYPHMNVFDNITYGLKVRGIPADERKRRAEDAAKL LGLDGLLDRMPKQLSGGQRQRVAMGRAIVREPSVFLFDEPLSNLDANLRNQMRIELRRLH QRLATTSIYVTHDQVEAMTLAEKILVLRAGNIEQYGTPDDIYLHPASVFVAQFMGSPSMN MIPATTDGEKLILPDGTPLSGVAASDIRLAAAPSKDVLVGLRCEDLLLDPEGSVQVTVDI IEALGPDTLAYCHPIAGKAGGVADQSAIIVRLPGTRRPAPGDAIRLSVRAGHGHVFDPAS GLRCL >gi|316924364|gb|ADCP01000031.1| GENE 2 1170 - 2015 1302 281 aa, chain - ## HITS:1 COG:SMb20418 KEGG:ns NR:ns ## COG: SMb20418 COG0395 # Protein_GI_number: 16264152 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 1 281 1 282 282 266 53.0 5e-71 MVEKKGLGKAIGHAFLWFGIAIVLFPLYIALVATTHTFDVLIGSTPLWFGGEFLKNFTTV LTQGLKAAGGIPVWVMMANSLVMALGIAIGKIAISITAAYAIVYFRFPGRELCFVLIFIT LMMPVEVRIVPTFQVAANLNFIDSYLGLVLPLIASATATFLYRQMFMTIPAELLEAARID GAGPWRFFVDVVLPLSRTNTAALFVVLFIYGWNQYLWPLLVTNQESMYTIVMGIQRMVNI PDAVPEWNLIMAVALLGMLPPLLVVLGMQKLFVRGLVETEK >gi|316924364|gb|ADCP01000031.1| GENE 3 2031 - 2924 1287 297 aa, chain - ## HITS:1 COG:RSc1265 KEGG:ns NR:ns ## COG: RSc1265 COG1175 # Protein_GI_number: 17545984 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Ralstonia solanacearum # 5 297 4 293 293 323 58.0 2e-88 MQDGRYAFRGKKHLLLPLLFLLPQLAITVVFFFYPAGEALMGSFYMEDSFGLSKEFVGLD NFITLFSDPSYLETLRLTLVFSLATIVLTMGTALLFAVLADQVNWGGTLYKSMLIIPYAV APPLAGVLWLFLFNPSGGVLTELLHHFGYTWNHKINGNQALFLLIVSASWKQISYNFLFF LAGLHAIPRSLIEAAAIDGASPWRRFRTIIFPLLSPTSFFLLVVNTIYTLFETFGIVHAV TQGGPSKATETLIYKVFNDGFIGLDFGGSSAQSVVLMLIVMVLTFLQFRYVERKVTY >gi|316924364|gb|ADCP01000031.1| GENE 4 3042 - 4361 2057 439 aa, chain - ## HITS:1 COG:RSc1264 KEGG:ns NR:ns ## COG: RSc1264 COG1653 # Protein_GI_number: 17545983 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Ralstonia solanacearum # 7 439 10 438 438 444 52.0 1e-124 MRSFRLMALLAICLMLLAAPVQAKTTITFWHGMGGELGEITDEMIKEFNASQDKYEVKGV YKGNYDEAMTAAIAAFRAKQHPNLIQIFEVGTASMMAAKGAIRPVYEIMEAAGTPVDTSK LLGSVASYYSSTDGKLIAMPFNASTTVLYYNKDAVKKAGGDPDHFPTTWPEVAELAKKIK ESGACKYGLTSGWQSWVQLESFSAWHNVPFATNNNGYDGLDTKLLFNSPLHVKHIDFLSK LQKDGVMVYVGRKSEAINAFTSGEAGILMNSSGSYAAVKAGAKFQWGVALLPYWPDVKGA PQNTVIGGAAIWAMAGHSKDAEKGAAAFLNYLLKPEVQARFHQLTGYVPVTLEGYELTKQ QGFYDKNPGTDIAVKALSDKKPTINSLGLRLGNFVQIRNIIDEELEAVWAGKKTAKQALD AAVERGNAELARFAKANKK >gi|316924364|gb|ADCP01000031.1| GENE 5 4535 - 5380 995 281 aa, chain - ## HITS:1 COG:FN1891 KEGG:ns NR:ns ## COG: FN1891 COG0584 # Protein_GI_number: 19705196 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Fusobacterium nucleatum # 16 279 24 260 261 129 33.0 6e-30 MQKYIDFHTCPPVPLVVAHRGARGHAPENTLTAAALGYAVQADLWELDANYTKDGKLVVM HDDTLVRTTDVETAFPGRPSYRVCDFTLDEIKSLDAGSWYAGRDQFGRVAAGEIDADTLK SFDGLTVPTLEEALAFTKDNGWYVNVEIKNHSHLIGHETVTKDVLDLIRRLDMVEQVIIS SFQHRYLEECRVLCPEMATGALVEHIRPRDPAALCRRLQVNAYHPDQRILAPGDLAALRD AGFAVNVWTVNDMSEAKRLVAEGATGIITDFPAACRKALGE >gi|316924364|gb|ADCP01000031.1| GENE 6 5672 - 7162 1805 496 aa, chain + ## HITS:1 COG:TM1430 KEGG:ns NR:ns ## COG: TM1430 COG0554 # Protein_GI_number: 15644181 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Thermotoga maritima # 3 480 2 479 482 669 63.0 0 MSRYILALDQGTTSSRAILFDRSGNIIQMAQQEFTQHYPQPGWVEHNPNELFDSQAVVAA KCLRQAGVTGSEVAAVGIANQRETTVVWNRRTGAPVYNAIVWQDRRTAGFCDSLREQGKA CLFAEKTGLVLDAYFSGTKVRWVLENVPGAREQAEAGDLLFGTVDSWLIWNFTKGAVHAT DPSNASRTLMFNIHTGDWDDELLELLSVPRSMLPEVVPSSGIMGHMHPEFLGHSLPLAGD AGDQQAATYGNACMLPGMAKNTYGTGCFLLMNTGTEARRSENNLLTTVGWNCGYGLRYAL EGSVFIAGAVVQWLRDGLQIISDAAEIEPLASTVRDNGGVFFVPAFAGLGAPYWDQYARG AIVGLTRGVTRGHIARAAIEAIALQTLDIMDCMKKDSGLPLSTLRVDGGASRNNMLMQCQ ANVLGVPVERPMITETTALGAAYLAGLAVGFWDSEEEVSALWKLDRRFEPAMDEERRAEL LHNWHRAVGRAANWIE >gi|316924364|gb|ADCP01000031.1| GENE 7 7697 - 8284 626 195 aa, chain + ## HITS:1 COG:FN1539 KEGG:ns NR:ns ## COG: FN1539 COG1556 # Protein_GI_number: 19704871 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 76 195 65 182 183 75 35.0 4e-14 MEQTPAELWKQRSIAIAATVKEVASVDEALQYALELCQAAEPHEQLMGAEPERKGNILAA PGLPAEQAEILRKGCEEKGLVFVEGGLREYAGGMEVGVAWAACGLADTGTCVVESTNEDI RLATMLPETSVLLIRRSTILPGNENGAEVLQKFFGKAEPEFLAFISGPSRTADIERVLTL GVHGPLYLHAVLIND >gi|316924364|gb|ADCP01000031.1| GENE 8 8335 - 10491 2800 718 aa, chain + ## HITS:1 COG:FN1540_1 KEGG:ns NR:ns ## COG: FN1540_1 COG1139 # Protein_GI_number: 19704872 # Func_class: C Energy production and conversion # Function: Uncharacterized conserved protein containing a ferredoxin-like domain # Organism: Fusobacterium nucleatum # 9 461 4 461 463 337 40.0 7e-92 MKEPARDIKNYRTDLSAALEDTFQRRALDKFAVDYRASRERIYSGLNDRELIAEVAARKD ESVRHLDELFVQFKEEAEKRGVQVHLARDAADANRIIAKIAEENGCKMVVKSKSMTVEEI QTNTALEAKGMEVVETDLAEWIIQLRHEGPSHMVMPAIHLSRTQVADTFTKETGKHQTED ITSLVRVARRELRRKFAEADMGISGMNFAVAESGAIGLVTNEGNARMVTTLPRVHVAVGG IDKLIPSFDDAMATLRVLPRNATGQHLTSYVTWIAGGVPTASAPDGKKSMHVVFVDNGRK AVLNDPILSQALRCVRCGACANVCPVYRLVGGHRMGYIYIGAIGLILTYLFHGKDRAKAL VQNCVNCQACKSVCAAGIDLPGLIEEIRMRYIEQDGNSLPMNLLASTLKNRKAFHTLLKF AKYAQKPLTGGEQFIRHLPSMFAKDNEFRALPAIADKAFRDRWEKLDRPVSANPSLRVAI FAGCVQDFVYPEQLEAAVKLMQGHNIRVDFPMDQSCCGLPVVMMGQRETARDVALQNMDA FEKGDYDVILTLCASCASQLKEGYVELFAGQPGRQARAKALADKVMDFSTFAKEKLGLSA ESFNHSDEKVTYHASCHLCRGLGVKEAPRELIAAAADYVPAAEEEVCCGFGGTYSAKFPE VSAALLRKKLDGIADTGAARVVMDCPGCVLQIRGGAEKDGKGLKVTHISELLAENLKK >gi|316924364|gb|ADCP01000031.1| GENE 9 10492 - 10695 89 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPPGGLGGHDAPQTPSKRKEAVPSPAEAENGTASFRSKSVRSTLFEVSGVAEGVLGQAMM DLLSYLF >gi|316924364|gb|ADCP01000031.1| GENE 10 10884 - 11303 355 139 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1353 NR:ns ## KEGG: Sterm_1353 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 27 136 1 110 113 184 73.0 8e-46 MISRSRGLSLGGSFPETPFPTHHGGDMKRCRITVLKRHFDEELAKEYGSKGIGKCPMLKE GQIFYADYAKPDGFCDEAWKAIYQYVFALAHGSGQFYYGDWIEKPGVAICSCNDGLRPVI FKVERTDEEAAINYTPHHD >gi|316924364|gb|ADCP01000031.1| GENE 11 11406 - 12794 1713 462 aa, chain - ## HITS:1 COG:BH0992 KEGG:ns NR:ns ## COG: BH0992 COG3829 # Protein_GI_number: 15613555 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 11 454 20 449 454 324 41.0 3e-88 MTSLAPHFESILDTLSDGVFISDAEGTTLFVNKMYETLTGLRQGEIRGKNIRTLVQQGIF DKVVNPQIVETGKSATHVQQLANGKRLVLTGYPVFDDKGQLCLVVTFARDITVLTQLQEE MTAQKKLIEQFHDRLAFLAREQTRELVPVFESREMKEVMSLVERFASSDATVLILGETGV GKDVVARLTHELSPRKEKMFLKVDCGGISESLTESELFGYMPGAFTGAGNKGKSGYFEMA DGGTVFLDEIGELSLAMQTRLLRVLQDGEIMRVGSSKPRKVNVRIIAATNRNLAERVEKG LFRRDLYYRLNVAVVNIPPLRERPDDIQPLVTHFLNMFTSKYRKHMNLTPGLMEALRQYS WPGNVRELQNLMHSIVITKDHGPLSVKDLPRHMTGNENEELFFPDDGINYERPLKDIMAD IERGILRKALKVHGSVQKVAEVFKVNRSTIFRKLHSEKDGGE >gi|316924364|gb|ADCP01000031.1| GENE 12 12983 - 13921 1263 312 aa, chain - ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 11 286 6 274 302 248 44.0 8e-66 MGSFEDRKATGTVFNIQKYSVHDGPGIRTIVFLKGCPLSCKWCSNPESQASHPQVAYNKG RCIGCHRCIKACEHDAITVNEDGTLSLDRGKCDVCKTLDCAHACPAQGMIIYGENKTVDQ ILKEVEKDALFYARSGGGMTLSGGEPLMHADIALPLLREARHRRIKTAIETCGCIPWDTL KEAAPYLNYVLFDVKQMDSEKHREGVGVGNELILSNLKKLLTEFPNLHVQVRTPIIPGFN DNDEFAYALGEFLKGYENVGYEALPYHRLGTQKYDFLSREYAMGDVSLPDGVAQRIQRIV DETRGAVTEEKK >gi|316924364|gb|ADCP01000031.1| GENE 13 14027 - 16519 3337 830 aa, chain - ## HITS:1 COG:pflD KEGG:ns NR:ns ## COG: pflD COG1882 # Protein_GI_number: 16131789 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 32 827 4 763 765 383 32.0 1e-106 MTQVAEIKSPHEQRLEDNIAGKEDIYRESHKRVFKLLERFDGQKPAIDVERALYFTQSMA ETVGQPLVLRWAKALMNVAKNITVMVQDDQLLLGRCGGHDGRYGILYPELDGDFLDIAVR DLPTRPQSPASISPEDAKIVVEQIAPFWKGRTYHEALNKALPAEVHKLTYDDPDGLISRF IVNETSSFRSSIQWVHDYEVVLKRGFNGLKQEMEEKLAALDPASPVDQVDKRPFIEATIL VCDAIVLWAKRHADAARKAAEACADPVRKAELIRMAENAEHVPANPARDFYEAVQSQYFT QMFSRLEQKTGTTISNGRMDQYFYPFYKKDMEAGILTDEKTLEYLECMWVGMAEFIDMYI SPAGGAFNEGYAHWEAVTIGGQTPDGRDATNDLTYLFLKSKREFPLHYPDLAARIHSRAP ERYLWDVAETIKFGSGFPKLCNDEECIPLYVSKGATFEEALDYAVSGCIEIRMPNRDTYT SGGAYTNFASAVEMALYDGKMKKYGDVQLGIQTGDARKFKSWDEFWNAYVQQHMLLLRTT FIQQYIVIQTRAKHFAQPMGSVLHALCRKHCIDLHQPQIPEGLNFGYFEFMGLGTVIDSL AAIKKLVFEDKKLTMDQLIDALEANFEGYEDIQQLLRTAPCYGNDDEYADEIGRELDRMA VSFAAKYGKEMGINNDARYVPFTSHVPFGKVVSATPNGRVAWFPLADGSSPSHGADHNGP TAILLSNHNTKNYGMRARAARLINVKFTPKCVEGDAGTEKLVQFIRTWCDLKLWHIQFNV INADTLKKAQKDPQKYRNLIVRIAGYSAYFVDLTPDLQNDLIARTGHDQM >gi|316924364|gb|ADCP01000031.1| GENE 14 16555 - 16758 151 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMCGAFRRTARKFSFPSSLVLLIIKQVPCQMGYKKCLLVISWYCDRERFEAFLWEGEKSQ NCITYQE >gi|316924364|gb|ADCP01000031.1| GENE 15 17252 - 18706 805 484 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 8 471 10 461 477 314 39 4e-85 MDVRQQDVERIVVEVLKKMMSDQPTAAATTVVAASGCDCGDFGLFDRLEDAVQAAEAAQK KISTVAMRDKIIAAIRKAGLENAKAFAEIAHNETGMGRVSDKIAKNILVCERTPGTECLS PMAISGDMGLTLIENAPWGVIASVTPSTNPTATVINNAISMIAGGNSVIFAPHPNAKRAS QTAIQVLNKAIIEATGVANLLVAVKEPTIEVAQELFSHPRIKLLVVTGGEAVVAQARKVA TMRLIAAGAGNPPVVVDETANIARAARSIYDGASFDNNIICADEKEIIAVDSIADQLKAE MKAIGAVEISLEQADAVARVVLRNYPQVEGGKAPNPNPKWVGRDAALIAKAAGIDVPDSC RLLIVDVKRDINHVFARVEQLMPVIPLLRAANVDEAIEWALILERGLSHTAGMHSRNIDN MDKMARAMNTSLFVKNGPHLAALGAGGEGWTTMTISTPTGEGVTCARSFVRLRRCCVVDN FRIV >gi|316924364|gb|ADCP01000031.1| GENE 16 18857 - 19153 350 98 aa, chain - ## HITS:1 COG:sll1030 KEGG:ns NR:ns ## COG: sll1030 COG4576 # Protein_GI_number: 16329366 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Synechocystis # 1 94 1 95 100 84 49.0 7e-17 MELAKVIGQVVSTVRCPGLPYNSLLLVDLLNEKGESIGRSQVAADPIGAGEGEWVIVSRG SSARFAIDKDAPLDLVIVGIVDHVNAGKAAIYRKNQGE >gi|316924364|gb|ADCP01000031.1| GENE 17 19283 - 19561 438 92 aa, chain - ## HITS:1 COG:ECs3319 KEGG:ns NR:ns ## COG: ECs3319 COG4577 # Protein_GI_number: 15832573 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Escherichia coli O157:H7 # 1 92 15 105 111 113 79.0 1e-25 MDALGMIETRGLVGLIEAADAMVKAARVQLVSYEQIGGGYVTALVRGDVAACKAATDAGA AAVQRVGGELVAVHVIPRPHQDLEAVFPLSRK >gi|316924364|gb|ADCP01000031.1| GENE 18 19650 - 20642 1036 330 aa, chain - ## HITS:1 COG:ECs3320 KEGG:ns NR:ns ## COG: ECs3320 COG0280 # Protein_GI_number: 15832574 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Escherichia coli O157:H7 # 3 321 2 319 338 278 47.0 2e-74 MSVFERCLEKCRARKGAVVYSDGEDRRVVSAAAKLIREGLVEPILIGDPEKVRAALQESG ETGVSLQVVNPYNPALLQHNAAEYMSIQKGKGKDISEEEAIKAVKNPLAAGALMVRRGEA EIGVAGNLSSTADVIRAGLRMVGTAAGSKTVSSFFFMLKDNNVCMFTDCAVIPEPTSAQL ADIAISTAGVYKRVMGDEARVALLSFSTKGSAKHERVDKVRAALEEIKTREPNLLVDGEL QLDAAVEPEVARLKAPGSPVAGNANVFVFPSLEAGNIGYKIAQRFGGWTALGPLLQGFAH GWHDLSRGCSSDDIYKISVVGLGMNRGGQQ >gi|316924364|gb|ADCP01000031.1| GENE 19 20642 - 21400 1048 252 aa, chain - ## HITS:1 COG:eutT KEGG:ns NR:ns ## COG: eutT COG4812 # Protein_GI_number: 16130384 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization cobalamin adenosyltransferase # Organism: Escherichia coli K12 # 3 242 4 252 267 125 34.0 7e-29 MIFLTEEDVRQRCELGQGGEFRLQAGERLTPAATELLRSRNCRVILPGQCTVEAAPVEEA KPAAPAAEAAPAAPAAEQASFPDGTYLDANTVVSKSHPRIFLRGKLDTLISSTVLVQTGF DGNNKLPAVLRNGLSDINVWLWQILQAEVSGEAVPAQSLCGMNAEAIRLVSHDPMKYLGQ GHIVPDVALGPNVALLNWLRAQAREVEVAYVQVGMEREDILASLNRLSSAIYVLMLLTVV AESGRDISKVGL >gi|316924364|gb|ADCP01000031.1| GENE 20 21434 - 22360 893 308 aa, chain - ## HITS:1 COG:eutQ KEGG:ns NR:ns ## COG: eutQ COG4766 # Protein_GI_number: 16130385 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli K12 # 89 308 5 228 233 79 27.0 6e-15 MTKQIITAETVRLAWKAGESVIVYAKGDIVTQQAQDDARHYGITLTTETPVAPSSALAEP EAEAVPPVPPVLEEVPASQEPVHVPTPALTADQLAAAARQLQEALVPLTQALTASVAPAA QPVPPVVFTPGSNPMAAEAVTASAPAEQRSDDLLVAIRRGVLSALPTGTADEALIDRLIA SVLAEMGACAAPETRPGVRQAGGVTHVDSKAQYWGGSRAKGSVAVMDVLSPAKGDAASVG YLDWENLSFSWTFRRTEVLVVLEGDLNLSIEGTTFSAAHGDVFSIPAGTEVELSSSGHVR CATVAVAA >gi|316924364|gb|ADCP01000031.1| GENE 21 22453 - 22857 669 134 aa, chain - ## HITS:1 COG:no KEGG:CLD_2520 NR:ns ## KEGG: CLD_2520 # Name: not_defined # Def: microcompartments family protein # Organism: C.botulinum_B1 # Pathway: not_defined # 8 86 8 86 98 63 45.0 2e-09 MSLEYKPSLGFVETAGLVIAIQVADAMAKAAEVEIISAHKVDGLRVCVICKGDVAACQAA VETGALLAQSLNGLNGYNIIPSPAEEPDSLMDMLEDIKRKKAARKAAKLTRMAAAKGEAA APAAEESPKGKAKK >gi|316924364|gb|ADCP01000031.1| GENE 22 22876 - 23433 759 185 aa, chain - ## HITS:1 COG:STM2054 KEGG:ns NR:ns ## COG: STM2054 COG4577 # Protein_GI_number: 16765384 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Salmonella typhimurium LT2 # 7 184 4 181 184 138 43.0 7e-33 MSESYTAVGMVEFNSIAAGIDAADQMVKTAQVDPLFFKTICPGKFVAAVTGDVAAVSASV NAGRETHADALVDWFIIPNIHRDVIGALAGATGITERGALGIIETFSAASIVVASDAAVK AADVQLLDVRVALGLGGKGYALMTGDVAAVNAAVEAGSTAAAESGLLVSKVVIPSPAETV FEQIA >gi|316924364|gb|ADCP01000031.1| GENE 23 23494 - 24831 1549 445 aa, chain - ## HITS:1 COG:lin1106 KEGG:ns NR:ns ## COG: lin1106 COG4656 # Protein_GI_number: 16800175 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Listeria innocua # 1 434 1 439 454 340 42.0 3e-93 MDKQAFIAAVKAAGIVGEGGAGFPAHVKYAADADTVIANGCECEPLLHTDQHHMLHHANE IVRAFSELMQATGAKRGVIALKKKYKEATKILTDAIGNRPIELALLPNFYPAGDEQTLVR EVTGRSVPPLGLPLMVGAVVANVGTLVSVAHALDGKPVTHKFLTVTGEVKKPGIVCVPIG TPLSACLEAAGGPTVSDPVFVLGGPMMGRMVDNADAFAKEVVTKTSGGLIILPRGHYLHR NATTPLNFMRRRAASACIQCRSCSELCPRHLLGHPFETHRVMRAFGSNAELTAEAGRLAL LCCDCGVCEHVACPMGLSPRRINQAIKNELRAAGMKYEGSRDVNEAYTQWREFRRVPVPR LVNKIGISRYMELPTNDLGALSPAEVRIPLRQHIGAPAQAVVKAGDRVKCGDVIGEIPEQ GLGARIHASMDGVVKSVDGAVVIGK >gi|316924364|gb|ADCP01000031.1| GENE 24 24922 - 25854 1307 310 aa, chain - ## HITS:1 COG:MT2360 KEGG:ns NR:ns ## COG: MT2360 COG2159 # Protein_GI_number: 15841794 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Mycobacterium tuberculosis CDC1551 # 44 277 65 306 307 130 34.0 5e-30 MKIIDFRFRPNTPEIINGIKNSSMFKATCEAIGFDKRVPQALPEIVADLDRRGVELCVVS GRDCETTYGFPANNNSILEFCRAYPNKFLGFWGIDPHKGMDAVREVEHVIKDLGMRGIAT DPYLAHCPPSDARYYPIYAKCVELGVPVFVTTAPPAQVPRAIMDYIDPRQIDVVARDFPE LILIMSHGGYPFVNEAIFACMRNANVYMDLSEYELAPMAEVYVDALNKMIGDKVIFASAH PFIEQADAIEIYKNLNISEEVREKVMYKTAAKILGLDKGVSLNTVKPMAPNGFNNAKFPL PFQPFAPMGR >gi|316924364|gb|ADCP01000031.1| GENE 25 25898 - 26083 67 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQYYGLKAITRTKSLYSGLLHRFQQKTVLSGSNQERFLLVPISLLSLGNLLVCAKLERHL L >gi|316924364|gb|ADCP01000031.1| GENE 26 26149 - 26349 73 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MITRSMRKIIRFLLPFAFFLVTFTRYGFPLLHRCNHGIRTLVLRIGTRSDPIAFLFTSQM MQKRTN >gi|316924364|gb|ADCP01000031.1| GENE 27 26471 - 27004 735 177 aa, chain - ## HITS:1 COG:lin1317 KEGG:ns NR:ns ## COG: lin1317 COG5405 # Protein_GI_number: 16800385 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent protease HslVU (ClpYQ), peptidase subunit # Organism: Listeria innocua # 1 177 1 179 179 221 64.0 6e-58 MELRGTTILAVRHKGHVALIGDGQVTMGQSIVMKHTAKKVRRMYKDQIVAGFAGATADAF TLFERFDAHLEQTGGNLIRAAVELAKDWRSDKFLRKLEAMLLVADRDHTLVLTGTGDVIE PDDGIASIGSGGPYALSAARALVRHSDLGAEDIALESMKIASELCVFTNGHYTLEVL >gi|316924364|gb|ADCP01000031.1| GENE 28 27039 - 27434 464 131 aa, chain - ## HITS:1 COG:VC2457 KEGG:ns NR:ns ## COG: VC2457 COG0736 # Protein_GI_number: 15642453 # Func_class: I Lipid transport and metabolism # Function: Phosphopantetheinyl transferase (holo-ACP synthase) # Organism: Vibrio cholerae # 3 130 2 124 126 89 43.0 2e-18 MAIIGIGLDLIELSRMERSLHRFGEHFLNRIMADDERSAIPGDPASPSARAVSHVAARFA AKEAAVKALGTGFAEGIGPRDVAVRSLPSGKPELVLSGKAREKADALGVKKLHLTLTHTR DNAAAVVILEG >gi|316924364|gb|ADCP01000031.1| GENE 29 27603 - 28931 1514 442 aa, chain + ## HITS:1 COG:mlr5265 KEGG:ns NR:ns ## COG: mlr5265 COG1004 # Protein_GI_number: 13474390 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Mesorhizobium loti # 1 441 1 440 443 466 54.0 1e-131 MRICIVGTGYVGLVSAACFAEMGNDVRCVDVNPEVVALLDSGRIHIFEPGLEDLVSRNRQ EGRLRFTTSLADGLDDAEFAFITVGTPSRPDGSCDLSFVEGVARQIGEHMRGPLIVVDKS TVPVGTADRVRELVAAALAARGEDIAFDVVSNPEFLKEGDAVSDFMKPDRVIVGTSSEKT ASAMRDLYAPFARSREKLIVMSVRSAEMTKYAANCMLATKISFINEIATLCEKVGADVRD VRTGIGSDHRIGYQFIYPGIGYGGSCFPKDVKALIHTAHEAGMRPELLEAVEGVNARQKR SMAFRVADYFEPQGGVFGKVLALWGLAFKANTDDMRESPALSIIEELTSRGMRVRAYDPI AGPNARKLLADNPLVFIEDDPYAICEGSDALLVATEWNQFRNPDFDRIKESLVAPVLFDG RNLYSPSVLGKRGFAYFCVGKK Prediction of potential genes in microbial genomes Time: Fri May 13 02:19:04 2011 Seq name: gi|316924358|gb|ADCP01000032.1| Bilophila wadsworthia 3_1_6 cont1.32, whole genome shotgun sequence Length of sequence - 5267 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 32 - 2287 2857 ## COG0209 Ribonucleotide reductase, alpha subunit - Prom 2431 - 2490 4.3 - Term 2459 - 2494 8.1 2 2 Tu 1 . - CDS 2524 - 2706 262 ## Dvul_2942 hypothetical protein - Prom 2863 - 2922 3.2 + Prom 2692 - 2751 2.5 3 3 Tu 1 . + CDS 2881 - 3117 350 ## DVU3085 hypothetical protein 4 4 Op 1 . + CDS 3218 - 4105 799 ## COG0266 Formamidopyrimidine-DNA glycosylase 5 4 Op 2 . + CDS 4150 - 5223 1603 ## COG0473 Isocitrate/isopropylmalate dehydrogenase Predicted protein(s) >gi|316924358|gb|ADCP01000032.1| GENE 1 32 - 2287 2857 751 aa, chain - ## HITS:1 COG:AF1664 KEGG:ns NR:ns ## COG: AF1664 COG0209 # Protein_GI_number: 11499254 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Archaeoglobus fulgidus # 13 750 3 752 752 752 51.0 0 MMNMPLNLTPPVINPNAEKVLSKRYYRKDENGQCVEDATGLFWRVASSVAAEEAKYEQSP WKPDALARAFYDMMTDWRFLPNSPTLMNAGSELGQLSACFVLPVGDSIEEIFDAVKFAAM IHKSGGGTGFSFSRLRPKNSRVGTTGGVASGPVSFLRIFNTATEQIKQGGTRRGANMGIL RIDHPDILEFIRAKEQEGEFNNFNLSVGITEAFMQAVEQDRDYALVNPASGKEQGRLNAR EVFNLLVDKAWQSGDPGIIFLDRINRDNPTPAQGEIESTNPCGEQPLLPYEACNLGSLNL SSFFVPGHLDETNPADRGIDWAGIERTIALAVRFLDNVVDASQFPLERIAEQVRRNRKIG LGVMGWADLLYQLRIPYDSADAITLAERIMDFIETRGRAASIQLAAERGPFPAYETSVYP TKGIPPLRNATITTIAPTGTLSILAGCSSGVEPLFALSFARNVMDGERLMEVNPHFEEAL REIDCYSQPLLEGVAELGSIQSLDQLPEDLRRVFVTAMDIAPEWHLRMQAAFQRHTDNAV SKTVNLPNSATRDDIFAIYWMAYEEGCKGVTVYRDGSKSTQVLTTSAPAAKPSEESQVRV RKRPDFVTGFTQKVQTGLGGMYLTVNEVDGKPFEIFATIGKSGHSVTAKAEAIGRLVSLA FRSGIDVADVVGQLKGIGGEHPIFQKKGLLLSIPDAIAWVLENRYLKGRNPVPTHTVSAF DRQLCPDCGGELVFQEGCHRCPNCAYTKCGG >gi|316924358|gb|ADCP01000032.1| GENE 2 2524 - 2706 262 60 aa, chain - ## HITS:1 COG:no KEGG:Dvul_2942 NR:ns ## KEGG: Dvul_2942 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 3 59 4 60 60 70 71.0 2e-11 MAVTAEQVLKAMQDAGKPVRPGDVAKALDADSKEVSKAIDELKKSGKIMSPKRCYYAPAE >gi|316924358|gb|ADCP01000032.1| GENE 3 2881 - 3117 350 78 aa, chain + ## HITS:1 COG:no KEGG:DVU3085 NR:ns ## KEGG: DVU3085 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 70 1 70 81 105 71.0 4e-22 MKYLIMEDFAGQPVSFVFPRRVDHGDMREQLPYGRVLGAGYVELRDGAFRCHGGYAELDI VARPEDEAILMEAFQKAE >gi|316924358|gb|ADCP01000032.1| GENE 4 3218 - 4105 799 295 aa, chain + ## HITS:1 COG:SP0970 KEGG:ns NR:ns ## COG: SP0970 COG0266 # Protein_GI_number: 15900847 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Streptococcus pneumoniae TIGR4 # 1 293 1 271 274 187 35.0 2e-47 MPELPEVETIARTLAPQVEGRRIVACELLNPSTFEGTIPLGKVVGGVIGRPGRRGKLLLL PLAFGSDPLPPVDEACPSGSLARCLACGDERITGLGFHLKMTGRLFVYPAGTPPEKHTRL LFDLDDGSRLFFDDTRKFGYVRVLSPSSVSCWPFWNSLGPEPLEMDADAFAACFAGRRGK IKALLLDQSVLAGCGNIYADESLFRAGIRPDAQLVSADRLKRLHAALREVLLESIDACGS SIRDYRTARGDAGAFQNAFRVYGRSGETCLECGTPLESCRIAGRATVFCPNCQLA >gi|316924358|gb|ADCP01000032.1| GENE 5 4150 - 5223 1603 357 aa, chain + ## HITS:1 COG:PA3118 KEGG:ns NR:ns ## COG: PA3118 COG0473 # Protein_GI_number: 15598314 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Pseudomonas aeruginosa # 1 357 1 357 360 434 63.0 1e-121 MQKTICLLPGDGIGPEIVAEAVKVLRAVEKKFGHSFTMTEALLGGAAIDAVGVPLPDATV EACKAADAVFLGAVGGPKWDTIDPAIRPERGLLGIRKALGLFANLRPATLFPELAGACLL RPDISAKGLDLIVVRELTGGIYFGQPAGTEVRDGLRTGFNTMIYNEEEIARIGRVAFSTA RKRRKKVCSVDKANVLAVSRVWREVMIEVGKEFPDVELTHLYVDNAAMQLVRDPSQFDVI VTGNLFGDILSDEASVITGSIGMLPSASLGTGNPGLFEPIHGSAPDIAGQGKANPLATIL SAAMMLRLAFDMDAEAAAIEGAVHKVLSDGFRTGDIMEAGKTQLSTTAMGDKVVETM Prediction of potential genes in microbial genomes Time: Fri May 13 02:19:11 2011 Seq name: gi|316924356|gb|ADCP01000033.1| Bilophila wadsworthia 3_1_6 cont1.33, whole genome shotgun sequence Length of sequence - 1359 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 220 - 405 87 ## 2 2 Tu 1 . + CDS 320 - 1120 833 ## COG0428 Predicted divalent heavy-metal cations transporter + Term 1189 - 1235 -0.0 Predicted protein(s) >gi|316924356|gb|ADCP01000033.1| GENE 1 220 - 405 87 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPAPNAVNAHVQHPASRAIPTGELKKSVINAASDNCENHPVFNPNALIPARAKGISILFH A >gi|316924356|gb|ADCP01000033.1| GENE 2 320 - 1120 833 266 aa, chain + ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 7 266 9 269 269 242 56.0 4e-64 MTDFFSSPVGMALLAGCCTWAFTAFGAGMVYTAKSFSRRTLDVMLGFAAGVMIAASYWSL LAPALEMSSHLGRLACVPVALGFLAGAGVLRLVDLILPHIHPTENVPDGPPSKLPRSALL VFAITLHNIPEGLAVGVAFGAAASGAPEASIAGAMTLMFGMGLQNIPEGVAVSVPLLREG FSKNRAFFFGQLSGIVEPIAAVFGALVVGIAEPILPFALAFAAGAMIFVVVEEVIPESHA SGHGDAASLGVIIGFVVMMCLDVALA Prediction of potential genes in microbial genomes Time: Fri May 13 02:19:30 2011 Seq name: gi|316924343|gb|ADCP01000034.1| Bilophila wadsworthia 3_1_6 cont1.34, whole genome shotgun sequence Length of sequence - 14042 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 8, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 26 - 319 424 ## COG0776 Bacterial nucleoid DNA-binding protein 2 2 Tu 1 . + CDS 213 - 407 62 ## + Term 516 - 569 5.1 3 3 Tu 1 . - CDS 1232 - 2497 2001 ## COG1541 Coenzyme F390 synthetase 4 4 Tu 1 . - CDS 2602 - 3993 1761 ## COG0038 Chloride channel protein EriC - Prom 4102 - 4161 3.5 5 5 Tu 1 . - CDS 4460 - 5518 1240 ## LI0340 hypothetical protein - Prom 5547 - 5606 2.3 + Prom 5550 - 5609 2.8 6 6 Op 1 20/0.000 + CDS 5736 - 6974 1680 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Term 6977 - 7028 9.1 7 6 Op 2 24/0.000 + CDS 7172 - 8077 1410 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 8 6 Op 3 19/0.000 + CDS 8074 - 9075 1323 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 9 6 Op 4 18/0.000 + CDS 9063 - 9833 216 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 + Term 9897 - 9949 3.0 10 6 Op 5 . + CDS 9979 - 10686 264 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 10742 - 10801 5.3 11 7 Op 1 . + CDS 10859 - 12160 1217 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 12 7 Op 2 . + CDS 12178 - 13062 1223 ## COG0548 Acetylglutamate kinase + Term 13179 - 13218 0.8 13 8 Tu 1 . + CDS 13224 - 13742 382 ## Dvul_2350 hypothetical protein + Term 13745 - 13778 2.1 Predicted protein(s) >gi|316924343|gb|ADCP01000034.1| GENE 1 26 - 319 424 97 aa, chain - ## HITS:1 COG:XF0743 KEGG:ns NR:ns ## COG: XF0743 COG0776 # Protein_GI_number: 15837345 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Xylella fastidiosa 9a5c # 5 95 3 93 99 62 34.0 2e-10 MSKTVTKADIAKSICAQSEMKQQHAKKVLDAMLGIMKKALKEEKEMLLSGFGKFEAFTKK NRKGRNPQTGEEITLDSHDVLAFRISRKFKAAMNSEQ >gi|316924343|gb|ADCP01000034.1| GENE 2 213 - 407 62 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIPSMASSTFLACCCFISDWAQIDLAISALVTVLLMVSSPESRCASDIFWCGPSGPVPFA RTVE >gi|316924343|gb|ADCP01000034.1| GENE 3 1232 - 2497 2001 421 aa, chain - ## HITS:1 COG:AF1671 KEGG:ns NR:ns ## COG: AF1671 COG1541 # Protein_GI_number: 11499261 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Archaeoglobus fulgidus # 12 407 11 417 433 176 29.0 6e-44 MTRKDRTEGIYNRREVLDESERRQYCLIQLKDLLSYAYRYSEDVKKRFDRAQFNVEKFKT LSDIKHIPLLKKKELIFLQSIGPRLGGLLTKDIGELKRIFLSPGPIFDPEDRMDDYWGYT EAFYSVGFRPGDVVQVTFNYHLSPAGLMFEEPLRNLGCASIPAGPTDPGTQLDIMQKLRV SGYVGTASYLMHLAQKAEEKGINLRKDLFQEVAFVTGERLSEKMRNQLEKKFDIILRQGY GTADVGCIGYECFHKTGLHISNRCYVEICHPDTGIPLKDGEVGEIVVTSFNKTYPLIRLA TGDLSYIDRAPCACGRTSPRLGSIVGRVDTTARIKGMFVYPHQVEQVMARFEDIKRWQIE VINPGGIDEMILYVEASNFHQEDELLHLFRERIKLRPELKVLAPGTLPPQIKPIEDKREW D >gi|316924343|gb|ADCP01000034.1| GENE 4 2602 - 3993 1761 463 aa, chain - ## HITS:1 COG:L113400 KEGG:ns NR:ns ## COG: L113400 COG0038 # Protein_GI_number: 15673646 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Lactococcus lactis # 21 456 3 426 512 196 33.0 6e-50 MPMKRKTKVRIARFTRQVPILREMQALRNESVLRLVVQAAFVGGCTGGVIGVFRWLYDHG NAWIVSTLRHGADDLATALIVFGTLILLALIVGQMVRKEPLISGSGIPQTELALAGQLPY PGFRIVASKFVGTLLSLSGGLSVGREGPSIQMGAAVGCGFGRIFQKLTGEDPGHAPRFLI AGSVAGLAAAFGAPFAGMLFAFEEVKTIVTVPLLLFTGVASFSAWFVITILFGFGLVFPF SSIPSLDWGQMWLPVLFGVGTGLFSALYNTALLRVTEFHDHQKLVPAPLKPLLPFLLSGV LLYVYPQVLVGLGYSTADLGGLTPNGPLLLGSVALLLFVKIIFSILSFASGVPGGLLMPM LAIGSMVGAVGGTLLIGNGFSAEPQLPAYLVLGMAGLFSGTVRAPLTGTALVAEMSGAFQ CLPEMAIVAFISTVVANGVGSPPVYDSLKRRILVNKNASSKIE >gi|316924343|gb|ADCP01000034.1| GENE 5 4460 - 5518 1240 352 aa, chain - ## HITS:1 COG:no KEGG:LI0340 NR:ns ## KEGG: LI0340 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 352 1 325 325 291 41.0 3e-77 MLIRCPECRFERQIDENAIPANAAMATCPHCQYRFRFRNPDGTPVADETPAAAPAPQTAP AGRPLPPDLDGDDPLPPGAMVPRIPGDHDASAPTPEVPEAEPQKGQPAPPQNVAEKRSDF WNNFQKKRDGESKGKGGIAAAMHVDGTDVPWEEPKRYNLFMALYQTILRVMFNAPRFFAA LPATNGKLTRPLVFYLILGMFQTLVERMWYLMSIQASGPSITDPKLQEVLGDVAQSMSLP LTILLTPGILAIQLCFFASVFYLMLRLVQPENVQFRTVFRVIAYSAAPTVVCIVPLVGPL VGSIWFGVCCFIGCKYSMRLPWSRTGLALGPLYLIAFAIGMQLIRQFLSMSA >gi|316924343|gb|ADCP01000034.1| GENE 6 5736 - 6974 1680 412 aa, chain + ## HITS:1 COG:PA1074 KEGG:ns NR:ns ## COG: PA1074 COG0683 # Protein_GI_number: 15596271 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Pseudomonas aeruginosa # 37 401 1 362 373 215 37.0 1e-55 MLGTMRDAVCFFRKSRAPAVPPRRSIRKTFHFSGESMNKLVKKLMLVAAMLMLAAGPSFA ADAIRIGLMCPLTGKWASEGQDMQQIVSLLVSEVNKAGGINGKQIELIVEDDAGDPRTAS LAAQKLASAGVMAVIGTYGSAVTEASQNIIDEAEIMQIATGSTSVRLTEKGLPLFFRTCP RDDEQGRVASKVIAAKGFKKVAILHDNSSYAKGLAEEAQKGLKDAGVPVVFYDALTPSER DYTAILTKLKAADPDLIFFTGYYPEAGMLLRQKKEMHWDVPMMGGDAANNTDLVKIAGKD AAKGYFFISPPSAHDFDTPEAKDFFSRYKAQYNSLPSSVWSVLAGDAFKVIVAALQAGTD ANPEAVAKYLKKDLKEYPGLTGKLGFNEKGDRIGDLYKVYEVDGNGGFVLQK >gi|316924343|gb|ADCP01000034.1| GENE 7 7172 - 8077 1410 301 aa, chain + ## HITS:1 COG:YPO3807 KEGG:ns NR:ns ## COG: YPO3807 COG0559 # Protein_GI_number: 16123941 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Yersinia pestis # 4 301 8 308 308 216 45.0 4e-56 MNEFLQQLINGLAVGGIYALVALGYTMVYGVLKLINFAHGDIFTIGAYLGMTLLVSGGLS GSMTPVLAVGLVVIIVFGLVALLGVALERVAYRPLRKANRLAAVVSALGASIVFQNAVML IYGARVYVYPENLIPTLTFNIFGLNVPLMRVIVIVSSLVLMLALYAFINRTRMGTAIRAV AIDQGAARLMGINVDRVISLVFFIGAGLGGVAGVMVGTYYGQIDFTMGWSYGLKAFTAAI LGGIGNIPGAMIGGLLLGVIEALGASYLAMAWKDAIAFLVLILILIIRPTGLLGERVADK L >gi|316924343|gb|ADCP01000034.1| GENE 8 8074 - 9075 1323 333 aa, chain + ## HITS:1 COG:YPO3806 KEGG:ns NR:ns ## COG: YPO3806 COG4177 # Protein_GI_number: 16123940 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Yersinia pestis # 29 300 113 407 428 189 40.0 4e-48 MSLFKRYYPVLVAVLVAVLPLGMNTYWTEVAVNVGLYALLALSLNVILGQAGIFHMGHAA FYAVGAYVTAILNTHYQIPILLLIPVAGAAAALFALIVARPIIHLRGDYLLIVTIGIVEI VRIALINDVFGLTGGANGIFGIARPELFGIKIRKAIQFYYLIWIMVGLTVLLFHWLSESR FGRALNCIKEDDTAAEGCGMDVAHLKLMAFVIGAFWAGMAGNLFAAKMTIISPSSFTFWE SVVVFAVVILSGGSQIGVLLGTFLIVALPEMFRDFASARMLVFGLAMMIMMVVRPQGLLP PSPRRYDVRRLLRRTRDFSSPVAPRTARKEGAA >gi|316924343|gb|ADCP01000034.1| GENE 9 9063 - 9833 216 256 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 1 244 3 230 309 87 26 4e-17 GRGMSLLSLQTVTKIFGGLTAVNEVSFDVEQGSIVGLIGPNGAGKTTVFNLITGNYVPDG GDIRFAGQSIKGMKPHTIVGLGIARTFQSIRLFPTLPLVENVLAGRHCRMHSGIIGSMFH TPAQRREERAALERAMNELEFVGLADSYAEEAGSLSYGNQRLLEIARALASDPKLIILDE PAGSMNDPETAVLIELIYAIRQRGVTVLLIEHDMGLVMKVCEKLLVLEYGSLIASGSPDV VRRDPKVIEAYLGQDD >gi|316924343|gb|ADCP01000034.1| GENE 10 9979 - 10686 264 235 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 226 2 230 245 106 29 1e-22 MFLEVENLQAGYGTEEVLHGVSLRVDEGEIVSILGANGAGKTTTLLTISGLVRASGGAVR FQGEDLHTLPSHEVVRRGLTQSPEGRRVFGVLSVLENLRLGAFTVGDKEKAEQTLDWIFD LFPRLHERKDQLAGTLSGGEQQMLAIGRALMGQPKLLLLDEPSLGLAPLLVKSIFETIRA INQSGVTILIVEQNARAALKLATRGYVLELGRVVMEDTAAHLLANPSIQEAYLGA >gi|316924343|gb|ADCP01000034.1| GENE 11 10859 - 12160 1217 433 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 4 433 8 466 466 473 54 1e-133 MSELTPREIVAELDKYIIGQNQAKRMVAIAVRNRWRRQRLAAELRNEVAPRNIIMMGPTG VGKTEIARRLAKLCSAPFIKVEATKYTEVGYVGRDVESMIRDLMEIGINLVRAEEAEKVK GRAEAAAEERLLDLLLPSGDGRENTREKLRELFRQGFLDDREVEFEVKEQSQPIGMLGVP GMEQLGDQMKGAFSKLFPQKTHRKKMKVGAAWRHLIEDESSKLVDEDKITDLARERVEQM GIVFIDEIDKLASGSQQRSADISREGVQRDLLPIVEGSAVNTKYGLVNTDHILFIAAGAF HLSKPSDLFPELQGRFPLRAELEALGKEEFYRILTEPHNSLTRQYEAMLETEGVRIEFTD DGLREIAAFAEDVNTRTENIGARRLHTIMEKILADISFDASEKRGSTLVIDREHVVAQLA DVRADAELSRFIL >gi|316924343|gb|ADCP01000034.1| GENE 12 12178 - 13062 1223 294 aa, chain + ## HITS:1 COG:PA5323 KEGG:ns NR:ns ## COG: PA5323 COG0548 # Protein_GI_number: 15600516 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Pseudomonas aeruginosa # 2 294 7 300 301 297 55.0 2e-80 MDAKLQSQVLIESLPYLRKFHGETVVIKYGGHAMKDEALKAAFARNIALLKLVGIHPVVV HGGGPQIGNMLEKLEIRSEFREGLRVTDDATMNVVEMVLVGSVNKDIVNQINRAGARAVG LSGKDGMLLHAEKKAMVVTRENQPPEIIDLGNVGRVTRVETTLIFSLMKDGFVPVIAPVG VDDEGHTYNINADAVAGAVAGALRARRLIMLTDVAGVLDPEGKLIQSIRVEDAAGLYENG TVTGGMIPKLNCCIDAIEQGVEKVMIVDGRVENCVLLELLTDQGVGTEIVGEER >gi|316924343|gb|ADCP01000034.1| GENE 13 13224 - 13742 382 172 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2350 NR:ns ## KEGG: Dvul_2350 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 20 163 1 144 153 149 52.0 3e-35 MSYDYRIPQGRDKGKRTNGMFSLGGKLGSVLSALGNGEKLMQVRLWQNWEMVMGPDIAPL AWPLGARNDILIVGGEDNLALQELSFMTPEILERVNAFMDAPVFDRVELRLVMGDRPLDQ MPDIQPSTRIRPAPPRPRQLGAHLEEMNPDSPVARCYAAYLRMHGVPTERKS Prediction of potential genes in microbial genomes Time: Fri May 13 02:19:57 2011 Seq name: gi|316924323|gb|ADCP01000035.1| Bilophila wadsworthia 3_1_6 cont1.35, whole genome shotgun sequence Length of sequence - 17847 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 11, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 3.4 1 1 Op 1 . + CDS 92 - 859 751 ## COG1794 Aspartate racemase 2 1 Op 2 4/0.000 + CDS 914 - 2248 1615 ## COG0677 UDP-N-acetyl-D-mannosaminuronate dehydrogenase + Term 2338 - 2384 -0.5 3 1 Op 3 . + CDS 2505 - 3563 812 ## COG1088 dTDP-D-glucose 4,6-dehydratase + Term 3719 - 3752 -0.9 - Term 3754 - 3788 0.3 4 2 Tu 1 . - CDS 3814 - 4968 875 ## COG0438 Glycosyltransferase 5 3 Tu 1 . - CDS 5123 - 6016 995 ## COG1209 dTDP-glucose pyrophosphorylase 6 4 Op 1 16/0.000 - CDS 6130 - 6669 169 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 7 4 Op 2 . - CDS 6765 - 8834 1735 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 8983 - 9042 4.5 8 5 Op 1 . - CDS 9222 - 9644 456 ## Ddes_0981 hypothetical protein 9 5 Op 2 . - CDS 9700 - 10509 804 ## COG0101 Pseudouridylate synthase - Term 10533 - 10577 1.7 10 6 Tu 1 . - CDS 10632 - 11282 698 ## COG2344 AT-rich DNA-binding protein - Term 11305 - 11348 9.5 11 7 Op 1 40/0.000 - CDS 11398 - 11733 631 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 12 7 Op 2 . - CDS 11803 - 12504 1032 ## COG0356 F0F1-type ATP synthase, subunit a - Term 12542 - 12580 -0.4 13 8 Op 1 . - CDS 12671 - 13123 279 ## DVU0919 hypothetical protein 14 8 Op 2 . - CDS 13098 - 13358 351 ## DVU0920 ATP synthase protein I 15 9 Op 1 . - CDS 13673 - 14788 1211 ## COG0763 Lipid A disaccharide synthetase 16 9 Op 2 . - CDS 14785 - 15753 1295 ## Dvul_1706 nucleoside recognition domain-containing protein 17 9 Op 3 . - CDS 15770 - 16627 766 ## COG1091 dTDP-4-dehydrorhamnose reductase - Prom 16651 - 16710 4.0 18 10 Tu 1 . + CDS 16768 - 17562 693 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase + Term 17573 - 17601 3.0 19 11 Tu 1 . - CDS 17599 - 17823 92 ## Predicted protein(s) >gi|316924323|gb|ADCP01000035.1| GENE 1 92 - 859 751 255 aa, chain + ## HITS:1 COG:PAB0912 KEGG:ns NR:ns ## COG: PAB0912 COG1794 # Protein_GI_number: 14521575 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Aspartate racemase # Organism: Pyrococcus abyssi # 8 244 4 222 228 120 35.0 2e-27 MKTDTGAIGVLGGVGPYAGLDLMRKIFDQTEANCDQEHLSVIQYSLSEHIIDRTRFLLGE TDENPGEAIGEIMVRMAKAGATVIGVPCNTAHSPRIMDVAVAMLHEASPKTRFVHMIDSV VAFIRESLPHARKIGVLSTKGTYATGLYQDALSAAGFVPLFPDEEGRERVQQAISNTAYG IKAQSNPVTPEARAALLAEAEKLVEQGADAVILGCTEIPLALTEPDLRGVPLLDATKVLA RALIIAFAPERLKKA >gi|316924323|gb|ADCP01000035.1| GENE 2 914 - 2248 1615 444 aa, chain + ## HITS:1 COG:PM1003 KEGG:ns NR:ns ## COG: PM1003 COG0677 # Protein_GI_number: 15602868 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetyl-D-mannosaminuronate dehydrogenase # Organism: Pasteurella multocida # 18 399 8 390 424 448 54.0 1e-125 MSVSLPSIASFHQGAHSVAVVGLGYVGLPLAVAFARHFNVIGFDLNAARVEELSAGTDRT GEVDSEALQASTARFTSDPAALREAGVIIVAVPTPIDEHRNPDLSPVEGASRTVGRHLSR GAVVVYESTVYPGVTEEVCVPILEAESGLRCGADFTVGYSPERINPGDKVHTLETIKKIV SGSDAPTLDLLAELYGTVVRAGIHRAPSIKVAEAAKVIENTQRDLNIALMNELSIIFDRL GIDTLDVLEAAGTKWNFLPFRPGLVGGHCIGVDPYYLTFKAEELGCHPQVILAGRRINDG MGKHVAETCVKLLIRQGRLVNAARVGILGFTFKENVPDLRNTRVIDVIRELQEYGVDVLV HDPLADAGEVRHEYGLSFAALDELANLDALILAVPHKVYAENGVLEPAALHARFAVPDKA LLLDVKGCLSPSAVQAEGMAYWRL >gi|316924323|gb|ADCP01000035.1| GENE 3 2505 - 3563 812 352 aa, chain + ## HITS:1 COG:PA5161 KEGG:ns NR:ns ## COG: PA5161 COG1088 # Protein_GI_number: 15600354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Pseudomonas aeruginosa # 3 350 2 352 352 462 63.0 1e-130 MKTVLVTGGAGFIGSCYVLSRRTAGERVVNLDKLTYSGNIGNLASLDGDAEHIFVRGDIG DALLVRQLLHHYQPDAVINFAAESHVDRSIHDPDAFVRTNVLGTCTLLRGVLEWWKELSE ERRNAFRFLHVSTDEVYGTLRPGEPAFTETTPYAPNSPYSASKASSDHFVRAYHETYGLP TLITNCSNNYGPRQFPEKLIPLVILNALSGKPLPIYGTGENIRDWLHVEDHCEAIVAVLE KGAPGECYNIGGHSERANIDVVRSICRILDKLRPTAAPYEKQITFVADRPGHDLRYAIDA AKIEQEIGWVPRRRFEEGLRETVGWYLENAAWVENIQSGEYRQWLETNYSGR >gi|316924323|gb|ADCP01000035.1| GENE 4 3814 - 4968 875 384 aa, chain - ## HITS:1 COG:CAC3071 KEGG:ns NR:ns ## COG: CAC3071 COG0438 # Protein_GI_number: 15896322 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Clostridium acetobutylicum # 9 377 1 357 372 303 42.0 4e-82 MDERTLPHLRVAVIHYWLTGMRGGEKVVEALCRIFPQADIYTHVVRPEALSETITSHPIH TTFIQKLPGSVRHYQKYLPLMPLALEQLDLRGYDLVISSESGPAKGVITRADTPHICYCH SPMRYLWDFYQDYLESAGAVTRLLMRPLFHRLRLWDYASAQRVDHVIANSRTVARRVKRW WGKEAAVIHPPVDISRFSSPHMAGLQNVPGTPEPGSYYLCLSELVSYKRVELAVEACTRT GRRLVVAGDGPERKRLESIAGPTVSFVGRVDNAALPALYAGCKAFLFPGEEDFGITPLEA MAAGRPVIAYGRGGVLDSVADGETGIFFERQTADALTEALDAYEASTEQTWVHDKLKRQA ESFSEEIFRKKMIAFIADVLENKV >gi|316924323|gb|ADCP01000035.1| GENE 5 5123 - 6016 995 297 aa, chain - ## HITS:1 COG:PA5163 KEGG:ns NR:ns ## COG: PA5163 COG1209 # Protein_GI_number: 15600356 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Pseudomonas aeruginosa # 4 288 3 287 293 393 66.0 1e-109 MASRKGILLAGGSGTRLHPLTLSVSKQLLPVYNKPMIYYPLSVLLLAGIREVAIITTPDD GWQFRKLLGDGSQWGCSFEYITQPKPEGIAQAFLLAADFIQGSHSCLILGDNIFFGNGLE DLLISAREQKEGATVFGYHVSDPERYGVVEFDQDMKVRSLEEKPQHPKSSYAVAGLYFYD QDVVDITREIKPSPRGELEITDVNKAYLDAGKLHVGLMGRGIAWLDTGTHDSLMAAGTFV QAIETRQGLKVSCIEEIAWRKGYISTEQLMKLAAPLQKSGYGDYLMTLPERVAESWK >gi|316924323|gb|ADCP01000035.1| GENE 6 6130 - 6669 169 179 aa, chain - ## HITS:1 COG:ECs4819 KEGG:ns NR:ns ## COG: ECs4819 COG0437 # Protein_GI_number: 15834073 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 7 175 33 190 300 86 29.0 2e-17 MSIYRLVVDQDRCIACHACEASCKAAHDSAPGISLGTLLVDGPRFEDAAEEATPIIVKHA RPEAPAPLAADGEPRVVLHTRYRACVQCPQPRCVEVCPKEALIRRESDGIVFIREEACVG CGACQKACPHHLIWVDPRKRKSVKCDQCKDRLDAGLDTACVTVCPTGALKLVRKERPGR >gi|316924323|gb|ADCP01000035.1| GENE 7 6765 - 8834 1735 689 aa, chain - ## HITS:1 COG:STM2065 KEGG:ns NR:ns ## COG: STM2065 COG0243 # Protein_GI_number: 16765395 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Salmonella typhimurium LT2 # 6 638 46 715 758 325 32.0 2e-88 MKTVYSVCGMCGTRCPVAVGVENGEAVWIQGNPHSATGSTLCPRGIAALALEKDSERPQS PLMRVGARGGHKWRAVRWEEALDAVAGKLRAIRDQYGPESVMFSHRGGPFTDLYKAFARG LGTPNIYSHSVTCTRNVDQACASVLGLDRGRLVIDYRESKHIVLQSRNALEALNLAEVAG ITAARANGCKVTVMDVRATVSAAKADTFFFVRPGTDYAMNLAVLHVLISEKLYDPHMLPY IDGFGELEERVRPCTPEWAETETGIKADRIVRLARELAEAAPRVLWYPGWFTARYADSFV TVRSAYLINALLGSIGARGGMPISLSPKETGKRLRPLSALYPNITKPMADKADWQQPGLL HRAFDAAVTGDPYPVRAYISMRHNILSSLPDPDTLRSKLDKLDLIVAITTTWSPTADYAD IVLPLSPALSRESILASKLGLKPQFFRRQRAVQPRFDTRADWEILCGLASRLGLDKLAFS RIEDIWNYQLEGTGLGIEAFDATGFVPLASAPRWKTLAETTLPTPSGKIEARSRLWAASG HDTLPPYTAPERPAAPDQFRVIPGRAALHTQASTTNNPLLSELAPTNTLWIHTERAQALG IADGDWVEVCTATGPVGRLRAKVTTGIHPEAVFMLHGFGRKTPQETRAFGKGVADEACMS SGLEHEDPLGGGLALQKHFVTIRKAEETN >gi|316924323|gb|ADCP01000035.1| GENE 8 9222 - 9644 456 140 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0981 NR:ns ## KEGG: Ddes_0981 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 140 1 140 140 228 77.0 7e-59 MPQVAARINEQQERWLKDYFRTKSAGAEFILPWAVDTFFRAITTIKGTFTGPELKTILEA HRDQRLLPAHTRLSYLLLRVADLCEQTDFHSRYGASRTSLEAKLKRLDDTEATALMVWAS AFWVSRNCSAANMDDYIASY >gi|316924323|gb|ADCP01000035.1| GENE 9 9700 - 10509 804 269 aa, chain - ## HITS:1 COG:CPn0580 KEGG:ns NR:ns ## COG: CPn0580 COG0101 # Protein_GI_number: 15618490 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Chlamydophila pneumoniae CWL029 # 1 255 1 247 267 163 39.0 4e-40 MPRLKLTIAYVGTQYHGWQTQARKNASPLPTIQNIIEDAVAHVLGERVHVHGAGRTDAGV HAEAQVAHLDVPESRARMDWQLALNTLLPRDIRIADAVLVPDTFHAQHSAVRKTYEYRLW LSKRYTPPQLFPFVWACGPVDVERMDEGSRYLLGKRDFASLKNAGTDLRTTVRTILSITR TPEGQLPEDCLELTWRFEADGFLKQMVRNTMGLLVAVGRGKLEPADIPGILDACDRRIAP LTAPACGLTMKKVWYDDMFPLAASGHEGA >gi|316924323|gb|ADCP01000035.1| GENE 10 10632 - 11282 698 216 aa, chain - ## HITS:1 COG:TM0169 KEGG:ns NR:ns ## COG: TM0169 COG2344 # Protein_GI_number: 15642943 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Thermotoga maritima # 6 208 2 203 208 178 48.0 8e-45 MLSLKSERIPRATIRRLAVYVQVLESMQRNGVEVISSGPLAEACDVNASQVRKDLAYFGE FGVRGVGYNVASLIAAIKASLGVDREWRAALIGVGNLGRALLHHAEFKARGFNIVGAFDC DPFKIGEQVYGLEVTCTSDLKSAVDAKGIEIGIITTPPERAQRAADHLVEAGVCGILNFA GARIHVPEKVVVEYVDFFHYLYALAFSITGSEHRSK >gi|316924323|gb|ADCP01000035.1| GENE 11 11398 - 11733 631 111 aa, chain - ## HITS:1 COG:asl0009 KEGG:ns NR:ns ## COG: asl0009 COG0636 # Protein_GI_number: 17227505 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Nostoc sp. PCC 7120 # 53 110 23 80 81 60 62.0 9e-10 MRKILVTLLSTVAMMGIASLAFAADGAPVKLDSASLGLAIFGCAIGMAVAAAGCGIGQGL GLKSACEGIARNPDAAGKIQVSLILGLAFVESLAIYSLVVNLIILFANPFI >gi|316924323|gb|ADCP01000035.1| GENE 12 11803 - 12504 1032 233 aa, chain - ## HITS:1 COG:Cj1204c KEGG:ns NR:ns ## COG: Cj1204c COG0356 # Protein_GI_number: 15792528 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Campylobacter jejuni # 29 228 17 219 226 133 37.0 4e-31 MAGGLPEPILISELVHLDHITIAGQTVEFRHVFYTWMAMLILFVVGWLVRRNISLIPGKM QNIFESIIGGLEDFTVTNMGEDGRKVFPVLGGLFLFIAVQNILGLIPACDAPTANINTNI GMALFVFLYYNYQGIKRWHGHYIHHFMGPMLPLAPFMMILEFISHLARPLSLTLRLFGNI RGEEIVLLLFFLMAPLVSTLPIYFLFLLAKVLQAFIFYMLALIYLKGAMEPAH >gi|316924323|gb|ADCP01000035.1| GENE 13 12671 - 13123 279 150 aa, chain - ## HITS:1 COG:no KEGG:DVU0919 NR:ns ## KEGG: DVU0919 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 15 142 9 133 144 78 37.0 7e-14 MQAKSDRNNGPFHWLEKRLWRSGVQDPTIREILCWQISVIALSLLFGAVLWPFHPAGAWI FWFGFGALLSAWNFFALIKFVPKVISAGWSKSSLFALLLRTNMRLLFTGILLYMVLVWFK GSISAVLLGLAVLLVGMTAGGLKKALKKPV >gi|316924323|gb|ADCP01000035.1| GENE 14 13098 - 13358 351 86 aa, chain - ## HITS:1 COG:no KEGG:DVU0920 NR:ns ## KEGG: DVU0920 # Name: atpI # Def: ATP synthase protein I # Organism: D.vulgaris # Pathway: not_defined # 4 82 50 128 134 89 53.0 5e-17 MTSKENKQGLFDSLGRASVMGLHMVSGIIVGCLLGYWLDKWFGTYPWCAGVGLIVGIGAG FRNIWLDARILLRQGEQDDAGKKRQK >gi|316924323|gb|ADCP01000035.1| GENE 15 13673 - 14788 1211 371 aa, chain - ## HITS:1 COG:NMA0069 KEGG:ns NR:ns ## COG: NMA0069 COG0763 # Protein_GI_number: 15793098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Neisseria meningitidis Z2491 # 4 344 9 359 384 182 31.0 1e-45 MKTIWINAGELSGDMQAAALLTALREREPELAAIGMGGPNLARAGQKNLFRVESLSVMGI MEVLTALPRALHMLSQIKKEMARLRPDAVVLVDAPEFNFRVAKIAHGLGIPVYYFIPPKI WAWRTGRVRFLQRYVKRLFCILPFEPAFYAKHGVQVDYIGNPLVDMVNWPELEKIEPIKG RIGLMPGSRRKEVEALLPEFGKAARILLQQGRDVTFHCLRAPNMPEEKLRALWPSDVPVA FDAPEDRYTAMRRCGCMLAASGTATLETALAGVPTVVSYRVAPFSALVGRLLIKVKWVSL TNLIMQKELFPELLQERATGEMMASQLAAWLDMPPQIEAVRAELAELRRRCGEPGSAARA AEKLLEALKES >gi|316924323|gb|ADCP01000035.1| GENE 16 14785 - 15753 1295 322 aa, chain - ## HITS:1 COG:no KEGG:Dvul_1706 NR:ns ## KEGG: Dvul_1706 # Name: not_defined # Def: nucleoside recognition domain-containing protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 306 6 312 322 270 51.0 4e-71 MTSHLLKTLREVLMDAVHVSFDFFKVLIPISIAMKILAELDWIRYLALPLEPVMQLTGLP ADLGIAWATGIMVNFYSALIVFIGLLPGLPPLTTEQVTTLAVMMLIAHSIPAEGRIAAQC GVSFIGQAVIRLVVAVIGGVIVHTSCQAFGWLDTPARIVFTPSPDVNLLWWALGEVRNLV SIFCVIFFVMLLQRFLRYLKVADLVGRALAPALRVLGMRPAAATAMIVGVITGIVYASGV ILKEVRSGEMSRHDVFSCMTLMGLAHAIIEDTCLMLLIGAHIGGIFFLRLALAFVTCAMI NILYLRFRGDAREPETAPNEVS >gi|316924323|gb|ADCP01000035.1| GENE 17 15770 - 16627 766 285 aa, chain - ## HITS:1 COG:MA3778 KEGG:ns NR:ns ## COG: MA3778 COG1091 # Protein_GI_number: 20092574 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Methanosarcina acetivorans str.C2A # 7 277 9 264 269 152 33.0 9e-37 MSTPVALVLGGHGLLGQPLAQQLAGNGWEAQSLDFEDCNLLNPVELQPRIEFINPDVIFN TVSWNTENPAEKQPQEALSVNRGLPAFLGGLVKGTPRFLVHYSSDQVFNGRKDSPYTEED KADPISPCGKSRLAGEQALLELNADNICIIRTGWLFGPDGDSFLKRLLGRAKTEGTVEVI HDQIGSPTYAKDLAQATLQLVKLRAPGLYHVANSGQATWCELAAEAVRQASLPCSVRAVA SSDKTLRANYEVLSSAKYTALTGCPMRPWSQALREYIYSVLLASQ >gi|316924323|gb|ADCP01000035.1| GENE 18 16768 - 17562 693 264 aa, chain + ## HITS:1 COG:CAC1373 KEGG:ns NR:ns ## COG: CAC1373 COG4822 # Protein_GI_number: 15894652 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Clostridium acetobutylicum # 1 262 1 263 278 115 29.0 8e-26 MKRGILLAAFGSGSSQGESTLRRFDAQVRKAFPDVSVRWAFTSMLMRERLASERKKSDSV HKALKKMAFEKFTHVAVQPVHVIPGLEYGDIVSDADELRADGTFASLVVGAPLLTESQAS VDRAARALLAELPSGRAPGEPVLFMGHGSRHAAESRYEALAAAVRERDPLVLMGTLNGTI RLEHIVEHLRASGRPFDRVWLLPLLAVVGRHTLEDMAGDSEHSWRSRLEAEGIACVPVLR GMMEYQGFMDIWVGNLAQAMKGWG >gi|316924323|gb|ADCP01000035.1| GENE 19 17599 - 17823 92 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQRFRTGLRWFGTFAFGASGSIGTPLILLWIPFSAYPLYHSNTFLFFLLQHRETNANKNR LRASVPRIDPYESL Prediction of potential genes in microbial genomes Time: Fri May 13 02:20:29 2011 Seq name: gi|316924312|gb|ADCP01000036.1| Bilophila wadsworthia 3_1_6 cont1.36, whole genome shotgun sequence Length of sequence - 11831 bp Number of predicted genes - 12, with homology - 10 Number of transcription units - 5, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 35 - 1159 1707 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 2 1 Op 2 . - CDS 1242 - 4574 2963 ## DvMF_0765 hypothetical protein - Prom 4753 - 4812 2.3 - Term 4750 - 4800 -0.3 3 2 Op 1 . - CDS 4898 - 5494 708 ## COG0349 Ribonuclease D 4 2 Op 2 4/0.000 - CDS 5496 - 6881 1450 ## COG0486 Predicted GTPase - Prom 7056 - 7115 3.6 - Term 7159 - 7200 5.5 5 2 Op 3 16/0.000 - CDS 7301 - 8521 1016 ## COG1847 Predicted RNA-binding protein 6 2 Op 4 18/0.000 - CDS 8546 - 10150 2221 ## COG0706 Preprotein translocase subunit YidC 7 2 Op 5 . - CDS 10140 - 10418 79 ## COG0759 Uncharacterized conserved protein 8 2 Op 6 . - CDS 10415 - 10693 217 ## DVU1075 ribonuclease P protein component (EC:3.1.26.5) - Prom 10798 - 10857 3.3 9 3 Tu 1 . + CDS 10646 - 10864 59 ## + Term 10922 - 10971 5.9 10 4 Tu 1 . - CDS 10861 - 10995 183 ## PROTEIN SUPPORTED gi|220905232|ref|YP_002480544.1| ribosomal protein L34 - Prom 11126 - 11185 3.8 + Prom 10935 - 10994 1.6 11 5 Op 1 . + CDS 11023 - 11226 80 ## 12 5 Op 2 . + CDS 11237 - 11800 681 ## Dvul_2682 hypothetical protein Predicted protein(s) >gi|316924312|gb|ADCP01000036.1| GENE 1 35 - 1159 1707 374 aa, chain - ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 1 356 1 347 355 340 49.0 3e-93 MSFPTLNIGDLIAKTPIVQGGMGVGISLSRLASAVANEGGIGVIAGAMIGMKEPDVASNP LEANLRALRREIEKAREATQGIIGVNIMVALTTFAEMVRTSIEAKADVIFSGAGLPMDLP KIFNETCERKKEEFKTKLVPIISSGRAATLIARKWMASTGYMPDAFVVEGPKAGGHLGFS PEHIVDPNYALEQLVPQVVEAVKPLEDKAGRAIPVIAAGGVYTGEDIKKYMDLGASGVQM GTRFVATYECDADDRFKQAYIDAKQEDVTIIKSPVGMPGRALVNNFIDSMRDGGKKPFKC IFHCVKTCEQEKTPYCIAAALINAMKGNLERGFAFCGENVSRVNNIVSVHDLISSLQREF DAIINKPAAATVKA >gi|316924312|gb|ADCP01000036.1| GENE 2 1242 - 4574 2963 1110 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0765 NR:ns ## KEGG: DvMF_0765 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 6 1108 42 1100 1100 836 44.0 0 MTPSPPFRIVPWDTDFLDALKNGVMKATRGQPGNAVVVFMHDRPRRYLRERFRYAPDVPK PCLIPRMFTERELMAAFRQEQHQKLRREAGKLDQIALLSRCVRELAEPDTELCMQLAKTD DAGFFPWGVRLAELLEECFTQGLTPEDMLYTEGEVAPFGAALLGSLGKIFHRYRDALIES DLTTPGFDAFVVASALEEGTDGDDLPPLPDFLAGRAILLAGFGMLTGTENTLFRYLWRHG AQVFLHVDPALAENGQGHWACTEQSNWIADWKADTVLACPSPGKKPKIHYFAGYDLHSQL DALRGDLIELERKDRLAAALKARSTTQQLSLLPSSENGTASTLPETSASGNADNKDSITS KVLGGRGKGGPGGKGRGKPLFRRVPSPFPRWARPNPAPIPTSQLDIAVALAHSGALLPVL HHLPRKDCNISLGYPLERSLLFRLLETVLEARNRRQPNGTTHWKTLADLVRHPYLRLLEA NGISLRDIFQNMETRLRNGSRHADAHAVAEGAADDFFAASSLPNVAEAMPAIRELLNRIL RDTVDTWARVHTLGGLADALSGLCDTLLVYGSGNDEDGAGNGKADIWSRFPIDAECLFRL MQRVIPALKDNGMADTPLPWPLMQAMLLELVRAERVPFEADPLTGLQVLGMLETRLLRFS RVFLVDVTDDRLPGAPIRSPLLPDSLRALLGLPDTRNREQLAAYTFHRLIAGADEVWLYW QEGVETSGLFDGKKQRSRLVEELIWQEEQARGQRLKPGREPLRTAALDVRPPVRVRKVVP KTPAIREQIAASLKHPLSATRLDAYLTCPLRYYYERLCAIAPIDEVNEDDDPAAVGVLLH NVLRDFYAPAVGKTVRRDAQSGDPELPFLDEKALRALFRTALDASGLEAALPPESAAMLS VTGPERLGMFLRAQPEQTEVLSLEEEYDAEIRVGGRIRRLTGNLDRVDWREQEDPEGAID EGAVILDYKTGRIKALRPDIWADDAFWDALDPEKAAEAASEPDPEHDFLPIMAQRIPTVQ LLYYCYLYGQATGKPVLDAAFVALGEDGGERPLFGKGMTIEERERALSLIPRLIGFILLH MELCPEFRPREGAHCRWCSWRNVCIISSQQ >gi|316924312|gb|ADCP01000036.1| GENE 3 4898 - 5494 708 198 aa, chain - ## HITS:1 COG:HI0390 KEGG:ns NR:ns ## COG: HI0390 COG0349 # Protein_GI_number: 16272339 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonuclease D # Organism: Haemophilus influenzae # 44 185 44 184 399 58 32.0 6e-09 MNASSNAAGGAPLSKEEINDLPMLAYEGEVMLVQTEGEMARALNFLKKETLLGFDTESRP SFKKGKSYPTSLIQLAGSELVVLIRLNLTPFCGALAGLLADPGIIKAGVAIRDDIRALQK LHEFTPGGLADLAEMAKQRGIKAQGLRTLAAQLMGCRISKAAQCSNWAKKTLTPQQIRYA ATDAWIGREIYLCMMDQG >gi|316924312|gb|ADCP01000036.1| GENE 4 5496 - 6881 1450 461 aa, chain - ## HITS:1 COG:CAC3734 KEGG:ns NR:ns ## COG: CAC3734 COG0486 # Protein_GI_number: 15896965 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 4 461 5 459 459 297 36.0 3e-80 MTTDTIAAIATAPGAGGIGIVRVSGPGALPILEKLFAPAGKGGYKPWILRRGRVQDTEGN TLDDSLAVFMPGPKTFTGEDVAEFQCHGGPVLLAAVLEACLCAGARLAERGEFTRRAFLN GRMDLTQAEAVAEMIAAPSKEGARLAAAKLDGVLGQRIGDLRERVEHLRAQICLAVDFPE EEVECLPQEGFLSAIADVKAAVASLLAGFERTRCWREGVTVALAGPVNAGKSSLMNALLG RQRAVVTEYPGTTRDFLEEPIQLAGLPVRLVDTAGLRETDNPAEAQGIQLGRSMIESADV VLLMVDGTEGTTPDTWALLSELGPERTILVWNKSDLATPPPHWYGKEMNLSVKPAAHAVI SARKGEGLETLAEAVRELALARSQGQEPEPGEAVPNLRQARSLMEVLGELEALEQDVLAG VPYDLCAVRLEGAASALAEITGLDTPEEVLNRIFASFCIGK >gi|316924312|gb|ADCP01000036.1| GENE 5 7301 - 8521 1016 406 aa, chain - ## HITS:1 COG:CAC3735 KEGG:ns NR:ns ## COG: CAC3735 COG1847 # Protein_GI_number: 15896966 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Clostridium acetobutylicum # 276 402 79 208 209 84 36.0 3e-16 MDDFKEFQGKSLDEAIRAACSYFDAQREKLEIDIIQDAKNGIFGLVGARKAIVRARRAQL KPRVGSLLARDAQPASAPKAKAEDASRAPKPRQGQAPRPPRGNREPKPPFEEDDSIGNRI LPPVEEDDSIGNRILPEETDIDDSIGNRVDEPRAPRRPRRNDGPRIPYEGNEDAFRPRRP RPERRANQDRPDRPERQDRPRTPRPPYEPRQERRPVEAPAEQDVLRDSSLEFDQDEALNE GLPSKPFSELDQEKLTAVSQEVAFKIVSSILGETPVEVKIMENRVDIHVDCGDDSGLLIG REGQTLAALQYLTSRIVSRRMEAPVRVQFDVGDYRERQDDRLRELALALAERVRATGRPC STRPMSSYHRRLVHMALQDSPDVQTRSSGEGPLKRVIIQRRRQERH >gi|316924312|gb|ADCP01000036.1| GENE 6 8546 - 10150 2221 534 aa, chain - ## HITS:1 COG:VC0004 KEGG:ns NR:ns ## COG: VC0004 COG0706 # Protein_GI_number: 15640036 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Vibrio cholerae # 1 527 1 526 541 272 33.0 1e-72 MDDKRTFLAIALAIAVLLAWTPLAEHMGWIQPQQRPAATQEATPAPAPAAQTAVAPASSL PVFTPSAGTDVKVETPLYSAVIYSGGGILRSFTLKHYDETIKADSPKVNLISPEASQTAP LGLTVNGQPSWSTGQWSFQGSDLNLKAGEQGSLTFTGIVDGVRVTRVISFNADNYLLSEN ILVGSADQAPRTVRLGFTVAATPFSNGKYDPTRLAWDANESFKEETSASTLAEKGIIEQG VFNWAGVMSNYFMNVTAPADPNNLTLKGRVQGDVWRLALERPDLLTPSNGGEVPVAVNWW FGPKDRTMLATAPDHLSQAVNFGMFSIIARPLLTILAFFHSFVGNWGIAILMLTFCIRVV FWPLSQKSFKSMEQMKKLQPMMKKLREKHKDDKEALNKEMMQLYKTYKVNPAGGCLPIVV QIPVFIGLYQALLNSIELRHASFIEYLPFTHITWLADLSAADPFYITPLLMGASMFLQQR LTPAAGDPTQQKVMMFMPVIFTVMFINFPAGLVIYWLCNNILSIGQQWWMLRKA >gi|316924312|gb|ADCP01000036.1| GENE 7 10140 - 10418 79 92 aa, chain - ## HITS:1 COG:PM1164 KEGG:ns NR:ns ## COG: PM1164 COG0759 # Protein_GI_number: 15603029 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pasteurella multocida # 14 75 16 77 86 94 66.0 4e-20 MKRLSLRRLAVLPIRLYQWTLSPVLPPSCRYHPTCSAYAIEAVLTHGIFKGSWLALRRIL RCHPWSSGGYDPVPPPHSSSFSHQEQLDHHGR >gi|316924312|gb|ADCP01000036.1| GENE 8 10415 - 10693 217 92 aa, chain - ## HITS:1 COG:no KEGG:DVU1075 NR:ns ## KEGG: DVU1075 # Name: rnpA # Def: ribonuclease P protein component (EC:3.1.26.5) # Organism: D.vulgaris # Pathway: not_defined # 2 52 30 80 96 65 62.0 6e-10 MRNRVKRVLRECFRLHQALLPPAVDLVIVPKRHLKPEQLDLAAATREFLPLIKEIGGYVV LRREQNACPDISRDGIAAPGGLEPETASGVKA >gi|316924312|gb|ADCP01000036.1| GENE 9 10646 - 10864 59 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQPETFPEHPFDTIPHDCRSDFLGDGHTDAPRITGCTKTNKHNEVFGKKTATLIITECEI GPPEQSMPTRPG >gi|316924312|gb|ADCP01000036.1| GENE 10 10861 - 10995 183 44 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|220905232|ref|YP_002480544.1| ribosomal protein L34 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 43 1 43 44 75 79 2e-13 MKRTYQPSKIKRARTHGFRERMSTASGRAIIRRRRAKGRKQLAV >gi|316924312|gb|ADCP01000036.1| GENE 11 11023 - 11226 80 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTYTFSASESQAQTEHDTETCEDLAEWLERFIPSREELQQRYEKTRRESGTARVAAPAFW GHTPQKR >gi|316924312|gb|ADCP01000036.1| GENE 12 11237 - 11800 681 187 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2682 NR:ns ## KEGG: Dvul_2682 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 4 184 3 183 260 193 58.0 3e-48 MQPRFRTLKTTIRTRLQEPGWDDFAKELDEVPARELVGPLFSCLPLGGEATDRAASALGK AVSRMADEHIEEARNVVRRLMWHMNEESGNIGWGIPEAFAEILAQHRRLGDEFYPILNSY VIDTGKGDNFCDNNVLRRSCFRAVERFALARPDLASKARGPLQAGLRDEDPVCREIAREA LGKIGMF Prediction of potential genes in microbial genomes Time: Fri May 13 02:21:10 2011 Seq name: gi|316924295|gb|ADCP01000037.1| Bilophila wadsworthia 3_1_6 cont1.37, whole genome shotgun sequence Length of sequence - 20685 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 10, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 11 - 56 10.0 1 1 Tu 1 . - CDS 288 - 797 427 ## DvMF_2147 zinc resistance-associated protein - Prom 858 - 917 5.9 - Term 1011 - 1055 10.4 2 2 Tu 1 . - CDS 1076 - 1693 506 ## - Prom 1824 - 1883 5.2 + Prom 1770 - 1829 2.0 3 3 Tu 1 . + CDS 1953 - 3725 1795 ## COG0642 Signal transduction histidine kinase 4 4 Op 1 . + CDS 4121 - 5500 1367 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 5 4 Op 2 . + CDS 5572 - 5952 353 ## Dvul_2198 hypothetical protein + Prom 6605 - 6664 3.0 6 5 Op 1 . + CDS 6701 - 9151 2356 ## COG3264 Small-conductance mechanosensitive channel 7 5 Op 2 . + CDS 9129 - 9914 173 ## PROTEIN SUPPORTED gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 + Term 10132 - 10182 1.0 8 6 Tu 1 . + CDS 10495 - 11619 975 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) + Term 11674 - 11710 9.6 + Prom 11687 - 11746 3.5 9 7 Tu 1 . + CDS 11825 - 12367 794 ## LI1175 hypothetical protein + Term 12376 - 12441 15.7 - Term 12481 - 12520 9.0 10 8 Op 1 . - CDS 12710 - 13753 1229 ## COG0618 Exopolyphosphatase-related proteins 11 8 Op 2 . - CDS 13743 - 14033 447 ## DVU0507 hypothetical protein - Term 14352 - 14410 12.7 12 9 Op 1 . - CDS 14521 - 17598 3696 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 13 9 Op 2 . - CDS 17602 - 17823 136 ## Ddes_0060 protein of unknown function DUF448 14 9 Op 3 32/0.000 - CDS 17825 - 19129 899 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 15 9 Op 4 . - CDS 19126 - 19743 641 ## COG0779 Uncharacterized protein conserved in bacteria - Prom 19792 - 19851 7.5 + Prom 19955 - 20014 4.8 16 10 Tu 1 . + CDS 20040 - 20627 726 ## COG0693 Putative intracellular protease/amidase Predicted protein(s) >gi|316924295|gb|ADCP01000037.1| GENE 1 288 - 797 427 169 aa, chain - ## HITS:1 COG:no KEGG:DvMF_2147 NR:ns ## KEGG: DvMF_2147 # Name: not_defined # Def: zinc resistance-associated protein # Organism: D.vulgaris_Miyazaki_F # Pathway: Two-component system [PATH:dvm02020] # 10 135 11 133 179 81 40.0 1e-14 MSLNKASIAVALAAFLAVGGIAGQALAAPHDGHDGHWRQQRSYQDCIYQALTQEKKAQYD AIMKEFADKTAPLRDKLAAKYIELRTLGNSPTPDPKAIGKATEELVALRNEFAKERSAMV DRVAKEIGINIFQGKGPGCPVERPRRCPATGMTVMPDGPQPGAEAPASE >gi|316924295|gb|ADCP01000037.1| GENE 2 1076 - 1693 506 205 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQIYTENRFCLPLFYLIVLGVALSIWICPDKAFATGASSGVHDDQPRQAPEGLPPSRMPE AFPGFSPDTPSKCPALAPQRTHDTSRFESRLFAMPLPPNLTPEQQQAALAIMREAEPYLT VIHIQLRQTLVELHNLSFASDTPPDALAVIGRRLIRLRTEAIRELQRISAQMEKLAGFNP GWGARVRGTKMEDLNPQPLNEQSVE >gi|316924295|gb|ADCP01000037.1| GENE 3 1953 - 3725 1795 590 aa, chain + ## HITS:1 COG:hydH KEGG:ns NR:ns ## COG: hydH COG0642 # Protein_GI_number: 16131833 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 340 579 217 456 465 218 46.0 4e-56 MQTTGLFSSSSPQQANVGGGSSLIGTLGAALVLVFGVVLLTGLSLDRTQTAMVRFLAEKG VALVTALESGVRSGTRSRTGIRLQYLVEELADRPDVRFIAVTMPDGTILAHSNPARVGEV LNTRGGRELGAETIAALRPSETPSWAIIDMEGSRAFVVFKTFHPKIKGEGRQDEPTVGPL PYVFLGLDLAPLEVAQAQNRERAVLLGAGVLLAGMLALLGLHAVERVRSSRRGQRVAEAL AEELAVALPDGLVVFDAKGRITRMNKAALTLLGMEAPAKGKAFLGRKPAEVLPSALAELA AKLLQEPVLPDTEIILRHGEEQQYISVRGGHVNEGYEGRLGSLMFLRDLTEVRRLEAEVR RREKLAAVGNLAAGVAHELRNPLSSIKGYATYFGGRFPEGSADREAAQVMVKEVERLNRA IGDLIGLSRPTDIRPRMTGMRRLIGDTLRLIGQDAANHKVAIRFDAPEVLPDVAIDPDRM RQVILNLCLNGLEAMPDGGELFLSLHPEPDALRLEIRDTGVGIAPDALPHIFDPYFTTKG QGTGLGLATVHKIMEAHGGSISVTSEPGQGAVFRLLLPLGGEGGKVHGHE >gi|316924295|gb|ADCP01000037.1| GENE 4 4121 - 5500 1367 459 aa, chain + ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 7 455 8 441 441 503 61.0 1e-142 MSTPPIILIVDDDSSHRAMLRTVLRGWGYATDEADDGDTAVEKVKERAYDAVLSDVRMAR MDGISAIREILHHNPSIPVLVMTAWQSVETAVAALRLGAYDYLEKPLDFDLLHLTLERAL DHTRLAAENRELREYIRESPSGLLGRSPRMRELIEMVNTVAPTEATVLITGESGTGKERV ARAIQEASARRGKAFVTVNCAALNESLLESELFGHEKGAFTGADKRREGRFSQADGGTLF LDEIGELPLLLQAKLLRALQQGEVQRVGSDTPIIVDVRVIAATNRNLREEVSEGRFREDL YYRLNVIGVEVPSLRERREDIPVLATAFLERFALANRKEIKGFTPQAMDALLKYSWPGNV RELENAVERAAILCLGEYISERELPMAVSSAPKNNDAPTLAELQGAAGEEPISMTLDEME RAAILRTLQDTGDNKSEAARRLGITRATLHNKLRRYDME >gi|316924295|gb|ADCP01000037.1| GENE 5 5572 - 5952 353 126 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2198 NR:ns ## KEGG: Dvul_2198 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 13 125 2 114 115 150 66.0 2e-35 MNTAHLSSTKNTSPLAAFADQQIDWNLTPEMAVTLYLEWGNNDWRSEHPPVRSKSDVATY FVVDAWQDPLKVRLVRRNSESADDLVTVPLPEPLVKAFRDEYGSLKGVFEPLPVIKDWLK KELGQA >gi|316924295|gb|ADCP01000037.1| GENE 6 6701 - 9151 2356 816 aa, chain + ## HITS:1 COG:SMc00028 KEGG:ns NR:ns ## COG: SMc00028 COG3264 # Protein_GI_number: 15964691 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Sinorhizobium meliloti # 564 784 608 828 871 184 43.0 9e-46 MNASRLVSLFLCLSLALPCAAFADTAPASGQKTDAPASTPKTVPDPTQDVWTTLLQNRVE ELGAIDAETVALTKRLPDASRKLNAALSGIEEEYQRLMTLSRVSRGLPLELSVVQQRLAR LNDNLSDVLEPLEGTLNTLKSRLSEISLLEQDSAPSKDETDISPELQAFLSDLAQTQGRL NTVQIRISRVLAPARKLQENITSLTGRVAKSIPGLWQDYYLQRSGKIYDVDSWLNIQKSI NALQETFSVRMNAELPWTLAGWLGVILRAIVLILPLHGLIFVSRRMSRKWPESLRTGWTK MCGHSFVWLSFGFTFHFAAWSPSGSYHVLSIIGTLLLSLGQMALAWDLYTFQRSDLQLRS PLWPLFTPLLGGLLLLFFNLPGPILGGIWLLMSLVTLWRDYKRPLPDIPFPLVINLLKGQ AVILWIAVLMTLIGWGRLSILVCVAYAALAVCVQQAVGFMRLMNVIAEHMPQEGVKALFS GFLLALALPAMLVLATAATGLWILAYPGGEFLLTHLANMDVSVGKTSFSMLQVLFIVSAF YVTRSFISVGRSFIADLPAHSMRLDRSLVGPIQAGFTYLLWGLFGLYTLSALGFSLTSIA VVAGGLSVGIGFGLQNIINNFVSGLLVIFGQTLREGDVIDVGGVNGIVRRINIRSTQVET FDNAVIFVPNAEFLSGKLTNWTRNGRMVRQEVAVGVAYGSDIQLVEKLLKQVAKEHQKVL TYPEPVVLFNDFAASSLDFRLRFWVGDILHGSGIASDIRKVIDAKFTEANIEISFPQMDV HLRENETVPVELRNPRKVKEPTLQQEDKSDAPEQNG >gi|316924295|gb|ADCP01000037.1| GENE 7 9129 - 9914 173 261 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 [Hydrogenivirga sp. 128-5-R1-1] # 42 229 29 192 228 71 29 5e-12 MLQSKTDNPLDGLHLETYVRADADGVASCAAGLLAARCAEAVSRRGVFTLALSGGSTPLT LFRLLRTEAWAARVDWAHTRVFWVDERCVAPDHPASNYGAANQELLAHVPAAEIYPMDGE IDPGESAVAYEELLRQHVPAEEGMPRLDCALLGMGDDGHTASLFPGSPLLAPEALRSGRL TGFAVAEHLAPKPESRRITLTLEMINAARCCLYLATGEGKRPPLGKALDLLAPASLPVQL VRPKGGELIWVMDEAACGGRQ >gi|316924295|gb|ADCP01000037.1| GENE 8 10495 - 11619 975 374 aa, chain + ## HITS:1 COG:PM1321 KEGG:ns NR:ns ## COG: PM1321 COG0741 # Protein_GI_number: 15603186 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Pasteurella multocida # 202 370 188 356 360 152 43.0 9e-37 MRNPLAGYAVSVTGTLALLALVTGFVPGGSNQSLEPVTGATSRAKGENPLPRPLLHIPEQ FREAMKQNLPSTPFAASGDRFPVLEVQSDGTVTMRPLHGSIPVKLETTSTSLAAMLTAAP TEPQQADFVPRAELFSNMPRINQAEPAHLDVLGKQLDFDDLPLRWNGSQQAFELAPEVLE RTEKLLGDLRQLSGLTASMRRQAEQYRPVVEKYADRYNLSPDLVFAIIYTESDFDPDLIS NRSAHGLMQVVPDTAGGEVHRWLGRTGKPSPSLLLHPETNIKYGTAYMYLLQNRHLSAIA DPQSREYCAIAAYNIGTGGMLRTFGKSRDAAFEAINAMTPEQVRNTLLKKLSSRETKAFL AKVLKSRERFSMLG >gi|316924295|gb|ADCP01000037.1| GENE 9 11825 - 12367 794 180 aa, chain + ## HITS:1 COG:no KEGG:LI1175 NR:ns ## KEGG: LI1175 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 7 162 6 164 170 151 50.0 1e-35 MRTISAIALCLMLVFATTGCAASKIAVVDPARLFQESEPGKAGIEHLKQLEAAMQEQLKT AQGMIEKAPNDEALRARFQKTFVGYQQIVNAEQQKVVQQINGLMQKTLEDFRTKNGYSVI MNAEGLLAFDPKSDVTKDVIAEMDKTKVTFAPVKLEPITAAPKADDKAAPKADAKPEAKK >gi|316924295|gb|ADCP01000037.1| GENE 10 12710 - 13753 1229 347 aa, chain - ## HITS:1 COG:TM1595 KEGG:ns NR:ns ## COG: TM1595 COG0618 # Protein_GI_number: 15644343 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Thermotoga maritima # 21 331 2 307 333 163 33.0 4e-40 MGIDTVYEQDMQTAHIIQPAEEMAAIIQQFDNIVIVAHGSPDGDAIGATGAMGSLVKALG KRFVLYNATGIPDYLEWVPLPGKLVTKPSAIPFKPGLIIVLDCGDAWRMGKELLAVFPEY PSVNIDHHLGNPMFASLGNWVDPGMAATGQMVAAVADAAGVPLTGELAQCVYLSLVSDTG SFTHGNTSAAVFTLAARLVANGLDAAAMREKLDNQWSMPKTKLWGKLMQTLSLECDGMVA VCPVTMEEIGSFGAVREDLEGFAEQMRRIKGVRVAVLIRQDPGNRCKLSLRSSGSDDVRS VAALFGGGGHLNAAGATIDDADMDAVTRQTIKAIQDITFGAGKEEKR >gi|316924295|gb|ADCP01000037.1| GENE 11 13743 - 14033 447 96 aa, chain - ## HITS:1 COG:no KEGG:DVU0507 NR:ns ## KEGG: DVU0507 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 96 1 96 96 144 81.0 1e-33 MVIAVLTVEYHLHGNDSLKGKRRVANSLKQKVRNTFNVAIAEYGTEDSLTRLRLMVISLS NSEQHLQSRMDKCLAMMEAVCSEEMVYSDLEFINGD >gi|316924295|gb|ADCP01000037.1| GENE 12 14521 - 17598 3696 1025 aa, chain - ## HITS:1 COG:BH2413 KEGG:ns NR:ns ## COG: BH2413 COG0532 # Protein_GI_number: 15614976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Bacillus halodurans # 321 1020 50 726 730 608 50.0 1e-173 MTEDKTKVKDLATELGVPTKNLLQALRELDIPAKSTSSNIAAEDIDRVKNHLKDSATSEQ GDRREVQPGVIVRRRRQSTEDVAAEGEEAPRAERRAPRENHTPTARIVSVPGASEAQDAA PVPSAEPEAPAAPEVSVQETVKEGAEEKPAPVAEADKADDQKPRQPRDNRQKKRSGLREA APVAKVISRPQDIAAAKAAEEAAAKAAAEAAAAAAKAAAEAAAAAKKTEEAEKPRKAQPA ASARPEGSSAPSLLPPVSEGNRSSSSDGEDDDRRDRRKPRRPQEPATPQVRVISRPDPAA VAAARTQQAAQSDGRDNRNAGRDGQARDGRPGQPRDGRPGQGRDDNRPARTGYQGPRPAG GGAPGNFTPGATPGGPLPDRDGQSKKKRNKTARRTVDFQDTANNGRRRSDDMDDVPRRGG RRRPKASRVVSQATQPLKAVKRKIRIEEAIRVADMAHQMGLKSNEIIKVLFNLGIMATIN KALDIDTASVVAAEFGYEVEKVGFAEEQYLADHRTEDSPEQLKRRPPVVTIMGHVDHGKT SLLDAIRKTNVTSGEAGGITQHIGAYHVKTKRGEIVFLDTPGHEAFTAMRARGAQVTDLV VLVVAADDGVMEQTREAVNHSRAAGVPIMVAVNKMDKPTANPDRVLQELASLGLVPEDWG GDTVVCKVSAKTREGLDEMLEMLALQADILELTANPDKPARGHIVEAKLDKGRGPVATVL IQEGTLHQGDTFVCGVFSGRVRAMFNDQGRKVKEAGPSIPVEVQGFEGVPEAGEEFICLE DEKLARRIAESRAVKLRERELAKQSRVTLETFLSRKADDQEALVLNLVVKSDVQGSLEAI VDALNKQSTEKVRINIIHGGTGAITESDILLASASDAIIIGFNVRPTAKVKEMAEHESVD IRFYDIIYKLVDEIKSAMAGLLAPVSREVYLGQAEVREVFSVPKVGSIAGSHVVDGKLTR NASVRLLRDGVVVHNGKIASLRRFKDDVREVLKGYECGISLENFNDIKVGDVIEAFEMVE EAATL >gi|316924295|gb|ADCP01000037.1| GENE 13 17602 - 17823 136 73 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0060 NR:ns ## KEGG: Ddes_0060 # Name: not_defined # Def: protein of unknown function DUF448 # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 7 72 14 78 81 73 56.0 2e-12 MDRGHVPTRMCAICRRRMPKQELVRHVLSPQGGEFLVDESQTQPGRGWYVCSETACREKF RRFRTGGRKRKGD >gi|316924295|gb|ADCP01000037.1| GENE 14 17825 - 19129 899 434 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 3 427 8 437 537 350 42 3e-96 MNLELKKAIDQISKDKGLDRDMLVDTLEDAVRTSVIRKYGENIDVEVSYNDDTGEIQVHQ FKIVVDDVADPDTEIDIEKARELDPSVQLDDEIGFPLKVTDLGRIAAQSAKQVIIQRMRD AEQELIYEEFKDRVGEICSGIIQRRDKGGWIINLGRTEAILPKEEQIPREHYKRGDRVQA LIIEVKKEGRGPQIVISRSHRDYTAALFRREVPEVDDGTVQIMGVARDPGSRAKVAVLSR ERDVDPVGACVGVRGSRIQNIVQELHGERIDIVVWSPEISTYARNALAPAVISRIVVDEA ENLLEVTVPDDQLTSAIGRKGQNVKLAAKLLGWKIDIFTETRYNETNAIGRGLEQVASVA EVSVDSLVSAGFTTLEQLQDATDEELSEKLSLSDSRIGDLRAAINFLSPVVGESADKADA DADKADETAAGEDE >gi|316924295|gb|ADCP01000037.1| GENE 15 19126 - 19743 641 205 aa, chain - ## HITS:1 COG:PA4746 KEGG:ns NR:ns ## COG: PA4746 COG0779 # Protein_GI_number: 15599940 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 6 154 4 132 152 88 30.0 7e-18 MSQPDRIEFITELVTPLAASLGLAVWGVELGGAARPIARIYVDVLPGAEPAPSEKASDDD LLPQGVTIDQCAELSRLAGLALDVEDPFATNWTLEISSPGLQRPFFKIDQLRNYVGRELE VVLAAPLDTWPGRKKFSGVLAAVADEEFTLSLPDTSRKAEEPEEVTIAWPFVRKATLVHH FPEPGKKLGGKKDSKDSKGTRGGAA >gi|316924295|gb|ADCP01000037.1| GENE 16 20040 - 20627 726 195 aa, chain + ## HITS:1 COG:RSc0416 KEGG:ns NR:ns ## COG: RSc0416 COG0693 # Protein_GI_number: 17545135 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Ralstonia solanacearum # 4 193 3 190 192 248 63.0 7e-66 MAKKKILMLAGDFVEDYEIMVPYQMLLMVGHDVDVVSPGKKPGDVIATAVHDFEGHQTYT EKRGHNFMINADFDAVDCAAYDGLVVPGGRSPEYLRLNPRVIEIIREMDGAKKPIAAICH GQQMLVSAGILKGRSCTAYPAVKPDVVDAGGVWCEPNATATNAFVDGNLVTGPAWPAHPE WMALYLKLLGTKIEA Prediction of potential genes in microbial genomes Time: Fri May 13 02:21:40 2011 Seq name: gi|316924276|gb|ADCP01000038.1| Bilophila wadsworthia 3_1_6 cont1.38, whole genome shotgun sequence Length of sequence - 20706 bp Number of predicted genes - 19, with homology - 17 Number of transcription units - 14, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 141 - 195 14.1 1 1 Tu 1 . - CDS 280 - 2346 2627 ## COG0480 Translation elongation factors (GTPases) - Prom 2493 - 2552 1.5 + Prom 2625 - 2684 3.5 2 2 Tu 1 . + CDS 2729 - 3523 645 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 3 3 Op 1 . + CDS 3717 - 3992 388 ## COG0776 Bacterial nucleoid DNA-binding protein 4 3 Op 2 . + CDS 4004 - 4210 322 ## DvMF_0499 hypothetical protein + Term 4253 - 4303 1.2 5 4 Tu 1 . + CDS 4319 - 4522 323 ## PROTEIN SUPPORTED gi|220904361|ref|YP_002479673.1| ribosomal protein S21 + Term 4594 - 4625 2.1 6 5 Op 1 . + CDS 4778 - 5221 259 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 7 5 Op 2 . + CDS 5263 - 7614 2650 ## COG1193 Mismatch repair ATPase (MutS family) + Prom 7667 - 7726 2.9 8 6 Op 1 31/0.000 + CDS 7918 - 9660 1128 ## COG0358 DNA primase (bacterial type) 9 6 Op 2 . + CDS 9644 - 11416 2521 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 11510 - 11550 7.0 - Term 11497 - 11536 6.0 10 7 Tu 1 . - CDS 11556 - 11798 63 ## + Prom 11769 - 11828 3.3 11 8 Tu 1 . + CDS 11872 - 12108 197 ## + Term 12141 - 12184 7.2 - Term 12257 - 12287 1.0 12 9 Tu 1 . - CDS 12304 - 14316 2389 ## COG0480 Translation elongation factors (GTPases) - Term 14598 - 14630 1.1 13 10 Tu 1 . - CDS 14801 - 15535 454 ## LI0555 hypothetical protein - Term 15971 - 16005 3.5 14 11 Tu 1 . - CDS 16033 - 17184 1236 ## LI0554 hypothetical protein - Prom 17229 - 17288 3.1 15 12 Op 1 . - CDS 17295 - 17900 740 ## COG2431 Predicted membrane protein 16 12 Op 2 . - CDS 17897 - 18217 495 ## LI1088 hypothetical protein - Prom 18249 - 18308 5.2 + Prom 18208 - 18267 2.8 17 13 Op 1 . + CDS 18352 - 19083 812 ## DVU2890 hypothetical protein 18 13 Op 2 . + CDS 19095 - 19514 477 ## COG0864 Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain and a metal-binding domain + Prom 19684 - 19743 1.7 19 14 Tu 1 . + CDS 19763 - 20542 689 ## COG1469 Uncharacterized conserved protein + Term 20586 - 20630 -0.9 Predicted protein(s) >gi|316924276|gb|ADCP01000038.1| GENE 1 280 - 2346 2627 688 aa, chain - ## HITS:1 COG:aq_001 KEGG:ns NR:ns ## COG: aq_001 COG0480 # Protein_GI_number: 15605613 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Aquifex aeolicus # 5 686 7 698 699 521 40.0 1e-147 MSKALDSQRTYALIGTGGCGKTSLAEMLLFQSGVINRLGAIEEGTTTLDYEPEEIKRRGS IQPGFATFLWNKDRHFLMDVPGDTNFTGDLPYLLMGVDAAVLVIDAVDGVRPLTRRIWNA VREAGLPSFVFINKMDRDRADFDMAFNGLSSVLGMKPVTLYMPVMTDGVFTGVVDILGGK ALMFGENGAVTEALIPDAIADEAALLHDTTVENIAESDEELMEKYLEEGSLSEEDLASGL RKGVLNASLVPVVVGSSLENKGGRELLDAIARLFPSPLERPAFLDADGNERASSDEGPAC GFVFKTIADPFSGQLNMVRVISGTISSESTLKNMRTEESERLGTLLYLDGKTQTPCKDVL GPGAIIAVGKLKNTRTGDTLSDDKAPFAVAMPQLAPQLITFALAPKEKGDEDKVYAAVQK LLDEDVTLKLSRDEESGDILLSGMGQLHIETAVERAKRRYKVEILLKTPKVPYRETVRGK VQVQGRHKKQSGGRGQFGDCWIEMEGLPRGSGYVFEDAIVGGAIPRNYIPAIDKGVQESA ARGFIAGCPVVDFKVRLYDGSYHTVDSSEMAFKVAGSLAFKKAIESLKPVLLEPIVLLCV SVPDEYMGDVIGDLSSRRGKVLGSDSQVGITEIKAHIPMSEVLRYAPDLRSITGGQGVFT MEFDHYEEAPQPIVDKVIAEHQKAKAEE >gi|316924276|gb|ADCP01000038.1| GENE 2 2729 - 3523 645 264 aa, chain + ## HITS:1 COG:PA0592 KEGG:ns NR:ns ## COG: PA0592 COG0030 # Protein_GI_number: 15595789 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Pseudomonas aeruginosa # 6 256 8 258 268 150 39.0 2e-36 MTAAPRAKKSLGQHFLKDAKTSARIVDLLRIGPEDRVLEIGPGPGAITGIIHERGPAEFR LIEKDSYWAAHHAELERPAPAVQVLNADALAFPWESLEGPWKIISNLPYNVGSPLMWDIV SRTPDLTRAVFMVQKEVAERLYAKPGTKDYGALSVWIQSYVRVEWGFVVGPGAFNPPPKV DSAVVTFIPLPRERHPADPKALSSILKLCFQLRRKQLQSILRRAGRDDTAAALERLGIAP EARPETLTPEQFQQLAGIFGRSGC >gi|316924276|gb|ADCP01000038.1| GENE 3 3717 - 3992 388 91 aa, chain + ## HITS:1 COG:RSc1714 KEGG:ns NR:ns ## COG: RSc1714 COG0776 # Protein_GI_number: 17546433 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Ralstonia solanacearum # 1 90 1 90 90 87 50.0 4e-18 MTKADLVAQIAARANMTKAAAERSLNAMLESVQEMLAEDGKLTLTGFGTFVAETRQERQG RNPRTGDVITIAASKVVRFRPGKMLKDALNK >gi|316924276|gb|ADCP01000038.1| GENE 4 4004 - 4210 322 68 aa, chain + ## HITS:1 COG:no KEGG:DvMF_0499 NR:ns ## KEGG: DvMF_0499 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 63 1 61 62 90 73.0 1e-17 MLHGETIHSPLPQDIPWWSPDFAVFFGVLYLVLFVIGTGVGYVILRSIWDTCRGGCGCHG EEHPAESH >gi|316924276|gb|ADCP01000038.1| GENE 5 4319 - 4522 323 67 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|220904361|ref|YP_002479673.1| ribosomal protein S21 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 67 1 67 67 129 94 2e-29 MPGVFLNEDDYNFDIALRRFKKQVEKAGILSEMKKRQHFEKPSVMRKKKKAAARKRLLKK MRKMNAA >gi|316924276|gb|ADCP01000038.1| GENE 6 4778 - 5221 259 147 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 1 147 1 147 147 104 36 6e-22 MSLTTRIEQDYIAAYKAKDALRLGVLRLLKTAAKNLQVELMRPVTDEELATVVQKQAKQR QDSIEQFTAANRPDLAEKEAAELSILKDYLPEPLSEEELAAAIDAAIAALGVTNMSGMGK VIQSVMGDYKGRVDGKAVSAAVKARLS >gi|316924276|gb|ADCP01000038.1| GENE 7 5263 - 7614 2650 783 aa, chain + ## HITS:1 COG:BH3106 KEGG:ns NR:ns ## COG: BH3106 COG1193 # Protein_GI_number: 15615668 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Bacillus halodurans # 4 783 3 785 785 332 31.0 3e-90 MDSRTIKALEFGKVLEHLAGLCVSEAGRRVSLGLFPLRDADAVNAAHTLFDEVRTWSTHS GFRLSDFPDLEGLFPHLEKAAVSPSSAPLDADALWALRETLLQGRKAAQSINENGAMWPS LRDLVASMPLPEMTLSALSRCLGDDGLIKDESSPELMLVRGELRRLHLMCLRKVKDFAVQ YNIAQYLQDDYMTLASDRYVLPLKSNFKGRIQGIIHDYSNTGETCYFEPLFLVEQNNRLQ ELKREEREEERKVLRYLTGIVQNELPFIRSAWDLLVRLDVELAKCGLAALFDGACATISP DGEDAPLSLLGARHPLLALDPQIRKQGGPHPVDLIFRPTDRALVISGGNAGGKTVCLKTL GLLAIMTLAGLPVPAAKGSVIPWWTSIHAFIGDEQSLDDHLSTFTAQIRHLGNAWEATDR RTLILLDEFGAGTDPAQGAALAQAVLDGLLERGAHVVAATHFPALKTYALTREGVRAASV LFDPGTKKPLFRLAYDQVGASQALDVAREHGLPESVLRRAEQYLLLDGQDMTAVMDRLNA LAAKREGELDALKAEQQRTREKRKAVQERFERERERLIKDVRELSAKVMKDWQEGKAGHK QALKELAKVRAELHVSPEQEEAAAPAFDIAELKPGQHVMHRPWNKKAVVREVDARQNRVK LDMNGVTLWADAALLGPADAPPQQAKPKSGVLVRTTAGSDPEMSLLRLDLRGKRADQALG ELSQYLDRALLSGREGVEIVHGRGTGALRKEVHAFLKTFPGIASFARAPEDQGGDGVTIV TFK >gi|316924276|gb|ADCP01000038.1| GENE 8 7918 - 9660 1128 580 aa, chain + ## HITS:1 COG:YPO0644 KEGG:ns NR:ns ## COG: YPO0644 COG0358 # Protein_GI_number: 16120969 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Yersinia pestis # 2 358 3 359 582 245 37.0 2e-64 MGRNARAIQEIKARLNLVDIARRYVDLKRNGPRWVAPCPFHQETKPSFSINEEEGFFYCF GCQASGDLFDFYGQINGLDFKETLEQLAEEAGVTLERGPQKHDGQPAGQTMSKRRQLLKI HEIAAAHFTENLSGRDGAECRDYMARRGISEEVAKLFGLGWSRRDWQSLAEVLRRAGFSE SMGVEAALLGKSERGRAYDRFRGRLMFPIRSLSGNVIAFGGRIIANEDEAKYINSSDSQL YKKGEHLYGLQQARRAIATGKPAMLTEGYMDVVTLHQFGYSSAVGVLGTAFTPEQVKRIS GFTSHVELLFDGDGPGRKAALRACEMLLTRGLSCKVVLFPEGEDIDSLLRTQGTDIFEDL RRNAPEGMAFCVRCLRDMAPREAVDWAREFLRQVELPELVSRFASTLSTGLGLAESELRE RIIESRGARALPRNAGGQETQPPVRTNPRDREIMTFAVRYPSSLPRLRELGAHLVLSAAW ARDLWQKLEEYPVDEVVQHLDPREKRFWIRCRTGDVPPLDNEEGEFGAIRTMLDTLHLTA QSASVSAALRQGAGAGNFEADLEYLRALQETLERTHGEQH >gi|316924276|gb|ADCP01000038.1| GENE 9 9644 - 11416 2521 590 aa, chain + ## HITS:1 COG:XF1350 KEGG:ns NR:ns ## COG: XF1350 COG0568 # Protein_GI_number: 15837951 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Xylella fastidiosa 9a5c # 11 589 12 616 618 415 43.0 1e-115 MGNNIKDIQQIKSLIAKGKEMGFLTFEEVNKAVPLEMNTPEQFEEIIGIFDQLEIAIVDL EKDGKKIPVTSAETDGEQSEERLELVDNEDAADFSSRSTDPVRMYLREMGAVPLLDRDGE VTIAKKIEQGEQDVLYALVEVPVAVEELINVGEDLKLNRIKLKDVVKTIEEDDPTEDEMN QRTRVILLLEEIRQTFKKKRKIYAKLDECCTLERRVTAIQKEIMAFKEEIVTRLRDIKLE KTLIDRIIETVEDYVRQMRNCQRDLSAYLLSTGKNQEEIKDLFRKLDSRDISPVLAAKEL NMSVDELFSYKEMILGKIEILQRLQEKCCHNVSDLEEVLWRIKRGNNAAMRAKQELIRSN LRLVVSIAKKYTNRGLQFLDLIQEGNIGLMKAVDKFEYQRGYKFSTYATWWIRQAITRAI ADQARTIRIPVHMIETINKLIRTSRYLVQELGRDPTPEEIADRMEYPVDKVKKVLKIAKE PISLETPIGDEEDSSLGDFIEDKKAVAPAEEVVNTKLSEQIASVLADLTPREEQVLRKRF GIGEKSDHTLEEVGKLFNVTRERIRQIEAKALRKLRHPVRSQVLRSFYDS >gi|316924276|gb|ADCP01000038.1| GENE 10 11556 - 11798 63 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYKERPKIDAEQQWIETGKAFLALFIAEPEFARKAVEIARRSPRLTCKHATIIRELQLVS DLMVTIKRRIEEIDGTAPPR >gi|316924276|gb|ADCP01000038.1| GENE 11 11872 - 12108 197 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTFAKPFGDMLKEKRKLRGLSQMGLAMKSGRSLRCIQYLEAGTQEPTLSTLYAIAHALDL RVMDLIEALPEELNDMYN >gi|316924276|gb|ADCP01000038.1| GENE 12 12304 - 14316 2389 670 aa, chain - ## HITS:1 COG:Cj0493 KEGG:ns NR:ns ## COG: Cj0493 COG0480 # Protein_GI_number: 15791857 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Campylobacter jejuni # 3 670 8 677 691 586 44.0 1e-167 MTQAVRNIGVIAHIDAGKTTLSERMLFYTRKIHRMGEVHDGTATMDYLPEEQERGITITS ACTTCEWNGTTVNIIDTPGHVDFTIEVERSLRVLDGAVGVFCAVGGVEPQSETVWRQSEH FGVPKLAFVNKMDRIGADFSAVLKSMQTRLGANPLPVVIPVGAAETFQGVIDLVTLERLD FDESDQGQTWTRSPLTEADAELAAPWREQMLERLAENDDTFLEQYLGGEYTETDIRSAIR RATLARRVTPVLSGSALKNTGVQPLLDAVIAYLPAPADLPPVTAHNPEDGTDGTIPCDPA LPFTGLVFKVMMDGGRKLALVRLYAGTLKEGDPCRNVTRRADERISRLYRLHADRREQVD AAKAGDIVAVIGLRSARTGDTVGAPGSKLLLESIEAYQPVISLAIEPRNADEGKALDEAL DRFSLEDPTLTVAIDEGSGHRIVSGMGELHLDVILERIRREYGIAPRVGQPQVIRRETPK RTASATGIFDRELGKETHIGEVTLSIAPRERGSGNQIRFAIDTAILPAAFVDAVRQGVEN ALQSDPVTGYPLQDADVEITAMPRRDGSTVAGYHMAAGIALRSALEAAQVTTLEPLMFVE ISAPEANLGPAISLFGTRGGKVENILDHAGLKLVQGLAPLSKLFGFSTDLRSATQGRAGL MMRFERFDVI >gi|316924276|gb|ADCP01000038.1| GENE 13 14801 - 15535 454 244 aa, chain - ## HITS:1 COG:no KEGG:LI0555 NR:ns ## KEGG: LI0555 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 28 241 5 249 251 107 34.0 3e-22 MPFSPAIEACRVPDERLAGAYEETSAAHRSWIKTTLALAEATYPAPPSRLTITSENAAAG FGFARTRETAPWAVLLIGEGYASAVRLAAAIMPARLAGVEPVFAVWTGAETAPSGLFAAL ELTGVEQVFAMRDPAPLLRELPGRGRILRFGKAPLPECPCPVWSDRAPRIERTALPDTAV LWAHPDALPADDGADVVYAGQIIIGEDTPLVLGAGLEGCWLHTGLTPDFFMNERLALSAL KLES >gi|316924276|gb|ADCP01000038.1| GENE 14 16033 - 17184 1236 383 aa, chain - ## HITS:1 COG:no KEGG:LI0554 NR:ns ## KEGG: LI0554 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 14 373 11 378 392 347 45.0 6e-94 MPLSIFKRLFICFCFALLTALPAQAEEQKNGTLPPALETALSTLLADAPQGKAAPDAATV NAVLDFVATNKQAANKVRPEARPQGSGAYLKETLNVPLRKLVEYMLDPSIPGEAIYPSAV RRNAWMPGSPILKDNAALTDAAYPPAAPIVTRGVEYEETTPDTSSGCYYSYKLNRLFVLA DYKGRTALISVSVMPGQSSVGLRGAIVGNDKDWTYVYTPEKGTNLAMLGWAETYLYGSAS ISVFMESAPGSGKVDVSIFKWAKAGWKGSNVVKVSHITAGLKRFTSGLRQVMESPRLPSS DAIAAKYSELKAMNDTELRAQLSSFGTHLAKQNADPLDEKAFRTVLDNGAYPGTLKRDDA IAELMKLYMRQQLGTLPAAVARN >gi|316924276|gb|ADCP01000038.1| GENE 15 17295 - 17900 740 201 aa, chain - ## HITS:1 COG:PAB0910 KEGG:ns NR:ns ## COG: PAB0910 COG2431 # Protein_GI_number: 14521573 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus abyssi # 1 197 1 190 196 85 27.0 6e-17 MKESIIVLFFFSFGVLGGRLDLLPADWLGLVHEASNLAVYAMLAAVGMSLGFDSRAWRIL RDLKGWVVLVPLMIIVGTFLGGVAAWTMLDMSFRDVMAVAAGFGYYSLSSMLINQLADVS LGSMALISNMVRELVTLLFAPLFARVFGGLGPLSAAGAASDTCLPAIIRTSGERNTILGI FSGMVLTIAVPIFVTSIFAWL >gi|316924276|gb|ADCP01000038.1| GENE 16 17897 - 18217 495 106 aa, chain - ## HITS:1 COG:no KEGG:LI1088 NR:ns ## KEGG: LI1088 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 88 1 88 102 72 48.0 4e-12 MIVEMGCLILGVPVGFLLRRKPLVIKITDQVLTWSVRILLLLLGLALGADDRLMSQMDTI GARGIFISLCCVAGSLIGARLLEPIMNLHVGRYAVSRTAKDEGAAQ >gi|316924276|gb|ADCP01000038.1| GENE 17 18352 - 19083 812 243 aa, chain + ## HITS:1 COG:no KEGG:DVU2890 NR:ns ## KEGG: DVU2890 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 4 243 53 292 292 266 57.0 6e-70 MTYDAAPDPSARQGRKTVNESQDTPLSEGSWQPFLTVNACGRDWTLERAADMEALWESMT EFTEDERLPYWTELWPSSLVLADWLYQRRESLRGQPCLDLGCGIGLTALVAQWLGANVIG MDYEPEALRFARRNAEHNAVPQPLWTVMDWRKPAVKRRSLRFIWGGDIMYEQRFAAPVLD FLEYALAEGGAAWVAEPSRAVYDTFRSMLVNRRWAGRCVWEKNIEALYPQERPVPVRIWE IHR >gi|316924276|gb|ADCP01000038.1| GENE 18 19095 - 19514 477 139 aa, chain + ## HITS:1 COG:PH0601 KEGG:ns NR:ns ## COG: PH0601 COG0864 # Protein_GI_number: 14590497 # Func_class: K Transcription # Function: Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain and a metal-binding domain # Organism: Pyrococcus horikoshii # 3 138 2 137 138 118 41.0 3e-27 MGETVRFGVSLDSDLLEKFDALCERQGCPSRSEALRDIIRDALVQDSLHSETADAAGVLS LIYDHHVRDLSRKLTERQHDAHGLIVTTLHVHLDHHNCLEILVLKGKAGELRELADQLRS IRGVTHGTFSITTIGADLP >gi|316924276|gb|ADCP01000038.1| GENE 19 19763 - 20542 689 259 aa, chain + ## HITS:1 COG:TM0039 KEGG:ns NR:ns ## COG: TM0039 COG1469 # Protein_GI_number: 15642814 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 1 256 1 253 259 234 48.0 2e-61 MEDVQNSPAPVALPIDRVGVKGLKLPLLVRDRAQGTQHTVANVDVSVDLPAAFKGTHMSR FVQALANWTEDLDYAGMKRLLEDVRERLNARRAHIVFHFPYFVQKNAPATACPGILPYEC RLTGELPEDGKPSFLLEVTVPVMTVCPCSKAISREGAHSQRADVRMAVRMRGFCWIEDFI EIAEASGSSPVYSLLKREDEKFVTEDAFSRPTFVEDVVRNVASRLADHPHVSGFRVEVES YESIHAHNAFACIEHGVTL Prediction of potential genes in microbial genomes Time: Fri May 13 02:22:21 2011 Seq name: gi|316924273|gb|ADCP01000039.1| Bilophila wadsworthia 3_1_6 cont1.39, whole genome shotgun sequence Length of sequence - 2596 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 32 - 1450 1849 ## COG0469 Pyruvate kinase 2 2 Tu 1 . + CDS 1749 - 2573 879 ## COG4820 Ethanolamine utilization protein, possible chaperonin Predicted protein(s) >gi|316924273|gb|ADCP01000039.1| GENE 1 32 - 1450 1849 472 aa, chain - ## HITS:1 COG:all2564 KEGG:ns NR:ns ## COG: all2564 COG0469 # Protein_GI_number: 17230056 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Nostoc sp. PCC 7120 # 2 470 3 468 476 318 40.0 1e-86 MKTKILATIGPASNKRDVLSKLIAAGVRIFRLNFSHGDSSSFVDLIKMIRELEIVHKTPI TILQDLSGPKIRIGTFPNEGSLNVLKGDQLLLGPSSLMCNEEFPYIPFDHPEIFAELEQG DRLVLADGTLQFRVVEHREDGLFKLEANNNGIVTSRKGLALPGKSIPLPALTEKDKKDLV DGLALGVDAVALSFVQSPEDIREAKSIIRSHSDRDIPVIAKLERRNAVDRLDAILKEVDI VMVARGDLGIECPLPELPAMQKRIIRACNRAAKPVIVATQMLLSMVSSPSPTRAETTDVA NAVLDGADCVMLSEETAMGNYPVETVQFMSEIASKAEELMAETRRIAEPENDGTAEFLAY AACLLAQKANAKSIVAHSLSGTSARLLSARRPVQTIHALTPDVTSLKALNFSWGVLPHQV GNDDVGHLARAERFIASSSLFGLGEDVVITAGQPTSSSPQPRGTNLVKIYRK >gi|316924273|gb|ADCP01000039.1| GENE 2 1749 - 2573 879 274 aa, chain + ## HITS:1 COG:eutJ KEGG:ns NR:ns ## COG: eutJ COG4820 # Protein_GI_number: 16130379 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin # Organism: Escherichia coli K12 # 4 267 1 265 278 255 47.0 5e-68 MKQIAWDEVMKRLEVASKICNDASPVQVEGPIHVGVDLGTADVVVMAVDDNGMPVSAFLE WATVVRDGVVVDYHGAITIVKRLVSMTEERLGRKITEASTSYPPGTDARLSTNILDAANL RLVSTADEPSCLARLARLDRAAVVDIGGGTTGTAVISNGKVIASVDDATGGHHVTLAMSG ALSVPYEDAELKKRGTDNRQYAPIVKPVFERISDIVKAHISGHAVDTVYLTGGTCCFPGI APLFEKELGIKVECPDYPLLLTPLAIACLPLMEG Prediction of potential genes in microbial genomes Time: Fri May 13 02:22:36 2011 Seq name: gi|316924250|gb|ADCP01000040.1| Bilophila wadsworthia 3_1_6 cont1.40, whole genome shotgun sequence Length of sequence - 29989 bp Number of predicted genes - 23, with homology - 18 Number of transcription units - 16, operones - 7 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 223 - 264 1.1 1 1 Op 1 . - CDS 453 - 1484 1056 ## COG0859 ADP-heptose:LPS heptosyltransferase 2 1 Op 2 . - CDS 1484 - 2359 1237 ## COG0685 5,10-methylenetetrahydrofolate reductase - Prom 2396 - 2455 2.8 + Prom 2338 - 2397 3.8 3 2 Tu 1 . + CDS 2535 - 3542 661 ## DvMF_3033 hypothetical protein + Term 3546 - 3581 4.1 4 3 Tu 1 . + CDS 3661 - 4863 1455 ## COG0406 Fructose-2,6-bisphosphatase + Term 4946 - 4979 1.8 + Prom 5438 - 5497 1.7 5 4 Tu 1 . + CDS 5633 - 6031 196 ## COG2199 FOG: GGDEF domain + Term 6075 - 6103 -0.9 6 5 Tu 1 . - CDS 6050 - 6502 95 ## - Prom 6566 - 6625 1.6 + Prom 6053 - 6112 1.6 7 6 Tu 1 . + CDS 6269 - 6730 569 ## CLI_0624 acetyltransferase 8 7 Tu 1 . + CDS 6879 - 7505 507 ## COG4832 Uncharacterized conserved protein + Term 7529 - 7574 5.0 - Term 7651 - 7686 2.5 9 8 Op 1 3/0.000 - CDS 7718 - 10372 3560 ## COG0013 Alanyl-tRNA synthetase 10 8 Op 2 . - CDS 10475 - 11572 1587 ## COG0468 RecA/RadA recombinase - Prom 11695 - 11754 3.8 - Term 11700 - 11742 5.9 11 9 Tu 1 . - CDS 11767 - 13320 1928 ## - Prom 13369 - 13428 2.4 - Term 13690 - 13733 6.9 12 10 Op 1 3/0.000 - CDS 13749 - 15272 2396 ## COG1012 NAD-dependent aldehyde dehydrogenases 13 10 Op 2 . - CDS 15709 - 16503 750 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 16606 - 16665 1.9 + Prom 16565 - 16624 2.5 14 11 Tu 1 . + CDS 16682 - 18010 1803 ## COG1757 Na+/H+ antiporter 15 12 Op 1 . + CDS 18179 - 18448 121 ## COG2827 Predicted endonuclease containing a URI domain + Term 18472 - 18509 7.6 16 12 Op 2 . + CDS 18548 - 19312 1054 ## COG0730 Predicted permeases + Term 19352 - 19393 9.7 17 13 Tu 1 . - CDS 19949 - 20665 181 ## + Prom 20729 - 20788 3.3 18 14 Op 1 13/0.000 + CDS 20838 - 22088 1486 ## COG0124 Histidyl-tRNA synthetase 19 14 Op 2 . + CDS 22109 - 23932 2543 ## COG0173 Aspartyl-tRNA synthetase + Term 23998 - 24042 -0.6 - Term 24063 - 24099 2.5 20 15 Op 1 . - CDS 24305 - 25012 758 ## 21 15 Op 2 . - CDS 25009 - 25182 221 ## - Prom 25202 - 25261 1.9 + Prom 25133 - 25192 1.7 22 16 Op 1 27/0.000 + CDS 25352 - 26560 1688 ## COG0845 Membrane-fusion protein 23 16 Op 2 . + CDS 26570 - 29734 4236 ## COG0841 Cation/multidrug efflux pump + Term 29782 - 29820 1.1 Predicted protein(s) >gi|316924250|gb|ADCP01000040.1| GENE 1 453 - 1484 1056 343 aa, chain - ## HITS:1 COG:ECs4507 KEGG:ns NR:ns ## COG: ECs4507 COG0859 # Protein_GI_number: 15833761 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Escherichia coli O157:H7 # 11 291 6 291 352 102 28.0 1e-21 MAVDLATFSPKRILVCQLRQIGDVVLATPSIELLKRRYPDAEIHLLTEKKCAPLLEHNPH LGKVWALDKKVLSSLPQEIAWYWHVARTGYDLVVDFQQLPRCRWVVAFSGAPARLSYTPP WYTRLLYTHSSDMLDGYSAMSKASVLRPLGIEWNGERPRVYLTDEEHAFARTLLAQAGLQ PEQRLITLDPTHRQPTRRWPLAHYAGLVSLLAERDASLRFLPLWGPGEEAEIQELSRLCP AGSLLLPERMLSLREMAACIAEADLHIGNCSAPRHIAVAVGTPTLTVLGSTTPAWTFPSP EHADIALNLPCQHCNRNHCPDPRCLTGMLPGPVADKAMEMLKK >gi|316924250|gb|ADCP01000040.1| GENE 2 1484 - 2359 1237 291 aa, chain - ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 1 288 1 287 296 229 40.0 4e-60 MNIGQLIAERGKPFYSFEFFPPKDPDQWPDFFKMAERLAPLEPLFTSVTYGAGGSSQDAT LEIVSHLKKEFQFETMAHLTCVGATQDYITQYLQRLRDNGVDNVLALRGDIPKGQDIDWE TAEFRHAADLVRFAKAHYPDMGIGVAGYPAPHPESPSFASDWRYTVAKIREGADFVITQL FFDVREYLHFVDRLSDMGVSVPVIPGILPIQSLESIRRTLSLCGANIPGKLYLALEEANA KGGAEAVREAGLKFAVQQIRTLLDNGAPGIHLYTLNKASMCLRIAEEVGAL >gi|316924250|gb|ADCP01000040.1| GENE 3 2535 - 3542 661 335 aa, chain + ## HITS:1 COG:no KEGG:DvMF_3033 NR:ns ## KEGG: DvMF_3033 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 90 335 33 279 298 280 59.0 7e-74 MTLHLFSSIRHAMCRSGRGLCLAPRVFAVFRQLFLLGVLALSAVPAFADGVPSEPQDGLT AIKNALVPPQPSAEQPAPPPPPRWVFETLPPWEKLEEGLQLGLFSAQFDGGDPFEVVFLR IDPAYFDFTVETASSEKQSLPLEAWATRKGLIAATNASMYLPDGVTSTGYLRTGETVNNG RVVSKFGAFFVAGPDSPDLPGADLLDRSTDDWENLLPHYSMVVQNYRMISADRRLLWKPG GPKHSISAVGRDGTGAILFILCREPITGVDFGALLLALPIDVRVVMYTEGGSLAGLFLRT PVRSQIWLGRSLPEFWASGSQGAPLPNVIGVRRKS >gi|316924250|gb|ADCP01000040.1| GENE 4 3661 - 4863 1455 400 aa, chain + ## HITS:1 COG:YJL155c_2 KEGG:ns NR:ns ## COG: YJL155c_2 COG0406 # Protein_GI_number: 6322306 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Saccharomyces cerevisiae # 208 378 11 176 236 92 32.0 1e-18 MKLYVAMVGLPARGKSTLAKRIRSGLEQQGIRTAIFNNGELRRMLFGLESGSAEFFNPDN TRAQRLRDQITHQNMERARAWLDEGGDVAIIDATNGTVHQRVDLSATLRDRPVLFIECVN DDPLLLDASIRRKTRLPEFANMTQEEALESFRKRLAYYESVYTPVRKERCWIRVDAVDSC IQDEAPSNDLPYYAAIRDIISSRWVQDLYLVRHGETDYNREGRLGGDPSLTAKGIEQAEK LAAHFDGVDLPYIFTSTKQRSAETAAPLLRSRPNTISMALSEFDEINAGVCEGMRYSDVR DGMPLEYEARSHNKYGYIYPNGESYAMLKERVARGLRRALFLSGEGTLMIVGHQAINRTL LSLFLFQRASDVPYTYIPQNQYYHITITQRRKLFEMIRYA >gi|316924250|gb|ADCP01000040.1| GENE 5 5633 - 6031 196 132 aa, chain + ## HITS:1 COG:PA0575_3 KEGG:ns NR:ns ## COG: PA0575_3 COG2199 # Protein_GI_number: 15595772 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 1 122 49 173 182 86 35.0 1e-17 MDIDDFRGINDTFGHSAGDVVLRRMGAVLQEQLGDADMASRIGGGEFVFVMPGIVGKESA ESLLPRICAASFTHEDVPCPISVSIGFAAFPNDGAAFEVLHKEAGKALYHAKRQGKRQCA FYGELPEPCPAP >gi|316924250|gb|ADCP01000040.1| GENE 6 6050 - 6502 95 150 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYGEEADAVVLIREHQHAEQPPCILGAQRPPLRNGFPQIPPVLAERQPGEGIPDKRNQIH PSINVRFRRIPDGTFLIHVSLLCGGTRINEKRKPEGLRFLRVGHPANNRYRCFLSNLAGL AAPPPPDSLEPSVYRKECRLSNGSRPLHKQ >gi|316924250|gb|ADCP01000040.1| GENE 7 6269 - 6730 569 153 aa, chain + ## HITS:1 COG:no KEGG:CLI_0624 NR:ns ## KEGG: CLI_0624 # Name: not_defined # Def: acetyltransferase # Organism: C.botulinum_F # Pathway: not_defined # 1 146 1 146 149 149 47.0 2e-35 MNQECAVGYASEADIDAWMDLVALVRDAFPGLSLGEYRGNLRKAIAERRALCAKDARGLL GVLVLSDQHNGIGFLAVHPEARGRGVASALVRMMLDVLPADQDIFVTTYREDDPLGAAPR ALYKRLGFEEVELVTRYGYPCQQFVLRRGSGDK >gi|316924250|gb|ADCP01000040.1| GENE 8 6879 - 7505 507 208 aa, chain + ## HITS:1 COG:FN0105 KEGG:ns NR:ns ## COG: FN0105 COG4832 # Protein_GI_number: 19703453 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 12 205 12 204 204 154 39.0 9e-38 MKYEWKKHEKALYLPKAVPAPVTVPEHAFFMIRGEGNPNGEAFLEAIGVLYALSYAVKML PKKGDAPEGYYEYAVFPLEGVWDTAEPMPPGEALNKDALRYTLMIRQPDFVTDELAERIL AATSRKKPHPLYASASFGHWNDGLCVQMLHVGPYDDEPVSFNMMQAYCAEHGLRRASDSH REIYLSDARKTDPARLKTVLRFGVEPCA >gi|316924250|gb|ADCP01000040.1| GENE 9 7718 - 10372 3560 884 aa, chain - ## HITS:1 COG:ZalaS KEGG:ns NR:ns ## COG: ZalaS COG0013 # Protein_GI_number: 15803211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Escherichia coli O157:H7 EDL933 # 6 871 7 860 878 802 50.0 0 MLTANEIRHRFLEYFKKHGHTEVASSSLIPRDDPSLLFTNAGMVQFKKIFCGQEKRDYVR ATTSQKCLRVGGKHNDLDNVGRTARHHTFFEMLGNFSFGDYFKEDAIRFAWTFITEDLKL PKDRLYITVYKDDDEAFELWQKVAGVAPERIFRLGEKDNFWSMGDTGPCGPCSEIHFDQG ADMACGPDCGIGKCDCDRFLEIWNLVFMQFEQLADGSRVPLPRPSIDTGMGLERIAGVCQ GVRSNYDTDLFQVFINYMAELAGVRYRDNADNDTALRVIADHSRAIAFMIADGILPSNEG RGYVLRRLIRRAFRFGRLMGMQEPFLYKTALKVVEVMGEDYPELRARADFMARVTREEEE RFSSTLDKGLSMLEEEMDALADKGEKIIPGETAFKLYDTYGFPLDIVNDVAEKRGFKADE AGFNEYMHQQKQRARAAWKGSGEKDIASRFQGLLEDGLKSEFFGYTALTGVGRVVALLDG DGLPVEALPSGSLGYVVTDQTPFYGASGGQCGDTGLLTAPAGSAKVLDTLKPSADLTVHH IEVDGGTLLSDQEVVLTVTESIRLDAARNHTCTHLLHAALRRVLGDHVRQAGSLVTPDRL RFDFSHIAPMTPEELAAVERDVNAAIMADYPLTAKLMGQQAAIDMGAMALFGEKYGDTVR VVTIGNPDHTESVELCGGTHLHSTGQAGSFVILSESGIAAGTRRIEAATGWNALKHARAM SEELHQLAAMLKTQPGGLAAKLDGLQKENRGLRKDLEKAAAQAASGQGGDLMSKVVEING VKVLAAKLDASNIKAMRELMDDIRSKMPSGVACIAAPVDEGKVSMILYVSKDLHGRFTAP ALIKEVAAPIAGSGGGRPDQAQAGGTNPAGIDEAMDVLKAKIGE >gi|316924250|gb|ADCP01000040.1| GENE 10 10475 - 11572 1587 365 aa, chain - ## HITS:1 COG:HP0153 KEGG:ns NR:ns ## COG: HP0153 COG0468 # Protein_GI_number: 15644782 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Helicobacter pylori 26695 # 10 362 4 347 347 410 60.0 1e-114 MAKKPTLTPEEARQEALKTALETIERKFGQGAVMKLSDDVHVKVPVIPTGSIGVDLALGI GGIPKGRVTEIYGPESSGKTTLTLHVIAECQKLGGTAAFIDAEHALDVTYARRLGVKTDE LLISQPDHGEQALEIADMLVRSGAVDLVVIDSVAALIPQTELEGAMGETQVGGHARLMSH ALRKLTGTIHKSRTAVIFINQIRMKIGVTGYGSPETTTGGNALKFYSSVRLDIRKIQTLK DKEEAYGSLTRVKVVKNKMAPPFREAKFDIIWGTGISRSGELIDLGVDAGIVDKSGAWFA FGAEKLGQGKEKVRALLDETPELRNAIESQLIEHLGMNPRPVVHAPEETLPDPENAPVDD MDDEI >gi|316924250|gb|ADCP01000040.1| GENE 11 11767 - 13320 1928 517 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MALSKLQKGLVGLGIAVVAVIAVGYGGMRYLENAVVDAIRTWAAQTPQDAHVELGDISYT LMDNHLVLKNVRMTYVTPAKQTVAATVETLDIRNPGTTLLSLMRDPKSEIKETELPVADE IALHNLSFGPEPNITVRLRTFKGVAIETAAVKTLMSTASDDDPKVALAIIYGLSYKEDSA SGVKITSKALPFSLSTDSAVQKSYAKGHLDSSVTNGIVFTLRGQDVLTLGEIRLENMNLP PRDIMEKIYFIAPTDINDDEAFRIFQNLFAGPKPLIGLLSLKDLKTNSALLDISLDKLNI TNPSTSPYALEVSLEHLKMPVALVPELQLLSVMGVPEIDASASYAISLPNKDNQFNSTAS LSVAKLGTADFAVKGEVPYKDFFEIINNNSVTDSDIENFVEKNIKFSHIEAGYADEGLLP RLGILGQKFMGLTPEQCVDMAKKYVKESLGAAEGTENTAKLMEYIDKPGAIRLIFNTEKP IPVEAFDTLSDTDPSIKLDVNTGPKTALELMADLEKK >gi|316924250|gb|ADCP01000040.1| GENE 12 13749 - 15272 2396 507 aa, chain - ## HITS:1 COG:BH2312 KEGG:ns NR:ns ## COG: BH2312 COG1012 # Protein_GI_number: 15614875 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Bacillus halodurans # 3 483 9 484 485 390 39.0 1e-108 MIQIRNFINGQWQEETGGKTVPLYNPSTGETIGSVPVSNMETCESAIATAAEAYKTWRLV PIAKRMTYIHKIRECMIRDLEKLAQGIALDQAKHISEARGEVQRVIEIVEMACSIPALIQ GETLDQIAGSITGKVTKQSLGVFGGVAPFNFPALVFGWFIPFAIGTGNTFIYKPSTQSPY FMQLMCEIFMEIGLPGGVVNVIHGSRDIPGTWYEDPRIAGVCLVGSTPTAKKMAEGCARG AKRSMLLGGAKNMLVVMEDADIDVFIENYINSCYGSAGQRCLAGSIVAIVPELYDTVLER MIEAAKKVKVGDALDPDVYMGPLISRSAADKVKEYVDIALNHGHGCELVLDGRNPELPEK NKNGYFVGPTIIKDVTPCNPLFTTEVFGPLVATIKIADIDDALELIRQSEFGNGACIFTQ SQFYAEKFSRDADVGMVGVNVGICAPHPYIPFGGIKGSLLGTNKAQGKDGIDFFTQNKVT TIRTVDPNAKKGGDAAKPKSVRSCVAS >gi|316924250|gb|ADCP01000040.1| GENE 13 15709 - 16503 750 264 aa, chain - ## HITS:1 COG:PA0163 KEGG:ns NR:ns ## COG: PA0163 COG2207 # Protein_GI_number: 15595361 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 16 264 17 263 265 213 49.0 2e-55 MKSYRLTEVKSYNVPFVRLPRPVFWFNDPFEAGTWSDLHRHEHWGELAFMRSGYMVICTE LGNYAAPPQRAVWIPPGLTHEWYIPEPSVDNALYIMPHVLPPGPRFQRYHAMEVSPLVRE LIHALAPQPCDYEEPGPVARMVSVLLDQLPLLPEVGFPLPMPRDRRLVALCTALLNEPDS PETIREWCTRLGMSERTLARLFQRQTGESFGRWRQRIRLHHARAQLEAGESVTAVALNCG YASVSAFIAAFKKLFGRTPGQLAR >gi|316924250|gb|ADCP01000040.1| GENE 14 16682 - 18010 1803 442 aa, chain + ## HITS:1 COG:VCA0193 KEGG:ns NR:ns ## COG: VCA0193 COG1757 # Protein_GI_number: 15600963 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Vibrio cholerae # 20 429 27 439 447 375 55.0 1e-104 MSEKANFRALLPMIVFLVLFIGTGIALTAAGADMPFYQLSATVAIIPAIVLAILQGKDDL NKKIGVFLSGVGEINIITMCMIFLLAGGFASVASAIGGVKATVNFGLSLLPPSMILPGLF LIGAFISTAMGTSMGTVAAIAPIAVGVGEQTDIPLALLLGVVMSGAMFGDNLSMISDTTI AATRTQGCEMKDKFRMNLLIALPAALVTLALLWFAGASGQVVKHEEYQFIRVVPYLAILV MALAGVNVFIVLLAGIVFTGAVGIASVGGYTLIQWSKDIYSGFVGMNEIMVLSMLVGGLG ELMRYHGGISWLLERVNGLARKLSAESPVKAGECCISLLVLLANLCTANNTVAIILTGKV AHEIAVSNGVDRRRSASLLDIFSCVVQGLIPYGAQLLLAGSIAKLSPLSIAGNNWYCMLL AVAAVFAILFGIPRVRPVRPLA >gi|316924250|gb|ADCP01000040.1| GENE 15 18179 - 18448 121 89 aa, chain + ## HITS:1 COG:L1889726 KEGG:ns NR:ns ## COG: L1889726 COG2827 # Protein_GI_number: 15673803 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Lactococcus lactis # 2 69 4 70 104 68 45.0 2e-12 MWYVYLLRCADATLYCGVTTDMERRLREHNAGSRGAKYTRARRPVELVCCVAQPDASSAC RLEREVKQRPRAEKAAFLLARAERPPEEE >gi|316924250|gb|ADCP01000040.1| GENE 16 18548 - 19312 1054 254 aa, chain + ## HITS:1 COG:BS_yrkJ KEGG:ns NR:ns ## COG: BS_yrkJ COG0730 # Protein_GI_number: 16079702 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Bacillus subtilis # 6 233 9 246 261 59 22.0 7e-09 MLMGMMIIVGFIVGFLVGCTGMGGIILIPALVYLSGLSSHVAMGTTLFTFIFTTSLCSWL YIRLGHVDWKATIPICIGGFLFTYVGADVKAFTAAPYLNLILALLILMAGALVFCPVRGR RFSFMEEGRRSRFWVLFAVGSGVGFVAGLTGAGGPVLSVPIMIALGFPPLIAIGAGQVYS VPVALSGSAANFLHGAIDYKVGALMIVIQILGILLGVYMANRMDTTKLRKMVAWVCLFCG GFILVNAIRGILAM >gi|316924250|gb|ADCP01000040.1| GENE 17 19949 - 20665 181 238 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNTIESLMREMEEDRRQRRIAALSDTTQVDGGVWKVEPSPNACPTCLEFSRKTYAEKPNP PHPNCKCRISKQEKKKRYYVIGRRPLDGLGSVSTTKLDWIKKWDKNTYKSGKGGLPDLLP EHRHFFDSEGNDFGFFGDDNVRPDRKRNLPSYSYGDVKYDADIIDRAVRKYETYINQVQE KISTFTKSTKDIITNQVHGHVNVGNYGLILNNCQDYVSNILYIARRISEKERIPLILP >gi|316924250|gb|ADCP01000040.1| GENE 18 20838 - 22088 1486 416 aa, chain + ## HITS:1 COG:YPO2878 KEGG:ns NR:ns ## COG: YPO2878 COG0124 # Protein_GI_number: 16123070 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Yersinia pestis # 1 414 1 419 424 369 47.0 1e-102 MSDNVTAIKGFADLFSPDSDAFTHMEAVARETFSRYGFTELRTPILERTELFCRGIGTET DVVQKEMYTFPDRKGRSLTLRPEATAGVMRAYIESGRSGDAVTKLFTIGPMFRYERPQKG RMRQFHQINCECLGPSEPYADAELVTMAMRFLEALGLKDLSLQLNSLGCPTCRPAYRETL RKWLSELDESALCEDCRRRMVTNPLRVLDCKVPSCREHTAETPKILDHNCPDCAAHFETV RRLLDAEKVPYVINHRLVRGLDYYTRTTFEIVSDQIGAQGTVAGGGRYDGLVEQLGGPNV SGLGFACGMERLALLLPQPEAGADRPDFFTVVLTDVARDAAYALTQALRDAGFRGEMGFS SRSIKSAMRQAGKSGARFCLLIGEDELAAQTVMLKNMDSGEQSSVAFADVVARLQK >gi|316924250|gb|ADCP01000040.1| GENE 19 22109 - 23932 2543 607 aa, chain + ## HITS:1 COG:BH1252 KEGG:ns NR:ns ## COG: BH1252 COG0173 # Protein_GI_number: 15613815 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Bacillus halodurans # 20 603 4 592 595 654 54.0 0 MSEEQKELSVACQSLGDWTRTHTCGELTGADNGAEVCLMGWVQYRRDHGGLIFVDLRDRF GLTQIVFSPEYAPEAHEQAGALRSEYVLAMKGKVRPRPEGMVNPNLKTGEVEVVVSEWKL LNTSKTPPFQIEDRVEAGENLRLQWRYLDLRRPRMARNLALRSRTAMAIRNELASQGFLE VETPILTKSTPEGARDYLVPSRLNHGQFYALPQSPQQFKQFCMIAGLDRYFQIARCFRDE DLRADRQPEFTQVDIEMSFADEKQVMDMAEGLMIRVFKEALGADIPAPFPRMTWDEAMSR YGVDKPDTRFGLELQDVTSIVSNSGFKLFASAKLVKAMRVPGGESLTRKEIDELTEFVKI YGAQGLAWIKIRENEWQSPIAKFLSDEERAGLTEALGLSVGDIVFFQAGEPGMVNAALGN LRVHLGEKLGLIPENAWNFLWVTDFPLYEYSEEENRYVACHHPFTAPKDGDLEKMVSEPA ATKARAYDMVLNGNEVGGGSIRIHEAAVQRKMFEALGFSKESYESQFGYFIQALEHGAPP HGGIAFGLDRLIMLMSDSPSIRDVIAFPKTQKATCLMTNAPDIVSAKQLRELGLRLREEA KDKKEEK >gi|316924250|gb|ADCP01000040.1| GENE 20 24305 - 25012 758 235 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTPATDGANAPPGNSRFEAIASSTLTGRFFGVPFFRPSVSPALLFNVWAYLLGPFAFFLF GLWRKGIVLLAVGLFLYAPLILPTDASALTDILIEAYTPYYQALQALALLFVLLFCLGRG FWALLAFLTFAAVFLYGSFHTERLYIGLGFGTAYAWLNMSIIWKALFQAGLFSLLGRCWL GFAAILIDGLALWYVGLPLDLPVSVLPAAAFPVFCGMMATYDRYRKDVLRETFWW >gi|316924250|gb|ADCP01000040.1| GENE 21 25009 - 25182 221 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRHFLLRGILIGGSMGVLYALAGFSDSLPRAFGVGMIGGALAGLTLAIRQRKRKDRK >gi|316924250|gb|ADCP01000040.1| GENE 22 25352 - 26560 1688 402 aa, chain + ## HITS:1 COG:YPO3132 KEGG:ns NR:ns ## COG: YPO3132 COG0845 # Protein_GI_number: 16123294 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 12 398 10 395 395 196 33.0 6e-50 MKSPVGTPLSLMVSVCTLFALLLVGGCKSEEKGRTASLPPPLVGVMTVEERNVPVSFTYA GQTEGSRAVEVRAQVSGILMRRAYDEGQYVKQGQLLFEIEPDTYRAALRQANGVMEQAQA KFTQARQNLNRVLPLYKKNAVSQKDRDDAQAAYDSAKADLDSAKAAVSEAEIKLSHAYVT APVAGFASREYRTVGNLITAGSQDGSLLTVVNQNDPIYANFAIPSPQFMRLRALQEQGRL KSDGTVAEITLADGTVYQTKGVITFIDKQVNTNTSVVAARAEFVNPDLFVLPGQFVRVTL SGMELVNAILIPQQAVIQTQKGSMVVVIGEGDKAEMRPVDLGDNYGDSFLLNKGVKAGER IVVEGGNKAVPGQPVRIQQASVQDAKLPDQPVGTGSDSGKAE >gi|316924250|gb|ADCP01000040.1| GENE 23 26570 - 29734 4236 1054 aa, chain + ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 19 1045 4 1031 1051 1150 58.0 0 MENKQTPAPEQAPEIKEGFFLRRPIFSTVISLIITLVGAIAISVLPVEQYPDLTPPQVEV NATYTGASANVIAETVASLLESQINGVDNMIYMNSVSSGTGSMSLTVSFNVGTDPDQATI DVNNRVQLALAQLPQEVQRMGVSVLKKSSSMLQIIFLTSPDQRYNTIYLSNYALLNIVDE LKRLPGVGDAKNFAAQDYSMRVWLKPDVMSQLGVTPEDIATALRDQNAQFAAGRMGAEPM SPGVGVTWQITTQGRLTTPEQFGDVILRTQPDGSILRLKDVATIELGAQSYDFVGKYNGL DAVPIGIYLSPGANALATAEVVKAKMEELSKEFPVGVAYSIPYDTTTFVKISIEEVVKTL FEAMVLVFLVVYLFLQNWRATLIPCLAVPVSIVGTFAGMYALGFSINTLTMFGLVLAIGI VVDDAIVVLENVERHISDGLPPRKATAKAMQEVTGPVIAIVLVLCAVFIPVAFTGGMAGR MYQQFAITIAVSVVISGAVALTFTPALCALLLKPGHGEPNVFFRKFNGWFEGLTGKYVSI VKLLLRRSLLAVGLFVLVLVGIGGVFERVPGGLVPDEDQGYLIALMTMPDGAAISRTSAV VEKLDKAIMADPLVADVMSFSGMDAVSGAAKTNVATFFFTLKPWGERKAPGDSSFDLARK LYGMGFGLEQGSFIAFNPPPISGMSNTGGFEMWLQDRNGRGSVELSDVAQKLAEAANKRP ELQGVSTSFTVNAPQLFVELDREKARTLGVNVSDVFQTMQATFGSSYINDFNLYGRTFRV YTQSEKDYRARPEDLAEVYVRNKQDEMIPLTALINVKPTSGPQTVERFNNFQAAKFMGNP ATGYSSGQAMTAMEEVAKEVLPDGYTIAWSGSSYQEKLVSSGGSLVFVLALVMVFLILAA QYESWSLPLTVLTAVPFGVLGAITAIWLRGISNDVYFQVALVTLIGLSAKNAILIVEFAV EQYRNEGKDAVHAAEEAARLRFRPIVMTSLAFILGCVPLAVSTGAGAASRHAIGTSVIGG MLVATLVAPLFVPFFFRWIMRASEKLMGTGKKEE Prediction of potential genes in microbial genomes Time: Fri May 13 02:24:17 2011 Seq name: gi|316924215|gb|ADCP01000041.1| Bilophila wadsworthia 3_1_6 cont1.41, whole genome shotgun sequence Length of sequence - 40714 bp Number of predicted genes - 33, with homology - 32 Number of transcription units - 22, operones - 9 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 43 - 1419 684 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 + Term 1443 - 1476 6.1 2 1 Op 2 . + CDS 1689 - 1805 134 ## + Term 1841 - 1890 12.1 3 2 Tu 1 . + CDS 1975 - 2286 284 ## DVU2941 hypothetical protein + Term 2294 - 2330 8.1 4 3 Tu 1 . + CDS 2347 - 3657 1933 ## COG0015 Adenylosuccinate lyase + Term 3769 - 3802 -1.0 + Prom 3733 - 3792 3.7 5 4 Tu 1 . + CDS 3942 - 4505 962 ## COG0461 Orotate phosphoribosyltransferase + Term 4529 - 4572 10.7 6 5 Tu 1 . + CDS 4655 - 5491 479 ## COG3034 Uncharacterized protein conserved in bacteria + Term 5498 - 5539 -0.3 - Term 5436 - 5471 0.0 7 6 Op 1 13/0.000 - CDS 5548 - 6993 1420 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 8 6 Op 2 . - CDS 6996 - 9305 2649 ## COG0642 Signal transduction histidine kinase - Prom 9522 - 9581 3.9 + Prom 9448 - 9507 4.5 9 7 Op 1 . + CDS 9737 - 10213 352 ## LI1001 cytochrome c nitrite reductase small subunit 10 7 Op 2 . + CDS 10206 - 11765 2058 ## COG3303 Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit + Term 11870 - 11927 2.0 - Term 11954 - 12005 19.3 11 8 Tu 1 . - CDS 12119 - 12979 1069 ## DMR_13260 hypothetical membrane protein - Prom 13221 - 13280 3.2 + Prom 12967 - 13026 3.0 12 9 Tu 1 . + CDS 13215 - 14387 1183 ## COG1900 Uncharacterized conserved protein + Prom 14441 - 14500 3.0 13 10 Tu 1 . + CDS 14619 - 15605 1346 ## Ddes_1348 hypothetical protein + Term 15631 - 15668 3.1 - Term 15613 - 15661 10.5 14 11 Tu 1 . - CDS 15683 - 16690 1321 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 16790 - 16849 3.9 15 12 Op 1 . - CDS 16866 - 17624 544 ## LI0581 hypothetical protein 16 12 Op 2 . - CDS 17591 - 18757 1139 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 17 12 Op 3 . - CDS 18810 - 19385 813 ## DVU3352 putative lipoprotein - Term 19603 - 19647 -0.8 18 13 Op 1 . - CDS 19662 - 19988 433 ## Ddes_0522 branched-chain amino acid transport 19 13 Op 2 . - CDS 19985 - 20743 1031 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) - Prom 20839 - 20898 2.6 + Prom 20913 - 20972 1.8 20 14 Op 1 5/0.000 + CDS 20995 - 22272 1159 ## COG0420 DNA repair exonuclease 21 14 Op 2 . + CDS 22272 - 26249 2958 ## COG4717 Uncharacterized conserved protein - Term 26527 - 26569 6.0 22 15 Tu 1 . - CDS 26585 - 26980 361 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Term 26989 - 27022 1.2 23 16 Op 1 . - CDS 27251 - 27844 770 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) - Term 27866 - 27904 -0.9 24 16 Op 2 . - CDS 27905 - 28045 207 ## DvMF_1382 hypothetical protein - Prom 28110 - 28169 2.6 + Prom 27906 - 27965 5.0 25 17 Op 1 . + CDS 28171 - 29520 1201 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 26 17 Op 2 . + CDS 29542 - 30108 352 ## COG0655 Multimeric flavodoxin WrbA + Term 30119 - 30174 4.1 + Prom 30133 - 30192 1.7 27 18 Tu 1 . + CDS 30373 - 30993 791 ## COG1279 Lysine efflux permease + Term 31095 - 31141 9.7 - Term 31155 - 31203 5.7 28 19 Tu 1 . - CDS 31212 - 33038 2703 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases - Prom 33144 - 33203 2.7 + Prom 33016 - 33075 2.2 29 20 Tu 1 . + CDS 33212 - 33727 442 ## DVU3362 hypothetical protein + Term 33943 - 34001 5.0 - Term 33885 - 33917 0.2 30 21 Tu 1 . - CDS 34000 - 34689 605 ## Acid_4679 TetR family transcriptional regulator - Prom 34727 - 34786 3.3 + Prom 34714 - 34773 2.2 31 22 Op 1 27/0.000 + CDS 34858 - 36054 1511 ## COG0845 Membrane-fusion protein 32 22 Op 2 9/0.000 + CDS 36066 - 39239 4159 ## COG0841 Cation/multidrug efflux pump 33 22 Op 3 . + CDS 39236 - 40654 591 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 Predicted protein(s) >gi|316924215|gb|ADCP01000041.1| GENE 1 43 - 1419 684 458 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 13 458 14 457 460 268 34 5e-71 MRIFVVGLMMLGLAGCSFAPDYQRPQMELPQAWKDPGKGEQLDEQWWKRFDDSTLNALVQ EALIANRDIAAAVARVDYARAQLGVARAELLPLLSGQAQGTQTWVDNTKITNGSQSPFSA GFGATWELDLWGKLRNAKEAAMYQVLGTEAAQRGMRLSIAAQTSNAYFLLRSLDLQLSTA ERTVKTRTDALRIYTARYEQGLISELDLSRAKTEVETAKTALYQTRISRDAAESALEALL GRSPKDIMDGTVQRGMTLESIPTPPVIPAGVPSDLLERRPDIQQAEMSVKSANANIGVAK AAWFPAISLTGLFGVVSPELHTLMSNPLQTWSYGGAASVPLLDFGRVKYGVEAAEAKQRE SLATYEKTVQGAFKEMRDALTRQQEMSNVVASLERMVKELRLSVELANTRYDNGYSSYLE VLDAERSLFDSEMQLAAARSERLSSIVNVCLALGGGWK >gi|316924215|gb|ADCP01000041.1| GENE 2 1689 - 1805 134 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAHKKIERKKELDRRRKRRAERIKARIHEAQAAAKNSK >gi|316924215|gb|ADCP01000041.1| GENE 3 1975 - 2286 284 103 aa, chain + ## HITS:1 COG:no KEGG:DVU2941 NR:ns ## KEGG: DVU2941 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 100 1 99 99 80 48.0 2e-14 MPIYEYRCRKCGNVFEEWVKTFDSPELEPCPKCGGDAERIMSNTSFKLEGGGWFASCYGG KASGSASDAAPAADAAPAKADAAPAKAEAAASGSGGGSKAATA >gi|316924215|gb|ADCP01000041.1| GENE 4 2347 - 3657 1933 436 aa, chain + ## HITS:1 COG:BS_purB KEGG:ns NR:ns ## COG: BS_purB COG0015 # Protein_GI_number: 16077712 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Bacillus subtilis # 1 433 1 428 431 456 51.0 1e-128 MIERYTRPEMGKIWTPENRYQAWLRVELAVCEAWNRVGEISDADMETLRKEADFTVDRAF VDRANEIEETTRHDVIAFLTAIEEKVGPVSRFIHLGCTSSDIVDTAGSLLMVEAGELILK AFDRVLAVLKKMTLEHQGLLCMGRTHGIHAEPTSFSLKMAGFYAEFKRDRARFAAALDDM RVGKLSGAVGTYTVLSPEVEAIVCGLLGLKVDEVSTQVIQRDRHAAYFTALAVTAGTIER LCVELRHLQRTEVREVEEGFGKGQKGSSAMPHKKNPISAENMTGLSRLIRTNALAALENQ ALWHERDISHSSVERVIMPDSTILTDYVLHRLCRLLEGLRIMPENMARNMECSFGLYFSQ RVLTALIETGIPRQEAYVMVQRNAMKSWETRQPFPELIKADPEINSRLSEETFAGLFDPQ FYLRHEGDILKRVFEE >gi|316924215|gb|ADCP01000041.1| GENE 5 3942 - 4505 962 187 aa, chain + ## HITS:1 COG:alr5099 KEGG:ns NR:ns ## COG: alr5099 COG0461 # Protein_GI_number: 17232591 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Nostoc sp. PCC 7120 # 4 186 23 203 206 155 42.0 4e-38 MRELKKRLAKILIERSYREGDFTLASGRKSDYYFDCRQTALHPEGSWLIGTLFNELLADL DIKGIGGMTLGADPLISATTVISHEKGRPLAGLIVRKESKGHGTNQYVEGLSNFKPGDPV AMVEDVVTTGGSLLKACERVKDAGLNVVAVCTVLDRGEGGREAIEAAGYQLRALFTRPEL VELAKED >gi|316924215|gb|ADCP01000041.1| GENE 6 4655 - 5491 479 278 aa, chain + ## HITS:1 COG:aq_183 KEGG:ns NR:ns ## COG: aq_183 COG3034 # Protein_GI_number: 15605752 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Aquifex aeolicus # 42 272 139 389 397 105 33.0 9e-23 MPFECVSMRVWGCTALLLAGLMCEPEARAWEAVTDGLPDQAVVAVDKSRQRFFLMEKGKA RDYLCTTGQAQGDKQVRGDLKTPEGVYFVVRKRTERLDFEEYGGEAYILDYPNPVDRLRG KTGSGIWVHSRGRAITPFESRGCVVLNLKDIAEVGPELKRGTPVLIGERVEIAPRKDAVR EVEERTRGWRAAWRRGEAGDGFIAPERAASIRKVERQEKRVRVTFGPVRVLEGPGYVVSW FAQRTASPDGGAETGVRRLYWEKQGDGEYRIVGMAWAD >gi|316924215|gb|ADCP01000041.1| GENE 7 5548 - 6993 1420 481 aa, chain - ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 3 470 6 457 461 275 35.0 1e-73 MARILVVDDESLIRVLVADVGESLGHEVLTAASIKEGQDLAQRGVDIVLLDVLLPDGDGL AHVETFSQLPGRPDVIIITGHGNADAAEAALRSGAWEYLVKPLRVRDLSKALTQAVQWRN SRDSQSRSLLRHPDIIGKSPALAEALEALREAAASNVNVLITGETGTGKELFASALHANS LRASGPFVTVDCTTLPETLVEAHLFGHARGAFTGADRAREGLLAAADHGTLFLDEVGELP LPVQGAFLRALELRRFRPVGEVREEESDFRLVAATNRDLDDMVGMDLYRSDLRFRLRGMT IHVPPLRRRAEDIPLLAEHFTARYCQRHELPNKELTPDCYAMLADYSWPGNVRELRHTIK RACAAAGDGTQLFTRHLPTEIRIELARKRLVHLSEPEESAPTAPLPEQEPSTQPRKPETF PTLRDMKAKAERDYIAGLFAACQGDVRRAAGIAGVSRGHFYELLKKYGLDRTSSIPEFPA I >gi|316924215|gb|ADCP01000041.1| GENE 8 6996 - 9305 2649 769 aa, chain - ## HITS:1 COG:slr1759_4 KEGG:ns NR:ns ## COG: slr1759_4 COG0642 # Protein_GI_number: 16329648 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 239 475 17 260 270 165 39.0 3e-40 MSTFSWRVLRLPLLITLLWGLLLFTLFRWTAQREDEYTTGLARIQTATLFSSIVDTRDWN ANNGGVWVREHPGCPANPWLPEEERTLRAEGGTTLVKVNPAYMTRQIAESFTSTLASFRI SSLSPKRPENRADQWETGALLSFEKDRHELFDLVSDKEGMRYRYMAALPAKESCIQCHQD KKVGDVLGGISVSISAEPLFAAATERKRTTGLAFGIIGIIGMVGIGGATFQINRKKELAE AANRTKSAFLANMTHDMRTPLTGILGMTELLERETQDSRHRYLLANLRKATDSLLTVVDG IMRYSLLEADRQPTCCAPFSLRAELGACIAVLRPACASRDIRLSLVTDDTVPDRLVGDGF RLRQALGNLLGNAVKFTEKGSVTLRVTRAESQAPEGRCALSFQVIDTGRGIPTDEQERIF ESFEQGSGVRDSGDQHETGVGLGLAIARNIARRFGGDLTLSSTPGMGSVFTFTALFRLAA DSDGLPETPASPCPASPSPLAQPDRPAETGTSGRLVVAEDTAVTALFLSEALTQAGYAAH MASSGEDALYLIRKLQPDAVLLDMRLPDMTGLDIAGQIRSGELGVAPETPILVLTATLDP GDEQAFRRMGINRWLLKPVQAGKLASVVADLLPFTRKEQEETPMPASEPAEQPADVFDPA AALDALGGEALLKRLAGIFLGEEPNIRANLQRFALSPETLPELCPELRRQAHSLKNGAGM LHLESLRKASSDLEQAAASPAGQDFAALLRTTIEALERASAALRKHCGA >gi|316924215|gb|ADCP01000041.1| GENE 9 9737 - 10213 352 158 aa, chain + ## HITS:1 COG:no KEGG:LI1001 NR:ns ## KEGG: LI1001 # Name: napC # Def: cytochrome c nitrite reductase small subunit # Organism: L.intracellularis # Pathway: not_defined # 1 158 1 158 158 206 64.0 2e-52 MSEVSPRKCPWLKILLGGVALGVLVLAGLAWGMRVTDARPFCSSCHIMEQAARTHKLSPH AKLACNECHAPTALLSKLPFKAKEGARDFYMNTLGDVDLPIVAGMATKDVVNANCKACHF ATNENVASMDAKPYCVDCHRSAQHMRMKPISTRMVADE >gi|316924215|gb|ADCP01000041.1| GENE 10 10206 - 11765 2058 519 aa, chain + ## HITS:1 COG:ECs5052 KEGG:ns NR:ns ## COG: ECs5052 COG3303 # Protein_GI_number: 15834306 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit # Organism: Escherichia coli O157:H7 # 98 452 70 412 478 187 32.0 3e-47 MNSRVFYTLIRLAVVCAALGLAAGCDDVSTAELKTPVYKTGLKDSDDNKTSAFRKDFPLQ AASYDRNNESTFMSKYKGSVNFMKNDDVDPLPEGYKQAAQPYLKNLWLGYPFMYEYREAR GHTYAIHDILEIDRINRYGEKGGMPATCWNCKTPKMVQWIKQYGDDFWAKDFNQFRTEVT DDDSIGCATCHNAETMQLQLYSEPLKDYLKSVGKDPAKLPRSEMRSLVCAQCHVEYYFND PGHGPTKRPVFPWKNGFTPEAIYSVYEDNGNVDMPGFKGKFADWVHPVSQTPMLKMQHPD YETWIDGPHGAAGVACADCHMPYQREEGKKMSSHWWTSPLRDPELRACRQCHADKTAAYL RGRIEYTQDKTYKQLLVAQEYSVRAHEAVRLALEYTGEKPADYDQLIIRAKEAVRKGQMF WDFVSAENSVGFHNPAKALDTLTSSITLSQDAINAALKATNYGIAPKLEGDIKQIVPPIL KMSRKLQQDPEYLKTHPWFAYLKPLPKADQMWEGNKKIQ >gi|316924215|gb|ADCP01000041.1| GENE 11 12119 - 12979 1069 286 aa, chain - ## HITS:1 COG:no KEGG:DMR_13260 NR:ns ## KEGG: DMR_13260 # Name: not_defined # Def: hypothetical membrane protein # Organism: D.magneticus # Pathway: not_defined # 1 265 32 302 302 194 38.0 2e-48 MNLSKYEHAERKEAEASKDAPETPQDSGAEQKHHTLFGKQFTEKEQKTLFLYLVIGIVLM ELAVTVGAIIISITNAQPSSSGVPHFQFPWIGYLVAVVMVPVLAMLLVNLVSLGFSRGAR GGEDVNLEGVPQRMQTFYALVRGAPTVILFAGFVLMCAAIYYLDGVMSLLLKLGENFHLV AIWVVGGFAVAWMVSYVVRAWMHYKTKQMEAEYAFRHEVLERTGMVILDTKHAPTTELRM LPPVPGGQPGALPPAVDVDASAALPSAEEGQEDSSQTTVDVGSEKK >gi|316924215|gb|ADCP01000041.1| GENE 12 13215 - 14387 1183 390 aa, chain + ## HITS:1 COG:MTH855_1 KEGG:ns NR:ns ## COG: MTH855_1 COG1900 # Protein_GI_number: 15678875 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanothermobacter thermautotrophicus # 7 389 1 373 378 394 52.0 1e-109 MSSHKVIKTIAEINERIKKGRAVVLNAEEMTEAVRRMGPEKAAREVDVVTTGTFSPMCSS GLLFNIGQPEPPKIRTTRVWMNDVPAYAGLAAVDAYLGATELPEDDPQNKVYPGQFKYGG AHVIEDLVAGRSVHLRATSYGTDCYPRRSLDRTVTLADLTCAQLLNPRNCYQNYNAAVNC TSRTIYTYMGPLKPDMRNVNFATAGCLSPLFNDPWFRTIGTGTRIFLGGGQGYVIGAGTQ HDPSPQRTEKGLPLSGSGTLMVRGDLKGMKPRYLRGLSLTGYGVSLSVGVGIPIPILNEE MAAFTGVSNEDILMPVKDYGYDYPNGLPRVIQHVRFSDLLSGEVEIQGRKIPTVPLTSHV ISLEIADTLKAWIEKGEFLLTEAVDRIPAY >gi|316924215|gb|ADCP01000041.1| GENE 13 14619 - 15605 1346 328 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1348 NR:ns ## KEGG: Ddes_1348 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 24 328 1 306 306 449 72.0 1e-125 MGAGRSGFLSFSFRTNVARRGVLVKLPEFGFLEKCKIVGKTLLVQVCLFITLLLLAWGSQ ALYAKLDRTEFPTPVQITDADKLDENAKGKALVDAITHQMRYELDSTFGWSINDILFNRF VLDNRAYRQYGVYHATKVLMDLYSMTIAKLGTNDRESEMLYKARLNSFAIDPRSFMFPSA ESSYKKGLKLIEQYKESLDKGTGVYNCRTDDLYASFDLVIGENLLGYALGLLENSQELPF YTLDNRIYEVQGIVLVVRDFISALYELYPEISSKGNAGNMAAAIEYMNRICTYDPLYITS KVNSGELIISYVLFAKNRLEDIRNSIRI >gi|316924215|gb|ADCP01000041.1| GENE 14 15683 - 16690 1321 335 aa, chain - ## HITS:1 COG:BH3709 KEGG:ns NR:ns ## COG: BH3709 COG0451 # Protein_GI_number: 15616271 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 1 333 1 333 343 440 61.0 1e-123 MHVLVTGAAGFIGFHLSKRLIAEGHTVVGIDNLNDYYSVQLKKDRLAQLQALPGFTFEHT DLADDAALEAVFVRNAFSHVVNLAAQAGVRYSLINPKSYVQSNLVGFGNLLECCRHGKVE HLVFASSSSVYGMNTSMPFSVHDNVDHPVSLYAASKKANELMAHTYSHLYRLPATGLRFF TVYGPWGRPDMALYLFTKAILAGEPIKVFNEGKMRRDFTYIDDIIEGVMRVMARIPQPDP AWDSAKPNPSTSTAPWRIYNIGNNNTVELGTFISTLEDALGKKAIRNLMPMQPGDVEATW ADVSDLIADTGFRPQTSVEYGVGQFVKWYKEYYGA >gi|316924215|gb|ADCP01000041.1| GENE 15 16866 - 17624 544 252 aa, chain - ## HITS:1 COG:no KEGG:LI0581 NR:ns ## KEGG: LI0581 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 9 249 4 237 243 150 36.0 4e-35 MEIALVAPVPLNISPAFRPWRQAGITHLLLDDEVRERLLARPEPEPSEAAPPVRQHSAPP QRGERRQPQPPSYTPPPQPQPVSSAPQKPVRIADTDILAPDAWPAYWQTLLKKTPPRPSL VWSYPSLSRDLGGDADAAHRDFLRRLLGDMALPKGSHAFWPLNRYPYGDGESEQTVDARM FLSGIAALKPESVILMCGQVPPELGLAELRPLSPSIVHGHRYVVTPHVDDLIGKPQRYAQ LITFLKSIIAGR >gi|316924215|gb|ADCP01000041.1| GENE 16 17591 - 18757 1139 388 aa, chain - ## HITS:1 COG:BS_yloI KEGG:ns NR:ns ## COG: BS_yloI COG0452 # Protein_GI_number: 16078633 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus subtilis # 3 382 13 392 406 231 40.0 2e-60 MCGSVAAYRALDLVRQWQDAGVSVSATLTPSAQKFVTPLTFAALGAAPVYTAPFDDPQAP SPFAHLEPGQVAQALVIAPASAATIARLAQGQADELLACQALAFRGKAVIAPAMNPAMWS HPATQANIATLRERGCAVVEPGCGRTACGEEGQGRLADLREIYLAGLRALAPQDMEGIRV MVTLGPTREHWDGLRFWTNPSTGTMGAAVAVAAWLRGARVEAVCGPGTPWLPAGIARHNV GSARHMLEAAASVWPQCDAGVFTAAVADFSPVPPAQGGDKKFKKSDAPDGFDVHFAPNPD ILKTLAADRRPEQKVVGFAAESQNLEDSVRGKLVSKKADMIVGNLLQDGFGTADNTVFVA DASGREERWAHLSKTDVAWRLLSWLLSR >gi|316924215|gb|ADCP01000041.1| GENE 17 18810 - 19385 813 191 aa, chain - ## HITS:1 COG:no KEGG:DVU3352 NR:ns ## KEGG: DVU3352 # Name: not_defined # Def: putative lipoprotein # Organism: D.vulgaris # Pathway: not_defined # 2 191 3 189 189 141 42.0 1e-32 MRALYCLLLICSLLPLSGCADLNISNPFETKSSDGSEVYFDQFPDVPIPRDMSVDAKRSL ISVAQDGTKTGLITVEGRVDKPSLANAMILNMNRQGWNLRGAAIGSKTMHLYEKGERYAV IYYYEQTTTAAMEIWVMTRLADGVLPTMGNGGAAAGMDAGGSSSPSFYLTPDTSTGTGVA GGGVHQQGLSQ >gi|316924215|gb|ADCP01000041.1| GENE 18 19662 - 19988 433 108 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0522 NR:ns ## KEGG: Ddes_0522 # Name: not_defined # Def: branched-chain amino acid transport # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 2 105 10 112 116 94 52.0 1e-18 MNIDLLLCILGMALVTLIPRVMPVTLLAGRELPPLLTRWLSFVPVSVLAALVAPDLLLAD GKLNISGDNLFLIATFPTLLICWYKKGSLFGALAVGMGTVALIRYWMT >gi|316924215|gb|ADCP01000041.1| GENE 19 19985 - 20743 1031 252 aa, chain - ## HITS:1 COG:lin1480 KEGG:ns NR:ns ## COG: lin1480 COG1296 # Protein_GI_number: 16800548 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Listeria innocua # 24 237 12 231 235 116 32.0 3e-26 MSDTIESTVMAEPEASSSSAILTGMKRGFPIFLGYVPLGFAYGVLAVQNNIPAVYAVLFS LLVYAGAGQFIAVGLWGMGASVFSIVFTTFVINLRHVLMSAAVAPWFAPFTRFQQFIIGW GLTDEVFAMHSMAMATGEKARLPLVYAANFTSHSGWVLGTFIGAVAGDFLPDPKLFGLDY ALPAMFLALLVPQCKERLYTLAAVLSALLSVILAMYDTGRWNVIIASVITSTIGALLVTH RDRNIDALRRKA >gi|316924215|gb|ADCP01000041.1| GENE 20 20995 - 22272 1159 425 aa, chain + ## HITS:1 COG:MA2363 KEGG:ns NR:ns ## COG: MA2363 COG0420 # Protein_GI_number: 20091196 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Methanosarcina acetivorans str.C2A # 4 422 22 435 443 242 37.0 1e-63 MPPIRFIHAADLHLDAAFSGLSRDIPADFAERLRTATFTAFRRLLDLCERESPDFLLLAG DVYNQEDASVSAQLALRDGFRRLESLSIPVFLVHGNHDPLASRLRSVRWPGNVTVFGELP DAVPVFRKGEGTPLAIIHGASHASGRETRNLAALFRRTEACGLHVGLLHATPGDADGVAR YAPFSQEDLKASGMDYWALGHIHDRREVCREPLAAYPGCTQGLHINEPGEKGCLLVTAEA RPDGGYAVRTSFRPLGPAVWKTLDMDLGGAASLDELENRLRTGLDRAAAEVWSGCEMLLV RLRLQGRTELDGLLRKGTTCAELADRLREDGSGVPRIWIKDIDVATRPCVERGALLERED LLGEVFRLSEAARTASGLQALREGPLAPLFAHARAGKALEPLTDEELARLLDDAESLCLD LLEND >gi|316924215|gb|ADCP01000041.1| GENE 21 22272 - 26249 2958 1325 aa, chain + ## HITS:1 COG:MA2362 KEGG:ns NR:ns ## COG: MA2362 COG4717 # Protein_GI_number: 20091195 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 1309 10 1290 1300 257 24.0 1e-67 MRIRSFHIDAFGTLRDVTAEGLPEAGAVFLGHNEAGKSTLLDFFRSTLVGYPRTRDARER GYLAGQSGLLGGSLTLFSDARGGDIRLIRRPNVAKGEPLLTDAQGRPLDPALWERLLGGV TREVYASVYGFSLSELQSFASLTSEGVRNALYGASFGMAGLKSPGAALKKLTGSMEDLFR ARGSNPRLSAALKEWEDVRRDMRRAEEDAARYDSLAAERDAAQGRLAALREERADRERER RVLERRLGVWERWEEWRLAGVRLERLEPVPATFPQDGPARLERALERRSDAERALEQARQ RLEQAREDLGSRVADRALLDCGERLRELAGRAASCRNALAAIPGLCADRERTLAALQREL AGLGPDWTAERVQRIGRPLSLREALERQAEQRRSAVSDMENAQAAVRRIAEDTKALESEL EEQMRQAADVPEPPMALSEADRERLREAMARAEEARRRLPEAQTSSAEAERELDQALSRL ALDPGTSGRALEALSAGQDAIAALASEILHKEVERKDRERRVAAAGSDAERAGETLARLR ERRAACPARSAVDAQWAALRRLRSALSRLAVEHVRFTDAESRCAEHGAASGTESSPALVV LGSVLAVLGLAGAVLRGVLEVAFLRLGSVALPLEGWLLGAFLLVGGAFVWAGFPRRKERR PEFAATAERLQQRRTACLQRVQAVQREIGELCRTVGLSDAEEATVDAFERAVELARERCA AGERLAEEIQRQEAFCAEAERHAQTERAALEQASADERAAMARWLAWFQTHGVDAPLPGE AAVFCARVDSARMRLASALAKRRELAALEADRAALPECVRSLLSEAFYSGDDDIRAAEAA RNALELCREADRLCDERRRLEESVQAMTLQLTRLEQSRLGAEDALEAARARQVAADAAWE GFLGGLGLAAGLSPATAREALERMDRVQALEAERLRLDEELERQERERDALRMPLRGILR QLGRLPDASANNAAEEPDWPGCLEALLREWESARAENAEVVRLRARSEEQAVEVREAEAV HHDAVREVERLLNMAQVPDAEAFYRRHGAKLEREALERRREDLEDALRLAARDLYGADAD IPAFFASFGEADKEVLEAELAGLAGRLSMLADEEERLADSLRTQEVRLEQAEGSETLSRL RLRAASLSGTIRGLGLEWSRYALARHLLLEARGRFEKERQPGVIRAASALFSAITGGAWV GIAASLEDSSLRVLPPHGEPVSPEVLSRGTQEQLYLALRLAHIRNHAAQAAALPVIMDDV LVNFDPDRALRTAQTFGDLASSQEGSPGHQLLYFTCHPHMADMLRKAVPGVGLYVMERGT IREEE >gi|316924215|gb|ADCP01000041.1| GENE 22 26585 - 26980 361 131 aa, chain - ## HITS:1 COG:RSc2831 KEGG:ns NR:ns ## COG: RSc2831 COG0494 # Protein_GI_number: 17547550 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Ralstonia solanacearum # 10 119 29 144 153 108 51.0 3e-24 MEEKARIEVVGGILWRGGSFLAAQRPEGHPQAGFWEFPGGKVEPGESLEAALARELAEEL SLSVRNPRLWRTVEHDYDFRSVRLHFFHITEFSGEPVANDGQAFRWVTPEEALTLPFLEA DRPLLFDLSRP >gi|316924215|gb|ADCP01000041.1| GENE 23 27251 - 27844 770 197 aa, chain - ## HITS:1 COG:NMA1182 KEGG:ns NR:ns ## COG: NMA1182 COG0138 # Protein_GI_number: 15794127 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Neisseria meningitidis Z2491 # 6 195 4 194 530 199 50.0 4e-51 MDFLPIRRALLSVTHKDGLAEFATFLHANGVELVSTGGTLKFLENAGLPVTAVSDVTGFP EILGGRVKTLHPHIHGGILADKDNPEHLAVLKEKGIKAFDLICVNLYDFESALAKGLKPR EAIEEIDIGGPCMLRAASKNFHSILVLSDPDTYAEATAELKANDMSVSLPFRQRMAARTF AKTSRYDAMIAEYLTKN >gi|316924215|gb|ADCP01000041.1| GENE 24 27905 - 28045 207 46 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1382 NR:ns ## KEGG: DvMF_1382 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 46 1 46 46 74 82.0 1e-12 MFNELVSNAMHSTPQGFVGVSLIFVYFCVIVATAFYRIKKGDHMHH >gi|316924215|gb|ADCP01000041.1| GENE 25 28171 - 29520 1201 449 aa, chain + ## HITS:1 COG:CAC0523 KEGG:ns NR:ns ## COG: CAC0523 COG2265 # Protein_GI_number: 15893813 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Clostridium acetobutylicum # 16 448 14 457 460 232 30.0 1e-60 MAKPFVNPFVSGQFLELRVDALVSDGRGLGRHDGVVVLVEDALPGQLVRARVAAVRKSLA EAALVEVLERSPDEQEPPCPHAGRCGGCSWQSLAYGRQLAWKERIVHDALERVGKVACPP MLPILPSPSEWGYRNKMEFAFGVSGDGGKALGLRERGSRNIVGVTGCLLQRPLTMRVLEA VRMLSNGAFHDLDGRFLVVREPNAGGCFVELIVGAGGSLEAGERFAEALRAETPEVTGFA LSERLAPSDVAYGERALYADGRLEERIGDVRLQMGHNAFFQVNTPAAELLYAEAARFAAL EELPDPVLWDVYGGVGSIGLYMGRNARVVGIEEMPGAVRYARGNAKALGRKAYKVEQGDA KSVLGRLVKLGPKPDVVVVDPPRAGIDAKVAQQLAEAAPLRLIYVSCNPATLARDVARLA PAFRLKAARPVDLFPQTPHVETVALLERA >gi|316924215|gb|ADCP01000041.1| GENE 26 29542 - 30108 352 188 aa, chain + ## HITS:1 COG:CAC3505 KEGG:ns NR:ns ## COG: CAC3505 COG0655 # Protein_GI_number: 15896742 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 188 1 188 190 284 72.0 6e-77 MKVIGINGSARKDGNTSLLIKAVFAELEAEGIETDLVQLHGKALEPCKACFGCGGKRQCV VKTDYFNECFAKMVEADGIVLGSPVYSADVTAGMKAFLERAGVVVATNPGMLRHKVGASV AAVRRGGGLAAVDTMNHFLLNKEVIVVGSTYWNMAYGRDAGDVLNDAEGMANMRNLGQNM AFVLKKIR >gi|316924215|gb|ADCP01000041.1| GENE 27 30373 - 30993 791 206 aa, chain + ## HITS:1 COG:FN1861 KEGG:ns NR:ns ## COG: FN1861 COG1279 # Protein_GI_number: 19705166 # Func_class: R General function prediction only # Function: Lysine efflux permease # Organism: Fusobacterium nucleatum # 1 202 1 202 207 191 54.0 8e-49 MLAYLQGLSLGLAYVAPIGVQNLFVINAGLSQPRAAAYRTALIVIFFDVTLALAGFFGIG ALIEHSELIRKGVLLAGSLVVMYMGLRLMLSREVAGPAINVNIPLMKTIGMACVVTWFNP QAIIDVSLLLGSFRVALPPEESGLFLWGVVCASCLWFLSLTTISSLFKNRFTPRLLRIIN LVCGLVIFFYGVKLGWSFVELWQSAF >gi|316924215|gb|ADCP01000041.1| GENE 28 31212 - 33038 2703 608 aa, chain - ## HITS:1 COG:CAC0970 KEGG:ns NR:ns ## COG: CAC0970 COG0119 # Protein_GI_number: 15894257 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Clostridium acetobutylicum # 11 453 2 443 448 473 50.0 1e-133 MSLLYQKHASRPLTLDNRVEPELYRDVFPYTHVSRIEFDDTFLVPRPADPMFITDTTFRD GQQARPPYTVKQIARIYDLLHKLGGKSGLIQASEFFMYSPKDRKAIEVCRSRGYRFPRVT GWIRANMDDLKIAHDMEFDEVGLLTSMSDYHIFLKLGKTREQAMNDYLKVVTKALEWGIV PRCHFEDVTRADIYGFCLPFARKLMELSHEASMPIKIRLCDTMGYGVPFPGAALPRSVQR IVRAFTDEAGVPGAWLEWHGHNDFHKVLVNGVTAWLYGCGGVNGTLMGFGERTGNAPLEA LVIDYISLTGNDEAADPTVITEIAQYFEKELDYRIPDNYPFAGKDFNATSAGIHVDGLAK NEEIYNIFDTTKILNRSVPIIINDKAGRAGVAYWINQQFNLPPERQVSKKHPAVGQIHTR IMAAYEEGRNTSFSNKEIKNLVRRFMPELFDSEFDQMKRIAGELASNLVERLARDCQSTA DSEALTAQLQHFVRDYSFIQYAYVTDVKGHSTAIAISDPGDQKGYKAFPIGFDYSNREWF LQPMRTGKLHITNVHQSQVTGQLIITVSTVITDANDEIIGVLGADIQLEEIIRRAESLEA EVPNSEEE >gi|316924215|gb|ADCP01000041.1| GENE 29 33212 - 33727 442 171 aa, chain + ## HITS:1 COG:no KEGG:DVU3362 NR:ns ## KEGG: DVU3362 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 46 170 1 125 126 138 58.0 8e-32 MEERVQFRPDAGTAPQSIPFGEANPLRTGPARDNLSEKTMFQGVTMDAISIIAEQRIREA CERGAFDSLPGAGKPLELEDDSHIPEDLRMAYKLLKNAGYVPEEVLDRKEAQSIVDALEK CGDEQEKVRQMKKLEVVIARIKARHPNAPVLGDGSPYYERVVSRITVNKKD >gi|316924215|gb|ADCP01000041.1| GENE 30 34000 - 34689 605 229 aa, chain - ## HITS:1 COG:no KEGG:Acid_4679 NR:ns ## KEGG: Acid_4679 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: S.usitatus # Pathway: not_defined # 1 206 1 206 232 154 41.0 2e-36 MEKHTARKYSSPIRERQANQTRTNILDATQRLFLERGYAKTTVEAIAQEAGVAKQTVYAV FRSKNGIVAELLDRAVFTERVFELHDRSLETANIHEALKLTAQLVLQVHESQSPVFDLLR GAGMFDPQLARVQNDLRCVNRDRQENHVRFLLRGRRLKEGIDMGMALDVFWCLTSRDLYR MLVQERGWSGETYANWLYEMLANSLLHVSEEHPGEGWEMRTEEHLSAQR >gi|316924215|gb|ADCP01000041.1| GENE 31 34858 - 36054 1511 398 aa, chain + ## HITS:1 COG:YPO3132 KEGG:ns NR:ns ## COG: YPO3132 COG0845 # Protein_GI_number: 16123294 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 13 386 6 380 395 354 55.0 1e-97 MLVKHERRPFRFGVTLLAGLLGCLSLMGCKDEKTGAQAPTVPAVEVAVEIVTPQKVLYTT ELAGRTSGFQIAEVRPQVSGIIQKRFFEEGADVKAGDVLYQIDPATYQANLDSAKANLAR AEANVAPARLKMQRFKDLVNISAVSKQEFEDAEAAYKQALADVGVNKAAVENARIRLAYT KVTSPISGRSGRSLVTPGALVTENQSSPLTTVQQLDPVYVDVTQSSTEVLRLKRSLEDGT LQRADQDHAAVRLLLEDGSEYGLTGTLQFADVSVDESTGMVTLRAIFPNPKQELLPGMYV RAILNEGVDDQAILLPQRALLRDAKGNPTTYVVNAENKVEIRPLKVGRTQGNSWVVLDGL KAGDKVIVEGLQKIRPGSPVRIAEPTPVEQGASASEKR >gi|316924215|gb|ADCP01000041.1| GENE 32 36066 - 39239 4159 1057 aa, chain + ## HITS:1 COG:STM0475 KEGG:ns NR:ns ## COG: STM0475 COG0841 # Protein_GI_number: 16763855 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Salmonella typhimurium LT2 # 1 1016 1 1019 1049 1377 67.0 0 MARFFIDRPVFAWVIAIIIMLAGGISILNLPISQYPSIAPPTIGITAQYPGASAQTVQDT VTQVIEQNMNGLDYLLYMSSTSDSSGQAQITLTFSADANPDIAQVQVQNKLQLATPLLPQ EVQRQGISVAKRGANFLKVYAFVSEDGSMDNADIADYVATNLKDAISRISGVGEVSLFGS QYGMRIWLDPHKLQSYKLMPSDVINAINAQNAQVSAGQLGGTPSVPGQQMNLAISVQERL QTPEQFGNVILRTNPDGSLIRVKDVARVELGSESYSTLSRYNGKPATGIAVKLATGANAL ETASNIDDYLKGAEAFFPQGLKAVYPFDTTPVVKISIHEVVKTLGEAVLLVFLVMYLFLQ NFRATLIPTIAVPVVLLGTFGVLSAFGYSINTLTMFAVVLAIGLLVDDAIVVVENVERVM SEDKLSPKEAARKSMDQITGALVGIAMVLSAVFIPMAFFGGSVGVIYRQFSVTIVSAMVL SVVVAIVLTPALCATMLKPIGGDHVHSQHGFFGWFNRWFDSATLRYQSGVSYVIRRAGRF MLIYLVLIGGLVFMFKTLPTGFLPDEDQGMLFAQIQLPTGATQEETTKVLERVERYFLEE EKESIDSLMGVLGFSFAGNGQNMAMAFVRLKDWSERQDPSLKVDAVSKRAMAAFSKIRNA QVFAFAPPAIMELGNATGFDFQLQDKSGLGHEALLQARNQLLGMAAQNKNLVAVRPNGQE DQPQLRVDIDREKAGALSLSLADINAALSTVWGSSYADDFLDKGRVKKVYVQGDAPFRMV PEDMEKWFFRNSKGEMVPFSSFASAHWEYAPARLERYNGVPSVEILGQPAPGVSSGTAML EMEKLASQLPQGIGYEWTGLSYQERLSGSQAPALFALSILVVFLCLAALYESWSVPFAVI LVVPLGVLGALGAANMRMLSNDVYFQVGLLATIGLSAKNAILIVEFAKELHDKGGDLIEA TIEASRMRLRPILMTSLAFLLGILPLAISTGAGAGGQNAIGTGVMGGTFAATALGIFYIP VFFVVVTSVFSFRWKWKKPKRRRTANIVETIPGDSHK >gi|316924215|gb|ADCP01000041.1| GENE 33 39236 - 40654 591 472 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 2 471 1 457 460 232 31 3e-60 MMSAKYMALALAVLLPGCTMAPNYVRPEAPVAEAWPDDLMPKGTVLAQATDQLASEIGWK TFFTDQHLQRLIQLALDNNRDLRVSALNIERARGLYQIQRADLMPGVSASGESSNKGVSA DLSTTGEQTVSRQHSLGVGVTSYELDFFGRIQSLKDKALETYLGTEEAYRSARLSLVSEV AQAYLALVADRERLNIATETLKSQQASYEMIARRHSVGVSSELDLRQAQTSVDTARVDIA RYSGQVAKDITALSLLAGTKVTPDMLPAKALSELPVWPDIPVSLPSTVLLQRPDILQAEH TLKAANADIGAARANFFPRISLTANIGTASNELSHLFDGGTGIWTFLPQVSLPIFEGGRN VANLRVSEADKKIAVANYEKAIQSAFREVSDALIDRVSLAGQLEAQKSLVHATSETYRLS GERYNQGIDSYLAVLDSQRAMYSSQLNLISVRVAREQNLIMLYKALGGGVKE Prediction of potential genes in microbial genomes Time: Fri May 13 02:25:20 2011 Seq name: gi|316924193|gb|ADCP01000042.1| Bilophila wadsworthia 3_1_6 cont1.42, whole genome shotgun sequence Length of sequence - 21252 bp Number of predicted genes - 22, with homology - 19 Number of transcription units - 11, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 215 - 256 6.1 1 1 Op 1 . - CDS 365 - 967 709 ## LI1044 hypothetical protein 2 1 Op 2 . - CDS 985 - 1539 457 ## COG1396 Predicted transcriptional regulators + Prom 1536 - 1595 5.2 3 2 Tu 1 . + CDS 1619 - 2203 442 ## COG1280 Putative threonine efflux protein + Term 2362 - 2428 24.8 - Term 2344 - 2416 13.0 4 3 Op 1 . - CDS 2426 - 3400 1173 ## COG3008 Paraquat-inducible protein B 5 3 Op 2 23/0.000 - CDS 3397 - 4188 291 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 6 3 Op 3 . - CDS 4185 - 5366 1414 ## COG0767 ABC-type transport system involved in resistance to organic solvents, permease component 7 3 Op 4 . - CDS 5369 - 5935 740 ## COG0576 Molecular chaperone GrpE (heat shock protein) 8 3 Op 5 . - CDS 6029 - 7048 599 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Prom 7074 - 7133 8.1 + Prom 7110 - 7169 2.8 9 4 Op 1 . + CDS 7213 - 7419 84 ## 10 4 Op 2 22/0.000 + CDS 7446 - 7790 385 ## COG1918 Fe2+ transport system protein A 11 4 Op 3 . + CDS 7797 - 9992 2504 ## COG0370 Fe2+ transport system protein B + Term 10080 - 10114 -0.3 - Term 10159 - 10202 4.4 12 5 Tu 1 . - CDS 10411 - 10620 96 ## 13 6 Tu 1 . + CDS 10583 - 10990 327 ## Dalk_4820 hypothetical protein + Term 11003 - 11054 14.2 - Term 10992 - 11042 6.4 14 7 Op 1 8/0.000 - CDS 11186 - 12415 1253 ## COG0247 Fe-S oxidoreductase 15 7 Op 2 9/0.000 - CDS 12423 - 13712 1709 ## COG3075 Anaerobic glycerol-3-phosphate dehydrogenase 16 7 Op 3 . - CDS 13709 - 15376 1994 ## COG0578 Glycerol-3-phosphate dehydrogenase - Prom 15458 - 15517 3.7 + Prom 15767 - 15826 1.9 17 8 Op 1 6/0.000 + CDS 16072 - 16764 706 ## COG4149 ABC-type molybdate transport system, permease component 18 8 Op 2 . + CDS 16704 - 17519 240 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 + Term 17619 - 17654 2.7 - Term 17681 - 17748 14.5 19 9 Tu 1 . - CDS 17756 - 18898 1230 ## COG2005 N-terminal domain of molybdenum-binding protein - Prom 19031 - 19090 1.7 20 10 Tu 1 . + CDS 19087 - 19869 1008 ## COG0725 ABC-type molybdate transport system, periplasmic component + Term 20037 - 20073 0.5 + Prom 20174 - 20233 4.9 21 11 Op 1 . + CDS 20264 - 20764 603 ## COG2947 Uncharacterized conserved protein 22 11 Op 2 . + CDS 20761 - 21027 293 ## + Term 21181 - 21215 3.5 Predicted protein(s) >gi|316924193|gb|ADCP01000042.1| GENE 1 365 - 967 709 200 aa, chain - ## HITS:1 COG:no KEGG:LI1044 NR:ns ## KEGG: LI1044 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 28 199 21 193 204 140 42.0 3e-32 MKKRAPSLLLALLALAGLLTAGGCSLGRSPRPDFYMLSSPVENVVQSGKEKISGPRVAIG PVSIPGYLDRPQLFLRDGNDVKVELAEFNHWSEPFGEGVTRVLCDAVSASLTPRKGLASP MRSQQPFQWRIAVDIARFDGAPNGSVILDAGWSLVNESGEELKSGRFVQHAPAGPDIPSM VQAQSALLAQFGAVLGQMIP >gi|316924193|gb|ADCP01000042.1| GENE 2 985 - 1539 457 184 aa, chain - ## HITS:1 COG:BH2909 KEGG:ns NR:ns ## COG: BH2909 COG1396 # Protein_GI_number: 15615472 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 2 182 4 184 189 142 39.0 3e-34 MENIGHIVAENLKRLREERKLSLDAVAKCSGVSKSMLGQIERGVTNPTISTLWKIANGLK ISFTSLMMRPETDVEVVPRSAITPFVETDGKYRNYPVFPFDSSRGFEMYAIELDPGVRLD AEPHPEGTQEFITVFSGALTVLLDGERWGIADGDSIRFKADRKHAYANEGNTLCRLSMVI CYPS >gi|316924193|gb|ADCP01000042.1| GENE 3 1619 - 2203 442 194 aa, chain + ## HITS:1 COG:STM2645 KEGG:ns NR:ns ## COG: STM2645 COG1280 # Protein_GI_number: 16765965 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Salmonella typhimurium LT2 # 4 193 6 194 195 114 41.0 1e-25 MPSIAAFLSFACIMSFTPGPNNIMALSSASAYGLRKGLRFCFGVLLGVLGLMTACALFGA VLFQFLPDVEPTMRAVGAAYILWMAFGVWRSGSGNEDSRLVPVNGVVSGMLLQFVNPKGI LYGITAFSSFVLPYYDSFMALAVSIGVLSAVAYAGTCFWALFGAVFRRFLQNHHTAANAS MALLLVWCAASLYT >gi|316924193|gb|ADCP01000042.1| GENE 4 2426 - 3400 1173 324 aa, chain - ## HITS:1 COG:RSc0601 KEGG:ns NR:ns ## COG: RSc0601 COG3008 # Protein_GI_number: 17545320 # Func_class: R General function prediction only # Function: Paraquat-inducible protein B # Organism: Ralstonia solanacearum # 41 324 295 544 547 81 26.0 2e-15 MSKPANKTLIGAFVVGATALLLLAIAVFGSGKLFQTTSRYVLFFDGSISGLSVGSPVLFR GVPVGRVVEIRLTGDLDNLVFQTPVFIELNKKDEGRFSVSDGDISKKEYLDRLVSHGLRA TLATQSLLTGQLMIEMDFYPRSQIPYPIKEVKEYDDVPEIPTIPSQFDNILQTLTTLPYD DIASNVLDITEGVKKILSNSGTEQLIGHIDKLVVQLQDIGTNLDKTLTSIRGLAEPYTKL AQDTDKRLSAALEQASHVLSRIDNVAKETEMTVVSARGVVSKNSTTVIELNQAIREITEA ARAVRVFANTLERNPEAVLRGKVR >gi|316924193|gb|ADCP01000042.1| GENE 5 3397 - 4188 291 263 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 4 217 275 493 563 116 32 1e-25 MNTSTPIIEARDLTVGYGSYVVQKDLNFTINRQDVFIIMGPSGCGKSTLLRVLVGLLQPT KGQVLYRGQDFWAGTESERQKLLSGVGLLFQSGALWSSMTLAENVALPLQRYTKLSSAEI REQTSLKLALVGLAGFEDYYPSEISGGMRKRAGLARALAMDPEIVFFDEPSAGLDPVSAA LLDELILELKENMGMTVVVVTHDLDSIFTIGNNSVFLDAATHTMITGGDPHVLRCDQAHP DIVRFLNRGKTDSCPTSSKGYRS >gi|316924193|gb|ADCP01000042.1| GENE 6 4185 - 5366 1414 393 aa, chain - ## HITS:1 COG:XF1303 KEGG:ns NR:ns ## COG: XF1303 COG0767 # Protein_GI_number: 15837904 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, permease component # Organism: Xylella fastidiosa 9a5c # 133 386 136 389 396 162 35.0 9e-40 MAAEPMTKSRASASLGGALRAESSEGVLRVIFSGEWKRDLLGPNFFTERDEALRETERLL KEARSVELLGEALGEWDSGVLVVTTRLVTLAKEHNIPLDDARLPEGIRNLTKLALAVPPN TAAQRKAQKTTFLERVGGFTLAVPRMACDIFDFIGELVLSAGRLFRGRSSCTPGNAWLAV QESGVDALPIVSLISLLVGLILAFVGVIQLKMFGAEIYVSSLVAVSMTRIMGAIMTGIIL AGRTGASYAAVIGTMQVNEEIDALTTLGVAPSDYLIMPRVLALTAMTPLLVLYSDFMGIM GGFIVGVGILGLDPMEYYTFTQKGFNINNLWVGMVHGAVYGMLIAITGCYQGLRCGRNAE AVGKATTSAVVYSIVGIVLSTAVLTILCNILNI >gi|316924193|gb|ADCP01000042.1| GENE 7 5369 - 5935 740 188 aa, chain - ## HITS:1 COG:SA1410 KEGG:ns NR:ns ## COG: SA1410 COG0576 # Protein_GI_number: 15927161 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Staphylococcus aureus N315 # 26 185 42 208 208 108 40.0 8e-24 MTDTRDPQTEKGQNGATEENLAETAQAPEKTLEEQFREEICPTCTVKAEADETRLRALAE MENFKKRIQRDHDEYMRYASEPVLKDLLPALDSLDLAIQYGGSDETCKSLLTGVIMTRKL LLDALKNHGFDVAGEVGEPFNPDVHDAVSYEERDDMEPGLVSTLHQRGYRLKDRLLRPAK VSVSRKPA >gi|316924193|gb|ADCP01000042.1| GENE 8 6029 - 7048 599 339 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 38 337 38 328 329 235 43 2e-61 MSLSANASSPIPASSAPLLQAEGVTVRFDACERPFDVVRDVSFTLHPGRTLCLVGESGCG KSMTALSLLRLIPEPGRLAAGRILFEGRDLAELSESAMERVRGRDIGMIFQEPMTSLNPV IRVGEQVAEPLMRHLRLPKKAALAEAVELFRLVGIPAPDMRVRDYPHQLSGGMRQRVMIA MALACKPRLLLADEPTTALDVTIQGQILALLGDLSRERGMGLLLITHDLGVVAEMADEVG VMYAGRIVERAPVRALFAEPRHPYTWGLIRSAPTLDTPPSAKLDAIPGTVPSPGDLPQGC PFRPRCPEAHERCFKTPPVIESAGADGAREVCCWLAEAR >gi|316924193|gb|ADCP01000042.1| GENE 9 7213 - 7419 84 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDAEPVFITGSVFSWESLLVVAIIVAAWFVCRRINVMKRPVCNGGCAGCHGARQGRGCVF PEGRRRRM >gi|316924193|gb|ADCP01000042.1| GENE 10 7446 - 7790 385 114 aa, chain + ## HITS:1 COG:MA3479 KEGG:ns NR:ns ## COG: MA3479 COG1918 # Protein_GI_number: 20092290 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Methanosarcina acetivorans str.C2A # 38 111 8 81 82 68 44.0 2e-12 MFGFCHRKRGGHDCAEARSCGGEGCPISKEMREAGLISLRRMKEGQKARIAHVQASGELG RRIRDMGLIPGAEVEVVGRAPLRDPVALRLPGFTLSLRNNEADYIVVEPLEQAG >gi|316924193|gb|ADCP01000042.1| GENE 11 7797 - 9992 2504 731 aa, chain + ## HITS:1 COG:MA3477 KEGG:ns NR:ns ## COG: MA3477 COG0370 # Protein_GI_number: 20092288 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanosarcina acetivorans str.C2A # 1 717 7 655 670 479 39.0 1e-135 MDPQLTIALAGNPNSGKTTAFNALTGSHQHVGNYPGITVEKKEGFIKMPSGRCVRFIDLP GTYSLTAYTQEETVARHVLAQERPDVVIDVLNAGALERNLYLTIQLLELGVPVVLALNMM DEAAKQGMTIDIERLSQRLGFPVMATVARTGEGLPELLKATEAYIETKRGEPWEPLHISY GPDLDPVLRDMEAAIAEAGLLASEYPARWVALKLLEQDEEILTRSRAEAPELAADLERQV DEVARHLRDTLNTWPEAVIADYRYGFISSLLRDGVLVRLQDMHARIQFSDRMDKVLTHPF MGPAIMLGVLYLLYTVTFTIGEIPMGWVEQGFALLREGADAVLPDGQLKSLVLSGIIDGV GGVMSFVPLIMLIFLQIAFLEDTGYMARMAYMLDRIFRIFGLHGCSVMPFIVGGGIAGGC AVPGVLAARTLRSPREKIATLLTVPFMACGAKLPVFILFSGVFFPGHEAAVMFGLTLTGW VVALLTARLLRSTIIRGPSTPFVMELPPYRLPTMLGLAIHTGERTFEYLKKAGTVILAIS IILWAAMAYPQLPLDTRAHFAGAQQTIEANLEAAKASGGDVAALEEQLTALHGERAERAL QHSFAGRLGMALEPLTRPAGFDWRTDIALVGGFAAKEVIVATLGTAYSLGDIDPEDPTPL AQQIRNDGRWTPATALALLVFVLLYAPCLVTVAAIRQETGSWGWPVFSMVFNTLIAFGAA VAVRHVTMLFL >gi|316924193|gb|ADCP01000042.1| GENE 12 10411 - 10620 96 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSVKPCCPLSFRIVYLHPQMLKNVLSLNGKITCMRLMIESFSSIEYLVIQVKAFDIRVQR YSQNILAVS >gi|316924193|gb|ADCP01000042.1| GENE 13 10583 - 10990 327 135 aa, chain + ## HITS:1 COG:no KEGG:Dalk_4820 NR:ns ## KEGG: Dalk_4820 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 2 77 7 82 135 73 56.0 3e-12 MRKDRGQQGFTLIEIISVLVILGILAAVAVPKYYDLQEEAQKKAALGVIAESQARINLKF GQQLLAGDTCAKAQAAAIAVAKDDLGGWKYGTDNEISFAGDIATIESMTNPAGTVITLTN AKLSQPSCTGGTKTD >gi|316924193|gb|ADCP01000042.1| GENE 14 11186 - 12415 1253 409 aa, chain - ## HITS:1 COG:VCA0749 KEGG:ns NR:ns ## COG: VCA0749 COG0247 # Protein_GI_number: 15601504 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Vibrio cholerae # 6 404 14 403 413 253 36.0 6e-67 MKPIETVDKCTTCSTCVAYCPVTKATRAFKGPKLTGPSSERFRLYDETGGGEISEIEALD YCSNCKNCDIACPSGVKISTLNMLARAEYCKRHKPPLRDWVLSHGRMLGRLARRFPGWLV NVGSTNALTRLILDKFGVDARAPMPVFVPRSFTEWMGDHNAMLTASDRSRLTKKAVFFPG CYVNDYDPQTGKDLVFMLEKAGYEVIVPEFECCGLPMVANGFFDDARAAASRNVDTLAAL VSSGDVPILTVCPSCQLMLKQEYAEYFPELGKHESIVPHVQDACEFLIGLIESGELDIDV REDATKLVYHAPCHLRAQGIGKPGLDLLRMASANVEDARSGCCGISGSYGFKKEKYDVAA SVGSDLFKAVKDSAAEYCVTECGTCRLQISHHTKLPSMHPISWLRKLVK >gi|316924193|gb|ADCP01000042.1| GENE 15 12423 - 13712 1709 429 aa, chain - ## HITS:1 COG:VCA0748 KEGG:ns NR:ns ## COG: VCA0748 COG3075 # Protein_GI_number: 15601503 # Func_class: E Amino acid transport and metabolism # Function: Anaerobic glycerol-3-phosphate dehydrogenase # Organism: Vibrio cholerae # 7 425 4 418 436 166 30.0 6e-41 MIKRTPYDVMIAGGGLSGLSAALFAAKAGKRTLLVTKGSGVLAIGGGTIDVLGYRSDGTP LTSPFDGFDSLSPRHPYAIVGAETVKKALSCFLEFCDEGGCPYLCNNEQNTRVVTALGTT KPSYLVPHTMSMHGVAEAANIFIAGVEGLKDFSPALAAQGLATRKGYIGKNIVPVILPSP FPLQRDLSTLDLARYLDTPEGILWLSKSLNKYIVRGVPGAVFVPAILGTAANNDVHNAIK DRTGHIVNEISSLPPAVTGLRLHALLLRLLKKYDVDLIEQSTITGAVVENGRCAALITTN NGQERRYEARSFIIATGGVLGEGFAIEPERAWEPIFNIDLPLNPSSPEWSLPEAYPACRQ TPGTPRPSHGFALLGPDVDAKLRPLGKDGNPLCGNVFFIGKTLGGYDHAAEKSGNGVALS TALFAAMNA >gi|316924193|gb|ADCP01000042.1| GENE 16 13709 - 15376 1994 555 aa, chain - ## HITS:1 COG:PM1442 KEGG:ns NR:ns ## COG: PM1442 COG0578 # Protein_GI_number: 15603307 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Pasteurella multocida # 28 551 35 545 563 379 40.0 1e-105 MQELSEYTKRHWDVVVIGGGATGAGTLRDLAMRGVKALLLEQRDLVHGTSSRFHGLLHSG ARYAVKDAEAGRECIEENRILRRIGKHCVEETEGFFVRTPEDGADFEATWVTACDKCGIE ATPLTVKEARALEPNLADNIVSVYRVPDSAVDGFRLVWQNVQNAVRHGAGFRTYTEVTSI NSANGQVTSVSVRDSRTGQTGEIPCSFVVNAAGSWVGELAHTAGLDINVKPDRGTLVAFN HRFTSRVINRLRKASDGDIFVPHGSITILGTTSKKTDKPDDTVPDTAEVHELLDIGRVLF PEIDSYRILRTFAGTRPLYTADPSAEGRGASRNFVVLDHEKEGLKGMATICGGKLTTYRL MGERMADLVCAKLGVAAQCRTAVEPLVEDTPPALLERARKVFPAQGLEQAESRLGDSFAA TVERLEAAPWKKALLCECERVTIAEFEQVASEPTSHSLNDIRRRTRMGMGTCQGSFCGLR GVGAVLEAKLLPAGMQACGTGECDALPCGAPDLLQSFQQERWYGIRPVLWGSELRETELA RGMYGATLNVDGADE >gi|316924193|gb|ADCP01000042.1| GENE 17 16072 - 16764 706 230 aa, chain + ## HITS:1 COG:RSp1144 KEGG:ns NR:ns ## COG: RSp1144 COG4149 # Protein_GI_number: 17549365 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, permease component # Organism: Ralstonia solanacearum # 13 227 42 256 264 197 63.0 1e-50 MISEFFADGMIPWSPILLSLKVAGLATAASFCAGVAAAWLLARRRSPFSAVLDAMCTLPM ILPPTVLGYYLILLVGRRGVIGPWLAECGINLIFSWQGAVVAATVVVFPLIYKSARAALE LVDPRLENAARTLGASEWKVFWQVSLPLAWRGIVAGGMLAFARGMGEFGATLMIAGNIPG KTQTLALAIYDAFQAGNDEQATVLVVLTSCLCLTILVLADRLFGGKAGGR >gi|316924193|gb|ADCP01000042.1| GENE 18 16704 - 17519 240 271 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 10 244 268 498 563 97 31 9e-20 MPYHSCPCRQVVRRKGGRPLMLDVQIRKRLATGRKSGDSFSLDVSLSCATRRLVLFGHSG SGKTLTLQSLAGLVRPDGGYIKIGDDVLYASEKGIDLPARERHLGYVFQDYALFPHLTVR RNVEFALDKTKRTLFGRPSPAARKSVDDLLERFEVGHLAERYPREISGGQRQRVALARAL ITAPRMLLLDEPFAALDPLLRVRMRREIRVLLDEWNIPVVLITHDPADVDAFADSLAVYR NGMIRHVLDDYPARRASEPDALSLLAPLVEV >gi|316924193|gb|ADCP01000042.1| GENE 19 17756 - 18898 1230 380 aa, chain - ## HITS:1 COG:YPO1143 KEGG:ns NR:ns ## COG: YPO1143 COG2005 # Protein_GI_number: 16121439 # Func_class: R General function prediction only # Function: N-terminal domain of molybdenum-binding protein # Organism: Yersinia pestis # 244 379 123 260 263 71 36.0 2e-12 MQEEITRLVQGLDDSALRFIEACVRHERETRNAPQVPPPSMPHPGGQFTVPENVRHLTSE QLDAVSKAFLDWYKASVSTTQGRSRGRLWLVFLLIRYGALRLGEVLSIDDRTDLDFARSV VSVRGQNFRELQFPEAIMTEIRQVLESPLMFGLRGEVLHLDQGYVRRIFYERAKDVDLPK ELLSPRVIRHSRGIELLRGDVPLKIVQQFLGQQSPTLTASYLHFSREDARKIVHSHIRRE AMKKTSARNAFTGTINRIKRGDLLVEVEILTSTGLQVVSIITAESADNLELREGINTTAT IKAPWVIISTGDAPTSARNHFSGKVQSVQLGEIEAEVLVTLDEGTTVCAVITSQSARVLK LEPGKPVSVLFKAFAVVLGM >gi|316924193|gb|ADCP01000042.1| GENE 20 19087 - 19869 1008 260 aa, chain + ## HITS:1 COG:RSp0106 KEGG:ns NR:ns ## COG: RSp0106 COG0725 # Protein_GI_number: 17548327 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Ralstonia solanacearum # 1 258 1 257 257 228 51.0 1e-59 MDKRFFRRCARNAAFSLLGCLLLAAPAFAAQEITVSGAASLTNAFTEIKGLFEKKYPDIK VHTNFAASNPLLKQMEEGAPVDVFASADQETMDKAAAKKLVDTATRKDFALNDLVMVVPS DSKLNLTGAKDLTKPEVKRIAVGNPDSVPAGRYTKAALTTAGLWETLQPKYVFGASVRQA LDYVGRGEVDAGFVYRTDAKQGGDKMKVAAVMDGHKPVLYPIAVATTGSNRAGGAKFVDF VLSPEGQAVLAKYGFSNPQK >gi|316924193|gb|ADCP01000042.1| GENE 21 20264 - 20764 603 166 aa, chain + ## HITS:1 COG:PA5229 KEGG:ns NR:ns ## COG: PA5229 COG2947 # Protein_GI_number: 15600422 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 3 153 4 152 152 124 50.0 1e-28 MRHWLFKSEPDCYSFTDLENAPGQTTSWDGVRNYQARNFMRDDMKKGDLGFFYHSGKNPE IAGIVEIVREGHPDLTAQDPEAGHFDPKATPGDPRWYMVDVKLVRRFEPPVPRSLLRFVP ELAGMELMKTGSRLSVQPVEAEAYAAIVRLADQLAADKAAGQEVKA >gi|316924193|gb|ADCP01000042.1| GENE 22 20761 - 21027 293 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSELLHSPDAELAALKARVARLERLEEQVYFQERTLSALNEAITLQQRQLDDLQGRMEAV EEKFRELWELVGNEGGEATVPPHYMKLA Prediction of potential genes in microbial genomes Time: Fri May 13 02:25:58 2011 Seq name: gi|316924180|gb|ADCP01000043.1| Bilophila wadsworthia 3_1_6 cont1.43, whole genome shotgun sequence Length of sequence - 13803 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 5, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 50 - 511 676 ## COG3045 Uncharacterized protein conserved in bacteria 2 1 Op 2 . - CDS 530 - 1033 598 ## COG3153 Predicted acetyltransferase - Prom 1090 - 1149 3.3 - Term 1129 - 1158 0.4 3 2 Tu 1 . - CDS 1169 - 3784 3227 ## COG0699 Predicted GTPases (dynamin-related) - Prom 4009 - 4068 3.0 + Prom 3872 - 3931 4.0 4 3 Op 1 . + CDS 3997 - 4935 1093 ## COG2214 DnaJ-class molecular chaperone 5 3 Op 2 . + CDS 4981 - 5289 379 ## LI0125 hypothetical protein 6 3 Op 3 . + CDS 5406 - 8024 1804 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 8032 - 8080 12.7 - Term 8073 - 8118 5.0 7 4 Op 1 . - CDS 8157 - 8843 744 ## gi|212705016|ref|ZP_03313144.1| hypothetical protein DESPIG_03084 8 4 Op 2 . - CDS 8848 - 9531 797 ## gi|212705016|ref|ZP_03313144.1| hypothetical protein DESPIG_03084 + Prom 9782 - 9841 3.6 9 5 Op 1 16/0.000 + CDS 9863 - 12688 3778 ## COG0060 Isoleucyl-tRNA synthetase 10 5 Op 2 . + CDS 12688 - 13194 522 ## COG0597 Lipoprotein signal peptidase 11 5 Op 3 . + CDS 13191 - 13391 215 ## DvMF_0264 hypothetical protein 12 5 Op 4 . + CDS 13455 - 13803 395 ## LI1053 hypothetical protein Predicted protein(s) >gi|316924180|gb|ADCP01000043.1| GENE 1 50 - 511 676 153 aa, chain - ## HITS:1 COG:YPO0457 KEGG:ns NR:ns ## COG: YPO0457 COG3045 # Protein_GI_number: 16120786 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 22 151 19 152 155 149 54.0 2e-36 MLKPYACLALFCSLLLPSFAHADDSDVGCVTTEWKLLGANHKVCVSAFNDPDIPGVACYI SQAKTGGVSGSLGLAEDPSNFAISCSQVGPIEIPAKLPKQANVFRESTSVFFKATRVTRI WDAKRNTLVYLAVSRRLVDGSPFNAISTVPVKQ >gi|316924180|gb|ADCP01000043.1| GENE 2 530 - 1033 598 167 aa, chain - ## HITS:1 COG:Cgl1207 KEGG:ns NR:ns ## COG: Cgl1207 COG3153 # Protein_GI_number: 19552457 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Corynebacterium glutamicum # 3 155 6 155 168 95 39.0 4e-20 MTIRQATPKDFDAIYSLVKTAFQTAKVSDGGEQDFVLKLRKGSYIPELELVAEEDGVLIG HIMLTGASIRENGGCFGTLLLAPLSVTLEWRAKGIGAALIREALAKAAALGHSSAVLIGD PGYYGRFGFHSAASISYPGVPGEYTLACELTPGALRNISGAIALPDC >gi|316924180|gb|ADCP01000043.1| GENE 3 1169 - 3784 3227 871 aa, chain - ## HITS:1 COG:Cj0411_1 KEGG:ns NR:ns ## COG: Cj0411_1 COG0699 # Protein_GI_number: 15791778 # Func_class: R General function prediction only # Function: Predicted GTPases (dynamin-related) # Organism: Campylobacter jejuni # 208 439 144 371 392 75 25.0 4e-13 MSSRPTSASMLSMLPQPIQPIPGDSFNDRILLLLAAVATTDDMLTYREYQLVQEAAQAIF GERALHAELQAKLHYALLNPPSDPADIARDMANQADAQKVSSTFVDTMLKALSRIGGQEE RIDEKARSLVNDIEWAFRKSRLERSAGRGIGLGLNVGESLGELYRKATNVLPSRKEIAGW FAPETSQFNADMERFANSLDRIAWTLDDMDLREELYIFRKMLRRQPFKIVIVGERKRGKS SLINAIIGQELSPVRESTPETATVVEFKYAHAPDYSVRFLDSSQFARLEDYLENEQDNLL LTRKIEHIRKGVSDGTFIPGKLLSGITCWDDLSDYISLEGRFSGFVARVSVGLPLDTLRA GVVLVDTPGLNDTDQFHDYLSYEESLEADCVIFVMDARDPGSNSELSLLRKLARSGRTVS IIGVLTNIDRLNSAASLEVAREQARTVLREACRSSGHVELAGVVALNTRQAVEERCRGGS AFSETLSKVSRSVSGCGELEQLLALLREIMDRDAGKEAYRHKIAEAYSRIADSARERLRQ HVQEYRESLPNPELLGMLDAHAKQLSASALSSLEQARQVVNAAAKDLDAWDESTEKALKK FHETLVLRLMDAVNRKVADLGHQFAKDSVWKEFDATEARAIARRAVDEFLDEQRGILHTW EDKLRLFSARMDEFSQECLARLSSNIDGLQDDPGEVNGGSSTATHFLVQTHRHMKNLAVF TTGLTVGRLTALGPISILVTAGNIIALAAASPLAAAVFAAVAGTAGLLYHLGREDKRKAA FLDKRRREAEEYAERVCGALRQELAVVREDLGKAYEFEVKRGFAPALESLFHQSVHLRLF LDVMQKIRSDVSRYDTHVQKQLEQLGGVLER >gi|316924180|gb|ADCP01000043.1| GENE 4 3997 - 4935 1093 312 aa, chain + ## HITS:1 COG:slr0093 KEGG:ns NR:ns ## COG: slr0093 COG2214 # Protein_GI_number: 16331768 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Synechocystis # 2 312 3 316 332 222 44.0 9e-58 MSVEYKDYYKILGVGREASKDEIAKAFKKLARKYHPDLNPGNKESEEKFKEINEAYEVLK DEQKRKMYDQLGPNWQQGQQFGGNPFGGGNPYGGGTRFTFNGQEFGGQGFDGSGFSDFFE TLFGSRQGGSAGGPFSGYTSRPQRGRDIEADISITLEDAVKGGERSLTLEGGDGTKTLKV NIPAGVKDGAKLRLAGQGYDSPNGGPKGDLYLRIRFAPHSLFHVDGTDLTYEVRIAPWEA VLGAKVKVPTLDGNVELSIPAGTGSGKKMRLRGKGLGPAKSRGDLYVRVGIDAPKDLTPK QRELWEALAAEK >gi|316924180|gb|ADCP01000043.1| GENE 5 4981 - 5289 379 102 aa, chain + ## HITS:1 COG:no KEGG:LI0125 NR:ns ## KEGG: LI0125 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 100 1 100 103 122 63.0 3e-27 MELLQNTKDVPSRSTLIVWGEFLELTGTTPDRLSELMDIGWLKPTRTAEAALLFRQDDVY RIRKLERLCSDFELHTLGGSIVMDLLDRIATLELRIRELTDR >gi|316924180|gb|ADCP01000043.1| GENE 6 5406 - 8024 1804 872 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 856 1 804 815 699 45 0.0 MDINRFTEKSREALTTAQGLAAQYGHQEVDAEHLALALVNQEDGFVPRVLERVGVAPKAL VTALEAVLKKRPSVRGPGAEMGTITISQRVAKAIANAEALAKRLRDEYVSVEHIFAELLR EPASTGLGQVAADAGLSADKFTETMMAVRGPHRVTSANPEESYEALSKYGRNLVEAASKG KLDPVIGRDAEIRRVIRILSRRTKNNPVLIGEAGVGKTAIAEGLAYRIVKGDVPEGLKNK TIFALDMGALIAGAKYRGEFEERLKAVLSEVQKSEGQIILFIDELHTIVGAGKTDGAMDA GNLLKPLLARGELHCIGATTLDEYRKYIEKDPALERRFQTVLVEEPSVEDTISILRGLKE RFEVHHGVRISDSSIVEAVVLSNRYITDRQLPDKAIDLIDEAAAMIRTEIDSLPTELDEA NRKVMQLEIEREALRKETDDASRERLEKLENELRNLQMTQAELKGQWENEKGVINVVRDL KGEIEQTRLAIDDATRRGDLQAASELKYAKLPELEKRLHEAENGQEAPRLLKQEVRPDDV ADIVARWTGIPVTRLLQSERDKLIHLPDKLHERVIGQDEAVQAVSDAVLRTRAGLSDPSR PQGSFIFLGPTGVGKTELCKALAEALFDSEENIVRLDMSEYMEKHSVARLIGAPPGYVGY DEGGQLTEAVRRKPYSVILFDEIEKAHPDVFNTLLQLLDDGRLTDSQGRTVDFRNTIVIM TSNIGSYRMLDGINPDGSFASDVYTEVMGELRQHFKPEFLNRVDETVLFKPLLPEQIGQI IDLQLRRLQKRLEERKIALELTEAAHEFIGDAAYDPHYGARPLKRYLQSHVETPLAKFII GGQVRDDQRVVIDATEEGLTFGVKSGDTVQPL >gi|316924180|gb|ADCP01000043.1| GENE 7 8157 - 8843 744 228 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212705016|ref|ZP_03313144.1| ## NR: gi|212705016|ref|ZP_03313144.1| hypothetical protein DESPIG_03084 [Desulfovibrio piger ATCC 29098] # 18 218 13 224 240 81 26.0 4e-14 MCTAKIPPRPRLFLLAALFLLAGCSPQYSMRVISETDGHAYPNDQYTYYIASSKKQNSDP TYQLLKQDVKSTLSEAGFTLTQDRNRGTALLSIDYTAKTSTKHITAKKPIYGQTGTVEKT HGTYDKATGRYTKTTTTTPTYGTVGYEDETKEVTECDIFLHLSAASSKTNKELWSTSIYH THDSEDISGVLSVMVRGCKDYIARNTSGIISLQVTANDDGTFGIVEKQ >gi|316924180|gb|ADCP01000043.1| GENE 8 8848 - 9531 797 227 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212705016|ref|ZP_03313144.1| ## NR: gi|212705016|ref|ZP_03313144.1| hypothetical protein DESPIG_03084 [Desulfovibrio piger ATCC 29098] # 1 217 1 224 240 70 24.0 7e-11 MTFPRPCLYALLAACLLLPGCVANQYAIKINAITNGDIQAGIGDVYYIMAVKGQEDDLVF LELKRNLESALPQAGLKVTKSLPHATAVLFAGYKSQTVARQTTVSEPVYGVTRVETRGTG EYINTLTGKTSKTSVSTPTYGVTGYKNTQKEVAQNKIVLLLAADSLKTKKKMWETIVTYT GNSFDNRKMLDMMVMGAKDYLAQTTPGDTWLDVSESDDGALSLKERK >gi|316924180|gb|ADCP01000043.1| GENE 9 9863 - 12688 3778 941 aa, chain + ## HITS:1 COG:HI0962 KEGG:ns NR:ns ## COG: HI0962 COG0060 # Protein_GI_number: 16273685 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Haemophilus influenzae # 3 933 4 927 941 895 47.0 0 MSDYKKSLNLPDTSFPMKANLTQREPEMLRWWEENNIYGTMLEASGSRGSFRLHDGPPYA NGHLHLGHALNKILKDITIKSRNIQGQQSVYVPGWDCHGLPIELKVEHELGEKKKEMPSY AVRRRCRQYADKFIDIQRKEFKRLGVLGDWDHPYKTMLPEYESATATELANFVEKGNVVR SKKPIYWCCSCQTALAEAEVEYADHKSPSIYVRFPLTDDRLKTVFENADPSRAYVIIWTT TPWTLPSNMAVALHPEFEYSLVEYEGSQYVLATELVESVAKACGWDMDGVKAMGTATGQQ LELVKARHPFYDRESPLILGGHVTLDAGTGCVHTAPGHGREDYEVCLQYGIDIYSPLNDR GEYLDSVEFFAGLQVQKANPNVIEKVKEVGNLMGQADITHSYPHCWRCKKPVIFRATTQW FVSMEANELRQKALKAIRNDVEWIPSWGEERIYNMIEQRPDWCISRQRLWGVPILALLCE DCGEAWNDPEWMRDIAARFAKHPTGCDYWYEADMKDIVPEGLKCPKCGGQHWKRESDILD VWFDSGTSFAAVLEKRPELGFPADLYLEGSDQHRGWFHSSLLASIGTRGVPPYKAVLTHG YVVDGEGRKMSKSIGNGIELDEIISKHGAEIIRMWVSSVDYREDVRISAEIVNRLVDAYR RIRNTCRYLLGNLKDVAKADLVDVKDMDPLDRYALDVAARTHQRVQDAYRDYEFHKVFHS LHNLCSTDLSAFYLDILKDRLYSSAPASRARRSAQTALYHILLMLVQDMAPVLSFTAEEV FRHIPEALHPGAKSVFAFQLMDADAFLLDDADRKRWEAVLAARTEVTRAIEPLRKAGTVG HALDTAATLYASPELLEVLGGIGTDLRAVCIVSQLHLAPLADAPADLAQADIAECGKLAV SVAKAVGEKCERCWIYSDELGSDPEHPTLCPRCAAVMKELA >gi|316924180|gb|ADCP01000043.1| GENE 10 12688 - 13194 522 168 aa, chain + ## HITS:1 COG:RSc2459 KEGG:ns NR:ns ## COG: RSc2459 COG0597 # Protein_GI_number: 17547178 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Ralstonia solanacearum # 14 156 26 165 173 110 46.0 2e-24 MAEPGKDTRYRVVLGVGVGALVLDQLTKLWVMASLPYYGAVTVIPGFFDLVNIRNRGAAF GFLNRSDIEWQFWLFFAATLVSAGVIFMLARSAQRGEKLLFWGLGMVLGGAVGNLVDRIR FRAVVDFLDFYVGQWHWPAFNVADIAICCGALLVCLSMWLKGPSGRAS >gi|316924180|gb|ADCP01000043.1| GENE 11 13191 - 13391 215 66 aa, chain + ## HITS:1 COG:no KEGG:DvMF_0264 NR:ns ## KEGG: DvMF_0264 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 11 65 7 61 64 64 58.0 1e-09 MSLFSLEWWQIALLFLPALLNLWGIWHAFNHTFETPLERVLWMVACVFVPVLGGVAYVLF GWRRAH >gi|316924180|gb|ADCP01000043.1| GENE 12 13455 - 13803 395 116 aa, chain + ## HITS:1 COG:no KEGG:LI1053 NR:ns ## KEGG: LI1053 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 111 20 131 305 121 53.0 9e-27 MKKPYIFPLLLGSLVLTGCANGQIDAMQMRLNQQEQQIRLLNSQLSGVQPAQADTWAQVQ SLRQEMSAVKGQIDDFNNATAAAGGLPGLAQRVNNHEAALQAIATQFGMELPVAAV Prediction of potential genes in microbial genomes Time: Fri May 13 02:26:44 2011 Seq name: gi|316924166|gb|ADCP01000044.1| Bilophila wadsworthia 3_1_6 cont1.44, whole genome shotgun sequence Length of sequence - 15882 bp Number of predicted genes - 16, with homology - 11 Number of transcription units - 8, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 420 526 ## LI1053 hypothetical protein 2 1 Op 2 . + CDS 429 - 878 436 ## COG1145 Ferredoxin + Term 974 - 1012 4.5 + TRNA 888 - 963 84.8 # Met CAT 0 0 3 2 Op 1 . + CDS 1473 - 2036 -266 ## + Term 2051 - 2081 -0.3 4 2 Op 2 . + CDS 2089 - 2325 74 ## 5 3 Op 1 36/0.000 - CDS 2283 - 3524 1414 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 6 3 Op 2 . - CDS 3521 - 4195 213 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 7 3 Op 3 . - CDS 4221 - 4652 548 ## Pecwa_4113 hypothetical protein - Term 4670 - 4717 9.1 8 4 Op 1 . - CDS 4727 - 6466 2650 ## COG1785 Alkaline phosphatase 9 4 Op 2 . - CDS 6500 - 7618 1261 ## COG3481 Predicted HD-superfamily hydrolase - Prom 7708 - 7767 4.0 + Prom 8136 - 8195 4.3 10 5 Tu 1 . + CDS 8223 - 8447 231 ## 11 6 Op 1 . - CDS 8706 - 9992 944 ## COG0790 FOG: TPR repeat, SEL1 subfamily 12 6 Op 2 . - CDS 10033 - 11454 1237 ## 13 7 Op 1 . - CDS 11657 - 11866 174 ## 14 7 Op 2 2/0.000 - CDS 11790 - 12896 955 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 15 7 Op 3 . - CDS 12928 - 14004 1267 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 16 8 Tu 1 . + CDS 14456 - 15751 1888 ## COG0148 Enolase Predicted protein(s) >gi|316924166|gb|ADCP01000044.1| GENE 1 1 - 420 526 139 aa, chain + ## HITS:1 COG:no KEGG:LI1053 NR:ns ## KEGG: LI1053 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 3 137 173 304 305 178 65.0 5e-44 PAPQQAAPQTQAQGDIAKALYDNGVQSFNARNYKQALKSFSDFTDTYGKHKLVSNAWFWR GECNYQLGNFPAAALDYEQVISKYGSSGKAASAYLKQGMCFIKAGKKDAAKVRLQELIKK FPKSPEATRATQLMKDNKM >gi|316924166|gb|ADCP01000044.1| GENE 2 429 - 878 436 149 aa, chain + ## HITS:1 COG:all0569 KEGG:ns NR:ns ## COG: all0569 COG1145 # Protein_GI_number: 17228065 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Nostoc sp. PCC 7120 # 9 140 2 133 134 95 38.0 4e-20 MSQETKPYRTNVHLTFPPTISGAPLVCNLTRLFDLDFNISTAQITPRQEGFLTLELSGSR DACDKGIGYLREKGVLVSPVAQRIWHNEDKCMQCGMCTALCPSSALTVDIKTRLLAFDKE KCTVCARCIRICPVGAMQMDAPEEGVLTE >gi|316924166|gb|ADCP01000044.1| GENE 3 1473 - 2036 -266 187 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGAGKGRGSLFILTGISLIPSFIEYIVTVLSGAFPSLLGIVWRVSAAFLGDAVLNGTVRG MLWRACPSSKGKNHPSSPIVSRRAPSSLPRILRAALGWCPAISANSLGWTSPIGRKEPSP PYWRSNAPADGVEPFRSPGHLASAYSVPPVPPYAKRKAFLCLRGRLSVSFDAGCAVIAPL RVFSCGG >gi|316924166|gb|ADCP01000044.1| GENE 4 2089 - 2325 74 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGVGAELVLVEGVFLELVEVGIEALRLVRVPVFGVVRIPEPMDRGTGSLCLRGSGKMPRR FLGKGYSLSPNATAPVLR >gi|316924166|gb|ADCP01000044.1| GENE 5 2283 - 3524 1414 413 aa, chain - ## HITS:1 COG:DRB0050 KEGG:ns NR:ns ## COG: DRB0050 COG0577 # Protein_GI_number: 10957500 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Deinococcus radiodurans # 11 402 12 394 402 87 26.0 6e-17 MNPLMLIKGELRRSSGGILALALVLALSLSLGVGVSMTERAVRQGTARAGDAFDLLVGAQ GSSVQLMLGAVYLRPQSLPLVPGGAVTEILTQEGVVWAAPLAFGDRWNASPLVGTTTDMV TLGGNRTLAEGRAFAAQDEAVLGAGVPLRLGEAFSPMHGQVSASHNEHGHVRYKAVGRLP ETGTPWDNAILIPIESVWAVHDALPDAGHGLPLGHLFEGKGSPLPGVSAVVVKPESIAAA YRLRAFWQTATLPDADGRPVNMQGVFTGEVLTELFATLGDMRDIMTCMAYAAQFVALCGV MLVGGMAVSMRKRMLGTLRVLGAPRAYLVLSVWCVVSVSIAVGTLAGLLFGCGLSEGAAL LMFRQTGIMLSPQLTLAEGSFAAASFALGSLCALFPAYMVYRKTGAVALGDSE >gi|316924166|gb|ADCP01000044.1| GENE 6 3521 - 4195 213 224 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 18 201 16 199 223 86 31 9e-17 MNGLELQDILLTFPEGGRRRTVLDIPYLKVTSGEHIGIRGASGSGKSSLLNVVSGLVLPD RGTVRWGATVPGALPEHERDRWRGQHIGFVFQDFQLFPELEALENVLLPATFTGWRIPEA LIRRGRGLLEQMHVCPTRKVRALSRGEKQRVAIARAVLLKPGIILADEPTASLDEANAGQ VTLLLSGYARTLGSTLLVVSHDDAVLADMDRVLDLNRGIVREAA >gi|316924166|gb|ADCP01000044.1| GENE 7 4221 - 4652 548 143 aa, chain - ## HITS:1 COG:no KEGG:Pecwa_4113 NR:ns ## KEGG: Pecwa_4113 # Name: not_defined # Def: hypothetical protein # Organism: P.wasabiae # Pathway: not_defined # 1 143 19 159 159 181 62.0 7e-45 MKKLLLLFFALLCLTGTAKAATPELSFDELYSGGGVLGLQFSDKVKKLAGQRITIRGFMA PPLKAEAAFFVLTREPVALCPFCQSDADWPDNILVVYLSSSQSFVQNNTTIEVEGVLEIG SHRDGDTGFISQLRLRDARFRTL >gi|316924166|gb|ADCP01000044.1| GENE 8 4727 - 6466 2650 579 aa, chain - ## HITS:1 COG:DRB0046 KEGG:ns NR:ns ## COG: DRB0046 COG1785 # Protein_GI_number: 15795180 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Deinococcus radiodurans # 25 576 22 582 588 372 41.0 1e-102 MNTRIIPALLGALLFSGVAEAAPSIYPVDTARFLGGSRFDFKVEFDGELAEKDAKVLING QDAKAVLGAKPEFIAKEKGADASALLLRGASLKPGQYKVTVESPTGKASANWDVYGTSDK PVAKNVILLIADGLSMGHRTSARLMSKGVTNGMYNGKLSMDTLPYTAMLGTCSVDSIAAD SANTASAYMTGHKSSVNALGVYADRTPDSLDDPKQETIAELLRRTTGKSIGVVSDAEIED ATPASVVGHTRRRADKAEIVDMFYSVKPDVIIGGGSAYFLPQNVPGSKRKDDKNYVEMFQ KEGYALATSNTELSKAVKGNPDKLLGLFHTGNMDGVLDRKFLKKGTVSKFPDQPDLTDSM RAALSVLSKNPEGFFLMLEAGLVDKYSHPLDWERAVYDTIMFDKVVAMAQEFCDKNPDTL LIVTGDHTHSISVIGTVDDNLPGELMRDKVGIYAEAGYPAYKDENGDGYPDDVNVSKRLA VFIGNYPDHYETYRPKMDGPFVPSVKDEKGHYIANAAYKDVPGAQLRIGNLPRTESTGVH SIDDLVVGARGPHADAFRGFMNSTEVFRIMSEALALGNK >gi|316924166|gb|ADCP01000044.1| GENE 9 6500 - 7618 1261 372 aa, chain - ## HITS:1 COG:MK0390 KEGG:ns NR:ns ## COG: MK0390 COG3481 # Protein_GI_number: 20093828 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Methanopyrus kandleri AV19 # 130 242 42 151 246 63 34.0 9e-10 MNADHSRRTFLLGGLATGAVLLAGKTFLHPSAAHAATEAVTLDVCVNMTPEEMADRSQYV MAAWKYLQDAAAEIGNPGLRAAVLDIMKNPAPLLAEGDAKAIMAELKGQGLLAQDAKAVF PTCASTKKSPQPFYTAPGSGWNSHHIYPGGLVTHTALNVASCKALYDNYADMFGLKLDRD VVLASQLLHDLHKPWVFQWQADGTCRKEEPLAATGEHHVLSIAESLRRGLSPELCVAQAC AHDHPGATASEQLVVGWLKAAAVINGKDPFKAGLVEKDGKTLPMPRRMEGFVTHLADHDW VLSVPAAQWTVDALKQVAVRSYGIKEADLAKKPFNQFRNYVLSQKTAMQLYGAYSTKGLD GLTAEVAEIVRA >gi|316924166|gb|ADCP01000044.1| GENE 10 8223 - 8447 231 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKSEHLFIIRLGDGRHFEYEGTLDGAKRKASKLATPPVAIQLLAEQRLVATRRPYFGWDG KVGWHPWELLERAF >gi|316924166|gb|ADCP01000044.1| GENE 11 8706 - 9992 944 428 aa, chain - ## HITS:1 COG:STM0654 KEGG:ns NR:ns ## COG: STM0654 COG0790 # Protein_GI_number: 16764031 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Salmonella typhimurium LT2 # 119 406 59 314 331 82 27.0 2e-15 MEKVFSDKTEEGIRLIWMQFDPEKTAEGVRLLREAADAGDPDALCFLARTYMGERYVWEY AALEINGEKAASLLKEGIRRGSACAVLLAMRCGELTPSARKAMPFASLKEARDDVLGKAE AGHPFCQYMIGNTYYFGDCFEIDGIDPQTAFHDPDGLRADTARRALPWLEKALNGGLSDA AKNLYNLFSGSEGFPRDRGRQLEIALRGASAGNPYWEERCGKMCREENGREAEAMRWFRQ AAHHGQASSWFHIGSHYERGLGVDKDPAKAAEAYRNGAEKGSSDAQQALGIMAAFGRGVP QDPAQAAYWLRLAADDEKPLACGVLGHCYLRGLGVQQDDGEAFALLDRFAEAYDREKPDD GETSTYPDELVGIVYNGLGELYADGRGGPVDIKEGIRCFQAAAELGNGDARRNLGRFKRN WFGKWVRK >gi|316924166|gb|ADCP01000044.1| GENE 12 10033 - 11454 1237 473 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLLYGYTMSDELEHSATLRARAACLREHRLHFILCFGAFLAGATGSLLTASFLKANVAA PNGTLLAEGLAMLGVIGLGGLLSALFRCLSSRHSRLGKSVLSQYPAESVDPKAVFQEIDA DILQHGKRFGGLSIGKEWAIFQEAMLISRIRGIFSDILEPEREHGETAYLLCLVDADGSI LEASLPSRSVLDKARACLLESVPDAVSGGFEAMTDFMNMDDEGRREINRRVNARRQAGKP VSFSYAGPDGVPTSLATEQTVEEGIARIKPGDMLVLTPLSPLFLPSGEECLYIACEQDAG QTETLRLSAYIRRGDDYLRVFREMPGREAGLLFRDCFMRRAVPDIADWHVQSWENDWPEE RPILFVDDKRFENTSFEDVEAALDGVDGGDYGSFFLMFPGGYDGYLSLNGRQGDNYVVEA ALPDEDRYFRIATPRRAQVLFWFSGYYEKSRLPYMEEWKDVTKEVKKRRGEAD >gi|316924166|gb|ADCP01000044.1| GENE 13 11657 - 11866 174 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTQPCAAWTGESKGWSLFRSHRTAYKGKGFKDARRRGKAFYEGASSLFQRIRLFYILYRK TSLKKAAPY >gi|316924166|gb|ADCP01000044.1| GENE 14 11790 - 12896 955 368 aa, chain - ## HITS:1 COG:AF0890 KEGG:ns NR:ns ## COG: AF0890 COG1744 # Protein_GI_number: 11498495 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Archaeoglobus fulgidus # 35 358 58 388 397 143 31.0 4e-34 MAGFPFMRQAEAFRSSLPFILLLAVFILPCPMSAAAKPRPAIAFVHLLLPPGGEPHASPA GWIGSLDRARDTLESLRIPTIVSSASPGNAEDLNELSEMDPEVLITTSPLLYAAACDVAK RSPNTVFFACTDETVHDNMSGFQARTYEAHYLAGLIAGALSEDGVIGYLANDPYQSKASR LRNANAFTLGAQKSKAAIRVLYAPGNDPDEELKTLIAAGADIVDLERALPSLVERLREHS VRYVGTAGRTVDPLCLAAPEWEWSNAFGDLLTQVRFGIWRPRNVSYGIKEGVVGLSPFGP AVPETVRVRVREAKEAIRNGASPFEGPVRDVSGSLRIAEGSVPDDATLRGMDWRVQGLEP LPLTPDGL >gi|316924166|gb|ADCP01000044.1| GENE 15 12928 - 14004 1267 358 aa, chain - ## HITS:1 COG:AF0890 KEGG:ns NR:ns ## COG: AF0890 COG1744 # Protein_GI_number: 11498495 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Archaeoglobus fulgidus # 1 357 19 392 397 391 51.0 1e-109 MRTLVLLMAALFLAAGLASAPAQAIGVQKEMKVGFIYISPIGEGGWSYAHDQARKFLSSL PGVTTSYVESVPEGQDAERIIQNMARKHYDIIFATSFGYMDSMLNVAEQFPNSTFMHCSG YKTAPNMSNYFGRIYQARYLTGLVAGSMTQSNVIGYVAAFPLPEVIRGINAFTLGVRAVN PKAEVRVVWTKSWYDPAMEKSVAKNLIKEGADVIAQHQDSSGPQEAAQEQGTYSIGYNTD MSRVAPRAHMTSAIWSWNPVYENVVEQVRAGTWKNGSFWYGMETGVVDIAPYGPMVPQNV RDLTEAAKADIKSGKLVVFTGPIKDQKGVERIPTGMVPSDKELLGMDWFVEGVIGTID >gi|316924166|gb|ADCP01000044.1| GENE 16 14456 - 15751 1888 431 aa, chain + ## HITS:1 COG:RSc1129 KEGG:ns NR:ns ## COG: RSc1129 COG0148 # Protein_GI_number: 17545848 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Ralstonia solanacearum # 1 425 1 426 427 528 65.0 1e-150 MSTIASVWAREILDSRGNPTIEVEVSLDSGHVGRAAVPSGASTGTREALELRDGDKARYK GKGVSKAVENVNTEIAEAIVGLDALRQVQVDNTMLDLDGTDNKSRLGANAMLGVSLATAR AAAESLALPLYQYIGGVNAKVLPVPMMNIINGGAHAPNNLDIQEFMIMPVGAHTFADGLR MGSEIFHTLKTILAADGHITSVGDEGGFAPNLKNHDEAFSYILKAIEESGYNPGSEVVLA IDAAASEFHKNGKYMIGEKGALSSPEMIEWLAEFTAKYPLISIEDGLAEDDWDGWRELTD ALGDNIQLVGDDLFVTNPDILAEGIEEGIANSILIKVNQIGTLTETLDAIQLAKESAYTT VVSHRSGETEDSFIADLAVGVNAGQIKTGSLCRSDRLAKYNQLLRIEEDLDDAAMYYGPV MGANLAFDGEE Prediction of potential genes in microbial genomes Time: Fri May 13 02:27:53 2011 Seq name: gi|316924157|gb|ADCP01000045.1| Bilophila wadsworthia 3_1_6 cont1.45, whole genome shotgun sequence Length of sequence - 9012 bp Number of predicted genes - 9, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 62 - 376 125 ## + Term 561 - 597 -0.7 - Term 208 - 280 31.1 2 2 Op 1 . - CDS 283 - 540 394 ## - Term 586 - 628 13.3 3 2 Op 2 . - CDS 685 - 2418 2685 ## COG2855 Predicted membrane protein 4 3 Tu 1 . - CDS 3387 - 3896 623 ## COG0778 Nitroreductase - Prom 3925 - 3984 1.5 + Prom 4020 - 4079 4.3 5 4 Tu 1 . + CDS 4194 - 4715 349 ## LI0098 hypothetical protein + Term 4958 - 5012 18.0 - Term 4945 - 4999 18.0 6 5 Op 1 . - CDS 5120 - 6178 1350 ## COG0002 Acetylglutamate semialdehyde dehydrogenase 7 5 Op 2 . - CDS 6184 - 6498 358 ## DvMF_1716 hypothetical protein 8 5 Op 3 . - CDS 6549 - 7736 1437 ## COG0075 Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase 9 5 Op 4 . - CDS 7775 - 8959 580 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative Predicted protein(s) >gi|316924157|gb|ADCP01000045.1| GENE 1 62 - 376 125 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFAAFLLCKGRVFFFFFFFFFFFFFFFFFFLVSIHQAGTPFEPPPHQSKVLEKEMTGGGI TLLQKGYSSPGHLHYFFSGNGPRGRKRHVSRISIAASLMKGMSL >gi|316924157|gb|ADCP01000045.1| GENE 2 283 - 540 394 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRRPLPGYVKLLLCMFTIWLLLWVITPLWVPYSKLHQDFAAAQEKYDVPIGALYYNDIP FINEAAMEIRDTWRFLPRGPLPEKK >gi|316924157|gb|ADCP01000045.1| GENE 3 685 - 2418 2685 577 aa, chain - ## HITS:1 COG:STM2202 KEGG:ns NR:ns ## COG: STM2202 COG2855 # Protein_GI_number: 16765532 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Salmonella typhimurium LT2 # 223 546 16 325 349 71 25.0 5e-12 MANENSNVVVDRAQSHWSDLWKKEDYWAIWLGFALLIAAICIFINGAPASYKETIDKSNA IMKVEAEKAPFKTIAYIQAQDAKKGVVGTNLPIAKEIKAFIASPGKWTDNPVKSMFTSQA EADAKNAANKEKAEAAKAKAESSFAAAQAAEKLAADAGYKDASLNTAAEAAIKDWTKAKA DASKASAKAKPVNLFTTLPLLMVAFALFFGIGIFVMGQNLPKFLIGFVGLFVVVVIAMIL GKQSTMAYYGIGVEPWGIMFGMIIANTIGTPQWMKPALQVEYFIKTGLVLLGAEILFDKI IAIGTAGIFVAWVVTPIVLITTFIFGQKVLKMASPTLNITISADMSVCGTSAAIAAAAAC RAKKEELTLALGLSMTFTAIMMVALPAFIKYLGLPEVLGGAWIGGTVDSTGAVAAAGALL GPKAMYVAATIKMIQNVLIGVTAFGIAVYWCTSVEKTAGRETSLMEIWHRFPKFVIGFLT ASIIFSIYSADLGPDLGTALINKGVIDGIEKGVRTWFFVLAFTAIGLSTNFRELAPYFKG GKPLILYVCGQSFNLALTLLMAYVMFYLVFPEITAKI >gi|316924157|gb|ADCP01000045.1| GENE 4 3387 - 3896 623 169 aa, chain - ## HITS:1 COG:MA0330 KEGG:ns NR:ns ## COG: MA0330 COG0778 # Protein_GI_number: 20089228 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 1 167 6 172 179 191 55.0 5e-49 MDAFEVLHTRRSIRQFLNRPVGEDLVKELLSAAMSAPTAGGIQPWRFVVITDREKLDKIP DFHPYAGMIKQAPLAILVCGDTTSANYGKYWVQDCSAAMENLLLAARAKELASLWCGVHP VPEREQAFRELFNLPDTVSPLGLAILGYSETPFSHKKRYDEQKVHYNVW >gi|316924157|gb|ADCP01000045.1| GENE 5 4194 - 4715 349 173 aa, chain + ## HITS:1 COG:no KEGG:LI0098 NR:ns ## KEGG: LI0098 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 172 1 163 165 160 43.0 1e-38 MAGLPGPGDRIEARCTRCNDITGHVIVALVGGEIVKVECRACGSVHKYYPPATPKEAKGK TAVCRVKAGETRKEAVNSFKPTSTPSTAAASPAAVSRAAAKAQKAAEDMEQNWQRTMNMT AASARPYAMNESFAVGDVIDHPKFGSGVVQEIFPPDKMQILFRDGAKMLRCAC >gi|316924157|gb|ADCP01000045.1| GENE 6 5120 - 6178 1350 352 aa, chain - ## HITS:1 COG:PA0662 KEGG:ns NR:ns ## COG: PA0662 COG0002 # Protein_GI_number: 15595859 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Pseudomonas aeruginosa # 5 352 1 344 344 311 47.0 2e-84 MPQEMIRVGLVGVTGYTGMELARLLTGHPNMKLTAATSRTEAGKRLGDLYPFLNGLPGAD VILIEPEPELIAKKCDLAFLAVPHTAATEMAAALVERGLKVVDLSADFRLKSAETYEQWY KVEHKRPDLLPEAVYGLPELYASDIAKARLVANPGCYPTSIILGMTAALADGLIETHGIV ADSKSGVSGAGRSAKLGSLYCEVADSFKAYGIGTHRHTPEIEQELSRLAHGPMTISFNPH LVPMNRGILSTIYAQLKAPLSQADAQRVYEETWADSPWVRVLPSGQLPETRNVRGTMFCD MSVIVDPRTNRLIVVSVIDNVCRGASGQAIANANIMCGLPVTTGLMLCAMMP >gi|316924157|gb|ADCP01000045.1| GENE 7 6184 - 6498 358 104 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1716 NR:ns ## KEGG: DvMF_1716 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 13 102 3 106 107 129 63.0 4e-29 MNAASPEDHKACDCNKPGCDCSMPQVTFSTFILSLASSALVQLGEVPNPESGATEQDLVI AKHTIDILTMLEEKTKQCLDSDEARLLEGILYELRMKYVMKKVD >gi|316924157|gb|ADCP01000045.1| GENE 8 6549 - 7736 1437 395 aa, chain - ## HITS:1 COG:TM1400 KEGG:ns NR:ns ## COG: TM1400 COG0075 # Protein_GI_number: 15644152 # Func_class: E Amino acid transport and metabolism # Function: Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase # Organism: Thermotoga maritima # 2 362 5 361 384 265 39.0 8e-71 MFNKPRLLTPGPTPLPERVRLALAQDMIHHRKPGFKKILLETETMLQKLFGTEQPVLPLS CSGTGAMTAAVHGLFLPGETVIVVNGGKFGERWSKIAAVRGLKVVEIDVAWGQAVTPAQV EQALREHPEAKGVFMQVCETSTAAQHPVEAIARLTRTREVLLVADGISAVGISPCPMDAW GIDCLLTGSQKGLMVPPGMALMALSARAWKRAESIPVSCFYFNLAGERDNLLKGQGLFTS PVNLIVGLHESLSMLFEEGGLDALYRKQWALTCMVRTGALALDLPLFAPTHYAWGVTSIM LPAGVDGDKLLKVAADECGIVFAGGQDHYKGRMVRFGHMGWVDWADALAGLHALAHGLRA SGGFSASRDYLETALSAYHRALDVEPGRVIPDVRS >gi|316924157|gb|ADCP01000045.1| GENE 9 7775 - 8959 580 394 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 1 391 1 392 396 228 35 2e-59 MQQLFLKKHEDRRLRAGHLWIFSNEVDVKRSPLTAFAPGEAAQVCAADGRTIGTAYVNPA SLIAARIVSRKADEPLDAALIKKRLERALQLRETLFDVPFYRLCHGEGDWLPGLVLDRYG DVFSAQITTAGMEAQKDALLEVLGELFTPKAVLLRNDVGVRTMENLPLLVETGMGTVPDE IEVRESGAAFTVSLAEGQKTGWFYDQRPNRVEAAKYAKGKTVLDAFCYAGAFGVMAARNG AAGVTFLDASRQALDMASRNLAANAGCPGETLPGDALDTLASLRDGGRKFGVVCVDPPAF IKRKKDAEQGLNAYRRVNDLGLQLVENGGILTSCSCSHHLEAEALRRLIAQCAAKRGLNT QLLYQGFQGPDHPVHPSMPETAYLKVFILRVWKD Prediction of potential genes in microbial genomes Time: Fri May 13 02:28:33 2011 Seq name: gi|316924130|gb|ADCP01000046.1| Bilophila wadsworthia 3_1_6 cont1.46, whole genome shotgun sequence Length of sequence - 33448 bp Number of predicted genes - 26, with homology - 25 Number of transcription units - 17, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 134 - 493 526 ## COG0818 Diacylglycerol kinase - Term 602 - 629 0.1 2 2 Tu 1 . - CDS 666 - 3359 3376 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 3 3 Tu 1 . - CDS 3551 - 4462 1141 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 4533 - 4592 1.7 + Prom 4584 - 4643 2.9 4 4 Tu 1 . + CDS 4719 - 6335 2252 ## COG1151 6Fe-6S prismane cluster-containing protein + Term 6432 - 6475 6.1 + Prom 6472 - 6531 2.3 5 5 Op 1 3/0.000 + CDS 6557 - 7240 576 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 6 5 Op 2 . + CDS 7304 - 9316 1780 ## COG0642 Signal transduction histidine kinase 7 5 Op 3 . + CDS 9362 - 9943 435 ## + Term 9998 - 10038 6.2 - TRNA 10036 - 10112 79.4 # Met CAT 0 0 8 6 Tu 1 . - CDS 10319 - 11287 1336 ## COG3366 Uncharacterized protein conserved in archaea - Prom 11524 - 11583 2.3 + Prom 11324 - 11383 2.9 9 7 Tu 1 . + CDS 11443 - 12120 794 ## COG0517 FOG: CBS domain + Term 12147 - 12178 2.5 - Term 12137 - 12164 0.1 10 8 Op 1 . - CDS 12344 - 12898 459 ## COG1335 Amidases related to nicotinamidase 11 8 Op 2 . - CDS 12955 - 13341 422 ## COG1733 Predicted transcriptional regulators 12 9 Tu 1 . - CDS 13496 - 14053 798 ## COG2316 Predicted hydrolase (HD superfamily) - Prom 14231 - 14290 2.3 + Prom 14065 - 14124 3.0 13 10 Op 1 . + CDS 14223 - 15266 1117 ## COG1774 Uncharacterized homolog of PSP1 14 10 Op 2 . + CDS 15475 - 17463 2540 ## COG0143 Methionyl-tRNA synthetase + Term 17486 - 17525 7.5 - Term 17623 - 17660 4.7 15 11 Tu 1 . - CDS 17814 - 19475 2277 ## COG1620 L-lactate permease - Prom 19501 - 19560 4.0 + Prom 20036 - 20095 1.7 16 12 Op 1 . + CDS 20191 - 20835 257 ## Dvul_2391 hypothetical protein 17 12 Op 2 11/0.000 + CDS 20915 - 22138 1038 ## COG0477 Permeases of the major facilitator superfamily + Term 22140 - 22190 -0.5 + Prom 22202 - 22261 2.3 18 12 Op 3 . + CDS 22322 - 22906 569 ## COG1309 Transcriptional regulator 19 12 Op 4 . + CDS 22910 - 23731 750 ## COG1968 Uncharacterized bacitracin resistance protein + Term 23806 - 23856 7.2 20 13 Tu 1 . - CDS 24226 - 25500 1231 ## COG1896 Predicted hydrolases of HD superfamily + Prom 25712 - 25771 2.6 21 14 Op 1 59/0.000 + CDS 25801 - 26235 628 ## PROTEIN SUPPORTED gi|94987001|ref|YP_594934.1| 50S ribosomal protein L13 22 14 Op 2 . + CDS 26254 - 26643 543 ## PROTEIN SUPPORTED gi|220904336|ref|YP_002479648.1| ribosomal protein S9 + Term 26672 - 26703 5.5 + Prom 26721 - 26780 7.4 23 15 Tu 1 . + CDS 26870 - 28525 1563 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily - Term 28816 - 28864 18.8 24 16 Op 1 . - CDS 28880 - 30421 1255 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP - Term 30458 - 30498 9.1 25 16 Op 2 . - CDS 30506 - 30868 462 ## COG1734 DnaK suppressor protein - Prom 31034 - 31093 3.0 + Prom 30996 - 31055 2.7 26 17 Tu 1 . + CDS 31094 - 33274 2674 ## COG3968 Uncharacterized protein related to glutamine synthetase Predicted protein(s) >gi|316924130|gb|ADCP01000046.1| GENE 1 134 - 493 526 119 aa, chain - ## HITS:1 COG:HI0335 KEGG:ns NR:ns ## COG: HI0335 COG0818 # Protein_GI_number: 16272287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Diacylglycerol kinase # Organism: Haemophilus influenzae # 9 112 9 112 118 70 39.0 7e-13 MPRFINPKHLMGATKNSLSGLVFAWKGEQAFRHEVLVLVVLAALLAVTDKDAGQWLMVLG GWLGVMVVELLNSADEESFDLVTTEWNEHVKRGKDMASAAIFLAMVINAGIWTYVFLLN >gi|316924130|gb|ADCP01000046.1| GENE 2 666 - 3359 3376 897 aa, chain - ## HITS:1 COG:TP0105_2 KEGG:ns NR:ns ## COG: TP0105_2 COG0749 # Protein_GI_number: 15639099 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Treponema pallidum # 432 897 149 631 631 355 41.0 2e-97 MRTSRRSSLAGAGKIPGCPLGETMSIKQRFEQEPIYLMDGSAFVYRGFYANQSMQRSDGF PTSALFIVARILMRILREEQPKHFAFVLDGHGPNFRHELFPLYKAQRSATPEDLIKQIDP IHRLIKALGLHLEVSDSCEADDCLASLAKRYREERPVILVATDKDLKQCLHDNVMMWDPA SRDEKIVTLKSFEEETGLKPTQWPDVQAIIGDSSDNIPGVRGVGPKTAEKLFHEYASLEA IRDGFASMKPALQKKFDGQLEAMFLYRQLTTLDTSRCASLTLDAMRVRPVNRQEAATLLR EFELSSLERELATLIRDGRVAVEHDASETVDSGKSGSMRQLSLLDTPKAEPRQRLRDASD PFFDALEGNPVAVLGGHGELAIAVGDKEIVYTGLTPKLVDRLLKAKFVIAPDVKALLHTH ADWGRIPPEKWFDLGLASYLLEPEDRDYGWPKLSARWGTALGLSAANPGLLALTMTDQLL KRLSGAHLVELMKTLELPLIPVLADMESAGVKLDAEALADFLEEVQKDLDRITEDVYREA QGPFNIRSAQQLGDVLFNRLKLPVSGKTRGGQASTSQDVLEKLSGHHPVVDALLEFRKLE KLRSTYLEPLPRLMGGDGRIRTTFNQLATATGRLSSSNPNLQNIPVRGALGRRMRACFTA PEGKKLISADYSQIELRVLAHLSRDETLLAAFREGADIHARTASLLFDAPPSEITPDQRR NAKTINFGLIYGMGPQKLAQELKIPLSEAKAFMARYFERLQGLKHFYENVEEMAREQGYV TTLAGRRRPLPDIQAESQQARSLARRQAINTLIQGSAADIIKLAMLAVHGDETLRTLDAK LLLQVHDELLLEVPEAAAQEAGERVAALMANVRPGGIVLDVPLKADWGAAENWGDAH >gi|316924130|gb|ADCP01000046.1| GENE 3 3551 - 4462 1141 303 aa, chain - ## HITS:1 COG:aq_1333 KEGG:ns NR:ns ## COG: aq_1333 COG0037 # Protein_GI_number: 15606536 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Aquifex aeolicus # 4 299 2 297 301 237 42.0 2e-62 MICKRCKSATAIISLPSHNTGFCENCFRDFFSAQVARGIETGKLFTHEDRILVALSGGKD SLSLMLELSRQGYDVTGLHIDLGIPGSSEIVRGVIERFCGKHGFKLIVKEMAKEGLAIPL VKQRLKRPICSACGKIKRHYFNKTALEEGFTALATGHNLDDEIARLFSNTLRWDVGYLSD QGPRLDGEDGFARKVKPFWRLTEFETATYAFLEEIEHHHTPCPYSAGASFTYYKGLWNQL EEEMPGRKLSFYVDFLKRGRSAFAGLERTEGDALAPCTVCGYPTSSGVCGVCRIREVVKE GKE >gi|316924130|gb|ADCP01000046.1| GENE 4 4719 - 6335 2252 538 aa, chain + ## HITS:1 COG:YPO1360 KEGG:ns NR:ns ## COG: YPO1360 COG1151 # Protein_GI_number: 16121640 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Yersinia pestis # 1 538 1 549 550 533 49.0 1e-151 MFCNQCEQTAKGTGCTVMGVCGKLAPVADLQDHVIYALRLLSRTALKARAKGVIEQDTDD FTVRALFSTLTNVNFDPAALEEVAREAIRRRKALATKLGCTPDTCDGAVDADWESIRKVQ AGTDRFSDDVNVRSAMQLLLYGLKGVAAYADHAAVLGQRDPELNVFVYEALAAGMPDKDG TPDKARSLEDWLDLVLRCGKANLKAMELLDAGNTKAFGDPVPTQVSLGHRQGKAILVSGH DLEDLYELLKQTQGTGINIYTHGEMLPAHGYPVLKAFPHFVGHYGTAWQNQQKELPGFPG AVLFTTNCIQNPKDYGGKVFTSGIVGWPGLVHIKNRDFAPLILKALELPGFAEDAPGKEV TVGFGRKTLLDAAPAVLDAVKSGTVRHIFLVGGCDGAKPGRNYYTEFVEKTPQDTLILTL ACGKFRFFDKDLGKIGPFPRLLDVGQCNDAYSAVKVALALADALKCGVNDLPLSLVLSWY EQKAVAILLTLLALGVKNIHLGPSLPAFVSPDILNLLVENWGIKPISTPDEDLKALLG >gi|316924130|gb|ADCP01000046.1| GENE 5 6557 - 7240 576 227 aa, chain + ## HITS:1 COG:slr0449 KEGG:ns NR:ns ## COG: slr0449 COG0664 # Protein_GI_number: 16332256 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Synechocystis # 14 226 16 234 238 113 30.0 3e-25 MSNERLEVLGGLPLFEPLSPDERERLAAGCRMRTFARGAALFREGERADGMHIVLRGVVK VVRFAPDGREMVLHLVRKGNTIGEAAMFQKGTFPASAVAVDDVETLFLPADALFTLVTEN PEMALRMLAALSLRLRMFAHKLAAQGQGGAACRLATYLLHRRQIGGGDCIRLGVSREVLA NLLGLARETLSRQLSRFSEAGLVELRGKDIVILDVPALQATAAEGDR >gi|316924130|gb|ADCP01000046.1| GENE 6 7304 - 9316 1780 670 aa, chain + ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 270 503 3 236 260 257 55.0 5e-68 MGIDADMFNSPEWMSEALRIGNTGLWAIEVDEENGKHRMTANETMLLLLGLETHPSPEDC FRHWYGRVDEAYRAAVDECVGKMLSTGQRYEVQYPWLHPVCGRIFVRCGGKLLPGRSKNG LLRLKGYHQDVSELESMRERLRENLSRFETACRIGRIGVFECTRGSRILFSANDIFFEQF GIPADMVCFSAFRGLWSRIAPGCRKRVLEALRRSSWKPGRCERFEIELLDPEKGSRWFDF ESEFSRDGDAVRAVGYVADITEHKQHEASLRMAAEAAEAANRAKSSFLANMSHELRTPMN AIIGLSYLALKTDLTTQQYEYISRISESSTALLGILNDILDLTKVEANKLELARTPFNLK KELGILAAVVLPEAEGKGLEFSLEIAPDVPLLLMGDALRVRQVLLNLCNNAVKFTDEGSV SLRVRVVDSTNTRVRLEFVVRDCGIGIPESEIGRIFSPFMQVDESATRRFGGTGLGLAIS KRLVELMGGTLTVGSRVNKGSTFRVELAFPLAEDVPPGDAGSFCSGGNVPDESLPLGDLV GQRVLVVEDNELNQYVILNMLARFGVETCVAENGRDAVDRYAEDQDFDVILMDVQMPVMN GYDATRRIRESGLPRCRSVPILAMTAHAMRGDEERSFAAGMNAHLTKPIDVRELVFSLSR WGSAARRNRK >gi|316924130|gb|ADCP01000046.1| GENE 7 9362 - 9943 435 193 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRNVLALLGAGLLAGCCLMHFHVPIAEQTQANARTFGSLGQRLLPEDTPPERIAAFRQCL SEAYMEHYGEGEAGTAREREDGYALFWQTVMAAAPTYDHLASYDIHVRGRKRYEADPHSF VEAGRAWSRIVFERCGAGVFPDEDRYRMLGLFSRLPAWCDPPTGVPPSPRLLDDMGTPPA AARNFHKQSARRS >gi|316924130|gb|ADCP01000046.1| GENE 8 10319 - 11287 1336 322 aa, chain - ## HITS:1 COG:MA1324 KEGG:ns NR:ns ## COG: MA1324 COG3366 # Protein_GI_number: 20090185 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in archaea # Organism: Methanosarcina acetivorans str.C2A # 23 313 19 304 311 126 31.0 6e-29 MIPSDLMPFWLTLGRPLTRLLFAMCIGLLIANIVEALRWTQPLARLSAPLVRLARLREAA GASFSMAFFSPAAANALLSDSHAKGEISLRELVLANLFNSLPAYMVHLPTMLFMLWPVLG APALIYTGLTLAAAALRTLFTVGLGHFLLPPLHGENCIECRLPQEKVTLKSAALKAWKRF RRRLPKLVMFTIPVYVAMYLMQHYGLFRTAEAWLSDHTPWLGFIRPEALGIVLLSLAAEI GASLSAAGSALHMGGLTPPEVIIALLVGNILSTPMRTVRHQFPAYAGYYSPGLAFKLIFY NQLLRTITMIGVTWAYWLCAIA >gi|316924130|gb|ADCP01000046.1| GENE 9 11443 - 12120 794 225 aa, chain + ## HITS:1 COG:TM1140 KEGG:ns NR:ns ## COG: TM1140 COG0517 # Protein_GI_number: 15643897 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Thermotoga maritima # 1 174 1 175 215 164 48.0 1e-40 MLIREWMTKDVITVTPDTSMLKASKLMKDHNIRRLPVLDGKHVVGIVSDRDIRAASPSKA TTLDMHELYYLLSEVKVKDIMTSDPVTVYDTDAVDAAALLMENKGIGGLPVVDGSGELVG IITDHDIFRVLVDFCGASKGGLQLAFMLPDKPGVLTPIFEAISQNGGNVLSVLTSRGKTQ EGTSHVYIRLHAMDSEKEKALIENMKKAMPLEYWMDDEFHCNRNC >gi|316924130|gb|ADCP01000046.1| GENE 10 12344 - 12898 459 184 aa, chain - ## HITS:1 COG:BS_yddQ KEGG:ns NR:ns ## COG: BS_yddQ COG1335 # Protein_GI_number: 16077574 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Bacillus subtilis # 1 179 1 179 180 194 53.0 9e-50 MNTALLLIDIQNDYFPGGAWELSHADEAAAQARLALGRFREQHRPVFHVRHINTRPGAAF FLPDTPGSEIHGAVKPLDGEPVIIKHAPDAFFQTDLHDALSRAGIRKLAVCGMMSHMCID TSVRAARNHGYDITLLHDACATRDLSWNGKTIPAATVHEAFMAALHGAFADVRTTGDFLP SLPA >gi|316924130|gb|ADCP01000046.1| GENE 11 12955 - 13341 422 128 aa, chain - ## HITS:1 COG:CAC3399 KEGG:ns NR:ns ## COG: CAC3399 COG1733 # Protein_GI_number: 15896640 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 115 1 116 116 126 59.0 1e-29 MKIRTVCTCPLEIVHDIMRGKWKTIIVFQLRNGGMGLAELERGIEGITQKMLLQHLGELR AFGLIGKIEPDGYPLRVTYFLTERGEKLLRAVRIMQDIGVEYMLENGQAGILERKGIIPA AEGIPTKK >gi|316924130|gb|ADCP01000046.1| GENE 12 13496 - 14053 798 185 aa, chain - ## HITS:1 COG:TM1558 KEGG:ns NR:ns ## COG: TM1558 COG2316 # Protein_GI_number: 15644306 # Func_class: R General function prediction only # Function: Predicted hydrolase (HD superfamily) # Organism: Thermotoga maritima # 1 183 1 182 182 136 44.0 3e-32 MLDREQALALLNELGPEKHLIQHALASEAVMRALARHLGEDEEVWGLTGLLHDLDYPLTH EDPAKHGLVGAERIGDRLPEEALHAIRAHNGEMTGVAPSSAFDYALRCGETVTGLVVTAA LVRPTGMEGMQASSLKKKMKDKAFAASVNRDCIRQCSELGLELGDFLTLAIGAMAEIDEE LGLRK >gi|316924130|gb|ADCP01000046.1| GENE 13 14223 - 15266 1117 347 aa, chain + ## HITS:1 COG:BH0045 KEGG:ns NR:ns ## COG: BH0045 COG1774 # Protein_GI_number: 15612608 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Bacillus halodurans # 1 245 1 245 275 226 46.0 5e-59 MKRIIGIKFSHYGQIYFFYCEDPFVDRGDRVLAETGQGLGIATVMNIVDRLPEDLPPDGV RDVLRKVTDEDQIVAEENDTLTYSAHKFCESRIRERQLDMKLVDVEVLFDRSKLVFYFTA PTRIDFRELVKDLVREYHTRIELRQIGVRHETQMLGAVGSCGMVCCCRRFLRKFVPVTIK MAKEQNLFLNPSKISGICGRLLCCLSYEQENYDIFHSSCPRLGKRYQTNKGPMKVLRANM FRNSVALLTDTNEELELTLDEWQALDPRRPDAPPRPEGKPDAKPENKPSSPRPHDELMVV MADPDTIDEAFGDESDELVDDLGGLNDGEMETLEEGLRPKRKRKRKR >gi|316924130|gb|ADCP01000046.1| GENE 14 15475 - 17463 2540 662 aa, chain + ## HITS:1 COG:CAC2991_1 KEGG:ns NR:ns ## COG: CAC2991_1 COG0143 # Protein_GI_number: 15896243 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 5 547 7 533 536 437 41.0 1e-122 MSSTFYITTPIYYVNARPHLGHAYTTIVADSLRRFHTLLGEDTWFLTGTDEHGDKIVKAA EAAGQTPQEFVDGISGQFQALWPKLGIKHDQFIRTTDADHKARVQAFLQKVYDNGDIYFG EHGGHYCTGCERFYTEKELENGLCPQHLTKPDFIQEKNYFFRMSKYLPWLAGHIRENPDF IRPERYRNEVLSMIESGALEDLCISRPKNRLEWGIELPFDKDYVCYVWFDALLNYISALG WPDGDKYAAYWPGEHLVAKDILKPHGVFWPTMLKSAGVPLYKHLNVHGYWLIKDTKMSKS LGNVVEPIKMAEHYGLDAFRYFLLRDMQFGSDASFSEEALITRFNADLANDLGNLFSRVL SMNAKYFESKVPPMGELAEDDKALIELAENSRRNYVQLFGNIRFSQGLDALWDLVRALNK YVDSQAPWTLFKQGDTARLGTVIRLLLECMRKVALCLWPVMPGTAAALLEQLGQPLPASE RFGVAPAGNVNDEEGVWECLATGSVIASASNLFPRLEIPEEMKEDKKAKQKKPAKEKEAK AAPASAPAEMSVPGVAEYADFQKLELRVGTILEAGRVPKADKLLCFKIDLGEAEPRQILS GIAEHFEPETLVGRQVCVVANLAPRKIRGLVSAGMILTAEAPDGKLTLLAPGGDVAPGSK IS >gi|316924130|gb|ADCP01000046.1| GENE 15 17814 - 19475 2277 553 aa, chain - ## HITS:1 COG:VCA0983 KEGG:ns NR:ns ## COG: VCA0983 COG1620 # Protein_GI_number: 15601736 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Vibrio cholerae # 1 553 19 571 582 406 43.0 1e-113 MSVELLAMVAVLPILVALILMVGMRWPATKAMPLAWLVCAVSGILVWGLSPGYVAALTLQ GIVTAIGVLIIVFGAILILFTLEKSGGMETIQYGMQNISRDKRVQAIIIGYMFAAFIEGA AGFGTPAALAAPLLLALGFPAMAAAIICLVFNSFPVSFGAVGTPIVMGLSPLKPILDAGV ADGGMTYAAFYKIVGEYCTMMHIPMAFILPVFMLGFMTRFYGPNRTWSEGFSAWKYCIFA GVCFSVPYFIVAWTLGPELPSLIGGLIGLGILIYGTKRGFCVPETTWDFGPHEKWDASWT GSIAASGKTEFHPHMTQFRAWLPYAIIGLILVLTRIPELGLKAWLASFKLSFVDILGYQG VSASIDYLFLPGTIPFTLVAILTIGLHSMKANDVKDAWTTTFAKMKAPTIALFAAVALVS IFRGSGVNDAGMDSMPLALAKTLAALTGEAWPALASYVGGLGAFITGSNTVSDLLFAQFQ WDMATQLKLSKEIIVAAQAVGGGMGNMICIHNVVAVCAVVGLSGSEGMIIKKTFWPFLLY GIIVGIIACGLVF >gi|316924130|gb|ADCP01000046.1| GENE 16 20191 - 20835 257 214 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2391 NR:ns ## KEGG: Dvul_2391 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 8 198 121 311 328 163 43.0 3e-39 MGETGLRRTMPHWPQPGLIPRDPARGDRLETIAYFGSDQYEPQFVKTGAFQDALRQRGVR FVNRFQGEWHDYQHVDAVLAIRDCPPVVLATKPASKLINAWKAGVPALLGPEPAYRELRT SPLDFLEAASAEAVLDSIDRLQQESGLYRRMTEHGAKRAQAFDVNMLTQKWISLLEEARE RNRRETLYPVMRSIRYLWNRQKINLGKRLSGWRD >gi|316924130|gb|ADCP01000046.1| GENE 17 20915 - 22138 1038 407 aa, chain + ## HITS:1 COG:BS_yitG KEGG:ns NR:ns ## COG: BS_yitG COG0477 # Protein_GI_number: 16078162 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 24 407 12 401 422 111 28.0 2e-24 MTKDGLPGEASQEGQSRPVSLAGLVLVLCVALMGVMGVSITLPILPKLGAVFHQDAAGVA LLITCFTLPSAFMTPVAGVLADRFGRKAVLLPGLLLFACGGMGCAFSDSFENLLAWRAIQ GLGAAPLGILYGTLVGDFYKEADRPKVMGMVGATISFGTALYPAIGGFLGEMDWRWPFWI SLAALPVGLLALSVPLERPHTGMDWKQYARDSRSIIFHSAAIGLFGLTFLCFCILYGPTI TYFPLLADLLYKATPSHIGAVFTVASLGTAAIAMNLAWLGRKYSHRRLMLSATCCYVVAQ TLMLVLPDAVSSLWWLTLPIFIGGVAQGLTFPLLNARMTTLAPTRNRAIVMAMNGTVLRL SQSLSPLFFGIGWSYIGWRGPYAMGIGVALVIGALVWRVYPVSSGKE >gi|316924130|gb|ADCP01000046.1| GENE 18 22322 - 22906 569 194 aa, chain + ## HITS:1 COG:TM1030 KEGG:ns NR:ns ## COG: TM1030 COG1309 # Protein_GI_number: 15643788 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Thermotoga maritima # 1 88 2 87 200 62 35.0 6e-10 MTKKEALLKAAKELFGECGYTETTFKKISERAGVALGLLTHHYGNKEKLFLAAGMDVLER FHEVLQRACDEGKNGRESVLNFCKAYLDFSIDPDSNWLVLVRCSPYSDMKTTDDRELMFE KFNDIHKLLEQQLWRGIEDGSIRELLVKETAQILMSLVVGANRTRVLTPYAPPHLYREAV AFAARAITPCNACE >gi|316924130|gb|ADCP01000046.1| GENE 19 22910 - 23731 750 273 aa, chain + ## HITS:1 COG:YPO0649 KEGG:ns NR:ns ## COG: YPO0649 COG1968 # Protein_GI_number: 16120974 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Yersinia pestis # 7 267 10 269 272 218 48.0 7e-57 MDWFSTGLIMGAVQGLTEFLPVSSSGHLVIAGDLLGFTGPKASTFEVAIQLGSIFAVLMV YWDRFMGLLLPGRVQTSAPKPFAGLRGIWLLFLTTLPPSVLGLLLHSYIKQLFTPFSVSL SLAVGGLLMLFVESYCAKRPPKYSSIDEVTPKLALGIGLCQCMALWPGFSRSGSTIMGGM LLGEKRALAAEYSFVAAVPIMFAATGYDLLKTWSLFTVDDIPLFATGLFFAFLFGWLAIK TFIALVGRITLRPFAVYRLLLAPIVYFFMVNMG >gi|316924130|gb|ADCP01000046.1| GENE 20 24226 - 25500 1231 424 aa, chain - ## HITS:1 COG:jhp0650_2 KEGG:ns NR:ns ## COG: jhp0650_2 COG1896 # Protein_GI_number: 15611717 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Helicobacter pylori J99 # 203 423 2 211 212 157 41.0 5e-38 MSAIRKSLLQLMFSGSYMRRWNDKLRPVELYEIDKQAHKMIVAWMLTLLNSGGYSASDQL KLQQEVIERGLFDYLYRLVTTDIKPPVFYRICENEKDYKELTEWVLKELRPVLGAPDEGF WERLSAYHRNRDRTSLPDRILTAAHHYASGWEYNVIKPFNTFDEENQSIAESFTERLDGL TDLCGVNELIQGHAFFSDSPTALGRFAKLCGQLRFQIRWADTPRVPETSVLGHMFLVAGY AYFFSLSLGACPARRINNFFAGLFHDLPELLTRDIITPVKRSVNQLPSLLRAYELQELER RVFGPLSAGGHDRLVERLRYYLGLVGEGVTSEFDETIRDSSGQVRCLGSFDALHANGNED GLDPKDGTLLKVCDNLAAFIEAHSSVRTGISSPNLHEAIARIRGDFRHRSLGPLSLGTII ADFD >gi|316924130|gb|ADCP01000046.1| GENE 21 25801 - 26235 628 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|94987001|ref|YP_594934.1| 50S ribosomal protein L13 [Lawsonia intracellularis PHE/MN1-00] # 1 144 1 144 144 246 81 1e-64 MKTFSPTPEHIDHQWFVVDADNQVLGRLAAQIAHRLRGKHKPEFAPHMDNGDVIVVVNCE KIKVTGAKLEKKKYYHHSGYVGGLREITLDKLLAEKPADVLMHAVRGMLPKNRLGRAMLK KLKVYAGPTHPHTAQGPKPLSFPY >gi|316924130|gb|ADCP01000046.1| GENE 22 26254 - 26643 543 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|220904336|ref|YP_002479648.1| ribosomal protein S9 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 129 1 130 130 213 83 1e-54 MSEFDYGTGRRKNATARTRLYAGNGAIEVNGRKVEDYFPRKTLQMIIRQPLVLTKLVDKF DVKVNVCGGGVTGQAEAVRHGISRALLSADPSLRGVLKRAGFLTRDARKKERKKYGQRAA RARYQYSKR >gi|316924130|gb|ADCP01000046.1| GENE 23 26870 - 28525 1563 551 aa, chain + ## HITS:1 COG:CAC1683 KEGG:ns NR:ns ## COG: CAC1683 COG0595 # Protein_GI_number: 15894960 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Clostridium acetobutylicum # 6 551 8 555 555 436 40.0 1e-122 MTDTFLTITPLGGLGEIGMNCQMWNTPEGCVLVDCGIMFPDDQQLGVDVVIPPLEPILAQ RGRLLGVVLTHGHEDHIGAVPWLVSFIKGLKVYGSPMTLALVEHKLRERGLLDRAELITV TPEHELALGGLRFHFIPVSHSIPQGYALAVDTPVGKVVHTGDFKIDEFPSDGVGTDLPAL RSFAGVDGVRLLLSDSTNAESEGHTQSEKVVRETFHEIFSEAKGRIVITLFSSHIERIQM VFDMAREFDRAVVVSGRSLVNNIERGRDLGFMRMPPELFTDQTIPDVSPERMVVIATGSQ GEPLSALSRIASGEHRQLSIMEGDTVIMSSRVIPGNARAVNRLINQMYRMGADVCHDGTR PVHVSGHGRRDELRTMLDAVRPKFFVPIHGEYRHLIQHRDLARDWGIAPERTFILDDGEP LTLLPDTIRLEEKIPADSILVDGKGVGDVGNLVLRERQLLGGDGVVVVVLVLDEETGEVI HGPDMISKGFVFEQQFSHLLEDAKCLVLDHLETSPRLGIPRLGDRIRSSLRSFFRKVVGR DPVVVPVITEV >gi|316924130|gb|ADCP01000046.1| GENE 24 28880 - 30421 1255 513 aa, chain - ## HITS:1 COG:BS_yloA KEGG:ns NR:ns ## COG: BS_yloA COG1293 # Protein_GI_number: 16078628 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Bacillus subtilis # 251 511 305 569 572 84 27.0 6e-16 MDCHVFRRLCDELAPALMGCRIEKIHRPAKDVTLFTLYGTVGKRFLFFRAGRKAPFLFLS THKIPVGSAPPADIMRLRKYLADRRIIDVLPDWVGRRLYLHVNADTECWLTLDLREGPSL LFDAPPEPEIPAWPDPAHWAEACEGDGWRNWPVITPPLRRTLPLLPPDEQSALLLDLEAG GGDLFLYENAAGERELSAWPLPPERRRDADGTPREELVVEDAIRACAAAGEAQVLRGIAA LSRAEAAKPHQAEANRLRKLLLKLESEKKRLSDMVAAQDAARLLQSQLYRFSAEEKHASV TLDGAGGPVDLALDPRLTVRENMASLFHKASRGKRGLSMLEGRFAAVQADLARAEQAGLM AQAATSAPTPANAPSPAAPQFRAPELPKNVQPFRSSDGFLLLRGRDAKGNAQLLKLAAPH DLWMHTGGGPGAHILIRRDHAAQEIPSRTISEAAILSVLKSWRKDETQVDVIAALAKFVH PIRGAKPGTVRIDRMEPAIIVTPDPSLEEKLAL >gi|316924130|gb|ADCP01000046.1| GENE 25 30506 - 30868 462 120 aa, chain - ## HITS:1 COG:SMc00469 KEGG:ns NR:ns ## COG: SMc00469 COG1734 # Protein_GI_number: 15965557 # Func_class: T Signal transduction mechanisms # Function: DnaK suppressor protein # Organism: Sinorhizobium meliloti # 8 117 26 135 139 88 49.0 4e-18 MEQKDIEYFRTLLNNMLEEAQHKGDSTLEDLTDSNELFADPADRASAESDRAFTLRIRDR ERRLIRKIQAAIQRLDEGTYGICEDCGEDISIPRLKARPVTKLCINCKARQEEGENIRGE >gi|316924130|gb|ADCP01000046.1| GENE 26 31094 - 33274 2674 726 aa, chain + ## HITS:1 COG:slr0288 KEGG:ns NR:ns ## COG: slr0288 COG3968 # Protein_GI_number: 16331104 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Synechocystis # 1 726 1 724 724 767 51.0 0 MSGSNARSNAIRAITSYKPDLAPLNYADTKPTEIYGSNVFSDKVMKERLPRIVYKSLRKT IEQGVQLDPSLADIVASTMKDWAIEKGATHFTHVFYPLTGHSAEKHDGFLSPDGNGGVIT EFSGKMLSQGEADGSSFPSGGLRSTFEARGYTAWDVTSPAYLMENSSGIVLCIPTAFLSW TGVALDRKTPLLRSNQALNKQASRILKLFGKKTSLPIISYSGLEQEFFLIDRNFVFGRPD LLTAGRTLFGAKPAKGQEFEDQYYGVMPRRVMSCITEAERELYKLGVPVRTHHNEVAPSQ YEIAPVFETANLAIDHNQLMMTVLKKVAKRHGLHCLLHEKPFSGINGSGKHLNYSIGNAE VGTLFEPGETPHENAMFLVFLVAAIRGLHKYGGLLRATVAAASNDHRLGANEAPPAIMSM FLGDQLTDILEQFRVGQVRGSKGKRLMNVGVDTLPPLPADPGDRNRTSPFAFTGNRFEFR ALGSSMPASASQTALNTIMSDSLDYAATRLEKLSGGDPEKLHAAVGKLIQEIVEEHSAVI FNGDGYSEIWHQEAERRGLPNYRTTPEALMVYTSPDVVDLYSRYNVLSRQELKARQEIYI EQYCKTIRAEANLVIRMGRTIIYPAGLRFQQELLRACLDMRALNREPDTVLLDDIDDTLR KLREGLEHLENNVDMKIEDPVQEAHHKCTVVIPAMNAIRTLADHLETIVPEDLWPLPSYQ EMLFVK Prediction of potential genes in microbial genomes Time: Fri May 13 02:29:16 2011 Seq name: gi|316924091|gb|ADCP01000047.1| Bilophila wadsworthia 3_1_6 cont1.47, whole genome shotgun sequence Length of sequence - 44004 bp Number of predicted genes - 41, with homology - 36 Number of transcription units - 28, operones - 8 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 72 - 2531 2393 ## COG1287 Uncharacterized membrane protein, required for N-linked glycosylation 2 1 Op 2 . - CDS 2616 - 2837 351 ## Ddes_1959 transcriptional regulator, TraR/DksA family 3 2 Tu 1 . - CDS 2949 - 3437 606 ## COG2731 Beta-galactosidase, beta subunit - Term 3451 - 3502 -0.4 4 3 Tu 1 . - CDS 3577 - 4785 1450 ## COG4992 Ornithine/acetylornithine aminotransferase 5 4 Tu 1 . - CDS 4856 - 5338 595 ## COG0756 dUTPase - Term 5591 - 5628 2.3 6 5 Tu 1 . - CDS 5653 - 6174 171 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 7 6 Tu 1 . - CDS 6296 - 7120 207 ## PROTEIN SUPPORTED gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 + Prom 7447 - 7506 3.1 8 7 Op 1 . + CDS 7587 - 8099 669 ## COG0780 Enzyme related to GTP cyclohydrolase I 9 7 Op 2 . + CDS 8102 - 8902 884 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor + Prom 9155 - 9214 2.7 10 8 Tu 1 . + CDS 9281 - 10249 606 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control + Term 10386 - 10418 1.4 - Term 10210 - 10273 3.5 11 9 Op 1 13/0.000 - CDS 10286 - 12214 1904 ## COG1154 Deoxyxylulose-5-phosphate synthase - Prom 12287 - 12346 1.8 12 9 Op 2 . - CDS 12396 - 13304 1184 ## COG0142 Geranylgeranyl pyrophosphate synthase - Prom 13444 - 13503 2.5 13 10 Op 1 . - CDS 13651 - 13890 275 ## DvMF_0124 exodeoxyribonuclease VII small subunit 14 10 Op 2 . - CDS 13899 - 14756 1006 ## COG0739 Membrane proteins related to metalloendopeptidases 15 10 Op 3 . - CDS 14743 - 16215 1434 ## COG1570 Exonuclease VII, large subunit 16 11 Tu 1 . - CDS 16529 - 18253 2488 ## COG0442 Prolyl-tRNA synthetase - Prom 18311 - 18370 4.5 17 12 Tu 1 . + CDS 18715 - 19593 940 ## Dvul_2897 hypothetical protein + Term 19614 - 19666 11.0 18 13 Tu 1 . + CDS 20024 - 20617 629 ## + Term 20637 - 20678 12.2 - Term 20630 - 20660 3.3 19 14 Tu 1 . - CDS 20700 - 21074 334 ## - Prom 21164 - 21223 3.1 20 15 Tu 1 . + CDS 21208 - 22566 1304 ## DVU1407 radical SAM domain-containing protein + Term 22769 - 22809 10.5 - Term 22757 - 22797 11.3 21 16 Tu 1 . - CDS 22991 - 24604 2001 ## COG0392 Predicted integral membrane protein 22 17 Tu 1 . + CDS 24893 - 25375 700 ## COG0782 Transcription elongation factor 23 18 Tu 1 . + CDS 25506 - 26903 546 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 + Term 26912 - 26952 0.1 + Prom 27152 - 27211 2.1 24 19 Tu 1 . + CDS 27293 - 27559 274 ## COG0724 RNA-binding proteins (RRM domain) + Term 27577 - 27614 10.1 25 20 Tu 1 . + CDS 28545 - 29726 1490 ## COG1454 Alcohol dehydrogenase, class IV + Term 29738 - 29782 12.1 26 21 Tu 1 . - CDS 29868 - 30059 91 ## + Prom 30047 - 30106 5.4 27 22 Op 1 7/0.000 + CDS 30274 - 32226 1702 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 + Term 32230 - 32277 7.4 28 22 Op 2 1/0.000 + CDS 32285 - 33136 737 ## COG0294 Dihydropteroate synthase and related enzymes 29 22 Op 3 . + CDS 33179 - 33916 783 ## COG1624 Uncharacterized conserved protein 30 22 Op 4 . + CDS 33900 - 34841 693 ## LI0191 hypothetical protein 31 22 Op 5 . + CDS 34889 - 36241 1257 ## COG1109 Phosphomannomutase + Prom 36257 - 36316 3.1 32 23 Tu 1 . + CDS 36498 - 37361 813 ## COG1210 UDP-glucose pyrophosphorylase + Prom 37428 - 37487 4.5 33 24 Tu 1 . + CDS 37573 - 39924 1544 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase + Term 40066 - 40098 3.1 34 25 Tu 1 . - CDS 40100 - 40555 280 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 40307 - 40366 1.9 35 26 Op 1 . + CDS 40452 - 40661 92 ## 36 26 Op 2 . + CDS 40687 - 40896 234 ## DVU1567 hypothetical protein 37 26 Op 3 . + CDS 40994 - 41608 723 ## COG0293 23S rRNA methylase 38 27 Op 1 8/0.000 + CDS 41737 - 42483 900 ## COG0217 Uncharacterized conserved protein 39 27 Op 2 . + CDS 42710 - 43219 522 ## COG0817 Holliday junction resolvasome, endonuclease subunit 40 28 Op 1 . + CDS 43344 - 43709 454 ## LI0267 hypothetical protein 41 28 Op 2 . + CDS 43764 - 43988 274 ## Predicted protein(s) >gi|316924091|gb|ADCP01000047.1| GENE 1 72 - 2531 2393 819 aa, chain - ## HITS:1 COG:MJ1525 KEGG:ns NR:ns ## COG: MJ1525 COG1287 # Protein_GI_number: 15669720 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for N-linked glycosylation # Organism: Methanococcus jannaschii # 351 685 404 771 933 65 21.0 5e-10 MTIPPPPQSRNIPLGPVPRWLLGLLLGIATYAVAFLLRFQEWPSWQDVEYRLGNEMLLAT HDAYHWVAGAEGFEFGAGHPMSELLRILALITGAEPAQVAFWLPPVMASLVALLIFLWAW GMGSMEAGFCAGILASLSPGFLARTMLGYADTDLVTLFLPLLIGLAPAVWVMHFLRHPLA LPFRWFQRWTKRPIPIALDAPGHQPYAPISAFWVFALSASGLLAWWSQEWHSMFPYIVRY NVALIGCMALLLARPGERRTALLAGLAYALPALGGPTGAMFPLTLLIAIMGRERQFVAQL HRPAVLAAFWLMAAALLVDLSVLSTMLHHVQGYLKRAGDAVATGTPDPLVFPSVAQSIIE IQDLTLTEILVYFHPWQPIAILGLLGFLFVLCARSGALFLVPLALLALLSTKMGGRMVMF GAPIVAIGLTLPVDWLACALGDVRAKTTRKVFFLACLILAGLILLVPSCRETMIELWNSL GWLHFVIMGCLLAMFVIGLGRQRGWRLTRTLDMIGPHLLYRSAAVILMLAIVIPPLADLV PAMTHGPILNRRHAAGLRELRKATPEDAMIWHWWDWGYAAHHFSRRDTIADGAEHGGPSL YLPAAVYATDDPRFARQIIKYTAAKGNVPGNVFKGLTASQAADMITWLNNPNNPLIQADG KQYLVLSFDMLDLGFWISTFGSWNFLSKEGRGYAISIVPQALSYRLDKGEVVMKGSNINV PAASIDVFSDGQLDHRDYVTPPEYLPDNAAIKAWKEDMERRRNVHFMFNRVTGEKLVIDD RMYNTLMVQLLICDPGDPRFAPYFRLIFDNVFCRVYEVL >gi|316924091|gb|ADCP01000047.1| GENE 2 2616 - 2837 351 73 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1959 NR:ns ## KEGG: Ddes_1959 # Name: not_defined # Def: transcriptional regulator, TraR/DksA family # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 69 1 69 72 94 76.0 1e-18 MDIFDQAQELERLARETALQRARSSQYWEGPEWIDGQACCRECGEPIPQKRLEALPGVGL CRACQEEREREGF >gi|316924091|gb|ADCP01000047.1| GENE 3 2949 - 3437 606 162 aa, chain - ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 154 1 149 152 62 26.0 3e-10 MLVDHIGNWRRYGFGPAWEETMRWLENYAPGGSGSLADLTDGTHPLTDGFVSVATLTSRP LSGSLYECHKRFCDVQMVVEGQEWLFNAATTGLSLDGPFDDQRDVGFFQPAPAEVSRVTL SPGTFALLFPWDAHLPAIAVDGVPAPLRKCVGKIPFESLRLS >gi|316924091|gb|ADCP01000047.1| GENE 4 3577 - 4785 1450 402 aa, chain - ## HITS:1 COG:MJ0721 KEGG:ns NR:ns ## COG: MJ0721 COG4992 # Protein_GI_number: 15668902 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Methanococcus jannaschii # 13 402 10 398 398 356 44.0 5e-98 MTTSSLDTIKAREESLLCRTYGRYPVSIASGKGSRVTDLDGKEYVDLLAGIAVTSLGHCN EEIAQVIEKQARKLIHVSNLFYQEEQLELAEELLSLSHFGKAFFCNSGAEANEAQIKLAR RYMQRVKNVDAYEIITLDKAFHGRTYATMAATGQERFQDGFAPIPDGFKTVPWGDLNVLE AAITPKTAAVLVEIVQGEGGIRPMHAEYAKGIEALCREKGILFLVDEVQGGLFRTGKPWA FQHFDLKPDAISCAKALANGLPMGAMMTTDEVSKGFVAGSHATTFGGGALTSAVAAKVVR IMKRDHLDERAASLGGEFMESIRRISESHPGTIKEVRGLGLMIGIELAFPAKAVWEELIR RGFICNLSHEVVLRLLPALNIPEEDLRAFADTLEDILANIKK >gi|316924091|gb|ADCP01000047.1| GENE 5 4856 - 5338 595 160 aa, chain - ## HITS:1 COG:ECs4515 KEGG:ns NR:ns ## COG: ECs4515 COG0756 # Protein_GI_number: 15833769 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Escherichia coli O157:H7 # 30 160 22 151 151 108 41.0 5e-24 MTTAHPKVRIQYLRDSQAVYTANGAEGLRYATSGSVGLDLRVCLEEDEAAIPAGGRLAIP SGIAVEPLTPGIAGFVYSRSGLGAMQGLTVAQGVGVIDPDYRGEITVVLLNTSGEERRLR RGDRIAQLVFQPALQVELEECETLGATGRGSGGFGHTGKH >gi|316924091|gb|ADCP01000047.1| GENE 6 5653 - 6174 171 173 aa, chain - ## HITS:1 COG:L182026 KEGG:ns NR:ns ## COG: L182026 COG0454 # Protein_GI_number: 15672571 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Lactococcus lactis # 2 147 5 150 177 74 30.0 8e-14 MLRQATDADLNQVEEGYQKHFLHEKEHGAFTVFQEGVYPTRADAAKALSKGALHVYEENG DILGSIIVDRNQPDEYETIDWPSRAPAEKVMVIHLVMVCPEAAGKGIGTSLVKYAMERAR QHSCETVRLDTGAQNIPAASLYKKLGFQLAETGTMKVGGVISHTGHLFFEKML >gi|316924091|gb|ADCP01000047.1| GENE 7 6296 - 7120 207 274 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 [Burkholderia pseudomallei 305] # 116 257 5 151 194 84 35 1e-15 MTETIYTLSLHACAEGDILPLAFLPATAAPAPFAVHGMAARPLPDGNRDWRVGTCLYDAS GVARLKVTGRMFLPHSILALDAPHTQICPLLEALTPIEAGTRTLAARREHWALAWITLSD KGAAGLRVDESGPLMAADTRAKLPLCHEQGFMIPDDPQTLRPLVMELALGQGYDLILTSG GTGLAPRDTTPEALLPIFERRLPGFEQAMMQASLAKTPTAAISRAVAGTLGRTIVITLPG SRKAVSENLAAILPALGHALEKLHGDPSDCGKRA >gi|316924091|gb|ADCP01000047.1| GENE 8 7587 - 8099 669 170 aa, chain + ## HITS:1 COG:NMB0317 KEGG:ns NR:ns ## COG: NMB0317 COG0780 # Protein_GI_number: 15676234 # Func_class: R General function prediction only # Function: Enzyme related to GTP cyclohydrolase I # Organism: Neisseria meningitidis MC58 # 3 141 1 138 157 124 42.0 6e-29 MAMTNGGDQTAQLTVLGTGRLPQPEGGPSASLLEVFPNRFPHRPYVVSMAFPEYTSLCPV TGQPDFGTIVVEFIPDQKCVESKSFKLYMFAYRNHQSFMESITNTILEDFVEALDPMWCR VKGLFSPRGATYLHVFAEHYKKLDDPAKAEEVRQAVADWKREAAPHTWDK >gi|316924091|gb|ADCP01000047.1| GENE 9 8102 - 8902 884 266 aa, chain + ## HITS:1 COG:BH0086 KEGG:ns NR:ns ## COG: BH0086 COG1521 # Protein_GI_number: 15612649 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Bacillus halodurans # 4 261 3 250 254 141 33.0 1e-33 MHALLIDIGNTSLKIGVAGIEGLVASYTLPTDTQQSGDGLGLQLAYLLGHAGFGTPGTGP GNVALDGCVISSVVPGMNPLVAHACERFLELTPRFAHRDLPIPLENRYERPNEVGADRLV GAFGARRLLPDVRSVVSVDYGTATNFDCVTGNAYLGGLICPGVMSSLGALATRTAQLPRI ALTAHADVPIVGRSTVTSLNHGFLFGFASMTEGLYARLTKTLEGPVAFVATGGFAPDVAR VVDCFDLVRPDLVLEGLRLLWLESRD >gi|316924091|gb|ADCP01000047.1| GENE 10 9281 - 10249 606 322 aa, chain + ## HITS:1 COG:DR1207_1 KEGG:ns NR:ns ## COG: DR1207_1 COG0037 # Protein_GI_number: 15806226 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Deinococcus radiodurans # 17 222 57 258 441 137 42.0 3e-32 MAVGRFLKEKCGVSPHHVIVAFSGGADSTALAVILRCLGVSLVLAHLDHRLRPESGEEAE SARRFAERLGVPCMVRRVDVEALARSEGIGLEDAGRRARYAFFDSALLSERAEWVVTGHH LDDLSEDVLMRLVRGSGWPGLGGMKAVDPARRLLRPLLGTPRAELEAFLRSLGVDWIEDA SNRSDAFRRNRMRNHVIPLLKAENPSFSRSTRTLWELAREDEHYWNEVLAPVFAQLREEN GSLLLPRAAFVSLPRAARLRVYAGLFHRFGRGQAQSETLFRLDGAAVSSRSRKVFQFPGG VCITTDGDGVLMKTGGGGKNPF >gi|316924091|gb|ADCP01000047.1| GENE 11 10286 - 12214 1904 642 aa, chain - ## HITS:1 COG:aq_881 KEGG:ns NR:ns ## COG: aq_881 COG1154 # Protein_GI_number: 15606220 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Aquifex aeolicus # 30 631 26 614 628 609 51.0 1e-174 MNETDSTLPPLLARITHPSMVADLSLADKQILATELRQTIIRTVAANGGHLAPSLGVVEL TLALLSVFDPEEDKLVWDVGHQSYAWKLLTGRSENFHTLRQLGGISGFPKIEESPFDHFG VGHSSTSVSAALGMAMARDLAGKKNHVLAVIGDGSLTGGLAFEGLNQAGDMGRRLIVVLN DNEMSISHNVGALSLFLSRNMERGWARRVRREVKDWLKSIPGIGDEMAEYAHRTHRSLKT VFTPGMLFEALRFNYIGPVNGHNIEEVERHLRMAASIDDQPVLLHVLTRKGKGYPPAEAQ PSKFHGLGKFDVATGQTQPKPAGAPPTYTDIFGETLCRLAKEDDRIVAVTAAMSSGTGTG HFKSRFPTRFTDVGICEQHAVTFAAGLASQGYRPFVAIYSTFSQRAYDQIIHDVCIQKLP VTLCLDRAGLVGEDGPTHHGAFDLSFLRHIPNIKILAPRDEPELQAALITSLNLGQPLVI RYPRGSAPGRPLPNAEPLISLPPLPLGEGELLREGTDAVVIAVGSMVVAAQNAAERLFTE TGRSVAVFDARWIKPLPEKQLLDLVARFDRILFAEENALAGGFSSAVLELLVDNGTLRGQ RIKRIGLPDAFVEHGTQAQLRHRLGLDDEGVYLTLKALMEEK >gi|316924091|gb|ADCP01000047.1| GENE 12 12396 - 13304 1184 302 aa, chain - ## HITS:1 COG:alr0213 KEGG:ns NR:ns ## COG: alr0213 COG0142 # Protein_GI_number: 17227709 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Nostoc sp. PCC 7120 # 14 301 22 308 309 249 48.0 5e-66 MSTPYDSQQVKTILAERGAQVEAYLQHCLDAIPMQGRLKEAMYYSLLAGGKRLRPVLCLS TAALFGLSAEAVMPFAASLEMIHTYSLIHDDLPAMDDDDLRRGRPSCHKAFDEATAILAG DALLTDAFGFMASTVKDLPAGRVLEALASVSSAAGSAGMVGGQILDMDCTGKTNVPLETL QTLHALKTGAMFRVACVSGGLLAGASESDIASLRAYGEALGVTFQIVDDILDETADTATL GKPAGSDAEMGKTTYPSLMGLDRSRELAQEFAEKAVDSLSAFSGQDAAFLKGLALMLVTR NK >gi|316924091|gb|ADCP01000047.1| GENE 13 13651 - 13890 275 79 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0124 NR:ns ## KEGG: DvMF_0124 # Name: not_defined # Def: exodeoxyribonuclease VII small subunit # Organism: D.vulgaris_Miyazaki_F # Pathway: Mismatch repair [PATH:dvm03430] # 3 79 24 100 100 72 57.0 5e-12 MAAKKTPSFEDRLRRLQEVVAALENGELPLEDSVRLYKEGLTLSRSCREQLEKARNEVRL LTEEGLEPFDEKKLDEEGE >gi|316924091|gb|ADCP01000047.1| GENE 14 13899 - 14756 1006 285 aa, chain - ## HITS:1 COG:Cj1235 KEGG:ns NR:ns ## COG: Cj1235 COG0739 # Protein_GI_number: 15792559 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Campylobacter jejuni # 106 284 89 267 273 166 43.0 5e-41 MISRKCRIALLAVCLTLLLCGGALAASLTIHAPEEAKQGSAVKVRVVIDEQLPQLTFTWL GKNIAAPTVADGGHRIAEALLPVPVNADKDLIVQATAGTLTGKATVKVLKVKWPEQQISV KKTFVNPPKEAMKRIDAERKKSSEAINRVTPERYWRGEFQRPVPGVVTSAFGGRRMFNGE LRSYHRGVDLRGAEGTPIKAVADGKVAIAQNMYFAGNTVYLDHGQGVVSSYAHMSRLDVK PGEMVKAGQQIGLVGATGRVTGPHLHLGVNILGVAVDPLSLVPKQ >gi|316924091|gb|ADCP01000047.1| GENE 15 14743 - 16215 1434 490 aa, chain - ## HITS:1 COG:STM2512 KEGG:ns NR:ns ## COG: STM2512 COG1570 # Protein_GI_number: 16765832 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Salmonella typhimurium LT2 # 2 480 8 445 449 264 35.0 2e-70 MSILTVRQLTAQIKDAVESGFPYVWVRGEVTNVSRPSSGHIYFSLKDENALLQAVWFKGS QKERETFDPLTGEVFEDGPRTSLAGTLRNGQQIICAGRLTVYAPRGGYQLVVELAQDSGE GQLHLALEALKRKLAEKGYFSLERKRPIPEHPHRVAVITAPSGAAIRDFLRLSGERGTGC EIRIHPVPVQGDDAPPRIAEALDDENRRGWADVLVLIRGGGSLQDLWAFNDERVADAVYR SRIPVVAGIGHEVDTSIADMVADLRAATPSHAALLLWPERQWYAQLVDDLEANLLEAAER RLSQADQRLDTLGRALAWLSPERGLARLEERFGTLARRLDFALEQKLERTDARLRFLEAG MSRYAESRVLDTRLEQTAALTRRLRQAQALRLEQIGGRLDTASSRLEERFARQFEGLERQ LERHDLRLRGLDPEGPLERGYAYAFTADGHFVRSIRDVQPGASLTVKIRDGEADTRVTAV RATGESHDIP >gi|316924091|gb|ADCP01000047.1| GENE 16 16529 - 18253 2488 574 aa, chain - ## HITS:1 COG:CAC3178 KEGG:ns NR:ns ## COG: CAC3178 COG0442 # Protein_GI_number: 15896426 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 548 1 545 570 588 50.0 1e-167 MRWSQWYIPTLKEAPGDAEVISHKLLIRAGMIRKLTSGIYTWLPLGLRTLNKAANIVREE MDRAGAQEILMPTVQPADLWRESGRWEHYGKELLRFKDRHERDYCLGPTHEEVVTDLIRG EVRSYRQLPMNLYQIQTKFRDEIRPRFGLMRGREFLMKDAYSFDRDDAGADISYKKMFEA YKKIFSRMALQFRPVEADSGSIGGNFSHEFMVLAETGEDTIAYCTGCDWSANIERAAVLP PAKACEETCSAIEEVTTPDQHTIEEVCGFLKVPAQKLIKTLLYVVDGKPVAALVRGDREL NEIKLKNYFKADEVVLASPEQVTQWTNAPVGFAGPVGFTAGPIVADHELMADTDWIAGAN KADTHILHVDLKRDVPAFAYADLRSIAEGDVCPRCGKPVAFAKGIEVGHVFKLGTKYSTA LNAIYLDENGKEQTIIMGCYGIGVSRVVAACIEQNNDGDGIAFPPPLAPFDLELLNLDPK NADTAAKADELYDLLTGMGLDVLLDDREERPGVKFKDADLVGLPMQVVVGGKGLARGIVE VKNRKTGEKGELPVEGFAEAFAAWRKSVLACWGM >gi|316924091|gb|ADCP01000047.1| GENE 17 18715 - 19593 940 292 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2897 NR:ns ## KEGG: Dvul_2897 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 16 292 12 302 302 216 41.0 8e-55 MIARQFRGFTVAACLMVLAAFGMPDTSWAWESDGPRFNIYTQGYEKADVDEGGSYSKTET GVSAGYKWFTLAYKRSDFSWSHSDSVNFSKKGSPWDQLNKLTLDASFNGALGESVNWFAG GSIISGFEDQIWDSFTFAPRGGLTFSPTYDLKFHVGVAGLISPVRPLVMPIVGAEWRNEH DYGLSGIIGFPGTRVQYRFNDLLAARVAAKWDRDIYRLSNDSSVAGKGYVEESGYTGGAY LDITPIADLKLTVGAELLFDRQLRLYDKGGDEFSKTDVDRALGAVLRASYSF >gi|316924091|gb|ADCP01000047.1| GENE 18 20024 - 20617 629 197 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRFFTFACLGVAGVMMMASNALALDDFSTFTMKGVGNDVDPFELSCKQYNGLDAGTRRL VTAFLEGYAAEGASTVFVPEAVKNFESGIMAVCAKNPGKTLDGILEEVQYDVPEGGSEPR CSDLDGMTETEDLAQLLLWTQGYLESELSENEEADQDQAIHQDMFREDVAEVLAQCKGGG SDAKLIDVMRKVIMGEE >gi|316924091|gb|ADCP01000047.1| GENE 19 20700 - 21074 334 124 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLAFEVVLFGGLAFMFFCILRNQDSMLKTMREEHGAILSTLAKLEKRIATLQQLEASIA INPLAVQPPYPPEEKQELPQTWDRAPEPAEKPSSPLGGLSMDPPSSLGYSDDNPPKGGLP ELKL >gi|316924091|gb|ADCP01000047.1| GENE 20 21208 - 22566 1304 452 aa, chain + ## HITS:1 COG:no KEGG:DVU1407 NR:ns ## KEGG: DVU1407 # Name: not_defined # Def: radical SAM domain-containing protein # Organism: D.vulgaris # Pathway: not_defined # 1 438 1 437 491 663 70.0 0 MASPSIRPHMLVADEQGNIYDHPDFLLLCRRGEEWSLPRPDELMPLPEESELFLLPGRRA VGLNPETGETEVMEDWAVAAFAAPAHTLTAHPVYMTDEGAPMLPLFAYGAVGFANGRFYV CAKKVDEDVRQVFKGISRGKIDRSARKIIEDFPDNRLMQHIMQNCTLRYGCPAAKNLSLG RYEAPLPTSRTCNARCIGCISQQEEGSKICATPQCRLTFTPTPEEVVEIMRFHAGRETEK PVFSFGQGCEGEPLTEAPLLIESVRRYREAGGHGTINLNSNSSRPQAIAELAEAGLTSLR VSLNSARPEVYERYYRPHGYTFGDVRQSIIEARSRGVHVAVNLLYFPGITDTEEEIEALI ELFQSTGISLVQLRNLNIDPEFYPSLLEGISFGPSVGLNNFRKRIRRACPWITYGYFNPY LGDKADLGDTPMPGEWKPAPLEVSEAEEESGK >gi|316924091|gb|ADCP01000047.1| GENE 21 22991 - 24604 2001 537 aa, chain - ## HITS:1 COG:slr0712 KEGG:ns NR:ns ## COG: slr0712 COG0392 # Protein_GI_number: 16331959 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Synechocystis # 13 318 7 318 322 161 35.0 4e-39 MLKQLLLALKRLLRALGPVLVGCIFLGAVYLLYREISKYSLADIRMSLAQISTGSIILSV LLAIINYIILIGYDWLALKGIHKTLPVSRVSLVSFVGQAVSYNFGALLGGSTVRFRFYSS WGFSPMDIVRLVLMLAITFWVGALGLVGAIFMIAPPEIPPELGMHMPLDIRPLGAILFLI AISYLIVCKFIHKPIHIFGKEFAFPPFKIAVAQAVVAGADLVAAGACLYVLLPPDAHVSF LQFLPTYLMAMVAVVLTHVPGGAGVLEVVILHLTTASPQAVFAALLCFRVIYYLLPLLLA AVIFAIYEVRQQAIQESGVLHDAGRWMRAFAPTIMASAVFGIGAILCFYVVLPVSPERLA QIREWIPLGIVEFASMATGVAGVMLLFLTRGIQHRQRAAFRLAIAMLCIGIIGPLLHSLS WFVALMSLIVLLSVLSIRRKCCRPSSLWKLHLTPSWLFAIVSVLVCSAGLGLLIYHMDPS DPVLWTSSDYAADAARLFRTFAAEALLLICIAVGYMRTAPIRKRWNAVRRFGRRNKQ >gi|316924091|gb|ADCP01000047.1| GENE 22 24893 - 25375 700 160 aa, chain + ## HITS:1 COG:mll2568 KEGG:ns NR:ns ## COG: mll2568 COG0782 # Protein_GI_number: 13472314 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Mesorhizobium loti # 1 155 1 156 157 132 49.0 2e-31 MSSIPISKEGFQALEQELDRLKKERPHVIQAIKEAREEGDLKENAGYDAARERQGMLEAR IKYIESRMALFNVIDLSTLSSEKAIFGATVTVEDADTGEEREFTLLGPDEADYAKGTISI QSPVGVALLGKEVGDEITVNAPRGRINYEIIDIQFKKKTA >gi|316924091|gb|ADCP01000047.1| GENE 23 25506 - 26903 546 465 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 7 388 3 362 451 214 34 5e-55 MHYAPWTFCILTFGCKVNQYESQSVREAWQRMDGAETDAPAEADVILLNTCAVTANAVTD ARQAVRRLQREAPGVPVVIAGCAAEVARKQLAALPGVLRVVEQDHKSRLLDAPPLVLFAE EAAGSATPFPSSPAEAAARAATRDRTFPPFHIDGFRRARPVLKVQDGCSHGCAYCIVPLT RGPARSRPPKDCLAEMRRLLEAGYREIMISGINLRQYAMRDEGCRDFWDLLSYLDRELAP EWGSGARPDPARFRISSVEPAQLTERGIATLAETSMVCPHLHLSLQSGSADVLKAMRRGH YTPEALLSAVEGVAKLWPRFGLGADILMGFPGETEAHVLETLEVVRSLPLTYAHVFPYSA RPGTVAAELPDQVGKAVRQERAARVRAVVEAKREAFWKDTLTCERLLVALDCNEDAGSAS GQHGVDECYVPCRLRTPLRGEGHTLIPVRPVSVSRKGVIVEPLRP >gi|316924091|gb|ADCP01000047.1| GENE 24 27293 - 27559 274 88 aa, chain + ## HITS:1 COG:jhp0766 KEGG:ns NR:ns ## COG: jhp0766 COG0724 # Protein_GI_number: 15611833 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Helicobacter pylori J99 # 3 81 2 80 82 98 59.0 2e-21 MAKSIYVGNLPWSATEEQVQSLFADYGPVVSVKLVSDRETGRARGFGFVEMEEPGASAAI EALDNANFGGRTLRVNEAKPRAPRPPRY >gi|316924091|gb|ADCP01000047.1| GENE 25 28545 - 29726 1490 393 aa, chain + ## HITS:1 COG:ECs4466 KEGG:ns NR:ns ## COG: ECs4466 COG1454 # Protein_GI_number: 15833720 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 9 390 5 380 383 361 50.0 2e-99 MSVPGKMHSFFVPGISLIGVGACHGIPHYVHTYHITKPLIVTDKGIVNTGILKIITDILD GDNIPYSIYDKTVPNPTDENVTEGVAQYKADGCDGLISIGGGSSHDCCKGIGILLSNGGA IYQYEGVDKVNHMLPHYIAINTTAGTASELTRFCVITDKERHVKMSIVDWFLTPNVAVND PVLMVGMPPSLTAATGMDALTHAIEAYVSTNTTPMTDACAEQSIRLVAKYLRKAVANGKD MEAREGMAYAQYLAGMAFNNAGLGYVHAMAHQLGGFYSLPHGECNAILLPHVESFNLISR LDRFVRIAQMMGECTDGLSERAAAELAISAIKTLSKDVGIPTTITELAARYVKAIDPRDI PAMVGHAQKDTCAATNPRTMSLEIISQLYKDVF >gi|316924091|gb|ADCP01000047.1| GENE 26 29868 - 30059 91 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYFKRVTYTRHQSGGHGGHEKKRNSPFSQPIPAPPGGPKPSGYPAVPSKITLPIDSKQIP ALA >gi|316924091|gb|ADCP01000047.1| GENE 27 30274 - 32226 1702 650 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 1 647 1 635 636 660 52 0.0 MNQFTRNLMLWGVISLAMVIVFNLFSQPQHSSQPSVTYSEFLRQAAKGEVSEVVIQGNTL TGKTTDGKSFQIYVPNDPGLVDKLIAEKVEVRAEPVEDSPWYMTLLVSWFPMLLLIGVWI FFMRQMQGGAGRAMSFGRSRARMLNQEQGKVTFDDVAGVDEAKEELSEVVDFLSNPRKFT RLGGRIPKGVLLVGPPGTGKTLLARAVAGEAGVPFFSISGSDFVEMFVGVGASRVRDLFV QGKKNAPCLIFIDEIDAVGRQRGAGLGGGHDEREQTLNQLLVEMDGFESNEGVILIAATN RPDVLDPALLRPGRFDRQVVVPTPDVKGRLKILEVHTRRTPLDKHVNLEVIARGTPGFSG AALENLVNEAALQAARLGQDTVFMRDFEYAKDKVLMGKERRSLILSDEEKRITAYHEGGH ALVAKLLPGTDPVHKVTIIPRGRALGVTMQLPEGDRHGYSKAFLQNNLMVLLAGRVAEEI IFDTITTGAGNDIERATGMARKMVCEWGMSDVVGPMTIGEQGEEVFIGRDWGHARNYSED TARIVDAEIKKLVETARENCHKLLQENINLLHALAKALLDRETITGDDIDLLVKGEPLPP FDADGSAAKQEPAAPAPASADAEVETFKLEAEPSQDGGTGEKTQDETKQQ >gi|316924091|gb|ADCP01000047.1| GENE 28 32285 - 33136 737 283 aa, chain + ## HITS:1 COG:VC0638 KEGG:ns NR:ns ## COG: VC0638 COG0294 # Protein_GI_number: 15640658 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Vibrio cholerae # 22 279 18 274 278 206 41.0 4e-53 MNNERKWILRGGRVLNPAPFCLIGIVNVTPDSFSDGGKYIDPRNAVAHGLRLLDEGAGML DLGAESTRPFAEPVPEGEEIARLMPVVARLREQRPDAVLSVDTLKAGTARAALEGGADII NDVSACVADPALLDVLAEYKPGYVLMHSQGSPREMQVNPRYGNVVEEILAFFEEHLARLV KAGLPEDRIVLDPGIGFGKNKDHTVAILKGLERFASLGRPLYVGLSRKSMFREILGLELE QRAEATRLAVALLAARGIPYHRVHDVAGCAQALRLVEAMTPLA >gi|316924091|gb|ADCP01000047.1| GENE 29 33179 - 33916 783 245 aa, chain + ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 7 239 11 249 273 182 42.0 4e-46 MDFLSFYWRDVLDIALVTLLLYRILLLIRGTRAFSALIGLVVLVAVYALSHFIGLYSLSW LLENLFGSLFLVIVILFHDDIRQALAAMGTQYLSWKKRPAPSNMVGDLVWVCQYFAKRRI GALIVLEGKVQLGDMMKGGVVLDARISRELLLTIFFPNTALHDGAVIIRKGRIAAAGCIL PLAQMDRQNFGTRHRAALGATEVSDATVIVVSEERGEVSVASKGHLAVMPDAEQLKETLN NVIEH >gi|316924091|gb|ADCP01000047.1| GENE 30 33900 - 34841 693 313 aa, chain + ## HITS:1 COG:no KEGG:LI0191 NR:ns ## KEGG: LI0191 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 73 309 1 238 240 168 37.0 2e-40 MLSNTKDPQKGNWRYLLVAFLLALGLWYTLNAREQIERVVEVRLDYKGLPTGLIVTGGQL NKVSVRLRGPQELLRSMTNREISYTMDLSGVTPGKNVIPLTTGENKPPELRAYEVLEVTP SRMILEVDKIMETNLPVKVALRASPAASSVRLKDLVVDPPQVTVRGPASVIASMKEIQAE IPVDLAAEGKAVSEEVPLLAPPAVELNPQVVKVTWKIDVKRRTLSLQRDIIFEGENPNVS AQPSRANLMVSVPQAMVKDAGYLAQFQVSIPSDTAMPAEDGAVNAPLQVAVPQGGRVLKI SPETVSISRHPSE >gi|316924091|gb|ADCP01000047.1| GENE 31 34889 - 36241 1257 450 aa, chain + ## HITS:1 COG:mll3879 KEGG:ns NR:ns ## COG: mll3879 COG1109 # Protein_GI_number: 13473323 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Mesorhizobium loti # 1 447 1 445 450 454 51.0 1e-127 MSVRLFGTDGMRGRVNKYPMTPEVALRLGLAAGTFYRDKNRRSRVVIGKDTRLSGYVFEN ALAAGLLAAGMDVFLVGPLPTPAISFLTTNMRADVGVVISASHNPFSDNGIKFFDAEGFK IPDADEDRMTEMVLDPNFDWNYPEPANTGRAYKIKDAPGRYIVYLKNSFPAHLSLEGLRV VIDCANGANYRVAPLALEELGAEVVKIGTEPNGLNINYRCGSLYPEAVAAKVRETRADIG LALDGDADRLIVVDEKGTVLDGDQIMALCAQDLMQQGKLPGNILVATVMSNMALEVFMKE RGGTLIRTNVGDRHVVAAMRQQGALLGGEQSGHLIFREYSTTGDGLLAALQILRIMRQRK RPLSELAGLLVPYPQELRNVHVEHKIPFEENAEIADAVARIEEGLEGRGRVLLRYSGTEP LCRVMVEGQDADKVRVYANELAGIVEKALG >gi|316924091|gb|ADCP01000047.1| GENE 32 36498 - 37361 813 287 aa, chain + ## HITS:1 COG:CAC2335 KEGG:ns NR:ns ## COG: CAC2335 COG1210 # Protein_GI_number: 15895602 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Clostridium acetobutylicum # 1 275 1 275 303 267 46.0 2e-71 MNIRKVVIPVAGWGTRSLPASKNIPKEMLPIYNKPVVQYVVEEAIRAGVEDVIFVTNRDK SVIEDHFDYNLQLEGVLERAGKLKMLKEVRDVAEMVNIMSVRQKKQLGLGHAVLCAKEMV RDEPFAVMVGDDFMIGDDAGLTQLVQVASEHDMPVIGVMEVPADKVNRYGIVMGEEITPG VYNITDMVEKPAIGTVDSRLAIVGRYVLTPDIFEHLARVKPGHGGEIQLTDALASLARER GMLAVKMGGIRFDAGDWVDYLTANIYFGLRDEKLRDGLKARLRELLD >gi|316924091|gb|ADCP01000047.1| GENE 33 37573 - 39924 1544 783 aa, chain + ## HITS:1 COG:BS_priA KEGG:ns NR:ns ## COG: BS_priA COG1198 # Protein_GI_number: 16078634 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Bacillus subtilis # 249 783 268 805 805 404 39.0 1e-112 MLRTVALLTPPFASLTYAVPEWLAGFAWCPGLRVAVPLGRGMLRIGVVLGDGSPLPEGVT ARPMLWPLEKEPLLPSGHMEMVRQLALRQAVTPGQILAMVLPAGLRVTQMRLRTMEAQGK AQIRLLKDLPKLPVEVLAALGEAWMRGGAELLGPREDAAASELCVLRCDPPWAVRPTATR QIELLEFLLEHGAVSRREVLRQMGQGVAPALESLLKNGLIGLQCLEAGCAEEETGCNLLP PASEALFALSEAQEAALASFRADLDAGNPASHLLFGVTGSGKTAVYMELAKECLRRGRSM LLLAPEVALALKLRRDASLALPDVPLYFFHGYQSAALREKTFRELAKRREPCLVVGTRSA LFLPLPSLGAVVLDEEHDSSFKQDEGLTYQAKEVAWFRIAQAKGLLVLGSATPDLKTFYA VREAKIPVSTLPARVGGGTLPSIRLVDIRSMNCVESILAPETLSALKQTVEQGDQAVVLL NRRGYAPLMYCIDCGKVARCPHCDIGLTYHKGRERLVCHYCGYSVPFPSPCPSCKGLHFH PMGQGTERVEEYIGTLLPPGGRVLRLDRDSTRRPGRMEEILESFARQEAQVLVGTQMLSK GHHFPHVTLAVVADGDIGLNLPDYRAAERTFQLLVQSSGRAGRGEKPGQVIIQTRDVNHY CWQYVKNGDYEGFYDYEIALRKRRRYPPFINLALLRISYPMDWADGPTQLARITALLRSE SKGVTVLGPAPAPLPLLRGRRRFQCLLKAGDWQSIRALYAVLLPLASPPNLRISLDIDPV NML >gi|316924091|gb|ADCP01000047.1| GENE 34 40100 - 40555 280 151 aa, chain - ## HITS:1 COG:FN1295 KEGG:ns NR:ns ## COG: FN1295 COG0454 # Protein_GI_number: 19704630 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Fusobacterium nucleatum # 15 151 1 130 135 61 30.0 5e-10 MEIRKPGAAELKPALALAWDVFLKDAASGFSDEGIRTFKRFIEYDSMAATLANGTTTMWA AFSGLEPIGVIAARESHICLFFVAGPHQRHGIGRRLFETFRSRRLSLAPQAPLTVNAAPS AVRAYQRLGFVRTGGEQVAGGIRFVPMKHTL >gi|316924091|gb|ADCP01000047.1| GENE 35 40452 - 40661 92 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPSSEKPDAASLRKTSHARAKAGLSSAAPGFLISMRFRSLCSLFKERVQAFFNKGKGCAM PLDESGRVV >gi|316924091|gb|ADCP01000047.1| GENE 36 40687 - 40896 234 69 aa, chain + ## HITS:1 COG:no KEGG:DVU1567 NR:ns ## KEGG: DVU1567 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 69 1 70 70 71 51.0 9e-12 MWGSFSRARKVQTTVCCKHCGMPLFIERSCHEVHMFCPKCGKAFPLSEYIALADEAMEEF LSSVYCDRM >gi|316924091|gb|ADCP01000047.1| GENE 37 40994 - 41608 723 204 aa, chain + ## HITS:1 COG:BU383 KEGG:ns NR:ns ## COG: BU383 COG0293 # Protein_GI_number: 15616987 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 23S rRNA methylase # Organism: Buchnera sp. APS # 1 191 15 204 206 139 38.0 3e-33 MKEYRDHYFLKAKQENYPARSVYKLKEIDGRFKIFRKGMKVLDLGAAPGSWSLGAAERVG AQGKVLACDIQSTVTVFPPNVEFHQEDVFQRSEAFERQLAETGPFHVVMSDMAPQTTGTK FTDQARSLELCLEALAVAEKYLVKGGSFVVKIFMGPDVGELLKGMRPRFARVTSFKPQSS RVESKETFFVGLGFKGKEADGARD >gi|316924091|gb|ADCP01000047.1| GENE 38 41737 - 42483 900 248 aa, chain + ## HITS:1 COG:PA0964 KEGG:ns NR:ns ## COG: PA0964 COG0217 # Protein_GI_number: 15596161 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 1 246 1 247 248 241 54.0 7e-64 MSGHSKWANIQHRKGRQDAKRGKEFTKAAKEIIIAAKNGGDPASNSRLRAAIAAAKSINL PKDKIEAAIRKGTGQDAGGDIIEINYEGYGPGGVAVIVETATDNRNRTVAEIRHLMSKGG GSMGENGCVSWKFERKGVIQFSKEKYTEDQLMEAALEAGADDLRDEGDVWEIQTAMADFN SVREAFEAAGLEMISAELNQVPQTTMEVDLETARKLLRFIELLEDNDDVQNVYSDADISD EIMAQLED >gi|316924091|gb|ADCP01000047.1| GENE 39 42710 - 43219 522 169 aa, chain + ## HITS:1 COG:VC1847 KEGG:ns NR:ns ## COG: VC1847 COG0817 # Protein_GI_number: 15641849 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Vibrio cholerae # 7 169 4 163 173 121 44.0 7e-28 MSQSITIIGIDPGSRNTGWGVVREVSGVLQLVDCGVIRPPLDGEFASRLGVIFKEMHRLL GRLKPDEASVEQVFTAKNAATALKLGQARGAAIAACAAYDLPVRDYEPTVIKKSLVGVGR ADKEQVSFMVGRVLGVKPDWAVDTGDALAAAICHLTHRRFERLVRLSGR >gi|316924091|gb|ADCP01000047.1| GENE 40 43344 - 43709 454 121 aa, chain + ## HITS:1 COG:no KEGG:LI0267 NR:ns ## KEGG: LI0267 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 15 121 15 122 122 114 50.0 1e-24 MMARFTKVVFAAALVCALCLPAVAAQSATDLTGKHWLQSSQNEKLAFLYGASNIIAIEQL IAQQQGTQASPFVTAWIKAFGNTNWTDIQKKLDAWYAAHPDQANREVFDVLWYEFMVPAS K >gi|316924091|gb|ADCP01000047.1| GENE 41 43764 - 43988 274 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRLSAILLVGALVFGTVGCTNMNKTQQGLLSGAAGGAILGAGLGALTGGSGTTGALIGG GLGALAGGIYGHSK Prediction of potential genes in microbial genomes Time: Fri May 13 02:30:24 2011 Seq name: gi|316924087|gb|ADCP01000048.1| Bilophila wadsworthia 3_1_6 cont1.48, whole genome shotgun sequence Length of sequence - 4218 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 219 - 1319 1590 ## COG0012 Predicted GTPase, probable translation factor - Prom 1342 - 1401 1.6 2 2 Op 1 . - CDS 1507 - 2091 797 ## COG0344 Predicted membrane protein 3 2 Op 2 . - CDS 2101 - 4185 2363 ## COG0557 Exoribonuclease R Predicted protein(s) >gi|316924087|gb|ADCP01000048.1| GENE 1 219 - 1319 1590 366 aa, chain - ## HITS:1 COG:Cj0930 KEGG:ns NR:ns ## COG: Cj0930 COG0012 # Protein_GI_number: 15792259 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Campylobacter jejuni # 1 366 1 367 367 446 58.0 1e-125 MALSIGIVGLPNVGKSTLFNALTKAQNAQAANYPFCTIEPNKATVPVPDRRIDALVDLVH PQKTINATVDFIDIAGLVRGASKGEGLGNQFLGNIRECAAILEVVRCFEDDDITHVDGSV DPLRDIETIETELLLADIQGTSKRHERMVKMSKGDKDARVAADEMARLLEHLNNGNPAST FEAKDCPAFAAAWHELGLLTAKKIIYCANVDEDSLAEDNAHVTRLREFAATRNIEVVKIC ARIEEELQGLSDEEQREMLGAYGIEESGLLRVIHSGYKALGLASYFTAGEKEVRAWTIQD GWKAPQAAGVIHTDFERGFIRAEVIGFDDYVKYKTEAACRTAGVLRTEGKEYVVKDGDVM HFLFNV >gi|316924087|gb|ADCP01000048.1| GENE 2 1507 - 2091 797 194 aa, chain - ## HITS:1 COG:FN0537 KEGG:ns NR:ns ## COG: FN0537 COG0344 # Protein_GI_number: 19703872 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 6 191 6 192 194 114 40.0 1e-25 MFDILWICLTYCIGSVPFGLVFAKTFCRIDPRTAGSGNVGATNVARLCGKAWGAATLACD LLKGTIPVFIAMQYSTSELVWTLTALAAILGHLYSCFLGFKGGKAVATSIGVLIPLAFWQ LLIAVCICLFLIWRSGFVSLGSLALVTAMPILLALSGKFGLLPLSLIILALVFWSHRENI RRLAKGEEKTWIKK >gi|316924087|gb|ADCP01000048.1| GENE 3 2101 - 4185 2363 694 aa, chain - ## HITS:1 COG:sll1290 KEGG:ns NR:ns ## COG: sll1290 COG0557 # Protein_GI_number: 16329795 # Func_class: K Transcription # Function: Exoribonuclease R # Organism: Synechocystis # 27 690 29 660 666 125 24.0 4e-28 MVQGLVRYPGAGCVVEFMQGNAPQIAWVLEEQNGRLRLLLPNRRETALQAARILPWPGPA YEKNCSRDAALEILERHKSRREVANVDPLELWELAQGEVEQAPAQWFAELAMSEPDMDAV AACGHALMQAKSHFKFNPPNFEVYPESVVATRMAELEAARRREELVNKGSAFIRLLWEIH QKKSTQSPGRAAESLDPDVRERLRRTIMNRIADPETSEDDGLWKLMVKGLPDDPFMPLYL AQAWGLVEPHHNYWMDRAGYAPGNGWCEEHRDELDALLAQAAEDERQEPTFPDRPIISID APTTRDVDDAFFIEARPDGGWNLTLALACPAFRWPFGGKLDKAVFNRATSIYLPEATHHM LPEALGTGAYSLLAQKTRPSLLIECTVGADGLVAACEPRVGYARLAANLCYEDCETALDG GESPASPYLEQLRQALELARAHQERRIEKGAVIIERPDITINLEGAGENITVSLDEDPLA PKAHLLVSELMVLTNAALAAWAKEHGVTLLHRTQDVAIPKEFSGIWQTPLEIARVVKALA PAVLETNPRPHAGLGEAMYAPSTSPLRRYPDMINETQIISLLRDGKPRWSKEELDTLLPL LNAHLDAAGQVQRFRPRYWKLLYFKQQGDRWWPAVITDENDAFVTVNMPKEQMIVRARRQ FFGERTHPGQELEIRLGKVHPLQNDFQLLETREI Prediction of potential genes in microbial genomes Time: Fri May 13 02:31:01 2011 Seq name: gi|316924016|gb|ADCP01000049.1| Bilophila wadsworthia 3_1_6 cont1.49, whole genome shotgun sequence Length of sequence - 76731 bp Number of predicted genes - 74, with homology - 67 Number of transcription units - 40, operones - 20 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 2 Tu 1 . + CDS 1701 - 2072 259 ## LI0201 hypothetical protein + Prom 2085 - 2144 5.6 3 3 Op 1 . + CDS 2237 - 3517 1790 ## COG0104 Adenylosuccinate synthase + Term 3596 - 3633 6.2 + Prom 3618 - 3677 1.6 4 3 Op 2 . + CDS 3754 - 4824 869 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain + Term 4898 - 4951 12.6 + Prom 5072 - 5131 2.0 5 4 Tu 1 . + CDS 5250 - 6437 1510 ## COG0535 Predicted Fe-S oxidoreductases + Term 6445 - 6485 12.7 + Prom 6507 - 6566 2.0 6 5 Op 1 . + CDS 6590 - 7582 1454 ## COG0113 Delta-aminolevulinic acid dehydratase 7 5 Op 2 3/0.000 + CDS 7582 - 8763 1323 ## COG0535 Predicted Fe-S oxidoreductases 8 5 Op 3 . + CDS 8760 - 9224 596 ## COG1522 Transcriptional regulators + Term 9351 - 9406 21.2 - Term 9343 - 9389 14.2 9 6 Op 1 . - CDS 9445 - 10242 831 ## DVU0864 glycoprotease family protein, putative 10 6 Op 2 17/0.000 - CDS 10239 - 11360 1608 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 11 6 Op 3 15/0.000 - CDS 11361 - 12569 1184 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase - Term 12577 - 12609 2.0 12 6 Op 4 32/0.000 - CDS 12809 - 13615 1081 ## COG0575 CDP-diglyceride synthetase 13 6 Op 5 . - CDS 13612 - 14346 884 ## COG0020 Undecaprenyl pyrophosphate synthase 14 6 Op 6 . - CDS 14444 - 14680 77 ## 15 6 Op 7 33/0.000 - CDS 14711 - 15271 898 ## COG0233 Ribosome recycling factor 16 6 Op 8 . - CDS 15277 - 15993 1032 ## COG0528 Uridylate kinase 17 6 Op 9 5/0.000 - CDS 16008 - 16796 1031 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 18 6 Op 10 4/0.000 - CDS 16793 - 17947 1282 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 19 6 Op 11 . - CDS 17956 - 18936 1274 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Term 18954 - 18984 4.1 20 7 Op 1 38/0.000 - CDS 19006 - 19896 440 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts 21 7 Op 2 . - CDS 19916 - 20701 1153 ## PROTEIN SUPPORTED gi|218888008|ref|YP_002437329.1| 30S ribosomal protein S2 - Prom 20834 - 20893 3.5 22 8 Tu 1 . + CDS 20585 - 20827 109 ## + Term 20896 - 20939 5.1 23 9 Tu 1 . + CDS 21056 - 22006 950 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase 24 10 Op 1 . + CDS 22151 - 22669 774 ## COG4803 Predicted membrane protein 25 10 Op 2 . + CDS 22745 - 23695 799 ## COG0726 Predicted xylanase/chitin deacetylase + Prom 24265 - 24324 5.1 26 11 Tu 1 . + CDS 24442 - 24837 327 ## + Term 24879 - 24916 2.9 27 12 Op 1 . + CDS 25115 - 26557 1927 ## DMR_25930 hypothetical membrane protein 28 12 Op 2 . + CDS 26591 - 26764 333 ## DMR_25940 hypothetical protein + Term 26920 - 26961 9.4 29 12 Op 3 . + CDS 26981 - 27202 209 ## gi|302863291|gb|EFL86223.1| hypothetical protein HMPREF0326_01926 + Term 27382 - 27432 19.2 - Term 27368 - 27420 18.8 30 13 Tu 1 . - CDS 27568 - 28926 2075 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Prom 29047 - 29106 3.3 31 14 Op 1 5/0.000 + CDS 29386 - 30012 950 ## COG0035 Uracil phosphoribosyltransferase 32 14 Op 2 . + CDS 30129 - 31388 804 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Prom 31471 - 31530 1.5 33 15 Op 1 . + CDS 31653 - 32936 1373 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA 34 15 Op 2 . + CDS 32950 - 33717 508 ## COG2199 FOG: GGDEF domain + Term 33777 - 33812 7.2 - Term 33765 - 33800 6.4 35 16 Op 1 . - CDS 34027 - 34449 612 ## Ddes_1583 hypothetical protein 36 16 Op 2 . - CDS 34611 - 35579 954 ## COG0348 Polyferredoxin - Prom 35721 - 35780 4.4 37 17 Op 1 11/0.000 + CDS 36177 - 37181 297 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 + Term 37199 - 37236 9.2 38 17 Op 2 11/0.000 + CDS 37286 - 37858 719 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 39 17 Op 3 . + CDS 37869 - 39158 721 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 + Term 39182 - 39215 4.3 40 18 Tu 1 . + CDS 39494 - 40411 752 ## DSY1098 hypothetical protein + Term 40427 - 40458 -0.8 + Prom 40534 - 40593 1.5 41 19 Op 1 4/0.000 + CDS 40650 - 41120 354 ## COG0350 Methylated DNA-protein cysteine methyltransferase 42 19 Op 2 . + CDS 41104 - 42315 756 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase + Term 42352 - 42393 -0.5 43 20 Tu 1 . - CDS 42520 - 43296 786 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family - Term 43421 - 43454 6.3 44 21 Tu 1 . - CDS 43664 - 43948 167 ## LI0028 hypothetical protein - Prom 44123 - 44182 4.0 45 22 Tu 1 . + CDS 44168 - 45124 531 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Term 45153 - 45195 8.6 - Term 44579 - 44627 0.1 46 23 Tu 1 . - CDS 44830 - 45384 86 ## - Prom 45483 - 45542 3.7 47 24 Tu 1 . + CDS 45401 - 46513 1108 ## COG4521 ABC-type taurine transport system, periplasmic component + Term 46537 - 46585 15.5 + Prom 46853 - 46912 8.9 48 25 Op 1 9/0.000 + CDS 47107 - 48210 362 ## PROTEIN SUPPORTED gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 + Term 48223 - 48266 12.1 49 25 Op 2 . + CDS 48286 - 50151 838 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 + Term 50184 - 50219 6.0 - Term 50551 - 50578 -0.1 50 26 Tu 1 . - CDS 50617 - 51441 1158 ## COG2998 ABC-type tungstate transport system, permease component - Prom 51500 - 51559 3.2 + Prom 51480 - 51539 3.5 51 27 Tu 1 . + CDS 51670 - 52599 931 ## COG1897 Homoserine trans-succinylase 52 28 Op 1 . + CDS 52910 - 54187 1010 ## COG2873 O-acetylhomoserine sulfhydrylase 53 28 Op 2 . + CDS 54224 - 54730 249 ## COG1959 Predicted transcriptional regulator 54 28 Op 3 . + CDS 54760 - 55341 418 ## DVU1990 hypothetical protein + Term 55386 - 55439 -0.2 + Prom 55388 - 55447 3.5 55 29 Op 1 25/0.000 + CDS 55473 - 55814 501 ## COG1862 Preprotein translocase subunit YajC + Term 55831 - 55885 15.5 56 29 Op 2 31/0.000 + CDS 56011 - 57597 1765 ## COG0342 Preprotein translocase subunit SecD 57 29 Op 3 . + CDS 57663 - 58769 1355 ## COG0341 Preprotein translocase subunit SecF + Term 58873 - 58931 15.1 - Term 58870 - 58907 6.1 58 30 Op 1 . - CDS 58959 - 60413 1742 ## COG0053 Predicted Co/Zn/Cd cation transporters 59 30 Op 2 . - CDS 60475 - 61050 632 ## COG1971 Predicted membrane protein - Prom 61140 - 61199 4.0 - Term 61142 - 61206 27.2 60 31 Op 1 . - CDS 61329 - 62162 1109 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 62293 - 62352 2.0 61 31 Op 2 . - CDS 62361 - 62906 635 ## COG3544 Uncharacterized protein conserved in bacteria - Prom 62971 - 63030 3.8 - Term 63160 - 63207 -0.3 62 32 Tu 1 . - CDS 63227 - 63901 616 ## COG0288 Carbonic anhydrase 63 33 Op 1 . + CDS 64013 - 64372 394 ## 64 33 Op 2 . + CDS 64427 - 65566 965 ## COG1472 Beta-glucosidase-related glycosidases + Term 65598 - 65640 4.4 - Term 65659 - 65698 9.1 65 34 Op 1 . - CDS 65705 - 66088 516 ## - Term 66219 - 66251 2.2 66 34 Op 2 . - CDS 66273 - 68411 293 ## PROTEIN SUPPORTED gi|87310993|ref|ZP_01093118.1| ribosomal protein S1-like RNA-binding domain protein - Prom 68609 - 68668 1.9 + Prom 68341 - 68400 2.4 67 35 Tu 1 . + CDS 68623 - 69066 590 ## Dvul_0109 hypothetical protein + Term 69182 - 69226 14.6 - Term 69222 - 69263 4.2 68 36 Op 1 . - CDS 69344 - 70024 862 ## LI0197 endo-1,4-beta-xylanase 69 36 Op 2 . - CDS 70096 - 71256 667 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 71402 - 71461 3.5 - Term 71446 - 71485 9.5 70 37 Tu 1 . - CDS 71498 - 72301 683 ## COG0739 Membrane proteins related to metalloendopeptidases + Prom 72580 - 72639 4.4 71 38 Tu 1 . + CDS 72887 - 73552 376 ## PROTEIN SUPPORTED gi|223039866|ref|ZP_03610150.1| 30S ribosomal protein S16 + Term 73561 - 73613 15.3 + Prom 73683 - 73742 4.2 72 39 Op 1 . + CDS 73963 - 74922 1353 ## COG0861 Membrane protein TerC, possibly involved in tellurium resistance + Term 75044 - 75077 2.0 + Prom 75052 - 75111 1.6 73 39 Op 2 . + CDS 75160 - 76038 877 ## LI0248 hypothetical protein + Term 76066 - 76102 5.3 - Term 76106 - 76139 -0.3 74 40 Tu 1 . - CDS 76184 - 76636 -249 ## Predicted protein(s) >gi|316924016|gb|ADCP01000049.1| GENE 1 205 - 1479 1767 424 aa, chain - ## HITS:1 COG:RSc0504 KEGG:ns NR:ns ## COG: RSc0504 COG0138 # Protein_GI_number: 17545223 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Ralstonia solanacearum # 41 424 216 524 524 189 34.0 8e-48 MSSLKDMYRTVLKDTFPDTMTITLGDAVLTYKKRTWHIDGETKGLRYGENPDQPAALYEM ESGKLAIDGVEWRGPQGIMSALTEAQMLQAGKHPGKTNLTDVDNGCNILQYLAERPAAVI LKHNNPCGAAWSKESVGDALDKAFWCDRIAAFGGAVVVNRPLDMQAADLIAANYFEVVAA PDFEEGVLEKLAARKNLRIFKLPALARLGELAGVPFLDVKSMADGGIILQQSFVNRIRSA ADFIPAVATTKDGLTVSARKPTPQEADDLVFAWAVEAGVTSNSVIFAHNGATVAIGTGEQ DRVGCVELAIFKAYTKYADTLAFTRHGMTLYELKLKAKEDAEAAEQLAAIEADTQKAKGG LAGTVLVSDGFFPFRDGVDVCIAQGVTAIAQPGGSLRDYEVIGAVNEASPQVAMVFTGQR SFKH >gi|316924016|gb|ADCP01000049.1| GENE 2 1701 - 2072 259 123 aa, chain + ## HITS:1 COG:no KEGG:LI0201 NR:ns ## KEGG: LI0201 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 77 1 77 107 110 72.0 2e-23 MEQLLIKMARQLDALDEASLLALWDKYAVIVSHFEPTKRWEEAALVFSFIQAKRWKNQLF NYSWSAQVHPGAASQPVAEPDFFLDPPTGRPAPAKPKRKAAIIQFRPHRSPEKDASEEPG GKG >gi|316924016|gb|ADCP01000049.1| GENE 3 2237 - 3517 1790 426 aa, chain + ## HITS:1 COG:BMEI0351 KEGG:ns NR:ns ## COG: BMEI0351 COG0104 # Protein_GI_number: 17986634 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Brucella melitensis # 1 425 92 513 520 459 52.0 1e-129 MSNVVIVGVQWGDEGKGKFVDLLAQEIDYVVRFQGGNNAGHTVIVDGKKAALHLVPSGIL HEGKICLIGNGVVLDPVVFVEELDTLIGQGVDVSPNRLKISSKTHLIMPYHKVLDKAREN KLEKGQKIGTTGRGIGPCYEDKVGRVGVRASDLADPDLLKAKIAAALKEKNVLFTALYGI DALAVDAVFDEVMAVAPRIIPHLTDVSSELEQAWNEGKSVLFEGAQGIHLDIDHGTYPFV TSSNTVSGNAAAGSGVGPNKLDRIIGILKAYTTRVGEGPFPTELNDATGELLRTSGGEFG VTTGRPRRCGWQDIPVLRESARLNGLTDVALTKLDVLSGFDTIQICVAYEYKGKRMDYPP QEQGALDLVSPVYESMPGWKEDITACKTWDELPEAARAYVQRLEQLSGVPVSMVSVGPDR NQTIFR >gi|316924016|gb|ADCP01000049.1| GENE 4 3754 - 4824 869 356 aa, chain + ## HITS:1 COG:RC0410 KEGG:ns NR:ns ## COG: RC0410 COG0482 # Protein_GI_number: 15892333 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Rickettsia conorii # 2 355 15 368 370 201 35.0 2e-51 MTIAVAVSGGADSLYAMASLQREYPGQVIAFHALQKETSPEDDPVPGLEVSCRALGVPLH ILDLQEAFFRQIVRPFAESYAHGETPNPCVRCNALVKFGLWMDEARKLGADRLATGHYVS LAEHPRYGMALHQGADEAKDQSYFLALTPISRLKNAVFPLARVRKSEVRAALAEWGLSVP LPRESQEICFVPDDDYRAFLKGIGVRLPAGGPMVLLDGHVVGRHGGLWQYTEGQRRGLGV SWTEPLYVIGKDRSRNALLLGTADELPVNACAAGELNFLVPPELWPHELRVRTRYRQKAV PADIRLVGGGESGMDATARMLIRFHQPQLPSAPGQLAAVFDEHGHVLAGGIICKER >gi|316924016|gb|ADCP01000049.1| GENE 5 5250 - 6437 1510 395 aa, chain + ## HITS:1 COG:MA3035 KEGG:ns NR:ns ## COG: MA3035 COG0535 # Protein_GI_number: 20091853 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Methanosarcina acetivorans str.C2A # 1 391 1 394 399 592 68.0 1e-169 MIGISKLYCGQVEPSDALRYGRHSSQLPSHLLQFSEDKKPVVVWNMTRRCNLKCVHCYAQ AVDPDGKDEISTEQGKAIISDLAAYGAPVMLFSGGEPLVRQDLPELASYATEKGMRAVIS TNGTLITKEKARELKAINLSYVGISLDGAEEIHDKFRGVPNSFKKALEGLENCKAEGLKV GLRFTINKRNVVEIPKVFKLLRELEVPRICFYHLVYSGRGSEMIKEDLNHAETRSVVDLI MDETRALFDAGMPKEVLTVDNHADGPYVWMRMLKEDPKRAEEVFQLLQYNEGNSSGRGIG CISWDGKVHADQFWRNHTFGNVLERPFSEIWDDPNIELLHKMKNKKAYVKGRCASCRFLN ICGGNFRSRAEAYYGDEWAQDPACYLTDDEIRRPE >gi|316924016|gb|ADCP01000049.1| GENE 6 6590 - 7582 1454 330 aa, chain + ## HITS:1 COG:CAC0100 KEGG:ns NR:ns ## COG: CAC0100 COG0113 # Protein_GI_number: 15893396 # Func_class: H Coenzyme transport and metabolism # Function: Delta-aminolevulinic acid dehydratase # Organism: Clostridium acetobutylicum # 4 330 2 320 320 384 59.0 1e-106 MNEFYRGRRLRRNPVIREMVRETQLSAADLMMLYFVVDTTDEDFKKEIPSMPGQYQLSLK QLEKTVGEAVDAGLRSLMLFGIPKHKDAIASEAYAPNGIVQRAIRMLKAKWPDLQVVTDV CLCEYTSHGHCGILKEGDTCGEVVNDPTLELLAKTALSHVEAGADMVAPSDMMDGRILAI RETLDKAGYVNTPIMSYAVKYASAFYGPFRDAAESAPHHGDRKTYQMDPANAMEGLREAA ADIDEGADIVMVKPAGPYLDVIRMVRDNFDVPVAAYQVSGEYSMIKAAAINGWIDEERIV LESLIGIRRAGAKLILTYYAEEALKKGWVR >gi|316924016|gb|ADCP01000049.1| GENE 7 7582 - 8763 1323 393 aa, chain + ## HITS:1 COG:MA0573 KEGG:ns NR:ns ## COG: MA0573 COG0535 # Protein_GI_number: 20089462 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Methanosarcina acetivorans str.C2A # 47 388 9 348 349 416 56.0 1e-116 MNCHEPPCGMGREDSRPGTGHPGGHTGHPGGLPAGPRTLEDGSPLCRLIAWEVTRSCNLA CKHCRAEAHPEPYPGELSTEEAKALIDTFPSVGNPIIIFTGGDPMIRPDVYELIAYAGSK GLRCVMSPNGTLITPENARKIKEAGVQRCSISIDGYNAEKHDAFRCVPGAFDATMRGIEC LKAEGVEFQINTTVTRDNLHDFKKIFELCERIGAAAWHIFLLVPMGRAAELADQVITAQE YEDVLHWFYDFRKTTSMHLKATCAPHYYRIMRQRAREEGVSVTSATFGMDAMTRGCLGGT GFCFISHVGQVQPCGYLTLDCGNVRTTPFPEIWRKSKPFLQFRDQSEYKGKCGVCEFHKV CGGCRARAWSMDGDYMGEEPLCTYQPRKAKEAE >gi|316924016|gb|ADCP01000049.1| GENE 8 8760 - 9224 596 154 aa, chain + ## HITS:1 COG:MA0574 KEGG:ns NR:ns ## COG: MA0574 COG1522 # Protein_GI_number: 20089463 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 4 152 37 186 187 114 39.0 7e-26 MTEMTLDTMDKRILDIIQTDFPLESRPYAIIAERLGITEQEALDRVNALRKSRLIRRLGA NFQSAKLGFRSTLCAAKVPEDKLDAFIAEVNRHVGVTHNYLRRHAYNVWFSAIGPSWEAV CAMLDDITAKTGIPILNLPATKLYKIRVDFQMEE >gi|316924016|gb|ADCP01000049.1| GENE 9 9445 - 10242 831 265 aa, chain - ## HITS:1 COG:no KEGG:DVU0864 NR:ns ## KEGG: DVU0864 # Name: not_defined # Def: glycoprotease family protein, putative # Organism: D.vulgaris # Pathway: not_defined # 4 259 5 255 261 172 45.0 1e-41 MNTTLIMNASEGRIQFVLEQEGSLACAQEWSAPSKGTELLTPALADAFQRLGIMPSDISR IACVAGPGSFTGLRLALTTAAAFRRATGAAVAPLNALQALAGSVPFGLLFPARETRIRVI THARRGLVHGQDFLCAPGSALPSPVDEPAMWEIPAACGGERPDIMLGSGVARNLPQLEEL FGDSAPLFLPALTHPTAQALLDLTLALPDGAWGHKDLDPLYLRPCDAVDNLASIAAKRGQ APEEAYAQLDKLLGPSAEEVVRGSR >gi|316924016|gb|ADCP01000049.1| GENE 10 10239 - 11360 1608 373 aa, chain - ## HITS:1 COG:AGc2553 KEGG:ns NR:ns ## COG: AGc2553 COG0750 # Protein_GI_number: 15888706 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 18 371 51 407 409 256 40.0 6e-68 MDFLLNILSHWESALAVLLVLGGLIFFHELGHFAVARLFRIGVRTFSLGFGPKLLKLRRG KTDYCLSLIPLGGYVALAGEEDEAEQPDPKGKEIDGVLFAPEELYSGRPAWHRLLVVLAG PVANFVLALIIYCGIAWAQGQTYLLPEVGDVTPGTPAATAGILPGDRVLSIDGKPIENWN AVAEGIGAGNGKPVTIVLSRGGSEVTLSLTPEAKTRANIFGEEKPAWLIGIRASTATGHL PLGPVEAIGAGFRQTWDMIAFTCESFVKLAQRVVPLDNVGGPILIAQMVGQQAEQGLSAV LLLAALISVNLGILNLLPIPILDGGHIVFFTLEMIMGRPVSATAREWSAKVGMALLLGLM ILATWNDLTRLFS >gi|316924016|gb|ADCP01000049.1| GENE 11 11361 - 12569 1184 402 aa, chain - ## HITS:1 COG:XF1048 KEGG:ns NR:ns ## COG: XF1048 COG0743 # Protein_GI_number: 15837650 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Xylella fastidiosa 9a5c # 19 392 6 380 396 349 51.0 6e-96 MIRYITELPSSEWENSFPRPVVLLGSTGSIGVNTLRIIEKHPDLFQVVALAGGRNVERLI EQALRWRPPYLGIQTEEGRKALLAALPSGYEPEILVGPQGYAELASLPEASTILSAQMGA AGLRATMAAAEAGKVICLANKESLVLAGGMLRETCARTGAVILPVDSEHNAVFQGLRGRN TETVRRIILTASGGPFRGKKRDFLATVTPAQALKHPNWSMGAKITIDSASMMNKGLEIIE AHHLYGLPLERIGVLVHPQSLVHSLVEFEDGSLMAQAGTPDMRMPIAYCLAWPLCLDAGV PPLDLARSGALTFEEPDLHSFPCLELACRTIGKGSALPVALNAANEVAVEAFLSGRIGFM DIPDIIGRALDECSAPDPASLEAIETLDHETRLRVGLWIEKV >gi|316924016|gb|ADCP01000049.1| GENE 12 12809 - 13615 1081 268 aa, chain - ## HITS:1 COG:BS_cdsA KEGG:ns NR:ns ## COG: BS_cdsA COG0575 # Protein_GI_number: 16078717 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Bacillus subtilis # 147 256 151 259 269 107 53.0 2e-23 MIIASKHFPRIITGAVLVSVLGTALYLSGPYLFGLLLLFTLVGLWEFYAMFWPLKANIGG RILGLALGGLLITAAWMRPEMVVMTLALSTLILACMFLFCWSHDDKYRFTRVAVLAGGLL YVPMLIMPALNFSIHEQLLLICTAAGSDTVAYFFGMRFGKHKIWPKVSPKKSVEGSAAGL AASVVVAVCFGLAFGVPATGISDYALLGLVLGVMAQLGDFFESALKRSRSVKDSGNVLPG HGGVLDRVDSLLFVIPTYECARALMTFF >gi|316924016|gb|ADCP01000049.1| GENE 13 13612 - 14346 884 244 aa, chain - ## HITS:1 COG:CAC1791 KEGG:ns NR:ns ## COG: CAC1791 COG0020 # Protein_GI_number: 15895067 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Clostridium acetobutylicum # 12 237 22 254 257 231 46.0 1e-60 MSEPSPLPIPAASLPAHIAIIMDGNGRWAQERGLSRSEGHKAGVRAAKAIVTECRTLGIR HLTLYTFSQENWGRPKDEVSLLFQLLVSFLGEELPSMERNGISLRVFGELDGLPLPARTA LRHAMNRTAKCSDMIVNLALNYSGREEILRAARLLMQQGVKPEAVTEEAFRSCLYSAGQP DPDLIIRTSGEQRISNYLTFQSAYSELYFTSTYWPDFTPEALHRALVEYAGRNRRFGLTQ EQIQ >gi|316924016|gb|ADCP01000049.1| GENE 14 14444 - 14680 77 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKVPFSPSRPLPLILLGFSADGERRIGSPRKAGRLCSDRFPLKEYPFQRFALPEHAVFG GILTARRAGTSGLRPFQT >gi|316924016|gb|ADCP01000049.1| GENE 15 14711 - 15271 898 186 aa, chain - ## HITS:1 COG:BH2424 KEGG:ns NR:ns ## COG: BH2424 COG0233 # Protein_GI_number: 15614987 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Bacillus halodurans # 4 186 3 185 185 187 56.0 7e-48 MDKESVLLDSEDRMDKALGALDRDFSRLRTGRASTGLVDNIKVDYYGTPTPISQLASVAI PDSRTITIQPWDRGAFAGVEKAILKSDLGLTPVNDGKIIRISIPPLTEDRRKELGKLARK SGEEAKVAVRNVRRDANDQLKKLEKDKAISEDELKKATDDVQKLTDRYVAKVDEKCAAKE KEIMDL >gi|316924016|gb|ADCP01000049.1| GENE 16 15277 - 15993 1032 238 aa, chain - ## HITS:1 COG:FN1622 KEGG:ns NR:ns ## COG: FN1622 COG0528 # Protein_GI_number: 19704943 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Fusobacterium nucleatum # 6 238 6 239 239 284 59.0 8e-77 MAELKYKRVLLKLSGEALAGEKQIGIDPHTVSKICAEIADVVDMGLEVALVIGGGNIFRG LSASAEGMDRSSADYMGMLATVLNALAVQDALEKTGHPTRVLSAIAMQEVCEPYVRRRAM RHLEKGRVIICAAGTGNPYFTTDTAAALRGMELKCDAIIKATKVDGVYDKDPMKHDDAVM FKHLSYEETLRRHLKVMDSTAITLAQENQVPIIVCNMFNGSIKKVVTGENPGTTVEGD >gi|316924016|gb|ADCP01000049.1| GENE 17 16008 - 16796 1031 262 aa, chain - ## HITS:1 COG:VC0224 KEGG:ns NR:ns ## COG: VC0224 COG0463 # Protein_GI_number: 15640254 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Vibrio cholerae # 8 247 3 245 259 174 38.0 1e-43 MSSDTSGKPTITGLVLTYNGERLLGKCLESLAFCDAVIVVDSFSSDATESIAREHGATFV QHPWSGALPQFEYALGLVETDWVVSLDQDEICSEPLRDAILAALPGAPADLCGFTPARRS WYYDRFLKHSGWYPDRLLRVFRRNGVHFTQSGAHEHIDPNGRTRELDGDILHYPYKHFRE HLDKINSYAQQGADDLAARGKKGGLALGVLHGIGRFLRIYLLKKGFLDGKAGFINAVHGA FYAFLKYVRVDEGNWGFPYNHR >gi|316924016|gb|ADCP01000049.1| GENE 18 16793 - 17947 1282 384 aa, chain - ## HITS:1 COG:alr3012 KEGG:ns NR:ns ## COG: alr3012 COG0399 # Protein_GI_number: 17230504 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 6 376 8 381 382 259 38.0 8e-69 MNAPLIPQANPGAAFRSRRDALLKAAADVLDGGWYINGKQVRQFEESFAAFCSVKHAVGV GNGTDAIEVALRALGIGKGGLVFTVSHTAVATVAAIECAGATPVLVDVDPVTYTMSPESL AKALEAARAGRYPGKPAAVLPVHLYGCCADLDAIRAVAGDLPLIEDCAQAHGAAYKGRPA GSTGIAGTFSFYPTKNLGAVGDGGCVVTSDDALAERIRSVREYGWRTHYLSSEPGINTRL DELQAAFLNVLLPELPAKNAKRRALAELYRRELSGVPGLVLPTIPDAVEPVYHLYVVQCP DRDEVQRRLRDRGVGTAVHYPFPVHLQDAYRGSVALAPDGLPVTEALMPRILSLPMYPEL SEADAVAVAGAVRDVVSSLTKERA >gi|316924016|gb|ADCP01000049.1| GENE 19 17956 - 18936 1274 326 aa, chain - ## HITS:1 COG:VNG0063G KEGG:ns NR:ns ## COG: VNG0063G COG0451 # Protein_GI_number: 15789397 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Halobacterium sp. NRC-1 # 9 319 3 315 328 173 34.0 3e-43 MSLYSGAHVLITGGLGFIGSNLGIRLLEEGASVTLVDSLIPEYGGNPANIRGYESRLKVN ISDVRDPHSMKTLVKGHDFLFNLAGQTSHMDSMADPYTDLDINCKAQLSILEACRNANPG IRIVFAGTRQIYGKPDYLPVDEKHPVRPVDVNGINKMAGEWYHILYNNVYGIRACSLRLT NTIGPRMRIKDARQTFLGVWIRQVLTGKPFEVWGGEQLRDFTYVDDCVDAMMRAALHEEA FGQIFNIGGGKRISLRDLADLLVDTAGEGAYEVREYPADRKKIDIGDYYADDSRLRSLLG WEPRTQLRDGLRQIIDYYRPRLQDYL >gi|316924016|gb|ADCP01000049.1| GENE 20 19006 - 19896 440 296 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 1 294 1 261 283 174 36 2e-42 LMSITAAMVKDLREKTAAGMMDCKKALTECDGDMEKAVDWLRQKGLSKAAKKAGRATSEG LVGFELAADGKSGVAVEVKCETDFVARGDKFQGFVKDMVAQVAKGEYADSEALLAAPFVA DASVTVKEALDGVIATTGENMGLGKFAKMELAAGKSGLIGGYLHSNGKLAVLVEMQTGSD AAAASEAFHEVAKNVAMQIAAASPLAVSAEGLNPEVVEHEREVYRQKAREEGKPEQIIEK IAEGAVKKFCKDVCLLDQLYIRDDKMTISDLIKGAAKTIGEPITVVRFVRIQLGAE >gi|316924016|gb|ADCP01000049.1| GENE 21 19916 - 20701 1153 261 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218888008|ref|YP_002437329.1| 30S ribosomal protein S2 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 260 1 262 271 448 84 1e-125 MAYVSMKQMLETGVHFGHQTRRWNPKMRPFIFGARNGIHIIDLQQTVKLYRTAHDKIVET VANGGRVLFIGTKRQAQEAVAAEAGRAGQFHVTNRWMGGTLTNFATIQKSIERLKKLEAM FADGSVNRYQKKEILTLQREMAKLELTLGGIKDMDRLPQLAFIIDPNREEIAVKECRKLG IPIVAVTDTNCDPDVIDYIIPGNDDAIRAIKLFVTAMAEACLEGEAMRKDSKNKDAEEEL KKAADAEAKAEEAPAVEAAAE >gi|316924016|gb|ADCP01000049.1| GENE 22 20585 - 20827 109 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDTVTGAENERAHLGVPTAGLVTEMDAGFQHLLHANVCHGMSPSTWGFASTPALTLLPVC VPRTHAAGRNEHPRPLPECA >gi|316924016|gb|ADCP01000049.1| GENE 23 21056 - 22006 950 316 aa, chain + ## HITS:1 COG:TM1528 KEGG:ns NR:ns ## COG: TM1528 COG1575 # Protein_GI_number: 15644276 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Thermotoga maritima # 28 305 3 282 289 101 26.0 2e-21 MESKLSRALAPKRPDAPTLTSRQLVKAWWQALRPPFFIVDLIPVGLGGALAARDLGYWPW GIFAVVMVGCFCLHTVANIANDLFDYLEGVDTDETIGGTKVIQNGWISPRQICLTIIGLL LSTVLIGWWLIGHSGQEWLWGPVVFAVLSAVFYVAPPIRYGCRGYGELFVCLNMGFIMVM GSYAVMANGFSAQSLALALPVGLMVAGILYYQSLPEIETDLAAGKRTLANILGKAKAELV FKLWWPAVWVLMANLWACGLVGWPVFLGLLTFPLYWRACRLIHEAVEWLDLDQHGHLVRK LYLINGALLIAGVVLK >gi|316924016|gb|ADCP01000049.1| GENE 24 22151 - 22669 774 172 aa, chain + ## HITS:1 COG:BMEII0913 KEGG:ns NR:ns ## COG: BMEII0913 COG4803 # Protein_GI_number: 17989258 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Brucella melitensis # 1 172 1 170 177 88 32.0 7e-18 MQSLIAVIFKEDMDRAEAYRVELRGREELAGLVDCSKMVAAVCDKEGKVHLQYTHALAKD GALIGGVWGALIGFLFLNPLLGIVAGAGLGAATGEVGDFGISHRFMNELATHLKPGSSAL FIPLKKEAVDETLRALGNSDGVVLSTDLKLEDEYQMEKLFEEMTAERAKKTS >gi|316924016|gb|ADCP01000049.1| GENE 25 22745 - 23695 799 316 aa, chain + ## HITS:1 COG:BH0857 KEGG:ns NR:ns ## COG: BH0857 COG0726 # Protein_GI_number: 15613420 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus halodurans # 65 279 71 278 282 90 29.0 4e-18 MLHKIIVCCLVVVALAFPARASSLWHPEPGFLPEDARHAVPNNPAPPERTVPEQVLPPLP VQDEGTIRRVATKEKVAALTFDLCELSTMTTGYDAEIIDFLREKRIPATLFMGGKWMRSH SERAMQLMADPLFEIGNHGWSHGNFGIMSEKAMLEQIRWTQAEYELLREEILKRARAEGR SVELPEAVTLFRLPYGRCTDKALALLAREGLQVIQWSVVAETPQDNAVHGMGARVAGQVR PGAIILFHANLVPKGSAFMLKETVGELQRKGYRFVTVGELLKLGEPQRTRDGYFNKPGDN LSLDTHFGIDGTGRRK >gi|316924016|gb|ADCP01000049.1| GENE 26 24442 - 24837 327 131 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTLLERQRRESFMKGMFIPFMLLALVCSLFGDTNILHADVQKVVVYDSGSALTDMQKQSP VILESFEECMETSASKAPLKHLVPLMARFMLQLGLAVSSVVRLFVRHYVPTLRKSDSWGL LPFSLAPPCFA >gi|316924016|gb|ADCP01000049.1| GENE 27 25115 - 26557 1927 480 aa, chain + ## HITS:1 COG:no KEGG:DMR_25930 NR:ns ## KEGG: DMR_25930 # Name: not_defined # Def: hypothetical membrane protein # Organism: D.magneticus # Pathway: not_defined # 3 477 4 477 479 582 67.0 1e-164 MAKTNLNEDKIALLTGCFIFLLAALNLAHLDVLGWVVSTNMWTSMDKAFSATTGAYKGTI SGVVSLLCTYVALTAILSFGIKLLGGNVARFVKSFTVVFFISEICYMFGANAHIAATPDA QAKFGIDWSIGLTTEAGFIVALVAGIFISNAFPRIAESLHDACRPELFVKIAIVVLGAEL GVKAAAASGMAGTIIFRGLCAIVEAYLLYWAFVYYVARKYFKFSREWAVPLASGISICGV SAAIATGSSIRARPVVPIMVSSLIVVFTCIEMLILPFIASHFLYSEPMVAGGWMGLAVKS DGGAIASGAIADSLIRARALELLGVHWEAGWVTMVTTTVKIFIDVFIGVWSLVLAWVWTA KFDKTNSGRTMNFGDVLARFPRFVLGYLLTFVIMLFICVNVEWQPLGKSVISTLGPLRTI FFTLTFFTIGMVSNFHKLMEEGIGRLAIVYVVCLFGFIIWLGLFISWLFFHGMTPPVIAG >gi|316924016|gb|ADCP01000049.1| GENE 28 26591 - 26764 333 57 aa, chain + ## HITS:1 COG:no KEGG:DMR_25940 NR:ns ## KEGG: DMR_25940 # Name: not_defined # Def: hypothetical protein # Organism: D.magneticus # Pathway: not_defined # 3 57 2 56 56 66 58.0 3e-10 MAEDTKNVKFHEEVQKMEYEPLDATELKLIHWSWALGVFLLVALYFLSDFIAPGAHG >gi|316924016|gb|ADCP01000049.1| GENE 29 26981 - 27202 209 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302863291|gb|EFL86223.1| ## NR: gi|302863291|gb|EFL86223.1| hypothetical protein HMPREF0326_01926 [Desulfovibrio sp. 3_1_syn3] # 1 73 1 73 75 76 50.0 5e-13 MGRYRVHYIEGSGENLRIRKEQTVEAPSFQDALERFTHWPAAEACEQSPACAQHPGANLC HMEAWEVFPVGES >gi|316924016|gb|ADCP01000049.1| GENE 30 27568 - 28926 2075 452 aa, chain - ## HITS:1 COG:HP0380 KEGG:ns NR:ns ## COG: HP0380 COG0334 # Protein_GI_number: 15645008 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Helicobacter pylori 26695 # 5 451 2 448 448 598 64.0 1e-171 MASSYVQKVIRSVKKSNPHQPEFIQALEEVLHSLEPLFLKDPKYQQNGILERIVEPERQI MFRVAWTDDKGRVQVNRGYRVQFNSALGPYKGGFRFHPSVNLSILKFLGFEQIFKNSLSG LSIGGAKGGSDFDPKGKSDAEVMRFCQAFMTEAFRYIGSTIDVPAGDIGVGAREIGYMFG QYKRLTSSFEGVLTGKGLKWGGSLARKEATGYGSVYFASNMLKARNMDLEGATCAVSGSG NVAIYTIEKLYQLGAKPVTASDSRGCIYHPGGINLDALKQVKEVERASLARYAELCKDAK YIPAKEYPKDQHPVWNVPCKLAFPSATQNEVSGADAANLIKNGCVLVCEGANMPSTPEAV EAFLQAGLAFGPGKCANAGGVSTSQLEMAQNASMQSWTFEEVDAKLKNIMANIFKAAHET AEEFGVPGNYVLGGNIAGFRKVADSMIEQGLY >gi|316924016|gb|ADCP01000049.1| GENE 31 29386 - 30012 950 208 aa, chain + ## HITS:1 COG:NMA0985 KEGG:ns NR:ns ## COG: NMA0985 COG0035 # Protein_GI_number: 15793942 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Neisseria meningitidis Z2491 # 1 208 1 208 208 280 64.0 9e-76 MAVHVVDHPLVRHKLGILRKESTSTSEFRMLAKEIARLLMYEATKQFKTEKRTIKGWAGP VEVESISGKMVTIVPILRAGLGLMDGVLDMVPGAKISVVGLYRNEETLEPVEYYVKLAKE IDQRIAIILDPMLATGGSLIATIDLLKKRGCKRILSLNLVCAPEGIKRVEEAHPDVDIYT AAVDSHLDEHGYIIPGLGDAGDRIFGTR >gi|316924016|gb|ADCP01000049.1| GENE 32 30129 - 31388 804 419 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 11 419 14 431 447 314 39 1e-84 MESDTTYSFRLKDSLLGAQMLFVAFGALVLVPILTGMSTNVALFTAGVGTLIFQICTRGK VPVFLASSFAFIAPIIYGVQTWGLPATLGGMVVAGAVYVVLSFIIRWRGTEIIMRLLPPI VTGPVIIVIGLVLAPVGVNMALGKAGDGSAVLVPENTALIISMAALATTVLVSLLGRGMI RLIPIMSGIAVGYLCAIGLGIHNFSKVAAAPWFGVPDFVFPEFRWEAILFIIPITLAPAI EHFGDIIAISSVTGKDFLKDPGIKSTMLGDGIATMLASFIGGPPNTTYSEVTGAVALTRA FNPGVMTWAAIWAIVLSCVNKLGAFLSSIPVPVMGGIMILLFGAIMVVGLNTLVRAGEDL MEPRNLAVVALIIIFGVGGMTFNIGSFRLGGIGLAAVTGVLLNLFLPRAVHVHKKVKSE >gi|316924016|gb|ADCP01000049.1| GENE 33 31653 - 32936 1373 427 aa, chain + ## HITS:1 COG:CAC3586_1 KEGG:ns NR:ns ## COG: CAC3586_1 COG1058 # Protein_GI_number: 15896820 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Clostridium acetobutylicum # 1 243 1 244 245 224 46.0 3e-58 MNAEIISVGTELLLGHTINTDAAFVARELSAIGVNLLFACTVGDNPGRLRDALEEALGRS DVVITTGGLGPTDDDLTKETIASVAGAPLKIHEDSLRRLESYFKGRPAMGENQRKQAMLP EGATVFPNDIGTAPGCGVRTASGKVIVMLPGPPSELVPMLQHYVVPFLKEGSHAVIHSCM IRTFGLGEGAGALKIADLTDAANPTAATYAKENEMFVRVTARAESEEAAEALCRPVVEEI CRRYGDVVYGVDVDNLESVVVRLLSEKGLHLATAESCTGGLVAKLITDVSGASEVFGMGL VTYANEAKMKLLGVPAAMLEEHGAVSEPVARAMAEGVREVSGSELGIGITGVAGPTGGTP EKPVGLIYIALSDGTRTWVRRMTPPGRVHGRGWLRDRAAGTALDMVRRYLSDLPLEPVQS AGPATTL >gi|316924016|gb|ADCP01000049.1| GENE 34 32950 - 33717 508 255 aa, chain + ## HITS:1 COG:PA1120_2 KEGG:ns NR:ns ## COG: PA1120_2 COG2199 # Protein_GI_number: 15596317 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 80 235 8 165 195 134 50.0 1e-31 MKHDYEACLNDITRIVKDIVWGDAEQVRALFRYTNDPSVPASVGRLAEQIGTLIVQKEVR EFQLENLIEDLFSTRTALAQARHDPLTGLPNRAMFEEMLHKACGSALECGSGLALLLVDF DRFKEVNDSLGHAAGDELLVQGAARLQGCVGDRGLIGRMGGDEFAVLLIEQAEGDVLRTA KCILEAIRAPFPLQEGEARISSSVGVAFHSLEAATPARLLKNADVAMYRAKGEGRDRLWR YRPSTFTDSGYGRVF >gi|316924016|gb|ADCP01000049.1| GENE 35 34027 - 34449 612 140 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1583 NR:ns ## KEGG: Ddes_1583 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 9 138 8 137 142 165 63.0 4e-40 MSQAQLPNELATFLENWTTDPNNAKDAFVRFKDFLLTTPDVRFDFKARPGVSYSLRAANA KNDERPLFVLVDVVDDEPEARWLSVCFYADMVNDPDELGDFVPSGLMGKDACCLNLEEDD PAMRDYILARLGEAAASAAK >gi|316924016|gb|ADCP01000049.1| GENE 36 34611 - 35579 954 322 aa, chain - ## HITS:1 COG:yccM KEGG:ns NR:ns ## COG: yccM COG0348 # Protein_GI_number: 16128958 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Escherichia coli K12 # 11 321 36 349 357 199 37.0 6e-51 MHPFSSARLHRLIQLGFVLFLSWVAWRLHEYAQGVSGNGAFVPRPAAAEAFLPLSALLGL KRLLLTGQYDPIHPAGLTILLVALCTAILCRRGFCGYLCPVGWLSGLLARLGTRLGVSCR PGRRLEWLLSAPKYVLLAAILYSFVVSMDLPSIEWFLKSPYNLVADIKMLQYFLAPGTLT LGVIVVMVLGSLFLPGFWCRGFCPYGALLGLLSLLSPMAVNRNPASCSGCGRCAAACPSR IPVDGRKRLSGPECVGCTECVSACPSRCLDIRFGYGTGALRLPAWGIAAGTLLILLIGYA AAVFTGHWEADLPPQMIRMFLN >gi|316924016|gb|ADCP01000049.1| GENE 37 36177 - 37181 297 334 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 1 297 1 297 328 119 29 6e-26 MKRALALLIACGLLLGVTATSSFAADKVVIKMAGMKPDGEPETLGMRKFGEILKELSGGK YDVQVFPNSQLGKEDAYIASTRKGIIQMCATGTQTSAIQPSMAMLETPMLFDDYAHARRA MNGGKTFELITNGFTEKSGMRVLNAFPLGFRHFYTKKPVATIDDMKHLRMRVPNIPLYIN FAKECGISGQPMPFAEVTGALDQGVIDGGDSPLSDIVSIKMYETTPEITLTGHLLVIHSL YINEKFYQSLPEQDKKWIDEAAKRSADYVWDLVEKVDADAVKTITEAGGHVSEPSPELHK FMQDAGKRSWKLFYDTVPNAKEILDSADSYRQAK >gi|316924016|gb|ADCP01000049.1| GENE 38 37286 - 37858 719 190 aa, chain + ## HITS:1 COG:SMb20034 KEGG:ns NR:ns ## COG: SMb20034 COG3090 # Protein_GI_number: 16263785 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Sinorhizobium meliloti # 36 171 11 146 187 71 32.0 8e-13 MSSENKPVMPPERIEETETEEEIRREEAAKGKGERAFEVFCAAIFLGMIGLVFFNAFLRY VFRSSFAPSEEWARFLFIYITFIGGIEAFYRHKHIAVDMFVNMISGASRKAVDIVASLFM LAALVLLLFGGISVVLQTLDTYSVATDVNMAFINGTLPVMAFAAIIIHLRDIVRLIRTPA GEFNTAPKTE >gi|316924016|gb|ADCP01000049.1| GENE 39 37869 - 39158 721 429 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 1 420 3 422 435 282 35 4e-75 MELVVFLVSLFGFLALGIPIAIVLVLCSMVLMYYGGMWDSMFVMIIPQSMLDGANNYPLM AIPFFVFAGEIMTEGGLSKRVVKLAQLIIGRVRGGLGYAAIVSSVIFAGLMGSSVGEAAA LGSLLLPMMAQAGYKPGRAGGVIASGAILGPIIPPSTNFILLGATVGLSITKLFMIGLVP GLIIGLCLMVTWFFIVRIDGYKEKIEFKKGEASKIVRDAMPAFMMPVLLLGGIRFGVFTP TEGGAFACVYAIAVCTLYYRELTFKGLLAVSARAARTTSVVMLIVATATAVGWFITSAQI PMQVAELFQPLVDSPILLLLSINVFLFLAGMVMDLTPNVLIFAPVFYPLIQQAGIDPYFF GLLFILNLGIGVITPPVGTVLYVVCGIGNIKIGNLVRNMAPFLIVEMLVLFLLLFFPSLS IEPMKFLMK >gi|316924016|gb|ADCP01000049.1| GENE 40 39494 - 40411 752 305 aa, chain + ## HITS:1 COG:no KEGG:DSY1098 NR:ns ## KEGG: DSY1098 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 305 1 317 317 244 40.0 4e-63 MLFSASRRTDIPTYYSEWLCNRLEAGNVCVRHDARRVTRYVFSREAVDCLVLWTKNPLPL LDRLALLRGWPCVFQFTLTGYGRDVEPGLPDKLQLVEAMKRLSEAFGSERVIWRYDPIFF SGAYSLSWHVRCFDALTERLEGVTRTCVVSFLDMYPKLRRRIPALGLCGDSGQARKELLA AFAAMAERRGIRLSLCAEDVDVPGVVHAGCLSRELVERVAGCRLDVRPSQQRAGCKCVAS VDVGVYGTCGNGCLYCYANQDGIPVGRGSALHDPASPLLVGRLSADDEVTDQRCASLKSR QFSLL >gi|316924016|gb|ADCP01000049.1| GENE 41 40650 - 41120 354 156 aa, chain + ## HITS:1 COG:BH1021 KEGG:ns NR:ns ## COG: BH1021 COG0350 # Protein_GI_number: 15613584 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Bacillus halodurans # 47 152 67 171 175 141 62.0 6e-34 MLCFRETAFGRVGIEDKEGSITRLYLPGRSDAVSEGPETALLGEAFAQLEAYFAGRRKTF DLPLRFDEGTPFMKQVWQALCTVPYAHTASYKDIAEAVGNPKACRAVGMANNRNPIAIIV PCHRIIGSTGALVGYGGGLGMKERLLELERRYGPAD >gi|316924016|gb|ADCP01000049.1| GENE 42 41104 - 42315 756 403 aa, chain + ## HITS:1 COG:YPO1834 KEGG:ns NR:ns ## COG: YPO1834 COG0122 # Protein_GI_number: 16122087 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Yersinia pestis # 199 390 6 197 203 189 53.0 9e-48 MAPLTEPAFARLRSVDPVLAEAAGDAAFELETDLFAALAKGIMEKGASDAFPFAWASLAA QIGEVSPERVLEAGSALEAFGKKTANALLGAAKAVRSGKIGMERLAALPDDEAVGALCSL RCVDVRLAVRVLIGVLRRPDVLRFDDEAVRSGLMRVHGPRSCTQEGFEAFRARHAPFGSA ATLCLWESVERLRPVFPCGGEALDALKRRDKRLGRVIERLGPIRRGVEPDLFTALVDSVI AQQISGRAAQTISDRLHVLVGNFTPQGLAEADPSQIQQCGLSQRKVGYIQGIAREVASGA LDLEALRHAPDEELIRKLSALNGIGVWTAEMLMIFSLCRPDVLSWGDLGIRRGMALLYGD RELTRERFERRRKRYSPYGSVVSLYLWSIAGMEEALAVKLPRG >gi|316924016|gb|ADCP01000049.1| GENE 43 42520 - 43296 786 258 aa, chain - ## HITS:1 COG:STM2546 KEGG:ns NR:ns ## COG: STM2546 COG0483 # Protein_GI_number: 16765866 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Salmonella typhimurium LT2 # 5 246 1 246 267 164 36.0 1e-40 MTEQLSPLALQALAVVKESGAIIGANHAHPHTIRHKGRIDLVTETDIAVEAFLKERLATL APQAGFLAEESAEDLSLPDTCWIIDPVDGTTNFAHGLPLTVTSVAYRLNGEIVLGIVNAP LLRECFIAEKGKGAWRNGESISVSSVTACENALVATGFPYEIASRVDEILERMRPVLSSC QGVRRCGAAALDLAWTACGRFDAFYEDELKPWDMAAGALLVTEAGGSISNLDGSPFDLRW SILAGNKAMHELIGKMIR >gi|316924016|gb|ADCP01000049.1| GENE 44 43664 - 43948 167 94 aa, chain - ## HITS:1 COG:no KEGG:LI0028 NR:ns ## KEGG: LI0028 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 45 88 89 132 162 65 54.0 8e-10 MRSCLSAIIPSLAAALLLTAPAFAAPAQPAKEAAVPVQTAATSSYSGNAKTRIFHVSGCR YFNCKACTVRFKSAEEARSNGYKPCKRCLAREGR >gi|316924016|gb|ADCP01000049.1| GENE 45 44168 - 45124 531 318 aa, chain + ## HITS:1 COG:NMA0440 KEGG:ns NR:ns ## COG: NMA0440 COG0791 # Protein_GI_number: 15793445 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Neisseria meningitidis Z2491 # 83 197 152 270 280 111 48.0 2e-24 MNIPSIKIRAGLWLLLLTGLILAGCAPKQRAVDDSFAFGNDYENLTGPDRSTMTPEERFA YDYRSAAVARMRDEGVPLSNVHLVERAKTAIGTPYVRGGTSMSGFDCSGFVQWAYKSVGV SLPRTAREQSVLGTPIDSDEMEAGDIVAFRHPRRGYHTGIYVGDGKFIHSPRRGKSVEIT SLSDPYFSSTFLGARRVSISESDAEAAQKLMALYESRSSRLHRTASDAALSQSKAKHVVS KSQKKGKQTLSSSRSSSKHKAVSSKKRTTVASKKAASKTSVASSKKPSSKKQVAQKRSSQ KKSTSVAQKKSTSKAKSK >gi|316924016|gb|ADCP01000049.1| GENE 46 44830 - 45384 86 184 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNTYRQRTSAPPQYPFPHCQSGLSAFSGPLRGLFRHSKRISALSCSIVFFSVLQQVPQQG PLHARQQKKAPNESGLSQHIASATAKHYLDFALEVDFFCATEVLFFCEERFWATCFFEEG FLLLATEVLLAAFFEATVVRFLLETALCFELDRLDESVCLPFFCDLETTCLALLCESAAS EAVR >gi|316924016|gb|ADCP01000049.1| GENE 47 45401 - 46513 1108 370 aa, chain + ## HITS:1 COG:tauA KEGG:ns NR:ns ## COG: tauA COG4521 # Protein_GI_number: 16128350 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type taurine transport system, periplasmic component # Organism: Escherichia coli K12 # 57 320 69 320 339 60 25.0 4e-09 MKRFSKMLGAFALCMGLLASPVQAEKFVVGFQPYDTISYQAIINAELGLWKKYTPEGTEI EFHPAVQGTVVASNMLADRAQIGYMSIMPAAVLCVRAKGKVKMVSTTDMSEGTRCSLILV RKDAPDFKDNEALARWLDGKVIAAPKGSASDQYMRRFFEKYKVKPAEYLNQTIEVISTNF RAGKLDAASLWEPTLSGLASEVGEGVGKIVADGSACDNEDLGIVVMRSDFMEKHPKVAEG YLRSDLEAQLFMLNPDNWEQVINMVSQYATGVPKRVLWYSVFGKVPANSPNLVREWMNFY FGEREKANIDEVVAFLHQEGIISVDKLPEGTVDDSLTRKVFKASGHKPVAPGAALGVIEG RSAADCPFKD >gi|316924016|gb|ADCP01000049.1| GENE 48 47107 - 48210 362 367 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 [Alteromonas macleodii ATCC 27126] # 30 335 3 298 327 144 30 2e-33 MRGASLFFCGWCTTSIIPVGGNTFFHVRRRVMKKVLYTVAALLASSCMVLGMANNARAEK PLTLRLGHPMAPGNNVTVGYEKFKELVEKKSNKKIRIQLFPNCQLGSDRVTTEAAQAGTL DMSSSSTPNLASFSKSYMAIDLPYVTSPANQEKLYKALDDGELGKALDKVSESIGLKTIM FSEFGYRNFVSAKKPLKEVKDLMNLKVRTTDSPVEVAVATELGMNPAPVAWGETYTAIQQ GTVDAEGNTFSLLNDAKHTEVLKYAMDSEHNYSMHILLMNKKKWDSLTPEQQQIITEAAK EATVWQRAESVKLEKKAWDAFKAKGIEITMLTPEQRKELYDRTAPVREQFAKEIPAELLQ LIADTQK >gi|316924016|gb|ADCP01000049.1| GENE 49 48286 - 50151 838 621 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 205 612 12 421 435 327 40 1e-88 MEKTNASKQGVLHWLDENFEKMFLVTGLLSITLFITWQVIYRYIITQFIERAGAAVWTEE LSRYIFIWISYLALSVAIKKRSSIRVDMLYDHLPPRLQQISWIVVEVLFFILTATIAYYG WGQIERLQEYPQHTTALRIPFLIPYLILPFGFGLMCFRLLQSLYKQVKVCGLVDTLIGLI AVFVIASPVIFCDYIEPLPALFGYFIVLCAIGVPVAISLGLSTLATIICADTLPIEYMAQ VAFTSIDSFPIMAIPFFIAAGVFMGAGGLSHRLLSLADEIVGGTYGGIGLTAIVTCMFFG AISGSGPATVAAIGALTIPAMVERGYDKYFSAALVAAAGCVGVMIPPSNPFVVYGISAQV SIGDLFMSGIVPGIMVGIVLMSYCYWYSRKKGWRGQEKVRNARTLMHAAWDAKWALMVPV IVLGGIYGGIMTPTEAAAIAAFYGLIVGLFLYKEMDFKCLINSCIESCETSAVIIVLMAM ATLFGNIMTIEDVPGTVARAILSFSENKIVILMLINVLLLIVGTFMEALAAIVILVPILL PIVTGVGVSPLHFGVIIVVNLAIGFITPPVGVNLFVASGVAKAKLENIASQALPMIALML IVLLICTYIPEVPLILVGGPH >gi|316924016|gb|ADCP01000049.1| GENE 50 50617 - 51441 1158 274 aa, chain - ## HITS:1 COG:Cj1540 KEGG:ns NR:ns ## COG: Cj1540 COG2998 # Protein_GI_number: 15792848 # Func_class: H Coenzyme transport and metabolism # Function: ABC-type tungstate transport system, permease component # Organism: Campylobacter jejuni # 24 274 20 268 269 240 50.0 2e-63 MRLLRAFLVLSLLLLPQMANAETLMMATTTSTQDTGLLEYLQPIFKKDTGIDLKWISVGT GKALAHGKDCDVDVLLVHAPALEEKFVADGFGVDRHQVMYNDFVLIGPAADPAGVKGKDI PTALKTFAEKKIPFVSRGDNSGTHNTEKQLWKVAGMEVPGKDATWYIDAGQGMMATIRIA AEKDGYTVTDRGTYIKYNAAAKGDAPLKILVEGDKALLNQYSVMMVNPAKCPKVKQEAAK KFINWWISPNTQKAIASFQLEGKQLFFPNAKAGK >gi|316924016|gb|ADCP01000049.1| GENE 51 51670 - 52599 931 309 aa, chain + ## HITS:1 COG:TM0881 KEGG:ns NR:ns ## COG: TM0881 COG1897 # Protein_GI_number: 15643643 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Thermotoga maritima # 1 304 1 303 304 397 62.0 1e-110 MPIKIPEDLPARPILEGENIFVMTEGRANRQDIRPLEIAIVNLMPTKIATETQLLRLLGN TPLQVNVTLLRAEGHESKNTAPQHLERFYKTFSQVRDSSFDGMIITGAPVEQLAFEEVDY WDELVGIMDYAKKHVHATLYICWGAQAGLYHHYGIPKYGLPRKLSGIFSHTINDPTNQLF RGFDDVFHAPHSRHTEVRGEDIAKIPSLEILAESDEAGVLVAGTLDGRSMFITGHLEYDR DTLDAEYRRDVAKGMEIAPPAHYYPGDDATRPPLVSWRAHAHLFYSNWLNYCVYQATPYR LDDIPTDRD >gi|316924016|gb|ADCP01000049.1| GENE 52 52910 - 54187 1010 425 aa, chain + ## HITS:1 COG:lin0604 KEGG:ns NR:ns ## COG: lin0604 COG2873 # Protein_GI_number: 16799679 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Listeria innocua # 4 425 6 425 425 461 54.0 1e-129 MPIRFETLAVHGGQHPDPTTLSRGVPVYRTSSYLFKSAEHAAKLFGLAEQGNVYGRLGNP TQAMLEERMSLLEGGVGALASASGTSAIFSTIVNLAKQGDEIVSANNLYGGTFTLFNDIL PQFGITTRFVKPQDFDKMEAAVNGRTRALYVEAIGNPALDVADLDAVSAIAKRHGLPLVV DATFATPYLLRPFEHGADIVVHSLTKWLGGHGTALGGIVVDGGTFDWTDSRFSLYVEPDA SYHYLRWGYDLPEGYPPFITRMRLVPLRNLGACIAPDNAWMILQGLETLPLRMERHCANA LKVAYHLKQHPKVAWVRYPGLPDDPAHAAACRMLRNGFGGMVVFGIQGGQPAGQRFIEKL GLFSHLANVGDAKSLALHPASTSHAQLSEEQQRDAGLPPELIRLSIGIEHIDDILEDLDQ ALGEI >gi|316924016|gb|ADCP01000049.1| GENE 53 54224 - 54730 249 168 aa, chain + ## HITS:1 COG:DR2094 KEGG:ns NR:ns ## COG: DR2094 COG1959 # Protein_GI_number: 15807088 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Deinococcus radiodurans # 1 136 46 181 197 98 43.0 6e-21 MRYSTRTRYGLRFLINLAGRPAGACVQLAEIAREEGVSVKYLEQIVRVLRPAGILRSARG AKGGYALAKSPDAIRMDEVFERLEGHISPVECLHGAKSCEREGVCPTRWFWRELDDHMRM FLKGMTLARFVERERALKTEKGTFPVDMDGPACFHALHRTPPEGGGNA >gi|316924016|gb|ADCP01000049.1| GENE 54 54760 - 55341 418 193 aa, chain + ## HITS:1 COG:no KEGG:DVU1990 NR:ns ## KEGG: DVU1990 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 21 188 28 196 215 160 46.0 2e-38 MLLCAACLLLSARSVQARELTALDIFAQLPITLFENTPEGLSEDEKLRLIEQGASEFWEV ERFDADRLVLVSRPFGETRVGLRVFRGGDRLLAALGTDGGAMCALELWQEDATGGFVPAN PPDDPQLSDFLASGQRLAADVSPAFMFCLEDDGLDVRPLFWGPAGLVDVPVAKSVRYIWK SGAFEKTVSGKPE >gi|316924016|gb|ADCP01000049.1| GENE 55 55473 - 55814 501 113 aa, chain + ## HITS:1 COG:Cj1094c KEGG:ns NR:ns ## COG: Cj1094c COG1862 # Protein_GI_number: 15792419 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Campylobacter jejuni # 23 103 7 89 90 80 44.0 7e-16 MFASVAHAMGTAGAGQAGGADALMQFVPLIAMLAIFYFLLIRPQQKRAKQHKAMLEALKK GDQVLTTGGLVGRVVDIDGDILSIDLGSTTVSLGRAYVVSVMDPRSKAVKEEK >gi|316924016|gb|ADCP01000049.1| GENE 56 56011 - 57597 1765 528 aa, chain + ## HITS:1 COG:RC0894 KEGG:ns NR:ns ## COG: RC0894 COG0342 # Protein_GI_number: 15892817 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Rickettsia conorii # 6 528 7 517 518 344 37.0 3e-94 MKSLSWRIVVSLLVLLASIVYVLPSLPGVGGSSLGNLLPDARISLGLDLKGGMHLTLGVE VDKAVSNSLSLTGQELRAQASEKGITVLRSRLTPDNRLEFLLPRAEQRTELQELLSKDFP QLAVDSPTQVEGGGLRFTAAFTPAAKAKIEEMALDQAVRTIRNRIDQFGVAEPDIRKQAD FRIQIQLPGLTDSKRAIQLVGQTAQLTFHLVRDDVNPQGILPPGVALFPMVTVRPDGSTS ESMIALDKEPLMTGEDVADARPTFDNRDNKPSVGLSFNSRGAAMFERITGEHIHKRMAIV LDGKVHSAPTIQSRIGGGKASITGSFTPTEAQDLAIVLRAGSLPAPVTVLEERTVGPSLG KESIESGILAAAVGGIAVMVVMPLYYGLAGLLADVMLVFTITMLMAGLAAFGATLTLPGI AGIVLTIGMSVDANVLIFERIREEIRQGLKPLEAVAVGFDRATISIVDSNLTTIIAAAIL YQFGTGPVRGFAVTLSLGIVASMFTAIFVSRTAFGLWLGGNDGKKLSI >gi|316924016|gb|ADCP01000049.1| GENE 57 57663 - 58769 1355 368 aa, chain + ## HITS:1 COG:YPO3188 KEGG:ns NR:ns ## COG: YPO3188 COG0341 # Protein_GI_number: 16123350 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecF # Organism: Yersinia pestis # 13 360 18 313 322 183 37.0 4e-46 MSLHIFSKQTKIDFVGMRHYAYGLSALLILVGLITIFMNGGLRYGVDFAGGAAVQLQFEK PVADEDIKKSLAEMELPGLAIQQYGEDGRDYLVRFSTPNLTSEGIRSNILSSLEKSFAGN PASIQRLEMVGPKVGADLRNAALEAMYFAILLITVYISGRFEQRWMIAALMAAVLGSAMY VMGFLGMDMVYRVIGALALTLLISWKLKLNYALGAIVGLLHDVLITLGLLAMMGKEIDLN IIAALMTLVGYSLNDTIIVYDRIRENLQNQPEDNPAPLADIINLSVNQTLGRTIMTSATT LIAALSLTILGGGAIHDFALTMSLGVFIGTFSSVFVSNPILLLLGDTRQYMVARKKVEYE RPGEHGVV >gi|316924016|gb|ADCP01000049.1| GENE 58 58959 - 60413 1742 484 aa, chain - ## HITS:1 COG:lin2517 KEGG:ns NR:ns ## COG: lin2517 COG0053 # Protein_GI_number: 16801579 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Listeria innocua # 28 306 17 285 291 140 29.0 6e-33 MTTETAHHEAHSEKQHAALSSLFWALFLTLIKLGAGLATNSLGILSEALHSSLDLVAAGI TFFAVRVAARPADETHPYGYGKIENLSALAETVLLLVTCGWIVHEAVNRLFFDAPEITPS WWGVIIILISLLVDVNRAAMLRRVAQKHKSQALEADALHFTTDIWSSGVVLVGLLCVQAS EYLPADSPFRGILHMADAIAALFVSGIVVVVGFRLSRRAVTMLLDGGGKEHSEALEQALA KQLPLCRVRRLRVRESGADVFMDITVEAPATLRLDAAHDVSCVIEDIAHEIVPSADITVH VEPAQEETESMLQLVRTVAAAHKLSVHNLILSEQSGGLLVFLHVEASPDMTLREAHEYVV MFEKALGKRLNTTRIETHIEPEDRLCAPENALPFDADLHYVRAAVDSILPEFPMVSNVHD LRLTHMGGTPLVSFHCIIDGDLSLAAAHDVATDMERRLRTCAPKLDRILIHTDPSALIVM PEER >gi|316924016|gb|ADCP01000049.1| GENE 59 60475 - 61050 632 191 aa, chain - ## HITS:1 COG:STM1834 KEGG:ns NR:ns ## COG: STM1834 COG1971 # Protein_GI_number: 16765175 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Salmonella typhimurium LT2 # 1 187 19 203 206 115 37.0 4e-26 MSFPELFAIAVALAMDAFAVSVCAGCALKRVTAVHFLRLSLTFGFFQFLMPVIGWALGLT VRGFIESWDHWIAFALLAWIGGNMIRGGGDDSEDGESCSVDPTKGRKLLILAVATSIDAL AVGLSFSLLNINVWGPSTLIGVVCAAITALGLLAGKGLAHADIFGRRAELVGGCVLIGIG LKILYEHGVLS >gi|316924016|gb|ADCP01000049.1| GENE 60 61329 - 62162 1109 277 aa, chain - ## HITS:1 COG:BMEI1244 KEGG:ns NR:ns ## COG: BMEI1244 COG0697 # Protein_GI_number: 17987527 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Brucella melitensis # 1 272 1 274 281 192 42.0 4e-49 MSFEIILLILLAALLHASWNAVIKGGSNKLYETGLNCFGGGVGVLFIVPFLPFPALESLK FLAGSCTVHIAYYLCIAVAYRNVDMSFAYTIMRGTAPLLTSVFMLLSGHEMPLAGWLGIL CLCAGVLTLTRDNIKNGKFNINGALAAVGTAVVIMGYTLFDGYGARASGNPISYVCWLYV INTFPINVILLLRQRKTYIPYFCNRWKHGLFGGLCSLGSYGVALWAMTKAPIAMVAALRE TSVIFGMLLAVFFLGEKFSAVKLLAVVLVATGIFFMH >gi|316924016|gb|ADCP01000049.1| GENE 61 62361 - 62906 635 181 aa, chain - ## HITS:1 COG:all7633 KEGG:ns NR:ns ## COG: all7633 COG3544 # Protein_GI_number: 17158769 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 25 174 75 235 242 82 38.0 3e-16 MRTASLTVLALVAAFLFSVLPAQAAMDHGSGAAHYSDRAFLSGMIAHHEGAVDMAKVFLA TPKKDQDPQVTAWADDVIKVQEKEIAEMKELLKPLGGIEESAYAPMKKAMQHMLEEGKSL GANMRFVELMLPHHAMALEMSVPALLGSNNPQILNLAENIIISQAKEMRQFKAWIAEHHK K >gi|316924016|gb|ADCP01000049.1| GENE 62 63227 - 63901 616 224 aa, chain - ## HITS:1 COG:BMEI0222 KEGG:ns NR:ns ## COG: BMEI0222 COG0288 # Protein_GI_number: 17986506 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Brucella melitensis # 5 204 3 205 213 161 42.0 1e-39 MVSHDLLKRFSAGFQRFQKTWYCPENNIYEDLRVGQHPYALVIACSDSRVDPVLLLDATP GDLFVIRNVANLVPPYEPDSHHHGVSAALEYAVRHLHIGHIMVMGHAKCGGFTSLLEASH SDDEFLNIWMNLACRAKAEVDSSLPGADPDERQRACEMWGVRFSLDNLMGYPWIKSAVDG GELLLHGLYFDMGSGELLYFDAESETYVPMVNACASGTSSVKRR >gi|316924016|gb|ADCP01000049.1| GENE 63 64013 - 64372 394 119 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPDFLRWIIAGVTSVVVAGLVFLLATVGIVVAFFVFLVLFVVFGLAMRRARKNVQYGEDG SRFIIYTNIPGAGTGNADRVRRDDPDTYELSPDEYTVEAAPSKKPGDTDPKALGDHGKS >gi|316924016|gb|ADCP01000049.1| GENE 64 64427 - 65566 965 379 aa, chain + ## HITS:1 COG:mll9166 KEGG:ns NR:ns ## COG: mll9166 COG1472 # Protein_GI_number: 13488104 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Mesorhizobium loti # 39 378 46 380 394 243 39.0 3e-64 MFLHACLRVVRLWRMGCLVLCGFFLCFGGGAAAFAAPSVEEMAGQMLLVGFKGQEPEECG AILRDIQTRHLGGVILFKRDARNAKLPRNIRDAAQVKRLIAALRGASAGHPLFVAVDQEG GKVARFQPGDGFPAYPSAAELGRGTPDATRRTALGMGRMLRELGVNLNFAPVLDVNVYPA SPAIGRLGRSFSAAPQAVAAHGAAFADGLNDAGIVAVFKHFPGHGSARADSHKGVTDISA TWSERELSPYRSALGRPGQRMVMTGHLFHAGLDPAFPATLSPSVINGLLRGRLGYDGVVV TDDLQMDAIAAEYTLEEVVLRAIGAGADILLFGNNLEYDPAIVAKVQAVIVRAVEDGTIS RARLEASWRRILKLKQQMS >gi|316924016|gb|ADCP01000049.1| GENE 65 65705 - 66088 516 127 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTVVVKPEFAGLTEEQVIDAIDAYEAQVDEYYANKGDEEECGCGCSHDEGSCEDSDDDV IPSDDPIVLEVERLIAAYSDRFDELCKENGEVPEEALTYKPRTPIEQVAFDIFTDALHDS LMDEDDE >gi|316924016|gb|ADCP01000049.1| GENE 66 66273 - 68411 293 712 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|87310993|ref|ZP_01093118.1| ribosomal protein S1-like RNA-binding domain protein [Blastopirellula marina DSM 3645] # 581 708 822 941 1043 117 44 2e-25 MQNYASIIAQELNIRPQQTEAVIRLLDEDATVPFIARYRKEATGSLDEVAVTAIRDRLAA LKELDKRREAILASLEERGLLTPELRAKVEAADALTVLEDIYLPYRPKRRTRASMAKERG LEPLADQLARQDRTLPEAMAAAYVDAEKGVPDAEAALAGARDILAERFAEDQPARERLRK LFGREGLMRSKAVSGKEEVPDAAKYRDYFAWDEPAAQAPSHRILAMFRGEEEGFLRLSLR PRNEEQAQDLLARLFVRNDGPCGREVRAAALDGYGRLLAPALETELRNQLKQRADAESIR VFAENLRQLLLAPPLGAQNILAVDPGFRTGCKIVCLNREGKLLEYATINPHAGSDKARTE AGETLLRLAKKHGAEVAAVGNGTAGRETEAFIRALPGWSIPVVMVSESGASVYSASEAAR AEFPDLDLTYRGAVSIGRRLADPLAELVKIDPKAIGVGQYQHDVNQSELKRCLDDVVVSC VNAVGVDLNTASEQLLTYVSGLGPTLARNIVAHRHENGPFQTRRDLLKVPRLGPKAFEQC SGFLRIRDGKQPLDASAVHPEAYAVVERMAKDTGTTVRELMQSPERRKAIRLSDYVTDKL GLPTLTDIMKELEKPGRDPRGPFKPFSFAEGVSTMQDLKQGMKLPGIVTNITAFGAFVDI GVHQDGLVHISQLANRFVRDPNEVVKVHQEVLVTVLDVDIPRKRISLTMRGE >gi|316924016|gb|ADCP01000049.1| GENE 67 68623 - 69066 590 147 aa, chain + ## HITS:1 COG:no KEGG:Dvul_0109 NR:ns ## KEGG: Dvul_0109 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 8 135 1 134 166 66 28.0 3e-10 MNTPSSALSDEMKGAMSTLLNAVIFEQWLRFSWIEEDEEGDFCIQIPAETVSELVEDYPE YEGLIAQLNGTIVDADMACSAVLGYARSSLGEQSLAVLEHNEFQNMVGRFHQWLNDNVEA LDQDPKNFDQWCELFLADLQQAKDGNA >gi|316924016|gb|ADCP01000049.1| GENE 68 69344 - 70024 862 226 aa, chain - ## HITS:1 COG:no KEGG:LI0197 NR:ns ## KEGG: LI0197 # Name: not_defined # Def: endo-1,4-beta-xylanase # Organism: L.intracellularis # Pathway: not_defined # 29 224 37 232 235 187 43.0 2e-46 MSYFVRFLLLCCTVALAIPTYANALNERDAVVSRQTGKLQIRVHYPITGNARVDADVADW AHQAVDTFQNTYGEEPDLGVPYELETTYSTTRSTPSVLSIVWKTASYTGGAHGNLEITTT TYDMKSGALIDLYDVFENLDTALDVMSRYCTKALTKSLGDMYNDDMLRSGTAPEAENFSS FALTPEGIRIFFQPYQVAPWAAGSQVVDIPLDALADAGPRLSLWGK >gi|316924016|gb|ADCP01000049.1| GENE 69 70096 - 71256 667 386 aa, chain - ## HITS:1 COG:XF1469 KEGG:ns NR:ns ## COG: XF1469 COG0726 # Protein_GI_number: 15838070 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Xylella fastidiosa 9a5c # 3 165 4 161 284 107 40.0 5e-23 MSAYSLPVLMYHYVSRFPGAIAVSPEHFEDQCRGMAEHGWRGIGLDEAEGFLLKGAPLPP RSLLITFDDGYLDNYVYAWPILRKYGHKGVVFAVTERMEAEKKCRPTLADVWEGLPPSSL PPVDAPMHDTPFGYQVRRDMFFSWEEARHMESSGVMAVAAHSARHLAVFAGPEWGPVNRH DRHQKPASALEAAGQRFHVPGTRANTFNAVDFPEVWGLPRFKERPFLYSRAFIPSPDLVA AVQRLVPQEPAEARTFFQSAGNVAALETLVAGFSPDRLGTLESEAARRSRVHEELGACAE TLRRELGHPVRSLCWPWGSGSEVAREEGRKAGFSVFFTTRMGANPPAAPEAVHRFKVRDA GWSWLRLRLEIYSRPWLARLYGACRI >gi|316924016|gb|ADCP01000049.1| GENE 70 71498 - 72301 683 267 aa, chain - ## HITS:1 COG:CC1872 KEGG:ns NR:ns ## COG: CC1872 COG0739 # Protein_GI_number: 16126115 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Caulobacter vibrioides # 131 246 258 373 383 107 44.0 2e-23 MTLKPSYLSFSAAKRLALFSPLLCLLLLTACAPQQSGLQHHQYSIVRYQDSLTALHDYRP PTREALSYTARRELQTAGAAVSQNGSNVRFTPIPDEEFDDGDIEVTPELLKEPSLYPALS LIQPIKETIRVTSSFGPRKHPVKRRRLMHAGVDIAGNRGEKVVASAPGKVVYSGRKGSYG LTIDIDAGKGVTLRYAHLDKLGVRKGQKVKQGQYIGNLGRTGRVTGPHLHFEVRLRDKPI NPMQFLTHEHQWASNTGTKQRKRSGNL >gi|316924016|gb|ADCP01000049.1| GENE 71 72887 - 73552 376 221 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|223039866|ref|ZP_03610150.1| 30S ribosomal protein S16 [Campylobacter rectus RM3267] # 1 213 16 236 286 149 39 4e-35 MSFSLAGCGLMDYGGGGSGTRGTKTYTVRGKTYRPYLSADGYREDGVASWYGRDFHGKTT ANGERYNMYAMTAAHKLLPLGTKVRVTHLRNGKSIVVRVNDRGPFVGDRIIDLSYASAKE LGMIGTGTARVRVEAIETFGGASPGDMNGSFYIQIAALSSQASAQDLVRRLQNRNLGGRA FYAPSLGLWRVQAGPFNSLNRAEDLSDELNRQYPGNFVVAD >gi|316924016|gb|ADCP01000049.1| GENE 72 73963 - 74922 1353 319 aa, chain + ## HITS:1 COG:ECs1354 KEGG:ns NR:ns ## COG: ECs1354 COG0861 # Protein_GI_number: 15830608 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane protein TerC, possibly involved in tellurium resistance # Organism: Escherichia coli O157:H7 # 10 313 13 334 346 262 48.0 6e-70 MVVEHSVTEVVLFCVLIIVCLWCDLHAHQADKPVSARNAAIWSCIWVGLALVFAGYIGYS FGSEQVQLFLTGYLLEKSLSVDNLFVIMAIFSSFAVKDAFQHRVLYYGVLGALVLRLIFV AAGSSLVAMFGPYALASFGIFVLWTAWKMWQSMHSGEKEEIVDYSEHWAVRYTKRFIPVH NQLSGHDFFVKAPDTTGKLIWKATPLFLCLFVVEVSDVMFAFDSVPAIIAVTHDPFLVYT SNVFAILGLRSMYFLLAAGKRYLRHLEKSVVIILAYIGVKMLLDVVGIVHISPLISLGVV IGLLAIGILASLLPEKSAK >gi|316924016|gb|ADCP01000049.1| GENE 73 75160 - 76038 877 292 aa, chain + ## HITS:1 COG:no KEGG:LI0248 NR:ns ## KEGG: LI0248 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 184 292 107 216 216 100 42.0 5e-20 MNRTRTLLIIGVFMLSLAVLAALEFRVQDEPPSEGQMVTGSLSKDMPALPKLPSLGDSEG NKGGVLVPLDNGTLRGSAAPLDMSAIPPDEETPAGKDKPETAAPAADPAVSAEKPAPAPM GEIAPADTFTEKPKQPVAEEKAPQQEKKAEKAVAEKPAEKPQPKVSEPKQEKEQAKEGVV VLTTKAPAKLAAGQTAITATRLELGKSVVFRMTGAAPIKTKTLLLKDPNRYVVDLQGNWG IQLPRVPKDLWISGIRLGHHEESTRLVFELTRAPVSAKVVKINNTTVEVRIQ >gi|316924016|gb|ADCP01000049.1| GENE 74 76184 - 76636 -249 150 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPISLWAKTQSFLFPTRSSLVKEPEAVASAAAFPSAAKRELTPPGPACQQLFFRTPKFFF GARREDFSPAQKRELIPTPSRCQPLSSAFFKKLVRPAVPASRPCPAQRDIPSRSGENGLC TSTPPLSTTFYFFSDFFAPRSPFLPCFRFG Prediction of potential genes in microbial genomes Time: Fri May 13 02:33:17 2011 Seq name: gi|316923983|gb|ADCP01000050.1| Bilophila wadsworthia 3_1_6 cont1.50, whole genome shotgun sequence Length of sequence - 34237 bp Number of predicted genes - 38, with homology - 31 Number of transcription units - 24, operones - 9 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + 5S_RRNA 243 - 342 93.0 # CP001197 [R:1104568..1104682] # 5S ribosomal RNA # Desulfovibrio vulgaris str. 'Miyazaki F' # Bacteria; Proteobacteria; Deltaproteobacteria; Desulfovibrionales; Desulfovibrionaceae; Desulfovibrio. 1 1 Op 1 . + CDS 959 - 1420 426 ## COG3133 Outer membrane lipoprotein 2 1 Op 2 . + CDS 1496 - 2050 759 ## Ddes_2019 hypothetical protein + Term 2078 - 2116 9.3 3 2 Tu 1 . + CDS 2129 - 3031 724 ## COG2962 Predicted permeases + Term 3185 - 3211 -1.0 - Term 3222 - 3250 1.3 4 3 Tu 1 . - CDS 3315 - 3557 360 ## DVU0805 hypothetical protein - Prom 3740 - 3799 3.5 5 4 Tu 1 . - CDS 3828 - 4502 343 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division - Prom 4531 - 4590 1.6 6 5 Tu 1 . - CDS 4616 - 4939 89 ## - Prom 5069 - 5128 2.1 + Prom 4632 - 4691 4.5 7 6 Tu 1 . + CDS 4905 - 5123 142 ## 8 7 Op 1 . - CDS 5412 - 6185 1027 ## COG0565 rRNA methylase 9 7 Op 2 . - CDS 6188 - 6586 420 ## DvMF_3036 cytochrome c biogenesis protein transmembrane region 10 7 Op 3 . - CDS 6499 - 8061 1185 ## COG4232 Thiol:disulfide interchange protein - Prom 8174 - 8233 78.0 + TRNA 8157 - 8232 82.7 # Lys CTT 0 0 - Term 8227 - 8278 8.4 11 8 Op 1 . - CDS 8362 - 8520 58 ## Amuc_2074 acetyltransferase including N-acetylase of ribosomal protein-like protein 12 8 Op 2 1/0.286 - CDS 8517 - 8945 250 ## PROTEIN SUPPORTED gi|229547905|ref|ZP_04436630.1| acetyltransferase including N-acetylase of ribosomal protein family protein - Prom 9014 - 9073 8.8 + TRNA 9359 - 9434 82.7 # Lys CTT 0 0 - Term 9436 - 9471 6.5 13 9 Op 1 . - CDS 9577 - 10383 701 ## COG0730 Predicted permeases 14 9 Op 2 . - CDS 10475 - 10930 -346 ## - TRNA 10759 - 10845 69.3 # Leu TAA 0 0 15 9 Op 3 . - CDS 10945 - 12201 649 ## COG0546 Predicted phosphatases 16 10 Tu 1 . + CDS 11833 - 12624 794 ## COG1189 Predicted rRNA methylase + Term 12675 - 12733 -0.5 - Term 12812 - 12854 5.4 17 11 Tu 1 . - CDS 13060 - 14133 986 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 14155 - 14214 1.9 - Term 14479 - 14520 -0.4 18 12 Tu 1 . - CDS 14598 - 14822 120 ## + Prom 14546 - 14605 6.5 19 13 Tu 1 . + CDS 14821 - 16338 1610 ## COG0531 Amino acid transporters + Term 16363 - 16410 11.3 - Term 16356 - 16394 5.2 20 14 Tu 1 . - CDS 16432 - 17316 1120 ## COG0583 Transcriptional regulator + Prom 17340 - 17399 3.2 21 15 Tu 1 . + CDS 17434 - 18822 1800 ## COG1027 Aspartate ammonia-lyase + Term 18827 - 18872 3.4 - Term 19069 - 19099 -0.9 22 16 Op 1 42/0.000 - CDS 19103 - 19969 1113 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 23 16 Op 2 25/0.000 - CDS 19962 - 20729 223 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 24 16 Op 3 . - CDS 20716 - 21636 1091 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin + Prom 21647 - 21706 1.6 25 17 Tu 1 . + CDS 21783 - 22154 386 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Term 22149 - 22200 4.0 26 18 Op 1 . - CDS 22314 - 22598 371 ## 27 18 Op 2 . - CDS 22623 - 22904 196 ## 28 18 Op 3 . - CDS 22935 - 23438 626 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase - Term 23557 - 23602 10.3 29 19 Op 1 . - CDS 23614 - 24288 857 ## COG1994 Zn-dependent proteases 30 19 Op 2 . - CDS 24292 - 25488 333 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 31 20 Op 1 . - CDS 25607 - 26086 617 ## COG2606 Uncharacterized conserved protein 32 20 Op 2 . - CDS 26099 - 26710 591 ## COG2755 Lysophospholipase L1 and related esterases - Term 26756 - 26793 6.4 33 21 Tu 1 . - CDS 26891 - 28270 1746 ## COG0733 Na+-dependent transporters of the SNF family 34 22 Tu 1 . + CDS 28001 - 28492 71 ## 35 23 Op 1 10/0.000 - CDS 28683 - 30260 2049 ## COG0029 Aspartate oxidase - Term 30294 - 30323 1.1 36 23 Op 2 9/0.000 - CDS 30501 - 31550 1330 ## COG0379 Quinolinate synthase 37 23 Op 3 . - CDS 31547 - 32401 403 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 - Prom 32435 - 32494 2.4 + Prom 32651 - 32710 5.4 38 24 Tu 1 . + CDS 32742 - 34187 1687 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) Predicted protein(s) >gi|316923983|gb|ADCP01000050.1| GENE 1 959 - 1420 426 153 aa, chain + ## HITS:1 COG:RSc2792 KEGG:ns NR:ns ## COG: RSc2792 COG3133 # Protein_GI_number: 17547511 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane lipoprotein # Organism: Ralstonia solanacearum # 2 152 5 154 155 76 41.0 2e-14 MLKIKHVAVCMLALAMLSAGCTNYSGNTYSGSQVRSAQTVQYGTVVSVQPVTLEEDRPAV LGTVGGGVVGGVLGNMVGGGRGKTLATIAGAALGAAGGYAGEKALTKQNGLEITVELENG QQLSIVQAADQQFSPGERVRVLRGSDGSARVTR >gi|316923983|gb|ADCP01000050.1| GENE 2 1496 - 2050 759 184 aa, chain + ## HITS:1 COG:no KEGG:Ddes_2019 NR:ns ## KEGG: Ddes_2019 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 184 1 184 184 306 78.0 3e-82 MLDPKQCVLFSGGAAGTEQFFGAQAEAWGIEEVNYSFDGHQIERKRGVRVLTSEELALKD VSLTYVSRLMGREFTKAPLFRKVLQSICWQVSSGQEVVVVGNIQPDQTVKGGTGWGAEFA KICNKPLLVFDQEQDGWYRWLNDAWVKVEEPVIEHTHFTATGTRFMEANGRKAISDLFAR SFNR >gi|316923983|gb|ADCP01000050.1| GENE 3 2129 - 3031 724 300 aa, chain + ## HITS:1 COG:VC0195 KEGG:ns NR:ns ## COG: VC0195 COG2962 # Protein_GI_number: 15640225 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Vibrio cholerae # 1 293 1 293 302 192 38.0 8e-49 MFAVPSVKSNSLGLLAGLCAFLIWGFLVVFWHMLDAVAPFEILCYRIVFSFVSLLPVVML TRRWAEVVSAFRNRRVALTMFASSCVVGLNWFLYIWAISTGQILETSLGYYTNPLMNVLF GFLFLRERPSRLQGIAILLACIGVLLSFLGHNQFPWLALSLAFTFALYGFIRKTVSVEAL PGLFIETIVLIPFSLGWILWLYWNGEGFLCSPTVREGFLLFLAGPVTSLPLVMFAYAARH MRLMTLGLLQYISPTCTFFLGIFMFGETLNPAALVTFIFIWLALLLYTVESWRQTRHLPH >gi|316923983|gb|ADCP01000050.1| GENE 4 3315 - 3557 360 80 aa, chain - ## HITS:1 COG:no KEGG:DVU0805 NR:ns ## KEGG: DVU0805 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 78 1 78 98 97 57.0 2e-19 MSANLISAVHDLVLESGLGAKNIAAAVGKPYSTLLREVNPFDDGAKLGAETLVDIMRVTG NIQPLEHIAEQFGYELKRTH >gi|316923983|gb|ADCP01000050.1| GENE 5 3828 - 4502 343 224 aa, chain - ## HITS:1 COG:BH4060 KEGG:ns NR:ns ## COG: BH4060 COG0357 # Protein_GI_number: 15616622 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Bacillus halodurans # 19 166 14 158 238 71 30.0 1e-12 MRSSSVSPYDLQDWAKRAGFELTEETLPPLAGYLGLLIQWNRVMNLVGTRTAEDTFFTLV VDSLHLGRFLREDVEYSAAPCCWDLGSGAGLPGLPLRMIWQEGDYWMVEAREKRALFLST VLAKYPLPGTHVFRGRAEAFMAGPPARTADLIVSRAFMPWPGVLELVKGNLNPNGVVVLL LRERLQESPDWEQAAQNWRIAGQYTYTASRTQRYLYALTAQDAL >gi|316923983|gb|ADCP01000050.1| GENE 6 4616 - 4939 89 107 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEASSSGQPPPTTRLLFTSSPFWDRVLLTVLGIVSFFLWSHSKGDVPRDIFGTLFPLSFL LAVLYPMFFLPIAPMSPEEKRRWHLKKQKKRLKRQKAKADKKRRCKS >gi|316923983|gb|ADCP01000050.1| GENE 7 4905 - 5123 142 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGGCPEEEASIPGWLSEGGASGKMFVRNVCKPDYDKKQEHDDEECGNEFRRVSASDTIG DHPFGGTYAKPP >gi|316923983|gb|ADCP01000050.1| GENE 8 5412 - 6185 1027 257 aa, chain - ## HITS:1 COG:PA3817 KEGG:ns NR:ns ## COG: PA3817 COG0565 # Protein_GI_number: 15599012 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylase # Organism: Pseudomonas aeruginosa # 10 253 2 248 257 122 33.0 8e-28 MLVQASQDALRHLEIIMVGTKFPENIGMAARACANTGCGRLTLVSPAWWDKEKARPLATA KGEPLLDAIEVKPTLGDALAPNVLTFGTTARTGGWRRGLLTPEQAAGEIAPLLHEGSRVA IVFGCEDRGLSNADIEQCQRLVTIPTAGEASSLNLAQAILILTYECMKAVSRVDPQPSLG NDPGQQSRRITHEEQALLYARLKETLLAIDYLKSDNPDYFLMPLRRFLGKSGLRRHEMDM LMGICRQVDNLRKEAGK >gi|316923983|gb|ADCP01000050.1| GENE 9 6188 - 6586 420 132 aa, chain - ## HITS:1 COG:no KEGG:DvMF_3036 NR:ns ## KEGG: DvMF_3036 # Name: not_defined # Def: cytochrome c biogenesis protein transmembrane region # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 9 125 602 718 731 145 60.0 6e-34 MRLHLVERAARPEPAPWETFRADTFRSLLKKEPLMVEFTADWCPSCKFLEQTVLTPKRLH AITERYGLRLIKVDLTRPDPEAQALLRAIGSVSIPVTAIFPKGLLSNSPIVLRDLYTASQ LEDALATLSPRK >gi|316923983|gb|ADCP01000050.1| GENE 10 6499 - 8061 1185 520 aa, chain - ## HITS:1 COG:CT595_2 KEGG:ns NR:ns ## COG: CT595_2 COG4232 # Protein_GI_number: 15605325 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol:disulfide interchange protein # Organism: Chlamydia trachomatis # 239 496 1 272 401 143 32.0 1e-33 MKTKNTLHIWLVLCLVTGFLLYAASLTHAESLPFSLDTEYAELDGHPAAVLWVTPQKGYH TYSHYADAPIPTSLNVRNVQGHPTAATVLYPQGRLTPDTFEPAKQILAYTARFPIFIRFD APVSGPLSADLSMLLCSDKNCVPVQQTFSLALPETLPPPSPAALEAYGQLGKAETPPASS EPISPQAPPQQSVTQLTPLSGPSVQLAGSPAEPDEWRFTPRFPQESLEPTALGTALFLGL LAGLILNVMPCVLPVLTMKVSALLSSSGYETEGQRLAHFREHNVLFAAGILTWFLVLAFC VGALGLAWGGLFQNTHLVYGLLILVFLLSLSLFDVFTLPVLDFKVGASRNPKTQAYLTGL VATLLATPCSGPLLGGVLGWAALQPLPVIVAVFTATGIGMALPYLVLAAWPGAARILPKP GAWTGIMERLVGFFLMGTAIYLLSILPESQRLAALVTLLVCALAAWMWGQWGGLRASGRQ KLFTGALALLMVCGSIWWSVQPAPNPRRGKRSGRTRSVRC >gi|316923983|gb|ADCP01000050.1| GENE 11 8362 - 8520 58 52 aa, chain - ## HITS:1 COG:no KEGG:Amuc_2074 NR:ns ## KEGG: Amuc_2074 # Name: not_defined # Def: acetyltransferase including N-acetylase of ribosomal protein-like protein # Organism: A.muciniphila # Pathway: not_defined # 1 47 143 189 198 66 65.0 3e-10 MKRLGMLYQYSYKEQWQPKNILTTFCMYQLNFDGQDKRVYKGYLDQSPNQAD >gi|316923983|gb|ADCP01000050.1| GENE 12 8517 - 8945 250 142 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229547905|ref|ZP_04436630.1| acetyltransferase including N-acetylase of ribosomal protein family protein [Enterococcus faecalis ATCC 29200] # 9 122 21 129 204 100 44 5e-48 MGFTNTPTLETERLLLRRFSEKDLYALFKIHSDKDVNTYLPWFPLQSIEEASLFLRERYL DTYRQPYGYKYAICLKNDDIPIGYVNIGINEPHDLGYGLLKAFWRNGIVTEAANAVIHQV KKMAFLTLRPHTISRIPVAEMS >gi|316923983|gb|ADCP01000050.1| GENE 13 9577 - 10383 701 268 aa, chain - ## HITS:1 COG:PA0340 KEGG:ns NR:ns ## COG: PA0340 COG0730 # Protein_GI_number: 15595537 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Pseudomonas aeruginosa # 7 262 5 259 267 144 39.0 1e-34 MLSMLALLVLGGAFVGLLSGLLGVGGAVIIMPLLNVFFDRLGFSVDTTQHLSLGTSLATI LFTSLSSVLAHRRYGSVRADIWKKMAPGIIAGTLGGALLAPHLPGLFLRGFFAFFVMLVG VHLLFNSTPRPKTGRLERAMLPVSILIGLISSLAGIAGTMLCVIFLVWAAIDWADAVGTS AALSLPISLTGTLGYAIAGWNFPELPPYSVGFVYLPGMFCLLVSSMSMAVVGARLAHSPR LPMQALRRCFAVGNILLGLSILRSVLFR >gi|316923983|gb|ADCP01000050.1| GENE 14 10475 - 10930 -346 151 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHFVDRKPKGEQPRLLTPHPDTSILYSSCPDGGTGRRKGLKIPRPYGCTSSILVPGTIKK QTGYDTNPHPFTSQNKPPHACSCNCKCKQEGVFLFSSLLPLPPPSAPQPLRTRRHVFLHT LWAGSAYFPWGVGGASRASGGHPGSLSISAG >gi|316923983|gb|ADCP01000050.1| GENE 15 10945 - 12201 649 418 aa, chain - ## HITS:1 COG:BB0676 KEGG:ns NR:ns ## COG: BB0676 COG0546 # Protein_GI_number: 15595021 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Borrelia burgdorferi # 199 410 5 216 220 166 39.0 9e-41 MELLLAHVHRIHLRRAVLQQAVRKPARGSPGVQTDEPGHIQLEMLYGGQQLIRATADVLL DADQCEGRIQRIFVTGLIHFLPHAGNIFQRHFARHDQAFRHFAAFRQTLRKNQLVRTFLT LRHTAPASEKIAAIPRHRARVSKIQPRCNPLHSLRTRFPLERHGVLCHAAVSPVPFACVW HSGYNGFQRIAMNKQNHHAAIFDLDGTLLDTLDDLANAANAALHSAGYPQHPVAAYRQFV GNGLRMLVRRALPEGEADRVGPAGFEALVERTGANYARDWAVKTRPYPHIPELLQELQRR GIPLAVVTNKPHEWTLHMLKHYFPESPFLFIQGAMPNLPHKPDPTGALNAARHLDSIPNE TVFVGDSNVDMLTAHNAGMTAVGVDWGFRGAQELKESGADKILYDPLELLPFFEKYAS >gi|316923983|gb|ADCP01000050.1| GENE 16 11833 - 12624 794 263 aa, chain + ## HITS:1 COG:aq_773 KEGG:ns NR:ns ## COG: aq_773 COG1189 # Protein_GI_number: 15606155 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Aquifex aeolicus # 7 251 2 240 258 206 48.0 3e-53 MAQGKERADELVFAQGLAESREMAKRLIMAGKVALEDVSGVRQKVDKPGHKYPLDTAFAL VGIEKYVSRGAYKLLTAIEHFKLDVTGFVCLDAGASTGGFTDCLLQHGAAKVYAVDVGKE QLHERMRRDPRVISMEGTNLRTAPEELIPEPVDIIVADVSFISLTLILPPCVRWLKEGGL VAALVKPQFELGPHQTDKGVVRDPALRQQAVDKVLLFCQERLGLECLGVVPAAVKGPKGN QEYIVCLRRLGAGQVACGSEVEQ >gi|316923983|gb|ADCP01000050.1| GENE 17 13060 - 14133 986 357 aa, chain - ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 50 289 47 282 332 75 27.0 2e-13 MRRSTLSGLFGFFALLLALTADLSAAYQEPPFKLETGWLPEHETFLIWYAKQKGWDKQEG LDIVLNRFDSGKDLIGNADKWVIGACGAFPILTSPYPEQFTIIGVGNDESLANAVMVRPD SPLLDTKGANKGYPNLYGKADDVRGKTIICPASSSAQYLLTKWLAAYGLTTEDVRIQNTD AVPGLQAFFGGEGDAIVLWAPYSYTGDEHGLKVAATSRSVNAPQSVLLMADKDFAAAHPS QVASFLRVYLRAVRMMREESVATLAEDYKTFFKEWAGKTMSDAEVIKDITIHQVFLLEDQ LRMFNAEHSRSDMQDWLASIIRFHAGDAYSQDKTEELLGLVTGAFLEKVERPIPEYR >gi|316923983|gb|ADCP01000050.1| GENE 18 14598 - 14822 120 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTKSPSLSSRFKSGFFALFFVHRRIPALNEKGRIIQKLYWNVGIVPEGTNSLASAEPKYL NRHCKEAFIRINRL >gi|316923983|gb|ADCP01000050.1| GENE 19 14821 - 16338 1610 505 aa, chain + ## HITS:1 COG:SA0541 KEGG:ns NR:ns ## COG: SA0541 COG0531 # Protein_GI_number: 15926262 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Staphylococcus aureus N315 # 1 478 1 478 494 328 39.0 1e-89 MSNEHVESGREELKKTMSPAEVWALAVGAIIGWGCFVLPGARFLPEAGPIGSVLAFIVGG GLLCFVALAYSILVKAYPVAGGAFTYAYVGFGKAWAFICGWALILGYLCVIAANGTALAL LSRALLPGVFDVGYLYSVAGWDVYAGELAMMTCAFLFFGYMNFRGMDFASSIQLILAFAL VAGVLILTVGSFSTSTASLDNLFPLFAEGRSPWACVISIVAITPWLFVGFDTIPQTAEEF DFPPEKSRRLMLNSIICGATLYALVLISVAIIIPYTDLLGQNHTWATGAVADMAFGRFGG MILAIPVLAGILTGMNGFFMATTRLLFSMGRGKFLHPWFLKVHPKYGTPTNAVLFTLGLT LIAPFFGRSALNWIVDMSAMGTALAYLSTCLVVYKYASRFADQTPWWSKSVAVVGALTSI ACFLMLAVPGSPAAIGFESWIMLLAWVAIGACFYFSRASILRSIPETSMQYMLLGTADRP LLFTPKAAEEEVFEEDDAKSAVELG >gi|316923983|gb|ADCP01000050.1| GENE 20 16432 - 17316 1120 294 aa, chain - ## HITS:1 COG:AGc4564 KEGG:ns NR:ns ## COG: AGc4564 COG0583 # Protein_GI_number: 15889777 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 292 9 304 306 171 37.0 2e-42 MDQKLLEDFLSLCRHRSFSHAAQERNVTQPAFSRRIRALEEWLGVVLFDRTALPVRLTAQ GEQFLPVARDIVDRMAEARREFAAGNRSDGIVRLISLPTLSINVLPDLLFRVRQTCDQLR FVVNPFPQSVEEHFCALMEHQVDMLLTYRLDEYESDSEFLSHIESATVGRERFLPVVGSA LAERLPSASPLPYLSYSNFSFANRIILPLYEKTPCELQSVYESSLSEGIVKMLEKGIGMA WVPETLVSEQLKRGTIRRIWEDRPDLSVEVDIKLYRNKTVHRAAIETFWKAILP >gi|316923983|gb|ADCP01000050.1| GENE 21 17434 - 18822 1800 462 aa, chain + ## HITS:1 COG:Cj0087 KEGG:ns NR:ns ## COG: Cj0087 COG1027 # Protein_GI_number: 15791475 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Campylobacter jejuni # 3 460 4 461 468 598 62.0 1e-171 MPRLEHDFLGEMRIPSDTYYGIQTYRAVENFPISGVSIASFPELIKALGLVKKASAIANM QFGVLPREIGDAICAACDEIIEGKLHDQFPVDCIQGGAGTSTNMNANEVIANRALELLGH KKGEYEYCHPNNHVNCSQSTNDIYPTALRIALYTMLGQLTGDMAYLQQGFAAKSEEFSDV IKMGRTQLQDAVPMTLGQEFGAYATTIGEDILRVEESRKLLLEVNLGATAIGTSINAPKG YAEVASKALAEVSGIPVVLSTNLIEATWDTGAYVQTSGVLKRIAVKLSKICNDLRLLSSG PRTGINEINLPRLQPGSSIMPGKVNPVIPEVVSQVAFDIIGKDVTITMAAEAGQLELNVM EPIIAFSLSNGIHRLRQAMRTLQDRCVSGITANKERCRSMVENSIGIITALNPYLGYETS ASIAKEALETDGSVIDIVLARNLLSREKLDDILAPEHMINRK >gi|316923983|gb|ADCP01000050.1| GENE 22 19103 - 19969 1113 288 aa, chain - ## HITS:1 COG:HI0407 KEGG:ns NR:ns ## COG: HI0407 COG1108 # Protein_GI_number: 16272356 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Haemophilus influenzae # 27 275 8 254 261 129 35.0 6e-30 MLDLEPLYRLVSLFPFECLQAGFMQQALVALILLAPMAATMGVQVVSFRMAFFSDAISHS AFAGVALGLIFSINPHVAMPVFGILVGLGIMAVQRNSALSSDTIIGVFFSAVMAFGLAVV SRDNAVARDLQQFLYGDILTITETDIRWLIGLFFALAAFQIWGYNRLLYIGLNSVVAKAH RINVFFWQYLFAGLLALVVMFSVWAVGVLLVTALLIVPAATARNLARTAGGMFWWSILVS VTSAVAGLVLSAQEWAGTATGATVVLVSCGWFLLSAVLAAIRGEGRRQ >gi|316923983|gb|ADCP01000050.1| GENE 23 19962 - 20729 223 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 4 219 2 218 311 90 30 1e-17 MEPTNAVVFRDVSVRRSGLAILEHVNATVPQGSCTVIVGPNGAGKTTLILALIGEMRHDG DIDVMTGRSGKPLRLGYVPQRISIDRGMPLTVVEFLVMGIQKRPLWLGIRPSLKAHSLEL LSMVKAEHLASRRLGDLSGGEMQRVLLALALQQEPELLVLDEPSAGVDFQGEHLFCELLD ELRAAKGFTQLMVSHDLGMVFHHATHVICLKRHVFAEGTPDEVLTQENLMALFGMHMGLI NPHAPTADQPKDHHA >gi|316923983|gb|ADCP01000050.1| GENE 24 20716 - 21636 1091 306 aa, chain - ## HITS:1 COG:all0833 KEGG:ns NR:ns ## COG: all0833 COG0803 # Protein_GI_number: 17228328 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Nostoc sp. PCC 7120 # 24 301 62 336 339 150 36.0 4e-36 MRKTTAGIVSLMILTLMCGAALAQTRVLATTFPVYQIVRNITQNVPDVEVQLMLPAQAGC PHDYALTPQDMSKLAQADILVLNGLGLEAFLGSPSARAQKELHTIDSSKGISGLLPYTDA EAAHEEHEGHHHGGMNPHLFASPRMAAQMTRSIAGQLADLDPANAATYWANAENYARTLD ALADEFAALGGKLKNSRIITQHGVFDYLARDMGLDVVAVIQADDTQAPSASDMMKLIKAI RSQHVGAIFTEPQYPDKVAATLSRETGVATAKLDPVATGPAIAPLDYYEKTMRANLHTLE STLGTN >gi|316923983|gb|ADCP01000050.1| GENE 25 21783 - 22154 386 123 aa, chain + ## HITS:1 COG:DR0865 KEGG:ns NR:ns ## COG: DR0865 COG0735 # Protein_GI_number: 15805891 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Deinococcus radiodurans # 2 123 4 126 132 89 42.0 2e-18 MKRKTSQRAAIEQVFCQLDRPLGIEEILETGRMAVESLNQATVYRNVKLLLENGWLKQVC HPSLGTLYERTGKGHHHHFHCRVCNRVYDLPGCALNEREAAPSGFLVEDHECFLFGVCPA CHA >gi|316923983|gb|ADCP01000050.1| GENE 26 22314 - 22598 371 94 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKPLFYLLTAAAHPFGLYVVVPLYMEHCYVVTGSDGAGRAMAAGFAELFAIALWTLGVVI VSLLVSRLHYKEWLPTIGINTIIILIYLRLLLGL >gi|316923983|gb|ADCP01000050.1| GENE 27 22623 - 22904 196 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKGLFHALTAAAYPFGAYVLLPAITRGVGTAGDPMGRGLSRGFAALFFMGFWTLCVLIV SLTGSKAQYGECLTSVVVNAFAIVLTTLLCQLI >gi|316923983|gb|ADCP01000050.1| GENE 28 22935 - 23438 626 167 aa, chain - ## HITS:1 COG:aq_185 KEGG:ns NR:ns ## COG: aq_185 COG2870 # Protein_GI_number: 15605755 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Aquifex aeolicus # 12 159 5 150 157 164 56.0 7e-41 MDMLKHPGMADMPELLEKLDALRKEGKKIVFTNGCYDILHPGHVDLLERCKAQGDVLVLG LNSDESVKRQGKGDDRPVNPYPVRAFVLAHLESVDFVVRFDEDTPAKLIEAVQPDVLVKG GDWSVDRIVGRESVQRRGGEVISLPLLQGYSTTALIEKIRRTAPGSR >gi|316923983|gb|ADCP01000050.1| GENE 29 23614 - 24288 857 224 aa, chain - ## HITS:1 COG:aq_1853 KEGG:ns NR:ns ## COG: aq_1853 COG1994 # Protein_GI_number: 15606892 # Func_class: R General function prediction only # Function: Zn-dependent proteases # Organism: Aquifex aeolicus # 20 219 15 217 217 139 45.0 4e-33 MLDIDLAVSIKRLSVAFVPLMLGIILHEVAHGWAALKRGDPTAAMLGRLTLNPVPHIDPM GLFVFVLTSLTGPFVFGWAKPVPINPRNFRNIVKDTMLVSFAGPATNFLLSIGFAVLLRL LIEFFPLGEWQGNTVWDFFFLMFQTGVVVNIGLGWLNLMPIPPLDGSKILWGVLPPKLGF QYMQLERYGFLVLILLLMTGALGYVLYPLIQFSVNTVFSLIVLS >gi|316923983|gb|ADCP01000050.1| GENE 30 24292 - 25488 333 398 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 39 391 42 410 418 132 30 2e-30 MIPVDKQMQLIKRGITNLIDEGELRKKLERGKPLTVKAGFDPTAPDLHLGHTVLIHKLRH FQELGHRVVFLIGDFTGRIGDPSGRSATRPPLTTEQVIANAETYKEQVFKILDPEKTVVE FNSRWLNDFTAADFIRLGSRYTLARMMERDDFAKRYRENTPIALHELLYPLMQGYDSVAL KADVEMGGTDQTFNLLVGRTLMGQYDLEPQCILTVPLLEGLDGVRKMSKSYGNYIGINEP AADQFGKAMSVSDDLMWRYYELISSKSLDDIAALKKDVEEGRLHPKKAKEALAYEIVARY HGEDQAQEALQGFNSVFADGGIPDDAPEFACEHGEASKPPVFLTDSGLAASRGEAKRLIK QGSLSLDGERCDDAETPLEPGSYVVKLGKKRFLRLTVR >gi|316923983|gb|ADCP01000050.1| GENE 31 25607 - 26086 617 159 aa, chain - ## HITS:1 COG:all0659 KEGG:ns NR:ns ## COG: all0659 COG2606 # Protein_GI_number: 17228155 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 5 158 2 155 156 171 57.0 4e-43 MPPKKTNAARLLDELGIGYELHEAPYDEADLSATAMARSLGVPVEEVFKTLVVRGDKTGV LEVCVPGAAELNLKELAAVSGNKHVEMVPLKEVQPLTGYIRGGCSPLAGKKHYPVFVDES AILHERIYVSAGHRGVQLRLAPDDLLRAVEGTYAAIARY >gi|316923983|gb|ADCP01000050.1| GENE 32 26099 - 26710 591 203 aa, chain - ## HITS:1 COG:mlr4347 KEGG:ns NR:ns ## COG: mlr4347 COG2755 # Protein_GI_number: 13473671 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Mesorhizobium loti # 3 76 55 127 238 60 44.0 2e-09 MHLVCFGDSICYGYGARPGEGWVAQMTRMLATLREPVAVTNAGVSGDTTEDGLRRMQWDV ERHHPNAVFVQFGLNDASFWYTSVGRPLVGFRTYLANMGEIIQRSFRSGAHTVFLATNHR PAEAPDIPGAELYRRTVREYNEGLRATFSGMKGIVLIDMERLILDQFPNPDTILSPDGVH LNRTGNDFYFMAIGRRIARSLAG >gi|316923983|gb|ADCP01000050.1| GENE 33 26891 - 28270 1746 459 aa, chain - ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 6 455 8 454 459 501 59.0 1e-141 MSQNQQREQLGSRIGFLLLAAGCAIGLGNVWRFPYIAGAYGGALFVFIYIGFLLAVGVPI LVMEFSVGRAAQRNLGAALQKLEPKGSRWHTFGPLSLIGSYLLMMFYTTVAGWMLAYCWS MLKGDLSGLTPEQVGGFFGGLISDPTSSCLWMGVAVIFCFTVCGMGLRRGVERVVKFMMV GLFALMIALVVRAVTLPGGEAGISFYLMPDPEKLTGGLWAAVTAAMGQAFFTLGLGVGSM TIFGSYIDKSRSLTGEALYIVLLDTVVALMAGLIIFPACFAFGVNPGSGPGLVFVTLPNI FNSMPFGRFWGSLFFVFMSFAALSTVIAVFENIVSYCMDVWGWTRKKASTVNCVTMFLLS LPCTLGFNVLSGFQPFGEGSSVLDLEDFLLSNNLLPFGALLFLSFCCHRWGWGWNNFIAE TDQGQGARFPHWLKPYLQYILPCVLVFLFIQGYVDKFMK >gi|316923983|gb|ADCP01000050.1| GENE 34 28001 - 28492 71 163 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPARAFGLQLLQGRAQISLRRPAHGELHDQNGNADRQQEADIDEHEQRAAVGSCDIGETP DVPESDGAARGEKKKTDAAAELFSLLVLTHAFLNPLCPAGTSVFRVSLVYGNLPHLAAVT TEGGGRNAPVGRNNVAPPLLQKSEEAGKRENRADKAFFGTLFL >gi|316923983|gb|ADCP01000050.1| GENE 35 28683 - 30260 2049 525 aa, chain - ## HITS:1 COG:PA0761 KEGG:ns NR:ns ## COG: PA0761 COG0029 # Protein_GI_number: 15595958 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Pseudomonas aeruginosa # 4 525 3 518 538 399 42.0 1e-111 MTQQRLKTQVLVIGAGISGSTSALTLADAGYEVTLIVPGKELDGGNSALAQGGIVYRSDP NDAKDLEKDILIAGHHHNYVKAVRHLCQHGPEAVERILMERAPIPFDRLDGKLDFTREGG HGKHRIVHCADHTGKTIMEGLLQAVRMSPNIRILTGRTAVDLITTDHHARYKEYRYQLDN QCLGAYVYNEDTREVETLLADFTVLATGGVGQVYLHTTNTAACTGSGIAMAQRAGVRLDN LEYVQFHPTALYTRQSHSFLITEAMRGEGARLTNAKGEFFMKRYDERADLAPRDIVARAI MDEMLHSGEPCMYLDVSGVKHDIPTRFPTIYQHCMELGIDINKKPIPVVPVAHFFCGGIL VDASARTTLTRLYSVGECSCTGLHGANRLASTSLLEALLWGYSAGQDIAQRITKRGYISK RLADAIPDWESTGDERNDDPALIAQDWATIRNTMWNYVGISRTASRLHRAFDDLRALSRH LHDFYKNTAISKPIVDLFHGCQAAYSITQSALRNPRSLGCHHRIN >gi|316923983|gb|ADCP01000050.1| GENE 36 30501 - 31550 1330 349 aa, chain - ## HITS:1 COG:BS_nadA KEGG:ns NR:ns ## COG: BS_nadA COG0379 # Protein_GI_number: 16079837 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Bacillus subtilis # 13 345 30 365 368 256 41.0 7e-68 MTTAPLSNDARAIEELRSKLGGRLTIVGHHYEQEATIQHCDIRGDSLELARRVPGIASDY IVFCGVYFMAESAALLAREGQQVLLPDHSADCVMAQMTPARLLDRVLGRLTASGRKLVPL AYVNTSLAVKAVVGRYGGAVCTSANAEKMMNWAFRQGDGVLFLPDKNLARNTARKLGITG RDTHILDVRKTGEAVDLAAADKAALVLWPGLCAIHARFHPEQIEAVRKADPSCKVIVHPE CSPEVVRAADGAGSTTYIIEYVRNAPDGAHIYVGTEINLVERLAREQEGRIRVEALRSSA CSNMAKITPEKLRATLEGIVAGNTDPITVPSEDSAPAKASLERMLEACA >gi|316923983|gb|ADCP01000050.1| GENE 37 31547 - 32401 403 284 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 31 278 41 283 286 159 38 2e-38 MHLLSILELSLAEDGPDLTTEGVFGPGDRLQAIIVAKEQTIIAGLPVIPLVMDLCAAHPE ELSYTWQSLAKEGEEVAKGSIVARISGPARQVLRAERVILNFVSHLCGIANLTRRYAEKL EGTGVRLLDTRKTMPGLRYPDKYAVLCGGGHNHRRNLAEMLMLKDNHIDAAGSMTKAVEL LRGRYNPCPPIEVECRNLDEISEAVACRVDRIMLDNMTPDMLPEALEMIPSFIETEISGG VSLDTIRAYATVSTVRKPDFISVGRITHSAVTADLSMRIAKESL >gi|316923983|gb|ADCP01000050.1| GENE 38 32742 - 34187 1687 481 aa, chain + ## HITS:1 COG:SA0867 KEGG:ns NR:ns ## COG: SA0867 COG2239 # Protein_GI_number: 15926597 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Staphylococcus aureus N315 # 64 480 41 457 461 249 34.0 7e-66 MSDTAPEEKKTAPDGIARSLEEKERKNSGERPETKSDEPADNTRQRALSQCIEYLEDGES EYSHPADMAEHLENLSLKEQVCLFRHLPADEAAEALAELDQEVAVDVLENLDPDEAAQII AEMSPDDAADVLDELEEGHRDVLLSNLDRDDAEELRNLLAFDPDTAGGIMNTEIILLEQD ITVDEAISLIRRGMEEDMEIPYYAYVVDEDDKLVGVFSLRDLMLSKPGTILRDSLHDQDV IAVRYDTSSEEVARRMSHYNFMAMPVVDYEGRLLGVATYDDILDIMQDEASADLLGMVGA GQDETVDTPWLESVQIRLPWLIVNMVTSMMSAFVVYMFEGSIAGMALLAVLMPMVANQAG NTGQQALAVMIRQLATEKFDRKRAWIAVLREGKIGIASGVIMAMLACVGVWFTSSSPELG MVMGAALMGDMLLGALAGGSIPLIFRAVGRDPAQASSIFLTAITDSAGFFIFLGLATAFL L Prediction of potential genes in microbial genomes Time: Fri May 13 02:34:44 2011 Seq name: gi|316923971|gb|ADCP01000051.1| Bilophila wadsworthia 3_1_6 cont1.51, whole genome shotgun sequence Length of sequence - 12611 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 4, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 164 - 190 0.1 1 1 Op 1 . - CDS 230 - 2692 3137 ## COG0466 ATP-dependent Lon protease, bacterial type 2 1 Op 2 . - CDS 2734 - 3414 368 ## COG2003 DNA repair proteins 3 1 Op 3 . - CDS 3482 - 4261 767 ## LI0150 hypothetical protein - Term 4375 - 4412 8.5 4 2 Op 1 . - CDS 4490 - 4954 648 ## LI0151 hypothetical protein 5 2 Op 2 . - CDS 5047 - 7557 3705 ## COG0495 Leucyl-tRNA synthetase 6 2 Op 3 . - CDS 7633 - 8127 699 ## COG0781 Transcription termination factor 7 3 Tu 1 . + CDS 8150 - 8812 119 ## + Term 8975 - 9015 10.3 - Term 8429 - 8472 9.6 8 4 Op 1 18/0.000 - CDS 8480 - 8950 649 ## COG0054 Riboflavin synthase beta-chain 9 4 Op 2 15/0.000 - CDS 9008 - 10234 1686 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase 10 4 Op 3 16/0.000 - CDS 10263 - 10925 673 ## COG0307 Riboflavin synthase alpha chain 11 4 Op 4 . - CDS 10910 - 12049 1036 ## COG0117 Pyrimidine deaminase 12 4 Op 5 . - CDS 12046 - 12543 573 ## COG2131 Deoxycytidylate deaminase Predicted protein(s) >gi|316923971|gb|ADCP01000051.1| GENE 1 230 - 2692 3137 820 aa, chain - ## HITS:1 COG:BS_lonA KEGG:ns NR:ns ## COG: BS_lonA COG0466 # Protein_GI_number: 16079872 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Bacillus subtilis # 49 819 3 773 774 761 51.0 0 MTFDSDNDNDEKKEETPARRRRRPARRRVPAQIQDEQMDASEDVTPELQEIPSTMPLLPL RDVVVFNYMIVPLFVGREQSVQAVEAAATHGRHIFLCAQKDGQVDNPKADDLYPVGSVAL ILRLLKMPDGRIKALVQGVSRARLVDLNESGPYLSANVELMPEPEAVAPESEQEALIRFA REQCERILSLRGIPTGDIMGVLSNVNEPGRLSDLIAANLRLKMEEAQEILQCIDPMDRLR LIITHLVHESEVATMQIKIQTSAREGMDKAQKEYYLREQLKAIRKELGDGPDADEELEDI TKAIEKAGLPADVRKEADKQLKRLATMHGDSAEASVVRTYLDWLAELPWKKMSKDQLDIV KAKDVLDEDHYGLTKIKDRILEYLSVRKLNPDSKGPILCFAGPPGVGKTSLGRSIARALG RKFQRISLGGMRDEAEIRGHRRTYIGAMPGRIIQAMKQAGTRNPVIILDEIDKLGNDFRG DPSSALLEALDPEQNFNFSDHYLNVPFDLSKVLFICTANHLENIPGPLRDRLEIISLPGY TQQEKLAIARKYILPKEMHENGLKDRELTLPDAAMNRIIREYTREAGLRNLEREIGTVCR KIARRKAEGSKPPFRVTAAGLPRLLGAPIFIDDEAERKLIPGVALGLAWTPAGGEILYIE VSAIKGKGGLTLTGQLGDVMKESAQAALTYARAKADELGIAPDFAEHTDIHIHVPAGATP KDGPSAGVTLVTALISALTGKTVRGDICMTGEITLRGRVLPVGGIKEKVLAGVARGIGHV ILPAKNQKDLEEIPQELRRKIVVHTVDSIDDVLPLVFEKN >gi|316923971|gb|ADCP01000051.1| GENE 2 2734 - 3414 368 226 aa, chain - ## HITS:1 COG:BMEI0718 KEGG:ns NR:ns ## COG: BMEI0718 COG2003 # Protein_GI_number: 17987001 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Brucella melitensis # 4 221 27 244 249 150 39.0 1e-36 MSTDTPHYAGHRARLRERLLKDSTALADYEILELLLGYALLRRDTKPLAKELLSRFGSIR GVLDALPAEMQQVDGVGEGVAALRLLLREMMARYAEAPMRERKVLCTPRDVAQMVIPRLS GSPHEELWTATVDAQNRLLTWERLARGTVGMVPCYPRDILERVLQRKASGFFLVHNHPGG TPKASPEDIEMTRNIQRIATSMGLRLLDHLIVGDGACYSIREDGLL >gi|316923971|gb|ADCP01000051.1| GENE 3 3482 - 4261 767 259 aa, chain - ## HITS:1 COG:no KEGG:LI0150 NR:ns ## KEGG: LI0150 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: Purine metabolism [PATH:lip00230]; Pyrimidine metabolism [PATH:lip00240]; Metabolic pathways [PATH:lip01100]; DNA replication [PATH:lip03030]; Mismatch repair [PATH:lip03430]; Homologous recombination [PATH:lip03440] # 1 257 67 325 325 243 46.0 7e-63 MRNAHALTADVWKRLSAALSRPNPQTWPLFCLEVAWEKGQPKIPAHIAKLPCFTFADAKG WIWRSPGLEPRSLRRHIQIRSKALGLDLEPGCVDILAETLLPDAAAVDAELSKLQLLAGG KPLTQEQARSVSPTTEFNVFAFLRQLQAGQTASVWKSILEEQAKGEEPLFYLLAMLQREA RQLWQILAGEQVRMGPSDQQAKQQTASRLGAAGLAKLWDAMHTAELSVKSGRRSPSQALD ALMGDLTLLFTPAQRRSPR >gi|316923971|gb|ADCP01000051.1| GENE 4 4490 - 4954 648 154 aa, chain - ## HITS:1 COG:no KEGG:LI0151 NR:ns ## KEGG: LI0151 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 2 154 23 175 175 200 58.0 1e-50 MLTGCGYVWRGQEGSLSENSVLGNGSKTLKIKSVEQTSLYPWLTYQVRSLVRDDINARNL AKWVDDGQADYTLTVRIPSFKVRSYGQYRSASQLYTATISIEFIVYDGKTNTEVWRSGPI YYEENYENANEESAIKSILEMAVRRCMDALQQRF >gi|316923971|gb|ADCP01000051.1| GENE 5 5047 - 7557 3705 836 aa, chain - ## HITS:1 COG:TM0168 KEGG:ns NR:ns ## COG: TM0168 COG0495 # Protein_GI_number: 15642942 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Thermotoga maritima # 5 834 2 823 824 941 53.0 0 MSKPKRYDPAAIETKWQSRWENEKTYACHVDKDKKKYYVLEMFPYPSGNLHMGHVRNYSI GDVVARFKRMQGFNVMHPMGWDAFGLPAENAAIKHNIHPSVWTHANIDNMRAQLKRLGYS YDWDREVATCDEPYYRWEQLFFLRWLEKGLVYRKKASQNWCPHCNTVLANEQVVDGLCWR CDTPVVQKELTQWFLKITDYADELLADLSKLESGWPDRVLSMQRNWIGKSVGAEITFPLE SGEGDIKVFTTRPDTVFGVTFMTLAPEHPLVESLISGKPNEAEARAFIERTHNMDRIDRQ SDSLEKEGVFTGSYCLNPFTGRQVPIWLGNFVLAEYGTGAVMAVPAHDQRDFDFSKKYGM ERIVVIQPEGEAPLTPENLTEAYTAPGVLVNSGEFDGLPNEDAKKAIADALEASGKGKRT TNWRLRDWNISRQRYWGAPIPVVYCDTCGMVPVPEDQLPVRLPLDVQTHTDGKSPLPHTP EFYNCTCPKCGGAAKRETDTMDTFVESSWYFARYTDARNDKAPFDMEALRYWLSVDQYIG GVEHAILHLLYARFFTKVLRDLGYFPKEIDEPFANLLTQGMVLKDGSKMSKSKGNTVDPT EMIAKYGADTVRLFCLFAAPPERDFDWSDTGIEGASRFLNRIWRLYADTCEVLSPVGACS STAADATTAAAKEVRLKEHLTVKKAGEDIGNRYQFNTAIAAIMELVNALYLAKDELATTE EGRKILSSAMATVLTLLAPITPHVCEELWEDLGHARSIDQEPWPEWKEDALQRDVLTVVI QINGKLRGKIEVPASASKEEVEQLALTEQNIVRHLEGLTVRKVVVIPGKLVNVVAN >gi|316923971|gb|ADCP01000051.1| GENE 6 7633 - 8127 699 164 aa, chain - ## HITS:1 COG:BS_yqhZ KEGG:ns NR:ns ## COG: BS_yqhZ COG0781 # Protein_GI_number: 16079488 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Bacillus subtilis # 9 144 3 121 131 87 36.0 1e-17 MSKNRPHARRAARSRAFQVLYSLQFSPSTTLGDVRTAFLEMPDPTDADMDENAETREEKL YGFAWELVEGVWSNVKNLDSVIERFSQNWRVDRLGKIELTLLRLAVFEMLYRADVPPKVA INEALELSTRFGDAKAKSFINGILDAAIKAQEAGTLTVAGKADA >gi|316923971|gb|ADCP01000051.1| GENE 7 8150 - 8812 119 220 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MECRKNELPGASRKTARQRRKNAADKTLGNRALSVAAGNRFVPAGAYPRIRREPFRLWMA PTAYRGLAAGVFSEEGPIQRNTRRTASPPSKMIAEGKRGRRFYERLPRTVLNLLQYADGL KDGGGGFNALVSGLGARAFDGLLDVVRGQQAEAHRHAAFERHMGKALGGFGRHEIKMRRP APDDGADGDDAVVFAGGRQLLAGEGHFEGAGNADDGQVFL >gi|316923971|gb|ADCP01000051.1| GENE 8 8480 - 8950 649 156 aa, chain - ## HITS:1 COG:PA4053 KEGG:ns NR:ns ## COG: PA4053 COG0054 # Protein_GI_number: 15599248 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Pseudomonas aeruginosa # 4 156 3 155 158 176 59.0 2e-44 MNQIKTIDGHLNAADLKFAILATRFNDFIVDRLVGGAVDYLIRHGGSEENLTIVRVPGAF EMPLACKKLAASGKYDGIVAVGAVIRGGTPHFDFVAAEATKGLAHVSLESGVPVGFGLLT TDNIEQAIERAGTKAGNKGVEAASAVLETVRVLQQI >gi|316923971|gb|ADCP01000051.1| GENE 9 9008 - 10234 1686 408 aa, chain - ## HITS:1 COG:RSc0713_1 KEGG:ns NR:ns ## COG: RSc0713_1 COG0108 # Protein_GI_number: 17545432 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Ralstonia solanacearum # 1 201 24 224 226 273 68.0 6e-73 MAICSVEEAVEDIRQGRMIILVDDEDRENEGDITIAAEHVTPEAINFMAKYARGLICLPL GPSLVDKLDLPLMTQRNGSKFGTNFTVSIEARNGVTTGISAADRARTILAAVADDVRPED LVTPGHVFPLRAQPGGVLTRAGQTEGSVDLAVLAGLKPAAVICEIMKDDGTMARMPDLEK FAEEHNLKIAAVRDLIRYHLRHGQLGVRRVAETKLPSRYGNFTMIAYENDISADTHVAIV KGDVSEKGGPDPVLVRVHSECLTGDAFGSLRCDCGNQLAAALTRIEKEGRGAVIYMRQEG RGIGLANKLKAYALQDQGYDTVEANEKLGFKADLRDYGVGAQILLDLGIRKLRIMTNNPR KIVGLEGYGIEIVGREPIEVGCCATNEEYMRTKQEKMGHMLHVSGDRK >gi|316923971|gb|ADCP01000051.1| GENE 10 10263 - 10925 673 220 aa, chain - ## HITS:1 COG:Cgl1558 KEGG:ns NR:ns ## COG: Cgl1558 COG0307 # Protein_GI_number: 19552808 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Corynebacterium glutamicum # 1 193 1 193 211 156 46.0 3e-38 MFTGIINGQGKVLAVESRGHETRFRIQALYDLADIILGESIAVNGTCLTVETSGPSLFSA YASAETMQRTALGLLKPGAKVNLERALAVGDRLGGHIVSGHVDCVAEILSVRREGESRRI RIGFPASFGAEVIGKGSVALDGISLTVNDCGPDFLEVNVIPETWRATTVAEWAPGTRINM ETDVIGKYVRHMVAPHLGLASEEAKAGKLSVEFLRENGFF >gi|316923971|gb|ADCP01000051.1| GENE 11 10910 - 12049 1036 379 aa, chain - ## HITS:1 COG:TM1828_1 KEGG:ns NR:ns ## COG: TM1828_1 COG0117 # Protein_GI_number: 15644572 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine deaminase # Organism: Thermotoga maritima # 5 147 2 144 144 144 48.0 3e-34 MSHPYESFMREAAELAERGRWSAAPNPTVGAVLVRDGVAVARGWHTAYGKSHAEVECLKD AEAKGVDPSACTLVVTLEPCNHQGQTPPCTEAVIAAGIRHVVIGLRDPNPKAAGGMERLA EAGVEVEAGVCEELCRDLVADFLIWQTTKRPYVMLKLAMTLDGRIATRTGHSRWITGETA RRQVHELRANVGRAGGAILVGGNTLHTDNPLLTARLDDPVERQPLAVSISSRVPAPDSLL LFKERPTETIFFTTASGAATPRAAQLRERGVRIRGLDRWKSGEDLVQILEYLRQEAGCPY VLCEGGGRLGLSLLEAGLVDEFHLHIAPKVLGDNDARPLFDGRTPLELDEALSLRLVRME PCGEDGHLIFRPVRACSQA >gi|316923971|gb|ADCP01000051.1| GENE 12 12046 - 12543 573 165 aa, chain - ## HITS:1 COG:Ta0312 KEGG:ns NR:ns ## COG: Ta0312 COG2131 # Protein_GI_number: 16081448 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Thermoplasma acidophilum # 3 148 6 151 170 192 58.0 2e-49 MSRVAWPDYFMNIAHLVAERSTCLRRRVGAVAVKDKRILATGYNGAPSKVAHCLDIGCLR EQLGVPSGQRHEICRGLHAEQNVIIQAAVHGISLAGAEVYCTHQPCLICSKMLINCGITK IWYASGYPDELAMQMLNEADITLELLPLRPTPDGCGHTGPSGEGA Prediction of potential genes in microbial genomes Time: Fri May 13 02:35:11 2011 Seq name: gi|316923958|gb|ADCP01000052.1| Bilophila wadsworthia 3_1_6 cont1.52, whole genome shotgun sequence Length of sequence - 12683 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 7, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 27 - 61 -0.7 1 1 Tu 1 . - CDS 233 - 1471 1685 ## COG0112 Glycine/serine hydroxymethyltransferase 2 2 Op 1 27/0.000 - CDS 1528 - 2778 1460 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase - Term 2852 - 2905 6.7 3 2 Op 2 22/0.000 - CDS 2933 - 3166 507 ## COG0236 Acyl carrier protein 4 2 Op 3 1/0.000 - CDS 3287 - 4030 241 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 4052 - 4111 2.1 5 2 Op 4 16/0.000 - CDS 4113 - 5117 790 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 6 2 Op 5 . - CDS 5145 - 6185 1353 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 7 2 Op 6 . - CDS 6172 - 6363 268 ## PROTEIN SUPPORTED gi|218888187|ref|YP_002437508.1| ribosomal protein L32 8 3 Tu 1 . - CDS 6479 - 7045 172 ## PROTEIN SUPPORTED gi|170758590|ref|YP_001787770.1| ribosomal protein L32 family protein + Prom 7001 - 7060 3.9 9 4 Tu 1 . + CDS 7114 - 7323 66 ## + Term 7346 - 7396 11.2 10 5 Tu 1 . - CDS 7441 - 8622 1547 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 8723 - 8782 3.0 + Prom 8850 - 8909 5.3 11 6 Op 1 . + CDS 9000 - 10727 2010 ## DvMF_0342 hypothetical protein + Term 10746 - 10784 4.2 12 6 Op 2 . + CDS 10879 - 11661 800 ## COG1385 Uncharacterized protein conserved in bacteria + Term 11773 - 11817 13.2 + Prom 11870 - 11929 4.3 13 7 Tu 1 . + CDS 11983 - 12441 339 ## COG1943 Transposase and inactivated derivatives + Term 12515 - 12558 10.2 Predicted protein(s) >gi|316923958|gb|ADCP01000052.1| GENE 1 233 - 1471 1685 412 aa, chain - ## HITS:1 COG:SA1915 KEGG:ns NR:ns ## COG: SA1915 COG0112 # Protein_GI_number: 15927687 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Staphylococcus aureus N315 # 1 410 1 411 412 537 64.0 1e-152 MDHILLEDPELARAFLLESDRQMSKLELIASENFVSSAVREAQGSVFTHKYAEGYPGKRY YGGCEFVDIAENLAIERAKQLFGCDYVNVQPHSGSQANMASYFALAKPGDTILGMNLSHG GHLTHGSPVNFSGRLFNVVSYGVDKDTCLINYDEVRRLAHEHRPTVIVAGASAYPRTIDF AKFRAIADEVDAKLLVDMAHIAGLVAAGLHPTPIGHAHVTTTTTHKTLRGPRGGMILSDE SFGKTLNSQIFPGIQGGPLMHIVAAKAVAFGEALRPRFKDYQAQVLRNTVTLGEELKNAK FNLVSGGTDNHLLLVDLTSKDITGKDAEHALDAAGITVNKNTVPFETRSPFVTSGIRIGT PALTTRGFREQDMVKVAGWIDAAIANAGNETRLAEISKEVAVFARQFPLFAW >gi|316923958|gb|ADCP01000052.1| GENE 2 1528 - 2778 1460 416 aa, chain - ## HITS:1 COG:BS_yjaY KEGG:ns NR:ns ## COG: BS_yjaY COG0304 # Protein_GI_number: 16078199 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Bacillus subtilis # 1 416 1 413 413 440 55.0 1e-123 MSRKRVVVTGLSALTPLGGNVTETWDNLLAGKSGIGPITLFDASGFDSRIAGEVKGFDPE AYGVPAKQARRMDRFVQFAVAAGNMLVEHSGLVINDDNAGRVSVILGVGLGGLHTIEVFH SRLVEAGPGKVSPFMIPMLISNMAPGQLAIATGAKGANMVLTSACASGTHAIGAAYTEIV MGRCDACISGGVEATVTPMGISGFTALKALSTAHNDDPEKASRPFDKDRDGFVMGEGAGL LMLESLEHAQARGATIYAEIIGAGSSDDAFHMTAPRDDGEGMIAAMQRAIADAGVSPDVV DHINAHATSTHLGDICETGAVKQVFGERAYQIAISATKSQTGHLLGAAGGVEAVFSVLAL HTGYVPGTINYETPDPECDLNCMTDGAKKLDPQYALSNSFGFGGTNGSILFKKFTA >gi|316923958|gb|ADCP01000052.1| GENE 3 2933 - 3166 507 77 aa, chain - ## HITS:1 COG:HI0154 KEGG:ns NR:ns ## COG: HI0154 COG0236 # Protein_GI_number: 16272121 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Haemophilus influenzae # 1 75 1 75 76 75 70.0 3e-14 MSVEEQVKKIIMEQLGVTAEEVKPEASFVEDLGADSLDLTELIMAMEEAFDVEIADDDAQ KILKVKDAISYVQNHTN >gi|316923958|gb|ADCP01000052.1| GENE 4 3287 - 4030 241 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 246 5 241 242 97 31 5e-20 MSEQLSTALVTGGSRGIGKAIAETLAKAGYQVYLTYVSKADEAEAVARNIVAAGGKACAF RLDVADAEAVAAFFKERIKDQVELDVLVNNAGITKDGLMLRMKDEDFDRVIQTNLRGAFV CVREAAKIMTRQRHGRIINISSVVGQMGNAGQINYASAKAGLIGLTKSAAKELAGRNVTV NAVAPGFVETDMTASLPDDVRAAYIDAIPLKRLGTPQDIADAVAFLASPGAGYITGQVIA VNGGMYC >gi|316923958|gb|ADCP01000052.1| GENE 5 4113 - 5117 790 334 aa, chain - ## HITS:1 COG:CAC3578 KEGG:ns NR:ns ## COG: CAC3578 COG0332 # Protein_GI_number: 15896812 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Clostridium acetobutylicum # 1 334 1 323 325 283 43.0 4e-76 MTAIHIQGLGTYVPTKRITNVDLMSLVDTNDEWIVSRTGIKARHMLADDENGSDAGTEAA LKALEDAGIAAEDITHVLTATCTPDYLCPSTACIISGKIGSHGAMAFDLNAACTGFVYGL DVANSILAGKPGAKVLLVGSEAFTRRLNWEDRTTCILFGDGAAAVVLTNGDAGTPKRCLP SAPRLRDVRCGADGRQYPLLTVGGGTNRNYKPGDPVQDDFYLQMQGREVFKQAVRSLSAV CSELLEDNGLTLEDIDLFIPHQANLRIIEAVGDRLKLGSEKIFVNLDEYGNTSAASIPLG IGDACAQGRIRPGSRVLLSAFGGGFTWGAALLEF >gi|316923958|gb|ADCP01000052.1| GENE 6 5145 - 6185 1353 346 aa, chain - ## HITS:1 COG:aq_1101 KEGG:ns NR:ns ## COG: aq_1101 COG0416 # Protein_GI_number: 15606370 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Aquifex aeolicus # 8 330 4 324 337 298 52.0 1e-80 MSSNKTVIAVDAMGGDLGPSVVVPGAIEAARQTGAKILLVGNEATLDGELNRLSPSGVDL EIVHAPEVAGMDEKPSDILRRKKNASIQVACRLVRDGAAQGVVSAGHSGASVACGMFIMG RIPGVERPALASLLPTEKEPVVLLDVGATVDCKPYNIFQFGLMGDAFARDILNKESPRVG LLSIGEEEGKGNSQVKEAYELFKMAQNLNFSGNIEGRDLFTGEMDVAVCDGFVGNVALKL SEGLGLSLSRVLKRELLNSGFLPKLGSLLAKSAFRRFAKVVDYAEYGGAPLLGLQNISIV CHGRSNAKAICNATRMATLFVEKETNKRLMETICANEELTRFGRSC >gi|316923958|gb|ADCP01000052.1| GENE 7 6172 - 6363 268 63 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218888187|ref|YP_002437508.1| ribosomal protein L32 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 56 1 56 59 107 85 3e-23 MAVQQNKKSHSKKGMRRSHDRVATPTVVYCTCGAPTVPHAACPSCGTYRGRQVVEQKTAD EQQ >gi|316923958|gb|ADCP01000052.1| GENE 8 6479 - 7045 172 188 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|170758590|ref|YP_001787770.1| ribosomal protein L32 family protein [Clostridium botulinum A3 str. Loch Maree] # 53 181 51 160 166 70 33 5e-12 MAELRIALNNIAPEGKTFVMDDPAIWSVPMTECGMDCRVVQPLVGTVTLLPQEDGCLVRG NLKGEVVVPCNRCAEDAHLIIDSSFDSFEPFPQADDETEPRNGKEKPFDSEADELIVKLV DGAPEINLAGLLWEEFVLALPVRPLCKPDCKGLCPDCGKNLNEGSCSCVRDEGDPRLAAL RGLKVKKN >gi|316923958|gb|ADCP01000052.1| GENE 9 7114 - 7323 66 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGKAKEAAPRENVPVRRFWCGFQSLEISLKHGRPRVFLKGDEKRGKAAASPLAFGGMGLK GTLIARPGK >gi|316923958|gb|ADCP01000052.1| GENE 10 7441 - 8622 1547 393 aa, chain - ## HITS:1 COG:BS_yhdR KEGG:ns NR:ns ## COG: BS_yhdR COG0436 # Protein_GI_number: 16078022 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus subtilis # 4 388 6 385 393 316 42.0 6e-86 MSIISTQMSSSIKNGSWIRRMFEAGIQLKQKYGDDAVCDFSLGNPDLAPPPAVGKALAEF VKHVDEPFSLGYMPNNGFGWAREKLAAHLSKEQGVELTANDVILTCGAAGALNVIFRTVL EPGDEVLTVTPYFVEYGSYVGNHGGTLKKVPTLPGTFGLDLPAIEAAITPKTRAMIINSP HNPTGTVYSREELEGLTAILAKASERNGRPLYLVVDEPYRFLAFDGVEVPSLLPMYPYAV LASSFSKNLCLAGERVGFIALSPLFAERAELMGGLTLANRILGFVNPPVVGQHIMAGALG SQVDVNIYARRREMMGNVLSDAGYEFQMPKGAFYFFPKAPGGDDVAFVNKLLDERILAVP GSGFGGPGHFRLAFCVEDEVIARSAEGFKRARG >gi|316923958|gb|ADCP01000052.1| GENE 11 9000 - 10727 2010 575 aa, chain + ## HITS:1 COG:no KEGG:DvMF_0342 NR:ns ## KEGG: DvMF_0342 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 569 1 569 572 765 65.0 0 MSLLWKEAFTLGDFSWSNTNSFSDIIEITDLEALKHFLDAVHVKFCLLKPWCETENYPLV DSRDLLPSFEAELWEYKGLPGFSMVAFARPLESFSEIFQYDILHPVLSAVNTAEGASCPL ETHVIARNVQMFLSRLPKKLQDPFRDQFRRVDTAVLDYYPALMPYLLAMDRAHVFATDVY GHYHLAGLFASFPSDMDGEIKRFGLRTGKFKIGDNEMYERNRTFVCQFLMELYGFPISSE RRTSAALFSRRLHKLGERFLVRVLGQSDRTITTIWNNGENRPYPRVEKIALVKLDPEQKD LIRACEEGGFFVDAEKRVVIIRISYKQHRYNADNVRQDRALSVERQELIHPLTGRAMPDV NVVKDTSTMILRLNDIVRGEYVGRAVYKRNELVENTDTDEKRLKFLFAWLSKNQRRIIGY SEEFYNNTTKVLDAYLKNPENQEAFALLHDLKQEVMSKYAYIRQARTVRYLEDIVARNYK GERLTYGRMLLEAIELLRDLKFELGNYFEPLMLSVLHYMEMILDDRYLLRRYINCPDEKL IKGGVEIRKNYRRLVSLRDEFESVRKSRQKPVAEG >gi|316923958|gb|ADCP01000052.1| GENE 12 10879 - 11661 800 260 aa, chain + ## HITS:1 COG:PA0419 KEGG:ns NR:ns ## COG: PA0419 COG1385 # Protein_GI_number: 15595616 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 18 250 6 231 240 75 30.0 1e-13 MAENEIPEPPDWSDARTFFLEPDLWHEPYELDASESHHLTRVLRIREGEDVRVLDGRGRE GRFRVLPYKKNAKAVALRLLDEWTYPEPESKVILAAGWTKAARRGWILEKAVEFEASGIW LWQAERSQFPVPSDIKESWQGQLAAGAKQCRNPWLPELRTMPGGVDELIALAEELGCEHR HVLVESGHKSVMSLTPDTLGQPGRTLCVVGPEGGFTAQEVEKLTRAGFLPATLGERVLRW ETAAVLCLGLHWWKRQLGGK >gi|316923958|gb|ADCP01000052.1| GENE 13 11983 - 12441 339 152 aa, chain + ## HITS:1 COG:STM0946 KEGG:ns NR:ns ## COG: STM0946 COG1943 # Protein_GI_number: 16764308 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Salmonella typhimurium LT2 # 1 152 1 152 152 266 82.0 1e-71 MSDIQSLAHTKWNCKYYVVFAPKYRRQVFYGEKKREIGEILRRLCEWKGVTILEAECCPD HVHRLLEIPPKMSVSGFMGYLKGKSSLMLYERFGDLKFKYRNREFWCRGYYVDTVGKNKE KIRDYIKKQLEEDKLGTQLCLPYPGSPFTGRK Prediction of potential genes in microbial genomes Time: Fri May 13 02:35:36 2011 Seq name: gi|316923946|gb|ADCP01000053.1| Bilophila wadsworthia 3_1_6 cont1.53, whole genome shotgun sequence Length of sequence - 10597 bp Number of predicted genes - 12, with homology - 9 Number of transcription units - 9, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 47 - 487 549 ## COG0757 3-dehydroquinate dehydratase II - Prom 607 - 666 2.2 + Prom 596 - 655 4.9 2 2 Tu 1 . + CDS 709 - 1401 744 ## COG0518 GMP synthase - Glutamine amidotransferase domain + Term 1424 - 1465 6.1 - Term 1405 - 1457 4.5 3 3 Op 1 4/0.000 - CDS 1550 - 2326 179 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 4 3 Op 2 39/0.000 - CDS 2323 - 4125 1937 ## COG0573 ABC-type phosphate transport system, permease component 5 3 Op 3 . - CDS 4426 - 5253 1298 ## COG0226 ABC-type phosphate transport system, periplasmic component 6 4 Tu 1 . + CDS 5347 - 5829 -121 ## - Term 5772 - 5818 3.1 7 5 Op 1 . - CDS 5907 - 6419 553 ## Ddes_1082 hypothetical protein 8 5 Op 2 . - CDS 6471 - 6962 612 ## - Prom 7197 - 7256 2.7 + Prom 7447 - 7506 2.0 9 6 Tu 1 . + CDS 7564 - 8073 686 ## COG4803 Predicted membrane protein + Term 8105 - 8142 6.1 - Term 8093 - 8129 6.3 10 7 Tu 1 . - CDS 8201 - 8980 873 ## COG0647 Predicted sugar phosphatases of the HAD superfamily - Prom 9091 - 9150 1.6 11 8 Tu 1 . + CDS 9129 - 9890 1004 ## COG1349 Transcriptional regulators of sugar metabolism 12 9 Tu 1 . - CDS 10047 - 10469 503 ## Predicted protein(s) >gi|316923946|gb|ADCP01000053.1| GENE 1 47 - 487 549 146 aa, chain - ## HITS:1 COG:BH2801 KEGG:ns NR:ns ## COG: BH2801 COG0757 # Protein_GI_number: 15615364 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Bacillus halodurans # 1 142 1 141 145 136 49.0 1e-32 MNRILVMHGVNLNLFGKRDPKHYGTETLEDINGKLSALAGELGVEVSFFQSNYEGAFVER IQEAHGDGTAGIVLNAGAWTHYSYAIMDALAILSIPAVEVHMSNVHAREAFRHFSVLSAV TRGSIAGFGSGSYLLGLRAVVDLVKG >gi|316923946|gb|ADCP01000053.1| GENE 2 709 - 1401 744 230 aa, chain + ## HITS:1 COG:RSc1158 KEGG:ns NR:ns ## COG: RSc1158 COG0518 # Protein_GI_number: 17545877 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase - Glutamine amidotransferase domain # Organism: Ralstonia solanacearum # 1 229 1 232 242 187 47.0 1e-47 MKRCLVMQHVAFEDLGTFAPVLEEKGFAIRYCPVDEPPTEEEWLAADLAVVLGGAIGVYD QVLYPFLKDEKDLIRKRLESQKPLLGICLGAQLIASVLKSRVYPGKGKEIGWGGVELTEA GRKSPLRFLDGYRPVLHWHGDTFDLPEGAELLASTPVTPHQAFRMGNHVLALQFHPEVDA DKLERWFVGHACELARAAIDPRRLRDETKRAGESAKETGQALLREWIDQW >gi|316923946|gb|ADCP01000053.1| GENE 3 1550 - 2326 179 258 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 249 1 232 311 73 26 6e-13 MTLAARIRRLSVAFHERTILHDITLDIPAEGITVLLGRSGSGKTTFLRALNRLNECLPGC RTSGEAELRLGGGLRPIYAGRTGEAALSDLRRRVGMVFQTPNILPASIRQNMLLPLQLAT AFPASEWRERMERSLKEVGLWHDVSSRLREPASILSGGQQQRLCLARTLALEPEILLLDE PTASLDPATTHTIETLLLKLAERYPILLVSHGLPQAKRMASRIVLFGKGRLQGMLSPEEL PNEAEMNRFFDGENAGNR >gi|316923946|gb|ADCP01000053.1| GENE 4 2323 - 4125 1937 600 aa, chain - ## HITS:1 COG:SP2085 KEGG:ns NR:ns ## COG: SP2085 COG0573 # Protein_GI_number: 15901901 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 23 258 13 253 287 114 34.0 6e-25 MHLNADGLMNTSLTERGGLPLLAAWLAGASVCAVFALIACFALPVVFSAQGGQIFSWVWR PDTHEFGILPMIVGSLLLSVSALLLSWPLAVGVACAIQDGGKKRLFGCTVAGIVRFMTTI PTVVYGFAAVFLLVPVVREAAGKGSGLCWLSAALMLALLILPTMVLVMDSAMRTKERNVR ITAAALGFTRSQTLACLVLPASRRWLLTAAILGFGRAAGDTLLPLMLTGNAPHVPESLFG SLRTLTAHMGLVTATEVGGPAYNSLFVAGGLLLLTSACVSLALRKLASESGEQLPEPPAL WRRYAPSALRPLSIVSGAIVASAIVCLLGFLLYRGLPTLEPSLLFGDTPPLRALLGTLPV WDGIWPACAGTFCLIMTTMLLAIGPGIGCGIYLAEYASPFMKRLFGVIMDILAGIPSIVM GIFGFTLILFLHRLGVAGASPGILLAGGCLALLVLPSLVVTTRTALEGLPASLRLTGFAL GLTHSQIVRHILVPQASRGILGGIMLAMGRAAEDTAVILLTGVVANAGLPSGLAVRFEAL PFFIYYTAAQYQTPEELERGFGASLTLLLLSGGLLLAAWWLYARYQRRKDACSLREGVSA >gi|316923946|gb|ADCP01000053.1| GENE 5 4426 - 5253 1298 275 aa, chain - ## HITS:1 COG:MTH1727 KEGG:ns NR:ns ## COG: MTH1727 COG0226 # Protein_GI_number: 15679719 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Methanothermobacter thermautotrophicus # 34 271 32 268 271 182 43.0 6e-46 MKKFLLPALAATLLLAATAVAAPLDAFKGMKGTLDIAGGTAHIPVMKEAAKRIMTANPDI RITVAGGGSGVGVQQVGEGLVQIGNTGRPLKDKEIEKFGLKTFPFAIDGVALVVNPANNV SDITAQQAADIYNGKITNWKELGGTDAPITLYTREDGSGTREVFVERALNKGSIVQSANV VNSNGAMKTAVAQDKQSIGYVGIGHVDKNVKALVFDKMVPSQENASNGTYKVTRLLFMNT KGAPEGITKAFIDYIYSPEGTEIIKKSGYIPTGRQ >gi|316923946|gb|ADCP01000053.1| GENE 6 5347 - 5829 -121 160 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLGGASILGMECAWMGEKSGVTSPEHVRCSGGKTVCGCKARAQTTRDGTARYGWGKEEAV CEGWDHDRSIAASRAFGNGIQRILFGGRTGSVQIRHNAIHYAVCRPLHTFEFTCCELCVL FEKILGIHPLPLWGNELSVVSLFQRKKPPNGTPVAWPPIR >gi|316923946|gb|ADCP01000053.1| GENE 7 5907 - 6419 553 170 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1082 NR:ns ## KEGG: Ddes_1082 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 7 167 3 185 188 105 38.0 8e-22 MSEQTEKKILSPEQAVQAACILFELNTREQDLLEMAQRAGVPCGDDASRRALTREWYGFV HAAVVYGLMAQASNQVMAEYLRSTRTLLSQMAGYTPEQIESFIDDTFSSYIRLMAQNQQK QCPSLFYQRLVGEGALETLPKERVAFLSGMMAITMCAILDKLGQYQFGVE >gi|316923946|gb|ADCP01000053.1| GENE 8 6471 - 6962 612 163 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSAYAEICSMFEQRRKVEEEYASAAYSQLLTFVELFEFYIGCQQPDLVVPLRIDECCNPL PDQTAPFRIVDNKMIAAVCIRTEDKNDCTWVPVSVKPISLVECFMEIGYGTDRVKQFHVN FSYGAIGIKQTSSDNALSYLEAMFKEEASFNPFHTKDDTNEIG >gi|316923946|gb|ADCP01000053.1| GENE 9 7564 - 8073 686 169 aa, chain + ## HITS:1 COG:sll1106 KEGG:ns NR:ns ## COG: sll1106 COG4803 # Protein_GI_number: 16332299 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Synechocystis # 1 159 1 159 171 149 57.0 3e-36 MRKLIAVTYDTEFKAEEVRLQFLKMQKSYLVSLEDAVVVEKKQNGKVKLHQMYNLTASGA VGGGFWGVLIGLIFMNPLLGLVVGAGAGAVAGALSDVGINDDFMKQLAEKLTPGTSALFV LVDSDLTDKVLDALRGTGGTVLQSSLSHEDEAKLQAALNSARAAQKDAE >gi|316923946|gb|ADCP01000053.1| GENE 10 8201 - 8980 873 259 aa, chain - ## HITS:1 COG:TM1742 KEGG:ns NR:ns ## COG: TM1742 COG0647 # Protein_GI_number: 15644488 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Thermotoga maritima # 1 259 1 257 259 160 39.0 3e-39 MLNKKCFVLDLDGTVYLGDIPIQETVDFILRHWDTIDFHFLSNNTSKAPTTYVNKLTRMG IPATLDRILSPVTPLIAHLRGNGIRTVYPVGNRDFVACLRERMPELNVLDYGVSEGAEAV VLAYDTELTYEKLTHAALLLQNPEVAYLATHPDLVCPSPQGPLPDAGSFMSLFETATGRR PQHIFGKPDPAVLGTLLQSYDKKDMVMVGDRLSTDKKLAENAGIDFILVLSGEAKLSDLP GLERQPTLVVDNLGQLESA >gi|316923946|gb|ADCP01000053.1| GENE 11 9129 - 9890 1004 253 aa, chain + ## HITS:1 COG:YPO0120 KEGG:ns NR:ns ## COG: YPO0120 COG1349 # Protein_GI_number: 16120465 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Yersinia pestis # 3 250 2 249 252 241 49.0 7e-64 MSKTSARHQALQEFLSARGYATIEELARHFEVTPQTVRKDINALAEEGKVQRFHGGAGMV SGSENIVYDERKGICQEEKRRIGALLATHIPDGVSVCINIGTTTEEAARALLEHKNLRVV TNSINVASICARNSSFDVVVACGTVRHRDNGIVGASAERFIREFRVDYGIIGISGIDEEG NLLDYDYREVAVARTIIECSRRVFLVTDRSKFGRPAMVRVAHLSDINALFTDGPIDKKWA DLIHEHGVELFMV >gi|316923946|gb|ADCP01000053.1| GENE 12 10047 - 10469 503 140 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKGVWREIRESLDIGRFAGSFRDRWVWLAFLLGGIGLLFPSGGLPPLWRLAFAAMTEELA YRALLQRQLEQCVPGKWGLFTGGMLLTSALFALSHLPTHTPLMAALTFFPSLAFGALWTR HRSLWLCAAVHFWYNLLFFL Prediction of potential genes in microbial genomes Time: Fri May 13 02:36:17 2011 Seq name: gi|316923931|gb|ADCP01000054.1| Bilophila wadsworthia 3_1_6 cont1.54, whole genome shotgun sequence Length of sequence - 19049 bp Number of predicted genes - 15, with homology - 9 Number of transcription units - 2, operones - 2 average op.length - 7.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 5 - 1012 1151 ## COG0457 FOG: TPR repeat 2 1 Op 2 . - CDS 1012 - 1497 474 ## 3 1 Op 3 . - CDS 1481 - 1933 801 ## COG2165 Type II secretory pathway, pseudopilin PulG 4 1 Op 4 . - CDS 1986 - 3026 1105 ## gi|302489774|gb|EFL49705.1| general secretion pathway protein K 5 1 Op 5 . - CDS 3013 - 3681 864 ## 6 1 Op 6 . - CDS 3698 - 4048 428 ## 7 1 Op 7 . - CDS 4045 - 4524 633 ## 8 1 Op 8 . - CDS 4521 - 6059 1269 ## 9 1 Op 9 . - CDS 6072 - 7007 618 ## 10 1 Op 10 6/0.000 - CDS 7095 - 8810 2147 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 11 1 Op 11 . - CDS 8815 - 10791 2696 ## COG1450 Type II secretory pathway, component PulD 12 1 Op 12 . - CDS 10796 - 11656 823 ## Dole_1040 PDZ/DHR/GLGF domain-containing protein 13 1 Op 13 . - CDS 11659 - 12900 1343 ## COG1459 Type II secretory pathway, component PulF - Term 12907 - 12945 7.4 14 2 Op 1 1/0.000 - CDS 12959 - 15790 2167 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 15 2 Op 2 . - CDS 15876 - 18713 2420 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold - Prom 18784 - 18843 1.9 Predicted protein(s) >gi|316923931|gb|ADCP01000054.1| GENE 1 5 - 1012 1151 335 aa, chain - ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 55 316 66 328 628 91 24.0 2e-18 MSAHRTVPQFGPSRRSPFRFAAAALCTAFVLMAASGCTPMRKQGAVPTFPASTMNDAVEA YQKGDCRESIRRFSAALQQQEHPALLNGLGMAYLSCNQPRNAAQAFERAVSISPGSAALH ANAGTALYADNDYKSAERQFDAALRIDPTNPEALVGKAGILIQRKEPEKALRQLSLVSGT DAASPEVLYNKALAMYQMGLTDDAGTDLGTYAREHPNDAEAQNALGVVMLRAGNYASAKA HLDRAIALRPEQGEYYYNRANVLKEQKEFKAAIDDYTRAVAFIPDLAGAYINRGDVRFLL RETEGACQDLKKACELGECDRLEKYEDAGRCRDFF >gi|316923931|gb|ADCP01000054.1| GENE 2 1012 - 1497 474 161 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCPTAERRFAGFTLLELLIVLVIMGVLVAVGAGALSVERNPLYTGAQALVTASSTARSRA LLLNAPVTLELGQTFMEISSENGDRPLREAFPKGMSAASVDEHYLLGGSQKLLFHPLGVV QEHVVHLQSGQDTLSVYIPATGTARILEGQFSLEQIRKEFL >gi|316923931|gb|ADCP01000054.1| GENE 3 1481 - 1933 801 150 aa, chain - ## HITS:1 COG:PA3101 KEGG:ns NR:ns ## COG: PA3101 COG2165 # Protein_GI_number: 15598297 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Pseudomonas aeruginosa # 6 146 7 147 148 144 51.0 5e-35 MDNGHIRQRNNGGFTLLELMIVIVILGILGAIVAPKFMDEPHKARVVQAKMQIENLSTAV KKFYLDNGFYPSTEQGLEALVTRPAIGKTPKNYPANGYITKIPKDPWGNDYVYTAPGQKV PFEIMSLGSDGAEGGEGEAADIWGEDVPDR >gi|316923931|gb|ADCP01000054.1| GENE 4 1986 - 3026 1105 346 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302489774|gb|EFL49705.1| ## NR: gi|302489774|gb|EFL49705.1| general secretion pathway protein K [Desulfovibrio fructosovorans JJ] # 41 309 42 319 368 65 26.0 4e-09 MCGTRGTFFKDKRRGAVLILVLFVLTGLSVLTVEFSRDILLDHAMSTSTRSILAAKPLME SGERLATAVLTRNSEAGTPDHVREDWGLFGLGLERLSAELPGADLSGSIEDENSRFPINA IFYEREADRARAEAFADILTRLLAGLMRVHGYAGGPDNALALAKEYVDSLRQWGGQTETD QDTLKWYLTQTPCYLPPGRPLLAPEEILLVRWPHVEQEWGRDVLRGTKELPGLLETLTIW TQGPMNMNTLQPAVLSALVRDERQARPFVNAVLHYRDNPDNDLGENWYKDLLSAYDVPAL PNGCLDVRSRWYRLNLTVRQGARKNTLTSVGWVTHEYVTWEYRAVH >gi|316923931|gb|ADCP01000054.1| GENE 5 3013 - 3681 864 222 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYGFTLVELLIAMTLTAVIGMVLFSTYSMVMDNGKTVRNRVLERESERVFWGILDNDIAG LCLIDDKRSTLPPLSREPIVPSDAFYRLTEKDKPQPSDDEVLLSFATSSHLADMPGTPLP GPVCVEYVLRNGNRSAAFIRRERAYCGVEGDFPWSELVLVRNVKSLEVALYSAKTQFVED WPSPLPPGAVPEAIRFTLHREEEEQPELFVVPVFPRRSHVRN >gi|316923931|gb|ADCP01000054.1| GENE 6 3698 - 4048 428 116 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNRGFTLLEVLIALAILSTVALLAVRVSGDSLTQLAETGWEDGVLRAGRGKMIQMLRESP DSLDQWGTLSPEYPDVEWHSKLIALRCMEGKRLEFRLVENRGTSSRELLLEYILPR >gi|316923931|gb|ADCP01000054.1| GENE 7 4045 - 4524 633 159 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSVSRPLLLGLGGLALLGVVQWVILPAFEYRADNASRAERARQRHLSMQLLTDDYARLSQ SNAAAPAKNRSLFSLVNKEADRLSLSRRIEALRPAARKENDGSERIEMRLTGLYLKQGVQ WLHALESHPGVRVENLTFRRSAKNLLDMDMTVSLSGGTQ >gi|316923931|gb|ADCP01000054.1| GENE 8 4521 - 6059 1269 512 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIAKDSFIILQSGTELTGALCRVGLKSRDVSYLHHVTCADLTPQAVASGLRSLLDGLAAA GLKRPDDVRVCLSAGAALFRDWAFPFRSRAKVEQALSLMLETEFPFDADTLSHRVCLTGN AAASAGYKSGVQAISVSLRKEERDSWLDAFEAEGLFPRLVTVDPFPLLMNLPSRQNGMAL LLYVQRGCSTLALLDNGVIRRIRTIPVGWPPETDAADLNVSLSERENAALPGFADRLRRE TSLVLDGTPFVPDRLLAYGEAFLGNGASARFAEAFELPVSVLGQEVPLAGQVARLGETDP SRLLALCVAAMPLPAPWRPPLFPSFHRPPKGGHLSSEHGRRLAWAACGALAVGAACLASV WAEGYAAGQQAVRHEDAARSLFRKALPDVRGSFNPVQMESILKNRIAGLRGGGENDATFP ALHLLQDMHAAVPDSLDIRLDRLSLDARRCGLSGTAASYEQVNALRVALSGLPGVREAKI LSAASRTGKPEPGTPSGAVIFEIELALEGGQS >gi|316923931|gb|ADCP01000054.1| GENE 9 6072 - 7007 618 311 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMRTLLDRLLFALPSPPSRASLVRGGLYLLFGLLLYGLFLGVLYMRELGVVGLRQWCGRL PGVQVSMSRPEMSFFPPALEIADLTVQPPNASEPLAFRNVRAGLTVFPLGISLDADIAGG GLAATVTPSSLWNPERLAVWSSLSGVGIEPLLRPFMGKTSLVQIRSGKLEGSATLDLPLL NGRPEPLAGEGSLNLSLRGGLADLSLPMLKSSRLDKLEGTVETGWKRDRLTLHQLAVRSP MLACTVQGQVTLVPRDLPASRMDVQSALRIPLEQVREELMPERTLQSLKDKGEVRVRIRD TFRRPSFDVQP >gi|316923931|gb|ADCP01000054.1| GENE 10 7095 - 8810 2147 571 aa, chain - ## HITS:1 COG:aq_1474 KEGG:ns NR:ns ## COG: aq_1474 COG2804 # Protein_GI_number: 15606637 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Aquifex aeolicus # 118 568 34 467 469 424 49.0 1e-118 MKTLTSLLERLNLDPEVLEACARNAEALGIPLERALYEAERINENQYLETAAARHAMLCE LRLAALMDKEGSGWATDAERVCKVPLPWLRKQKVVPVRDKDGKLALAICHPSGWLLAQEL GMLLGERLERPVLAREDDITDIINRIFGESTRSEGSVSDVLGDSVNAIDEFNEDAVEDLL EDSSEAPFIRLVNMILAQAVRAGASDIHIEPYRDVSRVRFRLDGVLYERHTLNKAHHAAV VSRIKVMAKLNIAEKRLPQDGRIAISLGGRQAGLRVSTLPTSFGERVVLRLLEKSERVLS LTELGLSREDLQLMHSLVGITHGIVLVTGPTGSGKTTTLYAVLQELTAPDKNILTIEDPV EYELEGVGQIQVNPKIGLTFADGLRSIVRQDPDVILIGEIRDAETAAIAVQSALTGHLVF STLHTNDAPGAVTRLFDMGVEPFLLSSVLRGVVAQRLVRMLCPHCREAYLPDGQELEKLG AARSAYRPGQPLYRAKGCPDCLDTGYRGRMAIYEIMPVSDALKRLIVDKADANVLGACAL SEAMRNLRHDGMLKVIAGLTSLTEVARVTNE >gi|316923931|gb|ADCP01000054.1| GENE 11 8815 - 10791 2696 658 aa, chain - ## HITS:1 COG:PA3105 KEGG:ns NR:ns ## COG: PA3105 COG1450 # Protein_GI_number: 15598301 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulD # Organism: Pseudomonas aeruginosa # 32 627 44 617 658 260 34.0 5e-69 MFDSFSKAATALRVSVLFLALLCAQGFPSAAPAWAEPESGLTLDFKDADIHALIKYISEA TGRNFITDPTVKGKVTVYSPVKISPDEAFETFVSILRVQGYAVQKSGTAYKIVPLKEGLG QGEDAAVGRKMGSELETVVTQIVPLKVGVAAELAKILPSLLGKDYAISAYTPSNTLALTA PAPNVAKAMAFLEQVEASDTAGTSATLGLQYGDSKTLAATLAKILKSRDEEYAKKGRPTV SLVLADERTNSLLIYGDPEAIGMARDAVGSLDIPTPKGKGDVHLISLSNAKAEDLAQVIN TLVERQRAAGTEDQKPDTVLSKDIKVVADKSTNSLVVTARPDEFEALSNIVAKLDVVRKQ VFIEALIMEVSSEASFSFGINWAIGGNTGDAAIVGGVSLNGGAVSLSSSGANKTVSLPAG VSIGAILKDAITVGNTSYNIQSILNAVRGNSDVDVLATPQLLTLDNEEASVEVVDNIPFT KESTTRNDNDFTTQSMDYKDVGVKLKITPRISDDGSLRLEVEQEVSRVTQGLITLTNGDQ LVAPTTRKRLVKTTILLQDSQTAVIGGLLDDQKTYNQSEVPGLGSIPVLGWLFKSRNKKS TQTNLFIFITPKVIRNAADSADLTREKQLVLHETSVGHDGLGLPIMSKPKLLKPVFVN >gi|316923931|gb|ADCP01000054.1| GENE 12 10796 - 11656 823 286 aa, chain - ## HITS:1 COG:no KEGG:Dole_1040 NR:ns ## KEGG: Dole_1040 # Name: not_defined # Def: PDZ/DHR/GLGF domain-containing protein # Organism: D.oleovorans # Pathway: Bacterial secretion system [PATH:dol03070] # 1 285 1 293 293 77 25.0 4e-13 MSRRLTTLFSLFCIALIAWFLARATFALRETALETSGSASRLAENAAAAPEPLAAFDGIT TRNLLGVSVFPPEARRSLKSGTDADSGEENGLLSADGIDALPVSKQGWKLLGTIVNTGPG KASRAVIQVNGAEQPYREGDTIQGWKIALVQRRTVVVAKGGSKERLLMSEDPILQKEDTK PDEQKTVSRARLREELGDVGSLMRAVSVSPQTVGGYQGLRILDMQSGSYIEELGLRKDDL LLGANGKPLRGFGDLGGLGDLADKSAITLEVLRNGKKTIIRYDVQS >gi|316923931|gb|ADCP01000054.1| GENE 13 11659 - 12900 1343 413 aa, chain - ## HITS:1 COG:VC2731 KEGG:ns NR:ns ## COG: VC2731 COG1459 # Protein_GI_number: 15642725 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Vibrio cholerae # 1 411 1 403 406 226 32.0 5e-59 MANFCYKAISAKGKTVRGLVEADTERQAIASVREKGLVPLSVESASASVANPGKPSSKSS FKKILESLHPVPKTAVAAMARQLATLLHAGLPLDEALASICAHDDASRMQGIVSRLRDRI MAGGDLADGLAEFPNVFSSTFVTMVRAGEASGTLELVIERFAEHIEQQVALVRKVQATLA YPILMLFVGIGVVIFLLTFVIPKVTQIFSDMGRALPVPTQILLAVSDGIRSGWWIILLLV LCIMIGVWRMRRTEGGKRYAHRLLLRLPGIAGIYRPLIVGHMTRTLGMLLKNGVTLLKAL HIVKSVADNKLMAQAVQNMIDGVQAGRDLSEFMNDPLVFPPLSRQMVAAGERSGQLGEML LWVANDSENRVASRLQVVTSLMEPVMILLLGGIVGFVVIAIILPIFEMSTLAG >gi|316923931|gb|ADCP01000054.1| GENE 14 12959 - 15790 2167 943 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 378 721 280 548 553 108 27.0 4e-23 MKNRTSKYAFCFAAAIALVLGGGMQLRAADKKADTVYHNGTIYTVTEDFRKPSQLNTPNT VEVVATLNGKIVFVGSESEAKAQGFLNASNVNKIVDLKGKTMLPGFVDGHGHFPGQGELD LFQANLNSPPIGTMTSIKDYIPVLAALAAQTEEGKAVTGKGFDDTLVTEKSYPTKEDLDE ASTKHPIVIVHTSDHINAGNSLAFKLAGFKRSDIGEDGVYVRDGKPYIGVRTVKKADGWD FTGVCAETEAMGLLNAATQNNSPDITDKSVRSVARASQVYTAAGVTTADLGGAVLAMPTE YGTVGFALNQIQVALQRGVLEPRVVVHPFAYMAYGPAEIGKINRMALGWKGDDYSDPGDS PAKGSDITSLSLKGVSAQMGGKAPEGLPANRFFMGAWKIIYDGSNQGYTGYFKTPGYYDP EYGGYEKGYNGMPNATTFSKEVLEKTIDIYHAANQSVEVHTNGSWAAEDYVTALEKAVAA HPEITDTRDCAIHGQMMERQHIERLVGDYSKLDATKDMYTELSGTAVDPALRAALQDGEL MKKQNLVNSYFVNHAFYWGDRHMEIFMGPGRAKNMNPCGWSAAYDLPFSVHNDTTVTPIS PLRSMQDAITRISSPTPLGKGGTLISGEGKDLDAVAMYPETKDGPQRAFWNYDQRVNVLQ ALHAITIVPAYQNHMEDKIGSIAPDKFADFTILDQDPFKIDPATIASMRVTTTIVGDKPV YGVLPDSETFAMQIAPSYDQPNGTSVLSFNGTSIDNATADKDYAPLPKGSKRLGTFAFTA ETTAGKSAVFQMNFLGNGATVGELSLHKLYANKTAPYAYGKPSADELPTASGMWWVADIK APTKALAPADVLEMDHTYMAFFIIADNDATFDIEPADGVIKDPVSLATTGPLPNNGNSGS SNDDGGSSSGCTVGSTPSYDLLVLFLGMSAVAAIRVLRRRNEQ >gi|316923931|gb|ADCP01000054.1| GENE 15 15876 - 18713 2420 945 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 16 718 1 548 553 143 23.0 2e-33 MKNRVKKQALSFAAAMAMILGGGMQLYAADKIADTVYRNGKIYTITETVQEAKDVKNAKK VDVVATLNGKIIFVGSEADAKAQGFLDENKVSKIVDLRGKTMLPGFVDGHGHFPSQGDND LFKVNLNSPLLGGTVDTMDKLVAELHAKAETLPAGSPIIGWNYDDTQLAEQVHPTRADLD KASTTHPILVVHISGHMAVANSAALDKYGVNKSTNVEGVVKDANGEPTGLLLEMKAQGLV PLASELKTNSAFGVARATQVYAAAGVSTADSGGTTVSSQVPLFQKTLAADQLGVRVLVHP LAYYYAPGLGLIGDDAMGISNRKAVGWKDTGDGKYTDASDALKIGYDMTTLKASKADVPS GLPANALMLGAYKFIFDGSPQGYTAWMKNPGYYDWGKYTAENSFNKAGYFNGLEGTLNMP VPNLKENIKIYHAAGQSVEVHTNGSAAAEAYVAALEEAVAAFPNVTDTRHTSIHGQTMER QHVERLSGHYENLEATADMYADAEFDGAFKDGKVDLSMGGKLPGNNLDKLMRAQNTVNSY FNNHTYFYGYRHTNQFFGPGRAYNMSPAGWSEAYGQRYTFHNDTFVTPISPLRSIQSGVT RFSGDARAAAGQENITVNGTGHDLNATVNYEARKGDPTTTRKFWTYDHRVNPLQAIHAVT IGPAYQNKVEDRIGSIAEGKLADFVILDEDIMDVAAKEPLRIADMRVASTIVADKVVHGV LPDSKTFISQFCAAYEQPTLDTVVTVQSSQMIDNATADKEYAALERGEKRFGTLQFTAEV AADSSAIFQMNMLGNGEKISALKLYKLTANKKSEYTYGRPAPDALGSASGQWWIASFDNP TVQLHQDTVLEMDKQYVAFFIIHDNDSVFDADKTDGVIVDPVTMVSTSGTLPTNGGTADP TSNDDDGGSSSGCTVGSTPSYDLLVLLLGMSAVAAIRVLRRRNEQ Prediction of potential genes in microbial genomes Time: Fri May 13 02:37:45 2011 Seq name: gi|316923923|gb|ADCP01000055.1| Bilophila wadsworthia 3_1_6 cont1.55, whole genome shotgun sequence Length of sequence - 9730 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 698 - 732 5.1 1 1 Op 1 6/0.000 - CDS 740 - 1438 664 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 2 1 Op 2 . - CDS 1441 - 2709 1589 ## COG0014 Gamma-glutamyl phosphate reductase - Prom 2755 - 2814 2.5 + Prom 2788 - 2847 2.8 3 2 Tu 1 . + CDS 2939 - 3625 912 ## Dvul_1216 hypothetical protein 4 3 Op 1 11/0.000 + CDS 3800 - 5653 1970 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits 5 3 Op 2 . + CDS 5656 - 6261 688 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit + Term 6371 - 6412 -0.7 + Prom 6432 - 6491 1.5 6 4 Tu 1 . + CDS 6535 - 8079 1090 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains 7 5 Tu 1 . - CDS 8734 - 8949 74 ## 8 6 Tu 1 . + CDS 8938 - 9342 435 ## COG4957 Predicted transcriptional regulator + Term 9471 - 9515 12.3 Predicted protein(s) >gi|316923923|gb|ADCP01000055.1| GENE 1 740 - 1438 664 232 aa, chain - ## HITS:1 COG:all5063 KEGG:ns NR:ns ## COG: all5063 COG1057 # Protein_GI_number: 17232555 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Nostoc sp. PCC 7120 # 5 224 2 201 208 87 28.0 2e-17 MTPAQTIGILGGTFNPVHIGHLRLATAVAEALRLKHVDLMPCAVPPHKADSGLLSFEMRV SLLQGALETPPNAAPSDARLQVSTLEGELPHPSYTWNLITEWRKRHTSESPMFILGGEDF MHLDTWHRGLELPNITNFVVVPRCQADEETFRATIGRHWPKAVITEPDENNLLSAAITDE TSCLYLPLPHLDISASLLRAKWLLGESIRYLTPDPVIDILDTYKEDVRLCWR >gi|316923923|gb|ADCP01000055.1| GENE 2 1441 - 2709 1589 422 aa, chain - ## HITS:1 COG:aq_1071 KEGG:ns NR:ns ## COG: aq_1071 COG0014 # Protein_GI_number: 15606350 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Aquifex aeolicus # 9 422 12 431 443 387 46.0 1e-107 MTVSIELQMEQMGQRAKKAARVLSAAAPAAKTQALRALAALLIERQPDILKANALDVNAA KAANMDAPRLDRLTLTPRIMADMADACLHVAEMADEVGVIEQQWQRPNGLMIGKMRIPLG VIAMIFESRPNVTIDAAILCLKAGDAVILRGGSEAIHSNLALAGLLHEALRSAGLPEDAA QVVETTDRAAVAALCKLDKYVDVMIPRGGESLVRAVCEAATMPVLKHYMGVCHLYIDEGA DLGMALRLAHNGKVQRPGVCNALECLLVHEKEAAAFLPMLAGKLGADGVEFRADARALPL LSGAKAVPARPGDFGQEFHDLILAVRVVDSLDDALDHIASYGSNHTEVICTNNHANATRF LREADASLTAVNASTRFNDGGQLGLGAEIGICTSKLHSYGAMGVRELTTTKFVALGTGQI RE >gi|316923923|gb|ADCP01000055.1| GENE 3 2939 - 3625 912 228 aa, chain + ## HITS:1 COG:no KEGG:Dvul_1216 NR:ns ## KEGG: Dvul_1216 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 4 228 5 229 231 131 42.0 2e-29 MADPLRPSGPQIPGSLMGEIQSEVAVEATPLLSFVLRNSRIIVTCIVLLVLVIAGVGGWQ WHQTRVEREAHLELGRILVSTQGPERIAALETFLPAAPSAMKSGVQLEIATTALGLEQYG KAADAYAAVAAADPKGSIGMMAAINQADLLQRQGKYAEALAVFDSLEKSAPESLRPAILE GQAMSAELAGKLDRALAAYESIATSLGDAANNGYFQAKIAELKARMAS >gi|316923923|gb|ADCP01000055.1| GENE 4 3800 - 5653 1970 617 aa, chain + ## HITS:1 COG:MTH1852 KEGG:ns NR:ns ## COG: MTH1852 COG4231 # Protein_GI_number: 15679840 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Methanothermobacter thermautotrophicus # 5 610 6 609 618 463 44.0 1e-130 MSHPLLAAEAGVRHLLLGNEAIVRGALEAGINVVTCYPGTPSSEIPDTFHYLASCGRYRM EYSVNEKVAVEVGAGAALAGAMSMTTMKHVGVNVAADPLFTAAYVGLPGGFVLVSADDPL CHSSQDEQDNRTYARFMGMPCFEPATAQEAKDMTREALLLARELQQPVMLRTTTRVNHLR GPVTFGALGEPAPIVPFERNPMRFVPVPAVAQGRHKVLVEHLENARAKAEASPWNKVSGE GRIGVIASGISRAYLADALAENGWEKDVKVLELGFTWPLPENLLADFMSGCDTVLVLEEL QPLVEQDLRALAQERKIDVSIVGKGPDLTIFGEYSTGAVARALAAVLGKELPSADGAAID VSRLPGRPPNLCAGCSHRAMYYAVRKVFGDEAVYSSDIGCYTLGMVPPLRAADFLFCMGS SVSAGSGFSMVSDRPVVGFIGDSTFFHSGMTGLANAVFNKHDVLLVILDNGTTAMTGHQP NPGVCTSVLGEGCNHLDIESVVRGIGVTDVAKVKPFNMRGTLKTIEEMKARSGVRVIIAE EPCMLFARRTLKQNRPQVAEVATQGEEALKCLAELACPAFRRGGTAEAPVVSVDESQCSG CMVCLQVTPAIRARKRS >gi|316923923|gb|ADCP01000055.1| GENE 5 5656 - 6261 688 201 aa, chain + ## HITS:1 COG:PH0764 KEGG:ns NR:ns ## COG: PH0764 COG1014 # Protein_GI_number: 14590633 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Pyrococcus horikoshii # 7 191 6 192 202 125 40.0 4e-29 MSNRIRIYMTGVGGQGTLTATTLLARVAVAQGVEVVAGEIHGMAQRGGVVESTILLGGWK SPKLGFGEADIVLGFEPLETLRGLRYLKQGGVVFSSTDVIPPLSVSAGREVAPGLDAVEQ AVRDRASSAWFLPCRSMGIELGSAQCGNNILLGALCASGLLPFGFEALEEGIRTFMPAKL VDVNLKAAERGREILASARLF >gi|316923923|gb|ADCP01000055.1| GENE 6 6535 - 8079 1090 514 aa, chain + ## HITS:1 COG:aq_218 KEGG:ns NR:ns ## COG: aq_218 COG3604 # Protein_GI_number: 15605774 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Aquifex aeolicus # 1 508 7 500 506 233 31.0 7e-61 METLQRIAAGMGPQRGFQSSLNSLLSLLTERHGFLRPHLVIFDPETRTLRLCVADGAPRS AQVVYEPGMGVTGQVFVTGKPVIVECLKGHPVFLSKFFMRTDEELSSLAFLSVPVLAPQM SWEPGREQARKVIGVLSVDTPRASREELERQRCFLEVVAGMIAAQATYLQDDMARQQRLA ERRGDRPHVAGDLDGGILAVSESMCEALKQASYAGHGRGPVLLRGEPGTGKARVAAFIHT ASVRNELPLVRFYASRLQSPGADRLTEEQAERELFGYRKGAFPGAVQTRKGLFELANCST LFIEDIDKLSLSIQAQILHVLQEQSVVRMGGGQPVGVDVRFICSSSADLENLVAQGAFLE DLYNRITVFPIALPPLRERPEDILPLADMFLKASAEALGRTVERISTPAQELLLQYPWPG NAGELAQCMKLAVQGCDDLVLRAAHLPQSLQTSGDSRAALPFNDAVARFEQELLIDALQH AHGNMLQAARDLQVSYRIVNYKVKKYNIDPRSFA >gi|316923923|gb|ADCP01000055.1| GENE 7 8734 - 8949 74 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIIHQGCLLGTKLLKRLCHESRNLSCIHSPQNPVRLFPRYPLTPEVIRGAETARNETVPP AKHPYQPVHCP >gi|316923923|gb|ADCP01000055.1| GENE 8 8938 - 9342 435 134 aa, chain + ## HITS:1 COG:SMc00058 KEGG:ns NR:ns ## COG: SMc00058 COG4957 # Protein_GI_number: 15964752 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Sinorhizobium meliloti # 10 131 19 140 143 73 31.0 9e-14 MDDHLKEALDIVKAQASVRIMSEDEIISMVRRLSSDIKNVAEGEGCVEELASSDVQSDPR KSIKEKSITCMECGKTFKILTRKHLASHNLDAAEYREKWGLKKDTPLVCKGLQRERRKKM KDMKLWEKRRKVEE Prediction of potential genes in microbial genomes Time: Fri May 13 02:38:02 2011 Seq name: gi|316923918|gb|ADCP01000056.1| Bilophila wadsworthia 3_1_6 cont1.56, whole genome shotgun sequence Length of sequence - 8009 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 7 - 3255 4196 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family + Prom 3400 - 3459 3.0 2 2 Op 1 1/0.000 + CDS 3679 - 5178 1910 ## COG0747 ABC-type dipeptide transport system, periplasmic component 3 2 Op 2 . + CDS 5248 - 7464 2692 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 4 2 Op 3 . + CDS 7468 - 8004 550 ## Ddes_0940 protein of unknown function DUF456 Predicted protein(s) >gi|316923918|gb|ADCP01000056.1| GENE 1 7 - 3255 4196 1082 aa, chain - ## HITS:1 COG:PA0799 KEGG:ns NR:ns ## COG: PA0799 COG0553 # Protein_GI_number: 15595996 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Pseudomonas aeruginosa # 492 1080 46 656 663 434 42.0 1e-121 MTEARTSEKMNVREENAARALIQGFLKDSVPEYIKDTGQEILNSGGVHKLTIRKEGDTWD VEGIVQGEDFQNYSPHLMLNPGDSQISYNCNCHEAFMGVCRHVAATALKMFADLDKDYGA PEEPQRLNTEWRQSFRSFFSSSLEPETGRHYFIFRFYPEPGRLIVALFRARQNKTGLSSV HQETSLEQILRNPEWCDQSPQLLQVARQIGHYLDYYGHRIEIPEGLVSWFFWAIRREYYM FWKDTDIPCTIVSTPLTVKLRPGLDEDGLRFDVLLQRGEKKPFSISEEDSEITFHGQMPL WVCWNQSFYPVQTSLSPSLVKQLVADHPVVPQDNISEFLDRVWARLPASELYEPDEFLKI MGPIFQPATYDPKLFLDEEGSLLTLEVQNTYETVHGEFVLPGPNPDFQTGSYVFEGHTFL VRRDQTEEAALMAQLAEMHFQPRSTRLWFMEPEEAIAFLLDSYPTLVENWRVYGEKALTR YKVRMSQPVISAKVESNEKEKWFTLDIDVEYDGQHLPLERIWKAWVRGRRYVQLKDGSYT SLPESWLEKLAHKLQALGFDPTKPPKRQFKQFEAPVLDNLLDDLPNAETDSFWNSLREKV RNFTEVEPVSTPKGLTATLRNYQLQGVSYLNFLSEYGFGGILADEMGLGKTIQTLSFIQH MVNHGHEGPNLIVVPTSVLPNWERESEKFVPHLKRLIIYGTRREGMFRKVADSDIVVTTY ALLRRDLEELEKHYFNSIILDEAQNIKNPNTITARSVRSIKARMRLCLSGTPIENNLFEL WSLFEFLMPGFLGSQHAFQRGVVKPIRDGDGESLEYLRSRVKPFILRRTKAEVAKDLPPK IESVTYCNMTDEQAELYTALTRKLRDQVLADVESKGMAKSQMSILDALLKLRQICCHPRL LKVDMPGFSTGSLPSGKFEAFKDMIFDVVEGGHKVLVFSQFVQMLQIIRGWLQLTDIPFC YLDGTSKDRLDQVDRFNNSPEIPIFLISLKAGGTGINLTSADYVIHYDPWWNPAVESQAT DRTHRIGQTRQVFSYKLICQNTVEEKILKLQEMKRGVAEAVIPGQDTWKSLTKEDLEMLF EV >gi|316923918|gb|ADCP01000056.1| GENE 2 3679 - 5178 1910 499 aa, chain + ## HITS:1 COG:BS_appA KEGG:ns NR:ns ## COG: BS_appA COG0747 # Protein_GI_number: 16078203 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus subtilis # 1 490 49 543 543 297 33.0 4e-80 MGTIGEPSNLIPYMASDSASGEITGLLYVAPLKYDKDLNVVPWAAASYEVLEGGRLLRFV LRDDIRWEDGVPLTADDVEFTYKLMIDPKTPTAYSGDFLAIESFKKTGRLSFEVRYAKPF ARSLMTWMGAILPKHVLEGQDIMTTPVARKPIGAGPYRLAKWDAGSMLTLTASDTYFEGR PNLDEVVYRIIPDPSTMFLELKAGKLDMMSLSPQQYLRQTQGPQWERDWRKYRYLSFSYT FLGFNLEHPFFKDVRVRRAISMAIDRQSLIDGVLLGQGVPTVGPYKPGTWAYNDKLTPVR QDVDEARRLLDEAGWKKNKDGLLERDGKPFLFTILVNQGNDSRIKAAIIIQSQLKALGIT VHIRTVEWAAFIKEFVNKGHFDAIILAWTITLDPDIYDVWHSSRAKPGGLNFTGYRNAEV DELLVEARSMTDQDRRKELYDRVQEILDAEQPYCFLYVPYALPIVQARFQGIKPALNGIM YNFDRWWVPKDLQRYHVRP >gi|316923918|gb|ADCP01000056.1| GENE 3 5248 - 7464 2692 738 aa, chain + ## HITS:1 COG:lin1558 KEGG:ns NR:ns ## COG: lin1558 COG0317 # Protein_GI_number: 16800626 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Listeria innocua # 32 732 26 735 738 630 44.0 1e-180 MSTVEAPASHLPLPMQNILSRLAANSTEAEQALVRKAYAYAAAAHAGQVRLSGEPYLSHP LAVAEVLAELGFDAHSVAAGLLHDTVEDTKVTLEEVDAEFGEQVADIVDGVTKISMMTFD SKEEQQAENIRKMILAMSHDIRVPIVKLADRVHNMRTLDFQKAHKRQRIAQETMDIYVPL ANRLGLHRLKLELEGLSFKYIHPDVYAQISDWLESNQVVERQLIAKIIAKLEGILSSNQI EGTVWGRIKHIYSIYKKMTEQNLTLDDMHDILAFRVIVKDVRDCYAVLGLVHAQWKPVPG RFKDYISMPKANGYQSLHTTVIGPEGERIEIQIRTEEMHRLAEHGVASHWLYKERHHAVS VKDAPEFEWLREIVNRQGQESDSKEFMHTLRMDLFKDEVYVFTPAGDVKELPEGATPLDF AFLIHTQVGSRCVGAKVNGKLVPLNTALKSGDTVEIITDKNRRPSRDWLKIVKTAKARSR IQQYLRTEERAAAMGLGREMLEKEARKAGINLSKAEKDGHLSTLVNSLAAGSVDELMASI GYARFTTKQVVKRLQAIVSPPEILETPDAVDTVVSHRKRKEQSPPPPPKTGGISVRGVDD MMVRIAHCCNPVPGDSIVGFISRGRGVIVHTATCPNIQEMEPDRLVSVQWDGHETQPFPV RIHIMARNQKGSLADIAIVLRDEDVNIDGCLLQALVDGRSEMEMVVQVRDVAHLYHVIDR LRHLPSVMEVLRKTANEE >gi|316923918|gb|ADCP01000056.1| GENE 4 7468 - 8004 550 178 aa, chain + ## HITS:1 COG:no KEGG:Ddes_0940 NR:ns ## KEGG: Ddes_0940 # Name: not_defined # Def: protein of unknown function DUF456 # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 4 170 11 178 200 147 54.0 2e-34 MEVLFPTIFLVLLGFVLLLNVITLPANWIMLGLIGLWRFAYPSPGDMGVFFFAMLVGLAL FGEVIEYIAQGWGSKKYGSSTSGMWAGLLGALVGALAGLPLLFGLGAFIGALVGAWIGCY LMERYKGRNDYEARQAAKGALVGRFLGIVVKCGIGAVMLGLTCHAIFQVPVYPEVMTF Prediction of potential genes in microbial genomes Time: Fri May 13 02:38:12 2011 Seq name: gi|316923911|gb|ADCP01000057.1| Bilophila wadsworthia 3_1_6 cont1.57, whole genome shotgun sequence Length of sequence - 7201 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 52 - 109 4.3 1 1 Tu 1 . - CDS 281 - 1579 1795 ## COG1541 Coenzyme F390 synthetase 2 2 Op 1 . - CDS 1762 - 2844 1175 ## COG4012 Uncharacterized protein conserved in archaea 3 2 Op 2 . - CDS 2847 - 3245 476 ## Ddes_1770 hypothetical protein - Prom 3307 - 3366 1.8 - Term 3309 - 3369 4.3 4 3 Op 1 2/0.000 - CDS 3384 - 4373 958 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 5 3 Op 2 . - CDS 4370 - 5158 967 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 5400 - 5459 2.7 + Prom 5359 - 5418 3.1 6 4 Tu 1 . + CDS 5588 - 6796 1629 ## COG0133 Tryptophan synthase beta chain + Term 6973 - 7024 16.5 Predicted protein(s) >gi|316923911|gb|ADCP01000057.1| GENE 1 281 - 1579 1795 432 aa, chain - ## HITS:1 COG:AF2013 KEGG:ns NR:ns ## COG: AF2013 COG1541 # Protein_GI_number: 11499595 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Archaeoglobus fulgidus # 3 431 11 439 440 481 52.0 1e-135 MDYFDKAETWDRATLETVQLARLKTTIQQASRAPFYAKRLAEAGVSADNLRSLEDIRRIP FTSKDDLRSQYPYGLATVPHSEFVRMHCSSGTTGTPVAVCYTQPDINSWADLMARCLYMA GVRKDDVFQNMSGYGLFTGGLGIHFGAERLGCMTIPAGAGNSKRQLKLVKDFGTTVAHIL PSYALHLGTTLIEEGEDPKALSLRIACVGAEPYTEETRRRIEALFDMKVYNSYGLSEMNG PGVAFECQYQNGLHLWEDAYLLEIIDPNTGKLVPDGEVGELVLTTLCRHGMPILRYRTRD LTRIIPGDCPCGRKHRRIDRILGRSDDMIIVKGVNIYPMQVERVLMAYPEVGQNYVIVLE RDGLKDTMKVQVEIREESFVEDMRVLRGLQETIARALKDEILITPKVELVQHNSLPRTEG KAQRVIDKREKS >gi|316923911|gb|ADCP01000057.1| GENE 2 1762 - 2844 1175 360 aa, chain - ## HITS:1 COG:MA0184 KEGG:ns NR:ns ## COG: MA0184 COG4012 # Protein_GI_number: 20089082 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in archaea # Organism: Methanosarcina acetivorans str.C2A # 23 355 3 359 373 158 35.0 1e-38 MAELSPASAGDAVARLLAAAGPILSLDIGSGTQDVLLALPGERAENWPRFVLPSPALGIA DRIRRHTAAGRPVWLYGHNMGGGFAVAVQEQVAAGLAPAASPDAALALHDNPERVKAQGV SITPSCPKGYTAVPLADYEPGFWDGLLSAAGLPKPSLIVAAAQDHGHHPEGNRVGRFNLW RALLTETQGNPARWLYDTPPAPCTRLNALQQCTGGPVADTATAAVLGALAAPEVAKRSQR QGVTVVNVGNSHVAAFLVFKGRILGVYEHHTGMLDTDALLFDLKEFGFGWLPDEQVRAKG GHGCAFLAPLPPEAEGFAPTFAVGPRREMLLGHAQFIAPHGDMMIAGCHGLLHGLALREA >gi|316923911|gb|ADCP01000057.1| GENE 3 2847 - 3245 476 132 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1770 NR:ns ## KEGG: Ddes_1770 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 12 126 35 147 150 142 55.0 4e-33 MAVYDVFTPECLNDLFPIQRTNDFFDALFGDAEEGAYDIRLVVDPEESDDGELHFYFELH QRAGRCLVCSLTYGLPQVFERHPIINLKGLVSDIASRAGWELDDVWWRIGTTESLSSALH RIPLVLIQNKGK >gi|316923911|gb|ADCP01000057.1| GENE 4 3384 - 4373 958 329 aa, chain - ## HITS:1 COG:aq_1221 KEGG:ns NR:ns ## COG: aq_1221 COG0008 # Protein_GI_number: 15606455 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Aquifex aeolicus # 7 283 4 274 473 229 40.0 8e-60 MSASRPVRGRLAPSPTGFLHLGNAWAFLLAWLACRSKGGSLVLRMEDIDPDRSRPEYADA IIRDLRWLGLDWDEGPDAGGPAGPYVQSARMGLYADALNRLGRAGHIYPCYCTRKELRTL AGAPHVGDAGAAYPGTCRNLPPERRAELEAAGRRPCIRLRCPSQNYAFEDAVFGPFSMTL EACGGDFALRRSDGVIAYQLAVVVDDGLMGITQVVRGEDLLVSTPRQLALFDLLGYPRPA YMHLPLLCDPEGERLAKRHASLTLASLRDAGVSPAAVAGYLGWKAGLIGALAPAHPRNLL PAFDPGRLRFLPERVLVEADLTALLSVVC >gi|316923911|gb|ADCP01000057.1| GENE 5 4370 - 5158 967 262 aa, chain - ## HITS:1 COG:PA4783 KEGG:ns NR:ns ## COG: PA4783 COG0697 # Protein_GI_number: 15599977 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pseudomonas aeruginosa # 1 256 39 284 296 90 31.0 3e-18 MMCGLRMMLAGVLLYLWSWARGERNLPTRKDLSQSFVLAFFMVFMASGFLAKGQESISSG TAAMILGAVPIWMVLGGWLFCGDPRPSLVQFFGLGTGFAGLILLSVNQTASGTDSGWGIL LVLCAAFGWVTGSFLSKKQASETQLSVIQTSGLLMFIGGLQSLVGAAVLGEFSTFSMDSV TPLSAGALLYLVIFGAIIAYTCYFWLLLHTRTVVAISYEYVNPVIGVFLGWLLAGEQVDG VIVTACCLTVLSVFFIVSRKHG >gi|316923911|gb|ADCP01000057.1| GENE 6 5588 - 6796 1629 402 aa, chain + ## HITS:1 COG:lin1669 KEGG:ns NR:ns ## COG: lin1669 COG0133 # Protein_GI_number: 16800737 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Listeria innocua # 7 399 2 395 400 475 59.0 1e-134 MNTSSASLSASESTGFFGAYGGQFVPDDLKARLDEVTEAFDRYSDNSFFKDELDYYFKHY TGRPNPIFFCANLSERLGGAKIYLKREDLNHLGAHKINNTLGQCLLAKRMGKKRIVAETG AGQHGVATAATAALLGLECTVYMGAEDMKRQRLNVIRMEMLGAKVVPALSGQRTLKEAVD EALAAWIANPDTFYVLGSAVGPHPYPTMVRHFQAVIGCEAREQMLAECGKLPVAAFACVG GGSNAIGLFAGFVDDADVRLIGVEPAGRGLTYGDHAASLCKGEPGVMHGFHSYMIKDENG EPGAVYSISAGLDYPSVGPEHSYLKDIGRAEYVSVTDKEAVDAFFLLSRTEGIIPALESS HALAQAVKVAPTLPKDAALLVCLSGRGDKDVEQMEAFLESAK Prediction of potential genes in microbial genomes Time: Fri May 13 02:38:38 2011 Seq name: gi|316923882|gb|ADCP01000058.1| Bilophila wadsworthia 3_1_6 cont1.58, whole genome shotgun sequence Length of sequence - 36548 bp Number of predicted genes - 29, with homology - 26 Number of transcription units - 13, operones - 5 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 514 - 1059 560 ## 2 1 Op 2 8/0.000 - CDS 1056 - 3158 2718 ## COG4666 TRAP-type uncharacterized transport system, fused permease components 3 1 Op 3 . - CDS 3312 - 4289 1721 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component - Prom 4317 - 4376 3.7 + Prom 4272 - 4331 1.6 4 2 Op 1 . + CDS 4363 - 4467 70 ## + Prom 4472 - 4531 4.8 5 2 Op 2 . + CDS 4562 - 6277 1163 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase + Term 6285 - 6330 2.1 + Prom 6537 - 6596 4.1 6 3 Tu 1 . + CDS 6749 - 8092 1045 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 + Term 8114 - 8151 6.0 7 4 Tu 1 . + CDS 8499 - 9797 1420 ## COG1158 Transcription termination factor + Term 9806 - 9855 7.2 - Term 9858 - 9899 7.1 8 5 Op 1 . - CDS 9941 - 10573 498 ## DVU1008 hypothetical protein 9 5 Op 2 . - CDS 10570 - 11808 774 ## LI0103 hypothetical protein - Prom 11864 - 11923 5.0 + Prom 11809 - 11868 2.9 10 6 Tu 1 . + CDS 12113 - 12325 238 ## + Term 12464 - 12500 5.1 - Term 12583 - 12619 2.3 11 7 Op 1 34/0.000 - CDS 12728 - 13480 284 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 12 7 Op 2 31/0.000 - CDS 13461 - 14255 1104 ## COG0765 ABC-type amino acid transport system, permease component - Term 14365 - 14404 10.5 13 7 Op 3 . - CDS 14450 - 15193 1195 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 15299 - 15358 2.6 + Prom 15244 - 15303 1.6 14 8 Tu 1 . + CDS 15435 - 16049 458 ## COG0241 Histidinol phosphatase and related phosphatases + Term 16195 - 16230 4.2 - Term 16181 - 16218 3.0 15 9 Tu 1 . - CDS 16260 - 17939 1771 ## DvMF_1897 polysaccharide biosynthesis protein - Term 18153 - 18182 0.5 16 10 Tu 1 . - CDS 18418 - 19806 1834 ## COG1362 Aspartyl aminopeptidase 17 11 Tu 1 . - CDS 19924 - 21330 1407 ## COG1921 Selenocysteine synthase [seryl-tRNASer selenium transferase] - Prom 21436 - 21495 2.1 18 12 Tu 1 . - CDS 21608 - 22873 1220 ## COG0285 Folylpolyglutamate synthase + Prom 23069 - 23128 7.6 19 13 Op 1 29/0.000 + CDS 23149 - 23709 350 ## COG2001 Uncharacterized protein conserved in bacteria 20 13 Op 2 . + CDS 23817 - 24884 1094 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 21 13 Op 3 . + CDS 24881 - 25186 259 ## DVU2511 hypothetical protein 22 13 Op 4 26/0.000 + CDS 25220 - 27304 1859 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 23 13 Op 5 26/0.000 + CDS 27310 - 28758 1587 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 24 13 Op 6 28/0.000 + CDS 28749 - 30167 1286 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 25 13 Op 7 28/0.000 + CDS 30157 - 31233 1387 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase + Term 31461 - 31498 -0.2 26 13 Op 8 25/0.000 + CDS 31518 - 32834 1082 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 27 13 Op 9 31/0.000 + CDS 32831 - 33976 1337 ## COG0772 Bacterial cell division membrane protein 28 13 Op 10 26/0.000 + CDS 33988 - 35064 1027 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 29 13 Op 11 . + CDS 35079 - 36443 1780 ## COG0773 UDP-N-acetylmuramate-alanine ligase Predicted protein(s) >gi|316923882|gb|ADCP01000058.1| GENE 1 514 - 1059 560 181 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRTKTPFRRKRLISRGSSSSTSVEVKVERELQCSRCKAIFPDYLARCPECGSEEWIGLVE VNPYTRMPMETFLKACGHLLWLVGTIGFLVLLWQTDSPDAETNKLFIYGAFLLLFCSVLF SAAYFGMSEIMRRILRMQRRLRAFHENYRDDQPGSGHTRTVQHKLAVRRVQGYGSPINKM Q >gi|316923882|gb|ADCP01000058.1| GENE 2 1056 - 3158 2718 700 aa, chain - ## HITS:1 COG:VC0429 KEGG:ns NR:ns ## COG: VC0429 COG4666 # Protein_GI_number: 15640456 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, fused permease components # Organism: Vibrio cholerae # 17 673 10 732 854 382 38.0 1e-105 MPHKMIEHIEQSPNGDDYREKLVAAEHGLRGDTVGITRIITMILALCWSLFQLASASILL LDTVYIRAIHLAFAISLVYFNIPMIKISGTSWRWDLRILLAMNRVTILDYLLGIIAAVSA LYIVLDYEGIASRAGVPTTRDIIFGLMLVVLLLEATRRVIGPALPIIASIFILYVFTGPH LPDFLAFKGASLSRFISQMTMDTEGIYGVPLHVSATVVFLFVLFGTMLERAGGGNFFIQL AISGLGRFRGGAAKAAVLGSALTGVVSGSSIANVVTTGTFTIPLMKKVGYPPEKAAAVEV ASSINGQLAPPVMGAAAFIIAEYVNVPYIEVAKAAAIPAFAAYAGLLWITHVEACKLGLR GLSKSELPSFWTTLKEGLHFLIPICMLLYELIIMQHSAEMSVFRAIVVLALIMLGQPVVK AFVHKKSKRKALKEGIKTLLLSLSAGGSNMAGVAMATASAGIIVGCVSLGLGQQITSFVE ILSGGNIFLLLLITALASLLLGMGLPTTANYIIMASLTAPILVQLASGIVVHGVALAVPL IAAHLYCFYFGILADDTPPVGLAAYAAAAIADTSPIATGIQGFLYDIRTAILPLMFIFNH DIILWGIDSVPEGIFLFFMTVLGCMSFASLTQGWFIAKNTLPDALLLACSTIIMLYPALL TGFFLPHDQRYWGYLIGIGIMGLLAYMQHARTTQTTEAAA >gi|316923882|gb|ADCP01000058.1| GENE 3 3312 - 4289 1721 325 aa, chain - ## HITS:1 COG:VC0430 KEGG:ns NR:ns ## COG: VC0430 COG2358 # Protein_GI_number: 15640457 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Vibrio cholerae # 11 325 17 328 328 265 46.0 7e-71 MHKHLLAAVLSLAVAMTGASALEAHAKTTFVTIGTGGITGVYYPTGGAIAKIVNAKKDKY DIRATVESTGASVFNINAIMAGDLEFGIAQADRQYQAYNGLSEWEGKPQKDLRAVFALAP EAVTFVAAEDSGIKSLKDAKGKVVNIGNPGSGNRQNAIDVFEAAGINIEKDLKAESIKAA DAPRMLQDGRIDGFFYTVGHPNGNIKEATAGKRKTRIVSITDIEPLVKKFPYYSLTNIDM AQYPEATNANEKVTTVGMLATFVTSAKVPDDVVYAITKEVFENLDEFKKLHPALEGLTRE TMLEGLTAPIHPGALKYYKEAGLMK >gi|316923882|gb|ADCP01000058.1| GENE 4 4363 - 4467 70 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVSGILEVYIFVKRAVSFRDGNRFGDMGQWFREN >gi|316923882|gb|ADCP01000058.1| GENE 5 4562 - 6277 1163 571 aa, chain + ## HITS:1 COG:ECs1922 KEGG:ns NR:ns ## COG: ECs1922 COG1473 # Protein_GI_number: 15831176 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Escherichia coli O157:H7 # 15 432 15 433 441 309 44.0 8e-84 MSVSSLPFLHGALAAVLPKLAAWRRDFHAHPEAGWTEFRTAALIIRRLRELGYAIRMGEA ACASDRRMGVPSSAVLAAARERALEHGADAELVAGMGDGHTGFWADLDCGSGPVTAFRFD MDCNEVAECAEENHRPVREGFASRWPGLMHACGHDGHAAVGLGLAEVLVSIRRHLRGRIR LIFQPAEEGARGALPMSEAGAVDGVDVLFGFHIGFKAEGPGSLICGTQGFLATTKRDVLF SGVPAHAGAAPEEGKDALLAACSAVVNLHAISRNGKGGTRIAVGRLEGGEARNVIPASAR LFMETRGETTELDAYMTAESERILRASADLWGCACEWRTVGNSAGGGSSPELARLIATVA EAMGGWKTVIPMSGFGATEDFACLMSRVQTSGGLASYLQVGTDRAAGHHSDRFDFDETCL GRALELLARLAVRMAGGGQEDTETETESDMEFLDEVVRRKGIHVQLWLEGEDGEGFGRGR VELLQLVDELGSLSKAAKQLGMSYRGAWGKIKKAERIAGETLVDASGTKRDGYSLTPAGR ELVQRFQQWYADVESFAKKRAEALFSDFDKE >gi|316923882|gb|ADCP01000058.1| GENE 6 6749 - 8092 1045 447 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 4 437 7 439 456 407 48 1e-113 MAWLKSVNDVVWGPGMLVLLVGTGLYLTVGLKFFTFRNFFRGLKNLWAGRSNQSEQGEIS PFNALMTALAADIGTGNIVGVSTAIYIGGPGALFWMWVTALVGMATKFSEVLLSVHFREK TPAGNWVGGAMYFIKNGLGPKWMWLGSCFALFGIVACIGTGAMVQGNSIGEAIQANLSIP TWFTAIAVFLLAGAVLIGGVKRIGAVAGKVVPAMAVIYVLISLIVIILHIDQLPRVLAWV VAEAFTPTAAEGGFAGATVMMAIRMGMARGVFSNEAGLGTAPMAHAAATSASPLVQASIG MLDTFIDTIIVCTMTGFAILVTGQWTSGLTGAAMTSAAFETSLPGIGGIAVTICLTFFAF TTALGWCVYGERCAIYFFGDKAQIPFRIVYCIAIPVGVLTQLDVVWLLADTCNALMAIPN LIAILLLSPVLFKLVKEDESKDSIFKH >gi|316923882|gb|ADCP01000058.1| GENE 7 8499 - 9797 1420 432 aa, chain + ## HITS:1 COG:AGc5136 KEGG:ns NR:ns ## COG: AGc5136 COG1158 # Protein_GI_number: 15890078 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 15 431 2 420 421 568 69.0 1e-162 MRKKKVATEEVQEGATMSLTELKNMSMSVLMELAEKYEIENASSMRKQELIFAMLQACAS RSGAIYGDGVLEILPDGYGFLRSPLSSYMPGPDDIYVSPSQIRRFYLRKGDVVSGQIRPP KEGERYFALLKVKEIGFEPPENARHVVLFDNLTPIYPDRKLIMENGEKNLSCRVIDLMSP IGRGQRGVIVAPPRTGKTILLQSIANSINANHPEVYLIVLLIDERPEEVTDMERTVKNAE VISSTFDEPPQRHVQVCEMVLEKAKRLVERKRDVVILLDSITRLGRAYNAVTPSSGRVLS GGLDANALQRPKRFFGAARNIEGGGSLTIIATALIDTGSRMDEVIFEEFKGTGNMEIYLD RHLAEKRVFPAIDINRTGTRKEDLLLSDDVLKRVWILRKILAPMSSIDSMEFLLDKMRGT KSNEEFINAMSK >gi|316923882|gb|ADCP01000058.1| GENE 8 9941 - 10573 498 210 aa, chain - ## HITS:1 COG:no KEGG:DVU1008 NR:ns ## KEGG: DVU1008 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 14 210 59 262 267 136 37.0 6e-31 MISRCFRFFLFLPLLLLVTVPALARQTPEAVLKEIQAAIDASDLAAFERRVDVDALLDQS SSALIAALQKAGQVDTSGLPPMLALMVASVQDPSMAKQIKGLLVQETGTFTRSGIESGFF AGKPKANAKPKGLLAPLFGNVSTGRKELRPRGASRKVNGNVILPATIHDFGNGRDYKVDL GMSPSGESWKVTSIANMDKLVFRLQKEAAE >gi|316923882|gb|ADCP01000058.1| GENE 9 10570 - 11808 774 412 aa, chain - ## HITS:1 COG:no KEGG:LI0103 NR:ns ## KEGG: LI0103 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 8 406 14 419 424 256 34.0 9e-67 MSKPNRYRTDKDDDRETEPSRKGHHGLALLILILLAGVVVWFWLARDPEAREEFSNRMAN LRESAINLAADLMSRTTTPPSPPDVGAVAGNSLPGAPPKYRSPIETPAELDEALAEQRPV QPGGAEVRGPVAPGEQVPPDAGRKDDQVVRIAFIDDLASWLVSHYVPAATPGRSGRLSAS LQGANLRYGMGMTGLAWIGDDLPAGRTAALNHLFTPGMLDAIYRLYVDRFMEAVNRSAAT PLPSGEALSPAQRSEFFKLYARQFRGVAGALQALASMPDFNRQMENLSAAAQRVVDTNAQ YSELTFAADEARSNGELTRYSTLRQQMAAKGQQYQQAVIAREQAKSAFVQALKRTPEARY LDDDALLFIASWIDRRTHNNPEKLTAAGQAANLFRDLAGHFDAAAANPGASQ >gi|316923882|gb|ADCP01000058.1| GENE 10 12113 - 12325 238 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGCCCNNTKDDKNQSPDCMASKSTVVTTPEGEDVEVNFNMGTGEVTYGKKTGCCSDKSDG TSSGGCGCSK >gi|316923882|gb|ADCP01000058.1| GENE 11 12728 - 13480 284 250 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 1 226 1 223 312 114 32 1e-24 MTATESNEAPIISIQNVWKFFGQLTALHDVSFDVQPGERVVIIGPSGSGKSTLLRSINRL EVIDKGTILVEGSDINAPENDINKIRQDLGMVFQSFNLFPHKTVLQNLTMAPMKLRHISK HEAEERALQLLKKVGLSEKVNVYPSMLSGGQQQRVAIARALAMQPNIMLFDEPTSALDPE MIGEVLDVMVKLAQEGMTMVCVTHEMGFAREVADRIVFMDQGQILEVAPPAKFFSDPEHP RLQQFLKQIL >gi|316923882|gb|ADCP01000058.1| GENE 12 13461 - 14255 1104 264 aa, chain - ## HITS:1 COG:AF0232 KEGG:ns NR:ns ## COG: AF0232 COG0765 # Protein_GI_number: 11497848 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Archaeoglobus fulgidus # 55 259 13 221 224 170 45.0 3e-42 MNNAPNIRIEVTEGAQIPDPRDRRLLNAWTLAFCVAVVALVTLCATQPEPYLEIVKFLPD GLIITFEVTVLSLLCTLPIGLLTGLGRLSRNPVINLIASTYVEIIRGVPLLVQLFYIYYA LGRIIQVPPMVSAVIAISFCYGAYMGEVFRAGILSVPKGQTEAARSLGFNNFQTMTLVIL PQAMRTILPPIGNECIAMLKDTSLVSIIAVADLLRRGREFASQTFDYFETYTVIALVYLI ITLLLSKGVSMMEGRLTYYDRDRK >gi|316923882|gb|ADCP01000058.1| GENE 13 14450 - 15193 1195 247 aa, chain - ## HITS:1 COG:alr3187_1 KEGG:ns NR:ns ## COG: alr3187_1 COG0834 # Protein_GI_number: 17230679 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Nostoc sp. PCC 7120 # 8 244 14 247 261 132 33.0 9e-31 MIRLFSATLLSLLLCASLANAAEKPLIVASDATWPPIEMLDENKNVVGYSIDYLKAVAKE AGLNVEFRNTAWDGIFAALESRQADIIASSVTITDKRKKAMGFSDPYCEIRQAVVVSTNS DLKNLKELDGKKVGGQIGTTGLVETLPKAKSKAIVKTYDEVGLALEDLAKGNIDAVICDD PVAKFYANKKQEYAGKLKVAFITDDVEFYGFAVRKSDTDLVKKLNEGIKAVKEKGIDKQV VEKWIGK >gi|316923882|gb|ADCP01000058.1| GENE 14 15435 - 16049 458 204 aa, chain + ## HITS:1 COG:YPO1074 KEGG:ns NR:ns ## COG: YPO1074 COG0241 # Protein_GI_number: 16121375 # Func_class: E Amino acid transport and metabolism # Function: Histidinol phosphatase and related phosphatases # Organism: Yersinia pestis # 6 174 7 182 188 101 32.0 8e-22 MSVSCAVLLDRDGTLIEDKHYLSEPSGVALLPGVGPALSRLVQAGHRLFLVSNQSGVGRG YFTEQAVVACQKRLEELLDPYGVAFTDAVWCPHAPEESCFCRKPLPGMFDELRARHGLLP ETTFMIGDKLDDLGFAANAGLSAGLLVLSGKGAEHARKAGYPVPVSGIVEVPGAPRRVVA ADFESAVAWIVRDIENRGGEEGAF >gi|316923882|gb|ADCP01000058.1| GENE 15 16260 - 17939 1771 559 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1897 NR:ns ## KEGG: DvMF_1897 # Name: not_defined # Def: polysaccharide biosynthesis protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 59 536 3 480 511 536 59.0 1e-151 MIAAIALWRTPKGHCAEAEDSVRRREILLPSKRPMTEERESAPNQAPDTDQATHQRPPQA SLARRYMGKLLANIATVPVYLIMEAVLPRALGPQVYGNFNFSTSVFTQLTGFLDMGTSTC FYNALSRRQYEMPLVAFYGRVSALVFLVTMLAGVFLLVPGLGDMLLPDVPLWYAPLAAFY GFLLWGTRVVRSMNDALGLTVPGELLRSGVNLLGAIIILALFWFGFLNAGSLFGLQYLMM GSIVIGCLAIMRRSWAHIDLSLTRDQRLSYTREFSDYSMPLFVQALLSALALSAERWVLQ WFDGSTQQGYFALSQRVSAACFLFVSAMTPLIMRELAIAWGKDDREHMGHLLTRFAPLLF GIAAWFSCFTAVEASVLVHIFGGAEFAAALVPVQIMALYPAHQAYGQLASSVFHATGKTK VLRNLAFIEHFGGFLLVWLLVAPQDMLGMGLGATGLALKTVTTQFIMVNLLFWMASRLIP LNFFRNLMLQGLILGSLLGIAWVCKQATGIYFADDAHQIIRFFLSGILYSFMSFGSWSPA RGCSGAHGRISGKCSYASD >gi|316923882|gb|ADCP01000058.1| GENE 16 18418 - 19806 1834 462 aa, chain - ## HITS:1 COG:CAC1091 KEGG:ns NR:ns ## COG: CAC1091 COG1362 # Protein_GI_number: 15894376 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 1 460 1 463 465 410 44.0 1e-114 MSNEFDFDTKNGWTHYADAASQEQMDVLAARYMDFLSHAKTERETVDLVVEALKAAGFSE DFKKDLVFRTYRGKAVFVARKGKKPLASGVRLISAHTDAPRLDFKQRPLQEQVGIGQAKT HYYGGIRKYQWLARPLALHGVIIKEDGTSIKVVLGEDPGDPVFTIADLLPHLAQKEVTKT VADAFDAEKLNVILGHSPLPSASDTDETAKEQKDPVRANILALLNRKYGIREEDLMSAEL QAVPAGPARYVGLDAALIGGYGQDDRVCVFTALSAFLDAQAPEHANCLIFWDKEEIGSEG STGAQGRFLEYCMEDLTEAWEPGTKVRDVFQNSTAVSGDVHAATDPDWQELHEKLNASII GHGPTFCKFTGSRGKYGANDAHPEYIGWLRGVLNRKNIPWQMAELGKVDHGGGGTVALYL AAYGMDTIDLGPAVLSMHSPFELLSKADLYSTKLAYQAILEA >gi|316923882|gb|ADCP01000058.1| GENE 17 19924 - 21330 1407 468 aa, chain - ## HITS:1 COG:aq_1031 KEGG:ns NR:ns ## COG: aq_1031 COG1921 # Protein_GI_number: 15606324 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine synthase [seryl-tRNASer selenium transferase] # Organism: Aquifex aeolicus # 1 454 1 439 452 354 42.0 2e-97 MNNLFRSLPSVDACLSGLENAAPTLYATTPRPLLRSLTNAFLDGCRRDIKEGRLTDSGQL GLDALLPRLAAFAEKEAAPRFRRVLNATGVVIHTNMGRSVLADEAVNAILKACRGYSNLE LNLATGERGSRYSHVEELLCTLTGAEAALVVNNNAAAVLITMDALCSGGEVVVARGQLVE IGGSFRIPDVMERSGATLREVGCTNRVHPQDYENAISERTVALMRVHTSNYRVVGFHADV PVEELADMAHRHGLPLIEDLGSGSFLDFTAAGLPGEPTVRSVVAAGADVVTFSGDKVLGG PQAGIIVGTKAIIEKIKRNPLNRALRIDKMTLAALEATLRLYLDEQLARKRIPTVAMITA APDELKRRASRLASRLRKACGDKAEIRLARGESRVGGGSFPENALPTTLVRAVPTGCTPE CLKQRLLETVPPLIGRLEDHAFLLDPRTLADEELPMAANVIAAALNQA >gi|316923882|gb|ADCP01000058.1| GENE 18 21608 - 22873 1220 421 aa, chain - ## HITS:1 COG:DR0340 KEGG:ns NR:ns ## COG: DR0340 COG0285 # Protein_GI_number: 15805369 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Deinococcus radiodurans # 29 343 48 351 416 167 40.0 3e-41 MRFHSFDDVQDHLDALGLFHMDFGLDRMRNALDALGLLTPPFVTVQIVGTNGKGSTSTFL SCVARAHGLKVGLYTSPHFVTPRERIRINGTMLPADRWPVLADRVMEAAPNLTYFEFLTA LGLLAFAEAGVDLVVMEAGLGGHYDATTAMPVQAVCFTPIGMDHEKILGPTLTDIASDKS QAMRPGVPAFTAPQEAEALDCLLRTAQEKGAELRETASLPFPQSALGLAGPHQRVNARLA IAVWDWLADQHHWPNMPETAAKGLASAHFPGRFQRIPACNGLPPLILDGAHNPHGLRAFE TAVRDADIQPAAVIFSCLADKDISDMLPFIRRIAGDAPLFVPTIQDNERAMNGEELAKLL AEGRGPAITQPTQRLSLALKETASFVPAEDADRHPVLLCGSLYLLGEFFNLHPQTLEQEL V >gi|316923882|gb|ADCP01000058.1| GENE 19 23149 - 23709 350 186 aa, chain + ## HITS:1 COG:BS_yllB KEGG:ns NR:ns ## COG: BS_yllB COG2001 # Protein_GI_number: 16078577 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 39 184 1 141 143 93 34.0 2e-19 MVLYICHPLDKVGGMDDKVGESGKKWSILTKGPRRPDMLFRGQSYRSLDAKGRLMLPPEF RDALTAASADGTFVLTTYDGCLVGYPAPLWNELEERFGRLRNSSRKIRDFRRLVLGGAED QSFDAQGRIRLSRAHVEYAGLEHDAVVVGQGDKFEIWDQARFKALLSQDFDDVADELAES GIDFSF >gi|316923882|gb|ADCP01000058.1| GENE 20 23817 - 24884 1094 355 aa, chain + ## HITS:1 COG:CAC2132 KEGG:ns NR:ns ## COG: CAC2132 COG0275 # Protein_GI_number: 15895401 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Clostridium acetobutylicum # 8 317 5 311 312 258 44.0 2e-68 MPSPESVHISVLLNEAVDALAPKAGGRYLDGTVGMGGHSFAIMERTGGEGFLCGLDRDTQ ALELARIRLAPFGDRVHLVHTRYSEFEAALDGIGWDAVDGALIDIGVSSLQIDSAERGFS FSSDGPLDMRMDRDSEELPVSRLVNRAKLEDLKDIIERYGEDPQAGRIARAIVEARVRKP IETTGELAALVERAYPAAWRAKARNHPATRTFQALRMAVNDEIGELERFLDTILGRLKPG GRVAVISFHSLEDRVVKHRMKAWAQGCICPKHIPVCVCHHAPEALLVTPKPVCPSERELM LNPRAGSAKLRVAEKLDPHARKADEDSGEEAARLRYEAKRAARLARHGREGRERA >gi|316923882|gb|ADCP01000058.1| GENE 21 24881 - 25186 259 101 aa, chain + ## HITS:1 COG:no KEGG:DVU2511 NR:ns ## KEGG: DVU2511 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 8 95 7 94 95 97 61.0 2e-19 MSQMSSSGWLLAFIISLLLSLLMGLALVWLSIERTDKAYSIRQLRNEVEQRAVLKTKLEV ERDRLLAPHVLRRKAIQLGMSEAKPGQIRRMPDPGPEAGVQ >gi|316923882|gb|ADCP01000058.1| GENE 22 25220 - 27304 1859 694 aa, chain + ## HITS:1 COG:VC2407 KEGG:ns NR:ns ## COG: VC2407 COG0768 # Protein_GI_number: 15642404 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Vibrio cholerae # 40 578 26 573 580 299 34.0 1e-80 MAYLGARKRTTRQASSQAVNRRRSEGGPRAWFASIDWNSVRFRTVAVLFFVVWGALWVRA GYVQLWEGPFLAERARRQHMAAETVAMPRGMITDRNGHILARSVECCSVYANPSIITDIE ATATTLAGVLGRPVDQIKPLLEKKRSFVWIARRVDDATAEAIRQADLTGVELSREFERIY PYRQVAGQLLGFVGMDGKGLEGIERSFDEQLGGLSVRQAVRRDASGREFYVNSNYEAQPA EDVHLTLDLQVQSIAEEEISKAVTEFGAKWGGVLVVDVAAGDVLAWAQFPFFNPNSYNQY RPSEYRNRLAQDALEPGSTFKPFLIASALQEGVVTRDTTFNCENGLWKSRYITIRDDSRP KQNIQSVAKILANSSNIGCGKIGLELGAVKYQRYLSRLGFGERTSVQLAESRGILRPARE WSEADLISSSFGQSLSVTVLQMAQAYLTLANEGVYKPLRIVLTDDVGGGDQRIFSKNTTR EVLSMMREVVDEGTGKRAAIPGVSVAGKTGTAQKAFRGKYGGERTASFVGLVPAEKPQYL VVIFIDEPSKVKYGGVIAAPVFKSVTSRVMAYHGSLPDPGALTPAQIKAQEKAEARARAR AVRRGSKEKTVLGEYRSELATKKTALPVRDSGTVPDVVGQSVRRAVEMFARQGLVPVIKG NGSRVVRQTPEPGVRWAGKDSAPTSCVLWLSEQE >gi|316923882|gb|ADCP01000058.1| GENE 23 27310 - 28758 1587 482 aa, chain + ## HITS:1 COG:slr0528 KEGG:ns NR:ns ## COG: slr0528 COG0769 # Protein_GI_number: 16332016 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Synechocystis # 18 479 38 500 505 325 41.0 9e-89 MYPLSLLESRVAAQGMELCIDSRKATPGCVFVALPGSSADGSQFIADAVARGAGYVVCRP ESAGNCGEAEVVDCADPRQALGLLARARYGTASLPFPVVGVTGTNGKTTTTYLLDHLFAS AGKKTGVLGTVSYRWPGHHEDAPMTTPDCLDVHTMLADMRAAGVDMAFMEVSSHALDQNR VAGVGFGGAVFSNLTQDHLDYHHDMETYFKAKAKLFLDGDGKPFADRVMAIGTDNPWGAR LAGMAPEAIGFGLTASGASGRYLEGKVLSSTTAGLRLHARFEGREWEFTSPLVGNYNAEN LLAVQAVALGFGLDPEAFRCFETFCGVPGRLERILNPQNLDVFVDYAHTPDALINVLNAL RGAGFKRIVTVFGCGGNRDRAKRPLMGEAVARLSDVAVLTSDNPRHEEPEAIMADVMPGL SGAAEVFADPDRRRAIEKGLELLHPGDALLIAGKGHESTQQIGDVKHPFSDQQTVREILG CA >gi|316923882|gb|ADCP01000058.1| GENE 24 28749 - 30167 1286 472 aa, chain + ## HITS:1 COG:BS_murF KEGG:ns NR:ns ## COG: BS_murF COG0770 # Protein_GI_number: 16077524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Bacillus subtilis # 1 464 2 454 457 184 32.0 4e-46 MRLSFEEIVRAVDGSAVAGAAPCGIVEGVGTDSRAVKPGSLFVCIPGETFDGHDFAVKAI EAGAAALLVSRNPFDAVPPVPLILVEDTVKALGKLAHCWRERLGETRVIGLTGTAGKTTV KELLAQVLSRRGLTAKTHMNLNNQIGLPLSMLAATGEEAFWVMEAGISHPNDMDELGAIL EPDLALILNVGPGHAAGLGDRGTAHYKSKLLAHLAPGGTAIISADYPDLAREARAVRQEL VFFSTSGRQVDYRAAYVVPAGEDKGLFRLWLDGVSIDVEAPFRGAFGAENVIAVAAVAHR LGLGAEEIAAGFAGAALPKQRFACSRAGDWLVIDDSYNANPLSFSRMLEAAAEMAGDHAD RPLVCVLGEMGELGSLSEDEHRNLGRLVADIKPRLVCWKGGHLEEFEDGLHAGRYGGAFC PVTSAEDMFKGLAACDLGSGGGVILFKGSRSNKLETLVSAFTEAQAGEPHAV >gi|316923882|gb|ADCP01000058.1| GENE 25 30157 - 31233 1387 358 aa, chain + ## HITS:1 COG:RC0910 KEGG:ns NR:ns ## COG: RC0910 COG0472 # Protein_GI_number: 15892833 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Rickettsia conorii # 1 358 1 361 361 338 51.0 9e-93 MLYNLLYPLSVDYSILNIFKYITFRSAWALVTALAVSILVGPYFIRWLQAIKCRQEILKE VKAHQCKAGTPTMGGLLIVFAVSVSVLLWADLTNAYVWMTMLVFIGFSAVGFIDDFMKVS HHQNKGLSPRAKFAGQMLVAGIAMLLLVSEPAYSTRLAFPFFKTLNPDLGYWYIPFAMLV MVGTSNAVNLTDGLDGLAIGPMVVAGLVFSVYIYITGHVRFAQYLQLEYISGVGEVAVFC SALAGAGLGFLWFNAYPAQMFMGDVGSLGLGGVLAFLAVLCKQELLLLIVGGLFVAETLS VILQVSYFKASGGKRIFRMAPLHHHYEMKGIPESKIIIRFWIISVMLGLIALSALKLR >gi|316923882|gb|ADCP01000058.1| GENE 26 31518 - 32834 1082 438 aa, chain + ## HITS:1 COG:aq_2075 KEGG:ns NR:ns ## COG: aq_2075 COG0771 # Protein_GI_number: 15607039 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Aquifex aeolicus # 16 433 4 408 415 223 35.0 6e-58 MFTKPAPAVKPGDLAVVVGAGVSGVAAAKLLHKLGARVRLLDRSLDRIPADFAEWAESAG VEIVCGEHSPSQFRDAKLVVPSPGVAAAVISKFLPDGTPPEIMAETELAWRQLSGEPVLG VTGTSGKTTTTSLCAAMLKEQGLTVFTGGNIGTPLSEYILSGEKADVVVLELSSFQLQTC STLHPRVGILLNISENHLDFHKDMDEYISAKMRLFANQTETDLAVLQDGMDELAGKYGIK ARKIHYTESDRFPDMQLMGPHNRCNAEAAWLACREFGVTEEAAARAVAAFQPLEHRLEKV AEINGVLYVNDSKCTTVEALRVALSSFDRPVVLLAGGKFKGGDLPGLCPLLKEHARCVAL FGASRERFEPAWRDTLPVTWDRTLEQALRRVAGKDGQPALAHEGDVVLLAPACSSYDQYP NYLRRGEDFKRVVHEVLA >gi|316923882|gb|ADCP01000058.1| GENE 27 32831 - 33976 1337 381 aa, chain + ## HITS:1 COG:BS_spoVE KEGG:ns NR:ns ## COG: BS_spoVE COG0772 # Protein_GI_number: 16078585 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus subtilis # 43 381 26 365 366 226 40.0 8e-59 MTMARRRVPSSGFRPTQESTPVGQYDWGLIALFLMLLCIGLLMVLSASGVVAERINGDKY FFFKRQLIYAVIGGVVMWVLAAVPRHILYKLQYPFLLFVLMLLFVTLSPLGARVNGAQRW ISVKFFSIQPLEFAKIALALYLAYFMSTKQELVKTFSKGIIPPFAMTALFCFLLLAQPDF GGAVVLSLILFFMCLIGGTRFIYLFMAIGVGLAGALALIIFEPYRARRLVAFLDPFADAQ NAGYQLVQSLYALGSGGFFGVGMGGSSQKMFYLPEAHNDFIMAVVGEELGFFGMTLIMVL FAMLFMRCYKIIMGQSDLRDRFSAFGVTLVLAIGATLNLAVVMGMAPPKGVAMPFLSYGG SSLLASMMCIGLLLNFSRTAR >gi|316923882|gb|ADCP01000058.1| GENE 28 33988 - 35064 1027 358 aa, chain + ## HITS:1 COG:PA4412 KEGG:ns NR:ns ## COG: PA4412 COG0707 # Protein_GI_number: 15599608 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Pseudomonas aeruginosa # 1 356 1 356 357 229 40.0 7e-60 MGVRIILTTGGTGGHIFPALAVAEQLRREGAELLFVGSQYGSEAKLAKQAGLEFRGLPVR GVLGRGLRSVGALWGLFRAVFMARAIVKDFRPDAVIGFGAYASFPSLVAAKLSGVPIAVH EQNAMPGLTNRMLAKLAKRVFLSLPDVTGAFDAKKSQLTGNPVREAIVESGKNAVGHPGT RRLLVMGGSQGAKAVNSVILASLERLTKAGIEIRHQTGSFDLERVLAGYRAHGVDASGVT PFIEDVAAAYQWADLVLCRAGATSVAELAVAGKAAVLVPFPYATHDHQTYNAQVMVDQGA ALLVAEKDLPHLDAGGMLINLLLDPGTLRTMSQMAHTCARPDAASKVAQGVLALCRQS >gi|316923882|gb|ADCP01000058.1| GENE 29 35079 - 36443 1780 454 aa, chain + ## HITS:1 COG:NMA2061 KEGG:ns NR:ns ## COG: NMA2061 COG0773 # Protein_GI_number: 15794939 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Neisseria meningitidis Z2491 # 1 453 39 502 506 454 52.0 1e-127 MRTKFTKIHMIGIGGSGMSGIAEVLLNLGYEVHGSDMADGPVVQRLRSLGADIHIGHAAG NVDDVNVLVKSSAVAEDNPEVVAARQKGIPVIPRAEMLAELMRLRTGIAIAGTHGKTSTT SLTAAIFDAASLDPTVIIGGRINAYGANARLGQGEYLIAEADESDGSFLCLLPIINVVTN VDRDHLDHYGNQEAIDEAFVSFMNQIPFYGLNVVCGDDPGVERLLPRVKRPVVTYGFGSG NHIRAEVLECGVINRFKVFIKGEELGEVTLRQPGRHNILNALAAIGVALDAGIEPHFCIE GLARFGGVGRRFEHKGEKNDILVVDDYGHHPVEIAATLATARTAYPDRRVVVAFQPHRFS RTQALFGEFCHVLGTVDKLLLTEIYPASEKPIPGVNGQNLAQGIRQVSDADVSYFQNFDE MVAALPDILKPGDLFLTIGAGNVTTVGPRFLALE Prediction of potential genes in microbial genomes Time: Fri May 13 02:39:39 2011 Seq name: gi|316923873|gb|ADCP01000059.1| Bilophila wadsworthia 3_1_6 cont1.59, whole genome shotgun sequence Length of sequence - 8580 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 73 - 132 3.0 1 1 Op 1 . + CDS 180 - 1070 861 ## COG0812 UDP-N-acetylmuramate dehydrogenase 2 1 Op 2 . + CDS 1058 - 1903 736 ## DVU2501 cell division protein FtsQ, putative 3 2 Op 1 35/0.000 + CDS 2133 - 3365 1502 ## COG0849 Actin-like ATPase involved in cell division 4 2 Op 2 . + CDS 3410 - 4693 1656 ## COG0206 Cell division GTPase + Term 4756 - 4802 14.1 + Prom 4833 - 4892 2.8 5 3 Tu 1 . + CDS 5021 - 5758 675 ## LI1049 hypothetical protein + Term 5946 - 5987 10.6 - Term 5934 - 5975 10.6 6 4 Tu 1 . - CDS 5981 - 6187 268 ## DMR_39680 hypothetical protein - Prom 6220 - 6279 3.2 7 5 Op 1 . + CDS 6291 - 7097 844 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) 8 5 Op 2 . + CDS 7178 - 8536 1257 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs Predicted protein(s) >gi|316923873|gb|ADCP01000059.1| GENE 1 180 - 1070 861 296 aa, chain + ## HITS:1 COG:TM1714 KEGG:ns NR:ns ## COG: TM1714 COG0812 # Protein_GI_number: 15644461 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Thermotoga maritima # 9 294 4 278 284 137 31.0 3e-32 MVRIVASPRMASLTSLHLGGRALALATFESAADLETLPEAADRIGGRVAMFGGGTNILAA DGELPVVLVRCGMDEAPAVIGEDGGLRLVRVGAGVKLPRLLVWCMKHGLSGLEGMAGVPG DVGGAVAGNAGAHGMDMGTVLRSVDVFSPDRGFRTLGREDVRCEYRFFGLKDGGKPWFAI SGIVLALRPAEPEAIRTALRDNIERKLRVQPVRSWSAGCVFKNPPEGTSAGKLLDEAGFR GKRLGSMCFSEIHANFLVNEGKGSADAALELIRSAQSAVWERFGIQLQTEVKLWVF >gi|316923873|gb|ADCP01000059.1| GENE 2 1058 - 1903 736 281 aa, chain + ## HITS:1 COG:no KEGG:DVU2501 NR:ns ## KEGG: DVU2501 # Name: not_defined # Def: cell division protein FtsQ, putative # Organism: D.vulgaris # Pathway: not_defined # 12 280 9 277 278 238 48.0 2e-61 MGILSSSKRNNRRPARNSYTSSQRRAPSSGGGSFFAHVKSALSWTLGLGLGVALLVGLGV GGLQLHRMATTSEFFAIKRVEIRGTTHFSREEVLKAANLQSGVNSLTVNIADVEQGLRDN PWVLSVAVKRRLPDAFEIRIRERIPAFWMLKDGVLYYADNRGQIIAPVNVGNFLSLPTLE ILPGGEELLPQMDELSRAFQAAHLPVNMASVSLFRVSAAKGFEVFIENRNLVLCIAAEDW DANLRRLSLVLSDLARRGELKTAREVWAADGNVWVVEPENV >gi|316923873|gb|ADCP01000059.1| GENE 3 2133 - 3365 1502 410 aa, chain + ## HITS:1 COG:PA4408 KEGG:ns NR:ns ## COG: PA4408 COG0849 # Protein_GI_number: 15599604 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Pseudomonas aeruginosa # 1 380 4 383 417 424 55.0 1e-118 MSKSELIVGLDIGTTKICAVVGEPSENGGMDIVGIGTSPSTGLRKGVVVNIEQTVQSIKR ALEEAELMAGCEIRSVYAGIAGSHIKGFNSHGVIAVKGGEVSPRDIERALDAAKAVAIPL DREVIHILPQEYIVDDQRGIADPMGMAGVRLEVKVHIVTGAVTSAQNIVRSCHKSGLDVS DIVLESLASAKAVLTEEEREIGVALIDLGGGTCDIAIFANDSIKHTGVLALGGQNLTNDI AFGLRTPMAAAEKIKIKHGAAIAEMVRPDEYIEVPSVGGREPRRLSRQVLAEICEPRMEE ILTLLDQELVRSGLKNMIGAGVVLTGGTALIQGCQELGEQVFNLPTRIGYPRNVGGLKDM VNSPKFATAMGLLRFGAEKEGMEQKFRIRSDGNVFNSILSRMKKWFSEIS >gi|316923873|gb|ADCP01000059.1| GENE 4 3410 - 4693 1656 427 aa, chain + ## HITS:1 COG:CAC1693 KEGG:ns NR:ns ## COG: CAC1693 COG0206 # Protein_GI_number: 15894970 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Clostridium acetobutylicum # 12 348 12 337 373 301 57.0 1e-81 MDMQFEADTPPANIKVIGVGGGGGNAVQNMIMAGLKGVSFICANTDAQALLRSKAEIKLQ IGEKLTKGLGAGADPNVGRDAAQESIGAIKDAIGDADMVFVTAGMGGGTGTGAAPIVAQA ARELGALTVGVVTKPFLFEGTKRARAAEQGIAELRENVDSLITIPNNRLLTIAPKKAKLS DMLKCADDVLHRAVRGISDLITVPGLINVDFADVRTVMSVSGLAMMGAGIAVGEGRAIEA ARKAITSPLLEDVSIAGAKAVLINITANEDLLFEEFNDASAYINDALGEADTNIIIGCAT DENAGDEIRITVIATGIEGNAAPKVVQGGQANMATVRPQQRPAQQPPQPSHSGLNYQKQA LAEEHAMPRMRMPRTVGNFSDEERIVPTFLRDRDQLSKQTTHNPGREEFIFEEEEIELPT FIRKQAN >gi|316923873|gb|ADCP01000059.1| GENE 5 5021 - 5758 675 245 aa, chain + ## HITS:1 COG:no KEGG:LI1049 NR:ns ## KEGG: LI1049 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 2 238 5 239 244 218 48.0 1e-55 MKTIRFFLLWCMVVAISACTPFYRGFEGSTLVSPARPDVSVSVVDMPMLAHGQIAPFLNT DQGYQFPETLVSVYGTDAASPVAIVALSFVPSNAWEWDPLSFSGPMAQQNMGTVFGGESF YGSVRIVNGAKDPFAPLFAPEDQWASLDWLVQRFACLNDFRRSKLILEYREPLPASLKGV SEVPVYNEAVQAFRERAAKVFHVQYGEAPMPKDEAPYIKTLNARYMGNFLGSMSVKEPLF PNYSD >gi|316923873|gb|ADCP01000059.1| GENE 6 5981 - 6187 268 68 aa, chain - ## HITS:1 COG:no KEGG:DMR_39680 NR:ns ## KEGG: DMR_39680 # Name: not_defined # Def: hypothetical protein # Organism: D.magneticus # Pathway: not_defined # 1 66 1 66 71 86 59.0 4e-16 MALNPELLALLACPVCRGELDPVDNESGLECPACGLVFPVRDNIPIMLQEEAIRKDDWER GQRKKQSR >gi|316923873|gb|ADCP01000059.1| GENE 7 6291 - 7097 844 268 aa, chain + ## HITS:1 COG:BH3372 KEGG:ns NR:ns ## COG: BH3372 COG0613 # Protein_GI_number: 15615934 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Bacillus halodurans # 7 251 13 250 266 151 33.0 1e-36 MQPRFVDLHTHTTASDGTDAPRDLIRRAASLKLAAVAVTDHDTVSGLDEAEAAGREYGVE IIRGCELGVQGQYGEIHLLGLWLPRHSAPLDAELSRLRGHREERNLKILDRLRSIGINIG YQEVLDEAGGESVGRPHIARVLQKRGIVSNFAQAFELYLGYYGAAYVPRTLLTPEEGVNL MADLGAVVSFAHPMLIRCPPSWFDEIIPRLKEAGLGAIEAYHSEHSARDERFCVELAARY GLGLSGGSDYHGMAKPGVELGRGKGGCG >gi|316923873|gb|ADCP01000059.1| GENE 8 7178 - 8536 1257 452 aa, chain + ## HITS:1 COG:SMc01406 KEGG:ns NR:ns ## COG: SMc01406 COG1167 # Protein_GI_number: 15965838 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Sinorhizobium meliloti # 34 450 36 461 472 254 41.0 2e-67 MKLLRLYQDQAREHPGEAKYMLVANAIAQGVEDGCLAPGESLPTHRELAEALGVTIGTVS RGYAEAAQRGLTVGVTGRGTFVTAPCHDVDVYEGAGRMHDLGFIAPFEYLNPDLNEAVAE LSRYADLKELSNYQQPRGLLRHREAGAHWAGRYGLAVSPENLLVCAGAQHALLTVLASLF NPGDRIAAEQLSYPLLKQLARRLRLNLTPVRMDQGGMLPDSLEAACRAGGIKGLYLMPTC QNPTLAQIPEYRRRELVEICRRYDVMIIEDDVYALSLEHNLPPLASLAPERCCFIASTSE ALSGGLRIAYLCPPDAVFAELERTISYTISMAPPLMAELATMWIRNGTADRVLAAKRREA AERNALARQLLDGFPLETRTTGFFCWLKLPDPWAAVAFAEAARQRGIIVADSDLFALNHA SPEQGVRLALGGVRTREALADALGTLAGMLHG Prediction of potential genes in microbial genomes Time: Fri May 13 02:40:03 2011 Seq name: gi|316923856|gb|ADCP01000060.1| Bilophila wadsworthia 3_1_6 cont1.60, whole genome shotgun sequence Length of sequence - 17356 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 11, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 194 - 253 8.2 1 1 Tu 1 . + CDS 326 - 640 467 ## + Term 681 - 715 7.3 2 2 Op 1 . + CDS 799 - 1923 862 ## COG0438 Glycosyltransferase 3 2 Op 2 . + CDS 2140 - 3162 1098 ## COG2008 Threonine aldolase + Term 3185 - 3229 9.9 4 3 Tu 1 . - CDS 3498 - 4151 760 ## COG1739 Uncharacterized conserved protein - Prom 4258 - 4317 2.6 + Prom 4189 - 4248 3.8 5 4 Tu 1 . + CDS 4367 - 4855 841 ## COG1592 Rubrerythrin + Term 4879 - 4932 1.5 - Term 4987 - 5026 9.1 6 5 Op 1 32/0.000 - CDS 5240 - 5926 841 ## COG0704 Phosphate uptake regulator 7 5 Op 2 . - CDS 5983 - 6792 292 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 8 6 Op 1 40/0.000 - CDS 6995 - 8806 2086 ## COG0642 Signal transduction histidine kinase 9 6 Op 2 . - CDS 8808 - 9494 983 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 10 7 Op 1 39/0.000 + CDS 9712 - 10533 1301 ## COG0226 ABC-type phosphate transport system, periplasmic component + Term 10542 - 10592 9.0 11 7 Op 2 38/0.000 + CDS 10748 - 11635 1100 ## COG0573 ABC-type phosphate transport system, permease component 12 7 Op 3 . + CDS 11645 - 12520 998 ## COG0581 ABC-type phosphate transport system, permease component + Term 12534 - 12577 -0.7 - Term 12602 - 12645 9.6 13 8 Tu 1 . - CDS 12693 - 13118 236 ## COG1943 Transposase and inactivated derivatives - Prom 13283 - 13342 1.8 - Term 13387 - 13412 -0.5 14 9 Tu 1 . - CDS 13415 - 16015 2843 ## COG1032 Fe-S oxidoreductase 15 10 Tu 1 . - CDS 16122 - 16442 596 ## LI0572 hypothetical protein - Prom 16465 - 16524 2.6 + Prom 16481 - 16540 2.6 16 11 Tu 1 . + CDS 16649 - 17350 727 ## COG0571 dsRNA-specific ribonuclease Predicted protein(s) >gi|316923856|gb|ADCP01000060.1| GENE 1 326 - 640 467 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRGFMKYACVLALPLLVAAGAAQAQTWPDGGNPSEGATLYDSCVPCHTLNGNGLAGKSVS DLMDKMKAYQSGTFTDAKLLGMQQVLKPLSDKQLLDLAAYINKM >gi|316923856|gb|ADCP01000060.1| GENE 2 799 - 1923 862 374 aa, chain + ## HITS:1 COG:XF0879 KEGG:ns NR:ns ## COG: XF0879 COG0438 # Protein_GI_number: 15837481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Xylella fastidiosa 9a5c # 25 344 22 322 370 74 26.0 4e-13 MSSFRSIQVVNVRWFNATAWYGLELARLLNAAGHESRVVALADTETFAKAEEMGLRPLAM PLNAKNPLEFPGLIRDMGRLVRAFRPDVVNCHRGESFLFWGLLKGMGGYALVRTRGDQRL PKGNLPNRILHTRVADAVVATNSVMARHFAEKMRVPAERLHMILGGVDTERFRFDPEGRV EVRARYGFTDDECVVGLLGRFDLVKGQRETIAALAKLVGEGVRNIRLLLLGFSTATLQEE VEAWIREAGMERYVTITGKVPDVTACLSALDVGVVASLWSETIARAALEIMACGRPLVST SVGVMPDLLPASALVAPGDVDALAGVLRRAATDEAWRRGLAGHCSERIMSLRDKDFLEQT LAVYAEAYRRRCSS >gi|316923856|gb|ADCP01000060.1| GENE 3 2140 - 3162 1098 340 aa, chain + ## HITS:1 COG:PA5413 KEGG:ns NR:ns ## COG: PA5413 COG2008 # Protein_GI_number: 15600606 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Pseudomonas aeruginosa # 5 336 8 340 346 252 43.0 6e-67 MDCTFASDNTSGVHPKVMEALNKANVGAAEPYGDDPWTAEAEGCFKALFGDDVDVFLVPL GTGANVLGFNRMIRSWHSILCSDMAHTHTSESGAVEAVVGCKMTPIPSVHGKISAGALSA YLTDMGSPHHSQPGLVALTQSTEVATVYTPEEIKAIVDVAHANGLFVHMDGARIANAAVA LGCDVRSFTKDLGVDMMSFGGTKNGMMMAESVVVFNKDFVRDFVTLRKQNLQLASKMRFL AAQYIAYFKDGLWLENARHANNMARLLADLISGMPHVELAHPVESNGVFVNMKPEHIAAL QRQYMFHEVEPSAHTVRWMLSFNTSEEQVRAFAKAIGALA >gi|316923856|gb|ADCP01000060.1| GENE 4 3498 - 4151 760 217 aa, chain - ## HITS:1 COG:STM3985 KEGG:ns NR:ns ## COG: STM3985 COG1739 # Protein_GI_number: 16767255 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 20 209 15 204 204 150 39.0 2e-36 MSVSYFIPDLAPEDLHRCEEIVRRSRFIVSLAHTPTPEAAKSFIERIKQEFPDATHNCWA YAAGAPGQTSKVGYSDDGEPHGTAGRPMLTMLLHGGVGELSAVVTRYFGGIKLGTGGLVR AYQGMVKLGLETLPTREHMIPARVEVVLDYPHITVFRRLLPDYRATVASENFGVDATYEL LLPDQNIPALETALTELTDGAVLITRLDDGTMPPETP >gi|316923856|gb|ADCP01000060.1| GENE 5 4367 - 4855 841 162 aa, chain + ## HITS:1 COG:MA0639 KEGG:ns NR:ns ## COG: MA0639 COG1592 # Protein_GI_number: 20089526 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Methanosarcina acetivorans str.C2A # 4 162 3 163 163 163 59.0 1e-40 MSKTEKNLMDAFAGESQANRKYLAFAKQADKEGLPQVAKLFRAAAEAETIHAHAHLRNAG KIGDTTANLKAAIEGETYEFTKMYPEMIADAKEEGQDRIAKYFDMVNKVEEVHANLYKKA LADPSSITGDLYVCTVCGYTQEGPCDKCPICGAVAAAFAKID >gi|316923856|gb|ADCP01000060.1| GENE 6 5240 - 5926 841 228 aa, chain - ## HITS:1 COG:PA5365 KEGG:ns NR:ns ## COG: PA5365 COG0704 # Protein_GI_number: 15600558 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Pseudomonas aeruginosa # 6 218 10 226 242 100 31.0 2e-21 MAYQEHYTQQLLDNLRAKLLIMGSKTQQALDDAAIAVLNHDLPRAAAVLDGDTDIDDLEN QIDEATLNILARTQPVARDLRFLMSVVRMVLDLERIGDESVVVAEQVTLSDKPIPSIVEK DLRALCTRTSAMLRNSLLAFQNGDAPSALAVSRYDDETAQMMVNIFQKLMQAVGDRTLEP WDSMHIVLITRALDRVCRRAENIAEHAYFMVEGVSLKHRRTLPQGWEK >gi|316923856|gb|ADCP01000060.1| GENE 7 5983 - 6792 292 269 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 26 255 5 230 245 117 35 7e-26 MPQPIASPVTLDNAQSLSGTPKMAAQNVNFFYGSSQALFDVSLTFPERQVTALIGPSGCG KSTFLRCLNRMNDLIPGTRTEGLITLDKEDINAPNFDVVNLRRRVGMVFQKPTPFPKTIF ENVAYGLRVNGERDESLIADKVEESLRRAAIFEEVKDRLHTSALGLSGGQQQRLCIARAL AVEPEVLLMDEPASALDPIATQRIEEGIHELKSQLSIVIVTHSMQQAARVSDRTAFFYMG KLIECDLTEKIFTNPSLKQTEDYITGRFG >gi|316923856|gb|ADCP01000060.1| GENE 8 6995 - 8806 2086 603 aa, chain - ## HITS:1 COG:BH3156 KEGG:ns NR:ns ## COG: BH3156 COG0642 # Protein_GI_number: 15615718 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 4 593 2 577 589 242 29.0 2e-63 MRDTKFRTRLFLCFLAVIVMTLVLPGLYVRKNIKEDGIDETVRNARREAVLIRMQMETAP DGVHGLSSTFDRIGTELGIRITFIGADGTVLAESDSSRINLNDLDNHADRQEIGAALREG FGVSIRHSNTLDTDLVYAAIPFAAVGKLPDGFIRVAVPLKHVEERIAQSERQWLAIVGLA TLIALLFAYGLSTHLERSLKEMIRVVESIATDPSGQGNRRRLHILPGREFRRLAHAVNDM ADRTEEHLRTIAENKAQLKTILDVMNEGVLVIGEKGRIRLVNPALVRLFPQAKDAEGKFP VEVIPAAEIQQALDELLSAPQDKPRLITLEVEPRPDMILSVQFIRPREAIHDVLAVAVFH DISEMARLMRVRKEFVANVSHELRTPLTAIAGYAETLRDSAAEDPSICAKFAETILRNAQ HMGTMVEDLLKLSRIESGAVPMEMETVKAASILSDVTTACQPQTAAHNLTLETDIDPALT VRADPHFIGQVFRNLIENACRYAPEGSTIKVSGGKRLDHNGLPEALFMICDDGPGIPQAD IARVFERFYRVEKHRSSPASTGLGLAICKHIVERHGGRIWAEPGPGGCLQFTLPLAAETE KRT >gi|316923856|gb|ADCP01000060.1| GENE 9 8808 - 9494 983 228 aa, chain - ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 8 225 6 226 232 216 49.0 2e-56 MISESATILIVEDEQDIRELLAYNLEKEGYATVQAADGKEGLELARSRKPDLILLDLMLP KMDGLAVCRELERDSGTVRIPIIMLTARGEDVDRILGFELGADDYVVKPFNIRELLLRIR AILRRQVMVESNPVLTRHGVSVDPAAHKVTVLGQDVELTATEFRLIEDLLRNAGRVRTRE ELLAAVWGYQFEGYARTVDTHIRRLRNKLGDAAEIIETVRGVGYRCKE >gi|316923856|gb|ADCP01000060.1| GENE 10 9712 - 10533 1301 273 aa, chain + ## HITS:1 COG:MA0887 KEGG:ns NR:ns ## COG: MA0887 COG0226 # Protein_GI_number: 20089771 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 26 272 68 317 317 198 45.0 9e-51 MKRIVLASLLALSCLLPAQAKAADEIVVNGSTTVLPIMQKVSEAYMAANPNVQIALSGGG SGNGIKALLDGLANIAMSSRDIKGSEKELAAKKGINPVRTAVAVDALVPVVNPKNPINEL SLDQLKDIYTGKITNWKELGGADANIVVVSRDTSSGTYETWEEMVMKKAKVMPKALLQAS NGAVEQVVAKNPNAIGYVGLGYLAPSIKGLHIGKVAASAETALSKEWPLSRELYVFTNGE PAGASGALIKYILDPAKGQKAVKEVGFVPLAKK >gi|316923856|gb|ADCP01000060.1| GENE 11 10748 - 11635 1100 295 aa, chain + ## HITS:1 COG:MA0888 KEGG:ns NR:ns ## COG: MA0888 COG0573 # Protein_GI_number: 20089772 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 2 288 3 292 296 249 50.0 4e-66 MSRELKEKSIRWLLTGVAFVSLLSLACIMLFLFMEGLPLFADYPLWDFLTGHLWYPTSEP PEFGIFPLIMGSLAVTILASCIAVPLGIMTAAYLAEIAASRTRRIVKPFVELLAALPSVV IGFFGMVIVAPFLQDWFGADTGLNLLNAALMLAFMSVPTICSVAEDALFAVPRTLREASL ALGATRWETLIRVVIPSALSGIGTAVMLGMSRAIGETMVVLMVAGGAAIIPLSIFDPVRP MPASIAAEMAEAPFRGDHYFALFATGIVLFLFTLLFNMLAFRIAEKHKQTGSSGL >gi|316923856|gb|ADCP01000060.1| GENE 12 11645 - 12520 998 291 aa, chain + ## HITS:1 COG:MA0889 KEGG:ns NR:ns ## COG: MA0889 COG0581 # Protein_GI_number: 20089773 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 17 290 31 304 307 272 54.0 4e-73 MTTITFHSDNTSRQRREAMMFGLLRVAAGINILALVGVCVFLLWNGSPAISWEFLTEPPR RMMTAGGVWPCIVGTFLLAFGAMLIAFPLGVASAVYLHEYGGKGRYTRYLRLGISNLAGV PSVVFGLFGLAFFVTFLGMGVSLLAGVLTLAVLTLPVIINTTEEALRQVPDAWREASLAL GATRSQTIARVVLPAAVPGMLTGAILGLARAAGETAAIMFTAAVFYTPKMPDSVFSAVMS LPYHMYVLATAGTEIEKTRPLQYGTALILLVLVLGMNLIAIIIRDRMQRKR >gi|316923856|gb|ADCP01000060.1| GENE 13 12693 - 13118 236 141 aa, chain - ## HITS:1 COG:SP1064 KEGG:ns NR:ns ## COG: SP1064 COG1943 # Protein_GI_number: 15900934 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pneumoniae TIGR4 # 5 141 6 143 157 138 48.0 3e-33 MKQIQSLSHTVWECKYHIVWIPKYRKKVLYGKIREELVGVFHELANRRGCKILEGHLCVD HVHMLLSIPPKYGVAEVVGYLKGKSAIEIAKHFEKVRKITGASFWARGYFVSTVGVGERE IRLYIRNQEKEDRKVEQLKLL >gi|316923856|gb|ADCP01000060.1| GENE 14 13415 - 16015 2843 866 aa, chain - ## HITS:1 COG:CAC1254 KEGG:ns NR:ns ## COG: CAC1254 COG1032 # Protein_GI_number: 15894536 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 1 589 5 602 622 508 41.0 1e-143 MKELLSLLPRPSHYIGTEEGSVHKEPASVRLHCALAFPDLYEVGMSYLGHKILYTILNNR EDIFAERVYAPCRETGRLLREHGVSLATLESDTDIVKTHMFAFAITHELCFTNVLYMLEL SGIPLRAADRGDDLFRWPLIVAGGGCAIASEPLAPFMDLMLLGEGEEMVPELCDLVIKAR EEGWSRSRLIEEAVNIPGVYAPSLYTHDANGVLTPLKPDLPTPGRRIVADFDRAAYPEKQ VVPFGAVHNRLSLEIARGCTRGCRFCQAGVLYRPARERSLPNLEKILENCLNDTGFDDVS FLALSTGDFSALKTLFLGTMDRCEAEQISVSLPSLRVGSIDDDIMRRLAGIRRTGATLAP EAGSQRLRDIINKGVTEEGLMLHVRKLFEHGWQQVKLYFMIGLPGETEEDIEAIVDLCRK ARDAAGRGMPRLQVTAAISPFVPKSHTPFQWEPQISLEQVRERVQYLRDAFRAEKCLKLR WHEPEMSFLEGVLSRADRRIADVVEKAYRRGAIFASWMDHFSIDPWLESLAECGLTAEEF TGARELDAPLPWDHLNAGVSREFLLRERRRAFEGKISDDCRYAACRQCGACDTAAGKSLL PRTPGLEEGTHRNSLNFKQRDQLEHQPNLDENGRLLMPPKPPKATEPPAINSALAVKAVR YRVWHTKEAEAAYISQLELQSLLERAMRRAGLPMAFSQGFHPLPLISFGRALPVGVESQA EWFSIVLREPLSAEEVMKRLAPRMLRGLRLDRLEEIPVNDKSVGSVQETFSLRFVGSDAD RRLFMEAWDDFTATDSLMFTRETKKGPRTADIRPLFQVIEWDEHGTLYIVTDWSETYISP MTLARAITPWAEQHQLKIMKLSQMFG >gi|316923856|gb|ADCP01000060.1| GENE 15 16122 - 16442 596 106 aa, chain - ## HITS:1 COG:no KEGG:LI0572 NR:ns ## KEGG: LI0572 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 95 1 95 111 144 74.0 1e-33 MSKPIRLFKLVTGELVLGKFDEEANMLEDVAIIQVVPTQQGVQMMMLPYGYPFEQEFKGS ISGEHFMYEYKKLPDDLETKYLEACTNLTLSTGGLAGAAGKSPLIK >gi|316923856|gb|ADCP01000060.1| GENE 16 16649 - 17350 727 233 aa, chain + ## HITS:1 COG:Cj1635c KEGG:ns NR:ns ## COG: Cj1635c COG0571 # Protein_GI_number: 15792940 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Campylobacter jejuni # 3 226 4 222 224 159 41.0 5e-39 MTLDVLQDRIGYRFKDESLLVLAVTHSSWANEHSAGNAHNERLEFLGDAVLEIAVSAQLF ARFPDAREGELTRLRSSLVNEATLAVIARKLHLDDCLRLARGEENQGGRQRDSLLSDAME AVFGGVFVDGGIEKARAVVERLYTDLWPKNATKVRRKDFKTKLQEATQRISKGLPVYALE DSYGPEHAKIFSVRVDLPDGRQFRASGPGLKRAEQEAAHVALVALGEEDGEEK Prediction of potential genes in microbial genomes Time: Fri May 13 02:40:36 2011 Seq name: gi|316923829|gb|ADCP01000061.1| Bilophila wadsworthia 3_1_6 cont1.61, whole genome shotgun sequence Length of sequence - 28101 bp Number of predicted genes - 28, with homology - 24 Number of transcription units - 17, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 243 - 274 3.4 1 1 Op 1 . - CDS 295 - 975 243 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 . - CDS 972 - 1667 952 ## COG4662 ABC-type tungstate transport system, periplasmic component 3 1 Op 3 . - CDS 1686 - 2162 713 ## Dvul_2229 hypothetical protein 4 1 Op 4 . - CDS 2221 - 3396 1442 ## DVU0740 hypothetical protein 5 2 Op 1 . - CDS 3812 - 3934 61 ## 6 2 Op 2 . - CDS 3951 - 4631 695 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 7 2 Op 3 . - CDS 4634 - 6187 2122 ## COG0007 Uroporphyrinogen-III methylase 8 3 Tu 1 . - CDS 6359 - 6688 317 ## DVU0758 hypothetical protein - Term 6719 - 6767 11.5 9 4 Tu 1 . - CDS 6790 - 7374 891 ## DvMF_2797 hypothetical protein - Prom 7404 - 7463 4.4 + Prom 7463 - 7522 3.4 10 5 Tu 1 . + CDS 7676 - 8950 1654 ## COG0172 Seryl-tRNA synthetase + Term 9036 - 9073 6.2 11 6 Tu 1 . + CDS 9116 - 10474 1699 ## COG1160 Predicted GTPases + Term 10477 - 10528 2.0 - Term 10596 - 10633 8.7 12 7 Tu 1 . - CDS 10730 - 11242 674 ## COG0703 Shikimate kinase - Prom 11439 - 11498 3.9 + Prom 11432 - 11491 2.6 13 8 Tu 1 . + CDS 11569 - 12312 1289 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 12333 - 12376 13.1 - Term 12451 - 12475 -0.3 14 9 Tu 1 . - CDS 12502 - 13017 658 ## COG0778 Nitroreductase + Prom 13339 - 13398 4.7 15 10 Op 1 15/0.000 + CDS 13503 - 14888 1500 ## COG0277 FAD/FMN-containing dehydrogenases 16 10 Op 2 . + CDS 14961 - 16256 1436 ## COG0247 Fe-S oxidoreductase + Term 16319 - 16371 14.9 17 11 Tu 1 . - CDS 16441 - 17808 894 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 17944 - 18003 3.8 18 12 Tu 1 . + CDS 17725 - 17883 67 ## + Term 17909 - 17943 -0.8 - Term 18174 - 18229 1.8 19 13 Op 1 . - CDS 18370 - 18990 542 ## Sterm_3311 hypothetical protein 20 13 Op 2 . - CDS 18983 - 19759 669 ## COG0457 FOG: TPR repeat - Term 19893 - 19936 9.1 21 14 Op 1 . - CDS 19978 - 20214 191 ## 22 14 Op 2 . - CDS 20224 - 20340 159 ## 23 14 Op 3 . - CDS 20414 - 22822 3482 ## COG1882 Pyruvate-formate lyase + Prom 23156 - 23215 8.7 24 15 Tu 1 . + CDS 23364 - 24254 735 ## COG1180 Pyruvate-formate lyase-activating enzyme 25 16 Op 1 8/0.000 - CDS 24445 - 25095 271 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 26 16 Op 2 11/0.000 - CDS 25270 - 26013 724 ## COG0368 Cobalamin-5-phosphate synthase 27 16 Op 3 . - CDS 26010 - 27044 1042 ## COG2038 NaMN:DMB phosphoribosyltransferase 28 17 Tu 1 . - CDS 27475 - 28101 664 ## COG1492 Cobyric acid synthase Predicted protein(s) >gi|316923829|gb|ADCP01000061.1| GENE 1 295 - 975 243 226 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 226 1 219 245 98 32 5e-20 MTPLFHLENIVRNYPDPASSDPHAARTVLNLSELDIRAGEILGIRGHNGSGKSTLLRIIA LLEPPDSGTVLFEGRPAGTDDLHLRRQVTLLLQTPYLLSRSVASNVAYGLRVRGIRNASE LQNRIAASLLAVGLDPAVFLHRRRHELSGGECQRVALAARLALRPRVLLMDEPTASVDQQ SAERIALAARHAADSGSAVVVVSHDHEWITPLSDRLVTLREGKLVE >gi|316923829|gb|ADCP01000061.1| GENE 2 972 - 1667 952 231 aa, chain - ## HITS:1 COG:Cj1539c KEGG:ns NR:ns ## COG: Cj1539c COG4662 # Protein_GI_number: 15792847 # Func_class: H Coenzyme transport and metabolism # Function: ABC-type tungstate transport system, periplasmic component # Organism: Campylobacter jejuni # 1 227 12 238 239 167 41.0 2e-41 MSVITQAFSDAFQLLASGDEETFAAVQATLSTSGLAMLGCIIPGVPLGFLLGYTRFPGRN MLRILVDTALSFPTVVLGLLVYLLLARQGPFADFELLFTVPGIAIGLALLGLPIVIAHTC LAVEQADRLLAPTLRTLGAGPWQLLFSTVRELRFHLLTACITAFGRVVSEVGISMLVGGN IKWATRTITTAITLETGKGDYARSIALGIVLVALAFVLNMGLAALRRRMGP >gi|316923829|gb|ADCP01000061.1| GENE 3 1686 - 2162 713 158 aa, chain - ## HITS:1 COG:no KEGG:Dvul_2229 NR:ns ## KEGG: Dvul_2229 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 8 158 29 179 179 176 58.0 3e-43 MHAYLRSLALGLALLLPAVPAFAADAPAKTDAAIPDKPTRITERNTPVAVEYEGNDSIGS RLATRVKETLNSSNLFSLTDKDTPKIRILIATQPEFKDRPGVGSAYAVVWVFSQSENILR HYLTREVGVVTPDEINGLAAKLVEETDSLATRYGYLFQ >gi|316923829|gb|ADCP01000061.1| GENE 4 2221 - 3396 1442 391 aa, chain - ## HITS:1 COG:no KEGG:DVU0740 NR:ns ## KEGG: DVU0740 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 4 387 5 387 391 421 58.0 1e-116 MFPPKPPKGADPRMYFFLFLLTLGTTLGYQGWTLLYTNFAVESAHLSAADNGLVQSLREV PGLLGFLIIPLLLVMKEHRVAAVAALATGLGTALTGFFPSFVPVILTTLLMSFGFHFFEA VNQSLLLQYFDVRTTPVVMGKLRGLAAGGSLAASVFVFFCSDFLPYTMMFLIVGALCMFT GVWGLFLDPTNKELPIQSKKMVFRRKYWLFYLLTLFLGARRQIFIVFALFLLVEHFRFSV NTVSVLFMINYAINWFLNPLIGRTINRIGERKLLSIEYSTAILVFTGYATTGSAWVAGVL YVIDYIVFNFSIALRTFFQKIAEPQDIAPTMAVAQTINHIAAIFVPALGGWLWVEFGYQI PFFIGAALSGCSLLLVQIIDREIRLHAPAKA >gi|316923829|gb|ADCP01000061.1| GENE 5 3812 - 3934 61 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVLRGRGAFLKKGPSPPKLPERLAPPAPKAFPCGLQTDSG >gi|316923829|gb|ADCP01000061.1| GENE 6 3951 - 4631 695 226 aa, chain - ## HITS:1 COG:aq_857 KEGG:ns NR:ns ## COG: aq_857 COG0299 # Protein_GI_number: 15606207 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Aquifex aeolicus # 3 194 2 193 216 188 45.0 7e-48 MSLKLAVLASGSGTNFQAMVDAVRRGALDADIRLVICNRPGAKVIERAKAAGIVCAVMDH KLWPSREAYDLAVADAILKSGADTVALAGYMRMLTAGFLNAFPHRVVNVHPALLPSFPGI HGAADAQAWGVKITGCTVHLVDEIMDHGEVIIQAAVPAIAGEPLDDLQSRIHAQEHRIYP QALQWLAEDRIKMDEDGRSLHLLPGSRPLAAPAPGVLVSPPLEEGF >gi|316923829|gb|ADCP01000061.1| GENE 7 4634 - 6187 2122 517 aa, chain - ## HITS:1 COG:BS_ylnD KEGG:ns NR:ns ## COG: BS_ylnD COG0007 # Protein_GI_number: 16078625 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Bacillus subtilis # 11 256 3 247 257 254 50.0 3e-67 MRQENNEDRMKVYLIGAGPGDPGLLTIKGKDILEKADVVVYDYLANDTLLGYARPDAERI YVGKVAGNHALPQDGINKLIIEKAKEGKIVARLKGGDPYIFGRGGEEAEELLDAGVPFEE VPGISSTIAGPAYAGIPLTHRSFSSSVTLITGHENPDKPGSVHNWKALAASANTLVFVMG MKNLPDIARNLIEAGLSPDTPAALVHWGTTAKHRSLAATLGTLHEEGVRQGFTNPSVIIV GKVVTLRDRLNWFEQKPLLGRSVVVTRAREQASGLAAQLADLGAEVIQFPTIDIKPLEDY SSVDAAVRNLGAYDWLIFTSANGVKCFWERLEAQGLDARALYGLQVAAIGPATAQAVRTH GIAPDFVPEAYIAESVAEGLIGLGMDGKKVLLPRAREAREVLPEELRKAGAQVDVLPVYE TVPAAAHRDEVLQRLEAGTLDAVTFGSSSTVDNFFAQIPADTIRNQPEEKRVKFASIGPV TTKTLAKYGFACDIQPEDFTISALVKALTAHYEKQGA >gi|316923829|gb|ADCP01000061.1| GENE 8 6359 - 6688 317 109 aa, chain - ## HITS:1 COG:no KEGG:DVU0758 NR:ns ## KEGG: DVU0758 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 7 109 51 149 149 102 47.0 6e-21 MREEPCKTYGHLFPADQHIADAVGAVLSDWNIPECTELEGDIFRISFEGVFFPLDDVLDA LRPLLCAESSGKIDLIDMEAWTLTRAAFSGTEITVKTVGLNHVLAYSGH >gi|316923829|gb|ADCP01000061.1| GENE 9 6790 - 7374 891 194 aa, chain - ## HITS:1 COG:no KEGG:DvMF_2797 NR:ns ## KEGG: DvMF_2797 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 26 193 27 194 196 175 54.0 7e-43 MSILIRRVVTLALTLVTLTLLAACGPSNNVRLIYNTNTSAVLPLPGSPSVAVVMFNDERT RQYIGERKDGSVFTASSTIADWFSRSLADELGRQGVQVSFASTLEQARAANPTYIVTGTI TDVWLQEQSATRVQCTIKAKVALSNKKGVLYSENLSSTQERQFIPSASAIESLLSDTLKD LLVPAAKKIQSQIH >gi|316923829|gb|ADCP01000061.1| GENE 10 7676 - 8950 1654 424 aa, chain + ## HITS:1 COG:FN0110 KEGG:ns NR:ns ## COG: FN0110 COG0172 # Protein_GI_number: 19703458 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 421 4 423 424 478 55.0 1e-135 MLDLKLLQKNPEVVAAALAKRHSPLNMRTFEELDNKRRELLAQVEALKAERNRASAEVAR LKRNGEDAEAFIASLSDLSSRISALDSATEAVKADLADWMLTVPNIPDASVPDGVDENDN VEVHRWGTPREFDFEVKNHWDLSNDNGGLDFERGAKLAGSRFTVSWGWAARLERALVCFY LDFHHEQGRTEILPPLMVNRKTMQGTGQLPKFEEDLFKVADWEYYLIPTAEVPVTNLHAD EILDEEALPKRYCSQTPCFRSEAGSAGKDTRGYIRQHQFTKVEMVYITHPDDSFKYLEEM RRSAETLLERLELPYRTITLCAGDMGFSACKTYDVEVWLPGQRTYREISSCSNCVDFQAR RANIRFRPKGGKPEWVHTLNGSGLPTGRSLVAIMENYQQKDGSIVVPKVLVPYMGGQEVI EPTK >gi|316923829|gb|ADCP01000061.1| GENE 11 9116 - 10474 1699 452 aa, chain + ## HITS:1 COG:FN0170 KEGG:ns NR:ns ## COG: FN0170 COG1160 # Protein_GI_number: 19703515 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 445 1 436 440 307 39.0 2e-83 MLPKIALLGRPNVGKSTLFNRLIRSNRAITHDRPGVTRDRMEGVVRSRSGRSFTLIDTGG VTLDSHHSVADGPEGLRGFEAEILQQTQAAMDEASVFCLVVDGRDGLMPFDEHLASFVRR AGKPTLLVVNKVDGMEHEDTMLAEFHALGFPILALSAEHGHNIRALEEEMVDLLPVEDPD AEAKPENVLRLAMLGRPNAGKSSLVNAMIGEDRMIVSDVAGTTRDSVDIPFVSDGRACEF VDTAGVRRRTRITDTVERFSVNSSLKTTTKADVTLYVIDALEGVTAQDKRLIDLLDERKT PFMILVNKIDLVGRKEQAALEKNFKEVLQFCPHVPVLMVSAMKQVGLGRIVPLAAQIREE CNTRIGTGALNRAMEEVLTKHQPPVVKRSRPKFFYLTQAEVAPPTFVFFVSDADRVSETY ARYLDRSLRKIFRIEHAPIRVRLRSSHKKKSE >gi|316923829|gb|ADCP01000061.1| GENE 12 10730 - 11242 674 170 aa, chain - ## HITS:1 COG:YPO3215 KEGG:ns NR:ns ## COG: YPO3215 COG0703 # Protein_GI_number: 16123374 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Yersinia pestis # 1 168 1 168 174 179 56.0 3e-45 MGHTIYLVGARAAGKTTFGGALARQLGCNYVDTDIHLRETTGKTVADIVAREGWDGFRKR ESAVLRAVTAPGTVIATGGGMVLAEENRRFMRENGIVLYLSAPAEVLASRLQANPNAAQR PTLTGKSIAEEVAEVLAAREPLYRETATHILNAAATPKELLAEALAILKP >gi|316923829|gb|ADCP01000061.1| GENE 13 11569 - 12312 1289 247 aa, chain + ## HITS:1 COG:sll1270_1 KEGG:ns NR:ns ## COG: sll1270_1 COG0834 # Protein_GI_number: 16330176 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Synechocystis # 13 245 32 266 275 127 34.0 2e-29 MFRTLAVSLLSLLLCASVAFAEKPLVVASDPTFPPNEMLNKDKEIVGFSIDYIKAVGKEA GFDVQVKNIAWDGIFAALASNQVDVIAASVSITDKRKKAMLFTDPYYELHQAVVLPLGKE IKDLEELAGKRVGGQIGTTAMVQTIPASKIKMIVKTYDEVGLAFEDLAKGNLDAVMCDDP VAKYYANTKDEYRDKFHIGLVTGEPEFYGFALRKNDKELAQKLNAGIKAVQEKGIEKQIL EKWFGTN >gi|316923829|gb|ADCP01000061.1| GENE 14 12502 - 13017 658 171 aa, chain - ## HITS:1 COG:CAC3555 KEGG:ns NR:ns ## COG: CAC3555 COG0778 # Protein_GI_number: 15896791 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 171 1 172 174 150 44.0 1e-36 MLELLKHRRSIRTFTDEPVSQENIDSLLKAALLAPSSMGKKPVECIVVRNKETIARLKTY KKHGTTPLGTAPLAIVVIADGQKSDVWVEDASIVSILIQLEAEKLGLGSTWIQLRRREGD SGPSEEAFRQELGIPEHYGVLSVVAIGHKNEQKKPYTDADLDFTKVHYETF >gi|316923829|gb|ADCP01000061.1| GENE 15 13503 - 14888 1500 461 aa, chain + ## HITS:1 COG:BS_ysfC KEGG:ns NR:ns ## COG: BS_ysfC COG0277 # Protein_GI_number: 16079920 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Bacillus subtilis # 1 460 1 459 470 409 45.0 1e-114 MLSLSFIKEIEKIIGRENVLTGEADRQNYAYDAAVLPPVVPGLVVRPTRTDQLGPLIQKC YEEGIPITIRGSGTNLSGGTIPDSTDAVIILTNSLDRILEINEEEMYAVVEPGVITCDLA AAVAARGLFYPPDPGSMSVSTMGGNIAENSGGLRGLKYGTTKKYVMGLEVLTETGEIMKT GACGRGCEYGYNLTELLVASEGTLALTSKAVLKLVTPPKASKAMMAIFPEVKRASQAVAG IIAAHVVPCTLEFMDNATINYVEDYVNIGLPRDAAAILLIEVDGHPAQVADEAVSVEKVL KDSGALEVVVAKDAAEKTRIWEARRVAIPALARCRPTLMLEDATVPRSKIPAMLEALDKI AARHHVTIGTFGHAGDGNLHPSILCDKRDKEEFSRVERAVDDLFNVALELGGTLSGEHGI GTSKKKWLELETSKGTLEYMRRMRSAFDPKKLLNASKIVSL >gi|316923829|gb|ADCP01000061.1| GENE 16 14961 - 16256 1436 431 aa, chain + ## HITS:1 COG:DR1730 KEGG:ns NR:ns ## COG: DR1730 COG0247 # Protein_GI_number: 15806733 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Deinococcus radiodurans # 20 424 23 416 425 180 30.0 5e-45 MDELTQLANALMALDDKLAACMKCGFCQAFCPMYMTTRIEGDLTRGKIALVENLAHRIIE DPEAVNEKLSRCLLCGSCQANCPSGVKTTDIFLEARAIVATYLGLSAIKKAAFRMLLPNP RLFGTLLRLSAPFQNLILKDEAKPQNTVCAPLLKPMLGDRHMPKLAAKPLHSTVGNVNTP MGKSRIKVAFFPGCLGDKIYTNVSEACLKVFDYHEIGVFLSDNYACCGMPALASGDRQGF EKMVMHNVDILKDQNYDYIVTPCSSCTVAIREFWAQFSDNLPPRYRDAINALAPKAIDIN ALLVDVLHVSPKTQAKGQTKVTYHESCHLQKSLGVSKQPRDLIRMNPGYNLVEMAEANRC CGCGGSFTLTHYDLSLKMGQRKRDNVIASGAEVVATGCPACMMQLSDMLARNNDPVQVKH TIEIYAESLPL >gi|316923829|gb|ADCP01000061.1| GENE 17 16441 - 17808 894 455 aa, chain - ## HITS:1 COG:BH3899 KEGG:ns NR:ns ## COG: BH3899 COG3829 # Protein_GI_number: 15616461 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 26 444 157 558 559 263 40.0 4e-70 MSATGAFLYCNQAFLSMFNLPSDIRGKHVTDFFLTAEQGVMTTIRTRKPTFCSSLTTTDA QGISFRYPLLDDKGKLYGVIIESISPSIGPERMQKFLDTIHDLEEKSNYFEQKVRKKPGS LYTFESIIGSSNAIMALKKKGKIFARGNEPILVLGESGTGKELVAQALHSASSRADKPFV TVNCAALPHDLMEAELFGYEGGAFTGAKSSGIKGKFELANKGTIFLDEIGELPLSMQAKL LRVLESGEIQKLAHRGQLHSDFRLIAATNKDLKESVERGSFREDLYHRLNILELTIPPLR DRVGDIPLLARHFIELHAGAKRGREIQISNELYRAFGLYPWHGNIRELKNVITYALFNLE EDEKVLSTRHLPERFFRELLTEQPEIKEKELCPDNLNLDEVGAQAERKALLLALSSTKYN KTLTARVLGISRNKLYKKMRDFNLLSPSGKNEGSI >gi|316923829|gb|ADCP01000061.1| GENE 18 17725 - 17883 67 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFAPDIRRQIKHAQKRLVAIEERTRCRHNGNSCVQKLKNIGVRAFGKRLQSG >gi|316923829|gb|ADCP01000061.1| GENE 19 18370 - 18990 542 206 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3311 NR:ns ## KEGG: Sterm_3311 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 5 200 8 203 205 152 42.0 6e-36 MNKDELIDAIVQIEWPMFAGVNNEGGKAACQMDLATFRIMRISQYSAWGEELLESCLADL RAAQNQGRNLMTEKYARMMKTTFPDEYPAIEKSLPPLDPAAARQIEDIVACHVQWKTALD QKYPHLGDRSRPVRTQEDRTGLPSVETYTRAELQTCSPRTISLYHAATMKRIERGENEAE ENLLNQVRQYGFASLEDAEQYFSTHE >gi|316923829|gb|ADCP01000061.1| GENE 20 18983 - 19759 669 258 aa, chain - ## HITS:1 COG:all3838 KEGG:ns NR:ns ## COG: all3838 COG0457 # Protein_GI_number: 17231330 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 2 254 453 709 710 63 26.0 3e-10 MSQGNPEEAGKIYEELCRMEQLTAAERGVALFGLGTCLFLHESYAAASLQLRESWELLIA SLGMEDPLTTRTMVLLSRTLIALGDLESGMEIGRGALKNLVKLYGQDADQAATAAFFLSS GAYQFGRLAEAEELTLQALQAWEKIYGNVSLQAAACLDALGKLRNVCGEKREGTDFHRRA ADIKMQVLGEHETTAASLGHIGMAEAELGNWKEAETLLASSLEQFDRLGVGKNAEGIAAF REKLNECRNMLAKEPTDE >gi|316923829|gb|ADCP01000061.1| GENE 21 19978 - 20214 191 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELRVSLDEREDPDEEQEIEVEGIPFVVNEDVIDSYGLKYTIAVDEHNMPAVSSELQKAS PAPESDGACSEEKACSLS >gi|316923829|gb|ADCP01000061.1| GENE 22 20224 - 20340 159 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLEITIAPEVIAKFKELLVEEDNEDAVFRIRETKVGGG >gi|316923829|gb|ADCP01000061.1| GENE 23 20414 - 22822 3482 802 aa, chain - ## HITS:1 COG:SPy2049 KEGG:ns NR:ns ## COG: SPy2049 COG1882 # Protein_GI_number: 15675819 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Streptococcus pyogenes M1 GAS # 10 799 12 803 805 335 31.0 2e-91 MGYAIDWSDLEQRIQDQKAALLDTSQNMDPERLKFLVETYDECKGESVFITRAKLFEKVL NNKKIYLDGNPIVGSLAGGRAMIYAYPEWQCNWVKDDLDSEKLALSSLGEVRFSDETKEI LKKVYKTWKGKTTYDRANKLYKELYGGNAELLVKCGWIYPVNDNSTGSGVADYPAMLTKG IRGILEEVEAAYRALPKHATNWTKIEFYRACKIALNAVIAYAHRYADLAEETAKAETNPK KQAELMEIAEVCRRVPEFPARNFREALQSFWFTHCCIQIEQCGCGHSLGRYGQYMYPFYK KDLDEGMLTKEQVLTLLKCQWVKHLEIAVYQGDAYAKAFSGHTGQTISLGGYTADGDDAS NDLEELLMDTQIAMNNIQPTLALFYTPKMKPSYLEKAAEVVRSGSGQPQFMNMNAAVARS LVRFASRGITLDEARTLPVIFGCVGTGIQGKGSYVTFEGQPNLAKLVEFAMYDGYDPHTR KQVFPNVKPAEECATFEELYDALLRHMDHAYDAQRKISDLGNSTREQIVPNIFRSCLLDG CIESGLCEEAGGPKYSQSLCITSTGIDAANSLYAIKHLIYDTKQLTWEQLKKALAANFEG YEDIQKLCFGAPKHGNDIEDVDQLTRRFFRDVERIYRSHGPDYFGYEAHMDPFSLSYHNY FAPMTGALPNGRQKGVALTDASVSAMPGTDVNGSTALIKSAAQAIDTVRNNCNHMNMKFL PSALEGPSGTRMLLNLIKTYFDLGGGHIQFNCVSSETLCDAQEHPQNYKNLVVRVAGFSS YFTRLYKGVQDEIIKRTEYQNV >gi|316923829|gb|ADCP01000061.1| GENE 24 23364 - 24254 735 296 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 1 261 4 262 302 213 42.0 4e-55 MDGIVYNIQRMSTKDGPGLRTTVFLKGCPLHCPWCSNPESQSFKPQLMFFSNLCVGCGAC ERICPNGAVVKIGDVYNRDRSICTDCGLCVESCPSKAREMSGKRMTVEEVMRVVDSDSLF YDNSGGGVTFGGGEPTAAGAFLLGLLDASIRRGYHICLDTCGVCEPERFRKIMNQVELFL FDCKHMDPDEHKRLTGLDNVIILQNLHALFEAKKALHIRVPLMPGINDTEKNIAQMAAFL HKHGHNEIDVLPCHTFGHSKYAALNLADPVMVPYQPEELAAALERFAKYDLKVTIV >gi|316923829|gb|ADCP01000061.1| GENE 25 24445 - 25095 271 216 aa, chain - ## HITS:1 COG:STM2018 KEGG:ns NR:ns ## COG: STM2018 COG2087 # Protein_GI_number: 16765348 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Salmonella typhimurium LT2 # 6 209 2 175 181 108 34.0 7e-24 MEKTELTLYLGGTRSGKSARAEAQALRTGGPVLYVATAEARPDDPSMHERIRRHRDRRPP HWSTLECPLHLAKRISLMFPAFFLEAAASVSPEQPPESGVGSSPKDAGEAAQERPTILID CVTLWVSNILFALPEPEDLTAFETAVRTEVNALLALMERSNCRWILVSGETGLGGIASDR ISRNYCDGLGLANQLIAASARKVFLVVAGCSLVLAE >gi|316923829|gb|ADCP01000061.1| GENE 26 25270 - 26013 724 247 aa, chain - ## HITS:1 COG:cobS KEGG:ns NR:ns ## COG: cobS COG0368 # Protein_GI_number: 16129933 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Escherichia coli K12 # 3 243 5 245 247 97 35.0 3e-20 MKFRAALGFFTRLPVGSAPLPPTFQGVIVWLPAIGLIVGLLAALAVALATLLLPAQLGGV IGCLVWVFITGGLHLDGVADCGDGLLVEAPPERRLEIMKDSRLGTFGGAALFFVLALKAA ALCTLAASFDGSWSGLLTLAGACCMAGALARSMVFPAMHIPSARPGGLGAALHNGVTRRH DMLALGIGLGICALNGGRGFTALIAALLVAWLLLSAAQKRLGGVTGDVFGCLIELTESAV LVACCLR >gi|316923829|gb|ADCP01000061.1| GENE 27 26010 - 27044 1042 344 aa, chain - ## HITS:1 COG:CAC1372 KEGG:ns NR:ns ## COG: CAC1372 COG2038 # Protein_GI_number: 15894651 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 17 342 21 345 352 273 44.0 5e-73 MNNLTIPAYSEQAAQAVKERLDSLAKVPGSLGKLEELAIRLAGITGRTCPSFPQKSVVLF AADHDITLKGVSATGQEVTEIQVRNFLKGGGTINAFCRNAGASLTVVDVGIKGDVGGAEG LVRRKVVHAARDFSEGPAMTREEALACVQVGIDMAHEEAKKGVTLLSAGEMGIGNTSPSS AVAAVLTGAPVEGVTGIGSGIPSERVRHKAELIRQGIERNRPNPADAVDVLAKVGGPELG AMAGLMLGGASLRIPVVVDGFIAGAAAAIAIGIRPGVRDMLVGSHSSFEPGHRILMNHLN IPTYLDLGLRLGEGTGAVLLYPLIDASVRILTEMRTLKELDINR >gi|316923829|gb|ADCP01000061.1| GENE 28 27475 - 28101 664 208 aa, chain - ## HITS:1 COG:CAC1374 KEGG:ns NR:ns ## COG: CAC1374 COG1492 # Protein_GI_number: 15894653 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Clostridium acetobutylicum # 1 208 284 488 491 147 36.0 2e-35 GDPDVVMLPGSKSVVPDLDDLRRSGLADNILGHAERGKWIFGICGGLQILGRAILDPHGI ESAAPEVPGLGLMDLRSTFAADKTLVRVARAETPLGVPSGGYEIHHGLTDHGPSALPLFL RADRTYPSEAERICGYVSGRRWATYLHGVFDDDTFRRTWIDHVRTDLGLTPQRRCLASYD LEKALDRLADVVRANSDMETIYRSMGLK Prediction of potential genes in microbial genomes Time: Fri May 13 02:41:33 2011 Seq name: gi|316923826|gb|ADCP01000062.1| Bilophila wadsworthia 3_1_6 cont1.62, whole genome shotgun sequence Length of sequence - 3901 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 2 - 2135 1699 ## COG1492 Cobyric acid synthase 2 1 Op 2 . - CDS 2132 - 3898 1382 ## COG1797 Cobyrinic acid a,c-diamide synthase Predicted protein(s) >gi|316923826|gb|ADCP01000062.1| GENE 1 2 - 2135 1699 711 aa, chain - ## HITS:1 COG:STM2019 KEGG:ns NR:ns ## COG: STM2019 COG1492 # Protein_GI_number: 16765349 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Salmonella typhimurium LT2 # 423 710 2 286 506 282 50.0 2e-75 MKLDAHGGDLLAMASVSKRDPASLLDFSVNVRPEGAPEFLRAALVRANTGLAAYPSPNAE EALAAAARRYGLPAGRFAFGNGSNELIHALARVLKRRGAPCVSIVEPAFSEYALACRLAG LESSPLWGGIGSGADGQPSGDLLAGLAETPSGSAVFLANPCNPSGLFRTPGECLSLLAAR PDLLWIIDEAFIEYAGPEADVSILRRMPSNGVVLRSLTKFHALPGVRIGYLAASAELAEA VRADLPAWNVNVFALSAAVAALGDTSDFAERARAENAERRADLAAVLSGLPGVEVFPSSA NYMLFRWRGAPKDLYGILLRRFGIAVRDCSNYCGLDDGTWFRAAVRFSEEHHRLAGALRE VMEEGDVPKKSAADTPLLAYGGMKSREEDEGDLSLKKISPSPADVPISGSSSVFPAGRSS RRTPALMLQGTSSNAGKSILAAAYCRIFRQDGYNVAPFKAQNMSLNSGVTANGDEMSRAQ IVQAQAARADPDARMNPILLKPHSDTGSQVVILGQPLGHMDVLEYFGKKRELWSAVTDSY DSLAAECDIVVLEGAGSPGEINLKSHDLVNMRMADYARASVLLVGDIDRGGLYASFLGTW MSFTDAERRLLTGYIVNRFRGDASLLGPAHDYMLAHTGVPVLGTIPYIRDLNIPEEDMAG FSWGHTDCGEKKAGTLDIAVVMLRHVSNYTDFAPLAAEPDVRLRPVRRAEE >gi|316923826|gb|ADCP01000062.1| GENE 2 2132 - 3898 1382 588 aa, chain - ## HITS:1 COG:sll1501 KEGG:ns NR:ns ## COG: sll1501 COG1797 # Protein_GI_number: 16329614 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Synechocystis # 1 356 88 438 482 234 39.0 3e-61 MGLFDSRDPGDPAGGTADCARALGIPVVLVFNARGMACSAAALVAGFRLHASRMGVQLAG VIANNVGSPRHADILRRALESERLPPLLGALPRNEAWRIPERQLGLLPSEEAGTTEAWLD ALADVAESSVDMDRLLSLTEARRPEARAVLPPRGIRPRRMGIAKDRAFCFYYEENERALA ARGWELLPFSPLEDTALPPGIDALYLGGGYPEVFARELSGNAAMREAIRAFAEQGGEIYA ECGGYMYLCTRLEASEGKGGKGGRTASWPMCGVIDATARMGGRIQSLGYREVTMLGDAPF GLGGDVFRGHEFHWSDIELHRGYAPLYAVRTASGHADSGIAAGNVRASYVHLYWGNTGEA NYAGRPAPSDFTACRPEHRAARPGEAKATCENIGQVILLNGPSSAGKTTLAKVLRDRLYA MHGICSLMLSIDQLLRSATGGHESVLDGLERTGLPFIETFHAGVAAAAKAGAWTIVDHVI GEDPRWIEDLLGRLEAIPLLSVQVLCDDEELRKRESGRSDRSPDWPHAQRQARHIHLPLP NQMVVDTTRTSPEDCAACILAALSAEKNGIPIRPGGGAPISTTERGSL Prediction of potential genes in microbial genomes Time: Fri May 13 02:41:53 2011 Seq name: gi|316923797|gb|ADCP01000063.1| Bilophila wadsworthia 3_1_6 cont1.63, whole genome shotgun sequence Length of sequence - 36174 bp Number of predicted genes - 30, with homology - 27 Number of transcription units - 19, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 467 103 ## COG1797 Cobyrinic acid a,c-diamide synthase 2 2 Op 1 5/0.000 - CDS 746 - 3346 2501 ## COG2875 Precorrin-4 methylase 3 2 Op 2 6/0.000 - CDS 3343 - 4683 1069 ## COG2242 Precorrin-6B methylase 2 4 2 Op 3 5/0.000 - CDS 4680 - 5939 971 ## COG1903 Cobalamin biosynthesis protein CbiD 5 2 Op 4 . - CDS 6159 - 6800 438 ## COG2082 Precorrin isomerase 6 2 Op 5 . - CDS 6896 - 7831 689 ## COG3366 Uncharacterized protein conserved in archaea 7 3 Tu 1 . - CDS 8065 - 8766 746 ## COG2243 Precorrin-2 methylase - Prom 8904 - 8963 7.7 + Prom 10136 - 10195 4.0 8 4 Tu 1 . + CDS 10364 - 11683 1025 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 9 5 Tu 1 . - CDS 11707 - 11853 64 ## - Prom 11917 - 11976 2.5 + Prom 12985 - 13044 5.8 10 6 Op 1 . + CDS 13289 - 13939 948 ## COG5012 Predicted cobalamin binding protein 11 6 Op 2 . + CDS 13998 - 15413 1790 ## MA2972 monomethylamine methyltransferase + Term 15459 - 15523 6.2 + Prom 15576 - 15635 2.9 12 7 Op 1 1/0.000 + CDS 15662 - 16723 590 ## COG0407 Uroporphyrinogen-III decarboxylase 13 7 Op 2 . + CDS 16891 - 17826 867 ## COG0407 Uroporphyrinogen-III decarboxylase 14 8 Op 1 . + CDS 17932 - 18240 393 ## Hore_15330 methylamine-specific methylcobalamin coenzyme M methyltransferase 15 8 Op 2 1/0.000 + CDS 18237 - 19202 710 ## COG0523 Putative GTPases (G3E family) 16 8 Op 3 . + CDS 19175 - 21082 1576 ## COG3894 Uncharacterized metal-binding protein + Term 21108 - 21163 -0.9 - Term 21060 - 21092 1.2 17 9 Tu 1 . - CDS 21118 - 21366 62 ## 18 10 Tu 1 . + CDS 21259 - 21978 456 ## Ccel_3200 hypothetical protein 19 11 Tu 1 . + CDS 22131 - 22583 40 ## COG3293 Transposase and inactivated derivatives + Term 22714 - 22748 2.0 + Prom 22736 - 22795 3.2 20 12 Tu 1 . + CDS 22881 - 23549 534 ## Dhaf_3800 hypothetical protein - Term 24237 - 24271 -0.8 21 13 Tu 1 . - CDS 24474 - 25043 400 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) - Prom 25232 - 25291 4.7 22 14 Tu 1 . - CDS 26266 - 26826 426 ## COG1192 ATPases involved in chromosome partitioning - Prom 27036 - 27095 6.4 23 15 Tu 1 . + CDS 27583 - 27750 182 ## + Term 27996 - 28042 -0.6 24 16 Tu 1 . - CDS 28095 - 28982 456 ## COG0659 Sulfate permease and related transporters (MFS superfamily) 25 17 Tu 1 . + CDS 29073 - 29669 159 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains - TRNA 30350 - 30426 59.7 # Glu TTC 0 0 - Term 30795 - 30840 13.4 26 18 Op 1 . - CDS 30874 - 32532 2380 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 27 18 Op 2 . - CDS 32604 - 33200 727 ## IL0782 hypothetical protein 28 18 Op 3 . - CDS 33290 - 34108 1034 ## Ddes_0053 hypothetical protein - Prom 34171 - 34230 4.7 + Prom 34080 - 34139 2.6 29 19 Op 1 11/0.000 + CDS 34381 - 35226 512 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 30 19 Op 2 . + CDS 35085 - 35831 387 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase + Term 35958 - 35994 -0.4 Predicted protein(s) >gi|316923797|gb|ADCP01000063.1| GENE 1 2 - 467 103 155 aa, chain - ## HITS:1 COG:MK1573 KEGG:ns NR:ns ## COG: MK1573 COG1797 # Protein_GI_number: 20095009 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Methanopyrus kandleri AV19 # 7 113 3 109 445 100 49.0 8e-22 MNTTFHAFCLAAPRSGEGKTTTGIALMRALARRGLKVQSFKCGPDYIDPTFHAQATGRPA CNLDTWMMGREGVRALWDNRAHDADACVCEGVMGLFDSRDPGDPAGGTADCAAPSASPSC SCSTPGAWRAPPPHSLRGSGFTLPGWASSLPESLP >gi|316923797|gb|ADCP01000063.1| GENE 2 746 - 3346 2501 866 aa, chain - ## HITS:1 COG:MJ1578 KEGG:ns NR:ns ## COG: MJ1578 COG2875 # Protein_GI_number: 15669774 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Methanococcus jannaschii # 5 251 7 253 259 247 49.0 7e-65 MKALVEFVGAGPGAEDLITVRGLRALEQADLVVYAGSLVNPAHLKACKADCTCLDSASMN LGEQIEAMSDAALAGKRVVRLHTGDPAMYGAINEQIRGLAQKGVAASIIPGVSSVFAAAA ALGCELTSPDVSQSVVLTRTPGRTPMPQGEDAAAFARTGAMLVFFLSTGKVGELMRHLME QGGLAEDTPAAIVYRASWPDERILRGTVGDIARQAEEAGLGRQALVIVGRALGANGTASR LYDADFSHGYRNRLVSEDFDGRCALYAFTDKGLTRAREIAAGLGLPTVLHSTHPSGAEGI VHIPAQDFDSRLAANWAQFDAHIFIGATGIPFRKAAPLLRGKNIDPAVLACPESGSHVIA LTSGHFGGTNRLARRIARITGGQAVIGSPADVNGLPAFDEAAAQEHARILNPEAVRALNA ALLDGSPIAFCGTRAVFERHFASTGQVAFFENPQDVTCGHAVLWDSENTLPEGMLHLDVS SRAFVLGVGCRRGVKPQELRLVAERYISEFGLNAENIAGIATCTVKEDEPAILGLGEAWQ VPVASHSAEELDAVPVSAPSEKVREKVGTASVCEAACLLSAGYGSIPQPTLYAPKSAFGD VTLALARLPHLAVPQNGQGEIVVAGLGSGAPGHITPDVDTALRRCDTVAGYSHYVDFIRD RIAGKPVIQNGMKGEVERCLSALEAALAGQNVCMVCSGDPGILAMAGLLYELRTREPRFR DIPIRVLPGITAATIAAASLGAPLQNGFSLVSLSDLLVPADEVRRNIRSVAQSLLPVALY NPAGRKRRALLDETLAVFREHRGRDVLCAYVKNAGREQETKWVGKLSEFPAAEVDMSTLI IIGGPRTRLDSGVLYEPRGYVEKYME >gi|316923797|gb|ADCP01000063.1| GENE 3 3343 - 4683 1069 446 aa, chain - ## HITS:1 COG:BMEI0716_2 KEGG:ns NR:ns ## COG: BMEI0716_2 COG2242 # Protein_GI_number: 17986999 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Brucella melitensis # 245 417 3 159 183 108 41.0 2e-23 MTNGRTRDAAGHIDIVSCGTGFPSDTGVLTLLEEADAVYGSRSLMALCPVPVKSTRTIGS CVRKDAEEALALAREGKRVVVLASGDALYHGLGGTLAALRKPGDDLAYHPGITAFQALFC RLGLPWQDARLFCVHSGEGLPSRGIAEAPLSVTYAGSRYPAHAIARAVLDTHPASARRAA IIAERIGSDDERILSGTLGELADAACGPTSILVIFPAAHADCHAQDAAATTRGSASIQAP ILALGLPEEAFERENNLITASDVRAVILSRLRLPAWGTLWDVGAGSGSVGLEAAALRPTL SIHGIERNPERCAMIERNRLSMGIANYTLHPGNALSVIHPSCPEPRAPSAYGGILPDPDR VFVGGGGKDLPALLTACRDRLRPNGLIVVSAVTLESFGTLLSWAPDHRASLCRIDIANER PLAKRSHHFAPQNTIYVFTFHKEAIS >gi|316923797|gb|ADCP01000063.1| GENE 4 4680 - 5939 971 419 aa, chain - ## HITS:1 COG:MJ0022 KEGG:ns NR:ns ## COG: MJ0022 COG1903 # Protein_GI_number: 15668193 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Methanococcus jannaschii # 5 282 6 272 362 137 32.0 3e-32 MKKDRKNLRWGYSTGACAAAAAKAAWIRLTRGGAPQSIWVHFLDGRERELPLLQSGAGHM AAIRKNGGDDPDCTHGATLYADIRACLSGEIRAEDYVLQIGNGTLILRGAEGIGLCNRRG LDCELGRWAINTGPRNMISENLRRAGFSSGCWLLEIGVENGEELAKHTLNSHLGIMGGIS LLGTTGLVRPYSHEAYIHTVRICVKSHFLSGGSTMVFCTGGRTKSGAELHLPQLPATAFV SIGDFIAESLAAACRYGMREIAVACMPGKLCKYAAGFENTHAHKVSQDMDLLHAEVRRTL PAEASLHGALKYSASVREALLSIPPDARDGLLRRLARTALRQFSRRCTGNPALRLLAFDF EGEFLFEEARGRPEDASPRQSTAIAAEPDEHMNGDQPDPAIADPGEIVGPSYFIQGHSI >gi|316923797|gb|ADCP01000063.1| GENE 5 6159 - 6800 438 213 aa, chain - ## HITS:1 COG:FN0970 KEGG:ns NR:ns ## COG: FN0970 COG2082 # Protein_GI_number: 19704305 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin isomerase # Organism: Fusobacterium nucleatum # 11 210 12 208 219 143 39.0 2e-34 MEVRWNLSGEEIERESFRTIEAECDLHIKLPAPEWRVVRRLIHTTADPRIADTLVFRHDA VASGLKALRSGAPIFCDSKMIRSGLSLARLRRLNPGYGPERLHCHIDDPDVLARAKAEGR TRALCSAEKARPLLDGAIVLIGNAPLALARIARYVLEESARPALVVGMPVGFVNVVESKA LLARCAVPQIALEGRRGGSALAVATLHAVMESA >gi|316923797|gb|ADCP01000063.1| GENE 6 6896 - 7831 689 311 aa, chain - ## HITS:1 COG:MA1324 KEGG:ns NR:ns ## COG: MA1324 COG3366 # Protein_GI_number: 20090185 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in archaea # Organism: Methanosarcina acetivorans str.C2A # 22 308 29 311 311 80 26.0 5e-15 MKLETFGRLFAIVGIAAFCGGLMEARQWHMVLALLMGRLTRMARLPEIVGLAMPTALCSN AAANSILVSSHAEGQIRTSALIAGGMMNSYLAYISHSIRVMFPVIGAIGLPGVLYFGMQF TGGFLVLLCILLWNRWYVSRHGESPCATESGSTSQPLAWPHAISKAAIRSFSLLFRMACM TVPLMLGIEWLLKNGVLDFWETALPHAVTELFPTELLSAVAAQFGGLVQSSAVAANLRAE GLIDNSQILLAMLVGSALGNPFRALRRNLPSALAIFPVPIALSIVLGMQLSRFLVTLAGI AGVIWIMLNTP >gi|316923797|gb|ADCP01000063.1| GENE 7 8065 - 8766 746 233 aa, chain - ## HITS:1 COG:STM2024 KEGG:ns NR:ns ## COG: STM2024 COG2243 # Protein_GI_number: 16765354 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Salmonella typhimurium LT2 # 4 233 3 232 237 108 34.0 7e-24 MKTGKLYGVGIGPGDPQYLTLRAADVLRSVDVVFTVISPNASNSVSQSVVDYLQPRGEVR LQVFSMSRDKTIREEQVRANAEAIIAELRAGHDCAFATLGDAMTYSTFGYVLEIIRKAIP NLDLEVIPGITSFATLTAKAGTVLVENGEQLRIIPSFRAEMAEALDFPKGSTTILLKSYR SRGALLDRLAREEKVQVLYGEHLAMKEQALLTDPDAIRARPEKYLSLIMVKKQ >gi|316923797|gb|ADCP01000063.1| GENE 8 10364 - 11683 1025 439 aa, chain + ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 69 432 106 456 461 250 39.0 3e-66 MYNLPESVIGKHITDYFLTGERGVMSSIRTRKMVICSSQTKNNVWGVSFRYPIQDEQGQL RGVVVESIPSNLDKDKLLALLDTVRNLEMKSYSFSEQKEAHKNSGLYTFEAIVGEAPCIE NMRCLGRRFAFSQEPILVCGESGTGKELVAQALHMASQRSDRPFVTVNCAALPPELMESE LFGYETGAFTGAKVGGVKGKFEMADTGTIFLDEIGELPLPMQAKLLRVLETGEIQKIAHK GQLHSDFRLIGATNRNLAEMVRQGQFREDLYHRLSVFELDIPPLRDRVSDIPLLVRHFVT QSVGDNRQKDIRIDDALYDAFAQYPWRGNVRELKNVLVYALYSLGDDQSVLTVQHLPPRF MRELEAAAVAGTTPDAEDSRQDSQNFSEASARAERKVLWDALVNSRYNKVLAARTLGISR SKLYRKLREHGLLAKFEAQ >gi|316923797|gb|ADCP01000063.1| GENE 9 11707 - 11853 64 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAFAPALPRPRQPLNIARRKDGNRSYTRQPGLRQWDSQQAANTTGFRR >gi|316923797|gb|ADCP01000063.1| GENE 10 13289 - 13939 948 216 aa, chain + ## HITS:1 COG:MA0859_1 KEGG:ns NR:ns ## COG: MA0859_1 COG5012 # Protein_GI_number: 20089743 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Methanosarcina acetivorans str.C2A # 7 216 12 218 270 147 41.0 2e-35 MANVACIEELKNAILDFDEDAAVTAAEKIIKEGVSPMDAIDAMGDALKVLGDKFQIMEVF LPEILLATDAFKAGLKVLEPELMKTIDAGSFQEKPKVVIGTVKGDVHAVGKDMVATMLTV GGFDVKDVGADVDSEVFITAAEEFGAQIIGLSALMSTTMPSQKEVIDFLEAKGLRNKYKV IVGGGPVTPQWAEQIGADGFSKDAVEAVELAKSLLK >gi|316923797|gb|ADCP01000063.1| GENE 11 13998 - 15413 1790 471 aa, chain + ## HITS:1 COG:no KEGG:MA2972 NR:ns ## KEGG: MA2972 # Name: mtmB2 # Def: monomethylamine methyltransferase # Organism: M.acetivorans # Pathway: Histidine metabolism [PATH:mac00340]; Tyrosine metabolism [PATH:mac00350]; Selenoamino acid metabolism [PATH:mac00450] # 11 467 11 456 458 173 26.0 2e-41 MRIGARGKFIDFWSRAVTGPICFESDFDKRVYWPKLKKITEKWGIKYDPDHMVPCDDEML DRLWQAAIDLVAEVGVLCTDTRRIIEFSRKEILDAAANTADRYTVGVGKDSFTAVHRGFE DYDHVKNPVSVLGRILGPVSQDIYHQIAMSYAQVPQIDMCHFQGNLTEIYGMPITPDSPW EMFSELWSVAQVKDVCRQVCRPGLADGGIRAIAMSAMQAAFDPGWGANKGDFRCCLTLPH QKVEYKHLSRALQWHTYGINFYSVMTSYPGGLSGGPATSAVTGTAEWIIQKLLFDVPLNG SWSVDAMYFSNTSKYSLWCSNHQNAAVTKNTVCEPLTGGGWQMTHGIGHENFFWESAASA ISAVVLGNGVSGGTGAQSGLKDHQSGLGLQFSAEVGEAVAKARLTRAHANDLVKRIMAKY QPTIDSHTAHKMGGDFRECYNLVTVQPQKWYMDMYDKVKKELTQMGLPMEY >gi|316923797|gb|ADCP01000063.1| GENE 12 15662 - 16723 590 353 aa, chain + ## HITS:1 COG:MA4379 KEGG:ns NR:ns ## COG: MA4379 COG0407 # Protein_GI_number: 20093166 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Methanosarcina acetivorans str.C2A # 6 349 8 342 342 112 28.0 8e-25 MPRTMTSKERVKAAFAHNPVDRVPMMILLGETWMIEREKISFKDLREMDDLGAELIVRTY DEMQSDSVTTGLGCWIGLLEALGCPTEISKIGAPIEVKPCIHDVAADISSLDRSKIRERL ENSELIQKMMRQSREIKKLVGDRKSVAGQMVGPFSGASMMVGVKEFMILLGKKSPYIKPL LEYATDCCAEIANMYCENGCDLIQTCDPCSSGDMISPKMYEQHVVPTLEGLTSQLKNCET FLLHICGKAGMRLPHVKALGIDGFSVDSPVDLKESLEAAGKELTMVGNFNPNELLCMGTS EAVYAAAYANAEIAGLDGGYVMMPGCDLAARTPLENILAMVRASSDYAASVRN >gi|316923797|gb|ADCP01000063.1| GENE 13 16891 - 17826 867 311 aa, chain + ## HITS:1 COG:MA0146 KEGG:ns NR:ns ## COG: MA0146 COG0407 # Protein_GI_number: 20089044 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Methanosarcina acetivorans str.C2A # 2 298 5 306 339 97 26.0 2e-20 MTGKEVLFKTLRHEETERPAWVPMAGVHAGFLKGYTADEVYQDADKLVESLLEVEKMYAP DGLILLFDLQLEAEILGCTIKWEKNGPPSVRSHPLESTFEIPDIKITRDSGRIPLVLDVT RRIKKAVGDRTALYGLFCGPYTLASHLRGTNFFRDARQHPEYVKELMAYTTELGLQMADF YLEEGVDVIVPVDPVVSQISPTHFKNFVFEPYAKIFRHIRNKGAFSSFFVCGNATHILEL MCQTKPDSLAVDENVNIVAAHKLTDSYDVAIGGNIPLTTALLLGTQLDNMKVGIDLIDSL NTRKNFIFSPG >gi|316923797|gb|ADCP01000063.1| GENE 14 17932 - 18240 393 102 aa, chain + ## HITS:1 COG:no KEGG:Hore_15330 NR:ns ## KEGG: Hore_15330 # Name: not_defined # Def: methylamine-specific methylcobalamin coenzyme M methyltransferase # Organism: H.orenii # Pathway: not_defined # 2 102 348 448 448 145 63.0 5e-34 MDVDVELPDYGNLKKPLLEAFTLDSTACAACTYMWGVAQDAKKHFGDRIDVVEYMYNTPE NISRIKKMGVKQLPSLYLNGELKYSSLIPNLDDLIRQIEEAL >gi|316923797|gb|ADCP01000063.1| GENE 15 18237 - 19202 710 321 aa, chain + ## HITS:1 COG:SMc03799 KEGG:ns NR:ns ## COG: SMc03799 COG0523 # Protein_GI_number: 15966935 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Sinorhizobium meliloti # 1 279 1 285 329 122 30.0 1e-27 MNNPCGPVPLLFITGFLGSGKTTLLNRILDEAAAQGKKIGVIINEWGRVNIDSSLIRAKD IEIEELNDGQVFCSCLSGNFLEALVLLAKRSLDVVIVETSGMANPFPLRNILCDLKRLTG GHYVYQGMIALIDPESFLDLVEDINAVEEQVIASQRIIINKIELADGETLARIRKKIRQL NPHAGIIETSYACVEGILDSSMHEVSTSSPLESKFVKRSAKESYTRPGQYIITTTEGLPP ERVEAFVREILPGALRVKGILSDTEHGWFHVDGVNDKVETRVLEASGNESKIVIIPKAGD EIAEKLIAAWNTQCRVPFSLS >gi|316923797|gb|ADCP01000063.1| GENE 16 19175 - 21082 1576 635 aa, chain + ## HITS:1 COG:AF0010 KEGG:ns NR:ns ## COG: AF0010 COG3894 # Protein_GI_number: 11497631 # Func_class: R General function prediction only # Function: Uncharacterized metal-binding protein # Organism: Archaeoglobus fulgidus # 93 624 74 595 597 317 35.0 6e-86 MPGSVFAFLNGFGQQGWADMSIMNIEVLAGGKRYRFQSEAGTSLKDCLLKAGILEGAECG GRGICGKCAVKVRSGTLAPVEGDESFLMTRGDGKVLACQSLIREDVVIEISAGRDDVRRK VRLPNLQKEKKYSDSLIEKKFIQLSKPSLRDQASDLERILAQVGKNKKVAFSLLGGLPEI VRKADFEVTAVLVEDELIAVEPGDTHALRYGFILDIGTTTIAVYLVDLLSGEVLDADGTA NPQRVFGADVLSRITAAAGRENLEKMQAVTIKGISETMRGLLRSLSIDENHVYSVVAVGN TTMSHLFLGVDPKNLSVAPFIPCYRPRTVVKGGRLGLPMHPEGTVHVLANISGYVGSDTL GVAMATKLWEQKGYSLAVDIGTNGEIILGYKGWLLACSAAAGPAFEGAHIQNGMRAGDGA IESVLLENGTVRLGVIGDMPPQGICGSGLIDAVAELLRCGLLGVSGRLAGEKSPALSQPL GGRLRTVGGMREFVLAFAGEQGNERDIVITQKDIRELQLAKAAIAAGIAVLLKEVKIEAS QIDRIYLAGAFGNYLDREKAVALGMFPGISVDKIIPIGNAAAEGAGLCLLSLGERKMADR IASFVKPVELSTHVEFNDLFVREIGFPPLGGKGAS >gi|316923797|gb|ADCP01000063.1| GENE 17 21118 - 21366 62 82 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQPLLQKVHGRRIPHIGCEHFQLGTQDFTGDKQCFHTIPNVAVKKCRAFAQCGFTGQSPK IPYKGQERAHKGSVSTPAPPAA >gi|316923797|gb|ADCP01000063.1| GENE 18 21259 - 21978 456 239 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3200 NR:ns ## KEGG: Ccel_3200 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 224 2 235 253 117 32.0 3e-25 MKTLLIACEVLRPELEVLASDMRNPPPMNFLEQRLHDYPEKLRSAFQGLVDAFEQENDGP LAVLCGYGLCGRGLSGVHASRATLVFPKLHDCIPLLLGLDQKGANASSREGATYWISPGW LKYFLVPFHLESYRRFSVYEKKFGAAKAARMMKAENALLDNYKNACHIRWPEMGDAYVDE ARKVAEATLLPYSEIWGSSAYLAELLHGGQSGERFLHLVPGQTIDMDVEGTICAVACPA >gi|316923797|gb|ADCP01000063.1| GENE 19 22131 - 22583 40 150 aa, chain + ## HITS:1 COG:CC0928 KEGG:ns NR:ns ## COG: CC0928 COG3293 # Protein_GI_number: 16125180 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Caulobacter vibrioides # 6 79 6 79 123 62 37.0 4e-10 MSVYRLTNAQWEYIRLLLLVRPRLRGCLPADNRQTLEAILHVLTTGCPWSELPRYLGNYV TAWRRFRKWGADGTFARIWPYLRAEFEHAEKIDFSRHKFECLQIQTDSPQLLIRRSGERE GRKRARSSGSGGPNPQLPGRRMVSFSAKIA >gi|316923797|gb|ADCP01000063.1| GENE 20 22881 - 23549 534 222 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_3800 NR:ns ## KEGG: Dhaf_3800 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 4 218 2 215 228 132 32.0 6e-30 MDQNVILVETVPLEFDLESLGCRLRIRPGQTKFMDRLSRLAEEAARVARPKAVARLCGLT ILDEEKVQVGEVTFSSPLLRQKMDGLGRAFPYLASEGTELADWSLSLSSSLEQVFASALR EVAVQQAENLLERTLLERYGIAQVSAMNPGSLKVWSLEEQVPLFELLAPLPEKLGVTLLP SLMMRPEYSVSGVFFQTDSKFYNCQLCPKKECPNRRTPSLVT >gi|316923797|gb|ADCP01000063.1| GENE 21 24474 - 25043 400 189 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 8 188 3 182 185 158 44 4e-38 MQDLLNGRCGWCGTDSLYVKYHDEEWGRLVTDDRLLFEFLTLESAQAGLSWITILRKRER YRKAFYDFDVNKVAQMTEQDVERLMQDEGIVRNRLKIKSAISNAKLFIAIQKEFGSFYNY TLSFFPEQKPISYQPKTLKDIQASTPESDAMSKDMKKRGFKFFGSTICYAYLQATGFVND HLVNCICRK >gi|316923797|gb|ADCP01000063.1| GENE 22 26266 - 26826 426 186 aa, chain - ## HITS:1 COG:PA5563 KEGG:ns NR:ns ## COG: PA5563 COG1192 # Protein_GI_number: 15600756 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Pseudomonas aeruginosa # 3 177 77 252 262 148 47.0 5e-36 MSSLEAGLDILPANIRLAEAELAFAGRIGRENLLKKALLSVSGEYDYVLIDCPPSLGLLT VNALNAANGLLIPVQVEYYALAGLALIRQTAELVRDLNPALAILGLVLTFFDARKTLNKD VAAALADEWGDTLFSTRIRDNVSLAEAPSNGQDVFSYKRSCYGAKDYAAFAAEFLERTEG CYGTRG >gi|316923797|gb|ADCP01000063.1| GENE 23 27583 - 27750 182 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWLEGVATLFVLTRFAGHCPAFAESYRQIYEYIGSHPEAQKIYAAVETKSPLHYT >gi|316923797|gb|ADCP01000063.1| GENE 24 28095 - 28982 456 295 aa, chain - ## HITS:1 COG:CT856 KEGG:ns NR:ns ## COG: CT856 COG0659 # Protein_GI_number: 15605592 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Chlamydia trachomatis # 2 268 292 560 567 196 38.0 4e-50 MELISQGIGNVASVFFGGFSATGAIARTATNIRAGAMSSISAIVHSLFLVAVVMWLLPLI ELIPLAALASVLVIVAYDISDLRTVRHIFQGPKSDWSVMLLTFALTVVFDLTVAVYTGVI PASLLFMRRMGELTGLHTCVSREDEASGHEIPVPDKNNVPDGVEIFAINGPLFFGVADRF QSTLNAMETPPKVFIMYLHNMSAIDMTGIHALEEFLERRRKGCRVLFAAVRKPVHRTLQR VGILRTVGEENVFPSLDEALLRAEEILEDTIMARGKSFATAEFEILKLSRERSKN >gi|316923797|gb|ADCP01000063.1| GENE 25 29073 - 29669 159 198 aa, chain + ## HITS:1 COG:aq_1792 KEGG:ns NR:ns ## COG: aq_1792 COG3604 # Protein_GI_number: 15606847 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Aquifex aeolicus # 1 196 329 492 497 110 36.0 2e-24 MLSATNIDLETSLAQGRLRRDLFYRLASVTLRLPPLRDRQSDIPILVDYYIKQFSKSWLI EPPHLSASVLGALCNHDWPGNIRELRNVVSRLLLHSLDGAVTEALVRETLHEWDNISGQT AKPAQLLSLSASAEIPLQAPSRGNDTDLPSLEENERAHILEALRLTGGRLSGPRGAAALL KVPRSTLQHRIRKLGIVV >gi|316923797|gb|ADCP01000063.1| GENE 26 30874 - 32532 2380 552 aa, chain - ## HITS:1 COG:CAC3170 KEGG:ns NR:ns ## COG: CAC3170 COG0129 # Protein_GI_number: 15896418 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 1 549 1 547 552 660 61.0 0 MRSQKMTFGLEKAPHRSLLYALGLTREEMKRPLIGVVNAANEVVPGHLHLGKIAEAVKAG IRMAGGTPMEFPAIAVCDGLAMNHEGMRFSLTSREVITDSIEIMATAHPFDALVFIPNCD KIVPGMLMAMMRLNLPSIIMSGGPMLAGRNNTDLISVFEGVGAVRNGTMTEEELEDLTEN ACPGCGSCSGMYTANTMNCLCEAMGVALPGNGTTPAVSAARIRLAKQAGMQVMELLEKNI RPRDIVTEQSVHNAVAADMALGGSTNTTLHLPAVFGEAELSLSLDIFDDISRKTPNICKL SPAGTQHIEDLHRAGGIPAVMGALLDKGLVHGEALTVTGKTVAENIKELKAHIIDPDVIR IDKPYAKEGGIAILRGNLAPDGAVVKQSAVAPEMMVRDVTARVFNSEEEGVEAILGGKIK KGDVVVIRYEGPRGGPGMREMLTPTSAIVGMGLGADVALITDGRFSGGSRGAAIGHVSPE AADGGLIGLVQDGDTIHIDIPGRKLELVVPEAEIEERRKHFVPVEKDVKSPFLRRYAKLV TSASTGGAYRKI >gi|316923797|gb|ADCP01000063.1| GENE 27 32604 - 33200 727 198 aa, chain - ## HITS:1 COG:no KEGG:IL0782 NR:ns ## KEGG: IL0782 # Name: not_defined # Def: hypothetical protein # Organism: I.loihiensis # Pathway: not_defined # 1 148 3 150 236 63 25.0 5e-09 MHEIEDYIEEAIRVVSRSDMPVSEKRSMIYSLLRLEEYGDCGFTNLRTLKEMMDCQYTFV FDKTEMYDYEANRGYYDDLSKKGGCSQGAPYTLVARDAVTNEWVKHGDKVCIDSGSDAWR GMVAAGAIAGEGAAPVERLENLDVLRKVKKLWGPMDDYFMQAHGGLFLLSGAIDDLPEEE FPEHFGMTKQDFADEYGD >gi|316923797|gb|ADCP01000063.1| GENE 28 33290 - 34108 1034 272 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0053 NR:ns ## KEGG: Ddes_0053 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 4 272 40 310 317 231 42.0 3e-59 MSNHFEGLGKTWLTLLNDPEKEVPAVVMQVMKEGKTRDCWQRKDSKEETMVLAWPVETGF RAGVTVHGNAGEQLRPVSTYPLLEGAPNDMTVNETYLWQNETEGEVSATCNEGANPLWFY SPFLFRDRENLTPGVRHTFLIAGLAYGLRRALLDEMTITEGVEYERYVAEWLAQNPGKTR LDVPQLTVDLRGARIVVPGDVASEYQIRVPVTSVEEMHIQNEKIYMLIVEFGLNTPNPLR FPLYAPERVCKIVPQAGDEIDAIIWLQGRIID >gi|316923797|gb|ADCP01000063.1| GENE 29 34381 - 35226 512 281 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 25 281 26 287 508 201 42 4e-51 MREIHVSAITDTVAELAVAACCRLPDVMYRAIESSVEREASPSGKDVLRQLLLNADIAAS ENTPICQDTGLAVVFAEVGQDVHVVGGSFEEAVNAGVAKGYTEGYLRKSSVAEPLFERKN TGDNTPAVLHVRLVPGDRIRLKLAPKGAGSENKSALKMLVPADGIEGVKKFVVDTVKAAG SSPCPPMVVGVGIGGTLELAGLCAKRAAMRDVDTRNPDPRYAAFEEELLGLLNDLGTGPQ GLGGTTTAFKVNVEFCATHIASLPVAVNINCHAARHAEAEL >gi|316923797|gb|ADCP01000063.1| GENE 30 35085 - 35831 387 248 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 1 245 260 493 508 153 38 1e-36 RDGPAGARRDDHGFQGQRGVLRHAHRQPPGGGEHQLPRCAPRRGGTVITSSPRLRRGGKQ EISVSEPIRLTTPLTQDKVRSLHIGDRVLISGTIYAARDAAHKRMVETLDRGEPLPVDLR DQIVYYVGPSPAKPGQAIGSAGPTTSGRMDAYAPRLMEKGLSGMIGKGNRSQAVKDAMQR YGTVYFAATGGAGALLSRCIRSYTVLAYAELGPEALAAMEVVDFPVIVVGDVEGGDYYME GPKAYAKR Prediction of potential genes in microbial genomes Time: Fri May 13 02:42:48 2011 Seq name: gi|316923793|gb|ADCP01000064.1| Bilophila wadsworthia 3_1_6 cont1.64, whole genome shotgun sequence Length of sequence - 3701 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 29 - 88 3.4 1 1 Op 1 . + CDS 131 - 517 480 ## Dde_1362 EF hand domain-containing protein 2 1 Op 2 . + CDS 587 - 2284 2084 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases + Term 2341 - 2389 12.6 + Prom 2540 - 2599 2.9 3 2 Tu 1 . + CDS 2693 - 3652 638 ## COG0726 Predicted xylanase/chitin deacetylase Predicted protein(s) >gi|316923793|gb|ADCP01000064.1| GENE 1 131 - 517 480 128 aa, chain + ## HITS:1 COG:no KEGG:Dde_1362 NR:ns ## KEGG: Dde_1362 # Name: not_defined # Def: EF hand domain-containing protein # Organism: D.desulfuricans # Pathway: not_defined # 1 119 1 120 133 78 44.0 9e-14 MKKLLFLAALTLCAAPAYAAPDRFEQMDKDGNGQVDWEEFQAAVPGMKRPAFDTIDADKS GGICRSEWDNFMKSHMGGQKGMPPAGMGMPAGMGGKAMPPAGAEKGGMPMIQPPVQQGEA PGIVPPKN >gi|316923793|gb|ADCP01000064.1| GENE 2 587 - 2284 2084 565 aa, chain + ## HITS:1 COG:VC0997 KEGG:ns NR:ns ## COG: VC0997 COG0008 # Protein_GI_number: 15641012 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Vibrio cholerae # 4 562 3 554 556 689 59.0 0 MNMDASSTPENSVGLDFLHVRINQDNASGRFGRKVHTRFPPEPNGYLHIGHAKSICINFG LAQEYGGLCNLRFDDTNPVKEDTEYVDSIREDVHWLGGEWDDREYYASNYFDQLYAFAEQ LIRDGKAYVDSQSAEEIRASRGTLTQPGTESPYRNRSIEENLDLFHRMRGGEFADGEHVL RAKIDMASPNVVMRDPTLYRIRHAEHHRTGDKWCVYPMYDFTHCLSDSLEGITHSICTLE FVNNRELYDWVLDALNVYHPQQIEFARLGLTYTVLSKRKLIQLVKGGFVRGWDDPRMPTI CGMRRRGYTPEAIRDFCSRIGVARAENLVEYSLLEFCVREHLNAIAPRTMAVLDPIKVVI ENYPEGQVEWFDMPFSQDGSVEGSRKVPFSRELYIERDDFREDPPKKFHRLFPGSEVRLR YAYYVTCKDVIKDADGNIVELRCTYDPESKGGATPDGRKIKGTIHWVSVPHAVSAEVRLY EHLFTSPTPGNTPEGVEFTDLLNPDSMRVVTAQVEPALAEFPAGSRVQFERLGYFCVDPD SKPGAPVFNRTVTLKDSWAKIEGKQ >gi|316923793|gb|ADCP01000064.1| GENE 3 2693 - 3652 638 319 aa, chain + ## HITS:1 COG:SP1479 KEGG:ns NR:ns ## COG: SP1479 COG0726 # Protein_GI_number: 15901329 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Streptococcus pneumoniae TIGR4 # 45 229 265 448 463 130 39.0 3e-30 MILARLIRVCASALLCAGVSLFPGNASADGAGTVVDGRAILKMSKGSNLCALTFDDGPSS HTERLMEILRERGIHATFFVVGKQVKYHPELIVRLQQEGHEIGNHTYSHKSLRHLSVDAQ RSEILSVQELLRKLGVHSRYIRPPYGNFDANTVAIAKAEGADIMLWSIDSMDWSKRASIE NMKTLISGQKLRGVFLFHDTHEQTIEALPEILDRLIADGCQFVTVSEYLAALRQDALPEG VNVASPAIQENPHQPVPPQGTQAEREAPQQLFQPGDVHEVRQPMAPRPKAAETPGAEQMA GHSTNMNDAEPAALRTSML Prediction of potential genes in microbial genomes Time: Fri May 13 02:42:52 2011 Seq name: gi|316923790|gb|ADCP01000065.1| Bilophila wadsworthia 3_1_6 cont1.65, whole genome shotgun sequence Length of sequence - 1974 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 522 668 ## COG3012 Uncharacterized protein conserved in bacteria + Prom 678 - 737 3.3 2 2 Tu 1 . + CDS 883 - 1632 832 ## COG1763 Molybdopterin-guanine dinucleotide biosynthesis protein + Term 1858 - 1886 -0.2 Predicted protein(s) >gi|316923790|gb|ADCP01000065.1| GENE 1 34 - 522 668 162 aa, chain - ## HITS:1 COG:ychJ KEGG:ns NR:ns ## COG: ychJ COG3012 # Protein_GI_number: 16129194 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 3 162 4 152 152 129 43.0 2e-30 MSLCPCGSGRPYEECCETYIEGRETAPTAEALMRSRYAAHTLGKYDYLNETVHPSIRDEA DHEDMKKWSEAVEWEGLEIFSTKDGAETDETGEVSFEARYSVNGMPQSLREDAFFRREDG RWYYVDGNVHGQDPYRRETPKVGRNEPCPCGSGKKYKKCCGK >gi|316923790|gb|ADCP01000065.1| GENE 2 883 - 1632 832 249 aa, chain + ## HITS:1 COG:PH0081 KEGG:ns NR:ns ## COG: PH0081 COG1763 # Protein_GI_number: 14590034 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein # Organism: Pyrococcus horikoshii # 5 249 2 240 240 133 32.0 4e-31 MQATSIIGFSNSGKTTLISRLSECLEARGLKVAIAKHTHHELDKPDTDTALLMGPKRTIV GLSNTKDGKGEAMIHWGHPCFLRDLVPLLDADILLVEGGKTIGWLPRILCLRTTPELLTA LPEGCKALRPELALATYGDNKLPGLPSFTAEDLDALVDLVLEKGFLLPALDCGACGEADC TAMTQRIVAGEKTPGDCVAARGSIEVTVNGQSVGLNPFTAQMLSGGIKGMLGALKGMVPG GEVIIRMKG Prediction of potential genes in microbial genomes Time: Fri May 13 02:43:06 2011 Seq name: gi|316923762|gb|ADCP01000066.1| Bilophila wadsworthia 3_1_6 cont1.66, whole genome shotgun sequence Length of sequence - 31499 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 13, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 520 536 ## COG0778 Nitroreductase - Prom 541 - 600 4.5 + Prom 543 - 602 4.5 2 2 Op 1 . + CDS 633 - 1034 367 ## COG1733 Predicted transcriptional regulators 3 2 Op 2 23/0.000 + CDS 1105 - 2643 1630 ## COG0541 Signal recognition particle GTPase 4 2 Op 3 19/0.000 + CDS 2795 - 3046 314 ## PROTEIN SUPPORTED gi|218887971|ref|YP_002437292.1| 30S ribosomal protein S16 + Term 3074 - 3122 10.0 + Prom 3097 - 3156 1.9 5 3 Op 1 12/0.000 + CDS 3295 - 3525 377 ## COG1837 Predicted RNA-binding protein (contains KH domain) 6 3 Op 2 . + CDS 3529 - 4071 182 ## PROTEIN SUPPORTED gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 + Term 4093 - 4134 -0.3 7 4 Tu 1 . - CDS 4303 - 5043 600 ## M446_5919 putative esterase + Prom 5122 - 5181 3.4 8 5 Tu 1 . + CDS 5264 - 6106 836 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 6228 - 6263 -1.0 - Term 6310 - 6343 5.4 9 6 Tu 1 . - CDS 6497 - 7777 1806 ## COG0460 Homoserine dehydrogenase 10 7 Op 1 . + CDS 8005 - 9087 945 ## COG0082 Chorismate synthase 11 7 Op 2 . + CDS 9091 - 11322 2470 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 12 7 Op 3 . + CDS 11319 - 12806 971 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Term 12823 - 12865 6.1 + Prom 13335 - 13394 4.1 13 8 Op 1 . + CDS 13524 - 15173 2253 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 14 8 Op 2 . + CDS 15202 - 15414 331 ## gi|212702956|ref|ZP_03311084.1| hypothetical protein DESPIG_00993 15 8 Op 3 1/0.250 + CDS 15426 - 16397 1473 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 16 8 Op 4 . + CDS 16402 - 17562 1555 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 17 8 Op 5 9/0.000 + CDS 17638 - 18726 1441 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component 18 8 Op 6 . + CDS 18793 - 20709 393 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 19 9 Tu 1 . + CDS 20858 - 21847 1007 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 21895 - 21932 1.1 + Prom 21899 - 21958 8.4 20 10 Op 1 3/0.250 + CDS 22110 - 23015 1178 ## COG0280 Phosphotransacetylase 21 10 Op 2 . + CDS 23033 - 24106 1005 ## COG3426 Butyrate kinase 22 10 Op 3 . + CDS 24103 - 25335 1502 ## COG0477 Permeases of the major facilitator superfamily 23 11 Op 1 11/0.000 + CDS 25447 - 27315 2271 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits 24 11 Op 2 . + CDS 27329 - 27931 863 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit + Term 27990 - 28027 1.8 25 12 Tu 1 . + CDS 28231 - 28857 664 ## COG1280 Putative threonine efflux protein - Term 28966 - 29017 12.5 26 13 Op 1 . - CDS 29133 - 30755 1811 ## COG4650 Sigma54-dependent transcription regulator containing an AAA-type ATPase domain and a DNA-binding domain - Term 30910 - 30952 9.3 27 13 Op 2 . - CDS 31000 - 31359 466 ## Dbac_2274 response regulator receiver protein Predicted protein(s) >gi|316923762|gb|ADCP01000066.1| GENE 1 2 - 520 536 172 aa, chain - ## HITS:1 COG:CAC1484 KEGG:ns NR:ns ## COG: CAC1484 COG0778 # Protein_GI_number: 15894763 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 3 172 2 172 172 176 50.0 3e-44 MNDFLSLAQKRYAVRSYLPKPVEAEKLERILEAGRVAPTAKNTQPFRFLVVQHPERLKKL SACTNVKGYPLAIIVCSVASEVWVRPFDGKSKPDTDAAIAATHMLLEATDLGLGSCWLMH FDPAPIREQFRIPEGTEPEYILAIGYPAEDSHPSERHTKRKPLEDLVVEESF >gi|316923762|gb|ADCP01000066.1| GENE 2 633 - 1034 367 133 aa, chain + ## HITS:1 COG:BS_ytcD KEGG:ns NR:ns ## COG: BS_ytcD COG1733 # Protein_GI_number: 16079955 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 5 103 3 101 126 112 54.0 2e-25 MKQIKPQYNCSLELAMELVGGKWKLVLLWHLLSGTKRFSEIKRKLGDITQKMLTLQLREL EAAGLLTRTVYPVIPPHVEYSLTERGIELAPVLRGLCAWSGKYAVEKGITLAPLPPSRDG KDLPGAAESFPEA >gi|316923762|gb|ADCP01000066.1| GENE 3 1105 - 2643 1630 512 aa, chain + ## HITS:1 COG:SP1287 KEGG:ns NR:ns ## COG: SP1287 COG0541 # Protein_GI_number: 15901147 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Streptococcus pneumoniae TIGR4 # 2 510 3 510 523 438 47.0 1e-122 MFESLSDRLSSAFRSLSGRAQLTEENIRDGLREVRLALLEADVNFKVVKEFVERVRERVL GQKVEKSLTPTQQLVKAVHDELVTLLGGETTELNLRGAQPGVIMMVGLQGSGKTTSAGKL ANLLRKQKMRPYLVPVDVYRPAAIDQLRTLARQLDIPCFASSADMKPVDIVRAALEDAKE KQCTVVILDTAGRLHIDAPLMQELVDIKSAVQPQEILFVADAMTGQDAVTVAEAFNKDLA LTGVVLTKMDGDARGGAALSIKSVTGASVKYVGVGEKLSDLEVFHPERVAGRILGMGDML TLIEKAQSSFEAEEAEAMAKKLQKATFDFEDFRTQMRRMKKIGSLENILKLIPGLGGLTS KLGDLKAPEQEMARTEAVINSMTMKERRNPDLINGSRRQRIAAGSGTTVAQVNQVLRQFT QMRQMMQQVMNPKAKGGRSKMPPLPKGMGGMGGMPGMGGLPGMGGLSGMGGLPGMGGLPG MGGMPGLPGAPGSGKSATKKKKTERQKRKKKR >gi|316923762|gb|ADCP01000066.1| GENE 4 2795 - 3046 314 83 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|218887971|ref|YP_002437292.1| 30S ribosomal protein S16 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 78 1 78 79 125 74 3e-28 MSVKLRLTRMGNKKRSFFRIVATNTASRRDGRPLEFLGYYNPQVSPADIKIDTDKVQKWM DLGAEMSDTVRSLMKQYAKKADA >gi|316923762|gb|ADCP01000066.1| GENE 5 3295 - 3525 377 76 aa, chain + ## HITS:1 COG:CAC1756 KEGG:ns NR:ns ## COG: CAC1756 COG1837 # Protein_GI_number: 15895033 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein (contains KH domain) # Organism: Clostridium acetobutylicum # 1 75 1 75 75 73 58.0 8e-14 MKDFVEFVAKGLVDKPEDVQVVEVEGEQGAVLELRVAKEDLGKVIGKQGRTARALRTLLG AASSKAHRRVMLEIVE >gi|316923762|gb|ADCP01000066.1| GENE 6 3529 - 4071 182 180 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 [alpha proteobacterium BAL199] # 10 173 3 163 179 74 30 7e-13 MAEQRNIPADLVEIGTLARPHGIRGEIRVNYYADSLELLRGDVVYLQAGNKPPRKMEIDT VRMHQGTPLIRFVEAPDRTAAEFLRGQTLLIPESALPELDEDEVYLHDMLGLSVVLDATG QKLGVLDHVLFHGEQELWSILTPGGKEILLPAVPEFVADIDLDTEIIRITPPEGLLELYM >gi|316923762|gb|ADCP01000066.1| GENE 7 4303 - 5043 600 246 aa, chain - ## HITS:1 COG:no KEGG:M446_5919 NR:ns ## KEGG: M446_5919 # Name: not_defined # Def: putative esterase # Organism: Methylobacterium_4-46 # Pathway: not_defined # 11 243 6 241 243 184 44.0 2e-45 MDTTETLLGRKPAFVLVPGSWCGAWCWKPVADRLRNAGHTVFPMSLTGLAERSHLLSDRI TLETHVMDVVNLIKYNDLRDVVLVGHSYAGIVLTAVAERIPQCLRHIVYLDAMVPKPGEC AMDLIPNDEAEQRVLRARHDGGLSIPAPTPGHFATEAMREWFRDHMTPQPIKPYFDRIDV RVPQGNGVPVTYVSCTPVKLHPIALSVERARRLPLWRVVEIASGHNVHLHRPDDVAEILM ECAERA >gi|316923762|gb|ADCP01000066.1| GENE 8 5264 - 6106 836 280 aa, chain + ## HITS:1 COG:RSp0710 KEGG:ns NR:ns ## COG: RSp0710 COG0697 # Protein_GI_number: 17548931 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Ralstonia solanacearum # 7 278 13 285 303 178 43.0 1e-44 MNSTMCTALLTLLSMTLFAGNSVLCRLALLEYKMEPVTYTAVRLISGALMLWLIMAVRHK NVFKSGTWPGALSLFTYMACFSWAYVELPTAVGTLIIAVAVQTTMIGFGFFSGERPSRQQ GIGIGIALVGLVFLLLPGLTAPPFFSSLIIFISGAAWGVYCLCGKGVSDPAASTAGNFMK AVPFTLLMMLCLPSQVSFDNPGVWYALAAGALASASGYVIWYMVVVRFTVTVAAVVQLSV PVITAIGGVLFVNEPITLRVALSSAAILGGIFFATAFRRG >gi|316923762|gb|ADCP01000066.1| GENE 9 6497 - 7777 1806 426 aa, chain - ## HITS:1 COG:CAC0998 KEGG:ns NR:ns ## COG: CAC0998 COG0460 # Protein_GI_number: 15894285 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Clostridium acetobutylicum # 5 414 4 415 429 342 45.0 1e-93 MNTTLNIGLAGLGTVGGGLVRLLSENAEEIRARSGCDFNLKAVAVRNPNRKRDLPEGVRL TTDPMTLADDPDIDVVIELMGGIDTAKELITRSLAKGKQVVTANKALLAEEGESLFRLAD EKGATLLYEASVAGGIPIVQTLKESLTGNHITSLEGILNGTGNYILSEMTSKGVNFAPAL AEAQEKGYAEADPTLDIDGFDTAHKLVLLIRLAWGVDYPYTKMPIEGIRNLDKMDIDFAR EFGYRIKLLGRARMRDGKLEAGVFPTLVNHTYLLARVGGAYNAVRVEGNAVGSLFLHGLG AGSLPTASAVLGDLISIARQNNKLNSGFVKQVLPQADILPPEEACNTYYMRFMVKDDPGV LRDLSGALSDQGVSIAQAIQKGQSEAGVPLVFMTHEAPVSAIRKAVETMRKSDFLLAPAI CYRVMG >gi|316923762|gb|ADCP01000066.1| GENE 10 8005 - 9087 945 360 aa, chain + ## HITS:1 COG:aroC KEGG:ns NR:ns ## COG: aroC COG0082 # Protein_GI_number: 16130264 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Escherichia coli K12 # 1 357 1 348 361 320 52.0 3e-87 MSGNTFGRLFRLTTYGESHGPALGGVVDGCPAGIALSEELLQKELDRRRPGGGTAGTTRK EPDAVRLLSGVFEGATTGTPIGFQIENTNQRSGDYGPLAAVWRPGHADMTYDAKYGVRDF RGGGRSSARETAARVAGGAIARALLAAQGISVRSCTLEIGGMPAPAFTPEDMAGAASRPY CAPCEAAVEPWEALVRTVRGEGDTLGGIVRVEVLGVPSGLGEPVFDKLDAVLAQALMSVG AVKGVEIGDGFAAARSRGSFNNDAYRPAGAGSTPGNPATNHCGGILGGISTGQPLVMTVA IKPIPSIAKEQQSVNAEGSPVSLRVGGRHDICAIPRVNPVLEAMAALTVADALLLQRRMG >gi|316923762|gb|ADCP01000066.1| GENE 11 9091 - 11322 2470 743 aa, chain + ## HITS:1 COG:CPn0123 KEGG:ns NR:ns ## COG: CPn0123 COG0507 # Protein_GI_number: 15618047 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Chlamydophila pneumoniae CWL029 # 17 727 6 720 732 574 41.0 1e-163 MTQLSLSSLNDTVELQGTLERVVFHNEENGYTVFRVRPEGKDEVDLVTVVGHMGSPQAGA SLRIKGRWVNNARFGRQVQMDSFESLLPATAEGIKLYLASGLIKGIGKSIAARIVKAFGE DTLRIFDEEPQRLLEISGITQKKLVTIVECWTEHQGVRNLVQFLQPHDIGASFAVRIYKH YGPQALSIVQENPYRLAMDIRGIGFLTADALAAKLGFESEHPLRIQAGTLYTLMKQIDDG HVYYPRRALVEQTCSQLGIAEEFVEEAVDCLAREERVVLEELDDEIGVYLTRFHHYESKI AYYLRRILASPKSVRFPKADEVVEKVVSRLGITLAEEQLEAVRTSATSKVMVLTGGPGTG KTTILNAIIQVFAENKAKILLAAPTGRAAKRMSEAIGREARTIHRLLEYTPKDDGFARNE DNPLACGLLVVDEASMMDTMLAYHLLKAAPLGATIVFVGDVHQLPSVGPGNVLGDLIASG AMPVVELVEVFRQAAESEIVCNAHLINRGELPRLESSKDRLSDFYFMRQDDPDRAADIIV DLVKNHIPRRFQLDPFDEIQVLSPMHKGTVGAANLNLRLQQALNPEGEALQRGERLYRLG DKVMQIRNNYEKDVYNGDIGRVSSVDVQEKCLVVRYDDRYVGYDWEELDEIVAAYAISIH KSQGSEYPAVVIPLMTQHYMLLQRNLIYTGVTRGKRLVVLVGEPRALAMAVKNNRMQKRY TWLARRLGAAPEAAGPQTEGERT >gi|316923762|gb|ADCP01000066.1| GENE 12 11319 - 12806 971 495 aa, chain + ## HITS:1 COG:HP0087 KEGG:ns NR:ns ## COG: HP0087 COG0791 # Protein_GI_number: 15644717 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Helicobacter pylori 26695 # 84 493 27 444 457 189 29.0 8e-48 MMVSGARKGFWGVPPYERLGGRGVVSVPQMVKGRSLRLIALGVLLCCALALGGCGGGRKG GSLPPAESGSFEGPKGSIADLRRLPQDLLVYARQGNPDKPLMSAAEQARQDARFNSLFFG PWEAVRSSVSASEAFAIFGGQKKRSKARGWAENLLPWTQENWDKLTANAARDAYPSRFDK AITVRPTVLREAPTHKPRFGNPAEAGEGYPFDMFMYSTLPVGMPLLVVHTSADGAWVFVE TGLVSGWVPTEDTAVTDAPFRSRYENGTYAVIVRDDVPLVDELGRYVTTGSLGTMLPLSG GSGSSLRLLVPVRDPQGRAVAVPVRVSPSDAVRKPIPLTARAAAEIGNRMMGQPYGWGGY LFNRDCSLAMRDLFVPFGVWLPRNSSAQAKAWQFISFVKASPSGKESIIKDEGVPFATLL WLRGHITLYIGEYKGEPVMFHNVWGVRTDDGNGEGRHIIGRAVVTSLQPGAELPNVRREN LILSRLQGMSVLRYD >gi|316923762|gb|ADCP01000066.1| GENE 13 13524 - 15173 2253 549 aa, chain + ## HITS:1 COG:mlr9192 KEGG:ns NR:ns ## COG: mlr9192 COG1053 # Protein_GI_number: 13488234 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Mesorhizobium loti # 12 529 11 532 578 149 25.0 2e-35 MLVTETAWGKRIKTDVLILGTGASGVGAALRAAELKADVLMVGQGRLESSGCLGGGNDHF MAALGTDEPNDTREAFVGFFMKSAFGYREAQLNQWFDAIKPCVKILEEVDTEFLIKDGKR YRSVGFGQPGAWWLHITNGTTIKRHLARRVRKSGVNILDDFHITRLFERGGHIMGCMGFN VLTREVYAIECKTAICSLGWHPQRLTNNSTGNPYNCWHMPYNTGSYFVLPMQIGASLVNI DISGRATLIPKGWGAPGMNGINNMGGKEINALGERFMFKYDPMGENGKRRNQVMGTWQEQ VEGNGPPFYMDMTHFSDEDVHHLQYVLMPADKETYLDYCAARGIEFKKAPLEVEVSEMTV SGMLLADDRLETTVKGLFAGSNFTSFSGAMCCGYVAAFHAANDAASTEMGVIDDAEAQAE HDRILAPWERKGTNLLKYNDFEDPIRQIMDYYAKYRRNMAGMRLALEKLALVESYTDRVV ATNNHELMRLHEAFDLLELCRAHLEACLQRKESGRGMYQLSDYPEKDPELAKGLVLTRKD GAFQFGWTE >gi|316923762|gb|ADCP01000066.1| GENE 14 15202 - 15414 331 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702956|ref|ZP_03311084.1| ## NR: gi|212702956|ref|ZP_03311084.1| hypothetical protein DESPIG_00993 [Desulfovibrio piger ATCC 29098] # 5 70 502 566 566 73 47.0 3e-12 MPPVFDSKKCVKCGRCVKDCPGYILELDKEHDETPHVIEAYKRECWHCGNCRISCPHQAV SFEFPLYTLV >gi|316923762|gb|ADCP01000066.1| GENE 15 15426 - 16397 1473 323 aa, chain + ## HITS:1 COG:MK0297 KEGG:ns NR:ns ## COG: MK0297 COG0111 # Protein_GI_number: 20093737 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Methanopyrus kandleri AV19 # 47 320 37 312 522 218 44.0 1e-56 MSKHTVFVEAPPFAKFSDAEMRRLLDDERLEVLDWRGKRVNTPGFVEALATADIAVTGNS LTITDELLEKLPNLKLIAKLGTGLDMIDIPSVLRRGILLCNTPGANSVAVAEHTFALLLG YLRNVPQCDNAVRTGQWEKARTMGGEICGKTVGIIGLGNIGSRVASRMAGFEARLLGTDP CWPEALAAKYGIERRELNELLAESDIVCVHCPLDETTAGFIGKAELALMKPSALLVNMAR GGIVDEDALYEALRGKVISGAIIDAYSQEPLTASPLFSLDNVILSPHAGAFTTDALNAMS RMSVDQLFQYVDGATPDNLVTAE >gi|316923762|gb|ADCP01000066.1| GENE 16 16402 - 17562 1555 386 aa, chain + ## HITS:1 COG:PH1371 KEGG:ns NR:ns ## COG: PH1371 COG0436 # Protein_GI_number: 14591174 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Pyrococcus horikoshii # 32 383 26 383 389 283 42.0 4e-76 MPSFLNAMTRQIPSSSIHAATQKAEAYRREGKKVVIFSIGRPDFDTPAHIKEAAKAALDK GYVHYTPNMGILPLREAVAEHIKRQTGVSYDPKTEVMITCGGQQAILLTMKAVLEPGDEV LLPSPGYGLYYNCASIADARVRAYALKAPDFGWGGAEAGERTKMLFINSPHNPTGAVLSK EELGQIADFAKANNLLVVSDEAYDRLLYDGLEHRSIASEPGMRDRTVILGSFSKTYSMTG WRIGYLAGPAEVIRGMALLQQSFVLSVNSFAQWGAVEAMTGPQDCVEEMRQQFDIRRKAM MEALSTIPNVSFAAPKGAFYIYLNHEKTGLDPVRFCAKLLDEYYVACVPGSEYGPYAGCN TRLSCATGLDDCLEGVDRIRKMVASL >gi|316923762|gb|ADCP01000066.1| GENE 17 17638 - 18726 1441 362 aa, chain + ## HITS:1 COG:PM1525 KEGG:ns NR:ns ## COG: PM1525 COG1638 # Protein_GI_number: 15603390 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Pasteurella multocida # 26 346 20 339 340 108 26.0 2e-23 MKRTLLTLTLAAALCLPALFAQKSAAEEPIVLTFNLGMPPIHQRWVNAIKPWCDELEKRS NGRIKIEPYFANALGKRSDAMDSVRTGIADLAEAPFSSNPGAFPFHSQIFSAANPSMALG NAYEMLSDFYKAHPEVLKKEIKGVKLMFIHAYPVGDCVMTKSAPILKLDDIKGKKLGFEG GGLRFETMQALGASVVGMNMSDLYQAMQGGIIDGIVMDFDPLISRRYGEEVKHVTLLNIT GTAFYVVMNQERYDSLPDDLKAIVDDMSQNYGPNLLEKFWAENVYGSLDKWIKEMGGTVH VLSDEDYARADKLAAVPAQAWFKSLDKAGYRGAELEKTFHQLEAKYFTPWRQSEAYRYVK QQ >gi|316923762|gb|ADCP01000066.1| GENE 18 18793 - 20709 393 638 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 223 634 15 426 435 155 26 2e-37 MSSNAIETPPLAAQLESLSALIRRTEPFCQRISAVGMYIFMGLVILTFADVLLRYVIGSP ISGSIEITEMLMSVVLFSSVAYTYWRRGHVSMDIVTGKLSELNKNRLGAITTVWSLAVVF FCITCMGKYAMTTTDTTSVWNMSYKPFIWFAVFGCVLLFLAMLSHLLDQLAAIIRGGGMG SAFLMLGIGLLGVALSVWVAVERLPGISAPAQGVIGMVYMFTMFFLGMPVAFALMGTSLV FIASLRGLTAALNLFGTAWFSTCSSYTWAPLMFFLLMGFLCFYSRFGEDLYRTARNWCGH FRGGLAIASVCACTALGAVVGDALVGCIAMMTIALPEMRRHGYDDKLAIGTLACSGGIGS LIPPSSNFIIYGVLAEQSIADLFIAGVFPGLVCMACFIVVIMIMVWRNPELAPALPRVPM HERLVSLKTGLPIIIIFIVVIGGIYAGMFTATEGGGIGAFTTVFLALVMGRLSWSIVTSG LNDTGKTISMAFTVLGGAGVFSYFMTMSKIPMILAGVIASMNMPPMAVMFAIIVCMSLLG CFIPAIPLMLICVPIFLPLAALFGWNLIWFGVIINILVTMAGMTPPFGVSLFVAKELADV PLSLVYRSSIPFVLAFFLCLGFCIAFEPLSTWLPAMMR >gi|316923762|gb|ADCP01000066.1| GENE 19 20858 - 21847 1007 329 aa, chain + ## HITS:1 COG:MTH970 KEGG:ns NR:ns ## COG: MTH970 COG0111 # Protein_GI_number: 15678988 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Methanothermobacter thermautotrophicus # 60 326 60 318 525 152 34.0 1e-36 MKVLSLLPRSRFTESGMTLPETLSLHFLPPNGSHVLIAAAESAQALLLPPSQPYVDAFCL ERMPSIRFIQTTGAGFDSVDHLTAAELGIPVSNSPNMNASSVAEYVMAAIVNLQRGLAWA DGEIRRGRYAASRAELLEQGGLELRGCRIGLFGLGNIGRAVVPLAKAFGCSVAACDAFWP EEFAAENGVERMDVAELFAECDVVSLHCPLNGSTRNLVDYNLLSSMKPHALLINAARSGV VVEADLARILAEGRIRGAALDCFADDGRAENPFLSIPAERVLLTPHLAGVTRAAFGRMLS QALENLERVLVHGQPPRFVVNGVLADYSD >gi|316923762|gb|ADCP01000066.1| GENE 20 22110 - 23015 1178 301 aa, chain + ## HITS:1 COG:CAC3076 KEGG:ns NR:ns ## COG: CAC3076 COG0280 # Protein_GI_number: 15896327 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 1 300 1 300 301 204 40.0 1e-52 MLQHFSELIDKVKAMPRTVVCVANADSETVEAARMALDKGLADVILYGDEAVVRPLVEQA GIEGRAELRHAADPATALREAVRSVRGGESGVLMKGMLNSSDFLRGVLNKEEGLRTGRRL SHIAAVELPGYHKLIYSTDGGMNINPDLAAKKDILANALLALRSWGYECPKVACLAANER VDPHNPATIDAAGLVEAWERGEFPTQCIVEGPIAMDVALNRESALHKGIESRVSGDADIF LMPIMEVGNVAIKGLLHFLKGSQVAGLILGAAAPVVMTSRSEPPSAKLYATALGCLAARN A >gi|316923762|gb|ADCP01000066.1| GENE 21 23033 - 24106 1005 357 aa, chain + ## HITS:1 COG:CAC3075 KEGG:ns NR:ns ## COG: CAC3075 COG3426 # Protein_GI_number: 15896326 # Func_class: C Energy production and conversion # Function: Butyrate kinase # Organism: Clostridium acetobutylicum # 1 350 1 354 355 325 45.0 6e-89 MFRILVINPGSTSTKIAVYEDEVPLFVESRDHDTAKLSPAVMEQFELRHRLLLDVLTEKG IDPASLHACVGRGGLLPPVRSGAYRVNDAMLDVLRHRPVMQHASNLGAVLADAVARPLGI PAFIYDPVTVDEMDPIARITGFSSIERKSVGHMLNMRACALRYATQNNAKYPELSLIVAH MGGGITLSLHVGGRVVDMISDDEGPFAPERSGGLPCFQLAEMATQDGVTFPEMMRRMQRK GGLMDWFGTSDAREIERRIHEGDGKAALVYEAMAHNLAKNIGKLAVVTRGRLDAILLTGG VARSRMLTDWVAERVSFLAPVHVLPGENEMESLALGVLRVLRGEEEAHTFAERTEAS >gi|316923762|gb|ADCP01000066.1| GENE 22 24103 - 25335 1502 410 aa, chain + ## HITS:1 COG:AF0367 KEGG:ns NR:ns ## COG: AF0367 COG0477 # Protein_GI_number: 11497979 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Archaeoglobus fulgidus # 5 387 3 397 397 178 31.0 1e-44 MKTPSRLIILLAGLVLCTVCGVLYAWSIFVVPLEQAFGWQRPETSLTFTFMITFFSLGMF AGGKLLRFGPARAVQIGGALLCVGLLWASRIDSVIGLYLSYGVVSGFGIGIVNLVPAAVC LRWYPERKGLVSGLLTMALALGTLLFGTVGAGWLIGKVGVSSTFMALAVLFLGIIVLGSL FLRMPEAAPGKEEGDGVGLRDMLHTPSYWMIWGWMLTIQIGGLMIIGHIVPYALECGLTA AQAGLGMGVYAIANGVGRLFFGYVHDRFGRAWGMGLDAVFMGCGLVLLAVLPSRFGMAGF LIAAVPVALAFGGTIPQLAALIMAFFGPRHFGVNYGFSTSPLMVASVCGPFIGGLIRAWS GDYLVALYVAAAITLLGVAPAVVLREKAHKRESVAREGCPSPASLSSQNG >gi|316923762|gb|ADCP01000066.1| GENE 23 25447 - 27315 2271 622 aa, chain + ## HITS:1 COG:MTH1852 KEGG:ns NR:ns ## COG: MTH1852 COG4231 # Protein_GI_number: 15679840 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Methanothermobacter thermautotrophicus # 3 621 6 615 618 503 45.0 1e-142 MSVLLEREIGKKALLMGNEALVRGAYEAGLDYASCYPGTPSSEVSNLLYELQGAGGFRMD FATNEKVAMEAVAGASMGGLNCLTAMKHVGLNVASDPLGTLTYLGVRGSLVVYSADDPSM FSSQNEQDNRQYARLFGLPCFEPATAQEMKDMTVAAFALSAQMGMPVMVRATTRVAHSRG VVELGDIRPKAGPRHYEKDYAFVPLPGNAVRHHKRLVERMRSLVPVSDASPFNRVEGPAD TRLGVVASSAAVNYVLDAVKECGLSDTVGVFRLGMAWPLPENALLEFLRGKDTVLVLEEL EPLVEEALRAMAQKNGLTLTILGKGTADLSTLYEYTPARVSAAVAEAFGVADVTPAPLDL TDAPQLVNRPPSLCAGCPHRMSYYGVKLGCEGRDVIFATDIGCYSLGFMPPLRMADIGVC MGAAASMPAGLELAVGPEQRIVGFIGDSTLFHSGLTGIANAVYNHHRFVLVVLDNMVTAM TGHQPSPGRDASLPQAPGTPPLTSIDMEAVIRALGVEHIQTVRPTNLKKVAEATRAALDH DGVSVIICREPCPLHMRRLSKAKKPVFGIDGERCVNCHTCVDTFGCPAFQLRDGKVSIDP VQCIGCAVCAQVCPNNAIRPQK >gi|316923762|gb|ADCP01000066.1| GENE 24 27329 - 27931 863 200 aa, chain + ## HITS:1 COG:MTH1853 KEGG:ns NR:ns ## COG: MTH1853 COG1014 # Protein_GI_number: 15679841 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanothermobacter thermautotrophicus # 6 195 5 195 196 149 43.0 3e-36 MNTLRIYIVGVGGQGNVLASKMIGEAALSMGVPVLMSETHGMAQRGGVVESTVVLGGAQS PTISDNCADILLAFEPMEAVRALSKANADTIVITNTAPVVPFIPGGSAAYPPVDDLLACL RAKVGRVIAFDARKESLEAGNPLGLNMVMLGALYGAADMPLTKESQMEAIRKNGKPAFVE SNLNCFERGFKAASEETAKD >gi|316923762|gb|ADCP01000066.1| GENE 25 28231 - 28857 664 208 aa, chain + ## HITS:1 COG:MA1855 KEGG:ns NR:ns ## COG: MA1855 COG1280 # Protein_GI_number: 20090705 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Methanosarcina acetivorans str.C2A # 10 208 13 211 211 142 42.0 4e-34 MPPVETLGAFFVASVVMGLAPGPDILFVLTQSALYGARAGFATTCGLITGLFVHITAVSL GVAALFQSSETAFNVLKFAGAAYLLYLAWLSFRSGTSKASLQKAQFPGYGTLYRRGVIMN ITNPKVTLFFLAFLPQFADPARGGLTAQIIALGALFQLATLLVFGCVSLLAGRVAGRFNS SVKGQLFLNRAAGCVFTGLAVMLLVSSR >gi|316923762|gb|ADCP01000066.1| GENE 26 29133 - 30755 1811 540 aa, chain - ## HITS:1 COG:rtcR KEGG:ns NR:ns ## COG: rtcR COG4650 # Protein_GI_number: 16131296 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sigma54-dependent transcription regulator containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Escherichia coli K12 # 17 535 20 526 532 419 43.0 1e-116 MRSIVFSTVSHLDMYTGEDRKDRWRPLLELLRIHDFKVDRLYFFISHLYRHIVPTLVKDM NNVCPETEIVPVITNLNGVLTYEDIAPAYKVFSAYFEQYRFDLSNERYFFHLGPGNLFQH ALMLIMLFHFKRLPFQMLRLAPNPEKPGDIIMEIYEGDIRKWIASIADSEKRNVAAQTFL KSCIETRNPVFNKLIEEMEHVATHSTAPVLLSGPTGSGKSHLARKIYELRYNKGLVPGSF VEVNCATLQGESVLSNLFGHVRGSFTGATTVRQGLLKTAHEGILFLDEIAEIPLQIQVIL LKAIEEKKFYPFGSDTPVHSDFQLICGTNRDLAEEVRQGRFRLDLLSRINMWHFRLPALR ERLEDIEPNINYELDRHTRLKGFKADFLPEARERYLNFAMSPEAIWPGNFRDLSSSIERM QSYALGGIINEELVEEEIIRLRAIWHTPALAELPAPKQGAPLLSRLFTPSQLDQMDLFDK IQLENVIRVCCDCSTRTEAGRKLFGVSRGRKKHADDTTRLNKYLKSQGVTWEQIQSLRYR >gi|316923762|gb|ADCP01000066.1| GENE 27 31000 - 31359 466 119 aa, chain - ## HITS:1 COG:no KEGG:Dbac_2274 NR:ns ## KEGG: Dbac_2274 # Name: not_defined # Def: response regulator receiver protein # Organism: D.baculatum # Pathway: not_defined # 1 115 1 112 119 66 34.0 2e-10 MELLLVTPRPEVWNECLPVFQRGGNTLQQAASLEDAAPIIRDTPPVLAILDLELEGKALR QAVIDIMMINASVHTAVVSDMDPDEFHEATEGLGILMPLPTSPKADDAERLLKALAGVM Prediction of potential genes in microbial genomes Time: Fri May 13 02:43:53 2011 Seq name: gi|316923723|gb|ADCP01000067.1| Bilophila wadsworthia 3_1_6 cont1.67, whole genome shotgun sequence Length of sequence - 50818 bp Number of predicted genes - 41, with homology - 37 Number of transcription units - 21, operones - 8 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 14 - 394 567 ## Ddes_1864 dinitrogenase iron-molybdenum cofactor biosynthesis protein 2 1 Op 2 8/0.000 - CDS 449 - 1333 1101 ## COG1149 MinD superfamily P-loop ATPase containing an inserted ferredoxin domain 3 1 Op 3 . - CDS 1326 - 2213 935 ## COG1149 MinD superfamily P-loop ATPase containing an inserted ferredoxin domain 4 1 Op 4 . - CDS 2221 - 2907 695 ## COG1342 Predicted DNA-binding proteins - Prom 2927 - 2986 4.9 + Prom 2880 - 2939 2.8 5 2 Tu 1 . + CDS 3153 - 3653 421 ## + Term 3706 - 3750 6.5 - Term 3896 - 3946 -0.8 6 3 Tu 1 . - CDS 4006 - 4524 630 ## COG0607 Rhodanese-related sulfurtransferase - Prom 4769 - 4828 2.8 + Prom 4680 - 4739 3.7 7 4 Tu 1 . + CDS 4794 - 6146 1473 ## COG0733 Na+-dependent transporters of the SNF family + Term 6177 - 6226 17.0 + Prom 6296 - 6355 2.0 8 5 Op 1 . + CDS 6495 - 7022 357 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 9 5 Op 2 . + CDS 7103 - 7666 650 ## COG0655 Multimeric flavodoxin WrbA - Term 7701 - 7740 1.1 10 6 Tu 1 . - CDS 7837 - 12216 4493 ## COG0642 Signal transduction histidine kinase - Prom 12438 - 12497 1.9 11 7 Tu 1 . + CDS 12564 - 13046 272 ## Dvul_1794 hypothetical protein + Term 13161 - 13196 0.3 + Prom 13185 - 13244 2.4 12 8 Tu 1 . + CDS 13341 - 15434 1912 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain - Term 15291 - 15335 -0.4 13 9 Tu 1 . - CDS 15528 - 15818 79 ## 14 10 Tu 1 . + CDS 15712 - 16776 1032 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase + Term 16812 - 16855 12.4 - Term 16805 - 16838 2.7 15 11 Op 1 . - CDS 17038 - 17448 468 ## Ddes_0558 cytochrome c class III 16 11 Op 2 10/0.000 - CDS 17512 - 19425 2647 ## COG0642 Signal transduction histidine kinase 17 11 Op 3 13/0.000 - CDS 19422 - 19928 359 ## COG0642 Signal transduction histidine kinase 18 11 Op 4 13/0.000 - CDS 19859 - 20602 943 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 19 11 Op 5 12/0.000 - CDS 20617 - 22479 2424 ## COG0642 Signal transduction histidine kinase 20 11 Op 6 9/0.000 - CDS 22531 - 22920 613 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 21 11 Op 7 1/0.000 - CDS 22943 - 23350 387 ## COG0784 FOG: CheY-like receiver 22 11 Op 8 . - CDS 23352 - 24269 1267 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins 23 11 Op 9 . - CDS 24274 - 24792 622 ## Dvul_2716 hypothetical protein 24 11 Op 10 . - CDS 24802 - 25170 363 ## Dde_3710 acidic cytochrome c3 25 12 Op 1 . - CDS 25277 - 26593 1796 ## COG0247 Fe-S oxidoreductase 26 12 Op 2 . - CDS 26609 - 27268 976 ## Dde_3708 hypothetical protein 27 12 Op 3 . - CDS 27296 - 28540 1555 ## DVU0266 hypothetical protein + Prom 29393 - 29452 1.8 28 13 Tu 1 . + CDS 29589 - 30563 1208 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 30583 - 30628 6.0 29 14 Op 1 21/0.000 + CDS 30922 - 31680 262 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 30 14 Op 2 . + CDS 31680 - 32687 915 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components + Term 32814 - 32843 0.4 - Term 32793 - 32841 12.1 31 15 Tu 1 . - CDS 32851 - 33930 1085 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis - Prom 34087 - 34146 2.9 - Term 34124 - 34179 23.1 32 16 Op 1 . - CDS 34303 - 37872 4884 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase 33 16 Op 2 . - CDS 37917 - 41627 4703 ## COG1038 Pyruvate carboxylase 34 17 Tu 1 . - CDS 41777 - 42841 598 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase - Prom 43032 - 43091 1.5 35 18 Op 1 . + CDS 42840 - 44003 854 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 36 18 Op 2 . + CDS 43996 - 45648 1360 ## COG3894 Uncharacterized metal-binding protein 37 18 Op 3 . + CDS 45738 - 46916 1617 ## COG0126 3-phosphoglycerate kinase 38 19 Tu 1 . - CDS 46922 - 47218 69 ## 39 20 Op 1 . + CDS 47222 - 48415 1009 ## LI0463 hypothetical protein 40 20 Op 2 . + CDS 48434 - 49828 1611 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases + Term 50024 - 50059 6.5 41 21 Tu 1 . - CDS 49936 - 50391 -226 ## - Prom 50520 - 50579 3.0 Predicted protein(s) >gi|316923723|gb|ADCP01000067.1| GENE 1 14 - 394 567 126 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1864 NR:ns ## KEGG: Ddes_1864 # Name: not_defined # Def: dinitrogenase iron-molybdenum cofactor biosynthesis protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 126 1 126 126 159 68.0 4e-38 MSTSTLAIPSALPGGLEAGMGMHFGHCDIYTIVQIEDGAIKSVGTLPNVPHQQGGCLAPV QHLASHGVTALLAGGMGMRPLMGFQQAGVSVYYAGNYPTVGLAVQAFLDGKLPAFSTEFT CGGGNH >gi|316923723|gb|ADCP01000067.1| GENE 2 449 - 1333 1101 294 aa, chain - ## HITS:1 COG:AF2381 KEGG:ns NR:ns ## COG: AF2381 COG1149 # Protein_GI_number: 11499958 # Func_class: C Energy production and conversion # Function: MinD superfamily P-loop ATPase containing an inserted ferredoxin domain # Organism: Archaeoglobus fulgidus # 1 286 1 282 283 240 42.0 3e-63 MREIVVISGKGGTGKTSVCASLAHLAQNKVVCDLDVDAPDMHILLDPQVHTREAFVSGNE AVIDRDACRRCGICFEHCRFDAVKKDGDVYGIDPLRCEGCGVCVALCPAKAIAFPEKECG EWYVSDTRFGPFVHAQLYPGQENSGRLVTLLKQQARELAKRQGLDLVICDGSPGVGCPVI SSLSGASLAVAVVEPTPSGRHDFERVAALCDHFRIPVAVLINKADLNHEEVQAITRLADD KGYTVVGALPFDPAVTGAMIRRKALTETDSPLASTLSAIWGRIRELAYAPRKRG >gi|316923723|gb|ADCP01000067.1| GENE 3 1326 - 2213 935 295 aa, chain - ## HITS:1 COG:MA4242 KEGG:ns NR:ns ## COG: MA4242 COG1149 # Protein_GI_number: 20093032 # Func_class: C Energy production and conversion # Function: MinD superfamily P-loop ATPase containing an inserted ferredoxin domain # Organism: Methanosarcina acetivorans str.C2A # 1 285 1 273 284 203 39.0 3e-52 MRIAIASGKGGAGKTTVTASLASVWDRPFIAVDTDVEAPNLHLFLPPAVEASETVGLEVP ILDPERCTLCGACRAICRYKAIAQFASRLTIFTDMCHGCGGCFAVCPSQALTPGSRELGV LDQGTVLEGRGRFLMGRSRIGEAMTPPLLRALRKKLDLMLTAIPADALIDSPPGVSCPAM TVGLDADAVLLVAEPTPFGFHDFRLAHQAFRQIGKPVAVIMNRAAMPGNAEGDGALRAYC AEQGLRVLGELPFDRAAAETYAKGRLIAATSPEWRVRFESLRDAVLTFAEGACHA >gi|316923723|gb|ADCP01000067.1| GENE 4 2221 - 2907 695 228 aa, chain - ## HITS:1 COG:MA4245 KEGG:ns NR:ns ## COG: MA4245 COG1342 # Protein_GI_number: 20093035 # Func_class: R General function prediction only # Function: Predicted DNA-binding proteins # Organism: Methanosarcina acetivorans str.C2A # 2 94 6 98 112 89 50.0 4e-18 MARPRNCRYVERKPHVTYFKPRGIPMTELTETLLTVEGLEALRLADLEGLTTGEGAERMR VSRHTFGRTLAEARRAVADALVNGRALCIEGGTYAVLPPQPEADKPHKEFHMQKVAVSSE GPSLDDMVDPRFGRAGGFVIVNPETMETSYLDNGVSQTMAQGAGIETAERMSAAGVTVVL SGYVGPKAFEALKAAGIKVCQDLDGMTVREAVEKYKNGDAPFADAPNK >gi|316923723|gb|ADCP01000067.1| GENE 5 3153 - 3653 421 166 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MERENISVDIITVHGHEHVVVINGMAFELDPGMFDPAIHKIEWVAGQGVIYWEDGRENTL FGKDGYDVHIAPLVRAFGEEQRRVEDNVIILDAERRQRTYATAKQRANLLSQIREVEEKM ARSTQAILAAQLAGKVPEGEDVHHFLSNYTRKLELRAQLTTLDTDM >gi|316923723|gb|ADCP01000067.1| GENE 6 4006 - 4524 630 172 aa, chain - ## HITS:1 COG:ygaP KEGG:ns NR:ns ## COG: ygaP COG0607 # Protein_GI_number: 16130582 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 2 172 3 173 174 158 49.0 5e-39 MLPAISPQEAFERFSKGEARLVDIRETDEYTQEFVPGSRLIPLSIIAKHPLKDADAPDKP IVFFCHSGNRTANASDLLERLAGDVQAYRLDGGISGWEKAGLPVEHISSTIPLFRQIQIA AGSLVLIGVIGSAFWHPFFWLSAFVGAGLVFAGISGFCGLGVLLSHMPWNRR >gi|316923723|gb|ADCP01000067.1| GENE 7 4794 - 6146 1473 450 aa, chain + ## HITS:1 COG:BS_yocR KEGG:ns NR:ns ## COG: BS_yocR COG0733 # Protein_GI_number: 16078994 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus subtilis # 9 450 7 445 445 301 42.0 3e-81 MAETSRDLFASRLGMLAATLGSSVGLGNIWKFPALTGEHGGASFLLVYILSTLVVGLPLM IAELLIGRSTRSNAYQAFKTLSSNRFWWIIGGLGIAGAVTILSFYSDVAGWVFAYIPKAL TGSVHTQDPAVAKEAFSSLLACPWKSLFWQWLVLGLTASVILRGASGGIEKATRILIPLL FILLVVVCIRSLTLPNAMEGLKFLFMPDFSRIDGSVVLMAMGLAFFKMSIGFGCMITYGS YFRSDTHVPFLAVRVMVCDLLVSILAGIAVFPAVFSFGFEPTAGTSLLFLTIPAVFASMP GGQLFTTLFFVLSAVASMGAMLSLLEVPVAWLSETFRISRPRATVLMTLALILLGAPATL STSVLSDVTVFGLSLFDLYDFLSSNLMLPLDGLLLSLFVGYVWSRSDALDALTNGGTLHN LACAKAILFLCRYVTPILIVIILLNGLKVF >gi|316923723|gb|ADCP01000067.1| GENE 8 6495 - 7022 357 175 aa, chain + ## HITS:1 COG:STM2587 KEGG:ns NR:ns ## COG: STM2587 COG2110 # Protein_GI_number: 16765908 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Salmonella typhimurium LT2 # 1 163 2 169 274 94 34.0 7e-20 MKCLLCDINPAMREAWEKELERRPRLAALCSVVAGGITDLRVDAVVSPANSFGFMRGGVD GVYTRVFGEGVESRLQAIIRTLPAEELSVGEALIVPTGHSGIPWLISAPTMRRPSVLHDG DPVRRSARAAMRAGLEQAFVSIAFPGMGTGTGRLPFDAAAKAMFDGMEEALFSPR >gi|316923723|gb|ADCP01000067.1| GENE 9 7103 - 7666 650 187 aa, chain + ## HITS:1 COG:MA0445 KEGG:ns NR:ns ## COG: MA0445 COG0655 # Protein_GI_number: 20089336 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 5 185 3 182 182 96 32.0 2e-20 MGKNILCILSSPRQNSNSGLLAEAVLEGAAEHGGKAEVIRLVERNINPCHGCFACTRSDR GCVQHDGMTDLYPLIASADALVMATPIYYFNMSGQLKTFIDRCIAVDVGGGKRGLRGKKL AVTMAYEGEDPFDSGCINAIRCFQDICRYTGMELKGQAYGTALEPGAILANKKLLDEARE LGRILLS >gi|316923723|gb|ADCP01000067.1| GENE 10 7837 - 12216 4493 1459 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 914 1319 258 669 676 199 35.0 5e-50 MNTLFLKAPQLTAEIMRAHYCEHDLDKILPLFHETLTWIGAGEDQICYDYETIKDFFLRT YAEKAVPDCIITDEDYRLVSWDEHSCLVAGRCWIATKPETGYVLRAHQRVTFSYKLVDGE LKITLIHISNPYADMQEGEDFPAQIGRQSYEYLQRLLVEKSRQIALLNSTVSSGLKANWD DEYYSLFYVNEGLCRMLGYTEEELMAKCRGRMTELVYPPDLPQALADCERCFANSLTYST EYRMQRKDGKLIWVWDTGSKSKNEEGKTVINSVIVDITERRHTNDTIRRQKAFFQSLYDT TLCALVQYDLNGTFLNANAFTFDVIGYTEEQFRTETGGSLMSIVHPCDVDTVREHIERLI ADRRPTVYNCRIIRRDGAVRWVCASANIINNMDGVPVIQAAYSDVTELQRVERERDSTYD SIPGGVAKVLIGTQLSLLEANDNFFQMLGTDRTAYKGTLSAVAPADRGAIVTAFMEAAET DAPVDIEYRCRRFDNEQTIWIHLIARFVENAHGAKVYQCVFIDITKQKTAQIQLYRERDR YRIIMENSADVIYEYDRKTDTVVFYETIRRGDETEIAKHVSPNFSKKLYDQKIAHPEDAE TAMRVFSGIQSGSAEVRLRNLKINGDYVWCLLQGQPVYENGALARVVGIIRDITENKRIS QEKERLQRIFDLELRRDYESICQINPSTGRYVMWTPSNASYYDIPTSGIFSEELAHAISR IVCGEDQETCLKTLSIGNMLKTLEEEKEGTCYYRVLTPDGSLRWKCARYTYFGDDGSILL NVRDVHDIRIAQQQEENRFRAILRETCEYIIETDVETKSYTLHLPTLINRYPLEACSDYG SLIARYSERYVAPEDRESFLRAVSLPEALSRMRREGGSCSIKYTVNTNGSPAYKTWNMSL YRYDDNREYMLSYILDITKLVLEQQEKEREAERNRQIIKDALTAAEQASRAKSDFLSRMS HEIRTPMNAVIGMTTIAAASLDNRDKLTDCLGKIGLSSRYLLSLINDILDMSRIESGKVS IINEEFDFRSFVEGISSLIYPQAKNKNIVFDLNIEGVVDERYRGDPLRLNQVLINILSNA LKFTPEWRSVHLSIRETRRVRDRAYLQFIVRDTGIGMEKGLLERIFEPFEQGGASISHSY GGSGLGLAISSNLISLMNGHISVSSTPGVGSEFVVELPLLTVPDNTPKQDVSLEDIRVLV VDDDLVTCEHTTLILNRIGVDAEYVTSGKAAVTRVKSALQRHTCYNIALVDWKMPDMDGV ETARSIRRIVGPDTLVIIMSAYDWTEIEARAREAGVDFFISKPIFQSVVQDVLLKATRRR QSADTLPVQKEDFAGRRILLVEDNEINMEIARTLLEFRNASVDGACNGQQAVEMFRSSPQ NHYDAVLMDVRMPVMDGIAATQAIRGLDRADAATVPILAMTANAFAEDIERSRKAGMNEH LAKPIEPETLYARLASYFR >gi|316923723|gb|ADCP01000067.1| GENE 11 12564 - 13046 272 160 aa, chain + ## HITS:1 COG:no KEGG:Dvul_1794 NR:ns ## KEGG: Dvul_1794 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 103 18 133 163 76 48.0 3e-13 MYRKEKGFTLIEIISVLVVLGILAAVALPKYYDLQEEARIKAATAAVSEAQARINLAFAQ YLLNHGECKDIAALLMTPLGDGKGDGSLYIGDDDAHFEDKTAGKVGGWIFWVDESKKLET VNEDTRVGLLEDPDGNKIDMSETGLYLRIPQCNSSNKDEK >gi|316923723|gb|ADCP01000067.1| GENE 12 13341 - 15434 1912 697 aa, chain + ## HITS:1 COG:slr2100 KEGG:ns NR:ns ## COG: slr2100 COG3437 # Protein_GI_number: 16330586 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Synechocystis # 5 346 10 353 368 228 36.0 4e-59 MGTEKRQILIVDDVELNRAILAELFQDSYEVLEAENGCQALELLEAHHDDILIVLLDIIM PVMDGFEMLQNMAHRTWKEEIPVVLITSENSDNALLKGYELGVSDIINKPFNPNIVKRRV DNTIELYLHKRHLEALVRQQVETLEKQTLKLNRFNEFIIDTLSTVVEFRSCESGTHIRRV REITRLLLESLSFRYPEYDLPHGAIDKMVSAASMHDIGKIAIPDAVLNKPGRLTAEEFEI MKTHSLKGCELLQSINEGQDEEYYRFCYDICRSHHERWDGSGYPDGLAGNNIPIWAQVVS LADVYDALTSDRVYKAAYTHAQAVSMILNGECGVFNPRLLNSFLSVASRLENGEIGPMFR ERAAAPAKPSHKKHSLSARTLWLLEREREKYRVLSELSGDIVFNYDVKRDMLECSEKWYE VFGWDITVPNARKTLLRSPFIHEDDRAIVLKGLSGITPKHPRCRMEIRLMTSGGGYEWFD VYANALWDADSGSRLGYLGKLTNINERKSEVNRWREQANTDPLTGLSNRKRIEEQIAQAL EDDKENGAAFLFIDVDNFKAVNDTLGHMFGDEVLRHVASEIRRKVRTSDIVGRVGGDEFI VFLRNIRSLDAISKKAGEICAAFKSKYSEAIPHGGISCSVGIALYPGDGACYEDLMYKAD QALYAAKEKGKNCFAFYDVTFRNRAFPSVLSEVESGA >gi|316923723|gb|ADCP01000067.1| GENE 13 15528 - 15818 79 96 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSLSKITFSRADTGKAHNRDTITPKTTSHSRFHRLMPSFLSLEGEKRVTRFSDPDGTRRR FGRRGSNVRENVFVPVFFPADALAAPRASRMADERP >gi|316923723|gb|ADCP01000067.1| GENE 14 15712 - 16776 1032 354 aa, chain + ## HITS:1 COG:SSO2243 KEGG:ns NR:ns ## COG: SSO2243 COG1957 # Protein_GI_number: 15899016 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Sulfolobus solfataricus # 29 231 3 179 307 127 39.0 2e-29 MRRWNRLWDVVLGVIVSLLCAFPVSAREKVILDSDMVEGFDDGVAMLALAQSPGIELIGV TIVAGNTWVSDGVAYALRQLEIAGQNIPVAAGVDRPFRPQRYELFGLERQLFGMGHDAWV GAFGYPKPESWQKVYRERYGKEPQSRPDPRHAVDFIIEEVRKHPGELTIAEIGPCSNLAL AVLKAPDIVPLIKRVIFMGGSFFKPGNVTPTAEFNWWFDPEAARIVVRTPFREQIMVGLD VCEKMPFSFDRYQAFLAGQRPEMKKLLESTYAGQQFAKDKAFIQYVWDVLAAAILIDPSL IAEERTCAVDVNAEFGPSYGQALAYPDNGPQGSQKARIVMTIDQERFWNMLTAR >gi|316923723|gb|ADCP01000067.1| GENE 15 17038 - 17448 468 136 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0558 NR:ns ## KEGG: Ddes_0558 # Name: not_defined # Def: cytochrome c class III # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 132 1 130 147 83 32.0 2e-15 MKRLILCALAVAGLCIPALATDPTPEQKAQIEKPMTMNNTGNEKKQVIFTHSAHAKTDCA FCHHKAVEGNIYVGCAAKGCHDNMDKKDKSEHGYYFTMHNKKSEKSCMGCHQKAVAENPD LKEKFKGCNPCHAKNS >gi|316923723|gb|ADCP01000067.1| GENE 16 17512 - 19425 2647 637 aa, chain - ## HITS:1 COG:atoS_3 KEGG:ns NR:ns ## COG: atoS_3 COG0642 # Protein_GI_number: 16130156 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 373 632 12 269 278 107 28.0 8e-23 MKETILLVDDEQGIRTVMGLSLRDAGYNVITVASGEEALQLFAERPIPIVITDIRMPGMN GLDLLKHIKILAPETEVILISGHADLEMAIQGLKLEASDFITKPIDDDLLHISLKRALER IAMRNKLKEYTSNLEKMINEKSAQLVEAERQLAARQVVDGFAFGLHTLGSSMSGEQQSFN ELPCFIAMHDRFMEVLAVNDLYRERLGNLVGHPSWEPYVLHSNEAMFPVERAIVEGCGAR SEQILRDKNGKEIPVLVHTAPILNNEGEVELVLELSVDEQESRRLREELRVTRERFRQLF EESPCYVSVVGRDLQVLEANRAFRHTFGMQSGKRCYELFAGSSAPCDECPVQRSFQDGKP CHTEKVVIDHEGRPVNVLVWTAPLRDTSGNITECIEMATDITELRQLQDRLASLGLLMGS TAHGIKGMLTALDGSVYRLGSGIDKGDEARTRDSLKDMRQLVARLRKMVLDILYFAKERK LDWNILVAEEFMRSVVSTVSDKAAEAGIAMTLEADGDPGTFEADASALSAALVNLLENAV DACRADGSKKEHYVRVSVKGLPDAVEVRIEDNGIGMTEETRTKLFTLFFSSKGKGGTGIG LYVARQVVMQHGGRIEVASVKGEGSTFTVTMPRILAK >gi|316923723|gb|ADCP01000067.1| GENE 17 19422 - 19928 359 168 aa, chain - ## HITS:1 COG:AGl216 KEGG:ns NR:ns ## COG: AGl216 COG0642 # Protein_GI_number: 15890218 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 39 153 407 522 526 66 37.0 3e-11 MPCGPRAARPPCGQASGRPAASLSIALQFQDAADSDPIPLDAERLEGCLLNLVGNALEAF PAPGLRARPAEVRVEIARTPDALIYDVRDNGTGLSAEAAERLRDGLFTTKKDGTGFGLLG TRKALLEMGGRLLWENLPDSGALFRIVLPLRVERTPAAAEANHLREDS >gi|316923723|gb|ADCP01000067.1| GENE 18 19859 - 20602 943 247 aa, chain - ## HITS:1 COG:SMb20613 KEGG:ns NR:ns ## COG: SMb20613 COG2204 # Protein_GI_number: 16265273 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Sinorhizobium meliloti # 1 138 1 141 460 98 39.0 1e-20 MDTSVRVLLVDDEPGIRHVLGALIRDFGYDVHTAESAREALEAFTANPFPIVVTDIRMPG MDGLALLERVRNQNPDTQVVMITGHGDMDNAVECLRLGAADFIAKPVNDDLLEHSLKRAA EQYSLREQIRRHTEHLEELVASRTRELLEAHRVAVVGETVACMAHTIKNLAAALEGSLFV LKQGMESGNREYLDDGWAMLEEDISRVRDKLLHLLRIGQQTELRECLVDPVQPVRHVVKR LEGRPPV >gi|316923723|gb|ADCP01000067.1| GENE 19 20617 - 22479 2424 620 aa, chain - ## HITS:1 COG:BMEI0947_2 KEGG:ns NR:ns ## COG: BMEI0947_2 COG0642 # Protein_GI_number: 17987230 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Brucella melitensis # 347 601 1 259 269 151 38.0 3e-36 MHSRFGAYRLMVKTITMTALLFLAGVGAVLWLMPYEWLWIVLPAVTLLLCAVSVALYWYE VRLPIGRMIASMQAMREGQAVTLDDARRDDEFGQLARAIVEFSIRNQHHLEEVRRRKDDF QRLFDMVPCGISVQDREYRLLRWNHSFAVRYDPQPGNTCYEVYKGRTTPCPECSVQRTWE EGAIQCNQESRVNPDGTRDYWFVQTVPLFDKDGNVSSVMEMSIDMTLIHTLQHQLQASER THKAIFDSIPNAVFLLDAAELTILDCNPASVKMYGRQRENELIGRGILDLFLPEEREQYA SQLRAFTVFSGVTNIKADGTPLRVDIRSASAMIENRRVRILCATDVTERIEMEQKFIQAG KMATLGEMATGVAHELNQPLTVIKGAASYFLRKTRRSEPIAPETLSELSVEISGQVDRAS DIINHMRAFGRKSDLALLDTDINGVVAQACDLFGRQLVVHGITLETSLAPALPPVLAIPN RLEQVIVNLILNARDAVEERVKSAPEPPAVIGVSTAMDGNTVLLSVWDTGTGIPAHLLNK IFEPFFTTKPVGKGTGLGLSIIYGLVKDFGGSISARNRDEGGALFEIRLPITRRGDLAPN APADGNRAGGVSGPQPEQIS >gi|316923723|gb|ADCP01000067.1| GENE 20 22531 - 22920 613 129 aa, chain - ## HITS:1 COG:all1736 KEGG:ns NR:ns ## COG: all1736 COG2197 # Protein_GI_number: 17229228 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 1 121 1 125 231 67 28.0 7e-12 MPKKILTIDDDPYIVKYITEVLSDNGYATCSASSVEEALSVLEAERPDLVTLDMEMPDEW GPRFYRKMSMNPAFKDTPVIVISGLQGIHLAIRNAVATLQKPFDPEELLSIVNRVLESKD AARKAMDGE >gi|316923723|gb|ADCP01000067.1| GENE 21 22943 - 23350 387 135 aa, chain - ## HITS:1 COG:TM0468 KEGG:ns NR:ns ## COG: TM0468 COG0784 # Protein_GI_number: 15643234 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Thermotoga maritima # 4 131 3 122 122 70 32.0 6e-13 MDGRRALVIDDEVHFRYFLRVLLEKAGFAVKMARNGIEALDVLRSWRPDVITLDVMMPEQ SGLTFYGAVCRDEDWKRIPVVMLSAVPISVREHAFATIGLTQGPLPAPAACLEKPCTPEA LLEVVNRLVPEPTTA >gi|316923723|gb|ADCP01000067.1| GENE 22 23352 - 24269 1267 305 aa, chain - ## HITS:1 COG:MJ0531 KEGG:ns NR:ns ## COG: MJ0531 COG0589 # Protein_GI_number: 15668711 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Methanococcus jannaschii # 161 303 24 168 170 62 28.0 1e-09 MLFKNILFASSGTAGSDNAAKLAFGLAEKQQAELTLYHCCGVPSRGFTTNVTDTRSGSLE QPDEAYQEMVREELNTAYAYQIEACTRPFRMELSVGMPSTEILRYARQIKPDLIVMGASS AASDPNAARMRSIVGNTVRAVAKKAPCPVLIVNRPCTTCWHLFSNIVFCTDFSEAANHAF RFALNTARQLNAKLYITHAVDITSIQGMTMDQSEIERHTEQMREKIDKLYLSKLDGFANA EVIVREGIPYVEILKVARENEADLIVMAHHSSDLSDDDADIGSTVEQVVLRSACPVASVT RPEAR >gi|316923723|gb|ADCP01000067.1| GENE 23 24274 - 24792 622 172 aa, chain - ## HITS:1 COG:no KEGG:Dvul_2716 NR:ns ## KEGG: Dvul_2716 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 170 1 182 184 127 43.0 2e-28 MDAIRLHSDGTLSSEAPFAGLLRMRLELDGDVTLRSFFRLLKRNPLARELVPGLGDALAE AEACPASGCRYPGLEALRLGKRVELTGFPGTPRVDVYLAVEGVGDELPELRFFRLRDMLD TPLQTGTARHVLLGETATALDAETSFTVFDLLEGMGWELGFQGGSLTCNLKE >gi|316923723|gb|ADCP01000067.1| GENE 24 24802 - 25170 363 122 aa, chain - ## HITS:1 COG:no KEGG:Dde_3710 NR:ns ## KEGG: Dde_3710 # Name: not_defined # Def: acidic cytochrome c3 # Organism: D.desulfuricans # Pathway: not_defined # 26 122 28 124 128 103 49.0 2e-21 MPLHALARAGLAVTLLLGLVGTAHALTLKSEEFKTHRRPAVTFDHDNHNERAKIEDCIVC HHGGENGVIDPEVSSEDQPCSECHKANMPSGRTPLMRAYHKNCIECHTAQNKGPTTCGAC HK >gi|316923723|gb|ADCP01000067.1| GENE 25 25277 - 26593 1796 438 aa, chain - ## HITS:1 COG:AF0547 KEGG:ns NR:ns ## COG: AF0547 COG0247 # Protein_GI_number: 11498157 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Archaeoglobus fulgidus # 42 429 65 470 477 187 29.0 4e-47 MGIEKRLIEDANLTRGMQLATPERITKVIRDVLKGEAGARVKVYEQTCMRCGACAKACHF SLSHPDDASYTPVAKLDKTIFKMVRESGKLNAEQMRGIAQIAHTECNMCRRCIHYCPVGV DIAYLMSLVRRICNKLGITPTFIQDTANSHSATFNQMWVREDEWIDSLVWQEEEAREEFP GIRIPLDKEGADFMYSVIAPEPKFRTQLIYQAAAIFHQAGSDWTMPSRPGWDNSDMCMFT GDYEMMGRIKRAHFELAQKLRVKRIVMGECGHAFRSVYDQGNRWLGWKDSPVPVVHAIEF YWELINEGKIKITHQFEDPVTIHDPCNTIRGRGLADKLRDVVHFLCANVVEMTPNREHNF CCSAGGGIINCGPPFKSVRMEGNRVKADQLRNTGVHTVVAPCHNCHGGLEDIIKHYKLGM HTKFIGDLIYELMEKPEV >gi|316923723|gb|ADCP01000067.1| GENE 26 26609 - 27268 976 219 aa, chain - ## HITS:1 COG:no KEGG:Dde_3708 NR:ns ## KEGG: Dde_3708 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 1 219 1 219 219 207 51.0 2e-52 MNALYNFAVGPLAWIAFTVCIGGMIWRLWSLGRLARQRDASALAYMNWKDSLTSIIRWTL PFSTCGWRENPLLTVANFVLHIGILLAVLFYSAHNVMWDYNFGFSLPSLPDGFMDAVTLA TILACLVLGWRRLAVPASSYVTRQADWFSLLLVTGVMVTGFASAHGFGSESFMALLHVLF GEAILICLPFTRLSHAILIPFTRAYMGSESIGVRRTCDW >gi|316923723|gb|ADCP01000067.1| GENE 27 27296 - 28540 1555 414 aa, chain - ## HITS:1 COG:no KEGG:DVU0266 NR:ns ## KEGG: DVU0266 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 27 412 18 403 408 386 48.0 1e-106 MEDQKRYDWEPSVRKLGETEMLPETAWQEEWQISPDCESFAAVSALEDGTFTVRLNGALW ETRTDKMYNCQFAPDGRLTALSQSDGEWAIVVDDAESEEHADYLWGTRFNKAGTIAVPMQ TGMEYGMLVDGAPWEQLYTAATDFVLSETGKTAAVVQTAGLGQADLEGFSKGIYTIAVDG QAWEECYLNAWSPCFDREGHRVASTVRVTPYEYTISINGQRWSETYPCAWEPIFEPKSGD VIAPIRKEGKWGLARNGSLFWKPMFAQCWAPQAAATDGEYIWAVAAPSYGAFTVACNATP WNCRFPSVTDLVLSPDGKHAAALGSQNNSRFQIAVDGKVWDDTFDMAWPVVFSPAGDRAA AKVRRDGKFALYVDGNAVIENLDGVWNPTFSPDGTVLLFCSLKDGVFSRHTVRL >gi|316923723|gb|ADCP01000067.1| GENE 28 29589 - 30563 1208 324 aa, chain + ## HITS:1 COG:mll1508 KEGG:ns NR:ns ## COG: mll1508 COG1879 # Protein_GI_number: 13471512 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 29 324 43 335 335 185 37.0 9e-47 MKRILMLGLLALSLLCGTAFAKTGEGMTIWFDTGGSVGDGYGTIVQNGAKAAAADMGCEL RLLYSDWNPETMITHFRNAVAARPDGIVVMGHPGDAAFGPLVDAAVKQGIVVTSIDTALP ETQARHRAKGFGYVGTDNYTQGKALAEEVLRRTNLKKGDRAFVWGLKRIPDRGRRAQAIV EMLEQAGVTVDYQEITPEIDKDPVLGAPVVAGQLGRHPDTKVIIVDHGSLTAQMGNHLKN AGVKPNTVYVAGFSLSPATAAAIENGYVQLVSEAQPYVMGYFGVVQTVLSKKYGFTGFSI DTGGGIVDAANIGAVSAFAKQGLR >gi|316923723|gb|ADCP01000067.1| GENE 29 30922 - 31680 262 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 239 1 226 245 105 31 5e-22 MGMLLSVRGIRKSFGAVAALRHADLDMAEGEVVALLGDNGAGKSTFVRLLSGAGRPDGGT FRFGGNPVDLAKYSVRKARDMGIETVHQDRCLCEGQSLWRNMFVGRHVRTRWGFIDIAAE REATRVMLGEWLGLGGAGLDPDADVRVLSGGERQALAIGRAMYFGARLVILDEPTTALSL KEVKRVLEFIQALRDKGRSVLLVSHHVHQAYDVADRFLFMDKGRTVGEVRREGTSPADLT ERLLSLAEGRGA >gi|316923723|gb|ADCP01000067.1| GENE 30 31680 - 32687 915 335 aa, chain + ## HITS:1 COG:mll1505 KEGG:ns NR:ns ## COG: mll1505 COG1172 # Protein_GI_number: 13471510 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Mesorhizobium loti # 7 312 29 333 344 172 35.0 9e-43 MGGRIRRALPASTRNQLWLTGLFLALLALFMMTGARAFLAWPMYESLLSTLPLYGFLALG LTLVVAAGEMDLCFPSTLAVAGFGFALTAIHTGQVWLGVIVAILIGLGIGLCNGFLVAYV GVPSIIATIGTQFFWRGAVMLLSQGLALGLADLGGSPTHSVLAGRLGGFLPMQSLWLIVM ALALWLLLNRHPLGDAIRFTGEQADIASRLGINVPAARLGVHTLMGGIAGFAGAIGCMEM GSWWPTQGDGYMLLVFAAVFIGGTSVFGGSGRLYGTLIGIAIVGMIEAGIVSSGLSGFWT RAVHGMVIVVAVSSYAILSGNADERILRLVRWRKR >gi|316923723|gb|ADCP01000067.1| GENE 31 32851 - 33930 1085 359 aa, chain - ## HITS:1 COG:CAC1797 KEGG:ns NR:ns ## COG: CAC1797 COG0821 # Protein_GI_number: 15895073 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Clostridium acetobutylicum # 3 352 1 349 349 389 58.0 1e-108 MELPRKQTRQLKLGSVRIGGGAPIVVQSMANTDTRDVEATLGQIKRLHDAGCEIVRVAVP DETAARALRAIHDASPIPVIADIHFDYRLALIALEVGLEGLRINPGNIGERKNVEMVVDA AKARGAVIRVGVNSGSVEKRLLEQYGGPTPQAMVESALGHVRILEEHGFYDTKISIKSSS VLNTIECYRLLSQRCDYPLHLGVTEAGGVLRGAIKSSVGMGVLLSEGIGDTLRVSLTAAP EEEMTVAWELLRALGLRQRGPEIISCPTCGRTEIDLIGLAQEVERRLRTENAPIKVAVMG CVVNGPGEAREADLGMAGGRDKGIIFRKGEVIRSVRGQEALLAAFMEELDKLLVERRDL >gi|316923723|gb|ADCP01000067.1| GENE 32 34303 - 37872 4884 1189 aa, chain - ## HITS:1 COG:all0635 KEGG:ns NR:ns ## COG: all0635 COG0574 # Protein_GI_number: 17228131 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Nostoc sp. PCC 7120 # 152 655 90 573 821 221 33.0 9e-57 MTKTPSATKNRQDEVPVEDILKRLVLTGADIVRIGEDAELLVGGKNYNTALISRVEGVRI PQFRAISSVAFHIVLDECKVCAALIRSMVDEAYNRIDWASPEVTKDHEFLPKFVRSVAAE IRKASEADPELPHIRLRTFVNNVVDGFASSQEGIDQLRKRSVMVQAAILSCPLPERVDKA VRSAYRDICAEAQESDVPVAVRSSAAGEDSRKKAFAGLQDTFLNMVGDDAVATAYLWDCA SAYNLRSMIYRREAILDALTQSETTGQEELAAQAKKEWSIENTSLSVCIMRMINPVVSGT AFSADTATGCRGTVRKDLVSIDTSYGLGEAVVGGRVTPDKLYVYQKDDGSEVVIRFMGSK TMKIVYDENGGTKEVPVPERECMLWALTPTQAEQVAKGVRAVSKAYDGMIMDTEFCIDSK GMLWFVQARPETRWNEELALHPHTIFMRRREVEPKAAAAAEILLTGNGASRGAGQGKVRF LRSALELNRVGKGEILAAERTDPDMVPGMRVASAILADVGGDTSHAAITSRELGIAAVIG IQNPTALQALDGLDVTVDGTRGRVYRGLLPLHEVGGEMDVEALPQTKTRVGLVLADIGQA LFLSRLRNVPDFEVGLLRAEFMLGNVGVHPAALEAFDNGELERLVQKKIGELDSKLRKTI REQLDAGLVSLDLRLREHVGNITGFAEELEKLGKKDDLRNPDEIMAVYRRMREVEKLLDE YTERAARFYSTLETSADLAEHVRIIMGYESELLALSGDSPEVKARRAAIEKEIGDQVSYA ALNPAVQETLRQIAALREDVGARCGLAKEIAALRSIPGEIGTLIRARGHKNGHDHYVQTL SQGLALFAMAFYGKPIVYRTTDFKTNEYRNLLGGNLFEDHEDNPMLGYRGVSRDIHDWEI ESFKLARSAYGAHNLQLMLPFVRTLEQARAMRGYLDTVHNLRSGEDGLKIILMSELPSNA ILARQFIQEFDGFSIGSNDMTQMVLATDRDNSSLSHIYDEEDPAVVWAILVTIFSGQKFG KKVGFCGQGVANSEVLRGLVAISGITSASVVPDTYYRTKQDFAAAEALNISASGLGKWLG EQHQAKLVKLLEAAGKADVAKLASDPAKIRAWYDAETARLHGELRDSLGGSRENTARKAL KTFRATFHKPVIYAAWNWNETVEDALHQAGFATFEEQAAALAEQRKKLA >gi|316923723|gb|ADCP01000067.1| GENE 33 37917 - 41627 4703 1236 aa, chain - ## HITS:1 COG:SA0963 KEGG:ns NR:ns ## COG: SA0963 COG1038 # Protein_GI_number: 15926699 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Staphylococcus aureus N315 # 16 1202 5 1145 1150 477 30.0 1e-134 MAAKTFEQVLDEVRGKPILVANRGIPARRICRAIRERFAAVPVMTATDIDKAAPAASSAQ ELLLLGSDPRAYLDLDLIIAKAKQRGIIAIHPGWGFASEDERFPQKCEEAGITFIGSKAD SMNLLGNKVQVRKIAKRLGIPVVPGSEGAVDVPAARKVVDEITLPIMLKAEGGGGGRGIF LVREKSELEDAFFKASAMAQASFGNPRLYVEKYLEQVRHIEIQVIADQHGNVFAFDERDC SVQRNHQKLVEITPSPWAGITPELRERLKEYSRMLVREVGYYSLATVEFLVTPQGEPYLI EVNTRLQVEHGITESRYGVDLVEEQIAVAFGAKLRLTEENTKPSHFAMQVRINCEDPQNN FAPNSGLITRYVSPGGPGVRIDSNLSAGYEFPPNYDSAGSLLIAYGRDWQKVLGIMDRAL TEYMVGGVKTTIPFYRQVLKHPAFRAGMFDTGFIPSAPELMIYADLAPESERLAKLIAEI SAKGYNPFVQLGEYRSHTTPRLPRQDVVLPAIPSKVRKEPSPYPHGDRVALLDYVRDSGR VHFTDTTTRDSTQSNSGNRFRLAEDRLVGPYLDNCGFFSLENGGGAHFHVAMMANMTYPF TEAKEWNKFAPKTLKQILIRSTNVLGYSPQPKNLMRLTGEMICDNFQIIRCFDFLNHIDN MRPFAEVALSRRDVVFEPAISMSWANGFDVAHYLGVAENVLSVCGDVAGMSEKEVSRHII LGLKDMAGVCPPRFMTEVVTALRKRWPELVLHYHRHMTDGLFVPSVGAAAKAGVQIVDTN LGACVRSYGQGDTLATAAYMEGELGLKTAMNKDMVRDANFVLKQVIPYYDRYCAPYFQGI DNDVTEHAMPGGATSSSQEGALKQGYIHLLPYMLKFLAGTRKLVRYHDVTPGSQITWNTA FLAVTGAYKRGGEEEVKYLLGVLDRVNDVPDEAELSEGTRAARLALYQDCNDAFRNLLLG KFGKLPLGFPPDWVYESAFGNAWKQAIADRTTHSPLEKLTDMDIEAERKAFRDIIKREPT EEEMVMYLNHPGDAVKTVQFRAKYGDPNRLPLPVWFEGCSVGEEINFVDTSGKPHQFLLV SMSPANDAGESMVRFVLDSEIFSQLIKVAPPKNGGAGGAVMADPTDKYQVGAPSNGDLWV MYVHPGDIVEEGEELFNVSIMKQEKAVLAPVAGIVKRVLKTADYKETRKMETVREGELIV ELGPVPRICTNEACSRPLPMNDIEFCPYCGERLSRI >gi|316923723|gb|ADCP01000067.1| GENE 34 41777 - 42841 598 354 aa, chain - ## HITS:1 COG:VC0319_2 KEGG:ns NR:ns ## COG: VC0319_2 COG0340 # Protein_GI_number: 15640346 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Vibrio cholerae # 105 349 10 235 237 75 31.0 2e-13 MTHDSTFRLNGTVRLLHEGVHSERPPHDRLADPFPADHAAGSALWNGLPEAAFTMPGTST ACPYLEQFFTAPVSQTENTPETASVPAPVYLCGSATSSMDVARALAAHGQLPVWGSVLAL SQRSGRGQLGRNWVSPEGNVYAALRLPQSHPFTGTAAAPAVGGLIAEALTHMGFDVHMKW PNDLLRREEQEEGPGWCKVGGILLEERPLPPHRETPAENLLVAGIGLNLVSSPPAALMRA QRAVPAGLLSSATKSNVSPLSVAGLWMRLVSRIFFCYVEEIDAKGKNAWRSLAERHLAFL GQTVLLTDGPDEQERHTGILEGLDDFGGLRLRNRKGTNSFLSGSLRLDRPPMQP >gi|316923723|gb|ADCP01000067.1| GENE 35 42840 - 44003 854 387 aa, chain + ## HITS:1 COG:HI1606 KEGG:ns NR:ns ## COG: HI1606 COG0617 # Protein_GI_number: 16273496 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Haemophilus influenzae # 12 305 1 324 416 149 34.0 6e-36 MAIENWPETGRIPVYLVGGAVRDRLLGRTPREYDFAFDADEATFLQRNPEARKVGKSVSV FLLRGQEFMPLEGTLEEDMRRRDLTINALAEDRDGNLLGHPHALSDLREGILRPASPTSF RDEPVRVFRLARMACELPSFTVHPEAVAQMRAVVGEGLLGRIPAERVGRELMKALASPKP SRWLSVLAEGNCLSPWFRELETSMDIPAGPVAYHSGSVMAHLMDVMDAVAGDPLCVWMAL CHDLGKIGTDPALLPHHYGHELRGAEPAEHVAKRLALPARYGAAGVLSSQLHMKAGIYEM LRAGTRCDLLMQVHNAELDGPFWKLAEADSGRSLRPVVNDDLAVLLRISLPPEWRDRGKE SGRRLRAMRCQALAMHMAKRKRENADG >gi|316923723|gb|ADCP01000067.1| GENE 36 43996 - 45648 1360 550 aa, chain + ## HITS:1 COG:AF0691 KEGG:ns NR:ns ## COG: AF0691 COG3894 # Protein_GI_number: 11498299 # Func_class: R General function prediction only # Function: Uncharacterized metal-binding protein # Organism: Archaeoglobus fulgidus # 117 543 131 564 585 134 28.0 6e-31 MADGAGMAGAPLDVSGRKPGAAGGIHAVDSDAAVVVDAVHAAQGVCVVDSSGLNHILPVE AGQTLAQTIWLSGELSPPALCSGLGRCGACRVRFFEGTPKPCDADSAILGAEAVRGGWRL ACRHAAVAGMVVELPPPPSEKRKRMDIRPDAGPFRLAVDLGTTSIHWRLLDGTGHEAASG QALNPQMGAGSDVVSRLAAARNQEGRERLGHLVLRFLQRVVSDVGVPVAELCIAGNTAMT SILLNEDVAGLCAAPYRLTEPGGRTAELPGLPPAWIPPQPAPFVGGDISAGMAALLYGEP PEFPFLLADMGTNGEFVLALDKERSFIASVPLGPSLEGIGLRYGGVADTGSVSGFRLGPF GLSPVVIGNTEPKRICGTGYLSLLDALLRTGFLDATGRLASASVSPLAARLLGTVERGAA GWSLPLPGGMELAGADVEEILKVKAAFSLALESLLAASGLESRALARVCLGGALGEHMPE TALERLGFLPQGLQARTVAEGNTSLRGAALLLTRPELRERLVRWSSGCTLVDLAARPDFT ALYMRHMVFG >gi|316923723|gb|ADCP01000067.1| GENE 37 45738 - 46916 1617 392 aa, chain + ## HITS:1 COG:XF0823 KEGG:ns NR:ns ## COG: XF0823 COG0126 # Protein_GI_number: 15837425 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Xylella fastidiosa 9a5c # 1 391 110 500 500 453 61.0 1e-127 MNVRLMSDLDLQGKTVVFREDLNVPVKEGKITSDKRIRAAIPSLRFALDKGAGIIVLSHL GRPEEGVYSEEASLAPVAKRLEELLGVPVRLEKDYLDGVSVKPGECVLCENVRFNKGEKK NNEEVARKLAALGDVYVMDAFATAHRAQASTEGAIRFAKVACVGLLMAAELEAVTRILHA PKHPLLAIIGGSKVSTKLEVLNNLSKRVDKLIVGGGIANTFLKAAGYEVGKSLYEPDLVE SAAKIMEEAKGRGAAIPLPVDVVVGPALEEHAPATVLKVEEVRPDDMILDIGPATAAMYA QIIAEAGTVVWNGPVGAFEIDQFGKGTEAIAKAVAETSAYTVTGGGDSIAALEKYGYASQ VDYISTAGGAFLEVLEGKTLPAVAALEARAAS >gi|316923723|gb|ADCP01000067.1| GENE 38 46922 - 47218 69 98 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPCRVLDRRGHGPYRAVGTRRIRLAEARTRPTGAAFSGTVLRIVSGTPLPCMAVPGGTFL IGATVLPGCGALFRRATGRGTRGPLFGSPRVVSTFRGG >gi|316923723|gb|ADCP01000067.1| GENE 39 47222 - 48415 1009 397 aa, chain + ## HITS:1 COG:no KEGG:LI0463 NR:ns ## KEGG: LI0463 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 3 389 32 409 416 315 42.0 2e-84 MCRPLFSPLDAFCRNALAQLPKALDAVMPLSRNHRADLPEAIRDLSAMLTYERGGLGRSY WSAPRYVSAYLRYFLPWNLVRLTGLLPGLDLPVPVCDAETPVTIADLGSGPLTLPIALWL SRPDWRAVPLTLVCVDTVPRPMEMGRSILEHMAKLSGEPLNWTIRLVRSPLMQSFRELRS PYLLMAGNVLNELKDKPGVSVDERMADLAVAVGRTLHPEGTALFVEPGTRLGGTLTAKLR ETALEEGLTPVAPCPHLGPCPLLETRERRWCHASQLAVAPAWLADLARYAKLPKDSLSLS FMQLRPESEAAPIAKTALFPAMDPNGVVARILSESFPVPGMGHARYACTEDGFAIIPAAG DIPSGALVACRRPASPRKDAKTGAVELLWQPEQKPQR >gi|316923723|gb|ADCP01000067.1| GENE 40 48434 - 49828 1611 464 aa, chain + ## HITS:1 COG:YPO2984 KEGG:ns NR:ns ## COG: YPO2984 COG0008 # Protein_GI_number: 16123165 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Yersinia pestis # 3 464 2 463 471 450 49.0 1e-126 MSKIITRFAPSPTGHLHIGGARTALFCWLLARHYGGEFRLRIEDTDTERSKQEYTDAILA SMRWLGLDWDGELTYQTQRTDRYNEVIDKMIESGHAYWCSCTPEEVEAMREAARAKGEKP RYDGRCRERGLDAGPGRVVRLKIPTEGRVVFDDMVKGHIATDVSELDDMILRRSDGMPTY NMAVVVDDHDMGITHVIRGDDHVSNTPKQILIYQALGWELPVFGHVPMILGKDRQKLSKR HGARSVVEYQNDGLLPHALVNCLVRLGWSHGDQELFSMQELIDLFDGKNLNSSASAFDPD KLLWFNAHYLRETPLDDLARLVLPFIHQKGFTGATEASIEPLVPLYRERAKNLIELADGI AQLLYKSADLPYDEAGVAKWLTDEGKEHVKVIRDQLAALPSFDKESIEHVIHSYVESLGV KFKMVAQPVRVAITGVIGGPGLPEFMLAIGKDETLARMDRGLTL >gi|316923723|gb|ADCP01000067.1| GENE 41 49936 - 50391 -226 151 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGAKPEARYDTQPKPVDASRESFSDGGSIPPASTIQKICCPTWSNEGRNLSNTKRFRPFL YLLALKTSHYHKNEEGKNEDDCQASPPSRTLVFGIPPLPFCWMSDQRLIQPLRRKPGYNC IPACLRGVIKERSSRYQAIPPLTPKCRTGSR Prediction of potential genes in microbial genomes Time: Fri May 13 02:47:06 2011 Seq name: gi|316923559|gb|ADCP01000068.1| Bilophila wadsworthia 3_1_6 cont1.68, whole genome shotgun sequence Length of sequence - 206859 bp Number of predicted genes - 177, with homology - 154 Number of transcription units - 98, operones - 36 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 38 - 66 2.1 1 1 Tu 1 . - CDS 97 - 1539 2096 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 1588 - 1647 3.6 - Term 2108 - 2144 5.6 2 2 Tu 1 . - CDS 2166 - 3185 1045 ## COG0628 Predicted permease - Term 3363 - 3398 0.4 3 3 Op 1 . - CDS 3456 - 3737 349 ## Dvul_1309 hypothetical protein 4 3 Op 2 11/0.000 - CDS 3818 - 4225 346 ## COG0239 Integral membrane protein possibly involved in chromosome condensation 5 3 Op 3 . - CDS 4239 - 4613 387 ## COG0239 Integral membrane protein possibly involved in chromosome condensation 6 3 Op 4 . - CDS 4613 - 5371 900 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 7 4 Op 1 . - CDS 5610 - 6110 525 ## CYA_1382 hypothetical protein 8 4 Op 2 17/0.000 - CDS 6098 - 6868 533 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component 9 4 Op 3 . - CDS 6865 - 7800 1195 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 7923 - 7982 2.2 10 5 Tu 1 . + CDS 8168 - 8722 792 ## COG0590 Cytosine/adenosine deaminases + Term 8750 - 8790 7.2 - Term 8739 - 8775 3.2 11 6 Tu 1 . - CDS 8937 - 10340 1560 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 10460 - 10519 1.7 + TRNA 10770 - 10856 59.4 # Leu CAG 0 0 12 7 Op 1 . + CDS 11121 - 12320 420 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 13 7 Op 2 . + CDS 12335 - 12523 160 ## + Prom 12537 - 12596 2.1 14 8 Tu 1 . + CDS 12651 - 13631 461 ## gi|302862827|gb|EFL85759.1| hypothetical protein HMPREF0326_01462 15 9 Op 1 . + CDS 13757 - 13975 229 ## XALc_1802 hypothetical protein 16 9 Op 2 . + CDS 14003 - 16654 1912 ## COG5519 Superfamily II helicase and inactivated derivatives 17 10 Tu 1 . - CDS 16719 - 16913 90 ## - Prom 17148 - 17207 2.0 18 11 Tu 1 . + CDS 16824 - 18050 853 ## LKI_10806 recombination protein 19 12 Tu 1 . + CDS 18197 - 18550 351 ## gi|302861370|gb|EFL84308.1| putative protein MobC + Prom 18556 - 18615 2.5 20 13 Op 1 5/0.042 + CDS 18652 - 21048 1634 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 21 13 Op 2 27/0.000 + CDS 21067 - 22527 1528 ## COG0286 Type I restriction-modification system methyltransferase subunit + Term 22678 - 22718 1.3 + Prom 22877 - 22936 4.1 22 13 Op 3 . + CDS 22956 - 23822 -118 ## COG0732 Restriction endonuclease S subunits + Prom 24423 - 24482 8.1 23 14 Tu 1 . + CDS 24548 - 24667 68 ## + Term 24769 - 24814 -0.8 + Prom 24762 - 24821 5.1 24 15 Tu 1 . + CDS 24892 - 25095 59 ## + Prom 25381 - 25440 5.9 25 16 Op 1 . + CDS 25464 - 26069 -324 ## COG3344 Retron-type reverse transcriptase + Prom 26072 - 26131 4.6 26 16 Op 2 . + CDS 26159 - 26362 148 ## - Term 26539 - 26576 0.1 27 17 Tu 1 . - CDS 26582 - 27508 298 ## COG3177 Uncharacterized conserved protein - Prom 27625 - 27684 3.3 - Term 27696 - 27730 -0.5 28 18 Tu 1 . - CDS 27733 - 27972 112 ## Cag_1364 hypothetical protein - Prom 28122 - 28181 7.8 + Prom 28613 - 28672 2.0 29 19 Tu 1 . + CDS 28759 - 29193 442 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain + Term 29197 - 29244 1.2 + Prom 29284 - 29343 2.5 30 20 Op 1 . + CDS 29390 - 30361 915 ## COG2768 Uncharacterized Fe-S center protein 31 20 Op 2 . + CDS 30492 - 31613 617 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold + Term 31633 - 31694 11.2 - Term 31626 - 31678 10.1 32 21 Tu 1 . - CDS 31770 - 32864 1353 ## COG0180 Tryptophanyl-tRNA synthetase + Prom 33463 - 33522 6.4 33 22 Op 1 . + CDS 33693 - 34970 1599 ## COG0045 Succinyl-CoA synthetase, beta subunit 34 22 Op 2 . + CDS 34975 - 35262 377 ## Moth_0398 hypothetical protein 35 22 Op 3 . + CDS 35311 - 36207 1066 ## COG0074 Succinyl-CoA synthetase, alpha subunit 36 22 Op 4 . + CDS 36263 - 37147 1387 ## COG2301 Citrate lyase beta subunit 37 23 Op 1 . + CDS 37596 - 38840 1655 ## COG0281 Malic enzyme 38 23 Op 2 15/0.000 + CDS 38886 - 40277 1653 ## COG0277 FAD/FMN-containing dehydrogenases 39 23 Op 3 . + CDS 40526 - 41815 1165 ## COG0247 Fe-S oxidoreductase + Term 41838 - 41884 5.8 - Term 41882 - 41915 -0.4 40 24 Tu 1 . - CDS 42010 - 43443 1648 ## COG0477 Permeases of the major facilitator superfamily + Prom 43857 - 43916 2.9 41 25 Tu 1 . + CDS 44145 - 45404 1347 ## COG0814 Amino acid permeases + Term 45427 - 45467 6.2 + Prom 45503 - 45562 3.1 42 26 Tu 1 . + CDS 45622 - 46047 597 ## Dde_1828 hypothetical protein - Term 45927 - 45958 -1.0 43 27 Op 1 . - CDS 46197 - 47345 1481 ## COG1454 Alcohol dehydrogenase, class IV 44 27 Op 2 . - CDS 47384 - 48388 1474 ## COG1087 UDP-glucose 4-epimerase + Prom 48035 - 48094 1.8 45 28 Tu 1 . + CDS 48284 - 48541 69 ## + Prom 48711 - 48770 2.3 46 29 Tu 1 . + CDS 48805 - 49569 247 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 47 30 Tu 1 . - CDS 49704 - 50633 543 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 50657 - 50684 1.5 48 31 Op 1 . - CDS 50698 - 52002 1254 ## Dacet_1613 hypothetical protein 49 31 Op 2 . - CDS 52005 - 52793 889 ## Dacet_1614 hypothetical protein 50 31 Op 3 . - CDS 52808 - 53560 609 ## COG2091 Phosphopantetheinyl transferase - Prom 53591 - 53650 5.1 + Prom 53702 - 53761 3.9 51 32 Op 1 . + CDS 53868 - 54650 498 ## AM1_3384 hypothetical protein 52 32 Op 2 . + CDS 54653 - 56620 1942 ## COG0370 Fe2+ transport system protein B 53 32 Op 3 . + CDS 56642 - 62929 7989 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins 54 32 Op 4 . + CDS 62926 - 65049 2797 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Term 65094 - 65133 12.1 55 33 Op 1 . + CDS 65178 - 67574 2637 ## COG1033 Predicted exporters of the RND superfamily 56 33 Op 2 . + CDS 67571 - 68674 882 ## PA14_54900 hypothetical protein 57 33 Op 3 . + CDS 68671 - 69099 518 ## 58 33 Op 4 . + CDS 69102 - 70271 1336 ## COG3320 Putative dehydrogenase domain of multifunctional non-ribosomal peptide synthetases and related enzymes 59 33 Op 5 . + CDS 70262 - 74695 4241 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 60 33 Op 6 . + CDS 74789 - 75886 1317 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 61 33 Op 7 . + CDS 75934 - 76833 940 ## COG1045 Serine acetyltransferase 62 33 Op 8 35/0.000 + CDS 76835 - 78586 256 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 63 33 Op 9 . + CDS 78579 - 80312 195 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + Term 80333 - 80382 -0.9 - Term 80516 - 80540 -0.3 64 34 Tu 1 . - CDS 80614 - 81036 66 ## + Prom 80559 - 80618 3.1 65 35 Op 1 . + CDS 80839 - 81150 187 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase 66 35 Op 2 . + CDS 81206 - 81703 242 ## COG0350 Methylated DNA-protein cysteine methyltransferase 67 36 Tu 1 . - CDS 81719 - 82669 770 ## COG0837 Glucokinase - Prom 82711 - 82770 3.9 - Term 82784 - 82827 12.2 68 37 Op 1 13/0.000 - CDS 82891 - 83826 1127 ## COG0320 Lipoate synthase 69 37 Op 2 . - CDS 83768 - 84412 618 ## COG0321 Lipoate-protein ligase B - Prom 84533 - 84592 3.6 + Prom 84739 - 84798 3.5 70 38 Tu 1 . + CDS 84834 - 87482 2893 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase + Term 87671 - 87709 7.1 - Term 87659 - 87696 -0.7 71 39 Tu 1 . - CDS 87734 - 88669 898 ## COG0583 Transcriptional regulator - Prom 88690 - 88749 3.6 + Prom 88661 - 88720 2.4 72 40 Op 1 . + CDS 88740 - 90497 1949 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 73 40 Op 2 . + CDS 90579 - 92267 1505 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 74 40 Op 3 . + CDS 92447 - 93907 1752 ## Dhaf_4598 sodium/sulphate symporter + Term 93932 - 93971 8.0 + Prom 94175 - 94234 5.3 75 41 Tu 1 . + CDS 94255 - 94668 317 ## 76 42 Op 1 . + CDS 94961 - 95413 578 ## RSKD131_4191 tripartite ATP-independent periplasmic transporter, DctQ component 77 42 Op 2 . + CDS 95417 - 96685 666 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 78 42 Op 3 . + CDS 96753 - 97292 112 ## DVU2560 hypothetical protein 79 42 Op 4 . + CDS 97295 - 98017 286 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 80 42 Op 5 . + CDS 98025 - 98411 511 ## DvMF_1005 acyl carrier protein, putative + Term 98522 - 98554 0.6 81 43 Tu 1 . + CDS 98716 - 99615 328 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase + Prom 100022 - 100081 1.5 82 44 Tu 1 . + CDS 100102 - 101274 657 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes + Term 101297 - 101346 0.7 83 45 Tu 1 . - CDS 101473 - 102174 758 ## COG2186 Transcriptional regulators - Prom 102360 - 102419 3.9 84 46 Op 1 . + CDS 102662 - 103438 1177 ## COG3257 Uncharacterized protein, possibly involved in glyoxylate utilization 85 46 Op 2 . + CDS 103514 - 104539 1159 ## COG0540 Aspartate carbamoyltransferase, catalytic chain + Term 104608 - 104647 -0.2 86 47 Tu 1 . + CDS 104696 - 105649 895 ## COG0549 Carbamate kinase + Term 105819 - 105854 3.1 + Prom 105773 - 105832 7.5 87 48 Op 1 . + CDS 105940 - 107262 1210 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 88 48 Op 2 38/0.000 + CDS 107313 - 108905 2309 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 108957 - 109001 1.3 89 48 Op 3 49/0.000 + CDS 109042 - 110022 279 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 90 48 Op 4 44/0.000 + CDS 110025 - 110900 1084 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 91 48 Op 5 44/0.000 + CDS 110937 - 111920 1045 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 92 48 Op 6 . + CDS 111963 - 112982 1182 ## COG4608 ABC-type oligopeptide transport system, ATPase component 93 48 Op 7 . + CDS 113051 - 113635 862 ## COG0590 Cytosine/adenosine deaminases 94 49 Tu 1 . + CDS 113741 - 114805 1324 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component + Term 114843 - 114877 5.1 95 50 Tu 1 . - CDS 115151 - 115510 387 ## - Prom 115720 - 115779 3.4 + Prom 116049 - 116108 4.5 96 51 Op 1 2/0.167 + CDS 116287 - 116862 404 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 97 51 Op 2 1/0.250 + CDS 116880 - 118172 1042 ## COG2010 Cytochrome c, mono- and diheme variants + Term 118193 - 118220 0.5 + Prom 118230 - 118289 2.4 98 52 Tu 1 . + CDS 118331 - 121201 2032 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs - Term 121831 - 121871 4.1 99 53 Tu 1 . - CDS 121896 - 123122 669 ## COG0500 SAM-dependent methyltransferases - Term 123414 - 123460 7.4 100 54 Op 1 . - CDS 123639 - 124484 795 ## COG0157 Nicotinate-nucleotide pyrophosphorylase 101 54 Op 2 23/0.000 - CDS 124481 - 125164 681 ## COG4149 ABC-type molybdate transport system, permease component 102 54 Op 3 . - CDS 125177 - 125899 910 ## COG0725 ABC-type molybdate transport system, periplasmic component 103 54 Op 4 . - CDS 125934 - 126578 182 ## PROTEIN SUPPORTED gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 - Prom 126598 - 126657 1.9 104 55 Tu 1 . - CDS 126662 - 127144 -322 ## - Term 127173 - 127226 15.0 105 56 Op 1 2/0.167 - CDS 127318 - 128766 763 ## COG0531 Amino acid transporters 106 56 Op 2 . - CDS 128857 - 129441 483 ## COG1945 Uncharacterized conserved protein + Prom 130055 - 130114 6.8 107 57 Op 1 33/0.000 + CDS 130138 - 131286 946 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 108 57 Op 2 35/0.000 + CDS 131303 - 132334 547 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 109 57 Op 3 8/0.000 + CDS 132336 - 133094 227 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 110 57 Op 4 17/0.000 + CDS 133091 - 133783 278 ## COG0500 SAM-dependent methyltransferases 111 57 Op 5 1/0.250 + CDS 133856 - 134602 587 ## COG0500 SAM-dependent methyltransferases + Term 134787 - 134822 -0.6 + Prom 134837 - 134896 2.7 112 58 Op 1 3/0.167 + CDS 135021 - 135773 444 ## COG0747 ABC-type dipeptide transport system, periplasmic component 113 58 Op 2 38/0.000 + CDS 135767 - 136597 880 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 136602 - 136642 13.5 114 59 Op 1 49/0.000 + CDS 136655 - 137584 808 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 115 59 Op 2 44/0.000 + CDS 137586 - 138425 764 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 116 59 Op 3 17/0.000 + CDS 138422 - 139228 381 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 117 59 Op 4 . + CDS 139219 - 140010 427 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 118 59 Op 5 . + CDS 140088 - 141512 900 ## LI0461 hypothetical protein + Term 141536 - 141578 6.3 - Term 141523 - 141566 6.1 119 60 Op 1 11/0.000 - CDS 141605 - 142273 621 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase 120 60 Op 2 . - CDS 142286 - 142639 162 ## COG0375 Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) - Term 142797 - 142828 -0.4 121 61 Op 1 . - CDS 142847 - 144145 791 ## COG1160 Predicted GTPases 122 61 Op 2 . - CDS 144147 - 145103 599 ## COG0502 Biotin synthase and related enzymes 123 62 Tu 1 . + CDS 145005 - 145199 130 ## 124 63 Tu 1 . - CDS 145237 - 145641 277 ## Dde_0725 hydrogenase-like - Term 145726 - 145753 -0.1 125 64 Op 1 2/0.167 - CDS 145853 - 146137 233 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 126 64 Op 2 . - CDS 146077 - 146427 85 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes + Prom 146642 - 146701 2.8 127 65 Tu 1 . + CDS 146764 - 146964 89 ## + Term 147029 - 147063 0.2 128 66 Op 1 . - CDS 147493 - 148482 814 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 129 66 Op 2 12/0.000 - CDS 148479 - 149468 936 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 130 66 Op 3 3/0.167 - CDS 149484 - 150059 828 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 131 66 Op 4 13/0.000 - CDS 150056 - 150730 921 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 132 66 Op 5 12/0.000 - CDS 150741 - 151313 884 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 133 66 Op 6 12/0.000 - CDS 151317 - 152273 1148 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD - Term 152532 - 152581 1.3 134 66 Op 7 . - CDS 152726 - 153949 1435 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 135 66 Op 8 . - CDS 153960 - 154739 647 ## DVU2791 cytochrome c family protein - Term 154767 - 154809 -0.9 136 67 Op 1 . - CDS 154933 - 155313 360 ## Ddes_1502 ferredoxin hydrogenase (EC:1.12.7.2) 137 67 Op 2 . - CDS 155326 - 156579 938 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 138 68 Tu 1 . - CDS 157094 - 158512 1179 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes - Prom 158616 - 158675 4.5 139 69 Tu 1 . - CDS 159080 - 160075 316 ## COG2801 Transposase and inactivated derivatives - Prom 160119 - 160178 3.0 140 70 Tu 1 . + CDS 160707 - 161642 730 ## COG2207 AraC-type DNA-binding domain-containing proteins 141 71 Tu 1 . - CDS 161948 - 162694 342 ## COG0500 SAM-dependent methyltransferases - Prom 162806 - 162865 2.3 - Term 162774 - 162807 -0.4 142 72 Tu 1 . - CDS 163037 - 163354 80 ## 143 73 Op 1 . - CDS 163464 - 163739 121 ## gi|239628392|ref|ZP_04671423.1| predicted protein 144 73 Op 2 . - CDS 163742 - 164101 164 ## - Prom 164180 - 164239 5.1 - Term 164205 - 164243 5.1 145 74 Op 1 . - CDS 164280 - 165740 1992 ## COG0471 Di- and tricarboxylate transporters 146 74 Op 2 . - CDS 165802 - 166014 257 ## gi|255525440|ref|ZP_05392378.1| nitroreductase 147 74 Op 3 1/0.250 - CDS 166058 - 166585 572 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 148 74 Op 4 . - CDS 166563 - 167747 1309 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit - Prom 167819 - 167878 3.6 + Prom 167770 - 167829 5.2 149 75 Tu 1 . + CDS 167926 - 169851 1660 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Prom 170126 - 170185 4.8 150 76 Tu 1 . + CDS 170287 - 170544 244 ## COG3811 Uncharacterized protein conserved in bacteria + Term 170638 - 170686 12.1 - Term 170587 - 170625 -0.9 151 77 Tu 1 . - CDS 170865 - 171407 385 ## - Prom 171436 - 171495 2.5 + Prom 171408 - 171467 5.5 152 78 Tu 1 . + CDS 171499 - 171744 193 ## + Term 171906 - 171954 7.5 - Term 171894 - 171940 11.4 153 79 Tu 1 . - CDS 172039 - 172695 331 ## + Prom 173043 - 173102 4.9 154 80 Op 1 . + CDS 173220 - 174506 465 ## GYMC10_5081 extracellular solute-binding protein family 1 155 80 Op 2 . + CDS 174503 - 176755 1918 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain + Term 176897 - 176945 13.0 + Prom 176953 - 177012 3.9 156 81 Tu 1 . + CDS 177124 - 178317 1189 ## gi|302862801|gb|EFL85733.1| transporter, major facilitator family + Term 178441 - 178473 -0.7 157 82 Tu 1 . - CDS 178448 - 178672 248 ## gi|302862698|gb|EFL85630.1| DNA-binding protein - Prom 178840 - 178899 2.7 158 83 Tu 1 . + CDS 179510 - 180784 925 ## COG2015 Alkyl sulfatase and related hydrolases + Term 180841 - 180878 3.8 + Prom 181114 - 181173 7.3 159 84 Tu 1 . + CDS 181279 - 181521 347 ## DvMF_2842 hypothetical protein + Term 181714 - 181751 7.1 + Prom 181650 - 181709 1.6 160 85 Tu 1 . + CDS 181756 - 182004 98 ## + Term 182057 - 182109 1.0 161 86 Tu 1 . - CDS 182009 - 183466 1782 ## Dhaf_4598 sodium/sulphate symporter 162 87 Op 1 . - CDS 183610 - 185325 1957 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 163 87 Op 2 . - CDS 185337 - 186977 1451 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 187181 - 187240 7.8 + Prom 187176 - 187235 7.1 164 88 Tu 1 . + CDS 187344 - 190412 1916 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Term 190511 - 190555 9.1 165 89 Op 1 . - CDS 190568 - 192178 1460 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold - Term 192227 - 192278 9.0 166 89 Op 2 . - CDS 192283 - 193677 1966 ## COG0786 Na+/glutamate symporter - Prom 193773 - 193832 7.2 + Prom 193741 - 193800 5.2 167 90 Tu 1 . + CDS 193967 - 195409 1239 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Term 195346 - 195382 0.4 168 91 Op 1 10/0.000 - CDS 195584 - 196465 193 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 169 91 Op 2 2/0.167 - CDS 196541 - 197251 762 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 170 91 Op 3 . - CDS 197263 - 198186 902 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 198378 - 198437 2.1 171 92 Tu 1 . + CDS 198588 - 199979 1243 ## COG2199 FOG: GGDEF domain + Term 199988 - 200041 9.3 + Prom 200347 - 200406 3.0 172 93 Tu 1 . + CDS 200450 - 200713 74 ## + Prom 200746 - 200805 1.6 173 94 Tu 1 . + CDS 200948 - 201277 362 ## DVU0236 phage integrase family site specific recombinase - Term 201165 - 201208 1.4 174 95 Tu 1 . - CDS 201286 - 201594 290 ## + Prom 201547 - 201606 1.9 175 96 Tu 1 . + CDS 201632 - 201949 99 ## 176 97 Tu 1 . - CDS 202507 - 206634 4162 ## COG4625 Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain 177 98 Tu 1 . + CDS 206666 - 206858 168 ## Predicted protein(s) >gi|316923559|gb|ADCP01000068.1| GENE 1 97 - 1539 2096 480 aa, chain - ## HITS:1 COG:PA0766 KEGG:ns NR:ns ## COG: PA0766 COG0265 # Protein_GI_number: 15595963 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Pseudomonas aeruginosa # 7 473 8 474 474 331 43.0 2e-90 MLHKYKMLPLVAVMLVMSLVFSAQAAPVVPDFVPLVQSAGKAVVNISTEKKVMQRGGMPS EMFRGLPPEFERFFEQFEGNSRNAKPRTQRSLGTGFLISSDGYIVTNNHVVAGADTVLVN LQGSTGKEHSLKAQVIGTDEETDIALLKVKADAPLPFLNFGNSDKMEVGEWVLAIGNPFG LGHTVTAGILSAKGRNIQSGPFDNFLQTDASINPGNSGGPLINMEGKVIGINTAIIASGQ GIGFAIPSSMAERIVNQLKQDKKVSRGWIGVTIQDVDENTAKALGLPEATGALIGSVMPD EPAAKGGMKDGDVVLEVNGQKIDDSSALLRAIATEAPGSKVNMVVWRDGERKNITVQLGE RNLKASSGKSGSAVEQDAPSIGLSVRPLTKEEARTANVKPGTGLLIVGVEPGKLAAEAEL REGDIILSANLKPIKSVEEFSKIIREDAKKRGVVMLQLQRQGQTFFRSLALSDTPDGAKK >gi|316923559|gb|ADCP01000068.1| GENE 2 2166 - 3185 1045 339 aa, chain - ## HITS:1 COG:VC0624 KEGG:ns NR:ns ## COG: VC0624 COG0628 # Protein_GI_number: 15640644 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Vibrio cholerae # 4 327 15 341 361 102 26.0 2e-21 MTRVLACLAIGAWVFLLWPFPITTFMATCMACVTYPAYRKLHRKMSPSWAMTTYTVGLAA ITILPIATVVSLVTPQAVAGLKILDGLRDSGWIHSPEAQAWFASVDAWLKNLPGLEGGLD QLASTAAGLAGTAARTVLAGGVGIAGGAFQAVLVLFLFVMITMMCVTRADLIHEFACRLT QWPAEVIDRFVSTIRKAIFGVLVGVVFVALIQGFLCGVGFAIAEVPQPAFWGLIAAFVAP IPFVGTALVWLPVCIWLWLTGSTVACIGLAIWCALVVAGVDNLLRPFFLKTGIDASVVTL ILSILCGLAAFGPVGVFAGPVLVAVAIQAGNESTLCRKH >gi|316923559|gb|ADCP01000068.1| GENE 3 3456 - 3737 349 93 aa, chain - ## HITS:1 COG:no KEGG:Dvul_1309 NR:ns ## KEGG: Dvul_1309 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 91 1 96 98 88 54.0 9e-17 MRTLLMLPLLLLPFTAQAASLLPGGDYPAPDCRSPLRPLPGDSPMDWRMYRSDMEAYRQC VEAYLATARQDAERIRKRMEKAVREYNEESGNL >gi|316923559|gb|ADCP01000068.1| GENE 4 3818 - 4225 346 135 aa, chain - ## HITS:1 COG:CC1981 KEGG:ns NR:ns ## COG: CC1981 COG0239 # Protein_GI_number: 16126224 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Caulobacter vibrioides # 8 120 5 118 127 71 42.0 4e-13 MIFSRNALAVMLGGGAGAVCRVMTGFAVMEKLPSGFPLGVLCVNVLGGFLMGLLQGWMRR TGNTFATGYCLLGTGFLGGFTTFSTFSLDTFLLYRSGDASLAALNIALNMVLCLCAVWGG YALLAPRPERPTAPS >gi|316923559|gb|ADCP01000068.1| GENE 5 4239 - 4613 387 124 aa, chain - ## HITS:1 COG:VC0060 KEGG:ns NR:ns ## COG: VC0060 COG0239 # Protein_GI_number: 15640092 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Vibrio cholerae # 9 114 9 115 126 77 47.0 5e-15 MMPAEMLCVAAGGALGALCRHGVSLWCADRFGKAFPCGTLLVNVSGSLLMGLLAGCWQRW GTPSAFALFAGAGFLGALTTFSTFSMDTLAAFHAGHPAKAMLNIGLNTTLCLTATAAAYF LIVP >gi|316923559|gb|ADCP01000068.1| GENE 6 4613 - 5371 900 252 aa, chain - ## HITS:1 COG:AF0086 KEGG:ns NR:ns ## COG: AF0086 COG0600 # Protein_GI_number: 11497706 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Archaeoglobus fulgidus # 7 243 11 243 244 193 49.0 3e-49 MRALFRYLLALAALLAIWQIGSLALGEFLLPRPLDVLAYFGEALTTGAFWQHAGASAWRV LSAMSLAWLVAFPLGLILGHVRSVDNTLSPMVFLTYPLPKIVLLPLFLTLFGLGDLPRVL LIALTTGYQILVVTRASAQGLDKKYLDSFRSLGGTPLQMLRHVLIPAALPDAMTALKVAS GTAVAVLFMAESFATRRGLGFLIMDAWGRGDQLEMFTGILAMSLLGVALYELCNVLETRS CRWRNFHMAGKR >gi|316923559|gb|ADCP01000068.1| GENE 7 5610 - 6110 525 166 aa, chain - ## HITS:1 COG:no KEGG:CYA_1382 NR:ns ## KEGG: CYA_1382 # Name: not_defined # Def: hypothetical protein # Organism: Cyanobacteria_CYA # Pathway: not_defined # 8 161 7 160 166 72 31.0 6e-12 MSTLIILLLCGAAGGFIFEWFGLPGGAMTGAIIAVILLKSFSSLPQAAFPRPFQFCIYAG LGVLVGNMYRPEMLLAVRDTWPVLVISTLLVLFAGMLIAIFVVKFGNLDVTSAYLATSPG GLNVVVGLAADMGPNAPIVLAYQMVRLYTIILTVPLAARILHKFLS >gi|316923559|gb|ADCP01000068.1| GENE 8 6098 - 6868 533 256 aa, chain - ## HITS:1 COG:AF0087 KEGG:ns NR:ns ## COG: AF0087 COG1116 # Protein_GI_number: 11497707 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Archaeoglobus fulgidus # 9 249 2 243 243 221 46.0 9e-58 MSRASDAGIAVRDLSKGYGEGPVLDGLSFDLPSGETLSVIGPSGCGKSTLLYLLAGLDKP DRGTVSTGEVSRRDKGRISFILQDYGLFPWKTVQENLSLPLELQGVAPSTRRKAVADMLD ELGLGGLGMRYPAQLSGGQRQRVAIGRALITDPDILLMDEPFSSLDALTREHLQTTVLSL WQRRRPTCVLVTHNVPEAVFLGKHVMVMNSHPARNVMWLENPCFGDADPRGEERYFSLTK KVYAALSETKDFSCPH >gi|316923559|gb|ADCP01000068.1| GENE 9 6865 - 7800 1195 311 aa, chain - ## HITS:1 COG:AF0088 KEGG:ns NR:ns ## COG: AF0088 COG0715 # Protein_GI_number: 11497708 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Archaeoglobus fulgidus # 25 309 25 299 300 146 33.0 5e-35 MMRRLLVLLSLLVLAAVQPAHSAPLKIGVLPAADTIVLHVAADEGLFAKQGLEVELIPFQ SALELGAAMRAGALSGHFGDIINVLMQNESGSPQAIVATTSHSSPDNRCFGLVVSPKSKA QTLADLKGKDIAVSSATIIDFLLAQLLLQEGAAPDFLNRQDIRQIPVRLQMLLSGQIESA LLPEPLVSLVEAKGARTILNDCKLNTPLAVIALKRDVLDAPDGTQTVAKFREALREAAKR INETPDAYRPIMEAKGLLPKGASANYTMVRFDMTHTPTGLPSEADIKTFADWMKANRILK KDPAYGDVVFQ >gi|316923559|gb|ADCP01000068.1| GENE 10 8168 - 8722 792 184 aa, chain + ## HITS:1 COG:BMEII0619 KEGG:ns NR:ns ## COG: BMEII0619 COG0590 # Protein_GI_number: 17988964 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Brucella melitensis # 5 179 8 192 199 170 46.0 1e-42 MQEKLFNRLMDVIEYDVAPLTHKRVAIGCKVFGAAVLRKDDLSLVLAETNMEAFSPLWHG EVYAIKQFWELQGHPDPKDCIFIATHQPCCMCASAIAWSGFPELFYLFGYENTSKDFHIP HDQKMIRELFHCTSPAEDSSYYTSRPIMGLCDTPEAAARFERIKNLYAELSAVYQAGEKK MVLS >gi|316923559|gb|ADCP01000068.1| GENE 11 8937 - 10340 1560 467 aa, chain - ## HITS:1 COG:PA0667 KEGG:ns NR:ns ## COG: PA0667 COG0739 # Protein_GI_number: 15595864 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Pseudomonas aeruginosa # 31 446 18 434 447 231 32.0 2e-60 MRRILWCIISLCLILLSLALLLRFRDDADPLPAPLENVFSHVEGLLSLSPEHAVDEFKQD EEEEVQKEDGEEEDAKEADVYEANPRYTIEQDPEQGEVLRSEIQKGDTVGKILGDWMDAN DLAELLAAAKPVYALTKVRFGQPFAVVRDPQTKAFRCFKYEINQEKYLIVEKKDDRFVAR LEEIDYQTSLAVIKGEIKSTLSGAVTEQGENVALAIALANVFASEINFISDLREGDAFEV LVEKRFRHDAFEGYGRVLGAKFTNQNKQHTAYLFHNERGRETYYNAEGDNLHRELLKAPL SFLRVTSRYSMARRHPVFGNTRPHQGIDYGAPTGTPIMAVGDGVITNIGRAGGYGKQVII RHDNGLESLYGHMSRFAKSMKNGKRVRQGQTIGYVGATGTATGPHLDFRIRKQGQFVNPD KLIIPRDQALEKRRMADYKLVVHAVDAYLTGKDTASYDPDTWFKDED >gi|316923559|gb|ADCP01000068.1| GENE 12 11121 - 12320 420 399 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 4 394 6 403 406 166 29 8e-40 MKLSDAAVRNARANGKVQKLSDGGGLYLHVTATGSKLWRLAYRFEGKQKLLSFGAYPAVS LKDARHRRDEAKELLARGIDPGEEKKQAREEKLAKEREERDTFEFVAREWFAKYEPTLSE KHAKKLRRYLENTIFPVIGGKPVTQLEPADFLQLVQPSERLGHHETAHKLMRLCGQVTRY ARITGRVKYDVAAGLTEALTPVQTTHFAAVTLPDDIGQLLRDIDAYVGYTSVVYCLKILP YVFTRPSELRLAHWSEFDFKNAAWIIPASRMKMRREHVVPLSKQVLVLLKELHAYTGNGE LLFPSARALTTPISDAAPLAALRRMGYAKETMTLHGFRAMASTRLNELGFRADVIEAQLA HKEPDTVRLAYNRAEYMEERRQLMQKWANYLDELRSTKQ >gi|316923559|gb|ADCP01000068.1| GENE 13 12335 - 12523 160 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSDTIHSRKSPQFQEQDTFVTIGKVPEDEQITVADTGWSEETKERMRQAGYDDMDMIRAW YS >gi|316923559|gb|ADCP01000068.1| GENE 14 12651 - 13631 461 326 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302862827|gb|EFL85759.1| ## NR: gi|302862827|gb|EFL85759.1| hypothetical protein HMPREF0326_01462 [Desulfovibrio sp. 3_1_syn3] # 1 216 1 212 311 107 34.0 8e-22 MLDQQKISLGRVRQAYRLTSNEVETVIQALWPHAVFPSELWRRAHEAYWKDFADSNQTVY NIESPTLPQPCLSDKRIHTYLFGCDIMDFFKKLSWNIHANALELGLKYILEETYDEIGNI FPEEREFLFHSLKRLETEDFFDQYSPDSPSFENPISIPASSYHFISHIFVDTILVSRYDT IRFLRNYPLHKEVKGITRALVESILKSMPPENISPESSQVQTEQPAVEEAEDGKLMFRIP STVWRGKPDHAVYDGMKDTYPLPVIAYVLFYWCSADKPETGHKTQIGRKTQLGRLFTEKE YSDPKSYRNLMNRLLEEADAYTIIQG >gi|316923559|gb|ADCP01000068.1| GENE 15 13757 - 13975 229 72 aa, chain + ## HITS:1 COG:no KEGG:XALc_1802 NR:ns ## KEGG: XALc_1802 # Name: not_defined # Def: hypothetical protein # Organism: X.albilineans # Pathway: not_defined # 1 69 1 83 85 75 50.0 4e-13 MAQRKPSFPEFGFVRLPQILACIPICESAWWEGCRTGRYPKPVKLGPRTTAWRVEDIREL IERLGNGEANVR >gi|316923559|gb|ADCP01000068.1| GENE 16 14003 - 16654 1912 883 aa, chain + ## HITS:1 COG:STM2745 KEGG:ns NR:ns ## COG: STM2745 COG5519 # Protein_GI_number: 16766057 # Func_class: L Replication, recombination and repair # Function: Superfamily II helicase and inactivated derivatives # Organism: Salmonella typhimurium LT2 # 309 865 114 680 689 325 38.0 2e-88 MSSDILHSFAECLRAAGLEVETVQADGLLHRCGTTDRPHRKDGAYKAFLDVPASIWWKNW RTGDEGTWTYRPEKELPTAERKALHERIRAIRAHNEAEQARRWQVAAKLAISIWNHARPA EDNHPYLQRKEVPAIGLRQTEDGRLIVPVQNTSGKIQSLQFILPDKPAEDTDKFFLKSGK TSGGFFSIPAKNGTKDGPLLIVEGYATAASLHLATGYAVLIAFNAGNLQAVARTARARYP DREILLCADNDCETVKPDGTPWNPGKEAACRAAQAVGGKLAVCPAHDGKATDFNDLHRLR SFEAVRAVVEAARKQDTECPMPEGFFLVAEGKRAGLYKLETKPDGEMNEVRIGPPLSVKG MTRDSEGNEWGLMLEWADPDGKKHTWPMPIELLFRQGADWYSSLASGGWLGNPSARKKLM DFLSAVRPTRRIRCVPRTGWDNTAYLLPDAVYGDTSGESVVLQSAHHGDLYRTAGTLEGW REIAVLAVGNSRLSFALCAAFAGPLLRLAGLEGGGFSFEGGSSSGKTTALQIAASVWGGP EHVRSWRATDNGLENIAVLHNDNVLILDEVGQVNGKVLAECAYMLANGQGKGRSSREGNL RKSHSWRLLFLSSGELGLADKLAENGLKSRGGQEVRFVGLPVDTSMLTELHGFPHAGAVV NRLKELSAIHYGHAGRAFLHKLTEPDTMTTVLSELQSALANTVSHLVPVGSDGQVRRVAQ RFALCGLAGGLAAQMEILPPDFDAPGCAERCFHDWLAARGGIGASEDAAILAAVRLFIEQ HGASRFQDLDRIADTCPNRVGFRRTRNSMTEYLILPESFRAEVVKGYAETRAVRVLREAG WLRTPDKNRLKAQERLPGLGRVRVYIVRLPDDADEGTAQTLSD >gi|316923559|gb|ADCP01000068.1| GENE 17 16719 - 16913 90 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAFALASVVALDAANGVFLEFVHMQNEIGHGLFSLFIFSGDSGDSGDHQAMTRLFGGHHS NREW >gi|316923559|gb|ADCP01000068.1| GENE 18 16824 - 18050 853 408 aa, chain + ## HITS:1 COG:no KEGG:LKI_10806 NR:ns ## KEGG: LKI_10806 # Name: not_defined # Def: recombination protein # Organism: L.kimchii # Pathway: not_defined # 1 223 1 221 361 146 41.0 1e-33 MSYLVLHMDKFKKDAIRGIQSHNRRERESHSNPDIDYSRSVGNYDLHESASDNYSQTIQN RIDDLLMVKAVRKDAVHMCGLIVSSDKSFFTRMGKEETRRFFAEAAAYLTDFVGRENVVS AMVHMDEKTPHMHFLHVPVTPDGRLCAKDIYTKAALRKLQDELPRHLQNRGFQIERGVEQ QKGSAKKHLDTREFKQQQEMMAAMRQDAARLEAILFDLQKQIELAEAQQVTLEEENESRK KVISEAESRLKNGPKLPAANFFNFKEVLAHAQEKLNAYQKALSDKEEIAQERDMRERQAQ TLLYERDRAHEAVRMAQSHIQALKTENQKLRASMAQAQEKAASALKDMQDFILFSGNQLR FQEFQLQREDERKMNARNREQARLAAERDRQIPQEAEKPNRPRGMRMR >gi|316923559|gb|ADCP01000068.1| GENE 19 18197 - 18550 351 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302861370|gb|EFL84308.1| ## NR: gi|302861370|gb|EFL84308.1| putative protein MobC [Desulfovibrio sp. 3_1_syn3] # 1 97 1 93 123 115 73.0 7e-25 MSITASQVKEIRQEFAKLRPSAVTLDGNRAMTVKQAIFTLAPTLERMKKRGFDTQEIVEK LHEKGIEVKPQTLTKYLTEARRQKEGRKNQKRDTPPPLPRHEQRGSFITPDTPDDEL >gi|316923559|gb|ADCP01000068.1| GENE 20 18652 - 21048 1634 798 aa, chain + ## HITS:1 COG:alr3473 KEGG:ns NR:ns ## COG: alr3473 COG4096 # Protein_GI_number: 17230965 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Nostoc sp. PCC 7120 # 10 798 11 776 776 599 42.0 1e-171 MSVNRFGNESEEDTKLRRITPAIQKAGWKADQILMEYSLKADRYRIIPGQNFAQKDNQSA RTKPDYILCRSINRPLAVVEAKKATKTAQDGLDQAIAYARMLGVDFAYASAGQDFIEYEV PTAKQRTITLDAFPSPDDLWKRWRELHGIADEDGQKLENALYYTSADGHTPRYYQMQAIN RTVDAIIGRHRRRVLLVMATGTGKTYTAFQIVWRLKKAGIVRNVLYLADRNQLVDQTLAD DFAPFNTATKIRQGKIDRNYEIYFGLYQQLKGDQTEPVAEHYRQVPPDYFDLIIVDECHR GSASEQSSWRDILDYFQPAIQIGMTATPNEKDGSNNLDYFGEPLITYTLKQGIEDGFLAP YQVVSVHLDKDVQGWEPEEGDVDEEGQPVPMRKYTLADFDRKIELRSRTRKVAEVVSNYL QHLGRMSKTIIFCTTQRHAANMRDAMRACNRDLAAVDHRYVVRMTADDEEGRGLYEDFIS INRPYPVVVTTSKLLTTGANTKCVKLIVLDSNIKSMTEFKQIIGRGTRLRPDVDKTFFTI LDFRGACALFHDPDFDGPADDETNWDGNGDPPIRPRRPDGDDSSTSPVDTLPDGQDGAPD TGQDGGDDQQPREMIVVDGVEITILGKSVSYLDENGNLVTEKFAEYTRKNILSCFPTEEA FRQAWNGGRAKKIIIEELQQRGILVEHLKKELGNPDLDEFDMIRHIAFGGAMLTRQLRAS KVRGAKFLEKYQDTARAVLERLLDAYTINGIREIDDLAALKTYCGNLGGMKKIFQAFGNQ ENFMSAVREMENILYDAA >gi|316923559|gb|ADCP01000068.1| GENE 21 21067 - 22527 1528 486 aa, chain + ## HITS:1 COG:SP0509 KEGG:ns NR:ns ## COG: SP0509 COG0286 # Protein_GI_number: 15900423 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Streptococcus pneumoniae TIGR4 # 1 481 1 484 487 539 53.0 1e-153 MSLASFIKKMRDITRLDQGINGDAQRIEQITWMLFLKIYDDREYDWEALEHDYVSIIPDP CRWRNWADTGKALKGDDLIRFVDGMLLPTLKDLPIPPGCPLRKSIVKTVFTDIHNFMKDG VQLRQLLTEINECDFNDPQEAHAFGSIYESILKLLQSAGSSGEFYTPRALTDFMARHVGL KLGDKVADFACGTGGFLNSARAWLEGQAKTNAQREILARSFHGTEKKPLPYLLCVTNLLL NGVDEPLIRYGNSLTKSTGDYTEADKFDVVLMNPPYGGSEQLTIQQNFPSNMRSAETADL FLILIMARLKATGRAAVVIPDGFLFGGGNKTEIKRELLSNFNLHTIVRLPTSVFSPYTSI ATNVLFFDGNGPTKETWFYRVDMPEGYKHFSKTKPMLLEHLADLDAWWDKREPLEVNGSD KARKYSKEELEALQYNFDLCGFAQEDEEILPPAELIAHYKAERARHEKIMDEALGKILAL IGEARA >gi|316923559|gb|ADCP01000068.1| GENE 22 22956 - 23822 -118 288 aa, chain + ## HITS:1 COG:MJ0130m KEGG:ns NR:ns ## COG: MJ0130m COG0732 # Protein_GI_number: 15669898 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Methanococcus jannaschii # 20 278 150 391 425 130 32.0 4e-30 MNSQEERQFCADVRVNGVCQANINAKKIGAFSIPVPPIDEQQYIVSCLNELLPLVEEYGK SQSALHVLETELPGKLRASLLQQAIMGKLVPQLDDEPAVDIDAEEPEEVPFAIPEKWKWV RLRDIGAIFSGATPKTNVTEYWSPAIVPWVTPADLGKNKKKTISCGERSISKKGYLSCSA VLLPKGSVVYSSRAPIGHIAITENELATNQGCKSIAPNFEIVLSEYVYYGLIALTPDIQS RASGTTFLEISSKKFGETFFPLPPLAEQRRIITRLNELLPYLNSMIKN >gi|316923559|gb|ADCP01000068.1| GENE 23 24548 - 24667 68 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEVPEQYATHLVDRNQKLETTTASKKYRQEFIFLFKCKS >gi|316923559|gb|ADCP01000068.1| GENE 24 24892 - 25095 59 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVESQNMEKNYWNIIERFLGKRKTLYNLNLRMYYMCLLNMGDTHVSSGRLLFLVKSSMGR ISEPAKL >gi|316923559|gb|ADCP01000068.1| GENE 25 25464 - 26069 -324 201 aa, chain + ## HITS:1 COG:SA2010 KEGG:ns NR:ns ## COG: SA2010 COG3344 # Protein_GI_number: 15927789 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Staphylococcus aureus N315 # 5 199 70 224 338 81 31.0 1e-15 MQDFIKQEILDTNYVSSQISESVFSYRKALSIVDNAKFHIGAKWIIKTDIENFFDSIKEY HIYHFFHSLGYASLLSLELSRLVTWPSFDINFDLGLSQDRSLNEDITFTSKYKFYCSSDV IGSLPQGAPTSPQLSNIIFSPIDEKFKILSEEYGYIYTRYADDLFFSTRNNIDINSARNL LNEVRYILNSSNYKMNNKKQK >gi|316923559|gb|ADCP01000068.1| GENE 26 26159 - 26362 148 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNIFFAEKNGLEAQRKFIKFHGNTHTFLKYLLGKLNFANQVEPDFANIWKKRLLNLAKNS GFTMIYY >gi|316923559|gb|ADCP01000068.1| GENE 27 26582 - 27508 298 308 aa, chain - ## HITS:1 COG:SMa2105 KEGG:ns NR:ns ## COG: SMa2105 COG3177 # Protein_GI_number: 16263601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 10 251 94 347 406 79 27.0 1e-14 MATIESVGSSTRIEGSKLSDRQVESLLANLSIRAFTSRDEQEVAGYAETMEQVFQSWEHI PLTENHIRQLHRDLLRHSDKDERHRGQYKNTPNSVAAFDAHGQQIGIVFETAPPFDTPRL MQELVDWTNAELNAEAFHPLLVIGIFIVSFLAIHPFQDGNGRLSRILTALLLLRCGYAYT PYSSLESVIENSKEGYYLSLRQTQTSIQTDSPDWHPWLLFFLRALHQQMKRLERKMEQEH IVMSALPELSSQILNYAREHGRVTVKDMVILTGVSRNTLKEHFRKLIANGQLGMRGKGRG TWYILKIE >gi|316923559|gb|ADCP01000068.1| GENE 28 27733 - 27972 112 79 aa, chain - ## HITS:1 COG:no KEGG:Cag_1364 NR:ns ## KEGG: Cag_1364 # Name: not_defined # Def: hypothetical protein # Organism: C.chlorochromatii # Pathway: not_defined # 5 75 2 72 73 89 61.0 5e-17 MIISYTPISIDREALTIMGVPFPSLEAWESAAAAIGSNMFEGYQPTKRGIEIIRDYTTGK ISFDAFIPAAKHKAYREGR >gi|316923559|gb|ADCP01000068.1| GENE 29 28759 - 29193 442 144 aa, chain + ## HITS:1 COG:AGl997 KEGG:ns NR:ns ## COG: AGl997 COG1917 # Protein_GI_number: 15890615 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 15 141 23 149 151 169 62.0 2e-42 MEKDTLSTAVSEPVILRNGSQPSIKGPDAWFTGTVRIDPLFPAHTPARASSAYVTFEPGS RTAWHTHPLGQALIITAGVGRVQVQGGPIQEVRPGDTVWFPPYVKHWHGAAPGVAMTHIS VAEEEGGNNVEWLEKVTEEQYTGN >gi|316923559|gb|ADCP01000068.1| GENE 30 29390 - 30361 915 323 aa, chain + ## HITS:1 COG:MA0367 KEGG:ns NR:ns ## COG: MA0367 COG2768 # Protein_GI_number: 20089264 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Methanosarcina acetivorans str.C2A # 32 323 53 353 355 242 43.0 9e-64 MAEQFSRRSFLKGGALTVGAFAFGGFPRIDTAMAAPQEKAKVFFTKDISLEGLMKVYARV NHGMSGKIAIKLHTGEPHGPNILPREMVRGFQANIPDSSIVECNVLYPSPRQTTEGHRET LRTNGWTFCPVDIMDEDGDVSLPIPGGKWLTELSVGKHILNYDAMLVLTHFKGHTVGGFG GSLKNISIGCASGKLGKQQIHQLPGDGTWPGGPLFMERMVEGGKAITNHFGQHITYINVL RNMSVDCDCAGLGAAAPTTPDLGIIASTDILAVDQASVDMVYALPEAQRRDLVERIESRS GLRQLEYMKLQGMGNNQYDLITV >gi|316923559|gb|ADCP01000068.1| GENE 31 30492 - 31613 617 373 aa, chain + ## HITS:1 COG:XF1739 KEGG:ns NR:ns ## COG: XF1739 COG2220 # Protein_GI_number: 15838340 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Xylella fastidiosa 9a5c # 25 372 26 371 385 378 53.0 1e-105 MRRGMRVAFGLVLLLCILIGGGYAFMSQASFGKLPEGDRLDAISQSPHYEGGAFRNTEPL PPIKGNGGIVGALMKYVFSSTEGQKPTRPMPSAKTDLRGLDRNTDTVVWLGHSTYFVQLG GRRLLIDPVFSTYASPVFFANRAFEGTNLYTAEDMPDIDYLLISHDHWDHLDYATATALR SKVGQVVCPLGVGAHFEAWGYGKETVFEGDWYSVLEGKDGFAIHILPARHFSGRSLTRNK TLWAGFALVTPERRIFFSGDSGYGKHFAEIGARFGGFDLAMLDCGQYDENWRYVHMMPED TAQAAEDLRARALLPGHVGKFAIAYHTWDDPFKRIVAASRGKPYRLLLPLIGEPVALPIA SDGESLRWWEAGR >gi|316923559|gb|ADCP01000068.1| GENE 32 31770 - 32864 1353 364 aa, chain - ## HITS:1 COG:L0358 KEGG:ns NR:ns ## COG: L0358 COG0180 # Protein_GI_number: 15672048 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Lactococcus lactis # 5 347 6 341 341 429 62.0 1e-120 MEQIILTGDRPTGRLHVGHYVGSLKRRVELQNSGKYDKIFIMIADAQALTDNAEHPEKVR QNIIEVALDYLACGLDPVKSTMFIQSQIPELCELSFYYMNLVTVARLQRNPTVKSEIQMR NFEASIPVGFFTYPISQASDITAFKATAVPVGEDQAPMLEQTKEIVRKFNAVYGDTLVEP EMLLPDNKACLRLPGIDGKAKMSKSLGNCIYLGEDADEVKKKVMSMFTDPNHLRVEDPGQ VEDNPVFIYLDAFSRPEHFERHFPDYQNIQEMKAHYTRGGLGDMKVKKFLNNVLQDELEP IRNRRRELEKDIPAIYDILKEGSMRAREAAARTLDDVRRAMKINYFEDMDLISEQAKKYA RSCG >gi|316923559|gb|ADCP01000068.1| GENE 33 33693 - 34970 1599 425 aa, chain + ## HITS:1 COG:MK0731 KEGG:ns NR:ns ## COG: MK0731 COG0045 # Protein_GI_number: 20094168 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Methanopyrus kandleri AV19 # 1 394 1 358 359 192 33.0 9e-49 MAKFLEYQGKEWLAKGGMPVPKGRVASSPEEAEEAARWIGGPVAVKGQVQAGGRGKAGIV KLADTPEEARALAEEILAKTVKGLPVEKVLIEEKLDIKKEYYCSFVINNAREARCPMLMF STEGGMDIESVDESLLFRLNIDPQYGLQSYDAIDVCTQAGIAPADLSKFASFLTKLSKLY KKYDCQTLEINPFVMTGDGSLYCADCKMEIDNSAVFRHPDFGIKIARDLPGQPTDLDVIG WGIEETDARGTGFLMNMGYDEVSPGYVGYHPIGGGSAMMGLDALNAVGLKAANYADTSGN PVAAKIYRVAKATLSQPNIEGYLLGGFMMANQEQWHHAHALVKVLREMLPKKPGFPCVLL LCGNREDESLEILRTGLADMMTPDGIGRRIEIYGREHVTDTKFIGERLLALCKEYSAEKT SQKAG >gi|316923559|gb|ADCP01000068.1| GENE 34 34975 - 35262 377 95 aa, chain + ## HITS:1 COG:no KEGG:Moth_0398 NR:ns ## KEGG: Moth_0398 # Name: not_defined # Def: hypothetical protein # Organism: M.thermoacetica # Pathway: not_defined # 1 95 2 96 96 120 67.0 1e-26 MEIKEKSSTVSIDVSKCLSCETKACVAACKKYARGILELKDGVPSVEHLTAEEVLRLGTE CLACEFACTFHGNKAITIDVPVAGLPEYLAKRGLA >gi|316923559|gb|ADCP01000068.1| GENE 35 35311 - 36207 1066 298 aa, chain + ## HITS:1 COG:APE1072 KEGG:ns NR:ns ## COG: APE1072 COG0074 # Protein_GI_number: 14601168 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Aeropyrum pernix # 1 298 6 297 297 246 45.0 3e-65 MGILINKNTKVVVQGITGREGSVRTKYMKSYGTHVVGGTSPGKKGSDVHGVPVYNTVKEI AREHGEIDFSVIFVPGHVLKTAVMEAADAGVKNVIPCVEGTPIHDIMEMIAYCKAKGTRL LGPGSIGIITPGEAVVGWLGGNVEWANTFFKRGSIGVFSRSGGQSGTIPWVLREGGFGVS TVIHTGTEPVLGTSMADLLPLFEEDPETEGVAVYAEIGGSQEEECAEVIASGKFTKPFVV YVSGAWAPEGQRFSHASNIVERGRGSAKSKMDAITKAGGYVAMTPTDIPVILHEKLKK >gi|316923559|gb|ADCP01000068.1| GENE 36 36263 - 37147 1387 294 aa, chain + ## HITS:1 COG:FN1379 KEGG:ns NR:ns ## COG: FN1379 COG2301 # Protein_GI_number: 19704714 # Func_class: G Carbohydrate transport and metabolism # Function: Citrate lyase beta subunit # Organism: Fusobacterium nucleatum # 5 282 9 284 296 155 33.0 7e-38 MAVMRSIMYVPGNNPKMVEKAPSIPADIITLDLEDSVPPSEKEAARKLVAEKLKYAGSGG AQVYVRINNWETEMTNDDLEAVVWEGLDGVTLAKTGHPDDVKRLDWKLEELERRRGIPVG TVKISMLLETAKGIMNAYECCMASSRNVNAIFGAVDYCRDMRVKLTSEAVEQMWGRAKMA IAARAAGIVAIDAPFVAYQDIPAFERNVMEGKQMGYEGRMIIHPSQVEPSNRLYAPDPAD VEWANGVVKVFEEEGLAKGKAAVSYLGKMVDTPVYLNAKDILAAQAEIDAKVKK >gi|316923559|gb|ADCP01000068.1| GENE 37 37596 - 38840 1655 414 aa, chain + ## HITS:1 COG:BH3168 KEGG:ns NR:ns ## COG: BH3168 COG0281 # Protein_GI_number: 15615730 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Bacillus halodurans # 8 411 5 405 410 445 62.0 1e-125 MSRIKNLREEALALHKDNVGKIDIRIKVPARDNDDLTLAYSPGVAEPCKEINKDPAMLDV YTNHSNFVCVVSNGTAVLGLGDIGAGAAMPVMEGKSLLFKRFGDVDAFPLCVDTKDTAKI VELVELVAPTFGGVNLEDIKAPECFEIEDSLKARGVFKGPIFHDDQHGTAVVTLAGLLNA LKIVGKNIEDIKVVTSGAGAAGTAIIKLLMAVGLKNVIMCDSKGPIWEGRPEGMNKYKDA IAKATNPDKVKGTLADAIKGADVFIGVSVPDSLNEDMIRSMAKDPIIFAQANPIPEIWPI ERATQAGAKVVATGRSDIKNQINNVLAFPGIFRGAIDVRATDINDAMKIAAAHAIADLVT PEKLSPGYIIPPATDPDVAPAVAAATARAAIESGIARNPVDPQSVAENLRKRLS >gi|316923559|gb|ADCP01000068.1| GENE 38 38886 - 40277 1653 463 aa, chain + ## HITS:1 COG:BS_ysfC KEGG:ns NR:ns ## COG: BS_ysfC COG0277 # Protein_GI_number: 16079920 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Bacillus subtilis # 1 458 1 457 470 424 47.0 1e-118 MPSASLIKEFEQLVGKENVFSSEADRQSYAYDAAVLKPMIPSLVVRPTEVEQLGQVVKRC YEEGIPMTVRGSGTNLTGGTIPDCSDTIVILTTGLNKILEINTEDLYATVQPGVITAKFA AAVAAKGLFYPPDPGSMSVSTIGGNVAESSGGLRGLKYGTTKDYVMGMEVFANTGDLVKT GSRTVKCATGYNIAPLLVGSEGTLAVTSKVTLKLIPPPKASKAMMALFKDMDGASQAVAG IIAAHVVPCTLEFLDHASINYIEDYVKIGLPRDAGAMLLIEVDGHPAQVEDEAVIVEKVL RDNSATEIVVARDAVEKSRIWEARRVAIPALARCRPTLMLEDATVPRSKIPAMIKALDEI AARHNVTIATFGHAGDGNLHPSILCDRRDKEEFSRVERAVDDLFNAALELGGTLSGEHGI GTAKIKWLEQETSHGTIMFSRRLRKAFDPKGLFNPSKIVGFGD >gi|316923559|gb|ADCP01000068.1| GENE 39 40526 - 41815 1165 429 aa, chain + ## HITS:1 COG:BH2135 KEGG:ns NR:ns ## COG: BH2135 COG0247 # Protein_GI_number: 15614698 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Bacillus halodurans # 16 424 26 433 451 194 29.0 2e-49 MDEISSLAKSLMVLDDKLASCMKCGFCQAFCPVFDTTGKEGDVTRGKIALVENLAHLIIQ DPEAVNEKLSRCLLCGGCQFSCPSGVPTLEIFMEARAIVTAYLGLSPVKKAIFRKLLPNP RVVDTLLRAGSPFQKLLLKDEQNPQKTACAPLLKSLFGDRHMPLLSDKPLRSKVGKVDRP AGKSGLRVAFFPGCMGDKIYTGVSEACLKVFEHHGVGVFLSDNFACCGIPALVSGDREGV EKMIEHNLGILSRTHFDYIVTPCSSCTMTIKEYWPEMSRNLPASVQETAKKFGAKAIDIN AFLVDVLGVHPEGPAKGGLKVTYHESCHLKKSLGVSKQPRELINMNKNCELVEMVEPDRC CGCGGSFTLTHYDISQRIGQKKRDSIVATGADVVATGCPACMMQLSDMLARNGDRARVRH TIELYADTL >gi|316923559|gb|ADCP01000068.1| GENE 40 42010 - 43443 1648 477 aa, chain - ## HITS:1 COG:STM3887 KEGG:ns NR:ns ## COG: STM3887 COG0477 # Protein_GI_number: 16767171 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 8 463 9 464 475 574 68.0 1e-163 MTFSLHKLVGLPAIAAAAFFMQALDATILNTALPAVARSLDRSPLAMQLAVVGYTLTVAM LIPVSGWLADRFGTRRVFISAVNFFILGSLLCALSTSLPMLVASRVLQGIGGAMMMPVAR LAVLRAYPRTELVAVLNFISIPGLVGPVVGPLLGGWLVTYATWHWIFLINIPIGLIGILY AYHIMPDFTAPRRPFDWMGFFLFGLSLVSFSSSLEVFGDRVGPDFLPPVLLISGIALLGL YIRHARRHPSPLIRLSVFHTRTFSVGIGGNILSRLGTGCVPFLMPLMLQVGLGYPALVAG AMMAPTAIGSITAKTFAMKILNRFGYRVTLTGVTVCIGLMIAQFALQTPTLPLWMLILPL FMLGMAMSTQFTAMNSITLGDLSTEDASTGNGLLSVTQQLSISFGVASSTVVLRFYNSFA SGTLLDHFHYTFITIGGITMMAAFVFMLLRKDDGDSLLPGRKKPVFETTSSSTPRNP >gi|316923559|gb|ADCP01000068.1| GENE 41 44145 - 45404 1347 419 aa, chain + ## HITS:1 COG:YPO1285 KEGG:ns NR:ns ## COG: YPO1285 COG0814 # Protein_GI_number: 16121568 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Yersinia pestis # 11 419 10 414 414 361 50.0 2e-99 MQEQTGEGMGSQGNVLGGTMIIAGTAIGAGMLALPMISAGMWIYWSLLLMVLTWMLMLRA SQAILEVNLHYEPGSSFHTLVQDTLGPVWSTINGLAVAFVLYILVYAYVSGGGSTVQQTV MAVTGNDPGMMGSSLFFSLILMACVWWSTRFVDRLSVILMGGMVLTFILSMTGMLSQIRL PVLLDLGENGSGGGAAIFIWCALSTYLTSFCFHASVPSLVKYFGKRPADINKCLRYGTLI ALVCYVAWIVAADGIISRGQFKAVIAAGGNVGDLIRAAGSGIDSSFILRMLEAFSFFAVA TSFLGAGLGLFDYMADLCKFDDSVMGRAKTTLVTFLPPLLGGLIKPDGFLAAIGWAGLAA TIWSVIVPALMLRASRKKFPAAAYSAPGGSLTIYVLLVYGCITAVCHILFVLHYLPMYE >gi|316923559|gb|ADCP01000068.1| GENE 42 45622 - 46047 597 141 aa, chain + ## HITS:1 COG:no KEGG:Dde_1828 NR:ns ## KEGG: Dde_1828 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 6 140 4 140 142 87 40.0 1e-16 MSDILLWMHPLMQVCAAVLGVWAMWQGLKRVAMLLGKKTLFPWKQHVKLGTLALVLWILG ALGFYVTLDLFGSTHITGLHAELAWPIIGLAILGLITGYIMNRYKKKRKILPLIHGTINV LLIILVAVECYTGVQVWKDFS >gi|316923559|gb|ADCP01000068.1| GENE 43 46197 - 47345 1481 382 aa, chain - ## HITS:1 COG:aq_1145 KEGG:ns NR:ns ## COG: aq_1145 COG1454 # Protein_GI_number: 15606402 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Aquifex aeolicus # 6 380 7 381 387 181 33.0 3e-45 MHCHFHNPVSIHSGPGMRRQIGALAAKYGKKALIVTTGSAHSRALAEEAGASLEAAGLGW AHYNGIRPNPLGSMAAEGAQAARDNGCDMVIGIGGGSVMDASKGIAFAYYNDAPIFEYIY GRKQGGQALPILLIATTAGTGSEGNWTAVFTDTDHVKKGFALPALYPKESIVDPELMTTL SSRGIAGPGFDALAHAIESFLSVRANPFSKLYSRQAILWLSEGLPKVQANPSDLETWERV ALGSTFAGIAIGNAGCTAPHGIEHPISGLLNVAHGEGLAAIYPEYMRFMRPHAQADFAEL ARLLGAETSGLTEEEASLRAVEQVDALLKTLGIAFTLSDLGVRESQLDWLSRTALDSMPA VFANNPAAMTSEDVKTILERRL >gi|316923559|gb|ADCP01000068.1| GENE 44 47384 - 48388 1474 334 aa, chain - ## HITS:1 COG:BS_galE KEGG:ns NR:ns ## COG: BS_galE COG1087 # Protein_GI_number: 16080937 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Bacillus subtilis # 4 328 3 328 339 390 58.0 1e-108 MGKILVTGGTGYLGTHMLVELVKQGYEPLCVDNLHNSNPSVLLAVKEICGKEIPFIKADA GDIAAMRALFAEYQIDAVIHFAGHKAVGESVAHPLMYYRNNYDSALTILEMCLKHGSALV FSSSATVYGVPHFLPLTEDHPLSAVNPYGRTKLYIEETIRDVALAHPEFNVSILRYFNPV GAHESGLIGENPNGIPNNLMPFVCQTAAGIRKELQVFGDDYDTPDGTGVRDYIHVTDLIS GHIAALKKLETKPGCIIHNLGTGNGTSVLEMVHRFEAVNGVSVPYRIVARRPGDVASCYA DPSKAFEELGWKAVKTLDDMVRDSWMWQLKGAKN >gi|316923559|gb|ADCP01000068.1| GENE 45 48284 - 48541 69 85 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEVVHAQGLVALLDELHKHVRAEIAGSAGDENLTHIVFLLTPGLSGLNLSCCLIQPGDPR KSRRWGSAGDWVTFAQQRIGKEVLP >gi|316923559|gb|ADCP01000068.1| GENE 46 48805 - 49569 247 254 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 3 243 1 232 242 99 30 9e-20 MKLQGKTAIITGAAKGIGAAAARRFAAEGARIVLADKDEANGHEVAKAIVKEGGEAAFCL CDVGNEADVQAALDTAARTYGKLDIVVNNAGWQLNKTLLETTAEEFNAVLNTNLTSMFLF TKGAANMFIAQKTGGAIVNVCSTFAVVGSPGYVAYHASKGGVASFTRAAAISLMPHNIRV NAVGPGTTETPGLHDGARDTGDEAKGMASFLALQPLKRFGKPEEIASVIAFLASDEASFV TGALWMADGGYTIV >gi|316923559|gb|ADCP01000068.1| GENE 47 49704 - 50633 543 309 aa, chain - ## HITS:1 COG:alr2625 KEGG:ns NR:ns ## COG: alr2625 COG2207 # Protein_GI_number: 17230117 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Nostoc sp. PCC 7120 # 140 291 178 333 347 106 37.0 5e-23 MTPTSIHSTCTCRFFHQGQCSCPDVTLFVIDDDSHAAGPFRLQQDGARMTFFYCLNGQAR IDAQAADGEAFSEEIGKGNSLLFCLNGGFSISRAGKRIQAVGLQLRPECLRLFACQLGFV HDGLRKSSSRTLKGDIPLHLRILLEQILNCRDDGLLRDVFLAHKKYELLYQQIELLDREN SHKCVSPAECRAAHRAYAILLQDIGKPPSLPELAAAVGVNRTRLTALFRMLFEDTVFGIL RRERLECARRLLNEKGKNITEIAYLCGFSSPSHLTKAFSAQFGISPKQYQIARRHGHPAA PVSCGEQHT >gi|316923559|gb|ADCP01000068.1| GENE 48 50698 - 52002 1254 434 aa, chain - ## HITS:1 COG:no KEGG:Dacet_1613 NR:ns ## KEGG: Dacet_1613 # Name: not_defined # Def: hypothetical protein # Organism: D.acetiphilus # Pathway: not_defined # 25 434 26 405 405 202 34.0 1e-50 MKTLFRACAVCLFWFGFAHAAVEGDPFSDGKAFSDLHAIEASDTFLLGGFAESRNQFSLN DPETPISLRQKVQVEALWRRDVLSLFGSLEGSYEGAAHTWPGDHSPWKADLRELYLTYDT DNIDIFLGRKTHRWGTGDGINPMDLINPLDTRDPVTTGRADNRLPVWLFSGTVSGNGMSF EGVFLPKAEVNALPRSGNPWEPRALRELRRQRDDGLFTITAPDEPDRWFRDVEYGGRLSA NLGGWDLALMAFHGFVDNPMFVSRNNGGGMRWTAEYPRFTAFGTSFAKGFGSQTVRGEVA YKPRFPVQGSFGFHRADLWQGVLGWDYDIDSKYYLNFQLFGDVQEGSEASGSRTWHGATY EISGKWFRDALKTGVRGKLYTSGEGTLTEVFLEYELDDHWKASTGVMFWTGGEDTILGEY TDNDFVYLTLRYAF >gi|316923559|gb|ADCP01000068.1| GENE 49 52005 - 52793 889 262 aa, chain - ## HITS:1 COG:no KEGG:Dacet_1614 NR:ns ## KEGG: Dacet_1614 # Name: not_defined # Def: hypothetical protein # Organism: D.acetiphilus # Pathway: not_defined # 13 262 4 253 253 240 48.0 3e-62 MQFMFSRKSGIVLAALLGVQLILPASGWTLTGRDVALKADQVDTSRTSDMSMAMIIKRGD QQLVRFMAMKKKKFPDSEKQHIRFLEPGDIRNTAYLTWSYRDINKDDDMWVYMPAESLVR RISGGSKKGSFMRSDYANEDISRREVDKDTYTLLPDEALSGVDCHVLEAKAVFPEKTNYS KRIIWIRKDIWLPAKIDFYDQGGNRCKELVFGGYKEIQGIWTATRQRMRTVGSDSETIME IREVAYNTPIAEDIFLPQDLKR >gi|316923559|gb|ADCP01000068.1| GENE 50 52808 - 53560 609 250 aa, chain - ## HITS:1 COG:slr0495 KEGG:ns NR:ns ## COG: slr0495 COG2091 # Protein_GI_number: 16331528 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Synechocystis # 50 210 6 171 246 102 33.0 7e-22 MFSPTPLSANSVTASPPTPPCEDGKPSDGFRLLLLSGDAAAPPADSDCVHVWSRPLTGPV HSGFRDLLSEEECAAASRFHTPELAERYRQAHGWMRSVLSRYLDTPPETLRFRSGPDGKP RLAHGSLHFNLAHSDSLALLAVRQDAPVGVDVEPVRPLPEAMLIACRWFSAPEIRWIAAS DDRDRAFLRCWVCREAFLKATGEGLGRPLDSFALHPCGDTLRLDGAALAECTPLAGHVAA VVCETLFPSC >gi|316923559|gb|ADCP01000068.1| GENE 51 53868 - 54650 498 260 aa, chain + ## HITS:1 COG:no KEGG:AM1_3384 NR:ns ## KEGG: AM1_3384 # Name: not_defined # Def: hypothetical protein # Organism: A.marina # Pathway: not_defined # 15 173 3 165 184 81 31.0 3e-14 MKFKFIFTPVFGTVMKHSLLFWSRKSHCWAGVYLSAVTLLWLGEMALLPAVYEPRFEVAS PSPAVSSVGTPAVSVQDVLDRFSHERVDGEPLTPDTVTFLPKEGSWVVRDAGRYVSVTYD VRTGEARERAFDSAKLVEEKNGLAWLSPELGAALKLSFQPLFILLCVTGIHLLWGRNRRP GKEKERAATLDAVRTGETCLYERADDPCLAGRLASLGFLPGVAVRMLRNAKRGPLLVLAR HTRVALGREIASGIVVARKG >gi|316923559|gb|ADCP01000068.1| GENE 52 54653 - 56620 1942 655 aa, chain + ## HITS:1 COG:MA3477 KEGG:ns NR:ns ## COG: MA3477 COG0370 # Protein_GI_number: 20092288 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanosarcina acetivorans str.C2A # 13 645 11 658 670 300 32.0 8e-81 MGALRNHPAGKSLTVAVAGQPNTGKSTLFNRLTGLNQRVGNWTGKTVDRKEGAALWNGVA FTFVDLPGCYGMTPHSPEEEIARDYLLSGGPDAVLAVVNAASVERSLYIVSELVALGLPV VVALNMTDVAEQEGRAVDAAVLSSALGLPVIPLVGTSADGTAVDALLRALSEASCPASVP AVPAKVSLLAEIIGAALPCAPLWAACKLFEGDEGLLARTETLSGERREAIRVLMGDASVA PELYASRQAWIDAACGKAVRTSGTGRGLTERWDRVLLHPLWGRLAAFLVIPAGCVLGAAL GMMTGGLVLFAVLAWAPDLKAAYPGPLGSLAADALLPAFGWVVALLSMISALYAIFSFLE DTGYLARVAYLMDSFLSGLGIDGKSAIPLMMGFLCNTVSIAGSRVVDSPRRRLITLCMLP FIPCSGQMGVAFLFAFALFPAGTALLVVLGVSCLNLFMAGVVGRIMHATLPGRYANGLIM ELPLYHRPNFRTIFGNVRTRAVLFLRGAAVNIFAALIVVWAISTFPGGTVETSWLYHFGR WLEPLGSFLGFDWRFTVALLSSFVAKETTAGTLAVLFSVGATDHEAVVQALRASITPAGA LAFIVASNLYIPCIASISVLRSELGSWGRTLALLAAMFALAMGMACAVYHIAVFV >gi|316923559|gb|ADCP01000068.1| GENE 53 56642 - 62929 7989 2095 aa, chain + ## HITS:1 COG:all2649 KEGG:ns NR:ns ## COG: all2649 COG1020 # Protein_GI_number: 17230141 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Nostoc sp. PCC 7120 # 759 1860 50 1114 1583 469 29.0 1e-131 MNQIHPEPLSGTLVDILEETAARYPEHGIFHVASGDTSEEFQSYPELLRNAKEMAQVLYR RGLRPRTPLILSVDASRTFLEIFWGCLFSGVLPAPLAHMRTPKADSMEAQKIFHVWEAIK APIVADSSNERAFAVLTDMFKGTGATLIASEDLITEARIVDFQEDDRYIPQHDDSAVLQF SSGSTGMPKGARLTHRNLIANIRALRAIEGGTSEDRLVTWLPYFHDFGLFGCHLMPMLAG MDQIKMDPFQFAQRPFLWMEKIHEHRGTITSATNTGIEHLSAYIGLRADRLPEVDLSCLK VWTVGAEMISAESCRILQEQLAPMGLAPHLLMPGYGLTETTLVATCHPRNTPVKTFTLDR RLMVSEQVVRYVPKGEDAAEFTSVGKPVQYCEVRIADKDGKPLPADRVGVVEIRGDNVIS DYFDNPEATESSFNGEWFSTGDMGFMDAEGDLCIVGRSKEIIIVRGQNYYPADIEHIALS GMEKAFRLVVACGVYDPREGREIILMFYIPAKKGADAELAQLLHRMNEQVSTLAGFSVDR FIAARQGDIPRTSSGKVMRKALCEGYLNGDFDGKITVLEHEGPVLDPASMDHEQIVLGVW SDVLELPVDAIGTKKNLFRLGGDSIRAMRMQARLEDIYRAKMESNFCYLFPTVEQQVQYF RTRDFSIEPPQNEIEALLQKIVASSLGIKAEAVSVSAELMPLIGDISKAFALFDAIREVF AEAEIGEDFLRLTTIRQMADYLWPRVFCETHADGDVAYFPLMHFQETLYFHRKGFVQNEP SGLSCYIYLNARMDGDFRPDVFDKALNYVVGRHPIMRSVIDEEREKPRFKVFKTVPEVHA RHIDVSHLPPSEERAYILQRGLELNDYRFDLGEWPLFFCEITKFANDRYVFAMNIDHMLV DGFSYMQVFDELFNTYDRMVLGEPWELPEAAMTFGDYVRVENLRQRTQEYKNALEFQLGL FKDLPSKALLPTKRNPALLKEVYFDTFYQEIRPEIIEGLNAIAAEEQISLNALLLAAYFK LMNVWCHQDDMIINMPVFNREQYFAGARKTVGSFIDIFPVRLQTHFDEPIVQIARKAEAF TRKLLEVPVSSIELSRELFERQGLRATSMSSIIFSNSIGMYAGEVSGMKTIKLETPEFRT GAPGTFIDLVIYDYRVRRNDNDVFYFNWNYIRDLFDREFIETLAKQYHILLEQLIRYRNE PDHPFSGEDIVPERYRSLIASLNRTEAAIPESTLHGLIEEQIRRTPDSEALTYEGRSLSY AAFGKRADQVAGLLRHLGVSSNEFVALFLNRSFDTLAGQLGIMKAGAAYLPIGVDYPTDR VAYMLEDSGARVLLTQSAHLKELEGALGRVEHILVMDEGASAAAIPESLRERVVMPSGIY VDRPDVALPAGSPDDLAYMIYTSGSTGKPKGAMITHRNIVNFLTWVKEELGITAAERLAF VTSYAFDMTMTSNWTPFLVGASLHVLSEEKTKDVNNLLRFISEKGITFLNITPSHFSLLS SAREFLADADIPLPETMRIMLGGEVISTKDLNQWLKFYPGHRFINEYGPTEATVASTYFR IPVNADNQVDLPVVPIGKPVYNTQIYILNRFREHCMPGVPGELYIGGMGVSRGYHNKPEK NAEAFVPNPFNPADPSDRLYRTGDVVRMLDTGDLEFLGREDHQINLRGYRIEAGEIESAL REHESVTEAVVVPREDTAGSLTLVAFHTGSEVPAAALRDHLAKRLPEYMIPAHFEPLGEM PCTPSGKLDKNRLPDVVIEAGKRDAAVIRPVTELEKRITAIWEDVLGVTDLGMTSNFWDV GGDSLKAMRLIMRMKKEGFIDFGLKEAFEYQTVASIVSRILRKGEGKAEEAGIVALTDVE RPEARLFCLPYACGNPTMYRQFGRLLPASYAVLAANLPGHGKAGEPMRSIPEMAALCVEQ LAAFNDGTPLFLLGYSFGGFLAYEIARRLEEKGRPVAGVVLVASPPPGVIGGLRAIIDSS EDEIVRVSKEVYHYDFAEMTEAERRDYLNTLRVDTQAMLDFAFGAAVEAPMLNLVGTLEE EEELKTMAEAWNAVFANPSHDRTEGAHMLIKTHPEELAGKVRHFMNELLKREGKA >gi|316923559|gb|ADCP01000068.1| GENE 54 62926 - 65049 2797 707 aa, chain + ## HITS:1 COG:all4026 KEGG:ns NR:ns ## COG: all4026 COG1629 # Protein_GI_number: 17231518 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Nostoc sp. PCC 7120 # 33 707 189 854 854 385 35.0 1e-106 MKRNGIAGILAAVLLSAWSLPAGAEETYKLDPVLVTAEKRTENVQDVPVSVTAISEQQIK DSGIRSIQDVARQVPNLFIANWGFRGNSYAFIRGIGAVNNDPAIGFYVDDVNYMDSRVFD TNLFDIERIEVLRGPQGTLYGRNSLGGVVNIVTKKPDNEFHYGLEQTVGNENLYETTLYM RAPLIKDKLFFGVSGTSEQMDGYNTNDFLDKKVDRRRGLNGRMQLRWMPTDKLDVTANVD GEKVNDGVFPLTDMDQADKNPHHVSYDYEGRDKRDTLGSSLRVAYDAPWFKMTSITAYRG YNDVTRNDQDFTPYDLITAREDIKDRQFTQEFRFASPEGSGPLKWLGGLYLYKKHQDHTL DLNYGQGAADMGMVPMAMTNTADSDIKTYGYAVFGQATYTLFDKLDLTAGLRYEYEKNKL DYASDYLAGGMVVPGMGSDIRGRKHDDVFLPKAQIAYRWTPDFMTYAGVSRGYRSGGFNT SFLDVSDLAFDPEYSWNYEVGFKSSWFNNRVNFNTSLFYIDLSDQQVTQVLPTANTVIRN AGKSRSMGFEVEASALITEGLLFEGSFGYTDAKYRRYSDKVSGMDYVGNRTPLAPEYTYN LALQYSLPLLESFDFFHKEDSLTWITRAELQGVGKFYWNDANTLKQDPYELVNLRTGLET DNYSITFWAKNVFDKKYNCVAFAFSGSSALAQVGAPRSFGVTFRADF >gi|316923559|gb|ADCP01000068.1| GENE 55 65178 - 67574 2637 798 aa, chain + ## HITS:1 COG:PA3079 KEGG:ns NR:ns ## COG: PA3079 COG1033 # Protein_GI_number: 15598275 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Pseudomonas aeruginosa # 13 774 19 773 793 126 22.0 2e-28 MSSFFFGRLTDWVLRRRGAVAVLIVLLTAFFLWKMQDLQFDNSNEIWFSEGDPSLERINT FHKLFGNDDFVYLVFDADTLFSPESLRLLSDLVSDIRKKVPYLRDVTWLGNAERIEADAE GIVVSDFIDPETYDAASMPELRRRALAEKNYRDNLISSDGKTVGMLLEMRTYPRDSVEPR SEVTHALREVLAQPEYAGFSIHAVGQPILHNDYNELSFKESATFFGLCLLVQMVLLFWLG KGVRGVIGPISVVVLSVIWTLGMIQVLGFTLNLFIILIPTLLICVCIGDSMHIIATYNQH RRGGLGVVAALRQAMSETGMPCLLTSLTTAAGFLGFCAADIQPFREMGVYASVGALMAYI LSILVVLLLYSTERDKAPSAPTPGMPGAPARTASPSTPIRIGKPDIFDRFLGRVYLINVR YPKVLLALFVVLLGVSVYGYTLVEVETGTARMLSKSLPLRQTYDFVDERMGGSMSVEIML DTGRENGVKSQAFLRGLERLQQHLDVSPLVTKTVSVLDIIKKINESMHGGDKAAYALPDG DDAAIAQYLLLYEMSDGRELDKLVSFDSRVARLTAKTRTLGTGDVRRLSEDVAAFSRGVF GDTVKVRMAGNLDWTKSMNDLLADGQRQSFLAALLVVSVIMCFALGSLRLGLLSMLPNIF PVFVTLGLMGVSGIYMDMPLMSFSAIIIGVVVDDTIHFLFHYRESFARTGSYEKALEETL LSTGRPMLFTTITMLCGFSVLLFSDMVGVVKFGGLGCFAFGWALLADYFLVPAILLTFRP LKAPSNPPFSGETARRGV >gi|316923559|gb|ADCP01000068.1| GENE 56 67571 - 68674 882 367 aa, chain + ## HITS:1 COG:no KEGG:PA14_54900 NR:ns ## KEGG: PA14_54900 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_PA14 # Pathway: not_defined # 2 352 6 364 384 172 34.0 2e-41 MNGRMIGVIGGTGRVGRECLRYLHENTAFGLLIGGRKPPREALPGSFLSVDVFDEASLAR FCGQCSLVINCAGPASAVRERVAAAALAGGCHYVDPGGYTPLFPILSSRRPEIRAKRLTF LLTLGILPGLSELFPVYVARTCFDQVEGFEYACVGRDRWTFPSAWDIAWGVGNIGNGEAP VYYEQGVRRQAGLLASGRRMDLPAPVGRHTVFRLMRDDLQLFVEESGISEAHVYGNNWGC WVTLATVLVRLAGWYGTERRLAQSARLIMRAAELDMRGRKPGFMLHLRMRGTLRGQPRSV VRTLFLEDTYRATGLCAAIGARLAAEGMEPDVFRAAQMPDPQAFMRHFLAQGYVVTGGAL PGEWGRL >gi|316923559|gb|ADCP01000068.1| GENE 57 68671 - 69099 518 142 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKLYARLVVALGFPIAAACTLSLPLILLGAVAPDPRPWNWPTVYIVDFLYGLALFSQLVG FLCRTSRNRLWTAAVGGYTALAFGQAALEALLDLLGVQPAPITAKWFLFLFMSAGLILGA LMLLLAFRHGRYWTAFSLSQER >gi|316923559|gb|ADCP01000068.1| GENE 58 69102 - 70271 1336 389 aa, chain + ## HITS:1 COG:alr5357_2 KEGG:ns NR:ns ## COG: alr5357_2 COG3320 # Protein_GI_number: 17232849 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Putative dehydrogenase domain of multifunctional non-ribosomal peptide synthetases and related enzymes # Organism: Nostoc sp. PCC 7120 # 15 377 8 374 377 167 32.0 4e-41 MTNLASAPAYGRPGVLLTGATGFLGGYLLKELVERTEMPIYCLTRSDGTLGARARLMDNL HFLFGADAVEAWPLERIAIVEGDLAKERFGLSEAGYAELAERTGAVFHAAAMLWHFGKLE QFLKVNVQGTENLLAFCSAGMPKTLNHISTLAVSGRRCDNPKNLFTEADFHESMECPNVY VETKYEAEKRLRPAMLAGHAVRIFRPGFIMGDSKTGRFKKHITSDAQYLHLQGHIFMRTA PPLYDDDYMDLTPVDYAAAAIVHIAFQPDTPPGTYHVCNPQPILKSQIWDIIRDYGFPVR TVPAERYLEEVLDSDDELFLRGLQSVIVYLGDYEKSPAIFDASETLRRLKGSGISCPPPD PALLRRYLNYCVDIGFLPHPSTLGGLEWS >gi|316923559|gb|ADCP01000068.1| GENE 59 70262 - 74695 4241 1477 aa, chain + ## HITS:1 COG:CAC2428 KEGG:ns NR:ns ## COG: CAC2428 COG1924 # Protein_GI_number: 15895693 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 399 712 3 313 319 256 42.0 4e-67 MVMSASSFSGFPADARILGLDIGSISVDVVLLDGLGTPLYSRYVRHHGQPDKVLCAELEA LERTHAGIAAAVTGTGADRVGRLLGACAVNEVIAQVAAASFRCPQARSVIDIGGQDSKYI QLEPAGEGGGLRLKDFSVSSMCASGTGSFLDQQATRLGLDIETEFGEAALRSRAPARIAG RCSVFAKSDMIHLQQKGTPVEDIVAGLCHALARSFKANVVGVKALELPVALVGGVAANAG VVKALRSVLHLEEDGLFVPEDFALSGAFGAALFLLGGQGEAVPYKGLAAFREGLRQRVPM KTLVPLQPVSVPPPGCRGASDEGPKVSAGFMPPLRPSASSLPDGAGTRGLPGRPEARIPE GGSPSSAPREPAASPLAADDGDSIRSSLCLPLERRTRVFLGIDIGSISTNIVLMDENREV VAKYYLMTASRPIEAVRTGFRDILERYGELADVQAVGVTGSGRHLIGDLAGADVVVNEIT AHATASAFICPEVDTIFEIGGQDSKYVRLENGLVKDFTMNKACAAGTGSFLEEQAEKLGI SIKRDFARLALAAEHPVDCGEQCTVFIDSEVVRHQQRGTPVSDIAGGLAYSIATNYLHRV VEKRPVGDHILFQGGVAFNTAVLAAFERLTGKAITVPPHNEVMGAIGCCLIARRKMLETP GFATGFTGFGVLEKGYRQESFQCNLCANQCDISKILVEGHRPLFYGGRCERYEVRRSSGG GGLRDLFAERERLLMSAYQPKGKAGSRGVIGYPRMLTFHEYYPFFQAFFSELGFSLLLSP PTNAEIVRRGVSGVASAACFPTKVAHGHAAWMKEAVLEGKAGAMLIPSLRETFPTAEAHP YANHCSYIQFIPDLVNEAFKLEASGIRLIRPALHFRMGREQVLRELERTAASLGVSSRAE VRRACDEAYAAQARFREARNALGREALDALGPDGKAVVLVGKAHNIHDPGTNLNLARILR GMGIQTIPSDLLDLFHSPDVGEAWRNMTLAMGQRTLAAADIIRRDARLNAIYLANFGCVN DSMYPRFFGREMGEKPFLLLEIDEHSAEAGVVTRCEAFLDAVANFDAAREIAPRRTRKIE FDPGGDRVLYLPHAANGMAVWAAALRAHGINAKLLPPPDERSLEWGRRCLDGKECLPCTL MTGDMVRLIKEDGVDPAKAAFFMPGSCGSCRYDLFNTLQQIVFEDMGLGGAALVDEYQGA NRKLHAIMSGASCGMLAWRGFIAADILEKLRLHIRPYETGAGDTDRAYYACLDRLVEVVE AKGDVERAVIGMVEAMRAVPVDRSRPRPLIGLVGEAYLRNVDYASNNIIQSVEQMGGEIR MPAIMEVLWYSLYKQRYFQELGRHRVKAFIHRVQHGILNRIERKMRRHAASVLPDPYEKP IWEVIGQSGLSLDAGLGFGASVEMARSGISGIIHAIPFNCVPGTVIQGLEGRFRSLFPGV PFMTVGFSGQADLGVRIRLEALVHQCRSLAPGHAARM >gi|316923559|gb|ADCP01000068.1| GENE 60 74789 - 75886 1317 365 aa, chain + ## HITS:1 COG:MA3614 KEGG:ns NR:ns ## COG: MA3614 COG0037 # Protein_GI_number: 20092414 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Methanosarcina acetivorans str.C2A # 9 317 10 329 379 63 21.0 9e-10 MFRYTNHVCKKCFLQNDPFHKVTLNEDGVCSLCAKPAPEEARRDWDALEKRFAAHLDRVR GTRPYEGLIMMSGGKDSAYLADLLKETYGMRLLGFIIDINYEYPETFENAKTIARKLDLP YVVFRQEPEAMRRYYRFLFTEKALLQPDGGQVCTFCGRFLIHSACAFARRMGIPLVFSGH NPDQIFLMGESIHNTPEQDVLIEFLMETLSEETGKALARWRESYGEPELPLFPDSIHVEG TELLFPFQYFPYRPEAMMRHVRERLQWLPIKRFSKTYIASGCRLVKLWAYMAYLNNTNSY VDFEFCNQIRNGTLSADTVRQFYEQAEIDYEELAELIAELDMAGPMKALLTPYGEKAEAL LRLLP >gi|316923559|gb|ADCP01000068.1| GENE 61 75934 - 76833 940 299 aa, chain + ## HITS:1 COG:all4037 KEGG:ns NR:ns ## COG: all4037 COG1045 # Protein_GI_number: 17231529 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Nostoc sp. PCC 7120 # 130 296 4 161 253 152 46.0 7e-37 MSQIFLSPFAMPELDRVVEQLCAPSSLEGVFHRPLHDAPMPSLDVLGEILSRLKIALFPG YFGMSNVRMESMRYHLSANLDSIFRKLAEQIRCGACFACAEYAQECLECERISREKALEF IERLPHIRRLLATDARAAYQGDPAATSAGETIFCYPSILTMIHYRIAHELYLLEVPIIPR ILCEMAHAITGIDIHPGASVGEAFFIDHGTGVVIGETCVIGRGCRLYQGVTLGALSFPKG ADGVLIKGIPRHPKLEDDVTVYAGATILGNITIGAGSVIGANTWVTRDVPRNSKVVSGD >gi|316923559|gb|ADCP01000068.1| GENE 62 76835 - 78586 256 583 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 333 560 127 354 398 103 32 8e-21 MENTNSLGLLLSFARPCRRLLAASVGLAVLGVASGMVPYFAVSYMIVDLYAGTATPESLF LFALIALLGQAGRVALGTASTVLSHRAAFAILKGIRTDIAAKLSRMPLGSVIETPSGKLK TLIVDTVEKMEVPLAHLIPELTSNLLVPLFMAAYLFWLDWRMALLALATFPVGLFCYMAM TKDYAERYATVQDAGKGMNAAIVEYINGIEVIKAFNQSSASYGKFVAAVRANRDAMKDWF RATNGYYVAGMAIAPASLITVLPAGIYFHMSGTLDAPAAITCFILSLGLIQPILQALGYT DSLAMMDSTLKEVSDLLGRPEMVRPSNCVPLRGHGIEFDQVSFAYTGTGREVLQDVSFKI VEGGMTAIVGPSGSGKSTLAKLLVSFWEADRGRILLGGVDVRKLPLSQVMQSVAYVSQDN FLFNVSLRENIRLGRPDATDREVEEAAEAAACGFIKALPGGYDTPAGDAGARLSGGERQR IAIARAMLKDSPIVVLDEATAFADPENEAFIQESISRLVKDKTLVVIAHRLSTIVRADRI VVMDGGRVSAVGTHAELLQTSPLYGRLWESHTGARDAEGAEHD >gi|316923559|gb|ADCP01000068.1| GENE 63 78579 - 80312 195 577 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 340 555 135 351 398 79 27 9e-14 MIETIRRLLDFSGRCRTDLLCAFGYGVLYSVSEVIPVLAIVVALEAVLSPSGGGWFAVAA SGGLMGLSVAGKIVFGKRATERRMLASYDMCADKRIEVGEILKRVPLGYFSQNRLGELTA TLTTTLGEIENNAVAILDKVVNGFVHAVAITLLISWYDWHAGLITAAGLVVSLFVYGAMQ RKGRKLSPVRQAAQSRLVAAVLEYVQGMAVVKAFGLGDRSNRAVDEAIEACRASNTDLER SFSTLAAAYQLVFKVAAFGVLLSSGLLYLSGDMDLIKCLLLMVSSFMLYAHIEMMGSVSA LSRVISVALDRMEALKNAPLLDERGEDFVPESFDIRMEDVVFSYDGTRLLEGINLVIPQG TTTAIVGPSGSGKTTLCSLIARFWDVQEGRVLFGGRDVREYTCDGLLRNISMVFQNVYLF EDTILNNIRFGKPDATMEEVREAARKARCHDFIMALPEGYETPVGEGGVSLSGGERQRIS LARAMLKDAPVIILDEATASVDPENERQLQEAFEALTRDKTVIMIAHRLSTVRKADQILV LDKGRIIQRGRHEELMEQGGLYADFIRIREQAVGWSL >gi|316923559|gb|ADCP01000068.1| GENE 64 80614 - 81036 66 140 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIDDPVTRIVTIIFRHAQDLSHQPGVLVPADQPRNLAVRGNTPFRYFLNNRKYLVYQILV YHAAPHGIFPTLDGNRAFFLTPSISTNLRFLSTPTKYPVAYRGLRVATFTTHQQRTRYLL ERHDLLQIPGGGNGPYTLTE >gi|316923559|gb|ADCP01000068.1| GENE 65 80839 - 81150 187 103 aa, chain + ## HITS:1 COG:lin0580 KEGG:ns NR:ns ## COG: lin0580 COG3695 # Protein_GI_number: 16799655 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Listeria innocua # 5 101 1 97 98 96 51.0 1e-20 MGCRMVHKDLIYEILSVVEEIPKGCVATYGQIARLIGRDKNSRLVGKVLSMAEYYGDYPC HRVVNHAGRLVPGWLEQRLLLEAEQVVMKDDRHVDLKKCRWDY >gi|316923559|gb|ADCP01000068.1| GENE 66 81206 - 81703 242 165 aa, chain + ## HITS:1 COG:L118481 KEGG:ns NR:ns ## COG: L118481 COG0350 # Protein_GI_number: 15672513 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Lactococcus lactis # 9 152 23 166 169 152 50.0 2e-37 MLAAKGDALAGLWIEGQKYFADGLKGEAREQDDFPVFVQAKEWLDRYFAGEKPRISELKL EPAGSEFRKAVWEALCQIPYGEVVTYGELAQKMAVLHNRERMSAQAVGGAVGHNAISIII PCHRVVGANGSLTGYAGGIDKKIKLLTHEGAYSESFFVPKKSTAP >gi|316923559|gb|ADCP01000068.1| GENE 67 81719 - 82669 770 316 aa, chain - ## HITS:1 COG:BMEII0251 KEGG:ns NR:ns ## COG: BMEII0251 COG0837 # Protein_GI_number: 17988595 # Func_class: G Carbohydrate transport and metabolism # Function: Glucokinase # Organism: Brucella melitensis # 4 311 20 329 348 95 28.0 1e-19 MPYIMAVDLGGTNCRFAGFTLDKGILSLHRMGKRKTAELPDTGALISACGTALDRHPRTA DAFVVGMAGPVADPLKARLTNAPLEVDLTDAGERYGIRSCRIINDFTAEACACLTEAGSS AHCVLDAPGPRPAGPIGIIGAGTGLGTASLIRDSHGSWLPLPAEGGHVIFPFIGKEEAAF QDFAARELGLACLSGDDVLSGRGLRLLHLFLTGERREAHEIAAEAFGSETGTLRWYARFY ARACRNWALSTLCTGGLFITGGIALRNPLVTECAAFREAFYEGPHRRLLERIPVRRFTDM NSGLWGSAWFGMRMTA >gi|316923559|gb|ADCP01000068.1| GENE 68 82891 - 83826 1127 311 aa, chain - ## HITS:1 COG:mlr0392 KEGG:ns NR:ns ## COG: mlr0392 COG0320 # Protein_GI_number: 13470626 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate synthase # Organism: Mesorhizobium loti # 11 291 28 309 321 274 50.0 1e-73 MSAPENSKPSLRIPPWLRVKLPCNHTFSDTRALVEGLDLHTVCHSAKCPNMFECFSRKTA TFLILGNTCTRNCAFCNIGGGIIHAPDPEEPARVGKAAAALGLRHAVVTSVTRDDLPDGG SAHFAATIRAIRDSGETCPTIEVLIPDFRGSREALRTVMDAEPHIINHNVETPPAHYSRI RPQADYQQSLELLRRVKEAGRVSKSGLMVGLGETDEEVHGVLADLAAAGCDIVTIGQYMR PSQKHHPVERYVHPDIFESYAQKGRELGIPFVFSAPLVRSSYNAEQAYASLLNRQQHMPS EAAEPRGSEQP >gi|316923559|gb|ADCP01000068.1| GENE 69 83768 - 84412 618 214 aa, chain - ## HITS:1 COG:DR0764 KEGG:ns NR:ns ## COG: DR0764 COG0321 # Protein_GI_number: 15805790 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase B # Organism: Deinococcus radiodurans # 2 208 99 316 336 151 40.0 8e-37 MNIVDLGVMPYSEALAVQLECHERVRQGEEDTLFLVEHPPVITFGRHGGEENLLLGRAEL AARGVEIVKTDRGGNITCHFPGQLVAYPVFRVGKRTNGLHGFVRTLEEIVIRSAAAFGVE AARWEGRPGVWIGNRKLCSLGMCVRHWVSFHGFALNVGNDLSLFSAITLCGLHDAEATSL SRECGDDSLSMQEVKDVCTREFQTLFADPPVAPC >gi|316923559|gb|ADCP01000068.1| GENE 70 84834 - 87482 2893 882 aa, chain + ## HITS:1 COG:TM0272 KEGG:ns NR:ns ## COG: TM0272 COG0574 # Protein_GI_number: 15643042 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Thermotoga maritima # 2 865 4 867 881 957 57.0 0 MRYVYLFHEGNARMKELLGGKGANLAEMTNIGLPVPCGMTITTDACREYYNNGKKLPEGL VEEVTRNLKAIEAEAGKRFGDEDDPLLLSVRSGAVFSMPGMMDTILNLGLNAATFKALAR HTGNMWFACDTYRRFIQMFSDVVMEIPKDKFEHILQEQKAAQGVTQDQELSVESLLAVID KSKALYRQEIEEDFPEDVTRQLFLSIEAVFRSWNNHRAIVYRNLNKIDHNLGTAVNIQAM VFGNMGEDSGSGVAFSRNPSTGERKLYGEYLINAQGEDVVAGVRTPRPIAELGEDMPEIF EQFRSIAKKLERHYRDMQDIELTIERGKLYILQTRNGKRTAQAALKIAHDMVGEGLIDKK EALLRIDPEHLAHVLHRQIDSSADLTVLAAGLAASPGAAFGSVVFDANEAEHLGRIGSKV ILVRVETTPDDIHGIVQAQGILTSRGGMTSHAAVVTRGMGKPCVCGCEAVKVDYEQQLFT VGSTVVRKGELISIDGTTGQVILGAVPLKDPELSKEYQTILEWADEVRTLQVRANADTPE DAEKSRKFGAQGIGLTRTEHMFMAQERLPYVQRMILATTTEERMGALLPLRIMQENDFYS ILKAMHDLPVCIRLLDPPLHEFLPSLEKLLVETTELRIRKDNPQLLEEKERLLAQVVKLH EANPMMGHRGCRLGITYPEVYEMQMHAIFNAASRLTRDGYTVLPEIEIPLTISKAEMEIL KGRCDRIARECMELHRVSFTYLCGSMIELPRAALLAGEIAESAEFFSFGTNDLTQTCFGF SRDDAEGKFLPAYIHQHILKDNPFAVLDREGVGRLMRIAVEEGRKTRPDLMIGICGEHGG DPSSVEFCHEIGLDLVSCSPYRIPIARLAAAQAALKHPRPVG >gi|316923559|gb|ADCP01000068.1| GENE 71 87734 - 88669 898 311 aa, chain - ## HITS:1 COG:ECs1923 KEGG:ns NR:ns ## COG: ECs1923 COG0583 # Protein_GI_number: 15831177 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 10 303 7 296 302 76 24.0 5e-14 MIEELNGDFLQWLRGFYYVSQTGSIRRAAQMMNRNPSTISYQIRSLEEELNTVLFDRYKK SLRITPEGKKLFEWTVYTFETLRGLRSEVGTLSGKLQGTVTFSSNLPFAAQVVGIISGFR ERHPDVNIRIRRALTYEVVDDVESSKVDFGLTGVTVLPDHSELEELFQARPLLIVHRDNT YHLPKKPKLSDLERLPFVSFLSERMDDSGEPYFENMDATLFVKNTVLSVNNYHLMLRYVL HGVGAAIMDEMCLKASSYGTDWRPLVSYPLDEFLPTVRYGILVRKRKHLSPQAKGLIENI REELAKTNLPA >gi|316923559|gb|ADCP01000068.1| GENE 72 88740 - 90497 1949 585 aa, chain + ## HITS:1 COG:PA2298 KEGG:ns NR:ns ## COG: PA2298 COG1053 # Protein_GI_number: 15597494 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Pseudomonas aeruginosa # 43 556 7 521 574 194 29.0 3e-49 MMFFSQKEAWDAADPRRGPPPQRTSQQEGTMNTSTHPLGTLIETDVLVIGSGASGCGAAL GAREQGLRVLLMDKGKLESSGCIGGGNDHYMAVLDEEGVAHDAAEDLIKFYAKPLNGWTP AMLRNGWYAHMKPMLEKLEAAGVEFGRTPDGRYHRTQGFGQPGTWWVHIANGMTIKRVMA RIVRESGVDVLDRVMAVKILTDGGKACGALGWNVSTGEFHIIRAKTVVSAQGRSATRGTD NSTHNPYNVWMYPYNTSAGVVLGYDAGAAVTELDTYQRATMLPKGYGCPGMNGINSSGAH EINALGERFMGKYDPMWENGVRNNQIQGTFQEQLEGSGPPFYMDMRHVDEAVVRELQDIL MPGDKATFGDWAECTGTDFQHKLLEVEIGELIFGGTIAVNDQFETSVPGLFCGSIFLYCS GAMCGGFEAGRQAALKASGMSAAGAVDEALAARVRDEIFAPLGNEETLSYKELEEAARNV MNYYMGFRRSMTGMARALEKIRFLSDQASRLHADTLRDLMRCHEAKDVLTICELAIQATM ERKESGRCVYRLTDYPEINPEMAKPLLLMRGENGPVYQWGKAPLL >gi|316923559|gb|ADCP01000068.1| GENE 73 90579 - 92267 1505 562 aa, chain + ## HITS:1 COG:Ta0414 KEGG:ns NR:ns ## COG: Ta0414 COG0493 # Protein_GI_number: 16081537 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermoplasma acidophilum # 28 420 47 449 484 201 34.0 2e-51 MPPRYSREIEQPIVMGDSLKDQFRTSPCEASCPAGNPIQKMQSLVEKGRFGEALRYLRAR NPFPGVTGRVCPHFCQAKCNRDRLEGCVNTRALERAAYDHAPKGIEPFTRAPETGKHVAV VGGGPAGMTAAYFLSLLGHDVTVYESGPVLGGIPRYAVPDFRLPRDVVDREIGWILETGV RACVNVTVGKDISFAELRASSDAVVVATGTPKENALPIPNAECGLKAVEFLRKAALGQRP DVGRTVVIMGGGGVAFDCAFTARRLGAADVHVLCLEAEGAMRAPEEDLEQARREGVHVHN SCTMSGVRKDGDRVSGVDYFDVQECRFDEQGRLSLIPVPGGEHVLACDTVIFAVGMKTDL GFFEGEVPECTPRQWIVADAAQKTSLEGVFAAGDVASGPSSIAGAVGAGRRAAFGVHAYL TGENSRVYIINEEGRIEARDRLATSQPPYVVPFEEIYGVEQYGRAEPQRQGIREGLSFRE INEGYTPEQARDEAARCMHCGHCKGCGTCVDDCPGYVLELKHLEERDRPEVAFGDECWHC ANCRTSCPCGAIGFSFPLRMQV >gi|316923559|gb|ADCP01000068.1| GENE 74 92447 - 93907 1752 486 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_4598 NR:ns ## KEGG: Dhaf_4598 # Name: not_defined # Def: sodium/sulphate symporter # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 14 486 17 489 489 254 34.0 4e-66 MSANAQMTAKGDAMYYVHSFLCLLIMFGFGQLPPVEPLTALGMRLIGIFLGLLYGWIFID IVWPSMAGLLALMLVGGMKPVALLNRSFGDPIVVMMFFIFVFCATINHYGLSRFISLWFI TRKFVAGRPWVFTFTFLASMFILGGLTSASPAAIIGWSILYGICDVCGYKKGDGYPTMMV FGIVFASQVGMSLIPFKQAALTVFSAYETMSGVGIDYAKYMLIAACSCVLCSLLFIALGK YVFKPDMEKLKKLDASKLDTDGALKLSKVQKLILCFLFALVALLLLPNFLPADLFISKFL KSIGNTGICVFLVTVMCFLKVDGKPLLKFKAMIDSGVAWGIILLLAVVQPLSGAMAAQES GITNFLMMIVEPIFGGSSPVVFALFIGFVATALTQVMNNGAVGVALMPVIFSYCSSMNVA PELPLIMVVMGVHLAFLTPAASASAALLHGNEWSDSGSIWKTAPLVILLSWIAIAVVTVV LGGVLF >gi|316923559|gb|ADCP01000068.1| GENE 75 94255 - 94668 317 137 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQSQVLRYIRFFIFGCWGILLLLMLLYGPWMIQAKQITIPDISITNFMIRDTSSFFPMHQ EAAEQLIGEGKAVLSATEAVPNKLPNMRLVVFLLLLFPLCTKVLRTRFSESLLVPHLSDI RGMLPLPLAPPSGLCFS >gi|316923559|gb|ADCP01000068.1| GENE 76 94961 - 95413 578 150 aa, chain + ## HITS:1 COG:no KEGG:RSKD131_4191 NR:ns ## KEGG: RSKD131_4191 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter, DctQ component # Organism: R.sphaeroides_KD131 # Pathway: not_defined # 2 150 1 146 147 84 32.0 1e-15 MLALLVLLLGMEVFSRFLLGKSFSWMEELCRYLFVWSSYIGVAIAVKHKEQLRILMFMDV LKKRFPQLVRILYVVSELTFTVFCALVFYYSLGMLENMIRFKQVSAALEINVMYAYLIIP ISMALTIFRTLQGLCRDIRNNTLEFESRED >gi|316923559|gb|ADCP01000068.1| GENE 77 95417 - 96685 666 422 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 1 421 6 428 432 261 32 2e-68 MLLVISFVVFMLLGLPLFGVIGLASIISMIGMPFPMEFIPQNLYTGMDQFPLIATIGFVM AGFLMEPAGITGEIIEVAKKAVGNIRGGLAMVTILACMIFAALSGSGPATTAAIGSIMIP GMIAAGYRKDDAAAVSATGGTLGILIPPSNPMIIYGVVANTSIAGLFMAGFIPGFVLTSL MCITAYCIAKKRGYQGTNEKFSIVGLCKAIWDGKWALATPAIILGGIYWGVFTPTEAAEV GVLWTLFVGLFVYRKLTWKNISSALIRTSAFAGSATILVGVSMAFSRLLTLYHIPQTVGA FLGSISTDPTITLLLIAFFIFLCGFVADTLAMVVVLAPVFLPITNALGIHPIQLGILFVV CCETGFLTPPFGANLFITMKITDVKLEEVALRAFPYLLAMWLLIIVVAACPDFVMFLPRL LM >gi|316923559|gb|ADCP01000068.1| GENE 78 96753 - 97292 112 179 aa, chain + ## HITS:1 COG:no KEGG:DVU2560 NR:ns ## KEGG: DVU2560 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: Fatty acid biosynthesis [PATH:dvu00061]; Metabolic pathways [PATH:dvu01100] # 79 179 7 100 128 63 36.0 4e-09 MFPVRTFRRMEVHGGLEPLAQALAAWGKRGRMEYVSRISGVCGGRIAFSDWYGYTKSMEN EKAIPGGRVDGSGNREVRGTWRFSGEHPLFDEHFPGAPRVPGTLVIEAMRAGAEALFPGW RVLGVKRFRFRHFIEPGAYDYAFTPQLDAAAIRCVLLSGERRMAEGTLLVAADPARGEA >gi|316923559|gb|ADCP01000068.1| GENE 79 97295 - 98017 286 240 aa, chain + ## HITS:1 COG:RSc1052 KEGG:ns NR:ns ## COG: RSc1052 COG1028 # Protein_GI_number: 17545771 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Ralstonia solanacearum # 8 233 11 246 249 75 28.0 1e-13 MLFKGPVIITGGSSQLGMALARALREEGAKTLSVCRSSEGLARCGEAGLSCLPLDEPESL PERCRGLLGEPPAHLVDLMHSRFESLIAGAAPEAIMEWAAVDIGLRARLVRAVGRSMLAR RFGRCVFVSSSAAECPAEGQGYYAAAKLAGEALYRSLAVELSGRGVTACSLRLSWLDAGR GAVFLERRREAVEKRMPIGRLVRMDEAVETVLFLLSASASSVNGTVVTLDGGLSATKTTF >gi|316923559|gb|ADCP01000068.1| GENE 80 98025 - 98411 511 128 aa, chain + ## HITS:1 COG:no KEGG:DvMF_1005 NR:ns ## KEGG: DvMF_1005 # Name: not_defined # Def: acyl carrier protein, putative # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 13 122 19 132 137 93 49.0 2e-18 MDYSEITDKVCGIVADIFECDPRALSGETRLMTDLPCESIDLLEIGARLGQTFHILIDDD VVFLRSLRYHVAHGGEPEAVIRREYPYLSPERVKALALGLSNPDAVPQLCLDDIAAYIRA ALPSQEYE >gi|316923559|gb|ADCP01000068.1| GENE 81 98716 - 99615 328 299 aa, chain + ## HITS:1 COG:aq_1717 KEGG:ns NR:ns ## COG: aq_1717 COG0304 # Protein_GI_number: 15606798 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Aquifex aeolicus # 43 295 156 411 415 174 39.0 1e-43 MLDFDRENGLPPADSEALDALWLLRWLPNTGNTALARFLDIHGEGLVLGTACASALQAIG EGFRRVRYGLSETVLCSGGDSRLSRGGMLGYAKAHALSRNPVPSEASRPFDEARDGFVPG EGGAAFVLESLESARARGAAVFAELLGFGASLDGGALTAPDESARFAEQAVRGALADAGL EPHAVDWVSAHGTGTPLNDRSEAVLLERVFAQAEARPAVTALKSWIGHGSAACGGMELAL MLAAWRSGRLPPIRNLRAPCSGLLDFAVATRPFPGPVGVLENFGFGGQNAALVVRCGDD >gi|316923559|gb|ADCP01000068.1| GENE 82 100102 - 101274 657 390 aa, chain + ## HITS:1 COG:PM1901 KEGG:ns NR:ns ## COG: PM1901 COG0156 # Protein_GI_number: 15603766 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Pasteurella multocida # 14 371 7 368 387 168 32.0 1e-41 MADAKRTLRECLDERLKEGRARGLERRPPLLDMAGEGPVVHMGGRAFLNFTLNDGLGLAS SPEWRAEVGECFAAFPPSASASRLAGGRSRITEEAEQAVAAYFGFDECLFLPSGYQGNLA CAMALIHAGQPVFVDRRVHASIARGLVLGKADVRTYAHADYGHLERRLVSAPPATVQPLV MTESLFSMDGTELDVSRMAELRSKYGFFLMVDEAHAVGALGPGGRGLCAGVPGTADIVLG TFGKSLGLFGSFLLLPKGFAAFFESLSSAVMHSTAMPPAHAAAVLKLLERLPLLDAERAR LRDNAVFFRTRLCELGIPTRGTAHIVAVPTGGEARTTHLGEQLAERGVLALAARYPTVPY DDGLLRFGLTALHTQGMLERTARLLADLWG >gi|316923559|gb|ADCP01000068.1| GENE 83 101473 - 102174 758 233 aa, chain - ## HITS:1 COG:CAC2546 KEGG:ns NR:ns ## COG: CAC2546 COG2186 # Protein_GI_number: 15895808 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 226 1 230 231 120 33.0 2e-27 MFDKVVKSTVPGQVIEQIKELLVKGELKRGDRLPPERQLADMLGVSRPSLREALRALEYA GMLETRVGEGIFVADGDSIMMNNLLMLHLIKQYALEEMIEVRKVLETSNVRFAVLRARDE DLAALKEILEQSRGQIANKAAFIKSDYAFHQAIAVASGNSILATMLQTMRTMMSDFNSQL LTSQEGRQQVYAHHKKIVEAILNRDEKAAQDAMFLHLENVVQSMKKASPSKSK >gi|316923559|gb|ADCP01000068.1| GENE 84 102662 - 103438 1177 258 aa, chain + ## HITS:1 COG:ylbA KEGG:ns NR:ns ## COG: ylbA COG3257 # Protein_GI_number: 16128499 # Func_class: R General function prediction only # Function: Uncharacterized protein, possibly involved in glyoxylate utilization # Organism: Escherichia coli K12 # 2 258 9 261 261 235 47.0 8e-62 MGYPSDLLSSRAVVKPGMYAIIPPEGRVFNVIPGIEGCRMTILCTPKMGAGFVQHIGTAL PGGGTTVPYGASGQIETFIYVLDGEGSLTVTVGGRTEVMPQGGYAYAPAGVGISFRNETD KPLRFLLYKQRYIPHPDPAMQPYAVFGNTNDIEERIYDNMENVFVRDLLPVDERFDMNMH ILSFAPGGCHPFVETHVQEHGAYLYEGEGLYLLNDDWVPVKAEDFVWMGAYCKQCCYGVG LTRLSYIYSKDCHRDAEI >gi|316923559|gb|ADCP01000068.1| GENE 85 103514 - 104539 1159 341 aa, chain + ## HITS:1 COG:PH0720 KEGG:ns NR:ns ## COG: PH0720 COG0540 # Protein_GI_number: 14590597 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Pyrococcus horikoshii # 22 340 4 307 308 157 35.0 4e-38 MTISASLKPLDLRDELEAANMKGKNLTFLDDYSKEELGHLFKAAELLEPYSKNQIPLLAN KLLYTLFFQPSTRTRCSHEVAMHRLGGRVITETDPMGHSAMAKNESLYDGIRVISQYADV LVLRHGDSEQVLSTLERLGSDACPIISGGYGHVTHPTQGLLDMYTALRALKKPFEDMRII ISTPDLSRARSGQSFALGAARMGAHIIYTGSSGLRTPQVLIDKLKSMGASFEEHFDLTHD ENVELMTRADLLYLPGSSLKKEDPNREDFMRKVANHYLYLNDLEYIKKKTGRVVGIMHSL PRNDFEFEQSLDKSEFELYFKQIGFSIPLRMALLASICGVN >gi|316923559|gb|ADCP01000068.1| GENE 86 104696 - 105649 895 317 aa, chain + ## HITS:1 COG:yahI KEGG:ns NR:ns ## COG: yahI COG0549 # Protein_GI_number: 16128308 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 4 316 2 314 316 278 47.0 9e-75 MPTRKVAVIAIGGNSLILSKDAQTVQDQYRALCVTGKHIVSLVELGYQVVVTYGNGPQVG FLLWRSEVAHATSGLHGLPLVTCVADTQGGIGYQIQQTLKNEFLRRGLCEDSIAVLTQVV VDKDDPGFRNPTKPVGEFYPPERAEVLRRENPSWVLREDAGRGFRRYVPSPRPLEIVEEA GLRTLLASGLHLIAGGGGGIPVFRTEGGYEGVDAVVDKDLTSCLLAKRIKSDLFVISTAV SNVAIHFGTPEQENISRMTVAEAMACVEAGYFAPGSMLPKVLAAVDFVRATGKEAIITSP ENVRAAVVEGKGTHIVP >gi|316923559|gb|ADCP01000068.1| GENE 87 105940 - 107262 1210 440 aa, chain + ## HITS:1 COG:ybbX KEGG:ns NR:ns ## COG: ybbX COG0044 # Protein_GI_number: 16128496 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Escherichia coli K12 # 15 439 25 453 453 275 37.0 1e-73 MLVTADRVFGGHLYVRNGKIAAISDDGTLPAREELDAAGLYVLPGMLDCHAHINDPGFTW REDLPHASEAAAVGGVTTLIDMPLQNTPPLTSADAFANKLAVFEGRSLVDYALWGGLVTD NTAELAGMDAAGAVGFKAFIAPVSLGYSSVDMGLAREALFRIKAFGGLAGFHCEDHALIC AGEAKAKAGGGTRRAFLDSRPVVAEVIATANVIELARETGARVHICHVSHPRVAELVRRA QADGLSVTGETCPHYLVFTEENLLSCGTVFKCAPPLRTAEARDGLWEYVLDGTLSCIGSD HSPSRPDEKDEAVHGVMGAWGGLSGLQSLVQVMFDQAVTRRGCSPSLLARFASSAARVFG LDGRKGALGVGLDADAVLIDPGRAWTITAHSLRYLNPFSAFEGLEGVGLPVCTVVRGRVV AREGNVLAPFGHGLFQPRKP >gi|316923559|gb|ADCP01000068.1| GENE 88 107313 - 108905 2309 530 aa, chain + ## HITS:1 COG:AGl2786 KEGG:ns NR:ns ## COG: AGl2786 COG0747 # Protein_GI_number: 15891502 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 35 511 87 565 585 345 40.0 1e-94 MASLTNGWKLFSAAALTAAMLFTAVPAKAEAIPDGGTFTVTWAQNPVSLNPGLSSGISSG IPGAQLFASPLQYDDQWNPHPYLAEKWEMAPDGLSLTLHLVKGAKFHDGTPITSEDVAFS IMAIKANHPFKAMYAPVSGVDTPDPYTAVIRLSKPHPAILLCMSPVLCPIMPKHVYGTDP NIRQNPANKAPIGSGPFKFVEWKPGDYIMLEKNKDFFIKGKPHVDRVIIKIIPDMNNRVM AVERGEVDAMPFFDSLREVKRLSGDKNLVVTNKGYAGIGSLDWVAFNTGKKPFDDVRVRQ AAAYAIDRNFFAKVIMMGLVTPSATPITPFSPFYTKDVNMYDVNLEKAKKLLDEAGYPVK ADGTRFTVTVDYAPGSPTTKMMAEYLKPQLAKVGIDAKIRISPDFGTWAERVSNYNFDIT TDNVFNWGDPVIGVHRTYSSANIVKGVPFSNTQQYRNPKVDAIMEQAASEVDNEKRAKLY KEFQQIVMNDVPIYFMTTTAYHTIYNKRVGNVPASIWGFLAPYMDVYLKK >gi|316923559|gb|ADCP01000068.1| GENE 89 109042 - 110022 279 326 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 59 322 33 313 320 112 26 2e-23 MTFLRVVGKRLFWAAVLLVLVLCLNFFLIRLAPGDPAVTIAGEMGGASPELLASIREQYG LNQSLFNQFVAYFGKVLHGDLGYSYFFNLPVMELIQDRLPATLLLVVTAVCAAVLIGVCL GVQAGRRPNSWTSHLVTIVALIGFSAPAFWTGMMLLILFASWIPILPVSGMTTIGAGYTG FAYVLDVAHHLILPVVTLAILYIAQYSRMERASMIDILGTDYIRTARAKGLRERTVVYKH ALKNALIPVVTLAGMQFSQVFAGAVLVETVFNWPGMGRLAYESILRKDYPTLLGILFCSV FVVIIANVLTDFAYRFLDPKMRVGRR >gi|316923559|gb|ADCP01000068.1| GENE 90 110025 - 110900 1084 291 aa, chain + ## HITS:1 COG:AGl2782 KEGG:ns NR:ns ## COG: AGl2782 COG1173 # Protein_GI_number: 15891500 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 288 63 336 341 264 48.0 1e-70 MSEQSAAWEYKEESPLREAWKMYLQNRAAVLGLVLLGAVLLLTCVGPLVYPVDPFEMVAP PMSAPGEYGLICGSDYLGRDIFAGIIAGGKATMAVGGFATLVTLCIGITIGALAGYRQGI WDELLMRATEFFQVLPPLLLAMVVVALFSPSLTSVAVSIGIVSWPQVARVTRAEFMRIKE MDYVAAARTMGAGNATIMCRVILPNAIPPLVIVATLTISSAILFESGLSFLGLGDPNIMT WGMIIGSNKDYIMNSWWAVTFPGLAIFLVVLAIGLIGDGINDAFNPRLRRR >gi|316923559|gb|ADCP01000068.1| GENE 91 110937 - 111920 1045 327 aa, chain + ## HITS:1 COG:BH0028 KEGG:ns NR:ns ## COG: BH0028 COG0444 # Protein_GI_number: 15612591 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Bacillus halodurans # 3 327 10 330 347 316 50.0 4e-86 MEDTLLEVRDLQVEFSTKMGIVRVLDKVNLKVDPRQTVGIIGESGCGKSMMALSIMGLVP IPPGRIAGGEILYRGTDLLTLEKDAILDIRGKRIAMIFQEPMNSLNPVFTVGNQLVETLL RHEDMTRRDAWDKSVDMLKAVGIPSPEKRMHDYSFQMSGGMRQRVMIAMSLCLDPDILIA DEPTTALDVTVQAQILDLLREMQDRVGASIIMITHDLGVVAEVSDKVAVMYAGRKVEEGS VEQILFNPQHPYTKALKGCIPHLQRTPTSGRHRLHEIPGMVLGMAELGKDRCSFYERCPC GKPECMEHNPPAKAIDAGHEVACWLYS >gi|316923559|gb|ADCP01000068.1| GENE 92 111963 - 112982 1182 339 aa, chain + ## HITS:1 COG:CAC3183 KEGG:ns NR:ns ## COG: CAC3183 COG4608 # Protein_GI_number: 15896431 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Clostridium acetobutylicum # 8 324 5 323 323 317 48.0 2e-86 MSTENEGLLEVRDLKVHFPLPKKSLFGERSYVKAVDGVSFRVRKGTTFGIVGESGSGKTT IAKAVMGLVGVSGGSIRLEGNELVGIPEDRLKALRPKYQIVFQDPYSSLNPRMRVGKIVQ DPLNLMDLGSHEEREEREKEVFEQIGLHQAQRALFPHQFSGGQRQRISIGRALASYPELI ICDEPVSALDVAIQAQILNLFMDLQEKNNLTYLFISHDLGVVQHMCDDIGVMYLGRFAEV ADKVSLFTSPAHPYTYSLLSVVPQVSRSPKGGRVKLLGDPPSPIDLKPGCRFNTRCPFAE DVCRHEEPELRKIGDNHFVACHLVRDGLGPHQRGGEWKV >gi|316923559|gb|ADCP01000068.1| GENE 93 113051 - 113635 862 194 aa, chain + ## HITS:1 COG:mll2512 KEGG:ns NR:ns ## COG: mll2512 COG0590 # Protein_GI_number: 13472273 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Mesorhizobium loti # 30 170 10 147 156 97 34.0 2e-20 MEKKRLTVHGNEVTDGRFAWVTPEKDIELLRKGIQAGFKSLAYGNEPYGCVLAGPSGELL LEGLNTCLTEHDPLAHGEMNLCREAAQKYDPEFLWQCSIYVPGSPCSMCTCAIFYTNIGR IVHATSIYDKQDYTVMDLPYLGLSPLEILRRGNKDIVIDGPYPELAEECMKKFEHFDPSG IEFYAKSAHHLMKR >gi|316923559|gb|ADCP01000068.1| GENE 94 113741 - 114805 1324 354 aa, chain + ## HITS:1 COG:SMb20036 KEGG:ns NR:ns ## COG: SMb20036 COG1638 # Protein_GI_number: 16263787 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Sinorhizobium meliloti # 14 292 16 294 338 134 31.0 2e-31 MKRCAMVLLACAALAGLVSGLAVAAIPVKVASSVGTSEQNYVNVTYGKLVELAQKYSGGA FEFQLYPDMRLGDEQKTVRALQHGDIQMSVLATNNFMPFAPSCGWLNMPYLFGSLEEFRK LVDLMWDQHNAWAVKESGARVLAIVDIGYRQLTTDAAHPVRNLADARGLAVRTPQNALAV SAFNALGFKPHPASFADTYGMLAKGTVNGQEGCFNNVVTMKFADHQKYATCINYAVHSAN IIVNEEWLQGLPEQARDALIRAGREAMAYERTKVSQMLVADDRALQEQGMELLGVVEDLS EWTRLGRTSWLKCYDVLGYGNADKGKAIMAVVLSKKEALANAWEDWLRQPPATR >gi|316923559|gb|ADCP01000068.1| GENE 95 115151 - 115510 387 119 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYTLAHLHVICRDMEKMIRFWREALGAEFVEYRTFGDDPGAIMRFGGLEIYFKEIPQAKE LQSGQVGYEHVGVYVPHLETSVESLVKGFGCTLASMPANQNRVAFVRGPENLMLELIEA >gi|316923559|gb|ADCP01000068.1| GENE 96 116287 - 116862 404 191 aa, chain + ## HITS:1 COG:PA4620 KEGG:ns NR:ns ## COG: PA4620 COG2080 # Protein_GI_number: 15599816 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Pseudomonas aeruginosa # 10 178 1 169 179 218 55.0 6e-57 MSGETMTAEIEKRPLSMIVNGKEVGPIDVPVGMPMIDFLHHYLNLTGTHFGCGQGVCHAC TVLEVQPDGRLVDTRTCIANAHAFNGKTIITIEGQAKTDAEGNVTELTPIQKAFIEKFSF QCGYCTPGFVAGATAFIDRLKHEPVKRADLESAIEDALNDHICRCTGYVRYYEAVRDVAL ATPGCVIGSVD >gi|316923559|gb|ADCP01000068.1| GENE 97 116880 - 118172 1042 430 aa, chain + ## HITS:1 COG:PA4619 KEGG:ns NR:ns ## COG: PA4619 COG2010 # Protein_GI_number: 15599815 # Func_class: C Energy production and conversion # Function: Cytochrome c, mono- and diheme variants # Organism: Pseudomonas aeruginosa # 45 430 29 415 415 274 38.0 3e-73 MRNWIVAIVVIAVVAALVVLSGLFRTSSVAPQAQQTLSPERMQTLIPRGRELALAGDCFG CHSMPQGPMGAGGLAIATPFGTLYSTNITPDKQYGIGNYTRADFHRVMRDGIAPGERNLY PAMPFVFSHITTPDDIDALYAYNMSIPAMPIANKSNTGVFVLPVRPFMDFWTLLNFPDRK VLRNDQRSAEWNRGAYLVEGLAHCGSCHSPRNFMMGVEFSRSLQGGEVDGVVVPDITAAA LAKRGFDVPTLSTYLATGMAPQGTSFDSMYTVTHFSTSAMEPEDVKAVAIYLLTGKDGKL ALPAASPVPLPQAAVPKNGTPMAAGRLAYMASCAGCHGMEGEGIPNVSPAMKGNAALAMD NPQTIINVVLNGTPTQIFKNGERMYAMPPFAHRLNAPEVADLVTWIRAEWGGQTVPVTVE QVSAQETAVK >gi|316923559|gb|ADCP01000068.1| GENE 98 118331 - 121201 2032 956 aa, chain + ## HITS:1 COG:PA4621 KEGG:ns NR:ns ## COG: PA4621 COG1529 # Protein_GI_number: 15599817 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Pseudomonas aeruginosa # 22 954 47 941 943 856 47.0 0 MIGRLPDAHALEVHSGPSPLDWMSPDGKARYRWDALRKVTGQKTFARDFRARDLSGWPKE QAHAFFIKATRADRIFEGVDLSLLGPKLQPDRLVLHEELFGDGVTVPQPESLAAGFYGKN FLVPKGQTPPLIGHPVALLVYHDFDRFEAAKRMLRFAQDVVKYGAATGPNTPPNYGAARY VRIGGDTPASSSVYSPMQDAIIWGAFDGNNPVWPPESPTGVFVPTSVMLKQRGGGGVRDP LTVARETYEPMRRGMDAAADIEKAIEAARKDESKLVLDRSGFSQSIDPCALEADNGNAWY DAKTRTLHLIVGAQSPYEVARVAALMVKDNKRFPVETIKLLSGTTVGYGSKDHSLFPFYA IAACFYGDGLPVRLANDRYEQFQLGIKRHSVEMDVTIVADRKSGKFEILKVFYKFNGGGR ANFSFSVAQVGATAAQSIYYFPKSDLTTVALATPAVEAGSMRGYGTLQAMSITELMVDEL AVELGMDQIELRRRNALQAGFENTQGAQPLGELNNIEMLDLAAKHPLWVNREKAKADFDK ANPGKRYGVGFAQVQKDYGTGADTSALALEFDADGKVRMRHCVQEIGTGATTAQQVIVRD MLGKAPDFVEFGVAEFAELPMVSNWEPYSTTQEQQDEFQKNPYWVPFMLPAMSASNSAYF IGFGTRQAARFLFEHALWPAARAIWSEGPAGGQIVSARMTLSDLRVVEGGIGGGGMETLS FERVARKAHEMGLVTGVALHCFSRWEWTTATFDIPTIGSISVAADVISVRYGDGAAPELK RRMTTGGYDFIKRSSVNYPAVQRNNAGVTTYTPAACIVELNVNTFTGEIEIMRHHSLVDS GQMIVPELVSGQLQGGLAMGIGHALMEELPLYEDGPGNGTWNFNRYTLPRAKNVAVWNQT ADYLAPLSETSPTRGLGEVVMVPIIAATGNAIAHAIGKRFYQLPVTPEKIRKALAL >gi|316923559|gb|ADCP01000068.1| GENE 99 121896 - 123122 669 408 aa, chain - ## HITS:1 COG:VNG7121 KEGG:ns NR:ns ## COG: VNG7121 COG0500 # Protein_GI_number: 10803668 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Halobacterium sp. NRC-1 # 116 408 99 386 391 234 42.0 3e-61 MKIKTATTIFEVLASEVRLSIFRLLVKYAPEGLVAGEISQMLDIPKTNLSFHLKNIMYSG LVSMEREGRNTRYRANIPLMLETIAYLASECCSGNSAHCQPYLAEGGIEPEFLSVYCQKQ TYDTEITTMDTADKTKSSEDVKETVRAGYAAIATGQRSCCCSSQRGSQADPARLAAAVGY DAESLAKLPDGANMGLSCGNPVAIAALREGQTVVDLGSGGGFDVFQAGEKVKASGRVIGV DMTPEMLAKARKNIGQYRQRTGLDNVEFRLGEIECLPVPDNSVDVVLSNCVINLSPDKPK VWREIYRVLKSGGKVSVSDLALLKPLPDTVRDMAAALVGCVAGAVLVEETKALLEKAGFT SIVLTPKPDYVRNMQDWNDPLYKQIAETLPQGEEMADYVVSLSIEARK >gi|316923559|gb|ADCP01000068.1| GENE 100 123639 - 124484 795 281 aa, chain - ## HITS:1 COG:PM1111 KEGG:ns NR:ns ## COG: PM1111 COG0157 # Protein_GI_number: 15602976 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinate-nucleotide pyrophosphorylase # Organism: Pasteurella multocida # 3 279 2 278 282 153 31.0 4e-37 MSFHISQRFIDELLEEDCPYMDLTVEALGIGREPGLLTCHPKAACVVAGVEVAARLLESS GCRVACEAASGDRLEAGQVFLRAEGSAGAIHRSLKIAQNVMEYSSGIATRAAGMLESARK ANPHVQLAVTRKHFPGTKRLSLAAAQAGGAIAHRLGLSDSLLVFDQHRVFTGGLDGFAAR VSECRKAFPEQKLGAEVATPEEALLLARAGIDSIQCERFTCADLEETVRGVKAVNAAVQV LAAGGVTGENAEAVAATGVDVLVTTWAYFGKPADIKMVVSA >gi|316923559|gb|ADCP01000068.1| GENE 101 124481 - 125164 681 227 aa, chain - ## HITS:1 COG:PA1862 KEGG:ns NR:ns ## COG: PA1862 COG4149 # Protein_GI_number: 15597059 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, permease component # Organism: Pseudomonas aeruginosa # 8 226 6 224 228 110 33.0 3e-24 MLGELLHDEEVLFSVWLTLKVAAVCLALHLITAVPLALWARSPKAPFRRTLNFVVTLPLV FPPIALGYLLLMALGQTGLGEPLQRLFGVRLIFSQAAVVLAAYIAGLPLVIKPVQAALGS ETVRKLTEAARVTGADPIAAFFLVVIPLIRTSLVVGLLLGVTRSFGEVGISLMLGGNIAQ RTNTLSLEVFNAVSRGDFERATALCVLLACISLCLYLAIDQLQRRKA >gi|316923559|gb|ADCP01000068.1| GENE 102 125177 - 125899 910 240 aa, chain - ## HITS:1 COG:jhp0425 KEGG:ns NR:ns ## COG: jhp0425 COG0725 # Protein_GI_number: 15611492 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Helicobacter pylori J99 # 6 237 8 242 246 75 23.0 1e-13 MKRWILALVMLGLSVVPALAETAVTTGMGYKKMLSELCALYKQESGKQLTEVYSGTIGQS LAQYKAGSGVSVFVSDRQTLEDSDVSFASFQPLGTAVLVLAWKKGLNVESPQDLQNPKIQ SIGYPDKKGAIYGKAAERFLTQSGLRGPLKDKLRMFSTVPQVFSYLTTGELDAGFVNTAV VKAQGKSIGGSMDIQSGYDPIEMVAAVVKGAEKDPEVEAFLTFLQSDKARSVYAKHGLRP >gi|316923559|gb|ADCP01000068.1| GENE 103 125934 - 126578 182 214 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 [Haemophilus influenzae PittAA] # 1 206 2 205 205 74 31 3e-12 MRVLQVSIQKQLEAFALQVSFVLASHGITVLWGASGSGKTTLLQCLAGLLRPDAGRIACR EAVWFDAERGVCLAPERRRLGYVFQDVRLFPHLSVRSNMLFGRRFRGPSRVSFEDVVALL GLGRLLHRTPSDLSGGEKQRVAVGRALLACPELLLMDEPLTGLDRGKREEIMAYVKAIPE RFGVPVLYVTHSDAERRFLADRVLNLEDGKLTEY >gi|316923559|gb|ADCP01000068.1| GENE 104 126662 - 127144 -322 160 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQGDSTGCSVSKRNTSRPQVLKQVPQPVHFETSIELITAFLLVCIGSQRTCCHREPATCR KICAYRHNKDTKGFMLSSKRCRSQHNIEIVSMSAETAVSMNGKRLWSPCLSVRALYRYVP QLWKMAMASSKNSVKGWGTWTVPEDCGRVSRASHTSLSLL >gi|316923559|gb|ADCP01000068.1| GENE 105 127318 - 128766 763 482 aa, chain - ## HITS:1 COG:CPn1031 KEGG:ns NR:ns ## COG: CPn1031 COG0531 # Protein_GI_number: 15618939 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Chlamydophila pneumoniae CWL029 # 4 482 7 485 485 524 60.0 1e-148 MANNTKKLGVIALASVVVSSMVGGGIYSLPQNMAAQASVGAVIIAWVVTGIGMFFLANSF RILSDIRPDLKAGIYMYGREGFGSFVGFLIAWGYWLCQIFGNVGYAVITMDALNYFFPPY FAGGNTIYAIIGGSILIWLFNFVVLRGTQQAAVINTIGTIGKLIPLFVFIIIMIFVFHID KFDFDFFGKLAVDNGQHLGGLGAQIKSTMLVTLWAFIGIEGAVVLSDRAQSQDDVGKATI MGFVGCLIVYVLLSVLPFGFMTQQELAAVPTPSTAGVLERAVGTWGSWIMNIGLIIAVLA SWLAWTLITAEMPFAAAKNGTFPRQFSRENANGAPSVSLWVTSALMQLALLLVYFSNNAW NMMLSITAVMVLPAYLISTLFLWKTCEDGQYPQGAATGRASALACGFLGSAYGLWLIYAA GLHYLLMASVFIAIGIPVYIWSRKQHPDQNPMFLKYEKVLLVLLVLVALFSLYLFMRGIV KL >gi|316923559|gb|ADCP01000068.1| GENE 106 128857 - 129441 483 194 aa, chain - ## HITS:1 COG:CPn1032 KEGG:ns NR:ns ## COG: CPn1032 COG1945 # Protein_GI_number: 15618940 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Chlamydophila pneumoniae CWL029 # 5 193 4 193 195 311 73.0 6e-85 MTTLGTRYPTLAFISGGVGEAEDGIPPQPFETFCYDSALMQAKIENFNIIPYTSVLPKEL YNNIVPVDSVASSFKHGAVLEVIMAANGARLEEHRAIATGLGICWGKNKQGELIGGWAAE YVEYFPTWIDDDIAKAHAEMWLNKSLKHELSLRGVEQHSEFQFWHNYLNLTKPYGYCLTV MGFLNFEFADPVKR >gi|316923559|gb|ADCP01000068.1| GENE 107 130138 - 131286 946 382 aa, chain + ## HITS:1 COG:YPO1343 KEGG:ns NR:ns ## COG: YPO1343 COG0614 # Protein_GI_number: 16121623 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Yersinia pestis # 7 377 12 372 378 177 29.0 2e-44 MIHILGRLSRVVMACALSVLFLGGTGKTILAAPAISFTDIAGREVQLDKLPKTFVVANYI ANFLMVGGAGSLDKVVGMTFDGWEETRYGEYVVYTETFPKLKAIPSIGGYHDNILDSEKI LSLRPDVLLVGRSQFADNNQKIDIFEKAGIKVVVLDYHAMKVENHTKSTMILGQLLDREA VAKEQCDVYASALEDVYRKIAALPDSAKHKTVYMELGNKGIGEYGNSYNKDVLWGAILKN LGADNLAENATQPYAPLDTEFVLASNPQLIVIGGGIWRNNAEGDQMSMGLTIDEATAQKR LKGFAARPAWKNLAAVKSGEIYAVDHGSLRNMIDYTLTLYLAKILYPNTFQEVDPMGDMR AFYAKYLPALKFDGTFMIKCAR >gi|316923559|gb|ADCP01000068.1| GENE 108 131303 - 132334 547 343 aa, chain + ## HITS:1 COG:MA2149 KEGG:ns NR:ns ## COG: MA2149 COG0609 # Protein_GI_number: 20090992 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 9 343 17 352 355 209 37.0 6e-54 MPSGNAHMYRQLNGRRLLTGMVLAILTLGLIVIDLGMGSSGIGPGEVVDALLGGPDGDTA NTAILWSIRLPMTLTCVFVGGSLSLAGLQIQTITNNALASPYTLGITASASFGAAIAITL GLSVAGYLWIGTALLALVFALAVSLLIFYLGRLKGMSTSTLILSGIIMNFFFQALQQYLQ YRASPEIAQIISGWTFGNLQRSSWMSVVVSGCLLVMGAALLSGWSWRLTVLTTGEERARS LGINVERLRLHVFLICSFLIAGAVGFIGTVAFVGLVAPHCAKLMLGEDQRYLLPSATILG GLMLLASSIVSKLLSGGSMLPVGIITSIVGVPFLFVLLMKNGR >gi|316923559|gb|ADCP01000068.1| GENE 109 132336 - 133094 227 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 211 2 217 245 92 26 2e-17 MLTLSIQNLSVHFSGNHVINDVSATLHGGEMVAVVGRNGVGKTTLIKAIARLIKRSGDVA LYDDKGNLFSDRDIAYVPQLESVTSRLTAFEMVLLGLVKDLRWKVTDEQIEKVADTLAEL GLDSLSRKPVCSLSGGQKQLVFMAQAFVSRPKVLLLDEPTSALDLRHQLVVMDLARKYTR ENGAITLFVVHDLMLASRYSSRLLMLHEGRIKAFDTAERVLHPHLLGDVYDVEASVERTR PGFLNVIPVRPL >gi|316923559|gb|ADCP01000068.1| GENE 110 133091 - 133783 278 230 aa, chain + ## HITS:1 COG:MA3459 KEGG:ns NR:ns ## COG: MA3459 COG0500 # Protein_GI_number: 20092272 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 9 206 7 203 249 160 42.0 2e-39 MNRCRLETIRAYWSGRVAAYSAVNADELGTIQAKVWDELIAEQLPWKHPLRILDAGCGPG FFSILMARRGHEITGVDYSEAMIECARENVENHSPEASAYFSQMDAQNLTFEDDTFDVVL SRNLTWNLEHPDRAYAEWLRVLRPGGVLLNFDANWHAHLFDPELARLYAEDAQTLRAMGY AVEDEHGDPIMDELIPQLPLSREQRPGTRPAWNGSVAATCLSGRLCPQTS >gi|316923559|gb|ADCP01000068.1| GENE 111 133856 - 134602 587 248 aa, chain + ## HITS:1 COG:MA3459 KEGG:ns NR:ns ## COG: MA3459 COG0500 # Protein_GI_number: 20092272 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 7 218 7 221 249 153 40.0 3e-37 MTELEAIRRYWNTRAEGYSLKTAYDLQHAEKVWLERLRPFVLASGKLDILDIGCGPGFFS ILLARLGHSVVAFDYTEGMLERASRNAAEADVSIVVRQGDAQDLPFPDETFDLIVSRNLM WNLEHPESAYAEWLRVLRPGGRLVNFDGNHYRHLYSEPYAEELKQPDYTDGHNPDFMLGV DPAPIHAIAANLPLSRVDRPQWDVETLLKLGARNVTVDVERKHFVDGTGQPVSIIKRFMV SAQKEEQQ >gi|316923559|gb|ADCP01000068.1| GENE 112 135021 - 135773 444 250 aa, chain + ## HITS:1 COG:BH2074 KEGG:ns NR:ns ## COG: BH2074 COG0747 # Protein_GI_number: 15614637 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 46 225 1 180 486 227 56.0 1e-59 MFMGKLFVVVIALLGLAGCSQEEPPSKAEAPRTELAYASTKDIRNINPHLYSGEMAAQNM VFEPLVVNTKEGVKPWLAESWEISPDGKEYTFHLRKDVAFTDGTPFNAEAVKANMDAIVS NLPRHAWLDMVNEIDRNEAVDEYTYKLVLKHPYYPTLVELGLTRPFRFISPKCFIDGETK NGVSGYVGTGPWVLTEHKDKQYAVFTRNERYWGPKPALESVRWKVRTIRLFCSHSRRGKS ILFSVPTVTC >gi|316923559|gb|ADCP01000068.1| GENE 113 135767 - 136597 880 276 aa, chain + ## HITS:1 COG:BMEII0487 KEGG:ns NR:ns ## COG: BMEII0487 COG0747 # Protein_GI_number: 17988832 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Brucella melitensis # 5 275 251 522 526 222 37.0 8e-58 MLNLDAFAALQKEGTYKTEISDPVASRAILLNAHQPITRDINVRKAFQHAIDKKAIADGI LNGSETIADTLMAKTVPYCDVDLPVRSYDVVKANELMEKAGWSMGKDGYRYKDGKKCAVT IYYNSNNSQERTISEYMQNDLKKIGVELKIVGEEKQAFLDRQRTGEFDLQYSLSWGTPYD PQSYLSSWRIPAHGDYQAQVGLERKEWLDKTITALMIEPNEDTRKNTYKELLAYIHEQGV YVPLSYSRTKAVHVPALKGVTFCTSQYEIPFERMSF >gi|316923559|gb|ADCP01000068.1| GENE 114 136655 - 137584 808 309 aa, chain + ## HITS:1 COG:BH2075 KEGG:ns NR:ns ## COG: BH2075 COG0601 # Protein_GI_number: 15614638 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 1 307 1 308 310 271 45.0 8e-73 MGSYLFRRLLNLIPILIGISFLSFVLINLSPSDPAEVAVRVNEVTPTEEVLAETRAKLGL DKPFLTRYVAWLNNSLHGDFGNRYVDDKPVLGEIQKALPATLVLAGAALCLTLVCSLSAG IVCALYEGTFLDRLIRGGVFLGTAMPSFWAGLLLMWLFAVKLDLVPTSGMDGPSSVLLPA VTLSLAYIATYTRLIRANMIQNKQENYILYARVRGLPERSITRHMFKNSLQASLTALGMS LPKLIAGTFVVESIFAWPGIGRLCVAAIFNRDFPVIQAYVLIMAVLFVVGNVLVDILSAA VDPRLRKEF >gi|316923559|gb|ADCP01000068.1| GENE 115 137586 - 138425 764 279 aa, chain + ## HITS:1 COG:BH2076 KEGG:ns NR:ns ## COG: BH2076 COG1173 # Protein_GI_number: 15614639 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 7 272 7 272 293 238 45.0 9e-63 MSPWIRLQRDRLALVCMGFLGGVLVLGLLAPILPLSDPTLIDVGDKFAPWSWHHPFGTDQ LGRDVLSRLVYGIRATVFLSLLTMAITSALGACVGLVSGFFRNRIDDVLMRICDVMMSFP SEVMILAIVGMTGPGLENVVIANVVAKWPWYARMVRSIVRQYSDKDYIRFAKVAGGSSIR IMMRHLLPGTAGELFVLTTLDTGSVILMISALSFLGLGVQPPTPEWGMMLNEAKEVMTLY PLQMIPAGMAILLVVAAFNFLGDSLRDAFDPKHAGRERS >gi|316923559|gb|ADCP01000068.1| GENE 116 138422 - 139228 381 268 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 3 255 8 263 563 151 36 2e-70 MSLLEVRDLTVRDSLTGALIVRDVHFCLEPGTCLGVVGESGSGKSLTCLALLGLAPPSLR VSGSVRFGGIDLLQAGRETVRGIRGKRIAMIVQQPMTAFDPLYSMGAQLLETLRATTSAF SEKESRGRIVEALEMMHIHNPLDVLKKYPYQLSGGMLQRCMIAVALLQRPDIIIADEPTT ALDSMNQREVVAQFHWIRERFGSSLILVSHDLGVVRQLAQEVLVMKDGVGVEYGGAELFS APRHPYTRYLVDTRATLSKAFERVMQCR >gi|316923559|gb|ADCP01000068.1| GENE 117 139219 - 140010 427 263 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 10 251 11 265 329 169 41 1e-40 MSVAHRETPLLEVRNVSKSYPVSGGFWRKKRQPVLRDVNLTLSEGSSVGLIGESGGGKST LGRLILGLEQPDSGQLFLEGRPVQVWSKEHPGQMSVVFQDYTSSANPRFTVEELIREPLD ILRLDASCAIPELLERVGLPKALRNRYPHELSGGQLQRVCIARAIATKPRFIVFDEAVSS LDVSVQAQVLELLLELKGDMTYLFIAHDVQAVTYLCDHIMFLHEGTIAESLEREHLARAS SGYAQRLLQSVIPFNPDACVATV >gi|316923559|gb|ADCP01000068.1| GENE 118 140088 - 141512 900 474 aa, chain + ## HITS:1 COG:no KEGG:LI0461 NR:ns ## KEGG: LI0461 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 474 1 498 498 347 42.0 5e-94 MKRAMMLFLAAGLLLSMTSGARAADVNIKGQWQNGFSWADRNPLKSANASVDRFKASIRL RTQIDIVASDSLKGVVFFEVGHQNWGTNDAALGTDGKVVKVRYSYVDWNIPETDAKVRMG LQPFDIATFAVPYAAFMTDGAGISLSGNFTENVGATLFWVRAYNNDERTVSVYDKDNTLL YTYNSHDAMDVIGLAVPVQFDGIRMNPFGMYSAIGKDTRFGAGANNKTGVASGLLPVGTG KVVADSKDNHGNAWWGGLSAEMTLFSPLRVAVDGVYGKVDMGKTGNQDLKRAGWYAALAA ECKTDFVTPGLTFWYASGDDANALDGSERMPVIDTDVALTSFGYDGGMYNRSSTFNGTDI SGSWGIMAELKDISFIEALSHAFRVAMVKGTNNTEMVRGGVVARDAMATVGNNMYLTTKD KLWEVNFDSQYKLYEGLTLAFELGYIYLDADKELWNDLYKENNFQAAFSVNYKF >gi|316923559|gb|ADCP01000068.1| GENE 119 141605 - 142273 621 222 aa, chain - ## HITS:1 COG:aq_671 KEGG:ns NR:ns ## COG: aq_671 COG0378 # Protein_GI_number: 15606084 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Aquifex aeolicus # 3 219 35 251 259 229 49.0 3e-60 MEIRVIRNVLEANEKLSARLKEHFARHGILTLNLISSPGAGKTSLLERTLADLSPEFRMA VIEGDLQTDNDAQRVAATGAQAVQINTEGGCHLDSNLVMEALGALDLSEIDILFIENVGN LVCPVEFDCGEDAKIALLSLPEGDDKPEKYPFLFNRASAMILNKIDLLPYLDFDMETAAR HARHLNAALPVFKISCRTGDGLEEWYDWLRQAVRAKRSAPNE >gi|316923559|gb|ADCP01000068.1| GENE 120 142286 - 142639 162 117 aa, chain - ## HITS:1 COG:aq_1021 KEGG:ns NR:ns ## COG: aq_1021 COG0375 # Protein_GI_number: 15606317 # Func_class: R General function prediction only # Function: Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) # Organism: Aquifex aeolicus # 1 115 1 111 115 57 30.0 5e-09 MHEMSLVTSLLSIIREETERHALHRLLQVRARYGALANIVPEALSVAFEALTAGTDWEGA VLQAEEIPLSLKCSRCGQIFSPARQKRFSAPCPFCGEEQGHSIGAGRELYIQSMEAG >gi|316923559|gb|ADCP01000068.1| GENE 121 142847 - 144145 791 432 aa, chain - ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 5 399 8 401 411 369 48.0 1e-102 MLDAPKGLRLHIGLFGRRNVGKSSLLNALAAQSVSIVSNTPGTTTDPVEKTLEFAPLGPV VFLDTAGLDDEGELGTQRTERTLAVLPRTDMALVVTAENVWGHYEETIVSRLREQAIPFL VVMNKTESSVASKDRLPDAMRGLPMVCASAKTGEGLETIRRELVRLSPGESLHEAQLVAD LLPEKGVVILVVPIDSGAPKGRLILPQVQTIRDALDGRKLCLVVTEGELGAAFACLKEPP ALVVCDSQVVRRVALETPQSVPLTTFSILMARLKGDLPLLAAGAAAIGNLKPGDSVLMME ACSHHPQQDDIGRIKIPRLLQQYAGGELRFDMCAGKSLTDCPGSYDLIVHCGGCTLTRRQ MLGRLRSAQSRGIPMTNYGVAISFTQGVLKRVLTPFPDALAACEPAHETCHDAQSKTNKE SGWRGYIFIGPR >gi|316923559|gb|ADCP01000068.1| GENE 122 144147 - 145103 599 318 aa, chain - ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 1 293 17 309 350 252 43.0 6e-67 MQRNDILKLLFETPYEALRERADEVRAREKGEHVFVRGLIEFSNRCERNCRYCGLRSANP SLKRYRLDEVQIVEAAERAVAFGVDTLVLQSGEVHDEPQRIAHVVDALRTRFPVAVTLSV GEQPTASYALWKEAGASRFLLKHETADASLYAALHPGYTLAQRVDALQRLRRAGYEIGSG FIVGVPGQRPETLADDILLARELHVDMCGAGPFIPQADTPLGNEPQGSVELALRVMAVLR IALPWSNLPATTALASLDPVSGQREGLLAGGNVLMPGFTPAAHREDYCIYDNKHRVSMDE ARQVIESAGRTHSLHREG >gi|316923559|gb|ADCP01000068.1| GENE 123 145005 - 145199 130 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSLFPGPHLVGALTQSLVRRFEKQFQNIVPLHTLSPAFNPSFNTRSCGKGDRQSIRRMG YIPF >gi|316923559|gb|ADCP01000068.1| GENE 124 145237 - 145641 277 134 aa, chain - ## HITS:1 COG:no KEGG:Dde_0725 NR:ns ## KEGG: Dde_0725 # Name: not_defined # Def: hydrogenase-like # Organism: D.desulfuricans # Pathway: not_defined # 1 132 1 130 483 112 43.0 5e-24 MVERNHINVLRREVLRRVAESFLHYADFGNSVERIPFVMRPKNTRPNRCCIYKDRAILRF QVMAALGFRLEDEEDDSTPLSVYAEKASLEREQPSAPILTVCDTAWQGCIPARYYVTGAC QNCIAHPCIGPCHF >gi|316923559|gb|ADCP01000068.1| GENE 125 145853 - 146137 233 94 aa, chain - ## HITS:1 COG:TM1267 KEGG:ns NR:ns ## COG: TM1267 COG1060 # Protein_GI_number: 15644023 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Thermotoga maritima # 3 93 157 247 473 112 54.0 2e-25 MESIDAIYGAEENGSRIRRVNVNLAPLSVEGFRRLKERNIGTFQLFQETYHRPTYGRVHL AGPKKDLDWRASSFDRAMQAGIDDVGMGLLYGLI >gi|316923559|gb|ADCP01000068.1| GENE 126 146077 - 146427 85 116 aa, chain - ## HITS:1 COG:TM1267 KEGG:ns NR:ns ## COG: TM1267 COG1060 # Protein_GI_number: 15644023 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Thermotoga maritima # 1 87 59 145 473 105 54.0 3e-23 MAVHSQDLLDELFHTAKSVKEEIYGNRIVLFAPLYISNLCSNECPYCAFRRSNKEAIRRA LNPDEIRQETKILLRQGHKRVLMVAGEAYPGSGIDYGWNPLTRFTVRKRMAAAFAG >gi|316923559|gb|ADCP01000068.1| GENE 127 146764 - 146964 89 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHYFRWPLSYTARERLRFELALPSMSSIQNLSYKPYTDNAAGKVNIIPFQGKEFARADAG THINAK >gi|316923559|gb|ADCP01000068.1| GENE 128 147493 - 148482 814 329 aa, chain - ## HITS:1 COG:MA0657 KEGG:ns NR:ns ## COG: MA0657 COG1477 # Protein_GI_number: 20089544 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Methanosarcina acetivorans str.C2A # 32 328 37 335 339 171 33.0 2e-42 MMTRRVFLGMAGVAGAGLWLGVPGLAQAAVSRETKVMLGTFVDVSVAGVSSMQASDALGL AFAEASRLERVFSRHDGGTPVSELNRAGRLRAAPAELVRVVNRSLFYGALTGGSFDVTVQ PVIDLFRAHRNPSGELTLDDSELRAARALVGRRGLQVSGADLSFARSGMGITLDGIAKGY IADRVSAVLTSAGVKNHLVNAGGDIMASGHKSPGVPWRVAVQSPTGPTYAGELSLSGKAI ATSGSYEIYYDASRRHHHLINPASGFSPAVGSVSVVAGTAMEADTLATALSILPPTDALK LVQGLPGRECCILSPDGCIYTSPGWASFA >gi|316923559|gb|ADCP01000068.1| GENE 129 148479 - 149468 936 329 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 5 265 8 263 264 159 38.0 9e-39 MVMISVLVLLGLGLVTATVLAVASRVFHVEEDPRVQAVLEALPGANCGGCGYAGCEGYAT AVATDPAVPANRCCAGGAETSIAVGELTGKTVAASDPLVSFRRCDKVAGNVALRYDYQGM PSCAAAAGLVGGSDSCSYSCLGFGDCVQVCPFDAIEVRGGLARVNASKCTGCGKCAETCP RNVLELVPARARVMVFCSTKDRLKAVTEVCKVGCIKCGRCVKSCPADAVTLENDRIHINH KVCLTYGPECGEACAAACAREALRVLCPSAPLKSEAPAGATSAPGKPLDAKSAAQAAPAS QGKAAPAPVQSAPAAPEASPAPAAKENVS >gi|316923559|gb|ADCP01000068.1| GENE 130 149484 - 150059 828 191 aa, chain - ## HITS:1 COG:TM0248 KEGG:ns NR:ns ## COG: TM0248 COG4657 # Protein_GI_number: 15643020 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Thermotoga maritima # 1 189 1 188 191 164 54.0 1e-40 MSVFELFISAIFVNNIVLAQYLGNCPYLGCSRDKGVAVGMGSAVIFVIFMATLCTWLMQR YVLVPFELGYLQTIVFILVIAGLVQFVEMFLKKAIPPLYSALGIFLPLITTNCAVMGVTI LVQREEYDLTTSLLYAVASSLGFMLALILMAGIRERLDTCRVPKALMGTPIALIMAGLMS LAFMAFKGMTS >gi|316923559|gb|ADCP01000068.1| GENE 131 150056 - 150730 921 224 aa, chain - ## HITS:1 COG:TM0247 KEGG:ns NR:ns ## COG: TM0247 COG4660 # Protein_GI_number: 15643019 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Thermotoga maritima # 5 200 4 196 200 185 53.0 6e-47 MQQLVKEFTKGLWTELPPFRLVLGLCPTLAVTKTADNGLGMGLAVIFVLVLSNMLISLVR DIIPKKVRIVCFIAISASLVVAVELLMQAYAYPLYQQLGIFVPLIVVNCIILGRAEAFAA KNTVPLAIADGLGMGLGFTMSLTFLGSIREVFGAGTLFGTPVMWEGFRPFSVMVEAPGAF LCLGLILAAMNLINRWQAKRKGREEPENISSGCASCGGCSGGKS >gi|316923559|gb|ADCP01000068.1| GENE 132 150741 - 151313 884 190 aa, chain - ## HITS:1 COG:MA0661 KEGG:ns NR:ns ## COG: MA0661 COG4659 # Protein_GI_number: 20089548 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Methanosarcina acetivorans str.C2A # 67 180 69 185 188 67 38.0 1e-11 MLAIIRMIVVLSSICGLSGFALSYLKISTAPRIEEQVLTYVQGPAILKVFADIDNSPIAE RKTFTLDGAKVTVFPGKKDGKLVAVALEHFGKGFGGDVGVMVGYDVNRDTLTGIGITTMK ETPGLGTRVADPAFTGQFTGKPADARLKSQGGDIDAVSGATISSTGVVTALGNAAKVYAA LKPEIVKAWQ >gi|316923559|gb|ADCP01000068.1| GENE 133 151317 - 152273 1148 318 aa, chain - ## HITS:1 COG:FN1595 KEGG:ns NR:ns ## COG: FN1595 COG4658 # Protein_GI_number: 19704916 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Fusobacterium nucleatum # 10 317 4 308 314 155 34.0 9e-38 MSQTNQQPVLLAVSAPPFWHCGRTVKKASYAMLLALAPAAFMAVWHWGIPAARVMALAMV TGIATEALCQKIMGQDISVDDFSGAVSCLLFAFLLPANAPWWLVMLGAALAIGLGKMAFG GLGANAVNTALVGWAMLYVSWPALMDPNSMQLDTSFIDPLVRLKYFGAGAVSHIDLTDLL LGNQIGGLGASQAGALFIGGSYLAARGTIRWEIALSFFVGVFLTAALYNVIDAERFATPF FHLCTGSTFLGGFFLATEWASSPGRQIPMMLYGLIGGAMVIIIRVYGIYPDGVPFAILLI NLLAPLLDSIHPKPFGAR >gi|316923559|gb|ADCP01000068.1| GENE 134 152726 - 153949 1435 407 aa, chain - ## HITS:1 COG:PM0385 KEGG:ns NR:ns ## COG: PM0385 COG4656 # Protein_GI_number: 15602250 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Pasteurella multocida # 96 395 132 448 835 128 29.0 2e-29 MKQRFQILNDPRFHLVYGVTQRFSYGPSPKLVRLNREGFTLVPGLKRKTPVYPGMLLGEH PAPDKGDLFASIHGEISEITERSVFLAATDPSKDAPEVEPVDLLGQGLEGDELALAVKKM GVNTRSLGRRCKTLIINALNPDPGVTWAEPMLVSHVENLRTGMELHRRMARADNVILAVP KGSKVAYEGIPLAYVEPEYPNSVDQLVIKAVTGEENPEGVGIVGLHNIWSLGRVGRLGRP LIETVVTIGSYEHSGNYIVRDGSTIGELLQFANIRLKDGDTVVRGGPLRGESLDRLDRSV TKGTYGIFVVEAGTIPPMEGHSPCVSCGACVLVCPARLKPSNLSRYAEFALHERCRAEHI ESCLECGLCGYVCIARRPVLQYIRLAKRKLAEMERDKGLVELKTKQL >gi|316923559|gb|ADCP01000068.1| GENE 135 153960 - 154739 647 259 aa, chain - ## HITS:1 COG:no KEGG:DVU2791 NR:ns ## KEGG: DVU2791 # Name: not_defined # Def: cytochrome c family protein # Organism: D.vulgaris # Pathway: not_defined # 1 259 1 259 259 274 58.0 2e-72 MPRRHVFVTVFCCLMAVVAVVGYTHDTVGKTPVRLLLENAGGRVVFDHRRHAEDYKVACE TCHHESAEARENVQPCGACHGVDFDGSFREDHVAAFADDETTCATCHHMGSAPAKWDHAA HAEEYGLSCTDCHHANADIEPEPARCTACHLDKPMGDIPDMKTAAHAKCADCHQKWFDAG MKGCTSCHPFTDNRKLSASGQPVDVNPGDADCTVCHGDVKASELIPDRMEAFHGNCMKCH EKHGKGPFRKDQCNQCHTK >gi|316923559|gb|ADCP01000068.1| GENE 136 154933 - 155313 360 126 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1502 NR:ns ## KEGG: Ddes_1502 # Name: not_defined # Def: ferredoxin hydrogenase (EC:1.12.7.2) # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 124 1 123 124 174 69.0 7e-43 MSFATVTRRGFLKGACMLSGGILLGIRMTGKAAAAVKAFKEYMGDRINGVYGADKQFSKR ASQDNAQVRTLYESYLGKPLGHKSEELLHTKWFDKSGALKELTAKGVYPNPRHVKEFVAS GYPYGE >gi|316923559|gb|ADCP01000068.1| GENE 137 155326 - 156579 938 417 aa, chain - ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 84 377 2 293 372 286 50.0 6e-77 MTRVVMEHVEYELNVPETGVQPDSLTFVEIDQEKCIGCDTCQQYCPTGAIYGETFEPHTI KYRELCINCGQCLTHCPSMAIYEVRSWVPKLEKKLKDSHVKCVAMPAPSVRYALGEAFGL PVGTVTTGKMLSALKALGFSHCWDTEFAADVTIWEEASEFVERLGGQKALPQFTSCCPGW QKYAETFYPELLPHFSSCKSPIGMNGALAKTYGAQRMQYAPKDIYTVSIMPCIAKKYEGL RPELNSSGQQDIDATLTTRELAYLIRKAGIDFASLPDGERDSLMGESTGGATIFGVTGGV MEAALRFAYQAVTGIRPDSWDFKQVRGLSGLKEYTVTLNGTELRLAVVHGAKRFAEICDQ VKAGNSPYHFIEFMACPGGCVCGGGQPLMPSLFASLERKALGFFAGFRKRLAQSANV >gi|316923559|gb|ADCP01000068.1| GENE 138 157094 - 158512 1179 472 aa, chain - ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 1 472 1 472 472 690 70.0 0 MYNPSSLNAEEFICHDEALASIAYASENRNNLGLVDSIIEKAERKKGLSHREASVLLACD VPEKVQKIYALAERIKQDIYGNRIVMFAPLYLSNYCINSCSYCPYHAKNKHIARKKLTQE DIVREVTALQDMGHKRLAIESGEDPLNNPIEYILESIRTIYSVKHRNGSIRRVNVNIAAT TVENYRKLKDAGIGTYILFQETYHKQSYEKLHPAGPKHNYAYHTEAMDRAMQGGIDDVGI GVLFGLERYRYEFAALLMHAEHLEAVHGVGPHTISVPRVKRADDIDPDEFGNGIDDETFA KICACLRLAVPYTGMIISTRESKATREKVIRLGVSQISGASKTSVGGYGSPDSEEENSAQ FDVSDNRTLDEVVCWLMELGFIPSFCTACYREGRTGDRFMALCKSGRISDCCHPNALMTL KEYLEDYASEQARRTGSALIRRELGNIPNERIRYIAAERLEKIAAGQRDFRF >gi|316923559|gb|ADCP01000068.1| GENE 139 159080 - 160075 316 331 aa, chain - ## HITS:1 COG:CC0624 KEGG:ns NR:ns ## COG: CC0624 COG2801 # Protein_GI_number: 16124877 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Caulobacter vibrioides # 32 282 29 253 326 65 26.0 1e-10 MESFNQNVIKHKAGLLNLAAELGNISKACKIMGFSRDTFYRYQAARDAGGVEALFEVSRR KPNLKNRVEEAIEVAVTAFAIDFPAYGQTRASNKLRKQGVFVSPSGVRSIWMRHDLASMK QRLRALEKLSAEQGIVLTEAQVQALERKKHDDRACGEIETHHPGYLGSQDTFYVGTIKGV GRIYQQTFVDTYSKWAAAKLYTTKAPITGADLLNDRLLPFFSSMNMGLIRMLTDRGTEYC GRVEAHDYELYLGVNGIEHTKTKARHPQTNGICERFHKTILNEFYQVAFRRKLYQSLEEL QAAGHMDRQLQHPKNPPGQNVLRKDSHADAY >gi|316923559|gb|ADCP01000068.1| GENE 140 160707 - 161642 730 311 aa, chain + ## HITS:1 COG:mlr2197 KEGG:ns NR:ns ## COG: mlr2197 COG2207 # Protein_GI_number: 13472034 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Mesorhizobium loti # 16 299 8 289 298 209 39.0 6e-54 MTRAIRVNTGNSWAILRDIILRNTASRSESYITPIKGFEFHRQVSNADPKPHFYEPVIIV VVQGKKLVKIGAEEHHYGENICFVCGVDMPVSSCVMEASKETPYLSMSLKLDTGLIASLA SQIPSPSDSTVYRGAGTQEVDPDLLDAFVRLAELTEKPEQILVMRDMLMREIHYRLLAGA FGNTLRSLNTLGSQGHQITKAIAWLKEFYKEPLLVEDLANRSHMAPSTFHKYFKRITTLS PLQYQKRLRLGEAQRLMLSEGYDVTQAAMAVGYESATQFIREYKRLFGDPPRRNVMSMKN IAKGAPQWAAM >gi|316923559|gb|ADCP01000068.1| GENE 141 161948 - 162694 342 248 aa, chain - ## HITS:1 COG:yjhP KEGG:ns NR:ns ## COG: yjhP COG0500 # Protein_GI_number: 16132127 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 248 1 248 248 384 74.0 1e-106 MDIPRIFNITESAHRIHNPFTSEKLATLGAALRLERGTRVLDLGSGSGEMLCTWARDYGI IGVGIDMSQLFSEQAKLRAEELEVSDRVKFIHGDAVGYVTEEKVGVAACIGATWIGGGVA GTIELLAKSLCTGGIILIGEPYWRQLPPTEDVAKACHANAVIDFLILPELLASFGDLNYD VVEMVLADQEGWDRYEAAKWLTMRRWLEANPDDDFAKEVRTQLTLEPKRYATYTREYLGW GVFALMAR >gi|316923559|gb|ADCP01000068.1| GENE 142 163037 - 163354 80 105 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHPYLNKNINVQMISFVIPLYMNNQKEIDGYNRNARVDNPAIKDGQSTDRHATCSFAQRK GNVEPLCWKLAAWWFGDKEDGIEIKKQRVQWRRKGGGKGQAPRKI >gi|316923559|gb|ADCP01000068.1| GENE 143 163464 - 163739 121 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|239628392|ref|ZP_04671423.1| ## NR: gi|239628392|ref|ZP_04671423.1| predicted protein [Clostridiales bacterium 1_7_47_FAA] # 1 50 1 50 582 84 80.0 2e-15 MKSIQTKFIVLILGCILLSSSVIGGAGILNAKHVVDEDSVKIMNLMCKEKNRKSMRCLEN PAIGQDAGGLPPLQSRIYAADIGLVLKQNDA >gi|316923559|gb|ADCP01000068.1| GENE 144 163742 - 164101 164 119 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MERKLTILMRALLAVALVAITLVLTYEFWRAGNGKSSLDVQEIRLQASISRYPPQHKGHQ ARVRPTPLNVFANHILILFVKPTTVFLNHAQEIRLRPDCPNRKHGCPLTDERQISHGEA >gi|316923559|gb|ADCP01000068.1| GENE 145 164280 - 165740 1992 486 aa, chain - ## HITS:1 COG:MTH788 KEGG:ns NR:ns ## COG: MTH788 COG0471 # Protein_GI_number: 15678812 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Methanothermobacter thermautotrophicus # 57 486 25 442 443 77 23.0 4e-14 MASAAGPQNSGNVNYFFHIIVFLCITFLFGKLDPIAPLTPFGMNTIGVFLGVIYAWVFID IIWPSMVGLLALMLLDVLPATALLNKGFGDPTVIMMMFIFVFSATLDRYGLAKYISLWFV SRKCVMGKPWRLTFALLLAIAILGGLTSATPAAVIGWSLLYGVFDICGYKKGDGYPIMMI IGTVFAAQLGMSLVPFKSLPLVAISAYEKLSGTSVDYALYLLASLTTCLFCLLLFIALGK HLFRPDVSKLESLDIKLIIKESDMTLTGIQKLLLGFLVALIIFMMLPGFLPKDLAVTIFF KKIGNTGVCILLVALLCAIRVKGKALLPWRAMVNEGVAWPIIFILAFTLPLAGPLSDPKS GITAFMLEMLQPLFGSGSGTIFVLCMGIVAVIMTQFINNTALAVALMPVVHTYCSTNGVS SELPVILITIACCLAFLTPAASSTAAMLHGNDWTNTKSIWKIAPFLIVLSLIVASAVVIL IGKVFL >gi|316923559|gb|ADCP01000068.1| GENE 146 165802 - 166014 257 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255525440|ref|ZP_05392378.1| ## NR: gi|255525440|ref|ZP_05392378.1| nitroreductase [Clostridium carboxidivorans P7] # 1 56 1 56 268 62 42.0 8e-09 MPIFSIDAESCVNCGKCVKICPLDVLREGKTTPEIVYREDCQSCFLCIIYCPKHAITVDT ERGRATPEPY >gi|316923559|gb|ADCP01000068.1| GENE 147 166058 - 166585 572 175 aa, chain - ## HITS:1 COG:AF1463 KEGG:ns NR:ns ## COG: AF1463 COG1053 # Protein_GI_number: 11499058 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Archaeoglobus fulgidus # 3 159 407 564 575 80 31.0 2e-15 MTGWKAGEGAAEACEGQAQLPVDEAQLAALKEKTFAKFGHKGHMSVGDAIYRIQSLTIPA KYNIIRSEASLREALAGLDEVERDIEANVAVRDMHELMLYHELHGMLATARLTFTSALQR KESRGSHYREDYTETDGEHWNVWLKSSFKDGKIVVEAEPIPLEKFERYGLDVLPR >gi|316923559|gb|ADCP01000068.1| GENE 148 166563 - 167747 1309 394 aa, chain - ## HITS:1 COG:mlr9192 KEGG:ns NR:ns ## COG: mlr9192 COG1053 # Protein_GI_number: 13488234 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Mesorhizobium loti # 8 377 11 400 578 120 29.0 5e-27 MIPKTERLSADVLILGGGLAGCMGAVAAIKKGCTVILADKAWVGSSGESTFAAGDILYYD PEQDDMHEWLERWNKAGDSMFDPEWLEWYGKNVHELILMLDSWGMEFEKDAQGRFKRKPG RGHREAVVFPGYKMQKKLRGILEKLGVVIVDRVMITELLTSDGHVAGATGFNVRTGQFYV FEAKNTVLSTGACSFKGQYHGQDMVSGEGNDMALRVGAEFTNMEFANTYNATAKEFDICG MSRFQCLGGRFTNALGETFMHKYDPVQGEGAMLHILVRAMTQEVRAGRGPISFDLRGMSE EDKDLSRRMLPMFFEACASKGVDPFSEPVEWIPGFMGSTSCGAGLTLKSFSCDTTIPGLF AAGDMANQGLVMGAIAGPGGDQPGLGFGDGLEGR >gi|316923559|gb|ADCP01000068.1| GENE 149 167926 - 169851 1660 641 aa, chain + ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 57 632 46 626 627 310 33.0 4e-84 MPALPPGGIMKDILVLVTLQPLLRAIQKYILEEDRARVDIQYCKNLNGIIDFIDKKLPSS VEVIISTPGPSFFIAQLIKKKIPILPLEYNNIDIIKSLHMALSVCPGCVAYGHYLQETQW LDDIRKMVGQDFGNFLFGNDDATNADILRKLQGRGVRAIVGGGYICNIAQEMGFLVFPVE VNRFTVKETIHKALSIADTQKYARYSQKNIDTILSNQAEAVITVDQDNEITFFNKSAEKL FAVSGADVIGRKSWNIFPQNTFEAVLKGGEPQETHPHTVHGVDIVGNYRPVFDNGSVIGA VGTFSTMTDIQKKDEFIRKYYAPKTAQAKHSFDDFYGGGLLFQELLERARCFARTDETIL ITGESGTGKEVMAGSIHNASRRGSKPFLSINSAAIPATLMESELFGYEPGAFTGGKKNGS PGMFEFAHGGTLFLDEIGEMPLELQSKLLRVIQEKEVRRIGASRVIPVDVRIIAATNKDL NGEVAANRFRADLYYRLNILHLHLPPLRAYTESAGEIAEKILRKLAPGSEPDRAAPLRAL LAKTGQYRWPGNLRELENIVRRYLALSPYLSRSIKLSDIFEPSELAEGPRESGGWENSGE LQKILSTYYRMGCSKTLTAQELGISRSTLWRKLRQIRNPDS >gi|316923559|gb|ADCP01000068.1| GENE 150 170287 - 170544 244 85 aa, chain + ## HITS:1 COG:PA0712 KEGG:ns NR:ns ## COG: PA0712 COG3811 # Protein_GI_number: 15595909 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 1 85 1 85 85 106 58.0 9e-24 MNISRFEQRTLHALAKGGRIIIIRDDSGKVADVECYTREGWLLCDCTMEIFKKLKSKKLI CSKNSLPYRITSEGLLAVRSQSDNR >gi|316923559|gb|ADCP01000068.1| GENE 151 170865 - 171407 385 180 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MADLKNMTGTEAFQLAKDESALTIEEIAERLTVSPSVIKRYLNAGDNYLPSLEMIPRLCS TLGNNILLRWLEAQVEAEENAVPPAQNRTEVLTSVVRAGAALGNVQRILAETQVIVPHNA RKLRSALNDVITECRVAKESLQPLAARKDLAECTHILCMKKASFDKENLFQNWRKPQRKN >gi|316923559|gb|ADCP01000068.1| GENE 152 171499 - 171744 193 81 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSLSKEELVLAACLLETTDASLSAEDAMSDVKQIMMNLPESLDPAYRGLLAKAACILLSS NRFSPGAAIAEARKVMTLAGF >gi|316923559|gb|ADCP01000068.1| GENE 153 172039 - 172695 331 218 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLKNVQRQLKKVPDYDIVLPHNPLQHEVLEAMKRFFQNRQYDVLVSLRQLLLTAAVLCLF CIPCFAGEWSVSLGMNGGDADTITGDLSLRYTLNPFYGNDMLEIRPLIECAGVYWKRDHD AIWGGGIAGGAVVDFWREGSWRPYVSGSFGGFMLSDNKIASHSLGCNFQFRTKGSAGIRF GESYRHDVQVDVAHFSNAGLNNHNSGFNTYGVSYGFRF >gi|316923559|gb|ADCP01000068.1| GENE 154 173220 - 174506 465 428 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_5081 NR:ns ## KEGG: GYMC10_5081 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 52 352 51 372 446 72 24.0 4e-11 MRKLFFGFILSQCFLLSGFYFQRGNQNVEYDNPSLYPQVLLTFLGVQYPSENKAVTEHIL NNYSKSHPNILVSYEWVPLRTYPCILKKRICSNSLDDVFMLSQDSRYQVPEGVLADLSSL PTLAAYRQQVLAQMRKGDTIPYVTTSLGAFGLFCNLNMLERNRLKVPQTLEEFLAACAAF HRIGITPLAVTRECLRALLIARAYEPAVFGDANAFFHSLNTDEAFLRRTLLRGFLFLALL RDKEYFDVDDLVLIRENMPGMFFSAGEQPFMIGGSWFSPVLGKAQSAFRFSVFPLPVTSQ GAVTVLGLDTPLSVSSRGEHVPQAVQLVEGLTSPENILIFNDGQGGFSPLKNAPLPADKA VHPLWKSMDAGRGIFHSDIRLRYNLWHRLDSGIELILSGASAEEATESVIQALHDAQQGK EGKQGKQS >gi|316923559|gb|ADCP01000068.1| GENE 155 174503 - 176755 1918 750 aa, chain + ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 304 736 323 754 776 202 32.0 2e-51 MMTSRRQNLFIGSVGALLFGVLCWISFDFMHSVQQSLWDNSVKNIMESTARGANVLQRGY VKDLEMLRMLASELEQGKSSDSGRIRSKLKTFLSDTGNLSALIFEDGTGYVDSGLAVTLT PQEMETFKGLHESSGLLFPHRNRGTGRRTFTIYTVVDFPDGRKAYLFKGYNVETLYATYA MSFYNNTGFSYVVAPDGNIAMRSVHPASNKTFSNLFDLISQSENNPDVVESFRNSLKNGK IGIAVFSNRHEEFVFCYVPMPEMDGWYIVSIVPNVEIMREANSIIQKTLFICFLIFVGFL ICFLIYLYITRGYQKEIYALAYTDKLTGVRNFTKFRQDGEGMLQAPDGLPCALLSINMLN FKIYNDVMGYSEGDALLKAFARILEREAPRPVLVARMLADAFLVLFPYKGKGELVTYCEA YSRALNGFIVSREQNYQLELRSGICCVEDSDASDINSLLDRANMALKTIKVKGAAQWKFY DHTMRDKLLREKDLEIRMEKALADGEFLVYIQPKYWVGTRALAGGEALVRWKSPDQGFLS PGEFIPLFEKNRFIVKLDQFMFTSVCGLLRQWLDAGIEPRPLSVNVSRMQFHTPDFVERY IRIKSQYVIPDGLLELEFTESIFFDNVALLSSAVHELRRAGIKCSIDDFGAGYSSLNVLK ALPVSVLKLDGMFFRTAKDDEQAKIVIRNIMRMAGELKMATVAEGVETHEQVAFLETTGC DIIQGYVFARPMPAAEFTELLAGEGGHAER >gi|316923559|gb|ADCP01000068.1| GENE 156 177124 - 178317 1189 397 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302862801|gb|EFL85733.1| ## NR: gi|302862801|gb|EFL85733.1| transporter, major facilitator family [Desulfovibrio sp. 3_1_syn3] # 11 378 27 390 411 338 51.0 4e-91 MRFFADRTTTLLQIFHVIVDGLYDSVPILLSFMIVAFGAQEKDAGVILSVAALLGTLAGL GTKGCSQRFGFLRTLCLITGAYGVGYAANTFSQNIWFSGFFFVIAMMGYGVFHNAAFSYL TVRTERSMLGKVLGDFTAIGDIGRIPITAFAGFLAAWSFAGIPGWRIVCALFGAVACCIT LYLLWTWLRNGDAGGSPCEERKKPQSLLPPLSLLRDRHTASTVVANILDGFSGDQIFAFL PFLLFAKGMDPKVIGSFALAFTVGCFAGKMACGRLVGIFGSRKVFITSKLLMSALLAVLV TAQGLPVIIVTSILLGIVTKGTVPVLQTLLVEPVTDPQAYDDLFAVNTFARGSTNILTPL LFGFIASAFSAEYIYVLMAIVSVFAVAPVLFGRPMRA >gi|316923559|gb|ADCP01000068.1| GENE 157 178448 - 178672 248 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302862698|gb|EFL85630.1| ## NR: gi|302862698|gb|EFL85630.1| DNA-binding protein [Desulfovibrio sp. 3_1_syn3] # 1 69 20 88 89 80 59.0 3e-14 MPMILRRYRDAAGLPQQQLADYAEVSKGSISALEGGRSVPNVDMLITLARAFGVRPGERL DAVVDETEKEVPPR >gi|316923559|gb|ADCP01000068.1| GENE 158 179510 - 180784 925 424 aa, chain + ## HITS:1 COG:YOL164w KEGG:ns NR:ns ## COG: YOL164w COG2015 # Protein_GI_number: 6324409 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Alkyl sulfatase and related hydrolases # Organism: Saccharomyces cerevisiae # 25 420 103 509 646 201 33.0 2e-51 MLKPNGAAELTVFTEKSYKETITPIVPGIWHVLGVGHSNAVIIEGRTSVILVDTLDTLER GQKLLEIIRSRTEKPVKTILYTHGHPDHRGGAGAFSGTDPEIIAFAPATPVLKKTEMLQD IQNLRGIRQFGYALGDEENISQGIGIREGLAYGESRAFRQPTTVYDEDKVVREIDGVRLE LVRLPGETDDQIMIWLPERQVLCCGDNYYGCWPNLYAIRGSQYRDIAGWLDSLDEILSYS ATYLLPGHTRPLCGKEEVRSVLTGFRDAIRYVLEKTLAGMNEGKDIDTLASEIVLPQEYA SLPYLGEYYGCVEWTVRAIFTAYLGWFDGNPTNLHPLPPKERAEKAVALAGGADAVLRAA EEAVRKGECQWCLELCDLLLTIGVNAEAARRQKAVALTKLAAYETSANGRHYYLVCAHEL ENRH >gi|316923559|gb|ADCP01000068.1| GENE 159 181279 - 181521 347 80 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2842 NR:ns ## KEGG: DvMF_2842 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 79 1 79 98 86 56.0 2e-16 MHEQLIKEIQKMVRDGEVPAKSVAEAVGKPYSTLMREINPYDKGAKLGVETFMAIIETTG DPTPLKLMAYELGYRLIPDK >gi|316923559|gb|ADCP01000068.1| GENE 160 181756 - 182004 98 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGFPFHGNISVGECVVALKSSLPVFSKKVTPLSVLNSRPQHIFRLASADYREYMRRFLSH VFPFYLLKLFVVRRAVWGEGLL >gi|316923559|gb|ADCP01000068.1| GENE 161 182009 - 183466 1782 485 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_4598 NR:ns ## KEGG: Dhaf_4598 # Name: not_defined # Def: sodium/sulphate symporter # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 10 485 14 489 489 242 32.0 2e-62 MPLLPSCKDKEIAYYAHSLISLLIMFGCGQLTPITPLTPLGMNLIGIFLGVLYGWIFIDI IWPSMAGLLALMLIGGMKPNELFQTSFGDPLVMMMFFIFVFCATINYYGLSKFISLWFIT RKCVAGKPWLFTYTFFLSIMLLGALTSASPAVVIGWSILYGICDKCGYQKGEGYPTMMVF GIVYAAQIGMSIIPFKQVPFTVLGAYENMSGMTIDYAKYMIIAITCCALCSLLFIVMAKY VFKPDMKKLIPLDTEGLDTEGALRLNKVQKIVMGFLFALVVLLLLPNILPATSGIARFFK TIGNTGICMLLVTVMCLLKVDGKPLLRFKTMVDSGVTWGIILILAVVMPLSHAMANDESG ITKFLMALMTPFFGNESSLVFALCMGFFATVLTQFMNNGATGIALMPIVYSYCTGMNVPP ELAIIMVVMGVHIAFLTPAASASASLLHGNEWSNTKAIWRTVPIVILATWATISAIVVAL GVALF >gi|316923559|gb|ADCP01000068.1| GENE 162 183610 - 185325 1957 571 aa, chain - ## HITS:1 COG:MK0828 KEGG:ns NR:ns ## COG: MK0828 COG1053 # Protein_GI_number: 20094264 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Methanopyrus kandleri AV19 # 34 527 30 532 556 137 27.0 6e-32 MKEQGYQYTTDLLVIGCGFSGMWAAMRARDFLGDVLVVDKGPRDWGGLGGLSGGDMIVKQ PDMKLDDLLDDLVYYYDGLADQKLLGQILTQSYDRFMDFENLGHKFVRNDKGELCFIPQR ALDYMRYYYYHPYGMGGIEKARLLKQECEKRGVRRLGRTVITDVITDGERVTGAVGFHSQ SGVPVFIKARAVLLATNTGGWKPSYHQNTPASEGVSIAWNAGCAMRNFEFWKVWNVPVDF AWEGQTGLLPKGARFLNAKGEDFMKKYSPKFGAKADPHYNTRGMVHEVRAGNGPIRFDCS QMKPEDVETMRPRAGWMGLNDKKLRELGIDFFGQELEWMPQVRHTYGGIVADLDGSTAIK GLYAAGLARNPDPGVYMGGWATCITATTGYSTGEAAAQFVQGHDAVAFDEAYAASRLEAF TGYLGRDGIAPKDVISDMREVMSAPDIALMKTGKGLSRGLDRVEEIRAEVLPHLGARDPH ELAKLFEATSTVLLTELCLNAALMRKESRAGHYREDYPERDNEHWLKWIEQKQVDGKREV HTVPVPLNDYPIKPYRYYMDNFSWPTPPKAV >gi|316923559|gb|ADCP01000068.1| GENE 163 185337 - 186977 1451 546 aa, chain - ## HITS:1 COG:TM1217_2 KEGG:ns NR:ns ## COG: TM1217_2 COG0493 # Protein_GI_number: 15643973 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 6 432 50 469 472 212 34.0 1e-54 MIHSIDPAACTGCGTCTKTCPLDVFRLEPAQEALSPCMAACPAGVDIRGNHYLIQQGHLG EAARRYRIVQPFPAITGRVCFHPCEEKCARNHVDEAVNINAVEQIMGDWDLKAPLEKPAR RHITKVAVIGSGPAGLSCAWFLAQMGYPVTVFEAMPLPGGMLRYGIPTYRLPDAIVAAHI ARLEAMGIAFRCNTKIGEGADLSLSDLKRLGFKAVMLAPGTTVSRKVRMEGVELPGVHWG LEFLRANRGDEQPRLSGDVLVIGGGDVAVDAAITAKRLGAAKVSMACLESRETMPAYPHN QADALREGIECHCGFGPGTILQENGHVAGMELKTCIRVVDEQGRFAPLFDENATMRIRAD HIIFAIGQASELDGFAKDVRIEQGRIVIEDVTFSTSAWGIFAAGDAATGPSSVVSAIAGG RECAFSIDRMLKGADIRGERERKRPELPEEQWPREGIRHEPRQERASLPAPADDPFAETL RPLDMEACFAEALRCMTCGSKSRITYTDDCMTCFNCELNCPSGAIYVHPFKERFARTLDQ IEADNR >gi|316923559|gb|ADCP01000068.1| GENE 164 187344 - 190412 1916 1022 aa, chain + ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 683 1018 125 456 461 251 43.0 4e-66 MAKEKAWYADSAVGTLLADLFEGKWSPELLAQLAQIPLAEAERGAFPGCPPSSGRKRRFR LVECVQAHKEACSFLLARPSGSGIESFLRLSADGEELGALETMLGLGLSRMNTLRPAGIN LVLRLVHRSLLGVGLRGLNADGAKRYRRLCRDVHPLIVCSGSNTPSLARLFFRARGLAVK FGDRRDMALFDLLVGSINVCNTPEHGNPRFHAIMARGYAALEALKEPDLFEQAAPYLGIY HFIEGNYEQAMNLFSRASRKLRAQEHHLVEMFYVRHWSFAASCRGNFELAAGLLLSRLRM FAARSDNQLARSIRGQLAALYLRMGQYEKALEQLDIAQIGVSPQVDIVSGVTNARHLAYY HMLSGNPKAAYKVLHSALDEARRQGYERPIYLGGALLELLCAFQEQGCPPLPCYSAEQEL QRCLRGPNRMLRGVAARLSGNALMRGGNEDGALGLYQDSLALLEKIGCPLEAAKTRLALA SLALRRNDKEKAVLLVSDAWPMYDCLKDFFWPEELLALVPTYMRKHGEANLSSRALLEAY RESFSPHRTWEQFEAFSQALLAESARILGTTQGYLFHVPYPHAPLRLVAVLGGKLESVTY PYASLPDLMELVAEGAPLILDNVQEGGMGFGSMLIGIPIDCRPDGVYALCHVGAFLPEVR TVLEENVLKDIGRVLAWECSRVMERERRHAERLETLADTSGEMICVSDEMERFLGDVELA AKTDASILLCGESGVGKEMVARYIHEKSGRSGRLVIINMASLQDELFESEFFGHEKGSFT GAMTSKPGLVEMAENGTLFLDEFTEASTRVQAKLLRVLQERCFHRVGGTQTISVNFRLIA ASNRDITHAVRQGLFRADLYYRIAVINLKIPPLRERKADILAIARYYLHFFSHWHHRECV QDFSPENRRILESWSWPGNIRELRNVIEQSVVLTGGRHLTLPDPVEHTEASRGVGALGGG EMPEELFSGNPSLADLEKRYIERTLEQSGWRIEGKRGALAVLRISRSALYDKMRRYGVKR PF >gi|316923559|gb|ADCP01000068.1| GENE 165 190568 - 192178 1460 536 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 13 534 23 548 553 261 33.0 2e-69 MLTNECTLISCPEILTMDEGKPTAEAVCISEGKILAVGTLEELRALAPRHRRKEIHLEEG VLLPGFIDSHSHLSMYAQCRTQFFCDTAHGTIGNLLSAFRTHAATQTDTEWIIGYCYDDT GMSDHRHLTRHDLDAVSQERPVFVSHITSHMGYANTLGLAKLGVTADFSVEGGMVVLGDD GMPEGLLKENAYFKVFQKIPACPPEKLPEKIEYAIDDYNRAGFTMFQDGGIGLSNGAHGI LRAYNRLAREKRMNARGYLHFMPNIMDEMLELGTWNMPVSDYLYYGGVKSFADGSIQSLT AVLGEPYACKPDFCGDHVLTPEQIAELIAKYHCQGVPVAFHANGDAAIEFVVRGFEEALR KCPRKVPGDMIIHVQMATDEQLSRLHACGVTPTFFVRHVNVWGERHVNLFLGEERAARLD PCGSCVRMGIPFGLHVDSPVQPVDAIRSIHTAVNRTTTAGRILGPDQRISPLDALKAYTV NAAKCSARDHAVGTIAPGYYADFVQLSGNPLTVPPESIETLSVLKTISGGRIVYES >gi|316923559|gb|ADCP01000068.1| GENE 166 192283 - 193677 1966 464 aa, chain - ## HITS:1 COG:Cgl2722 KEGG:ns NR:ns ## COG: Cgl2722 COG0786 # Protein_GI_number: 19553972 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Corynebacterium glutamicum # 5 414 6 391 449 86 23.0 1e-16 MGSGFENLILFGLLSLMLLIGTLLRAKVKLFQDYLVPASIIGGLIGFAIVSTGWLKIGEW QITSSSFNWFLFHAFNISYISLLLTRPRGNQENTSKEVVRGGMWQTLIWTISLPAQALIG GAVIWLYNLATGNSLSEFIGMIVTHGYTQGPGQAYAFGTLWEKGGIADCATVGVIYAALG FLSAAIVGVPVARWFVRKGLNANKTGASITKEFLTGIMDEKSTATNGRETTHASTIDTLA FHVALVGIVYLITYAELSWIEAHIKPFFDQYKWLKGFGATLSMPMFFIHGLIVAWLLRTL LLKLGAGRLMDPVVQTRITGASVDYLLTATLMSIHIVVLKQYVIPIFLVAFSVTLFALAL NLWFGRRTNYGPERVLCQFGCCCGSTATGLLLLRIIDPDFSTPATLELAFFNVGILVTCA PILYFFAPAFYTFTGMEILMIYGAITVIGIAAMFALKLVGQKQW >gi|316923559|gb|ADCP01000068.1| GENE 167 193967 - 195409 1239 480 aa, chain + ## HITS:1 COG:BH1879 KEGG:ns NR:ns ## COG: BH1879 COG3829 # Protein_GI_number: 15614442 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 155 473 238 552 555 263 42.0 5e-70 MPFPIASDEDFKRLVENPLFHQLLDALSIGVSLTDPTGTVRYFSQSCYHIYGLDPSESVV GKKIDAIFQTGRAGVLNSLQTRRINTVNSISYNGVEGLCRRCPILDDKGNLVCCLSEVIV TTHDNERIEELLHNLQQLKRKVGYFIAQETTGGGLRTFDDLVGDTSVMQALKAMGKRFAR SREPVLILGESGTGKELFAQAIHKASPRANGSFISVNCAALPRELAESELFGYVEGAFTG ARKGGQKGKFELADKGTIFLDEIGELPLYLQAKLLRVLESNEIQKIGMSGSKYSDFRLIA ATNRNLPELVDKRQFREDLYHRLNILELSLPPLRKHRGDIPSLITQLIEGICGPQKALEM RVSQDVLDLFMRYGWPGNVRELKNVLAYAYCCMDDDAVELTPRYLPERIFMGSRSEDVPA AVEPGAFSFQAMQREAEKRAIESALSLTGGNKSKAAKLLGFSRNTLYLKMKALGMGLKRK >gi|316923559|gb|ADCP01000068.1| GENE 168 195584 - 196465 193 293 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 123 288 75 238 242 79 34 2e-13 MDCQRDTEARSPLQTQLFSLQGHTALVTGAGSGLGKYMAHALLHAGANLILVGRTLDRLE ASADEIFLSLKQQGGHLAYDSDPAARSAHRDPDQNKRIACIAFDVSELKTLPELAQKASQ PFGAPDILVNAAGLNPRKPWNELTPEIWEYTLRLNLSAPFFLAQALVPEMMRQGWGRIIN IASLQSSRAFPNGLPYGASKGGIAQLTRGMAEAWSRPGTGITANAIAPGFFKTQLTAPLF DKPEVVEALARQTTMNRVGFAEDIQGLTMFLASPASGYITGQVIHIDGGWTAI >gi|316923559|gb|ADCP01000068.1| GENE 169 196541 - 197251 762 236 aa, chain - ## HITS:1 COG:SSO3004 KEGG:ns NR:ns ## COG: SSO3004 COG1028 # Protein_GI_number: 15899712 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Sulfolobus solfataricus # 2 236 3 251 299 166 38.0 4e-41 MRTCVITGGTGGIGLHLCEGFQRAGYAVTALDVEQHKALPDGIGFIQTDLRQAEAVNEAF RRVTERHGAVHVLVNNAALAHFHKPVMEMTADEFDNALSVNLRGAFLCAQAFIKANAGQD YGRIINIASTRWNQNEAGWEAYGASKGGLVSLTNTLAVSLSATPITVNAVSPGWIQVDGY ETLSHADHAQHPSGRVGIPRDIVNACLFLAHEENDFVNGHNLVVDGGMTKRMIYVE >gi|316923559|gb|ADCP01000068.1| GENE 170 197263 - 198186 902 307 aa, chain - ## HITS:1 COG:FN1038 KEGG:ns NR:ns ## COG: FN1038 COG0697 # Protein_GI_number: 19704373 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 1 300 2 295 303 213 41.0 3e-55 MNTLQLRNSFLLLLTAVIWGVAFVAQSVGMDYVGPYTFTCVRSFIGGLFLIPCIALLNRL NPVSPGGIRPSNAKSKDQLWIGGVCCGVMLCFASCFQQIGIMYTSVGKAGFITAFYIIIV PLLGLFFKKRCGLFVWLGVALAIVGLYFLCITESLTIQFGDFLIFICAILFSFHILIIDY FTLRVDGVKMSCIQFFVCGLLCAVPMLLFETPDITQLLAAWKPVLYAGIMSSGVAYTLQI VGQKGMNPTVASLILSLEAVVSVLAGFVMLDQQLTMRETMGCAFMFCAIVLAQLPQKNIK NAEAVSA >gi|316923559|gb|ADCP01000068.1| GENE 171 198588 - 199979 1243 463 aa, chain + ## HITS:1 COG:AGl3214_2 KEGG:ns NR:ns ## COG: AGl3214_2 COG2199 # Protein_GI_number: 15891726 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 304 457 1 161 189 90 34.0 5e-18 MTTSFSKAQYVSGEHEDCPESLLQIIMEEVDEIVYVSDIATYEILYLNEFGKKVFGLTDI GKGRKCYEVLQGIDAPCPFCTNQFLNCETFYTWELTNPISHRHYLLRDKLIRWNGRLARL EFAVDITEKENISQAVQRKLDIESTLLECIRILNREENFPQAVDMVLENLGMMHQADRAY IFEYSVLKNGGLIANNTYEWCSEGIFPQKEVLQNVPISYMSCWQKMFERREDVVIDNLES IRDSDLDMYSVLKPQGIESLIVVPLVLNDVISGFIGVDNPKANRDDHSLLHSLAYFVTNE QRKRNMQSELKRMSYCDDLTGLHNRNSYISVLQKLEKNPPDSLGVVFVDLNGLKRINDQQ GHDSGDKYIRDISRIFSRYFRGDDLFRIGGDEFVFLCPNIPERVFYAKIAALQREANKAY PESLSLGQVWAEGDMHIMDMVRQADKRMYQEKAEYHARRGDTV >gi|316923559|gb|ADCP01000068.1| GENE 172 200450 - 200713 74 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPIADLTPNLLSSLKAQLLKTPTGNTKPKKATKRPKAKAPTRKMLVGQTVNNIFSFMRSA INRAVATGMWRSQSLVHQGRHLENGKG >gi|316923559|gb|ADCP01000068.1| GENE 173 200948 - 201277 362 109 aa, chain + ## HITS:1 COG:no KEGG:DVU0236 NR:ns ## KEGG: DVU0236 # Name: not_defined # Def: phage integrase family site specific recombinase # Organism: D.vulgaris # Pathway: not_defined # 1 109 268 375 380 95 47.0 5e-19 MLQAYTRKPAEPIFQQPRKKTAFEKTPACFQTAVRKLNLAPEDGDSLYAVTLHTMRHTFA SWLAQSGKVTLMELQKLMRHKNTTMTMRYAHLFPGQESEKLSIIGDMLA >gi|316923559|gb|ADCP01000068.1| GENE 174 201286 - 201594 290 102 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEIEGRTIILSTPEELLPLITQAVRTAEQGKTTDAPDLPGKKLLAPKDVEREFGIHQKTL AYWHMEGVGPVYTNFGRRVFYERAVLEEYIASGRVQTSESIK >gi|316923559|gb|ADCP01000068.1| GENE 175 201632 - 201949 99 105 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIFCNQFQHRVTFVRQEHFFPVPKDAAVADSKTFTDFSDAQTAHSEFKNFPAARYQSNSR YTRLRFLIEVMQQLNVAMNAEGFSNCCQMPIYRSHLDGEIPCNNF >gi|316923559|gb|ADCP01000068.1| GENE 176 202507 - 206634 4162 1375 aa, chain - ## HITS:1 COG:mll1534 KEGG:ns NR:ns ## COG: mll1534 COG4625 # Protein_GI_number: 13471533 # Func_class: S Function unknown # Function: Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain # Organism: Mesorhizobium loti # 602 1000 172 593 1008 89 28.0 4e-17 MTHRFRFTCTRLLPACALAALLVGPPGFAPSANAADVTYPGSNLDKGPLWNIDNSLFPAG SLSDNVVTINSGNVGGDVYGNDVDAAFSPVSNNTVILSGGSVGGDILGGANNGAVTDNNV AISGFGSVLGSVYGGYGAAEGTVNGNDVSIFDSGSVTGNVLGGYSRSVNSHVIGNTVTIS GGTVRDIYGGQSGKGNALNNRVTLDGAASQTNVIYGGRVEQGTARENAVVMKNGSVTLGI FGGIATADGGQAQDNHVTMSGGAVGEHLIGGYVQNGSGAATGNSVIFNGGSVTENVYGGR SVNGPAQNNSVTMTNGSAKWLLGGYSNSGDASGNRVEVSGGTLSGGVNGGETTSGNATGN SVDFSNVTATYVQGGYSGSGSATGNSLAIRSGTVQNNAFGGYVDSGSGEASGNSVTFNGG SVTNNIYGGMSAAGLAQNNSVTMTNGSAKWLLGGYSANGNVIGNSVNVSGGTLTGVSGGE SNSGSATGNIVSISGGTVQSNVNGGFVASGSGKATGNIVNISGNADLSTATVAGGISSSD AFTGNTLNKNSDAAVHIARNFASVNFGYSGNANIGELDSTPTGSALSGVTVNTNANNVSF VGVISGSGSMTKAGAGTLILSGTNTYSGGTTISAGTLSIGSDTNIGSGTNTIGNKGTLLL SGNGTYTNDWTLSGTGSAIATDNNNTLSGVLSGNGGLTKTGAGTLTLTGNNTYADGTAIN DGTLKGNIASGTDLSIAASAIYDGDNKARSVGGLNGGGKILNTDGLTVQSGTFGGVIGNS NTSLIKTGAGTLTLTGTNAYTGSTTISEGTLKGNIASGTDLSIADSATYDGDNKARSVGG LNGAGNILNTDGLTVQSGDFAGSIDNSNSGLTKTGAGTLTLSGTNTYTGMTTVRSGTLAL GSDLTSNQLTLYGGTVFDRGSHNHSLDNGILSVNGANGQSAMYKGDLSARNATLNFISPV HPTQPLLRVTGDADVSGSACNVGLAGGTSLASGSTLTLLEVDPDKTLTANNLQRGNGIVQ IGSTVAHDITADVNLDPTTRRLNAVTAQVSPGRATDQSKALSEGFLGGLALNLQGADLVA GRGMDSAVRASSGTDDAERHGFAGFGALSGGSLRYNTGSHLDMNSLSLLTGLAWGIDLAP GRLTLGAFFEYGNGSYDTHNSFTNAASVDGDGNAYYLGGGILARMDFVNIGPGRFYAEAS GRAGKTHNEYDSSDLRDVAGRKADYDSSSPYYGLHFGTGYVWNINDAATLDLYGKYFWTR QQGDSVGLSTGEHLSFDDINSSRLRFGGRFAYILNEHVAPYIGAAWEHEFDGKARARTNG FDIDAPNLRGNTGIGELGLSLTPSADLPLTVDLGVQGYTGKREGVTGSLMVKWEF >gi|316923559|gb|ADCP01000068.1| GENE 177 206666 - 206858 168 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLAANGAQRKRPHPEKGQGHVFGRSGNAESRATPGRERGGWAMSCVCVCVCVCVCVCVC VCVC Prediction of potential genes in microbial genomes Time: Fri May 13 02:54:16 2011 Seq name: gi|316923479|gb|ADCP01000069.1| Bilophila wadsworthia 3_1_6 cont1.69, whole genome shotgun sequence Length of sequence - 92330 bp Number of predicted genes - 87, with homology - 78 Number of transcription units - 47, operones - 22 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 331 148 ## - Term 380 - 415 -0.7 2 2 Tu 1 . - CDS 449 - 3019 2159 ## COG2909 ATP-dependent transcriptional regulator + Prom 3234 - 3293 3.4 3 3 Tu 1 . + CDS 3337 - 3603 180 ## COG2801 Transposase and inactivated derivatives + Term 3626 - 3665 2.1 - Term 3358 - 3419 1.8 4 4 Tu 1 . - CDS 3553 - 3837 155 ## 5 5 Op 1 11/0.000 + CDS 3700 - 4020 79 ## COG2801 Transposase and inactivated derivatives 6 5 Op 2 . + CDS 4121 - 4318 116 ## COG2801 Transposase and inactivated derivatives 7 6 Tu 1 . - CDS 4461 - 5129 615 ## COG3619 Predicted membrane protein - Prom 5359 - 5418 1.9 - Term 5774 - 5810 11.0 8 7 Op 1 22/0.000 - CDS 5836 - 6645 1273 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen 9 7 Op 2 32/0.000 - CDS 6724 - 7386 960 ## COG2011 ABC-type metal ion transport system, permease component 10 7 Op 3 . - CDS 7379 - 8380 1198 ## COG1135 ABC-type metal ion transport system, ATPase component - Prom 8413 - 8472 3.8 + Prom 8276 - 8335 3.0 11 8 Tu 1 . + CDS 8540 - 8689 57 ## - Term 8412 - 8439 -0.8 12 9 Tu 1 . - CDS 8623 - 9297 318 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 - Term 9576 - 9607 2.1 13 10 Op 1 . - CDS 9637 - 9981 351 ## COG0662 Mannose-6-phosphate isomerase 14 10 Op 2 . - CDS 10047 - 10883 524 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase - Prom 10909 - 10968 2.4 15 11 Tu 1 . + CDS 11064 - 11480 513 ## COG0105 Nucleoside diphosphate kinase + Term 11574 - 11604 3.3 - Term 11562 - 11592 3.3 16 12 Op 1 9/0.000 - CDS 11600 - 12409 961 ## COG3302 DMSO reductase anchor subunit 17 12 Op 2 16/0.000 - CDS 12412 - 13041 729 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 18 12 Op 3 . - CDS 13107 - 15545 2619 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 15614 - 15673 2.7 - Term 15681 - 15714 6.1 19 13 Tu 1 . - CDS 15741 - 17123 1929 ## COG0733 Na+-dependent transporters of the SNF family 20 14 Op 1 28/0.000 - CDS 17263 - 20961 3389 ## COG0419 ATPase involved in DNA repair 21 14 Op 2 . - CDS 20972 - 22201 1237 ## COG0420 DNA repair exonuclease 22 14 Op 3 . - CDS 22281 - 24479 2120 ## Dbac_0772 hypothetical protein - Prom 24651 - 24710 2.4 + Prom 24488 - 24547 1.8 23 15 Tu 1 . + CDS 24669 - 24935 333 ## + Term 24942 - 24975 5.4 - Term 24927 - 24967 7.2 24 16 Op 1 . - CDS 24970 - 28152 2245 ## COG2982 Uncharacterized protein involved in outer membrane biogenesis 25 16 Op 2 . - CDS 28149 - 31403 2599 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 26 16 Op 3 1/0.214 - CDS 31419 - 32741 1718 ## COG0205 6-phosphofructokinase 27 16 Op 4 . - CDS 32744 - 33604 300 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 - Prom 33713 - 33772 2.2 + Prom 33613 - 33672 4.0 28 17 Tu 1 . + CDS 33772 - 35589 1108 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 29 18 Op 1 . + CDS 35759 - 36307 318 ## COG0398 Uncharacterized conserved protein 30 18 Op 2 . + CDS 36334 - 37278 517 ## COG0457 FOG: TPR repeat + Term 37306 - 37333 1.5 - Term 37294 - 37321 1.5 31 19 Op 1 . - CDS 37330 - 38148 923 ## COG0457 FOG: TPR repeat - Term 38159 - 38204 1.5 32 19 Op 2 . - CDS 38287 - 39327 1213 ## CPR_2030 hypothetical protein - Prom 39383 - 39442 8.0 33 20 Op 1 . + CDS 39326 - 39514 69 ## 34 20 Op 2 . + CDS 39526 - 40509 708 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 35 20 Op 3 . + CDS 40506 - 41759 806 ## COG0826 Collagenase and related proteases + Term 41890 - 41938 5.8 36 21 Tu 1 . - CDS 41958 - 42914 809 ## COG2933 Predicted SAM-dependent methyltransferase + Prom 42942 - 43001 5.5 37 22 Tu 1 . + CDS 43057 - 46656 3461 ## COG0277 FAD/FMN-containing dehydrogenases 38 23 Op 1 5/0.071 + CDS 46762 - 47613 843 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 39 23 Op 2 . + CDS 47600 - 49024 1329 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 49046 - 49086 4.0 - Term 49693 - 49750 14.7 40 24 Op 1 . - CDS 49777 - 51108 1120 ## COG0739 Membrane proteins related to metalloendopeptidases 41 24 Op 2 . - CDS 51171 - 51593 448 ## COG0517 FOG: CBS domain - Prom 51734 - 51793 3.9 + Prom 51678 - 51737 4.1 42 25 Op 1 . + CDS 51846 - 52508 451 ## COG2518 Protein-L-isoaspartate carboxylmethyltransferase 43 25 Op 2 . + CDS 52584 - 53438 801 ## COG0489 ATPases involved in chromosome partitioning + Term 53465 - 53510 8.6 + Prom 53458 - 53517 3.9 44 26 Op 1 . + CDS 53625 - 54185 322 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 45 26 Op 2 . + CDS 54185 - 54544 205 ## Ddes_1194 septum formation initiator 46 26 Op 3 . + CDS 54507 - 55529 550 ## LI0371 hypothetical protein 47 26 Op 4 . + CDS 55556 - 56506 657 ## DvMF_0405 hypothetical protein + Term 56525 - 56563 3.0 48 27 Tu 1 . + CDS 56599 - 57603 826 ## COG0158 Fructose-1,6-bisphosphatase + Prom 57660 - 57719 2.5 49 28 Tu 1 . + CDS 57740 - 58828 606 ## PROTEIN SUPPORTED gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase + Prom 58852 - 58911 3.6 50 29 Op 1 11/0.000 + CDS 58988 - 59311 260 ## COG0526 Thiol-disulfide isomerase and thioredoxins 51 29 Op 2 . + CDS 59308 - 60234 533 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 52 29 Op 3 . + CDS 60237 - 60959 353 ## COG4105 DNA uptake lipoprotein + Term 60978 - 61029 10.6 - Term 61081 - 61123 5.2 53 30 Tu 1 . - CDS 61140 - 61568 563 ## COG4747 ACT domain-containing protein - Prom 61619 - 61678 2.3 54 31 Tu 1 . - CDS 61689 - 61889 284 ## COG1826 Sec-independent protein secretion pathway components + Prom 61855 - 61914 3.0 55 32 Op 1 . + CDS 62144 - 63040 325 ## DVU1370 hypothetical protein 56 32 Op 2 . + CDS 63040 - 63726 388 ## COG0546 Predicted phosphatases 57 32 Op 3 . + CDS 63764 - 64063 399 ## LI0363 integral membrane protein + Prom 64121 - 64180 1.9 58 33 Op 1 . + CDS 64246 - 64479 371 ## Ddes_1902 hypothetical protein 59 33 Op 2 32/0.000 + CDS 64479 - 66170 1338 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 60 33 Op 3 . + CDS 66214 - 66705 488 ## COG0440 Acetolactate synthase, small (regulatory) subunit + Term 66714 - 66761 11.2 - Term 66706 - 66743 5.6 61 34 Tu 1 . - CDS 66911 - 67090 78 ## + Prom 66790 - 66849 4.0 62 35 Tu 1 . + CDS 66998 - 67987 1314 ## COG0059 Ketol-acid reductoisomerase + Term 68015 - 68052 6.2 - Term 68001 - 68038 3.0 63 36 Tu 1 . - CDS 68057 - 68275 110 ## - Prom 68328 - 68387 2.9 - Term 68534 - 68577 14.0 64 37 Op 1 . - CDS 68589 - 68894 188 ## - Term 68901 - 68940 1.6 65 37 Op 2 . - CDS 69061 - 69273 61 ## - Prom 69388 - 69447 3.8 - Term 69384 - 69425 10.3 66 38 Op 1 . - CDS 69495 - 70880 1733 ## COG0471 Di- and tricarboxylate transporters 67 38 Op 2 . - CDS 71043 - 72719 1919 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 68 38 Op 3 . - CDS 72764 - 73006 277 ## gi|266621852|ref|ZP_06114787.1| ferredoxin, 4Fe-4S - Prom 73135 - 73194 11.5 - Term 73112 - 73160 7.1 69 39 Tu 1 . - CDS 73229 - 73939 800 ## Dde_0463 Crp/FNR family transcriptional regulator + Prom 74139 - 74198 3.1 70 40 Tu 1 1/0.214 + CDS 74260 - 74634 369 ## COG3111 Uncharacterized conserved protein + Term 74643 - 74691 15.2 71 41 Op 1 40/0.000 + CDS 74726 - 75418 437 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 75447 - 75485 -0.8 + Prom 75425 - 75484 1.5 72 41 Op 2 . + CDS 75573 - 76793 663 ## COG0642 Signal transduction histidine kinase + Term 76913 - 76964 3.9 73 42 Op 1 . - CDS 77028 - 77810 472 ## COG0500 SAM-dependent methyltransferases 74 42 Op 2 . - CDS 77876 - 78526 349 ## DVU1787 nuclease domain-containing protein 75 42 Op 3 . - CDS 78544 - 78831 250 ## LI0344 hypothetical protein - Prom 79070 - 79129 3.6 + Prom 78823 - 78882 4.7 76 43 Tu 1 . + CDS 79093 - 80775 1904 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 80785 - 80855 26.9 - Term 80785 - 80828 8.6 77 44 Op 1 . - CDS 80865 - 81812 1097 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 78 44 Op 2 . - CDS 81821 - 82267 369 ## DvMF_0187 hypothetical protein 79 44 Op 3 . - CDS 82357 - 83349 1123 ## COG1494 Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins - Prom 83522 - 83581 2.7 80 45 Tu 1 . + CDS 83522 - 86065 1102 ## COG1643 HrpA-like helicases + Term 86093 - 86132 4.1 + Prom 86144 - 86203 4.4 81 46 Tu 1 . + CDS 86245 - 87900 1345 ## COG0018 Arginyl-tRNA synthetase 82 47 Op 1 . + CDS 88096 - 88689 451 ## LI0230 hypothetical protein 83 47 Op 2 23/0.000 + CDS 88694 - 89491 584 ## COG0767 ABC-type transport system involved in resistance to organic solvents, permease component 84 47 Op 3 13/0.000 + CDS 89501 - 90331 283 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 85 47 Op 4 13/0.000 + CDS 90364 - 90807 519 ## COG1463 ABC-type transport system involved in resistance to organic solvents, periplasmic component 86 47 Op 5 5/0.071 + CDS 90810 - 91439 498 ## COG2854 ABC-type transport system involved in resistance to organic solvents, auxiliary component 87 47 Op 6 . + CDS 91451 - 92305 409 ## COG2853 Surface lipoprotein Predicted protein(s) >gi|316923479|gb|ADCP01000069.1| GENE 1 2 - 331 148 109 aa, chain + ## HITS:0 COG:no KEGG:no NR:no VCVCVCVCVCVCVCVCSKKRTGTRLRSGPLPPATFLAYTQPGPPSKDDGGRTADGEAEHA AWKTSDDYGQKGIYVSLCFLLGKKANNFCVSPFWGDTRMPLSESHLIVF >gi|316923479|gb|ADCP01000069.1| GENE 2 449 - 3019 2159 856 aa, chain - ## HITS:1 COG:PA3921 KEGG:ns NR:ns ## COG: PA3921 COG2909 # Protein_GI_number: 15599116 # Func_class: K Transcription # Function: ATP-dependent transcriptional regulator # Organism: Pseudomonas aeruginosa # 17 459 35 465 906 69 28.0 3e-11 MEHNVCTGISDVRYFSPRLLAALTGILSSPLTVIEAPMGYGKTVAVREFLRRENLAVIWV SVLDSPDGAFWRSLCREFERLPETAETARALRRLGPPFAPDDAARRDAALELLESLDFAK PTVLVVDDVHLLASAPSCVRFFQILARQAMSNLHIVLTTRHFQTDDALLSDLKGELARLG PALLAFSEEDIRAYCALCGLKVSPEQIRALHGATGGWVSGVYLHCRHYAQHGSFSLPSFA RSFTLPSPQSSSLSSPQYSENELPPDMAALLEEQLYRPLSPDVRDMLFALCPLEQFTLPQ ADFCCGTDTRAALEALVRQNSFVRRREGSGVYTVHAIFRSLLLRLFRALPREQRQAVHRR CGDWFAAEDEFIPAMEHYHAARDFERALSVMERDMARHLVTESAAFFARLFQDCPEEVLS RHPKAAFKHALAALSASDFPAFADRCRWLARYCAALDEAHPATPVLRGELEMLLALAEYN DIAAMSVRHRRAWELLGRPTGLYPPESTWSMGCPSVLFMFHRESGNIREEVRLMRECMPH YYKAAAWHGAGGELLFEAEALYMAGDFTEALRLCRQAETVAALHGQLCNTLCALFLRARL ALARKNTSTACAAVREMRDLITKKQDYFLLHTAEVCAARLSGLLRRPDEIPAWVHEGRGE RMYAFAQGDFRLAQGRALLLAGDAAAVLGLFHALLQTPLFDKHRLFFIYAHIFLAAAHAM LDAREKALESLRAALDAALPDDVLMPFVENADLVLPLLLAPQETRHEDGVRRILALAEPW LRHLGLPGTRASEIPFGLSVKQYEIARLAAEGRTGPEIACRVGLGLNTVKTHLKTVYRKC GANNRPELRRLLHGEK >gi|316923479|gb|ADCP01000069.1| GENE 3 3337 - 3603 180 88 aa, chain + ## HITS:1 COG:STM2765 KEGG:ns NR:ns ## COG: STM2765 COG2801 # Protein_GI_number: 16766077 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Salmonella typhimurium LT2 # 1 88 1 88 88 116 68.0 9e-27 MKKSWFSEHQIITILKSGEAGRTAKDVCREHGISSATFYAWKSKFGGMEASDIKRMRDLE HENALLKQMYEDLSLENQPLKDVIEKKF >gi|316923479|gb|ADCP01000069.1| GENE 4 3553 - 3837 155 94 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVPCISLATQHNKIFGKSVSRVAFDQLKKRRYHKAAVTRRWFVKKHASTYAQCRARLADG KSMLFVNEQHDVTLLGLASELFFDHILQGLIFQA >gi|316923479|gb|ADCP01000069.1| GENE 5 3700 - 4020 79 106 aa, chain + ## HITS:1 COG:STM2764 KEGG:ns NR:ns ## COG: STM2764 COG2801 # Protein_GI_number: 16766076 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Salmonella typhimurium LT2 # 23 104 6 87 155 97 59.0 4e-21 MFLYKPTPRDGSLVITTLLELVERYPRYGFAKYFVVLRREGYTWNHKRVYRSYRQLQLNM RRKGKRRLPSRAPVRLEAQTVVNGCWSVDFMSDALMHGQRAWRFQP >gi|316923479|gb|ADCP01000069.1| GENE 6 4121 - 4318 116 65 aa, chain + ## HITS:1 COG:YPCD1.69 KEGG:ns NR:ns ## COG: YPCD1.69 COG2801 # Protein_GI_number: 16082757 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 1 62 132 193 193 60 46.0 1e-09 MERFNRTFREEVLNFYVFSRLSEIRTIVDAWVQEYNEQRPHESLGNLTPEEFALKHAGGS SLALH >gi|316923479|gb|ADCP01000069.1| GENE 7 4461 - 5129 615 222 aa, chain - ## HITS:1 COG:lin0467 KEGG:ns NR:ns ## COG: lin0467 COG3619 # Protein_GI_number: 16799543 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 8 218 6 213 221 160 45.0 2e-39 MKHGKHIQVSESFLLCALLTMTGGFLDVYTYITRGHVFANAQTGNVVLLGLNLAEGNIKE VAFYLFPIIAFALGILFTEWIRAKFKEYDLLHWRQIVVFFEALLLFFPAFMASGMWDTAV NILVSFVCAVQVESFRKVNGHSVATTMCTGNLRSATEQIFHYVRTRDPDTKKTILSYYEI VVFFALGATLGAALSALFAEKAILFCCLFLIIAFLSMFAKEI >gi|316923479|gb|ADCP01000069.1| GENE 8 5836 - 6645 1273 269 aa, chain - ## HITS:1 COG:FN0658 KEGG:ns NR:ns ## COG: FN0658 COG1464 # Protein_GI_number: 19703993 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Fusobacterium nucleatum # 23 268 17 260 261 234 52.0 1e-61 MKNFRSALLAAALSLVVAAGSATAAQATTKIRVGASPTPHAEILKVANDVLKPQGYELQI IEYSDYVQPNMALEGKELDANFFQHKPYLDDFNKEKGTKLVSIGTVHYEPFGIYAGKTKS LGALKDGAMVAVPNDTTNEARALLLLQSNGLIKLKDGAGLTATRRDIAENPKKLKIEEIE AAQLVRALPDVDLAIINGNYAILGGLKVADALSAEKADSIAATTYANILAVRAGDENRPE LKALIDALKSDQVKEFMTKKYEGAVVPAN >gi|316923479|gb|ADCP01000069.1| GENE 9 6724 - 7386 960 220 aa, chain - ## HITS:1 COG:FN0659 KEGG:ns NR:ns ## COG: FN0659 COG2011 # Protein_GI_number: 19703994 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Fusobacterium nucleatum # 2 219 14 231 233 183 49.0 2e-46 MFDQATIEMLLEGVVDTLYMTLVSTFFSYVFGMMMGTVLVICRQDGITPRPMIYAVLDVV VNLTRSFPFLILMIAVIPLTRLLVGTTIGNNATVVPLVIAAAPFVARLVESSLLEVDGGV IEAAQSMGASTMQIIMKVLLPEALPSLINGSAIAATTILGYSAMSGAVGGGGLGKLAIMY GYNRYQTDIMFITVVLLIVIVQVFQSFGNWATKRSDRRIS >gi|316923479|gb|ADCP01000069.1| GENE 10 7379 - 8380 1198 333 aa, chain - ## HITS:1 COG:BH3481 KEGG:ns NR:ns ## COG: BH3481 COG1135 # Protein_GI_number: 15616043 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Bacillus halodurans # 5 328 1 333 338 300 47.0 3e-81 MTEAIIQIQDLEKRFRSKNTEVYALQGINLTIRKGDIFGIIGKSGAGKSTLVRCINMLER PTGGSVMFEGRDMCRLGSRDLQIARRSMGMIFQQFNLLMQRTAEENICFPLELAGVKKDA ARERARELLELVNLSDRAQSYPSQLSGGQKQRVAIARALATNPKILLCDEATSALDPATT ESILSLIKDINRRLGITAIIITHEMSVIEKICNQVAIISHGRIAETGSVEEVFFHPQTEE ARQLVIPEALQDMPHSRLYRIIFNGRSSFEPVISNLVLECGAPVNIMFADTRDIGGTAFG QMVLQLPDNEEVVRRILAYADSKGLMLEEMKNV >gi|316923479|gb|ADCP01000069.1| GENE 11 8540 - 8689 57 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDRTVIHPVSREWGNVTPMVLRGGLGCAHRSMDLMCGLSLSRLISCVTL >gi|316923479|gb|ADCP01000069.1| GENE 12 8623 - 9297 318 224 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 14 224 17 234 236 127 34 3e-28 MDITVFTELEFFFRILIAGICGGLIGYERNNRLKEAGIRTHLIVALAAALIMVVSKYGFS DVTTLKGVALDPSRIAAQIVTGVGFLGAGMIFVRNQTISGLTTAAGVWATAGIGMTIGAG LYFLGVAATLLIVAAQMTLHKNFTWLNFPVAEQISMEIEDTGDGVASIRDKLLSHNLEII SMKSRNNGNGLISLDLYVKLPRNYNVTQLMSLLSDNPHIKSIDL >gi|316923479|gb|ADCP01000069.1| GENE 13 9637 - 9981 351 114 aa, chain - ## HITS:1 COG:TM1287 KEGG:ns NR:ns ## COG: TM1287 COG0662 # Protein_GI_number: 15644042 # Func_class: G Carbohydrate transport and metabolism # Function: Mannose-6-phosphate isomerase # Organism: Thermotoga maritima # 16 112 23 119 121 94 47.0 6e-20 MIRHYDELTKLDMQHKGGKGHILAAELLNGEDFAGKGRVFNHCVLKPGCSVGRHRHVGDF EVYHVLSGTGLYFDNGELKPVTAGDVMICKDGEEHMLENDGTEDLEFIALILYA >gi|316923479|gb|ADCP01000069.1| GENE 14 10047 - 10883 524 278 aa, chain - ## HITS:1 COG:BH3849 KEGG:ns NR:ns ## COG: BH3849 COG0656 # Protein_GI_number: 15616411 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Bacillus halodurans # 3 275 2 275 278 238 40.0 7e-63 MSIRDIAEPVSLNDGTAMPGYGFGCYAAHGEEIINAIRWAVRDGYRYIDSAAMYGNEAEV GEAIRTCGVQREDLFISSKIWPTRFDDPDASLAQSLRDLGIDVLDCCLLHWPGTDRKQRL SAYEKLLRRREEGFVRRVGVSNFQIKHLEEIRSAFGSFPVLNQIELHPLYQERDLCAYCR ENGIRLVAWGPLFRGKLMQHPAIVEIAAAHGKTPGQIILRWHIQKGHIAIPKSSNESRIH ENGDVFSFILTPDDMDRIDALDCGEHIGDDPYTFTGSK >gi|316923479|gb|ADCP01000069.1| GENE 15 11064 - 11480 513 138 aa, chain + ## HITS:1 COG:SMc00595 KEGG:ns NR:ns ## COG: SMc00595 COG0105 # Protein_GI_number: 15964917 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside diphosphate kinase # Organism: Sinorhizobium meliloti # 2 138 3 139 140 174 64.0 4e-44 MIERTLSLIKPDAVQRNLTGEILAMIQGAGLKVVALKMIHMTKAQAEGFYAVHRERPFFD SLTDYMSSGPVVCSILEGEDAIHRYRELMGATNPEKAAEGTIRKKYAVSLEANSVHGSDA PETAAFETRYFFSAFEIV >gi|316923479|gb|ADCP01000069.1| GENE 16 11600 - 12409 961 269 aa, chain - ## HITS:1 COG:PM1756 KEGG:ns NR:ns ## COG: PM1756 COG3302 # Protein_GI_number: 15603621 # Func_class: R General function prediction only # Function: DMSO reductase anchor subunit # Organism: Pasteurella multocida # 7 269 9 278 282 72 30.0 1e-12 MHGYWSLVIFTLLGQAAAGMLILSFFSRTADTSRAKAWAACILLGVGALASLEHLSDPTV SFYTITNVGTSWLSREILFVGLFGAGLLLWLITLNAWARRLAAILGLAFVYVMSRVYTIP TVPFWNSLFTYWLFLATSLLLGSSLLLFMDALAARKDPEKKATLLLGWYPVFIVLAFILQ MLVIPLQLLLAQSPFSTYLLAWHLTLLLFGAALGPLLLIRNGVNEMLPGACRTCPCPLWI RAGIILLLIVAGEVCGRALFYSGYTWFGM >gi|316923479|gb|ADCP01000069.1| GENE 17 12412 - 13041 729 209 aa, chain - ## HITS:1 COG:dmsB KEGG:ns NR:ns ## COG: dmsB COG0437 # Protein_GI_number: 16128862 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli K12 # 1 209 1 205 205 214 49.0 1e-55 MIKNPAFYFNSELCTGCKACMIACIDKHNLNKGVLWRRVLEYSGGEWLPVGDAYEQNVFA YYVSLSCNHCENPVCAEACPTQAMHKDENGIVSVDPDRCVGCRYCEWNCPYGAPQFDPER KKMTKCDFCRDYLEQGLPPSCVAACPCRALDYGEYDELLKRYGEQAAIAPLPGPELTRPH FICAPNRHSKPQGSTEGLRGKRISNPEEV >gi|316923479|gb|ADCP01000069.1| GENE 18 13107 - 15545 2619 812 aa, chain - ## HITS:1 COG:STM0964 KEGG:ns NR:ns ## COG: STM0964 COG0243 # Protein_GI_number: 16764325 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Salmonella typhimurium LT2 # 4 812 6 814 814 921 57.0 0 MRKPHCTEQALLKRRDVIKGGAVVLASAFFPFSIEILNAGSAVAAPPAPSGTGEERILWN SCNVNCGSRCALRVHVKDGVITRVETDNTGDDRYGMQQLRACPRGRSMRQRIYAEQRIPY PLRRVGNRGEGKFERISWEEAFKEIGQRLRGTIDTYGNEAVYLNYGTGALGSTMGKSWPP AATPVARLMNLVGGYLNHYSDYSTCQITVGMPYLYGGSWVDGNSLSDMENSELAVFFGNN PSETRMSGCKAKTLQHARFTRNTRVIIIDPRYTDSMVSVGDEWIPIRPGTDAALSAALAY VMITEDLIDKPFLAKYTIGYDEESLPKGAPAGSSYKSYILGQGPDKTPKTPAWASRITGI PPARIEKLAREIAGARPCFICQGWGPQRTTNGENISRAIGMLAVLTGNVGIKGGNTGARE NAGYKLPMATFPTLENPVKTELSCFNWYQAIDDYKQMTATTAGIRGRERLIAPIKFIWNY AGNCLTNQHGGINQMHPILLDDKKCETIVVIDTTLTPSARYADFLLPSCLNLEEHDWTSD GDSNIAYVIFDNKCIEPLGEAKSIYDICAGVANELGVKEAFTEGRTQYQWLEKLYAESRK AIPELPPTLEEAYTMGVYKRYFPENHIAYKAFRDDPEANPLPSPSGKIEIYSPRLAELAE TWTLLPGQAITALPEYIPNPEGAISPERKEWPLQLIGHHYKQRTHSTYGNCWWLQEVAPQ ELWINPIDAKARGIEFGDRVKVFNGRGVSFVKAKITPRIMPGVVSLPEGAWHTPNAAGED TNGCVNVLTKLLPTALAKGNPHHTNLVQVEKA >gi|316923479|gb|ADCP01000069.1| GENE 19 15741 - 17123 1929 460 aa, chain - ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 2 456 4 454 459 501 61.0 1e-141 MEQGIEREKLGSRLGFLLLSAGCAIGLGNVWRFPFITGAYGGAAFVLIYLVFLVILGLPI MVMEFSVGRAAKQGIGLAFRKLEPKGTFWHLYGYGGIIGCYVLMMFYTTVTGWMLSYCWY MGSGQLSSLTPEQIGAFFGGTLGDPFDQVGWMAVTVVAGFLVCSMGLQRGVERITKIMMG CLLVVMLFLVIRSVTLPGAEKGILFYLKPDFGKMFQHGIWEPIYAAMGQAFFTLSLGIGA MTIFGSYIDKQRSLTGESIHILLLDTFVALMAGLIIFPACFAFDVDAGSGPGLVFVTLPN VFNSMPGGQLWGMLFFVFMSFAALTTIIAVLENIVAYGIDVLKWTRKKAVTVNFVLVFLL SLPCALGFNVLSGIQPFGPGSMILDLEDFIVSNNLLPLGSMVFLFFCCYKRGWGWDNFIA EADTGKGLMFPKGLRLYVKYVLPFIVFFVFVQGYIDKFFK >gi|316923479|gb|ADCP01000069.1| GENE 20 17263 - 20961 3389 1232 aa, chain - ## HITS:1 COG:PA4282 KEGG:ns NR:ns ## COG: PA4282 COG0419 # Protein_GI_number: 15599478 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Pseudomonas aeruginosa # 1 1222 1 1205 1211 243 26.0 2e-63 MRILQIRFKNLNSLAGEWAIDLTHPAFLSDGIFVITGPTGAGKTTILDAVCLALYGRTPR LPRVTKSENEIMSRQTGECFAEVTFETRAGTFRCHWSQWRARKKASGELQAPRHELSDAV LGRIIENNIRGVADKVEEATGMDFERFTRSMLLAQGGFAAFLQAPADQRAPILEQLTGTE IYSDISKRVHARVVLERTKLETLEAELAGIHLLEEAEEQRLHADLASLIEQASEAARQSD RTQQALLWLEGIRTLQAELAMLENRKKQLAGRIEAFQPERRKLERANRALELSGAYAGLV SLRDAQAQALRQHKEYSEALPACEAELRQTETAFASVGTVLEQGKREREQGLRTTRAVRE YDTKLLEKEAALSALKRELAERENAFTTAQSTHQSLNAQLADIRASLTEVLAALQHTAAD QGLVENLSAIQHSCGQLRQLAETGREKKRAVEAALARQQETAKAHMDALALKERQDRETA RLEQAVALRRKALAELLGDRSVADWRASLMNAATRSSLLDKTEEARRQYETCLLGIKELE QRQEKRARNEAELTQHLQSAEKEHRRLEAEYEKREAEQAECSRLRAFEEARRHLHNGEPC PLCGATEHPFTGFSVPAPAGPEEHLARLKEELKRSGTGLASLQGDLAGLSREREWVTEQL QKEAATLSEHETHVRECLAALAHPSITPATPKQPLPAALLLTLQNLRETTETERRLAERI LQQAEQEDAGLQADMTAWEKARHEANRLELALQTALHHKDSAEQEAVRAERERSSLLEQY TALRQTVLQELRPYGIEALAADGPETLLHSLSERRERWVSREKRRDLLQQKMSALELESR HLASRLLDMEKDLGKQREAARSLSEDREKLRLDRRRLLGDKSPDEEEKRLNGLVEEAEKR LEGVRQTVDAAKQRFAGLTSKRETLDRSIAERQAQIGPLEEDFRLRLSQTGFADEAEYRD ACLTETVRNTLAQREQELLTERAELEARLADRIAQLAAEREKQVTDRTREELDETLATLR DTLKTQQEAIGGLRQKLHDNHMVKLAHQERTAAVEAQRRECRRWNDLHDLIGSADGKKYR NFVQVLTFEIMIEHANRQLRRMTDRYLLLRNKEQPLELDVIDGYQAGEVRSTKNLSGGES FIVSLALALGLSQMASKTVRVDSLFLDEGFGTLDEDTLDTALDTLAGLHRDGKLIGVISH VAALKERIGTQIQVTPKTGGRSVVSGPGCSSV >gi|316923479|gb|ADCP01000069.1| GENE 21 20972 - 22201 1237 409 aa, chain - ## HITS:1 COG:YPO3206 KEGG:ns NR:ns ## COG: YPO3206 COG0420 # Protein_GI_number: 16123367 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Yersinia pestis # 6 399 1 397 414 258 37.0 2e-68 MKTTSLRILHTSDWHIGRTLYGRRRYETFSAFLDWLADTVRDRHVDVLIVAGDVFDTSAP SNRAQELYYRFLCRVMPFCRHIVVVAGNHDSPSFLTAPKELLRALNVHVVGSISDDPGHE ILRLDGPDGEPELIVCAVPYLRDRDIRVVEPGESIEDKERNLIDGIRRHYAAISAQAETY RTRPDLPILATGHLFASGGQTADGDGVRQLYVGSLAQVTADCFPDNIDYLALGHLHIPQK IGGSETRRYSGSPVPMGFGERGAKSVCLVDFAGRQASVECLPVPVFQELERIEGAWEAIE SRLKAIAAGGGHPWLEIIYTGSELISDLRDRVEALTAGADMEVLRIKNARIMERALERDD DADTLDALDVYEVFERCLAAHGVPDEQRPDLRLAYRETVASFLEEDPHA >gi|316923479|gb|ADCP01000069.1| GENE 22 22281 - 24479 2120 732 aa, chain - ## HITS:1 COG:no KEGG:Dbac_0772 NR:ns ## KEGG: Dbac_0772 # Name: not_defined # Def: hypothetical protein # Organism: D.baculatum # Pathway: not_defined # 5 710 7 711 747 562 43.0 1e-158 MPNHAAFNWKFHRIGGLDQVTLRTPEELLHLNELDPKLWVALSCPIDKLQFDARTLELLD ADKDGRIRIQEVLDAVHWTVDRLSDPALLAESRPELALEEIRQDTDEGRLLYSTASRILT QSGKEGALTQEDVAEAIDAATQTAFNGDGVMTPHPSFDPDMNRFIEDIVATTGGATDAGG QQGATLELAQTFMRNIQDYKAWFDELAAYADTFPLGSGTDTAFQIFQSVRPKIDDYFTRC QLVAFDVRASDALNAPETIFAQLVDHALNTDDEAMRELPLAHIAADQPLPLEKGLNPAWR DQIIALKDTIAVPLLGVRDELTSDKWQTIKTRFAPYAELVARKPENGVEKLGMERLGELL SSDLPSRFEALTQQDEDAAGHLKTLTDVERLVLYHHHLHRLLMNFVSFCDFYALSRPTTF QIGTLFIDGRGCNLCLRVDDITNHAAQAQPSHLCLAYCECSRLDTGQKMNIVAAVTAGDS NLLIPGRHGVFVDTQGQSWEAVLVKLVDNPISIRTAMFAPYKRFGRMITDQLEKLASSKD SALMSDASKSIDKLTADAQKEAKPFDIGRSMGIFAAIGLALGAIGTALASIAAALFSLAW WQFPLLFVGIFLCISGPSMFLAWLKLRRRTLGPVLDASGWAVNSQIPINFMLGSCLTDAA ALPPNASRSFDDPFRKQSRWKKWAVALGVVCLAVLLAGGGYWGWKEYQKRHPAKQEQTSD IAAPKVPAPSAK >gi|316923479|gb|ADCP01000069.1| GENE 23 24669 - 24935 333 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPKLQAARVTQCGLDFAVVLVDPKLFMNPSELEAFSERAQALFPGVPVAMLSYDDLKVTR YHGPDDVVAFLKQVRVVSLPWVEYTIGE >gi|316923479|gb|ADCP01000069.1| GENE 24 24970 - 28152 2245 1060 aa, chain - ## HITS:1 COG:PA5146 KEGG:ns NR:ns ## COG: PA5146 COG2982 # Protein_GI_number: 15600339 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in outer membrane biogenesis # Organism: Pseudomonas aeruginosa # 626 1003 229 664 750 68 22.0 5e-11 MKRSTSILLYVLLPLFWALVLALGFFIYTFNQDPVGFARSLSARFSAPEAGFVLSADQAS LSLFPRPAASVSGLTIRTPAMTLFVDEGAVYPDFWALLHGETRIAGVRLLKPTLLLEPPA GTTPDATQQPRFSIPPQLAEMDVELKDGTIASLLPGANKSGIHSQWRMSGISGSATVPGN GEPGELSLSVGKMEWYGNTPPEKDGQAPPIQTLSNVDLEISDLEYAFPADAAAMLRFKAT CAMPLSFGEGSPRFVLGVTARTTQGTLAIDGVASLDGTFSLKRQSVPVHVLLPFTTEAPL GSLFQGAPLPQIAIKGASLKVEGDQATLDGRLIFGADYVPTVRGTLALKHLSLPRWFGFA RDLPPGVQVALDNISGTLPFELTPQKLVASSVTATTLNTVFTGGGGVNDFSNPVIALHLA TKDAPLNRVFPEVENKAVSAPSYKVPPLLGGDDSNTAVGYDIHLEAARATLWKWAANGVS VRITPDPASRTEQTKVAIRCGSLYGGSAQGDLIPGDVMALNLTASGVNVDSLFSPVAGYP VLKGTLTGSASFTARPTSPAAFLSSLKGKGEALIEKGNLSLSRAKKEKNLAFSQLRVAFQ GAGSRTQGTQPRYAYTGKWQGSLTTPAGQSSLDLTGALQFPTSGPFNLFADAVSASGKLS ANGIGGQASGKLSLNTQANTLEAKELSGQLLTKDASASFTGSVSGTRLDANPAWDASLSV STGNLRAFLAQWGMLPSSLPQQALRQAQIKAKIRADESILRLSELEGRVDDTRLAGQIEG TKGTPPHWTAKLRLGTLRLGDYLPASSKYAQPSTPWQTEWLRKVQLDGDLSVERLVIARI PHENLTVPVTIKNGVLTADPIKARVAGGTTGAGLRAEGTAGGLLARLRYTLSAVNVLTLC KERGQEQLLSGTGSLDADVSGLLRSGADIPAALSGTLNFVIRNGELDAKKPGPMSRFSSL SASGALSKGILTTRDLNLSGGLSVRGQGSINLINKTLNYALNVTGPGIPEIPVRYYGSLD APQRSFNATGILANVFNSIGSGVLNILDIVVSAPLRLLAP >gi|316923479|gb|ADCP01000069.1| GENE 25 28149 - 31403 2599 1084 aa, chain - ## HITS:1 COG:CAC2262 KEGG:ns NR:ns ## COG: CAC2262 COG1074 # Protein_GI_number: 15895530 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Clostridium acetobutylicum # 11 829 25 891 1252 96 22.0 3e-19 MPSHDPQLRQIRASAGSGKTYELTTSFLKHLSGAAEAGGGSFSGCSAVHSGPHGWPEILA VTFTNRAAAEMQERIIGRLKDTALGTDKPAPGWTREQARRWVGIILRRYGALNVRTIDSL LHLIVRLTALELDLPPDFEPVFATDEAIAPLLDSLLEQSRRDERLHSLLEEACRNVFFHS PRQSFLAGKSLREQVMEVLLPVMEARETPLAHPSEIADRLGALIRNLRDAAENLLFLLTE EKLAVSKHLLNALDACRKNPPKNLPPKSTMWDKGSLDECLNKASKGKASDKSLSAFGELQ DAVQKLKSDGELLRRAQTVMPFVELARELSGQVPDFLKREGAVPAAFVPRLARQVLSGDY GVPEAFCRLGTSLTHILVDEFQDTSREQWEAIHPLVLEALSRGGSLTWVGDVKQAIYGWR GGDATLFDEVRSDAELCAVAPEPRVDILPTNWRSCRTIVETNNTLFRQLSETATAKAVLS AMLPKDTPSALLAAILEEGAQLLKEGFAGSEQNVAPDKAEGFLRLQRVYGDKSEDLDEEV RERLLGCVQEVVSRRPWGDVTVLVRSNGKAAQVAGWLMEEGIPVVTDNSFLLAEHPLVEQ ITALLTFLDSPRNDLAFWTFLSGRQMLLPLIPLSEQALEDWAAARRTSERRNMPLFMAFR EDFPDIWRKWIAPFHADAGLLTPYDVTREALGRLDIWSRYPDEAAFVRRFLEIIHVAEGQ GYGSLSSFLDYWNKHGQQEKAPMPETLDAVRVMTMHKSKGLQFPVVIVPWHNFSQRVDSP AVETHVDGLTVLAPRSPASGYAHYKAIADNAREALHLLYVAWTRAEEELHAFLTETSSSR NASGLGSGLNVLLGALPMTKETFETGTPPFVRPAEQAREPKTMRDGKAVAEELALIEPES IPAGQTNERWRPMHWLPRLRIFRNPLEEFSFTQKRRGTFAHHCLACLQTAGQLTGHPEED ARQAFKQGLRTFPLPIRDPETVEHEIVEMLAWYAALPEAGEWLRYGTPEQEIVDESGELY RSDLVVDDGKRITVVEYKTGAPTPAHEIQLQRYMRLISKAAPRPVRGVLVYLDLKRLDYK QLQT >gi|316923479|gb|ADCP01000069.1| GENE 26 31419 - 32741 1718 440 aa, chain - ## HITS:1 COG:TP0108 KEGG:ns NR:ns ## COG: TP0108 COG0205 # Protein_GI_number: 15639102 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Treponema pallidum # 11 427 9 453 461 405 48.0 1e-113 MTSEYSIPVREIPIATLGQAKIPSPLPYGNMINTGKVMLSLSHEYDEDIDTHQTLLFEEA GPRQDLYFDPGKTKCAIVTCGGLCPGLNDVIRAIVLEAYHAYNTPSVLGIRYGLEGFIPS YSHNVMELTPASVEHIYQFGGTLLGSSRGPQEPSEIVDALERLNVSVLFVIGGDGSMKAA SAIAKEVRQRNIRISIIGIPKTIDNDINFVPQSFGFDTAVDKATEAIGCAHVEAVGTPNG IGIVKLMGRESGFIAAQSALALREANFVLIPEAPFQLHGDGGLLPALERRLKLRGHAVII AAEGAGQHLLQQNEARDASGNPVLSDVGSLLRTSIVDYFHGKMPISIKYIDPSYIIRSVP ANANDRVYCGFLGQHAVHAAMAGRTEMVVAKIMDRYVHIPLDLVTKKRRKLDIRSGLWRA VLESTGQGELTGMLPEGEKA >gi|316923479|gb|ADCP01000069.1| GENE 27 32744 - 33604 300 286 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 38 270 6 238 255 120 31 3e-26 MITEKNTSPRKAFRSAPKAFEQDASLENASCSDSKEEQAVLSGVKPVLELLEREPERIDA VLVRKGKRSQDTDRILDLCRTAKVRFTLADAQSLDRLCPAGHQGVVARLFEAGFTEFADL LTDATDAPLPLILVLDQVQDPGNAGTLARTLYAMGGAGLVIPRHNGTFLGAGARRAAAGA LERLPVAKVMNIARALDEARDAGFLIYGAAFGEGSLDAFTTRLHTPALLVLGNEEHGIRP QVAKRCHHLLHIPMLRTFDSLNVAQAGGILTSCFARQHLEKNPSGE >gi|316923479|gb|ADCP01000069.1| GENE 28 33772 - 35589 1108 605 aa, chain + ## HITS:1 COG:PA1812 KEGG:ns NR:ns ## COG: PA1812 COG0741 # Protein_GI_number: 15597009 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Pseudomonas aeruginosa # 118 600 110 525 534 206 31.0 8e-53 MRLNGVLPSCVAFRPSGVLGVLLLCVLVALTGCSSKNITSKGPFVLQEPSKGYPSVPSSS GQKTGTVYLPSDDGRPLTKAEQEAFLSEGEIDRNLPQEELGDVLLHFKYLVHKDRYTVEK NLERAQLYMPFIYETLRSRGLPRELAYVAFIESGYNPMATSSSGAAGMWQFISSTGKHYG MEQDWWMDERRDPYQSTRAAADYLDKLYKMFNDWHLAVTAYNAGEGKIQRGLAATGAKTF FELRRKNEQIYSVRDRLSDENKQYLPKFLAVCKIVRNLDKLGFSCATFASSSQIAEVRAK PGTDLMLFSKSIGMSWEEFSAHNPAYQRYVSHPSRSTKIYVPRAMAGKAQALLFKLPPAN SGKSYAGLRDYKISRGDTMASISRKTGVPVAELRRINQVSEPLRAGRVLKIPGNNRTGVD PRALASVRGVPSAKGTSVASAGRQVTPSRPVSASAPVREKVVSTASAASHEVKQGDTMYS IAKRYGVTQEAILAANGMKNHNISLGQQLRIPGKGAAVASQSAKKAPVASSVRPVVASVS SVPVSRPVAEAKATNKIVQYTVQNGDTLWAIARKFNVSPVELLSLNNMSRNTALRPGDTV RVAVN >gi|316923479|gb|ADCP01000069.1| GENE 29 35759 - 36307 318 182 aa, chain + ## HITS:1 COG:CAC0677 KEGG:ns NR:ns ## COG: CAC0677 COG0398 # Protein_GI_number: 15893965 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 19 169 75 225 229 61 29.0 1e-09 MPSGYMGGFFYVCSVAVLICLAVPRQLLSFAGGYAFGPLCGAVFATLGVTLGCVLAFGMA RYCGRSFVERRYGDKAAAFNRYAVQRPFILAILIRLFPSGNNLVFSLLAGVSRIPAWPFF LGSCLGYIPQNLLFAMIGSGMRVDKGWRIGLSALLFAASCALGWWLYKRYAAAYPMGARN EH >gi|316923479|gb|ADCP01000069.1| GENE 30 36334 - 37278 517 314 aa, chain + ## HITS:1 COG:FN0847 KEGG:ns NR:ns ## COG: FN0847 COG0457 # Protein_GI_number: 19704182 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 1 314 295 599 599 251 40.0 1e-66 MKGLELSRAYWEKCALPLFRRELPAFLERAAVGLVGEGSECFGFDDEISQDHDWGPGFCL WLPEGEWAEWQDAVEALLARLPDTYEGFPARMASAKRMGRVGPLSIEGFYGRFIGMPQPP QTWKQWRLVPEQFLAVSTNGAVFSDRLGAFTAFREALLGFYPEDVRLKKIAARCMGMAQA GQYNLLRSLKRGETATAMLAAARFSEQAVSMTFLLNKRYMPFYKWAHRGVEQLPILGRET AACVTALAGLDWRLGPRVEAVAGDIVESLCRDVAGRLRSDGLSDADGDWLVEHGPSVQAR IETPELQRMSVMLE >gi|316923479|gb|ADCP01000069.1| GENE 31 37330 - 38148 923 272 aa, chain - ## HITS:1 COG:all3780 KEGG:ns NR:ns ## COG: all3780 COG0457 # Protein_GI_number: 17231272 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 12 259 197 450 499 94 29.0 3e-19 MQLDEHFQRRLNEARMFFAEGKALAAEKIYRDLLKPDLPAGGRVLVLDGLGRCLHIQGRL EEAETFFRESLRFLEELFGPNHIHVAGGLQNLARLRSERGECEEAATLGERALDILKQNL PADDLRIADALLNLSSHQYTAKKYDAAEANLKMALHLWEAKEGRRCFGVSTCLNNLGRIC EERGETRQGVLYHQEAVSIRKEILGIHPETAFSLGNCGAALAGDAQWEKAAQTLEEALSC YESLGLSDSPEAVTCRNNLNLCRNAVKQSAAQ >gi|316923479|gb|ADCP01000069.1| GENE 32 38287 - 39327 1213 346 aa, chain - ## HITS:1 COG:no KEGG:CPR_2030 NR:ns ## KEGG: CPR_2030 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 1 346 1 351 351 359 60.0 7e-98 MNYWISVLFRIIPLAMGAICLGYGWYIWDMGSDANTYVAGHVVLFLSIICVALFTTAATI IRQLIHTYSAAFKVALPLIGYVAAAIGIIGGLIFVRSGTGAEHFVAGHVVFGLGLITCCV STVALSSTKFILIPSNSKAGPGHHPDQAFSGGMVALLFAIPIICALIGIVWGIDIVRSDE TPNVVAGHVLAGIGLVCASLVALVASVVRQIQNTYSEADRRFWPWLVIVMGTIDILWGLY LLIFQYGPVTIAPGFVLIGLGIVCYSILSKVLLLALVWRHPYPLAKRIPLIPVLTALTCL FIAAFLFEAATINPAYMVPARVMVGLGGICFTLFSIVSILESGTSS >gi|316923479|gb|ADCP01000069.1| GENE 33 39326 - 39514 69 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MACSFAGGMYEQSQAHRIYEYHYKLNGLDVLKPGFHSVGSRCFRSEYQLFSLRGLVISAF PR >gi|316923479|gb|ADCP01000069.1| GENE 34 39526 - 40509 708 327 aa, chain + ## HITS:1 COG:PA1060 KEGG:ns NR:ns ## COG: PA1060 COG0697 # Protein_GI_number: 15596257 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pseudomonas aeruginosa # 19 308 12 291 301 199 47.0 4e-51 MMRFIQNHVSLSVRAGLALFGAVLLWSSSFIALKIAVSAFDPMVMVFGRMLSSLVALMLL RVTVWRRAEAPMLLDRRVTRREWKYIVLLALCEPCFYFVFEGYAMIYTTASQAGMVVAAL PLAVVVAAWLLLGERPHRRVWIGFVLAVVGVVWLSAGSEATESAPNPIFGNFLEVLAMLC GALYVVCAKQLSSRCSPVLITTMQSLIGLLFFLCLLPLPAVKLPDSFPLLPTLAILYLGV GVTMLSFLLYNFAVRSVPASRTGAFLNLVPVLTLFMGMVFLDERLTAGQWAASALVFGGV ILSQWKTAQSEEKADSSATASLSEGAE >gi|316923479|gb|ADCP01000069.1| GENE 35 40506 - 41759 806 417 aa, chain + ## HITS:1 COG:CAC1687 KEGG:ns NR:ns ## COG: CAC1687 COG0826 # Protein_GI_number: 15894964 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Clostridium acetobutylicum # 8 283 4 274 406 231 42.0 1e-60 MKERERLPELLVPAGGREQLESAILYGADAVYMGGPELSLRTACEGFSGEELGLAVADAH TAGVRVYYCLNAMPYDAQLPAVEAVLERLPGMGVDGLIAADPGVIWLAKKHCPSVPLHLS TQAHSVNGAAVAFWREAGVERINLARELGFKQIRALAEAFPGVDFEVFVHGAMCLALSGH CLLSAWVNNRPANQGRCTQPCRFEYRGLSLLVEEQKRSGEALWEIREGEAFSGFWAPQDL CLLRYVGCLADLGVRALKLEGRTKSGGYVAQIADVYRTALDRHARREAGGSDCGKVEIST AALLEELFHTASRPLSTGFFLPRRRVEAPPAGLLPRPVVARLAEPEGDGWRVQVRSPWSS DREASVLVPGMRRPALLPGAYRLENHRGERVDLLHPGMEGVLHCEIDGLGKGLYVRA >gi|316923479|gb|ADCP01000069.1| GENE 36 41958 - 42914 809 318 aa, chain - ## HITS:1 COG:HI1195m KEGG:ns NR:ns ## COG: HI1195m COG2933 # Protein_GI_number: 16273647 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Haemophilus influenzae # 162 281 189 314 363 82 34.0 7e-16 MKKISADASISGNASPITVPFTAYLAAKGFLEDLLAELGDEVIGVRGRLVLAKGAPRSAA WAQNTWETPVFLPIESIGDGARKLTAIQRNWHLHSEENHRRASLIQEKLPHVSEKPRHFG EAAPTAPLGSWTLWEPNLILASARCSSPFPDGVVQFHENKIDPPSRAYLKLWESFTLLPK RPGPGDLCLDLGSAPGGWTWVLASLGAQVFSIDKAPLAPQVDHMPGVSHCIGSGFALDPR HAGAVDWLFSDMICYPDKLYEVICRWIELGSCRNFVCTLKFQAETDHATAAKFAAIPGSR LMHLSCNKHELTWAKLEQ >gi|316923479|gb|ADCP01000069.1| GENE 37 43057 - 46656 3461 1199 aa, chain + ## HITS:1 COG:NMA1206_1 KEGG:ns NR:ns ## COG: NMA1206_1 COG0277 # Protein_GI_number: 15794150 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Neisseria meningitidis Z2491 # 100 703 130 791 805 350 36.0 1e-95 MPHKGPHISIAPDFVVNRILRINLEDFADWPESVRHLATEIAEELFLVAYNPFVDSATVK ASVKERFDREVFALAHHYANTIGEGITLFWSGHEAEVAFRKELVERLGAFLPPDAIVTRP SALVACATDATDLRMELPLMVVEPANAEQVSELVKLANELKFALIPRGGASGLTGGSVPM RRRSVIVRTTRFTKMSPVDHENMSVTLDAGVITQAAIDAVAKEGFLFTVDPASKTASTIG GNVAENSGGPFAFEYGTTLDNLLSWRMVTPTGEIISIERKNHPRHKILPDEVAVFEVKDI SGGVRSVVELHGSEIRLPGLGKDVTNKALGGLPGMQKEGVDGIITDATFIVHKKPAQSRV MVLEFYGRSMHQAAVVIGQIVDLRNRIREEGDYARLSALEEFNAKYVRAIEYQKKSDKHE GVPISVIIIQVDGDEPYLLEKCVQEIVDIVAPHDNVALLVAKDEKEAELFWEDRHRLSAI ARRTSGFKINEDVVIPMQRIPDFALFLEQLNLECSATAYRQALQELGRLPGMALEDKDLN REFVNVTRVAQGGVPSSELSDEEMEERAVEFLRLMAERYDRLAPKIKKISDNMLVGRVVV ASHMHAGDGNCHVNIPVNSNDLHMLEIAEEAAMRVMAEAQEMGGAVSGEHGIGITKIAFL GKDKMDAIREFKNRVDPRDVFNPAKLTQRELPVRPFTFSFNRLIEDIRQSGLPDKDRLIS LLASVQMCTRCGKCKQVCPMMYPECSYHFHPRNKNMVLGAIIEAIYYSQINKGRPDPSIL AELRAMMEHCTGCGRCTSVCPVKIPSADVALQLRAFLDEEGAGGHPLKSKVLNWLVRDPA HRIPQAVKAAALGQRMQNRIIGVVPQAIKKRLYNPLFSGKGPEPGYRNLYEALHLERGNI FVPRSGDGAPVQENTALKEAVLYFPGCGGSLLSRTIGLSALGLLLRTGVAVILPEAHLCC GYPLLSSGADAQFASNMARNKKALQATIACATKRGFRVTHIVTACGSCREGIERLEPSTL LGEGGGELVHLDVMQFVQGRMDANPDLFKGNVLYHASCHPEWVGVHKVKGVQKQAGAIAR LTGAAIEVSPGCCGESGMGAIASPLVYNTLRKRKMDVLEAALADYPAQSPILVGCPSCKV GITRSLMAMHERRPVLHTVEWLATLLFRERWGEKWIRVFRRRIAPSADAQGVRIVELDG >gi|316923479|gb|ADCP01000069.1| GENE 38 46762 - 47613 843 283 aa, chain + ## HITS:1 COG:MA3786 KEGG:ns NR:ns ## COG: MA3786 COG0543 # Protein_GI_number: 20092582 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Methanosarcina acetivorans str.C2A # 1 279 19 296 299 277 51.0 2e-74 MPTTILHKESLIPGRTSKLVLDAPQIAATAKPGHFVMLRVNEQGERFPLTIADTDPEKGT ITIVYLVMGKSTMLLEALSEGDSILDVAGPLGKATHIVKGDSVICVGGGTGIAAMHHIAK GHHKAGNYVIAIIGARSKDLLLFENELKSFCDEVLISTDDGSYGHKGFVTDLLKDRLEKD KKVQEVVAVGPVPMMQAVAGTTLPFGVATTVSLNSLMVDGIGMCGACRVTVDGETKFTCV DGPEFDGHKVDFAELRQRLSAFRSQEKQSIEHHEHRCSCHDKH >gi|316923479|gb|ADCP01000069.1| GENE 39 47600 - 49024 1329 474 aa, chain + ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 9 474 2 461 468 476 55.0 1e-134 MTNTENVKKPAKTPRVDMPCQPAAVRAHNFKEVALGYSLEQAQLEASRCLQCKNPRCRQG CPVEVRIPEFIAQVVKGDIAEAYRILKSTNSLPAVCGRVCPQENQCEGKCILGVKGEPVA IGRLERFVADNAMHDPCVGVSDNDACAVTNPDLKVACIGSGPSSITVAGYLAARGIKVTV FEALHEAGGVLIYGIPAFRLPKDVVAAELDGLRQNNVEFRTNWVGGRTVSVQQLFDEGYK AVFIGVGAGLPQFLGIPGENLIGVFSANEYLTRVNLGRAYDFPNFDTPTYPGKHVTVFGA GNVAMDAARTALRLGAESSTIIYRRSRAEMPARHEEIEHAEEEGVKLFELVGPLHFNASE SGVLKSVTLQRMALGEPDASGRRRPMPIEGEVFEHETDLAVVAVGTRSNPILLEATPGLE LNKRGYIVVNEETGETSIPNVFAGGDIVTGAATVILAMGAGRKAAQEIAKRLLG >gi|316923479|gb|ADCP01000069.1| GENE 40 49777 - 51108 1120 443 aa, chain - ## HITS:1 COG:aq_1743 KEGG:ns NR:ns ## COG: aq_1743 COG0739 # Protein_GI_number: 15606815 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Aquifex aeolicus # 20 432 17 415 425 217 33.0 4e-56 MGRIIRNTISYGFVTLILAALGFCIYLFLDDLDGPEVTMTPDTGRISPTQEITLKLADAK SGVRSVVVTIHRNNQSLVILDKVFSEPAPVQTTSFNLKNAGLRDGAFELEITARDNSLMS FGKGNGTTRKWNMQLDTQPPKVRIKTTVPALRRGSVTAIAYSVSEDASMTGVQLGEQFFP AFLQPNGLYYCFFPFPLDTPKGKFTPELLTRDLAGNESRNRVLVNAGERNFRKDTLNISD NFLNSKEDVFKELVPDDISNLERYVRINNEVRISNEKTLLEIGKNTSPTMLWSGAFKRLP GSASKASFGDQRTYMHNGTKIDEQTHMGQDLASVAHAPIPAANDGKVVFAEPLGIFGNLV VIDHGLGLQSLYSHMSEIQTNVGATVKKGDIIGLTGTTGLAGGDHLHFGILMHGIQVQPL DWLDPKWIKNTITDRLDAAGAER >gi|316923479|gb|ADCP01000069.1| GENE 41 51171 - 51593 448 140 aa, chain - ## HITS:1 COG:VC0737 KEGG:ns NR:ns ## COG: VC0737 COG0517 # Protein_GI_number: 15640756 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Vibrio cholerae # 1 137 21 154 169 94 40.0 4e-20 MIYVSDLMTSRVFTLRRTDTLQDVRSLMQLAKIRHIPVTEDGDRFVGLLTHRDLLGYAVS HLAEINREEQEEIESSILVGDIMQTDVRTVAPDTLLREAAEILYRNKYGCLPVLDGDNKL VGIITEADFLRLAIALLHDA >gi|316923479|gb|ADCP01000069.1| GENE 42 51846 - 52508 451 220 aa, chain + ## HITS:1 COG:XF0857 KEGG:ns NR:ns ## COG: XF0857 COG2518 # Protein_GI_number: 15837459 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-L-isoaspartate carboxylmethyltransferase # Organism: Xylella fastidiosa 9a5c # 14 215 19 220 225 171 49.0 1e-42 MTELGQHRIFPDFKRQRARMVKLLEEQGITDEAVLAAMRTVRRHLFVPEALQGRAYEDHA LPIGFGQTISQPYIVAFMSQLLEAKPGMKVLEIGTGSGYQAAVLREMGLEVFTIERVREL YEAARELFPEGSPGIRMKLSDGTLGWRVQAPFDRIIVTAGGPNLPLPLLEQLADPGIMAI PVGRARREQKLLRVSKREGRVTARAFGNVAFVDLVGDHGW >gi|316923479|gb|ADCP01000069.1| GENE 43 52584 - 53438 801 284 aa, chain + ## HITS:1 COG:MJ0283 KEGG:ns NR:ns ## COG: MJ0283 COG0489 # Protein_GI_number: 15668458 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Methanococcus jannaschii # 20 284 25 288 290 250 45.0 2e-66 MSSCSNQAPQGQATIPMTAAMQNKVLTQNLKDVRHKLFVMSGKGGVGKSSVTVNLATALA SRGFTVGILDVDIHGPSVPRLLGASASVMADENGKMLPVPCGERMSLISMDSFLKDKDTA ILWRGPKKTGAIRQFLTDVQWGALDYLVIDSPPGTGDEHLTVLDAIPDAGCIVVTTPQEI SLADVRKALDFLKQVQAPVLGIVENMSGLSCPHCGKEIDLFKKGGGEQLAKQYELPFLGA IPLDPATVIAADRGVPVVSLTENSPARQGFMALADAVIAATEAE >gi|316923479|gb|ADCP01000069.1| GENE 44 53625 - 54185 322 186 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 1 174 477 664 904 128 40 9e-29 NTMFNLANQLTLARIFFVVPIILLLYFPGKITCFLAAVLFGIASLTDFLDGHIARKGNMV TSFGKFLDPLADKLLICSILIMFVELGWVPAWVTVVIIGRELAVTGLRAMAIDEGVVIAA DKYGKVKTVMQIIAIIPLLLHYPLLGVNVHLIGNFLLYIALVLTIFSGLNYFYKFYGIWR ENDRAA >gi|316923479|gb|ADCP01000069.1| GENE 45 54185 - 54544 205 119 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1194 NR:ns ## KEGG: Ddes_1194 # Name: not_defined # Def: septum formation initiator # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 15 115 2 97 100 108 57.0 4e-23 MNVRSTANTSSKATLWKVFILILAIIVNVVLASRLLWGPQSLVSYRELASQYAELLKERD GFDVVNAGLSREIRLLQSDEKYVEKMIRQRLNFVRSNEILYLFTEDANARGALPHDGKN >gi|316923479|gb|ADCP01000069.1| GENE 46 54507 - 55529 550 340 aa, chain + ## HITS:1 COG:no KEGG:LI0371 NR:ns ## KEGG: LI0371 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 8 339 1 292 293 148 33.0 3e-34 MRVEPCLMMEKIEWYQEVLKLDPDSKVFFPLAKLLRDSQQPDKAIEVLRAGLQHSSVFLE ARLLLIQILFEQSRAGECSEELSTVTGLLERYPAFWEVWAESVSEKNRDLALAIRLMAST IRHPEHSLSLILESGLGILQDVRRSFSSSQSIEASDSEPPCGVSHEETLCTATPSAQELS SCTVPVEESSIADAELLESMLEESAPVQEHPSCVTPFLPNKAKSVHWVDPDEDSDLMADE NPEEPTLRTRSMADVLAEQGDIVGALEIYQELEAAAPTPEEARELHDSVAALVSRMAGNS TEPGEQTEIGEPSYGDVGGQNQLMSLLESLADRLEARARV >gi|316923479|gb|ADCP01000069.1| GENE 47 55556 - 56506 657 316 aa, chain + ## HITS:1 COG:no KEGG:DvMF_0405 NR:ns ## KEGG: DvMF_0405 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 29 302 30 307 321 267 48.0 5e-70 MSLQSSFITTALIGSLVFIGGCSTFDSTWKSTKAFYGEYINPPAQIDYDDKGVLNDAETM LASRMVGIDIQLEQLERYLQNSDKPPTGESVAVLFHRFPWLSGLAAVDANGMVLAQEPPA AMKELDFAKMLEQKARGNELRGLRGLVEDTPLGPEVLTGIPIYSGSEMLGLLVAHFDMRS LLTYTSGAEDLVVLAPQGVLWPGRFEVDATPLTGQDWSELTKDSTHGTVSNKTGSFIWMV RFLGTQPIVFATPSEGHFPEQPGQLDALSHPAAFSSQGMMAPVTESHVIEGGDSSILTAP LPPIKGLGMEESAISD >gi|316923479|gb|ADCP01000069.1| GENE 48 56599 - 57603 826 334 aa, chain + ## HITS:1 COG:VC2544 KEGG:ns NR:ns ## COG: VC2544 COG0158 # Protein_GI_number: 15642539 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase # Organism: Vibrio cholerae # 1 328 1 327 336 319 49.0 4e-87 MSSEITVTEHLLLQQQRAPQATGHFTALFHDLVLSAKIIARSVNKAGLLDVLGGTGDINV QGENVQKLDDFANRVLLYRMERSGVLCAIASEENAELVRVSAAFPRGDYMLVFDPLDGSS NIDVNINVGTIFSILRRKPGRTGEVQLDEILQPGSEQVAAGYFLYGTSTMLVYTTGQGVH GFTLDPSVGEFLLSHPDIRIPETGSIYSVNEGNWNHWDDKARAAIDYFKHPDDPNATPCS ARYVGSLVADFHRTLLYGGVYMYPPDSRKGKSRGKLRYLCEASPLAFVAENAGGAATDGH NRILDITPTTLHERVPLFIGSRKDVEAINAIYKK >gi|316923479|gb|ADCP01000069.1| GENE 49 57740 - 58828 606 362 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Slackia heliotrinireducens DSM 20476] # 2 322 439 752 781 238 44 1e-61 MLCLGIETSCDETALALVRDGKCVESVLATQMDVHALFGGVVPEIASREHYRFLGALYDE LMRRTGLTLADVDVVAVTRGPGLLGALLVGVAFAKGLSLAGNKPLIAVNHLHAHLLAVGL EHELLYPLLGLLVSGGHTHIYRVDAPESLELLGHTLDDAAGEACDKFAKMLGLPYPGGVL LDRLAQRGAADAHLFPRPYTHVDNLDFSFSGLKTAALTYLNEHPVLVEEGRRVREAGTDS ASPALCDVCASYMLAVAETLSLKMEKAFHRERQKGITGIVVAGGVAANTQVRAAMHVLAE KVGKPLLLPSRFLCTDNAVMIAHAGELLACKGYGQGLAFPAIPRGQKLPDDLGELRAPLH SL >gi|316923479|gb|ADCP01000069.1| GENE 50 58988 - 59311 260 107 aa, chain + ## HITS:1 COG:slr0623 KEGG:ns NR:ns ## COG: slr0623 COG0526 # Protein_GI_number: 16331825 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Synechocystis # 4 103 6 105 107 133 58.0 1e-31 MATQVNDSNFDAVVLQSQLPVLVDFWAPWCGPCRAIGPIIDELANEYEGKLSVVKLNVDE SPSTPGKYGIRAIPTLILFKNGEVVEQVTGAVSKSSIATMIEQKALA >gi|316923479|gb|ADCP01000069.1| GENE 51 59308 - 60234 533 308 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 4 306 2 306 306 209 37 3e-53 MKHYDAIVVGGGPAGITAALYLCRSGISVAQIEMLAPGGQILKTESIENYPGFPKGIKGW EMADAFAAHLDDYELDRYNDAVLKMEQVPGGWSFSVGKETIVGKAVVVCSGANPRPLGVP RETQLTGRGVSYCALCDGNFFRDQVVAVVGGGNSALEEALYLSRIASKLYLIHRREGFRA AKVYQDKIRAASDKIELVLDTVVTGLMGEDSLQGLHLKNVKTGEETQLPVDGMFVFVGYE PQNSFLPAGLELDPQGFIITDCEMRTNLPGLFAAGDIRSKMCRQVTTAVGDGATAATAAF TYLEQLDA >gi|316923479|gb|ADCP01000069.1| GENE 52 60237 - 60959 353 240 aa, chain + ## HITS:1 COG:RSc1627 KEGG:ns NR:ns ## COG: RSc1627 COG4105 # Protein_GI_number: 17546346 # Func_class: R General function prediction only # Function: DNA uptake lipoprotein # Organism: Ralstonia solanacearum # 7 202 30 237 289 86 30.0 4e-17 MYFRHLVLAACLLLLSGCGIIDYFFLPPPEDTAQELYEGANDAMQEKNYSQAAQYYTKLK DNFPFSPYTVEAELSLGDAFFLDGKYPEAAEAYKEFESLHPRHEAIPYVLYQVGMSNLKS FISVDRPTTSTQEALEFFGRLRETYPNSEYAQKSVEEMKNCRRLLAEHELYLGDVFWNMN NYGPAWRRYTYIVDNFPDVPEVSAHAKEKALSAYYRYREQQSQKAREQIQGSWKRWFDWL >gi|316923479|gb|ADCP01000069.1| GENE 53 61140 - 61568 563 142 aa, chain - ## HITS:1 COG:MTH1854 KEGG:ns NR:ns ## COG: MTH1854 COG4747 # Protein_GI_number: 15679842 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Methanothermobacter thermautotrophicus # 1 140 1 140 143 133 50.0 8e-32 MKVEQLSIFLENKAGRLAQVTKTLAEAGINIRALSLADTSDFGILRLIVNDTEKAINIMK EAGFTVGRTAVVAVEVDDKPGGLNNILEALSGQNVNVEYMYAFVQEGGGSATMIFRFDRI DQAIEVLKAKNIPIIPADRING >gi|316923479|gb|ADCP01000069.1| GENE 54 61689 - 61889 284 66 aa, chain - ## HITS:1 COG:slr1046 KEGG:ns NR:ns ## COG: slr1046 COG1826 # Protein_GI_number: 16329622 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway components # Organism: Synechocystis # 1 45 44 88 126 58 60.0 4e-09 MYGIGMQELLLILVIVVLIFGSKKLPEIGGGLGKAIRNFKRASSEPDEIDITPKDEKKKD DTTHSA >gi|316923479|gb|ADCP01000069.1| GENE 55 62144 - 63040 325 298 aa, chain + ## HITS:1 COG:no KEGG:DVU1370 NR:ns ## KEGG: DVU1370 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 291 1 285 290 123 35.0 1e-26 MQKESDDSIVYVPTMHREFFPEGFSGCVKPLWPGLSSRYWGQRNPDAWHKISLPYTPTEA AACLAELTQLDEAGIAALSDTVASSRSTAQKVSQERDDLKCFAKTGECREAGKDTVSTED MRRWAQRFLLLGWLQEERVLEMEQLSARYRAGAEKLAAHLGTRDTEREPADEDAEMLSGL LGMMRDLVPEDPATLLPSWCFILDLLAVLLPEGTIACTADQRMAKAFAEAGICQESLSPA LLARLPEGWHAPEGYAVTYGEEPMWKLIGKKAPQSDRPWLDRRQLVILCTADDVLERA >gi|316923479|gb|ADCP01000069.1| GENE 56 63040 - 63726 388 228 aa, chain + ## HITS:1 COG:alr4944 KEGG:ns NR:ns ## COG: alr4944 COG0546 # Protein_GI_number: 17232436 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Nostoc sp. PCC 7120 # 23 214 6 199 212 67 26.0 2e-11 MQSSVTTLASLLERTFPAGLSGLIFDCDGVMVDSRDVNIGYYNLLLREVGKPPITPEQAG YVQMSTAKEALEYIFSPEELKLLPAIAERYPYRDVALPQLELEPGLADMLHWLRERGVRL GIHTNRGSGMWDLLARFNLCEMFDPVMTAEIVPSKPDPAGVHRILETWKFGPESVGFIGD SATDAAAAQGGGVPLLAYNSPDLPAAVHVDNFAHLRSALERLPRLGRT >gi|316923479|gb|ADCP01000069.1| GENE 57 63764 - 64063 399 99 aa, chain + ## HITS:1 COG:no KEGG:LI0363 NR:ns ## KEGG: LI0363 # Name: not_defined # Def: integral membrane protein # Organism: L.intracellularis # Pathway: not_defined # 1 99 1 99 101 116 72.0 3e-25 MFVLGNILFAVARVLDTLLTLYFWVVIISALLTWVRPDPYNPVVRTLNALTEPVLYRIRK WLPFTYISGLDLSPIVVLVAIQLVQAIVVRSLFQYAAML >gi|316923479|gb|ADCP01000069.1| GENE 58 64246 - 64479 371 77 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1902 NR:ns ## KEGG: Ddes_1902 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 75 1 75 76 79 77.0 3e-14 MEQRELLLLEKYAAVNPELKELWEDHILYEKQVEKLEAKAYRTPTEEQTLKQLKKQKLEG KTRLHAILDGYNEQEGN >gi|316923479|gb|ADCP01000069.1| GENE 59 64479 - 66170 1338 563 aa, chain + ## HITS:1 COG:MA3792 KEGG:ns NR:ns ## COG: MA3792 COG0028 # Protein_GI_number: 20092588 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanosarcina acetivorans str.C2A # 5 561 11 559 564 644 56.0 0 MLCTGARILLESLKREGVDLFFGYPGGAVIDIYDELSNHPDLRHILVRHEQGAVHAADGY ARACGKVGVCLATSGPGATNTVTGIATAYSDSIPMVIFTGQVSTPLIGNDAFQEVDIIGI TRPCTKHNFLVKDVKDLAETVRKAFYLARSGRPGPVLVDIPKNIQQASCEFVWPEDVVMR SYCPTFRANLNQLRRAMDVLVTSRKPLIVAGGGVIMANAAEELCTLAHLMHAPVCSTLMG LGAFPGDDPLWIGMLGMHGTFAANRAVTQADVILAVGARFDDRVTGKLSQFAPNARIVHI DIDPTTLRKNVRVEVPVVSDALSALQGLRSILESGHAQKDWAGDHADWMGQVLEWQKEHP LSWAKGDVLKPQQVVERMYEITKGEAIITTEVGQNQMWAAQFYKYHKPRTYLTSGGLGTM GYGLPAAIGAQMAFPDRLVVDVAGDGSIQMNIQELMTAVENGLPVKILILNNRHLGMVRQ WQELFYDANYVATDMKGQPDFVKLADAYGAEGYRITTEEELEELLPKALASPRTAVIDVL VDREENVSPIVPAGASLDEMLIV >gi|316923479|gb|ADCP01000069.1| GENE 60 66214 - 66705 488 163 aa, chain + ## HITS:1 COG:alr4627 KEGG:ns NR:ns ## COG: alr4627 COG0440 # Protein_GI_number: 17232119 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Nostoc sp. PCC 7120 # 1 156 10 165 182 150 56.0 1e-36 MRRVLSVLVENEPGVLSRVAGLFSGRGFNIESLNVAPTLEDGVSLMTITTSGDEQIIEQI VKQLRKLVTVVKVVDFYGMAYVEREMMVIKVHAEESKRGEVLRIADIFRCKVVDVSLTDL TLEATGDHSKLQAIIQLLQKFGIKELARTGTLAVRRTMQSDDI >gi|316923479|gb|ADCP01000069.1| GENE 61 66911 - 67090 78 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRMALAAVTDNGNGLVFEKIQVRVLFIEGFHVLSLTGNSKHSENTAQRGPSQNHLRACA >gi|316923479|gb|ADCP01000069.1| GENE 62 66998 - 67987 1314 329 aa, chain + ## HITS:1 COG:sll1363 KEGG:ns NR:ns ## COG: sll1363 COG0059 # Protein_GI_number: 16332126 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Synechocystis # 2 328 39 365 367 419 65.0 1e-117 MKAFYEQDANLDFLKNKTIAVIGYGSQGHAHAQNLRDSGLNVIVGQRSGGANYALAKEHG FAPMSAAEAAAAADVIMILLPDQHQARVYREDIAPNLKPGKSLVFAHGFNIHFCQIEPPK DVDVFMVAPKGPGHLVRRTFTEGGGVPCVFAVHQDATGEATQKALAYAKGIGGTRSGVIE TTFKEETETDLFGEQAVLCGGVTELIKAGFETLVAAGYQPEIAYFECCHEMKLIVDLIYE GGFGKMRHSISDTAEFGDYRTGRRIITDETRKAMKQVLSEIQDGTFARDFILECGCGYPS FKAKRRIENDHQIEQVGGKLRALMPWLKK >gi|316923479|gb|ADCP01000069.1| GENE 63 68057 - 68275 110 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLPMPEEEELHRCPHGERCQALMLCSLDCATSGAHPDRKTKADGRFIVTPWWCIHKCKW FSEHKNEAPKKS >gi|316923479|gb|ADCP01000069.1| GENE 64 68589 - 68894 188 101 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MITLSEAVLDELKRLRGRTERTFRIEPGCSCCHSPTLELILDEPEDDDVTTEVSGFTFCI RKRMLDQLGNVRIEADHAEGYRIFSERSLSSFVKNGRLCAS >gi|316923479|gb|ADCP01000069.1| GENE 65 69061 - 69273 61 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRTYSQSGCKHPRLYEHYTLSLLTFTYIPHEYSAFYLTTLRKVHGMRFPACVGSGKASAS SREWKAPPLI >gi|316923479|gb|ADCP01000069.1| GENE 66 69495 - 70880 1733 461 aa, chain - ## HITS:1 COG:SA0645 KEGG:ns NR:ns ## COG: SA0645 COG0471 # Protein_GI_number: 15926367 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Staphylococcus aureus N315 # 73 461 122 517 517 62 21.0 2e-09 MNPVQALKWTMTLLIPCGVYFLSPQDVSPHFPLYMGLTSAAIIAWALNVFPAIGVAAMLT FAYILCGVADAEVVFGPWATVLPWLSFAAVIIGEGMEKTGLAKRLALRCLKLTGGSFSGL VAGFFLAGLVLVVILPSILARVVIFCAIAVGIIQALQLEAKSRMSSTIVMMAFFAAAAPQ FMFLHSSESFIWAFGMMLKGTDKTVNFWDYAYQATFINLIYYALSMATVYIVKGRETLVS GAHFSTFIEQSCREMGPIQPREIKLLVLVFGIILGFMLEPLTHIDPVYVCCVLALLSYLP GINVLDRESFSNLNIVFLVFITGCMAIGFVGGSVGANKWAVASIVPLLQGWGETMSVVCA YAAGVVINFLLTPLAATAAFTPAFGELGTAMNVNPLPLFYAFNFGLDQYIFPYEAVYFLY IFITERVLLRHIVTALAIRMLIVGIFVVVLAVPYWNGIGLM >gi|316923479|gb|ADCP01000069.1| GENE 67 71043 - 72719 1919 558 aa, chain - ## HITS:1 COG:PA2298 KEGG:ns NR:ns ## COG: PA2298 COG1053 # Protein_GI_number: 15597494 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Pseudomonas aeruginosa # 9 528 7 520 574 143 27.0 9e-34 MIPIKERLESDVLIVGGGIAGLMAAIAAADKGASVTLLDKANTKRSGAGATGNDHFLCYI PEKHGTINVVYEEFMDSQNATSNDTPLVMRFLETSPTIVNMWNDWGINMKPTGDFHFEGH AYPGRPKIFLKYDGHNQKKVLTEQAKKRGVRIVNHSPVLELLRDDNGITGALALDISSDE PAYRIVKAKAVVVATGYVSRLFTTDPTPSQMFNTNMCPSGTGDSIAQAWRLGAHLVNMEK TYRHAGPRFLARCGKATWIGVYRYPNGKPIGPFITKPNVETGDMTSDIWASAFTDLKLNG TGPAYMDCSGASPEDLEHMRWAMRCEGLTALLDYMDKEGIDPGRHAVEFGQYEANLCTNG IEIDINGESNIRGLFAAGDMMGNISGDIGAAAVYGWIAGHHAGDYKTLGQPADVDGAPFC QERMAFFSGLYERSGGASWQEANFALQQIMTEYADCGPHRLRSDTLLTAGIKYIGDLRKK INKEMSVSDAHELMRAAEVLNLLDLGEALMIAARERKESRGRHLRADFTFTNPLLRDKFL RVWQEDGTPRTAWRDRLK >gi|316923479|gb|ADCP01000069.1| GENE 68 72764 - 73006 277 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621852|ref|ZP_06114787.1| ## NR: gi|266621852|ref|ZP_06114787.1| ferredoxin, 4Fe-4S [Clostridium hathewayi DSM 13479] # 1 69 1 68 83 66 55.0 5e-10 MPPVIDRTKCIGCGMCADVCPLQVFRHDPKKDKIPEVRRPYECWHCNACVLDCRAKAIEL RLPLTHMLLYVDSDSLKPKE >gi|316923479|gb|ADCP01000069.1| GENE 69 73229 - 73939 800 236 aa, chain - ## HITS:1 COG:no KEGG:Dde_0463 NR:ns ## KEGG: Dde_0463 # Name: not_defined # Def: Crp/FNR family transcriptional regulator # Organism: D.desulfuricans # Pathway: not_defined # 30 235 12 220 221 95 28.0 1e-18 MAKHDSIPVSSSKLPLFAPFAVPKIGNMATAWHEILHLGTKQRHSAGSIVKVEGETCFDL FYIDEGKVHVVFDTIDGRMRSVVSFEPGSIFNLAPAATRCEASGQYQCMTDAVIWQIPGK VLHDPVFAARYPNLMLSVIELLGTLVLTYHTYLTDMLMDDFVTRFSRFLISLSLERGSDE FPLGMTQEQLASVFGVHRATLARAIQHLKQEAIIACFTCRRVEIIDMERLRHMAHL >gi|316923479|gb|ADCP01000069.1| GENE 70 74260 - 74634 369 124 aa, chain + ## HITS:1 COG:ECs3906 KEGG:ns NR:ns ## COG: ECs3906 COG3111 # Protein_GI_number: 15833160 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 123 1 128 130 75 38.0 3e-14 MRYIMALLLSLALAVPAFAGFEGPNASTGGFAGPGPQTITKAAQVQNALDDTPCTLEGNI LERLSKNKYVFQDDSGKITVEIGQRIFGSNRVTPETRVRLTGEVDHKKQGRANEVDVRYL EIVK >gi|316923479|gb|ADCP01000069.1| GENE 71 74726 - 75418 437 230 aa, chain + ## HITS:1 COG:XF2336 KEGG:ns NR:ns ## COG: XF2336 COG0745 # Protein_GI_number: 15838927 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Xylella fastidiosa 9a5c # 9 230 16 237 241 221 50.0 8e-58 MEAGRIRALIIEDNQDITANLYAFLEPLGYELDCSANGHIGLEMAVAGTFDVIVLDIMLP GMDGLAVCRALREEHVDPTPILMLTARDTVADRVTGLDCGADDYLVKPFSMKELDARLRA LVRRARGRQTRSVLRWEDIELDPAAHSATRSGVPLRLSPTGFVILETLLRAAPAVVPRTE LEQAIWGDFPPDSDALRTHIHELRQKLDKPFTCPLLKTIPHVGYTLVRHE >gi|316923479|gb|ADCP01000069.1| GENE 72 75573 - 76793 663 406 aa, chain + ## HITS:1 COG:XF2535 KEGG:ns NR:ns ## COG: XF2535 COG0642 # Protein_GI_number: 15839124 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Xylella fastidiosa 9a5c # 139 389 102 356 373 119 32.0 9e-27 MTLVTGGIFSLSAYLSYDYAITHVIRWHMEPIMRLLIAAEEGHTLRKGERLSVDSELLAK ELKVKWYVGEAIPDDLRPERSKVRELVRIKRDRYAMTYRDKEGQEYAIVGKIKDLDDLEE VMADIALACVVGSLIGAGLLAYWLSRRLVSPLVELTGKVNRGEALDDTLLFSRDDEVGDL ARAFAEREQALQEFLNREQLFTGDVSHELRTPLTVLQGGVEILESRLSGDDKLLPIVGRM QRTVASMTAMVGTMLLLARKPEQLEFRPFDLCVLARQEEVEIRERLRGRPVTFSSLLPDR LTVLGSPELAAMVLHNLLDNACRYTEQGRILLEFRADEMLLTDTAPVIDPDVRTRMFERG VRGTSKSPGSGLGLSLVLRGCERLGWKVAHEHWEGGNRFRVRFSQG >gi|316923479|gb|ADCP01000069.1| GENE 73 77028 - 77810 472 260 aa, chain - ## HITS:1 COG:MA1347 KEGG:ns NR:ns ## COG: MA1347 COG0500 # Protein_GI_number: 20090208 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 33 149 47 178 251 61 28.0 2e-09 MWNAETAQFLHDWYATPEGTYAITQENRLFQHLISQWPRRGHTLLDIGCGAGIFLEMLWH YGFDVTGLDTGTDLLDMARERLGNRAEFQLGRPEHLPFDDEEFDYAALLTVLEYVDNPED VLREAIRVSHRGVIIGFMNSFSLYQIQRRLHRPTLEYRHRRHNLNFWSLARMVRRIRPKA TLSFRSVLLGPPNTWKKEGLWGKINSLQTPLPIGAYLGLCIDTMPRVPLTPLLLRAKEKA FKVYTGLQPEASREGAPSRP >gi|316923479|gb|ADCP01000069.1| GENE 74 77876 - 78526 349 216 aa, chain - ## HITS:1 COG:no KEGG:DVU1787 NR:ns ## KEGG: DVU1787 # Name: not_defined # Def: nuclease domain-containing protein # Organism: D.vulgaris # Pathway: not_defined # 33 214 20 199 208 171 46.0 1e-41 MLSAVLVTAFFVFQTAIAHADGLFSATVTQCTDGDTLVLDTGQRVRLAGVDTPEKGSKDT PPQYYAREAARFTCERTRKQRVKVIPLPGASRDRYQRLVAEIILPDGRSLNEQLLQQGMA SFYAHKNLPSQLVRRLTAAQKDALDKRAGCWGFILTRPQAQEPYIGNRNSKRFFSKACLR TANISKKNQIRFSDLEEAFRSGYAPARPCGIWPSAE >gi|316923479|gb|ADCP01000069.1| GENE 75 78544 - 78831 250 95 aa, chain - ## HITS:1 COG:no KEGG:LI0344 NR:ns ## KEGG: LI0344 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 95 1 95 95 134 74.0 1e-30 MSSTPDEITIEYEESGQILIKELDKVILSKGAWTTILFRYQELDAETGEYGPDKYAIRRY QKSGGEYRQKSKFNISSAEQARKIVDALSGWLADA >gi|316923479|gb|ADCP01000069.1| GENE 76 79093 - 80775 1904 560 aa, chain + ## HITS:1 COG:RSc2913 KEGG:ns NR:ns ## COG: RSc2913 COG0488 # Protein_GI_number: 17547632 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Ralstonia solanacearum # 9 560 5 555 555 665 59.0 0 MSNEPDKIIYSMIRVSKRHGQREVLKDISLSYFYGAKIGVLGLNGSGKSSLLKILAGVDQ SFEGKTVLAPGYTIGYLEQEPLVNEMRTVREVVEEGAQELVNLVKEFEEINAKFAEPMEP EEMDKLIERQGQVQELMDAKGAWDLDSKLEMAMDALRCPPGDTPVSVISGGEKRRVALCR LLLQNPDILLLDEPTNHLDAESVAWLERYLRNFPGTVIAVTHDRYFLDNVAGWILELDRG RGIPWKGNYSSWLEQKEKRLAQEDKAEDERRKTLSRELEWIRMSPKGRHAKGKARINAYE AMLSHESEKRAPELEIYIPAGPRLGKKVIEAVGISKALGDKELMEDVNFIIPAGAIVGII GPNGAGKTTLFKMLVGQEKPDAGTLTIGDTVQFAYVDQQRESLTPGKTVYELISDGAETI KLGNREINARAYCSRFNFTGSDQQKKVDVLSGGERNRVHTARMLKSGANVILLDEPTNDI DVNTMRALEDGLENFAGCVLVVSHDRWFLDRIATHIMAFEGDSSVVFFDGNYSEYEEDRK KRLGKDADQPHRLKFRKLTR >gi|316923479|gb|ADCP01000069.1| GENE 77 80865 - 81812 1097 315 aa, chain - ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 7 290 5 288 308 181 38.0 2e-45 MSTVPTAVLFPGQGSQEPGMGRDVAEASKEAMELWKKAEQISGLPLRAVYWESDDAALMA DTKHLQPALTVVNVTLWQALSGKLSPACAAGHSLGEYSALAAAGSLSPESTLELVSLRGK LMADADPDGKGGMAAILKLNREAVNEIAKAAAEATGEILIVANHNTPAQFVISGTRAAVE AALPLVKEKKGRAVPLPVSGAFHSPLMDTAAQELAKALNKMTWSRPRFPVYSNVTGKAVT DGESLRELATVQMISSVLWIDTIANQWHDGIRSWVEVGPKGTLSRMVKPILDAVPVEEEV AITAVGSLEGVNAFA >gi|316923479|gb|ADCP01000069.1| GENE 78 81821 - 82267 369 148 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0187 NR:ns ## KEGG: DvMF_0187 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 12 140 44 173 176 119 50.0 4e-26 MALFFRLGLAGLLLLLTVLPASAETLYNQPPFSEKELNQFIADLPRFRAWIKTNKEKAHP IVNEAGEPDFLYSKNAAGYIEAAGWKPERFFCIMGRAAAAVAIIQQGDAITKEPPIEMPN VSDDELDVVRRNLPGLLKAISPAPTPKK >gi|316923479|gb|ADCP01000069.1| GENE 79 82357 - 83349 1123 330 aa, chain - ## HITS:1 COG:Cgl0990 KEGG:ns NR:ns ## COG: Cgl0990 COG1494 # Protein_GI_number: 19552240 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins # Organism: Corynebacterium glutamicum # 5 329 7 331 335 326 54.0 4e-89 MSLMEAPERNLAFDLVRVTEAAALASARWLGRGDKNAGDGAAVDAMRLSFNSLPIAGKVV IGEGEKDEAPMLYNGEIIGMGNGPSLDIAVDPVEGTNLLAYGRPNAIAVVGAAPGGSMYN PGPSFYMQKLVVGQAARDVIDLDAPVHVNLRLIAKALGKDTNDLVVFVLDKPRHQKLIQE IRAVGARIQLHTDGDVAGALMAVDQRSEVDVMMGTGGTPEGVLAACAIKGAGGAMLARLD PQSEAEKAAIQEAGIDLAQILDADTLVQGDDVFFAATGISGGTFLRGVQFSGQGAVTHSM IIRSKTGTMRYIESHHNWNKLMQISSIKYD >gi|316923479|gb|ADCP01000069.1| GENE 80 83522 - 86065 1102 847 aa, chain + ## HITS:1 COG:PA3961 KEGG:ns NR:ns ## COG: PA3961 COG1643 # Protein_GI_number: 15599156 # Func_class: L Replication, recombination and repair # Function: HrpA-like helicases # Organism: Pseudomonas aeruginosa # 9 836 4 831 838 629 48.0 1e-179 MFNHTTEVLPIEALLPDLLETLGSGTRCILEAAPGAGKTTRVPLALLRAAWLDGKGILML EPRRMAARNAARYMASLLGEKVGQTVGYRVRLDSRVSASTRITVVTEGVLTRMLQDDPEL QGVGCLIFDEFHERSLNADLGLALALDCQQGLRDDLRLLIMSATLDSGPLLKLLQQDFSG AVPVLRSEGRAWPVETHHLPLPRADIPMERWVADAVRRALREENGSILVFLPGTGEIRKV EALLQDVRSDTVHLCPLYGDLPAAEQDAAIAPVWAPVRKIVLATSIAETSLTIEGVRVVI DSGLARTVRFDAGTGMSRLVTERVSLASAEQRRGRAGRTEPGVCYRLWHQGDEAGMAAHI RPEILDADMAPLCLELAVWGITAPSSLHWLDEPPVEAVNQAIPLLRSLEAVDAAGRVTAR GRVMARLPLHPRLAHMVLAAKAFGLGKAACLLAAMADERDPLRLRDVDIRPRLALLGHGQ GSPAVFRIREAAHQIGRLVDAAGEPPHRTPMSIPEEEQAGVLLALAYPDRLAQKRSRGSY RMANGRGGFLDDTDSLADEPVLAVGAVNGGSGNVRIWQAAPLSQETVAALYQTQFTKNED VVWDSREQAVAARKRVSFGAFIIEETPLRNGDASLADRMAQAVLSGIRELGMSCLPWTEE LYQWRLRVELLRAVDKGGDWPDVSEEALLTGLEDWLVPFLQGISRRSQFSKIDLAAALHA LLPWRKQRELDAMAPTHMEVPSGSFIRLLYEASAEGIVPVLAVKLQEMFGMQESPAIAEG KVPVLVHLLSPAGRPLQITRDLKGFWKNGYPAVRAEMRGRYPKHPWPEEPFTTLPTRLTN RRLQNRS >gi|316923479|gb|ADCP01000069.1| GENE 81 86245 - 87900 1345 551 aa, chain + ## HITS:1 COG:BH3808 KEGG:ns NR:ns ## COG: BH3808 COG0018 # Protein_GI_number: 15616370 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Bacillus halodurans # 25 551 32 556 556 488 47.0 1e-137 MRAEQHLEASFRAVLKNMNLPWPDKAVIEPSKDPKHGDLASNLALVSAKAAGMPPRELAA KMAEALRNEDASLASVEVAGPGFLNVTFSPEFWRETIRRIDDAADRFGQATVGKGCKVQV EYVSANPTGPLHIGHGRGAAVGDTLARILRFAGYDVTTEYYINDAGRQMRLLGLSIWLRA KELAGIAVEWPEDYYRGSYIIDIAREMMDKDPGLTSLSDAEGQDRCFAYGMQTILDGIKE DLREFRVEHQVWFSERSLVERGAVEETFERLKKAGLSFEKDGALWFRTTDFGDDKDRVLR KSDGSLTYFASDIAYHDDKYRRGFDRVVDVWGADHHGYVPRMKAAVQAVGRKAEDLDVVL IQLVNLLENGTQVAMSTRAGQFETLADVVKEVGVDAARFMFLSRKSDSHLDFDLALVKQR SMDNPVYYVQYAHARVRSVLRKAAEAGLTLPEKTDLALLAPLSEAEDLALLHFLDRFEDV AKGAAQALAPHHISYYLMELAGMLHSYYAKHPVLQAGDPELALARLALLRSVGQVVKNGL ELLGVSAPESM >gi|316923479|gb|ADCP01000069.1| GENE 82 88096 - 88689 451 197 aa, chain + ## HITS:1 COG:no KEGG:LI0230 NR:ns ## KEGG: LI0230 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 193 29 253 258 116 31.0 6e-25 MSFSGLITSGIITIIAIGWIFAFGVIVGRGYNPEKKMPELARLLPPPEGQDAPKEAKGIL KPEELTFMTDLKQTQPAANAAPNKPEAKAAHPTAQPVAAATPKPEAASADKAKYDFVFQA VAYKSKDSADKLRERMEGEGLRTRMTIEKDNKGRPKWFRVQVLVRGTDADASAAKQVLVK MGLKDATQVSKKPVRGR >gi|316923479|gb|ADCP01000069.1| GENE 83 88694 - 89491 584 265 aa, chain + ## HITS:1 COG:RSc2961 KEGG:ns NR:ns ## COG: RSc2961 COG0767 # Protein_GI_number: 17547680 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, permease component # Organism: Ralstonia solanacearum # 26 264 23 254 255 164 43.0 2e-40 MDILSRFFAVPGALFLRWIDALGDTFLFLLEGLRQIFIAPKLFIKVLQQLYVIGTKSLFV ICLIALFTGMVLGLQGYYVLVKFGSVGFLGAAVSLTLIRELGPVLTAIMVTGRAGSSMAA EIGVMRITDQIDALEVMDIPGMGYLVAPRFVASLIAFPLLTAIFDVVGIIGGYLTGVLLL GVNEGAYFHGIESSVLMPDVTEGFIKSFVFALLIALICCYQGYNAHRRRDGMGPEAVANA TTSAVVISCVFVLVADYVVTSAMLR >gi|316923479|gb|ADCP01000069.1| GENE 84 89501 - 90331 283 276 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 226 14 241 329 113 30 3e-24 MKDQHSSWDIRLEHLRVGYGSRVVLDDINAVLPGGKITVILGESGCGKSTLLRHVIGLSR PMQGHILYGGQDLFALPERQFRRVRRRFGVLFQDGALLGSLTLAENVGLPLHEHTRLPKA TIRSTVLRILDLVGLAEFADYYPNELSGGMKKRGGLARAIVTEPPLLFCDEPTSGLDPIN SAQMDQLLLDMRKAYPEMTLIVVSHDLASVARIAEHVLVLRDGRVVFSGSYEALKASTDE YLRRFMDRKAEEKASRVATTATDPDVRAALDAWLEG >gi|316923479|gb|ADCP01000069.1| GENE 85 90364 - 90807 519 147 aa, chain + ## HITS:1 COG:RSc2960 KEGG:ns NR:ns ## COG: RSc2960 COG1463 # Protein_GI_number: 17547679 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, periplasmic component # Organism: Ralstonia solanacearum # 9 146 10 148 173 104 41.0 6e-23 MSFSKEAAVGFFVMLGLVCVAYLTVKLGRMEVFHSEGYVLTASFNSVSGLRPGAEVEIAG VRVGRVKSIRLDDKQPRAVVELQLGDNVHLTDDVIASVKTSGLIGDKYISLEPGGSGEPL KNGDEITDTESAVDIESLISKYVFGKV >gi|316923479|gb|ADCP01000069.1| GENE 86 90810 - 91439 498 209 aa, chain + ## HITS:1 COG:NMB1963 KEGG:ns NR:ns ## COG: NMB1963 COG2854 # Protein_GI_number: 15677793 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, auxiliary component # Organism: Neisseria meningitidis MC58 # 24 198 23 192 196 69 27.0 4e-12 MMRTIRHMVLALCSVFILCGTVHAAESPLAVLQTQIDQILNVLKEPNYKDPIKRVPLRAQ IEKYVHEIFDFSEFSARTVGRNWPSFSDAQKERFDKAFANLLLITYLDKIQGYNGEKIEY SGEVLSTKGDRAEIQTIVTLSDGKPVPVAYRMMLKNGKWVVYDVLIENVSLIKNYRSQFQ DVLTRGTPEQLIERVEARARELQAQSTVN >gi|316923479|gb|ADCP01000069.1| GENE 87 91451 - 92305 409 284 aa, chain + ## HITS:1 COG:RSp0916 KEGG:ns NR:ns ## COG: RSp0916 COG2853 # Protein_GI_number: 17549137 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Surface lipoprotein # Organism: Ralstonia solanacearum # 28 273 68 289 296 126 33.0 5e-29 MTAEQRFFPIWKMCLCAAVVLSLMGCAAKHGDKAVNSPDVQAVETTSSVGAASVADSDSA FEDYDDEESEAIADPLEGWNRFWFGFNDVLLLKVVKPVYTGYTYITPSPVRTGLSNAFHN IQMPIRFLNCVLQGRFGEAWIEVGRFVVNTTAGFGGVFDVAKQSKPLIPVDNRDADFGQT LGVWGFGEGIYLVWPVVGPSTVRDTVGLAGDWTTSAFFWISEPIGPLEFEPALAASLGLR FNDMGTVISTYESLKKSAVEPYIAARDAYVKYRRAGIMGNRFQW Prediction of potential genes in microbial genomes Time: Fri May 13 02:57:31 2011 Seq name: gi|316923394|gb|ADCP01000070.1| Bilophila wadsworthia 3_1_6 cont1.70, whole genome shotgun sequence Length of sequence - 78869 bp Number of predicted genes - 86, with homology - 78 Number of transcription units - 31, operones - 18 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 77 - 1381 617 ## COG0303 Molybdopterin biosynthesis enzyme - Prom 1444 - 1503 2.4 2 2 Op 1 1/0.000 + CDS 1628 - 3664 1666 ## COG0556 Helicase subunit of the DNA excision repair complex 3 2 Op 2 . + CDS 3696 - 5792 1551 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 4 2 Op 3 . + CDS 5811 - 6602 638 ## COG0289 Dihydrodipicolinate reductase 5 2 Op 4 . + CDS 6602 - 7444 619 ## COG0171 NAD synthase + Term 7458 - 7513 15.9 - Term 7441 - 7501 2.1 6 3 Op 1 11/0.000 - CDS 7525 - 8184 635 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase 7 3 Op 2 . - CDS 8196 - 8549 210 ## COG0375 Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) - Prom 8610 - 8669 2.0 8 4 Tu 1 . + CDS 8745 - 9464 667 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold + Term 9487 - 9515 1.3 - Term 9470 - 9509 6.5 9 5 Op 1 . - CDS 9510 - 10136 541 ## COG1280 Putative threonine efflux protein 10 5 Op 2 . - CDS 10224 - 11174 663 ## COG1660 Predicted P-loop-containing kinase - Term 11186 - 11227 13.5 11 6 Op 1 11/0.000 - CDS 11261 - 11797 599 ## PROTEIN SUPPORTED gi|46580040|ref|YP_010848.1| ribosomal subunit interface protein - Prom 11872 - 11931 1.9 12 6 Op 2 17/0.000 - CDS 11935 - 13413 956 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog - Prom 13528 - 13587 5.3 - Term 13651 - 13687 2.0 13 6 Op 3 . - CDS 13726 - 14448 260 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 14 6 Op 4 . - CDS 14448 - 15023 425 ## Dvul_1507 OstA family protein 15 6 Op 5 . - CDS 15085 - 15792 426 ## DVU1626 hypothetical protein 16 6 Op 6 . - CDS 15789 - 16382 537 ## COG1778 Low specificity phosphatase (HAD superfamily) 17 6 Op 7 3/0.000 - CDS 16372 - 17187 521 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase 18 6 Op 8 . - CDS 17296 - 18948 2001 ## COG0504 CTP synthase (UTP-ammonia lyase) - Prom 19197 - 19256 4.6 + Prom 19212 - 19271 2.4 19 7 Op 1 . + CDS 19389 - 20222 766 ## COG0047 Phosphoribosylformylglycinamidine (FGAM) synthase, glutamine amidotransferase domain 20 7 Op 2 7/0.000 + CDS 20259 - 21113 502 ## COG0327 Uncharacterized conserved protein 21 7 Op 3 . + CDS 21149 - 21916 858 ## COG1579 Zn-ribbon protein, possibly nucleic acid-binding + Term 21936 - 21975 -0.3 22 8 Tu 1 . + CDS 22389 - 23612 510 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase + Prom 23614 - 23673 2.1 23 9 Op 1 . + CDS 23788 - 25611 1245 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 24 9 Op 2 . + CDS 25608 - 26072 233 ## COG1490 D-Tyr-tRNAtyr deacylase + Term 26100 - 26147 10.1 - Term 26088 - 26135 10.1 25 10 Op 1 . - CDS 26163 - 27068 680 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 26 10 Op 2 . - CDS 27096 - 27770 264 ## COG0438 Glycosyltransferase - Prom 27870 - 27929 2.1 27 11 Op 1 . - CDS 28195 - 28644 402 ## DvMF_2251 hypothetical protein 28 11 Op 2 . - CDS 28692 - 30344 1614 ## COG0513 Superfamily II DNA and RNA helicases - Prom 30589 - 30648 4.1 - Term 30591 - 30629 2.6 29 12 Tu 1 . - CDS 30687 - 31562 447 ## COG1243 Histone acetyltransferase + Prom 31777 - 31836 5.7 30 13 Tu 1 . + CDS 31944 - 32369 201 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 31 14 Tu 1 . + CDS 32771 - 33073 170 ## PROTEIN SUPPORTED gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 + Term 33256 - 33302 11.2 - Term 33247 - 33287 8.2 32 15 Op 1 . - CDS 33310 - 33903 476 ## COG2755 Lysophospholipase L1 and related esterases 33 15 Op 2 . - CDS 33969 - 34721 305 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 34 15 Op 3 . - CDS 34711 - 34884 215 ## DVU3173 hypothetical protein - Term 35120 - 35161 10.3 35 16 Tu 1 . - CDS 35185 - 36456 1596 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 36580 - 36639 3.5 + Prom 36539 - 36598 5.5 36 17 Tu 1 . + CDS 36820 - 37029 174 ## + Term 37054 - 37101 12.2 - Term 37046 - 37084 0.1 37 18 Op 1 . - CDS 37132 - 37899 761 ## COG0005 Purine nucleoside phosphorylase 38 18 Op 2 . - CDS 37902 - 38564 464 ## Tery_4650 ribulose-5-phosphate 4-epimerase and related epimerase and aldolases 39 18 Op 3 . - CDS 38554 - 39687 889 ## COG0182 Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family - Prom 39713 - 39772 2.4 - Term 39869 - 39909 9.6 40 19 Op 1 7/0.000 - CDS 39932 - 40477 624 ## COG0680 Ni,Fe-hydrogenase maturation factor 41 19 Op 2 8/0.000 - CDS 40480 - 41172 969 ## COG1969 Ni,Fe-hydrogenase I cytochrome b subunit 42 19 Op 3 11/0.000 - CDS 41183 - 42979 2084 ## COG0374 Ni,Fe-hydrogenase I large subunit 43 19 Op 4 . - CDS 42982 - 44139 1063 ## COG1740 Ni,Fe-hydrogenase I small subunit - Prom 44282 - 44341 1.8 - Term 44329 - 44377 11.7 44 20 Op 1 12/0.000 - CDS 44399 - 45748 1509 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 45 20 Op 2 . - CDS 45772 - 46890 554 ## COG0438 Glycosyltransferase 46 20 Op 3 . - CDS 46963 - 48159 643 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 47 21 Tu 1 . + CDS 48507 - 50198 1125 ## COG3378 Predicted ATPase + Prom 50250 - 50309 4.5 48 22 Op 1 . + CDS 50388 - 51938 945 ## DVU2879 hypothetical protein 49 22 Op 2 . + CDS 51940 - 52701 927 ## EUBREC_2116 hypothetical protein 50 22 Op 3 . + CDS 52698 - 54056 1605 ## gi|302493220|gb|EFL53082.1| hypothetical protein DesfrDRAFT_0130 + Prom 54058 - 54117 5.2 51 22 Op 4 . + CDS 54151 - 54402 287 ## 52 23 Op 1 . + CDS 54512 - 54862 438 ## gi|302862905|gb|EFL85837.1| putative iron receptor 53 23 Op 2 . + CDS 54872 - 55240 283 ## gi|212702697|ref|ZP_03310825.1| hypothetical protein DESPIG_00725 54 24 Tu 1 . + CDS 55417 - 56172 739 ## DVU0193 hypothetical protein - Term 56428 - 56456 -0.9 55 25 Tu 1 . - CDS 56510 - 56761 135 ## 56 26 Tu 1 . + CDS 56787 - 57002 226 ## 57 27 Op 1 . + CDS 57184 - 59043 1529 ## COG5525 Bacteriophage tail assembly protein 58 27 Op 2 . + CDS 59043 - 59249 148 ## gi|212702701|ref|ZP_03310829.1| hypothetical protein DESPIG_00729 59 27 Op 3 . + CDS 59272 - 59568 262 ## Bcep18194_B1537 hypothetical protein 60 27 Op 4 . + CDS 59651 - 59965 229 ## COG1396 Predicted transcriptional regulators + Term 59987 - 60028 8.8 61 28 Tu 1 . - CDS 59924 - 60085 102 ## + Prom 59972 - 60031 6.2 62 29 Op 1 . + CDS 60118 - 60345 120 ## DVU0196 hypothetical protein 63 29 Op 2 . + CDS 60349 - 61200 425 ## DVU2872 lambda family phage portal protein 64 29 Op 3 4/0.000 + CDS 61110 - 61970 981 ## COG5511 Bacteriophage capsid protein 65 29 Op 4 . + CDS 61963 - 63168 1028 ## COG0616 Periplasmic serine proteases (ClpP class) 66 29 Op 5 . + CDS 63178 - 63567 692 ## gi|212702706|ref|ZP_03310834.1| hypothetical protein DESPIG_00734 67 29 Op 6 . + CDS 63571 - 64584 1221 ## MCA2923 hypothetical protein 68 29 Op 7 . + CDS 64587 - 65219 756 ## DvMF_2867 protein of unknown function DUF847 69 29 Op 8 . + CDS 65219 - 65434 349 ## 70 30 Op 1 . + CDS 65727 - 66029 363 ## 71 30 Op 2 . + CDS 66026 - 66346 453 ## gi|212702712|ref|ZP_03310840.1| hypothetical protein DESPIG_00740 72 30 Op 3 . + CDS 66343 - 67011 537 ## gi|212702713|ref|ZP_03310841.1| hypothetical protein DESPIG_00741 73 30 Op 4 . + CDS 67029 - 67505 480 ## gi|212702714|ref|ZP_03310842.1| hypothetical protein DESPIG_00742 74 30 Op 5 . + CDS 67498 - 67839 342 ## gi|212702715|ref|ZP_03310843.1| hypothetical protein DESPIG_00743 75 30 Op 6 . + CDS 67843 - 69315 1919 ## COG3497 Phage tail sheath protein FI 76 30 Op 7 . + CDS 69326 - 69850 669 ## Spro_4913 major tail tube protein 77 30 Op 8 . + CDS 69862 - 70203 414 ## 78 31 Op 1 . + CDS 70309 - 73467 2400 ## COG5283 Phage-related tail protein 79 31 Op 2 . + CDS 73457 - 73666 270 ## gi|212702721|ref|ZP_03310849.1| hypothetical protein DESPIG_00749 80 31 Op 3 1/0.000 + CDS 73666 - 74703 1185 ## COG3500 Phage protein D 81 31 Op 4 . + CDS 74715 - 75353 595 ## COG4540 Phage P2 baseplate assembly protein gpV 82 31 Op 5 . + CDS 75350 - 75742 570 ## gi|212702725|ref|ZP_03310853.1| hypothetical protein DESPIG_00753 83 31 Op 6 . + CDS 75742 - 76083 433 ## Amico_1105 gpW/GP25 family protein 84 31 Op 7 4/0.000 + CDS 76070 - 77200 1147 ## COG3948 Phage-related baseplate assembly protein 85 31 Op 8 3/0.000 + CDS 77193 - 77933 843 ## COG4385 Bacteriophage P2-related tail formation protein 86 31 Op 9 . + CDS 77930 - 78869 978 ## COG5301 Phage-related tail fibre protein Predicted protein(s) >gi|316923394|gb|ADCP01000070.1| GENE 1 77 - 1381 617 434 aa, chain - ## HITS:1 COG:CAC2021 KEGG:ns NR:ns ## COG: CAC2021 COG0303 # Protein_GI_number: 15895291 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Clostridium acetobutylicum # 39 429 38 404 407 227 34.0 3e-59 MQKDFLHVLPVGEVISRLLDVSPLPQESKRLDKLSGGPILAETVMATETLPPANRSGMDG YAVQASDLFGASEANPVWLDCVGEIAIDRPPNFTLQSGQCAAIVTGGYLPEGADAVIMVE HTKPFGAGVIEMRRSVAPGEYVMQKGDDAQEGTPLLERGTKLRAQEIGLLAALGITEVPV IRRPRVSILSTGDELVSPEQTPRPGQIRDVNTLALSAMLRPVADVSTFGIAPDRLGPLTA SLKAALAGENGLPADVVFLSGGSSIGVRDLTLEALQSLGDTEILCHGVALSPGKPLILAR CGQTLVWGLPGQVASAQVVMHVLGVPFLRHLAGHSLMAQFFGKGNFRYGHAFDQTFWPSR QAILSRNIASRQGREDYIRVRLEHQANGLPRAVPVPGLSGLLRTLLDSEGLVRISARIEG LEAGTPVDVLLFES >gi|316923394|gb|ADCP01000070.1| GENE 2 1628 - 3664 1666 678 aa, chain + ## HITS:1 COG:HI1247 KEGG:ns NR:ns ## COG: HI1247 COG0556 # Protein_GI_number: 16273166 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Haemophilus influenzae # 6 676 7 676 679 788 58.0 0 MNTTETPLFKLRTEYVPTGDQPAAIEQLASNIEAGVRDQVLLGVTGSGKTFTVANVIAQV NRPALVLAPNKTLAAQLYNEFRALFPENAVEYFVSYYDYYQPEAYVPASDTYIEKDSAIN DDIDKLRHAATHALLTRRDVIIVASVSCIYGLGSPEYYARLIIPVEVGQRVSMDAIITKL VDVQYQRNDMDFHRGTFRVRGDVLEVIPAYEHERALRLEFFGDEIEAVREIDPLTGEILG NIGKTVIYPASHYVSDRDNLERAMSDVRDELGERLRLFQSQNKLVEAQRLEQRTQLDLEM MQELGYCTGIENYSRHLDGRKEGDPPATLLDYFPDDFVLFADESHISIPQVGGMFKGDRS RKTTLVDFGFRLPSALDNRPLEFHEFLERLNQVVYVSATPGKWELERSQGIVAEQIIRPT GLLDPEVEVRPVKGQIDDLLGECRARVEKHERVLVTTLTKRMAEDLTEYLNSMGVAARYL HSDIDTMERMAIIKALRAGEFDVLVGINLLREGLDIPEVSLVTILDADKEGFLRSTGSLI QTFGRAARNAGGKVLLYADTVTASMRAAMGETARRRAKQQDWNETHGITPTTISKPLATP FDSLYTSSGEGKGKRGRGKQAKQPAEASASVDITAQNVARYLKQFEREMRDAARDLEFEK AAALRDRIKQLREQFLIA >gi|316923394|gb|ADCP01000070.1| GENE 3 3696 - 5792 1551 698 aa, chain + ## HITS:1 COG:ECs3283 KEGG:ns NR:ns ## COG: ECs3283 COG0272 # Protein_GI_number: 15832537 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Escherichia coli O157:H7 # 16 679 5 669 671 568 48.0 1e-161 MNQGTLQSLLVEDADKKRLEELRAFVIYHNHRYHTLDAPEITDDEYNAAFQELLRLEERH PEWRSPDSPTNRIGGQVLSSLETKAHTRRMYSLDNVFDAEEWQGFLKRLDNAQEGLEHAF WCDPKMDGLALELVYENGRFVEALTRGDGEVGELVTEAMQTVRNLPKVLHGDAPERLEVR GEVIFRRRDFALLNERLRKENLKTFANPRNAAAGSIRQLDTSVTASRPLRFLAYGFGDVR FGGVQPWSTYEEVMGRLRDFGFETPPGGRLCRGSEEVEAYYASLSEKRESLAYEIDGVVM KLNDLEAQEALGYTARAPRFAVAWKFPAQQATTLLLDITVQVGRTGVLTPVAELEPVNVG GVLVSRATLHNEDEIRNRDVRIGDRVVVQRAGDVIPEVVRAVLSERAPDSQPFVFPHVCP SCGQAASRLEGEVAWRCVNVSCPAMIRQSLAHFVSKAGLDIEGLGQRWIELLVASGRVKT PADLFTLRVDELLHYERMGVKLATKFVDSLDRAKKEATLQRLLCALGIRHVGEQTAKTLA ATYADMDVLRAASPEELQNLPDIGPEVASSIKAFFDDEPNVALLARLRELGLWPVRQEAA VASMPVGPLAGQKILFTGSLSIPRSKAQQMAENAGAEIAGSVSRRLNLLVVGDEPGSKRE KAQALGIRIVDEAGFLALLQADNEVTASSSENAAQDSE >gi|316923394|gb|ADCP01000070.1| GENE 4 5811 - 6602 638 263 aa, chain + ## HITS:1 COG:RSc2745 KEGG:ns NR:ns ## COG: RSc2745 COG0289 # Protein_GI_number: 17547464 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Ralstonia solanacearum # 3 258 1 261 263 191 45.0 9e-49 MSLSIIVMGASGRMGTTIANLAKEQGMTLAAVLERPERLDALAHWGCLAGTDPEVVYPQV PGAVVIDFTAPEASLANARCAVKNGNPIVIGTTGFTPEQKAELDAIAQNGRVFWSPNMSV GVNVLLELLPELVQMLGPDYDLEMVELHHNRKKDSPSGTALRLAESLASARDWDLKDVAC YHREGIIGERPKKEIGVQAIRGGDVVGVHTVYFMGPGERIEVTHHAHSRENFAQGALRAA SWLPGQPGGKVYAMGDILKSRLK >gi|316923394|gb|ADCP01000070.1| GENE 5 6602 - 7444 619 280 aa, chain + ## HITS:1 COG:alr2485_2 KEGG:ns NR:ns ## COG: alr2485_2 COG0171 # Protein_GI_number: 17229977 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Nostoc sp. PCC 7120 # 10 280 7 276 295 286 53.0 3e-77 MSSIPAKDWKSVWELLTSGIRDYVRKNGFTDVVLGLSGGMDSALVAALAVDALGKDHVHG VMMPSPWSSEGSITDSEALAANLGMETFTVPISPMMEAFEKALAPAFAGRERDVTEENIQ SRIRGVIMMSFSNKFHWMLLATGNKSEVAAGYCTMYGDTCGGLAPIADLYKTEVYQLAQW FNEREGMCAIPQNIFDKAPSAELRPGQKDQDSLPEYDMLDSILHALIEESKRAEDIDLPG VTPEDVNRVQSLMRRSAFKRLQLPPLLPVGNHAFGVHVHM >gi|316923394|gb|ADCP01000070.1| GENE 6 7525 - 8184 635 219 aa, chain - ## HITS:1 COG:aq_671 KEGG:ns NR:ns ## COG: aq_671 COG0378 # Protein_GI_number: 15606084 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Aquifex aeolicus # 3 219 35 251 259 224 49.0 7e-59 MEIPVIRNVLEANGKLAASLKDQFAKHGILALNLISSPGAGKTSLLERTLTDLASEFRMA VIEGDLQTDNDARRVAATGAKAVQINTDGGCHLDSNLVMEALGAFDLNEIDILFIENVGN LVCPVEFDCGEDAKIALLSVTEGDDKPEKYPLLFNRASAMVLNKIDLLPYVDFDVEVAAR HARHLNADLALFEVSCRTGEGLEKWYSWLRQAAKAKKGA >gi|316923394|gb|ADCP01000070.1| GENE 7 8196 - 8549 210 117 aa, chain - ## HITS:1 COG:alr0699 KEGG:ns NR:ns ## COG: alr0699 COG0375 # Protein_GI_number: 17228194 # Func_class: R General function prediction only # Function: Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) # Organism: Nostoc sp. PCC 7120 # 1 117 1 109 113 69 38.0 1e-12 MHEMSVVTSLLSIVREEMEKHDVHRLLLVRVRYGALSNIVPEALSFAFEALTAGTDFEGA VLETEEVPITLKCSQCGHTFPAVKGEHFFAPCPACGERYGHSMETGRELYVQHIEAE >gi|316923394|gb|ADCP01000070.1| GENE 8 8745 - 9464 667 239 aa, chain + ## HITS:1 COG:PH0093 KEGG:ns NR:ns ## COG: PH0093 COG2159 # Protein_GI_number: 14590044 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Pyrococcus horikoshii # 2 237 14 247 247 212 45.0 4e-55 MQKIDAHSHVGYFGGWADVGYTDEQMVAEMDRYGIEKSVVSYMDNTVVEQAVKRFPNRFV GITWANPYEGDKAVETVVREVREHGFQGIKLHPLLNVFTANDAVVHPLMEVAQQMDLPVF IHSGHPPFSLPHSIIQLAEDFPKVRIVMVHMGHGNGIYIQAAIDLSKKHDNVYLETSGMP MHIKIREAYNTVGSERVFWGSDAPFHHYRVEMLRTEVCGLPESALENIFYRNIKRFLAL >gi|316923394|gb|ADCP01000070.1| GENE 9 9510 - 10136 541 208 aa, chain - ## HITS:1 COG:BH0429 KEGG:ns NR:ns ## COG: BH0429 COG1280 # Protein_GI_number: 15612992 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Bacillus halodurans # 2 206 1 205 207 138 41.0 8e-33 MLPLETLGAFFVTAIVMGLAPGPDNIFVLTQSALYGFRAGIVTTLGLMTGLFGHTAAVAL GVAALFQTSEMAFTVLKCAGAAYLLYLAWLSFRSGASRAWLEQSTFPGYWALYRRGVIMN ITNPKVTLFFLAFLPQFAKPELGNVPLQIVTLGLLFQLATLAVFGGVSFLGGRLAEWFNA SIRGQIILNRITACIFTALALLLIFSSR >gi|316923394|gb|ADCP01000070.1| GENE 10 10224 - 11174 663 316 aa, chain - ## HITS:1 COG:lin2617 KEGG:ns NR:ns ## COG: lin2617 COG1660 # Protein_GI_number: 16801679 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Listeria innocua # 31 314 8 286 291 222 42.0 7e-58 MTDQPLLPEHAAEKPTDSASECPKTNADKPIFIITGLSGAGKSTVLHVFEDMRLFTADGV PPELMPEMVRLLQSSSVEPSHGIALGFDQRRSEFMAELDQALHRLSGMGFRTKILYLEAD PAVIMRRYATTRRPHPLEREGVGLEQAVREEMERLAPVREIADKVIDTSSFSIHDLRRII QRKWNSTLERLHTIKVNLISFGFKYGVPREADLVFDLRFLPNPYFVEELRPQTGKESAIA EYVFSEPSGKEFKKRLIDFLSFLLPLYDAEGRYRITIALGCTGGKHRSVAMTEALMRALK RQDYTVSVEHRHMELG >gi|316923394|gb|ADCP01000070.1| GENE 11 11261 - 11797 599 178 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46580040|ref|YP_010848.1| ribosomal subunit interface protein [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 177 1 178 179 235 66 6e-61 MHISFTFKNFEPSEHLRKYARRRMEKLGRFFGKNPALDAQVNMAVDKFRQRVEVQLTGDG INIAASEVSEDMYASIDLVLDKLEAQVKKFVSRNKEIHRKGRANDIDVYTFEVTEEDGER TITGRDHFSPKPMSPDEAAMQLDANDFEFLTFLNSENDRINVIYRRKNGNLGLIDPIF >gi|316923394|gb|ADCP01000070.1| GENE 12 11935 - 13413 956 492 aa, chain - ## HITS:1 COG:STM3320 KEGG:ns NR:ns ## COG: STM3320 COG1508 # Protein_GI_number: 16766615 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Salmonella typhimurium LT2 # 8 489 6 474 477 300 38.0 4e-81 MALELRQQLKLSQQLVMTPQLQQAIRLLQLSRLELVETVQRELLENPFLEEMQDEPSNEP PAEVRPESPRDEPTYDREVARSEDWEDYLGEFASTPQQVQPRDYELPEEMSSLEARYASS PTLESHLMWQLRLSSLSDDQKELGEIIIGNLSSSGYLQASLEEMAEMARADFAGETSTAE AKDKAWPTVEEVETVLKAIQLFDPVGVAARTPQECLLIQIKALGYDRDQVLVDLVRDHLE DLEAHRYKPLLRKFRLDMDELKEYLDIIQSLDPMPGASFGEGMSTFVSPDVFVYKVDGEF LIVLNEDGLPNLHLSPVYDNASENASSKEKEYFNEKIRSAAWLIKSLHQRQRTLYKVVES IVKHQRGFFEEGISKFKPLILKDIADDINMHESTVSRITTNKYVATPFGVYELKFFFNSA LELDDGSQVGSESVKALIKKCISEEDPKNPLSDERIGEILKEHLKVNIARRTVAKYRMAM DIPSSSRRKAHF >gi|316923394|gb|ADCP01000070.1| GENE 13 13726 - 14448 260 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 9 215 17 222 318 104 32 1e-21 MARLSGVDLRKWYGQREVVRGVSPEVNQGEIVGLLGPNGAGKTTSFYMLTGIIKPSSGRV LLEDQDIGTWPLHERARVGISYLPQESSVFRKLTVRQNLQIILEHTGLSRKEQKERADVL LEDFGLTRLEKSHAMHLSGGERRRLEIARALIREPKFVLLDEPFAGIDPLAVGDIQGLVR ALRDRGIGVLISDHNVRETLTICDRAYLMVQGQVVLSGTPETIVNNEQARSVYLGEGFSL >gi|316923394|gb|ADCP01000070.1| GENE 14 14448 - 15023 425 191 aa, chain - ## HITS:1 COG:no KEGG:Dvul_1507 NR:ns ## KEGG: Dvul_1507 # Name: not_defined # Def: OstA family protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 4 185 8 185 196 156 46.0 4e-37 MKKLFLGITTLSLSLFCSAPLWAAPVTPPKPAKNVETRITSDQLTYLAEKQLVIFDKNVH VVRPDIEIWADRITVYLKPPKGDAQKKEGEKGGMPAGMAAGDVDRIVAERNVRMKSENRN GTCAKATYTMDDGVLLMEGDPRLTDGENTVTGETIKYFTEENRSEVMGGSKKRVEAVFSG SKNSSPIRGNR >gi|316923394|gb|ADCP01000070.1| GENE 15 15085 - 15792 426 235 aa, chain - ## HITS:1 COG:no KEGG:DVU1626 NR:ns ## KEGG: DVU1626 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 58 228 36 199 214 133 44.0 5e-30 MKRSRILLITALVALLAGGWYVAKRGGETTLIGRLLDVANKGGIKSLDGAKLQGNATQSG NPLEEVAGLAIKGINLFQGNKGLELWRLKASWAHMSQNGDTIDVDKPVVRYALGEGSASN PDDDVLDVQAQLGRITDNQRFLTLWDNVVITRYDDVITSSRMNYDANKRLMTFPEGAALE SPTASGTATFFTWDLATNEMHGSGGVLVVLKPRPDAPNANAAPSPREPQANPQQE >gi|316923394|gb|ADCP01000070.1| GENE 16 15789 - 16382 537 197 aa, chain - ## HITS:1 COG:aq_2171 KEGG:ns NR:ns ## COG: aq_2171 COG1778 # Protein_GI_number: 15607107 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Aquifex aeolicus # 30 179 8 157 163 141 46.0 1e-33 MTCNTLYSDGTGISGHSPTPRRESIESLAKHIRLLVLDVDGVMTDGGLYYDANGLVMKRF HVLDGIGIRLAKTAGIEVAVLSGMDVPCVVRRLEVLGVTEYHGGADNKCTILDGMRKRMS LEWQEIAYLGDDWVDLAPMSRVGLPAAVANAMPDVKKLAKFVTQKEGGCGAVREFVDLLL TCQGKREALLEHWMRLE >gi|316923394|gb|ADCP01000070.1| GENE 17 16372 - 17187 521 271 aa, chain - ## HITS:1 COG:XF1289 KEGG:ns NR:ns ## COG: XF1289 COG2877 # Protein_GI_number: 15837890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Xylella fastidiosa 9a5c # 11 261 29 279 295 290 54.0 2e-78 MMTSDELYGQLRKDHFIFAGPCALESFELALDTAHAVAEASKEAGLFAVFKSSWDKANRT SGKGFRGPGLVKGMEWLARIKEETGLPVVTDIHLPEQAGPVGEVADILQIPAFLCRQTDL LLAAARTGRVVNVKKGQFVAPWDMGPVRDKVAAAGNDRVLLTERGSSFGYNNLVVDFRSI PIMRALGVPVIFDATHSVQLPGGQGSCSGGERHHVPTLARAAAGAGVDGIFMECHPDPDK ALCDGPNSWPVAKLPALLKQLSAIWNIPYDL >gi|316923394|gb|ADCP01000070.1| GENE 18 17296 - 18948 2001 550 aa, chain - ## HITS:1 COG:CAC2892 KEGG:ns NR:ns ## COG: CAC2892 COG0504 # Protein_GI_number: 15896145 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Clostridium acetobutylicum # 1 550 1 535 535 712 60.0 0 MKTKFIFITGGVLSSLGKGLSAASIGALLKTRGLTVTIQKLDPYINVDPGTMNPYQHGEV YVTDDGAETDLDLGHYERYLNTALSQKNNTTSGSIYYKVITKERHGDYLGGTVQVIPHIT DEIKNAILSLAKDEPENPAPDVAIIEIGGTVGDIEGLPFLEAIRQLRSELGRDNCLYIHL TLVPYLRSAGEHKTKPTQHSVKELRSIGIQPDIILCRCEEPVPADLRAKIALFCNVDKDA VFSAVDVSNIYELPLKLYEEGLDQKVAIMLRLPARNAKLEPWEELVHNVKNPQGRVTIAI VGKYVELKEAYKSLHEALVHGGIANRVAVDLRYVNSEEVTAENVAETFKGADGILVPGGF GYRGVEGKIESIRYARENNVPFFGICLGMQCAVIEFARHVVGIEEANSEEFNPFAKDKLI YLMTEWYDFRKKAVERRDAESDKGGTMRLGAYPCAISEGTHAFEAYKTANINERHRHRYE FNNAYRERLADMGMVFSGQSPDGTLMEIVEIPDHPWFLGCQFHPEFQSRPMKPHPLFKDF IRASCLARKK >gi|316923394|gb|ADCP01000070.1| GENE 19 19389 - 20222 766 277 aa, chain + ## HITS:1 COG:Ta1318m KEGG:ns NR:ns ## COG: Ta1318m COG0047 # Protein_GI_number: 16082610 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, glutamine amidotransferase domain # Organism: Thermoplasma acidophilum # 13 243 13 226 257 155 37.0 1e-37 MSQVNTLVITGYGTNSHQETAHAARLAGADRADVVHFSDIVAAKVRLSEYQFLVFPGGFL DGDDLGAAQAAAQRWLHLSDAEGQPLLDSLNRFVDAGGLVLGICNGFQLLVKLGVLPALD GKRFERQVSLSHNDSARYEDRWVQLKPNPDSPCIFTKGLAVHGVLPMPVRHGEGKLVARD AATLQRLQDENLIALQYTHPETGEATQEHPWNPNGSPLAIAGLTDPTGHILGLMPHPEAF HHATNHPGWTRGEVAVPGTALFANAVRYLRENPVNHS >gi|316923394|gb|ADCP01000070.1| GENE 20 20259 - 21113 502 284 aa, chain + ## HITS:1 COG:SP1609 KEGG:ns NR:ns ## COG: SP1609 COG0327 # Protein_GI_number: 15901449 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 1 280 1 262 265 84 28.0 2e-16 MKQSELIALIERTAPLAIAAPWDKSGVQVASARQDINRLAVCLDPTPESIRIALSGGAEM ILAHHPLTMEGRFTDRLDSYHEVLSLLFRADVPLYSAHTSLDANPLGPVSWLADELGLCR IPSSDTKDGEAGHGMPLPLTVLEQTGTMERGGGSYACGFGIVGDCAVDMTPEDLKKMLAL WLVGSCPRLAGALPERIRRIAICPGSGSSLAPEAAACGADLLITGDLKYHTALDLPLPVL DVGHFSLEEEMMRRFALQLKENVSDVAVQFVPAQDPLAPFSPTD >gi|316923394|gb|ADCP01000070.1| GENE 21 21149 - 21916 858 255 aa, chain + ## HITS:1 COG:Cj0706 KEGG:ns NR:ns ## COG: Cj0706 COG1579 # Protein_GI_number: 15792055 # Func_class: R General function prediction only # Function: Zn-ribbon protein, possibly nucleic acid-binding # Organism: Campylobacter jejuni # 1 234 1 231 238 67 24.0 2e-11 MSLYLEQIRQLVALQRVDDAIHSVEMELEQAPKELEDLKNRFAATNTQRERVLEKLAHLK EQEKRITGELDDDSSRIKKSKNKMMQVSNSREYQAMAREMDNMEKVNRSREEERAALLEE KLHQDNALQEVDAIWADLKAELEAKQISLETRQDEARKRLDELAQVRSETGSAVPRPVLD RYEFIRRRLSHPVIVPVTAGVCSGCRIMIPPQTFIELQGGHKIINCPNCQRLIYWVEHFN EETNQTADEMAHHAE >gi|316923394|gb|ADCP01000070.1| GENE 22 22389 - 23612 510 407 aa, chain + ## HITS:1 COG:FN1788 KEGG:ns NR:ns ## COG: FN1788 COG0245 # Protein_GI_number: 19705093 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Fusobacterium nucleatum # 242 398 4 153 160 142 43.0 2e-33 MAWTLLLAAGQGNRLASATGGLAKQFLDWKGAPLYWESALVFSRCARVEGLVFVFPEACI DEERERVAGLMRNASLGIPWKVVAGGALRQDSVRNGLSALPDDCMEVLIHDAARPFVSPV LVNRILDGLHGEASGECVAACIPGIPVVDTIKVMDEEKKVVATPDRKHLMAVQTPQGFSL AFLRAAHQRAEQEGWVVTDDASLLELCGHAVHVAEGEAGNKKITTPEDLEMLRMTGERIP CVGYGYDVHKYADGNEARQPARPMRLGGVPIAGSPDVLAHSDGDVLLHALMDALLGCIGA GDIGTFFPDSDPAFDNVNSAVLLDTVLEHVQKANVHITHVDLTVIAQIPKVGPHREMIRR NVARLLGLDMGAVNVKATTEEGLGFTGERLGIKAVAVVTGLRGNRDT >gi|316923394|gb|ADCP01000070.1| GENE 23 23788 - 25611 1245 607 aa, chain + ## HITS:1 COG:BH3104 KEGG:ns NR:ns ## COG: BH3104 COG0318 # Protein_GI_number: 15615666 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Bacillus halodurans # 8 567 3 562 566 526 47.0 1e-149 MSDLQQGMDCPWFSSYDKGVPHDIVIPEGGLQNILDTAAETYGQREAIVFQNTHISYREL KELAEKLAASLRRRGVRPGDRVSIMLPNLPQTFIAFWGVMKAGAVAVMTNPLYMESELTH QIKDSGAKHLITLDIFWPKIAALREQLELDVCYVTRVRDALKFPLSWLQPFSARRQGTWV PVPFDGKEVVAWKDLFKTTERLSVEVTDAHETIALLQYTGGTTGLSKGAMLTHANLLANL TQLMAIIQQPPEKEHVFLALLPLFHVFGLTITLLLPARMAARVVPMPRYVPADMLEAFKK YQFTAFIGAPSVYISLLQQKNLAQYDLHHIIFCISGSAPMPVEWLKKFEEVTGTPITEGF GLTEASPVTHANPVFGKQKPGSIGVPVPGTLARIVDTEDGERVLAHNEIGELVIKGPQVM KGYWNRPEETARTIRDGWLYTGDIAYMDEEGYFYIVDRKKDLIIVGGYNVYPREIDEVLH THPKIREAVTVGVNHRSRGETVKAYIVPEEGANLTVPEVVAFCRQKLASFKVPRLIEFRE ELPKTMVGKVLRRILREEELHKSDTEQDKEQLVPEETSKEEVSAPEAGIGEKEAKTQGEP EKTDGNA >gi|316923394|gb|ADCP01000070.1| GENE 24 25608 - 26072 233 154 aa, chain + ## HITS:1 COG:SPy1980 KEGG:ns NR:ns ## COG: SPy1980 COG1490 # Protein_GI_number: 15675771 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Streptococcus pyogenes M1 GAS # 1 151 1 144 147 135 47.0 4e-32 MRLLLQRVREGKVTIEGQNVASIGKGLVVLVGFGPEDTIDLRGGKLWNTLIEKMVGLRIF PDDEGKMNRGIEEAGGEIILVSQFTLYADSRKGRRPSFHLSAPPGVAEPLFQHFVEDVRS RLPGRVQQGVFAADMDVSLTNWGPVTLLLDSADF >gi|316923394|gb|ADCP01000070.1| GENE 25 26163 - 27068 680 301 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 2 300 6 305 308 266 46 3e-70 MDLLQSIGKTPLIRLAKLTAGLDAEILVKVESRNPGGSIKDRAALRMIQGALEKGTLKPN GTIVEPTSGNTGIGLAVVSTCMGFKLILTMPESMSDERKALLRGFGAELILTPAAKGMGG AVEEAQRLAATEGYVLLDQFSNPDNAEAHYRTTGPEILSDAGTVDAFVAGVGTGGTITGT GRYLREHLPSVGLFAVEPAESAVLSGNPAGPHLIQGIGAGFVPALLDRSLLTEVLPIPGL EAIKMARRVMTEEGISCGISSGANVAAALLLANRPEWKGKRIVTVLPDTGERYLSTQLFK A >gi|316923394|gb|ADCP01000070.1| GENE 26 27096 - 27770 264 224 aa, chain - ## HITS:1 COG:PH1844 KEGG:ns NR:ns ## COG: PH1844 COG0438 # Protein_GI_number: 14591592 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pyrococcus horikoshii # 26 224 176 378 381 111 35.0 1e-24 MKAFLPNSEACARAISWCCPKHKIHVIPNGIPDDRVIPTRPASAVKQELDIPEQALIFTY VGNNNPAKGIEPLLKAFSSACLDAYLVMVGANPALWLPLCDTLGIREKVRLIGQTERVSD YLQIADAFVFPSKNMDSAPNTLLEAIRMGLPVVAATVGGVPEIAQNNGLLVPPDDTEALT RALQEMASDPERRKLWGANSARRGQSYSVEARCVELEKIYHSVL >gi|316923394|gb|ADCP01000070.1| GENE 27 28195 - 28644 402 149 aa, chain - ## HITS:1 COG:no KEGG:DvMF_2251 NR:ns ## KEGG: DvMF_2251 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 3 149 18 164 164 120 43.0 2e-26 MTEEVLGNFSAILESMDFAQELELLGIRKLHFRQRKRAIRELRAMGIGLWRLGLKRSFPS DGEFLFERFMLGLYEQAHTNKEREQANAFDLLVRSYIDKFGERGDTDFTTVSGHIVSLFR RKPGDTAAQRLKLALLMRNTYTDIFRHLI >gi|316923394|gb|ADCP01000070.1| GENE 28 28692 - 30344 1614 550 aa, chain - ## HITS:1 COG:BH2384 KEGG:ns NR:ns ## COG: BH2384 COG0513 # Protein_GI_number: 15614947 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus halodurans # 55 494 17 447 539 280 37.0 4e-75 MTEQMQEAKATPAAQKAKAKSHPSNTIQTSSEIISVDEPENALPEVSLADLPQLLQTACA KAGWNSLMPVQSRALPYLLEGRDLMVQSRTGSGKTGTYLLPLMARLNPAMPAVQALILVP TRELAVQVEQEAKTLFKGSGFTVAAVYGGVGYGKQMDALRQGVSVVVGTPGRVLDHLLRR TLNLDHISALIFDEADRMLSIGFYPDMKEIQRYLPETSIHAMLFSATYPPHVLKLAGEFL TDPQMLSLSTTQIHVAEVQHLYCECKSMEKDRTLIKILEVENPASAIIFCNTKATVHFVT AVLQGFGFNADELSADLSQSRREDVLSRLRKGTIRFLVATDVAARGIDIPELSHVFLYEP PDDRESYIHRAGRTGRAGAAGVVISLVDIMEKMELQRIAKYFKVPLTQHMAPTDEEVAHA VGMRVTALLEARFRQLNGLERERMKRFEPLVQSIAEDPEQRHLLTLLLDDCYQKSLNPTA FLPAGTPRKESEGAARPPKPHTGGGRRGPRENGPRREGGRREGQPFSKEGKREGGRGFKG RKRSSDKPSD >gi|316923394|gb|ADCP01000070.1| GENE 29 30687 - 31562 447 291 aa, chain - ## HITS:1 COG:CAC1749 KEGG:ns NR:ns ## COG: CAC1749 COG1243 # Protein_GI_number: 15895026 # Func_class: K Transcription; B Chromatin structure and dynamics # Function: Histone acetyltransferase # Organism: Clostridium acetobutylicum # 6 224 56 276 358 145 33.0 7e-35 MERTASRPIELAFYGGTFTALPERLQMECLALAMQAKEKGIVCRVRCSTRPDALRPDRLQ ALRHAGLDLVELGIQSFHTEALLDAQRGYNGDRAREGCRLIKESGLKLGIQLLPGMPGST PQRFSEDVEEALAFSPSCLRFYPCLVVDGTPLAERWRMGQYAPWELDTTITTLGKALASA WARRIPVIRLSLAPERELDESVLAGPRHPALGNIIQSEALFETVRSHFALNGFLPPDELF FPQYCQGFFSGHKGSLLPRWEKLGIQPSSIQWVEGEYASLRWNKPLEEVLS >gi|316923394|gb|ADCP01000070.1| GENE 30 31944 - 32369 201 141 aa, chain + ## HITS:1 COG:BH1189 KEGG:ns NR:ns ## COG: BH1189 COG0537 # Protein_GI_number: 15613752 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Bacillus halodurans # 4 141 2 142 142 120 43.0 1e-27 MPASQHEDCIFCKIARGDIPCTSVFESEELIAFLDISPVNKGHTLLVPKAHMETLFDMPA GIGEMLFAAMKQVGSAVMKATGAEGLNVVQNNYSAAGQQVPHVHWHLIPRFADDGYTAWP QGAYQDMQEMAALADAIREKL >gi|316923394|gb|ADCP01000070.1| GENE 31 32771 - 33073 170 100 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 [Haemophilus influenzae PittEE] # 5 93 4 92 96 70 37 3e-11 MNRALTKADIVEAIYEKTDRNRADVKNTVEILLGLMKQAIKKDDALLVSGFGKFEAYDKE ARKGRNPQTDETITLPPRKVVVFRLSRKFRAELNGEEVEA >gi|316923394|gb|ADCP01000070.1| GENE 32 33310 - 33903 476 197 aa, chain - ## HITS:1 COG:alr1529 KEGG:ns NR:ns ## COG: alr1529 COG2755 # Protein_GI_number: 17229021 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Nostoc sp. PCC 7120 # 6 196 13 206 206 101 33.0 7e-22 MKTFFFFGDSITLGVNDTLAGGWIGRFAGLASQRAGLPVPPSTFYNLGVRKHSSLQIRER WASEYRSRVNDATVPYLIFCFGTVDMAAPNGNVAVPMQESTQNAQAILSEAQMEAPVLMM GPPPVKNPDHLERLNKLNETYAGLCLDLGVAYLDLLKGLPAVYVADLDDGLHPGKTGNML IAEQLLNAPIVQGWIRS >gi|316923394|gb|ADCP01000070.1| GENE 33 33969 - 34721 305 250 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 28 249 1 221 221 122 33 8e-27 MNAENSQELLEEKTGTATDAHAAKVSGLFARIVKWYDPLNRLLSLGLDQGWRKCLADAVL PGQAEGKRVLDLAAGTLDVTLAVRKRHPAAQVLAMDFCPPMLVHGQKKLSGEDKDFVLSV GADARALPLPDACMDGLTMAFGIRNIAPRSASFAEMARVLKPRGRACILEFGTGKTRIWL GIYNFYLKRILPVVGRLSGDPGAYAYLARSIIEFPSADALSDEMRAAGFKRIYHIPLCSG IVCLHVAEKG >gi|316923394|gb|ADCP01000070.1| GENE 34 34711 - 34884 215 57 aa, chain - ## HITS:1 COG:no KEGG:DVU3173 NR:ns ## KEGG: DVU3173 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 57 1 57 58 63 63.0 2e-09 MGKENKEALQVTKEIIVKFIETQRISPSNFAEHFPAIYKIVLDTLEDNATQDDTSER >gi|316923394|gb|ADCP01000070.1| GENE 35 35185 - 36456 1596 423 aa, chain - ## HITS:1 COG:DR0656 KEGG:ns NR:ns ## COG: DR0656 COG1301 # Protein_GI_number: 15805683 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Deinococcus radiodurans # 9 402 5 406 431 282 43.0 7e-76 MSKKHSKMSLPVQMVIGLVLGVIVGSMTSAEFTAAYLRPFGQLFITLIRMVVVPLVLATI IAGAAGIHDLSKLGRVATKTLVYYFLTTGVAVAIGLFLANIMQPGIGLDLSTKGLEAKQI TPPSLIQTLLNIVPMNPIDALAKGNMLQVIFFAVIFGFALSSLGETGKPLLRIFELIGDV MIRMTNMVMMYAPIGVFGLISYTVSQHGLKVLLPLGKLILVSAIASVLHVIICYSPLVKY VVRIPLPTFFRGVFEPWLIAFTTCSSAAALPTNLQSVRRLGASKGVASFSIPLGNTINMD GTAIYMGVAAVFAAEVYGIPLTLSDQLTVMLMGLLASIGTAGVPGAGLIMVSLVFTQISI PLEALALIAGIDRVLDMIRTSINVLGDATGALLVSKLEGDLNTEPFAENEDVSELNTGTE TFE >gi|316923394|gb|ADCP01000070.1| GENE 36 36820 - 37029 174 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDNKTYDVIFVARKNAENKNAGMIAENVDKATAIRFAQRYCHRNEDNEGVMIAYHGTTVQ PEGSEWWFH >gi|316923394|gb|ADCP01000070.1| GENE 37 37132 - 37899 761 255 aa, chain - ## HITS:1 COG:PH0125 KEGG:ns NR:ns ## COG: PH0125 COG0005 # Protein_GI_number: 14590069 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Pyrococcus horikoshii # 4 251 6 245 260 217 46.0 2e-56 MGIKVGIIGGSGLDNLNIFTNARDTFTGTVWGEPSSPLREGEIAGIPVAVLARHGRSHTI APSSVNYRANIQALKDAGCTHILATTATGSLRKEIRRGDLVILDQFIDFTKQRKMTFHDY FKPGQPVHAPMAEPFDASLRQILIDGCRKNDFPFHPTGTVVTIEGPRFSTKAESRMFRMW GADVINMSVSTETALANEAGIPYAAVAMSTDYDCWKEDEAPVSWEEVSLVFKQNAEKVTT LLVESVPAIAQAALE >gi|316923394|gb|ADCP01000070.1| GENE 38 37902 - 38564 464 220 aa, chain - ## HITS:1 COG:no KEGG:Tery_4650 NR:ns ## KEGG: Tery_4650 # Name: not_defined # Def: ribulose-5-phosphate 4-epimerase and related epimerase and aldolases # Organism: T.erythraeum # Pathway: not_defined # 7 215 5 197 198 134 35.0 3e-30 MTAEPIDGVVKYQASHTRGDVETSLRTLPAGIRETALDALTLFPELDAARTTLHDAGLIG VYPSGIGYGNVSLRLAGNLFLISGSGTGSSRLLGKQGYSLVRAFDPLENTVASFGPVQAS SESMTHGAVYGAANKARCVIHIHSPFLFTSLLAEGFPRTPESVAYGTPALSREVARLITE ELSPSEGVFVTAGHNEGVFAYGESIASTLNLILSLNMTKD >gi|316923394|gb|ADCP01000070.1| GENE 39 38554 - 39687 889 377 aa, chain - ## HITS:1 COG:mll7284 KEGG:ns NR:ns ## COG: mll7284 COG0182 # Protein_GI_number: 13476068 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family # Organism: Mesorhizobium loti # 1 370 1 364 364 377 54.0 1e-104 MNIDGKAYRTIWLDEADRLHVIDQHVLPHRFETATITSFSALCDAIRNMLVRGAGLIGAT AGYGMWLAACEAPAKDTEAFDTFMSDAATALKRTRPTASNLAWAVDRQLAAMTRGSSPEE KRHIAKVTAQAIADEDAEACKRIGEHGKTLLEALAAKKANGAPVNILTHCNAGWLAFVDY GSALSPVYAAHNAGLPVHVWVDETRPRNQGASLTAWELSRHGVPHALIADNAGGHLMQHG QVDIVIVGADRVTRRGDAANKIGTYLKALAAHDNGVPFYVALPSSTFDFSLDDGVAEIPI EERGAAEVRIMSGKTPDGSVLDVQICPDETPARNWAFDVTPNRLITGLISERGVCQATQE GIDSLYPESAGTARHDS >gi|316923394|gb|ADCP01000070.1| GENE 40 39932 - 40477 624 181 aa, chain - ## HITS:1 COG:ECs1131 KEGG:ns NR:ns ## COG: ECs1131 COG0680 # Protein_GI_number: 15830385 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase maturation factor # Organism: Escherichia coli O157:H7 # 2 176 4 179 195 171 54.0 7e-43 MERIVVLGLGNILYGDEGFGVRVAERLYSRYAFPDNVEIVDAGTQGHPLLAFVERATRLL LLDAVDFGLQPGTTVEKDSTDIPAYLSAHKMSLHQNSFSEVLALAELKDCLPEEIRLIGA QPLDMTYGNTLSPLLLSRLDTLVDMALHQLQAWGVPGRPACPESVFQNPEISLERYVPLP A >gi|316923394|gb|ADCP01000070.1| GENE 41 40480 - 41172 969 230 aa, chain - ## HITS:1 COG:ECs1130 KEGG:ns NR:ns ## COG: ECs1130 COG1969 # Protein_GI_number: 15830384 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I cytochrome b subunit # Organism: Escherichia coli O157:H7 # 12 228 12 228 235 224 51.0 1e-58 MNEPIQEGKPIYVFQLPIRIWHWSMVLSFLVLIPTGYIIGKPWHSLDGDPTYLFYMGYTR MAHFIAGFIITIGLLWRIIFAFFGNKYSRQVFIIPFWRKSWWLDLLSDFRWYLFLDRTPR EHIGHNPLAQLGMMTCINLLIIMILTGFGMYVQSSDSVILQPFHLVVDFIYWIGDSGRDL HSYHRLGMLFLMCFIIIHLYMVIREEIMGKSTLVSTMFSGFRLLRSGKGG >gi|316923394|gb|ADCP01000070.1| GENE 42 41183 - 42979 2084 598 aa, chain - ## HITS:1 COG:hyaB KEGG:ns NR:ns ## COG: hyaB COG0374 # Protein_GI_number: 16128939 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I large subunit # Organism: Escherichia coli K12 # 1 597 1 596 597 863 65.0 0 MAYEYTTQGYTVNDSGRRLVVDPVTRIEGHLRCEVNINDDNVITNAVSCGTMFRGLEIIV KDRDPRDIWAFVERICGVCTGTHALASVQAIEDALKIDIPDNANIIRNLMQLALWYHDHL VHFYQLGGLDWIDVVSASKADPKETSRLAQSLSPWPQSSPGYFASVKEKLVRIISTGQLG IFKNGYWGHPGYKFPPEANLMLVAHYIEALDFQKDVVQIHTVFGGKNPHPNWLVGGVPLA LNINGKGGADVINMERLELVLSIIKRCREFSAQVLVPDSIALAKFYPEWLHLGTGLSNQS LLAYGAFPSIANDYSQKSLLIPGGAIINGNFNEVLPVDLSDEDQIREEVGRSWYTYPKGV TSLHPYQGETVPHFELGPATKGSRTDIKQLDENAPYSWIKTPRWRGNMMEVGPLARTLIA YQLKQPDIVARVDDLCARIGAPVTSLQSTMGRILTRAQEAHWAADTMQVFFDKLITNLKN GDSTAVFTNKWDPDTWPQEARGVGFTEAPRGALGHWTVIKNKKVDVYQCVVPTTWNAAPR SDGGQLGPYEAALLGTKMDVPKQPLEILRTLHSFDPCLACATHVLGPDGSELLTVHMD >gi|316923394|gb|ADCP01000070.1| GENE 43 42982 - 44139 1063 385 aa, chain - ## HITS:1 COG:STM1786 KEGG:ns NR:ns ## COG: STM1786 COG1740 # Protein_GI_number: 16765127 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I small subunit # Organism: Salmonella typhimurium LT2 # 3 378 2 368 372 521 64.0 1e-147 MFNPEETQYEVLRRRGICRRDFLKFCSMAAVALGLGPMGHLEIAHAMETKPYLPVLWING LSCSCCTESFIRSAHPLASDIILSMIALDYQDTIMAAAGDQAMECFEEAISTYKGRYILA VEGNPPLGEEGMYCIDGGRPFMDKLQRGAAGAKAIVAWGTCASWGCVQAARPNPTQATPI HKVIHDKPIIKVPGCPAIPDVMSNIVAYILTYDQIPKLDSQGRPEVFYGKRVHDQCVRRA HFDAGQFVESWDDAAASLGYCLYKMGCKGPTTYNACPVTRWNNGVSYPIQSGHGCIGCAE QNFWDHGSFYSRITNIPQFGTNTTAETVGVAAVAGIGAGVVTHAAISTAVHLKHRYGKDG DCSKETKTAQAEKTASNDSTPSERN >gi|316923394|gb|ADCP01000070.1| GENE 44 44399 - 45748 1509 449 aa, chain - ## HITS:1 COG:all2854 KEGG:ns NR:ns ## COG: all2854 COG2148 # Protein_GI_number: 17230346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 47 447 52 468 469 275 39.0 2e-73 MLISGYRKGLFKVLDAVCILAALSITGWFMLPADLSVLDDYTGASLFTVVSYLFFFYVLD AYNVGAEDFKDTTGRVLIAFFLGAIASASAFYAFQHWRFDRRTLLLLFSLCLLFCLGWRA LYYKHIGRFVHKARILLIGTDRAGKVRQTISESLTDADIVGYVGDCDNDREIPYLGTPMQ VEEIAKQQNVTMIVLLPDAPIDEDIATELLHAKLHGQMIVDVRSFCEHMVHRLPVSQISS EWLLTEEGFSLNTRGSLRRLKRAFDLFATLGLLICTSPIMLLTAIAIRVESPGPVIYRQR RVGLFGQDFTVYKFRSMRTDAEKNGAVWAMKSDPRVTKVGKIIRKTRIDELPQLWNVLKG EMSLIGPRPERPEFVKELEKEIPFYSLRHAVKPGVTGWAQVCYPYGSSVEDSRRKLEYDL YYAKNMSILLDVRIILKTIGVVLFPKGAR >gi|316923394|gb|ADCP01000070.1| GENE 45 45772 - 46890 554 372 aa, chain - ## HITS:1 COG:SMb21231 KEGG:ns NR:ns ## COG: SMb21231 COG0438 # Protein_GI_number: 16264483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 3 370 11 382 384 249 40.0 9e-66 MNIIIMNNRASFLVSFWSVLISRLQEEGHQVTCLVPVGDSEAEATLKGFGASIRNYPLDN KGLNPVHDLKTCLALYRIFREERADILYASTIKSVIYGIPMAALAGIRSRYAMITGLGYM FEANTPVKKMLTYLASSLYRISLSFSDAVFFQNTDDVQTFRDWHCLPRGAHVVMTKGTGV DTDKFAVAPLPEAPLTFLLVGRLLEAKGLYEYAEAARLVKKRYPNARFQLLGAPESSRGG VPLETVKGWEREGILEYLGVTRDVRPYVGQANVVVLPSWREGLPCSLMEAMSMGRPIVAT DVPGCRDVVVDGKNGFLVPVRTPEALAKALESFLEDSALTARMGKEGRFIAETELDARKA ADLILSVMKLVS >gi|316923394|gb|ADCP01000070.1| GENE 46 46963 - 48159 643 398 aa, chain - ## HITS:1 COG:CC2466 KEGG:ns NR:ns ## COG: CC2466 COG0389 # Protein_GI_number: 16126705 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Caulobacter vibrioides # 7 389 40 416 418 256 40.0 6e-68 MQKWFMHVDMDAFYASIEQKDHPELRGKPVIVGGGGPRGVVSAASYEIRKFGVHSAMPIA QALQLCPHAILVPVRMARYAEVSRTVIDVLRSYSPRVEKASVDEAYLDATGLERLFGPVE DMARRIKLEVKEVTGGLTCSIGLAPVKFLAKIASDLNKPDGLSILYPDKLAAFLQSLPVE QIPGVGKTMVRELQSLAIRTAGDVPRYPKVFWERRFGKAGITLYERAQGIDPREVEPYTP PKSESAETTFDIDTRDIGFLKSWLFRHADRIGRTLRKQKLQGRVITLKIKYADFRLMTRR VTLDAPTSATETIYETACDLLDHMRLEEKIRLIGVGVSGFDSPPHQLRLPTIGKDSQSNE ERRTKLDKVMDELQDKFGQTSVVRGRLFQQAAQPKKRT >gi|316923394|gb|ADCP01000070.1| GENE 47 48507 - 50198 1125 563 aa, chain + ## HITS:1 COG:XF2505 KEGG:ns NR:ns ## COG: XF2505 COG3378 # Protein_GI_number: 15839095 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Xylella fastidiosa 9a5c # 67 491 360 764 819 127 26.0 4e-29 MSGKKYKDKVKQRTKEEEKAMKAKVAAKKQDLMSELTNEVIRTYFDEDGVGDGKLFNRLH RDKIVGVIDSDDFLFWNGAHWEKAKEKQEFRAIEDVVRLYERLAVEKEKEFDSVDKRDDP DLKKELQKQLGAIRRRIKTLRDAPGQDNLKKMTARVDPPLLVYPEQLDDKPRLLPCPNGV INLETGELEQGRPRDYLLTACETEYDPGLLDVEDPCPVANDFLLRSMDGDKELVAFIWRL LGYGLIRERKDHIFMIFHGEHGRNGKDTLIKLITTTLGKALSGDVPVEMLLQTPNVKNSS GPSPDVMRLRGMCIAWINEAEENQKFALAKLKKLSGGSYITGRSPYSKEETSWKQTHLPI MTTNELPKAKADDAAFWQRALILKWNLSFVNKPDPAKPYQRQADKYLDEKLEKERKGVLA RMVRGAIEYLKYGGLQVPEKVYRWTESQRTNWDDLAQFLSEWCVREPGHERIEDYKTSIS ATDLHEAFCLWYARYKDRRFSISAKKFAEMLNKKEIPSKKSNGIWRLGITLTPDADIELQ KAREFNPPKSSHKKGENDSGNIL >gi|316923394|gb|ADCP01000070.1| GENE 48 50388 - 51938 945 516 aa, chain + ## HITS:1 COG:no KEGG:DVU2879 NR:ns ## KEGG: DVU2879 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 19 512 26 496 518 310 39.0 1e-82 MATMLENYRHRFGSAVKAQGNGFNGPCPLCGGEPGKSDRFIIWPDREHDLGHTCAVNHIP GVCYCRQCRFTGDSIKYLMEIEGLSFREACAELGISNAPVRLRHRPAPREPRAESCTFTP QAWELPTEKWVAYATKLQAEAEQEIWNHPEALKWLAARGITEEAVRTYRLGYLVGENGKA GRYRSRSALGLAPKEQDGKAMTMLFIPRGITIPLFAEDGRLINLRIRKPNADLAKEEGRK CLKYIELEGSCRRPLLLRPEAERARLSVYVIVEGELDAVLCHYATGGGIGALAVRSNTRK PDAEAHSLLEGAVRLLVALDYEDSLNGVAGLKWWMDTYPHARRWPTPEGKDPGEAYGLGV DIREWISEGLPRSVSLPDAPGSVEAFSCGRVFEGGGGETPTNSPSLEKGKGQDGCMASVK EALPAGLREAMPAYLAVNDVPPDVLHAWALWQGLPVRFIKEDGGFRWLYSHSWAKRHRDQ FEAFWRFQDGSDALWDWLSAHVAAEIGAHNLLKIWG >gi|316923394|gb|ADCP01000070.1| GENE 49 51940 - 52701 927 253 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2116 NR:ns ## KEGG: EUBREC_2116 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 19 240 22 243 307 98 28.0 2e-19 MDFSTENDTLSRISKALYGDALTLAVMDPKDISLLKKNARILKKDVFQQLTANIGRDKRL SSVPLCHRLSDGRVEVLSGNHRVQASVEAGIERILVMIIEEDLTRSQAVAIQLSHNALVG EDDPALLAELWAEIEDIAAKTYAGLSSDVVEKLDKIDLTSFTTPQVSTRTMTFAFVDSEA ERLNAVLDDLDGLPAKEIWLADVGQFDRFFDLLEATKRTFDVRNASLAMLKLMDLAEEAI ANHKPEQATEGAA >gi|316923394|gb|ADCP01000070.1| GENE 50 52698 - 54056 1605 452 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302493220|gb|EFL53082.1| ## NR: gi|302493220|gb|EFL53082.1| hypothetical protein DesfrDRAFT_0130 [Desulfovibrio fructosovorans JJ] # 1 451 1 444 449 495 56.0 1e-138 MSFIGAVATSVRQVLAQYAKDVHLPCLIVGAGNFTVPSVLRSAGFAGTITACDVTLYTSA LGAYLSGWTLEAREREDCPEHLRGLLRTGSPLELTASISLLMDLREVWKGDNAFKMRMVE HSREAWDTLIEKTCARLEGYKAHIGPIDYQARDGFDLLEKSASGHTVFAFPPTYKAGYEK LEALLRATVEWTPPAYREMTDKSLELFEAIARFDSYYVVLEKDLPDVYALLGQPSAVLPR GRGRTTYIVAKHAKKVVIRSSVKTAPVGPIWPANRAVSGDVVPGFAPVKRAQSLRLNELY LAKRIDYFDGGVDVCIVLTLDGQVIGKADFMKTSHAQWKLPEGNPGGDESLYIMCDLAVA SDVEKRLAKLVLLLLTSREVKEWVDAKLNKRVGWVITTAFAKGPVSMKYRGCFQLYSRKQ DKKTGQYALNYYAPFGARTLTESFALWKKKYK >gi|316923394|gb|ADCP01000070.1| GENE 51 54151 - 54402 287 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDARRKQELKELMVTAWNLARFGANRFGGPASLYFWMALRIAWWERGGKSVYYTKGNVAQ MWMGIVPRQEKVKRGQIMLPGLA >gi|316923394|gb|ADCP01000070.1| GENE 52 54512 - 54862 438 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302862905|gb|EFL85837.1| ## NR: gi|302862905|gb|EFL85837.1| putative iron receptor [Desulfovibrio sp. 3_1_syn3] # 3 103 24 124 141 74 38.0 2e-12 MTEREARKLAKEVVSDEYAVIDEIWNRRRVNYHSVAADYDRDTIKDINRKLPNLLVKNGG VALDELADEYGFESTCDLIDMFLAYTPKRVRLEQLVAQFLEENPQPSGDYDGDVPF >gi|316923394|gb|ADCP01000070.1| GENE 53 54872 - 55240 283 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702697|ref|ZP_03310825.1| ## NR: gi|212702697|ref|ZP_03310825.1| hypothetical protein DESPIG_00725 [Desulfovibrio piger ATCC 29098] # 3 109 2 109 120 140 64.0 2e-32 MSIENSFFSSKAPKERKVCIAKWHRNWSGPRAERFAPSDPNAVDWKAAYRKELESRFPTP SSLRLYLQEIEKRTPDPILCCFELNPEECHRRVLAAFIKENINLDVPEWNGRRHDGQFSL LP >gi|316923394|gb|ADCP01000070.1| GENE 54 55417 - 56172 739 251 aa, chain + ## HITS:1 COG:no KEGG:DVU0193 NR:ns ## KEGG: DVU0193 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 14 238 6 229 243 163 46.0 4e-39 MEDLQKDKDQGEKLRSLVEVSRKNDIPALLNAKNRAQQDVYSDPSKENLAVLERATAMLE KAMDAGQNCKNWKEALTYLQEDCGRKIGQTKLFADIKAGRLRKQPDGTFKRRDLDRYAAS LPTAGTPDKLATDAARRQREKEEQEIRRIRAVADKEEFILKVKQGQYISRDDVYQELAAR AVALSASLKTEFEARSLDVIALVEGNPKKSGPFVEHIEQVIDEAMNEYAKPIEIEVTFTA EPEAGTESDDE >gi|316923394|gb|ADCP01000070.1| GENE 55 56510 - 56761 135 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRHAVADFGAFGVVEPIHRPDEVARDAPDALESHSNANQLFGRGTDTCLIFVCNHSFGSV MVKSNCIVSHIIAISPSRALRAF >gi|316923394|gb|ADCP01000070.1| GENE 56 56787 - 57002 226 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKLRHGMTVMTARYAVVLRGRNQCFPDQWWGHPIIEGKIMKRHTIVLKAEDVVALPVPEG FQRVFPGLLNG >gi|316923394|gb|ADCP01000070.1| GENE 57 57184 - 59043 1529 619 aa, chain + ## HITS:1 COG:RSc0853 KEGG:ns NR:ns ## COG: RSc0853 COG5525 # Protein_GI_number: 17545572 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Ralstonia solanacearum # 9 408 28 406 660 157 33.0 6e-38 MKRRRPVPISAWAEKHRILEMSAIRGRWRNVFTPYLTGIMDVSGLPGVETVIICKSPQTG GSECGHNIVGYCIDRLPGPVMYVFPDELTARENAKDRIIPMIEASPRLRQYMTGYGDDAS SLRINLLHMPIYLGWSGSVSRLGNKPIRILILDELDKYKNPKNEASSESLAEKRTTTWRT RRKVVKISTPTTEDGPIWKALTEEAGARFDFWVRCPHCGFFQHMDFERIAWPGKDEEKSP DAETVLAKRLATYACEYCGTVWDDGDRDRAVRGGEWRERTSGLELMAHVAAHRPVKVGFH IPAWLSYFVSLSEVAHAWLKYKESGKLDDLKNFRNQYAAEPWVESHAARSEDAILALCDD RPRGKVPGPVDGKERVSVLLATVDTQQHYFRYVIRAYGYGETEESWLVASGSADNLAALE EILFGSVYADPDGREYMVKAAMIDAMGGRTAEVYRWAVRHRGRVFPWQGVRSMAQPYTPS HQEYFPDAKGNKVKIPGGLMLYRCDVTFFKSDLAFKLGIHPDDPGAFHLHANDGGQLEQY AKELCAEVWDDEKQGWENPANKPNHFWDCEVMQRAFAFILNVRHRRRPDEEAKKPARPPR PSERGGGGIGSRLANLRRS >gi|316923394|gb|ADCP01000070.1| GENE 58 59043 - 59249 148 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702701|ref|ZP_03310829.1| ## NR: gi|212702701|ref|ZP_03310829.1| hypothetical protein DESPIG_00729 [Desulfovibrio piger ATCC 29098] # 6 56 1 51 69 63 50.0 4e-09 MALYDLSDRLNWQQACEILGCSKAQLYRLVKEKKIPVYGTGKRYRWYLRTDLKLFLETGY CEKNDLTN >gi|316923394|gb|ADCP01000070.1| GENE 59 59272 - 59568 262 98 aa, chain + ## HITS:1 COG:no KEGG:Bcep18194_B1537 NR:ns ## KEGG: Bcep18194_B1537 # Name: not_defined # Def: hypothetical protein # Organism: Burkholderia_383 # Pathway: not_defined # 1 97 1 97 98 110 56.0 1e-23 MEKNKPHCPLSRVKALIEAGKVHMTTTARNGAAALGYDRKRAYAEIMCLSPHEFYKSMTT YHDSSVWQDVYRHKADVGMLYIKLTVIDDVLVVSFKEL >gi|316923394|gb|ADCP01000070.1| GENE 60 59651 - 59965 229 104 aa, chain + ## HITS:1 COG:XF2491 KEGG:ns NR:ns ## COG: XF2491 COG1396 # Protein_GI_number: 15839081 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Xylella fastidiosa 9a5c # 6 103 34 131 133 122 58.0 1e-28 MLETHADFCPVCGEGVLSDEEADRLDGLSEVFRRKVNEELFDPAFVLSVRKKLGLDQRQA GELFGGGANAFSRYELGKAKPPQALVQLFKLLNNDPSRLNELRG >gi|316923394|gb|ADCP01000070.1| GENE 61 59924 - 60085 102 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSETVETVETVETVETKNFQKDKKKPGWEAGLFCANLFKASAPQFIETGWIII >gi|316923394|gb|ADCP01000070.1| GENE 62 60118 - 60345 120 75 aa, chain + ## HITS:1 COG:no KEGG:DVU0196 NR:ns ## KEGG: DVU0196 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 72 50 125 131 63 51.0 1e-09 MSTIWTREELLDLIACWKAAYKAASTGKSYTVQGRTLTRYDLPEIRQQLVYLQGELAALD TGRRGPAIVLARVRR >gi|316923394|gb|ADCP01000070.1| GENE 63 60349 - 61200 425 283 aa, chain + ## HITS:1 COG:no KEGG:DVU2872 NR:ns ## KEGG: DVU2872 # Name: not_defined # Def: lambda family phage portal protein # Organism: D.vulgaris # Pathway: not_defined # 24 209 45 226 566 142 44.0 1e-32 MALLDQFGHPLPPVSTSRMTARASRDAGAYRGSISGWRGPQVHSPEGESRERDVMQRRAA DLAANDWAAHSAVEAISGNAIGTGLVPKASIPADMLGISSESARELGKRMEWAFALWTSE ADVRGQCHFADLQNLGIRTMLSLGEMLHLAVMLNEKERERQNRAFSFALQTLSPARLMTP DDQQGEPLIRDGVRLSEYGRPEGYWLATPKASPQSSFVSVEGALCWRRTSPMSRPASATA RGCSTCSGTRRTSRCAACPLFPRASSCSATCPTPSATSCSRRS >gi|316923394|gb|ADCP01000070.1| GENE 64 61110 - 61970 981 286 aa, chain + ## HITS:1 COG:STM2606 KEGG:ns NR:ns ## COG: STM2606 COG5511 # Protein_GI_number: 16765927 # Func_class: R General function prediction only # Function: Bacteriophage capsid protein # Organism: Salmonella typhimurium LT2 # 2 241 243 491 526 105 28.0 1e-22 MRGVSAFSKGIELFRNLSDAISYELFAQVIAASFPVFVALENGGVQLPDYVTEGQEGDGE RRERQLVQDLSPGQVLYGNENEKPYVLESKRPSANFSAFVEIVLRATAASVGIPYESLTK DFSKTNYSSARAALNEAWKLYSFYRNWFGRLYCQPVYEMVIEEAFLRGMFELPKGAPGFY EARAFWCNVDWIGPSRGFVDPVKEITATILALQNRLMTYGEAWAETGRDFDEGYARMLEE SPLLALLAPLNLSTKIGKPGKDAAPEDGEKPDEEGDPEKETGEEDE >gi|316923394|gb|ADCP01000070.1| GENE 65 61963 - 63168 1028 401 aa, chain + ## HITS:1 COG:ECs1633 KEGG:ns NR:ns ## COG: ECs1633 COG0616 # Protein_GI_number: 15830887 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Escherichia coli O157:H7 # 36 377 71 399 439 105 31.0 2e-22 MNELWALPFDMAERVLAELASAKSNPQALVEGFPERKARGYELVGGVAVIPVSGAIVREQ GWYGTGQDAVASSLKAALADPSARAILFDITSPGGVVAGTKELADAIAEARTKKHCAAYA NGLCASAAYWLASATGTVYAPLTATVGSIGVIMTITNYAKLEEKWGISTVTITGGKWKAA GQGGELTDEERRYFQERINTLHQIFKADVGRHMGLTADPQLWGEAQLLLAQPARELGLVT DIVRDRDAAIRKLAVEAQMTREELAAQSPELVDALLAEGRLKAEAENKANMDKAAADAVA GALAVVKAVAGDETASRVETTLNTLRATGMSAEQIATVAPLLAKAEAPVHENAEAKSRAD ILAGLQAAHRQPAAAAPGTVPTATTKSPLLADAERRAEVAK >gi|316923394|gb|ADCP01000070.1| GENE 66 63178 - 63567 692 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702706|ref|ZP_03310834.1| ## NR: gi|212702706|ref|ZP_03310834.1| hypothetical protein DESPIG_00734 [Desulfovibrio piger ATCC 29098] # 1 128 1 123 132 80 44.0 5e-14 MSKIIVNTEVMGPDFSELVLHELNYEWSREVVTLAASEKELPFGLVLMREAGKYKPLTES TVSEAQKLGGDPIAVLITGLPAGESDQTGTVIRRGAILNGAALKFDASVTTLQAEAKLAL SDLGIVIKE >gi|316923394|gb|ADCP01000070.1| GENE 67 63571 - 64584 1221 337 aa, chain + ## HITS:1 COG:no KEGG:MCA2923 NR:ns ## KEGG: MCA2923 # Name: not_defined # Def: hypothetical protein # Organism: M.capsulatus # Pathway: not_defined # 5 332 2 326 329 221 37.0 2e-56 MPIQNYPTVFDCTEMTAAVNKLPALPVYFRRLFEVKGVKTTTVSLDIRKGRIVLIGDSER NTAPESLAGRGAKREWMHLSCAHLAMSDTLAPEDLQDVRAFGSTEPISVAEVYNDKMQQL KDNMTATMEFHRLGAIKGVVLDANGTTVLHDIFNTFGVTKKKMDISFPKTAADDANPILT SILNAKRHVEAAMGGTPFSHIECIIGSDAYDMLTSHKLVREYFERWLSNRSNFGDNDYRK RGFTYGGLTFVERSDVVGGQTMVEAKKGHVYPVGPRIFKQYHAPADWMETVNTVGLEYYA RMDLKEKGRGIDIEVQSNPLTLCTFPEALIELNFKAA >gi|316923394|gb|ADCP01000070.1| GENE 68 64587 - 65219 756 210 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2867 NR:ns ## KEGG: DvMF_2867 # Name: not_defined # Def: protein of unknown function DUF847 # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 198 1 183 192 109 36.0 5e-23 MADFDLAYAPVAKWEGGWTHDSGDKGGETFRGCARNFFPNEPIWPVIDREKSHPSYKKGK AAFSAHLMGIPSLTGCVKGWYKKEWWDKLGLERFDQIVADELFEQAVNLGKAGMGRYLQR LCNAFNWRKDGSADGVRLFDDLQTDGVVGPKTLSALSIVLSRNDARRIVHLMNCMQGAHY VNSAANRLPLRKFCVGGWPTRTYDPGQEVF >gi|316923394|gb|ADCP01000070.1| GENE 69 65219 - 65434 349 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDFATLMDSQSGIVALGMAAVSGVCAFICAFMPAPTEQSGMLYRIVYELLNWIGCNKGKA KNADDAGNGGK >gi|316923394|gb|ADCP01000070.1| GENE 70 65727 - 66029 363 100 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPLTTETLLAYSMGIIGTLLVLLISLVVYVFLTLREEVRGVSSDLSELNKHRVKLVHIDD CRLTVARVHERLDDYEDAMQGLSERMARTEALLQERGGHS >gi|316923394|gb|ADCP01000070.1| GENE 71 66026 - 66346 453 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702712|ref|ZP_03310840.1| ## NR: gi|212702712|ref|ZP_03310840.1| hypothetical protein DESPIG_00740 [Desulfovibrio piger ATCC 29098] # 6 104 3 103 105 89 50.0 5e-17 MNQSFFKEILEQEIHSVFLNPAEFGESVTLEGRTLDAVVDRPEIAWPEADDRPGVSHKLV VLLVALSDFPDELYPGTSVTFNGERWFVATADREALRTIRLYREAA >gi|316923394|gb|ADCP01000070.1| GENE 72 66343 - 67011 537 222 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702713|ref|ZP_03310841.1| ## NR: gi|212702713|ref|ZP_03310841.1| hypothetical protein DESPIG_00741 [Desulfovibrio piger ATCC 29098] # 2 215 16 225 228 110 32.0 5e-23 MIRLDIPNMDETIRALTASLQHMPKECEIAVSRAINRTLNAMRAEAIRIARRAYVYVPPG RLFDQLYLKKAQRGTTKACLYISGRRGISQYHFRPEPKFPGTKPPAGVSAQIRQGGTRKV YQEPGYSKPFIMKKLRGIDFGGYGVFMRKKGVNNFHKKGRKGAEGLVWKGVKMLFGASPI QSLLKKENQQQIVDKASEVFPRRLQHEVNFQIGKLAASGKMR >gi|316923394|gb|ADCP01000070.1| GENE 73 67029 - 67505 480 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702714|ref|ZP_03310842.1| ## NR: gi|212702714|ref|ZP_03310842.1| hypothetical protein DESPIG_00742 [Desulfovibrio piger ATCC 29098] # 9 148 17 154 163 145 52.0 6e-34 MAVKEMLTEAMKEYPFPAPDGSCEDLQVFLHGLPDEQGRRTYPFICVRWVSGDINEGVDG YIGAEGRETLALVLGMYAPESQEQAGLILAELLDWTRAVLRRNRVVAKKFQLELPLKASI PDPEKQWMEYHMATVFPEYQYIIPSTPLGGTLKEHTYE >gi|316923394|gb|ADCP01000070.1| GENE 74 67498 - 67839 342 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702715|ref|ZP_03310843.1| ## NR: gi|212702715|ref|ZP_03310843.1| hypothetical protein DESPIG_00743 [Desulfovibrio piger ATCC 29098] # 34 110 7 86 86 62 48.0 7e-09 MSEQESPKTARKSPAKAESPSPELLARRKQALTVYVGPDRPFGLPLRTSAVLRGEPLPQL AAVIEANPDLKKLFVPVEELAETRCQLRKEGSGMQRLFKTINEASRKARKAKE >gi|316923394|gb|ADCP01000070.1| GENE 75 67843 - 69315 1919 490 aa, chain + ## HITS:1 COG:STM4213 KEGG:ns NR:ns ## COG: STM4213 COG3497 # Protein_GI_number: 16767463 # Func_class: R General function prediction only # Function: Phage tail sheath protein FI # Organism: Salmonella typhimurium LT2 # 3 469 5 452 475 144 27.0 4e-34 MAFRHGVYTSELPTSILPARSVDSNVVFAVGTAAVDRLETDKPRYVNRLRMYYSYDEFVS EMGWDEENFNKYSLQELAYSHFALYRGAPLVVCNVFDPAVHKTSVSSEAVSFDAKGAASL KHGSVSRLVLKNAESSTTYVEGTDYTLDPISGELSRIEGGSLPAEANVTAGYDYADVSLV DSTDVIGGINESTGESEGLELIDSVFPQFRLVPGSILAPRFSEDPAVAVVMAAKADGING LFKAVALADIPTEGEHGVKKYTDVPAYKQNNNLSDELLIVCWPKVKLGDRVFGLATHLTG LISQTDADREGVPYASPSNKRLEITSIGYPDEKEEGGWKELFLGLDKCNYLNGEGIYTAV NWDGGMKSWGGRMSAYPSNTDPKDCQDAIRRFFNWYQSTFILTYFQKVDNPLTRRQIQTI LKSEQIRLDGYAAREMILGGSISFDESDNPTTDLIDGIARFHLRITPPPANREIDGIFEF DTDNLSVLFS >gi|316923394|gb|ADCP01000070.1| GENE 76 69326 - 69850 669 174 aa, chain + ## HITS:1 COG:no KEGG:Spro_4913 NR:ns ## KEGG: Spro_4913 # Name: not_defined # Def: major tail tube protein # Organism: S.proteamaculans # Pathway: not_defined # 4 172 6 173 173 118 36.0 9e-26 MSRPEQTIAYRVYWQGKDLLGTAQIEMPQVQYMTETLSGSGLAGEIESPTIGLTQSMTCK MTFTSATKDVFDILDWTLQPLFECYSALQIVDESTSIRESIPYRLNIVGRPKNMSLGTME QGKKHGNDLELEVTRLEILLDGEEQLLIDKINFIHRVKGNDLLAAVRVQMGLNA >gi|316923394|gb|ADCP01000070.1| GENE 77 69862 - 70203 414 113 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEKTAQATLSAPITVQGKKTDILTLRRATLGDDEDAMDMAISFNRGNNPVTVELCTLSIV TGVPYDVLRTLDEDDIGAIRAAHNSLRPTKPKKKEEAETATATTQGEGSTASA >gi|316923394|gb|ADCP01000070.1| GENE 78 70309 - 73467 2400 1052 aa, chain + ## HITS:1 COG:XF0730 KEGG:ns NR:ns ## COG: XF0730 COG5283 # Protein_GI_number: 15837332 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Xylella fastidiosa 9a5c # 124 706 58 628 739 139 24.0 3e-32 MASEFGVSFSLGANLDGSFGSAFRSANGQIAKVTQSIRAMEGTPVGKIGASLLAQREKTQ KLVGSLKEAKGQLAGYWAEAERTGNITGTLAAQIERAERKVASLKGRLYQSNAAFREQNA EAVKVSGSVTKLRHDYDALNAAMNRAKSHRDALSANIARKNELRDQRSDLNGRLIGGAAQ AATAAIPVKLAVSAEDTFADLKKVMNGADDELLGQVYQDALKMSSETGKSFEDVVAIMTS GAQAGLGKTREEMRSNTEQAIQMSIAWGVTAEQAGDSLATWRSSMGMTSQEARHTADVIN ALSNEMNGEAGEIDRIFTRMGPLLKGSGMASQDIAALGMAFKASGAEVEVAGTAMKNFTK VLSLGNAMTKDQKEIFKHLGLDPNALQKQMQTDAKGAIMTLLTQLKRVPKEMQNAVSMKL FGDESIAAIAPLLDNLDLLKQAFKIANGNVDDSVLEEYLNRMETTATEEAKLAQQTRNLG ITVGNAALPAYNAFLKTLSKGVGVITGFAKEYPNVTTALLGGVGALAALTVGGIVFGYAY NGLATTINAVKGGMLALRGATIANTAATRGGTIATLMNRAAHLSWADVGKGSVSTVKSLG SGMLSLIGIQKGTAIGMVWGSRATRAWEKSTKLLGKGLGALKFAFGPVGIAIAGIGLAAY WLIENWDVVGPYFGKAWDWICGKFKWAADFIKGIVDWVFNAVDTIAKKWTESETFKRNTA DALNMNFGGWQGSTAEEGAAWVRTGNEKGKESLESPPDFVGPKPQDKAPDKPAGPKPLET AKQLPGMPTGDAPGGDFVDDSLPAPDFSGWGDEDGKKKKGKKGKGAGPVTVVSLDSGNRF STVFIPAASKKDKDASKPVGTSVLLPSSSGDSETVAFSKAGQNLVGGLNKTFDRLPKLFD ASLSKVSEPDIPPAVVNIPASSSSPQPAPRAIFHPVQRQDRSGSPFAVLKNALGAAPDQW SRTVGAKFGRDALPPVLPQTPMLLERNKKASAQRQPEASGDIQIVQHFNIADAGNLPALK KELRRLEPEFEKLVRRALERMRSDKARTAHAQ >gi|316923394|gb|ADCP01000070.1| GENE 79 73457 - 73666 270 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702721|ref|ZP_03310849.1| ## NR: gi|212702721|ref|ZP_03310849.1| hypothetical protein DESPIG_00749 [Desulfovibrio piger ATCC 29098] # 7 69 6 68 68 64 49.0 2e-09 MPSEKTTRQGQAWDQLAKDAYGDELRLGTLFPENVDELDVLIFGGDVRVAAPEAPSVAKV SSLPPWERM >gi|316923394|gb|ADCP01000070.1| GENE 80 73666 - 74703 1185 345 aa, chain + ## HITS:1 COG:STM4208 KEGG:ns NR:ns ## COG: STM4208 COG3500 # Protein_GI_number: 16767458 # Func_class: R General function prediction only # Function: Phage protein D # Organism: Salmonella typhimurium LT2 # 12 342 23 342 347 114 24.0 2e-25 MRRAAVTVSIKGHDVTLDLMPYLVSLTYTDKADEELDDLQIVLEDREGIWQGDWLPQTGD VIEASILTENWREIGAVEELPCGKFEVDEMELESSAEGGDTVTVKAVPAAVKSSLMLQKK TRSWEKTPITTVIADIAGAAGLDTLYRGPELVYERVEQRQESDLEFMQRITKEQGLRLAV KSDRVVVYAGQTADQLEPIAIRRASEADPGEGLDFQSFRAKRTTEGIYTQCVVGYTKAAD SETIETQYEPNIPPTTGRVLYINKRIENQAQAERMAKAELRDKNRKEQTASLSGMGDTRF RAGTVLDIQGWGRFDSKYVIAQATHTFSADGGYTTSLELEKALDY >gi|316923394|gb|ADCP01000070.1| GENE 81 74715 - 75353 595 212 aa, chain + ## HITS:1 COG:XF2492 KEGG:ns NR:ns ## COG: XF2492 COG4540 # Protein_GI_number: 15839082 # Func_class: R General function prediction only # Function: Phage P2 baseplate assembly protein gpV # Organism: Xylella fastidiosa 9a5c # 8 210 19 193 195 66 32.0 3e-11 MNELARVGFVVSRQPEKHRVRVEFRDTVTAKLVSGWLPVLVPRASADMAFDLPDVGDQVL CLFLGNGLEEGFVLGSMYGAQTPPVSSGDKFHRTFSDGTTLEYDRAAHKLRASVRGDVEA SVTGNVEVTLQGNGKVTAGGALELTSAAKIGLNTPALSMGGSGGGGTEAATQGNIRHRGN ITVTGGDVTVNGISFLAHVHDCPHGGTTGAPK >gi|316923394|gb|ADCP01000070.1| GENE 82 75350 - 75742 570 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702725|ref|ZP_03310853.1| ## NR: gi|212702725|ref|ZP_03310853.1| hypothetical protein DESPIG_00753 [Desulfovibrio piger ATCC 29098] # 2 130 1 128 128 146 58.0 3e-34 MMYQGVLGTFFFTVTDAEVATFRDLKQQREIQFAEHKCVSGLPKVQHTGRNLDTLSLTAQ LFPLTPLALTVDMRIDALRELAVLGEEVPLVLGLTYYGLYVLKSVEVQHRIFHNGVTMSA EIALNLTEYN >gi|316923394|gb|ADCP01000070.1| GENE 83 75742 - 76083 433 113 aa, chain + ## HITS:1 COG:no KEGG:Amico_1105 NR:ns ## KEGG: Amico_1105 # Name: not_defined # Def: gpW/GP25 family protein # Organism: A.colombiense # Pathway: not_defined # 25 105 25 103 106 62 48.0 4e-09 MELTVDMSVPASVEIGATGLRGLAQEIRTALATRKGSVPLDRDFGLSWELIDLPLPESRP LLVAEIGRGLERCVPRIKVKSVTFRTDTSGAADGKLTPVVTVEIRKEYLNDFR >gi|316923394|gb|ADCP01000070.1| GENE 84 76070 - 77200 1147 376 aa, chain + ## HITS:1 COG:STM4202 KEGG:ns NR:ns ## COG: STM4202 COG3948 # Protein_GI_number: 16767452 # Func_class: R General function prediction only # Function: Phage-related baseplate assembly protein # Organism: Salmonella typhimurium LT2 # 14 375 8 370 371 177 34.0 3e-44 MTFADLSGLPSVSFAPQSAGETETAIITAYEAIAKATLQPGDPVRLFLESLAYVISVQNG LIDLAGKQNLLAYARGGHLDHLGAPMGVIRIQPQPARTTVRFGVDGALAFAVPIPAGTRV TTQSGGVMFATLSDAVLPAGELFVETSAKATEAGASGNGLLPGQICRLVDPLPYITRVSN VATTLSGCDEEGDERFRDRIRMAPESFSVAGPNGAYEARVKAVSADISAVSVTSPTPGIV DVRFVMTDGELPDEAMIEEVENALTPKDVRPLTDKVLVGSPETVEYALAGKWFLSSSDST LLASITKAVDAAVEGYRLWQRSKPGRDINPDELIARMRNAGAKRVELATPVFQRLTETQI ARETSVSMTFGGVEDE >gi|316923394|gb|ADCP01000070.1| GENE 85 77193 - 77933 843 246 aa, chain + ## HITS:1 COG:STM4201 KEGG:ns NR:ns ## COG: STM4201 COG4385 # Protein_GI_number: 16767451 # Func_class: R General function prediction only # Function: Bacteriophage P2-related tail formation protein # Organism: Salmonella typhimurium LT2 # 84 244 51 210 210 76 32.0 4e-14 MSSRRIGSTPFLELLPDSIAGDPAIRAAADALDGLLVPSVKAIPSLLLYARLYGKEPDLL PPLRRLAEQAGGLRALEEPLLDLLAWQLHVDNYDIARTYAERLEMVKTSIAVHRKKGTPW AVETAVTAALGNVETTVTEWYDYEGGQPYHFKVLVTLFEQGIVADDINRARQLILETKNT RSHLDHLGITVALGSNCETRFGAVLGMGNTMTIWPEEITDLEQELSLNTGAVAHWQHILT IAPEEI >gi|316923394|gb|ADCP01000070.1| GENE 86 77930 - 78869 978 313 aa, chain + ## HITS:1 COG:STM4200 KEGG:ns NR:ns ## COG: STM4200 COG5301 # Protein_GI_number: 16767450 # Func_class: R General function prediction only # Function: Phage-related tail fibre protein # Organism: Salmonella typhimurium LT2 # 1 159 1 159 581 122 38.0 7e-28 MSQQFRTVTTNAGRNAVREALTQGKTVKLSHMSVGDGGGNPVTPLSTMTKLVNERFRAQI NDIVLDPATPDLFTSELFIPQAEGGWYIREVGLWMDDGTLFAVGNTPLTEKPDISSGAAT DLLVRLIIRVLDAATISIEIDPAQVLATREYVDRKLDAHNKDGGAHETLARKSVQIKAGT GLTGGGTLEADRTLTIKYGNTAGTACQGNDVRLADARTPKPHKATHQTGGSDAITPADIG AADKTIQIKPGTGLTGGGTLEADRTLTVSYGTAAGTACQGNDARLSNARTPTAHKTTHKT GGTDALTPADIGA Prediction of potential genes in microbial genomes Time: Fri May 13 03:02:28 2011 Seq name: gi|316923312|gb|ADCP01000071.1| Bilophila wadsworthia 3_1_6 cont1.71, whole genome shotgun sequence Length of sequence - 82442 bp Number of predicted genes - 86, with homology - 63 Number of transcription units - 53, operones - 16 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 362 - 421 1.9 1 1 Tu 1 . + CDS 462 - 1064 464 ## LIC025 hypothetical protein + Term 1087 - 1128 7.3 + Prom 1078 - 1137 4.6 2 2 Tu 1 . + CDS 1225 - 2739 1451 ## Bpro_5014 hypothetical protein + Term 2742 - 2781 2.0 + Prom 2764 - 2823 4.7 3 3 Tu 1 . + CDS 2884 - 3132 449 ## + Term 3362 - 3406 -0.4 - Term 3128 - 3160 5.3 4 4 Op 1 . - CDS 3235 - 3711 315 ## Ddes_1727 hypothetical protein 5 4 Op 2 2/0.000 - CDS 3730 - 5064 707 ## COG3864 Uncharacterized protein conserved in bacteria 6 4 Op 3 . - CDS 5069 - 6070 483 ## COG0714 MoxR-like ATPases - Prom 6090 - 6149 5.7 + Prom 6081 - 6140 2.8 7 5 Tu 1 . + CDS 6176 - 6439 84 ## + Term 6474 - 6509 7.2 + Prom 6519 - 6578 4.6 8 6 Tu 1 . + CDS 6649 - 6933 145 ## + Term 6958 - 6989 3.2 + Prom 7296 - 7355 4.9 9 7 Tu 1 . + CDS 7426 - 7713 264 ## + Term 7734 - 7778 4.0 10 8 Tu 1 . + CDS 8122 - 8400 200 ## + Term 8429 - 8459 1.0 - Term 8671 - 8703 5.0 11 9 Tu 1 . - CDS 8761 - 9147 402 ## COG3111 Uncharacterized conserved protein - Prom 9191 - 9250 5.3 + Prom 9121 - 9180 3.4 12 10 Tu 1 . + CDS 9205 - 9732 304 ## COG1525 Micrococcal nuclease (thermonuclease) homologs + Term 9882 - 9935 1.7 - Term 9682 - 9726 2.3 13 11 Op 1 . - CDS 9769 - 10161 303 ## 14 11 Op 2 . - CDS 10178 - 10549 336 ## 15 11 Op 3 . - CDS 10569 - 12344 908 ## COG3472 Uncharacterized conserved protein - Prom 12395 - 12454 2.7 - Term 12542 - 12591 1.1 16 12 Tu 1 . - CDS 12611 - 12868 88 ## - Prom 12891 - 12950 2.7 - Term 13710 - 13752 1.1 17 13 Tu 1 . - CDS 13891 - 14388 261 ## gi|212702749|ref|ZP_03310877.1| hypothetical protein DESPIG_00777 - Prom 14478 - 14537 5.2 18 14 Op 1 . + CDS 15002 - 15478 549 ## DVU0230 transcriptional regulator CII, putative 19 14 Op 2 . + CDS 15541 - 15711 116 ## 20 14 Op 3 . + CDS 15708 - 15863 169 ## 21 14 Op 4 . + CDS 15860 - 16231 329 ## 22 14 Op 5 . + CDS 16228 - 16470 266 ## 23 14 Op 6 . + CDS 16467 - 16868 297 ## 24 14 Op 7 . + CDS 16855 - 17235 460 ## DVU1754 hypothetical protein 25 14 Op 8 . + CDS 17232 - 17387 147 ## 26 15 Op 1 . + CDS 17612 - 18442 621 ## gi|212702699|ref|ZP_03310827.1| hypothetical protein DESPIG_00727 27 15 Op 2 . + CDS 18445 - 18783 242 ## 28 16 Tu 1 . - CDS 18761 - 19930 739 ## COG0582 Integrase - Prom 20142 - 20201 7.6 + Prom 20525 - 20584 7.5 29 17 Tu 1 . + CDS 20727 - 22445 1111 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold + Term 22471 - 22501 4.1 - Term 22459 - 22489 3.3 30 18 Tu 1 . - CDS 22535 - 22828 345 ## 31 19 Tu 1 . + CDS 23244 - 24227 576 ## COG0604 NADPH:quinone reductase and related Zn-dependent oxidoreductases + Term 24231 - 24274 0.4 + Prom 24467 - 24526 3.9 32 20 Op 1 20/0.000 + CDS 24614 - 25462 580 ## COG0822 NifU homolog involved in Fe-S cluster formation 33 20 Op 2 . + CDS 25459 - 26619 933 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes + Term 26642 - 26682 6.6 - Term 26635 - 26665 0.2 34 21 Tu 1 . - CDS 26690 - 27244 614 ## COG0431 Predicted flavoprotein - Prom 27298 - 27357 2.2 35 22 Tu 1 . - CDS 27366 - 27797 258 ## COG0590 Cytosine/adenosine deaminases 36 23 Tu 1 . - CDS 27935 - 28585 679 ## COG1611 Predicted Rossmann fold nucleotide-binding protein - Prom 28788 - 28847 3.2 + Prom 28660 - 28719 4.7 37 24 Tu 1 . + CDS 28820 - 30439 1060 ## COG1236 Predicted exonuclease of the beta-lactamase fold involved in RNA processing + Term 30442 - 30488 7.1 + Prom 30444 - 30503 2.3 38 25 Op 1 14/0.000 + CDS 30548 - 31114 333 ## COG0742 N6-adenine-specific methylase 39 25 Op 2 . + CDS 31093 - 31689 371 ## COG0669 Phosphopantetheine adenylyltransferase 40 25 Op 3 . + CDS 31686 - 32600 440 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase + Term 32640 - 32686 14.0 - Term 32628 - 32672 10.2 41 26 Op 1 21/0.000 - CDS 32694 - 33911 1341 ## COG0282 Acetate kinase - Prom 33944 - 34003 1.9 42 26 Op 2 . - CDS 34034 - 36151 2393 ## COG0280 Phosphotransacetylase - Prom 36207 - 36266 3.8 - Term 36256 - 36294 10.7 43 27 Tu 1 . - CDS 36304 - 39900 4288 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 40096 - 40155 4.2 + Prom 39836 - 39895 1.6 44 28 Tu 1 . + CDS 40069 - 40263 99 ## + Term 40303 - 40339 -1.0 - Term 40327 - 40362 1.2 45 29 Tu 1 . - CDS 40454 - 41446 550 ## COG2840 Uncharacterized protein conserved in bacteria - Prom 41483 - 41542 1.9 + Prom 41425 - 41484 3.8 46 30 Tu 1 . + CDS 41633 - 42685 905 ## PROTEIN SUPPORTED gi|126667548|ref|ZP_01738518.1| Ribosomal protein S7 + Term 42714 - 42753 10.5 - Term 42702 - 42741 9.7 47 31 Op 1 . - CDS 42786 - 43136 338 ## Dde_2026 hypothetical protein 48 31 Op 2 3/0.000 - CDS 43146 - 44489 1159 ## COG0373 Glutamyl-tRNA reductase 49 31 Op 3 . - CDS 44491 - 45324 741 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 50 31 Op 4 . - CDS 45311 - 45988 399 ## COG1648 Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) - Prom 46036 - 46095 3.4 + Prom 45959 - 46018 2.9 51 32 Tu 1 . + CDS 46249 - 46560 240 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 52 33 Tu 1 . - CDS 46633 - 47310 928 ## COG0563 Adenylate kinase and related kinases - Term 47372 - 47406 3.2 53 34 Op 1 . - CDS 47419 - 47880 213 ## DvMF_3041 protein of unknown function UPF0153 54 34 Op 2 . - CDS 47885 - 48310 274 ## COG1832 Predicted CoA-binding protein - Prom 48332 - 48391 5.7 55 35 Tu 1 . + CDS 48312 - 48611 96 ## 56 36 Tu 1 . + CDS 48673 - 49893 640 ## COG0006 Xaa-Pro aminopeptidase + Term 49936 - 49962 -1.0 - Term 49772 - 49807 -0.7 57 37 Tu 1 . - CDS 49867 - 50010 85 ## - Prom 50134 - 50193 2.0 + Prom 50109 - 50168 2.0 58 38 Tu 1 . + CDS 50195 - 53629 1794 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) + Prom 53636 - 53695 2.2 59 39 Op 1 . + CDS 53852 - 54778 772 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 60 39 Op 2 . + CDS 54818 - 55687 327 ## COG1426 Uncharacterized protein conserved in bacteria 61 39 Op 3 . + CDS 55691 - 56461 266 ## LI0301 hypothetical protein 62 39 Op 4 19/0.000 + CDS 56495 - 57433 770 ## COG0752 Glycyl-tRNA synthetase, alpha subunit 63 39 Op 5 . + CDS 57433 - 59520 1311 ## COG0751 Glycyl-tRNA synthetase, beta subunit 64 40 Tu 1 . + CDS 59716 - 59988 299 ## PROTEIN SUPPORTED gi|218885396|ref|YP_002434717.1| 30S ribosomal protein S20 + Term 60015 - 60050 5.1 - Term 60001 - 60036 5.1 65 41 Tu 1 . - CDS 60063 - 61622 1149 ## COG0497 ATPase involved in DNA repair - Prom 61690 - 61749 2.0 66 42 Op 1 49/0.000 - CDS 61753 - 62616 715 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 67 42 Op 2 . - CDS 62613 - 63623 653 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 68 42 Op 3 . - CDS 63646 - 64170 447 ## LI0252 hypothetical protein 69 42 Op 4 . - CDS 64181 - 64795 208 ## DvMF_0918 hypothetical protein - Prom 64824 - 64883 2.1 - Term 65172 - 65207 0.1 70 43 Tu 1 . - CDS 65335 - 66144 284 ## DVU1356 HD domain-containing protein - Prom 66194 - 66253 2.6 + Prom 66139 - 66198 3.9 71 44 Op 1 . + CDS 66297 - 69824 2941 ## COG0587 DNA polymerase III, alpha subunit 72 44 Op 2 . + CDS 69847 - 70233 231 ## COG0720 6-pyruvoyl-tetrahydropterin synthase + Term 70302 - 70369 16.1 + Prom 70324 - 70383 6.8 73 45 Tu 1 . + CDS 70626 - 70838 385 ## + Term 70932 - 70971 8.3 - Term 70915 - 70964 17.0 74 46 Op 1 4/0.000 - CDS 71011 - 71451 444 ## COG4917 Ethanolamine utilization protein 75 46 Op 2 . - CDS 71456 - 71794 205 ## COG4810 Ethanolamine utilization protein - Prom 71864 - 71923 3.1 76 47 Tu 1 . - CDS 71988 - 72695 135 ## Dvul_1842 phosphoesterase, PA-phosphatase related - Prom 72868 - 72927 1.8 77 48 Op 1 . + CDS 72988 - 74727 1257 ## COG2414 Aldehyde:ferredoxin oxidoreductase + Term 74760 - 74802 4.9 78 48 Op 2 . + CDS 74809 - 75549 480 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 79 48 Op 3 . + CDS 75549 - 75794 158 ## Dde_2458 hypothetical protein + Term 75986 - 76021 6.1 + Prom 75828 - 75887 4.0 80 49 Tu 1 . + CDS 76061 - 76789 391 ## LI0137 putative integral membrane protein + Term 76807 - 76846 6.3 + Prom 76931 - 76990 2.2 81 50 Tu 1 . + CDS 77066 - 77377 291 ## + Term 77393 - 77430 4.5 + Prom 77414 - 77473 3.1 82 51 Tu 1 . + CDS 77502 - 78413 444 ## COG1090 Predicted nucleoside-diphosphate sugar epimerase + Prom 78478 - 78537 3.0 83 52 Op 1 . + CDS 78598 - 78822 265 ## 84 52 Op 2 . + CDS 78868 - 79329 -176 ## + Term 79377 - 79417 1.1 + Prom 79437 - 79496 2.4 85 53 Op 1 . + CDS 79527 - 80183 426 ## Dvul_0437 hypothetical protein 86 53 Op 2 . + CDS 80199 - 82388 1311 ## COG5000 Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation Predicted protein(s) >gi|316923312|gb|ADCP01000071.1| GENE 1 462 - 1064 464 200 aa, chain + ## HITS:1 COG:no KEGG:LIC025 NR:ns ## KEGG: LIC025 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 46 198 34 177 179 79 32.0 9e-14 MTIPQLHMYDLKTGEYTGSRDATQRPNGEYILEATGATPVALPASIPSGHVARWTGDAWE TVEDHRQHMDERGRKEGGTQYWLPCDTWRSEPRYTEELGPLPDDALLERPERPLDEYKAD KRREVSDAYASALTATLTMPAANPSALEVTMGAALFAADDAVGLADVQKILTARRDALFA LVDGAASRAELGQIEVSYPV >gi|316923312|gb|ADCP01000071.1| GENE 2 1225 - 2739 1451 504 aa, chain + ## HITS:1 COG:no KEGG:Bpro_5014 NR:ns ## KEGG: Bpro_5014 # Name: not_defined # Def: hypothetical protein # Organism: Polaromonas # Pathway: not_defined # 6 498 16 515 597 228 31.0 5e-58 MYNASFYPTPPEVAEKMLAKVGKLYERSILEPSAGKGDLADAAVGKLDRYYNRCREVVHC IEIEPELQAAIRGKGYPLVGTDFLTFWPDEKYDLILMNPPFVSGEAHLLHAWEILDHGDI VCLLNEQTLLNPCTVHRKLLATIIEEHGEVEHLGSCFAEDALRKTQVRVSMVHLRKKREE PKFSFDAGSDEEGAAVFSDGSRFEGEVATRDTVGNLVAQYGRCRELFVRIAHLAQELAHY AGPLGTDGGETLKELMRQKPTRRAQEDAYNRFVRSLKKSAWREVLRLTDVRNLASHGVQK EIDRILESNERMAFSEENVYALVESIFLNRGAILQQCVVEAFDIMTRYYDENRVHVEGWK TNDAWKVNRRVVLPRVVSVTFSGSGYLSYGNSRQNLNDIDRAMAFLEGKKLESVPCTAVR ALEGHLKACGDDFSGVLFESTYFEMRCYKKGTLHMYFKDKELWERFNLTAARGKNWLPDD VKAREREARARNRRADQYGLPLSA >gi|316923312|gb|ADCP01000071.1| GENE 3 2884 - 3132 449 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNNATLIAERDSLRAAFIRYIALDQFISCGCNPDDFDAHFENMQDALRGGAMEETIRTLS SSIEDVVFDVLNECETFQTEAE >gi|316923312|gb|ADCP01000071.1| GENE 4 3235 - 3711 315 158 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1727 NR:ns ## KEGG: Ddes_1727 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 157 1 157 164 160 54.0 1e-38 MHIEWNITKRRGNIRPVLHYTVTLEEHERELALPFIRVVSTIPEPPDSWQEFCYPGQHER AENPASGKTYDLEIPSHKGRLWKQSLRLPWREENDYPEVEQSFKKLRDAFEAELKAAYGS LPMDESNSLETSFDARRFIAPGILAERFLMLARNAKAS >gi|316923312|gb|ADCP01000071.1| GENE 5 3730 - 5064 707 444 aa, chain - ## HITS:1 COG:DR1169 KEGG:ns NR:ns ## COG: DR1169 COG3864 # Protein_GI_number: 15806188 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 224 440 162 376 379 77 31.0 4e-14 MTALERQAHLAMIRARAALVLDHPFFGSIALRLTLKPDPTCSDLWTDGRTLGFNPSYAAA LSEAALIGAQAHEVMHLACAHHVRREERDTALWNKACDIVVNQLLLDAGFSLPQGAVHDP AYAGFSVEALYSELARLQDEAPNKGAKRPEAQEETEQTEGGSGQPGEGKGQGKGQNDPTE GERGEAELLGGHGASESSLDKGKGQRAKPVAFTGEVRDHPVLDGGSGTAQKQAEQEADIE LVQAMQRAKHMGDMPAGLLRLFRKRLHPTLDWRGILQRFLENCADGDSTWTTPNRRYLYQ GIYLPSRQEPRIPHIVLAVDSSGSVDNALLEMFCTELSGILESYDTLLTVLFHDTRVQSV QTFTRQDLPLRLAPAGGGGTDYRPVTAYIEENDLAPTCMIWFTDLECDRFPEEPAFPVLW LAEQPNGTTPPFGETGYLKERPSA >gi|316923312|gb|ADCP01000071.1| GENE 6 5069 - 6070 483 333 aa, chain - ## HITS:1 COG:DR1171 KEGG:ns NR:ns ## COG: DR1171 COG0714 # Protein_GI_number: 15806190 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Deinococcus radiodurans # 5 277 3 271 340 171 40.0 1e-42 MDGFMTPSQIFSALHTLLSAHQPVFLWGAPGVGKSQVVAKVAADRGMALRDIRAVLLDPV DLRGLPRLENGRAEWCPPAFLPGPDDSEQGIIFLDELNAAPPLVQAACYQLVLDRRIGEY RLPDGWSIIAAGNREKDKAVTHRMPSALANRMVHLEFDVSPDDWILWAQQAGIRREVIAF LRFRPKLLHDFDPLSSGKAFASPRSWAFLSGILDANPDPDVEYELFRGTVGDGAAAEFMG FLRVWRGLPSVEDILANPADALVPDDPAALYAVCEALSEKAADGTVNALVTYAGRLPSEF GVLLMRDAVCRDERIVRTSAFADWAQANAHVLM >gi|316923312|gb|ADCP01000071.1| GENE 7 6176 - 6439 84 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAQDALYHRRRVRDLPLLRHWSRYRPWHKRSLRTKPWTLPDAPDLDMTPVRINISLPKCV LEGLDRKASARGMTRSALIAKAAQAYM >gi|316923312|gb|ADCP01000071.1| GENE 8 6649 - 6933 145 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKEASIVEIERKAIALIDRFRKEAGLSEAKLGELAFPEAKNYRQKINSLRNARGSGNEPL RLRLGDFCAICHALGKNPAQELLLLWGEADKENS >gi|316923312|gb|ADCP01000071.1| GENE 9 7426 - 7713 264 95 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEEKMRVALERKAVELFEKRRLDKGLSVEALAARLYPDVPAANARMNLNRLRKPQINGKP KRLSFGDFIDLCIALDMVPERIVSQTITEVLEQEK >gi|316923312|gb|ADCP01000071.1| GENE 10 8122 - 8400 200 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVDEGTAVKIERDILSYLDTVKKERGLTDEKWGEQAFQGSVNGRRKVQNLKRPQSNGQP QKLCIADFVRLCSVLHVDPARVLSKALEDNNL >gi|316923312|gb|ADCP01000071.1| GENE 11 8761 - 9147 402 128 aa, chain - ## HITS:1 COG:STM3176 KEGG:ns NR:ns ## COG: STM3176 COG3111 # Protein_GI_number: 16766476 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 1 123 1 123 130 88 46.0 4e-18 MKRLFALVLAATLAAPTIAAAGFQGPNSAQGGGFQGPTTGIEADTVAKAQKSWDDARVVL TGNITQRVAGSDDKYIFKDTTGEMIVEIDFELFAGRTVTPQNKVRLSGKVDKDFMESPKV DVKVLEIL >gi|316923312|gb|ADCP01000071.1| GENE 12 9205 - 9732 304 175 aa, chain + ## HITS:1 COG:Cj0979c KEGG:ns NR:ns ## COG: Cj0979c COG1525 # Protein_GI_number: 15792306 # Func_class: L Replication, recombination and repair # Function: Micrococcal nuclease (thermonuclease) homologs # Organism: Campylobacter jejuni # 36 175 43 174 175 95 38.0 3e-20 MRQADRIQAERGYMEANMLRILLLIVALAFPLPAYAWPGTVLDVHDGDTMTVAPMGDVRT PLKIRLYGIDAPELEQKGGPQSRDHLLSLVRPGQDVEVIKMSTDKYGRTVALVATDRVLN ADMLEAGQAWAYPAFCNAPFCKGWKKLEQDAKEARRGLWSRKNPTPPWKWRQKRK >gi|316923312|gb|ADCP01000071.1| GENE 13 9769 - 10161 303 130 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRMKVRKKMCRFLALIIMLCMGSSAYAETPTAEAQKKVAAYLTLLDDMCFTDPFLSYGFA QDPGKSWKEGLEKLEIELNADPEVSPAVKKTPGLLIHLGLQTFYAKHCLADSNWDRKNVL DELGEYAEGI >gi|316923312|gb|ADCP01000071.1| GENE 14 10178 - 10549 336 123 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRFLAPCLALVLLLALSPIRAQCASNATETVREAINDLLDDFDDFKDSEIFQQCVYGCGS ENPGKEWRGRIKVLQRQAMRREDIPTRLKDAIGELWQMGRTYARGNARKAAELRRRIETV LED >gi|316923312|gb|ADCP01000071.1| GENE 15 10569 - 12344 908 591 aa, chain - ## HITS:1 COG:all3615_2 KEGG:ns NR:ns ## COG: all3615_2 COG3472 # Protein_GI_number: 17231107 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 237 584 2 356 377 236 36.0 9e-62 MSFRTAEPNLKDVLADIHNGIIQLPDFQREWVWDDRHILELIASVSLSFPIGAVMFLEAG GVPFQTRLFEGVELSPAPAPKKLVLDGQQRLTSMYLALYSGKPVKTKTDKDDVVHRVYFL DMKKCLDPRADREEAILSLPDTFKILSDFGRKIDLDLSTLEQQYEKKMFPASIIFDPVKS KEWRKGYRRYHGEQEDSFLMDFEDDVLSSFQQFKVPAIELGIDTTREAVCKVFEKVNTGG VTLTVFELVTATFAASHFNLPEDWKARSERMRERHSLLKVADGTAFLTALTLYASFRKAQ SESSAVTCKRKDVLELVVADYQRLADEMEYGFKKAAQLLAEEKIFDPRFLPYATQLIPLS CICAALSNQIDNASIKEKVLRWYWCGVFGELYGGANETRFSQDLPDVVDWVNGGDTPRTV KDASFAPRRLLSLQTRISAAYKGLSILLMQRGGKDFISATPIAINAYFVTPVDIHHVFPK AWCTAKNLPREKWNSVINKTPLSTTTNQYLSGDAPSLYLKRIEEKKGVPSETLDACLRSH AIPVEELRQNAFDDFIRQRAILLLDMIEAATGKAVSGRDSDETVMAFGAAL >gi|316923312|gb|ADCP01000071.1| GENE 16 12611 - 12868 88 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLEDALGEQTPEKLEWVERKGYFVIYHEKISNWICRVYYGSGYLKYVMFKDKTTLHMKAG DRLYKRDCPKLKEVFLQVYDGKREA >gi|316923312|gb|ADCP01000071.1| GENE 17 13891 - 14388 261 165 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212702749|ref|ZP_03310877.1| ## NR: gi|212702749|ref|ZP_03310877.1| hypothetical protein DESPIG_00777 [Desulfovibrio piger ATCC 29098] # 1 75 1 75 115 117 74.0 2e-25 MELFNRVKEIAKHFTGSDKALAEKLGFKQATFSGYLNEKRQDNLWPLLPNILSLFPSLSR DWLYFGEGPMLKTDATESPSPNQSPAQTDNSALLELLVRNKELEEKVFDLKGRLADKEKL IALLEENKRLAESVSAVVTPTEARRDNPQHATSVRPVTGVPDGGI >gi|316923312|gb|ADCP01000071.1| GENE 18 15002 - 15478 549 158 aa, chain + ## HITS:1 COG:no KEGG:DVU0230 NR:ns ## KEGG: DVU0230 # Name: not_defined # Def: transcriptional regulator CII, putative # Organism: D.vulgaris # Pathway: not_defined # 7 151 6 150 154 81 36.0 7e-15 MKEDTPNIGAICQRLAKHAPSGLSAEQIAYRLGRPYNTLMSELSPLRDTHKFDVNLLIPL MRLAGSTEPLHVMARALGGVYVDFLPVSDAAHPVHGQCMASVKSFGDMMVGTAKALEDNI ITSEERRELARLGYRAVGDILALLLQIDEAEARDRGRA >gi|316923312|gb|ADCP01000071.1| GENE 19 15541 - 15711 116 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDRNEIYTGKARLVNRGNGIVLRCATKGELAELLALLRALFPHGGRMKDLMGRAAA >gi|316923312|gb|ADCP01000071.1| GENE 20 15708 - 15863 169 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRYVIGDSARLVTPYRGYSWVTIIGYEGDGYCVELTSGLEIVVREDELEDV >gi|316923312|gb|ADCP01000071.1| GENE 21 15860 - 16231 329 123 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRKIRELAIPKYKKDWPGKTLLMKEACPTTRMSPYEYGERLPSLIEAGVLVKLERFLSK SEATLSGHSDLYQWAEKEGQRVIKIGWRCPRCAVCHEDYIPESFIRQKKAIFVEFTGTEG EEA >gi|316923312|gb|ADCP01000071.1| GENE 22 16228 - 16470 266 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSVINQLDIEHIDEALRKFANLENMNGFAAILIGFTDEGDLSCLIGKGKLNQGDMLYSFS KGIQNLFERMDALKQDGGVQ >gi|316923312|gb|ADCP01000071.1| GENE 23 16467 - 16868 297 133 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMPLDWSEHAKLERGAALPVDESLLVMLRERRKGPERVEATEMTLAEDPLFAIERRIEEE GDDGEKRLTSRIESVFFSYEEAETYLEQFKYRLHGCRLTVLPTFGSLKRVLRAGEFLDEI ERRLAEGGHDGAA >gi|316923312|gb|ADCP01000071.1| GENE 24 16855 - 17235 460 126 aa, chain + ## HITS:1 COG:no KEGG:DVU1754 NR:ns ## KEGG: DVU1754 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 76 1 76 135 88 56.0 8e-17 MERHEFDAAYARICEVCGMKTQTELSAYLGIRQSSISDAKRRMMIPAAWLLTLLTREGVN PTWILTGEGSKFLVPASLPPSAPTLLLARLAGPELEQRIRAGLTSATECIAEEIRAFAEG KEEKGE >gi|316923312|gb|ADCP01000071.1| GENE 25 17232 - 17387 147 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIDLLIEYRHWLAGTVMGAGFVWLFVVQEVARKRRLRRMARECRHESGQEG >gi|316923312|gb|ADCP01000071.1| GENE 26 17612 - 18442 621 276 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702699|ref|ZP_03310827.1| ## NR: gi|212702699|ref|ZP_03310827.1| hypothetical protein DESPIG_00727 [Desulfovibrio piger ATCC 29098] # 28 272 1 249 252 70 25.0 1e-10 MKLFRKEKKPAGAYCCPICKRGYRHAGMAERCTRTAVCRLYNTPPAEVREAWRLVGGAAS LGWFLAHPILGTEPEDSGLYGAARAVQDTTGELYAMLHGGFPCADHVRRALHAALTGEVA GIWPPGHPAHLGHVGDVIRSVICDARGEAVARAVRPGLLDGMRELEERVEALYDEIIPEG EADYEEDAIEGIVRLSDAVIGPKPEGRKPSLYLVNERHLVVGRGRADVRRVMMGFGLSKP RIQGISPGEKFEDGRTAEDIIKTAVRVPALIGRMEE >gi|316923312|gb|ADCP01000071.1| GENE 27 18445 - 18783 242 112 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSQSHEKQLIEALRVVCGGEKSEVSPELGDEIVKAIREEIAPLVELAAFPHREYLDEKAA SFYCSTPVSTLQRKRVDGKGPVYIKDGAKVLYARKDLDRYMAARKVKTYEQS >gi|316923312|gb|ADCP01000071.1| GENE 28 18761 - 19930 739 389 aa, chain - ## HITS:1 COG:Ta1314 KEGG:ns NR:ns ## COG: Ta1314 COG0582 # Protein_GI_number: 16082303 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Thermoplasma acidophilum # 203 371 104 271 283 61 27.0 3e-09 MALRDKTKYTGVYTRTSKTKRYLGKADVCFEITYKVGPKLIWEKVGWKSEGYTAVLASEI RAERIRVSRHPDQFPSPPKADPLTYGRAWEIFAEKRLPLLKNHKALRHHYAIHIQPVFEN VLLKDITSFSLESFKVALLAKEAIRKTKTGKVFPRGNTLSPSTVNFILLDVLNVIGRMIE WGLHPGPAPKVKLLPSNNERQRFLTPIELERLLDVLEILSCRVYRIALISMHTGMRVGEV LRMRGQDVDFENRLIHVDGKMGKRAAFMDDTVINVLQSIVPVRPADLVFTTAKGLQIRPN ALTHTFTKAVNILGLNDGITDPRFKVVIHTLRHTFCSWLASQNVPLYTIGKLVGHTSLRS TQRYAKLSPDAKWDALKLIEKIASSAHTS >gi|316923312|gb|ADCP01000071.1| GENE 29 20727 - 22445 1111 572 aa, chain + ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 33 571 18 550 553 347 39.0 4e-95 MIRRTFLKGAAAVSFSLMVSPALTAFAKDSVRVYVGGPIFTMNKNNDIAEAIAVKGETIL AVGKKAEVMAAAGSGATVVDLKGKALIPGMIDGHSHFPSGAFNELTMVNLNVPPLGRAES IADMQSLLKERTAQTKKGEWIVGYNYNDLAIKEQRHPTRADLDAVSTEHPIFVKHVSGHL GVANSKALELAGITEETPNPEGGKFRRGPDGKLDGVLEGPAAQAPVSAIRPKPTAAQYEE AVRRDNMIYAAAGITTANNGGSPTVDDFFLKASENGDLGIRVVIWPNGRNAKLIESYGEK RQGAQLDKAGKVFLGPAKLFADGSPQGYTAWFSKPYFKQLPGKPADFRGFPVFNSQEELF ALVQKLHDAGWQITTHTNGDQAIQDMIDAYSAALEKNPRKDHRHILNHCQFCRPDQVVAI AEKGFVPSYFVTHTWFWGDIHRDMVAGPERAAHISPLKAALDHKITFALHNDTPVTPISP LMDVFSAVNRLTSSGKVLGPDQRIDVMEALRGVTINGAYMQRLEDKIGSLEKGKLADMVI LDKDPTKVDPVKLKDIMVEETIVGGNTVYKRA >gi|316923312|gb|ADCP01000071.1| GENE 30 22535 - 22828 345 97 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MASTESGRQMYPSFKEAHEAFVAEYTALYQGFPKYLEDLVIGFGWGEDAAGQRWFIISAN PQGLQHLTYPELVSWDFESFTDIPELIQHAHLDNISF >gi|316923312|gb|ADCP01000071.1| GENE 31 23244 - 24227 576 327 aa, chain + ## HITS:1 COG:BH0935 KEGG:ns NR:ns ## COG: BH0935 COG0604 # Protein_GI_number: 15613498 # Func_class: C Energy production and conversion; R General function prediction only # Function: NADPH:quinone reductase and related Zn-dependent oxidoreductases # Organism: Bacillus halodurans # 1 315 1 310 322 160 31.0 4e-39 MRAIVMTAFGGPEVLRLTDIPEPEPEHGQIRVRLYAAGVNPAEAYIRTGQYAFFKPTLPY TPGFDGAGVVDKVGEGVQGSKPGDRVFVTALTARRNTGTYAEKVVCDAEAVHLLPDAVTL EQGAAVGFPGMAAYRALFQCAGLKPAERILIHGASGGVGTVAVQLARACGAFVIGTAGSP ASMELVRSIGAHIVLDHTQPGYLDTLAALTDGVGPDVILEMLADKNLENDMRMIAKHGRI VVIGSRGSLEMTPRLLMAKESAVMGMAIWHSAPEEARMAEAAVAAALRSGALRPVLGDIL PLEKAAQAHEDIIARGGKPGKMLLRIE >gi|316923312|gb|ADCP01000071.1| GENE 32 24614 - 25462 580 282 aa, chain + ## HITS:1 COG:all1456_1 KEGG:ns NR:ns ## COG: all1456_1 COG0822 # Protein_GI_number: 17228950 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Nostoc sp. PCC 7120 # 1 175 1 183 183 211 60.0 1e-54 MWEYTDTVREHFLKPRNAGELADANAIGEVGSLACGDALKLFLKINDKGIIEDASFQTFG CASAIASSSALTEMIKGKTVEEAAKVTNKDIATYLGGLPREKMHCSVMGQEALEAALRNW RGEPAEKQHEHEGKLVCKCFGVTDVQIRRAIEENNLKTVEEVTNYTKAGGGCGECLDTIQ DILDEELGKPKLKEFNPAPKSIMTNVQRMQKVMQVLQDEVRPRLAADGGDIELVDVDGHR VVVALRGLCSNCSSRTVTLKDLVEKILREQVEPEIVVEEVKA >gi|316923312|gb|ADCP01000071.1| GENE 33 25459 - 26619 933 386 aa, chain + ## HITS:1 COG:all1457 KEGG:ns NR:ns ## COG: all1457 COG1104 # Protein_GI_number: 17228951 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Nostoc sp. PCC 7120 # 1 383 1 383 400 514 63.0 1e-146 MSVIYLDNNATTSVAPEVFESMIPFFTEKYGNASSMHTFGGQVGKSIQDAREKVAAMLGA QPEEIVFTSCGTESDSTAVMSALQAQPEKKHVITTRVEHSALIALGQHLEQKGYELTYLG VDSKGRLDLDELSEAMRKDTAIVSIMFANNETGTVFPIQKIAEMVKERGILFHTDAVQAA GKLPIDLAKMPVDFLALSGHKLHAPKGIGVLYVRKGTRFIPFLRGGHQERGRRAGTENAP YIVGLGTACELAMNHMTDENTRVPALRDKLEKGLLAAIPDAIVNGDVENRLPNTSNIAFQ YVEGEAILLLMDQLGICASSGSACTSGSLEPSHVLRAMGVPFTFAHGSIRFSLSRYTTDT EIDYVLKNLPPIIEQLRAISPFRAMK >gi|316923312|gb|ADCP01000071.1| GENE 34 26690 - 27244 614 184 aa, chain - ## HITS:1 COG:PA1204 KEGG:ns NR:ns ## COG: PA1204 COG0431 # Protein_GI_number: 15596401 # Func_class: R General function prediction only # Function: Predicted flavoprotein # Organism: Pseudomonas aeruginosa # 1 182 1 183 185 149 44.0 2e-36 MNPTLNILGISGSLRAQSRNTGLLRFAKQCLPPTVNFQIADLADVPFYNADRCDSKPEPV QTLIRQVTEADALVLACPEYNYSIAPALKNALDWVSREPDCAPFNGKPVAIMGAGGGMGT SRAQYHLRQVCVYLNLRPVNKPEVFSNAFSASFNESGDLIDPSLQQQVTALMNALLAWHS QLTK >gi|316923312|gb|ADCP01000071.1| GENE 35 27366 - 27797 258 143 aa, chain - ## HITS:1 COG:XF1012 KEGG:ns NR:ns ## COG: XF1012 COG0590 # Protein_GI_number: 15837614 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Xylella fastidiosa 9a5c # 9 143 26 160 167 135 52.0 2e-32 MLQAIAKARQAPQQGEVPVGALIVDPAGTILAAEHNQPITLSDPSAHAEIMAMRSAGAVL GNYRLEGCILVVTLEPCLMCTGAIVHARLAGVVYGAADNKAGAVESCFNGLDLPFHNHRV WHMGGIAAPECASLLQNFFQKSR >gi|316923312|gb|ADCP01000071.1| GENE 36 27935 - 28585 679 216 aa, chain - ## HITS:1 COG:Cgl1086 KEGG:ns NR:ns ## COG: Cgl1086 COG1611 # Protein_GI_number: 19552336 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Corynebacterium glutamicum # 13 210 46 242 256 204 47.0 8e-53 MSQSPIDMLSIRESWRMFRILAEIVDGFETLSDLGPCVSIFGSARVKPENPLYAEGETIA KLLVESGYGVITGGGPGLMEAGNKGATEAGGTSVGLHIQLPHEQECNKYVKIRSDYRYFF LRKLMFVKYALAYVVMPGGMGTIDELSEAFVLAQTKRIRPLPIILYQSTFWNGFLDWVRS TMVSGGYIRASEVDDLVTVCDTPEQVVQQIRRRVIL >gi|316923312|gb|ADCP01000071.1| GENE 37 28820 - 30439 1060 539 aa, chain + ## HITS:1 COG:PA3614 KEGG:ns NR:ns ## COG: PA3614 COG1236 # Protein_GI_number: 15598810 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted exonuclease of the beta-lactamase fold involved in RNA processing # Organism: Pseudomonas aeruginosa # 5 459 6 459 467 383 44.0 1e-106 MKVHFLGAAQTVTGSCYLIEACGIRFTVDCGMHQGNSEIEARNFDTDIYQSGNIDFILLT HAHIDHSGLLPRIVHEGFKGSIYCTPPTAALAGLMLEDSAHIQEMEAEWRQKKQKRHTGQ NGKSKEIVPLYTVEDARKVPDQFRLVEYNKTFEPHPGIAVTYKDAGHILGSAFLELTVTE DGKTTRVVFSGDLGRPGTLLMHDPVVASQADYLFIESTYGDRNHKNEEATFDELAEAIAY SYNNHDKVIIPAFAVGRTQEILYCLYLLRQKGKLPDDMPIFVDSPLAIRATEVFKEFKDY LDTPEIDLSGNMSALLPNLKFTLSALESQALNVMKGPAIIISASGMCNAGRIQHHLRHNA WRPTASIVFVGYQGVGTPGRKIVDGASSIRLFNEDVAIKAKVFTIGGFSAHAGQSQILDW IRKMAHPGMQVVLVHGEEKAQQILSGLIKDQFHLSVHAPGYLEEMALAAGATPEVVVTAQ PKSYAHVDWELLLEETEGKLSKLRTRLEKAGERPWDEQVDIRDRILELNKELLSVLTQL >gi|316923312|gb|ADCP01000071.1| GENE 38 30548 - 31114 333 188 aa, chain + ## HITS:1 COG:FN1329 KEGG:ns NR:ns ## COG: FN1329 COG0742 # Protein_GI_number: 19704664 # Func_class: L Replication, recombination and repair # Function: N6-adenine-specific methylase # Organism: Fusobacterium nucleatum # 1 151 1 150 182 120 41.0 2e-27 MRIIAGALGGRNLKTVEGPGYRPATAKVREAIFSMLSSRGVVWSGLRVLDLFAGSGSLSF EALSRGAQEVCLVEREPKVVQCLNQNVEALDVSDRCRVAESDVLRFLRGRAYQPYDVIFA DPPYGENRLVPTLKAIMKGGWLAPDGYLLAEIEGLLRFDAAAAHEELELEIDRNYGQTRI ILWHKTNE >gi|316923312|gb|ADCP01000071.1| GENE 39 31093 - 31689 371 198 aa, chain + ## HITS:1 COG:PA0363 KEGG:ns NR:ns ## COG: PA0363 COG0669 # Protein_GI_number: 15595560 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheine adenylyltransferase # Organism: Pseudomonas aeruginosa # 9 161 5 157 159 166 52.0 3e-41 MAQDKRVAIYPGTFDPLTNGHANIIRRGLRMFDNIIVAVAADTGKSPLFSLEERVAMAEK VFAKEPNISVEPFQGLLVEYVARRNVHTVLRGLRAVSDFEYEFQIALMNRKLRPDIETLF LISDYRWLYISSTIVKTVASLGGDVRGLVPDHVLSCLRERFGFTHGEIEPVSLPPVPELS ELARLQELEASLDRDTDK >gi|316923312|gb|ADCP01000071.1| GENE 40 31686 - 32600 440 304 aa, chain + ## HITS:1 COG:mll1448 KEGG:ns NR:ns ## COG: mll1448 COG0324 # Protein_GI_number: 13471468 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Mesorhizobium loti # 7 293 20 311 321 164 36.0 2e-40 MTKPRVVCLVGPTGVGKSAIALRLAGQFNGEIINADSRQVFKAFPIITAQPSQAEQSVLP HRLYGFLKTDARFSAGAWGEKALEHIEHTAFPVFVGGTGLYLRAFFDSIVDIPPIPDDIL TRLTEACRIEGSLVLHRKLKEIDPAYAARIHENDRQRIVRALCVYEATGKTFSWWHSQTP PPRDADVLRIGLRLPLDTLTPLLARRIDLMLEAGALEEARSEYAVFPEGTLPGWSGIGCR ELHLYLSGVLSLDAARELWIKNTRAYAKRQLTWFNADSRIQWFAPNEGKEILTLVSEWQQ RKKD >gi|316923312|gb|ADCP01000071.1| GENE 41 32694 - 33911 1341 405 aa, chain - ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 402 1 400 403 426 53.0 1e-119 MKILVINAGSSSCKYQLFNMDDQSVLCSGVVERIGQPMGKLSHKIAPGTDKEEKIVVERP FPTHVEGMEDVISLLLDSEKGVIQDKSEISAIGHRVLHGGEAITDPVLIDEKVKEIIRDC FPLGPLHNPANLMGIDVAEKLFPGVPNVGVFDTEFGMTLAPEAYLYPLPYSLYEELRIRR YGFHGTSHKYIAKATAAFLGKPLSELNSITMHLGNGSSMSAVQNGKCIDTSMGLTPLEGL MMGTRCGSIDPAIVPFMMEKKGMTPQEADTVLNKQSGLLGICGTSDMRDVHANIEKGDEK AALAFKMLTRSIKKVLGSYFFLLGNVDSIVFTAGIGENDEFVREAVCEGLEPFGIKVDKK ENHTRKPGARVISTPDSRIPVLIIPTNEELEIATTTMRIVEESQK >gi|316923312|gb|ADCP01000071.1| GENE 42 34034 - 36151 2393 705 aa, chain - ## HITS:1 COG:MT0421_2 KEGG:ns NR:ns ## COG: MT0421_2 COG0280 # Protein_GI_number: 15839794 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Mycobacterium tuberculosis CDC1551 # 372 697 6 331 334 410 62.0 1e-114 MAHSLYITGTEGSSGKTVVTLGLMHFLQSQVRKVAFFRPIIDSEDEARRDSSINLILKHF ELDMLYRDTYACTYKEALELVTSGNMSLLIEKIFQKYKALENEYDFVLCQGTDFRDKDTA VQFELNSEIAASLNIPLALVINGKDKSLDAIQASVRSNLELLKDKRREVGCVFVNRVSFT TEDCPTCASTIIEGSGAFTPLFFISETPALCNPSVGEVQKWMNADVLFGKEGLNNLVHDY LIAAMQVGNFMNYLEQDLLIVTPGDRSDIILASLTSHLSSTYPNIAGILLTGGIDLPESM QKLMEGWTGIPVPILSVKGATYDTCQELLKLHGKISPEDYRKITVALDAFSEGVDKETLV NKIFNFRSDRVTPMMFEFNLAEQAQKHRMRIVLPEGEELRILRAAESLCERGIADIILLG DTDAIQEKIKKFGLKLQDATIIQPTASPRFNAYAQQYYEMRKSKGLTLEQAQERMQDSTY FGTMMVQIGDADGMVSGAVNTTAHTIRPAFEIIKTKPDTSIVSSVFFMCLKDRILVFGDC AVNPNPTASQLADIAISSAHTARVFGVEPRVAMLSYSTGSSGKGEAVDEVIEATKLAHER APELLLDGPIQYDAAIDPEVARTKAPTSPVAGHASVFIFPDLNTGNNTYKAVQRAANALA IGPVLQGLNKPVNDLSRGCTVPDIINTVMITAIQAQAEKGLITLK >gi|316923312|gb|ADCP01000071.1| GENE 43 36304 - 39900 4288 1198 aa, chain - ## HITS:1 COG:CAC2499_1 KEGG:ns NR:ns ## COG: CAC2499_1 COG0674 # Protein_GI_number: 15895764 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Clostridium acetobutylicum # 1 411 1 411 413 560 63.0 1e-159 MAKKMKTMDGNTATTHIAYALSDVACIYPITPSSVMGELADDWAAAGKKNLMGQTVTVRE LQSEAGAAGAVHGSLVAGAMTSTFTASQGLLLMIPNMYKIAGELLPTVFHVSARALSTHA LSIFGDQQDVMAARQTGFAFLCASSVQECMDIALVAHLATIESSVPFCHFFDGFRTSHEI QKIEVIDYEDIRKVVNWDKVNEFRQNAMNPEHPHQRGTAQNPDIYFQNTERRNLFYDAVP NVVVEAMKKVEGITGRKYRPFDYVGHPEADRIVVAMGSACETIEEVVNLLNAQGQRVGLV KVRLFRPFSTEHLLQVIPSTVKTITVLDRTKEPGCLGEPLYLDVCAAFMEKGKTPEILGG RYGLGSKEFTPAMAEAVFANMTTVGPKNHFSVGITDDVSYTSLEVGTDIDTVAPGTVQCK FFGLGADGTVGANKQAIKIIGDNTDLYAQAYFAYDSKKSGGFTVSHLRFGKSPIHSTYLV NQADYIACHKSAYVHQYDVLEGIKEGGVFVLNSEWNTVEDLSRELPAAMKRTLARKNIKF YNVDAVKVAMEIGLGGRINMIMQTAFFKLAEVIPFEQAVALLKDSIKKTYGRKGDKVVNM NLAAVDNAIAALTEIKIPAEWAEAVDEHKAEACSCGCGCGHNHEPAYITDVVRPILAQQG DKLPVSAFEPDGLVPLGTTAYEKRGVAVNIPEWISENCIQCCQCSFVCPHAAIRPVLATE EELSCAPDTFVTKDAIGKELKGLKFRIQVYAEDCLGCGSCAEVCPAKTKALVMKPLHTQI EAQVANLKFATECVEPKDNLVARDSLKGSQLQQPLLEFSGACAGCGETPYVKLITQLFGE RMLIANATGCSSIWGGSAPTVPYTTNKDGHGPAWGSSLFEDAAEYGFGMFSAVKHRREKL ADIVAEAAKLPGLPEGLVDALNGWLENKDDAEGSKKFGEEILAALSDAPEHELFEQIWDM NDLLTKKSVWVFGGDGWAYDIGYGGLDHVLASGEDINVLVMDTEVYSNTGGQASKSTPLG SVAKFAAAGKRTGKKDLGRMAMTYGYVYVASISMGANKQQTLKALKEAEAYKGPSLIIAY APCINQGIRKGMGKSMEEGKLAVDSGYWPLYRFNPELAEQGKAPLTLESKAPDGTLRDFL AGENRYAQLKSIAPQDSERLQNDLEKAYNDRYQLLKYMAAFSGSAEQPKAAEEPKVTE >gi|316923312|gb|ADCP01000071.1| GENE 44 40069 - 40263 99 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCTLGEKSPLAENTDLCIRVYCYVAAPALAFRPRTPRKATPSLALAMKPDTFVFSQGRCH STHF >gi|316923312|gb|ADCP01000071.1| GENE 45 40454 - 41446 550 330 aa, chain - ## HITS:1 COG:RSc2343 KEGG:ns NR:ns ## COG: RSc2343 COG2840 # Protein_GI_number: 17547062 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 195 310 105 217 221 75 37.0 1e-13 MNEEKPFEGENPFRKLDKKRFLSHDEKKKAKSLKKNGLISNNDGEVSSFYAEFVSPQDEG ETRAFLNAVSGVTKLGAPANAKREKRPSVTSLDNPVLGKQALWETDRSLSKTKKNNPTPE IAGKNAASSTPKKTGNETGEEDINFFAAMQDVTPLSGKGREVAAEAPVSIPPVQTPPNPL QEFIDGKLEFALAFTDEYVEGHVVGLDLMLVGKLQAGQFSPESHLDLHGMNAQQAFDALV GFFRAAYFKGQRTVLVVPGRGLNSPHGISILREKVQEWFTQEPLKRVILAFCTAKPSDGG AGALYVLLRKFRKGEGKIHWERKPVDPDLI >gi|316923312|gb|ADCP01000071.1| GENE 46 41633 - 42685 905 350 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126667548|ref|ZP_01738518.1| Ribosomal protein S7 [Marinobacter sp. ELB17] # 9 341 10 339 354 353 53 2e-96 MPLTRAQAYTEAGVNIDAGNSLVSRIKSIVSQTHIHGVLSDIGGFGGLFKPDMSGMAEPV LVASTDGVGTKLKLAFAFDKHDTVGIDLVAMSANDILVQGARPLLFLDYFATGKLDVDKT TQVIAGIAEGCKQAGCALLGGETAEMPDMYADGEYDLAGFCVGMADNAKIVDGSGIRVGD VLIGLASSGIHSNGYSLVRKILDKSGLGPDDTMPGSDRTVKDVLLEPTYIYSDVVRNLMR DLPVKGMVHITGGGFYDNIPRVLPNSVTADIKFASWDVQPVFHWLREEGGLTWPEMLQIF NCGIGYVFIVPSEVAEEAMGRLEAMHKGAWVIGTIGRRQDKDEEQVKILF >gi|316923312|gb|ADCP01000071.1| GENE 47 42786 - 43136 338 116 aa, chain - ## HITS:1 COG:no KEGG:Dde_2026 NR:ns ## KEGG: Dde_2026 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 1 109 1 109 111 105 45.0 6e-22 MRCYLVDELQPEVIANLEAQLTKNGCQSGIERLYWLPLEKERLLPVQREHESSCGPHCLA LEILDDAVRLEFLVRAKGRMRCECVCYLSPETERHMMDWLDARIMEAEIEQGLIQQ >gi|316923312|gb|ADCP01000071.1| GENE 48 43146 - 44489 1159 447 aa, chain - ## HITS:1 COG:aq_1279 KEGG:ns NR:ns ## COG: aq_1279 COG0373 # Protein_GI_number: 15606498 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Aquifex aeolicus # 4 361 3 350 406 298 46.0 2e-80 MEQQIYLIGLNYRTAGVEVRERFALTDHCSKDTWAIPLRPSGENDSAETTLDEALILSTC NRVEILAVGQGDVPGLILEAWASAKNRKAVELAPYVYIHKGREAIAHLFTVASSLDSMVL GEPQILGQLKGAYRKATLAGTTKTIINRLLHKSFSVAKRVRTETGIAASAVSISYAAVEL AKRIFGDMGQYQAMLIGAGEMAELAATHLIQAGISNVKVANRTYQRAKELAAEFKGEAVP FEHLFEHLADVDIIISSTGSPEAIIRARDIKDVLKRRKYRPMFFIDIAVPRDIDPDVNNV DNVYLYDIDDLKEVVEENLAHRRVEAEKAQGIVDEEVNAFAFWLESLELQPTIVDLLSHF ETHAQEELQRTLKRLGPVNPETREALEAMLHGLVRKLSHEPITFLKEAKSETASRNVELI RRMFGLDDEHPMGRCTRRVHRNESQLP >gi|316923312|gb|ADCP01000071.1| GENE 49 44491 - 45324 741 277 aa, chain - ## HITS:1 COG:MT0551 KEGG:ns NR:ns ## COG: MT0551 COG0755 # Protein_GI_number: 15839923 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Mycobacterium tuberculosis CDC1551 # 69 272 113 320 324 94 32.0 2e-19 MILPELLPVLAIHLYALGTAAVVAGILARNEWLKRAALVLTVLAFTTHTLLLIATFMDDG FTGLTRSVYVQLLAWCITLSGLVAWLRWRYEALLLTVAPFSLLTFLIALLLRHAETPLPP VLSGMTFTIHIAAIFISIGLMALAFGAGVLFLIQAKSIKSKSKLAGFQKDLPALSALDKI NAFTTTVGFPLFTAGVLFGFISARINWGTILSGDPKEFISLVVWGLYAWLFHQRFTQGWQ GRKPAILAIWIFTVCAFSLIVVNLFMTTHHSFLTTPR >gi|316923312|gb|ADCP01000071.1| GENE 50 45311 - 45988 399 225 aa, chain - ## HITS:1 COG:aq_1237 KEGG:ns NR:ns ## COG: aq_1237 COG1648 # Protein_GI_number: 15606466 # Func_class: H Coenzyme transport and metabolism # Function: Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) # Organism: Aquifex aeolicus # 3 164 4 163 187 108 35.0 5e-24 MHYYPVFLELSGQSCLVVGAGAVGCRKIASLLECPIERLHIIDLAEPDTNLQVLLEDKRV SFTKRPFTPSDVEGCALVFAATSNRQTNDAVARACTERGILCNCADAPKDSSFIVPALVE QGNIAIALSTGGASPALARKIREDLEAWLGERYTGISELLMRLRPLVLALHHETRQNTTL FRSIVDSPLSEALQRRDRQTCESLLRELLPIELHPYIMELLHDLA >gi|316923312|gb|ADCP01000071.1| GENE 51 46249 - 46560 240 103 aa, chain + ## HITS:1 COG:MA3570 KEGG:ns NR:ns ## COG: MA3570 COG0454 # Protein_GI_number: 20092376 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Methanosarcina acetivorans str.C2A # 1 102 40 139 140 78 41.0 3e-15 MRTLYLPLSEIVVDEDRATGEVVAFMAFVEDYLAALFVAPAHQKRGVGSRLLALAKKMRG TLDLSVYAENERAVAFYQKNGFRITGERIEEVTGRTELLMAFP >gi|316923312|gb|ADCP01000071.1| GENE 52 46633 - 47310 928 225 aa, chain - ## HITS:1 COG:AF0676 KEGG:ns NR:ns ## COG: AF0676 COG0563 # Protein_GI_number: 11498284 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Archaeoglobus fulgidus # 1 177 1 175 217 138 43.0 8e-33 MNILMFGPNGSGKGTQGALIKKAFGLAHIESGAIFREHVGGGTELGKKAKEYMDRGELVP DSITIPMVLETLKTKGADGWLLDGFPRNIAQAEQLHKAMQEQDMKLDYVVEIDLPRNIAR DRIMGRRVCKNDGNHPNNIFIDAIKPNGDACRVCGGELTARADDQDEAAINKRHDIYYDD KTGTLAAAHYFKALAEKGETKYIVLNGQGTIDSIKETLIKELGLN >gi|316923312|gb|ADCP01000071.1| GENE 53 47419 - 47880 213 153 aa, chain - ## HITS:1 COG:no KEGG:DvMF_3041 NR:ns ## KEGG: DvMF_3041 # Name: not_defined # Def: protein of unknown function UPF0153 # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 6 144 28 166 175 183 61.0 1e-45 MKKTVFDCQMCGQCCEGEGGIVLSPKDLKRLYEGLNLEKQAFLDAYGVFRNGKWQVRTGE DGNCIFFRAGQSCTVHAIKPDVCRAWPFFRGNMADAESLHLAKAFCPGIRPDATHEEFVE EGQSYLEENGLVASDPEHEGHALLPVSKTDSSF >gi|316923312|gb|ADCP01000071.1| GENE 54 47885 - 48310 274 141 aa, chain - ## HITS:1 COG:SP0033 KEGG:ns NR:ns ## COG: SP0033 COG1832 # Protein_GI_number: 15899979 # Func_class: R General function prediction only # Function: Predicted CoA-binding protein # Organism: Streptococcus pneumoniae TIGR4 # 1 139 1 142 145 106 42.0 1e-23 MLQTFLSDGRMREFLTQARTIAVVGAKDKEGQPVDRVGKYLIQAGYQVIPVHPVRKDVWG LQTYPSLAEVPFPVDIVNVFRAPQYCPDHARETVALSPLPQLFWMQQGIVSPEAASIAGK AGIAVVEDLCIMVEHKRLLGN >gi|316923312|gb|ADCP01000071.1| GENE 55 48312 - 48611 96 99 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDLTKRVQKSLVGYANKGNEKVLGVYSSSRRARKELLFSELPQAFLKHVMAHSLLKLCKF IIPNILPKQTAKNISPDYKEIFQILCCNGRQNKENSKDQ >gi|316923312|gb|ADCP01000071.1| GENE 56 48673 - 49893 640 406 aa, chain + ## HITS:1 COG:MA4232 KEGG:ns NR:ns ## COG: MA4232 COG0006 # Protein_GI_number: 20093022 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Methanosarcina acetivorans str.C2A # 20 403 9 385 388 246 37.0 6e-65 MSSPDLLPLGEIEFRYARCRRLLRTLVPDAGGMLVTSRLGIYYLTGTLGWGLVWLPVEGE PVLLLRKGVERAQLESPLRHILPFKSYKEITSLCAGCGSPLSSAIAVDKNGFNWSMAEML QSRMEGVRFTSCDAVLAQARAVKSEWELEKLRRCGALHAKVLDEMLPERIHPGMSEFDVA RTYVEAVFACGGSGMLRMNAPGEENFFGYASADTSGIYPTYYNGPLGCKGMCPAIPFMGN AERLWQKRALLSIDMGFNVEGYNTDRTQVYWSGAADTIPGPIRRAHAVCVEIFEQTAAAL KPGAVPAEIWAEACAVAEREGQMDGFMGLGRDKVPFLGHGIGLTVDETPVFAKGFTAPLE QGMVVAIEPKIGIPGSGMVGLEHTLEITASGARPLTGTSKQIILIG >gi|316923312|gb|ADCP01000071.1| GENE 57 49867 - 50010 85 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKGLHKSLPKGKRIVEANNPASERHHNHIGRTRGMKGKSTDQDDLL >gi|316923312|gb|ADCP01000071.1| GENE 58 50195 - 53629 1794 1144 aa, chain + ## HITS:1 COG:BS_mfd KEGG:ns NR:ns ## COG: BS_mfd COG1197 # Protein_GI_number: 16077123 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Bacillus subtilis # 86 1092 95 1114 1177 649 36.0 0 MTEIARFKAGRIHVARAGLATQARLCVDMLQQGRGMAWVVKNRDELLTARALMRLFSPEL SSGDFLPDAFDKGGWTSLPPFSPRSANREGWTERLSALYALRSGQAKGLLLTADNLLPKL VPLDFFENREVTLIRGEDVSLDMILEQAVEWGYERVSMVSNPGEIARRGDILDIMPPGYD KPIRLDFFGDLIEEIRLFDPVSQRSKANLSEVRVLPVSPIRQSPADQAKMQARWKSLFQK NMLTENERFGLTRLQEQGDFRLLPGCCYDTATNLEAWLPADTVWVLPGLSDFRDMLSSAE RLWTEAFEAQDVEGRPQPQRLAMREASDVESACERFACVHAEPLVMGIAKGIHDLPERSL SSFEDLFPSTAEQDRPWQTLVQALKRWQNDKRQTILCFASEKSRTKFLNLASQDGLSPLL RYSGEQHGLFALVAPFRAGAELIWDQSLILGEDLLQPHAEKSRRVPTGAFKGLDRYDELK PGDLLVHRDYGIARFGGLLRMETGSTANDFLLLHYSGDDRLYVPVDRLSLIQRFKGGGEA QPSLDRLGGGAWQSGKEKARKAIEKIAEDLIEMYAWRKVAKGFRYPPLGELYREFEASFG FEETPDQAKAIQDVLADMEKPEPMDRLVCGDVGFGKTEVALRAAFRAASEGRQVALLCPT TVLAEQHYQTFRSRLSGFALNVGLLSRFVSSAKQKEVLKAAASGQIDILIGTHRLLSDDV VLPNLGLLILDEEQRFGVRHKEKLKKIKKNVDALTLTATPIPRTLQLSMSGVRELSVIET APPERKPVSTALLERDKATLRQVIERELAREGQVFWVHNRVQGLERVVEFVKELAPNARI GMAHGQLPEKKLEETMHAFWHGELDILVCTAIVESGLDFPRANTLIVDQAHLFGLGQLYQ LRGRVGRSDRQAFACFVVSDLERLPAATKERLRIILDMDYLGAGFQVAMEDLRLRGAGNI LGEVQSGHMGRVGLELYLEMLEQAVNKIKNGGVSLQIETELNLGLTAHIPEDYITDGRER LRWYKRLSAAPDAQARQELELELRDRFGILPQPLEIFMAVLALKQFLSGAQALKADVYED RLRVLWDEKQNAIAPEKLVPFLSAQKGNAKLIPPSSLELKLDMQLPTPRRLDAARLALGT LLTD >gi|316923312|gb|ADCP01000071.1| GENE 59 53852 - 54778 772 308 aa, chain + ## HITS:1 COG:CAC3215 KEGG:ns NR:ns ## COG: CAC3215 COG0760 # Protein_GI_number: 15896462 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Clostridium acetobutylicum # 14 226 53 266 353 66 25.0 5e-11 MRIVSTAFLLFFLITSPVRAAINGVAAVVNGEMITAFDLQAETAPEAMRRGLNPKDPNQA AAIEELTKATLERMINNIILTQEAQRLKISVGDSEVDNEIQQIMSRNKMTPEEFQRQLQI QRTTEKDFRERIRSSILRNRLLANMVGRKVIVTKEEIADYYNQHKQTFMNNQKVRFAVIV YPPTENAEAQAARIRSGKLSFEQAARQVSVGPRAQEGGDVGVVDWNSLDPTWQDRLSQLK PGDVSGLFEVNNGLKGQLKLLSMESGDGQTLEEATPQIERILREPKLQERFREYSEQLRK RAVVEIRQ >gi|316923312|gb|ADCP01000071.1| GENE 60 54818 - 55687 327 289 aa, chain + ## HITS:1 COG:alr1052 KEGG:ns NR:ns ## COG: alr1052 COG1426 # Protein_GI_number: 17228547 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 10 283 12 285 292 74 25.0 2e-13 MEETETVLTLADFGALLRERRIHKGLTEENVAAELKITSRLVKAIEEGDMESMPHAVYAR GFIRAYAKLLAVDDSVTHAACALLKDPEEELREQEIRVVPAAARREESHVPWLAILLCAL FLAGGAWYFRDSIPGLSSNTVSREKTASPAVSSPAVPEADTVVTTPAIPVEPAANATLPA ASEEPIVLGGTPLPAETETPTEQAALPATGHRMLLTAQGDCWVSTTADGKNSQRVMHKGD TLNVDFQEKLVMKLGNAGAMLITYDGKELPPVGKIGQVKTVTFPNDAQN >gi|316923312|gb|ADCP01000071.1| GENE 61 55691 - 56461 266 256 aa, chain + ## HITS:1 COG:no KEGG:LI0301 NR:ns ## KEGG: LI0301 # Name: recO # Def: hypothetical protein # Organism: L.intracellularis # Pathway: Homologous recombination [PATH:lip03440] # 1 256 1 249 249 266 46.0 8e-70 MEWSDTALVLGVGRFRESDLWLRMLTRRHGIVSAFAFGGSRSRKRFCGCLDLFNELQIST KTTRNGMYLSLQEGNLIRGPRRLRTDWNRLGMFMNCVRFVEALGVPQDGAAGVFLLLKDT LELLEQSETVQDILPILFRLRLASQQGYAPALTACVSCGKKDFDHAGFLVSEGTIVCPDC VSGRGNIVEISGKSLDVLRQVQEVSPLHWHFLTSGSSEEAGGVLLPAERRECARAVDGFV QYHLGLTWDRGRFRRM >gi|316923312|gb|ADCP01000071.1| GENE 62 56495 - 57433 770 312 aa, chain + ## HITS:1 COG:all1985 KEGG:ns NR:ns ## COG: all1985 COG0752 # Protein_GI_number: 17229477 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, alpha subunit # Organism: Nostoc sp. PCC 7120 # 22 308 1 287 294 417 64.0 1e-116 MAVFSEGNTPSGECRYCHGGSMYFQDVILTLQRYWASRGCVVAQPMDIECGAGTFNPSTF LRVIGPEPWSVAYVEPSRRPTDGRYGENPNRLQHYFQFQVILKPSPDDVQDLYLGSLRAL GIHQEEHDIRFVEDDWESPTLGAWGLGWEVWLNGMEVSQFTYFQQVGGIDLSPVSAELTY GLERLTMYLQGKDSVYDIQWNEHVTYGQIYHQNEVEQSRYNFDESDAAMHREHFDQFEAE CLRLTGEGLLWPAYDYCLKCSHTFNLLDARGAISITERTGYIGRVRTLASSVAHLYAEQR KEMGYPMLGGNR >gi|316923312|gb|ADCP01000071.1| GENE 63 57433 - 59520 1311 695 aa, chain + ## HITS:1 COG:HI0924 KEGG:ns NR:ns ## COG: HI0924 COG0751 # Protein_GI_number: 16272861 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, beta subunit # Organism: Haemophilus influenzae # 4 694 6 688 688 447 39.0 1e-125 MSTFVLEIGTEELPSRFLSNVEKELADRFTAGLKEAGYTFSLLDTCATPRRSVVIVEGLE DTQPVREELVMGPPARIAFDAEGNPTKAAQGFAKTLGIDVSALGRTQTEKGEYLSGMKKT GGVPTLDVLAGLCPAIIAALPFAKRMRWGDGEFAFARPIRWVLALLGDVVVPFEVGKVAS GRVTQGHRVLGPGPFTIKTAGDYVPTIVKEGAVQLKAAERRAYIVSEGNRIAEAVNGKII WIDGLLDEVQGLVECPVPLLGDFDPSFLELPREVLLTSMQEHQKSFGVEDASGNLMPHFL TVLNLHPKDLSLVKKGWERVLRARLEDGRFFWKTDLEATFDEWLEALDAVTFLAPLGSMG EKTRRISALCRWLAEKVQQDPEQAARAGRLSKADLVSAMVGEFDTLQGIMGGIYARKKGE TEAVAAALAEQYLPSGPDSPVPATELGSILSIADKVDTLVGCFGLGMIPTGAADPYALRR CALGITRIMLERGYRFDVKELFEEAQRLYGDRKWKLAPAEAIAKLNDFFIARVKNYFLTQ GKETLLVEAVTAVDPDNVWALGRRLGALESMSRQDDFPQAAQTFKRVANIIRKQGHEAGV DLQGTWKRELLQEPAEMALAEALEKMFAAFEAAWANDDFEALFSMLAALRPAVDELFDKV MVMCEDPALRANRLNMLKALTLRMERLADFSALQL >gi|316923312|gb|ADCP01000071.1| GENE 64 59716 - 59988 299 90 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|218885396|ref|YP_002434717.1| 30S ribosomal protein S20 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 87 1 87 87 119 71 4e-26 MANHKSAIKRHKQSVKRAARNRTVKTRIKNVVKSVRAAIKADDQAAATQLLSTATSSLDK AATKGVIHWKKAARKISRLAKALNAAKAAA >gi|316923312|gb|ADCP01000071.1| GENE 65 60063 - 61622 1149 519 aa, chain - ## HITS:1 COG:FN0268 KEGG:ns NR:ns ## COG: FN0268 COG0497 # Protein_GI_number: 19703613 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Fusobacterium nucleatum # 1 514 6 537 558 224 31.0 3e-58 MLEYLHIRDLALISDMELEFASGMNVLTGETGAGKSFILKALNFLMGERLGADMVRPGKE RAQVEALFMRSGEELIVRRELIAETGRSRLFINDTLASQETIKELRTTLLVHTSQHGQQK LLQPNFQAKLIDDWMNQPDLLAARNTLLKELKEAAERKEALRTRFRELADRRELLEMHLT EIEKVSPAEGEEEQLEAMRAEWRGTEQLRRHYERAQMILRGEETGLLEQIGHLERALELL AGDDEDMAAYLESAVSFRQTIAELERKLRRPPLPQADIDPEQIEARLFELAQLKRKLHRT LPEILALREEIQDNLSFLDACTLDIRQVEKREKQLQEQLAGVLALLNPERRKAGESFTRK LESELQGLGFSQYVRVSADWSEVALFPGCVEDRVRLLWSPNPGQQPQPLDKIASGGELSR FMLAVVSVQEHDEEATLIFDEVDAGVGGLTLNKVAERLEALAQRRQMLLITHWPQLASRA FRHFRVTKQVKGGKTFTQCIRLDERERKTELARMAGIES >gi|316923312|gb|ADCP01000071.1| GENE 66 61753 - 62616 715 287 aa, chain - ## HITS:1 COG:PAB0093 KEGG:ns NR:ns ## COG: PAB0093 COG1173 # Protein_GI_number: 14520362 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Pyrococcus abyssi # 27 283 27 283 287 240 49.0 2e-63 MNADKTLLTPLCKRLGATLSRRGMLFVGLGIVIIMSAAALLAPWLSPYDPTALHLKSILV PPSAQFPLGTDALGRDVLSRLLYGARVSLWVGFVSVSISIAIGIALGLLAGYFRGIVDEI IMRGVDVMLCFPSFFLILAVIAFLEPSLNTIMVVIGLTSWMGVARLVRAETLSLRERDFV AAARLAGTRPFRIMVMHILPNALTPVLVSATLGIAGAILVESSLSFLGLGVQPPDPSWGN MLMEGKDVLEIAPWLSLYPGLAILITVLGYNLLGESLRDTLDPRLKR >gi|316923312|gb|ADCP01000071.1| GENE 67 62613 - 63623 653 336 aa, chain - ## HITS:1 COG:BS_appB KEGG:ns NR:ns ## COG: BS_appB COG0601 # Protein_GI_number: 16078204 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 14 329 6 315 317 242 42.0 7e-64 MFTNETTSFILRCVRKTIWMLLVLWGITLVSFFVIHLAPGTPTDMQTTLNPLAGEAARAR LEALYGIDRPLYIQYFDWLSRIVQLDFGNSMSSDARPVIEKIAERLPLTIGINLISLLLT LLIAVPIGVLSAWKQGSLFDKGMTVLVFLGFAMPGFWLALLLMLYTGLEHQWLPISGLMS LDYESLSFWEKLMDLGRHLVLPITVITVGSVAGMSRFMRSSMLEVLRQDFILTAKAKGLP ARVVIFRHALRNALLPVITLLGLSVPGLIGGSVIIETIFALPGLGQLFYAAVMARDYPLI MGNLVLGAVLTLAGNMLADIGYGLADPRIRAQGKRQ >gi|316923312|gb|ADCP01000071.1| GENE 68 63646 - 64170 447 174 aa, chain - ## HITS:1 COG:no KEGG:LI0252 NR:ns ## KEGG: LI0252 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 170 1 169 173 215 62.0 5e-55 MNEPYLGQTTDMVLGQEFLTWLWFRCETQPMGFKDKEGVPFSMNLEQRIVVQGGEGDSLE TASVSGSLSQLREVRLGLRTGKKVTRALVRFEREELAWQTTIKAEDFSLGSFKTPKVERE EDDDPDAAFLEKMYLMELCLGLFDACYKQFLDIRLSSLWDKEVQDMSAWMSQQA >gi|316923312|gb|ADCP01000071.1| GENE 69 64181 - 64795 208 204 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0918 NR:ns ## KEGG: DvMF_0918 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 203 1 203 205 286 68.0 4e-76 MGILSASCSFTRFKITEPVPKELWMEIPAKLRQFAFQDIDDIAEERGWGWTSFEDMLDMQ WRSAPPEKGAYLAFALRLDTRRIPPAVLKKYVTIALREEEGRIKEQGKKFIARDRKNELR DQVKLRLMGRFLPIPAVFDVAWATDTNIIYLASTQTKLIELFMNHFTLTFDLHLEPMTPY ALAASMLDEKTLARLDDLEPTSFV >gi|316923312|gb|ADCP01000071.1| GENE 70 65335 - 66144 284 269 aa, chain - ## HITS:1 COG:no KEGG:DVU1356 NR:ns ## KEGG: DVU1356 # Name: not_defined # Def: HD domain-containing protein # Organism: D.vulgaris # Pathway: not_defined # 15 262 7 258 267 229 48.0 6e-59 MMHTYENNTPQAWDDIRNHEALFESFASMYLREHPGDMLRLKREHTYKVLAHARAIVAQE GLASQEGRAALLAALYHDTGRFPQYVRWRTFSDAESENHGYLGVHVVKKEHFLTGEPPNI HKWVLTAIALHNRYALPALPEPYLTITHAVRDADKLDIMRIMAQHLSRPIPTRDVVLRVQ DAPKLWSQSIVDTVLSGGIPSYHDLRYVNDFRILLGSWIHDLHFTSSKKTCVASGFLQEV LKGLPSTPELQPVTAYLLNEFSTVRNLCG >gi|316923312|gb|ADCP01000071.1| GENE 71 66297 - 69824 2941 1175 aa, chain + ## HITS:1 COG:VC2245 KEGG:ns NR:ns ## COG: VC2245 COG0587 # Protein_GI_number: 15642243 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Vibrio cholerae # 3 1157 10 1142 1164 879 42.0 0 MSEFVHLHCHTEYSLLDGAIRLNDLCARAKDFGMPAAAITDHGNLYGAAYFYTTCKSYGI KPVIGCEVYVTKDHRDKDSEFSRVRHHLILLAKNNEGYHNLVRLVSKGFLDGFHYKPRVD KALLRQHSEGLIALSACLAGEIPRTLMGANKLITNGGTFDDAIRQTQEYMDIYDGRFYLE VQANTLPDQATLNNKLLELAEHTKVPLVATNDCHYLNADDVEAHDVLLCIQTQAKVNDAK RMRFEARDLYYKSAEEMEQAFKHIPEALSNTVRIAEECHVEMDFTHHYFPVYELPEGMTL STEFQRLAREGLKQRLELHPDRDTIDPKIYWDRLEMELKVICEMGFPGYFLIVQDFINWA KGNDIPVGPGRGSAAGSIVAWALRITNLDPLPYNLIFERFLNIERVSLPDIDVDFCEDKR TRVIQYVSQKYGIDSVSQITTFGKMKAKAVVRDVGRALDMSFKETDRIAKLIPDDLKMTI KKALDAEPELATLYKEDQIIRKLIDISMRLEGLSRHASTHAAGVVVSDKPMDEYLPIYRG KKGELVTQFDMKMVEKVGLVKFDFLGLRTMTLIDNTLKAIEEQGKKAPNLDILPLTDPDT YDVFSRGDTDGVFQVESSGMRQYLRMLRPNCFEDIIAMLALYRPGPLGSGMVDEFIKRKH GEVDVTYPLPSLEGCLKDTYGVIVYQEQVMQIAQIVAGYTLGGADLLRRAMGKKNAEAMA KERTLFVDGAVKNGTSKEKATEIFDLMEKFAEYGFNKSHSAAYALISYHTAYLKTHHKVE FMAALLTSEIGNQDKILKYIAACKDNDIEVRQPDVQVSRREFIVRDEAVVYGLGGIKNVG DEAIREIVAAREKDGPFLSFLDLCIRVSLRKVTKRVLESLIKGGALDCFGCSRAAMVAAI DPVVARAQKKIKEKQSNQISLLTLSPKKIEENQASGIGFSCEEESISEWDEDQKLRFEKE ALGFFLTSHPLQPYRHELNRLDLRPLEDCREMADKATIKCAVLVTSIREILNKRGNRMAF VAVEDLTASGEVTFFTEELNASRDLLNSEQPLLLTATIDNRESSSYSPDSDDDSDDEAPV KEIKLRGVSVQALNDACSASDAPISWELDPRRLNAEGMESLKAILERHKGNTEMQLAFCL DGTFCRVRLGPQWMVTPGPAFQQDMHRWCSANPIQ >gi|316923312|gb|ADCP01000071.1| GENE 72 69847 - 70233 231 128 aa, chain + ## HITS:1 COG:MA0956 KEGG:ns NR:ns ## COG: MA0956 COG0720 # Protein_GI_number: 20089834 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Methanosarcina acetivorans str.C2A # 6 125 2 122 123 85 42.0 2e-17 MSRPFWKLTVRSEFCAAHALRHYQGKCEHLHGHNYAVEVVVKGYTLSQNTELLMDFGDLK ALLKQALEPLDHAYINDVPPFDAVNPSSENLARYIWNQMAPSLPETVKMYSVTVAEKGIQ SATYMEEE >gi|316923312|gb|ADCP01000071.1| GENE 73 70626 - 70838 385 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MESSTLKKKAKDAFDKTTDYLSEAKDSVTENVGDFVDEAKDNLSKVKDKAKEGINAAGEA VQKAYEKAKK >gi|316923312|gb|ADCP01000071.1| GENE 74 71011 - 71451 444 146 aa, chain - ## HITS:1 COG:ECs3323 KEGG:ns NR:ns ## COG: ECs3323 COG4917 # Protein_GI_number: 15832577 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli O157:H7 # 1 143 1 142 159 148 50.0 3e-36 MKRYVLLGTVGAGKSTLYAALHGIACDEAKKTQAMQYDLDGGVDTPGEFFCHPMYYPALL STTVDTDVLIYVHPANDPLCRLPAGFLNIYTQREVICAITKVDLPDADFEATKAMLMDHG VSGPFFPLGKDRPELLQELVDWLQKE >gi|316923312|gb|ADCP01000071.1| GENE 75 71456 - 71794 205 112 aa, chain - ## HITS:1 COG:ECs3324 KEGG:ns NR:ns ## COG: ECs3324 COG4810 # Protein_GI_number: 15832578 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli O157:H7 # 3 112 26 135 135 143 68.0 7e-35 MEEKARIIQELVPGKQVTIAHLIAHPDEDLAKKMGVPGAEAVGILTLSPREAAMIAGDFA TKAASVQIGFVDRFTGALVIQGSVASVEEALRYTVKALHDVLGFAACETTKS >gi|316923312|gb|ADCP01000071.1| GENE 76 71988 - 72695 135 235 aa, chain - ## HITS:1 COG:no KEGG:Dvul_1842 NR:ns ## KEGG: Dvul_1842 # Name: not_defined # Def: phosphoesterase, PA-phosphatase related # Organism: D.vulgaris_DP4 # Pathway: not_defined # 26 209 11 195 204 111 36.0 3e-23 MNPISYPIRILLAPSPDYRWGVHLCLLAPLGAIMAFVILCLGWWGEPVRQFFSPLATADI HITRFMSVVTHIGNPLIYVFYLVLLLVALTKRDWRETMFVLRCVLFSVFFLCFVMQIMKY GLGMPRPGFPWPPYPLGFINQYASFPSGHTATIITAAVPLALWFRNRPFSLFLSAVIALM GFSRVWLGQHHPIDIVGGMLLGSIAARCIACERLPMGKEEQQALGEQKIREETGK >gi|316923312|gb|ADCP01000071.1| GENE 77 72988 - 74727 1257 579 aa, chain + ## HITS:1 COG:PH1019 KEGG:ns NR:ns ## COG: PH1019 COG2414 # Protein_GI_number: 14590859 # Func_class: C Energy production and conversion # Function: Aldehyde:ferredoxin oxidoreductase # Organism: Pyrococcus horikoshii # 3 557 7 560 607 262 35.0 1e-69 MEKVLRINVGAENGPVATIEELGAYAGLGGRAMTTTVVCNEVPPNCHPLGPENKLVFSPG LMAGSAAASSGRISVGCKSPLTGTIKESNSGGTGGQFMARLGYAAIIIEGERKTDDLWKI VITKDAVMFEKCNEYRMFSNYPLVAALQEKYGMEPGYVTIGTAGEMLMCNSTICFTDPEG RATRHAGRGGVGAVMGSKGIKAIVLDPAGTPVIRKPKNAEAFKAASRAFAQGLSSHPVCG TGLPTYGTNVLVNILNEAGGLPTKNFSVGRFEGADKICAETMVEIQKRRGGNPTHGCHRG CIMKCSGICVDENGEFVSKQPEYETVWSHGAHCGVDDLDKIILCDRLEDDYGLDTIETGA ALGVLMEAGALKWGDIDGIIAMIHEIGKGTPMGRILGAGTATTARCFGIERAPVVKGQAM PAYDPRAVKGQGVTYATTTMGADHTAGYAVATNILGCGGKTDPLSAEGQAEISRNLQIAT AAIDATGYCLFTAFALLDQPETMQALVDTINAMYDLNMSLDDVTELGKDILRKERAFNAA AGFTKAHDRLPLYFSRESVAPHNVRWDVSDEDLDSVFNF >gi|316923312|gb|ADCP01000071.1| GENE 78 74809 - 75549 480 246 aa, chain + ## HITS:1 COG:MA0255 KEGG:ns NR:ns ## COG: MA0255 COG0476 # Protein_GI_number: 20089153 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Methanosarcina acetivorans str.C2A # 20 244 7 230 247 110 32.0 3e-24 MKDAERTNRDTPAALDAIPERYWRNARFLSRTDQEKLLQAHVVIIGLGGLGGTVLEELLR LGIGTITGVDMDVFELSNLNRQLLATEENIGLHKAEAARLRARQINSGVRFIPVTEKQDF GSMCDLFRDADVVVDALGGLADRQALEKAAAEKGVPVVSAGIAGLTGWCKAIFPGETGPA SLLAGEGDRPTPDEILGNLAPTVFLAASLQAALVLQILTGKPVRREALFFDLEDGTFTQV VLDRRP >gi|316923312|gb|ADCP01000071.1| GENE 79 75549 - 75794 158 81 aa, chain + ## HITS:1 COG:no KEGG:Dde_2458 NR:ns ## KEGG: Dde_2458 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 1 68 1 69 74 72 50.0 5e-12 MKLTVKCFATLMPLTPPGGSLEFHGTDICDLLRQLSIPESEARIIFVNGIGVEKDALLHD GDRVGIFPPWEEGDFRLCTIR >gi|316923312|gb|ADCP01000071.1| GENE 80 76061 - 76789 391 242 aa, chain + ## HITS:1 COG:no KEGG:LI0137 NR:ns ## KEGG: LI0137 # Name: not_defined # Def: putative integral membrane protein # Organism: L.intracellularis # Pathway: not_defined # 8 242 10 262 262 270 57.0 4e-71 MESSTYWLDLIAETYIGICVLCCLFIISKQRRRPAPMMRVMLLVWPIITLWAGPLGIWAY ETSNRRMPSHDGDGGARHDMSDMHMEMPMQPMHSHTPHWKSVMTGTLHCGAGCTLADLAG PFLFRMAPFVLFGSSLYGEWAVDYVLALIIGVFFQYAGLASMSHDRGLSLWFRAFKVDFL SLTAWQVGMYGWMAIAVFLLVGPMSPDQPVFWLMMQISMVCGFITAYPMNWWLIRIGIKS AM >gi|316923312|gb|ADCP01000071.1| GENE 81 77066 - 77377 291 103 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDVFDLADNLRAEFQEKGVSNEEFLLKIAETYDIKRVFVSSVADELFDKIPDKRIAEVPE VGMDEAKHLWFAFGIGKILLRDRGLEPSNFDCMQFSNRLLQMK >gi|316923312|gb|ADCP01000071.1| GENE 82 77502 - 78413 444 303 aa, chain + ## HITS:1 COG:yfcH KEGG:ns NR:ns ## COG: yfcH COG1090 # Protein_GI_number: 16130239 # Func_class: R General function prediction only # Function: Predicted nucleoside-diphosphate sugar epimerase # Organism: Escherichia coli K12 # 1 301 1 292 297 221 45.0 9e-58 MRVIILGGSGFIGRALTQALISRGDEVVVPTRHVPAATPSTGPEYQLWDGQDHSSLSQLL NGADAVINLLGENIAAKRWSSVQKERIISSRLLAGQALVIALQMPIVKPKVLIQASAIGY YGFWPENASAPDCTEDSPAGSGFLATTTIQWERSTQQAEHLGLRRCIIRTAPVLGLGGGM LAKLLPVFQLGLGGPVGTGRQPFAWIHLDDEVAAILFLLDHEELSGPFNLVAPEHSTMND FVQSLGKVLKRPVWLPVPSPLLRLGFGDMADELLLAGQKASPVRLLEHGFVFRHPTLDSA LFA >gi|316923312|gb|ADCP01000071.1| GENE 83 78598 - 78822 265 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVNLWKQVRFIEIVDPGNQFVKNLMFDGVYSSDFALLSEDILDWFSAINERYASVYSDTE HHRAVKATIDQMYA >gi|316923312|gb|ADCP01000071.1| GENE 84 78868 - 79329 -176 153 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFFSILCRKIASKKTFAPPKRCPLAPPFIYRSHAAWYERKRMDRTRPGPFFRKKTACIID TSGFWLHTAAATASAPARAASAQGFGRHWPDRAHLERLRPVFQVCGRKQPFGVFTSAIGA RRTGFIGADHVFKLARAGRAIKIVHRHKGITPF >gi|316923312|gb|ADCP01000071.1| GENE 85 79527 - 80183 426 218 aa, chain + ## HITS:1 COG:no KEGG:Dvul_0437 NR:ns ## KEGG: Dvul_0437 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 55 218 37 201 201 112 37.0 1e-23 MSLPQRIPYGFSACLILALLLFFLAEAHARSASGEERGSRTSVELKTSDGSLSNGQTPKL ELRTFTIEMLNDGVHVSTGIGLENGGVVRSQLRDGAVMTLTCKLALERVRTLLSNEVISE ESRSYQLRHDLLSREFILSSPGHPIVRQKQFDTLLASAFQHLDFLLPLQAPLVSGETYRV QLKITLEHAEVPPWLEKALFFWSWEVTPPLSFSQDFIF >gi|316923312|gb|ADCP01000071.1| GENE 86 80199 - 82388 1311 729 aa, chain + ## HITS:1 COG:mlr0399 KEGG:ns NR:ns ## COG: mlr0399 COG5000 # Protein_GI_number: 13470633 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation # Organism: Mesorhizobium loti # 50 708 37 690 738 279 29.0 1e-74 MTEQEKVVQIGVPDERERRRRRYELLAAFLLLLLVLGGTWTQLTLYGVDSWMFIALLNIN SIFMLIILFLVARNVVKLIMERRRKVFGAQIRTRLVVVFVSLSLIPTVIMFLASNRVVAT SVDYWFTRQTETSLQAALDVGQSFYAAAAERLHSRSEAIAQEAAQRRIAWTASSANTLLQ TKQKEYGLILVGFVTPQEKELYWHAPKAFTEGWQEARTRIDWTHVARATFGSLLWATDDA DYVIGVLAIDGGKTGYLITAESIGQGLLAKLERISKGFEEYAQLKQLKKPLKVSFLLILG VLGMITIFGSVWFGFRLSKEFTAPILALAQGTTRIAQGDLDFRLEDKGADELGLLVQSFN HMAKDLQEGRMSLTHANAMLAEHNRYIETVLDNITTGVITLDADGRIRTMNKAASSIFDA QPERLEGRNPAEFLPRSYSNIFVSMLETLREHPEQNWKRQEDFLLGDRSWKLILHAVALS GPGGIRAYVIVVEDITELEKMQRMAAWREVAKRIAHEIKNPLTPIKLSAQRLDRKFGKQI EDPAFGQCTDLIVKQVERLQEMVQEFSSFAKLPEVQLSPDDVAPLLDELTTLFRTSHSSV MWDLHLPERLPKIALDPAALHRALLNIFTNAAEALDTLPPEATKRVRITAVHDRGRGSLR LIISDNGPGLQPEERERMFEPYFSRKKGGTGLGLAIVKSIIADHRGTIRAVAAQGGGTSI VLEFPVMRA Prediction of potential genes in microbial genomes Time: Fri May 13 03:07:18 2011 Seq name: gi|316923264|gb|ADCP01000072.1| Bilophila wadsworthia 3_1_6 cont1.72, whole genome shotgun sequence Length of sequence - 55020 bp Number of predicted genes - 52, with homology - 43 Number of transcription units - 24, operones - 10 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 44 - 400 148 ## + Prom 950 - 1009 2.1 2 2 Tu 1 . + CDS 1044 - 2609 914 ## COG1757 Na+/H+ antiporter + Term 2637 - 2674 6.0 + Prom 2665 - 2724 4.5 3 3 Op 1 . + CDS 2850 - 3098 248 ## COG3077 DNA-damage-inducible protein J 4 3 Op 2 . + CDS 3085 - 4875 1436 ## COG1757 Na+/H+ antiporter 5 3 Op 3 . + CDS 4947 - 6116 937 ## TTE0837 isoaspartyl dipeptidase + Term 6142 - 6183 7.0 - Term 6130 - 6169 7.4 6 4 Op 1 . - CDS 6196 - 6636 202 ## 7 4 Op 2 . - CDS 6793 - 7020 128 ## - Prom 7218 - 7277 3.4 + Prom 6898 - 6957 2.4 8 5 Tu 1 . + CDS 6992 - 7231 161 ## + Term 7299 - 7343 10.2 - Term 7287 - 7331 10.1 9 6 Tu 1 . - CDS 7405 - 8898 1064 ## COG2326 Uncharacterized conserved protein - Prom 8968 - 9027 2.4 + Prom 9461 - 9520 4.3 10 7 Tu 1 . + CDS 9586 - 10299 490 ## COG2186 Transcriptional regulators 11 8 Op 1 . + CDS 10536 - 11021 548 ## gi|302862986|gb|EFL85918.1| putative tricarboxylate transport protein TctB 12 8 Op 2 . + CDS 11052 - 12563 1884 ## COG3333 Uncharacterized protein conserved in bacteria 13 8 Op 3 . + CDS 12585 - 13814 940 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 14 8 Op 4 . + CDS 13840 - 14811 1151 ## COG3181 Uncharacterized protein conserved in bacteria 15 8 Op 5 . + CDS 14869 - 15927 1005 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 15960 - 15990 5.0 - Term 15945 - 15981 8.0 16 9 Tu 1 . - CDS 16007 - 17011 821 ## COG4213 ABC-type xylose transport system, periplasmic component - Term 17427 - 17464 5.1 17 10 Tu 1 . - CDS 17502 - 18644 613 ## COG0477 Permeases of the major facilitator superfamily - Prom 18784 - 18843 6.8 + Prom 18934 - 18993 8.1 18 11 Op 1 13/0.000 + CDS 19118 - 19420 445 ## COG0831 Urea amidohydrolase (urease) gamma subunit 19 11 Op 2 17/0.000 + CDS 19435 - 19839 625 ## COG0832 Urea amidohydrolase (urease) beta subunit 20 11 Op 3 10/0.000 + CDS 19876 - 21594 2160 ## COG0804 Urea amidohydrolase (urease) alpha subunit 21 11 Op 4 16/0.000 + CDS 21845 - 22489 566 ## COG2371 Urease accessory protein UreE 22 11 Op 5 17/0.000 + CDS 22428 - 23222 792 ## COG0830 Urease accessory protein UreF 23 11 Op 6 9/0.000 + CDS 23278 - 23916 1005 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase 24 11 Op 7 1/0.000 + CDS 23927 - 24895 1008 ## COG0829 Urease accessory protein UreH + Term 24954 - 24992 3.3 25 11 Op 8 . + CDS 25033 - 26034 1125 ## COG4413 Urea transporter 26 11 Op 9 . + CDS 26065 - 27279 1715 ## COG0004 Ammonia permease + Term 27288 - 27320 2.9 27 11 Op 10 11/0.000 + CDS 27328 - 27708 434 ## COG0239 Integral membrane protein possibly involved in chromosome condensation 28 11 Op 11 . + CDS 27718 - 28113 471 ## COG0239 Integral membrane protein possibly involved in chromosome condensation 29 11 Op 12 . + CDS 28053 - 28268 84 ## - Term 28136 - 28176 8.1 30 12 Op 1 . - CDS 28385 - 28924 146 ## DVU0821 hypothetical protein 31 12 Op 2 . - CDS 28926 - 29615 83 ## DVU0822 hypothetical protein - Prom 29643 - 29702 6.5 32 13 Tu 1 . + CDS 29584 - 29841 76 ## - Term 29863 - 29894 -0.5 33 14 Tu 1 . - CDS 29898 - 30635 321 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 34 15 Tu 1 . - CDS 30908 - 32575 603 ## COG1240 Mg-chelatase subunit ChlD 35 16 Op 1 1/0.000 - CDS 32926 - 33921 595 ## COG1239 Mg-chelatase subunit ChlI 36 16 Op 2 . - CDS 33938 - 37837 2711 ## COG1429 Cobalamin biosynthesis protein CobN and related Mg-chelatases - Term 37844 - 37899 9.4 37 17 Op 1 . - CDS 37928 - 39976 1676 ## COG4206 Outer membrane cobalamin receptor protein 38 17 Op 2 35/0.000 - CDS 39973 - 40806 248 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 39 17 Op 3 . - CDS 40799 - 41800 548 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component - Prom 41820 - 41879 8.2 + Prom 41820 - 41879 4.0 40 18 Op 1 45/0.000 + CDS 41956 - 42804 673 ## COG1131 ABC-type multidrug transport system, ATPase component 41 18 Op 2 . + CDS 42819 - 43544 583 ## COG0842 ABC-type multidrug transport system, permease component + Prom 44063 - 44122 2.3 42 19 Op 1 30/0.000 + CDS 44268 - 44837 330 ## COG0811 Biopolymer transport proteins 43 19 Op 2 . + CDS 44827 - 45234 236 ## COG0848 Biopolymer transport protein - Term 45784 - 45826 1.0 44 20 Tu 1 . - CDS 46067 - 46270 147 ## - Prom 46334 - 46393 1.5 + Prom 46852 - 46911 5.1 45 21 Tu 1 . + CDS 46961 - 47233 247 ## DvMF_2842 hypothetical protein + Term 47271 - 47303 -1.0 46 22 Tu 1 . - CDS 47289 - 47771 -8 ## 47 23 Op 1 . - CDS 47985 - 51077 2245 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains 48 23 Op 2 . - CDS 51146 - 51361 127 ## 49 23 Op 3 . - CDS 51444 - 51875 502 ## gi|182418212|ref|ZP_02949512.1| transporter, major facilitator family 50 23 Op 4 2/0.000 - CDS 51872 - 52435 529 ## COG0477 Permeases of the major facilitator superfamily 51 23 Op 5 . - CDS 52521 - 54134 1465 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold - Prom 54375 - 54434 4.5 52 24 Tu 1 . + CDS 54693 - 55020 160 ## Nmul_A1248 integrase catalytic subunit Predicted protein(s) >gi|316923264|gb|ADCP01000072.1| GENE 1 44 - 400 148 118 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRFGVIGAALLLLSGCAANDTETPSNLEGAVSEAPAANASAPAVTDGQLHTLSVELMTY QGGTNGAREYVVAQAKQACDSINQEVYIENLTSETTWRGGRAELTFACVDKSDSRLKQ >gi|316923264|gb|ADCP01000072.1| GENE 2 1044 - 2609 914 521 aa, chain + ## HITS:1 COG:VC1131 KEGG:ns NR:ns ## COG: VC1131 COG1757 # Protein_GI_number: 15641144 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Vibrio cholerae # 9 477 11 496 533 300 40.0 5e-81 MEPYYAGWLSLLPPVIAITLALLTKEVISSLLIGILTGTFIYSVGTNSDVLFMGTIESAF DTMANKVDFNILVFCTLLGALVYTISMAGGTRAYGKWATRRIKSRKAALLSTGGLGAFIF IDDYFNCLTVGTVMKPVTDSYKISRAKLAYIIDATAAPICIIAPISSWAAAVGSNLRTTG AFSSDFAAFVSTIPYNFYALLSIAMVGLICLTNSDFGPMRTAEERAQHGDLGAVDGSSEK QVQPAERGTVWDMLLPIGSLIVFAVLGLLYSGGYWGKDPAFHTMGAAFGNCTAAKALVWA SVGALTVSFLLFVPRGLLSFRTFMDGAVEGMKAMLPANIILVLAWTISGVCRDLLQTPIF VQSLVADGGVSGAFLPAIVFLIAGFLSFSTGTAWGTFGILIPIIVPVAQAVDPDLVVVSL SATLAGSVFGDHCSPISDTTILSSAGAGCNHIEHVSTQLGYACIVAFCCFVGYVVAGFTK ANLWWSLGSSLVLLLISVFILHMLGNKRAAARETAAIGGNA >gi|316923264|gb|ADCP01000072.1| GENE 3 2850 - 3098 248 82 aa, chain + ## HITS:1 COG:RC1344 KEGG:ns NR:ns ## COG: RC1344 COG3077 # Protein_GI_number: 15893267 # Func_class: L Replication, recombination and repair # Function: DNA-damage-inducible protein J # Organism: Rickettsia conorii # 1 82 6 87 87 69 48.0 1e-12 MVHIRVDEKLKREACAALDGMGLSLSDAVRLLLVRVAAEKALPFEIRVPNTSTIRAMEDV NVGTLESFKSVTALMASIDENN >gi|316923264|gb|ADCP01000072.1| GENE 4 3085 - 4875 1436 596 aa, chain + ## HITS:1 COG:VC2037 KEGG:ns NR:ns ## COG: VC2037 COG1757 # Protein_GI_number: 15642039 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Vibrio cholerae # 156 596 34 472 481 290 36.0 4e-78 MRTIEATPTFKEDLKRVCQDPGRKRTVEDLLSILARDLEFMVGMPAHTLPGKWSDYLECQ LQSNLFLIYRKPDKNRLQLVRLGSCSDFECRDDEYDIGFSVSCGNKEEPSRFFKPDNATT GRPTVTTHKKPTLAFALFTLLSIVTIIACGMIFFKLKLHLLMMSCWVVCALFARRLGYTY TELEVGAYELIQRAMGAVIILMCVGALIGAWISAGTVPVMIYVGLQIISPSLFLVTSLIL CSVTSLTTGTSWGTIGTVGLAIMGIGAGLGFDPGITAASIICGAFFGDKLSPLSDTTNMA AAVSGVPLLRHVRHMVNTITPAYIITIILYAIIGFKQGSGVADTSQLNAILTGISEHFQI GIIPALPMLLVLGLLIRQTNPVLAIVSGALFGVLIAVGYAGMDLTTAFNSMWSGYKADFA NPMLAKLLNRGGITSMLDIAALVIFACGLGGMLRHIGIIDVVLEPVARRATSGLSLVLAT LFIGYGTLMLTAAAYFSIVMNGTVMAPLFRKRGYRPENCSRVVEDAGTLGGPLVPWASNA LFPMSMLSVSYMDYAPWAFVLYLTPLMSILYAAFNINMTRLTPEEMAEENKEFVAE >gi|316923264|gb|ADCP01000072.1| GENE 5 4947 - 6116 937 389 aa, chain + ## HITS:1 COG:no KEGG:TTE0837 NR:ns ## KEGG: TTE0837 # Name: not_defined # Def: isoaspartyl dipeptidase # Organism: T.tengcongensis # Pathway: not_defined # 4 389 3 392 392 276 40.0 1e-72 MATILIRNASLFSPDPLGTNDILIINDRIAAIAQNVSVPEWLGPVQEMDGKGLFAVPGFI DAHVHITGGGGEAGFSSQVPPLPLSKLIRSGVTTVGGLLGTDGVTRHVDAVLAKANSLEE EGVSSFIMSGGYPVPSPTLTGSIRSDIAFIEKVRGGKIAIADHRVAPVSAETLLAVATEA RIGGMLRGFIGMLIMHIGAAAEGLSCVFAALERAPHLGRHLIATHINRSPFAFSEAAKLV AKGGFMDISSGLNVQTLGPDTLKPSEAIALAMRQGVAKERILMSSDGNGSAARYGDDGSV SGLVASDLGSLHTEFADCVKEGMPLSEALCPITRNVAQAFSLSRKGHIAVGADADILLLA PDLALHTVIARGECMMREGTLCKKGTFEE >gi|316923264|gb|ADCP01000072.1| GENE 6 6196 - 6636 202 146 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDFIKVYERIRERGMEAEWGTLNNGLPQISVRVLGITFFVVPDHFPGYEKAESDFICLRT FMPMETVNVRHKAFLDAAFNKIASNIRMVKIYSVENDGILYAFFEVQVLSTPEDFAQRLS PYANECVRAIREVAAFVNSHPSLAES >gi|316923264|gb|ADCP01000072.1| GENE 7 6793 - 7020 128 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINSNWGNSIAFSLKKYVYECLTEYNVQPKVTFQQAGLKRRSTSVRRFHSQCSLPDRPYT RLYPFMDTDLSYTEY >gi|316923264|gb|ADCP01000072.1| GENE 8 6992 - 7231 161 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEFPQFEFIIKRIFYFFLLLIVFALAAVFWFMFDVFLFWKYTPLSEWSPLGRVLWMTFSV LTTIGVNLLVQKKIRHRSS >gi|316923264|gb|ADCP01000072.1| GENE 9 7405 - 8898 1064 497 aa, chain - ## HITS:1 COG:PA3455 KEGG:ns NR:ns ## COG: PA3455 COG2326 # Protein_GI_number: 15598651 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 1 495 1 496 496 381 42.0 1e-105 MFESAALGRSYPAKEYAALENDLRMKLFKAQGTCIEHKLPVLITIAGVDGSGRGAVANML SEWMDAKTIRNHVFWMQTDEERTRPEAWQFWNKLPAGGEIGVFLGGWYGGTIRRFCCGDI GEREFNASMERWRRLEHTLASSGTVIVKLWLHLGKKVQKARLKDRLKHQEIHHFTPYDKK SAENYDGLVSAAAKAITLTDRVDAPWTVIDAYDGNFRNASVARAIIAAVEAAVAVKRQNA VPVVPTKASEEEIGQISALDAIDLSRTCERGVYKKELAELQSELYDLTYKAYKKGISSTI LFEGWDAAGKGGAIRRLTAGIDARITRVIPVSAPTDEELAHNYLWRFWRHIPRAGFVTIY DRSWYGRVLVERVEKLTQPEDWKRAYAELNDFEEQLQEHNNILLKFWLHISPDEQLRRFK EREEIPWKNYKITPDDWRNREKWPAYVEAADEMFLRTSTEYAPWHIIPAEDKKCARLDVI RIYRDALRKALEQKKKK >gi|316923264|gb|ADCP01000072.1| GENE 10 9586 - 10299 490 237 aa, chain + ## HITS:1 COG:HI0054 KEGG:ns NR:ns ## COG: HI0054 COG2186 # Protein_GI_number: 16272028 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 14 226 24 237 266 103 31.0 4e-22 MLKETPLPTTVRPRHYQKLSERLRQFMTENHFQDGDKLPPERALAESFGVSRNSVREAIH AFAERGLLESRHGDGTYVRVPDMEPLRSAILEAVDSEGHLFDEVMEYRRILEPAVAELAA LRRTPEQLDRLKIIACDQQRRILIDGDDGELDAQFHLCLAECSGNRLLINTVALLNEQYA SGRTADLRDASWHQFSVASHLRIIDALERQSAEDCRKAVEEHLDPIVHKHLFVTARD >gi|316923264|gb|ADCP01000072.1| GENE 11 10536 - 11021 548 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302862986|gb|EFL85918.1| ## NR: gi|302862986|gb|EFL85918.1| putative tricarboxylate transport protein TctB [Desulfovibrio sp. 3_1_syn3] # 1 161 1 161 161 184 70.0 1e-45 MKSPTDVVSGFILLGLCGVGAYSVATLPDAGGMEHVGPGAFPAGILVLLTALSLVLIVQG FRRSPVKVYWPESKVFKKILVFIGLFYLYLVTLTGLGELFLNMENPPFQANGAFSISTFL FLLIALPLLGRRKPIEILSVAVLTTAVLVFAFGWFFQVLLP >gi|316923264|gb|ADCP01000072.1| GENE 12 11052 - 12563 1884 503 aa, chain + ## HITS:1 COG:BH2009 KEGG:ns NR:ns ## COG: BH2009 COG3333 # Protein_GI_number: 15614572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 8 474 1 466 504 363 44.0 1e-100 MTDWRGFMLANILDGIMNAISMGSLLANLMGVTLGIIFGALPGLTAAMGVALLIPLTFGM PAVEAFSALLGMYVGAIYGGCITAILVGTPGTVSAAATMLEGPALTARGESRRALDMATI ASFVGGIVSALALIFIAPMLARAAMSFGAPEYFAVAVFGLTVVASLSSGHLVKGLISALA GLFLATIGLDPVTGDMRNTFDNPNLFNGLSLVPVLVGLFAVSQVLVTVEDVVRGVSLKES TVSKRGISLKDLTGNVVNFIRSSVIGTLIGIIPATGVSAASFLAYSEAKRFSKTPKMYGK GCVEGIAATESSNNAVCGGALIPLLTLGVPGDIITAIMLGALMIQGLTPGPLLFVEHPVT VYGIFAAFIIANIMMLVCGLIAVRGAGKIIAIPGAVLMPIVITLCVVGGYAVNNSTFDLL VVAIFGTVGYLMIKCDFPQPPLLLAMILEPIAEANFRRALTISQNDYSIFYTSPVACIIL LVSLLVLLKPIYDDFRAGRKAAA >gi|316923264|gb|ADCP01000072.1| GENE 13 12585 - 13814 940 409 aa, chain + ## HITS:1 COG:BH0352 KEGG:ns NR:ns ## COG: BH0352 COG0624 # Protein_GI_number: 15612915 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus halodurans # 20 391 8 366 374 196 34.0 9e-50 MELSQCKERLDAWFSDKEPEMLSMLERIVNMDSFTNDAADVNKVGEVVTGWLAEAGFHTA KLPKKPAPADEQWMNGLGNVFSARTHSVECGPGVAFIGHMDTVFPAGTAGARPFRLDRAA DRATGPGVTDMKAGIVQNMFVARALKELGLMDVPMTLTFSPDEELGSPSTTPILGEQLNG AQAVICTEPGYPGGGVTLERKGSGHMFLEIMGISAHAGRCYQDGASAILELAHKILAFDA YVDLDHDTTVNTGLVNGGTSANSVAPNASARIHITFKTLETGRRLAEALRAETAKNHIPG TSSHISGGVRLPPLVPTPGVMKLFSLAEQAGELIGYSVHSAPSKGTAESGYCSSVLGVPA ICSMGPEGSNLHSADEYMIPSTFVPRCKLAALTAIQAARAFAPAAKVSL >gi|316923264|gb|ADCP01000072.1| GENE 14 13840 - 14811 1151 323 aa, chain + ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 29 313 16 298 308 142 31.0 9e-34 MKKLQALFCALLVCTLLLSGLPRTASAAYPDRPVTIIIPFGPGGAVDIAARILAEYFQNK HQITLNIVCKAGGAGAPAMLDVAKARPDGYTFGFPAIATFSTTPQIKKTGYTLADFRAVV QVTNMWLSLAVNADSGIKTINELMAAAKATPGKYNYATHGALSTQRLFMSRLLKAFPGVD LPHVSYTSGHEVSTALLGRHVTSGFGVTTNQKPYVLSGDFTMIGVSSPERLAEFPDVPTF AEQIGPEYTFASSHGLVAPKRVPEDRILTMQNLVKEALADPDVQAKFAKAGLTTDYLSAE DFQKALDNMWKTIGDIMRENKFN >gi|316923264|gb|ADCP01000072.1| GENE 15 14869 - 15927 1005 352 aa, chain + ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 47 257 47 249 332 70 25.0 4e-12 MGFKRSFVGICLVLLALAWSTPARSASPIPMKTAWLGEHETFAVWYAKQKGWDKEAGLDL TMLRFDSGKAIVEGVLAYDWAIAGCGAVPALTAALSERLDIIAVANDESAANALYVRADS PLLSIKGTNPAYPDIYGDQATVKGKEILCPKGTSAHYMMAAWLKSLGLEEKDVRVKYMSP VQALGAFAGGLGDVVALWAPLTYDADAKGFKPAALSSNCGVSQPVLIVANHEYAIQYPER VEAFLKMYLRVITFMRETPVETLVPDYVRFYEEWTGRKLTSEIAAKDLANHAVFNLEEQL ALFDDAQGKSKLQNWLTDIAAFYESIGMVRKEDRQRLRRMNAVTDDYLKALR >gi|316923264|gb|ADCP01000072.1| GENE 16 16007 - 17011 821 334 aa, chain - ## HITS:1 COG:YPO4037 KEGG:ns NR:ns ## COG: YPO4037 COG4213 # Protein_GI_number: 16124157 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Yersinia pestis # 5 332 8 331 331 171 30.0 2e-42 MPKRLLCCIFLVLMYVSGCDDAQKPKIGISFGVGEAKRWPAEKGYMEERAQELGMEVETR FNKADAPKTQMQDCFELIDSGISVLILIPRDARKANEILAYAKKKNVKVISYARAVMGED IDFFVGYDTYRIGQSLGLHLTEKTYKGNLAILKGDKNDFNSPLLYDGAMISIRPLVERGA IHMILDEYVNGWSVDLAKRMLTDAIIKNDYKIDAVFAHNDIFAGAAAEVVKELGIKNPVI ITGMDAETPALKRLLKGTQDATVYMDLKSMAYTAVNEAYNMATKKKPNVNSEFDNESKFK IDAFLINGKLITRENIDRVLIAPGHFTHEQIYGN >gi|316923264|gb|ADCP01000072.1| GENE 17 17502 - 18644 613 380 aa, chain - ## HITS:1 COG:MA0181 KEGG:ns NR:ns ## COG: MA0181 COG0477 # Protein_GI_number: 20089079 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Methanosarcina acetivorans str.C2A # 1 370 19 388 410 104 23.0 3e-22 MLFGVRLLSAMGVTLILPVMPAMARTFGLSIAEAGMVVVCFTLAEATMTPVAGVLSDRFG RMAVLLPALLVFAGGGILCLFAESWRDVLICRVIQGIGAGPLGVLYTILAADMVDEKHLP RIMGRLTAVSSLGTIVYPVIGGLLGEWSWRAPFFVFTLALPTAVLSLMVTLEKPQGAMDW RRYWKQTGNILRERRAIGFFVLVFLCYCAIYGPINTCFPMMAEAKYHASSSRIGLVFSFV AIGSYLAAVGLPRLHAKWSFRTLILVAGFCYTLPLAFLASVPGLWLCAVPLFVSGAAQGL SLPIINDNVALLGTPDDRAAILAVSETSVRVSQSVSPLLFSIISMKWLWDGAYASGFAVG ILILLVAFFVFEPRTAPSQK >gi|316923264|gb|ADCP01000072.1| GENE 18 19118 - 19420 445 100 aa, chain + ## HITS:1 COG:BMEI0649 KEGG:ns NR:ns ## COG: BMEI0649 COG0831 # Protein_GI_number: 17986932 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) gamma subunit # Organism: Brucella melitensis # 1 99 1 99 100 132 70.0 1e-31 MHLTPRELDKLMIHTLADVALKRKAKGLKLNHPETVAVLSAAALEEARAGKTVEEVMAET RKVLTRNDVMEGVVEMIPTVQVEAVFTDGSRLITIHNPIN >gi|316923264|gb|ADCP01000072.1| GENE 19 19435 - 19839 625 134 aa, chain + ## HITS:1 COG:BMEI0648 KEGG:ns NR:ns ## COG: BMEI0648 COG0832 # Protein_GI_number: 17986931 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) beta subunit # Organism: Brucella melitensis # 4 133 17 152 159 165 61.0 2e-41 MTNKTPNKTKTPVGGLILGSDPIEFNAGRKTITLKVRNTGDRPIQVGSHFHFFEVNRALE FDRHAAYGMRLNISSTTAIRFEPGDEKTVSLVSIGGTSGTYGFNNLVDGWTGDEHERVIN EERADKLGFKNAKK >gi|316923264|gb|ADCP01000072.1| GENE 20 19876 - 21594 2160 572 aa, chain + ## HITS:1 COG:YPO2667 KEGG:ns NR:ns ## COG: YPO2667 COG0804 # Protein_GI_number: 16122873 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) alpha subunit # Organism: Yersinia pestis # 1 571 1 571 572 874 75.0 0 MPTISRQEHAALFGPTVGDKIRLGDTDLFVEIEKDLRHYGDESMYGGGKTLRGGMGGDAR LNRDAGVLDLVITNVTVIDAVQGVIKADVGIRDGRIVGIGKSGNPSMMNGVDPDLIVGNS TDAITGEHLILTAGGIDSHVHYISPQQGYAALSNGVTTFFGGGVGPTDGTNGTTITPGPW NISKMLEAIDAMPVNMGILGKGHAYNSVPLVEQIEAGACGFKVHEDWGAMPSMIRAALSV ADDMDVQVSVHTDTLNESGYVEDTIDAIAGRVIHTFHTEGAGGGHAPDILRVASHPNVLP SSTNPTLPFGINTQSELFDMIMVCHNLNPNVPSDVSFAESRVRAETVAAENVLHDMGVIS MISSDSQAMGRVGENWLRVIQTADAMKAARGKLPEDAPGNDNFRVLRYVAKLTINPAITQ GVSHLIGSVEVGKYADLVLWEPQFFGAKPKMVIKGGMISWANMGDPNASLPTPQPTYYRP MFGGMGLAKPRTRITFVSQAAMGRNIREKLGLRSCVEPVCNIRGLNKHDMVRNGNLPKIE VDPETFAVKVDGVHATVKPAQKVALGQLYFFS >gi|316923264|gb|ADCP01000072.1| GENE 21 21845 - 22489 566 214 aa, chain + ## HITS:1 COG:BMEI0646 KEGG:ns NR:ns ## COG: BMEI0646 COG2371 # Protein_GI_number: 17986929 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreE # Organism: Brucella melitensis # 1 176 1 176 201 191 50.0 1e-48 MQLVENVLGNSNDPAWKARLADATIDTLELSQWDAQKNRLRKDTRNGQAIAVSLDRNAFL HDGDILLWDEKTKQAVICKIDLCEVLIIDLSGLQKLPADQLIERCVQLGHALGNQHWPAI VQNGFVYVPMAVNRLVMNSVMNTHHFKDITFRFAPGSEIVDLLEPTQARRLFGGTERPMD GGHTHASPDGHAHHHDHHHTHAHEGHCHDGHCHN >gi|316923264|gb|ADCP01000072.1| GENE 22 22428 - 23222 792 264 aa, chain + ## HITS:1 COG:YPO2669 KEGG:ns NR:ns ## COG: YPO2669 COG0830 # Protein_GI_number: 16122875 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreF # Organism: Yersinia pestis # 29 262 2 226 228 229 53.0 5e-60 MTITTRTHTKDIATTAIATTKAPEHTVKNILSLSRMMQFGDSMLPVGAFAFSNGLESAVQ KGVVHDAETLRQYTHTALEQAAKGDAVAVVWATRAALAGDLEDLIRVDREVLCRKLNEET RLMATRMGRKLAEMGADITENPLVIGWRDAIKDGRAPGTYPVSLAVQFVAMGLSTQEKLD TGTLDEVLTVHQYGVAMTILNASMRIMRISHIDMQRVLYSLTRDFDAMCRTAMRTPLEQM SNYAPMTDILAASHVKAHVRLFMG >gi|316923264|gb|ADCP01000072.1| GENE 23 23278 - 23916 1005 212 aa, chain + ## HITS:1 COG:YPO2670 KEGG:ns NR:ns ## COG: YPO2670 COG0378 # Protein_GI_number: 16122876 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Yersinia pestis # 2 203 9 210 220 309 76.0 3e-84 MKKITRIGVGGPVGSGKTAIIEAVTPVLIRRGIKPLIITNDVVTTEDAKHVQRELDGILE AEKIVGVETGACPHTAIREDPSMNLAAVEELESKFPDSDLIFIESGGDNLTLTFSPALAD FFIYVIDVAAGDKIPRKSGPGVCQSDILVINKEDLAPYVKASLEVMDRDSRKMRDGKPFI FTNCMTGKNIEELTDMIIRDALFDFKPAEKTA >gi|316923264|gb|ADCP01000072.1| GENE 24 23927 - 24895 1008 322 aa, chain + ## HITS:1 COG:BMEI0643 KEGG:ns NR:ns ## COG: BMEI0643 COG0829 # Protein_GI_number: 17986926 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreH # Organism: Brucella melitensis # 23 321 4 301 302 311 48.0 1e-84 MSQLSSRARVELAKAALSRIGLEAPELRPYQDEPAQMPSGTVGKDGYLRLEFADRGDRSV MAFMDRRVPFLVQRALYWDEAMPQMPCIFIITTTGCVLQGDRMALEIEVGKNAQAHVTTQ SATKVHMMNANYASQLQDIVVEEGGYLEYMPDPLIPHRTSRFLSKTRLSVAETGSLLYAE VVLPGRKYHHEDEMFGFDLYSSTIEATRRESGEKLFVEKFIIDPRETDLSRTGIMNGYEV FGNVILMTTKEKTMQVREAVEAGVDREARLAYGASLLPGECGLIFKVLGMTSESVRGKIR EFWQIARKAVAGRDLPQAFLWQ >gi|316923264|gb|ADCP01000072.1| GENE 25 25033 - 26034 1125 333 aa, chain + ## HITS:1 COG:YPO2672 KEGG:ns NR:ns ## COG: YPO2672 COG4413 # Protein_GI_number: 16122877 # Func_class: E Amino acid transport and metabolism # Function: Urea transporter # Organism: Yersinia pestis # 12 332 11 330 330 291 52.0 2e-78 MNAVAKDSPNPWNALASQNPVVHFIDVCLRGAGQVMFQNNPLTGLFFLVGIFWGAYSAHM ISVGIGAVLGTIMGTLTAYALRAPRENINMGLHGYNGILVGCALPTFFAATPLLWGYIVA GSIFSTVLMMAVSSMLRTWKVSAMTGPFVITTWFLMLAAYNFGNIQIISLPHPAISVQPA ASDLFALNMEQFWRAAFAGVSQVFLINNVITGILFLIGLAVSSIWAAVFAFVGSIIAICT ALILGGGSTAIIAGLFQFSAVLTAIGLGTTFYNPNWRVVCYTFLGTVFTVVAQGALNVLL NVYGIPTLTFPFVVAAWIFLLPNIDLLPKNYQS >gi|316923264|gb|ADCP01000072.1| GENE 26 26065 - 27279 1715 404 aa, chain + ## HITS:1 COG:MA3918 KEGG:ns NR:ns ## COG: MA3918 COG0004 # Protein_GI_number: 20092714 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Methanosarcina acetivorans str.C2A # 4 402 2 400 404 362 49.0 1e-100 MLSLNTGDTGFVMLCTALVCMMTPALALFYGGMVRKKNVLTIMMQNFISMGVTAIIWIFG GFSLAFGPDVGGLFGDISYHFALDKVGIAPSPVYASTVPFILFFAYQLMFAVIAPALITG AFAGRLNFRGYLKFLILWMIFIYIPVCHWIWGGGFLAKIGVMDFAGGIVIHVSAGFAALA SVIFLGRRDDMKPGEPTVPNNLPLVAVGAGLLWFGWFGFNAGGAYAADGLAAYAFTNTTI AGSIAMIVWMLYDWHYDRKPSFSGVLVGAVAGLATITPCAGYVEPWAAIIIGAVGSSVCY YAKYVQERLGFDDALEVWRAHGMGGVTGALLLGILASSNIDKVNASIHQFLMQLLGVIIV AVYSFVVTIIIFKVISAWGPIRVTKEEEEAGLDESLHGEEAYKM >gi|316923264|gb|ADCP01000072.1| GENE 27 27328 - 27708 434 126 aa, chain + ## HITS:1 COG:VC0060 KEGG:ns NR:ns ## COG: VC0060 COG0239 # Protein_GI_number: 15640092 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Vibrio cholerae # 9 126 8 125 126 88 46.0 3e-18 MTELAGTLWVAGGGALGAYCRFQISAWFAAWFGKGFPYGTLFVNVLGSFIMGLLVGAIKM NAVPFIPWHDFIGEGFLGALTTFSTFSMDTFAAFRDGQPGKAIANVVVNMVLCLCGTAFG FFLMLQ >gi|316923264|gb|ADCP01000072.1| GENE 28 27718 - 28113 471 131 aa, chain + ## HITS:1 COG:VC0060 KEGG:ns NR:ns ## COG: VC0060 COG0239 # Protein_GI_number: 15640092 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Vibrio cholerae # 13 121 12 120 126 69 38.0 1e-12 MSYIENTIAVLCGGAAGAVCRFKLNAAIMSGLTMTFPLGILCINVLGGFLMGLLQGAMKR SGKPFTVGYSLLGTGFLGGFTTFSTFSLDTFNLYHTGDLMLAGLNILLNAVICICAVWAG YRIVFPRTAAA >gi|316923264|gb|ADCP01000072.1| GENE 29 28053 - 28268 84 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYLRRVGWLPHCLPTHRRRIRKLCLPSQENRDKDPVIGRVFVLSLFPTLNLRNRGINRFE KAILTKAKKIV >gi|316923264|gb|ADCP01000072.1| GENE 30 28385 - 28924 146 179 aa, chain - ## HITS:1 COG:no KEGG:DVU0821 NR:ns ## KEGG: DVU0821 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 13 172 21 184 199 127 39.0 2e-28 MNTAKICVFGLALFCLCGCTDSHYEMKQVPPQLSQAEMSRTDIVKVGYELADGLIKNLHA PLNEGETVIVASFADVDNLTQSSTVGRLLGDIVGSRLSQQGFTVVDIRTRQNSIFIQEQG GEFVLSRDVRALSKANNAACVVAGTFGRLGKSTVVSVRMIRASDNVILSSADGVLVRGR >gi|316923264|gb|ADCP01000072.1| GENE 31 28926 - 29615 83 229 aa, chain - ## HITS:1 COG:no KEGG:DVU0822 NR:ns ## KEGG: DVU0822 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 18 209 4 198 218 113 32.0 6e-24 MWKYFLCLMSMLLTLTAWGCSLAPQPVARPVTGQQTLEAVDHWRILAQQIVKEMQLTTGS SVYVSEQDRSPFGRAFTTLLRHEVAASGARLSGAREGSLCIDWGVQIIKYSEPRQTIHVY PGTIAAITGGGIGAGYIIKNRPSSWPVVGAAGAGLLGEAANLLDMATPKYPIDTEVLINI TGAVDGAVTYDYSGLFYMRAKDSDQYWERPPFRGKEQPMQAKTYRSVGN >gi|316923264|gb|ADCP01000072.1| GENE 32 29584 - 29841 76 85 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIRHKKYFHIPVLLHVVHHYNSLLLNSSFYISYHILNVVYQMIPPNIRCHMDTKGNLIL HVTVSYPFSWLWLSCFQRSFLVLRQ >gi|316923264|gb|ADCP01000072.1| GENE 33 29898 - 30635 321 245 aa, chain - ## HITS:1 COG:MA3922 KEGG:ns NR:ns ## COG: MA3922 COG0614 # Protein_GI_number: 20092718 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 2 215 112 328 378 81 30.0 2e-15 MPNFEVLAELRPDVILAWKTNPGPELERQLEPLGIAVLRLDLTEPGKLPEEMRTLASILG PEAQRRTEAYWKWVARWTEQIQKSIAGQPKPTVLAEHFTPLRIAGPGSGLYDLTQMAGAN NLADDIGIRSMQVDSEWVLERNPQCFVKSILLGKRNAEEDTRRTDECLRSVLERDNWQLL DAVKENRVYILDSDIASGPRYLVGLAELAAWLYPDASVPSGKRIHEEWANAAWMPMYKEN DGGQD >gi|316923264|gb|ADCP01000072.1| GENE 34 30908 - 32575 603 555 aa, chain - ## HITS:1 COG:MA0877_2 KEGG:ns NR:ns ## COG: MA0877_2 COG1240 # Protein_GI_number: 20089761 # Func_class: H Coenzyme transport and metabolism # Function: Mg-chelatase subunit ChlD # Organism: Methanosarcina acetivorans str.C2A # 258 552 13 304 320 221 45.0 4e-57 MEAAFREGVRVLQPGLLAAAHRGILYIDEVNLLSDHVADIILEACSEGVNRIRREGISAE HPSRFVLVGTMNPEEGELRPQLLDRFGLAVSVSAPGDVEERLEVLRLRERFDADPQGFLL QYAEREAALAVRIREAQRLVAQVRIPRLLLKRMAELAAEAHCAGHRGELALERTARAFAA LAGRLDVREEDVLEAAELALAHRRRMTPPPPQEQKEPERNERSSSQEQEQSRRETDGDEQ REHSHSQEQPFDTVPQFPPQANSSGAAGLGDVEQVFAIGDPFTIKPIRLRKDRVQRKGTG RRSRTATLQKAGRTVGSLPLVGSEMKYADIAWDATLRAVALRHRVGEGGWTRPVITLEDI RTRRREKKIGNLIVFSVDASGSMGAARRMEEAKGAVLSLLMDAYQKRDKVAFIAFRGQQA EVLLEPTGSVEQAYRRLEELPTGGRTPLASGLEESHRIIRNQLRKDPDTRPILLVLSDGR ANAAPDGMKPMPAALEAARRIAADGRTHSLVIDVERQGLVQFAMAKTLSEGLEAEYMHLE ELEADTLVRTLKDIL >gi|316923264|gb|ADCP01000072.1| GENE 35 32926 - 33921 595 331 aa, chain - ## HITS:1 COG:MA0877_1 KEGG:ns NR:ns ## COG: MA0877_1 COG1239 # Protein_GI_number: 20089761 # Func_class: H Coenzyme transport and metabolism # Function: Mg-chelatase subunit ChlI # Organism: Methanosarcina acetivorans str.C2A # 8 323 9 328 384 384 59.0 1e-106 MKTLRQRYPFSAIVGQEELKQALLLNLIYPGIGGVLIRGEKGTAKSTAVRALEAILPEID VVDGCPCGCDPHGDALCPWCLEQEALESVSRQVRVVDLPVGSTEDRVVGSLDMETALREG RRRFEPGILADANRGILYVDEINLLDDHLVDVLLDAAAMGVNTVEREGVSWSHPSRFVLV GTMNPEEGELRPQLLDRFGLCVEVRGMAEPEARLQVMTRRAAFEDDPVGFRATWQERENE IAERIVAARRKLGSVAVPEDVMRAIVELAIETEVDGHRADIVMLKTVKAIAAWRGDGEAA REDVMQAARLVLPHRMRRKPFQNVGSAKAVR >gi|316923264|gb|ADCP01000072.1| GENE 36 33938 - 37837 2711 1299 aa, chain - ## HITS:1 COG:MA0872 KEGG:ns NR:ns ## COG: MA0872 COG1429 # Protein_GI_number: 20089756 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobN and related Mg-chelatases # Organism: Methanosarcina acetivorans str.C2A # 1 1295 1 1298 1302 1352 50.0 0 MSVVCVAWSSHVGALMSAASRPGMPSVRVLSSRMLEEPDGVERCLETMRSATALFVFRTT DALWDQLDEGIRLIGKTVPVVCVSYDPAAWALSSVPVETAQTAYRYLTYGGAENLGNLFR FLDALPNAAAVPEPVPVPWEGLWHPDAPVRAFATVRGYLEWYAGYACERGLSLDPERTVG LLFGRHYWVNDMPDVEAALVHALEAKGLGVFPAFTNTLRDKATGNKGATIWSREVFLGES GSRIGALVKFLPYFTNNGGNMPAFVGDDSPARESVRVFRELGVPIFQPVFASSKTLEEWE ADPQGLNSEVSWAVAMPEFEGAIEPFFLGGGTLSQAGVGGTEIERRTPHPERVERFASRI ARWLRLRNKPVAERRVAFLLNSDPCASVEASVGGAAKLDSLESVSRILRAMRQAGYAVDV PESGAALIETIMERKAISEFRWTTVQEIEAKGGVLAHVDLATYRRWFDAYPENVRQKVAE AWGNPPGEPMNGVPAAMVLNGDILVTGVRWGNAVVCIQPKRGCAGSRCDGQVCKILHDPS VPPPHQYIATYRWLQDGFGADVVVHVGTHGNLEFLPGKSVGLSGACFPDLALHEVPHVYI YNSDNPPEGVIAKRRSYAELVDHMQTVMVQSGLYDALEELDRLLGEWEQARAGNPNRAHQ LEHLIREGIAAANLESQVSPETSPDFATLASRIHAALGLLRNTHMEDGMHVFGETPQGNR RAQFIASIVRYDAGQADSLRKRLCTAQGFELETLLAEPGGVDKRLGQSHASLLEKVEKQL VAVCEILMEGTDPEVLPACIRSLLGDACLVPDALGGLVSVGRRILGIIERMEATDETGSL LSAFTGNYVLPGPSGIITRGREDILPTGRNFYTLDPRRLPTRAAWRVGQNLARALIAKHL EEEGRYPENVAMFWMCNDMMWADGEGMGQLLYLLGVVPRWLGNGVVEGFNVIPLEELGRP RIDVTVRVSGLLRDSFPAAMHMLDAAVQAVAALDEPLESNFVRKHTQERLAAIEADDPDA WRSATFRIFSSEPGTYQAGVNLAVYASAWQTEADLADIFLHWNGYAYGKDAFGVKRPRAL EASLSTVDVTYNKVVSDEHDLFNCCGYYGTHGGMTAAASHFRGGQVKTYYGDTREPEQVQ VRDLTDEVRRVVRTRLLNPKWIEGMKRHGYKGAGDISKRTGRVYGWEATTQAVDDWIFDD IAKTFVLDPETRAFFDENNPWALEEIARRLLEAEARHLWKADPDVLAELREAYMDIEGTL EDRTEAFGGEFQGGSIDIVTSDDVAEWKRALEVVRGNNQ >gi|316923264|gb|ADCP01000072.1| GENE 37 37928 - 39976 1676 682 aa, chain - ## HITS:1 COG:PA1271 KEGG:ns NR:ns ## COG: PA1271 COG4206 # Protein_GI_number: 15596468 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Pseudomonas aeruginosa # 14 581 12 521 616 125 27.0 2e-28 MKKQGTCLAALLGALCLFPQGALAEETVSATMKPVVVTATKTEHSLGETTADVSVVTREQ IEKMPANNVLDVMRTMPGVTVDSARSFYGTSTQNKVIIRGMGGDDVNGRVLVLMDGLPVM TAGNNIFNWDTISLDTVERIEVVRGPASALYGSSAMGGVINIITRKPTEEGFKTTVGTKF GRYNTWQNKLYHTGAIDKFSYAISGSMLKSRGFNVLPEHSPKTSSNRNEFNSAREKMENY NGALALNYRFDETADLSIHGEMSSFENTGRWHIEDFNLYSNRHQGIGARLHKDFGVVDSS FSIRGDFTKSDYDNASKTVKTSEAPSRMRTVSWDQSNTFALGDMHLFTVGVAGSWGKFDS ESNYFTSTRYTQKGGKELNLSGYLQDEISIWDGKLTVVPGVRYDYWRSKGYLQDTNDRVN PDKQDFSSTSNVRFSPKLGVRVDPWDNLVVFRSNYGEAFRAPTLNDLFGGSIIGSSRYKS NPNLKPELSKTWDVGVDVNPTDRLTLNVTGYKTWAEDYIANIPKGKEDGYNIKIKENIDK VTITGIEANIHYQYNEYLKLFAEGTALRPEIRSGANQGNHLHNVPTRKASLGFTFSHPDW FTLQASATWLGRIWQDQTDNGDIAEGNFWLGEFKLSKRFDYERYWLEPFIEMQGITSKDE IRYTNGSRVPINMFFVGLEAGF >gi|316923264|gb|ADCP01000072.1| GENE 38 39973 - 40806 248 277 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 226 278 509 563 100 28 2e-20 MLELRDLSFAYPAGRPLIEGLSFTFRSGGICAVLGPNGTGKTTLLKLLLGILKPSSGEML LSGMPIGRMTAQRRARKLAYIPQDVSLRFPLSVFEFVLLGRKPWFVWGPSPSDVALTGTV LSRLDLDALAERPLANLSGGQRQKVALARALVQETDWLLLDEPTSSLDLRHQVEVLTTAR RVATKQGKGVILSMHDINLAARYADTILLMKDGGIVASGAPADVLTGERIASVYGLPMRA FHAADNPQVSLWFPDTTEEGAVFPSHSGIFQEEDMEQ >gi|316923264|gb|ADCP01000072.1| GENE 39 40799 - 41800 548 333 aa, chain - ## HITS:1 COG:MA3921 KEGG:ns NR:ns ## COG: MA3921 COG0609 # Protein_GI_number: 20092717 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 46 326 76 361 369 239 46.0 6e-63 MHAIPAARYLARRKRQNLLLGLLALVALCMLLLGVSVGNGVVRLPLGLSEHELALFVRLR LPRVLMGALVGAGLAMSGTLAQAVLRNPLASPFTLGISSGAAFGAALAILLGGASQWGMA GNAFVFAALTAFAALGFARLRDSRPETLILGGVAIMFLFSAATSLLQYVATEYEVQAIVF WGFGNLGRVGWSELGMAAVMILLPVPFALKLSWDLNALLAGEETASSLGVNVRRLRIGGI LAASLMAAGAICFTGVIGFIGLVAPHIARLALGAEHRFLIPGAALFGASLVTFSDVLARN ICPPQIIPIGILTSFLGVPFFFWLLMRKGQNHA >gi|316923264|gb|ADCP01000072.1| GENE 40 41956 - 42804 673 282 aa, chain + ## HITS:1 COG:alr2486 KEGG:ns NR:ns ## COG: alr2486 COG1131 # Protein_GI_number: 17229978 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 5 278 8 300 300 205 39.0 7e-53 MIQTEHITKSFGRTRILSGISLHVRDRELVAYLGPNGAGKTTTVRILSGQARADGGTVKL AGRNLTEHPLEAKALCGVVSQHLNLDAELSVRENLDIHGRLFGMNASERKEGITSLLETV EMAHKVDVLTKTLSGGEKRRIMLARALLHKPRILFLDEPTVGLDPLIRRKLWGLIKRIQQ EGTTILLTTHYIEEAEFLADRVIFLDKGRIVAEGTPNALMNRIGQWALDVQCNGLLTTRY FNERDEAARETTQEQGTSTVRRVNLEDVFLSITGRKVSGSAQ >gi|316923264|gb|ADCP01000072.1| GENE 41 42819 - 43544 583 241 aa, chain + ## HITS:1 COG:all4219 KEGG:ns NR:ns ## COG: all4219 COG0842 # Protein_GI_number: 17231711 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Nostoc sp. PCC 7120 # 4 208 28 240 275 102 34.0 6e-22 MAAVYLREMLILRRRILKQFASWLVSPLLYLVAFHYAMNDTLVGSRPYADFLLPGLVAMS SMTQSWAIASDINISRFYWHTFDEFQTAPLHAAAYVTGEVLAGMTRAVLACLVVMGIGLL GGIQIAYGAPGLWLALLLNAWFFSSLAVALALHVKAHADQALLSNFVITPMAFLGGTFFP LDRLPGWAQHVLELLPLPHAAQAMQAAAIGQPVRFLSCGLLFLGAGAAFTWAIFSVRSAQ G >gi|316923264|gb|ADCP01000072.1| GENE 42 44268 - 44837 330 189 aa, chain + ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 10 175 4 193 202 87 30.0 1e-17 MNFLMSFADFFAKGGPLMWPILLCSVLALAIVLFKTLEYTTALHTLGKIKQGDSIPLPWF LAPICERLGTADEETLSLTANRQVRKLERGLGVLELITTIAPILGLTGTVTGMISTFQAI GEHGSRVDPSVLAGGIWEALITTAAGLLVALPAHVAYHFLENRLSELIQTVQEIVDVRRI AFCGISHEN >gi|316923264|gb|ADCP01000072.1| GENE 43 44827 - 45234 236 135 aa, chain + ## HITS:1 COG:sll1405 KEGG:ns NR:ns ## COG: sll1405 COG0848 # Protein_GI_number: 16329195 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Synechocystis # 1 118 9 127 142 66 29.0 1e-11 MKIKRRKRARNSMSITPLVDVVFLLLLFFALTLHFSPEEAISVELPTSSSAKQQSETEII LTVTPEGVIRLNGKDVPSQSLETELASLRKIDEKQAVQVRADQEVEVGKLVAIIDAIRNA GFQHFDLMTQQVNRK >gi|316923264|gb|ADCP01000072.1| GENE 44 46067 - 46270 147 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDFICGYKVISQVAIKPSYTTDGVVTYGTAGWSRMLDEKHVGDGLNSVPETQGSISFRLR LLWQHLR >gi|316923264|gb|ADCP01000072.1| GENE 45 46961 - 47233 247 90 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2842 NR:ns ## KEGG: DvMF_2842 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 77 2 78 98 84 50.0 1e-15 MNEFYQLIQDTVSRSPIGAKAVAVKIGKPYSTMMREVNPNDKGAKVGADTLMDIVRATKD ISPLVFMAKELGYRLVPVSKDEGEEDSAAL >gi|316923264|gb|ADCP01000072.1| GENE 46 47289 - 47771 -8 160 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPGFLEFPLHGNVSRIAITYGYFSRPVIVVSYIMYHPKVIFVIGVGVTAIALPHCCFGLR YLFLCIRYDIRAQWQTPGGANPRRGAHIHTFQVPHGSGRRVFAAPFHVGPVNSQGQACSR TSPVIERGPFPEPGLVGLLANGPRGEHEKRRLQRTAFAKK >gi|316923264|gb|ADCP01000072.1| GENE 47 47985 - 51077 2245 1030 aa, chain - ## HITS:1 COG:hyfR KEGG:ns NR:ns ## COG: hyfR COG3604 # Protein_GI_number: 16130416 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli K12 # 640 1025 263 657 663 246 38.0 2e-64 MTSQQIAHVLALHLAGRSLFIEELALLTGNVLPASDERWRELERLGLAERVSSPHGTAWK AGDETAAFIRSQLPLLKEFFRPDEKAYALTCLHEQASLSDCETVLSRISQCAERDRNILP LLELAVLFLLRWGRRNGSTAGLDGKFLNLVFAVQGLSFNSTELLKKILRLSALAYGLARK NGNKRFLALLMLSRMYLHVFIGSNTTRLTQNLQQALCRVRSYFDGDTQEALPLFEGLLAY LCGDLKKNIDWFNRCSEETPWYYKRFYDTLAVCTLFSAGYLQQFHFALGCIEFLRRTATL TDDILSATLYEVNTCFLLLRKRDSGKALEFINRLRVSPLYTQHTVVNSLTTRAHALYHFL AGDIERAYAVLNDETRRAITRDIRPATFKDPLVLDMLYVFAHNGYVPIPYYELEPTLGHL LLGANVHLKGSAMRVQALLLRDRGEGPQKQLELLRQSFELILTTRDMHELTLTAHELANT CELCGSMEEARSLRNIVCKATGKPLDASTSYHQASLACLNVPPAAFSLFTCSLPDAESEE KYIIERCHKALNTAPQREIFEERLHLLLRASIDGFRVERAAFFRTVDEDNPLQFVQAVNL SALELKSEHMRPCMQWLVQVAQQRSATTPVHFHKQYGFCLSIDIGTEPPWLLYLSSTYSV LPADEFSLFELRCAARLFTAELRSALFVRRLKERENELQQNKLRSIFLQEDRGECLVLGA GLNTLLNEAKYAAVTDVPILLWGETGVGKEIMARHIHRLSGRQGPFIAVHPASMVESLFE SEFFGHERGAFTGASNQKIGLFEMADQGTLFIDEIGEMSPLIQTKLLRVLQEQRFMRVGG TAEIKSRFRLIAATNRDLWKEVREKRFRQDLLYRISVVPLLLPPLRERKQDILPLVDAFI RHSSQRYGKTIFPLSPEQKQALVDYSWPGNVRELKNIIERAVILCEQGASLHFHFDQANM EPFFSSQAGDGGILADTPTLEQLEERYLRKIMAMTRGRVRGEHGAAALLHMKIPTLYAKL RKFSIPCGKR >gi|316923264|gb|ADCP01000072.1| GENE 48 51146 - 51361 127 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MARGAPEKRGASDTYRQRYNRIVVEKHNFVWCNAKMTAFPSWRAVLLWNVFIASRFRFVH PESGTLVAKAH >gi|316923264|gb|ADCP01000072.1| GENE 49 51444 - 51875 502 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|182418212|ref|ZP_02949512.1| ## NR: gi|182418212|ref|ZP_02949512.1| transporter, major facilitator family [Clostridium butyricum 5521] # 4 140 268 404 411 82 33.0 1e-14 MITPFSGWLSDRLGLWSRRTRAWLGVPCFLILAAVFAAGFYYQIAACCAIGMFLFIIPVT GVHIATQELVPTRYKATAYGTYVTLLQGLGFFGPMLAGALSDAFGLQLALVYMQLVFVIG GLIMLVAGFTYVKDYNRARAMEA >gi|316923264|gb|ADCP01000072.1| GENE 50 51872 - 52435 529 187 aa, chain - ## HITS:1 COG:CC1819 KEGG:ns NR:ns ## COG: CC1819 COG0477 # Protein_GI_number: 16126062 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 31 185 30 185 479 70 28.0 1e-12 MPSIVLDREEEMESQTESVVASVRQDGASAEKGLAASDEPTSPVGKKWFLFGFGVLYMLF LLDFAARLGITAVFPAMQKDLGLSDSQVGVAGSAVLLGMTVFVLPFSFLADKGSKKHAVN LMSAVWGVGCTLCGLVSHLFLIVLGRFMVGIGNASYAPVSVSMLTSWTRRSRWGSVIGAY NSALPAF >gi|316923264|gb|ADCP01000072.1| GENE 51 52521 - 54134 1465 537 aa, chain - ## HITS:1 COG:SMa0101 KEGG:ns NR:ns ## COG: SMa0101 COG1574 # Protein_GI_number: 16262503 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Sinorhizobium meliloti # 24 537 24 540 541 212 31.0 1e-54 MFCTEKLKLYTGAKIVTLYDPMPIVEAVCVIGERIHAMGSLEDMERCAALCGEYEKIDLG GGVLYPGFIDTHSHLSMYADTFSQVYCGSRLGTVPRVLDGLREAAGNTPQDRWIVGYAYD DTGIDEKRHLTREDLDAVSTERPILVKHLSVHFCYLNSLAIEKLGYTAETRVAGGTVCLG ADGRPNGILEENASYQAIAKLPATSFDETRKNMLRAVRDYNAHGFTTFMDGGIGLSGSYK TDMTAYLSLARDKALDARGYLQFMPPVLDALEPYGLWGFPSEYLSFGGVKYFTDGSIQGY TGALLEDYHSRPGYRSELVCTPEELGNLVMHYHAAGIQIAVHTNGDAAIEATLQAYERAQ RALPRPDLHHIFVHAQMASDGQLRRMKACHALPTFFVRHIEVWGDGHYDTYLGPSRANRL DPAGSAVRIGLPFALHVDTPVLPVTALDSIHAAVNRISPSGRLMGADQRISPREAIKAYT VYASLFCMGEHDRGRIEPGRLADFVLLSDDLEAIDPLAINRVKVRMTLCGGRIVYQA >gi|316923264|gb|ADCP01000072.1| GENE 52 54693 - 55020 160 109 aa, chain + ## HITS:1 COG:no KEGG:Nmul_A1248 NR:ns ## KEGG: Nmul_A1248 # Name: not_defined # Def: integrase catalytic subunit # Organism: N.multiformis # Pathway: not_defined # 1 109 1 109 347 174 73.0 9e-43 MESFNQNVIKHKTGLLNLAAELGNISKACKMMGFSRDTFYRYQAARDAGGVEALFEVSRR KPNLKNRVEEAIEVAVTAFAVDFPAYGQTRASNELRKQGIFVSPSGVRS Prediction of potential genes in microbial genomes Time: Fri May 13 03:10:16 2011 Seq name: gi|316923163|gb|ADCP01000073.1| Bilophila wadsworthia 3_1_6 cont1.73, whole genome shotgun sequence Length of sequence - 102950 bp Number of predicted genes - 105, with homology - 66 Number of transcription units - 41, operones - 20 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 33 - 404 188 ## bglu_1g32480 integrase, catalytic region - Term 1066 - 1104 5.2 2 2 Op 1 . - CDS 1177 - 12345 5624 ## CtCNB1_2656 hypothetical protein 3 2 Op 2 . - CDS 12421 - 12666 168 ## - Prom 12697 - 12756 3.5 + Prom 12645 - 12704 3.7 4 3 Tu 1 . + CDS 12757 - 13302 265 ## + Term 13314 - 13344 2.1 + Prom 13316 - 13375 4.3 5 4 Op 1 . + CDS 13440 - 13766 226 ## DvMF_2721 hypothetical protein 6 4 Op 2 . + CDS 13768 - 15216 393 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 7 5 Op 1 . + CDS 15323 - 16339 160 ## DvMF_2723 radical SAM domain protein 8 5 Op 2 . + CDS 16336 - 16599 61 ## + Term 16617 - 16647 3.3 + Prom 16764 - 16823 3.9 9 6 Op 1 . + CDS 16914 - 17600 617 ## COG3646 Uncharacterized phage-encoded protein 10 6 Op 2 . + CDS 17659 - 17937 436 ## 11 6 Op 3 . + CDS 18014 - 18193 191 ## Gura_0580 hypothetical protein + Term 18212 - 18253 3.1 - Term 18198 - 18239 3.1 12 7 Op 1 . - CDS 18294 - 20054 851 ## 13 7 Op 2 . - CDS 20063 - 20473 329 ## 14 7 Op 3 . - CDS 20487 - 20846 298 ## 15 7 Op 4 . - CDS 20850 - 21467 538 ## 16 7 Op 5 . - CDS 21464 - 22006 582 ## Desal_2351 hypothetical protein 17 8 Op 1 . - CDS 22190 - 24196 895 ## Nmul_A1444 hypothetical protein 18 8 Op 2 . - CDS 24231 - 24797 614 ## 19 8 Op 3 . - CDS 24810 - 26096 437 ## gi|239835198|ref|ZP_04683524.1| Conserved hypothetical protein 20 8 Op 4 . - CDS 26093 - 26338 173 ## HSM_0915 hypothetical protein 21 8 Op 5 . - CDS 26335 - 26898 695 ## CV_0338 hypothetical protein 22 8 Op 6 . - CDS 26909 - 30253 818 ## 23 8 Op 7 . - CDS 30263 - 30829 418 ## 24 8 Op 8 . - CDS 30816 - 31520 703 ## ECH74115_3509 hypothetical protein 25 9 Op 1 . - CDS 31890 - 32339 498 ## 26 9 Op 2 . - CDS 32348 - 33433 1622 ## Smlt1961 putative phage-related protein 27 9 Op 3 . - CDS 33444 - 34547 950 ## 28 9 Op 4 . - CDS 34561 - 34857 197 ## 29 9 Op 5 . - CDS 34841 - 36940 2536 ## Aave_2355 hypothetical protein 30 9 Op 6 . - CDS 36940 - 38244 1368 ## BURPS668_A2354 hypothetical protein 31 9 Op 7 . - CDS 38248 - 38724 502 ## Mmc1_1689 hypothetical protein 32 9 Op 8 . - CDS 38727 - 38990 272 ## 33 9 Op 9 . - CDS 38999 - 39298 148 ## 34 9 Op 10 . - CDS 39295 - 39525 235 ## 35 9 Op 11 . - CDS 39525 - 39905 396 ## RB2501_01256 hypothetical protein 36 9 Op 12 . - CDS 39977 - 40240 92 ## 37 9 Op 13 . - CDS 40240 - 40515 300 ## 38 9 Op 14 . - CDS 40539 - 40865 140 ## 39 9 Op 15 . - CDS 40795 - 41331 237 ## CLL_A2280 VRR-NUC domain protein 40 9 Op 16 . - CDS 41328 - 41744 152 ## LI0834 hypothetical protein 41 9 Op 17 . - CDS 41753 - 42814 425 ## Noc_0052 hypothetical protein 42 9 Op 18 . - CDS 42817 - 43131 141 ## 43 9 Op 19 . - CDS 43128 - 43919 669 ## COG3645 Uncharacterized phage-encoded protein 44 10 Tu 1 . - CDS 44036 - 44593 507 ## 45 11 Tu 1 . + CDS 44670 - 44930 167 ## + Term 45061 - 45095 -0.9 46 12 Tu 1 . - CDS 44884 - 45123 147 ## - Prom 45155 - 45214 2.9 + Prom 44938 - 44997 3.5 47 13 Tu 1 . + CDS 45180 - 45869 68 ## Ppro_0272 putative phage repressor + Term 45945 - 45986 -1.0 + Prom 46696 - 46755 7.9 48 14 Op 1 . + CDS 46775 - 46999 96 ## 49 14 Op 2 . + CDS 47004 - 47195 101 ## + Term 47223 - 47255 3.0 50 15 Tu 1 . - CDS 47199 - 47675 -195 ## - Prom 47808 - 47867 3.1 + Prom 47798 - 47857 1.8 51 16 Tu 1 . + CDS 47881 - 48120 268 ## DvMF_2842 hypothetical protein + Term 48201 - 48236 9.5 + Prom 48730 - 48789 3.8 52 17 Tu 1 . + CDS 48812 - 49168 202 ## + Term 49226 - 49255 1.1 + Prom 49220 - 49279 4.3 53 18 Op 1 . + CDS 49505 - 49747 226 ## SPAB_05352 hypothetical protein 54 18 Op 2 . + CDS 49731 - 49979 229 ## SPAB_05353 hypothetical protein + Term 50000 - 50041 0.3 55 19 Op 1 . + CDS 50123 - 50473 439 ## 56 19 Op 2 . + CDS 50470 - 50676 136 ## 57 19 Op 3 . + CDS 50673 - 50810 76 ## 58 19 Op 4 . + CDS 50807 - 50941 143 ## 59 19 Op 5 . + CDS 50934 - 51806 984 ## Cphy_2988 hypothetical protein + Term 51812 - 51848 4.0 60 20 Op 1 . + CDS 51866 - 52333 513 ## COG0629 Single-stranded DNA-binding protein 61 20 Op 2 . + CDS 52355 - 53341 1195 ## DvMF_1683 protein of unknown function DUF1351 62 20 Op 3 . + CDS 53412 - 54338 922 ## DVU1525 hypothetical protein 63 20 Op 4 . + CDS 54332 - 54580 150 ## 64 20 Op 5 . + CDS 54583 - 54777 140 ## 65 20 Op 6 . + CDS 54813 - 55160 300 ## DMR_23570 hypothetical protein 66 20 Op 7 . + CDS 55206 - 55505 259 ## + Term 55514 - 55559 9.5 + Prom 55815 - 55874 3.4 67 21 Tu 1 . + CDS 55913 - 56107 151 ## 68 22 Tu 1 . + CDS 56261 - 56746 454 ## 69 23 Tu 1 . + CDS 57096 - 57476 82 ## - Term 58249 - 58278 -0.2 70 24 Tu 1 . - CDS 58403 - 59569 410 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 - Prom 59631 - 59690 4.3 - TRNA 59774 - 59866 69.4 # Ser GCT 0 0 - Term 59709 - 59760 11.6 71 25 Tu 1 . - CDS 59961 - 62375 1408 ## COG1452 Organic solvent tolerance protein OstA - Prom 62409 - 62468 3.9 + Prom 62546 - 62605 3.1 72 26 Tu 1 . + CDS 62793 - 63947 1051 ## COG0787 Alanine racemase 73 27 Tu 1 . - CDS 63988 - 65100 1121 ## COG2230 Cyclopropane fatty acid synthase and related methyltransferases - Prom 65181 - 65240 2.6 - Term 65317 - 65369 -0.7 74 28 Op 1 . - CDS 65383 - 65670 340 ## Dde_1968 TRASH 75 28 Op 2 . - CDS 65692 - 66207 237 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 76 29 Op 1 . + CDS 66270 - 67280 985 ## COG4974 Site-specific recombinase XerD 77 29 Op 2 . + CDS 67299 - 68303 979 ## COG0618 Exopolyphosphatase-related proteins 78 29 Op 3 . + CDS 68242 - 70047 2154 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase + Prom 70049 - 70108 1.9 79 30 Op 1 . + CDS 70141 - 70647 421 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 80 30 Op 2 . + CDS 70658 - 71068 621 ## DVU1651 hypothetical protein 81 30 Op 3 . + CDS 71085 - 73796 3000 ## COG0249 Mismatch repair ATPase (MutS family) 82 30 Op 4 . + CDS 73793 - 74260 424 ## DvMF_0354 hypothetical protein + Term 74482 - 74538 4.2 - Term 74474 - 74521 7.2 83 31 Op 1 . - CDS 74656 - 75123 464 ## COG1267 Phosphatidylglycerophosphatase A and related proteins 84 31 Op 2 . - CDS 75135 - 75755 490 ## COG1351 Predicted alternative thymidylate synthase - Term 75780 - 75831 3.0 85 32 Op 1 29/0.000 - CDS 75875 - 76903 755 ## COG2255 Holliday junction resolvasome, helicase subunit 86 32 Op 2 . - CDS 76903 - 77514 714 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 87 32 Op 3 . - CDS 77609 - 78688 481 ## COG1541 Coenzyme F390 synthetase 88 32 Op 4 . - CDS 78688 - 80055 629 ## COG1964 Predicted Fe-S oxidoreductases 89 32 Op 5 . - CDS 80052 - 80489 219 ## Ddes_1819 hypothetical protein 90 32 Op 6 . - CDS 80498 - 81208 171 ## COG0500 SAM-dependent methyltransferases 91 32 Op 7 . - CDS 81205 - 81420 320 ## Ddes_1821 hypothetical protein - Prom 81471 - 81530 3.5 92 33 Tu 1 . - CDS 81575 - 83938 1431 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Term 83958 - 84018 15.6 93 34 Tu 1 . - CDS 84071 - 86797 3753 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs + Prom 86854 - 86913 2.4 94 35 Op 1 . + CDS 87100 - 87390 215 ## COG2158 Uncharacterized protein containing a Zn-finger-like domain 95 35 Op 2 . + CDS 87464 - 87688 386 ## + Term 87710 - 87761 1.2 96 36 Op 1 . + CDS 87829 - 91359 4105 ## COG0642 Signal transduction histidine kinase 97 36 Op 2 . + CDS 91359 - 93224 2355 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Term 93311 - 93352 7.2 98 37 Tu 1 . - CDS 93386 - 93847 464 ## COG2954 Uncharacterized protein conserved in bacteria 99 38 Tu 1 . + CDS 94140 - 95990 2459 ## COG1217 Predicted membrane GTPase involved in stress response + Term 96130 - 96178 11.9 - Term 96120 - 96160 6.0 100 39 Tu 1 . - CDS 96254 - 97225 1177 ## LI0327 hypothetical protein - Prom 97399 - 97458 1.6 101 40 Op 1 . + CDS 97691 - 99055 1820 ## COG0439 Biotin carboxylase 102 40 Op 2 . + CDS 99074 - 101323 3106 ## COG0825 Acetyl-CoA carboxylase alpha subunit + Term 101338 - 101376 6.0 103 41 Op 1 . + CDS 101396 - 102046 935 ## LI0490 hypothetical protein 104 41 Op 2 . + CDS 102048 - 102392 389 ## LI0491 hypothetical protein 105 41 Op 3 . + CDS 102416 - 102950 571 ## COG0629 Single-stranded DNA-binding protein Predicted protein(s) >gi|316923163|gb|ADCP01000073.1| GENE 1 33 - 404 188 123 aa, chain + ## HITS:1 COG:no KEGG:bglu_1g32480 NR:ns ## KEGG: bglu_1g32480 # Name: not_defined # Def: integrase, catalytic region # Organism: B.glumae # Pathway: not_defined # 3 123 226 346 348 220 81.0 1e-56 MEMGLIRMLTDRGTEYCGKVEAHDYELYLGVNGIEHTKTKARHPQTNGICERFHKTILNE FYQVAFRRKLYQSLEELQADLDTWIDSYNTQRTHQGKMCCGRTPMQTLLDGKSLWAEKVG QLN >gi|316923163|gb|ADCP01000073.1| GENE 2 1177 - 12345 5624 3722 aa, chain - ## HITS:1 COG:no KEGG:CtCNB1_2656 NR:ns ## KEGG: CtCNB1_2656 # Name: not_defined # Def: hypothetical protein # Organism: C.testosteroni # Pathway: not_defined # 1517 2595 1282 2366 3594 674 42.0 0 MKTIGEYRARYPQLRELTDYQLTRAVYDTTAPDMPLEDFAKAFGGVTEEDPEEVRIQEYN LSHPDDTIRREDMGKASLLSDTGHGIAAGINELAAIPFWLAEEMGIEPAGRAKDYFLKSA EEKRKGYSPEMKLARAEEFFTEEPDGSYGLGGAWTSPRKILGSVAESAPGMVGGMGLSAL AARGLMKAGFSRGLAWAVGASLGEGTIGGAQNALDVHEAILAMPEDKLAASPEYQEILQR TGDPKRAREELANSVAAMTGLKTGASTAVLSAPSGYMFGKILGGETGKTLLGTAAKQGAA EFGEEAGQSGAEQYLSQAELQRADPTIDPMSGVGEAAVSGGFAGMAMGGGMGGFGHVLGS RAQQQPGAPAVSDAQPVLPPEQQAALPEGGHGSPLELGPGQNIVMGTDRAYEDPIATPPP TSPLQKGMEAILGGPLGLSTAPAAPDFAVTPGGRAVSANGIVQPDAFGPIAAGMRSDMPP LLADQVYDVEYSVVPGAPKALSEAPRGLPLGQDKSPVAPESPQPFDAQANGGSAPLQLTR INDKSFMLHGDTGPFEEAIKAAGGRLDRKVGGWRFPLVREGEIRELFGIPLGAETVPGDN LVSEVPEASPEVLRETPLPVSGDMAMSGQTGLEVTHGGLQQGTPGTGTGELLSESVPVDA AGNPGELDAVTGRPGDFPMGDRGSRPDMDTAVPGAGGTEAGLGAVVQRAAGPVEIPAEGV DAALLSPDRRNVPDGGAGRHAEGGRRGTVFRGHGQVGRDVPGGSERTERAALEGNGAQGV EQKLLRAIERQNGGLTPEAEAWARSLPSARQEAAWQQLTLDAFKGYTASNLRVDEASLKF IPLSKDTKTSLQLKALANVFGQDLILFRDASKQPTGILGATIPAINNVIFMEENREGYRP HLSVLGHEMLHSLHNSSFKLYSDFRKYLLSELKPEALQHYREVLDRRTGHDGTVGRMSDD KILEEIGADLVGKRLTEESFWAKMAEERPSLFARVSQFVRDLLDRVTAAFRADPLPAAWV KDFDAIRRHIDAMMGQWAEKNARAKTGGTPPHSGNTREVLDGRDETVRNAGRNGGEGSDH AGGLLQSNRGRHGAPVDLVGDGGRGGDRGRAAQPAPEGREGERPRQPVAVKIPNRADEPA HYELLEADDVRASHLPSRGFQKNPAYGLENERRYHDEPGSRAKVMENAAKLDPAFLMESV DANHGAPVIDHDNNVLGGNGRAMSIARAYESFPERGEQYREALKANADRLGIDPAQVDAM HSPMLVRRLERGMSREERQQLVTAMNDDFKDAKEKRASGKSRGERFGRRTLDMLSAGLKD AESLREYFDTPASVAVVERMMEDGVIQRSERNALVGADGLLNPDGKKVVEEALRGRIARS YEALAKLPADVVGKLDAVIPHILVAEGIGKPWNITEHVRDSVDLLVGFKGSGVKEPGTYL KQVNMLTGRAPVQDFSKQAIALFRMALDAKKGEYVKAFESYLKNAKLSPEAGNIPGVAKP QDKAFRDAFGMKTEATPNKPEKREAEKPQTEEETKGQPPEKSPKKEPQEQTSGKPGDAGH VERHRALYKSLLNGKASLEDYHRGFSALLENEEAVRAELSGMTVNALLARGGAMFAWRHK GDKKADIVDALYRDMLSDFTLGQPYSVSGFMGMGPEEYRKATVASIRRIVDAQSEQTLAD FARDAAAQREEYASRRKAEREAVENPKTLEDFRSFIRFHMDEGKTAGEARDLLSFEQREQ FDRLAAEASRSTRKTKAEPVLSSSSQTTGGEIIATKHTQKGHDLFVVRLEDRVSREDYLA LLSSAKKLGGYYSKFRGNGAVPGFQFTGREQAEAFLKLAQGDKAEAQEAVKARRDAYEDD RSQTAVQRLNEMADRLDEDADASLNAARKVNTAKRAGQAAAADAMAYADKSLAKTMRNIA GAIGNGTATFLDRVREKKQVEQLRDALRMAKWEEDKTKYPAWDQREKHSEEPATTQTVEF AAFPHYRLYRSSWAELGRKLSEKDGTKSLGKKILSVADDVTETYLEFAHSHLEEVSTYAT QAGKPAVFKSREDAEKAIARSGYRGKAVVLPVKRGENRIILSPSAAQERGIWKGDDDRML SLRPELVDELFDKQGSTSRDIPWQLFDIRQKRKALARMGIETPAEYRMALREFVGLQAAA KQDKVRELERSVMRDAANNRGWLDFFPTPVAVTEEMLAAADIRPGMGVLEPSAGMGHIAD RIREKGVEPVVAELEPQKRELLEAKGYEVIGKDFMKDIPEGESFDRIVMNPPFSKRQDTE HVRRAYDLLNPGGKLVAIVSEGSFFGKDKKASEFRDWLEEVGGTSEKLAEGSFNDPSLPV TTGVNTRMVVIEKEGTPMASVTPGEFLPAMPSTRLRIRKAAVQRVADALGKRAANAADTR VVQSFEELPEHIRQLYGEVSSRLEGVYDPASGTVYLVADNLRGTARAAEVWMHENMVHHG LNGLLGMDEKRRVLNRLWQSMGGMGNAGIASIARKYGVDPRSDVEGRALVMEEYLAHLAE KRAAGKLSDQEQALWRRFVEAVLRAWHALTDAVTGRTGSMKYENVDRLLSALDRYVFEGR PEGMAEGGMVPAMASVDRSGSDMEAAHAAWEQVQRDAEEWGRQVDGYMPGVKGRAGKDTT LLTVCRTPDVLQKLGAVDLPMTMTRDNLTKILSDKQDHALPKDLVKQLPQAVAEPIMVFE SATQADAFVVLTELRHEGRSVMVAVHLDTERQHIRVNDIASAYKRGNEGWYVRQIEEGRL LYQDKKKSLAWARTARLQLPLVRKLPSRLSGNKILTEADVVKPIAPDNKPLASLRDVFGT TAKEEQRAEDAAPKTPLDSFRSKIGGHRPTVAERFAAFRENWKRKVEQQVFDHFASIKDV SKDGYVQARMTTSLGTQIESLIKYGAPVWNDGIMGVEGKSLTDVFAPVASEMDSFMGWMV ANRAARLKKEGRENLFTDAEIAAAQKLNEGVMPDGSKRAATYKQVFEDFARFKKAVLDFA QEGGIIDPESRSAWENADYIPFYRVKDGLDAVRSARNNSGLAGQSSGIKTLKGGKANIGD PVENIMRNFIHLIDAAQKNRAADMILQDMAKARLARKLTPEEVARLEKRQGDELTEVLAQ DKELAAQCGPLTDSQKKVIAATFNPTEFNYVRVQKDGKSLYYSVEDPMLLDALTGMWKQD GRQGPVWKLLRGTRRLLTTGVTSMPDFMLRNFLRDTLHAWTIEHQSGYRPLVDSLAQIKN TFNVDEDTKQLLAAGSAFMGSGYRLGNSGEEAAKALNKVLDKYGMDKKAFQDSVLDTGEK LKAFVGKLREGYEHAGSAAENAARLAVYKKMRAKGFSHREAAFAAKDLMDFTVRGQSQLI QSLCETVPFLGARLAGLHRLGRGAVENPRAFAAKGMTIALASMALYALNQALNPDDWDKM EDWEKDTYFHFWVDGEHFRLPKPFEVGFIFGTIPERMMEQLITDKSGGVFASRVFAGIWD QLAFNPFPQAVRPLVEQVANKDFFRETPIVGMGMDRKAPVDQFDSRTSLTVRAMASFMDA LLTPVMGKTGADVLRSPKRLEHLVNGYFGGLGAFVLGASDMVLRPLGNEPEAPERKLEEL PIIRSFYGGSGEKRTRYESELYEVMKQGNKLHASLKQIAEEGGNVSAAKAELSHDDIVAL ANRKGLTALTKEFGKLNKAAEKVRINRVMSSSDKQRELDNIQRRKNILAQRAMKRIEEMR VQ >gi|316923163|gb|ADCP01000073.1| GENE 3 12421 - 12666 168 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSLSKEELVLTACFLKSTDMSISIEDALGDVKQISTSLPESFDPAHSRLLAKAACILLAS NRLSPGDAIAEAQKVITLAGL >gi|316923163|gb|ADCP01000073.1| GENE 4 12757 - 13302 265 181 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MADLKNMTGTEAFQLAKDESALTTDEIAERLNVSPSVIKRYLKSGDSYLPGLEMLPRLCA VLGNTILLQWLEAQVESEEESVPPAQSRAEVLTAVARAGVALGDVQRILVESKVIAPHHA REIRSALNDVITECRIAKESLQPLAAHRDMTKYAPLFSLKNLPGEEKNRPPRSWWKFWKK D >gi|316923163|gb|ADCP01000073.1| GENE 5 13440 - 13766 226 108 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2721 NR:ns ## KEGG: DvMF_2721 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 23 106 25 109 116 70 40.0 3e-11 MEKQQVPFQLGEDGMKGTILASKKFFERDSVLTVSGKYSEQYFVSVQPVGEFDVEITISA KAHSSLSEDLLKQFMNDLIDQQIRIDLQKEFGQLRNIIIEYAFSPVSK >gi|316923163|gb|ADCP01000073.1| GENE 6 13768 - 15216 393 482 aa, chain + ## HITS:1 COG:CAC0658 KEGG:ns NR:ns ## COG: CAC0658 COG0641 # Protein_GI_number: 15893946 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 23 250 51 293 518 80 25.0 6e-15 MYQMLPFQFARFPEHDVLMVNECGEFLFLEENRFDSLVRHEFDESSEDFLNLKGKLFVAQ DDPEVALQKIAAKYRSRKSFLREFTSLHMMVITLRCNQRCEYCQVSCAEQDAYKYDMPVD VAEKIVDMIFESPTKHPKIEFQGGEPLLNWNVITATVTYAEKKAALLNKNVSFVICTNLI GITEEQLLFCRDHNISVSTSLDGPKTIHDKCRIVRTGGSSYDRFLERLDTSRRILGHDGV DALMTTTAFSLHRLEEVIDEYIRQGMSGIFIRSLNPYGFAAEQANTLGYSMAAFTSQYLR ALQYILEKNKSVFFPEHFATLLLSRILTPFSTGFVDLQSPSGAGISGVIYDFDGSVFPSD EARMLARMGDRHFCLGNVLHDSYQEIFAGPKLKQMTATACVETTPSCAWCVYQAYCGTDP VRNYLETGDELRNMENSPFCIKHKTLFTGLFELLRNSSDDEDSIIWSWITRNPALVARHA HN >gi|316923163|gb|ADCP01000073.1| GENE 7 15323 - 16339 160 338 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2723 NR:ns ## KEGG: DvMF_2723 # Name: not_defined # Def: radical SAM domain protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 4 336 42 375 384 314 45.0 4e-84 MGCLGYAGCITNGEKGTFLHPDTLWHADDLDQVKEGDIVCLNDSGTAIVLWEKDSLQNSL MLTEACNCNCLMCPQPPQKHAPELLQQAENILDLLHGKHVPHICITGGEPTLLKDNFIRL LYRCTKEHPESVINILTNGKTFSDMTFARDAAHAASANTLFCVSLHSEIDQIHDELVGKK GSFDKTQLGIYNLAQCGAYIEIRHVITKLNYKRLLHFAEHVYSYFPFCSHYALMGLELCG YAAANKERIAVSPHEYKEELALAVLYMHRRGLPVSVYNIPLCLCTPRIRPFARQSISTWK NRYVQQCSECTRQEDCAGFFSTSVSLPLEHIQPIREEV >gi|316923163|gb|ADCP01000073.1| GENE 8 16336 - 16599 61 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKAVYLLFLFASFVASGLGLNKSEAQATSSITNDSALTTSDSGKLYFSDLIEGHNGSGL IAAHYSHRSHGSHGSHGSHQSHYSSRY >gi|316923163|gb|ADCP01000073.1| GENE 9 16914 - 17600 617 228 aa, chain + ## HITS:1 COG:PM1774 KEGG:ns NR:ns ## COG: PM1774 COG3646 # Protein_GI_number: 15603639 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Pasteurella multocida # 12 115 27 128 239 96 45.0 4e-20 MSQALLLSDPVPSVSLHDGRPATTSREVAHYFRKRHDNVVRDIRSIMDNCPEEFTALNFE VSNYLDQTGRSLPMYIIFRDGFTLLAMGYTGPEAIRFKLAYIEAFNRMEAELARRNRPAL PAAPRFDEAAMLELAAEIREAQQHYYRTFGRLCSRLISMSIPVFTALESRVYKQAPDRPF SGVRIGAQWERYFTERMTAALHSLDDRLPDEKNPAMLLLEYARAMSAR >gi|316923163|gb|ADCP01000073.1| GENE 10 17659 - 17937 436 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLFDDGKKLEKAVGEEAAKTIVEVLERFDESQRSASASKGDLRETELRLMKEIDGVRLE IQKAKAETIKWVAGIITAQTVAIIAAIIALMK >gi|316923163|gb|ADCP01000073.1| GENE 11 18014 - 18193 191 59 aa, chain + ## HITS:1 COG:no KEGG:Gura_0580 NR:ns ## KEGG: Gura_0580 # Name: not_defined # Def: hypothetical protein # Organism: G.uraniumreducens # Pathway: not_defined # 1 58 1 58 74 67 56.0 2e-10 MKANDFDEYFDDGGSVMPFADMSKAERPNVKTKRVNVDFPAWMVEASDKEAAYLTAPRV >gi|316923163|gb|ADCP01000073.1| GENE 12 18294 - 20054 851 586 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGRGGYASLAGAAPINSFGQDFLGGAAAGEKMRQVVGEIQINDRLKQIGDAVTQAGGSID GIDPTLFSDPVGLRALGEYTTQYNSSQEGLQKAWKNHDQAIMRMFETTTYAPQAFKQSGD PSILDQASDKLGLPYQTQYNPETKRHEVMYAGRFGPEATGQSFTSEELSNLYGGLAKDPK RFSQIAFQYSLATMRQNEDYLKDPSKWLYDNDGKTVFVPRKMMTANGFRSGYIVTDTANK TSKFMTNEELQQAGVFPKDLQTMKGLFDMEKGRADMANDKVRLGYEGQRVSQGWATHNLN RERFEYEKGDGLIPGMPGINRKGFKKTMDADFNQWMAAAGYAEKNGVFYKPEYNKDGSIA VDDEGKPKLRAMSPAEVGEARAEHQQDFVKRYTGQSAPGQGGDVGNIILSRGLEYKGGKP SVPPAGNGGNIPSQGKTSTEQPARLGGGMFGNSVGPTDKPQRAEDAKLSGAMPGTAASSS TQEKERILPPKGSVWERMPDKYEAYTKVFGKRPDKVKPALAVKELDAELERLAEEKLREG MDRPNWWEADLRGLGSEGALKPAIEKAKQFIMDEILTSKGYYSSQS >gi|316923163|gb|ADCP01000073.1| GENE 13 20063 - 20473 329 136 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGTLGWTPAAQLATQNQSGAAASYASMDKAVNQTQTTKVSGGSSNIFGDILPVVGAAAGF ALGGANPLAGMALGSTLGGGIGSAIGGQSAATLGGLGTLAAGTLSDMSLKSGEGVQNLIG DDTAKKNIANRLGKFW >gi|316923163|gb|ADCP01000073.1| GENE 14 20487 - 20846 298 119 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLMSYTSAMQQGFVPRLQSMMPQAPQGPSLAGGMLGVGVGQEGDGSLIPTPQGDAGGDA FKDLLPQSGPQQTQPGQQPTQPGMGTMRQNVTPDVRGQLRQGIGLETQPAMKAAGGAYA >gi|316923163|gb|ADCP01000073.1| GENE 15 20850 - 21467 538 205 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRIYTRIEMEMRTGRVLHACSSEYAGPIAWLKGGKGGGSSTTNISSEPDKEYNARMASVA ERQQGLSDQFFKWWQDVQAPVEKANAEAQLGLIPVQTDIVKEGLEGFKGIQTDFYNASRP TSTETYAGRAVTDANMALSKATGSMNRNMSRMGISPTSGAAQSASRDLAIQGAASIAGAA NTGRMQGEEMNLKKLSGAMAYGLGG >gi|316923163|gb|ADCP01000073.1| GENE 16 21464 - 22006 582 180 aa, chain - ## HITS:1 COG:no KEGG:Desal_2351 NR:ns ## KEGG: Desal_2351 # Name: not_defined # Def: hypothetical protein # Organism: D.salexigens # Pathway: not_defined # 9 171 23 183 196 79 29.0 5e-14 MYGMTHKSALPDVALYELYDRLKDAGRFHETFYDGEIEDSHHGFRDYARRPDIHLWALIH GGELSGMCWLTHVCHGGTAFAHFAMLPNLHPLVTLRLGRFAVASMLRLRDERGAFILEAL HGTTPVRRPKAVRWVQKVGFVSVGELPYAVFMADTGRNEPGLISYADRTTAPGAWCGEKE >gi|316923163|gb|ADCP01000073.1| GENE 17 22190 - 24196 895 668 aa, chain - ## HITS:1 COG:no KEGG:Nmul_A1444 NR:ns ## KEGG: Nmul_A1444 # Name: not_defined # Def: hypothetical protein # Organism: N.multiformis # Pathway: not_defined # 1 668 1 676 678 168 26.0 6e-40 MPVIDVTTFTGMRPAVAGHLLEQNEAQSAFNVDTSTGVLSPVYASRLEAQYPFIVGSLFR HDARASGKGLVWRVYPDKRQFVESPVAGDPHSRLYMSSPNGLRFIDGHGNEYALGIKAPE KAPRIGDAGQRGGPVSFDAETKTMTFTSTDRAACPYLTPGGKVVFSGMPPSPLAAGTTYS VLEGVPDGQGGHRYRLGNPASSSPNTPIAFADGTAQCSVAYEGRQQNRVYVFTVVNSYGD ESAPSLPASVTTDISCNQRILDLVYAPAAGEAPMAKKRIYRLATGENGNSDYLFVAEIDG GLTEFTDNRLDVELAESLPSLNWRQPDPGLRHVASLPGGVLAAHTAGGVYLSESYKPYAW PEAHNYTFQGRIQTIAVSQRTLFALTESVVHALTVDDPASAFATTLDGYAPCLSADGTVT SPLGVLFPSSDGLYLVSQGMTSPQNVTDGLISDREWHDLNPSSFFAVFYDTTYLAFYRRL NGEYGTLMLDFGKNGQARMRLMDEWGMAIVVVPGGRKIYYAKQIGNTGESGLYEMFGNED APYIATWRSKEFVFPTPVNMAAAIVESEGDEMDGGEEEILFWGGAVGDQMPGEIPFGDED SDVYPGGTPISSVLRVFADGKLRATLYVKPNRFMRLPQGYAARRWEFEITTLKPVKRIAI GTAIEELR >gi|316923163|gb|ADCP01000073.1| GENE 18 24231 - 24797 614 188 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDYGQIIHRTFDDSYVITKNSMSYHVPNEGEFAEEWAEVRAYAEAHPECVVVEQPYVPPV PTLEELKAAKKARIDAETSAAILAGFDYAVDGVNYHFSYALDDQQNFSDTANVCLMKQSG MLGLPDSVTWNAYTVPDDELVRLTFDAPGFLALYAGGAMKHKNETMQRGGERKAAVEAAA TADVITAA >gi|316923163|gb|ADCP01000073.1| GENE 19 24810 - 26096 437 428 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|239835198|ref|ZP_04683524.1| ## NR: gi|239835198|ref|ZP_04683524.1| Conserved hypothetical protein [Ochrobactrum intermedium LMG 3301] # 2 91 6 95 369 77 44.0 1e-12 MISLRNNAQSVLTLPVSAEQTLLNLSLGDGGKFPDLSLGDSFRCAIKDSAGNVEFIRVVQ RAGDILTVERGQEGTRARDWKVGARVQLRMTAKTWEEMAGEHWRRVMDASGLPITPTVVG PSSFSLPGDFVSLFAQTRSFRLYTGDGTFLYGYVANASLSGNATLIVVEGISLPSSVVAV DIGLPLNVHPKAVNAAPVAHLTDSAAHSAIIIPLRGDIDEVKQILSGLVRSDGTIDANML PPATKKTLGGVIIGSGLKVSDAGLVQVDEAVFAKVPNAVWADGAQFALRLRREGNVDTVW NWIDPGGQPSHVWGGNDSVNMYVYNPANFSVNYANSSNYANSAGSAPANGGTAGAVSGVV VLSLNRDHACVLPSGGTWVYFYFASYQTDRDGYSSSNHSCGTAAGGSTIASGTGYYSYSL KGIAIRIS >gi|316923163|gb|ADCP01000073.1| GENE 20 26093 - 26338 173 81 aa, chain - ## HITS:1 COG:no KEGG:HSM_0915 NR:ns ## KEGG: HSM_0915 # Name: not_defined # Def: hypothetical protein # Organism: H.somnus_2336 # Pathway: not_defined # 8 79 12 89 89 68 48.0 5e-11 MTYGKRTLIAVDQLLNTLLGGWPDETLSSRCYRWARDGVRAWPRRVVDGLFFWQREHCKS SYESEKCGRQLPPELRNRGTV >gi|316923163|gb|ADCP01000073.1| GENE 21 26335 - 26898 695 187 aa, chain - ## HITS:1 COG:no KEGG:CV_0338 NR:ns ## KEGG: CV_0338 # Name: not_defined # Def: hypothetical protein # Organism: C.violaceum # Pathway: not_defined # 56 165 57 166 188 88 49.0 1e-16 MIDYTNIIHRTSDDSYVITKNGFPYHVYPYAAEFAQEWDEVLAYAEAHPECVTEEQPYVP PVPTLEEVKAAKLSEINAAADRAIATLTATYPDREISTFDKQESEARAYAADATASTPLL SALAEARGISLPDLVERVLAKADAFAVASGSIIGQRQALEDRLDACTTLEEVRGITVNIS MPGGGEA >gi|316923163|gb|ADCP01000073.1| GENE 22 26909 - 30253 818 1114 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIPTVNVTVAVHEQDGSPVRDALVLAKLTAVERYNGYVVADEYTGRTDERGRAVVAVFPN ELGSEGSEYRFRIVTPAGKTFSVYATVPNSDCNLHQICELEPSERRGAGQVVSTEMAGYV TQAETARDKSQEAANRAQAAAVQVDVSAQTATAAASQALSNANAAKRAAEDATGLVQRTE TAVAGFESEVIGRVEAETQRLGGEASTAVSTAKDQAMTALDAHMEQKTEGLDLHAADLKA ALTASLGTREEEAIGAVRIERDAALVVLREEGAQFREDLNTLAERSEDAAKRAACSAAIT VKAASDVDEALTDTAIDLLAPQVVAEAVRQATEIALDSANTATVEAEKSRQSAATACACA DESAASAQGAADSAAAAETSAGAAADSASVAAASEAVATAKAGEASTSATAAKASENAAK ASEQTATEAANTATMKAGEASVSAAAADVSETNAASSATAAADSANVASAAQTAAEAACE ETKKVAVLPATTTSRGSVMPDGLTISVTADGVITVKDVAIGGDLGDLASARGQVGDARQL ADLDFNMLTVPGFYRTTGNPKNGPIGISGASIGSLFVSGMQGNGNRLFQIMVSGNGIYWR TSISGGTTWEIWNQVLTGNKIGDGIRNTNGIISVPEMQGATSAQAGVSGLVPPPLAADAG KVLGSDGTWSFPKDVAIGGDLEDLASARGIFDTLTKGSVDCNTLTEQGVYAVSLAGTVNG PGFLAKLVVFNGKGSAFTNQMALASGVTSSGSVRVAYRAKNSEDIWSPWSEGILSGRIGD GITVNNGIISAPEYEGATASTAGTSGLVPPAAAGQHESFLTGGGEYKPALSTGGGVLSGS IQINDLENAIGAAPSTSTERGLFLADKNSVVMGGFDVIQRASDNAKYTQFYSKNSRGDIT SLAAVTYEDGTRELVADSPLQINDIQIKQVVDGGRRVIVLSGARGNQGHSFRFSPDTGEA YMDGRVIHAKADTAGYADTAGSAPANGGTAWAANRLRREGGVDTIWWWSGQGGQPGWLWG GNDGVNMYVYNPANFSVNYANSCNYANGAEYANNAGSVVGTPNINVNNNGFNTLPPGGTW RLIQSDGSGFLQIKGEYAGGTYVGYSKFMYIRIA >gi|316923163|gb|ADCP01000073.1| GENE 23 30263 - 30829 418 188 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPRLDDVIADILPEAQNAPADVLKKAMLDAARTLCRRSKVWRVDVEGTLIPGIAQVELEL PRETSVVDVAYLYTQREQLFTGRFSAAPNGTVTLADVPVDGGLLRAVVSLVPTTKATCID DGLYDHWGTVIRHGARWLLKSMHGREWFEAQVALYHQQEFERGIGQAAHAALNACRKSPI YPLKNSFL >gi|316923163|gb|ADCP01000073.1| GENE 24 30816 - 31520 703 234 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_3509 NR:ns ## KEGG: ECH74115_3509 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 4 230 5 212 216 65 31.0 2e-09 MLCSEILRSVSGKLQDEDADARRWPWESASGAYSLMDSLNAAVREIVTQRPDATALTEPM RLEPGMMQRIPRADIHMTSRNAVSLINVIQNFDPDGNTPGRPVFRVELDALRTAAAWGKA GGRVENWAYSPLDNREAFWVYPGVESGRDVWIEAVYSAEPVRAATPSDRFPLPESFANAA YLWMLFDVLAGDHSESNFAKAQAFLQAFAQSLGVKLQTDLAFPIRQGGVNDAQA >gi|316923163|gb|ADCP01000073.1| GENE 25 31890 - 32339 498 149 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MATFKSDAVKSGLMFLGTAQPGCVLCRSGRVKEKFTAADVAELVPIPKGAMVLDVRVVNE ALDACTSVSVGDKDDPDRYFAALDLSSAGTHSAQSEGPTTAHNHVYADEAVLTVTVPATQ TAAKAITAHVLYKMVEGCLTDEADVFPAA >gi|316923163|gb|ADCP01000073.1| GENE 26 32348 - 33433 1622 361 aa, chain - ## HITS:1 COG:no KEGG:Smlt1961 NR:ns ## KEGG: Smlt1961 # Name: not_defined # Def: putative phage-related protein # Organism: S.maltophilia # Pathway: not_defined # 1 360 1 367 369 301 45.0 2e-80 MAGTEFPLNHPLAVQVWSNSLAVESGKRQYFSKFMGTGESALIVVKTELQKQAGEKITVG LRMKLREDGVEGDNLIEGTSAEEALTFFSDSLFIDQKRKGTKSKGKMSEQRVPYNLRKEG RDALATWWSEYYDEQFMMYLSGARGINADFISPLSFKGRANNPLQAPDAEHMVYGGSATG KANLTANDKMSLGIVEKLVAKAETLDPMMQPITVEGERKHVLLMHTFQAFSLRTSVSQND WLDIQKAAGVRGDGNRVYKNALGEYADVILHKHRNVIRFNDYGASGNVGAARALFLGAQA GLAAWGGASGQGRYTWNEEKDDRGNALAITAGAIFGVKKSRYDNKDFSVIAVDTACADPN V >gi|316923163|gb|ADCP01000073.1| GENE 27 33444 - 34547 950 367 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEEGMMEASQEAPVQQAVSPAEPAPKPAPEQAGHDELDFFDEAVSEAELRGETPDGEAHG QDRTETPAPKEPKPETKPGDEPTGGDTSPGEDTSADGAKPPKGFVPHAALSEERTKRKEA QQRVEQLEQELAARKATHEQREMPEAPKDASEAARRFAEQNPQYAALVFEDSRDGELLRD KLDTYGEEDAIVLAKTLYVERELAEQKRRESGSADAAFLGTCVREMDAMFEGGLNGQQAK ELIGYLQNEAGLTPDTITLLTSPNTIVIDPRTGRQSYLGGRALEVVGLFKDAHTLAAASS PERIRESIEAEVTKKVMQKINGEQAAFRELGDVPGHGDAPVGNIPATEDEFARLSPEQQE RLLRGEL >gi|316923163|gb|ADCP01000073.1| GENE 28 34561 - 34857 197 98 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDRPTDFGAMTPAQEAAFDRKAEAWCRGHRPCPSGRDPRFWNRRDRPGAMDAYRNGYDRI RWGTPDTNDTGRTDSSPAQSLSDASAVMDAPNHSPAQA >gi|316923163|gb|ADCP01000073.1| GENE 29 34841 - 36940 2536 699 aa, chain - ## HITS:1 COG:no KEGG:Aave_2355 NR:ns ## KEGG: Aave_2355 # Name: not_defined # Def: hypothetical protein # Organism: A.avenae # Pathway: not_defined # 7 643 44 690 779 233 28.0 2e-59 MAKHDNRMPIQTALGYVEEARAASRDWRAKSWRDHEMYDGDQWTPEDRQRAVDAGIDPLT INRIFPAINLILGSQELNRANIIAKARTAKDGQIAEIMTEALAFVLDQNDGQYRIGQAFK DAVIPGIGWLYCGFNNDPRQERIKLDFRDWKEVFWDPFASPWLESDKCRYAFFQRWMDLF DLQCLYPEREKEIGEAFSGLSAHDSDYSYMDDEADIVEQDKRVLGSTRWSDPERRRIRPV QLWYPVLEKAVFALFPDGQCVEVNTKLPDAQVYMLVRNAQQLITTSVRKLRVKTFIGSYE LSDEPSPFPHGQYPFIPFIGYLDRYLNPFGVPRMLSGQNEEINKRRSMNLAMLQKRRIIV EEGAADDLQDLYEEANKPDGFMVLKPGGRSKMEIIEGAQLSQYQIQVLEQSEKEIQQISG ANDEAMGYTSNANSGKAIELRRQQSSTIMASLFGNYRRSMSRLGQLVIANVQGAWTAEKV LRITDKMTNAERFVTVNQKVLGESGDVVEIRNDITQGMYDVIVSDAPATDSVREQNMNLL IEWCKQSPPEVIPYLMGMAMEMSNLPNKDQLMMKLKPMMGITPEEMDMSPEELQQRAQQE AEAKAQAEQMQQQAQQQLMQAGLEKAGLENELLRAQIDKTRSEAGMKAREQDRKEFQTGI EAGNAIRQARNEDAAVAASLAPPAPISTPQPPMPYGQTY >gi|316923163|gb|ADCP01000073.1| GENE 30 36940 - 38244 1368 434 aa, chain - ## HITS:1 COG:no KEGG:BURPS668_A2354 NR:ns ## KEGG: BURPS668_A2354 # Name: not_defined # Def: hypothetical protein # Organism: B.pseudomallei_668 # Pathway: not_defined # 2 416 7 450 490 416 49.0 1e-115 MEITLPHNWRPRDYQIPMWRYMEHGGRRSVLLWHRRAGKDDNSLRYLASTAMEKTATYWY LLPKAVQVRRAIWEAVNPHTGKRRVIEAFPDAIVARTRDNEMTLTLANGSSVHFLGADNF DTLVGSPPYGIVFSEYSLTNPLSWAYLKPILEENGGWAIFNFTSRGRNHAATLYEYAAGE ESWFAQRLPVTETSVFTPEQVEEIRKEMHRTYGEEDGEALFRQEYMCDLDAPVVGAYYGK LLARAADEGRITGVPYDPAAPVFTAWDLGMDDSTAIWVAQCVGREIHLIDYYEANGQPLA HYADWVRGRGYGRPTHYLPHDARARELGTGKSREEVLAGLDIGPVLVVPQQSVADGINAV RTILPRCWFDQIKCGAGAEALRNYRKEYDEKRKVFHDRPLHDWTSHAADAFRYLALSCGQ HQKSGSKGFRPRRR >gi|316923163|gb|ADCP01000073.1| GENE 31 38248 - 38724 502 158 aa, chain - ## HITS:1 COG:no KEGG:Mmc1_1689 NR:ns ## KEGG: Mmc1_1689 # Name: not_defined # Def: hypothetical protein # Organism: Magnetococcus_MC1 # Pathway: not_defined # 1 148 1 163 202 64 27.0 1e-09 MRKQRYDWETIRAEYEAGSSMGKLSDKYGVDKAAISRRAKKEAWAQDVTGAVDRAVDAKV NGIVNTVDPEKKAAAIASAADEKVAVIIRHREEWETQRELVTTAIEKNDFDKAKLAKITA ETLKIRQEAERKAWGIRDVDAQPDGGALTVRILRVSGE >gi|316923163|gb|ADCP01000073.1| GENE 32 38727 - 38990 272 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQMLRDLGEVKAELSAVKAELAGLRERIDDVVISNLRDHGKRMSMLETRVAALEAAENRR AGGMAALVAVAAAAGAAGNVLSRWIAG >gi|316923163|gb|ADCP01000073.1| GENE 33 38999 - 39298 148 99 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRGDMPCRRSFAFWKRWHPSGLHSARMPNAAALPLLVMLCLPLLCGCSAASSADSRTIPL SPILTSLRPVTLDGIEGAWMDWRDAQALAAWIDGVETAR >gi|316923163|gb|ADCP01000073.1| GENE 34 39295 - 39525 235 76 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDFSMISDFLNSQTGSWALFLSAASAVCAWAATLMPAPSETSGVVYRTLYKVINWIGANI GKARNADDAQKQRKLQ >gi|316923163|gb|ADCP01000073.1| GENE 35 39525 - 39905 396 126 aa, chain - ## HITS:1 COG:no KEGG:RB2501_01256 NR:ns ## KEGG: RB2501_01256 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 4 119 1 115 117 109 47.0 3e-23 MSTLRFFKPEEFACKCGCGRGYDDMDAGLLRMLDEARALAGIPFSLSSAFRCAKHNKAVG GVADSAHTHGYAVDIKCTSSHYRFRIVSALLEAGFRRIEAGPTWVHVDNDPAKPQDVIFY AAGRVY >gi|316923163|gb|ADCP01000073.1| GENE 36 39977 - 40240 92 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAIGYIPQLLFSRKEIRDAFQVGDDTVTRWIEQGAPIVVEGRGNNVRYCAEVAALQAWRV VTNRAQRSKSLRPENLVNPPDVQSPYG >gi|316923163|gb|ADCP01000073.1| GENE 37 40240 - 40515 300 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELMEAAVLLGNGGQRGGAVMIAALARRMEEARKKHPVFAEGKYHALGVIGAEYEELVRA VERETPERVRDEALDVAVTALRLWAGEEICL >gi|316923163|gb|ADCP01000073.1| GENE 38 40539 - 40865 140 108 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSQQQTSSPSRIVEVVVPFREDESFVDRLARANAALGRAIHAQLPAWNVKRNEIAVFGP QRAHRNGSVSYVYRLVEDRKPLFGKGRGAERALMRHEIWSQQPESAAQ >gi|316923163|gb|ADCP01000073.1| GENE 39 40795 - 41331 237 178 aa, chain - ## HITS:1 COG:no KEGG:CLL_A2280 NR:ns ## KEGG: CLL_A2280 # Name: not_defined # Def: VRR-NUC domain protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 35 145 3 114 129 97 45.0 2e-19 MSDTCTFAEFLDHCTGRKVMPPPAQAKKRPSCPPESEEQKALFDWWQRTPYARHFVMYHI PNGGRRDKITGARLKAEGVVAGVPDIFLASPRQGFHGLYIEMKRQRGGTVQATQKELITA LRQAGYRVEVCMGWWEAREAIENYLTGEIPKGASRRVEPAANQQSLADRRGCCAVQGR >gi|316923163|gb|ADCP01000073.1| GENE 40 41328 - 41744 152 138 aa, chain - ## HITS:1 COG:no KEGG:LI0834 NR:ns ## KEGG: LI0834 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 21 132 26 139 147 93 40.0 2e-18 MSEALTAESLSGPLRSVAERLGMPFVFRLVEHFGGTTVAIPSGSAGKKQYRRLCAALGDE SASALCREFARRAIYIPNLKQAMLDKRNSSLNMERDELAGKGHSERTLVAILARRYSLSE RQVWRILKQPRTPGEATQ >gi|316923163|gb|ADCP01000073.1| GENE 41 41753 - 42814 425 353 aa, chain - ## HITS:1 COG:no KEGG:Noc_0052 NR:ns ## KEGG: Noc_0052 # Name: not_defined # Def: hypothetical protein # Organism: N.oceani # Pathway: not_defined # 184 282 92 187 199 77 47.0 1e-12 MGGYFKVWRKIEDSKSWSRGALYRGLMITLLQKANWKQGYFHGQEILPGQLACSGASLAS ELDLSRYQVMRMLATLEDDGFISRQTFGKVCTLITVVNWQLYQSATEEAAQQPHNGRTSS AQVPHTIEEGKKARKEIPPASADAAEERASLLGKEKQEGAGARTGSAHVTNKAPRSAASA TGEGPAPEPAYRTATKRVLTGQRLAWFNRVWDAFGYKRNKAEAADAFIDIEGLSEPLVAA ICRAAEQEAARRPDLVARGKTPKMLTGWLSGRRWEDEADAPPPLVPVAARGPLLGDPVID VPTQEQRDEGWKAGLSFMEKWRHGERPNQAGQFDRRKPLPIPANFRSVLQRAL >gi|316923163|gb|ADCP01000073.1| GENE 42 42817 - 43131 141 104 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIIRCRHRIPSPEAVGFLTVEGLREAAAQYPHLRPCKKHGWNWLYRTACAKCGDKVEVPL EGSAREHEKGPLWSNGPKGEKMMQVHQNVENSMPAIGSAVKGKV >gi|316923163|gb|ADCP01000073.1| GENE 43 43128 - 43919 669 263 aa, chain - ## HITS:1 COG:SA1801_2 KEGG:ns NR:ns ## COG: SA1801_2 COG3645 # Protein_GI_number: 15927569 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Staphylococcus aureus N315 # 147 257 9 119 126 117 54.0 2e-26 MSGLRIFQNREFGAVRVIEYGGEPWFVARDVCAVLGTETRDLPDILEHDEQRPIVDIIHT LNDSTGLRRDSRIISEPGLYSLVLRSRKPEAKAFKRWIVHEVIPSIRRTGGYGALALPNF RNPAEAARAWADKEEQRLLEEQKRLALEQKMEEVRPKVVFAESIEVAKTSILVGEMAKLI KQATGYDIGQNRFFEWLRNRGYLHKDGSQTNMPTQRSMDAGWMEIKEGTRIGSSGESRIT RTPKITGKGQIYFINLFKKMVES >gi|316923163|gb|ADCP01000073.1| GENE 44 44036 - 44593 507 185 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MADYKNMTAIETLREAKDASGMTAESIAQGVGITATHLRRYLDPNDNYAPSLHVIPVLCR VMRNTILLQWLEAQIVADDTPVTPAATRADVLTAVARAGSALGEVQRIVAEAQVLYPSTA REIRSGLGDVIAACRSAQAGLQPLAERRDRDVVLASLSDGEQAPPVAMPEPLTGECRKPW WKVWR >gi|316923163|gb|ADCP01000073.1| GENE 45 44670 - 44930 167 86 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVPTVHLLITLRGPVWRVRILSGGTIRWKCYRAEDYPTPEAVARRCAAGLVTIKPENKP HEGEEGHLRGDIRPVHDSPLRVRWEC >gi|316923163|gb|ADCP01000073.1| GENE 46 44884 - 45123 147 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNTELFRTHLRERMQITRATQIEVEEKTGVSQASISRFLSGARINSDNLFKLWEFVYEGQ FPATLSTPTEPEEVNHGLA >gi|316923163|gb|ADCP01000073.1| GENE 47 45180 - 45869 68 229 aa, chain + ## HITS:1 COG:no KEGG:Ppro_0272 NR:ns ## KEGG: Ppro_0272 # Name: not_defined # Def: putative phage repressor # Organism: P.propionicus # Pathway: not_defined # 141 229 147 235 235 66 40.0 7e-10 MTFCPQYRMQERMELEEKLMARLRDISDQRQIAAFAQKCGVSQANLSRALGVKAQQLGLD KVSKILSAMGALVIFPDEERYPVMRRMACHSPTENVTGDNLHEIPVFEEAGAGLPAEFFS TAPENMIPVLPQYNLPDVRAVKVTGDSMEPTILKGAYVGVIPLDDELEDGGIYLVQRPPF GLVVKRVMQDEDGNIILHSDNPRWKPQKVSNEGYDNIIIGKVVWTWQLV >gi|316923163|gb|ADCP01000073.1| GENE 48 46775 - 46999 96 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MELRSPFNPSYFVASESQPVSYQCTRALLESIDGGSHRFISSGQISGIQPMLPQQFSPMP VPFLQAMDQRTQEE >gi|316923163|gb|ADCP01000073.1| GENE 49 47004 - 47195 101 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNNLPKNNPKSGQLIVYRNKGTSALNTGFFIQTTSVPTMPSVNIQIHPVGTIYSVSPNFS TSK >gi|316923163|gb|ADCP01000073.1| GENE 50 47199 - 47675 -195 158 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVICRFLSRITARRSSHLRMPGLLLPAFSDGFMACRFNRSIPAHSIPVPPSLRGLSSHHP ISCRRVLPAPKGLRLPCSYLLGLPPSGSSFSGPLPRHSPSSFVVKTIYTYTNFASIKISY MYILNQKIPPTPNGIDGHARQGTKKPLTRRGWIKKKCL >gi|316923163|gb|ADCP01000073.1| GENE 51 47881 - 48120 268 79 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2842 NR:ns ## KEGG: DvMF_2842 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 77 2 78 98 93 62.0 2e-18 MSKLLESVHTLVIDGDMPAKAIASAIGKPYSTLLRECNPYGKGAKLSAETFMAILKATGN IQPLELMARELGYKLIPID >gi|316923163|gb|ADCP01000073.1| GENE 52 48812 - 49168 202 118 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRGDFAMIGGKGDMQVLNETGMAAFEESIRKALEERRKVIGMTEQALGSLAFPHVADSR RKVQSIRKGQGSGENRKPQQLRMTDVMNLLAALGLPWEKVIKQAFADAETAQKEEQEK >gi|316923163|gb|ADCP01000073.1| GENE 53 49505 - 49747 226 80 aa, chain + ## HITS:1 COG:no KEGG:SPAB_05352 NR:ns ## KEGG: SPAB_05352 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Paratyphi_B # Pathway: not_defined # 1 80 1 80 80 84 47.0 9e-16 MPVILIVGPYRFFFYSNEGNPVKAPHIHVRSQDGEAKISLVEPYGVLLNAGFSAQELRKI CKLVQEKRDILKGAYHDYFA >gi|316923163|gb|ADCP01000073.1| GENE 54 49731 - 49979 229 82 aa, chain + ## HITS:1 COG:no KEGG:SPAB_05353 NR:ns ## KEGG: SPAB_05353 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Paratyphi_B # Pathway: not_defined # 1 82 1 82 82 113 68.0 2e-24 MIISPKKVWFDEDSMWVGLNDARVIGVPLAWFPRLLNATVAERERFELSAFGIHWEHLDE DISVEGLLAGQGDLTRTPIKVA >gi|316923163|gb|ADCP01000073.1| GENE 55 50123 - 50473 439 116 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCYGTNCGREGAFGTCYHPEDCIMRAIERDAEENLAARLARDTALSREHFPCPNCLEQGE RHSLTCENGLFTCPECGGECDAAELIALYDDIRAGHVSDAEVVGLWIEKLDARRVA >gi|316923163|gb|ADCP01000073.1| GENE 56 50470 - 50676 136 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSAAIFCPHCKLKYDKAVRLRRHRDFWICSSCAEHYTAETLATACENAARSFLAKANYLK IMARRAAA >gi|316923163|gb|ADCP01000073.1| GENE 57 50673 - 50810 76 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKIDIWKRPWLAVLLLFLCFLLVGYFERQDQELFERMAPFTEAMR >gi|316923163|gb|ADCP01000073.1| GENE 58 50807 - 50941 143 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNWTDDAYSDEHGPWTDEELMIAAGNAARRKALRRKTTEGDDDV >gi|316923163|gb|ADCP01000073.1| GENE 59 50934 - 51806 984 290 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2988 NR:ns ## KEGG: Cphy_2988 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 10 181 5 176 238 138 44.0 3e-31 MCDQPQPTGIYQAMAEILAEIPSIGKDNKNKEQGFKYRGIDDVYNALHPLLAKHKVFMAP TVLSRASEDRTTAKGGAMQCVTLSVEYRFFHADGSSISCTVMGEGRDTSDKATNKAMAVA HKYALLQTFCIPTEDIGQDDPDAETPEPVQPRQDARPQVMSGPVDFDKVRAELARMTTEE ELQEYWKKVRVEKEHRHFNQLSRLFLDRRKELAPKRETPSPEKPRPTSSIPNIIPADQVI AEFNACETITALYAAATRLAVPENHPEIAAIREAFRVRRKQIEAGQTRAA >gi|316923163|gb|ADCP01000073.1| GENE 60 51866 - 52333 513 155 aa, chain + ## HITS:1 COG:NMA1672 KEGG:ns NR:ns ## COG: NMA1672 COG0629 # Protein_GI_number: 15794566 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Neisseria meningitidis Z2491 # 1 144 1 154 174 116 42.0 1e-26 MSLNKVMIIGRLGREPELKYTQSGSPVCTFSVATDESYTDNSGQKVEKAEWHRIVVFQKA AENCSQYLAKGSLVFIEGRLETRKWQDQQGQDRFATEIKAQRVQFLDRKADGQSSQQQEG GRRQQGRRHAPPPSAYEDLGPAFPSEASGMDDVPF >gi|316923163|gb|ADCP01000073.1| GENE 61 52355 - 53341 1195 328 aa, chain + ## HITS:1 COG:no KEGG:DvMF_1683 NR:ns ## KEGG: DvMF_1683 # Name: not_defined # Def: protein of unknown function DUF1351 # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 28 322 6 286 299 163 40.0 8e-39 MTQTAEILEALPPVQMQPTGLAQFDLAVTATPLVINWNRDAVSALLDATLEQYEKLVVQE EDVPAIKSEMAGLNKLRDRLDNARKEITRQIAGPLEAFDAEAKALVARVVEVREGLNRQV KEFERRDREGRHQSVQFVIDALKNSEGVPELDIPIKEPWLNKSIKQAQLHAEIQNIILKH KQDKAAAEQLERARADRATMVEAALKAKAEEYGFSLPMGKFAPCLSLDITSEEAAGVIGQ VFAAEASLREQQAKDKAAREAAAEEARQTRQAVPTPPPAAPFIEEDDFPTAPPVTTTLIL SVAYEPAREQAVQELLAQLRSVATVTVM >gi|316923163|gb|ADCP01000073.1| GENE 62 53412 - 54338 922 308 aa, chain + ## HITS:1 COG:no KEGG:DVU1525 NR:ns ## KEGG: DVU1525 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 61 202 162 309 330 77 38.0 8e-13 MSQEQQDATDYVRVTLTIHEFSEDGECCIVSDKHGHGTELSTDAFLHDAEDIDVGDTIAC DILREAWEDTGLTPDDAGPHDVRACGRENIEVLVDLTDEQLLELGSEMADALRERDKLET ELLAVKKDYKARIDLSVSKAAEAAAEYRSGKRFETVSCDRFEDRTTMEVVWCDSVTGKEV SRRPMTAEERQHRLELVTPDKPADNGEGRAEVLTLPAPPTANARTCLSCRHLSADGTEKA EPCMACAQANGGDADNWEPRRECKTCAHVTSTVDGFPCGGCSLNPDPGHGGDEDRWTWKD APKEGIPC >gi|316923163|gb|ADCP01000073.1| GENE 63 54332 - 54580 150 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLTRRMAKRQHCPLMPMAKIYEHHVNGVQINCEIIEYEKCTGEECPKWQDGKTDGKVCPY EETCRICSTCRTCPDRYGYCGG >gi|316923163|gb|ADCP01000073.1| GENE 64 54583 - 54777 140 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPGYDDPNRPEWTGKRKAKNFARINAACKGLKTSGGKPPTECPKCGTSHKFAACPVCGCP RPKR >gi|316923163|gb|ADCP01000073.1| GENE 65 54813 - 55160 300 115 aa, chain + ## HITS:1 COG:no KEGG:DMR_23570 NR:ns ## KEGG: DMR_23570 # Name: not_defined # Def: hypothetical protein # Organism: D.magneticus # Pathway: not_defined # 3 108 14 119 234 84 41.0 9e-16 MPFQDAYERILQSTGLRTQTDIAAMLGVKQSSISDAKRRNHIPDSWILTLFNKKGLNPSW IRTGEGPQYVAGTDTPPTPVLSEQQAAESLEPILRAALLGVVPELADQLRQKMNP >gi|316923163|gb|ADCP01000073.1| GENE 66 55206 - 55505 259 99 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MELVDEVRRCERAVQQAKNALEIAKRDAAVAACPYKEGDLVSGWDRDGTSPAKVDKILFT PSYPYYDLRVLPITEGGKPSRRHRYAYNVLDVTPYEGDE >gi|316923163|gb|ADCP01000073.1| GENE 67 55913 - 56107 151 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MITPEELDYIRTAAIGDMLGDSKALDEMGSAATIFRLCRELEHAQNEKTELQEVVVRIYK ALKG >gi|316923163|gb|ADCP01000073.1| GENE 68 56261 - 56746 454 161 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTLTLNLSEKLVKDIIESDASLKAEIDALVIDKAAKSFLTKGIEKRVERMIKNDLSDSV NKAIEKYIQSGAFGLNHYSTIKEVLYDRIASLVDAEVEKRIRKSIEEYAEVKIKELVDTR LGKESDTILLYIDRVIEFKINSIIAGLMDMRKKEGGACVRN >gi|316923163|gb|ADCP01000073.1| GENE 69 57096 - 57476 82 126 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEELTLLGGSFNGSELWVACGNMRKCRMAGPSRKTVAEAVAAWNAMPRDLTWTDEPPKE VGNYWWRWDAGSKHWIYNVRLIPTIHGLITYDHSAYDAEHVEDIGGQWAGPIPEPREPKE QGGYDG >gi|316923163|gb|ADCP01000073.1| GENE 70 58403 - 59569 410 388 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 34 388 43 403 406 162 30 6e-39 MLTDTKIKQAKVTDKPYKIYDSEGLYIEVAPTGSKRWRWRYRWNGKEKLLSFGIYPEVPL KLARDRRAAAKELLVSGVDPSQARKEEKVESRAAENSFAAMAEEWYSLYSVPWSEHYKGL VRRRIDEYLIPRLGKRSLSEITPIEIMGILTSLEKRGVIETANRVLGICSQIFRRAVATG KAKSDPCRDLRGALAPAQEKHHAALTTKDGARAVMRALDAYQGSFVVCAAVRFTALTFVR QVELRFATWDEIDWEERMWLIPAERMKMRREHMVPLSRQAIAVLEEMRRVNGTQPYIFTG QGRRRRPISENTVRCALQSMGFAGEMTAHGFRSMASTLLNEMGWRSDVIERQLAHVDKNK VRSAYNRAEYITERRQMMQAWADFLDSL >gi|316923163|gb|ADCP01000073.1| GENE 71 59961 - 62375 1408 804 aa, chain - ## HITS:1 COG:HI0730 KEGG:ns NR:ns ## COG: HI0730 COG1452 # Protein_GI_number: 16272671 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Organic solvent tolerance protein OstA # Organism: Haemophilus influenzae # 72 365 68 350 782 94 28.0 6e-19 MIVPLPRGMFRLVTKTVCAPIALLALFLCSPVHSYAATDVLTIPMEDTKEASWNLEADKL VTLSDNTIVEAQGGVVLQRGNDILKADFARYYSATNWVYLKGNVFVRMGKDDIHSDEAEF DLHSKTGWLTNGHVFMEGPHIYFSGSRIIKHWGDRYTFNKAKVTTCDGTSPAWSMNAEQA VVEIDGYAQLFHSTFDVKNTGIFYSPFMVLPAKTTRQSGLLPPDYGISDKRGFYYTQPYF WAIDESHDMTFYGGWMTKIGPSLSVEYRANEFTDQKTWLAATGIYDKDTVAIPGTSRVSE NKQTLRTNNDRYWLRGMADGFLGASTWRYRSNIDYVSDQDFLREFNQGPTGFDRTRDNLF RMFGRDLQEDDQNRVSAALVSNDWQRVGIVASMRYEQDPALGHGNRAQSQDELVQRLPQL DLFLYKGRIVPQIPLELEAQFQSGYMYRASGTTGGRTEIYPKLSVPLDFGFGSVIGTVGL RQTYYNTDRKEHTSPLAMYMDNSASPRQTGESRTMIDMDIQGYTEASRIWQLGDESSIPL KPENAGKQMWTAVRHEIQPRIRYSRTPHVDQEKNPFYLMEDRILPRDELTYSITNIVTRK GSIVSVTGEGDKQEARRSTFYQDLLRWRIESGYDFEEARRDRYRDEYGRRPFMDIVSDFE IYPWPWLGYHDKTYFSAYDGQVTRHDHDINLRYKDKISWYTGMSFRDKYYDYRKKFQYEN WNNVQLTSDLRLIHNDLTINLTPEWSIRFDDYRNMRQGGTFGKTYDQSIDIAYSAQCYRI IGRYNYDGYDKSYSIMVELPGIFD >gi|316923163|gb|ADCP01000073.1| GENE 72 62793 - 63947 1051 384 aa, chain + ## HITS:1 COG:alr2458 KEGG:ns NR:ns ## COG: alr2458 COG0787 # Protein_GI_number: 17229950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Nostoc sp. PCC 7120 # 2 365 28 391 401 196 35.0 7e-50 MFLSSAIITVHLENLRHNLKTLLARHPSLMPVIKADAYGHGVVAVARVLAEEGIQHMAVG SVGEGALLRQEGHTAFLLALMGLARDEDTALAASYDITPLVHSRESLERILAQSHLTGRA KPLTVAIKFDTGMSRLGFRVDEAAELADYLRTLKEVRPVLVMSHLAASDTPALDDFTHEQ ARRFHEATESMKAVFPGLKTSLTNSPGLLCWPSYVGDLARPGVTLYGGNPLHGTDRAKLG MGLLPVMEMAAPVLSVHPVVKGATVSYGCLYTAPKDIRAAVVGAGYADGYPRSLSMRGSV LIRGQRAPILGRVCMQMCIVDVTDIPGVEPGDTAYLLGGSGPLAIRPEELAEWWGTISYE VFCALGRNRRVSEKKFSEYSNRRA >gi|316923163|gb|ADCP01000073.1| GENE 73 63988 - 65100 1121 370 aa, chain - ## HITS:1 COG:YPO2390 KEGG:ns NR:ns ## COG: YPO2390 COG2230 # Protein_GI_number: 16122613 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cyclopropane fatty acid synthase and related methyltransferases # Organism: Yersinia pestis # 6 365 19 378 383 434 54.0 1e-121 MDAKAVLTSMLAEADIRVGGDRPWDIRVHNEGLYKRVLREGTLGAGEAYLEGWWDCDQLD VMFCKALHAHLEDKVRRNLPNALIVASQLLFNLQSVARAPMVAEQHYNTDNAMFARMLGP TMNYSCGYWKDAETLDEAQRNKMDLICRKLDLKEGMSVLDIGCGWGGLSLYMAKEYGVKV TGVTISTEQLAYAREHDAGHLVNWLLQDYRSMEGQFDRIVSVGMFEHVGRKNYDIFMKTT KRLLKPSGLFLLHTIGNSKKKTGTDPWINKYIFPNGMLPSPVCVAKAITGLYVMEDWHNF GADYDKTLMAWHQRFEEGYAEGAFQCSERVRRMYRYYLLSCAGAFRARDIQLWQIVLSPE GVEGGYCCQR >gi|316923163|gb|ADCP01000073.1| GENE 74 65383 - 65670 340 95 aa, chain - ## HITS:1 COG:no KEGG:Dde_1968 NR:ns ## KEGG: Dde_1968 # Name: not_defined # Def: TRASH # Organism: D.desulfuricans # Pathway: not_defined # 1 82 1 82 100 116 62.0 2e-25 MWKWLILILVGYALYRMFMNDRKKSGEDTKKEKEHLIATGEMVKDPVCGAYIDSDSNITV RDGKTVHRFCSYDCRDEFLKRIGKLPEKTGGDDDE >gi|316923163|gb|ADCP01000073.1| GENE 75 65692 - 66207 237 171 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 4 146 119 253 278 95 42 7e-19 MNERHMAYVCMGSNMGEPETNLARAREVLGALPGWKIEASSPVYFTEPQDMRDQPWFANQ VLKVSCALDMTAPDFLDMLLDVEKRLGRVRETSDPAMRFGPRVIDLDLLLFDQERWDTPH LALPHPRMSERAFVLVPLRDIEPNLLLSDGRTPGEALRSLAHAVEGNRIRQ >gi|316923163|gb|ADCP01000073.1| GENE 76 66270 - 67280 985 336 aa, chain + ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 30 318 4 297 297 268 48.0 1e-71 MKTTTRIHASPVSDKNGQRPDTPAGFPARLASRWLDGLIAERGLSRNTVAAYRQDLDALQ DFLDELETPLSGLDDENITLFIAWLRKRGDATRTLARRISSLRSFLAWCVERGELASNPA ALIDTPKLPSLLPDVLTQDEIVRLLNAPDATSKLGLRDRAMLELLYAAGMRVSELIELQP IDLDLQRGVVRIFGKGSKERLVPLHDAAVMRMAEYLKIVRPLFTPVEDRVFLNRSGNGLS RQGVWKLVKRYALEAGIRKPISPHTFRHSFATHLLEGGADLRSVQILLGHADMSATELYT HVQSERLLQIHRKYHPRSQHAEPGPDDASSPEDDLS >gi|316923163|gb|ADCP01000073.1| GENE 77 67299 - 68303 979 334 aa, chain + ## HITS:1 COG:all3989_1 KEGG:ns NR:ns ## COG: all3989_1 COG0618 # Protein_GI_number: 17231481 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Nostoc sp. PCC 7120 # 11 289 3 281 309 125 28.0 8e-29 MKPSLASAPTVITCHSNADWDALSSMIGMSMIYPDSIMIFPGSMEKPLNQFFNETAVFLY TFKNIKEIDHASVKRVVVVDTQIRSRVPQVQDLLDLPGVEVEVWDHHPTPYSGEKNKVIN ADVTHIGTTGSTCTLICQALQERGITPACQEATFLGLGIYGDTGAFTYTSTKPEDFLAGA WLRQHGMDLPFIADLVQTGMTSVHIKVLNELLDSATAHEIGPYSVVLAEATLDSFMGDFA YLAQKFMEMESCNVLFALANMEEKVQVVARSRVDAVDVGQICKALGAGAPFRGVRFGQEH PHPRTQGRHLPTAVHAGKPEQAGAGSHVLARRRH >gi|316923163|gb|ADCP01000073.1| GENE 78 68242 - 70047 2154 601 aa, chain + ## HITS:1 COG:TM0715_3 KEGG:ns NR:ns ## COG: TM0715_3 COG0617 # Protein_GI_number: 15643478 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Thermotoga maritima # 144 582 7 423 430 238 34.0 3e-62 MQVNPNKQARDLMSSPAVGIEDRQTIREAEQLMNRYGLKAAPVFRAGTRHCIGYMECQTA SKAIAHELGDMPVSEYMQRTILTVPPDAPLQRLMKIIVGAHQRLVPVVEKNEVIGVVTRT DLINMFVEDPSGVPIPVNTSARERNLAKLLSTRLPQHMLHMLHRAGELGDRLQVSVYAVG GFVRDIMLSRPAVEFDDVDLVVEGDGIAFARALAQELGGRVREHRTFMTALIIYHDENGE EQRLDVATARLEYYKYPAALPTVELSSIKMDLFRRDFTINAMALRLNKGQFGCLVDFFGG QSDIQRKTIRIIHALSFVEDPTRIIRAVRFEQRYGFHISTQGEKLIKNALSLNLVEKLSG ARILHELNLIFREDEPETCLRRLNELGVLAAIHPSLVLNADKNELLDSLREVIDWYRLLY FKETPELSTLYLMALCSAVPAIETADILHRLGLTPTMREEILSLRESVRMTLVELINWHR DTQKKGQAEQSVSRLCALLAPLPLESTLYLMARSDNEEISSSVSQYIYKWRQIKVDINGD DLHRLGLEPGPQFGSVMRMVLAAKLDGKAETREAQLKLAEELIRKGFPGNGTAGAAPKKP R >gi|316923163|gb|ADCP01000073.1| GENE 79 70141 - 70647 421 168 aa, chain + ## HITS:1 COG:PAB2301 KEGG:ns NR:ns ## COG: PAB2301 COG0537 # Protein_GI_number: 14520282 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Pyrococcus abyssi # 5 156 21 171 185 179 51.0 2e-45 MTHPLWAPWRMDYILGPKPDSCVLCLPPDDLSHDEERLVLYRGKTAFVIMNKFPYNNGHI MVAPLRHVMDLPLLAAEESTEIMELLKQCTTILREFFKPQGINVGLNLGEAAGAGIRDHL HFHLVPRWNGDSSFMAVMSETRVIPDHLASTYTKLKPLFARLRPGQPA >gi|316923163|gb|ADCP01000073.1| GENE 80 70658 - 71068 621 136 aa, chain + ## HITS:1 COG:no KEGG:DVU1651 NR:ns ## KEGG: DVU1651 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 94 1 94 123 90 52.0 2e-17 MRYIKVLLLVILFFLVMMFFVQNQSAFSQAVALKLDLLFLPPVESAPLPFYTLLIICFVL GALCILAMLMWDRVSLSAKLTMANMRARGFEKDVAKALKTNEALQKKLDAAEAKAVQLAE DVENAKKSASALTEQA >gi|316923163|gb|ADCP01000073.1| GENE 81 71085 - 73796 3000 903 aa, chain + ## HITS:1 COG:CAC1837 KEGG:ns NR:ns ## COG: CAC1837 COG0249 # Protein_GI_number: 15895112 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 27 894 3 862 869 630 39.0 1e-180 MPTDGPSQGRAWALLQHMSQQPASPRLTPMFEQYLRIKEGYPDALLFYRMGDFYELFFED AEVASRELQIALTSRNPNAEAPIPMCGVPWHAAEGYVSQLLNKGYKIAFCDQVEDPRAAK GLVERAVTRVYTPGTAVEDVSLEPKGHTYLGALFWSADTDRGGFAWVDVSTGYWAGLHVK KSQELWQWAQKIAPRELLLPEDADVPASLHLTDIQAVRVPLRSHFDYKRSAERVLAAQSV AELGALGLEGRKELVQACGALLAYLEQTQMQDTKHLAPFEPLDLGQHLLIDDVTERNLEL FRRLDGRKGVGTLRHVIDSTQTPMGGRLLEERLRNPWRELAPIQETQDAIAWLIAHPENR KKLRETLSGVYDLERLSTRIALNRTSPRDMLSLRQSLAALPPVKSAITSSDADTPRVLRS ITEHWDDLGDHAATLQKALADDPPQFITEGGLFKQGYNAELDELLDLVEHGENRVKALLD EEQKASGISKLKLGYNRVFGYYFELSKAVGGTPPEHFIRRQTLANAERFTTVRLKELEEK LLSAADRRKSLEYKLFQQLRGALAEARPRILFMADMLAQLDYWQSLAETAVRHNWSRPVL HTGQSITIREGRHPVVEGIIGEAAFVPNDLHMDEDRRLLLITGPNMAGKSTVLRQTALIC LLAQMGSFVPAREAQLGLCDRIFSRVGASDNLAQGQSTFMVEMMETARILRQATKRSLVI LDEIGRGTSTFDGLALAWAVAEELARRAGGSIRTLFATHYHELTALEGKIPGVHTMNIAI REWNGEIVFLRRLIPGPSDRSYGIEVARLAGVPQPVVQRAREILAQLEQNKGSSPVRQVM PNLLPGIQLPEAKPKKAVEEVAVAEPEHPLLVALRDTNPDALTPLEALKRITEWKLLWGA PKQ >gi|316923163|gb|ADCP01000073.1| GENE 82 73793 - 74260 424 155 aa, chain + ## HITS:1 COG:no KEGG:DvMF_0354 NR:ns ## KEGG: DvMF_0354 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 9 154 13 146 173 103 39.0 3e-21 MKTWITPILCCLLLLAAGCGKKGDPVPQDKKNLFSWESADAALTGNGCLAISALMKGAAR NVDGFSIELEPLAASADSTLPKELLTPQDTCEGCPFTPRETQELTPQQAVPTDSGTRFAF TYCPQTKAAAYRWRLVARNVFMAFPYALTPVKTVR >gi|316923163|gb|ADCP01000073.1| GENE 83 74656 - 75123 464 155 aa, chain - ## HITS:1 COG:ECs0471 KEGG:ns NR:ns ## COG: ECs0471 COG1267 # Protein_GI_number: 15829725 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphatase A and related proteins # Organism: Escherichia coli O157:H7 # 14 147 30 162 171 87 40.0 8e-18 MDKLILAFCRLGFAGLSPIMPGTCGSALAAVLAPFLFIPLPFVARIVVLVLVFWIGGMAA TRGEQILGYKDPGEIVIDELLGMWLVLLPFSNPGWAKIGIAFVLFRIFDMWKPWPVHASE SWLPGGYGIMIDDVMAGLQALLVMWLLQLVGFWGV >gi|316923163|gb|ADCP01000073.1| GENE 84 75135 - 75755 490 206 aa, chain - ## HITS:1 COG:PH0762 KEGG:ns NR:ns ## COG: PH0762 COG1351 # Protein_GI_number: 14590631 # Func_class: F Nucleotide transport and metabolism # Function: Predicted alternative thymidylate synthase # Organism: Pyrococcus horikoshii # 22 195 49 222 243 139 46.0 3e-33 MWPRLLAGDIGREKQAQFVASIMESGHASPVEHVSFTFALGGVSRALTHQLVRHRIASYS QQSQRYVDGSDFDYLIPPEIRKNPEALARFEGIMREIGSAYRDLKGLLEEAGRTGSKANE DARFVLPQATASNIVVTMNCRALLNFFEHRCCRRAQWEIRHVADEMLALCRGVLPEVFGL AGAKCERLRYCPEGEKFSCGRFPAKA >gi|316923163|gb|ADCP01000073.1| GENE 85 75875 - 76903 755 342 aa, chain - ## HITS:1 COG:CAC2284 KEGG:ns NR:ns ## COG: CAC2284 COG2255 # Protein_GI_number: 15895552 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Clostridium acetobutylicum # 28 340 15 327 349 386 57.0 1e-107 MSDSPNTPSFLPPADLAKELSAPDSGSSGDESIRPSRLEDFIGQEELRANLRVFLNAARE RGQAMDHVLFYGNPGLGKTTLAQIMAAELGVNLICTSGPVLERSGDLAAILTNLSRHDIL FVDEIHRMPIAVEEILYPALEDYKLDLVIGQGPAARTVKIDLEPFTLVGATTRIGLLSSP LRDRFGIISRLEFYTPEELSRVVIRSSRILGVPITEGGALEIGRRSRGTPRIANRLLRRV RDFAAVYGSGVVDEEQASHALRRMDVDENGLDQMDRKLLRVLVDIYGGGPVGVKTLAVAC SEEVRTIEDIYEPYLIQCGFLKRTSRGRMATAKAYKHLNMLG >gi|316923163|gb|ADCP01000073.1| GENE 86 76903 - 77514 714 203 aa, chain - ## HITS:1 COG:NMB0265 KEGG:ns NR:ns ## COG: NMB0265 COG0632 # Protein_GI_number: 15676189 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Neisseria meningitidis MC58 # 1 199 1 194 194 111 39.0 9e-25 MIAYIEGRLAEVCANACVVVTEGGVGYEVYLPQQTRLQLPERGGHVRFYICHIVREDAQE LFGFETWDERQTFIVLTSISKVGARTALGILSAFRPDDLRRLVLEDDVLALTQVSGIGKK SAQHIFLELKYKLKVEDMPAAASLQLGVPGSVYRDALTGLEGLGYAEAEAAPVLKNILHE EPDLEVSEALRAALKALARERQK >gi|316923163|gb|ADCP01000073.1| GENE 87 77609 - 78688 481 359 aa, chain - ## HITS:1 COG:BH0195 KEGG:ns NR:ns ## COG: BH0195 COG1541 # Protein_GI_number: 15612758 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Bacillus halodurans # 16 337 7 326 445 140 30.0 3e-33 MKRPFLDDWIARCCGLEVLSEESLRRYQLNKINEVLRYTERNSLLYRKRLAGVFERHGEA IRAGMWPASPDDVTQLPFTTARDLEDGWKRFVCVPLDAIARMVTLSTSGTTGEPKRLAFA SADLERTLDFFAHGISVLVRPGDTVLILLPGAERPDGVTDLLIRALPRIGARGVAGNPAA EQAGFCRELELHRPDCLVAAPGQLRRLLGAHPASPGIRAILSSAEPLPQDLEEALVHGWH CEVFDHYGLTETGYGGGVECCGKQGYHLREGDLFFEVVDPVSGEPVPDGTPGEVVFTTLT RQAMPLIRYRTGDMAAMLPGPCVCGSPLRRLSRIRGRFRKVGGRLEVLAPRKGWMEDSG >gi|316923163|gb|ADCP01000073.1| GENE 88 78688 - 80055 629 455 aa, chain - ## HITS:1 COG:MTH831 KEGG:ns NR:ns ## COG: MTH831 COG1964 # Protein_GI_number: 15678851 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Methanothermobacter thermautotrophicus # 10 441 7 477 497 254 34.0 2e-67 MTPPARSLPQATESVCPVCLRRIPAEYVLDPAEEGIPSVLLRKECPEHGVFSVPVWRGMP DFRSWVRDKTPSHPRHPFVERERGCPFDCGLCPDHAQHTCTGLIEVTGRCNLRCPVCYAS AGEQVAPEPSLERIAFQMDRLRQASGACNVQLSGGEPTVRDDLPEIIRMAKARGFALIQC NTNGLRLGTEPGYAASLREAGLDSVYLQCDAADDAAHEILRGRRCLPEKLEAIRVCGEAG IGVVLVATVAAGVNADRLWPLVEMGLRLGAHVRGVHFQPMSSFGRCPWRSDGAPRVTLPE IAAELERQSQGQIRWTDFHPPGCENALCSFSAVYRRNGETLELVQGASSCCDCGETPSAA EGARKAKAFAARHWSAPASPAAARDGDAFDRFLASAGIEQRFTVSCMAFQDAMTLDLERV KGCCIHVVSPSGTLIPFCLYNLTSFDGTTLYRGRV >gi|316923163|gb|ADCP01000073.1| GENE 89 80052 - 80489 219 145 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1819 NR:ns ## KEGG: Ddes_1819 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 142 5 151 154 120 55.0 1e-26 MLELIPLIREGYCCSQLLVLLALQQQGVENPGLVRAAGGLCHGMGQSGGACGLLTGGAAA LGYLACKGAAGEAAHPMAEPLINDYAAWFAQHVCTGGCRDVSCPSIQEKTGGGSDMTLCG DLLAECWDKLVDLCAEYGIDMTEPR >gi|316923163|gb|ADCP01000073.1| GENE 90 80498 - 81208 171 236 aa, chain - ## HITS:1 COG:Rv3729_2 KEGG:ns NR:ns ## COG: Rv3729_2 COG0500 # Protein_GI_number: 15610865 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Mycobacterium tuberculosis H37Rv # 11 147 61 207 311 65 32.0 1e-10 MIPQHREEPLYQRELFRLASADCLRPGGVELTERGLAHCAFGVGERVADLGCGPGVTLAL LAERGLSPVGMDRSAAMLQEAERRLSGVPLLAGTLEGLPFRDACMDGIVCECVLSLSCTP ERALGEMGRVLRPGGRLLLTDIVVREGSHGAGGQGCARGAVPADVVAERLARQGFRILAS EDHSRLLGELAGRLLFQGVPRSALLAWMGACPGGGGSACSGKRFGYWMFVAEKGNK >gi|316923163|gb|ADCP01000073.1| GENE 91 81205 - 81420 320 71 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1821 NR:ns ## KEGG: Ddes_1821 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 71 1 71 71 104 76.0 1e-21 MAMEPNYAPDPGDWVCTRCQCPLEQVKVQVGYLGSAFDVSLPRCPSCGLTMIPKSLAEGK MAEVEALLEDK >gi|316923163|gb|ADCP01000073.1| GENE 92 81575 - 83938 1431 787 aa, chain - ## HITS:1 COG:Ta0414 KEGG:ns NR:ns ## COG: Ta0414 COG0493 # Protein_GI_number: 16081537 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermoplasma acidophilum # 21 195 47 229 484 96 36.0 2e-19 MEQADLRSWERQCTQDDMPKCRAACPLQMDVRPFLERMAHGDTPGARKVLERHLPLPGVL GRICEHPCETACVRQELGGPLAVGALERVCVSQCAQQTRNLPRPPKSKKVAVIGDGMAGL VAAWDLSRKAYPVDLFYAGGRPAEGLIAAFPVLTPEALDGELEAMRKSGVTLTRREPDKA LFQEVEGYDAVFVDVSCAPGLAPASRDAVDSVTLNVAGGPEHVCFGGWPSPDGTGTPVAW AAEGRRAAMTLERSMTGVSLTASRENEGASATRLHTPLDGLAPLQRLEPEHPGEGYSLEE AKTEAGRCLQCECLICVRECLYMQKYKGYPRVYARQMYNNAAIVKGHHQANTMINSCTLC GQCEVLCPEGFSMADLCLSFREDMVRRGMMPPSAHEFALEDMAAANGPECALSFVGSGAD GKAAERCGQVFFPGCQLAGARGEQVLAVYETLRKDLGSVGLLLQCCGVPAQWAGEEKLFA ETVEALKSTWESLGRPRVIAACASCCKTLREALPEVSVVSLWEVLDTECPSLSFREEACC GGVPTLSIHDPCSARHDEAWLRSVRSLLSKRGVPFEEPRLSGETTPCCGYGGLTWDANPQ LASAIAADRAGQLEHDAVTSCIMCRERLVAEGKPSLHMLDLLYPGESLHAAATAKGSGLS ARRAGRAALRTEVLRRYAGESVAETADDGIPVRIAPDVLEKMEERHILREDAVRVVRHAE ASGDTFLNRDNGHFLASLRPVRVTFWVEYSVEDGVCVVHDAYCHRMEVPGTSTPKGRYEA VRNPFHL >gi|316923163|gb|ADCP01000073.1| GENE 93 84071 - 86797 3753 908 aa, chain - ## HITS:1 COG:mll4880 KEGG:ns NR:ns ## COG: mll4880 COG1529 # Protein_GI_number: 13474083 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Mesorhizobium loti # 156 908 1 763 774 275 30.0 4e-73 MISKTLIVNGMTKTLLVNPEDSLADVLRGQLLLTSVKVGCGTGQCGACNVILDGKLVRTC TLKMKRVEDGAAVTTLEGIGTPEHMHPLQLSWMFHGGAQCGFCTPGFIVSAKCLLDNNPA PTRNDVRDWFQSHRNACRCTGYKPLVDAVMDAAAVMRGEKSVKDIEYKEPVDGRTWGSRR PRPNAVAKVTGTHDFGADTALKMPPDTLHLALAQAKVSHANIKGIDTSEAEAMPGVYRVL THKDVKGKNRITGLITFPTNKGDGWDRPILNDTKIFQYGDALAIVCADSEAHARAAADKV KFDLELLPEYLNAPDAMAPDAIEIHPGTPNTYFEQKEVKGQETAPFFADANNVVAEGSYY TQRQPHLPIEPDVGYGYMNEEGKLVIHSKSIGLHLHGLMIAPGLGLDFGNDMVMVQCNAG GTFGYKFSPTMEALIGVAVLATGRPCHLRYNYEQQQQYTGKRSPFFTTVRYAADKATGKI KAMEADWIVDHGPYSEFGDLLTLRGAQYIGAGYGIENIRGLGKTVCTNHGWGAAFRGYGA PESEFASEVLIDELAEKLGMDPLELRAVNCYKPGDTNPSGQAPEVFSLPDMIDIMRPKFK AAKEKAAANSTDTVKRGVGVAIGIYGCGLDGPDSAESWAQYNEDGTVTIGVCWGDHGQGA DAGALGTAHEALRPLNIAADKIRLVMNDTSKAPVGGPAGGSRSQLVVGNAIRVACETLIE AMKKPGGGFYTYDEMKAEGREVRQYGKWTTPCTSPDENGQGEPFCCYMYGLFMAEVAVDI TTGKTAVEKLTMIADIGKVNNRLVTDGQLYGGLAQGIGLALTEDYEDIKKHSTLIGAGLP YIKDVPDDIELIYLENSRKDGPFGASGVGELPLTAPHAAIINAIYQACGARVRHLPARPE KVLAALKK >gi|316923163|gb|ADCP01000073.1| GENE 94 87100 - 87390 215 96 aa, chain + ## HITS:1 COG:CAC2444 KEGG:ns NR:ns ## COG: CAC2444 COG2158 # Protein_GI_number: 15895709 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a Zn-finger-like domain # Organism: Clostridium acetobutylicum # 1 85 1 84 88 87 49.0 5e-18 MKNSSRFFRNTSCAHFPCHPNADPETFNCLFCYCPLYFLPECIGTPRWNANGIKDCTLCR VPHQPDNYDRIIQKLSAAIRERAAEGKEKGPRQPTE >gi|316923163|gb|ADCP01000073.1| GENE 95 87464 - 87688 386 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPNKLEQAQEALAKVEAHMETLTPQTQARHMAERVRDNLAACIAMAQCNPKAGEILMPNV LTAAHEYLSGLGKN >gi|316923163|gb|ADCP01000073.1| GENE 96 87829 - 91359 4105 1176 aa, chain + ## HITS:1 COG:barA_1 KEGG:ns NR:ns ## COG: barA_1 COG0642 # Protein_GI_number: 16130693 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 340 643 217 523 525 218 42.0 4e-56 MRLLTVLLFGVFLVATQVFLVLKAMQDNLFERDVDPVDAVMERQASISDELTPQERLWLS KNHTVRVGVDPDFYPLEKFDAEGRYTGIGPDYLRILHKMTGLSFRIAPPSDWTNTIDMAT SHKVDMYIAAAETQHRSQYMLFTPSYIVLPGIIVTRQPTKGESVTRGAATDETEPSIDGI KDLAGKRVAVVNRYSWHDFLEELHPEITPVPVKNTLEGLQKVAFGEVDAMIDYQFNITEK INNSGIRNLRAAGSIDAPYGHAFGIRKDWPELHSIINKALSKITPEERRIIAQKWLQPYE KKAFSKQTIWMMLFAAEVLFTVFAFVLYWNFSLKRQVSRRTAQLNQELAKYDRAEKELRE SRAMLEQSNVELEHRVSERTRDLQAINAELQLAKEAADAATTAKSQFLANISHEIRTPLH GIMAFAELALLKEKTPPHRYLRAILQSSRALLEIINDLLDVSKIEAGHLELEQAPFMLDD VVHHVCTVALHNAHAGGIELVVDMDPNMQIALIGDAGRLQQIIMNLVTNAVKFTGPGGQI CVTLKQGHAEPLPSDTCPEQRRVQIICFVRDTGVGIAPEFLGQLFQPFRQVDASTTRRHG GTGLGLCICRQLVEMMHGEIWVESEPNVGSTFAFSIPLCTQPDSGATAQLPPPPAQPIHA LVLSASPLQTEILCRHFSALGIDALTATNTEEAVRKAQELPKKQPDLLFLDRKLSEPNTL YALRALRTAFGRTIPAILMDEHAGEQILSVSNRDAAQHDGQVAILSLVTLRGLHESIVSL NEPCLVAYSGRLHPQPSPMETPLFHDVRILVAEDNPTNQEIMEALFEDTGVQLTIVSNGK QALDALRVGAQRGERFDLVLMDVQMPVMDGYEATRIIRTMPELEGLPIVAITAHAMHEDK MRALSAGIARYLSKPLSRAALFGTLRALLPNKILPRKPEQATHRSESAALSAPIVLNNTA LPPCLDAGAIERLNVSPETYQKILRGYVRNTSEALPALRKGLYLDPAMPEQALCRPAWEW LIREAHNLKGASANVGAIAVQQDAHALELALKQFVRQSGNAAVETAALAQTFSPLLGALE EAFATVQTSVLEALPETPQPQEAPPQPVQAGTLDEDQKAQLQHFKEMLALADPERIAEAL MPLTAFLPSALMEKLQQAIDFYDYDEALALLPPDTL >gi|316923163|gb|ADCP01000073.1| GENE 97 91359 - 93224 2355 621 aa, chain + ## HITS:1 COG:ECs3742 KEGG:ns NR:ns ## COG: ECs3742 COG3829 # Protein_GI_number: 15832996 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli O157:H7 # 304 615 272 589 592 260 46.0 5e-69 MEPLLMETEYEKPLVLVVDDDPTNLRILVSKLNREYRLGVAKSGTKALEYMGKQIPDLVL LDVMMPDMDGYAVCEHIKRDPRLCDIPVVFISYVDDPSQKTRGFEVGGVDYITKPFHDAE VLARVRTHIMNKQMREQLKRHNAEIGKELDEHRRQLLALLDNLPGLAYRETVIGGIPDAR RAVSFVSDGVLGLTGYAPDRFMGEERLGLLDIAHEEDRETIRSAIATALKEHRRWELVYR IITAWGEEKWVWEQSSGAFDASGTLITIEGLVNDITEKQKNELGIRRENEELRERLKARC FNNIVGDSPPMREVFELIARAGATEDCVVVFGESGTGKELAARAVHECSARCDKPFIAVN CGAIPENLFESEFFGYKKGAFTGALADRKGCLDRADGGTLFLDELGELSLSAQTKLLRAI EGQGFTPVGGSDLHKPNFRIIAATNRNLAERVASGQMREDFFYRIHVIPIHLPPLRQRKE DIPLLIEYFLNAYPRVGDLPALNGEVMQAFMQYDWPGNIRELHNTLYRYLTLGKVQLGTL QVGSGQLPAGGQTGTGEAEHRQALPQEPLASALDRFEREYLLETLRRNDWKRAETADILG IDRRTLFRKIKQFDLEDEEGK >gi|316923163|gb|ADCP01000073.1| GENE 98 93386 - 93847 464 153 aa, chain - ## HITS:1 COG:all4694 KEGG:ns NR:ns ## COG: all4694 COG2954 # Protein_GI_number: 17232186 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 1 153 1 153 153 178 58.0 4e-45 MNVEIERKFLVKSEAWRGLAEGVRFRQGYLQTHPCTVRVRTEGERGVLTIKGQTRGFSRE EFEYEIPRGDADRMLDTLALPSLIDKIRYKIPYQGFVWEVDEFLGDNAGLVVAEIELSDE GQAFERPGWIGEEVTGDRRYANSALASRPFCMW >gi|316923163|gb|ADCP01000073.1| GENE 99 94140 - 95990 2459 616 aa, chain + ## HITS:1 COG:mll4112 KEGG:ns NR:ns ## COG: mll4112 COG1217 # Protein_GI_number: 13473495 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Mesorhizobium loti # 9 608 2 604 609 631 55.0 1e-180 MKDLVRNEKLRNIAIIAHVDHGKTTLVDALFRQSGIFRSDQQVDDRVMDSMDLERERGIT ISAKNCAVSWNGVKINIIDTPGHADFGGEVERALSMADGAILLVDAAEGPLPQTRFVLRK ALERGLKIIVVVNKIDRKDARAQEVLNEVYDLFIDLDATEEQLEFPVLYAIGRDAVAMRE LDAPRVDLSPLFETILEVVPGPSYDPEEPFQMLVSDLDYSEYLGRLAVGRSLHGTVRSNE AMVCVNEAGVPVPLRVTRLQAYEGLRLVDITEAMPGDIVIIAGMDDVAIGDTICTKENIR VLPRLRVDEPTVAMRFTINTSPLAGREGKNVQSRKIRDRLLKEALINVAIKIEEPDEKDS FIVKGRGEFQMAILIETMRREDFELCVGRPEVIFKRDENGQLLEPIEQLYVDCDENFMGV VTDKIAQRKGRMVNCVNNGTGRVRLEFSVPSRGLIGYRDEFLTDTKGTGIMNSLLEGYEP HRGDFPSRFTGSIVSDRAGNAVAYALFNLEPRGQLFVVPGDPVYEGMIVGEHNRDNDIDV NPTKEKKLTNMRASGKDEAVVLTPVRPMTLEHALHFVREDELVEVTPHSIRLRKAELNAL KRYQSAGKKKIASPGK >gi|316923163|gb|ADCP01000073.1| GENE 100 96254 - 97225 1177 323 aa, chain - ## HITS:1 COG:no KEGG:LI0327 NR:ns ## KEGG: LI0327 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 34 323 20 307 308 127 31.0 5e-28 MESMSVLERLSELLPALPPGLIEAWGQLVLFAVPVCLYLAAVVLPFVVLYGLIRAHFGRR SLLERCSRQQARFANFLNWLFLVCFVGLGLSKVVPYETFPPLYQAGLYLSAGLLLGGTII WTVVVAAWKPLRGWPVLHGFLAFLTGSSLACLPIIGLILGRVALQGTELPQENDLQTLIG LLLPSASDPFWLYFGLIHFLEVASAGALGLFWLLVRRKIDDFGRDYYVFAANWCGEWAAW GGWFSLIMAGVLCFMLQTQDLLTLENQGALLFVAALFAALLIPSVIWTVIARSATPMRHK IGMIFSLLLLVVAIANSGVLVLL >gi|316923163|gb|ADCP01000073.1| GENE 101 97691 - 99055 1820 454 aa, chain + ## HITS:1 COG:aq_1470 KEGG:ns NR:ns ## COG: aq_1470 COG0439 # Protein_GI_number: 15606634 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Aquifex aeolicus # 1 430 15 414 477 240 35.0 5e-63 MRIVQACRKLGLEFVCVHTAEDIHSGHVRIAKELGGEKSLYQVSSYHDANEILSVADDAG ATAIHPGYGFFAEDFRFARRVTTRERKLIFIGPSWRVIRELGDKINTKRLARSLGVPTVP GSDRPIYDEMEAEKIARSLYDFQLQQGITRPLVLVKASAGGGGMGIEEVYDIDQFRSVYR RIRNYALRQFKDEGVLIEQRICGFNHLEVQIVSDRSGKNPVHFGTRNCSIQSTGLQKRIE VAPGFDSSSIEYDFDADKLLEDITRHSLALARKVGYDNVGTWEWIVTKDGSPFLMEVNTR IQVENGVSARISKVKGQGDVDIIAEQIRIGLGDPLGYTQEDITFEGVGIEYRLIAEDPDN RFTPWVGRIDAFGWPSHPWLKMHTHVPTDESYEIPTEFDPNLALAIIWGENLEQAKERGM QFLDELTLQGENQSGELKSNVNFLRANTGRILRF >gi|316923163|gb|ADCP01000073.1| GENE 102 99074 - 101323 3106 749 aa, chain + ## HITS:1 COG:CC2995 KEGG:ns NR:ns ## COG: CC2995 COG0825 # Protein_GI_number: 16127225 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase alpha subunit # Organism: Caulobacter vibrioides # 32 299 15 273 320 141 35.0 4e-33 MDIEKRAHHLSERLNYIRDIFDGKHPEDIELLHNRLNEFFRRATEGIQSDPDAELTTLEE LFTFMERKLETKIQPMDKVRIVRHPQRICLRDILENVYDNFTEIGGQEEHSIDPSMLIAR ATITRRRGKKTYNQLIMVIGQEKGHGEEFRNGGSVKPWGNSKALQYMKVAETEGIPIHTY VFTPGSFPIEDTPGAAQQIAKNLYEMAGLTVPMVAVFSEGGSGGAEAISLADRRLMLSHG YYSVISPEGAAAIEGRLKPGQRATPELIERCATQLHITAEDNLQFGYIDRVIQEPSLGAR PYHYDFFRTLRQEIIRATDETVLSVRSGMFRGALLRRMSRDDINLDEMYIRWHLSQGARE RLVLRRQKKFLRLSRGAYIDRRPFLNKMRNSMRESWGNISARIKYALITKHQRKFAYLMD EMTSEMHLLKRRLTAPFCRIPRDQRPSIEPETVRNLTTLSDWDEESESRKGKWTYISPRA KEDRAITCPNAGTHGCLDLWSPDLYGEFAGVCTWCGHHFPMEYQWFVKNVFDEDSVREFN GEVEASNPLQFEGFDARLAEARERTKLKSGCITFEAKLDGTKMIVALFAGTFRGGSVGSA EGTKFVEAAELAMKKRYPLLAYVHGTAGIRIQEGTHGVIQMPRCTVAVRRYINAGGLYTV LYDTNSYAGPVASFLGCSPYQYAMRSSNLGFAGRGVIKETTGIDIEPHYHSAYKALARGH IQGVWDRREARANLKQVLLTMGGRNLYYR >gi|316923163|gb|ADCP01000073.1| GENE 103 101396 - 102046 935 216 aa, chain + ## HITS:1 COG:no KEGG:LI0490 NR:ns ## KEGG: LI0490 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 216 1 216 216 323 69.0 2e-87 MLDVTSLLEEIKASPFEEQEIVAPHTGVVTFGSLKLGDKVIGPNGTWKERPGTLIATITR ERNPKPLNAVQKGEVCAIHSELEGTYVQAGTPLATIRHFLSKDEVLRILLKQALNLFVAP ERAKYYFVPQIDTKIKVSGCRSVSVYEGMELFIVSRMKREMPLYYTGPDGLIYTVYFEHN ENVDAGSPLIGVCPPDQLQQIEEVVLKVQTEWQEQE >gi|316923163|gb|ADCP01000073.1| GENE 104 102048 - 102392 389 114 aa, chain + ## HITS:1 COG:no KEGG:LI0491 NR:ns ## KEGG: LI0491 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 114 1 114 114 123 57.0 3e-27 MGKLLQIRVSAWTYREEDVVRAWPALTALAWPRPQYPDEKRGVLELVTALENGLSFEDWP QAVAEGLREGIGRAAAVKKDLEEALFQWEPRKANELSETLEEELATLNERAPKP >gi|316923163|gb|ADCP01000073.1| GENE 105 102416 - 102950 571 178 aa, chain + ## HITS:1 COG:PM1950 KEGG:ns NR:ns ## COG: PM1950 COG0629 # Protein_GI_number: 15603815 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Pasteurella multocida # 2 158 4 158 166 117 43.0 9e-27 MLNKVMIIGRLGRDPELRYSQSGSPVCTLNIATDESYTDRDGNRVDRAEWHRVVVFQKAA ENCSQYLTKGSLVFVEGSLQTRKWQDQQGQDRYTTEIKAQRVQFLDKRGGEQGGMPSGAG GSEGGYSAPRRPAAAQQGGGSRPQQRPQQRQDDYEDLGPGFLPRPAAWTTCRSREIWG Prediction of potential genes in microbial genomes Time: Fri May 13 03:21:06 2011 Seq name: gi|316923121|gb|ADCP01000074.1| Bilophila wadsworthia 3_1_6 cont1.74, whole genome shotgun sequence Length of sequence - 54668 bp Number of predicted genes - 47, with homology - 41 Number of transcription units - 23, operones - 9 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 105 - 164 2.1 1 1 Tu 1 . + CDS 318 - 1946 654 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 + Term 2016 - 2069 12.3 - Term 2128 - 2167 4.4 2 2 Tu 1 . - CDS 2266 - 2775 330 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases - Term 2781 - 2821 9.7 3 3 Op 1 . - CDS 2827 - 4380 1517 ## COG3119 Arylsulfatase A and related enzymes 4 3 Op 2 . - CDS 4391 - 5803 1795 ## COG1757 Na+/H+ antiporter - Prom 5951 - 6010 4.8 - Term 6187 - 6225 5.6 5 4 Op 1 . - CDS 6252 - 6749 409 ## COG5598 Trimethylamine:corrinoid methyltransferase 6 4 Op 2 . - CDS 6638 - 7681 442 ## COG5598 Trimethylamine:corrinoid methyltransferase 7 5 Tu 1 . + CDS 7622 - 7846 105 ## - Term 7702 - 7753 2.0 8 6 Tu 1 . - CDS 7786 - 9345 1416 ## COG1292 Choline-glycine betaine transporter + Prom 9638 - 9697 8.9 9 7 Op 1 . + CDS 9868 - 10848 952 ## COG5598 Trimethylamine:corrinoid methyltransferase 10 7 Op 2 . + CDS 10912 - 11346 431 ## COG5598 Trimethylamine:corrinoid methyltransferase + Term 11357 - 11399 5.4 - Term 11337 - 11395 14.1 11 8 Op 1 . - CDS 11519 - 12469 777 ## COG2962 Predicted permeases - Term 12522 - 12561 2.0 12 8 Op 2 . - CDS 12580 - 14004 1502 ## COG1027 Aspartate ammonia-lyase 13 8 Op 3 . - CDS 14045 - 15502 1511 ## COG1027 Aspartate ammonia-lyase 14 8 Op 4 . - CDS 15551 - 16591 1303 ## COG2423 Predicted ornithine cyclodeaminase, mu-crystallin homolog 15 8 Op 5 8/0.000 - CDS 16628 - 18550 2201 ## COG4666 TRAP-type uncharacterized transport system, fused permease components 16 8 Op 6 . - CDS 18620 - 19597 1267 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component 17 8 Op 7 . - CDS 19655 - 20359 585 ## COG1741 Pirin-related protein 18 8 Op 8 . - CDS 20352 - 21095 553 ## COG1396 Predicted transcriptional regulators 19 8 Op 9 . - CDS 21014 - 21166 66 ## - Prom 21246 - 21305 6.2 - Term 21252 - 21300 6.0 20 9 Tu 1 . - CDS 21408 - 21650 114 ## - Term 22234 - 22281 6.8 21 10 Op 1 1/0.000 - CDS 22283 - 23500 726 ## COG0438 Glycosyltransferase 22 10 Op 2 . - CDS 23532 - 25616 2434 ## COG2206 HD-GYP domain 23 10 Op 3 1/0.000 - CDS 25690 - 26532 478 ## COG3672 Predicted periplasmic protein 24 10 Op 4 13/0.000 - CDS 26533 - 27909 1505 ## COG0845 Membrane-fusion protein 25 10 Op 5 3/0.000 - CDS 27920 - 29956 261 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 26 10 Op 6 . - CDS 30159 - 31556 1457 ## COG1538 Outer membrane protein - Prom 31800 - 31859 3.7 + Prom 31545 - 31604 2.5 27 11 Tu 1 . + CDS 31727 - 32152 193 ## gi|255061425|ref|ZP_05313515.1| nucleotidyltransferase substrate binding protein, HI0074 family + Term 32193 - 32229 -0.9 + Prom 32208 - 32267 3.1 28 12 Op 1 . + CDS 32412 - 32558 141 ## 29 12 Op 2 . + CDS 32551 - 33264 925 ## COG0591 Na+/proline symporter 30 12 Op 3 . + CDS 33294 - 33977 793 ## COG0591 Na+/proline symporter 31 13 Tu 1 . + CDS 34086 - 34376 344 ## + Term 34580 - 34609 1.1 - Term 34568 - 34597 1.1 32 14 Tu 1 . - CDS 34620 - 35798 1716 ## COG1454 Alcohol dehydrogenase, class IV - Prom 35839 - 35898 3.5 - Term 35842 - 35877 1.2 33 15 Tu 1 . - CDS 35979 - 36815 941 ## COG0648 Endonuclease IV - Prom 36837 - 36896 2.0 - Term 36839 - 36893 19.1 34 16 Tu 1 . - CDS 36918 - 38990 3313 ## COG3808 Inorganic pyrophosphatase - Prom 39164 - 39223 4.2 - Term 39376 - 39418 8.1 35 17 Tu 1 . - CDS 39444 - 39923 547 ## Dde_2611 putative lipoprotein 36 18 Op 1 . - CDS 40045 - 40635 625 ## COG0566 rRNA methylases 37 18 Op 2 . - CDS 40666 - 42681 2634 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 38 18 Op 3 . - CDS 42697 - 43587 1058 ## COG2519 tRNA(1-methyladenosine) methyltransferase and related methyltransferases + TRNA 43835 - 43911 85.4 # Arg TCT 0 0 - Term 43671 - 43703 0.2 39 19 Tu 1 . - CDS 43908 - 44120 57 ## + Prom 44620 - 44679 5.4 40 20 Tu 1 . + CDS 44845 - 47382 2360 ## COG0642 Signal transduction histidine kinase - Term 47541 - 47585 3.2 41 21 Op 1 . - CDS 47676 - 49004 810 ## COG4973 Site-specific recombinase XerC 42 21 Op 2 . - CDS 49059 - 49580 489 ## LIC025 hypothetical protein 43 22 Tu 1 . - CDS 49929 - 51446 1322 ## GK0543 hypothetical protein - Term 51552 - 51610 18.9 44 23 Op 1 . - CDS 51638 - 52303 851 ## Bcep1808_4559 putative bacteriophage protein 45 23 Op 2 . - CDS 52300 - 53460 1598 ## COG3299 Uncharacterized homolog of phage Mu protein gp47 46 23 Op 3 . - CDS 53460 - 53813 561 ## Reut_A2409 hypothetical protein 47 23 Op 4 . - CDS 53810 - 54541 700 ## BCAL2970 hypothetical protein Predicted protein(s) >gi|316923121|gb|ADCP01000074.1| GENE 1 318 - 1946 654 542 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 203 498 43 341 425 256 47 2e-67 MKALTRLGQRRYPVQGGYTTEQARELALLSRALGRQVGLVIDRQGKVDMVIVGDPASILI PELPRGRGAAGRLRGIRLMHTHLSPDGLSQEDLMDMLFLRLDSVSVLTVNDYGDPVSFQS GHLLPPNADSKPYRIHPMTAWDRVDIDFNAEAVSLEEELGRVLSEASEAGDSPRAILVSV SPLPRAIQETHIEELRELARSAGIVVTGTLIQRVADIHPRHILGKGKLTELEILALQGQA SLIIFDGELTPAQLNSLSEVTERKVLDRTQLILDIFAQRATTRAGKLQVEMAQLKYTQPR LVGKNRAMDRLMGGIGGRGPGETKLETDRRRIRERIAKIKKELDGLRQQRAFTRARRARQ GLPVAALVGYTNAGKSTLLNALTRSEVLAEDKLFATLDPTTRRLRFPEEHELVLADTVGF IRNLPKELTEAFQATLEELEAADLLLHVADASHPELDRQIAAVDGILADMELNEVPRVLI LNKWDRLEDEMRDILRDRWPDALPISAETRDSLNVLSRCIENTIHWETTANIEITGPMPK VY >gi|316923121|gb|ADCP01000074.1| GENE 2 2266 - 2775 330 169 aa, chain - ## HITS:1 COG:CAC0738 KEGG:ns NR:ns ## COG: CAC0738 COG0847 # Protein_GI_number: 15894025 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Clostridium acetobutylicum # 6 164 3 163 306 94 32.0 1e-19 MADPVYVALDFETADYQPDSACAVGLAKVRGGEVVDTLYSLIRPPRRRVLFTWVHGITWK DVQGSPTFLEFWPQMASFLQGVTHLVAHNAPFDRRVLEACCQSNGIALPDLPFVCTLRES RRKLKLPSHKLDAVCRCCGIPLDHHHAGSDAIAAARILIYLQGESQGGE >gi|316923121|gb|ADCP01000074.1| GENE 3 2827 - 4380 1517 517 aa, chain - ## HITS:1 COG:STM0886 KEGG:ns NR:ns ## COG: STM0886 COG3119 # Protein_GI_number: 16764247 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 8 359 4 333 495 115 28.0 3e-25 MSPIKNVIFIMLDTLQFNYLGCYGNTNVKTPNLDRFARQGFLFENAYSEGLPTIPVRRAI MTGRFTLPYAGWKPLDLNDTSLTDMLWCREVQTALVYDTPPMRLPKYGYSRGFDYVKFCN GHELDHETFHNVPLDPAVKAEDYLSPGMLHRNEDGEYDSSSQSLIREAECYLKQRQFWKS DADNYVSVVAREADDWLRRKRDPKRPFLMWVDSFDPHEPWDPPSVWEGKPCPHDPDYQGN PMILAPWTEVKGVMTEEECAHIRALYMEKVELVDKWIGNLLDSIREQGLWDETMVIITSD HGQPMGNGEHGHGIMRKCRPWPYEELVHVPLLIHVPGLEGGKRIESFVQNVDVTATMMDA LGYGQSALSEAGHEGIQTYGADEMHGISLLPVMRGETDTVREVAIAGYYGMSWSLITKDW SYIHWLKNDIDTDEMNRLFYDGSGKGGNAGRQSAELEMKEEMWTCVQGAEVTVPEQDELY DRRADPFQLNNLAGEHPEKAKELLQQLKLYIGELRTL >gi|316923121|gb|ADCP01000074.1| GENE 4 4391 - 5803 1795 470 aa, chain - ## HITS:1 COG:FN1422 KEGG:ns NR:ns ## COG: FN1422 COG1757 # Protein_GI_number: 19704754 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 23 458 10 444 473 297 38.0 3e-80 METKGNKPSLGWAILSLVLPVAMILYGTLVVGVRPPVLPLIGAVALAGLMGLKTGYRWEE LQEGMFEALGRIQIAIAILALVGMIIAAWLASGTIPAIIYWGLKLIAPEHFLLSAMVLCS VASIATGTSFGTMGTIGVALLGVGQALGYPPAMTVGAIVSGAYIGDKMSPVSDSTNITAS VCEVPLFDHILSMLWTTVPAFVAAGIVYTVLGFSFAQEQTASLESLNAILSGLEGHFSLD FVAFIPPILMIALAYKRFPVLPVMVACLLSAVAIAVYEGVAFNDLAKMMTSGYVSKTGIK QLDPLLSRGGLLSIMPTVLLLCSGMAFGGILERSRVLEVLLDAVLKGARSAVRLVASVLA AAYIINLGTGSQMLAVIVPGRAFLDSFKKADISPLVLSRTCEDAGTIGCPLVPWSVHAFY IAGVLGVSAFDFAPYAFLNWFVPIFSILCALTGFGIWRLNGTPVHGSRKA >gi|316923121|gb|ADCP01000074.1| GENE 5 6252 - 6749 409 165 aa, chain - ## HITS:1 COG:SMa1478 KEGG:ns NR:ns ## COG: SMa1478 COG5598 # Protein_GI_number: 16263258 # Func_class: H Coenzyme transport and metabolism # Function: Trimethylamine:corrinoid methyltransferase # Organism: Sinorhizobium meliloti # 1 160 339 498 511 77 29.0 1e-14 MPTRTLAGNTDAKVPDIQAGYETMQNYIQLLMGGTHMINECLGILDGMMTVSYEKYIIDE EMLRRVGCMMRGLDTSEASFDMGVLLETPHMEPFLMHESTLAACSGQWQPDVACWSNYDT WFAEGCPSILDRAAEKCRERLESAPDDLLTPKLNQELSAFAEATV >gi|316923121|gb|ADCP01000074.1| GENE 6 6638 - 7681 442 347 aa, chain - ## HITS:1 COG:mlr8280 KEGG:ns NR:ns ## COG: mlr8280 COG5598 # Protein_GI_number: 13476841 # Func_class: H Coenzyme transport and metabolism # Function: Trimethylamine:corrinoid methyltransferase # Organism: Mesorhizobium loti # 8 293 43 325 516 155 31.0 1e-37 MSYFNREFKVLSREQLERVHALTLDILRVKGVLFHSEVAREILAAHGAKVDGACVTFPAS LVERCLSQCPAGFVWRARDPQKSIYTGEGQTDVFVMQDHGPVYVQERHGERRHGTMQDVI NFYKLGQTSRVNAIVGQCTVDPHEVDGPNKHLLVTHQLLRHTDKPIMSWPVATIGENEKV FKMIEMVMGEGYLSSHYFVTASVCALSPLQYAQESADTIIAYARANQPVTVLTAPMTGVS TPISDIGALVAQNAELLAGIVLAQLVQPGVPVIYGTATYAADMRSGAFITGSPFPTLLIA PRCSLPNRFTICPPVHSRAIPTPRCRISKPDMRQCRTISSCSWVAPI >gi|316923121|gb|ADCP01000074.1| GENE 7 7622 - 7846 105 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDPFQLFTAQYFEFSIEIGHIKLLFSMAAHKGNRTGMAGGYSRPFGKGCPAYVPAYSVSS CCLSCFIIFTMEMI >gi|316923121|gb|ADCP01000074.1| GENE 8 7786 - 9345 1416 519 aa, chain - ## HITS:1 COG:PA5291 KEGG:ns NR:ns ## COG: PA5291 COG1292 # Protein_GI_number: 15600484 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Pseudomonas aeruginosa # 1 512 1 498 661 270 32.0 6e-72 MESANDPKKFSILGNYDNKIFWPVAILYGCIIAYAIIDPSGAGATFSSIQRFIIAHFSWL ILLTGASAIFFSVWMACSSRFATVKLGAADEKPEFSFFAWVAMLFCAALGTGFVIFGAAE PLYHLFTAPTVMDAGSAGAVRGVPEAIRLSVVNWGLFGWPLFAVGGWAIGYAAYRHNKPL RTSTGLYGLLGERCNDTLVSKAVDVLAAIGTIGGVSMMIGLGVASISYAFQILFGIELGA TGKFSIMLCFILTYIISTSTGLARGMRYLSESNGYITLGLLFAVLILGATPFTYVINMIM QVAGEFLFRLPQNLLWTDAGNFEPREWSGSWFIFYILWNISYVPFTGGFIARISRGRTMR EFVCGTVLVPLFMTLLWFSVWGSNSCYEQLKGFLPLWETVQGSPEQALYILLGSFPFGSV LCFIAFICFCLFAITTADAASHFIAQQTTNGIEVPRLTMRVFWGCTIGFTGILFQVTGGF AAIKSLAIAAAAPFVLVTFAYIISIVKMMKHDRQQEETE >gi|316923121|gb|ADCP01000074.1| GENE 9 9868 - 10848 952 326 aa, chain + ## HITS:1 COG:MA0932 KEGG:ns NR:ns ## COG: MA0932 COG5598 # Protein_GI_number: 20089810 # Func_class: H Coenzyme transport and metabolism # Function: Trimethylamine:corrinoid methyltransferase # Organism: Methanosarcina acetivorans str.C2A # 1 325 1 332 495 217 39.0 2e-56 MAQRWKKAGSLTTGGMSLNVFTQDEMDAIHRATLEVLEHTGVLIGSPEARALLVEAGASV IDEDIVRFPQWMVADAVSCAPETLLLAGRTPDRDVVLDSTRVMFTNFGEAVYLIDPDTGE VRNTTKKDVEKLTRVVDALDVIPVCERMAGAQDYPEQVAELHNYEALLMNTTKHVFTGGG NGKLTQYMIDMAKAAVGEENFEERCPVTFNTCPISPLKLTADVCEVIMTAARNGATVNVL SMGMAGGSTPVNLAGALVVHNCEALAGLVLAQTTRRGAKFIYGSSSTAMDLRYGAAVVGT PELAVLNAGVAAMARYYKLPSWAAGG >gi|316923121|gb|ADCP01000074.1| GENE 10 10912 - 11346 431 144 aa, chain + ## HITS:1 COG:MA0528 KEGG:ns NR:ns ## COG: MA0528 COG5598 # Protein_GI_number: 20089417 # Func_class: H Coenzyme transport and metabolism # Function: Trimethylamine:corrinoid methyltransferase # Organism: Methanosarcina acetivorans str.C2A # 2 137 357 492 495 89 36.0 2e-18 MLAGANIIYGLGMLEMGMTISYSQLLMDAEMAEMMLFSMDGIVVNDETLSVDVIKEVGPR SDFLAHMNTFENMYIQSKPKLIDRLTRDRWNEAGHLDMESRALIAAKELLATWEPEPLPE EACARVRAVLNAAERDYGVPESLE >gi|316923121|gb|ADCP01000074.1| GENE 11 11519 - 12469 777 316 aa, chain - ## HITS:1 COG:SMc02545 KEGG:ns NR:ns ## COG: SMc02545 COG2962 # Protein_GI_number: 15964870 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Sinorhizobium meliloti # 17 304 17 304 313 170 33.0 4e-42 MEHRSPRTAAGGLSQRQGMGILFGVYLSFGLSPLYWNLLHGVSALEMVSHRSLWGSLIIA AFMVWKGQFPAFFRAIRDRREIALIGLCSLAHLWNWWIYIWAVTNGKVLECSMGHYIVPM ISTILGFLFFRERPGRLQWMGIIFAGIGVLILLMGYGAVPWASLNVALSAALFAFFRKRA TVGAAPGMLMELLFSAPFLWGYLIWLGERGQGHFLAGGAYMDLLLIGCGFVSAIPQLGLS LGIRSVPMISLGIMQYILPTTVFMLGAFVMREPISASKLLVFLFIWVGVGCFLSGTVFST GRKKLRQVTEIGRVKY >gi|316923121|gb|ADCP01000074.1| GENE 12 12580 - 14004 1502 474 aa, chain - ## HITS:1 COG:CAC0274 KEGG:ns NR:ns ## COG: CAC0274 COG1027 # Protein_GI_number: 15893566 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Clostridium acetobutylicum # 2 456 4 458 465 323 41.0 4e-88 MRKETDALGEMLLPDDVYYGIQTERNRQCLAITDDPLSAYPAFIKAVAQVKKACALTNKE IGALDAGKADAIMRACDEMADGAFNADFPLCIFRGSGTPFNMAANEVLANRANEILTGYK GSDQVHPNTHVNMCQSSNDVVPTAKGIAIHNEVGRVLKAAAHLEAALDAKAAEFADVVKM GRTCLQDAVPITLGQEFSGYAAGIRRNRLLLERERARWCTGVLGATAVGTGMGCMPGFSE HIYKNLSAVCGREIRRDANLFDGLQASDSFIILHAHLQALATVAAKAALDIRLMGSGPRA GFGDIVIPAVQPGSSIMPGKINPVMTEVMILACHRVAGNQAGVGFGAFSGELDLGASSAV PIRSILNSIDILSRSMVLFADKCVKGLSANAEKCLHTAERSTSLATMVSALFGYKIGSRV ANLAYEHNITCREAAEREHLLSHEAADDLFDLLSLTDVKKTEALFAKYAGIRNV >gi|316923121|gb|ADCP01000074.1| GENE 13 14045 - 15502 1511 485 aa, chain - ## HITS:1 COG:CAC0274 KEGG:ns NR:ns ## COG: CAC0274 COG1027 # Protein_GI_number: 15893566 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Clostridium acetobutylicum # 2 453 4 445 465 347 40.0 3e-95 MRIEKDALGEMPLPDGAYYGIQAFRSLSNFDVTDKTFNDYPAVVRALAEIKKACALANRD IGALGADKAGAIARACDDILEGRMEGQFPVNIWRGGGTSINMNLNEVIANRANELLTGHK GYEQVHPNTHVNMCQSSNDVYPTAENIVLYRETGRALEGVAALENSFARKSKEFGGVVRL GRTCLQDGVPMTLGQVFAGFHGVIRRSRERLEALREGFRVGVLGGTVIGTGLGVLPGYVE AVYPHLSAIVGFEMRLPDTPEGSPVSDAGLFDAMQNADGLLTLSGALKVLACSAGKIAND FRLLSSGPRSGFGEIRLPAVAPGSSIMPGKINPFMCDLVVQIMHQVAANDWAITMNASAS DLDLASNATVSFFALLQSLEMIGNGFALFSEKCVEGIVANEAVCRRYAEESTSLATIVST LYGYETGSRIAGIAYREGITCKEAALREKLIPEDAAEELFDVEALAHRAGSVALLKKYGG MRSIG >gi|316923121|gb|ADCP01000074.1| GENE 14 15551 - 16591 1303 346 aa, chain - ## HITS:1 COG:RSp0418 KEGG:ns NR:ns ## COG: RSp0418 COG2423 # Protein_GI_number: 17548639 # Func_class: E Amino acid transport and metabolism # Function: Predicted ornithine cyclodeaminase, mu-crystallin homolog # Organism: Ralstonia solanacearum # 2 339 5 339 340 176 31.0 5e-44 MADTSHSLLFLSQREVIEAGVLNMEQVVPLMEKVYGLHWRGETVLPSKGVIRWGGVETEW EKGRINALPGWIGGDIRTGGIKWIAVSPEGVRAPGMPKVAALVIINDPVTLYPVAIMDGV LLSAIRTGANMGAAALHLAKEDTRTIAIVGGGFQGRTQLMALLVARPQTREIRIYDVNRA QAENFARHMEKRTGRKVAVCATVDETVKGADIVVTATGSTEPLLLRRHLEPGMLYIHVGG NECEYEVISAADKRYVDDWEQIKHRDVSSLAHMFFAGKLKDEDITAEIGAVVVGDRPGRE SDDEIIYVNTVGLGVQDVALGSLLLAKAREKGLGTTVKMWDEPFVL >gi|316923121|gb|ADCP01000074.1| GENE 15 16628 - 18550 2201 640 aa, chain - ## HITS:1 COG:BH2945 KEGG:ns NR:ns ## COG: BH2945 COG4666 # Protein_GI_number: 15615507 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, fused permease components # Organism: Bacillus halodurans # 29 634 37 641 656 401 42.0 1e-111 MSTTDLPDDDTLETLSEEGYTSRKPTGALKYLYYAIGLGMAAFHIWFLGYSSMEPWALYY SHICFGVVLAYFLYPFSRRSNKRMPTLGDWACMLLAASACIYFILELDTIIYRVGVAPTS LDLFFSTVIVVLILEMTRRTNGWILPSIAIFFLAYALLGKYLPYSLGGHRGYSFARVFSY ITGMDGMLSTPLATSASFVFLFFLFSAFLASTGAGQFFIDIAMAVAGGKRGGPAKVAVIG SALFGTISGNSAANVVASGTFTIPMMIKVGYSPRFASAVEAVASTGGQMTPPILGAAAFI IAELTGTPYLDVALATVIPAILYFVSIFYMIDLEAYKEGLHGYAKDELPDARKVLLERGH LVIPIFVLIFVLIGLNASVIKAAIWAIYSTILCTFLRKSTRLTPEQFVAGFADGAKQAVG LIAACATAGIIIGVLNLTGTGLKFASGIIALSGGILPIALILTMGSSLILGMGLPTAAAY LICAAVIVPALTGLGVPALTGHLFIFYFACLSAITPPVALAAFTAASLGKTKPMGVALTA VRLGIVAFIVPFMFVYAPSLLWQGSLPEILGTLATSLCGVFFLGSALQGAFRDRVLNMGQ RGLFLLGSLALIQPGLVTDLIGVGLVACALLWQWLGSHRG >gi|316923121|gb|ADCP01000074.1| GENE 16 18620 - 19597 1267 325 aa, chain - ## HITS:1 COG:BH0601 KEGG:ns NR:ns ## COG: BH0601 COG2358 # Protein_GI_number: 15613164 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Bacillus halodurans # 29 323 48 342 346 150 34.0 4e-36 MFRTFWKFVGVAVMALAVSAPVSHAKQRLLMGTSTGGGSYYVLGGTWSNALNGRLGDKVD LSIEVTGGPETNIMLIENKEMDLGMVTAWQAGDMYNGKGKVPQKFQAMRSFIPLYPSYLQ IYALADSGLKTIHDIDGKHVASSSAGSSSFLAARAIIETLGLKPAKVSGMPASQQLNTLR DGQTQASFSVMGVPAPVIMEMEASNEINLLTMTQEDFSKLLEAYPYWTQGVIPKGTYKAA KEDIHVISLWNFAVAHKDLPEDLVYDMTKATFEVLPQLANAVKDMAKTKPEDILYSSVPL HKGAIKYYREIGLTIPDKLIPAEAK >gi|316923121|gb|ADCP01000074.1| GENE 17 19655 - 20359 585 234 aa, chain - ## HITS:1 COG:RSc2208 KEGG:ns NR:ns ## COG: RSc2208 COG1741 # Protein_GI_number: 17546927 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Ralstonia solanacearum # 1 233 1 231 232 88 27.0 1e-17 MIDVIPLSACHTEQYGTISVTSLLSFNNYFDPRNAAFGSLLVFNDYRVPPAAGFKMHPHH NCEQLFFVVEGTQKCVDTLNNSLVLKPVTVQRVTAGTGYARETANYGNSMLRFMSVRFQP RCLHTHPYHEMRTFSPSLMADHLLNVVSGRAESPSKGLPLPFDVDADVYLASLEQTPLEH FVPAGSKTFIYVVSGRIRMDRQILQPGDHVRVSGPETVSLAAAPRAWLIVIDMR >gi|316923121|gb|ADCP01000074.1| GENE 18 20352 - 21095 553 247 aa, chain - ## HITS:1 COG:BH1288 KEGG:ns NR:ns ## COG: BH1288 COG1396 # Protein_GI_number: 15613851 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 1 79 1 77 112 68 46.0 1e-11 MSFGKRLKSLLAKKRQSQMDFALRLGITRSRLCNYINDRSEPDYATLCTIAEALQVSVDY LLGRNGIQDSDFQGSRIFSDFIIGREKPASEGMTVWIPLYLSRSETLEEPPVPSGWLRET SSFVQTGEFRQPYAIIVGDDSMAPEIMPGDIAYIQPCFIYHPFMEQNLGRDLFGVRLDAG DAVGTSLKRCCVQSNLLIFYSNNVKYSPVILDMNKILFVPLIGKVVSIWRSYLDSDMLER IREADDD >gi|316923121|gb|ADCP01000074.1| GENE 19 21014 - 21166 66 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNISLYVDRKGTIFHTLKTKRELCLLESGLNLFWQKSVNLKWISLCVWV >gi|316923121|gb|ADCP01000074.1| GENE 20 21408 - 21650 114 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCLVISFSGEFCTIGYGWDATIRSHTLLAHTHPIFTYLLSTTSFLLIKHIDAIRMTDFSA FSQALGVLTASFDFVHLYFR >gi|316923121|gb|ADCP01000074.1| GENE 21 22283 - 23500 726 405 aa, chain - ## HITS:1 COG:sll1466 KEGG:ns NR:ns ## COG: sll1466 COG0438 # Protein_GI_number: 16330066 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Synechocystis # 1 391 1 405 413 204 32.0 2e-52 MRILFLHHTFPGPFRQLAARLGGLPGNEIVFLSERSRRDVWIPGVRNLTVSGVQPVMAKD RAERELLQMMRYGSRFANALLKLQQGGFEPDIVYAHPRWGCSFFAQDIFPQAFHAVYAEW YYTKGANYTFFTQGAARPAVEFAQSRVRNLCQLNALAECDLVVTATSWQKKQYPQQIAKH IHVIHEGVDTDFFSPKPERFRIEGCDLSTVKELVTFSGRGLEPFRGFPQFYRSLPRLLAA RPECHVLIMASERQGSDETVSLERLREEVPVDSRRVHFVGFRPYEEYRLLLRASTVHVYF TAPFALSAGLFESMSCGCLLVSSDTEPVREVVRHGENGFLCDFWDHDMLADMTTELLARS DAMGPVRAAARQSIVEEYNLKVQIPRHMDLLLSTYADWKKARQTS >gi|316923121|gb|ADCP01000074.1| GENE 22 23532 - 25616 2434 694 aa, chain - ## HITS:1 COG:VCA0931 KEGG:ns NR:ns ## COG: VCA0931 COG2206 # Protein_GI_number: 15601685 # Func_class: T Signal transduction mechanisms # Function: HD-GYP domain # Organism: Vibrio cholerae # 503 659 268 424 460 93 33.0 2e-18 MEQVIPQSKVGNKSAAYRRMGWFMAVILVLVLCAAGSLAWFNINMAETSLKQDVERRLQF TAQNKANALSLWFNSMQNQANRLISADLFRLFASEVNGLGNDLSPLLKASGDSPSGNDDL SQLASQLPLMKNLLQEFISYSGFLRARITNADAQTYLSTDVTPPALSLEQQQGIRQAVES GKLGILPVRKTSNGLVLDLVVPIFAPQYVENRSEKPVATLLLSLMVSSRLGETIDTAKGE NSFGVTHVFQIVGTKLQDLLPLSADIQNLPDWQLGQSDSLPFGIRGGEAGSPDEVYSIGV KVPELPWLVVQEVPVAAALKPFLAQRNAIVIWAVIAVVVVLLALLAVWWWLVGRNARNVS AELLQLYQISNQQKQLLDGINSALVDGIVLTDKGGMVQYANQAFARMVGRSDEELVGMDC AAIFGYDTALRLYKQLDVAIQSEQSFMFKDVMWLQSKKYHYQITCSPYRNESGVITGTVS VFRDITQLVDAQERNQRMVRQTIAAFMHAIEAVDPYLGGQSSYMANLGGDLVKTLGLSEE DESTVRTAASLSQIGKMQLPRELLTKPGAFTPEERQAMERHVEYARETLKNINFELPVVE AISQMNEFMDGSGYPEKLHGDEIGICGRILSVANTFCALIRPRSYRNAKSVDVALNILES ESAKYDQHVVHALREYLKTPAGEKFVRSLADENA >gi|316923121|gb|ADCP01000074.1| GENE 23 25690 - 26532 478 280 aa, chain - ## HITS:1 COG:VCA1081 KEGG:ns NR:ns ## COG: VCA1081 COG3672 # Protein_GI_number: 15601831 # Func_class: S Function unknown # Function: Predicted periplasmic protein # Organism: Vibrio cholerae # 136 267 46 180 220 89 37.0 7e-18 MILWRWLRVLAGLLAVSIFVGLPYPGAFASDDDDEILLKRSSVGSLLSNRNKAEGARVAA PSDPPPAVTPAEPAREPEAQAAPAGKRGEGGPIRLFGTLEMRSKIGKMPKWTSVLEKERK NPGYVANRQFPGQGAWKDIKAKLSDMSPLEQVKAVNVLINRWPYRTDMDVWGVMDYWETP VEFFQKSGDCEDFAIAKYFALRDLGFPASQMRIVVLKDTLRNLDHAVTAVYLDGDAWILD NLSNAVLSHKRLSHYRPQFSVNEEYRWAHLTPSAPKKTGR >gi|316923121|gb|ADCP01000074.1| GENE 24 26533 - 27909 1505 458 aa, chain - ## HITS:1 COG:VCA1080 KEGG:ns NR:ns ## COG: VCA1080 COG0845 # Protein_GI_number: 15601830 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 9 457 15 481 481 285 37.0 2e-76 MPQQPEKPQQPQLPNDTRTQAIGSEELPYVNEVEAALARKPRLGARFLSLAVGLFFLILG IWANFAEIDEVTHANGQVISSQRTQIIQNLEGGILGAVLVGEGEIVEKGTPLAQLDNKLA ESQYRDAQNRILENELSIIRLDAELKGESPAFPEDLSVSHPQAIADQMAIFHAREQQQNA EMGLLQSQYEQRSREVEELTGRKRQTERSLALAVEQRNIAAPLMQRKIYSRVDYLGLEQK VVSLQGDIESLASSIPKAQAAAEEAKQRLTLRKAEMEAEINAEISKRRTELNSLKETLAA GGDRVTRTELKSPVRGTVKQIYINTVGGVVKPGEAIMEIVPLDDTLLVEARVRPADVAFL HPGQKAMVKISAYDFSIYGGLEADLEQISADTIEDKRGEFFYLVKVRTHKNAISYRKEQL PIIPGMVTTVDILTGKKTVLDYILKPILKARQNALRER >gi|316923121|gb|ADCP01000074.1| GENE 25 27920 - 29956 261 678 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 438 670 126 357 398 105 30 7e-22 MAGLAGTGRVSPGACLHAAKRAGLEGKIVYRKSLQDISTLVLPCILLLSRDRSCVLTSLD DGKAGVIFPETGEGVQPVPLQMLADEYTGYAIFATREARLDQRADRIRLLKGKRWFWDVL LYYMPIYRHVAFASVVINLIGVISPLFVMNVYDRVIPNNAVDTLWVLAIGILIAYLFDFL LRNLRSYFVDVAGRNADVVLSSRLVQKVLTMRLDAKPESTGALVNNLREFESLREFFSSS TLLAFIDLPFLVVALLLLGYIGGPLVILPLCAIPVLIITGIVLQEAGKRTAEQGYKQNMQ KNALLVELVNGLETLKACMAESRMLHLWEQVVGVSAKAGSVAKKYNNLAITISTLVTQAV SVGMVIWGVYRIADGTMTMGGLIGSNILVGRAMAPLMQIASLLTRLQNSRMSLQALDLLM QLPSEGQNDNEYVEFGKLDASFSFEDLAFAYPGAERLALTDISVFIKPGEKVGVVGRMGS GKSTLGKMLIGLYQPRDGAVKFGGVDIRQLAEADLRGRVGFLPQDVVLFYGTIRDNIALG DPTINDQMILRASTLAGVTEFIRSNPAGFGAQVGERGMSLSGGQRQAVALARALVRDPDI LILDEPTSNMDNASEQMIKNRLKAVMGNKTLVLITHRLSMLELVDRLIVMEGGRIIADGP KNEVLRRLREQPAPRTEQ >gi|316923121|gb|ADCP01000074.1| GENE 26 30159 - 31556 1457 465 aa, chain - ## HITS:1 COG:VC1621 KEGG:ns NR:ns ## COG: VC1621 COG1538 # Protein_GI_number: 15641628 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 29 449 15 433 445 148 27.0 2e-35 MKSSHLRMLLLLGALGLACSPESAFSQEKAAAQAYAGKTVTGEVTVKQTVEDTLQSHRGL KVMQENLDVVRHELRRAKAGWGPSVDAVGRTGASRLSNTTTRPLNADKDMYGANSISLTL TQPLWDGFATRSRVRTGEATVDSMTYRVLDNATSFALDAIIAHVDLLRRREILQLAKDNV RQHVEILDSQKERVELGAGSSADVTQTKGRLARAQSTLTDAEASLREGEASYIRLTGKPV PASLAEVYVPEPMYTGYDAVMEVADKNNPKLKAYMSDIKAARGEKELAESAYHPKINFEV GPSYSDRSGPGSQWTSGMDAGLVMRWNLFNSGADKAGTEAAESRTRMATETLYNFHDELA LEIENTWTRYLAAKEQKKYYEEAIGYNTATRDAYLEQFKLGERSLLDVLDAESELFNSST QYTTANGNVVVGAYRLYALTGMLLPELGIKEDPLYESPRNQQEKR >gi|316923121|gb|ADCP01000074.1| GENE 27 31727 - 32152 193 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255061425|ref|ZP_05313515.1| ## NR: gi|255061425|ref|ZP_05313515.1| nucleotidyltransferase substrate binding protein, HI0074 family [Geobacter sp. M18] # 28 132 29 133 137 72 36.0 9e-12 MPSLFTPSSASLQHFLQTLSRLEEAFLLSPLSGLEQEGAVRRMRFAVDAGLKAMKDHLAA SGHLPDTPTPPAVLAAAWRRRIIFDGHIWMDMLARRSQLARDDSPETLQAAVKELESRFL PELARLRDWLVQHPCSGERKQ >gi|316923121|gb|ADCP01000074.1| GENE 28 32412 - 32558 141 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLAALTNLPMDGAVGVALVSMFVSFSMSLFGLLRNVKPAKRRMERNHD >gi|316923121|gb|ADCP01000074.1| GENE 29 32551 - 33264 925 237 aa, chain + ## HITS:1 COG:MTH1856 KEGG:ns NR:ns ## COG: MTH1856 COG0591 # Protein_GI_number: 15679844 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanothermobacter thermautotrophicus # 36 207 45 216 526 86 34.0 4e-17 MTEILLWAGYFAVFAFLVHKGRSKDAVLSGSVGFGVQAFAYVATYISAVALVGFGGLAHA YGLQMLLVAAGNVWFGTWAVYRFLAWPTRKWQERLGARTPAQLLGLGHGSPSLTRAMAFV FALFLGVYGSAVIKGAALLLAEILPVPVWALIWLVAIIVGLTVFIGGLRGVLYTEAMQGV VMLVGMLMLVGAIFSKVGGPWEGMQALAALPPSDLANNGFVSLSGGERGCSSSRWSS >gi|316923121|gb|ADCP01000074.1| GENE 30 33294 - 33977 793 227 aa, chain + ## HITS:1 COG:MTH1856 KEGG:ns NR:ns ## COG: MTH1856 COG0591 # Protein_GI_number: 15679844 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanothermobacter thermautotrophicus # 1 225 267 518 526 59 25.0 4e-09 MMQRHFAIADPRQLKKTTPLAMFILTVLVGGAYFAAALSRLILPEVASPDQVMPKLVQML LPEFALHIFVLAIVSASLSTATGIYHIAVSALSEDLRGRPSNRLSWFTGIAICVLVSGGC AQIKGQLVALLCTTSWSIVGAMALVPYVALVRFGRRNAGAAWASAACGFFSCLAWYLIAY GPTSIWGPQLGSVAAGIPPFFIGFLFSWIGWIAVSALSESPALEQEA >gi|316923121|gb|ADCP01000074.1| GENE 31 34086 - 34376 344 96 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSQPEINIRILDLNGPRMAQTERSLRRHLKQHAVEARITCVGCGLEIARQGFTNATPALL MNQYTITEGKEITEEAIETFCKQLLVWIEKQQAGTR >gi|316923121|gb|ADCP01000074.1| GENE 32 34620 - 35798 1716 392 aa, chain - ## HITS:1 COG:TM0111 KEGG:ns NR:ns ## COG: TM0111 COG1454 # Protein_GI_number: 15642886 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Thermotoga maritima # 2 391 3 387 387 291 40.0 2e-78 MKFTYSIPTKILFGPGSFNDLATTPLPGKKALVVITAGKSMRANGYLDRLIDMLDKQGIG HALFDKILPNPVLRHVHEGAALAREQGCDFVIGLGGGSSIDSAKSIAVMAKNPGDYWDYI SGGSGKGQPLVNGALPIVAIATTAGTGTEADPWTVITKEDTNEKIGFGCAETFPVLSVVD PEMMLTIPAHLTAYQGFDALFHATEGYIANVATPLSDAYALKSIELLAKYLPAAVKDGSD LEARTQVALASTLSGMVESTSCCTSEHGMEHALSAFYPKLPHGAGLIMLSEAYYSFFADK SPERFTAMAKAMGVVTGHLPEAERPMAFVKALVELQKACGVDGLKMSDYGITKEDMAKCA ANARHTMGGLFELDPYALSLDETTQIMMNAYK >gi|316923121|gb|ADCP01000074.1| GENE 33 35979 - 36815 941 278 aa, chain - ## HITS:1 COG:BS_yqfS KEGG:ns NR:ns ## COG: BS_yqfS COG0648 # Protein_GI_number: 16079568 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Bacillus subtilis # 1 273 1 282 297 199 37.0 6e-51 MLHIGCHLSSSKGYRHMGEDALSINADTFQFFTRNPRGSKAKKADPKDAEALRGLMEEHA FAPVLGHAPYTLNACALDEGLRAFARDAMREDVMTLDTYLPHMLYNFHPGSHVGQGIDAG MALIVELLDAILRPEQTTRVLLETMSGKGTEVGSRFEELGHILRSVSLGDKMGVCLDTCH VYSAGYDIVNDLDGVLEQFDRHVGLGKLYAIHLNDSLMPFASRKDRHARIGEGTIGLEAI ARIINHPALRHLPFYLETPNELPGYAHEISVLRAAWIE >gi|316923121|gb|ADCP01000074.1| GENE 34 36918 - 38990 3313 690 aa, chain - ## HITS:1 COG:AGc2169 KEGG:ns NR:ns ## COG: AGc2169 COG3808 # Protein_GI_number: 15888511 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 680 8 703 714 722 64.0 0 MFSFIVVLACGLFALYYAYDTSKRVFSAGTGTEEMQKIAGAIQEGAQAYLNRQYKTIAIV GVIVAVLLAWALSGLVALGFVLGAVLSGVAGYIGMNVSVRANIRTTEAARQGMTPALNVA FQAGAVTGLLVVGLGLLGVVLYYGLLTMFGVEQKTMLEALVALSFGASLISIFARLGGGI FTKGADVGADLVGKVEAGIPEDDPRNPAVIADNVGDNVGDCAGMAADLFETYAVTVVATM LLAAIYFTGELQHLMMMYPLLIGGVCILGSVFGTKYVRLGSDNNIMKALYRGLTVAALAS AGLIIVVTFLLFWGAPEMQIQGKTFGVGNLLMCSLTGLVITAALVWITEYYTATAFRPVR SIAQASTTGHGTNVIQGLAVSMESTALPVIVISIGILFSYSNAGLFGIAIAATTMLALAG MIVALDAYGPVTDNAGGIAEMSNLPGDVREVTDALDAVGNTTKAVTKGYAIGSAALAALV LFACYTEDLGRFFPNLNVSFSLQDPYVLVGLFIGGLLPYLFGAMGMMAVGRAGSAVVVEV RRQFREIPGIMEGTAKPNYGAAVDMLTKAAIREMIMPSMLPVFAPIIVYFIIDAFGGQSA AFATLGAMLMGTIVTGLFVAISMTSGGGAWDNAKKYIEDGHHGGKGSDAHKAAVTGDTVG DPYKDTAGPAVNPMIKIINIVALLLLQALA >gi|316923121|gb|ADCP01000074.1| GENE 35 39444 - 39923 547 159 aa, chain - ## HITS:1 COG:no KEGG:Dde_2611 NR:ns ## KEGG: Dde_2611 # Name: not_defined # Def: putative lipoprotein # Organism: D.desulfuricans # Pathway: not_defined # 5 159 15 171 171 128 39.0 6e-29 MNLRVFALLGVLLLAGCSGLSVRHLPSNPWKNEASQNIQMRYLAFQYQVIPVGNELGIVA EAYPVIDKLPDWASWYGEIMLSVYVSDEYGRVLASKDTVLVPRPLNREAGLPLEVSLDLG TNRHQPLSISFGYRLVLTDTQPDGTAGRRMLVSERALEK >gi|316923121|gb|ADCP01000074.1| GENE 36 40045 - 40635 625 196 aa, chain - ## HITS:1 COG:aq_1661 KEGG:ns NR:ns ## COG: aq_1661 COG0566 # Protein_GI_number: 15606763 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Aquifex aeolicus # 5 193 7 195 211 140 39.0 2e-33 MAREITERRRAKLERVLRWRQKDLTLVLANIHDPHNVSAIYRSADAFGVAALHLYYTDTA FPELGKKSSASARKWVESVRHRSAEELYAALKAGGHQILATSCTPESKPLGAYDFTKPTA VIMGNEHSGVPAELNSLVDGEVYIPMYGMIQSFNVSVAAAVILSEASRQRVQAGMYDRPS YPPDELAARLEAWIEK >gi|316923121|gb|ADCP01000074.1| GENE 37 40666 - 42681 2634 671 aa, chain - ## HITS:1 COG:AGl1953 KEGG:ns NR:ns ## COG: AGl1953 COG0488 # Protein_GI_number: 15891093 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 527 29 554 567 320 35.0 8e-87 MKITIQNLSKAFGGRDIFSDFSLDIDSGVRLCVCGPNGCGKSTLIRMLADADAPDSGRII MPRGCRVGYVQQDLDEAVLDTPLLEWVVDVLPDWHDFWSAWEAAAASRDEAAIARLGARQ AELEALYGYNPEHKAQAVLSGLGFDEGKWHLPIKQLSGGWRERAKLARVLVAGADVLLLD EPTNHLDIEAVEWLEDFLMDYKGALVFVAHDRVFMDRIGTHVLYLGGSKPVFRKATFSQF VELQEELEEQREREAQRLNAEIERKMDFVRRFGAKATKARQAGSRQKMAKKLEKELEGFR PEAKRKELSFKWPEPPKADKTILSVADLAFAFPDGVSLWPPLTFTIYRGQRIALVGHNGC GKSTLLKILAGRLERKGGTQVMGSLVKMGYYSQHQTELLNSSGTVLGEIRRLSDPRMTEE ELMSVLGLFLLGQNYFDRQVSSLSGGERSRLVLASLFLARANFLVLDEPTNHLDLESREA LVEALNAFDGTLLMVAHDRYLLSEVADEAWSMAGDGITVYKEGFAEYDVARRAALAASKA GKEREKDAGRKAEGPVPGKALDREELKRLKREQAEQRNALYKKMRPKQEAYAKLETQLEA LLTEQSDVEAALADPEVYADGNKTTELLKKFSQLKESSEQALEKLGELEAELAELEAQRA ALSMNGGGDEL >gi|316923121|gb|ADCP01000074.1| GENE 38 42697 - 43587 1058 296 aa, chain - ## HITS:1 COG:TM0748 KEGG:ns NR:ns ## COG: TM0748 COG2519 # Protein_GI_number: 15643511 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA(1-methyladenosine) methyltransferase and related methyltransferases # Organism: Thermotoga maritima # 5 250 11 255 265 179 41.0 5e-45 MPAYGDLVIIVTPKGKRSLRRVDENQDMHTQDGILRMADVVEAPFGSEVCTTLGVPYRIQ KPTLNDIVKGVKRQTQILYPKDIGYICMRLGVGPNRTVIEAGTGSGSLTVALSWFSGTTG KVHTFEAREEFHKLARRNLEWAGVGQNVTMHCQDIAEGFGDVTGADALFLDVRTPWDYLK HIPAAVTPGAALAFLLPTVDQVGKLLLGLEQGPFDDVEVCEILVRRWKPIADRLRPEDRM VAHTGFLIFARHQERSAKWDECRTASLGTRERKQEAARRERLGLDGMETASEQIDE >gi|316923121|gb|ADCP01000074.1| GENE 39 43908 - 44120 57 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSLALSKGTCGRTFGMANQSGGRAILCRKRASGRHHMDKTGRKTGNQWLTKAGNDKEDTY TLVCILLKSS >gi|316923121|gb|ADCP01000074.1| GENE 40 44845 - 47382 2360 845 aa, chain + ## HITS:1 COG:PA3462_2 KEGG:ns NR:ns ## COG: PA3462_2 COG0642 # Protein_GI_number: 15598658 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 323 601 1 279 385 185 38.0 3e-46 MSTFLLYQTKEKVSTAYKHPYTVSNTAREIHSRALDTKFFYRKLLSSETSDKKKLALIIH ERFLKMNMDRDKIKKKYLGPEKDIERLFDATDAFHNALLEGLSYSPSHSKTEILSYIDAN VEPAYIELEDSLKTVIAFSNGKMREFVGQSSFTVNMATAGSFFLILVVLTVAFFFYRLQK KAEQAIAYREKLFDILCNSIDDVFVIYHVRDKRIEYVSGNADRILGLGNNDISILYSRLS AANEKVLDDFSKQAPFDAPKGCDFSMKDTLTGEDRFMHLHLYPVKENAGTIRYVVSLSDR THEVKTRQTLKDALASAQQANTAKRDFLSRMSHEIRTPMNAIIGMATIAAAHIQNHGRVA DCLHKISFSSKHLMSLLNDVLDMSKIESGKLAVHHEEFDLPNLIDSIVSIMYPQTESQGQ HFSVALSGIEEEKLVGDPLRINQILLNLLSNARKFTPEGGSIKLEVSQKRKNGGVLMRFT VSDTGIGLSEAFQKRLFKPFEQADSSISQKYGGTGLGLAITHNLVTLMNGTIGVRSKEHE GSCFTVELPLALPPNGHIQKKEQIARDMKVLVVDDDLDTCEYAALILRRMGIAAKWVLTA REALDQVVDAHERGNGYDVCLIDWKMPEMDGIEATRRIREVVGPETLIIIITAYDWTSIE QRAREAGANAFLSKPLFSSALYHTLSAVSQKKPASAAAELEPGTEGQAPSLRGRHVLLVE DNDLNREITEEILKMKGVSFTCAENGQIAVDIFTASAPGTFDAILMDIQMPVLDGYAATA AIRASHSPESQAIPIIAMTANAFHEDVVSALSAGMNSHISKPIDPECLYQVLIASLRDQA KPAGA >gi|316923121|gb|ADCP01000074.1| GENE 41 47676 - 49004 810 442 aa, chain - ## HITS:1 COG:XF1483 KEGG:ns NR:ns ## COG: XF1483 COG4973 # Protein_GI_number: 15838084 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Xylella fastidiosa 9a5c # 117 346 55 285 294 76 29.0 1e-13 MGLRFVEYRRRQWVLRYRDPWSGKQRIKSFSTEGEARQFEAVQAEMHIRERELMRRAKRR MSSASATKATVAELLERYAGTLENPTTRSTTTQHAEPLTAIFGHRRAHCVTCDDVLAWCE VQRQRGVGQSTVHRRVSILRAAYGWAVRARLLLSNPLVSLRIAKPKAQRIAPPSVKEARL LYDAAAPHIQRVIVLGMATGARIGPSELFRLRWSDVDVGTGVIRMPNAHKGAGEESRDVP IRDDVLGLLRRWHEEDRQTGCAYVIAYKGRPVRSISSGWHNALRRAGIARRIRPYDLRHA FASLALVYGADIKCVAETMGHKNITMLLSVYQHTLFEQRRRAVNAAPGLFSGTGVKRGKK KRLPPGRGRRPARSSNRPRNAGRRLRTTRRPHGALPLAFALEPARRTGRMPGLRTHGSRT EAGRDERGFPAGWPFWRVAPLA >gi|316923121|gb|ADCP01000074.1| GENE 42 49059 - 49580 489 173 aa, chain - ## HITS:1 COG:no KEGG:LIC025 NR:ns ## KEGG: LIC025 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 46 125 34 115 179 82 48.0 5e-15 MTIPQLHMYDLKTGEYTGSRDATRRPNGEYMLEATGATPVALPADIPAGHIARWTGDAWE AVEDHRQHMDERGRKEGGTPYWLPGDTWRSEPRYTEELGPLPTEALLAKPEKSAEELAQE AQAKIKGRLAALDAKYLTPRTLAGLAVGDEYALEQWRAHEAEAEPFRIQLAAL >gi|316923121|gb|ADCP01000074.1| GENE 43 49929 - 51446 1322 505 aa, chain - ## HITS:1 COG:no KEGG:GK0543 NR:ns ## KEGG: GK0543 # Name: not_defined # Def: hypothetical protein # Organism: G.kaustophilus # Pathway: not_defined # 219 321 304 406 870 93 53.0 2e-17 MPTSVINEILSFCPQGTIASGDIMALEDYKNDVQRLRGHQPGIARRELENMALRQSSFVA AGLAQFVAKRYAPGVRDDADLDALETAITAAITALIAASNAQTATRLATKRTIGGVAFDG SANIHHYATCSTAAATAAKTVALAGFALATGARVLVKFTVTNSAANPTLDVNGTGAKPIQ YRGAAIAAGTLAANRTYEFVYTGAQYELVGDVDTNTTYPLATASNDGLMAKGDKGKLDGI AVGAEVNQNAFGNILVGSTTIAADTKTDTLTFVAGTNVTLTPDAANDKLTIAAKDTTYAA ATQSVAGLMSAADKKSVDYCEALRLSMIGVPRYWRSTSLPANHVWANGDLVLFSDWPELK KVYDGGGFTGMLLAYNATSATIAANLGKWRPNAANPTGLYVPKLSDQFFRAWTGGAGREA GGWQEDTMRNITGRIVNVVQGLTDNTVQRAEGAFYPSGAGAVGYRGEDLSIRHIHLDTSL VVPTGPENVPPHVWQPVAIYLGLQA >gi|316923121|gb|ADCP01000074.1| GENE 44 51638 - 52303 851 221 aa, chain - ## HITS:1 COG:no KEGG:Bcep1808_4559 NR:ns ## KEGG: Bcep1808_4559 # Name: not_defined # Def: putative bacteriophage protein # Organism: B.vietnamiensis # Pathway: not_defined # 4 213 7 216 220 192 46.0 1e-47 MSGYLGLVTSEHRNRPRFMATVAAVTDPLCGLQELLETMRAAFDVDSAVGGQLDRTGEWI GRSRHLRLELDDVYFEWGREAVGWARGSWKGLYDPETGMVRLPDETYRLLLKAKIGANRW DGTVPGAYEVWESAFADTGSLILMQDNQDMSVVIGLAGTPLDAVMRNLLLQGYLPLKPEG VRVAWYAVAPERGPLLGWNCETGGLSGWGKGIWPVRLEPLP >gi|316923121|gb|ADCP01000074.1| GENE 45 52300 - 53460 1598 386 aa, chain - ## HITS:1 COG:XF1704 KEGG:ns NR:ns ## COG: XF1704 COG3299 # Protein_GI_number: 15838305 # Func_class: S Function unknown # Function: Uncharacterized homolog of phage Mu protein gp47 # Organism: Xylella fastidiosa 9a5c # 3 385 7 387 387 117 27.0 4e-26 MALATIDETGLHLPDYPTVLEDVKARFRGIYGDDLYLEPDSQDGQLCAVFALALHDAYTL AGSVYNAYSPATAQGTGLSRMVKINGLRRKPSGRSTVDLRLVGQAGTVIRGGMAGDAAGK RWLLPDEVAIPQSGEITVTATAEESGDIRAAAGDIVKILTPARGWQSVGNPAAALPGAAV ETDAELRRRQAISTALPSLTVFEGTLGAVASIPGVTRSRGYENDGGVPDADGIPGHSICM VVEGGDTAAIAEAIAAKKGPGAGTYGTTEALVRDKFGVPNVIKFFRPVETPVYATVTIRP FPGYLSTTGESIRKNVAEHINGLNIGDDVSLSRLYSPANAANAASYDIESITLGTSQGAQ SAANVAVAFNAVASCSVDRVKLVVRP >gi|316923121|gb|ADCP01000074.1| GENE 46 53460 - 53813 561 117 aa, chain - ## HITS:1 COG:no KEGG:Reut_A2409 NR:ns ## KEGG: Reut_A2409 # Name: not_defined # Def: hypothetical protein # Organism: R.eutropha # Pathway: not_defined # 1 117 1 116 116 124 57.0 9e-28 MKYRKLTENGDYAFGRGGADMHADTPEAVGQAVLTRLRLFAGEWFVDLKEGTPYVPGVLG KHTQDTYDPVFRERILDTEGVTGIVSYASSFDGETRKLSVRAVIGTVYGETTIQEVF >gi|316923121|gb|ADCP01000074.1| GENE 47 53810 - 54541 700 243 aa, chain - ## HITS:1 COG:no KEGG:BCAL2970 NR:ns ## KEGG: BCAL2970 # Name: not_defined # Def: hypothetical protein # Organism: B.cenocepacia_J2315 # Pathway: not_defined # 1 240 1 233 236 205 46.0 1e-51 MDRRERWAEPVEALRAALDGRQAEMWTALPGIVQSFDPAAMTVSVQPAVAGRISDEAGKA ASVDLPILPDVPVVFPGGGGFALTFPVAAGDECLVVFASRCIDAWWQSGGVGEPMEPRMH DLSDGFALVGVRSQPHRLSPAVHTGNTQLRADDGSAYVEITPGGAVTAVGPSSVTVRSGG SITLDAPRIVIKGLLSMQSQGGGATTATLAGSLNATGDVTASNISLNSHTHPGDSGGTTG GPQ Prediction of potential genes in microbial genomes Time: Fri May 13 03:22:25 2011 Seq name: gi|316923106|gb|ADCP01000075.1| Bilophila wadsworthia 3_1_6 cont1.75, whole genome shotgun sequence Length of sequence - 10653 bp Number of predicted genes - 14, with homology - 8 Number of transcription units - 3, operones - 2 average op.length - 6.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 812 864 ## Reut_A2406 hypothetical protein 2 1 Op 2 . - CDS 805 - 1110 453 ## bglu_1g15740 hypothetical protein 3 1 Op 3 . - CDS 1154 - 1675 611 ## Xaut_3644 putative bacteriophage protein 4 1 Op 4 . - CDS 1672 - 4416 2638 ## 5 1 Op 5 . - CDS 4416 - 4568 187 ## 6 1 Op 6 . - CDS 4595 - 5002 611 ## SG1658 hypothetical protein 7 1 Op 7 . - CDS 5005 - 5448 596 ## Bmul_1816 putative bacteriophage protein 8 1 Op 8 . - CDS 5465 - 6949 2259 ## SG1660 hypothetical protein 9 1 Op 9 . - CDS 6963 - 7547 657 ## Bmul_1818 putative bacteriophage protein 10 2 Op 1 . - CDS 8070 - 8333 72 ## 11 2 Op 2 . - CDS 8330 - 8851 72 ## 12 2 Op 3 . - CDS 8929 - 9219 365 ## 13 2 Op 4 . - CDS 9219 - 9605 451 ## RB2501_01256 hypothetical protein - Term 9778 - 9824 -0.9 14 3 Tu 1 . - CDS 9986 - 10372 353 ## Predicted protein(s) >gi|316923106|gb|ADCP01000075.1| GENE 1 2 - 812 864 270 aa, chain - ## HITS:1 COG:no KEGG:Reut_A2406 NR:ns ## KEGG: Reut_A2406 # Name: not_defined # Def: hypothetical protein # Organism: R.eutropha # Pathway: not_defined # 17 264 5 240 322 246 50.0 6e-64 MAEATSRSGERRDEGRQWLRRCSLLVGPKEGGALELGELRVAFTVKVSEQETPNSASIKV YNLSEATAGRIRKEFTRVILQAGYQGNCSVIFDGNITKVALGQEGGPGAGKNGGEGEGLD TCLEISAGDGDRAYNYALVNTSLAAGSTPDDHVRACMKAFSAKGIEPGYIPLLPGQKLPR GKVMYGMARAYMRDTARRTGTAWSFQKGKMQMVPASGYLPGEAVVLSAGTGLIGTPKAND KGIEIKCLLNPRLRIGGRVRLDNAGGAGRQ >gi|316923106|gb|ADCP01000075.1| GENE 2 805 - 1110 453 101 aa, chain - ## HITS:1 COG:no KEGG:bglu_1g15740 NR:ns ## KEGG: bglu_1g15740 # Name: not_defined # Def: hypothetical protein # Organism: B.glumae # Pathway: not_defined # 3 98 6 105 105 82 47.0 4e-15 MRYVIPLTPEPQRFSIVLAGRELLLAVRWMDAPEGGWLLDMADAEGVPLVSGIPLVAGCD LLEPYAYLGLGGALLLSGDEPPSPDTLGRGVDLLFEVADHG >gi|316923106|gb|ADCP01000075.1| GENE 3 1154 - 1675 611 173 aa, chain - ## HITS:1 COG:no KEGG:Xaut_3644 NR:ns ## KEGG: Xaut_3644 # Name: not_defined # Def: putative bacteriophage protein # Organism: X.autotrophicus # Pathway: not_defined # 17 171 21 170 179 130 45.0 2e-29 MNWLTQPLERVFLRPLRSLGGLSFDVVVSEEHEDTLTIAKHPVEQGANISDHAYRNPCKV VIRGASSESTYGLPVWDPYNATLYNALLALQNAREPFDIVTGKRKYGNMLLEKLTVTTTP DSEHALMVTAECREVIIVRTQVIAVPAEPGRHRNPAKTGGTANKGQKQAVPVR >gi|316923106|gb|ADCP01000075.1| GENE 4 1672 - 4416 2638 914 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAVAGEQILTRLGFEVDGISLKSSLEKAAAFGSALRRAFADASAAFVGVAKGEAEVAAQA DALGVSVERFAEWRFIAEQTGASSDALGASLKAMLANNPGLSDAASELERAGGAMREMGD AQREAYARGLGIDPSLIPMLTRDVSGLRDMFRSLYATAGTDAAQAAEDSKGFLDGMARLA GLCDLLAKTVALSLVGTARSGIEALGDAFVAHFDDIRRVLELVVALASAAAPAIGAALGL LVSWGGQLAGLLGTLDADLVAGIGKGIAAFMALRGAVGAAIIGFANLKEAWALLTAAFSA NPYLLAIGAAVTLAVVVMDNWGAVKAFLLGVWDAVASGVRAAVDAVASAFDNAVNFVTGA GSAIAGAVGPAVSSAMGFAGAAVSGAADLLASAFGGAADVVSGLFSGMGAAIGIALQGIM DGFRLLGGLAAAIFSGDLSGALDAGIGLFRNFGETALGVFPALGEGILGGLSSLWGRVME AFPDFGAWASGVAESVDSAFSGAAEAVAPAVSGLAEAAVSAFGGVADTLGAAFGDVAGAA SSAFSGALDMVGPAASEAADRLSSALSGAVAFVSDRFPSLGAAAEAVGQGLASEFQSIRD LVASVFSGDLSGALDAALGLFSGFRETAFAIFAALGEAVLGVFAFVWGQVTASFPDFGAW AASSASAVAGAFGKALGWVREKLYGLVDMLPDWALEKLGWKRLPDAGAEDGMPGFRAPYA HGGQGAPAGNGGSGYGGLSSGNGSSGNAWGAGFAAGVSSGGAWGAGSVGGTPSDGAFAGA PAGGLGTGGSGAGSFGADGPGGPFFPGASGGSFGGAPGSSGQGRAPGGGSSRAGLFGPGF GTQPFFSPSTEAYLMAAPWQSAAMASAGGAVSLESRTEINVSGASSPDEVARRVAVEQDN INADLIRYAQGAAR >gi|316923106|gb|ADCP01000075.1| GENE 5 4416 - 4568 187 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRPVLRGLCRYEGLKDGTLSLEDVALMNDALTVQEENERRFMAAKEKERA >gi|316923106|gb|ADCP01000075.1| GENE 6 4595 - 5002 611 135 aa, chain - ## HITS:1 COG:no KEGG:SG1658 NR:ns ## KEGG: SG1658 # Name: not_defined # Def: hypothetical protein # Organism: S.glossinidius # Pathway: not_defined # 1 117 1 121 135 65 33.0 6e-10 MQFTIKDKTFDSGRLNAFQQLHVVRRLAPVTERLVALAGSAGDPEAFLGPLARTVGELPD ADVDYILNACLDVTQIRQDTGGFARLRVNGVVMFPLDLTMLLGIAAHVLKDNLSGFFADL PSVLNRAGKAAESDG >gi|316923106|gb|ADCP01000075.1| GENE 7 5005 - 5448 596 147 aa, chain - ## HITS:1 COG:no KEGG:Bmul_1816 NR:ns ## KEGG: Bmul_1816 # Name: not_defined # Def: putative bacteriophage protein # Organism: B.multivorans # Pathway: not_defined # 1 146 1 145 147 127 48.0 2e-28 MGHSYSFLDVQASIYGPGGNVSLAGDEAGVEQGGITVTPAGERSKMTVGADGSVMHSLLG DKSGTVSVKLLKTSSVNAALQIMYNLQTTTGAQHGMNTIVIRDVARGDVITCQNCAFAKQ PAITYGTDAGAVEWTFNAGSISFLLGM >gi|316923106|gb|ADCP01000075.1| GENE 8 5465 - 6949 2259 494 aa, chain - ## HITS:1 COG:no KEGG:SG1660 NR:ns ## KEGG: SG1660 # Name: not_defined # Def: hypothetical protein # Organism: S.glossinidius # Pathway: not_defined # 1 494 1 492 492 445 49.0 1e-123 MARALSVDRVVRVGINLQPMAAARRNFGTLLIIGASGVIDMEERLRAYTGIDGVAADFGM DAPEYRAAELYFSQSPRPAQLCVGRWGKTPTPAILKGGILSDGEADASAWASVKDGSFAV SVGGVSKDITGLDFSGATNMNGVAAVVSAALASAGASCAWDGQRFAMKTSTLGASAGIGY LAPLSEPAGTDISAMLRMTASTGLPPVAGTDGETAKEAVAALADKSGDWYGCVFADEGLA VEDHLDVAAFVEASAKARIYGVTVTDSRALDAGYAEDAASKLKELARKRTIVAYSRNPYA IVSALGRAFTVNFSANRSTITLKFKQLPGVVAEGLTETQAQALEAKRCNVFAAYDNDTAI FQEGVMSGPAYFDEIHGLDWLQNAIQSETWNLLYQSKAKIPQTDAGANQIITCIEAVLGE AVNNGLVAPGTWNADGFGLLERGDYLDKGYYVYTTPVAEQAQSEREQRKCPPIQIAAKLA GAIHFVDVQIDVNR >gi|316923106|gb|ADCP01000075.1| GENE 9 6963 - 7547 657 194 aa, chain - ## HITS:1 COG:no KEGG:Bmul_1818 NR:ns ## KEGG: Bmul_1818 # Name: not_defined # Def: putative bacteriophage protein # Organism: B.multivorans # Pathway: not_defined # 1 165 1 162 196 89 36.0 8e-17 MNTSATGGFLQPEPGLARSDIERLIGDVIAGITGLPRDLVRERIAVPQADPETGATWCSF GITKRSESKSQVRHYETEDGQGMSRVLTLETLTVPASFFGPACEDMALKLKAGLHISQNR EPLWHANIALVQAGDVITPPASSAPGVPDDLRWSGRADLTLTFRRGPSSPQGKAAEGTVA ITHAEQPDCGIRGN >gi|316923106|gb|ADCP01000075.1| GENE 10 8070 - 8333 72 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSEEENIRLVREIGEVKAELSGLKAEVAGTNQRLDDIVITQLKDHGKRLAMQDIRISSLE QAENRRRGGFPRSRRSPRSPEASGRCS >gi|316923106|gb|ADCP01000075.1| GENE 11 8330 - 8851 72 173 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPAVGRLMALLCALLPGVCGCAQWTGLALADATAPEGPARPVCTFPASASEWGMPAPLSD TLEGIMPEPDPLPGLPASGAEGRLSGILAHTPDSPLPSAGAFPEPSQPCVAGSAEGLPFV WGTVSPASPCSGGEGLGLAREAGPEAPYPGRRAHPVPGSGGWGLRFKGMEAAR >gi|316923106|gb|ADCP01000075.1| GENE 12 8929 - 9219 365 96 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METAVIDFLFSALAQLAAQYPDAAWLAVALSAVMTVCGLCAVATVWMPVPKETAGPYAAV YRWAHAFAAHFGQNRGAVADGRSETVQAEVKAVTGK >gi|316923106|gb|ADCP01000075.1| GENE 13 9219 - 9605 451 128 aa, chain - ## HITS:1 COG:no KEGG:RB2501_01256 NR:ns ## KEGG: RB2501_01256 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 6 119 1 113 117 124 53.0 1e-27 MAGFTLRHFSPVEFRCKCGCGAGMEKMDADLLHLLDEARDLAGTPFSLTSAYRCPKHNKA VGGVPTSAHTRGYAVDIRCVDSHSRFVILQALLEVGFRRIELAPTWIHVDNDPDKPRDVA FYRHGGAY >gi|316923106|gb|ADCP01000075.1| GENE 14 9986 - 10372 353 128 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGELWVNYAELADAIGEDGAEKLCRAVGGVSTYIPRTQPEGSPLCGVIGMERMRRLCSAF GGLRVTLPNRRRSEPSKVRIARLLESGKAPGAVALEIGVTERYVRMIASRCKIGSGGSGG RGSVCRPD Prediction of potential genes in microbial genomes Time: Fri May 13 03:24:59 2011 Seq name: gi|316923102|gb|ADCP01000076.1| Bilophila wadsworthia 3_1_6 cont1.76, whole genome shotgun sequence Length of sequence - 3477 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 833 391 ## Dalk_4586 hypothetical protein - Prom 1075 - 1134 6.8 2 2 Tu 1 . + CDS 1102 - 1770 739 ## COG2932 Predicted transcriptional regulator 3 3 Op 1 . + CDS 2781 - 3125 285 ## Nmar_0969 hypothetical protein + Term 3135 - 3160 -0.5 4 3 Op 2 . + CDS 3193 - 3475 364 ## Predicted protein(s) >gi|316923102|gb|ADCP01000076.1| GENE 1 2 - 833 391 277 aa, chain - ## HITS:1 COG:no KEGG:Dalk_4586 NR:ns ## KEGG: Dalk_4586 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 2 142 3 147 273 155 53.0 1e-36 MKYRKISPCIWNDAKVRELSDKGKLALLFLLTHPHMTPLGAIRANAPGLACELGWSEKVF RKAFGEVLALGIAKEDAKAPLVWFPNFLRHNMPESPNVVRSWVGAFRDLPESPLKGAMLE RTGEAVRTLGEGFRSAFAEAFPGVLGRAAPGVSASRPGNLAGAPSEALPIPFGEAFPEGM PNQEQEQEQRRKGEGARARAARSGMGGLWRGRGLSRLAGAGLRAGPSLAPEAFRRTEARG MGPLRTGACRGNIHRENAHRGEALRRGCPCPADGFLA >gi|316923102|gb|ADCP01000076.1| GENE 2 1102 - 1770 739 222 aa, chain + ## HITS:1 COG:NMB1078 KEGG:ns NR:ns ## COG: NMB1078 COG2932 # Protein_GI_number: 15678006 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Neisseria meningitidis MC58 # 89 222 10 145 146 66 31.0 3e-11 MPHGNLPIYEKTCYGSGMKTDYEKAIEWLNKKAEKAGGITNLGKTVGAPASTFFRVLKGG SPPGADKLLDWISQLGGKIVFPDERMEGYTLIPKVVAQAGAGSSLITSDEVLGMYAFRDD FLRRVGVHTKESVMLDVIGHSMEPMIRHKDTILVDQSVKELRDGDIFLVGFGEELLVKRV QRTPRGWLLKSENRDFSDIVVEGPDLETFRVYGRVRWFGRVV >gi|316923102|gb|ADCP01000076.1| GENE 3 2781 - 3125 285 114 aa, chain + ## HITS:1 COG:no KEGG:Nmar_0969 NR:ns ## KEGG: Nmar_0969 # Name: not_defined # Def: hypothetical protein # Organism: N.maritimus # Pathway: not_defined # 1 74 1 77 109 69 47.0 3e-11 MRMWMLPPETMCRKHLLGEHVELHMLLGSLRRGKNIDGFLAGKLVDPRRMFRRHEELVLE MERRGYRHASPLDEAECETLARRYGHTGTGIDAGANAAELARRCPECAKRLRRP >gi|316923102|gb|ADCP01000076.1| GENE 4 3193 - 3475 364 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRNPYEAEPAEEPRVCATCARYFDFDRMCNVPEAVEKLLRERFGIVLGGSEFAVEPEKIT DCAAWVSARLIDRYALLRAKQDGKAAARAASAPA Prediction of potential genes in microbial genomes Time: Fri May 13 03:25:17 2011 Seq name: gi|316923097|gb|ADCP01000077.1| Bilophila wadsworthia 3_1_6 cont1.77, whole genome shotgun sequence Length of sequence - 3643 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 21 - 509 557 ## COG4917 Ethanolamine utilization protein 2 1 Op 2 . - CDS 741 - 1082 524 ## COG4810 Ethanolamine utilization protein 3 1 Op 3 8/0.000 - CDS 1128 - 2111 797 ## COG4302 Ethanolamine ammonia-lyase, small subunit 4 1 Op 4 . - CDS 2143 - 3504 1659 ## COG4303 Ethanolamine ammonia-lyase, large subunit Predicted protein(s) >gi|316923097|gb|ADCP01000077.1| GENE 1 21 - 509 557 162 aa, chain - ## HITS:1 COG:STM2469 KEGG:ns NR:ns ## COG: STM2469 COG4917 # Protein_GI_number: 16765789 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Salmonella typhimurium LT2 # 9 148 6 143 159 73 31.0 1e-13 MPKRWKCLLAGPPEAGKSTLCAALLEPGSERRVIKTQSPVFHKDLMVDLPGEYITYPQRR TAFLVAAEDVRAILYVQSATAEPAHVPPGLLQTVPNLLIAGIISKIDHPGADIPRAERGL ARMGLRGPFFTVSAFRPETLEPLRRWLGENDLVPCAGTGKGA >gi|316923097|gb|ADCP01000077.1| GENE 2 741 - 1082 524 113 aa, chain - ## HITS:1 COG:ECs3324 KEGG:ns NR:ns ## COG: ECs3324 COG4810 # Protein_GI_number: 15832578 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli O157:H7 # 10 112 32 134 135 143 63.0 7e-35 MPNEANTGTRAYVPGKQVTVAHLIAHPATEICEKAGIPDVDAIGILTLTPGETAIIAGDM AVRSADVNVAFLDRFSGTLILYGSVGAVEQALTQVNDMLNRLLQFTICPVTRH >gi|316923097|gb|ADCP01000077.1| GENE 3 1128 - 2111 797 327 aa, chain - ## HITS:1 COG:STM2457 KEGG:ns NR:ns ## COG: STM2457 COG4302 # Protein_GI_number: 16765777 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, small subunit # Organism: Salmonella typhimurium LT2 # 1 319 1 298 298 291 51.0 1e-78 MREQDVERIAAEVLAALREGAPAGADRPVGHGETGTGVVPSAGRGGQPEAAHTGMPTAQD EGRDGGGTLPDLAGPEFRRGCFIASPHRRDVIDDFVSHTGARIATGTVGTRPPTRACLRF LADHARSKGTVFKEVPEEWISGHGLWSVSTLAEDKDTYLTRPDLGRLLSEESLRELKRRY ADAPQVVIVLSDGLSTDALLANYEEILPPLLNGLKQAGIRAGAPLFLRHGRVKAEDRIGE AVGCDVVVMLVGERPGLGQSESMSCYAVYRPTADTLESDRSVLSNIHREGTPPVEAAAVI VDLVRDMLHWKASGIKLNRRQAGDAGA >gi|316923097|gb|ADCP01000077.1| GENE 4 2143 - 3504 1659 453 aa, chain - ## HITS:1 COG:eutB KEGG:ns NR:ns ## COG: eutB COG4303 # Protein_GI_number: 16130366 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, large subunit # Organism: Escherichia coli K12 # 1 453 15 467 467 654 68.0 0 MKLKTVLFGTTYAFGSIKEVLNKAGELRSGDVLAGVAASSMRERVAAKQVLSELTLGDLR ENPVVPYEDDAITRVIQDAVHTPVYGSIRNWTVAEFREFLLDGRTSAADIERVRKGLTSE MAAAVSKIMSNADLICAAKKMPVVVRSNNTVGLPGRFSSRLQPNDTRDDVSSIVAQVYEG LSYGAGDAVIGINPVTDTVENTKAMLNALWEVIERHRIPAQNCVLAHVTTQMEAIRQGAN AGMIFQSISGSEKGLGEFGVSVGLLDEAYDLAGHYCQAAGPNVMYFETGQGSALSADAHY GCDQVTMEARCYGLARRYQPFMVNTVVGFIGPEYLYNHQQIIRAALEDHFMGKLHGLPMG CDCCYTNHADTDQNSNENLMLLLAVAGVNFIISLPMGDDIMLNYQTNSFHDIATARQLLN LRPAPEFEQWLERHGIMENGMLTARAGDASFLF Prediction of potential genes in microbial genomes Time: Fri May 13 03:25:29 2011 Seq name: gi|316923084|gb|ADCP01000078.1| Bilophila wadsworthia 3_1_6 cont1.78, whole genome shotgun sequence Length of sequence - 19115 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 7, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 203 - 238 2.7 1 1 Tu 1 . - CDS 391 - 1563 2013 ## COG1454 Alcohol dehydrogenase, class IV 2 2 Op 1 . - CDS 1681 - 1893 79 ## 3 2 Op 2 . - CDS 1894 - 3957 1716 ## COG3284 Transcriptional activator of acetoin/glycerol metabolism - Prom 4088 - 4147 2.5 4 3 Op 1 1/0.000 + CDS 4387 - 5274 1434 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 5 3 Op 2 7/0.000 + CDS 5409 - 6602 1672 ## COG0665 Glycine/D-amino acid oxidases (deaminating) 6 3 Op 3 3/0.000 + CDS 7000 - 7308 366 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 7 3 Op 4 3/0.000 + CDS 7308 - 7811 274 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 8 3 Op 5 . + CDS 7763 - 8698 1161 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 9 4 Op 1 . + CDS 9046 - 10404 2154 ## COG2610 H+/gluconate symporter and related permeases + Term 10414 - 10448 6.1 10 4 Op 2 . + CDS 10469 - 12034 2136 ## COG1760 L-serine deaminase + Prom 12079 - 12138 2.5 11 5 Tu 1 . + CDS 12206 - 13216 1385 ## COG3938 Proline racemase 12 6 Tu 1 . + CDS 13685 - 15184 2221 ## COG1012 NAD-dependent aldehyde dehydrogenases - Term 15355 - 15390 6.1 13 7 Op 1 4/0.000 - CDS 15399 - 18491 601 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 14 7 Op 2 . - CDS 18516 - 19115 783 ## COG1566 Multidrug resistance efflux pump Predicted protein(s) >gi|316923084|gb|ADCP01000078.1| GENE 1 391 - 1563 2013 390 aa, chain - ## HITS:1 COG:TM0111 KEGG:ns NR:ns ## COG: TM0111 COG1454 # Protein_GI_number: 15642886 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Thermotoga maritima # 2 360 1 356 387 248 38.0 1e-65 MMKRIDYFIPTRIVFGAGRLKELATLSLPGKKALLCVTEDKLMEKLGIQQRVIELLGQNG VSVAVFDKVTPNPTRTGVMAAAALAKESGCDFVIGLGGGSSIDTAKAASIMMANPGDLWD YASAGSGGRKPVSGAFPVVAISTTAGTGTEADPYAVVTNEETNEKLDFTLDEIFPAISII DPELMLTLPHKLTLFQGFDALFHASECYVNNGNENRLTELFAVDAIEKVAKHLPIVSEDG SNLESRSNISYAANILCGFTQSLVCTTSHHIIAQALGGFFPNVPHGASLLLIADAYYKKV CSLLPTEFDAIGEIMGEKADPAKPGYAFVTALAKLMERTGMDKLAMSDFNINKDDLPKVA HIAAAEVGFECDRYTVTEADVVEILMQSYK >gi|316923084|gb|ADCP01000078.1| GENE 2 1681 - 1893 79 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSERMGREGWGRGEKRGALPPFFRNVSPTFPRVSLSENATVWPFLVKSVALRYRFRVLRY RYFERTNRAL >gi|316923084|gb|ADCP01000078.1| GENE 3 1894 - 3957 1716 687 aa, chain - ## HITS:1 COG:BH1826 KEGG:ns NR:ns ## COG: BH1826 COG3284 # Protein_GI_number: 15614389 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; K Transcription # Function: Transcriptional activator of acetoin/glycerol metabolism # Organism: Bacillus halodurans # 20 590 8 541 623 290 33.0 5e-78 MDTYEKRNLLEKSYQSAVHEAWKQFIANKPIREKDIPPHILRSWKLSREAGMDPLNPQVP PVLNKRELASLCRQHQTLIDSAKPILDMLEVSIRDTGYIAILAVASGHLLAAVGDDNLLV QARSQYNIPGAQRSIKTIGASALSMSIVERAPIQITGYEHYNCWFHEWKCASAPIFNDND IPIASLTISSHISCKDIHTLTLTTSCANCISIRLRESALIDTEKRLNAMLQRVLNSLPEA VVAIDGNGSITHANNKAANYLLPKNGVLVGKNIDDLFPKPELPRVRQFMRRGVPETGDVT ILTHEGERTHFCRFEPIQLSNGDFGMTLSISMKSQIINIANHAGGNYAKYSFDAIRGKSP ELKAQIDLARRTARTASRVLLTGESGTGKELFAQAIHNSSSVCKGPFVAVSCAAIPRDLI ESELFGYVGGAFTGARKGGMIGKMELAKGGTLFLDEVNSLPLEMQAKLLRALQQMEIVRI GDTKPTPVDVRIIAATNEDLKDAVAQGTFRGDLYFRLNVIEIAIPPLRERKADIGYLAGI FLDRLSQASGQGAPEISGPALKAMQDYAWPGNIRELENVCERAWLLSGGAAITKAHLPRA SLRRRTGRSPARRAAGLARASMPGHGRVRRAMRPPGTGGRVRKRGHGVLRAYPQHAGRLQ RQPEQGGGAARRGPQHAVQKAAKVWDY >gi|316923084|gb|ADCP01000078.1| GENE 4 4387 - 5274 1434 295 aa, chain + ## HITS:1 COG:PA0223 KEGG:ns NR:ns ## COG: PA0223 COG0329 # Protein_GI_number: 15595420 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Pseudomonas aeruginosa # 4 282 5 283 293 175 37.0 8e-44 MKQIHGIFAVTVTHFQENGEIDYAACAKHINWLIESGVHGLLPLGATGEFSALTLEERKT FAEFAMKEVAGRVPVIIGAVSTNVDVTLEVAKHAASIGADGVMILPPPGLHPSQDEIYAF YKHISENVTLPVMIYNNPGSSGVNILPETLDKIADLPHMGFLKESTGDIMRLTRAVDELA DRLVIFCGCESLAYESFVMGAKAWVCVLANVAPAQSARLYDLIVNQGKLEEARALYRQIL PLLRLTEETGELWQIVKYILKQRGFGTGTLRLPRQPISEGVKAQLDELLGKADFA >gi|316923084|gb|ADCP01000078.1| GENE 5 5409 - 6602 1672 397 aa, chain + ## HITS:1 COG:PA2195 KEGG:ns NR:ns ## COG: PA2195 COG0665 # Protein_GI_number: 15597391 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Pseudomonas aeruginosa # 8 391 6 409 417 144 29.0 2e-34 MNKHENAEIVVIGGGAVGTAVACYLARDGADVALVERGEYAWGSSRRCDGHAVTYDSAPG YFSQFCKIGQDMFPEISRELPCDIEFEPEGLGLLVDDERDMETVLANYEGKKNEGVDVTF WDRDELLRHEPHVSDKVIACLNFNGDAKLNPMRLCFGLAELARQKGAKAFNRTAVTGITV RNGAVQSVETSSGSIATKKVVLASGVWTPGLGDMVGVKVPIRPRQGQILVTERLDGLVSK NYAEFGYLAAKSGKKRPGVTPEMEQFGVAMVLEPSAAGTVLIGSSRRFVGMDTTPHPAVM QAIAQRAKHFFPSFSGVKLIRAYAGVRPASPDGKPIISPTHVEGVYVAAGHEGNGIGLSL ITGKLVSQMLRGETPLVDLAPLCIDRFGMNPPSLPSA >gi|316923084|gb|ADCP01000078.1| GENE 6 7000 - 7308 366 102 aa, chain + ## HITS:1 COG:AF0273 KEGG:ns NR:ns ## COG: AF0273 COG0446 # Protein_GI_number: 11497889 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Archaeoglobus fulgidus # 13 94 23 104 534 72 42.0 3e-13 MFKSVFPPQRALVTITFEGEAIQVEEGTTVAAAVLDRAGGETRTTARKHDGRGPYCHMGV CYECLMEIDGVPNQQSCLIPVREGMSVKRQSGAPSFRTEENK >gi|316923084|gb|ADCP01000078.1| GENE 7 7308 - 7811 274 167 aa, chain + ## HITS:1 COG:SMa2225 KEGG:ns NR:ns ## COG: SMa2225 COG0446 # Protein_GI_number: 16263653 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Sinorhizobium meliloti # 5 129 4 129 458 86 40.0 2e-17 MRTQWDIVIIGAGPAGMSAALQATRHGLSVLVLDRQAEPGGQIFRSAGSASADKRKQIGA DYARGEALVREFRQSPATFLGGANVWHLAPGRAYVSHQGTSHVMQARQILIATGAMERPV PLPGWTPAGRARCRGRRRAAQVRLHARPKARSCCAATARSSSRPSCT >gi|316923084|gb|ADCP01000078.1| GENE 8 7763 - 8698 1161 311 aa, chain + ## HITS:1 COG:AGpT58 KEGG:ns NR:ns ## COG: AGpT58 COG0446 # Protein_GI_number: 16119831 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 303 159 459 472 131 33.0 2e-30 MLCGNGPLILQTVVHLKHFGIPIAGVALTGSPASLFKAALRMPGALLRPQYLLHGMGMGL RTFFGAPCHPAANDISIRRDGETLTVDFTSFGKRKSLQGTAVLLHEGVVPETRITRLARC RHAWNPRQRYWHAETDVWGKTNVCGIRVAGDSAGVRGADAAIASGEAVALDICREIGALT LETRDRLAKAALFRLYRCDAMQPFMETFFAPSPSALLPADDAVVCRCEELTAGELRRTIE AGCYSPDGLKSQARPGMGTCQGRMCSAAVAEMIAHAHGLPIETLPPYHAQPPLFPLPLEE LASMSIPPEGL >gi|316923084|gb|ADCP01000078.1| GENE 9 9046 - 10404 2154 452 aa, chain + ## HITS:1 COG:HI1015 KEGG:ns NR:ns ## COG: HI1015 COG2610 # Protein_GI_number: 16272950 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Haemophilus influenzae # 8 439 8 453 488 215 34.0 2e-55 MDGNLSLILAFLVSISVVLALVIKFKLHPFLSLFIGCLLMGVCSGLDPMAIIKTSCKGFG DTMGDIGIIILLGVAMGQILHESGCTREIANTMLRLCGREKSALALNLTGYIISIPVFFD AAFVILIGLIREMAREGKLALNTLVTALVVGLTCTHALVIPTPGPVAVAGNMGADIGWFI VYGVLIALPASLIGGVLYGKFLGRKSPIFLTDEEPDAALVEQALAKSDHPSGALGIAIIF LPIMLIFFANILNAFLDPASSASRVVSFFGNTNIVLLITVFVAYFSLREHLPEPFDTVVS RGAESVGSILAIIGAGGAFGAVIGASGISTAIVDVMQSWSIPVILLGFLMSQCLRIGLGS ITVSIVTASAVLAPVASQLGASPVLVGLAICCGGIGLGLPNDSGFWTICKMSGLSTKQAF LVYPIPTLISGLVGLAVLLILNMFASSLPGLM >gi|316923084|gb|ADCP01000078.1| GENE 10 10469 - 12034 2136 521 aa, chain + ## HITS:1 COG:BS_ylpA KEGG:ns NR:ns ## COG: BS_ylpA COG1760 # Protein_GI_number: 16078649 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Bacillus subtilis # 237 520 2 286 300 167 39.0 4e-41 MAKLPASLFNDVIGPVMRGPSSSHVAGAARIGALVRQSLGGDVRSVLVEFDVNGSLAESY HGHGSDIGFVSGLLDIDLTDPRVPDACAIAEREGVEVAFSILDYGASHPNNYRIAATGGD GRTLHWEAVSVGGGMVEMQRFEGFPVSIDGGYHEYLLLIDAHAAPLEQAAASVKAVASSD ELRAESDGARLLINLKRASALDAETVARLRALPGVAGCIALDPILPTRSWGGCSVPFSNA AGLLEYAKQDGRPLWELAARYESVRGNTTEEDIFNKMDRLVGIMEKAVEDGLAGTEYADR ILGPQAHLIDAGQRNRTLLPNDLFNNVTKYITAIMEVKSAMGVIVAAPTAGSCGGLPGTL IGVGRTLGLPREEIVKAMLAAGLIGVFIAESATFAAEVAGCQVECGAGSGMAAAGVALML GGTVAQCLDAASMALQNITGLACDPVANRVEVPCLGKNIMCGFNAVACANMALAGFDKVI PLDETIAAVYDVGLMLPLDLRCTLGGLGKTPASFAIRKRLS >gi|316923084|gb|ADCP01000078.1| GENE 11 12206 - 13216 1385 336 aa, chain + ## HITS:1 COG:mll3979 KEGG:ns NR:ns ## COG: mll3979 COG3938 # Protein_GI_number: 13473394 # Func_class: E Amino acid transport and metabolism # Function: Proline racemase # Organism: Mesorhizobium loti # 2 334 1 331 333 211 33.0 2e-54 MLASQSVFVVDAHTTGTPIRVVTGGIPPLKGASVAEKMEDMRRNHDWLRTCIMQQPRGFL SLVGAILTEPCSPEADYGVFYIDALTYQPMCGAGTLSVAKVLVETGMVKRVEPETKIVLE TPTGLVTVYVKWENHTVASINLENVPAFLYRKDLHIDLAGYGDVSVDIGYGGNFFVLADV HSLGLTITKDTVNLLRSLSKDILAAANKVIKVAHPTNPAINYLDQVLFCQNVPEEDGGYL AQCIFGDAQADISPCGTGTSTRLAQRYFRGLIGLEETFVQKSVCGGAFHAKGLRETTLGG VPAIVPYVSCSDVHITGFNHLIVEENDKLKNGFVSW >gi|316923084|gb|ADCP01000078.1| GENE 12 13685 - 15184 2221 499 aa, chain + ## HITS:1 COG:AGl1790 KEGG:ns NR:ns ## COG: AGl1790 COG1012 # Protein_GI_number: 15891010 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 497 11 501 511 546 54.0 1e-155 MTLLSHADYERIAANLSYPTEAFIGGTFAPALSGETMPTVNPATGKEIARVASCGREDVD KAVAAARKAFESGAWSRMHPSERKKILMRFIGLIEKHQVELAVLESLDSGKPVRECLLTD LPETIECLEWHAECADKQYGGISPSGDGRLGLIVREPAGVVACVLPWNFPLMMVGWKLGP ALAEGNSVILKPASVTSLSTLKLAEFAAEAGIPEGVFNVITGPGSSVGEALGLHPGVDVV SFTGSTEVGRRFLEYAARSNLKRIVLELGGKSPFVVLDDVADYAAVAAQAAAAAFWNMGE NCTANSRIIVPRKRKDAFTEALLAELENWRIGDPLDPENNLGSIVSEGQFRTIMGYIEKG KAEGGHVLTGGAPLAIGSGLFIPPTIFDGVTPDMTIVREEIFGPVTAILPADSDEEAVGL ANATAYGLQATLFTEDVTKAHKYARALKAGTVSVNCFSEGDNTTPFGGYKLSGFGGKDKG RESHDQYTETKTIFLNLDR >gi|316923084|gb|ADCP01000078.1| GENE 13 15399 - 18491 601 1030 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 560 1023 5 456 460 236 34 1e-61 MPASLIAAGDALARFFAAPEIGSVPHRSARQGILILVAFIGGMLPISNMAMLSSAYLALS GDFNADLIDLGRLALGNLLIMGLVQPLVNHIHAGLGLRGALFLALCVVSVAGVGAASSGS LTEMTLWYALAGFGGGLFLSLSARILFVAVPPESRRLMMAIWSFFIGFGSNLSPLLGGFL VEFLTWHSLFLLSPVVALPCLVILFFLLPDSRPAGMPPFDWCSFAFLVAFAVPTLIAVCY GQTFGWNSSFILIMIFCAVSALFLFLVSCLSAEAPLLHLRNLRIKGVGPAVMAALLLVIS HVGMRIQMILFMRNVLGYPPSEIGLMFLLPLCVFVLTVVPTGIMVATHGNPKPFLLAGIA CIAGAGLLLSRLDANADWWHMAVPLALRSFGYAMGSASATPFLLRNVPPEKQPQTLPAIN SVRFFSMCLCMGSITTLSVLLNKWYFALIAAQVTQGSSAASASISRWSALFAGQGLSPSQ SRHDALLIVSSSIKKQASVFTFDHLFLILSLVACVAFVCALLSRRVDPHPHAVRAHARNS LSRLWAAVRSRLPRPAFPLAGLLLCCLLAAGCTLGPDYKRPDLDLPAASVAADEPGALFT RERWWEVFHDGALNRLEEAALAHNSNLAQAMARVEEARAAAGVAFADRLPAVGLRSQDGK RLMTEGEELAHHYASRTQEAYKAVGFLSFELDLWGKYRRLDEAARAELLSTEAARDTIRL AVASETALAYFQLRTLQEQERIAADMLKSYERTCKVYETRYRLGQSPETTLRRFAAERDK TQAQLYDLEERRIRCEGALAVLAGFSPAGIVDGAGRAFQEGRTLGELAPPPDVPSGIPSD LLQRRPDVRSAEGRLMAANARIGAARAAFFPSFSLTAEGGYSSTHLDRVFLDPARIWGVM GGLAQPLFEGGRLTAKQRMAEAKYEEARAAYVGSVRNAFTETRDALAGNRISRKALAAST ERVKELTRSNEIMEKQYDAGLSSVMDLLDVRRQLLAARQEQAEARRRQLAAVVGLCKALG GGWTEAKGFE >gi|316923084|gb|ADCP01000078.1| GENE 14 18516 - 19115 783 199 aa, chain - ## HITS:1 COG:aq_1060 KEGG:ns NR:ns ## COG: aq_1060 COG1566 # Protein_GI_number: 15606343 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Aquifex aeolicus # 28 198 216 371 374 105 34.0 5e-23 AQADRDAASSAYRAAQANVAAAQEELALKKEGSRPEVIRAAEAQVMQAQAELDEINVLCQ DSVIPSPVNGVVAQKLVSAGELVAAGQKLFTLVNTDDIWLNARIEETRIGQIRVGQDVAF TIDGYPGRTFSGRIYEINPAACSVFSLISTENVAGYFTKIMQRVPVKISLPRLDSPETDP GEPEVVFRIGMQGTIEIQL Prediction of potential genes in microbial genomes Time: Fri May 13 03:26:10 2011 Seq name: gi|316923028|gb|ADCP01000079.1| Bilophila wadsworthia 3_1_6 cont1.79, whole genome shotgun sequence Length of sequence - 65729 bp Number of predicted genes - 59, with homology - 45 Number of transcription units - 39, operones - 13 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 259 367 ## gi|121533547|ref|ZP_01665375.1| secretion protein HlyD family protein 2 1 Op 2 . - CDS 250 - 861 709 ## COG1309 Transcriptional regulator - Prom 993 - 1052 5.7 - Term 1158 - 1197 7.3 3 2 Op 1 . - CDS 1315 - 1833 690 ## COG3448 CBS-domain-containing membrane protein 4 2 Op 2 . - CDS 1851 - 2456 633 ## COG3448 CBS-domain-containing membrane protein + Prom 2530 - 2589 2.4 5 3 Tu 1 . + CDS 2624 - 2890 401 ## COG1734 DnaK suppressor protein 6 4 Tu 1 . + CDS 3125 - 4486 1511 ## COG1757 Na+/H+ antiporter + Term 4584 - 4621 1.1 7 5 Tu 1 . + CDS 4654 - 4746 103 ## + Term 4844 - 4886 6.1 - Term 4830 - 4874 11.1 8 6 Tu 1 . - CDS 4914 - 5300 402 ## gi|302863062|gb|EFL85994.1| conserved hypothetical protein - Prom 5521 - 5580 3.9 - Term 5428 - 5464 -0.3 9 7 Tu 1 . - CDS 5612 - 6493 1093 ## COG1284 Uncharacterized conserved protein - Prom 6646 - 6705 2.2 + Prom 6684 - 6743 5.2 10 8 Op 1 . + CDS 6768 - 7187 247 ## 11 8 Op 2 . + CDS 7201 - 7395 153 ## + Term 7435 - 7468 5.2 + TRNA 7601 - 7675 54.2 # Gln CTG 0 0 12 9 Op 1 5/0.000 - CDS 8135 - 8608 583 ## COG0680 Ni,Fe-hydrogenase maturation factor 13 9 Op 2 11/0.000 - CDS 8630 - 10303 2358 ## COG0374 Ni,Fe-hydrogenase I large subunit 14 9 Op 3 . - CDS 10315 - 11265 1077 ## COG1740 Ni,Fe-hydrogenase I small subunit - Prom 11500 - 11559 3.9 - Term 11779 - 11812 2.0 15 10 Tu 1 . - CDS 11902 - 12150 329 ## - Prom 12337 - 12396 2.5 + Prom 12275 - 12334 5.3 16 11 Tu 1 . + CDS 12367 - 12765 669 ## LI0138 hypothetical protein + Term 12773 - 12802 3.5 - Term 12816 - 12854 10.1 17 12 Op 1 . - CDS 12859 - 13332 423 ## 18 12 Op 2 . - CDS 13388 - 14584 1541 ## COG1301 Na+/H+-dicarboxylate symporters + Prom 15169 - 15228 3.0 19 13 Tu 1 . + CDS 15265 - 15702 498 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 15757 - 15793 8.0 20 14 Op 1 . + CDS 16130 - 17215 391 ## PROTEIN SUPPORTED gi|167854911|ref|ZP_02477687.1| ribosomal protein L11 methyltransferase 21 14 Op 2 . + CDS 17228 - 17440 350 ## Ddes_0904 hypothetical protein 22 14 Op 3 . + CDS 17467 - 18012 420 ## Ddes_0903 hypothetical protein 23 14 Op 4 . + CDS 18000 - 19100 514 ## COG0063 Predicted sugar kinase 24 15 Op 1 . - CDS 19018 - 19188 87 ## 25 15 Op 2 . - CDS 19254 - 19718 420 ## amb4030 hypothetical protein + Prom 20135 - 20194 4.7 26 16 Tu 1 . + CDS 20260 - 20790 87 ## DVU3296 hypothetical protein + Term 21002 - 21041 7.1 - Term 21119 - 21152 -1.0 27 17 Op 1 2/0.000 - CDS 21162 - 22142 895 ## COG2199 FOG: GGDEF domain 28 17 Op 2 . - CDS 22317 - 23423 867 ## COG0477 Permeases of the major facilitator superfamily 29 18 Tu 1 . + CDS 23939 - 27340 3189 ## COG4625 Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain + Prom 28884 - 28943 8.6 30 19 Tu 1 . + CDS 29055 - 29843 377 ## gi|302861788|gb|EFL84723.1| sigma-54 dependent transcriptional regulator/sensory box protein + Term 29868 - 29899 3.4 - Term 29833 - 29904 17.5 31 20 Tu 1 . - CDS 29922 - 30392 447 ## COG2606 Uncharacterized conserved protein - Prom 30440 - 30499 1.7 32 21 Op 1 . + CDS 30552 - 33950 3820 ## COG0642 Signal transduction histidine kinase 33 21 Op 2 . + CDS 33937 - 35742 2213 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains + Term 35776 - 35815 5.6 34 22 Op 1 24/0.000 + CDS 36454 - 37209 800 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component 35 22 Op 2 2/0.000 + CDS 37266 - 38060 1069 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 36 22 Op 3 24/0.000 + CDS 38066 - 38908 1065 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 37 22 Op 4 . + CDS 38932 - 39711 263 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 38 22 Op 5 . + CDS 39713 - 40099 270 ## 39 22 Op 6 2/0.000 + CDS 40193 - 41272 1337 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 41298 - 41346 11.2 40 23 Tu 1 . + CDS 41377 - 42456 1428 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 41 24 Tu 1 . + CDS 42760 - 43299 469 ## + Term 43303 - 43350 12.5 - Term 43291 - 43338 15.7 42 25 Tu 1 . - CDS 43361 - 44296 677 ## DvMF_2792 YceI family protein - Prom 44469 - 44528 1.7 + TRNA 44586 - 44660 85.8 # Gly GCC 0 0 + TRNA 44675 - 44749 73.3 # Cys GCA 0 0 + TRNA 44755 - 44829 85.8 # Gly GCC 0 0 + TRNA 44849 - 44923 84.9 # Gly CCC 0 0 + Prom 45177 - 45236 2.8 43 26 Tu 1 . + CDS 45360 - 47144 1762 ## COG0513 Superfamily II DNA and RNA helicases + Term 47387 - 47442 19.5 - Term 47574 - 47608 1.2 44 27 Tu 1 . - CDS 47617 - 47862 199 ## - Term 48033 - 48068 3.3 45 28 Tu 1 . - CDS 48140 - 49681 836 ## DvMF_1023 transcriptional regulator, LuxR family - Prom 49797 - 49856 3.1 - Term 50002 - 50037 2.0 46 29 Tu 1 . - CDS 50065 - 51099 333 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 - Prom 51188 - 51247 3.0 47 30 Op 1 . - CDS 51355 - 52935 1244 ## COG0861 Membrane protein TerC, possibly involved in tellurium resistance 48 30 Op 2 . - CDS 53007 - 53468 -143 ## + TRNA 53689 - 53763 54.2 # Gln CTG 0 0 + TRNA 53778 - 53854 59.7 # Glu TTC 0 0 49 31 Op 1 . + CDS 54093 - 55304 435 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 50 31 Op 2 . + CDS 55301 - 55468 162 ## + Term 55489 - 55537 6.1 + Prom 56217 - 56276 2.1 51 32 Tu 1 . + CDS 56331 - 56546 81 ## + Term 56598 - 56629 3.4 52 33 Op 1 . + CDS 56671 - 56889 194 ## Slin_1968 phage transcriptional regulator, AlpA 53 33 Op 2 . + CDS 56882 - 58663 961 ## COG4643 Uncharacterized protein conserved in bacteria + Term 58672 - 58735 20.8 + Prom 58746 - 58805 3.3 54 34 Tu 1 . + CDS 58865 - 60052 828 ## pE33L5_0003 mobilization protein 55 35 Tu 1 . + CDS 60197 - 60583 199 ## gi|302861370|gb|EFL84308.1| putative protein MobC 56 36 Tu 1 . + CDS 60711 - 61208 264 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 61340 - 61388 8.7 + Prom 61240 - 61299 2.1 57 37 Tu 1 . + CDS 61541 - 63949 1272 ## COG3468 Type V secretory pathway, adhesin AidA + Term 63958 - 63996 -0.8 58 38 Tu 1 . + CDS 64550 - 64942 125 ## 59 39 Tu 1 . + CDS 65177 - 65729 72 ## Predicted protein(s) >gi|316923028|gb|ADCP01000079.1| GENE 1 1 - 259 367 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|121533547|ref|ZP_01665375.1| ## NR: gi|121533547|ref|ZP_01665375.1| secretion protein HlyD family protein [Thermosinus carboxydivorans Nor1] # 10 82 18 88 342 68 46.0 1e-10 MAVSVRKPVLYAFGVVLCAGIIGGIWLWYRTHCVLITNDSRIEGTLVGVASRLSDRVVQV MVNEGDTVHKGQGLVRIDSRSVMARK >gi|316923028|gb|ADCP01000079.1| GENE 2 250 - 861 709 203 aa, chain - ## HITS:1 COG:BH1965 KEGG:ns NR:ns ## COG: BH1965 COG1309 # Protein_GI_number: 15614528 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 10 196 8 196 196 85 31.0 5e-17 MTAKPGTTSTADRILDAAIGLMNTKGYNPVSVREIAKAVGLSEMTVFRHFPSKYAMLMAA MRRAPYAEMIEDVFTTHVRWELEHDLRHFMRRYMDIAEQRIGIWRIYFSAIGQIDEHSTE LVSDVTSFRERLRAYFEEMIRRGHIRQLDSLYVSSLFWNLIFGFVMTLVIGRKRPGLVRD EYIENAVSVFIGGVAPKEGEAWR >gi|316923028|gb|ADCP01000079.1| GENE 3 1315 - 1833 690 172 aa, chain - ## HITS:1 COG:slr0789 KEGG:ns NR:ns ## COG: slr0789 COG3448 # Protein_GI_number: 16331896 # Func_class: T Signal transduction mechanisms # Function: CBS-domain-containing membrane protein # Organism: Synechocystis # 14 171 37 184 185 104 39.0 1e-22 MAGGEQSPPRAKAGEIFSAWVGSCLAIACIALLERYATGNSGFPLLIGSFGASAVLAFGA IRSPLAQPRNLVGGHFLSALVGVSCYFLFPGTPWLASCLAVATAIALMHVTKTLHPPGGA TALIAVTGGEGIHQLGYLYAFVPCLLGALIMLAIALIVNNIPKAQRYPQFWW >gi|316923028|gb|ADCP01000079.1| GENE 4 1851 - 2456 633 201 aa, chain - ## HITS:1 COG:PA1854 KEGG:ns NR:ns ## COG: PA1854 COG3448 # Protein_GI_number: 15597051 # Func_class: T Signal transduction mechanisms # Function: CBS-domain-containing membrane protein # Organism: Pseudomonas aeruginosa # 8 197 188 370 385 74 27.0 1e-13 MNRNISLASRDPLPLGREDIYKGMEEIGAYLDVTPRDFQEIYAHAYKIARGRLLSSITAG DIMHVPVLCVAEEQSVRDLVIFLDDHRISGAPVVGAEGRLSGVVSESDVVRFVGGGETVT VMHLMHTLMRQGCVSGADLEAPVGSIMTRECVSVGEGAHLGDMLELLRTRRINRIPVVDA GMRPVGIVSRTDIINAFGVMQ >gi|316923028|gb|ADCP01000079.1| GENE 5 2624 - 2890 401 88 aa, chain + ## HITS:1 COG:PA4870 KEGG:ns NR:ns ## COG: PA4870 COG1734 # Protein_GI_number: 15600063 # Func_class: T Signal transduction mechanisms # Function: DnaK suppressor protein # Organism: Pseudomonas aeruginosa # 1 88 1 88 88 109 69.0 1e-24 MASGWAGDGAVQDQIQDSINDEVARARRNLPKGESLHFCEECGEPIPEARRKALPGVRLC VACQEEADKDQRAVSLYNRRGSKDSQLR >gi|316923028|gb|ADCP01000079.1| GENE 6 3125 - 4486 1511 453 aa, chain + ## HITS:1 COG:FN0978 KEGG:ns NR:ns ## COG: FN0978 COG1757 # Protein_GI_number: 19704313 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 21 436 5 421 431 251 37.0 3e-66 MAWNHVLLIHTDAKEPDVKAIVLSVFLGGLAVCVLTGASVLWALLLGLACFAGYALRQGH APRDVALMLWSGVRSVRNILIIFGLIGMLTAVWRASGTLPFIIHHTLQWVDPAYFILWVF LLCCLLSLLLGTAFGSVSTLGVIFMMLARSAGLDELATAGAIMSGIYVGDRCSPVSSSAA LVCALTNTNIYANMRRMWTTSALPFALTCAGYLALSLSGPARLAAPDAAGQMDRFFILSG WTLLPAACIVVLSLFRVDVKQAMFWSILAGGAVCLTVQGMEPGELFACLAFGYTPAPGAE ILAGGGIRSMLQVAGIVLLSSSYSGIFDATDLLSGFGGLIRRLAERIGTFRAMTLASVPV SGISCNQTLATILTAQLCRPCYARRQDMAIPLENSAILIAALIPWSIAGSLPCATLGVTM ACLPYALYLYLVPLANALRKDPGPDGPEPATSI >gi|316923028|gb|ADCP01000079.1| GENE 7 4654 - 4746 103 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEGLIGPLISFIGAVLIGIVVGKYIASKRK >gi|316923028|gb|ADCP01000079.1| GENE 8 4914 - 5300 402 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302863062|gb|EFL85994.1| ## NR: gi|302863062|gb|EFL85994.1| conserved hypothetical protein [Desulfovibrio sp. 3_1_syn3] # 41 128 31 118 118 105 59.0 1e-21 MSLQSKAFSHPCRRRSVPRRPLRAAPPPSPASDLFPHPEGEIRRYSFPALAIRCERELLE AQQAEDMKEILSNVSASVDAFASLMELMDAEAKVDGVIIHILGRELQRIGKILFKMQDTY STVELVGA >gi|316923028|gb|ADCP01000079.1| GENE 9 5612 - 6493 1093 293 aa, chain - ## HITS:1 COG:BH1678 KEGG:ns NR:ns ## COG: BH1678 COG1284 # Protein_GI_number: 15614241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 14 289 1 279 290 139 29.0 9e-33 MGFLNGNVLRRPHLSLRREWHTLLAATIGTLLTCFAVVALTVPYQFAGGGVLGLALISNY AWGISPAWVLSVGNAVLLLWGWKALSLRFALWTLYVTVLTSVAIPVFELFQYPVIGNTIL AALLAGAVGGVGFGMLFRVGASSGGTDVIVMVARKRWGVDIGSMSFYINVVILFASFVVV DIEKILMGGLLLYVETVTIDRVLKSFDRRSQVLIISKRTQDIADFILNELDRSATIIPAK GAYSDRPHDMLLVVLTRRQTVDLRQYIAEIDPEAFLIFSDVTEVVGLGFKNWE >gi|316923028|gb|ADCP01000079.1| GENE 10 6768 - 7187 247 139 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSTFAKNRLFIKSFLSILMLLMLFCTPDALKANASFDTSVTTSFVPSEYLPLETEQATA VVRANKAFLPERGLNRETGFPLLTLLFLLLYPLCTRSLRTLSWADLLLPRLAEIRGMLPL PGAPPLHRISFVSTLRACV >gi|316923028|gb|ADCP01000079.1| GENE 11 7201 - 7395 153 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRQGRPNRARRTHGGYAPPWRNAPAFPDKRQGVLMDIGLKVVLICAGFMAICVVLGMIGK KLAE >gi|316923028|gb|ADCP01000079.1| GENE 12 8135 - 8608 583 157 aa, chain - ## HITS:1 COG:aq_667 KEGG:ns NR:ns ## COG: aq_667 COG0680 # Protein_GI_number: 15606081 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase maturation factor # Organism: Aquifex aeolicus # 5 155 2 152 162 112 40.0 4e-25 MAEQVLVLGVGNILFSDEAIGVRTVEHLQQCASLPGNVELMDGGTLGIRLMDAIMGCDLL IVVDAVLGGGEPGTLYRLEGEGLRESMSFRDSMHQTDLVDTLIYCDLAGHRPDAVVIGME PADYHTMEIGLTPVCQARLPDLAGKVVEELRARGVIA >gi|316923028|gb|ADCP01000079.1| GENE 13 8630 - 10303 2358 557 aa, chain - ## HITS:1 COG:STM1538 KEGG:ns NR:ns ## COG: STM1538 COG0374 # Protein_GI_number: 16764883 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I large subunit # Organism: Salmonella typhimurium LT2 # 16 556 19 599 600 489 45.0 1e-138 MSEARKTPQSHFTGPVVIDPLTRIEGHLRIEVEVKDGRVSEARSVGTLYRGLETILVGRD PRDVQHFTQRTCGVCTYTHALASTRALEDAIKVEIPKNATYIRNLVLGMQYLHDHIVHFY HLHALDFVDVTSALQADPVKAAKICSSVSPRPASADGFKAVQAKLKAFVESGQLGPFTNA YFLGGHPAYYLDPEANLIATAHYLEALRLQVKAARAMAVFGAKNPHTQFLVAGGVTCYES LTPERIAEFEGLYKEVNDFVNQVYIPDLLLVGGAYKDWTKIGGTANFMTFGEFPGDERNL ESRWFKPGVVFDRKLEALPFDPSKIEEHVRHSWYAGDAVHKPFQGVTEPKFTFMGDKDRY SWMKAPRYDGRAVETGPLAQVLVAYLKGNAEVVPVVDSVLQTLSLTPGDLFSTLGRTAAR GIETAVIAKKTGEMLQEYKENVASGDKKIVENCEVPDKGEGAGFVNAPRGGLSHWLVIEN KKVANFQLVVPSTWNLGPRCSKGIPSAVEEALVNTPIADPKRPVEILRTVHSFDPCIACA VHVIDGRTNEVSKFQIL >gi|316923028|gb|ADCP01000079.1| GENE 14 10315 - 11265 1077 316 aa, chain - ## HITS:1 COG:jhp0574 KEGG:ns NR:ns ## COG: jhp0574 COG1740 # Protein_GI_number: 15611641 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I small subunit # Organism: Helicobacter pylori J99 # 10 315 29 335 384 290 47.0 2e-78 MRIAVGLGKENVEARLERQGVSRRDFLKFCSMVAVTMGMGTGFAEQVARALTMPRRPSVV YLHNAECTGCSEALLRTGQPFIDELILDTISLDYHETIMAAAGEAAEAALEQAISAPEGF ICVVEGAIPTANEGKYGYIAGHTMLDICGRILPKAKAVVAYGTCAAFGGVQAAKPNPTGA QGVNECFGAKGIKAINIGGCPPNPLNLVGTLVAFLRGDNIELDANNRPMMFYGSSVHDQC ERREHFDNGEFAPSFDSEEARNGWCLYELGCKGPDTMNNCPKVKFNGTNWPVGAGHPCIG CSEPGFWDKLSPFYEN >gi|316923028|gb|ADCP01000079.1| GENE 15 11902 - 12150 329 82 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSQKVKTWESHVPGCFGGSDHIFAGHPAEEKRAKELRKVCSEQRIPMDDVAKAIEHFLKS KKVGDALIQEQVERARKFLKLS >gi|316923028|gb|ADCP01000079.1| GENE 16 12367 - 12765 669 132 aa, chain + ## HITS:1 COG:no KEGG:LI0138 NR:ns ## KEGG: LI0138 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 123 1 122 128 98 43.0 6e-20 MKHNNALLHGCLLAAGLTLLPFAAHATGLSNHFLAENSFTQSVKDGYQDIKKEVKETVDE LTGDDKTDTQKFQERRDKDLKKYQKEVRDAQKDYAKKREKAQREYLKHHKQLPFQEDLQK DLESAPAPTTAK >gi|316923028|gb|ADCP01000079.1| GENE 17 12859 - 13332 423 157 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPWVILAAALCVLVWFWPRNTLIGLAVVGTAGVLAGIAFWGYGVYDERQRELVAITVEPA AEAASRTSPGIACGEGELLVSITNGSGKTVTGVRFNVGAYREGRSTNIAKGNWFDDDHII NPRGTVSACWKAPGLTRDPEGDRVIWKASIIKLEFAE >gi|316923028|gb|ADCP01000079.1| GENE 18 13388 - 14584 1541 398 aa, chain - ## HITS:1 COG:FN1148 KEGG:ns NR:ns ## COG: FN1148 COG1301 # Protein_GI_number: 19704483 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Fusobacterium nucleatum # 1 388 1 388 390 393 57.0 1e-109 MSAKKSDFGLIIKLLVALAIGAFLGRMANVEVMDAVASVKYALGQIIFYAVPLVIVGFIT PAIARLGQNASKMLLTAVAIAYLSSVGAAAFSAASGYAIIPHLSVPTEVSGLVDLPKPSF VLDIPPLMSVMSALVTALLLGLAVSWTKAEIINKALAELEAIMTSVVTRVIIPILPFFVG ATFCELSYEGRLTVQFPIFLKVIGIVIVGHFIWLAVLYSIGGILSGRNPWRVLRYYLPAY LTAVGTMSSAATLPVALSCARRSPALSKTVTEFMIPLGATVHLCGSVLTETFFVMTISLM LYGTLPSVGTMAFFIVLFGIFAVGAPGVPGGTVMASLGLVTSVLHFDPAGVALLLAIFAL QDSFGTACNVTGDGALAMMMEGIFNRNRQLDDRFGTGE >gi|316923028|gb|ADCP01000079.1| GENE 19 15265 - 15702 498 145 aa, chain + ## HITS:1 COG:MJ0531 KEGG:ns NR:ns ## COG: MJ0531 COG0589 # Protein_GI_number: 15668711 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Methanococcus jannaschii # 4 142 27 165 170 58 30.0 5e-09 MELKKLLYAVDLEDAPHPAIEHVKMVASLTGASVSIVYVLPSETVYESNYLSGEGQPNRI APTQALLEKMEHFLNTNFPEGVDDCAFLMGKISEEVVKYAHKSDIDCIVVGAHGRIGFRQ LFAGSVATEVVKKAHCSVSVIRPRY >gi|316923028|gb|ADCP01000079.1| GENE 20 16130 - 17215 391 361 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167854911|ref|ZP_02477687.1| ribosomal protein L11 methyltransferase [Haemophilus parasuis 29755] # 212 357 2 147 151 155 48 7e-37 MRKNLFATTPGIIIVGAVIGLIAVWLQKLGNPANMGICVACFERDVAGGLGLHRAAIVQY LRPEIMGLVLGSLAAAIATREFKPRGGSAPIVRFVLGMIAAMGALVFLGCPWRAILRLAG GDGNAILGLAGLATGIWFGTLFFKKGYSLGRSNSQSVVSGALFPVIAVCLLLLYLIFPQI PDQPQSGPLFYSVKGPGSQHAPFIASLGLAFIIGIFAQRSRFCTMGAFRDLFLFRYTHLF LGLAAMFAAAFIANALTGGLKFGFEGQPVAHSDFLWNYLGMVTAGLAFALAGGCPGRQLF MAGEGDSDAGIFALGMLVGAAMAHNLGTASSGTGIGVYGMQATIIGFAVCLIIGFVHSKK A >gi|316923028|gb|ADCP01000079.1| GENE 21 17228 - 17440 350 70 aa, chain + ## HITS:1 COG:no KEGG:Ddes_0904 NR:ns ## KEGG: Ddes_0904 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 4 70 5 70 71 82 62.0 6e-15 MRTIDTRGLSCPQPLVLFHQAIAENADTPLDVLVDNEASLENVTRAAEKKGWNVTPKDEG EGVFRLELRK >gi|316923028|gb|ADCP01000079.1| GENE 22 17467 - 18012 420 181 aa, chain + ## HITS:1 COG:no KEGG:Ddes_0903 NR:ns ## KEGG: Ddes_0903 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 4 179 41 216 220 199 56.0 5e-50 MFQGKPRKGASPPSRTSTEYGFLVFQHTGEVIRAERILREAGFAVEVKGPPPELRTGCDM IIVFDLMQEPLIRNALGAARLEPIQSVPMRDTLLEPVSLFHVQDYGDWFMVRAANMKITV ERATGKIVNVSGGGCPDVPYLAALLTGQCIGEAEEPRVKGQTLCSYSLQRAFEEARRLWR G >gi|316923028|gb|ADCP01000079.1| GENE 23 18000 - 19100 514 366 aa, chain + ## HITS:1 COG:MK0264 KEGG:ns NR:ns ## COG: MK0264 COG0063 # Protein_GI_number: 20093704 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Methanopyrus kandleri AV19 # 56 283 10 227 233 177 44.0 4e-44 MAWMICGTVPDASFPLTGGRWRLDGGFLHAEGGGIAPLSVQRGTPALLGTALLTCETLGV EPPTALLAGDTGNGDGSRKLYSRLAASPSLSGVRGITFHYLFPDLDGHNRVLMALEEAGP KPVLVADAGFMYVAKMSGYADAYDLFTPDAGELAFLADEKAPHPFYTRGFLLAADEDIPS LVERAYQHGNAARFLLIKGKVDHLVEGGRFLGDVSEPQVAALEPIGGTGDLVTGLVTGLL AGGMEMPQACLTAARASRITGLLANPTPATQIAELLPFLPEALRCALEGASPLRAPPGRE TVQTRAAPSLRILTEKPPSSRLGAFFTRQPEPFQAIPMATTFPETPFRWISFAHPTNTKR PDRLSI >gi|316923028|gb|ADCP01000079.1| GENE 24 19018 - 19188 87 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKFFLVLLFCVFPFHVLAGDVSYWADGQIKSIDGQDASYWSDGQMKSIGKEFPER >gi|316923028|gb|ADCP01000079.1| GENE 25 19254 - 19718 420 154 aa, chain - ## HITS:1 COG:no KEGG:amb4030 NR:ns ## KEGG: amb4030 # Name: not_defined # Def: hypothetical protein # Organism: M.magneticum # Pathway: not_defined # 1 150 1 149 362 142 46.0 4e-33 MRLLGNILWHFPCFGFVTSLFSFLAGILLTVTVVGAPIGLGLIQYAKFALAPYSYSMIDK RELHPNKSGNILYAILCLIVRILYFPLGVLFFLWGIAQVVVMCCTIVLLPMAIPYAKSLS TFFNPVGKVCVPVAVMDELERRKAEKEVDAYLNK >gi|316923028|gb|ADCP01000079.1| GENE 26 20260 - 20790 87 176 aa, chain + ## HITS:1 COG:no KEGG:DVU3296 NR:ns ## KEGG: DVU3296 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 5 164 14 170 175 102 34.0 4e-21 MPERQLTIQQKRYLKFAHLFFAALWGGGATTMVLLFCLFHPSTAHEQITFSKILFYIDFI IVGPGAGGCLATGLIYSLYGNWGFFKFRWITLKYVINILFITYGMLVFLPFTHGQYSYYL SQPTEAIIPEESIWMNIFCTSQNFCTILMFLFVVYLSVFKSFKKKNRERPRPFATD >gi|316923028|gb|ADCP01000079.1| GENE 27 21162 - 22142 895 326 aa, chain - ## HITS:1 COG:DR0267 KEGG:ns NR:ns ## COG: DR0267 COG2199 # Protein_GI_number: 15805298 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 161 317 344 502 511 83 35.0 6e-16 MSSVYRIENEEGMGPLGRRGIELTHEIMYRHYRRDEDFVYAQMAEDVTWIGPSRAQYTTG VDKLRELLQIEQLVTFTMEQETYQVAYEDEHSCLIFGCCTVTSDEETGLFIRTLQRVSFF YRLIGDRLHVVHMHLSHPYEVVDSDEVFPFRYGKDAFDYIQQTHQMAFTDSLTELGNRNA YETSCVRMAHDFDSVRSLCLILFDLNGMKRVNDTWGHLAGDRLLRDFASLLRETMPPTAK LYRYGGDEFIAALHEVSLSDVKKCQQKLEQRIARYNEENEIHLSAAWGHAFFNPETDHGL SEIVKRADGMLYAMKRAMKQAAENAS >gi|316923028|gb|ADCP01000079.1| GENE 28 22317 - 23423 867 368 aa, chain - ## HITS:1 COG:ECs2302 KEGG:ns NR:ns ## COG: ECs2302 COG0477 # Protein_GI_number: 15831556 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 362 52 413 417 106 26.0 6e-23 MYAPQPMFNSVSRDFGVDKGSTGLLVSVFMLSLAVSPLCVGMLLSRIGVRRAILGASLLL GISGVGIWFAPSFLVLLGIRALQALLVPVVLTAVMSGIAVMFRHLDLNRALAGYVTSNLV GSLCGRIIGGWAAQMFGWRATLAVICFFFFLVLLFIRSIPEMRNASARLHSPREYLAVFK GRGVPSLFFAESCGIFVFAAIGNLIPFRMAELGQGHSEGLIGLMYFGYSVGLVASLSLAP LMRVFKTPTRLILFASAVYVASCLTLAVPSLWALFGGLWLIAFGEFVVHSISPGLINHQA MLSGHGDRGMVNGLFLSCYYFGGLLGSYIPGALYSGFGWFACYLCIQLVQVAAFFVLFRL HRTAPDLR >gi|316923028|gb|ADCP01000079.1| GENE 29 23939 - 27340 3189 1133 aa, chain + ## HITS:1 COG:RSc3162_2 KEGG:ns NR:ns ## COG: RSc3162_2 COG4625 # Protein_GI_number: 17547881 # Func_class: S Function unknown # Function: Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain # Organism: Ralstonia solanacearum # 621 1108 53 551 575 66 24.0 3e-10 MPRNRFFHCLTLAAAFFLLWETQASAYETRVVTDHANTPLFQLRFFDQGEEYGNALSETE GEVSTWQLSAAQKDAVTQAVELWADILGPGANNAVPAAINVGTMNEVNADAISVANPTWQ GTPWEGASGLAGALIGDQSMNPPAQIRIGKMNFSIPDIPSPLPTGNGIDLVGTLYHEMGH ALGISSLALVGEDDGGNPVYGFDTEISPWDRHLVDRYGVTPTTDMEVVRSEPETAPGADP VFTVGEMTESGVLFKGTHVSEVLAGALNGGLPIEGFEGDSLDLSHIELDRSLMSHQDYRN YQVFMEAELAALQDIGYTIDRKNFYGFSVYGDNLTLSNGQGYFARNAGGTAYLPGQPNTA AYGVGLHIYGSGNDITQTADLLACGTAGTGIRVDGEANTLRIAPGVRVSADGAYGTGLLL AYGKGQNVVSRGEIRATGTGGAGARFDFGGSLFGLDSGLTAEYRGSYIRSRTEAAEDESG NIVTQLYERYDLLDELKGPLAERFDVSGPLSGNAAAIYISRNAWVKNINIVNGAALSGDI ISDWNPGNSDVQQAQAAANGDDLLTSLSFGRLAAGDGSATDAPDPDFRLRYDGNILGRHS IRMSVDGGALSFNGRAEVLDVNVREGATLSGNAEYTLNALDNGYVSVDGTFTNAGTVAPG NSIGTITINGDYHQTATGRLFTEFDASGTSDRLAVNGNAVVNGTLYLAPLPGYYAGSLSL TPITASGNLTEAYATNLLLDSPTLEMAATPQGGTATILSSRAADAYSRYAANGNAANVGR ALSSAEGVRDDMRNLYAALDFSAADGSTVRDALSQLSPDAYGNAALASFDMHRMLSDLIL PGTFSRAPQKDGEWHVFAQPYAGTFDQPGRSGMGGYDATNVGLIGGAERSTPGGLTVGGH VVFNHQSMTGDANGKLRGEGLYLGAQGLYAPADWDGWNVFGIGRLGVENWRMKRAVSFNG YNRENNKDWTGFSGSVRAGGGYEADWGTIKAGPFAALDYAFSSRPSLTEDNGMGSRLHLD SETFHSLRSSLGVRMSTNERQLGEHATWKAHASAAWNHELLDKAGTMHASFVEAANAGFS NTVKVPGRDSLGLGAGVRFNTDKHVSFSLNAGSELFRRDATSVYGNLAVEWKF >gi|316923028|gb|ADCP01000079.1| GENE 30 29055 - 29843 377 262 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302861788|gb|EFL84723.1| ## NR: gi|302861788|gb|EFL84723.1| sigma-54 dependent transcriptional regulator/sensory box protein [Desulfovibrio sp. 3_1_syn3] # 22 256 28 251 255 75 25.0 4e-12 MNHTSEELFSVFFSTVFESSCISDFTQKMASFFKRYMPVSAVYISFFEKDTIFKVAEYAD YSKNPFPDQITIPEDTLRRMYIEDNFFTGDVHSIKNFTGEETCVFSEFRNSIFNNEESSS IYIPLKHHVLEGIHIYMSIYSIGKNRYSQEAIDICSFIQPMLIDTFNSILLYNDMNALKH SFIAERHPSANKNIKEKYELSSSGDAFPTLDTIIIDHIKMAIAKANGKIAGPDGAANLLG LNTSTLWSKIRKYNIDPKTLTE >gi|316923028|gb|ADCP01000079.1| GENE 31 29922 - 30392 447 156 aa, chain - ## HITS:1 COG:FN0673 KEGG:ns NR:ns ## COG: FN0673 COG2606 # Protein_GI_number: 19704008 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 152 1 153 154 137 49.0 1e-32 MRIDEVKRQLEAFGVLDGYREFATSSATVELAAAAAGCEPGRIAKTLSFKTAEGPIVIVV MGTARIDNRKFKDQFKEKAKFPQGEEVESLIGHPIGGVCPFAVNEGVRVFLDLSLKAFDP VYPAAGAPNNAVCLSLADLERVTGGTWVDVCKQEAE >gi|316923028|gb|ADCP01000079.1| GENE 32 30552 - 33950 3820 1132 aa, chain + ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 351 600 2 246 260 213 46.0 2e-54 MRIFVFLGTILVIFILFTGHVLLVTKDAPDDSYLPSLESSISMLSPMTAMPQLFTPKEQT WIDEHPVITVSMTPAFIPLSDSKQLNAYAGISLDYLRLLSRIIGIQFRIEPQSNWTTALE DAKQWRTDLFAMVEPTLAVDANLLLTRPHISLPGIIVIHENAQHATSLKELAGKRVSVVY RHYWHNYLESRYPDIILDPVSNPLQGLFRVMSGHSDALVDYKASVLPKLEDNTHLRLQAT STIPAQSGLSIGVRSDWPELHSILSKALYQIQPPERELINNRWLSRQPSLHLPPRTFWTS LLGIEVVLSVLLLIIFWNFQLRRKVEERTKRLAAELEKSAKAEDLQRLNTELQQAMKAAD AATEAKSRFLANISHEIRTPLHGIISFTELAYLKNNPLPRNHQRTILDLSYALLDIVNDT LDFSKIEAGEMDTDTAPFMLDEVILRVCDMTMRGSMARELECLIDIDPSTPLALLGDAGR LQQILTNLMGNAVKFTPPGGRIHLQVRHGGNLHLSENQNKAQFLFFVHDTGVGIAPDQLP QLFQPFKQADSSLTRRHGGTGLGLSIARYMVENMGGEIWVESEVGQGSTFAFSIRLALQE QVEVLGGNTFMDLQAVVVSSSDTGVRVMRRTLEALGIQCQSHIASRVGPLSVLASLGKEP LRPDLIIIDRLLGDGDCPPALRLAAELQLRCAAPVPLVLVGGPREGLAISEFKDKPYPVE VIPAITVRCVRSTLRKLFDRKKAGFRGPRLAQSPVTPDLSSVRILVAEDNPVNQEIMSVL LEETRAQLKMAGNGIEALALLEREAFDLMFLDIQMPDMDGYETIKIMRERGYVLPVVALT AHAMQSDKQRCLDAGMDAYLSKPFKQALLFDTIHSLLPEYYRVPEAQAESPARPSVPDAE SWAFLPPCFSLETVIQTGLSPQQYPSILQSFARNHAQDAALFRHAVRTHDWESLRERAHA LKGASANLGALHLRDMARTLELEAQSQLDKGYPNAALSTLQGRPLAALLPELEEALGEVL EAAAKLCPSDKETKPADGPSAIASKPKLDIAKLGTYYNDLVEALLQAAPGRIRIALGSLL VICNADEYPLLHTLKMHVDNYDYEAALTVLERIRPSLFSSSAPSTEPSDAAQ >gi|316923028|gb|ADCP01000079.1| GENE 33 33937 - 35742 2213 601 aa, chain + ## HITS:1 COG:aq_218 KEGG:ns NR:ns ## COG: aq_218 COG3604 # Protein_GI_number: 15605774 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Aquifex aeolicus # 270 599 173 501 506 256 43.0 1e-67 MQPNKPVVLVVDDDPVNLDILVQTLEQDYFLIIAKNGKRALDLAFSHHPDLILLDILMPE MDGFEVCQRLKADKETDGIPVIFLSVMESPGQKHRGFEVGGVDYVTKPFHADEVLARVRT HITNKRLREELESHNMLIGMELHEKSRQLETLMNNLPGLAYQGEHSPDRKVRFVSQGVTR LTGYTPEHFMGPDGMGLLALVHPDDRAHVSETLENALRKKEPFELTYRIITSWNEEKWVW EQGAGVHAGGAGEQAPLLVEGFINDVTEKQKQDNTIRQENKELRERLKARCSGDIVGNSP QITAVFDLIAKAAAVDDNVVVYGESGTGKELAARAIHDGSSRYQQPFVAVNCGAIPESLF ESELFGYKKGAFTGATADKKGYLDQADGGTLFLDELGEISLMGQVKLLRAIEHGGFTPVG GTQIHRPDVRIIAATNRDLLERVNEGAMRRDFYYRIHVIPIHLPPLRERKADIPLLVEHF LKAYPSNKRFEYMDSEMLHAFMLYDWPGNVRELQNVLYQYLSLGTARLGDTVIGETTAQA AASEAPEDSLTLAEALSRFEKVYLLNALKKNNWKRGPTADSLGMDRRTLFRKMKDFGLEK E >gi|316923028|gb|ADCP01000079.1| GENE 34 36454 - 37209 800 251 aa, chain + ## HITS:1 COG:MK0603 KEGG:ns NR:ns ## COG: MK0603 COG1116 # Protein_GI_number: 20094041 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Methanopyrus kandleri AV19 # 11 248 5 243 246 220 44.0 2e-57 MPTTPESRIKIQVNNLTKRFGDLTVLDGINFNIKEGELLAIVGPTGCGKTTFLNTLSKLM PATEGEILIDGEEANPKKHNISYVFQEPTCLPWRTVRENVAYGMEVKGVGKEEREARATQ IMDLVGLSSCADLYPNQVSASMMQRIAVSRAFAVNPDLLLMDEPYGQLDVKLRFYLEDEL VNIWKKLNSTVLFVTHNIEEAVYVAERILVLTNKPTKIKAEIPVDLPRPRNLIDPKFVEL RKQVTELIRWW >gi|316923028|gb|ADCP01000079.1| GENE 35 37266 - 38060 1069 264 aa, chain + ## HITS:1 COG:mlr4519 KEGG:ns NR:ns ## COG: mlr4519 COG0600 # Protein_GI_number: 13473801 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Mesorhizobium loti # 16 259 38 280 286 147 35.0 3e-35 MAYKEVSVKQPLLLTILPYVSIIIFFAFWEGVVRSGIIPNTLLASPSQVFAAFIDKLSNP NPDGAIMATHAWTSIQEAFIGYVLSLVVGIPLGLAMGWFNVVEGLFRPIFELIRPIPPIA WIPLTVFWFGIGLAGKVFIIWIAGIVPCVINSYVGVRMTNPTLIQMARTYGANNWQIFTQ LCIPSALPMVFGALQLALAYCWTNLVAAELLAADSGLGFLITMGGRLGRPDIIVLGMICV GLSGAVIGFIIDQIEKKLLAGIRR >gi|316923028|gb|ADCP01000079.1| GENE 36 38066 - 38908 1065 280 aa, chain + ## HITS:1 COG:mlr4519 KEGG:ns NR:ns ## COG: mlr4519 COG0600 # Protein_GI_number: 13473801 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Mesorhizobium loti # 31 273 39 282 286 144 31.0 2e-34 MTSSIANETVDHSKNREPFSFYKFITNKYFLNVVSIVLFLALWDYVAKEKIFRDSLARPL EVVDQLYRLTYMKFAGTNLIGHIWASTQRVLIGFIAASVVAVPLGLFMALNKYVNAIVKP LFDLFKPMPPIAWVSIAILWFGIGEMSKVFIIIIGTFVPCLLNAYNGVRLVDPDLYDVIR VLGGKRRDEIFHVCFPASFPAVFAGLQISLSSAWTCVLAAELMNSRDGMGFLIKRGMDTH QPTLVLGGMVLIAAAAYGTSLLVTLFEGKLCPWKRTIENL >gi|316923028|gb|ADCP01000079.1| GENE 37 38932 - 39711 263 259 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 12 235 2 231 245 105 32 5e-22 MTTCNCEPKLKLRCENVSKTFIQKGNQEVPVIRDVTLDVYENEFLVILGPGQCGKSTLLK IFAGLETPDTGFVTLDGQPITGPGPDRGIVFQGYMLFAWKNVRDNVELGLKLRNIPANER HEIAQHYIDMVGLTGFEKHYPHQLSGGMKQRVGIARAYANSPNIMLLDEPFGQLDAQTRM FMQKETARIWEQEKRTVIFVTNNIDEALFLGDRIILMEGKLPGSIKTEYMIDLPRPREHT SMELLKLRSIITDNTALVL >gi|316923028|gb|ADCP01000079.1| GENE 38 39713 - 40099 270 128 aa, chain + ## HITS:0 COG:no KEGG:no NR:no METTTVSLIIGSLCAGLVWWSVQGFVRRCSGKAQPKNNDAQRQCDDAPWERTGEVYRKLL YLRLASFALCILAAAALYFHLGTALAVGLAGIGCGLQFCAFRIRTRHVQAVRKATVENAE KAAAGTSR >gi|316923028|gb|ADCP01000079.1| GENE 39 40193 - 41272 1337 359 aa, chain + ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 47 267 47 258 332 69 27.0 1e-11 MRFKKLISAFLAAAFCVSLAGMAFAEKLTKVPTAWMDEHETFLIWYAKEKGWDKEAGLDI DIQYFGSGMDILNALPSGSWVFAGMGGVPAMMGNLRYGTSVIAIGNDESMTNSVLVRPDS PIAKVKGYNKDFPEVLGSPDTVKGKTFLTTTVSSAHYGLSSWLKVLGLTDKDITIKNMDQ AQALAAFDNGIGDGVALWAPHMYAGQEKGWKIAGDLHMCKVGNPIVLIADTAYAEKNPEI TAKFLSIYLRAVNMLQKEPLESLVPEYQRFFLEWAGKNYSPELSLTDLKTHPVYNLEEQL ALFDTSKGMSTAQKWQSDIAKFFASVGSINKDELKKVENGAYATDKYLKLVKTPIPSYK >gi|316923028|gb|ADCP01000079.1| GENE 40 41377 - 42456 1428 359 aa, chain + ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 3 295 1 280 332 75 27.0 1e-13 MNVKKWLRGLCAAVLCVAYVGTAGAAELQKVPTAWMGAQETFPIWYAKQKGWDKEVGLDV ELLYFSSGMDALSTLPAKTWVFGGMGAIPALMGALRHDTYVIANCNDEAMVNAVMVRPDS PIMQTKGYNKSYPNVYGTPESVKGKTVLCTTVSSAHFALSSWLKALGLTEKDVTIKNMDQ ASALAAFEYGIGDIVTLWAPLTYVGEKRGWKVASTPHDCYRGLPIVIVADREFADKNPEV TAKFLSIYMRAIDMMKNEPMDKLLPEYLRFYVDWAGTDYSKDLAEMDLKNHPVFNLEEQL QMFDASEGPSQAQSWQGDLAQFFAAIGRISQDELKKVENSSYVTDKFLKLIKTPLPSYK >gi|316923028|gb|ADCP01000079.1| GENE 41 42760 - 43299 469 179 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHTTCLCAGTLLACLALSNAYASTPRHFLPYTADVSVPAHSAGSENPLVADRDDHREQM RQQRQLERRERMHRAEDRYQQGKMRQEQERMIREQRRTEREQARRNGQHWDNRDRKDSKW NDRDRDHPGRKHDKDRLERERRKDHQYNRHPQPRHDDGQYRPEELKRRGFDDGQYHPGR >gi|316923028|gb|ADCP01000079.1| GENE 42 43361 - 44296 677 311 aa, chain - ## HITS:1 COG:no KEGG:DvMF_2792 NR:ns ## KEGG: DvMF_2792 # Name: not_defined # Def: YceI family protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 2 310 5 312 314 219 42.0 1e-55 MQQIRTVSIGEVQAFLQNHPGGFLIDVLPPEFHVQQHIPGSSGVCVFETAFQEKMRALVP DMAAPLLVYGAGGSLDSAVAAEKLQREGYTDISLFAGGLEAWRKAGLPLEGEGVDFPVRD ESPLPMFKEYMLIPEKSFIQWACHNTVHSHDGTLSVSGGELRFPHGPQGEGNGFLTMDMN GIACRDLAQDEMLPVLIAHLKSIDFFDVMAYPTAQLDILSLMPLTGATVTGRTHRLQGQL SVLRTERAIECDAELRNLPDGELSMFCQLVWDRTLWGVRYGSARFYRFLGMHSVDDNISL SAMLFFRSQRP >gi|316923028|gb|ADCP01000079.1| GENE 43 45360 - 47144 1762 594 aa, chain + ## HITS:1 COG:BH2384 KEGG:ns NR:ns ## COG: BH2384 COG0513 # Protein_GI_number: 15614947 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus halodurans # 5 570 8 535 539 453 46.0 1e-127 MTASFTELGLSQELLKAVEDLGFEEPSPIQVLAIPALLTGKDAVGQAQTGTGKTAAFGLP ILEKIASGKSVQALVLCPTRELAIQVSEELSKLAVHKRGVSVLPIYGGQPIERQLRALAK GAQVVVGTPGRVIDHLQRGTLRLNEARIVVLDEADEMLDMGFREDIELILEQSPADCQRV LFSATMPQPIRELSKRFLREPEMLTIAHKMLTVPAIEQVYYEVRPYQKMDALCRVLDSQG FRKALVFCATKRSVDEITVHLQQRGYQADGLHGDMNQTQRDRVMSRFRTDGIEILVATDV AARGIDVDDVDAVINYDIPHDVEGYVHRIGRTGRAGREGKAFTFVTVREQYKIREIIRYT KARIQPGQLPTLRDVSNIRTSKLLDEVRQTLAESSLDRWRVLVEDFQTEHFPDGDASSRD ISAALLKLLMQRDFGNQDNVGEVDELTMAPQRPAKNAEAKSKGRMQPLSRRQESGPMSRL HIDVGQAHDVTPRALVGAITGESGIPGRSVGAIDIQDNFSIVEISAELAAHVLATLNKGV FISGVKVSAKAADETDSPHPRKPFAPGQRRQRPGKFGKKPRGATLGRKAYSESR >gi|316923028|gb|ADCP01000079.1| GENE 44 47617 - 47862 199 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRFEEALLLAGKGQLITRPGYGVSFAAIREGQAVYGHFIGETGFTDVRAYVFTDEDKSAT DWELFIRVLPDAWEGCDVPNG >gi|316923028|gb|ADCP01000079.1| GENE 45 48140 - 49681 836 513 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1023 NR:ns ## KEGG: DvMF_1023 # Name: not_defined # Def: transcriptional regulator, LuxR family # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 421 499 601 673 684 64 46.0 8e-09 MGSSAEEGFWPSDARTVQALLCTALLSALFFSWLWQAFRLYSYPGLVPYNDEAGELFHLP LFTGLALLAYGLRLRFRFTEQESCQGYVIPVLGSTRLRRRLVTLLLMWTPVVLMVTPVSN EFSGWVRSGSWAVSALAGIGAGIAGCRIGLALCRFSGGQLVLSLGLASIVAPICNYLVLL CPEPYWVIPALLFPALFALLPSVYPTASGEEGEHTYPFSRIPLFLWCGCLLTAFSESAYY NLYGALEPVMPGPSLSVLVLVVLGGLSALLVLGQAHRSPIWYAFLLVIPVLVVGYTAWPL LHRESPGLSLGGLLFGYTLLNIYMMTAFLHTVSYRKGTRRLQLLACGIGGIALASWMGSH NSGWITESIARGDSLNSLFAFQAFAILAGSFIFVVYLEKVGFFRLAKGEAAAPERVAGQA PSTPMSLMDWPIEELETHFRELGLTRQQAMIAALLARKTPDTTICDNLNISPSTLKTHIR NIHRRLGISSRHELAWLVTANPAGISETQGGGR >gi|316923028|gb|ADCP01000079.1| GENE 46 50065 - 51099 333 344 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 9 334 1 314 328 132 27 4e-30 MKRNLLQCMARVLLLPVALMFICGALVSAHAEYSGPKVRFRLAHPCPPGHHTSLAFEKFA ELVAQKSKNKIKVHVFPNAVLGSDVVMLKGALTGGLEMAVSSSPNLTWVVPSLMVFDLPY ITSVNYQKKLYEALDNGELGRYFKGKFNEVGLEPIMYCEYGYRDFFSVKKPIRSVGDLAG IQVRTTASAVEVEVAKALGMKPKPLAWGETYTALKEGIVEGEGNTFSFLLDAEHEDILKY AFASRHNYSMQILSANKKWWEELDPQVRSIIREAAAEALKYQREVLAPQSEEEAVKRFMA DGLEVSKPSEAELEELKRKTRPVWNLFSDTLSQELIDLVVATQQ >gi|316923028|gb|ADCP01000079.1| GENE 47 51355 - 52935 1244 526 aa, chain - ## HITS:1 COG:BMEI0671_1 KEGG:ns NR:ns ## COG: BMEI0671_1 COG0861 # Protein_GI_number: 17986954 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane protein TerC, possibly involved in tellurium resistance # Organism: Brucella melitensis # 7 239 7 239 241 296 63.0 5e-80 MFLDFSWVAEPTAWGGLGTLILLEVVLGIDNLVFISILSSRLPEDKRRHAFTIGLSLALI MRLGLLSTVAWLIGLTKPWFSLFGYSFTARDIILLIGGLFLLLKGTMELHERLEGFQPTE DKAVHQAVFWQVITQIVVLDAIFSLDSIITSVGMVKELSIMMLAVIIAVLAMLLASRPLT EFVDRHPTVIILCLGFLLMIGLSLILEGLGYHIPKGYLYAAIGFSIVVEAFNQLALRNRR NRITTRGLRESAARAVLELLGGTTIPGGETEMEMAALGYSDCRDKVFRPEERAIVARVIR FGGRTVRYIMTPRHKVQWLDSNESPEALLKLVGASKHAFLPVMRGDTDEVLGVVDLRELL WRYQKTGRFSLEASVVPVPMVFEHTGLPDVLDAFRQHPAPMGIVLDEYGSAVGVVTPMDI LSAIAGHMGDVAPEPDSFRQPDGSWLLPGRMAVDEGLHTLGIQPEEELSCATMAGLLLER LGHIPAAGESLFFWGHLWTVASMDGLRIDQIRIHPQRSGDITEKSE >gi|316923028|gb|ADCP01000079.1| GENE 48 53007 - 53468 -143 153 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MALQYYPVDIPIRTRLVLLLAIVMAGGFLFFPSQGTSLSSDSFYSAAVFPCTENTPEPCP VPLVLSPEEASFGGEAFWSLPEQGRAAYTPYERSTPGTSRLERGHAFSSFVALSGNSFCP GILLRRGSVLSAVSILPRGWVSCHACPRSPPAV >gi|316923028|gb|ADCP01000079.1| GENE 49 54093 - 55304 435 403 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 22 391 25 403 406 172 30 6e-42 MPLTDTHIRSLKPDVKPRKYFDGGGLFLFVPANGSKLWRMAYRFDGKSKLLSFGEYPTIS LKDARERREEAKRMLSKGIDPSDHKRQLRQARAIAERDSFQNIAREWHETRMAEFSEKHQ GTVMYRLETYIFPAIGKTHIAKLETRDVMEVVKPLEQRGNYETSRRVLQIISQVFRYAVI TGRAKHNVAADLRGALRPRKTVHRAAVLEPEKVGQLLRDIDAYEGYFPLVCALKLAPLVF TRPTELRAAQWKEFDLEAGEWRIPAERMKMRRQHLVPLSRQAMSILRELQKCSGEGKYLF PSIRTEARSISDATMLNALRRMGYQKHEMSVHGFRSIASTLLNELGYNRDWIERQLAHGE QDEVRAAYNYAEYLPERRKMMQAWADYLDGLRNTQQKRIREEA >gi|316923028|gb|ADCP01000079.1| GENE 50 55301 - 55468 162 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKEDTMTDRDVCISIGKIEDETQVTSNDTGWSEEEKERMRQYGYDDMDIIRAWYS >gi|316923028|gb|ADCP01000079.1| GENE 51 56331 - 56546 81 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKEKYPLEVIAYVLLYWCGSQTSGTGHKTPQGRKTHVGRLLAEKEYRDEKSYRNLVDTLL KGTDAYSIVKA >gi|316923028|gb|ADCP01000079.1| GENE 52 56671 - 56889 194 72 aa, chain + ## HITS:1 COG:no KEGG:Slin_1968 NR:ns ## KEGG: Slin_1968 # Name: not_defined # Def: phage transcriptional regulator, AlpA # Organism: S.linguale # Pathway: not_defined # 12 68 13 69 72 68 49.0 8e-11 MKQHQAPIPVHGLLRLPQVLSMILISKSAWWEGCRTGRYPKPVKLGPRTTVWRAEDIAAF IENLGRQGENHE >gi|316923028|gb|ADCP01000079.1| GENE 53 56882 - 58663 961 593 aa, chain + ## HITS:1 COG:RSc1865_1 KEGG:ns NR:ns ## COG: RSc1865_1 COG4643 # Protein_GI_number: 17546584 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 6 288 9 307 359 155 38.0 3e-37 MSNPLQIFRDILTDKGLIPAEIMADGKLHHCPTQTKPHKQNGAYIAHVDIPATLWWCNWE NGEQGTFCTEEKQMLSVFELSAWRERQHSIQRQREAEYAERHAEAAQLARQEWNSARMCD ANHPYLRRKGIPALEGIRQARDGALLIPVLDAADNLQSLQRIYPDGTKRFLVGGKVSGGQ FIIQGQPEKPIAICEGFATGASIHLATGWTVHVAFSANNMPVVAKPARDRFTDRAIIICG DNDEAGRKRGEEAAGLANAQLLFPHFTDDNGTDFNDLHQIEGIEAVRSQLETALTKQQGL IALDMGEFLSMSIPERGYLLSPVLPVQGIGILYAPRGIGKTFAALSIAVAVASGGAMFNW RAPMPKRTLYVDGEMPATSMQNRLSALVNGMSIPPHTLKNMALITPDLQPCPMPDLSTAG GQAMIEPFLKDVDMVVLDNIATLCRTGKENESQSWQTMQAWLLELRRRGMTVLLIHHAGK SGDQRGTSAREDIMDTVISLRRPREYSMAEGARFEVHLTKARGILGDDAKPFEANLITEG NALHWRVRDIEDVELEELKRLLGEGYSIRDCAEEMGKSKSSVHRLKRKLEGLA >gi|316923028|gb|ADCP01000079.1| GENE 54 58865 - 60052 828 395 aa, chain + ## HITS:1 COG:no KEGG:pE33L5_0003 NR:ns ## KEGG: pE33L5_0003 # Name: mob1 # Def: mobilization protein # Organism: B.cereus_ZK # Pathway: not_defined # 1 382 1 385 411 145 32.0 2e-33 MSYLVLHMDKFKKEAIRGIQSHNRRERESHSNPDIDYDRSAANYELHEAAASNYAEAIQN RIDELLLVKAVRKDAVRMCGLIVTSDKAFFDGLTPEETRRFFEESKAFLTEFVGAENVVS AMVHMDEKTPHMHFLHVPVTPDGRLNANKIYTRQSLRKLQSGLPAHLQSRGFVIERGVEQ TPGSVKKHMDTREFKQQQEALEKLIQESEETSRNSRQLISALEQREEELRKSIEEYERQA EEAEKVLREDSSLPKASLFNYPSVLEKASSLIEELKKALAVKHLVQKQKESLQQEVESLR RKQTRLETEYTAHRKQNHDEKEELENQLKKMKRIMAGYREFLLLPEIRPLHIEFVERKRA EQIQRQQEEERQRQEQEARDRERRQARSARGMRMR >gi|316923028|gb|ADCP01000079.1| GENE 55 60197 - 60583 199 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302861370|gb|EFL84308.1| ## NR: gi|302861370|gb|EFL84308.1| putative protein MobC [Desulfovibrio sp. 3_1_syn3] # 1 102 1 102 123 124 68.0 2e-27 MAITATQLKEIRQEFAALKPASVTLEGNRTMSVKEAVVALAPTLERMRKRGFEVQELVER LHEKGIEVKAPTLAKYLSEFRRRKGKKKDTPPAPASTGRMTEGVKRSVSENEHRQNSFIV PDMPIDEL >gi|316923028|gb|ADCP01000079.1| GENE 56 60711 - 61208 264 165 aa, chain + ## HITS:1 COG:SAP015_2 KEGG:ns NR:ns ## COG: SAP015_2 COG1961 # Protein_GI_number: 16119215 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 5 118 2 115 165 89 42.0 2e-18 MDKLSGKNFQRPQYRALLRRLRPGDLLCITSIDRLGRNYEEIQRQWRLLTKEKKADILVL DMPLLDTRRDKDLLGTFIADLVLQVLSFVAQNERENIRKRQAEGIAAAKKKGIRFGRPPR PLPAEFPTVLRLWSDGKMTLADAAKTCGMAESSFRYRAGVEQKSR >gi|316923028|gb|ADCP01000079.1| GENE 57 61541 - 63949 1272 802 aa, chain + ## HITS:1 COG:YPO2796 KEGG:ns NR:ns ## COG: YPO2796 COG3468 # Protein_GI_number: 16122995 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Yersinia pestis # 343 802 204 638 638 137 26.0 1e-31 MNMFASLAIACTITIAVPTISLAENYIISKSTNWGDYTLENYNHRARMLHGGETLTILND AIVTFDFTNPSSAIPITKGGCIIYSVDFFHSDGYKAGDVAISGGTILGKSAHPETGYGIY VYEDYPSKIELQNTNLILSDFNMGISAGNGTVDITNSSTIFTNINYGVYTKNNAHIDIKS KDFSISDSYIPVYAGGNSSVNIQEADNIDLNFSHSGIYSKDSATVAMKSSNFSMTGQYGL NASGNSSILVEADNVSLDAGALAYGEDNSTIRLAANSSGTLYGDVAAFGASNISLELGGT TQWTGASGKSAEARLQVELEDTATWHVLPAYTSETEYQPSEASRLQFTGGGIADLASARE YQKLSIGELAGNGGTFRLKVEDDSTDGVDQVDIVRYDSGSNSIYVVSSGNTEISAEEMNT YLVRQENGSGTFGLANTDQQVDLGLYLYKLASRTNDATTEWYLQRVESIVPVDPDDPTPP VNPNPPLSPTGETEAALSGLAGHYAMWYGQLTDLRKRLGEVRYGTQTGLWVRGFADKSRL DGLGGTSFTQNMFGGSIGYDTLASVSEESLWLVGMQLRSARAHQSVNGHWGGHGDLTSVG GGLYSTWAHADGWYVDAVGTMDWYNHKLRTSMLDGTRVHDDRSSYGLGASLEAGRKLDFA FSNEGRDYWFLEPQLQLSYFWVKGGDFHASNGMKIEQKNMDSLTGRAGLVLGKKFSLEGG NGERYMQPYVKAGVNHEFLGEQEARINGVRMTSDLDGTRVYYGAGVDWQATDNLRLYMQA EREHGEHFTREYNVSAGLKWKF >gi|316923028|gb|ADCP01000079.1| GENE 58 64550 - 64942 125 130 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRRPRPNLHPQAAKLLGIRLMYAPPYRHVPHVPARPLVIRSSLRFPGAMRYCADMRRNLR RRIIMDIAQLLSGLALIPPFSFLAGKLIDDPLWFLISIIPQSCIIFGLTYTRPFRSFYMG FANWWMRSLV >gi|316923028|gb|ADCP01000079.1| GENE 59 65177 - 65729 72 184 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAESSRMFFNAIVDALKETFPDVDKFHWEIVDDKSGSSLPEETSLVIKCLDGLIAVCGTS GEKLAKLQTENREIVYLLKNNPAFSERLGFIESWLEDHELFFEDMRHALQGIFKKLSGDA ALPSYGWVGNELIAYLIQFDAGKFPKKRESKEILNIRNWLQKIMEAPCPVRKPTEITRNG PVKL Prediction of potential genes in microbial genomes Time: Fri May 13 03:29:41 2011 Seq name: gi|316923001|gb|ADCP01000080.1| Bilophila wadsworthia 3_1_6 cont1.80, whole genome shotgun sequence Length of sequence - 33218 bp Number of predicted genes - 41, with homology - 26 Number of transcription units - 14, operones - 7 average op.length - 4.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 45 - 275 126 ## 2 1 Op 2 . + CDS 315 - 587 179 ## 3 1 Op 3 . + CDS 584 - 868 239 ## 4 1 Op 4 . + CDS 963 - 1241 296 ## + Prom 1355 - 1414 4.8 5 2 Tu 1 . + CDS 1440 - 2477 170 ## gi|302861828|gb|EFL84763.1| conserved hypothetical protein 6 3 Tu 1 . + CDS 2658 - 2750 86 ## 7 4 Tu 1 . - CDS 2969 - 3178 110 ## - Prom 3408 - 3467 10.3 + Prom 3585 - 3644 8.0 8 5 Op 1 3/0.000 + CDS 3669 - 4628 500 ## COG4962 Flp pilus assembly protein, ATPase CpaF 9 5 Op 2 3/0.000 + CDS 4656 - 4985 273 ## COG3838 Type IV secretory pathway, VirB2 components (pilins) 10 5 Op 3 3/0.000 + CDS 4990 - 5283 177 ## COG5268 Type IV secretory pathway, TrbD component 11 5 Op 4 1/0.000 + CDS 5280 - 6050 700 ## COG3451 Type IV secretory pathway, VirB4 components 12 5 Op 5 . + CDS 6097 - 7809 979 ## COG3451 Type IV secretory pathway, VirB4 components 13 5 Op 6 . + CDS 7806 - 8486 510 ## COG3701 Type IV secretory pathway, TrbF components 14 5 Op 7 . + CDS 8498 - 9781 563 ## Desal_1701 hypothetical protein 15 5 Op 8 . + CDS 9778 - 10509 475 ## Desal_1700 type II secretory pathway component PulD-like protein 16 6 Op 1 . + CDS 10666 - 11340 392 ## Desal_1700 type II secretory pathway component PulD-like protein 17 6 Op 2 . + CDS 11340 - 12587 469 ## Desal_1699 hypothetical protein 18 6 Op 3 . + CDS 12577 - 13173 223 ## 19 6 Op 4 24/0.000 + CDS 13178 - 14698 940 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 20 6 Op 5 . + CDS 14695 - 15789 560 ## COG1459 Type II secretory pathway, component PulF + Prom 15851 - 15910 1.7 21 7 Op 1 . + CDS 15935 - 16339 250 ## Desal_1695 PilS domain protein 22 7 Op 2 . + CDS 16388 - 16807 111 ## gi|283851716|ref|ZP_06368994.1| PilM protein, putative 23 7 Op 3 . + CDS 16804 - 17937 515 ## Desal_1691 hypothetical protein 24 7 Op 4 . + CDS 17948 - 18826 324 ## + Term 18827 - 18862 -0.5 25 8 Tu 1 . - CDS 18819 - 19121 170 ## + Prom 19456 - 19515 9.1 26 9 Op 1 . + CDS 19547 - 20017 351 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 27 9 Op 2 11/0.000 + CDS 20030 - 20929 563 ## COG3504 Type IV secretory pathway, VirB9 components 28 9 Op 3 1/0.000 + CDS 20933 - 22273 443 ## COG2948 Type IV secretory pathway, VirB10 components + Term 22365 - 22396 -0.5 + Prom 22514 - 22573 4.0 29 9 Op 4 . + CDS 22622 - 23392 250 ## COG5314 Conjugal transfer/entry exclusion protein - Term 23286 - 23326 10.3 30 10 Tu 1 . - CDS 23389 - 23643 109 ## + Prom 23866 - 23925 4.5 31 11 Op 1 . + CDS 23996 - 24889 461 ## COG3846 Type IV secretory pathway, TrbL components 32 11 Op 2 . + CDS 24898 - 25443 347 ## PMIP19 putative conjugal transfer protein 33 12 Op 1 . + CDS 25663 - 26142 402 ## PSPA7_4466 hypothetical protein 34 12 Op 2 . + CDS 26153 - 26671 6 ## 35 12 Op 3 . + CDS 26697 - 27092 216 ## 36 12 Op 4 . + CDS 27113 - 29428 1606 ## COG0550 Topoisomerase IA 37 12 Op 5 . + CDS 29449 - 29778 289 ## PA2593 hypothetical protein 38 12 Op 6 . + CDS 29857 - 30030 71 ## 39 12 Op 7 . + CDS 30064 - 31008 644 ## Dvul_0781 hypothetical protein + Term 31057 - 31106 -0.8 - Term 31183 - 31232 -0.7 40 13 Tu 1 . - CDS 31276 - 31491 76 ## 41 14 Tu 1 . - CDS 32348 - 33049 -222 ## Predicted protein(s) >gi|316923001|gb|ADCP01000080.1| GENE 1 45 - 275 126 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNIKSWFTPPEGPIDMEALTEKIQKALERRQKTAFLAGFLASCESWNGEYPYEGNGAAWR DGVGSAYARWVESQQK >gi|316923001|gb|ADCP01000080.1| GENE 2 315 - 587 179 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKISIFRLAQWLGWACCLAMTWMAQALYHKGDPMDGIFCLLLGILFSLILIHLLLRRSYE EKNKTRYAAITFASAKEKAATPISSKKEKQ >gi|316923001|gb|ADCP01000080.1| GENE 3 584 - 868 239 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRNTIQNYLSGFKIETFSPVSMALCIGVMAFSLYTPHERNFIYELYESLLHIKAVLVYPI LGLLLIKILLLRELDTFFDIVTLLILLWLIYSVS >gi|316923001|gb|ADCP01000080.1| GENE 4 963 - 1241 296 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRMRRKRNRAATPRNREEQHPLVNPLFAPLCNTKDLTIKGRFYDILAEERALHNSDERAF YRTALEYRQRLFSPRQQEILVQTARAKFGLAV >gi|316923001|gb|ADCP01000080.1| GENE 5 1440 - 2477 170 345 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302861828|gb|EFL84763.1| ## NR: gi|302861828|gb|EFL84763.1| conserved hypothetical protein [Desulfovibrio sp. 3_1_syn3] # 1 340 1 324 384 97 27.0 2e-18 MRTFSPTSLFFHVRDKSQPRFIKHDELISIGSSYLYGIIFEFCGDKDHCYPSQTTLALLC KSSTRTVQNYTAQLVKAEYIAVKKDEKTGRNIYYLLLSPRLLALLRKLGIPTHETQISGT AAKNEGLQEAHENFSPYIRDQRDQDTPLSPLPKPAAPSPSCHSRTGFPSSAPTTQRVQGR GDFSSLSRTGAPQTTHVLHRPTEHEFENLWAAWPQAASWAMPHNRQLALRVYRRMRRARQ LPPIEKLLAIVEQYKLNDSRWLNGYPPECSNWLHTRQFEKAPLVRPKKSFFGKLFGKPDP DPVLTPEERHQAEIYRAQVKALHERFTSRTFSAMPPSSPVATRTR >gi|316923001|gb|ADCP01000080.1| GENE 6 2658 - 2750 86 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQFEVSIPYTGIQTRIVLADSPEEALEKA >gi|316923001|gb|ADCP01000080.1| GENE 7 2969 - 3178 110 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVPIIIYVTRRLLFIELKNFLSIFIKLNPMKIELRRIGKQHNRKSSLRFHHGAHVILFQT LQGRIEGVR >gi|316923001|gb|ADCP01000080.1| GENE 8 3669 - 4628 500 319 aa, chain + ## HITS:1 COG:AGpT89 KEGG:ns NR:ns ## COG: AGpT89 COG4962 # Protein_GI_number: 16119853 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 309 33 325 343 258 44.0 7e-69 MSSIETSTLYSDWEAKIRSCAGDVINQALDDPLTCDVMVNPDGRIWQERFGEPMKCIGMM ESYNLESMIRFLASILNKTVSYEQPQLDGEYPGGFRFSGAIPPIVSSPACTIRKPASRVF TLEEYLESGIITERQKESLCRAVAEHQNTLVVGGTGSGKTTFTNALIAEMVRQFPDERHI IIEDTREIQCAALNVVFFHTTDEVPAEKCLKQGLRFNPTRIHFGEVRDGIALDLLAAWNT GHPGGISTLHANSAKDGLERLSELVSRNPHPPRHIEKAIGKAVQCLVFIAQTPQGRRVKE ILTVKGYSSKIQDYDLQAA >gi|316923001|gb|ADCP01000080.1| GENE 9 4656 - 4985 273 109 aa, chain + ## HITS:1 COG:XF2055 KEGG:ns NR:ns ## COG: XF2055 COG3838 # Protein_GI_number: 15838647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB2 components (pilins) # Organism: Xylella fastidiosa 9a5c # 22 107 45 130 144 72 41.0 3e-13 MNKPLLLLVAVFALLALAAPDALATMGAGGGLPYESGLEKIQASLTGPIPFIFALAGIIG CGSALVLGSDMNGFLRGFLVVIIGVCILLGAPTVISVVTGKGALLTAGV >gi|316923001|gb|ADCP01000080.1| GENE 10 4990 - 5283 177 97 aa, chain + ## HITS:1 COG:XF2054 KEGG:ns NR:ns ## COG: XF2054 COG5268 # Protein_GI_number: 15838646 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbD component # Organism: Xylella fastidiosa 9a5c # 12 93 14 95 106 96 57.0 9e-21 MAFTPFHHSLIRPNLILGCDREMIMISGIAAFALIATALEWVAAFASVLLWTLSLLLFRR MAKADPLLRFVYIRHRAYQAYYPPRSTPFRVNTREYR >gi|316923001|gb|ADCP01000080.1| GENE 11 5280 - 6050 700 256 aa, chain + ## HITS:1 COG:AGpT83 KEGG:ns NR:ns ## COG: AGpT83 COG3451 # Protein_GI_number: 16119850 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 35 253 4 221 822 151 35.0 1e-36 MNALTLIIWCISGVGLLLFIELIRQINKVGKLQKLQRYRSKKAGVCDLLNYSLLADNGII ACKSGALMAAWTYTGHDLNYSTNEEREYFSDIINRALRGMGDGWMIHVDAVRHPAPRYPK PELSHFPDTVSAAVDEERRRFFEERGVMYEGHFVLTVTWLPPLLAERKMIELMFDDDSKP DSRSKEYENLLETFTKNITALEERLSSVLKLHRLQAHSYMREDGETETYDDFLQHLNFCI TGITHPVQFARVPHIS >gi|316923001|gb|ADCP01000080.1| GENE 12 6097 - 7809 979 570 aa, chain + ## HITS:1 COG:AGpT83 KEGG:ns NR:ns ## COG: AGpT83 COG3451 # Protein_GI_number: 16119850 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 564 243 816 822 554 49.0 1e-157 MGKKFIQVVAIEGFPSASYPGMLSALTDLDLEYRWSTRFIFLDQQTALSHIKKYSKKWGQ KERNIVDVVFNRFNGRVNKDAADMHNDAEQAFAEVQGGHVAAGYYTSVVVLMDEDRRKVE NETLRLQKVIFNLGFAARIETLNTMEAFLGSLPGHGFENIRRPILTTQSLADLLPSSTIW TGSGTNPCPLYPPDSPPLCHCVTSGSSPFRLNLHVRDVGHTMMFGPTGAGKSTALALLAM QMRRYENASIFVFDKGMSMYATTKACRGQHFNIAGEKNGLQFAPLYALGTTQDRTWALDW IDTILKLNGVNSSPGDRYKIANTLKLMDKEGKRQLSDFVGLINVPHIKEALLPYTDSMLL NAKEDTFALSSFTTFEMEELMGLGERWALPILLYLFRRIEKSLHGQPAFIILDEAWLMLA HPAFREKITEWLRVLRKANCALLMATQNISDAVDSPIWNVLLSQTATKIFLPNFQANDMA ETYANMGLNPHQINIIARAVAKRQYYLVSENGCRLFNFALGPLALAFAGASDKETVQKIQ NLEETHGDEWVSVWLESRGLSLADYQGEAA >gi|316923001|gb|ADCP01000080.1| GENE 13 7806 - 8486 510 226 aa, chain + ## HITS:1 COG:XF2052 KEGG:ns NR:ns ## COG: XF2052 COG3701 # Protein_GI_number: 15838645 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbF components # Organism: Xylella fastidiosa 9a5c # 6 226 19 238 238 175 41.0 8e-44 MTKKATETPYLNAKKEYNEREGSLNASHQLNKIINILCLLIALSAVGGLVYYRYLLQLVP FVVEVDKLGQTYAVARADRAAPADPRVIHATVARFITLARMVTPDATLQRKGIFDLYASL ASSDPATFKMNEFLGAKSEENPFKRAAKETVEIQITSVLPQSNETWQVDWMETVRDRGDG SQLARYPMRALLNVYVVPPNHKTSEEQLRKNPLGIFIRDFSWSKQL >gi|316923001|gb|ADCP01000080.1| GENE 14 8498 - 9781 563 427 aa, chain + ## HITS:1 COG:no KEGG:Desal_1701 NR:ns ## KEGG: Desal_1701 # Name: not_defined # Def: hypothetical protein # Organism: D.salexigens # Pathway: not_defined # 343 426 155 241 242 78 47.0 4e-13 MKQCVLFCLLSLALTGCALRGTHEVAGVKTADALSETYPADAERFAAQAALELSRRYPAG QTGLSLVTVPGQFGTSLENQLRDHGFAIFPSDSSSGLRIGYVLDEIIGEITPTGYLQIRT SDGTAFSMIRKLMGGLPSTASQAPALPAPVSTPEEPAQGTPVVTPPARTPDPEPVPAVKE ASLAEAPQTPVPQTPTAHKEEPSKASGDSRMMPIITAVRTVIPVKWQYRIQGTELRQKQV PYPKNVPWRRAIMDMATATDSTVDINAKARLVTFMPKGMSSAPALANSTAPSQPESPVVA TALADAATASEAASATPKTSEKPATPKVEAAPIPASPGVTIVAEPVSLPEWRLSQGSLRT QLDVWANRASYQLVWNADTDLDMQSRASFRGNFVAAITQLFEGLHAAGFPLRATLYPANN VLEVSDR >gi|316923001|gb|ADCP01000080.1| GENE 15 9778 - 10509 475 243 aa, chain + ## HITS:1 COG:no KEGG:Desal_1700 NR:ns ## KEGG: Desal_1700 # Name: not_defined # Def: type II secretory pathway component PulD-like protein # Organism: D.salexigens # Pathway: not_defined # 27 242 21 216 497 91 30.0 2e-17 MRKYLALFLAAFVFFSGGCTVKNGGAPTQMQVEAKRDQLTELSSSKTVSIVGQPYLGAKP IVRESGEEIPVLGKNVTLKQQGTLSEIAASLSAIAPLSAQVAEADGEDVQPAKPAASEPP LDDLGLPPLATSLGGAGQIAVSYTGSLRGLLDTIASQSGYGWDYDAKTNRVTFSATQVRT YTVLAAPGAVSYESQLSNKSRERTGGSSISGSNINSTVSSGDTSAQTSQINKTELKFDIW REV >gi|316923001|gb|ADCP01000080.1| GENE 16 10666 - 11340 392 224 aa, chain + ## HITS:1 COG:no KEGG:Desal_1700 NR:ns ## KEGG: Desal_1700 # Name: not_defined # Def: type II secretory pathway component PulD-like protein # Organism: D.salexigens # Pathway: not_defined # 1 221 271 495 497 198 48.0 1e-49 MRVWALELSDTSSAGFNLQALFENDNISVVAGSLGDLGSASTAAVSVVHGKLKGSSGTLK ALREWGRATQLTSAGGLLTSNQPLPVLAIKRHAYLAGMSLSTSDYNQTSEITPGEVTTGF AMTIIPHILADRRVILQYNITLSALDDMQEIDREEVYVQLPQVSTRSFAQRSKMKMGQTL VLAGFEQSRQTANNAFGILNTGRDADYGKTLLVVTIELESAENV >gi|316923001|gb|ADCP01000080.1| GENE 17 11340 - 12587 469 415 aa, chain + ## HITS:1 COG:no KEGG:Desal_1699 NR:ns ## KEGG: Desal_1699 # Name: not_defined # Def: hypothetical protein # Organism: D.salexigens # Pathway: not_defined # 1 414 1 415 417 127 26.0 1e-27 MRRIILNGQSYAVGLKGFHFRGTREEAIQEAHGRDAQFDLIVVLTDHEQYGLGRSEGSSW KKTKSLAASLVRNNCPDAVHVFSFHDESTKEPFWWVIGILKKRISAQSDRCFGNKEEAEA LAISIQNSLGVEHKSFSEEESLAFFSNHLGAPKKKLFLQDATALSSLRETDRIKLLKAGL WVMAFALGWWTVDAVLEYKAARDAMEQARILTQNKERRAKELAAHPERYFSSVWMKSPEP DSFLKQCAPAMFRFPTAANGWRLAGLSCSGDFLNATWEHTEFSDFMYLPFNGALDHKKPQ IATSSKMLPVLPEGQREPGILLSQDEATRRIYALTQHFRLKLKKLSFTKRETRTVEKIQL TCPWIRGSWEINKIPAVMITDYANVGPAFAIPGLIITELSYDKDTWTIRGELYAR >gi|316923001|gb|ADCP01000080.1| GENE 18 12577 - 13173 223 198 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRDKRQWLALAALLCIMVGGGAWYWWPQQPVAVPTPIRPIKKTAPSPAQEKPQGQTLKGD AGIKVAPITEITPRTVYSGDLGTLTGIQAAADIKKMQLEFAQLDAKVKELKKQPEPPMQA LPLLPPRPSSSDMTQKTESSPHIVVLSVWGIGTTLTARLRTKDGTHTVKTGDDIPGFGTV QHISRDRVVVGGAAIPWK >gi|316923001|gb|ADCP01000080.1| GENE 19 13178 - 14698 940 506 aa, chain + ## HITS:1 COG:gspE KEGG:ns NR:ns ## COG: gspE COG2804 # Protein_GI_number: 16131205 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Escherichia coli K12 # 74 463 102 458 493 144 29.0 4e-34 MDTNFVPTVPQSLSNRLIWRDGSFHASEELQANMELDALFARYMDEMGGVAVPCVWYEPR TFDNRFGIRFDGKEDGENQTVRYIVDLLKKAKEQKVSDIHITYAGPYTTIEFRQMGQLQP YQTLTGTEGFELISALFQSRISQTQSTFSTYERHDGRIADRKYLPEGVFAVRLHSEPIQT PLSAEPGVFVAIRLLFDATGARGTLEERTASLGFTPAQQRLLKSFTERSGLTIIAGATGH GKSTLLKNVMEAMAEHQPTKNYMTQEDPPEYVIRGAKQVCVVTSSSEKARKEELVDSIAG LVRSDPDVIMIGEIRYREMAEAAINAALTGHGVWTTLHASSAFGILIRLREMGVPLEDMC ADGVLTGLVYQRLLPILCPHCRKPLLAHQEAISDRLWDRLGRIYREDELKGIYVREKSGC PECRGVGLVGLQVAAEIVPINRTILDFLKEKRMREAQEYWLKRLDGMTHIEYARRRVAAG EVDPSLAEERLGVTLDHDHDSPEEAA >gi|316923001|gb|ADCP01000080.1| GENE 20 14695 - 15789 560 364 aa, chain + ## HITS:1 COG:VC0836 KEGG:ns NR:ns ## COG: VC0836 COG1459 # Protein_GI_number: 15640853 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Vibrio cholerae # 70 358 52 336 340 74 22.0 4e-13 MIDLNASVRLPAVLERRLARLAFDAKTRERCWRKLASHQRHRMPLEESFNLFIRQAQAGH SLAEHCYRGIRDRLAEGKSMGEALSGFASPEETLLIHSSLKGGNFSDGLTLAAELLAARR KIIDAVIGALSYPTMLGSALIIFLYVISAVVMPQMTAITDPEQWNGPAALIYRVSLFVNS PSGVVVLIAFFLSIAFIVATLPRWTGKGRRWADKIPPWSIYRLLVGVSWLHTVATLMSAG QKLVDILDSMIKDNNTTPYMRAISRRLLIHASRGANLGDALEATKLHWPDRVLVDELQAY ANLPGFSQQIRSIATDWLNEGIGMIVQASRILNVLCISLLGLLIIFMAMSVGSIVENIIH GMGM >gi|316923001|gb|ADCP01000080.1| GENE 21 15935 - 16339 250 134 aa, chain + ## HITS:1 COG:no KEGG:Desal_1695 NR:ns ## KEGG: Desal_1695 # Name: not_defined # Def: PilS domain protein # Organism: D.salexigens # Pathway: not_defined # 6 134 31 159 159 76 33.0 3e-13 MLFGDSKLGTAQQDLGGLRINIQKTYLDLRTYSSISNDELIASAAVPSNMLNADKTAIVN AWNGDVTVSPADGGRSFTIELALIPQAECVKLAAFQFDTWESVSVNGTELAEGTLVTTDQ CTDVNSNTITYTSH >gi|316923001|gb|ADCP01000080.1| GENE 22 16388 - 16807 111 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|283851716|ref|ZP_06368994.1| ## NR: gi|283851716|ref|ZP_06368994.1| PilM protein, putative [Desulfovibrio sp. FW1012B] # 1 138 1 141 142 77 35.0 3e-13 MYPLAVLGLFIGIIFSLNSILPHMPIQKYPEEAIAKNFCVYRLAVANYIQENSNVVSISD NALELPDGFIKVRDWRTRISNGYCYIYGEATPEEIVLIRKNMGNSVLIGRNQGNRLYPSG IVVNAAIPAESVVSIVSLP >gi|316923001|gb|ADCP01000080.1| GENE 23 16804 - 17937 515 377 aa, chain + ## HITS:1 COG:no KEGG:Desal_1691 NR:ns ## KEGG: Desal_1691 # Name: not_defined # Def: hypothetical protein # Organism: D.salexigens # Pathway: not_defined # 3 373 6 373 375 275 42.0 2e-72 MKKRRQSGFTLLEMLITLTVIGILMPIAYQIWYAGFIEEQQSHAADQLRQVNAAADAYVK RHFETLLASASLASGPQITVAQLVDAGLLPNGFRNSNIWGQSYEIYVRSPHENTLNTITI TRGGRAHEEGSSFATLIVPGAAMRLGGAGGFVPTGTLPGQIEGTLHGTAGGFILDLAALG IASPGPGHLGAYSAFDEYDLGTDYLYCGEVPGHPEYNQMTTELDMTGHGIENVGSVQYVS RTVTDGESCSAQDEGKMFLDQLQGMYLCRNGQLVLMADTGNSTQFKMSTVANNGDRITKP SCAPGTGTVPQIYVAPAIASSGAQSPAMSALQAWATSVSATEWQVHLRVLNTSNSDAWVY PTPEYGRVMVFTTYAKN >gi|316923001|gb|ADCP01000080.1| GENE 24 17948 - 18826 324 292 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTESFKYYVTDGDTEWETTDLEEAIRFDRNEYFFLTESGVSCKATNIREALKKASKDYGM DVQYALCTISSGPEDGHVFIIGDSIKNIIEKLLKKKYINDDELVSNFLREANYIYWIIIP ATKELVGYCKSYEPYFMYNAPKYIIRHNFADIYETDDPEKISYDINQLIQKDKNKYFIIS YTGDLHQATNIREAFEKGHNHPYAVCNSFNLLHSNNIKIFSVGNSIREAIESMLEEKHKT DNELISNFLLGYGWSCWTIFPITQNLLGYYTEHKECDCEYCQIQNSIINLIK >gi|316923001|gb|ADCP01000080.1| GENE 25 18819 - 19121 170 100 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTPLNKEMYLAISLLFMIWENPKNGALFNYMLGIIKYPHSDIFKYKVKTLHNAIDSTRGR GERTLHEMLKDMTEGKCKKYDFPLIREILKKEYNIVYPLI >gi|316923001|gb|ADCP01000080.1| GENE 26 19547 - 20017 351 156 aa, chain + ## HITS:1 COG:STM2877 KEGG:ns NR:ns ## COG: STM2877 COG0741 # Protein_GI_number: 16766183 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Salmonella typhimurium LT2 # 1 156 1 143 160 63 32.0 2e-10 MKHILFLLAFLLLTATAHADVFDAPCSRYRIPKRLVLAIAKTESGLDPWCVNVAGKDYRP GSRSGALAIIRQARARGLSHDIGLMQINSWWLKHLNISPEAALEPRNNATLGVWILAKEI QRHGYNWKAVGAYHSPTPARQKMYAQVVSQKYRTLK >gi|316923001|gb|ADCP01000080.1| GENE 27 20030 - 20929 563 299 aa, chain + ## HITS:1 COG:XFa0042 KEGG:ns NR:ns ## COG: XFa0042 COG3504 # Protein_GI_number: 10956753 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB9 components # Organism: Xylella fastidiosa 9a5c # 35 298 26 294 296 280 53.0 2e-75 MNRSIVTLVLLVAQALPFSALAASKGTPDELPDLPLETLYFGGENPVLTPSEERALKIAR EWQAKAATQLTPVPGPDGAIQFLYGAVQPSIVCAVMQVTDIELQAGEQINNINIGDSARW LVEPAVTGAGVAEVQHIVIKPMDVGLETSLMVTTNKRTYHFRLRSHRTQYMPKVSFIYPD TVQQRFMALKNRQEEDRKEKTIPETQEYLGNLDFGYSISGSSPWKPVRVYNDGTKTIIQM PPKMRQTEAPSLLVLNGDEEVIVNYRLQGDRFIVDQLFDKAMLIAGVGSKQTKIVITRK >gi|316923001|gb|ADCP01000080.1| GENE 28 20933 - 22273 443 446 aa, chain + ## HITS:1 COG:XFa0040 KEGG:ns NR:ns ## COG: XFa0040 COG2948 # Protein_GI_number: 10956751 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB10 components # Organism: Xylella fastidiosa 9a5c # 21 446 15 469 469 308 42.0 2e-83 MGWKLWGKPEKELEQPQVLDPNAVPSGSGGKRANNIPLLLIILVVFIFLAIVAYVAFERS NSQLSPTEKEENRPPKENRSATQMANELTSGWGGTVVIPTEPSPPATDEEKAHTTRDAAP NKTFADLSLVQKSAPDPYLLEHRMRVQELKIQALEAARTSATRVHIELEKRPAPTAMDVN ARIAATRQRLADMSDPSAAYQARLAQLRGESPESTDALYEPTRSGKNDVRQFTQKDSWNL DSQVEGPASPYMIRAGFVIPATMISGINSDLPGQVMAQVSQNVYDTATGKYLLIPQGTRL IGAYSSDVAFGQERVLMAWQRLIFPDGKALDIRAMPGADSAGYAGFSDKVNSHWFRTISS AVLMSGVIAAVDMSQNDRNSDSNNDRQRASDSLSEALGQTLGQTLSQIITKNLNISPTLE IRPGYRFNVMVVKDMSLPGSYRAFDY >gi|316923001|gb|ADCP01000080.1| GENE 29 22622 - 23392 250 256 aa, chain + ## HITS:1 COG:XFa0039 KEGG:ns NR:ns ## COG: XFa0039 COG5314 # Protein_GI_number: 10956750 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Conjugal transfer/entry exclusion protein # Organism: Xylella fastidiosa 9a5c # 6 254 12 259 259 147 41.0 3e-35 MWSISSIKKLSVSISLVLFLPFSVRAGGIPVFDAANLQQSAISAVEAVNQTLKQLEEYAL QLQQYEDQIKNSLAPAAYVWDQAQRTMNKIVALQDQLDFYMNQAGDVNTYLRKFGSVSSY RASPYFGPGGGTKENRKTLMEAEELGSEAQKHANDNVVKTLENQQQALKKDAATLEQLQA KAQGAQGRMEAIQYASQLSSHQSNQLLQMRALLTAKIAAENAREQTVAAREARRQAMEEK AKESRYKKSDKMGWKP >gi|316923001|gb|ADCP01000080.1| GENE 30 23389 - 23643 109 84 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRIALFLTAIMTCSLLFACTDQEAEKRIQAAEERAVAAEKRAAVAEERLLILEKKITENE ERARKEKIAKESRYKKSDNIGWKP >gi|316923001|gb|ADCP01000080.1| GENE 31 23996 - 24889 461 297 aa, chain + ## HITS:1 COG:XF2046 KEGG:ns NR:ns ## COG: XF2046 COG3846 # Protein_GI_number: 15838640 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbL components # Organism: Xylella fastidiosa 9a5c # 3 295 119 420 464 107 36.0 2e-23 MMIGAQAGGQENLEPSGIVDLGFTIYDAAQQNLDIWSGTSIAAYIISLIVLVIFALIGIN MLLQLCSAWVLAYAGIFFLGFGGSKWTSDMAVNYFKTVLGLGASLMTMTLLIGIGTSIVT ESIAQMDTSGAQNLTEMAVLLITSITLFMLVDKLPSMVAGIINGASVGSTGIGAFGAGAA IGTAAAAMGMAVTSIAKSASIITGAVKGGAGLATGFQAVKNEMMGEQADIASKMQEHRKE AGLPPESGFVKPPSATAVLGRMAQMGGAMMKQGIKNKASEAVSNTAGGKIADMLKKD >gi|316923001|gb|ADCP01000080.1| GENE 32 24898 - 25443 347 181 aa, chain + ## HITS:1 COG:no KEGG:PMIP19 NR:ns ## KEGG: PMIP19 # Name: not_defined # Def: putative conjugal transfer protein # Organism: P.mirabilis # Pathway: not_defined # 9 180 10 182 182 148 44.0 1e-34 MKFLTLLVLFLFPFAAQADNQELSGDTKLACEAILCLSSGTRPGECGPSLSRYFGISHKK WKDTVKSRRNFLNQCPTVGEDPSMPTLVEAILNGAGRCTADLLNRQLRKQVPVEECIPDD VWRRMSREDRENTPRCRTKLVWIIDDALPSYCRTYAGHEYTWKVGVNYVGEPLNGGHWID E >gi|316923001|gb|ADCP01000080.1| GENE 33 25663 - 26142 402 159 aa, chain + ## HITS:1 COG:no KEGG:PSPA7_4466 NR:ns ## KEGG: PSPA7_4466 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_PA7 # Pathway: not_defined # 1 116 85 202 250 62 36.0 4e-09 MRKNIRLFTVNPEYYLNLEQFRAPFMSFCFHDGKGYVAEDGCHRACIAKFFLYSQPSPFL HGVHLTEVQTDARMTNLFYRLKKLLPVWCAAFPNSQEVTRNDDAKGWSMSFYGNTLRIEN HRRKGFNADFQAEELEGGLLPALLSPFKRRFGTYRNLLN >gi|316923001|gb|ADCP01000080.1| GENE 34 26153 - 26671 6 172 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKPIRRLFIDTGMRQSIELFLKEVSFCAMRASCCSYGEVTIYIPHFLLTPSASAGILERY YGKKIIIELQFERMLTQILITVGTFTDFTNPVHTTDRFKVNSYHERRHLRTENYTVSLTV AYKFILALIPHLHSRTLNVITVSPDAASNLIHSFETPGDPSGSYRRVLLANI >gi|316923001|gb|ADCP01000080.1| GENE 35 26697 - 27092 216 131 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGIYICKDDERNSICIIQTLVNNVNDSLNKDDCYPVFGAIGNTFMVEIAQGPIERRNILK TFYIDIKHNLQTEILIENKTYADEEILISIKQEYLLIAMLYITKSIFEQSPYILHPLNAR GNYPYGWEYGI >gi|316923001|gb|ADCP01000080.1| GENE 36 27113 - 29428 1606 771 aa, chain + ## HITS:1 COG:XF2059 KEGG:ns NR:ns ## COG: XF2059 COG0550 # Protein_GI_number: 15838651 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Xylella fastidiosa 9a5c # 1 651 1 676 685 534 44.0 1e-151 MRLFIAEKPSLGKAIAAELGVTQTCQGYTVCGNDAVTWCFGHLLEQYDPDDYDDAWKLWR RSSLPMIPREWRLKPKESARAQLQVISNLLGEAATVVNTGDPDREGQLLVDEVLEHFRYT GPVQRIWLASLDSRSIQKALATLKDNRDYANLRDSARARSQADWLIGMNATRAMTLRGRE SGRDGVLSMGRVQTPTLALVVNRDREIAAFTPIDYLVLQATLQHDAGTFSALFKPSETQP GLDSEGRLVDGATAQGIMDAVRGKNGIITSVTREKKKKPVPLPHCLSSLQKAASSKLGMT AQQVLDTAQSLYEKKLTTYPRTDCRYLPEEQFSDAARIITALSGVSGLEAVTAKADSALK GPVWDTKKITAHHAIIPTGEEPRSLTAQEKELYLMIAVQYFLQFYPPMLYEAQKILATIV ETAWEARGRMIIEPGWTGFAAEEDDEDAKKKEAEQSLPSVGNNDAVLCADVDALKKKTTP PSRFSEGSLIEAMANVHRFVSDAKAKAVLKENEGIGTEATRASILETLKGRGFITASGKS LVSTPLGQSLIDMTPDTLRDPVTTAQWEQRLEAITRGETSLEDFMREQYAALPLLLAPVL STPAALQPGAFPCPKCGKALRRREGENAGEFFWSCSDADCRTFLPDEDGKPGKPRERAIP SEHPCPVCGRPLYSGKNDRGTYWACYNKQGHPDGNNVFLPDDNGKPGQPKPRTPRIVTEF ICPDCGKPLLIRQGTNAKGPWTMFSCSGFPQCKASFWDKDGKPNFNNRVKH >gi|316923001|gb|ADCP01000080.1| GENE 37 29449 - 29778 289 109 aa, chain + ## HITS:1 COG:no KEGG:PA2593 NR:ns ## KEGG: PA2593 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa # Pathway: not_defined # 1 109 1 112 190 82 42.0 4e-15 MLLHITPKLLTAHAFSTAGLASVEIPEFRLKLSGEKELMTRKPFSNKRYYVGCRRSGKAS SGFLLELPHTVDEYTVISEWETVSGLRTHTVRYVVLDNELDAASDEMLL >gi|316923001|gb|ADCP01000080.1| GENE 38 29857 - 30030 71 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEMHRGTYRGECHDIYGPGGIIMERKQSFRIPTIERERLTLDVGFNLPSPDTAFVLS >gi|316923001|gb|ADCP01000080.1| GENE 39 30064 - 31008 644 314 aa, chain + ## HITS:1 COG:no KEGG:Dvul_0781 NR:ns ## KEGG: Dvul_0781 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 181 1 188 204 127 39.0 4e-28 MSLMKTSIPLTVYRVEPSTLANVTGDRIRQYAFRSIDQTPDELGSGFVPDTDMFADLNIS IPEKGNYMNFGFRIDTRKVPSTIFKKHLAEMTQEELQRTGKTFLSKNRKRELKELCKTRL LSKTEPRPSMSGVAIDKTTGLVYFASTSKSVLELFEKYFKTAFNGELERLSPSTLAASDS DHPLEDFMRDLYTESMALSLNGKEYHITEQGKATLSQAGGATVSVTDVPNSALAGLESGL LFKSLKIRLSTMPDDELVSVFTLNADFSFSGLKTPRIKIEKGNDDPDASFMLKIGFIEET VTAMHSLFRQHAGN >gi|316923001|gb|ADCP01000080.1| GENE 40 31276 - 31491 76 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDCPAAHPLTPRFAAEEKDCPAAVPLKAVLTSCPLKLCVCVGCVWLWLLRELATVCPTLY PDWRPRRKRPR >gi|316923001|gb|ADCP01000080.1| GENE 41 32348 - 33049 -222 233 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSQLAQTGESAPQGRFRTSLSRWYIRRLLGNLPPFHPCLSAERRPRIVPYGTSPHYGHTP RKSRHSKPFLPSYQCIRNWSRSSEELCQTIGDIFSGGLVEAAPLLESSNRTQCSVSRGES GQHPKKQMRLEIIHGVLTTFQQPVGDVMYTGQEIMHGLTKTQYLTLPESSGRRKRHVPST VLCGILSERDDMLFLCPIQSIRNWSGLSAAGYPITGGYSCAAKVHRRVHITAP Prediction of potential genes in microbial genomes Time: Fri May 13 03:33:54 2011 Seq name: gi|316922979|gb|ADCP01000081.1| Bilophila wadsworthia 3_1_6 cont1.81, whole genome shotgun sequence Length of sequence - 25417 bp Number of predicted genes - 26, with homology - 16 Number of transcription units - 14, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 599 - 1063 99 ## - Term 1066 - 1105 -0.0 2 1 Op 2 . - CDS 1121 - 1474 199 ## 3 1 Op 3 . - CDS 1480 - 2229 201 ## Amet_3814 hypothetical protein - Prom 2278 - 2337 6.2 + Prom 2816 - 2875 3.8 4 2 Tu 1 . + CDS 2961 - 3104 63 ## + Prom 3717 - 3776 6.1 5 3 Op 1 1/0.000 + CDS 3806 - 5236 508 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 6 3 Op 2 . + CDS 5233 - 6423 419 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 7 3 Op 3 . + CDS 6423 - 6845 148 ## Ddes_1578 hypothetical protein + Prom 6869 - 6928 4.3 8 4 Tu 1 . + CDS 7087 - 7482 147 ## COG1475 Predicted transcriptional regulators + Prom 7524 - 7583 1.7 9 5 Op 1 . + CDS 7647 - 7856 136 ## 10 5 Op 2 . + CDS 7820 - 8161 242 ## COG0620 Methionine synthase II (cobalamin-independent) 11 5 Op 3 . + CDS 8061 - 8345 118 ## 12 6 Tu 1 . - CDS 8546 - 9094 -251 ## - Prom 9137 - 9196 3.1 + Prom 9096 - 9155 3.9 13 7 Tu 1 . + CDS 9233 - 9970 -345 ## COG3547 Transposase and inactivated derivatives + Term 9986 - 10023 -1.0 14 8 Tu 1 . - CDS 11471 - 11716 56 ## - Prom 11809 - 11868 3.1 + Prom 12328 - 12387 4.0 15 9 Tu 1 . + CDS 12466 - 15117 1625 ## Ent638_0501 outer membrane autotransporter + Term 15126 - 15167 10.9 + Prom 15193 - 15252 4.1 16 10 Op 1 . + CDS 15380 - 15829 183 ## 17 10 Op 2 . + CDS 15885 - 16358 180 ## DMR_32200 hypothetical protein 18 10 Op 3 . + CDS 16434 - 16955 55 ## COG3293 Transposase and inactivated derivatives 19 11 Tu 1 . - CDS 17144 - 18919 233 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains - Prom 18951 - 19010 2.7 20 12 Tu 1 . - CDS 19481 - 19588 85 ## - Prom 19625 - 19684 6.0 + Prom 19581 - 19640 3.8 21 13 Op 1 3/0.000 + CDS 19827 - 21020 819 ## COG5441 Uncharacterized conserved protein 22 13 Op 2 . + CDS 21074 - 21901 985 ## COG5564 Predicted TIM-barrel enzyme, possibly a dioxygenase 23 13 Op 3 . + CDS 21997 - 23391 412 ## Ddes_2094 sodium/sulphate symporter 24 13 Op 4 . + CDS 23424 - 24017 302 ## COG0655 Multimeric flavodoxin WrbA 25 13 Op 5 . + CDS 24077 - 24493 125 ## COG2140 Thermophilic glucose-6-phosphate isomerase and related metalloenzymes + Term 24515 - 24552 9.4 - Term 24492 - 24554 8.2 26 14 Tu 1 . - CDS 24641 - 25153 -225 ## - Prom 25265 - 25324 4.7 Predicted protein(s) >gi|316922979|gb|ADCP01000081.1| GENE 1 599 - 1063 99 154 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPDYRGRFLRGLQSGHSVGQTVANSLKSHNHTQPTHTHSFSGQLVSTALSGTAAGQSFSS AANLGVSGWAAGQSITSGGAGGQSFSYQGSTSSSSGGYVTFRHDTGTTNGYGAGGIIPGY SATPPTASWGITITENNGFFIPSTVAAQVGIGVQ >gi|316922979|gb|ADCP01000081.1| GENE 2 1121 - 1474 199 117 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKLQFVVVKIMNLMEEGGSETLKHFVSEIFVELEQWGLRGDPLLWKYLKNYYAAIELPY PVEHLKEDILRIFKDFTGELPVREKYYFVKEFSKNCNCIKYGDNAMKEFFFRIVSLC >gi|316922979|gb|ADCP01000081.1| GENE 3 1480 - 2229 201 249 aa, chain - ## HITS:1 COG:no KEGG:Amet_3814 NR:ns ## KEGG: Amet_3814 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 1 249 1 248 248 342 65.0 9e-93 MSEQFINLTIDNIDKEHLCCAISDKKHQVGVATKKNWLRERIAEGHIFRKLDEKGKVFIE YAPLETAWVPVHGDNYLYIYCLWVSGSFKGKGYAKSLIEYCINDAKEKGKSGICVLSSKK KKPFLADKKFLLKYGFEVVDTVKDEYELLALSFNGEKPYFSETVKEMKNSSEELTIYYGL QCPYIPNCIEQVQEYCTKNNIPFHLIAVDTLDKAKNLPCIFNNCAVFYHGEYETNYLPNE TFLKKKFQV >gi|316922979|gb|ADCP01000081.1| GENE 4 2961 - 3104 63 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGAIGLIRKNEPGIYTLNEWESLTDIRTGPRCNNDSDRHTMRIHGQM >gi|316922979|gb|ADCP01000081.1| GENE 5 3806 - 5236 508 476 aa, chain + ## HITS:1 COG:lin1348 KEGG:ns NR:ns ## COG: lin1348 COG0553 # Protein_GI_number: 16800416 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Listeria innocua # 13 462 2 390 399 127 27.0 4e-29 MIVSSADWKLTTTAYAHQRAAIEKLSQLKVGALFMDMGTGKTRTALELIWLRRKRIVKCV WCCPVSLMEETRREILRHTSCIDTDIHMVGPRTREENIPQAMWYLVGLESLGRSPRVAYV LDRLIDGGTFLVVDESTYIKGHRAKRTRRLIGFGTRAPYRLILTGTPIQQGIEDLYTQME FLSPLILGYPSRHAFQSALAVFQDRKQAFPVEGVSLDEVCRAIAPYVYQVSKEDCLKLPP KLYRRILCSFSEEQTSLYQAVKTRFLDEINRYGLSDLNMAIFRLFSGLHAVSCGVLPSGF LGVRRLRNRRIDTLFNELRQFRTSHVIIWANYLESVAALNEALPTAFPQIPVYTLHGRVP ATERAAKIDLWKKRGGVLVATQGTGGYGLTLTESHQVFFYSENWRYALRLQAEDRCHRIG QKDSVCYVTLQGESRFDERIRSALARKADALEELKREVRRLNGNRNAIRKLMEHAV >gi|316922979|gb|ADCP01000081.1| GENE 6 5233 - 6423 419 396 aa, chain + ## HITS:1 COG:PA2127 KEGG:ns NR:ns ## COG: PA2127 COG3969 # Protein_GI_number: 15597323 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Pseudomonas aeruginosa # 2 385 4 399 408 337 47.0 2e-92 MKRWLGMDVFTAARSRLEEVFDAFERVFVSFSGGKDSGVVVQLALDIAREKGRLPLDVLH IDLEAWYAHTDAFVTRIMTRPDVRPHWICLPIHLRNSVSQIQPHWLCWDPDARDLWVRKL PTHPGVIADETYFSWFRRGMEFEEFVPLFAEHFAEGRSCASVVAIRCDESLNRFRTIAST TKRRWNGRSWSTVTPSGVVNVYPIYDWKTEDVWTATGRQGWDYNKIYDLMHLAGVGIHQM RLCQPYGDDQRKGLWLFKMLEPETWSRVVGRVQGANFGNRYVELSGNVLGNIRIRLPEGH TWRSYAEFLLDTMPPPTAEHYREKINTFLKWWAKHGFEEIPQEADPKLESRRKAPSWRRI CKTLLKNDYWCKGLSFGQTKRDMERQAQTIMKYMEL >gi|316922979|gb|ADCP01000081.1| GENE 7 6423 - 6845 148 140 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1578 NR:ns ## KEGG: Ddes_1578 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 132 1 133 143 80 36.0 2e-14 MELLEFKHENDPARNRELFGLLGEYATNSVIWEKLGGTISSEPGRLWFIATDGEARHVVA FGSMRFGRTRTRILHLYALEEGADIPVLEHCLECARKNGTAHLTTTDYWTRKELYLRYGF APARKIGRFVRFEKDLTCER >gi|316922979|gb|ADCP01000081.1| GENE 8 7087 - 7482 147 131 aa, chain + ## HITS:1 COG:ECs0640 KEGG:ns NR:ns ## COG: ECs0640 COG1475 # Protein_GI_number: 15829894 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 2 119 88 206 209 114 47.0 3e-26 MPVVVARDGKGHVVVDGFHRRQVAKTDVQIRESLGGYLPVVRLDKSIEDRITSTVRHNMA RGTHQVELTARLVTLLRNHSWTNERIGTELGMEPDEVLRLKQMQGLAEAFADREFSRAWD MCFQDTAKEQE >gi|316922979|gb|ADCP01000081.1| GENE 9 7647 - 7856 136 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQTYILFFPRIGRNRELKKVLEGYWKGELSEGQLRNTVDELKNGIGRFSMAQGFPSSLWA TFLSAIFLI >gi|316922979|gb|ADCP01000081.1| GENE 10 7820 - 8161 242 113 aa, chain + ## HITS:1 COG:CC0482 KEGG:ns NR:ns ## COG: CC0482 COG0620 # Protein_GI_number: 16124737 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Caulobacter vibrioides # 3 111 63 191 777 80 38.0 6e-16 MGDFSFCDFLDLTIALGTVTYIFRNVGSAFDTYFVMTRGDGSRNIPVMEMTKWLNTKYHY IVLELSSNQTFQPSLEWLENDDALAGQLGFHAKPVLPGPITNLPLSKDKTEVS >gi|316922979|gb|ADCP01000081.1| GENE 11 8061 - 8345 118 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTPSQGNLVFMLSLFYPVPLPTCHSPRTKQRFPSGKAYRSLEIYISFLERLTPQCGWIQV DEPILCTDMADRRWVFRPEMREKHNYRGVPDGMQ >gi|316922979|gb|ADCP01000081.1| GENE 12 8546 - 9094 -251 182 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYRFFWIVKAIKRILRNVNPNNTSPFHCSSHPCKYGLYAQATVRDCHEMSGDLDSSASFE LRPRIDLPLMLGPPLRGEVEFSHMTKDFFEWQYTRPRIVPYGTSSPYGLLPRNQGTASPF CLPLCVPERGRARGGTTVQDYREIFLRGHGSQTSTHYGTVEYEDISLLYEERTFVALFLF ID >gi|316922979|gb|ADCP01000081.1| GENE 13 9233 - 9970 -345 245 aa, chain + ## HITS:1 COG:all0306 KEGG:ns NR:ns ## COG: all0306 COG3547 # Protein_GI_number: 17227802 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 3 234 77 308 320 156 34.0 4e-38 MKVVNQRQVRNFARSLNKLCKTDAIDAKILVEFALSRCIISDTPKSEHGTKLSALLVRRE QLQFLITQEKCHIESTEGQYAHYIEAILDTLKAQLKEIDASISQVVKKDEKFAQDNKIIQ SIPGVGPIVSATILTECPEITRIGRKQLAFLVGVAPFNRDSGRFRGQRHITAGRAKIRKV LYCAMRPCLRWNSTVRNWFEHFLSSGKPYKVAVVACMRKLLMVLRAMLISQRRWHPSLKE DAETI >gi|316922979|gb|ADCP01000081.1| GENE 14 11471 - 11716 56 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLCLILCKYDSSITECRSWHATASCGLGCCYFSLFPQAIYLCEHGRKWIWFAAKTDSIYL CGRNTFRLSFPNVFSFFLGNE >gi|316922979|gb|ADCP01000081.1| GENE 15 12466 - 15117 1625 883 aa, chain + ## HITS:1 COG:no KEGG:Ent638_0501 NR:ns ## KEGG: Ent638_0501 # Name: not_defined # Def: outer membrane autotransporter # Organism: Enterobacter_638 # Pathway: not_defined # 400 883 895 1371 1371 317 38.0 1e-84 MDNLNLNVISKDSTISNYASASGISMFCVNDNSFKAGKTNIEIEVKNPLKGIYATGIDLT ASKNNVVALDDTKIIIKGSSENERFGVTHRGIYATGENNTITTGKLDIDLSGEGRAKMDA YAVTVKGDGKLSTATLGDGNISAFAKSPNDAFSVGITVHDGAVVSKGDGSITAIAEAGQG TGYGVGIRAKGEGAKLDVGHDDVTVKVTGGLDDGDKFIQSIGAIADSGGSLSLAGGSITA SADSEDAFIRGVLSTGGSSMTLGTEEGSLAVKAFGNDSAAALDVRGTSTIALQGNVSVEA PIALYGSGTIKNFGNLTVTSGSVADFIGTFTQDDGNTNLSGQDFFGGNVNVFKGALTVGM PDRGGVNLQNNEAMLALGQNVTLGEQAKLVVGDAANSGATVAFGDNSLLVVDGDKAASSF MISAPDAAPRSISVSDGANLYIVNAKIGETYKITDGFVAMDGDVAGWTGDNLVTGRMINA IRSDGTDGKVIVTTELKSASVVFPGVSIPNTIDEIFVSNKNDVDSENPGVAFISRALEPL YLPESDVVTTLDSAAQLSYAGGVQASTIAVAQAPTRAIQDHLSLATNVAQRGTCLHEDGF DLWANALYGANRARDFSAGSLNAGYNSDFAGGVIGSDWTFGAGVGKGRIGLAFNVGTGDT KSRGDFNSTKNDFDFWGVSLYGGWSTDNINVVADFGYSASKNELKQDIPASLGMGGKLKA DVDSSVLTTGVKAEYMVKTDVLDVMPHVGVRYMAVKTDSFSTKLDQGGDLFHTDGDLQHV WQFPVGVNLSKSFETESGWKIRPQADLSVVPAAGDTKAKIDVRTPGVDASDSMKRRVMDT TSFDGVFGIEVQKDNISLGLGYNVQASEHQTGQGVTASFMYKF >gi|316922979|gb|ADCP01000081.1| GENE 16 15380 - 15829 183 149 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIFIPYSKAQKTPLGEFPNWGDSIEQVIKERNIKYPITENSMSLIFGIPVQTMLAKDVFD EGIHIDLYSFFYNKLIEFSISIEPNNFEKTYYEIKNNIELNFKKYKDSHFIKNEDIFIDN DKKTIIILMKTKTIILYFIDFESYKKYKK >gi|316922979|gb|ADCP01000081.1| GENE 17 15885 - 16358 180 157 aa, chain + ## HITS:1 COG:no KEGG:DMR_32200 NR:ns ## KEGG: DMR_32200 # Name: not_defined # Def: hypothetical protein # Organism: D.magneticus # Pathway: not_defined # 13 156 6 148 178 67 28.0 1e-10 MFYSEDIMISDGELFKLLINKYPKEPIPFIVEQFMIFKFCLIQAEKKMDSLSIEKATQQP IRLEDQAQKNKKYQYHLVMNPERAINVDIVQCCLCGRKFQSLTANHLLSHGISVDEYKKL CGYAPTQKLICGNLLKKLRENAQKARRTREKKISREH >gi|316922979|gb|ADCP01000081.1| GENE 18 16434 - 16955 55 173 aa, chain + ## HITS:1 COG:msl9596 KEGG:ns NR:ns ## COG: msl9596 COG3293 # Protein_GI_number: 13488445 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mesorhizobium loti # 91 171 47 127 167 95 60.0 6e-20 MHLAPSALEKKEFRETWQVTALRDRVRTRYFQRKETRVLQEARRFPCLHLLYLSESQLER IKPFFPRSHGAPRVDDRRVVSDIIYGLPDVRLHPPQNSPHGVNSAQKEDSPRLIGRTKGG LNSKLHAVCDGHGRPLLLLLPEGQVNDYKGAAILQHLLSDTCTFLADGGSDTS >gi|316922979|gb|ADCP01000081.1| GENE 19 17144 - 18919 233 591 aa, chain - ## HITS:1 COG:aq_1792 KEGG:ns NR:ns ## COG: aq_1792 COG3604 # Protein_GI_number: 15606847 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Aquifex aeolicus # 243 590 145 492 497 259 42.0 1e-68 MFTENVSRLRRLVEGARHAEKPLVFSIPGSGLSAISAVKGGIPFLVVLNSGIYRISGVNS YASFLPFGNANEQTEELIRSQILPRCPQTPLVAGMLVGDPSCPEQERFRRLKSLGVAGII NWPPVSMNQGFFFDALRRRGFSEEREVAMLCEAKEQGFVTFGFSASIDSAKLFATEAGVD VLIINVGWTFYGSEPLEKQDRVQYAIRHVNQALEAVREATHGNCPICLFYGGGINTAQDT LQLYQQTNIDGLGVGSALEYFPVKDLIRGLSREFLEISKKKTLIQELRESPSDVIGTSQA MLELYQRVRRVAAYDVNVCIEGESGAGKEVIANSLHGLSRRSMGPFITINCGAIPETLIE SELFGHEKGAFTGALERRLGKFELANHGTLFLDEVAELSPKAQVSLLRAIQQKEIVRVGG RKNISVDIRIITATNRCLKELVSKGEFRADLFYRLNTITLYIPPLRERIRDIEALVNHFL KKIRQTFGCKATSISKDFLRRLKQHSWPGNVRELQQIIAEAAIMEDGDILMGYSFHPDSE MPHSAGDGMDNKKFNLRETLIRTNGNKAQCAKILGVSRKTLYQWIKKYGLN >gi|316922979|gb|ADCP01000081.1| GENE 20 19481 - 19588 85 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQKKGIDSYREHDVGFLWGSDQRKSIKPFVPGNGH >gi|316922979|gb|ADCP01000081.1| GENE 21 19827 - 21020 819 397 aa, chain + ## HITS:1 COG:mll9388 KEGG:ns NR:ns ## COG: mll9388 COG5441 # Protein_GI_number: 13488177 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 1 395 7 404 408 393 53.0 1e-109 MKHVYIIGTCDTKHTEILYVRNLLFSQGISTCIVDVGIHERMLPPDIGNVVHLREKQLDV FADITDRGKALALMGTLLEEFLVEHQKDISGVIGLGGSGNTAIVTRGMRALPVGLPKLMV STVASGDVAPYVGASDICMMPSVADVQGLNIITRKILGNAAHALAGMVQHPIPEEKENRT LVGMSMFGVTTPCVQQVCALLPEQYEPLVFHATGTGGKALEKLIDSGMVSLAIDITLTEI CDLFMGGVMSAGDDRLGAVIRTGIPYVGSVGALDMVNFAAMSTVPEKYRERNLYVHNENV TLMRTTVEENAAMGRWIGERLNLCTGKVRFLLPEGGISAIDAPGMPFYDPEADAALFKAI EQTVKQTDTRRIIRVPHHLNAPEFAEAVRTHFLEAIG >gi|316922979|gb|ADCP01000081.1| GENE 22 21074 - 21901 985 275 aa, chain + ## HITS:1 COG:YPO3838 KEGG:ns NR:ns ## COG: YPO3838 COG5564 # Protein_GI_number: 16123973 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme, possibly a dioxygenase # Organism: Yersinia pestis # 5 273 10 278 280 403 73.0 1e-112 MAITRSEILANLRAKIARNEPIIGGGAGTGISAKCEEAGGIDLIVIYNSGRYRMAGRGSL AGLLAYGNANDIVLELAREVLPVVKKVPVLAGINGTDPFVNWDYFLRRIKEEGFAGVQNF PTVGLIDGLFRANLEETGMSYSLEVEAIAKAHQMDLLTTPYVFSPEDAKNMALAGADILV PHMGLTTAGTIGAQTAKTLEECVPLIDECVAAARAVRPDIICLCHGGPIANPEDAAYILQ HCKGIQGFYGASSMERLPTEIAIKEQVRKFINIKF >gi|316922979|gb|ADCP01000081.1| GENE 23 21997 - 23391 412 464 aa, chain + ## HITS:1 COG:no KEGG:Ddes_2094 NR:ns ## KEGG: Ddes_2094 # Name: not_defined # Def: sodium/sulphate symporter # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 461 8 466 466 234 33.0 4e-60 MTLSRSVKWIIAVVIALLVYWGIDVPTDLADPKTPLFLAITVGTVVVWAFELMPAAAAGV GMLFLYALFVVPPKVAFATFNTFLPWITFSSLIIADAMLNSGLGKRIALRSMLLLGSSYS KTMIGLMLSGFVVVLLVPALLARVVIYMAIAQGLVAALDVDPKSRTSSSLIFGGFLAAVA PSLFILTGSEINLMGMYSTWDVTGEPKPWTDFLVQMGPLNVLYMAFSTFMIFMIRGKDPL PGENNLENILRVRLAEMGNIKPSEIKVLVLLVVGLLAFIFEKQIGYPGTMVYSLIMLLAF LPGVDLCDGKSFSKLNLSFVFFLASCQAIGMVAGALHVDKWLSDLMLPLLQGQGNTMAVL ITYLSGLLINFLLTPMAATGSMCGPLAQLALQLGIDPQVLTYSFLYGLEQYVLPYEIGVF MYIFVTGAITHKHVVPALALRMVFVPIMLAVVAVPYWTFLGLLK >gi|316922979|gb|ADCP01000081.1| GENE 24 23424 - 24017 302 197 aa, chain + ## HITS:1 COG:MA0327 KEGG:ns NR:ns ## COG: MA0327 COG0655 # Protein_GI_number: 20089225 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 195 1 189 191 213 52.0 2e-55 MHVVAFNGSPRKNGNTARLVRHCLDTLESEGLSTEMVQVGGTNIRGCCSCLSCRKKAVEL CAITSDPFNEWIEKMKNAEGIILASPVYMFGTTPEMKALIDRASFIARGRVRRGEQGSMF YRKVGGAISAVRRAGAIQALQTMISMFAVTQMVVPMSNYWTFGMGEAPGEVEKDAEGWRS MEELGRNMAWVIKKLYS >gi|316922979|gb|ADCP01000081.1| GENE 25 24077 - 24493 125 138 aa, chain + ## HITS:1 COG:SMa1331 KEGG:ns NR:ns ## COG: SMa1331 COG2140 # Protein_GI_number: 16263180 # Func_class: G Carbohydrate transport and metabolism; R General function prediction only # Function: Thermophilic glucose-6-phosphate isomerase and related metalloenzymes # Organism: Sinorhizobium meliloti # 1 135 1 135 137 119 43.0 2e-27 MSNLRSFVTVDDVETQSFDWGKLQWLTDPRVTGSQCMVSGIVTLDPGQGHARHNHPGCCE NLFLLEGEGEQMIEKEDGSRETRKVYPGTMISLQRGQYHSTYNVGTGVLRILACYEFAGP EAALRADPGCTVIPPKNA >gi|316922979|gb|ADCP01000081.1| GENE 26 24641 - 25153 -225 170 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFAALSGVASFHRACLLAQDFTHEPACQTGDSAPQGRFRHQPEQMVHPPFAWQLASFPSL PVCGEEAPYCPLRNFATLRAYSPEIKAQQAIFAFLPVYPALSALVGSRMPDYRGIFLRGH GSQTSTHYGTVVHSSVSLGGLQGMVFGRFRPLSIQGGNMEYNSGTRPSAN Prediction of potential genes in microbial genomes Time: Fri May 13 03:36:12 2011 Seq name: gi|316922975|gb|ADCP01000082.1| Bilophila wadsworthia 3_1_6 cont1.82, whole genome shotgun sequence Length of sequence - 3786 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 160 - 534 108 ## gi|302861318|gb|EFL84257.1| putative tail fiber protein - Prom 610 - 669 6.2 2 2 Tu 1 . - CDS 869 - 1585 128 ## COG0477 Permeases of the major facilitator superfamily - Prom 1820 - 1879 3.9 - Term 3242 - 3284 9.4 3 3 Tu 1 . - CDS 3418 - 3786 104 ## gi|302861318|gb|EFL84257.1| putative tail fiber protein Predicted protein(s) >gi|316922975|gb|ADCP01000082.1| GENE 1 160 - 534 108 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302861318|gb|EFL84257.1| ## NR: gi|302861318|gb|EFL84257.1| putative tail fiber protein [Desulfovibrio sp. 3_1_syn3] # 1 123 88 196 198 75 42.0 9e-13 MYPELVALIGWNVPNYQGVFLRGYGGQTSYHYGAVGHWSAGLGELQGDGIREIWGELSYL PRSRDGEVGQSGSLAFWNEGRNQWMNDAGKAPSGAMNFYASRSTPVVGEVRPVNRAVRYL IRAR >gi|316922975|gb|ADCP01000082.1| GENE 2 869 - 1585 128 238 aa, chain - ## HITS:1 COG:AGc2694 KEGG:ns NR:ns ## COG: AGc2694 COG0477 # Protein_GI_number: 15888785 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 12 222 275 481 498 116 37.0 3e-26 MGAMALFCLWRINPANIDYAVARGLVAEESGYTQSKSESFRYLFTDKSLLVVGMTLFFFH LGNAALLPLLGQSAVARFDVNAASYTAGTVVLAQVTMILTALWGAHVAQRKGYGPLFYMA LLALPLRGCIAGLWNTPWNIIPVQLLDGVGAGLLGVATPGMVARLLEGGGHINMGLGMVL TIQGIGAALSSTYGGLFAHHSSYDAAFFSLAAAPCLGLLLFLAGVRFLPTLRNAMQNH >gi|316922975|gb|ADCP01000082.1| GENE 3 3418 - 3786 104 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302861318|gb|EFL84257.1| ## NR: gi|302861318|gb|EFL84257.1| putative tail fiber protein [Desulfovibrio sp. 3_1_syn3] # 2 121 91 196 198 70 42.0 5e-11 AELVALIGWNVPNYQGVFLRGYGGQTSYHYGAVGHWSAGLGELQGDGIREIWGELSYLPR SRDGEVGQSGSLAFWNEGRNQWMNDAGKAPSGAMNFYASRSTPVVGEVRPVNRAVRYLIR AR Prediction of potential genes in microbial genomes Time: Fri May 13 03:36:37 2011 Seq name: gi|316922973|gb|ADCP01000083.1| Bilophila wadsworthia 3_1_6 cont1.83, whole genome shotgun sequence Length of sequence - 2236 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 251 96 ## - Prom 299 - 358 1.6 + Prom 134 - 193 2.3 2 2 Tu 1 . + CDS 309 - 1763 532 ## COG4973 Site-specific recombinase XerC + Prom 1783 - 1842 2.4 3 3 Tu 1 . + CDS 1885 - 2236 169 ## Nmul_A1248 integrase catalytic subunit Predicted protein(s) >gi|316922973|gb|ADCP01000083.1| GENE 1 2 - 251 96 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVCLACHPPERKAQQPLFVFLPVYPELVALIGWNVPNYQGVFLRGYGGQTSYHYGAVGHW SAGLGELQGDGIREIWGELSYLP >gi|316922973|gb|ADCP01000083.1| GENE 2 309 - 1763 532 484 aa, chain + ## HITS:1 COG:XF1483 KEGG:ns NR:ns ## COG: XF1483 COG4973 # Protein_GI_number: 15838084 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Xylella fastidiosa 9a5c # 94 326 40 275 294 73 24.0 1e-12 MATIRYSMRQKKWIVTWKVRGATFHHECASEESALSFSEIQRTISQKEKERFTQTKPIRK SITIRELFDHHFSLCSKRPITIKQDLYHARPLLQSLGTKQVASITQKEILAFCKSQKASG LAQSTIHRRISLLRSIASWGKREGYLSSAIENCSVSKGKARRTPPPTVSELNALLRVAVP HVQRVILLGIYTGARIGPSELFSLKWDDVDLTHGVISMPNANKGSKLKKRDIPIRKDLIP TLARWQHQDADCPYVINWSGKPVRKIAAAWKTALQKSGIRAIRPYDLRHAYATYSIRSGA DIKTSTEIMGHENARMILEVYEHVDWAQKVRAIEGIPDFFALAPKAVRNPIHERRPRKRE IDEPGAKYCVRNTVPALNDRKHLIHSEIENQYGNPADKDADHKRNAKTQNPCGPECGSQK ERGHKYELSQQTEQKKLKEEIHKPPPFVAKAEYRAEKQHDRTQPRNDSTHDNLHDGAPQW DAPG >gi|316922973|gb|ADCP01000083.1| GENE 3 1885 - 2236 169 117 aa, chain + ## HITS:1 COG:no KEGG:Nmul_A1248 NR:ns ## KEGG: Nmul_A1248 # Name: not_defined # Def: integrase catalytic subunit # Organism: N.multiformis # Pathway: not_defined # 1 116 1 116 347 187 72.0 1e-46 MESFNQNVIKHKTGLLNLAAELGNISKACKMMGFSRDTFYRYQAARDAGGVEALFEVSRR KPNLKNRVEEAIEVAVTAFAVDFPAYGQTRASNELRKQGIFVSPSGVRSIWMRHDLA Prediction of potential genes in microbial genomes Time: Fri May 13 03:36:53 2011 Seq name: gi|316922968|gb|ADCP01000084.1| Bilophila wadsworthia 3_1_6 cont1.84, whole genome shotgun sequence Length of sequence - 8445 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 5, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - 5S_RRNA 15 - 73 93.0 # AE015927 [R:2797299..2798807] # 5S ribosomal RNA # Clostridium tetani E88 # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. 1 1 Tu 1 . + CDS 81 - 536 -131 ## + Prom 548 - 607 1.7 2 2 Tu 1 . + CDS 712 - 5109 3339 ## COG0642 Signal transduction histidine kinase - Term 5497 - 5537 6.3 3 3 Tu 1 . - CDS 5572 - 6153 693 ## DvMF_1121 hypothetical protein - Term 6233 - 6274 9.1 4 4 Tu 1 . - CDS 6288 - 6710 263 ## PROTEIN SUPPORTED gi|90022209|ref|YP_528036.1| ribosomal protein S2 - Prom 6840 - 6899 2.4 + Prom 6848 - 6907 3.0 5 5 Tu 1 . + CDS 6935 - 8209 1672 ## COG0019 Diaminopimelate decarboxylase + Term 8376 - 8430 15.2 Predicted protein(s) >gi|316922968|gb|ADCP01000084.1| GENE 1 81 - 536 -131 151 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTTSLPISLWAKTQSFLFPTRSSLVKEPEAVASAAAFPSAAKRELTPPGPACQQLFFRT PKFFFGARREDFSPAQKRELIPTPSRCQPLSSAFFKKLVRPAVPGLSALPGSARHSVPQR RERFMHLHAASVNDFLFFFFTPGNAPLTSGV >gi|316922968|gb|ADCP01000084.1| GENE 2 712 - 5109 3339 1465 aa, chain + ## HITS:1 COG:slr2104_3 KEGG:ns NR:ns ## COG: slr2104_3 COG0642 # Protein_GI_number: 16330590 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 541 706 23 189 270 65 29.0 7e-10 MHSFLPFKHSRLRVYMPLFAALLLVLTPGMGAADPAPASPVPEVSTATPSVVQEALQAVV EQTAPETPAAPVVQTASALPAVSGEPGSALAKEGASVPLQAEWLLDPKGTHTLTDVLSPN AQKEFKPYEPESLPHQAGTIWMRLEVEGKAPVFPLVLDLNTRIAGQLPGIPQVWLVRPGE SNGTPVRPSSDGLYSLPNPLPDQTGIYIRVNGIPAPGFAPMLRNAASLTLVDELGTQPQL VLLAVLLFLCLLRGVTERREWRMWAALYIAAVWVQAFWGLPTTPAGEVSRWDMPGLLAPG VALLILPHVGRHMLRTRHHAPFIDMQFVLLALFGIALSVAPLIPGYTWTLQFLPLWPLFM LLLLPGTLAACARRLPGAKRFLLICILPPLGMLALFPLSRLLPESLIGLLPHALDGFMTP GVVSLIPLTGLTLSALFAALSPSPKPLPAPNRSTRDTKTGRAGVAALELGGASKASETSF ERLPSLSAEPEGLELANFPNKKPSSANVHSEPARAAEASRPTSRTPEKRPYPQALQAPLS AGMVEESLRAPLDALLRAISAVDQSQLSAEARRRTDALGVAGRNLATAIGNMGRGVSLSQ DFDRKERFDLNQLLLETHEAVSSLAESKNLGLSWFTAPYLPRCYEGRRAQLANVLALLVE SAVLATDRGMVQIRAQRLPESTDPGHLLFTVSDTGSGMPPLERSTLALVRTWELVGPDGE LVSLESGPKGTTISFSMRLTARIEQQPIPAAPEPDKETLSRLPASSLRIIVASNVPANRQ MLSYYLDELPHEIIEARSAEEAKALYRRTPGALIIFDDDMPEESIADAVADIRIFEGEHN FPLASILALVNSNEQIDALRRAGCTHFLKKPITRKDLRVLTLRLAPVSRRFKDTDGAPQK TAPKPQAPSAKGSPRVSPDIPNLPELPEPAKPLAPALSAPRDSHAAEPAPVKRGIFASLF ARFRKQAKPAAPEAPTVVDQVDEPVMTLTEKAPEKPIKLSSVGEPMPISKASPEEAPAPR PRPERPNPLEERAKHMPAQSSAPTNAAEWVGEPMPITKKQDSDQGLAKQVEAPLEMPTQE KEAEKRPAPEHASAPQRSEAATPLNTVPDLNLDEWVGEPTPIIKPLPVSAPLPAPEEAPL TLEPEQKVSPKKRDNAPLTLAPADDAPLTLTAQTDKADEPLMLGTPLPDKKSSLSNDVLM LDEPLGNQPELSLAGLGLEPQAEVINLTEPVRKPTPAQPAEKDAPIPDLFGDARPAPQPG PRSLLDEAACLAPCGQGSRPTASTNELAIDDIVELGVPVAPVIQPATEKDIALPEPQEDT LPHVLPAGDGEAAAQEQPVPAIPETPEQEPALPTKDGQKGPLQDAEQESEASDEAIRALL TELDEALERAIQGEQSGDAQMVCQAAAHIGRLAETYDLRVLDDPARCLEEVACSGNMDEI VQLMPDLVSAINRNRASFEEAERDG >gi|316922968|gb|ADCP01000084.1| GENE 3 5572 - 6153 693 193 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1121 NR:ns ## KEGG: DvMF_1121 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 189 1 189 193 257 61.0 1e-67 MKVVFEITGLSCDLALHPISPQTAESMREGGRDIYSQKYMNWWRRGNTRTFGMRLNESSE IRLQVDGEDVSFNANLLYRNVYTLRQRMYLSSKAQFLAVLGYDNETCTFRWIWENIDHFD ASKFNFVVTDWDNVLGTKGYRVLDNVFYNDKHADDEEWLNPSGFTLMDPIVIDLEDVRRE IEEELKGTADYGR >gi|316922968|gb|ADCP01000084.1| GENE 4 6288 - 6710 263 140 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90022209|ref|YP_528036.1| ribosomal protein S2 [Saccharophagus degradans 2-40] # 1 140 6 146 151 105 36 9e-23 MAVKDGNTVRVHYTGTFSDGEVFDSSREREPLEFTIGDGSLIPGFEDALLGHNAGDRFTV TIPADEAYGEHLEELLMEVPVSEVPEDIKPEVGMMLQIATDDGDMEVQIVEVNDKVVVLD ANHPLAGEDLTFDIEVIDVK >gi|316922968|gb|ADCP01000084.1| GENE 5 6935 - 8209 1672 424 aa, chain + ## HITS:1 COG:MTH1335 KEGG:ns NR:ns ## COG: MTH1335 COG0019 # Protein_GI_number: 15679335 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Methanothermobacter thermautotrophicus # 14 419 14 417 428 301 42.0 2e-81 MSNVRSAQTDKVRFFGNTTPQELAAQYGTPLYVYNEDVLRRRCRELLSLSSLPGFHVNYS AKANTNLALLRIVREEGCHADAMSPGELHINKLAGFTPDRLLYVCNNVSAEEMKNAADNG LIVSVDSLSQLDQYGKVNPGGKVMIRINPGIGAGHHKKVITAGKETKFGIDPTSLDEVRA LLKKHSLTLAGVNQHIGSLFMEPDNYLNAIEFLLHFVQSDLADLLPGIEIIDFGGGLGIP YRKYEEEPRLDMAELGRRLHALLSAWVEETGYKGKFFIEPGRYVVAECGVLLGTVHATKF NGENRYVGTDLGFNVLVRPAMYDSFHDIEIFRDGGEPDTDLVEQSIVGNICESGDILAKK RMLPLIKEGDIVAALDAGAYGFVMSSSYNQRPRAAEVLITSDGTPKLIRRRETLDDLTRC FVEE Prediction of potential genes in microbial genomes Time: Fri May 13 03:37:20 2011 Seq name: gi|316922953|gb|ADCP01000085.1| Bilophila wadsworthia 3_1_6 cont1.85, whole genome shotgun sequence Length of sequence - 17079 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 9, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 117 - 2456 2561 ## COG2217 Cation transport ATPase - Prom 2554 - 2613 4.0 + Prom 2447 - 2506 4.2 2 2 Op 1 . + CDS 2564 - 2758 336 ## Dvul_0933 heavy metal transport/detoxification protein 3 2 Op 2 . + CDS 2818 - 3285 458 ## COG0219 Predicted rRNA methylase (SpoU class) 4 3 Op 1 . + CDS 3841 - 5214 1605 ## COG0534 Na+-driven multidrug efflux pump 5 3 Op 2 . + CDS 5227 - 6117 1193 ## COG4866 Uncharacterized conserved protein + Term 6171 - 6208 9.4 - Term 6159 - 6196 9.4 6 4 Op 1 . - CDS 6259 - 7110 316 ## PROTEIN SUPPORTED gi|219848628|ref|YP_002463061.1| ribosomal protein L11 methyltransferase 7 4 Op 2 . - CDS 7120 - 7848 224 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) - Prom 7931 - 7990 4.1 8 5 Tu 1 . + CDS 8042 - 9352 2189 ## COG0281 Malic enzyme + Term 9374 - 9423 12.3 - Term 9362 - 9411 8.5 9 6 Tu 1 . - CDS 9463 - 10026 434 ## DSY0041 hypothetical protein - Prom 10274 - 10333 1.9 - TRNA 10359 - 10435 87.6 # Asp GTC 0 0 - TRNA 10527 - 10603 87.6 # Asp GTC 0 0 - TRNA 10609 - 10684 93.9 # Val CAC 0 0 + Prom 10756 - 10815 3.6 10 7 Op 1 . + CDS 10840 - 11022 84 ## 11 7 Op 2 3/0.000 + CDS 11015 - 12346 1862 ## COG0793 Periplasmic protease 12 7 Op 3 . + CDS 12442 - 13992 1280 ## COG2861 Uncharacterized protein conserved in bacteria + Term 14053 - 14101 5.0 13 8 Tu 1 . + CDS 14146 - 14946 1150 ## COG0345 Pyrroline-5-carboxylate reductase 14 9 Op 1 . + CDS 15129 - 15686 680 ## DVU0566 GAF domain-containing protein 15 9 Op 2 . + CDS 15697 - 16713 1410 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase Predicted protein(s) >gi|316922953|gb|ADCP01000085.1| GENE 1 117 - 2456 2561 779 aa, chain - ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 9 777 76 812 818 623 46.0 1e-178 MDQQKEVLFTVGGMHCAACSSRLERVLNGMDGVQSATVSLAANSATVVPDPALSESDTEA LVHQIEERAVDMGFTATPVAPEADMVDTWEAQQKETVGQLATLKARLWPEFGFTILLLLV SMGHMWGLPLPAIIDPMHSPESALNHALLQLVLTLPVLWSGRHFYLTGLPNLWRLTPNMD SLVAMGTGAAFLYSLWNTVEVALGHTGKVMDLYYESAAVLISLISLGKYLEAVSRFRMSD AIGALMNLTPETALRLPAPDKADQAEEVPVKVVRVGDYLQVKPGGRIPVDGVVTNGASSV DASMLTGESMPVPVGVGSSVAGGTMNTTGSFIMRAERVGADTALSRIIKLVREAQSSKAP IARLADDVSLIFVPTVMALAVIAGAGWLYWGHVPVSEAFRIFVAVLVVACPCALGLATPI SIMVATGRGAQMGVLVKNGTALELAGRLDVLVFDKTGTLTEGKPRLLAAESFDPLFDEER ILGYAASLEGVSEHPLALAVTSAAEERNLHLYPVSDFASVSGLGVSGNVDLENESIPLLL GNKRLMEERQVSFDAVKDLDARLAELSDSGATPLLLAVSGRLVGMLAVADTIRPETAGVV KELRQLGLRVIMLSGDNRRTAEAIAKSAGIDEVIADVLPDGKEKVISELQAAGLRVGMIG DGINDAPALARADVGMAMGNGIDVAVEAGDIVLLGSDTRSGKGLRGVVTALELSRAALRN IRENLGWAFGYNILCLPIAAGVLKIFGGPSLSPMIAGAAMAFSSVSVVLNALRLRRFGK >gi|316922953|gb|ADCP01000085.1| GENE 2 2564 - 2758 336 64 aa, chain + ## HITS:1 COG:no KEGG:Dvul_0933 NR:ns ## KEGG: Dvul_0933 # Name: not_defined # Def: heavy metal transport/detoxification protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 63 32 94 96 83 58.0 2e-15 MPTLTVKGMSCNHCKQAVTQALEALPGVSDVNVDLEKGEATWKESQPLDIAEVKKAINKL GFEA >gi|316922953|gb|ADCP01000085.1| GENE 3 2818 - 3285 458 155 aa, chain + ## HITS:1 COG:BH1023 KEGG:ns NR:ns ## COG: BH1023 COG0219 # Protein_GI_number: 15613586 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Bacillus halodurans # 5 154 3 155 157 128 42.0 4e-30 MPTCMHVVLFQPEIPPNTGNVARLCAAMQVSLHLIEPLGFKLEDRYLKRAGLDYWPHVDM AVWPDLDAYIRNSGAGRRLVLTSARRGAALHRFEFTENDSLVFGRETSGLPPEVIGLSPH HVRIPIKGEVRSINLSTAAGIVLFQALVSAGLVGE >gi|316922953|gb|ADCP01000085.1| GENE 4 3841 - 5214 1605 457 aa, chain + ## HITS:1 COG:TM0815 KEGG:ns NR:ns ## COG: TM0815 COG0534 # Protein_GI_number: 15643578 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Thermotoga maritima # 25 450 36 457 464 108 25.0 2e-23 MKLNTERIHTIWHLTWPQCIMLLCQFVIGITDVWAGGRIGSEVQASIGLITQCHMMFMAL AMAAVSGAVASISQSLGACKFVRARRYVGLVSIGCIAIGSAIAVAASVWREPLLRLIQTP ESIMPVAIMFLTATIWGIPGQYTLTIGAAVFRSAKMVLIPLYVTGTACLLNVFGDLAFGL GWWGFPAYGATGIAYSTLVSVTVGAALMLFLLMRHELFTRDSFPGWRWIKAGAPYLLKVA GPAFGTSFLWQTGYMVLYVITASLPFGRVNALAGLTTGLRVESILFLPAVAFSMTASVLV GHALGEGNHREAKRTLLATLGIACAGMCCVGAAIWPWRMELAGLIAPDPAVQVETVKYLS FNIMAVPFTVASVVLAGGLNGAGATVYPMVSFSFAVWAVRLPIAWLFGHIIWQDASGVFL STFVSQVVMSLSLLWVTLRCNWTRFALAVRPHTCASR >gi|316922953|gb|ADCP01000085.1| GENE 5 5227 - 6117 1193 296 aa, chain + ## HITS:1 COG:FN0277 KEGG:ns NR:ns ## COG: FN0277 COG4866 # Protein_GI_number: 19703622 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 30 292 25 286 290 117 30.0 3e-26 MLTENFSSIELGDAAAYAPYFRALPLHAADYTFTNLWGWGTHYNLEWRTAHGLCWIRQKR NDLPVERLWAPVGDWYAADWAAMPELVPGTTILRAPEPLCELLLERLPGRVAIEETPGQW EYLYTQEALSTLAGNKLHKKKNHVNGYMKAYGEDYRALNGEIMPQVLALQDDWCKWRECE KSASLLAESDVVCSVLRNWDALPGLIGGALYAGEEMAAFAVGEPLDDQTIVVHFEKGRPE YRGVYQAINFCFAKYAAKNFVFINREQDADEEGLRQAKESYMPSGYLKKNTIRIVK >gi|316922953|gb|ADCP01000085.1| GENE 6 6259 - 7110 316 283 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|219848628|ref|YP_002463061.1| ribosomal protein L11 methyltransferase [Chloroflexus aggregans DSM 9485] # 1 280 1 324 327 126 31 1e-28 MATPDLIRLDIVVDEEQYDTVAGLLVRNISYGWEEDSLPTGETRFRVHCDNAIVQENLLS ALRAWLPGLDVEQTSIPRQDWTVAWREFFTPVRAGQFIVLPPWLLESTPLEGRNPIIIEP KSAFGTGHHNTTVLCLEAITELLASGRLKAGQRFFDVGTGSGILGIACCLNGLAGLGSDI DPVAVDNALENVVINKVADDFRIVPGSAEAGEGERFDLVVANILAGPLRELAPALIARMK PGACLVLSGLLDVQADAVEAAYAELGKARRVQSGDWVALVWGE >gi|316922953|gb|ADCP01000085.1| GENE 7 7120 - 7848 224 242 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 18 218 20 221 223 90 31 5e-18 MSLIEIRDVHKWYGEFHVLKGISERVEKGEVLVICGPSGSGKSSLIRCLNRLEPIQKGDI LFEGTSIYAPEVDVNALRAEVGMVFQQFNLYPHLTVLENVTLAPIKVRGMGKKEATDLAM ELLGRVGIARQAAKRPSEISGGQQQRVAIARSLAMQPRAMLFDEPTSALDPEMINEVLSC MKDLAKDGMTMVCVTHEMGFAREVSDRVIFMDHGVILEEGTPEQFFTNPQHERTKAFLRE IL >gi|316922953|gb|ADCP01000085.1| GENE 8 8042 - 9352 2189 436 aa, chain + ## HITS:1 COG:CC3549_1 KEGG:ns NR:ns ## COG: CC3549_1 COG0281 # Protein_GI_number: 16127779 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Caulobacter vibrioides # 4 426 14 438 453 489 60.0 1e-138 MALFTKEESLAYHEEPRPGKLEVLPVKPCFTQKDLSMAYSPGVANACLAIADDPSLANAY TGRGNLVAVVSNGTAVLGLGNIGPLAGKPVMEGKSVLFKAFADVNAYDINLALTDPDKII EVVKALEPTFGGINLEDIKAPECFYIEEVLKKEMNIPVFHDDQHGTAIISGAGLINALEI TGKKAEDFIVVVSGAGAAAIACAKFYTELGIDPANIRMFDSKGILHKGRTDLNKYKAQFA LSEDMTMTEALKGADLFLGLSKKDLLTPDMIKGMADHPVIFACANPDPEIAYPLAMETRP DCIMGTGRTDFPNQINNLSGFPYIFRGALDVYATEINEAMKLAAARALAALAKEPVPAEV SAAYQGQTFSFGQGYVIPKPFDPRLIEWLPPAVAQAAMDTGVARKPIEDMDAYKASLRVR IQKAQGRANALIESYK >gi|316922953|gb|ADCP01000085.1| GENE 9 9463 - 10026 434 187 aa, chain - ## HITS:1 COG:no KEGG:DSY0041 NR:ns ## KEGG: DSY0041 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 3 158 2 157 165 183 51.0 3e-45 MKKQLILVGGAMGVGKSAVCRELLRQLTPGVWLDGDWCWNMNPFVVSEENKRMVLSNITH LLRAYLNNSSYRYVLFCWVMDQPLLFEAVLGPLRDIPFTLHSFSLVCTEQALRERLERDV RDGIREADVIPRSLRRLPAYAALPTCKLDVTSLTPYEAACAIAASVGRASSRRVPGLSGA GESWHGA >gi|316922953|gb|ADCP01000085.1| GENE 10 10840 - 11022 84 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLQGFPGGMSAPSSRPIPAGGKETRKYALSEHLRFQKNPGIFNPIVFSYCFQLRRLLYV >gi|316922953|gb|ADCP01000085.1| GENE 11 11015 - 12346 1862 443 aa, chain + ## HITS:1 COG:PA5134 KEGG:ns NR:ns ## COG: PA5134 COG0793 # Protein_GI_number: 15600327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Pseudomonas aeruginosa # 35 440 43 434 436 330 47.0 4e-90 MYRKSRAVALFLLSLLTVGSLTFGQAFAAGEDSRFDALKRFSQVLDIVERYYVKETPRPD LVNGALKGMLESLDPHSTMLSKEEFKDMQESTSGEFFGIGIEITMENNQLTVVTPIEDTP ADKGGMKSGDIILAVGGKPTLEMTLQEAVSHIRGPKGSEVVLTILHRDSKEPVDLRIKRD AIPLISVKSRELEPGYYWVRLTRFSERTTQELLDALSDAKRKGPIKGIILDLRNNPGGLL DQAVSVSDAFLNKGVIVSMRGRQEETAREFVAKPQDTDIIDTPLVVLVNGGSASASEIVA GALGDQKRALLVGERTFGKGSVQNIIPLSDGSGLKLTVALYYTPSGRSIQAEGIMPDLEV PFEAPKEKPAPLSSLRMIREKDLNKHLEKTDGDKGKASGKKVETPAPAANEPLPDAKEFL ERDNQLRMSLQFVKSLPKIRTIH >gi|316922953|gb|ADCP01000085.1| GENE 12 12442 - 13992 1280 516 aa, chain + ## HITS:1 COG:PA5135 KEGG:ns NR:ns ## COG: PA5135 COG2861 # Protein_GI_number: 15600328 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 269 489 26 240 257 93 32.0 8e-19 MEWFPALRKRLAARGVHLPDPYPYTARTFKGSVLATLLAAILFLIAYAVWEAANYTARNV QRTAFMTIETTAATSRLVTEKAGQAVEALFPKPAPVTEASLESVLPPGSHLGNQMKDAIR QNTLPYQEGSGTLTDAARQIDFALLQTVLRLKLDKGRILLLTSDYRTQGKDIYHLQRMRI YLPPVVQDPAQQAPRNENGNAPSAETGDVKGPPEPVFRFLNALAESLDTWADRAVLTESP GKLTLATKGVLTHEIWFEATTDAFPILPPTDKAPRLTIVMNELGGDKAVTASLLALKLPI TFSVLPFAKDAAATATAAHEAGQEVLVDMPMETMQSPFVKAGPGEITTKMSEEDMRILMD DALGHVPYATGASNFMGSRLTTDTAATRRFCEILARSGLYVLDDVTHQESILYAEARRRG LPAWRRALTLNDGPKTEGAVLADLKKAEETARAKGHAVVIATPAPHVLAALKRWSQERDK DIRLVPLRLQPVEEETFASEDIPSERQDDPTSPIEH >gi|316922953|gb|ADCP01000085.1| GENE 13 14146 - 14946 1150 266 aa, chain + ## HITS:1 COG:lin0414 KEGG:ns NR:ns ## COG: lin0414 COG0345 # Protein_GI_number: 16799491 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Listeria innocua # 10 266 2 263 266 166 38.0 4e-41 MDTPASLTSKRIGCIGCGNMGGAILGGLAEVPGLELYGYNRTPQRLEPLCAKGVTAVPDI PGIAARCDILVIGVKPYLVGGVLAEALPSLKPETVVISIAAGVTLHDLRDAVQGRCHVVR VMPNTPALVGAGVFGIQEDPALPKDVFAMILDLFGLLGSTIVLPEKKFNAFMALVGCGPA YVFHFMDALAEAGVTMGFTRQEALELVTQLVLGSAKLAALPGSHPAILREQVCSPAGVTI AAVNHLDRTAVRGHLIDAVLAAYAKS >gi|316922953|gb|ADCP01000085.1| GENE 14 15129 - 15686 680 185 aa, chain + ## HITS:1 COG:no KEGG:DVU0566 NR:ns ## KEGG: DVU0566 # Name: not_defined # Def: GAF domain-containing protein # Organism: D.vulgaris # Pathway: not_defined # 1 185 1 185 185 270 74.0 2e-71 MTGHDYFRALLEVATVINSSLEPTVVLHKITEQTAKAMNCKASTLRLLDRTGKLLLASAA WGLSSGYMRKGPVEVAKSGLDGEVLKGKLIHLRDACSDGRFQYPESAKAEGLVSVLSAPL MVNGKAIGILRVYSSEERDFTPDECDFMLGVANISAIAIENARMHEATRLNYELLTSYNY QVFED >gi|316922953|gb|ADCP01000085.1| GENE 15 15697 - 16713 1410 338 aa, chain + ## HITS:1 COG:BH3560 KEGG:ns NR:ns ## COG: BH3560 COG0057 # Protein_GI_number: 15616122 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Bacillus halodurans # 5 338 4 335 335 392 60.0 1e-109 MQKVRVGINGFGRIGRQVFRALHAKYQDKVEVVAINDLFDAETNFHLLEYDTNYGRANLD AVVEGNNATVGDWKIHCFAERDPKLLTWGAYGVDVVVESTGIFRSAKQAHVHIENGAKKV IITAPAKEEDLTIVMGVNHNDYDPAKHHVVSNASCTTNCLAPVALVVERLFGIVSGAMTT VHAYTNDQRILDLPHKDLRRARAAACNIIPTSTGAAQAVAKVIPSLKGKFTGWSLRVPTP TVSVVDFTAILNKDTDTDTMRAALKEAAEGELKGILAYSDAQLVSMDFKGNPHSSIVEAE YTTVQDGKLAKIVSWYDNEWGYSNRVADLIMWMKEKGF Prediction of potential genes in microbial genomes Time: Fri May 13 03:39:36 2011 Seq name: gi|316922900|gb|ADCP01000086.1| Bilophila wadsworthia 3_1_6 cont1.86, whole genome shotgun sequence Length of sequence - 39963 bp Number of predicted genes - 53, with homology - 52 Number of transcription units - 18, operones - 6 average op.length - 6.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 19 - 53 5.1 1 1 Op 1 50/0.000 - CDS 127 - 510 502 ## PROTEIN SUPPORTED gi|218885214|ref|YP_002434535.1| 50S ribosomal protein L17 2 1 Op 2 26/0.000 - CDS 500 - 1543 988 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 3 1 Op 3 36/0.000 - CDS 1556 - 2182 964 ## PROTEIN SUPPORTED gi|94987423|ref|YP_595356.1| 30S ribosomal protein S4 4 1 Op 4 48/0.000 - CDS 2207 - 2596 633 ## PROTEIN SUPPORTED gi|46579738|ref|YP_010546.1| 30S ribosomal protein S11 5 1 Op 5 . - CDS 2693 - 3061 573 ## PROTEIN SUPPORTED gi|94987421|ref|YP_595354.1| 30S ribosomal protein S13 6 1 Op 6 . - CDS 3079 - 3192 199 ## PROTEIN SUPPORTED gi|220903956|ref|YP_002479268.1| ribosomal protein L36 7 2 Op 1 2/0.000 - CDS 3295 - 4062 445 ## COG0024 Methionine aminopeptidase 8 2 Op 2 53/0.000 - CDS 4065 - 5378 1190 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 9 2 Op 3 . - CDS 5386 - 5841 625 ## PROTEIN SUPPORTED gi|46579733|ref|YP_010541.1| 50S ribosomal protein L15 10 2 Op 4 . - CDS 5838 - 6020 194 ## PROTEIN SUPPORTED gi|220903952|ref|YP_002479264.1| ribosomal protein L30 11 2 Op 5 56/0.000 - CDS 6026 - 6517 720 ## PROTEIN SUPPORTED gi|94987417|ref|YP_595350.1| 30S ribosomal protein S5 12 2 Op 6 46/0.000 - CDS 6539 - 6901 434 ## PROTEIN SUPPORTED gi|94987416|ref|YP_595349.1| 50S ribosomal protein L18 13 2 Op 7 55/0.000 - CDS 6914 - 7456 728 ## PROTEIN SUPPORTED gi|220903949|ref|YP_002479261.1| ribosomal protein L6 signature 1 14 2 Op 8 50/0.000 - CDS 7468 - 7848 536 ## PROTEIN SUPPORTED gi|46579728|ref|YP_010536.1| 30S ribosomal protein S8 15 2 Op 9 50/0.000 - CDS 7877 - 8062 308 ## PROTEIN SUPPORTED gi|218885200|ref|YP_002434521.1| 30S ribosomal protein S14 16 2 Op 10 48/0.000 - CDS 8075 - 8614 855 ## PROTEIN SUPPORTED gi|46579726|ref|YP_010534.1| 50S ribosomal protein L5 17 2 Op 11 57/0.000 - CDS 8625 - 8945 472 ## PROTEIN SUPPORTED gi|46579725|ref|YP_010533.1| 50S ribosomal protein L24 18 2 Op 12 50/0.000 - CDS 8957 - 9325 576 ## PROTEIN SUPPORTED gi|46579724|ref|YP_010532.1| 50S ribosomal protein L14 19 2 Op 13 . - CDS 9336 - 9602 400 ## PROTEIN SUPPORTED gi|46579723|ref|YP_010531.1| 30S ribosomal protein S17 20 2 Op 14 . - CDS 9607 - 9792 216 ## PROTEIN SUPPORTED gi|78357292|ref|YP_388741.1| 50S ribosomal protein L29 21 2 Op 15 50/0.000 - CDS 9794 - 10207 663 ## PROTEIN SUPPORTED gi|218885194|ref|YP_002434515.1| 50S ribosomal protein L16 22 2 Op 16 61/0.000 - CDS 10207 - 10848 1030 ## PROTEIN SUPPORTED gi|220903940|ref|YP_002479252.1| ribosomal protein S3 23 2 Op 17 59/0.000 - CDS 10852 - 11190 465 ## PROTEIN SUPPORTED gi|94987407|ref|YP_595340.1| 50S ribosomal protein L22 24 2 Op 18 60/0.000 - CDS 11203 - 11484 442 ## PROTEIN SUPPORTED gi|218885191|ref|YP_002434512.1| 30S ribosomal protein S19 25 2 Op 19 61/0.000 - CDS 11493 - 12323 1317 ## PROTEIN SUPPORTED gi|218885190|ref|YP_002434511.1| 50S ribosomal protein L2 26 2 Op 20 61/0.000 - CDS 12326 - 12613 337 ## PROTEIN SUPPORTED gi|218885189|ref|YP_002434510.1| 50S ribosomal protein L23 27 2 Op 21 58/0.000 - CDS 12624 - 13244 852 ## PROTEIN SUPPORTED gi|46579715|ref|YP_010523.1| 50S ribosomal protein L4 28 2 Op 22 40/0.000 - CDS 13260 - 13889 924 ## PROTEIN SUPPORTED gi|218885187|ref|YP_002434508.1| 50S ribosomal protein L3 29 2 Op 23 . - CDS 13901 - 14218 518 ## PROTEIN SUPPORTED gi|46579713|ref|YP_010521.1| 30S ribosomal protein S10 + Prom 14899 - 14958 3.6 30 3 Tu 1 . + CDS 15009 - 16091 1436 ## COG1253 Hemolysins and related proteins containing CBS domains + Term 16313 - 16356 6.4 31 4 Tu 1 . - CDS 16629 - 17996 1681 ## COG0477 Permeases of the major facilitator superfamily + Prom 18178 - 18237 4.6 32 5 Tu 1 . + CDS 18480 - 18890 520 ## Dvul_1942 cytochrome c-type biogenesis protein CcmE 33 6 Op 1 . + CDS 19044 - 21029 2491 ## COG1138 Cytochrome c biogenesis factor 34 6 Op 2 . + CDS 21029 - 21688 169 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 35 6 Op 3 14/0.000 + CDS 21685 - 22359 925 ## COG2386 ABC-type transport system involved in cytochrome c biogenesis, permease component 36 6 Op 4 . + CDS 22364 - 23038 985 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component + Prom 23086 - 23145 1.8 37 7 Op 1 . + CDS 23243 - 23380 288 ## 38 7 Op 2 . + CDS 23373 - 24008 841 ## DVU1045 hypothetical protein + Term 24027 - 24070 12.0 - Term 24244 - 24277 2.0 39 8 Tu 1 . - CDS 24340 - 24564 285 ## DVU1905 hypothetical protein - Prom 24641 - 24700 3.6 40 9 Tu 1 . + CDS 25023 - 26075 1566 ## COG0136 Aspartate-semialdehyde dehydrogenase + Term 26079 - 26106 0.1 41 10 Tu 1 . + CDS 26208 - 27149 1336 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Prom 27579 - 27638 5.8 42 11 Op 1 24/0.000 + CDS 27747 - 28955 1011 ## COG0004 Ammonia permease 43 11 Op 2 . + CDS 28968 - 29306 268 ## COG0347 Nitrogen regulatory protein PII + Term 29425 - 29462 5.0 + Prom 29441 - 29500 2.5 44 12 Tu 1 . + CDS 29550 - 29879 430 ## LI0949 hypothetical protein + Term 29907 - 29948 9.8 - Term 29960 - 30002 7.0 45 13 Tu 1 . - CDS 30094 - 31164 862 ## COG0438 Glycosyltransferase - Term 31355 - 31398 15.1 46 14 Tu 1 . - CDS 31418 - 31984 681 ## DVU1176 hypothetical protein - Prom 32201 - 32260 5.4 + Prom 31970 - 32029 3.8 47 15 Tu 1 . + CDS 32144 - 32521 544 ## DVU1174 hypothetical protein + Term 32620 - 32658 -0.6 - Term 32782 - 32810 -1.0 48 16 Op 1 . - CDS 32819 - 33367 430 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 49 16 Op 2 . - CDS 33351 - 34103 576 ## Geob_0103 hypothetical protein 50 16 Op 3 2/0.000 - CDS 34166 - 36025 1976 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division - Prom 36252 - 36311 4.6 - Term 36191 - 36240 3.2 51 16 Op 4 . - CDS 36354 - 37580 1461 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Term 37603 - 37636 4.0 52 17 Tu 1 . - CDS 37722 - 37994 305 ## Dde_1811 hypothetical protein - Prom 38037 - 38096 2.2 + Prom 38146 - 38205 2.5 53 18 Tu 1 . + CDS 38235 - 39956 1642 ## COG1944 Uncharacterized conserved protein Predicted protein(s) >gi|316922900|gb|ADCP01000086.1| GENE 1 127 - 510 502 127 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218885214|ref|YP_002434535.1| 50S ribosomal protein L17 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 124 1 124 133 197 75 6e-50 MRHSNSGKKLGRTPSHRKALFRNMASAMITYGKIRTTEVKAKELRRVVEPLITLALRNDL HSRRLAYETLNDHKLVQRLFDVVAPLFAGVPGGYTRITKMALPRKGDCAPMAILEFTRQP EAAAEKN >gi|316922900|gb|ADCP01000086.1| GENE 2 500 - 1543 988 347 aa, chain - ## HITS:1 COG:CC1272 KEGG:ns NR:ns ## COG: CC1272 COG0202 # Protein_GI_number: 16125521 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Caulobacter vibrioides # 13 332 4 322 338 326 51.0 5e-89 MLFRQGNRLINSRNWAKLVRPEQIVCDSDPNDTMYGKFTYEPLERGYGVTIGNALRRVLL SSLQGAAFVAVKISGVQHEFTTIPGVLEDITDVILNIKQVRLGMTTDEPQHLTLSVSKKG PVTAADIMTNANVEVLNPELHIATLTEDMELSMEFEVRMGKGYVPADMHEGLPDTIGLIK LDASFSPVRKVAYTVEQARVGQMTNYDKLILEVWTDGSVLPEDAIAYSAKIIKDQISVFI SFDERISGESGGEGGGSADINDNLFKGIDELELSVRATNCLRSANIATVGELVQRPEAEM LKTKNFGKKSLDEIKAVLESMGLDFGMKIDNFEKKYQEWKRKQHHEA >gi|316922900|gb|ADCP01000086.1| GENE 3 1556 - 2182 964 208 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|94987423|ref|YP_595356.1| 30S ribosomal protein S4 [Lawsonia intracellularis PHE/MN1-00] # 1 208 1 208 208 375 85 1e-103 MAKYNDAKCRLCRREGTKLLLKGDRCFTDKCAFDRRPYAPGQHGRARKKLSDYAVQLREK QKVRRVYGVLEKQFHGYFVHADMAKGVTGANLLAFLERRLDNVVYRLGFANSRTQARQLV RHGVFTLNGHKVTIPSLQVKVGDSVEVPEKSRTIPVLAEAQEAVARRGCPSWLEVDAPNF KGIVKALPQRDDIQFPINEHLIVELYSK >gi|316922900|gb|ADCP01000086.1| GENE 4 2207 - 2596 633 129 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579738|ref|YP_010546.1| 30S ribosomal protein S11 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 129 1 129 129 248 91 4e-65 MARPKRVAKKREKKNVPLGVAHIQASFNNTIITFTDTRGNTVSWASAGQSGFKGSRKSTP FAAQVAAEQAARKAQDNGMRTVGIYVKGPGSGREAAMRAINAAGFKVAFIRDVTPIPHNG CRPPKRRRV >gi|316922900|gb|ADCP01000086.1| GENE 5 2693 - 3061 573 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|94987421|ref|YP_595354.1| 30S ribosomal protein S13 [Lawsonia intracellularis PHE/MN1-00] # 1 122 1 122 122 225 86 4e-58 MARIAGVDLPRGKRADIALTYIYGIGRVTALKILDASGVNWTRGIDDLTAEELNEVRKEL EQSYKVEGDLRREISMNIKRLMDIGCYRGLRHRKGLPVHGQRTHTNARTRKGPRRGAVGK KK >gi|316922900|gb|ADCP01000086.1| GENE 6 3079 - 3192 199 37 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|220903956|ref|YP_002479268.1| ribosomal protein L36 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 37 1 37 37 81 100 9e-15 MKVRPSVKKICPKCKVIRRKGVLRVICENPRHKQRQG >gi|316922900|gb|ADCP01000086.1| GENE 7 3295 - 4062 445 255 aa, chain - ## HITS:1 COG:all1019 KEGG:ns NR:ns ## COG: all1019 COG0024 # Protein_GI_number: 17228514 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Nostoc sp. PCC 7120 # 2 252 19 269 275 226 41.0 5e-59 MKKYRGVFLKNEREIGLLREANRMVAMILDELGRQVRPGLPTMHFEEIVQNMCREFKVKP AFQGMYGFPYGLCCSVNEVIVHGFPSEDVILKEGDIVSFDVGTVYEGFYGDAARTFAVGD VSPEAARLLRVTEESLALAVAEARSGNELNDIAGAVQKHAEGAGFHVVRRFVGHGIGSTL HEKPEVPNYVVTTKPALPLKTGMVLCIEPMITVGTPEVEILDDKWSAVTRDRSLAAHFEH CVAILPGGPQILDLP >gi|316922900|gb|ADCP01000086.1| GENE 8 4065 - 5378 1190 437 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 11 436 19 445 447 462 52 1e-129 MALGGADNLARSPELRTKLLWTLGLLCAYRIGVHIPVPGVDGGALAHFFESVAGTLFGLF DMFSGGGLRNVSVFALGIMPYISASIILQLLQVVSPELKRMAKEEGAAGRRKITQYTRYS TVLITIIQGFGIAVGLETMYSPGGVPVVLEPGWTFRIMTMLTLTAGTVLIMWLGEQITEK GIGNGISLIIFSGIVVGIPGALVKSFQLIKLGDMSLFIALALVILMVAVLIGVVFMERAQ RRIPIQYAKRQVGRKMYGGQSTHLPLRVNTAGVIPPIFASSLLLFPATMANFDIADWLKT AASWFTPSSILYNVIFIALIFFFCFFYTAIIFDPKDVAENLKKAGGFIPGIRPGEKTQEY LDAVLSRLTLWGGVYISVISVLPMLLIAEFNVPFYFGGTSILILVGVAMDFMSQIESHLI SRQYEGLMGKTRIKGRS >gi|316922900|gb|ADCP01000086.1| GENE 9 5386 - 5841 625 151 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579733|ref|YP_010541.1| 50S ribosomal protein L15 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 148 1 147 148 245 81 3e-64 MNLNELYPFAEDRKSRKRVGRGSGSGLGCTSGKGNKGQNARAGGGVRPGFEGGQMPLQRR LPKRGFKNYLFKVEYEVINIARLVAAFEGKSEISLDDIYDRGLCPFGAPVKILGEGELSA AIKVEAHKFSQSAADKIRAAGGEVKELEVEG >gi|316922900|gb|ADCP01000086.1| GENE 10 5838 - 6020 194 60 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|220903952|ref|YP_002479264.1| ribosomal protein L30 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 57 1 57 58 79 64 3e-14 MSEITIKLAKSRIGCTPNQKKTLDALGLRRREMVKTFPDNPAVRGMIAKVSHLVVEVTKA >gi|316922900|gb|ADCP01000086.1| GENE 11 6026 - 6517 720 163 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|94987417|ref|YP_595350.1| 30S ribosomal protein S5 [Lawsonia intracellularis PHE/MN1-00] # 1 163 1 163 163 281 88 3e-75 MEQNELGFIEKIVSLNRVAKVVKGGRRFSFSALMVVGDGNGNVGFGLGKAQEVPEALRKA TEHARKNMVKVPLIEGTLPYEILGEFGAGRVMLKPASRGTGIIAGGAVRAVMEAAGVNDV LAKAIGTNNPHNVLRATMAGLAALRSADAVSEIRGMKLEAPRK >gi|316922900|gb|ADCP01000086.1| GENE 12 6539 - 6901 434 120 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|94987416|ref|YP_595349.1| 50S ribosomal protein L18 [Lawsonia intracellularis PHE/MN1-00] # 1 120 1 120 120 171 70 5e-42 MSTMTKNEARQRRKIRIRKKISGTAERPRLVIFRSNLHMYAQVVDDLTGATLAATSTLVL SKGGEKVSCNKAGAEAVGKEIARLAKEKSIEKVVFDRNGYLYHGKIKAVADGAREGGLEF >gi|316922900|gb|ADCP01000086.1| GENE 13 6914 - 7456 728 180 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|220903949|ref|YP_002479261.1| ribosomal protein L6 signature 1 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 178 1 178 179 285 79 4e-76 MSRIGKLPIAIPSGVEVKVGADVVEVKGAKAALTTPVCDLLSYEVADGHITLTRKAETRE SRAQHGLRRTLLANCIEGVTKGFSKTLEVIGVGYRVAVKGNVIELQVGFSHPVLVELPAG LSAKVEGQKLTIMGADKVLVGEMAARIRRIRKPEPYKGKGIKYETETILRKAGKSGGKGK >gi|316922900|gb|ADCP01000086.1| GENE 14 7468 - 7848 536 126 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579728|ref|YP_010536.1| 30S ribosomal protein S8 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 126 1 126 126 211 79 7e-54 MMTDPIADMLTRIRNAHLALHKEVSVPSSKMKEAIAAILKQEGYVDDVTVEDRNINIQLK YFKGKPAIEGLKRMSKPGRRVYVGAHEIPRVQNGLGICILSTSHGVLAGDKAHQMKVGGE LLCEIW >gi|316922900|gb|ADCP01000086.1| GENE 15 7877 - 8062 308 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218885200|ref|YP_002434521.1| 30S ribosomal protein S14 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 61 1 61 61 123 90 2e-27 MSRTSLEVKAARKPKFSSRAYNRCPICGRPHAYMRKFGLCRICFRNMALRGDLPGVRKSS W >gi|316922900|gb|ADCP01000086.1| GENE 16 8075 - 8614 855 179 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579726|ref|YP_010534.1| 50S ribosomal protein L5 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 179 1 179 179 333 90 7e-91 MTRLENVYREKVVPVLQKEFAYSSPMQIPGIVKISLNIGLGSASQNNKLMEEAMNELSVI AGQKAVMTRAKKSIAAFKLREGMPIGCRVTLRKERMWDFLDKLMNFALPRVRDFRGIPDR GFDGRGNFTLGIKEHAIFPEMEADRIENPKGMNITIVTTATTDKEGKFLLEQLGMPFRK >gi|316922900|gb|ADCP01000086.1| GENE 17 8625 - 8945 472 106 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579725|ref|YP_010533.1| 50S ribosomal protein L24 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 106 1 106 107 186 78 2e-46 MKQFRIRKDDKVMVIAGKDQGKIGKVLKVLHKHDSVVVEKVNVAKRHVKPNPYRREPGGI VDKEMPIHVSNLMVVCPACAAPTRVGYRYAEDGKKIRFCKKCNETL >gi|316922900|gb|ADCP01000086.1| GENE 18 8957 - 9325 576 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579724|ref|YP_010532.1| 50S ribosomal protein L14 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 122 1 122 122 226 95 2e-58 MIQVESTLQVADNSGAKKVACIKVLGGSHRRYATVGDIIMVAVKEAIPHSKVKKGDVMQA VIVRTAKEVRRVDGSYIKFDSNAAVLLSKQGEPVGTRIFGPVARELRAKNFMKIVSLAPE VL >gi|316922900|gb|ADCP01000086.1| GENE 19 9336 - 9602 400 88 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579723|ref|YP_010531.1| 30S ribosomal protein S17 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 88 1 88 88 158 86 4e-38 MSQAMENRKGRMLIGTVVSDKNDKTIVVRVETLVKHPLLKKFVRRRKKFTAHDPMNECGI GDKVKIVEFRPMSRNKRWHLVTILEKAV >gi|316922900|gb|ADCP01000086.1| GENE 20 9607 - 9792 216 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|78357292|ref|YP_388741.1| 50S ribosomal protein L29 [Desulfovibrio desulfuricans subsp. desulfuricans str. G20] # 1 60 1 60 62 87 70 9e-17 MKAKELKELGVEELKAKLAEQRQELFNLRFQHTTAQLEKTSSIPAARKNIARILTVLKEK E >gi|316922900|gb|ADCP01000086.1| GENE 21 9794 - 10207 663 137 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218885194|ref|YP_002434515.1| 50S ribosomal protein L16 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 137 1 137 137 259 89 1e-68 MLAPKRVKFRKWQKGRLRGPALRGATIAFGDIGLKTVEHGKLTNQQIEAARIAMMRHIKR GGKVWIRVFPDHPVTAKPLETRQGSGKGAPVGWCAPVKPGRILYEIKGVSLELAKEALTR AAHKLPVKTVIVVREGL >gi|316922900|gb|ADCP01000086.1| GENE 22 10207 - 10848 1030 213 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|220903940|ref|YP_002479252.1| ribosomal protein S3 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 213 1 213 213 401 92 1e-111 MGQKVHPYGFRLGYNKNWQSRWFSKKDYPAFVYEDHEVRKYVKKLLYSAGISKIEIERAG GKIRLILSTARPGIVIGRKGVEIEKLRNDLRAKFGREFSLEVNEIRRPEVDSQLVAENIA QQLERRVAFRRAMKRTVTMARKFGAEGIKVTCSGRLAGAEIARTEWYRDGRVPLQTLRAD IDYGFAEARTTYGIIGVKAWIYKGEILDKEVER >gi|316922900|gb|ADCP01000086.1| GENE 23 10852 - 11190 465 112 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|94987407|ref|YP_595340.1| 50S ribosomal protein L22 [Lawsonia intracellularis PHE/MN1-00] # 1 112 1 112 112 183 77 1e-45 MESKATAKFVRVSPRKTRLVARNVKGMPVEAAMNLLRFTPNKPAGVILGVVRSALANAEH NASMDVDALVVKEILVNEGPTWKRFMPRAQGRATNIHKRTSHITVILAEGQE >gi|316922900|gb|ADCP01000086.1| GENE 24 11203 - 11484 442 93 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218885191|ref|YP_002434512.1| 30S ribosomal protein S19 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 93 1 93 93 174 86 6e-43 MPRSLKKGPFVDDHLMKKVETAVETNDRRVVKTWSRRSTVLPEMVGITFAVHNGKKFLPV FVTENMVGHKLGEFAPTRTFHGHAADKKTKAKK >gi|316922900|gb|ADCP01000086.1| GENE 25 11493 - 12323 1317 276 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218885190|ref|YP_002434511.1| 50S ribosomal protein L2 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 276 1 276 276 511 89 1e-144 MAVRKLKPTSAGRRFQTVSDFEEITRTTPEKSLTEGLTKKAGRNNHGRVTSRRRGGGVKR LYRIIDFKRNKLDIPATVAHIEYDPNRTARIALLHYADGEKRYILAPLGVKQGDVIVSGE KADIKPGNALPMARIPVGTVLHNIELNPGRGGQFCRAAGTYAQLVAKEGKYALLRMPSGE VRKVLATCLATIGQVGNIQHENISLGKAGRNRWLGRRPKVRGVAMNPVDHPLGGGEGRSS GGRHPVSPWGMPTKGYKTRDKKKPSSRLIVKRRGQK >gi|316922900|gb|ADCP01000086.1| GENE 26 12326 - 12613 337 95 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218885189|ref|YP_002434510.1| 50S ribosomal protein L23 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 95 1 96 96 134 69 9e-31 MDHTQILLRPVVTEKATFLRDGANQVAFFVHPDANKIEIKKAVETAFNVKVTDVNVVTRK PTERVRNRKVSRVPGWRKAYVTLASGEKIEFFEGV >gi|316922900|gb|ADCP01000086.1| GENE 27 12624 - 13244 852 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579715|ref|YP_010523.1| 50S ribosomal protein L4 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 206 1 206 206 332 81 2e-90 MAVVKVYDQDKTEQGELTLTPEVFEVAVRPEILHLVVRAQLAAKRAGTHSTKTRAMVSGG GVKPWRQKGTGRARAGSNRSPVWRGGAILFGPQPRKYDFKVNKKIRKLALRMALSSRLAG DNLLVVKGFDLPEAKTKVFAKIAGNLGLSKALIIAPEESRNLVLSSRNLPGITLTTPDQL SVYEILKHKQLVLLEGAVEPVETRLK >gi|316922900|gb|ADCP01000086.1| GENE 28 13260 - 13889 924 209 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218885187|ref|YP_002434508.1| 50S ribosomal protein L3 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 209 1 209 209 360 80 7e-99 MAEKMGILGRKLGMTRIFAGDGSAVAVTVVQAGPCPVTQVKTAETDGYNALQIAFEEAKE KHVTKPMKGHFAKAGTALYRKVREIRLDGPAAYEVGAVLTADVFAAGDKIKVTGTSLGKG YQGVMRRWNFKGSKDTHGCEKVHRSGGSIGNNTFPGHVFKGRKMAGHWGAEQVTVQGLEI VDVRTEDNVILIRGSVPGPKNGLVLVRKQ >gi|316922900|gb|ADCP01000086.1| GENE 29 13901 - 14218 518 105 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579713|ref|YP_010521.1| 30S ribosomal protein S10 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 105 1 105 105 204 99 9e-52 MTTVSSDRIRIKLKAYDYRILDKAVAEIVDTARNTGAGVAGPIPLPTNINKFTVNRSVHV DKKSREQFEMRIHKRLMDILEPTQQTVDALGKLSLPAGVDVEIKL >gi|316922900|gb|ADCP01000086.1| GENE 30 15009 - 16091 1436 360 aa, chain + ## HITS:1 COG:sll1254 KEGG:ns NR:ns ## COG: sll1254 COG1253 # Protein_GI_number: 16330748 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Synechocystis # 1 343 1 342 346 223 36.0 3e-58 MITLLLTVFLVVLISAICSMTEAALYSVPWTYIENLRKQGSATGELLYQLRSRIDQPIAA VLTLNTVANTAGAAIAGAVAANVLGADNTALFAAGLTILILALGEILPKTLGVAHACGVA SGMARPLRLMVLIFKPFIWFSSMLTRLVASPQSGPSATEDDIRAITSLSRQTGRIQQYEE NAIRNILSLDVKHVREIMTPRTMVFSLQEDISVKDAYNHPQIWHYSRIPVYGDDNEDIVG IVLRKDIGRYVSQGQGEKTLFDIMQPVRFVLENQTVDKLLLEFLESRLHLFIVLDEYGGL AGVVSLEDVLEEMLGREIVDETDAVADLREAARQRRSALTQARNQQNAALKAARADAAKE >gi|316922900|gb|ADCP01000086.1| GENE 31 16629 - 17996 1681 455 aa, chain - ## HITS:1 COG:ECs2778 KEGG:ns NR:ns ## COG: ECs2778 COG0477 # Protein_GI_number: 15832032 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 14 430 21 437 438 288 37.0 1e-77 MKTTEQIEHDKNLRRVVVSSLVGAVIEWYDFFLYGVVAGLIFNHLYFPEFDDRIGTMLAF ATFAVGFVARPLGGLVFGHFGDKIGRKKVLVLTLMIMGTGTVAIGCIPSYASIGIWAPIL LILCRVAQGLGLGGEWGGAVLMSFESAPAHKRAFYASLPQVGLSLGLLLASGVIGFFSAV LSNEAFISWGWRIAFIMSAGLLFVGSYIRSSVHETKDFAEAKAKMPEVKYPLLDAFKRYP KMMVACMGARLIDGVAFNTFAVFSLSYLTQTRGLERSSALAVVMIASVVMSFFIPFWGYV ADRIGKARVFGFCALLLGILGFPAFWVFHNYSDNYLYVCLALALPFGVVYAAVFGTMSSL FSDSFDPSVRYSAISFVYQFSGIFAAGLTPMVATMLVAWNGGEPWYLCAYLGSAGLISAA CTLWIRHLNRKGFPPALPVEGETPVSEPLMEGESA >gi|316922900|gb|ADCP01000086.1| GENE 32 18480 - 18890 520 136 aa, chain + ## HITS:1 COG:no KEGG:Dvul_1942 NR:ns ## KEGG: Dvul_1942 # Name: not_defined # Def: cytochrome c-type biogenesis protein CcmE # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 136 1 137 137 189 67.0 3e-47 MAKKSGKNVYLVALLLFLGGVGYLAYSGFSENSVYFLNVSEALATPSDKLKAARLFGTVA EDGLSRSETGRDVKFRLEDQENKATTLWVEYTGAIPDTFKAGAEVIVEGGLRPDGSFQAK TLMTKCPSKYQKENRG >gi|316922900|gb|ADCP01000086.1| GENE 33 19044 - 21029 2491 661 aa, chain + ## HITS:1 COG:HI1094 KEGG:ns NR:ns ## COG: HI1094 COG1138 # Protein_GI_number: 16273022 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis factor # Organism: Haemophilus influenzae # 63 661 52 643 648 263 31.0 9e-70 MYYFAFVLLALSMLCALAGAAWAVKSLWPRTAPAAGKASRRTSVPGLGFVEKANLCLTGC YIIASAILIYALASYDFSLVYVASYTDRLLPLFYRITAFWAGQAGSMLFWAFSVAICGGL FQLTRSYKSLTPDTRLWYWVFYLGIMSFFGLILVTWNNPFLMNHAVPQDGNGLNPLLQNP GMIFHPPLLFLGYGGFVIPSCLALAQALSRRQADEAAWTDVARPFTLAAWAMLTAGIVLG GWWAYMELGWGGYWAWDPVENASLIPWLIATAALHTMLIQTRRNKLHGVNVFLMALTTIS AFFATYLVRSGVVQSVHAFGSGGVGVPLLLFILISVALSFWIALLARRSDTGELAGIESR EGFLILTSWLLLALSLIILIATMWPVFSAFWKETVMGLARDVVLDEAGHDHGGAVGLTAD FYNRVCMPLFAAMVAILSICPWLGWNGGVRNMRKLLLVVCSFIGAAAAMYSLGYTLPVAV LGASASVAALVSMSLKLFERPVYSHAPSFAAYGIHIGVALIALGIAFSGPYKIESEPTMA MGETVKVGQFEVTFKNLYEGEGAGYIFLEGELEVRKDGKLIGIAAPQRRVYAKWGQMQFA EAAVIPSLGNEFYATLMAVDQQNRAVFRLSSNPLVNWLWIGGILMCLLPFVSFRRRNGRE N >gi|316922900|gb|ADCP01000086.1| GENE 34 21029 - 21688 169 219 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 3 197 5 210 305 69 32 3e-11 MLLKLAGIGKLFGARAVLRNISFEVHPGTVTLLVGANGAGKTTLLKIMAGLARPTVGTVE RFCEDGGLGYLGHATFIYPGLTALENLAFWSGMHGNPTDKATLSEALARVELAPFAEERA GTFSRGMAQRLNLARILLQSPPLLLLDEPGTGLDVRSLAILHREIAASRDRGAGIVWITH DVAGDAKRADRIIAIENRTIGYCGPASGYEGVRTGEAVC >gi|316922900|gb|ADCP01000086.1| GENE 35 21685 - 22359 925 224 aa, chain + ## HITS:1 COG:DR0407 KEGG:ns NR:ns ## COG: DR0407 COG2386 # Protein_GI_number: 15805434 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Deinococcus radiodurans # 1 217 1 216 221 64 27.0 2e-10 MKAACLIARKDLRLVLSRGAGLVQALLLGLLLIFVFSLSRETGETMSGQGAATIFWLSSA FCQVLSFNMLYGLEEANGSRAGLLLLPTPVQSVLLGKAAAGLCIILAAQFLFLPATIVFL GQSLGDGWPLALLALVLTDIGMASLGSLLGALSQGQAARESLLSIVLFPLIIPILLAGIR VCAGGFSEALPEGVESWLGIAVAFDAVFLAAGLVLFPFVFSGDE >gi|316922900|gb|ADCP01000086.1| GENE 36 22364 - 23038 985 224 aa, chain + ## HITS:1 COG:DR0348 KEGG:ns NR:ns ## COG: DR0348 COG0755 # Protein_GI_number: 15805377 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Deinococcus radiodurans # 39 216 38 219 236 129 38.0 5e-30 MRYTWMLPCAVLGGVLMAACQVLIYRYAPVEQTMGLVQKIFYTHLPLAWWSFVSFFLVCV SGVAYLKTRNRHWDAVAGAAAEVGVVLSGLALVSGSIWARHSWGVWWTWDPRLTTTMILW FLYAGYLVLRKMDMPRERQANLCAVVGIVAFVDVPLVFLSARLWRSIHPAVFANKSGGLE PEMKIAAIAAVACFGLVWAGLVGLRTLQIKQKERLDGLAVHGEL >gi|316922900|gb|ADCP01000086.1| GENE 37 23243 - 23380 288 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTDMDWLMYANIAVWIGLGAYLAFLLRNQIALNRRLTQLETLRND >gi|316922900|gb|ADCP01000086.1| GENE 38 23373 - 24008 841 211 aa, chain + ## HITS:1 COG:no KEGG:DVU1045 NR:ns ## KEGG: DVU1045 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 10 198 17 194 198 132 39.0 8e-30 MTDQTFGKNRRSILLLITAGLVLMLVSTLSYFSANPGLVSHTTTGTQAAAPSGQQMPTSA GMDEATQQGVMALMQKLQKNPTDLEALVGLTEHFMHTQDWQRAETFALRAVVAAPNETQP LYMLGIIQHSQNRNAEAAASLEKVVAAKEDPSVRYSLGILYAYYLEQPDKGLEQLQKALD NPNTPADLKKAVAEEVTKIKEHKNHTPQQAQ >gi|316922900|gb|ADCP01000086.1| GENE 39 24340 - 24564 285 74 aa, chain - ## HITS:1 COG:no KEGG:DVU1905 NR:ns ## KEGG: DVU1905 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 4 68 71 136 142 67 54.0 1e-10 MATRLSPADEAPFNIQIKNLADEELLDFWEETQQIESALRAEYQQTVVPVQDYERLIVLE LQLRSCQRQKSCGA >gi|316922900|gb|ADCP01000086.1| GENE 40 25023 - 26075 1566 350 aa, chain + ## HITS:1 COG:aq_1866 KEGG:ns NR:ns ## COG: aq_1866 COG0136 # Protein_GI_number: 15606903 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Aquifex aeolicus # 8 339 5 340 340 397 61.0 1e-110 MSEKKLVVAVVGATGAVGREMLKTLESRNFPATEVVPFASARSAGTKVPYGDSELVVREL KEDVFEGIDIAIFSAGGSTSEKYAPHAAAAGCIVVDNSSAWRMDERCPLVVPEVNPEALE GHNGIIANPNCSTIQMVVALKPLHDAAGIRRVVVSTYQAVSGSGQKGISELETQVRQMFN LKEPEVNVYPHRIAFNCIPHIDVFLENDYTKEEMKMVHETVKIMRDPNVKVTATCVRVPV FYGHSESVNIETERKLSAKEARAILAQAPGVQVYDNPSEKMYPLAVDAAGEDATYVGRIR EDDTIPNGLNMWIVADNIRKGAALNAVQIAEELVRRDLLGVKDRTVFIKK >gi|316922900|gb|ADCP01000086.1| GENE 41 26208 - 27149 1336 313 aa, chain + ## HITS:1 COG:MK1627 KEGG:ns NR:ns ## COG: MK1627 COG0115 # Protein_GI_number: 20095063 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Methanopyrus kandleri AV19 # 49 304 26 283 295 119 29.0 8e-27 MIPVLDTNAWIEKLRELERPGADNILAFYEHRFGAICRDPRLMLAPLDDHLVHRGDGVFE TIRFTERKVIHLDAHLRRLANSAAGLSLTLPCPIEEIRDIVLAVAKAGDEPEGNIRILSG RGPGGFGIALKECPQPSLYVAAYRVPVRTDAWYDKGLTAFRSDVPVKPAMFARLKTTNYL SAVFMTLEAMQKHMDVALTFDANGCLTEAAIANVAVVDAKGALVLPEFKNALVGTVATKA MELAKTFMPVEIRPIPQAELDSVREMMILGTAHECIGVTHFEGRPIGDGKTGPVAHKLRK LIREDLLSGGEAF >gi|316922900|gb|ADCP01000086.1| GENE 42 27747 - 28955 1011 402 aa, chain + ## HITS:1 COG:CAC0682 KEGG:ns NR:ns ## COG: CAC0682 COG0004 # Protein_GI_number: 15893970 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Clostridium acetobutylicum # 1 402 1 404 405 354 51.0 2e-97 MNPADIAFIIICTGLVMLMTPGLALFYGGLVRSRNVLSTTMHSFITMGIVPVLWFVVGYS LAFGPDLGGIIGGFSHVLLRGVDMEGAGAASGIPPLLFMLFQCMFAVLTLALISGSYAER IHFPAMLLFSILWLLFVYCPMAHWVWGGGWMAKMGAVDFAGGAVVHMASAAGGLAAARTL GPRLGLGRKSFAPHNLPLTLLGGGILWFGWFGFNAGSALAANHLAVHAMVTTQMAAAAGV LGWLLVEWKHVGKPTSLGAASGALAGLVAITPAAGFVDIWASLVIGLVGGIVCYVGVLCK NRLGYDDALDVVGIHGVGGTWGALATGIFAVSSVNGVDGLLYGNPGQLWVQFVSVVGTWG FVYVASLIILRVVDTLVGLRVAPDPEIAGLDMNEHNERAYQI >gi|316922900|gb|ADCP01000086.1| GENE 43 28968 - 29306 268 112 aa, chain + ## HITS:1 COG:HI0337 KEGG:ns NR:ns ## COG: HI0337 COG0347 # Protein_GI_number: 16272289 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Haemophilus influenzae # 1 112 1 112 112 130 56.0 6e-31 MKKLEIIIRPSALDNVKDKLASLDIHGMTCTEVRGFGRQRGHTEVYRAAVYQVDFVAKIR LEIVVHDSMTESVVQAVMEAARTGNVGDGKIFVTPVENAWRIRTGEEGDAAL >gi|316922900|gb|ADCP01000086.1| GENE 44 29550 - 29879 430 109 aa, chain + ## HITS:1 COG:no KEGG:LI0949 NR:ns ## KEGG: LI0949 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 4 109 3 107 108 84 43.0 1e-15 MLKRTLCAAFAVAVLVAFSVPAFAGSHHKMRVRGTVESVDPAQKTFVVKDRDGKAIPFRV DDRSEFELEHRDKPDQDVPFTELKTGDRVKVKSFKGEPPHLVDDVDIYR >gi|316922900|gb|ADCP01000086.1| GENE 45 30094 - 31164 862 356 aa, chain - ## HITS:1 COG:TM0622 KEGG:ns NR:ns ## COG: TM0622 COG0438 # Protein_GI_number: 15643387 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 76 353 90 374 388 89 28.0 1e-17 MRVLLLDLGLELRGGQRQVYYLARALARTPDMEPLVACPRTGKLAELLRDEGLPVQGLPG RSPANPLLLRWMGQRLRDFPPDIVHTHDANAATVGAFYKLLHSGTLLIHSRRVSYPLRRG LRSWKYRIADAVVGVSREIADGMIGAGIPASRVSAIHSGIDPSRYRPREARQDGGFLFQS IGAFTPQKGYSVLVRAMAELRKRSLPPWAVRIVGDGPLLDPIKEEARTLGVDALLALPGR RDSVDMLPDCDALVVPSVDGEGSSGAIKEGWVTGVPVICSALASNQELVRGDENGLLAAV GDPVSLADAMARCLTDEGLRARLAEAGSRSVLEFTDTRMAEQYMDLYRRLMGKGTE >gi|316922900|gb|ADCP01000086.1| GENE 46 31418 - 31984 681 188 aa, chain - ## HITS:1 COG:no KEGG:DVU1176 NR:ns ## KEGG: DVU1176 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 183 1 182 185 249 64.0 3e-65 MTAFTTEKIDVEPWFDMELFMGVSQETRLGGDVMDRFMTLWKNWLPHLTVKGIDTGKIKY LLVALDESVEQDVDKAWEGSPSNAFLYNALAQTMCMAAVHAVIPEVQDAGCAPAPKPTAT LREALEAEGIPYSNDKDPILSRRYAVVTHYPFKGGCEICVLQSNCPKGQGQMEATSVVLP GYEHPQQM >gi|316922900|gb|ADCP01000086.1| GENE 47 32144 - 32521 544 125 aa, chain + ## HITS:1 COG:no KEGG:DVU1174 NR:ns ## KEGG: DVU1174 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 118 1 118 125 179 73.0 3e-44 MSQQYPSRIRYEYEQDPETRLQYTHGVWGGINPQGEIEISFYLESDKMPPYSERIIAPDG SFGHEIAPYNDEERTITRHIHSRVVMNYHTARAMLEWLEDKVAALEAEEDGSAPLSFDES GFQHQ >gi|316922900|gb|ADCP01000086.1| GENE 48 32819 - 33367 430 182 aa, chain - ## HITS:1 COG:RSc1314 KEGG:ns NR:ns ## COG: RSc1314 COG0791 # Protein_GI_number: 17546033 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Ralstonia solanacearum # 20 181 10 180 258 89 33.0 3e-18 MIYHAEMEGRVVKQWFVMGSACRFMALAGLLCLALLLSGCGTTYISAPDYGGRYDTSSLG ARVAQTARSQIGAHYRLGGTTPKGFDCSGLIWWAYRQHGINVPRVTDAQAKAGYGVGRGI AMRPGDILVFDTNRGRTGLHTALYTGNGRFVHSPTSGKRVREDSLAQEYWKKRLIRIRRI VR >gi|316922900|gb|ADCP01000086.1| GENE 49 33351 - 34103 576 250 aa, chain - ## HITS:1 COG:no KEGG:Geob_0103 NR:ns ## KEGG: Geob_0103 # Name: not_defined # Def: hypothetical protein # Organism: Geobacter_FRC-32 # Pathway: not_defined # 1 199 15 215 261 122 43.0 1e-26 MTALVGAGGKTTLMYALARRMADAGRRVVCTTTTKIFPPEDGLPVVLLEGAADPVAAVHD ALSAAPCVVAAGRPLPDVRKLDGVSPRMLAVLSAALPEALFLVEADGAARKPLKAPAAHE PVLPEPLGCCVAVVGLDSVGQPLDDGHVHRSALVCAAAGQEPGSPVTPATLACLVEHLEG LFRNCPAGCRRLVFANKGDGPGALDAASEAAALSRSVAWFAGSAAQGWCVPLTAGAQRAL GMEPFRDLPC >gi|316922900|gb|ADCP01000086.1| GENE 50 34166 - 36025 1976 619 aa, chain - ## HITS:1 COG:gidA KEGG:ns NR:ns ## COG: gidA COG0445 # Protein_GI_number: 16131609 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Escherichia coli K12 # 2 618 7 625 629 652 54.0 0 MFDCIVVGAGHAGCEAAMTLARFGQRVLLITGNVDRIGHLSCNPAIGGLAKGHMVREIDA LGGMMGLWADRAGIQFRTLNASKGPAVRATRAQIDRDSYMEAAKAALFAEPDITIWQDTV DEVLELGGRAAGVRTELGQTFEAPHVILTTGTFLCGLIHVGLTHFPGGRLGDAAAMKLSD SLRKHGLTLGRLKTGTTPRLLRSSIDFSQMEEQPGDDPAPGFSFYGPKPGQPQVPCYVTW TNERTHDAIRSGMDRSPLFTGVIEGTGARYCPSVEDKVARFPERERHHVFIEPEGLNGQE CYANGISTSLPLDVQLAMIASIPGLEHAKMVRPGYAIEYDYVDPVQLEPTLEAKVLPGLW LAGQINGTSGYEEAAAQGMWAAINVGCRLTGRQPFLPGRDVAYMSVLVDDLVTRGTQEPY RMFTSRAEYRLLLREANADARLTPLGRELGLVGDEQWSAFRRKQDALGRLLDMLREVRVS PDAATSDAFRELGEPVPNKSLTLEEVLRRPSMTLERLERLHPGVAAFPEEVLLEAETSVK YAGYLVRQDELVKRSARLEEVRLPADLDYPSVSGLSAEIVEKLCAVRPRTLGQAGRISGV TPAALTCLEIHLKKRSLLG >gi|316922900|gb|ADCP01000086.1| GENE 51 36354 - 37580 1461 408 aa, chain - ## HITS:1 COG:TVN1079 KEGG:ns NR:ns ## COG: TVN1079 COG0624 # Protein_GI_number: 13541910 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Thermoplasma volcanium # 13 406 2 390 392 310 41.0 3e-84 MTRFSRLIESIGQQRDKVIEYQTRMTELPALGPENGGTGEMPKALYLEGLLRELGVTDIL RIDAPDPRVPDGVRPNVVARIPGASPRRLWILGHMDVVPPGELSYWKTDPWKVVVDGDKI RGRGVEDNQQAIVCGLLIAQELKAQGITPDLSLGLIFVSDEETSSRYGIHYILKTHADLF GPDDFVVVPDYGVADGSSIEISEKGQLWMRVEVLGHQCHASRPQEGRNSLVAAADMILHV RDLESIYAQVDPLFQPPCCTFVPTRHEENVPNINSLPGKDVFYIDCRILPGISHDDVLAS AREIMEAVAERHGVTVDITTVTNAPASPATPPDSEVVLRLSAGIREIYGIEPHCAGSGGG TVAVGFRDMGIPAAVWASVVPTYHLANEYSLISRTIGDAQVLARMLFD >gi|316922900|gb|ADCP01000086.1| GENE 52 37722 - 37994 305 90 aa, chain - ## HITS:1 COG:no KEGG:Dde_1811 NR:ns ## KEGG: Dde_1811 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 1 89 1 89 89 119 66.0 4e-26 MSDEKTLVGAVKTEQGVEVKGMLMQPVVEKCEGCDRVRTFEEQQFCGSYPNPAKKWFDGR CNFATHIKTESGKAAKVNPLKASKRAAKGR >gi|316922900|gb|ADCP01000086.1| GENE 53 38235 - 39956 1642 573 aa, chain + ## HITS:1 COG:SMc01541 KEGG:ns NR:ns ## COG: SMc01541 COG1944 # Protein_GI_number: 15966038 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 254 544 60 372 393 63 27.0 1e-09 MPTTTLPYRYTHASTESMTGYFSCEPPEDLCFEDALARLEAAPLDEFLHRHLLRRIAALP PDEAAALRTPSRPVLMGLLYETAFLNAGHAGMLNGVSAGELERLEGATPLPYLSFARKSL TKEGQEEQANLAAWNALFEENLSRHHPLPHPDDNDLPLPAYVRNAEPPKHGRTAAEVHAE RAVPGQPIWSRPPAQHTAAEALASLATCGVIAGTEMRHESSLAPVGLLRNWNVDIAVRNG KLDYTLQGEATTWGRGLSIATARASYSMEMVERASAYLSVDGDAITDRLHPTPIVRASHA ELLAQGRAAIDPRGLPIDAEYNDQPLYWMEGRGVSGSAILVPVQAVGLFCNLDEPALFLS PGSTGMASGNTLDEAKVGALTEILERDAEATVPWRRGQCFELLADGEHPLAVLLADYARR GIHVWFRDMTTEFGVPCYQSFVTCGDGSVVRGAGAGLSGAGALLSALTETPYPYPNGPAS APVPAGLAQRRLSELPDYRLGSPSDNLRLLEEVLISHGHAPVYVELTREDLEFPVVRAIV PGLELTADFDAHSRPSVRLFRRYLEKAGSFERE Prediction of potential genes in microbial genomes Time: Fri May 13 03:40:34 2011 Seq name: gi|316922889|gb|ADCP01000087.1| Bilophila wadsworthia 3_1_6 cont1.87, whole genome shotgun sequence Length of sequence - 14327 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 6, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 75 - 134 2.8 1 1 Tu 1 . + CDS 316 - 1422 593 ## COG4188 Predicted dienelactone hydrolase + Term 1510 - 1548 12.3 - Term 1482 - 1552 31.2 2 2 Op 1 51/0.000 - CDS 1617 - 3695 2500 ## COG0480 Translation elongation factors (GTPases) 3 2 Op 2 56/0.000 - CDS 3708 - 4178 734 ## PROTEIN SUPPORTED gi|46579710|ref|YP_010518.1| 30S ribosomal protein S7 - Term 4339 - 4368 0.4 4 2 Op 3 . - CDS 4397 - 4771 612 ## PROTEIN SUPPORTED gi|46579709|ref|YP_010517.1| 30S ribosomal protein S12 - Prom 4809 - 4868 2.8 + Prom 5142 - 5201 4.1 5 3 Tu 1 . + CDS 5260 - 6441 1790 ## COG1454 Alcohol dehydrogenase, class IV - Term 6726 - 6777 19.0 6 4 Op 1 . - CDS 6820 - 7656 905 ## COG1253 Hemolysins and related proteins containing CBS domains 7 4 Op 2 22/0.000 - CDS 7872 - 8987 1135 ## COG0842 ABC-type multidrug transport system, permease component - Term 9036 - 9081 5.9 8 4 Op 3 45/0.000 - CDS 9088 - 10254 1408 ## COG0842 ABC-type multidrug transport system, permease component 9 4 Op 4 10/0.250 - CDS 10251 - 11132 352 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 10 5 Tu 1 . - CDS 12408 - 13403 1124 ## COG0845 Membrane-fusion protein - Prom 13536 - 13595 6.0 + Prom 13497 - 13556 4.3 11 6 Tu 1 . + CDS 13616 - 14281 524 ## COG1309 Transcriptional regulator Predicted protein(s) >gi|316922889|gb|ADCP01000087.1| GENE 1 316 - 1422 593 368 aa, chain + ## HITS:1 COG:RSp1248 KEGG:ns NR:ns ## COG: RSp1248 COG4188 # Protein_GI_number: 17549469 # Func_class: R General function prediction only # Function: Predicted dienelactone hydrolase # Organism: Ralstonia solanacearum # 34 339 47 372 372 94 28.0 4e-19 MTRRFLFFLLAFLILTCTAYAAPLQPGFKTLGIWDPAKNVRLDFAVWYPSRSAPFQVNYG DWNFSAARGRPPVEGKHPLILLSHDSAGSRFSLHELASELARNGFVVLAFTHPGDNVDDM GALFMPVQVTDRAKQLTQALDIALADPETAPLIDPDRIGVLGVGPGGTAAMLIAGARLDA TGWPIYCAGKEENADPYCTPWARQRMGAFSTTPNLSAPYRDRRVRAAAAVSPSYAMMFTP ASLSRIRIPLLLLRAERTPLYTLQHAERLLSAMPQPPQLGVLPDADTASLMSSCGGNLDQ TLPEMCLAVSPSRRKAIQAKMAAESAAFFLKYLGTPNPPPLPPEPEDNPQITSHPATPEK NAARKKKR >gi|316922889|gb|ADCP01000087.1| GENE 2 1617 - 3695 2500 692 aa, chain - ## HITS:1 COG:HP1195 KEGG:ns NR:ns ## COG: HP1195 COG0480 # Protein_GI_number: 15645809 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Helicobacter pylori 26695 # 1 689 1 689 692 889 62.0 0 MSRLAPIEKMRNIGIMAHIDAGKTTTTERILYYTGENHKIGETHEGGATMDWMAQEQERG ITITSAATTCFWLDHQINIIDTPGHVDFTIEVERSLRVLDGAVAVFDAVAGVEPQSETVW RQANRYGVPRICFINKMDRIGANFFRSVDMIRDRLKAKPVCLQIPIGSEDKFDGVVDLIN GRSVRFEKESKGLQITYGEVPEDLKDLYEEKRLELLDTVAEEDEELMEKYLEGHELTVEE INSCIRKGTIRQSIVPVLCGTAFRNIGVQPLLDAVVNYLPSPLDIDQMVGHNPDKPEEEI VCPSSDKEPLAGLVFKLASDPFVGHLAFFRIYSGVIEAGSTLYNANTGKKERLGRLLRMH ANKREDIKSAGAGDIVALVGMKLASTGDTICDEKRPVVLESLDIPEPVIEVAIEPKTKTD RDALSAALNKLAKEDPSFRVKGNEETGQTLIAGMGELHLDIIVDRLVREFNVNANVGKPQ VAYRETITKPSKSDLKYAKQSGGRGQYGHCVIEVEPNPEKGYEFVNAITGGVIPKEYIPS IDKGIQDALKSGVLAGFPVVDVKVTLVFGSYHEVDSSEQAFYVAGSMAIKDAMNKATPAL LEPYMDVEVVTPDDYLGDVMGDLNGRRGRVQSMEARAGAQVVRAQVPLSEMFGYATDLRS RTQGRATFTMQFHHYEKVPAAIAEEVSKKAAN >gi|316922889|gb|ADCP01000087.1| GENE 3 3708 - 4178 734 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579710|ref|YP_010518.1| 30S ribosomal protein S7 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 156 1 156 156 287 87 3e-77 MPRKGPIPRREVLPDPIYGSRLAARFVNRLMYDGKKGAAEKIFYGALEVLAEKTGEEALR AFEKALENVKPHLEVKARRVGGATYQVPMEVRPDRQVSLSIRWLITYARSRGEKGMSNKL AAELLDAFNSRGGAVKKKEDTHRMAEANKAFAHYRW >gi|316922889|gb|ADCP01000087.1| GENE 4 4397 - 4771 612 124 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579709|ref|YP_010517.1| 30S ribosomal protein S12 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 123 1 123 123 240 96 5e-63 MPTINQLIRVARKKVVKHKKTPALQACPQRRGVCTRVYTTTPKKPNSALRKVARVRLTNG LEVTAYIPGEGHNLQEHSVVMIRGGRVKDLPGVRYHIVRGTLDTSGVQDRRQRRSKYGAK RPKS >gi|316922889|gb|ADCP01000087.1| GENE 5 5260 - 6441 1790 393 aa, chain + ## HITS:1 COG:ECs4466 KEGG:ns NR:ns ## COG: ECs4466 COG1454 # Protein_GI_number: 15833720 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 10 392 6 382 383 386 53.0 1e-107 MAIVEKVDGFFIPNVTLIGVGAAKAIPERIVYLNATKPLLVTDKGIVHTGILKQITDILD EAAMEYAIYDETVPNPTDLNVAAGVELYKKEECDSLISIGGGSSHDCCKGIGLVVSNGGK IHDYEGVDKSTKAMPPYLAVNTTAGTASEITRFCIITDSARKVKMAIVDWRITPSVAIND PILMVGMPPALTAATGMDALTHAVEAYVSTGATPLTDACAEKAIKLVSENLRRAVANGSD IHAREGMCYAQYLAGMAFNNASLGHVHAMAHQLGGFYNLPHGECNAILLPIVEEYNLLAH LDKFINIARMMDENIDGLSKRDAAELAISAIRRLSQDVGIPASITELAKRYGKEVSRSDI PTMVANAQKDACGLTNPRKMTDAAVQQLYEIAF >gi|316922889|gb|ADCP01000087.1| GENE 6 6820 - 7656 905 278 aa, chain - ## HITS:1 COG:TP0649 KEGG:ns NR:ns ## COG: TP0649 COG1253 # Protein_GI_number: 15639636 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Treponema pallidum # 38 265 21 246 265 172 41.0 6e-43 MDRSTDKSLWDRVCNLFGGRSGDSIEQAVMEARDEGELDLAEGSMILSILRLDDVQVQDI MTPRTDFDCMPTGAPVSEVAQCILETGHSRLPIYKDTRDNIIGIAYAKDLISLLVDPAKH HTAVDEIMRSPFFVPETKIVSELLQEFRSRKNHIAIAVDEYGGTSGLATIEDILEEIVGD IEDEHDAPKEEDIHPLGDSRYALSGRAYLEDLEELGINLEADEVDTIGGYLCLEAGHVPE KGEIFECGGWRFIVDEADAKQIRRVIVEPATQEENGAE >gi|316922889|gb|ADCP01000087.1| GENE 7 7872 - 8987 1135 371 aa, chain - ## HITS:1 COG:SMb21204 KEGG:ns NR:ns ## COG: SMb21204 COG0842 # Protein_GI_number: 16264618 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Sinorhizobium meliloti # 7 370 5 368 370 328 45.0 1e-89 MMRTLRRILAIARKELLVLMGTRQSRFMVLIPPVIQIFIFAWAATMEVKNVDVVVINEDS GYWSQEVISRLRGSPTFRHITFEDDFKKAEEAINRQDVLVVMHFQNDFSAKVERGVPAEV QLLLDGRRSNTAQILTQYVQQIVQGLATATPAFVQAKPSRVEVEELNWFNPNLDFRWFIL PNLIGTINFLMGMIITGLTVARERELGTFDQMLVSPASPVEIACGKLIPGCVVGLVHGTI FFFSIIGFFKVPFVGSVAVLYVTMLLFSLSVSSIGLMVSSFASTQQQAFLGCFTVGVPCI LLSGFMTPVNNMPMFLQDLSQLNPLRHFIVILQGLFLKDITMSAALDSSARIAAITAVSM LCAIWMFTRKA >gi|316922889|gb|ADCP01000087.1| GENE 8 9088 - 10254 1408 388 aa, chain - ## HITS:1 COG:SMb21205 KEGG:ns NR:ns ## COG: SMb21205 COG0842 # Protein_GI_number: 16264619 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Sinorhizobium meliloti # 7 386 15 383 384 332 49.0 6e-91 MSNADPGRARPVFQWLAQLFALMGKESKQIIRDPSSYLVAAILPMIFLLLFGYGITLDAG VMDIVILDESNSQSGLQLQENFAHSPNFRVHQARSRSEAGRMMRDSVISGFIVIPYNFEA LLVGGQGKAAVAPIQVIVSGGEPNTANFIRGYSQGVIANWQQTRTPGGTIQKQPVDVAPR YWFNPAAKSRWYLVPGSITIVMTLIGTMLTALVIAREYERGTMESLFATPVTRMQILLGK LIPYYFLGMFSMTICALAGVFLFEVPLRGSVFALFVLSSTFLMPALGQGLLISILTKQQL LAAQTGLITGFLPAVILSGFIFDINSMPDALQYLTLAIPARHFNTALQTVFLAGDIWPVF LPRMAFMLGLGAAFLVITYRKLVKRLDV >gi|316922889|gb|ADCP01000087.1| GENE 9 10251 - 11132 352 293 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 40 248 6 214 311 140 39 7e-33 MEEEGWGARGKGGESFSPEKFLLPSPGGSTPHPAPAAQPVIRAVSLTKKFGTFVAAHDIT FEVQPGRIFGLLGPNGAGKSTTFRMLCGLSRPTEGECFVAGMNLLTAGGAARAKLGYMAQ KFSLYGELTVQQNMRLMADLYGLERGRVKPRIEQLVEALDLETFRDVRAFNLPLGQKQRL AMACATLHEPPVLFLDEPTSGVDPRTRREFWKHINAMTQNGVAVLVTTHFMEEAEYCDEI ALVFQGGIIARGTPDELKGKAPGAAKNGGAADMTLEEAFIAYIEEVQRKGGHA >gi|316922889|gb|ADCP01000087.1| GENE 10 12408 - 13403 1124 331 aa, chain - ## HITS:1 COG:Z1015 KEGG:ns NR:ns ## COG: Z1015 COG0845 # Protein_GI_number: 15800546 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 EDL933 # 18 329 21 330 332 252 48.0 7e-67 MLKRIIVILVVLACAAGGIWWLYDGKRQPSDLLTLYGNVDIRQVDVGFRVGGRVTELFKE EGDAVKTGERLARLDAKPYEEVFDQAAAQLSMQEIELRKMVAGYRTEDIEQARATLSGAQ AVYANAASNLRRVERLRQQNAVSQQSLDEARASYGDALSRRNAAREQLQLMESGYRSEDV ERQKAAVEAARAELARATTNLQDTELFAPQDGVVLTRVHEIGAVVQGGQTVYTVTLNNPV WIRAYVTQPNLGNIRPGQEVLLSIDATPDKTYRGRIGFISPTAEFTPKTVETKEVRNDLV FRFRVIADDPDNVMRQGMPVTVTLRKDGGKP >gi|316922889|gb|ADCP01000087.1| GENE 11 13616 - 14281 524 221 aa, chain + ## HITS:1 COG:RSp1395 KEGG:ns NR:ns ## COG: RSp1395 COG1309 # Protein_GI_number: 17549614 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Ralstonia solanacearum # 4 103 7 102 224 60 38.0 2e-09 MNKTNPPQRPKRAQRSDGRSTRAVVLEAAGKVFAERGFAEATSKEICERAGTNGAAVNYY FGGKEGLYEEVLIEAHRQMLSLEDLNRIITSEATPEEKLRVFLKHIIRTAMNASELWGIR IFLRELASPSPFVPKFITTAVFPKSQKLRELIRDITGLPPDSPAMQRATALIALPCMGLI LFPEKLRTLMLPATAGDAEGLLEDMLAYMLGGLRALGETAR Prediction of potential genes in microbial genomes Time: Fri May 13 03:40:47 2011 Seq name: gi|316922873|gb|ADCP01000088.1| Bilophila wadsworthia 3_1_6 cont1.88, whole genome shotgun sequence Length of sequence - 20593 bp Number of predicted genes - 15, with homology - 12 Number of transcription units - 8, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 189 - 223 4.0 1 1 Tu 1 . - CDS 289 - 1572 1799 ## COG0422 Thiamine biosynthesis protein ThiC + Prom 1698 - 1757 2.1 2 2 Tu 1 . + CDS 1841 - 4525 3334 ## COG0612 Predicted Zn-dependent peptidases + Term 4616 - 4655 -0.3 + TRNA 4715 - 4790 86.2 # Thr TGT 0 0 + TRNA 4899 - 4984 72.9 # Tyr GTA 0 0 + TRNA 5049 - 5124 90.4 # Gly TCC 0 0 3 3 Op 1 . + CDS 5283 - 6476 1500 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 4 3 Op 2 . + CDS 6481 - 6630 243 ## PROTEIN SUPPORTED gi|218886543|ref|YP_002435864.1| 50S ribosomal protein L33 + Term 6668 - 6734 30.0 + TRNA 6646 - 6722 73.8 # Trp CCA 0 0 + Prom 6648 - 6707 80.3 5 4 Op 1 . + CDS 6747 - 6989 268 ## DvMF_1449 preprotein translocase subunit SecE 6 4 Op 2 45/0.000 + CDS 7001 - 7690 694 ## COG0250 Transcription antiterminator 7 4 Op 3 55/0.000 + CDS 7745 - 8167 637 ## PROTEIN SUPPORTED gi|220904897|ref|YP_002480209.1| ribosomal protein L11 8 4 Op 4 43/0.000 + CDS 8258 - 8962 725 ## PROTEIN SUPPORTED gi|148263131|ref|YP_001229837.1| 50S ribosomal protein L1 + Term 8981 - 9009 1.0 9 4 Op 5 47/0.000 + CDS 9146 - 9664 632 ## PROTEIN SUPPORTED gi|218886548|ref|YP_002435869.1| 50S ribosomal protein L10 10 4 Op 6 . + CDS 9710 - 10090 542 ## PROTEIN SUPPORTED gi|46581331|ref|YP_012139.1| 50S ribosomal protein L7/L12 + Term 10115 - 10153 8.1 11 5 Tu 1 . - CDS 10477 - 10770 171 ## - Term 10966 - 10992 -1.0 12 6 Tu 1 . - CDS 10994 - 11332 152 ## 13 7 Tu 1 . + CDS 11479 - 11700 57 ## + Term 11788 - 11847 7.6 + Prom 11803 - 11862 3.6 14 8 Op 1 58/0.000 + CDS 11928 - 16091 3847 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 15 8 Op 2 . + CDS 16351 - 20514 5812 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit Predicted protein(s) >gi|316922873|gb|ADCP01000088.1| GENE 1 289 - 1572 1799 427 aa, chain - ## HITS:1 COG:MJ1026 KEGG:ns NR:ns ## COG: MJ1026 COG0422 # Protein_GI_number: 15669215 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Methanococcus jannaschii # 3 423 12 434 438 382 45.0 1e-105 MSIITQNAALQGLLAAHLPALAEYEGLTPEAITSAIEAGTMVLLGNPNHANVKPILVGQP SRVKVNANIGTSPFQNHPGCEMRKLEVAQEAGADTVMDLSIAGDLVAIRKDMLSRTPLPL GTVPLYSVAQRYIDKDRDPADIDPEELFAEVEMQAEQGVDFMTLHCGLTRRGAEWAARGE RALGIVSRGGSILARWMLKNDKENPLLTGYDRLLELARKYNVTLSLGDGLRPGAGCDAGD AAQWEEVMTLGALAKRGLEYGVQCMIEGPGHVPMNEVEAQIISIKKMTHGAPLYVLGPLT IDSSPGYDHIAGAIGGAMAVRVGVDFLCYLTPAEHLTLPDMDDVRQGIMASRVAAQAGEV ALGRPHALAREAKMAKARMALDWPAMREAAIDPAILDKRREPHKHEEACAMCGNFCAVKM LRDAKQM >gi|316922873|gb|ADCP01000088.1| GENE 2 1841 - 4525 3334 894 aa, chain + ## HITS:1 COG:all1021 KEGG:ns NR:ns ## COG: all1021 COG0612 # Protein_GI_number: 17228516 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Nostoc sp. PCC 7120 # 53 832 66 874 945 200 24.0 1e-50 MPSFSKLLTGALMSLFLGVAAGAAIAAPSPSSGASDIVPVIPDPLSGKEEQVTKLPNGLS VLILKDTRFPLVSTRLYVHAGSSYETPDQAGISHVLEHMVFKGTDSRPKSAISQEVESAG GYLNAATSYDYTVYITDMPDRHWKLGMDVVRDMAFHPTLDPQELESEKNVIVAELQRGED DPGSRMFKTLLADTLKGTPYDRPIIGYEKTIRALTTQNLRDYIAKYYQPQNMLLVVVGNV DPAEVLAEAEKMFAPYKNTAPLKEVMPYEADRLPLPGSKPALVVQPGPWNKVYLAAALPV PGSSSYQSATLDVLAYLLGGDRTSLFYKTYKYERQLVDSISVSNVGFERIGAFVVTAELD ADKVEPFWTSLTKDFAALDASTFTPEQLERAKLNLEDDLYRSKETLSGLASKIGYFQFFM GGDQGERNAIEALRNVDNGMLKQVLAAWVQPDRLTTVVLPPKDAKMPDMQAILDKEWKSG AKASAAKTAEAGKTEVIDLGKGRTVVLIPDKTLPYVSANLTYSGGDALLKPSEQGLSALV SNVLTKGTAKRSATEMQAFLADRAAGLAASAGRKTFSVNLTTPARFNRDLFDLLGEVVTA PAFSKDETARGIKDQLAAIKSREDQPLGLAFRKLPPFLFPGSVYGYLQLGEPENVQKYDE AQLRSFWNRQKARPWVLAVSGDFDRDQILAFAKSLPAPDQRKVDVPVPAWGTRPELDIPM PGRNQAHLMLIFKTAPDTSPDTPALDLLETSLGGMGGPLFRDLRDKQGLGYTVTAFNRQT SENGYMVFYIGTEPGKMAQAEEGFKRIINDLHQNLLSEEDVNRGKNQLEGDYYRNMQSLG SRSGEAAALTMEGYPLSFTKDQIEKSKNVTPEQLREIVKKYLNVDSAYTIKVLP >gi|316922873|gb|ADCP01000088.1| GENE 3 5283 - 6476 1500 397 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 397 1 407 407 582 68 1e-166 MGKEKFTRTKPHMNIGTVGHIDHGKTTLTAAITKVAAMKQGGKFIAYDEIDKAPEEKERG ITISTAHVEYETDNRHYAHVDCPGHADYIKNMITGAAQMDGAIIVVAATDGPMPQTREHI LLARQVGVPHLVVFMNKCDLVDDPELLELVEMEVRELLSSYGYPGDEIPVVRGSALKALE SDSADSPDAQCVLELLAACDSYFPDPVRETDKPFLMPIEDVFSISGRGTVVTGRVERGII KVGEEVEIVGIRPTVKTTCTGVEMFRKLLDQGQAGDNIGALLRGTKRDEVERGQVLAAPK SITPHKKFKAEVYVLSKEEGGRHTPFFTGYRPQFYFRTTDITGIIALEEGVEMVMPGDNA TFNVELIHPIAMEKGLRFAIREGGRTVGAGVVTEIVE >gi|316922873|gb|ADCP01000088.1| GENE 4 6481 - 6630 243 49 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|218886543|ref|YP_002435864.1| 50S ribosomal protein L33 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 49 1 49 49 98 85 4e-20 MRVNILLACTECKRRNYATVKNKKNTTGRVELKKFCPWCRTHTVHRETR >gi|316922873|gb|ADCP01000088.1| GENE 5 6747 - 6989 268 80 aa, chain + ## HITS:1 COG:no KEGG:DvMF_1449 NR:ns ## KEGG: DvMF_1449 # Name: secE # Def: preprotein translocase subunit SecE # Organism: D.vulgaris_Miyazaki_F # Pathway: Protein export [PATH:dvm03060]; Bacterial secretion system [PATH:dvm03070] # 5 79 9 83 83 72 52.0 4e-12 MTKKPTTAQEEKSEGIVAKAKAFRNYLELSKTELRKVTWPTVKETRTTSLVVLAFVVVMA IFLGVVDLGLSKLISFILAA >gi|316922873|gb|ADCP01000088.1| GENE 6 7001 - 7690 694 229 aa, chain + ## HITS:1 COG:YPO3752 KEGG:ns NR:ns ## COG: YPO3752 COG0250 # Protein_GI_number: 16123889 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Yersinia pestis # 54 229 6 181 181 194 53.0 1e-49 MIEENNASSGAVESSAQQSQAEAKPAQSELTAQDILAASTVQTELSAGNQDTGKSRWYIV HTYSGFEQRVEATIKEMMRNAQDNGLIHEVVVPTEKVIELGKGGAKRTTTRKFYPGYIML RMTMTDFSWHLVQSIPKVTGFVGGKNRPAPMKDEEAARILSLMETRQEQPRPKFSFERGD DVRVIDGPFGGFNGVVEDVNYDKGKLRVSVSIFGRQTPVELDFIQVSKG >gi|316922873|gb|ADCP01000088.1| GENE 7 7745 - 8167 637 140 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|220904897|ref|YP_002480209.1| ribosomal protein L11 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 140 1 140 140 249 87 8e-66 MAKKEVAKIKLQIPAGAANPSPPVGPALGQHGLNIMQFCKEFNARTMEQKGMIIPVVITA YADRSFTFITKTPPAAVLVMKAAKVEKGSGEPNRTKVGSITMQQVEEIAKLKLPDLNANG MEAAVKSIMGTARSMGIEIK >gi|316922873|gb|ADCP01000088.1| GENE 8 8258 - 8962 725 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148263131|ref|YP_001229837.1| 50S ribosomal protein L1 [Geobacter uraniireducens Rf4] # 1 226 1 226 233 283 59 5e-76 MPFGKKYRKAVEGLDLSQRFSVEEAVEKSLGASFAKFDETVDVAIGLGVDPKYSDQMVRG AVSLPHGLGKTVRVAVFCKGEKEAEARAAGADIAGAEELVAKIKEGWLEFDAAVATPDVM ALVGQVGRVLGPRGLMPNAKTGTVTFDVATAVKELKAGRVDFKVDKAGVLHAPLGKVSFG PEKILGNLKALVETVNRLKPSAAKGTYMKNMAVSTTMGPGFKVDMTLIKKFIEG >gi|316922873|gb|ADCP01000088.1| GENE 9 9146 - 9664 632 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|218886548|ref|YP_002435869.1| 50S ribosomal protein L10 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 172 1 172 173 248 73 3e-65 MNRSEKAALIAQIKAKADAASFVVVTDFKGMTVEELTRLRAKLYECGGEYLVVKNTLARI ALTDGMHDSVKDMFKENCGIALTSQDPVAVAKAVSEFAKTSKLFTVRHASLEGKMLSAAQ VDALAKLPGKQELLGQVLGTMNAVPTNFVSLFANMVRPLMYALKAIEEKKAA >gi|316922873|gb|ADCP01000088.1| GENE 10 9710 - 10090 542 126 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|46581331|ref|YP_012139.1| 50S ribosomal protein L7/L12 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 126 1 127 127 213 90 9e-55 MAVTKEEVVEFIANMTVLELSEFIKELEEKFGVSAAAPAAAMMVAAPAEAAAPAEEKTEF DVILKSAGANKIGVIKVVRALTGLGLKEAKDKVDGAPSTLKEAVSKEEAEEAKKQLVEAG AEVEVK >gi|316922873|gb|ADCP01000088.1| GENE 11 10477 - 10770 171 97 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDSVFGIFFGALAIIAFGACIGAGLRGRPGRLFGLAILKIGLACLFVLLLFVDSFPHNEI ILWLVLMLMAGALICFGGACWLAARSMGKDDEPPGQD >gi|316922873|gb|ADCP01000088.1| GENE 12 10994 - 11332 152 112 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFWNDQRGAHGRSVYRFFLALGVPLVFGACIGAAFRGRFGRGLGLVTLALGLAGLLYGAL TFTTYRDGYGLFAWVLQILMAGGVFCFGGACWLAARRKRQGVEKTPWDWRQR >gi|316922873|gb|ADCP01000088.1| GENE 13 11479 - 11700 57 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPLLGTQPSDGSGPSPVSKSLQDERRSLEEREGRTPGKESPFSLSLGSQPFHYSKTMPQM KIFRFVRKKSLPL >gi|316922873|gb|ADCP01000088.1| GENE 14 11928 - 16091 3847 1387 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 3 1359 14 1391 1392 1486 56 0.0 MGQLTKQFGKIKVEVEIPHLLNLQVDSYRKFMQEGRTERLPDEGLEGVFRSVFPIEDFNR TASLEFVSYEVGEPKYDQAECISKGLTYESPVRIKVRLVVYDVDDETEARTVRDIKEQDI YFGTLPLMTEKGTFIINGTERVIVNQLQRSPGIIFEHDGGKTHTSRKVLYSCRVIPMRGS WLDFDFDHKDILYVRIDRRRKMPATILLKAMGMSKTEILDCFYKKEIYELEDDGRVSWHF NPELHRKEAAFADIVDKEGNVLVKAGKPITKRAWRQMEAAGVEKIEVAPETPLGLFLAED LVDNATGEILAEAADEITAGLFDACREAGITTISALHTRGTDTSSSIRDTLVQDKTPTME KAQEEIYRRLRPSSPPTPEIAASFFDNLFRNGDYYDLSPVGRYKLNQRLNKTDHADLRTL ADDDILDAIKVLVQLKDSHGPADDIDHLGNRRVRLVGELVENQYRIGLVRMERAIKERMS IQEVATLMPHDLINPKPVAAVLKEFFGTSQLSQFMDQTNSLSEVTHKRRLSALGPGGLTR ERAGFEVRDVHVSHYGRICPIETPEGPNIGLIVSLTTYAKVNDYGFIETPYRVIRDGYMT DEFVHLDASRETGHVIAQANAAVDADRRLVDDYVTARVGDDVLMAAREEITLMDISPSQM VSISAALIPFLEHDDANRALMGSNMQRQAVPLLRSERPLVGTGMEVDVARDSGACILAES SGVVHYADANRIIVAYDDPALYPNMGGVRAYDLLKFHKSNQNSCFGQVPSCVPGQIVKKG DVLADGPAIHDGGLALGKNMLIAFMPWCGYNYEDSVLISERVVKEDIYTSVHIEEFEVVA RDTKLGPEEITRDIPNVGEEMLRNLDESGIIRIGAPVKPEDILVGKITPKGETQLTPEEK LLRAIFGDKARDVKNTSLKVPPGVEGTVIDVKVFNRRSGEKDERTRNIEDYEISRLDAKE QDHIRAITRRMRERLLPIVDGKQIATTLLGDKKGEVLAEAGAAMTEELLMALPVKKLADL FQSKEVNENVGELLVQYDHQIEYIKSIYDSKREKVTEGDDLPPGVIKMVKVHIAIKRKLS VGDKMAGRHGNKGVVSRILPEEDMPFFADGRPVDIVLNPLGVPSRMNIGQIMEVHLGWAA RELGRQLAEMVDKGTALQSVRDEVKDIFASPEIAAEVDGMDDEEFRKSVLKLRNGIITYT PVFDGAEESEIWNWLSRAGLDDDGKSVLYDGRTGEKLENRVTTGVMYYLKLHHLVDEKIH ARSTGPYSLVTQQPLGGKAQFGGQRLGEMEVWALEAYGASYLLQEFLTVKSDDVTGRVKM YEKIVKGDNFLEAGLPESFNVLVKELMSLGLNVTLHQEEGKKRPKRVGFAEEEHQQYLQQ QQDMADE >gi|316922873|gb|ADCP01000088.1| GENE 15 16351 - 20514 5812 1387 aa, chain + ## HITS:1 COG:PA4269 KEGG:ns NR:ns ## COG: PA4269 COG0086 # Protein_GI_number: 15599465 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Pseudomonas aeruginosa # 16 1352 10 1362 1399 1654 60.0 0 MSLDDLFSVRSSANANASQIRNLKAIQISIASPENIREWSYGEVKKPETINYRTFKPERD GLFCAKIFGPVKDYECNCGKYKRMKHRGIVCEKCGVEVIASKVRRERMGHIELAAPVAHI WFLKTLPSKIGTLLDMTMADLEKVLYFDSYIVLDPGSTNLQKMQVISEDQYLQIIDHFGE DALTVGMGAESIRGLLEELNLEEIKVLLREESQTTKSQTKKKKLTKRLKVVEAFLESGNR PEWMILEVIPVIPPELRPLVPLDGGRFATSDLNDLYRRVINRNNRLKRLMELGAPDIIIR NEKRMLQEAVDALFDNGRRGRAIAGTNGRPLKSLSDMIKGKQGRFRQNLLGKRVDYSGRS VIVVGPYLKLHQCGLPKKMALELFKPFIYSELEKRGHASTIKSAKKMVEREELVVWDILA DVVKEYPVLLNRAPTLHRLGIQAFEPLLVEGKAIQLHPLVCSAYNADFDGDQMAVHIPLS VEAQIECRVLMMSTNNILSPANGTPVIVPSQDMVLGLYYMTVERSFEKGEGMVFCAPWEV EAALDADQVSMHARIKVRMEDGKLTNTTPGRILVWRGLPAGIKFENVNKELTKKNIAKLV DSAYRDAGVKACVILCDRIKAIGYEYATRSGISIGVKDMLIPDSKKRIIDAASAEVDNIE SQYRDGIITRTEKYNKVVDVWTKATQDVSAEMNKEISTDILTDPKTGKQEANLSFNPIFM MSNSGARGNADQMRQLAGMRGLMAKPSGEIIETPITASFREGLTVLQYFTSTHGARKGLA DTALKTANSGYLTRRLVDVVQDVIVSEHDCGTVDGIEIRAIKDGGEVKQKLSERALGRVL LYPVYDPATGDLLYPENTLIDEPVAKTLDDKGINSVMIRSALTCQSERGICSLCYGRDLA RRHLVNIGETVGIIAAQSIGEPGTQLTMRTFHIGGTASKEIERSNFEAQHDGRVVLTRVK EIRNREGVAQVLGKSGQVAVIDEQGREVERYTLPNGARLHVTDGQSVSNGQLLAEWDPFN EPFVSEVDGFIKFTDLVDGKTYQEKLDEATHQASMTIIEYRTTSFRPSVSVCDENGDPKL RGAGQLPAVYSLPVGAIIMVKDGEQVLAGDIIARKPRETSKTRDIVGGLPRVAELFEVRK PKDMAVVSEIAGTVSFAGEAKGKRKLIVTPEVGESKEYLVPKGKHITVSDGDFVECGDLL TEGNPELHDILRTKGEKYLAAYLVDEIQEVYRFQGVGIDDKHIEVIVRQMLRKVTVTEPG GTSFLVGEQVDKAEFKAENQKAMSEGRSPATAEPLVLGITQASLTTSSFISAASFQETTK VLTEAALKGKMDYLRGLKENVIVGRLIPAGTGYREYVDQDITVPDQKERPDRFLDELSGN PIIADLD Prediction of potential genes in microbial genomes Time: Fri May 13 03:41:18 2011 Seq name: gi|316922864|gb|ADCP01000089.1| Bilophila wadsworthia 3_1_6 cont1.89, whole genome shotgun sequence Length of sequence - 11322 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 36 - 95 2.8 1 1 Op 1 1/0.000 + CDS 221 - 1336 1410 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 2 1 Op 2 . + CDS 1610 - 3340 1275 ## COG0728 Uncharacterized membrane protein, putative virulence factor + Term 3552 - 3593 -0.5 + Prom 3461 - 3520 3.3 3 2 Tu 1 . + CDS 3640 - 5670 2459 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily + Term 5737 - 5770 3.6 4 3 Tu 1 . + CDS 6027 - 6743 600 ## + Prom 6898 - 6957 3.4 5 4 Tu 1 . + CDS 7000 - 8694 2262 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] + Term 8878 - 8916 3.2 6 5 Op 1 . + CDS 8967 - 9857 1157 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Term 9873 - 9914 7.1 7 5 Op 2 . + CDS 9938 - 10330 422 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 8 5 Op 3 . + CDS 10323 - 11201 1475 ## COG0040 ATP phosphoribosyltransferase Predicted protein(s) >gi|316922864|gb|ADCP01000089.1| GENE 1 221 - 1336 1410 371 aa, chain + ## HITS:1 COG:FN1481 KEGG:ns NR:ns ## COG: FN1481 COG0343 # Protein_GI_number: 19704813 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Fusobacterium nucleatum # 1 370 1 371 373 430 53.0 1e-120 MNTPGTFQIHATDGAARTGVLQTAHGPVNTPIFMPVGTVGSVKAIAPDDLNAIGAEIILG NTYHLYLRPGDELVARRGGLHGFNAWDKPILTDSGGFQVFSLSTLRTIKEEGVTFRSHLD GSKHLFTPEKVVSIQRNLNSDIMMVLDECVPAGADHTYTARSLTMTTRWAKRCRDAYPKG TAGNLLFGIVQGGMFKDLRSESARQLIDLDFEGYAIGGLSVGESKDVMMEMLYHTAPILP SEKPRYLMGVGTPLDIIKGIDAGVDMFDCVLPTRNARNGTLYTSLGKLNIKRREFAEDDG PLDPACSCYTCRTFSRAYLRHLYVSQELLSFRLNSIHNLTYFLDTVRGARAAIREGRWAE YKRRYEELYDV >gi|316922864|gb|ADCP01000089.1| GENE 2 1610 - 3340 1275 576 aa, chain + ## HITS:1 COG:YPO2043 KEGG:ns NR:ns ## COG: YPO2043 COG0728 # Protein_GI_number: 16122282 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, putative virulence factor # Organism: Yersinia pestis # 3 295 14 306 511 134 32.0 4e-31 MGTLISRLLGFVRDAGIAWLLGGSGAADALTAALRIPYMARRLFGEGTLSLSLTAACTRE RLRGGSGCGLALAVTRKLALWTGFLALACMAGAGIIMRAIAPGLEERPEVFGEAVTLFRI CAPYIWSVMMAAGCMAALHSRQRFLLPSLTPSLFNLCVIGFALLAAFNPSLQPGVLVACG VLCGGILQWLAQIPAIRILQREEGKRGKPADARTVSEAFRRLPAGIVGAAMPQLAFLGAS ALASLLPEGHMASLFYAERLLEFPLGVLGAAVGMAAAPRLAELAASEGLSRSSRFHEIPS FSLSQPQKPEQADPPPPLSASRTARNDTKAIPGARLQKPWDGEGMEFEDGEPFFKRGEPP HGSLSPSPSSLPTAPPHSFSDEIQRAALLSLGLNLPAAAGLAAISLPLVAVVLGHGAFDA QAVSATALALCAYAPGLPAYALSRPLLAACHALESGLPLKAAAIALAVALAGGYALTLRF GAWGPPLGVSVGLWCNAALLWIGLSRGVSLRLALRSLAVQLAGTALTFGSAYGVVLWAGH ASNIAQLALAIPAGAAVYAASLLIGDRNWFRLLKKR >gi|316922864|gb|ADCP01000089.1| GENE 3 3640 - 5670 2459 676 aa, chain + ## HITS:1 COG:PA1689 KEGG:ns NR:ns ## COG: PA1689 COG1368 # Protein_GI_number: 15596886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Pseudomonas aeruginosa # 6 596 15 608 700 150 27.0 1e-35 MQKTSFFPPFIRTLSVLFIGTLLALVLARGAMLGILWPEVAGNPASDILNALYIGTKFDI RMVVLGLIPIALILGIPPLERLLVRSRGFRGLIDLLYAAVFAAATLVYIIDFGWFFYMHT RVDASLFELLGDPDIAGTMVWESYPVVWITFGLIVVTALYALFFGKTLARHEATPRLGWK GRTGWSVATFVILFLLGYGQISSNLFPLRWSNAFFSVDKNISILALNPIQNLFDTVNAAR GTRPDIEATRESYPRVAAWLRVPNPDPQKLDFMRTVPGTPAERQEKPLNVVVIIMESMTW PRTSFSPNLTGIPEDTTPNLMALSKDSLYYPLFFAPTRTTARAIFTTMTGIPDVNRSGGT SSRNQALVDQALMMNEFKGYSKYYMIGGSASWANIRGFLSHNIEGLHLLEEGSWKAPNTD VWGLSDLDLFREAAAALTKSPKPFVAVIQTAGFHRPYTIPEDNAGFQIKQPSEAILKNYG FTGADEYNSLRFSDHSLGEFFKIARQQPWFDNTVFAIFGDHGLNDTSLNMSPGYLACRLQ SNHTPMLIYAPGLVAAGKLQPGVDGRPCGQPDIFPTLASLAGIPYRYNAMGRNLLDPDTK RDEMQFLGGETESTVRLVENGYCYIRETDEHMYKMDAPVLEDLLNTDPERAAHMRQYAQD FYNVSKYLLYNNKKFD >gi|316922864|gb|ADCP01000089.1| GENE 4 6027 - 6743 600 238 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPHPTIHAVCLASSGRALSGTCSGPAAFVNNRKKTSLHESILSGIKARFGALARYAAALD EKTAARLSAADHPQQAAKQLETLLRLPETGTLLHFEAGDGGFLEAFHAARPGWELSAIES GDAFMRLLKKAFLRSAYNADYRDADIVSRFDIIAVTKPLDEVEKPLNAVRWLSRRLADNG TLFLWQPALRMGTGLSLHGLPIRIANLTALCKTAGLSVDTITTEGEVVCLCARNTPQR >gi|316922864|gb|ADCP01000089.1| GENE 5 7000 - 8694 2262 564 aa, chain + ## HITS:1 COG:MA3792 KEGG:ns NR:ns ## COG: MA3792 COG0028 # Protein_GI_number: 20092588 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanosarcina acetivorans str.C2A # 2 564 8 561 564 617 54.0 1e-176 MQLNGARILIECLIREGVNIIFGYPGGSVLDIYDELPQYGEQLRHILVRHEQAAVHAADG YARATGKVGVCMATSGPGATNTVTGIATAYCDSIPMVIFTGQVPTGLIGNDAFQEVDIAG ITRPCTKHNYVVKDVKDLAKVVRQAFYLARSGRPGPVLVDLPKNVLQGSAEFVWPEDVSL RSYNPTYRPNYAQIRKVVDALYNAERPLLFGGGGVIMSNASEEFRNLAKNLSIPVTCSLM GLGAFPCDHPQWLGMLGMHGTYAANKAVSNADLLIAVGVRFDDRVTGRLSSFAPKATIVH IDIDPTSIHKNVNVRIPVVGDCRNSLIAIREHLAAREPLPWKERFAAWQGQLDEWKQEQP ITWTPGETTIKPQEVIKAVDKITEGKAILTTEVGQHQMWAAQLYNFTEPRTLLTSGGLGT MGYGFPAAIGAQFAYPNKLVIDVSGDGSFQMNLQELITAVSNKLPVKVLLLNNGCLGMVR QLQDLFYGGRLNAVDIPDQPDFVKLAEAYGAEGYRVAKLEDLESTLRTAFASPNTAIIDV IVERDENVYPTVPSGASLDEMLLL >gi|316922864|gb|ADCP01000089.1| GENE 6 8967 - 9857 1157 296 aa, chain + ## HITS:1 COG:RSc1145 KEGG:ns NR:ns ## COG: RSc1145 COG0329 # Protein_GI_number: 17545864 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Ralstonia solanacearum # 8 295 6 293 294 287 49.0 2e-77 MTTGRFHGAFTALVTPFKNGKIDEEAYREFIEWQIEQGIDGLVPCGTTGESATMTHDEHE AAIRICVEQVNKRVPVIAGAGSNNTREAIPLTQFAKNVGADAALHITPYYNKPTQEGLYQ HFKTICSEVSMPVILYNVPGRTGCNMLPPTVARIARDVPDVVGIKEATADLIHVSDLVEQ CPEGFQILSGDDFTVLPHIAVGGCGVISVTANVAPKLMADLCKATLAGDMDEARRLHFKL MPINRAMFLETNPIPVKTAVCMMRGLDLEFRLPMVPLMPDNLEKLTKILNDCDLLH >gi|316922864|gb|ADCP01000089.1| GENE 7 9938 - 10330 422 130 aa, chain + ## HITS:1 COG:MA0908 KEGG:ns NR:ns ## COG: MA0908 COG0139 # Protein_GI_number: 20089786 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Methanosarcina acetivorans str.C2A # 16 128 12 120 120 140 63.0 7e-34 MESSLAFAPDFSKSGGLIAAVAQDAETLDVLMLAWMNPEAWEKTLETGEAHYYSRSRKCL WHKGGTSGHVQKVKSIRLDCDGDAVVLLIEQIGGAACHEGYRSCFSRELRDGKVSRCSEL VFDPKEVYNG >gi|316922864|gb|ADCP01000089.1| GENE 8 10323 - 11201 1475 292 aa, chain + ## HITS:1 COG:MA0217 KEGG:ns NR:ns ## COG: MA0217 COG0040 # Protein_GI_number: 20089115 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 5 292 1 288 289 151 33.0 2e-36 MANPILKLGLPKGSLEDPTIDLFDRAGWKVRKHPRNYFPDINDPEITARLCRVQEIPLYI QDGVLDLGLTGKDWVKETKADVVVVSDLIYSKVSNKPARWVLAVADDSPYQKPEDLAGKR IATELMGVCTEYFKERNIPVNINYSWGATEAKVVEGLADAIVEVTETETTIRAHDLRVID EVLVTNTVLIANAAAMQDPEKRRKIEQIDLLLQGALRAESLVCLKMNAPAAKLDAVLALL PSLNSPTVAPLQNGEWLSLETVVSTGIVRDLIPQLREAGAEGILEYALNKVI Prediction of potential genes in microbial genomes Time: Fri May 13 03:41:34 2011 Seq name: gi|316922857|gb|ADCP01000090.1| Bilophila wadsworthia 3_1_6 cont1.90, whole genome shotgun sequence Length of sequence - 7134 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 6, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 269 - 769 195 ## COG1576 Uncharacterized conserved protein 2 2 Tu 1 . + CDS 852 - 1505 661 ## COG0009 Putative translation factor (SUA5) + Term 1508 - 1575 14.9 - Term 1632 - 1676 10.4 3 3 Tu 1 . - CDS 1861 - 3162 579 ## PROTEIN SUPPORTED gi|227993272|ref|ZP_04040336.1| SSU ribosomal protein S12P methylthiotransferase - Prom 3232 - 3291 2.9 + Prom 3451 - 3510 2.4 4 4 Tu 1 . + CDS 3558 - 4238 413 ## PROTEIN SUPPORTED gi|165937309|ref|ZP_02225873.1| non-canonical purine NTP pyrophosphatase RdgB + Term 4315 - 4348 -0.4 + Prom 4249 - 4308 2.1 5 5 Tu 1 . + CDS 4383 - 5717 1848 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase + Prom 5741 - 5800 1.5 6 6 Tu 1 . + CDS 6045 - 7118 1389 ## COG0820 Predicted Fe-S-cluster redox enzyme Predicted protein(s) >gi|316922857|gb|ADCP01000090.1| GENE 1 269 - 769 195 166 aa, chain - ## HITS:1 COG:SP2238 KEGG:ns NR:ns ## COG: SP2238 COG1576 # Protein_GI_number: 15902041 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 15 165 3 158 159 97 37.0 8e-21 MLSPQQQGIMTLKPLRIVAVGRLKTPFWKAAAEHYVERLVRWRAFTESIIKDGDPSLPIV DRNALEGKGILSALSASDIPVVLDERGKTFTSRQFSVFLERLSENASRRPCFIIGGAFGL DDSVKQAASHLIALGPLTMTHELARVVLFEQLYRAESLTRGLPYHH >gi|316922857|gb|ADCP01000090.1| GENE 2 852 - 1505 661 217 aa, chain + ## HITS:1 COG:TM0852 KEGG:ns NR:ns ## COG: TM0852 COG0009 # Protein_GI_number: 15643615 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Thermotoga maritima # 17 214 17 213 335 123 36.0 3e-28 MASSQASGVNTPPYVDFKRAVELMLDGHPVIFPTETFFALGSRALDADATARVYRAKHRS TVRPLPLILGDWEQLDMIAKVHDELMPLLKRFWPGPLSVIFPARLRVPDILTGGTGRVAV RLSSHPVARALAQAVGEPITSSSANISGNPAVVSVNQLDEELIASVMGILDEPPVPAGGL PSTLVERMEDGRLRLLRAGAVTSAQIAEAGFEVVEEE >gi|316922857|gb|ADCP01000090.1| GENE 3 1861 - 3162 579 433 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227993272|ref|ZP_04040336.1| SSU ribosomal protein S12P methylthiotransferase [Meiothermus ruber DSM 1279] # 19 429 1 415 442 227 37 2e-59 MSVSLRCYSTSLGCPKNRVDTERLLGSLGVALVPVESPDEAELVFINTCAFIQPATEESV RTIAQAIADIEELPKRPLLAVAGCLVGRYHAEDLAPELPEVDLWLDNSDLEAWPAMLARK LGLPEPTVPGRILSTGPSYAWLKISDGCRHACSFCAIPNIRGGHRSHCKDMIVREARALL DQGVKELNLVAQDLTEWGCDLNGKQDLRTLLDGLLPLQGLERLRLMYLYPAGMTHEMLKY LREAGKPFVPYFDVPVQHSHPDVLSRMGRPFARNPREAIDRIRNVFPEAALRTSIMVGFP GETEEHFEHLREFIEEVRFQHLGVFAYRAEEGTPAAAMPDQVEDRVKEWRRDLLMEAQAD ISAEWLSRFEGDRLPILVDAPHPEWPGLHTGRAWFQAPEIDGMVYISGPGVEPGALVDAD IVECRDYDLVALA >gi|316922857|gb|ADCP01000090.1| GENE 4 3558 - 4238 413 226 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|165937309|ref|ZP_02225873.1| non-canonical purine NTP pyrophosphatase RdgB [Yersinia pestis biovar Orientalis str. IP275] # 6 211 14 217 223 163 48 3e-40 MRKEQTMSEQPHDTEPVTIVLATRNQGKVRELAEPLRAFGLRVVGLDAFPDLPEVEETGT TFEENALLKAREVSKRTGLVAIADDSGLEVDALNGAPGVYSARYSEDMPDLPGATKDERN TMKLLAALSSVRLWNRSARFRSVVAVCTPEGETLIAPGTWEGSVACSPRGKNGFGYDPVF LDPELGLTAAEMSPEEKMSRSHRAKALRELLRLWPSFWEGYLRKRA >gi|316922857|gb|ADCP01000090.1| GENE 5 4383 - 5717 1848 444 aa, chain + ## HITS:1 COG:aq_2158 KEGG:ns NR:ns ## COG: aq_2158 COG0617 # Protein_GI_number: 15607098 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Aquifex aeolicus # 131 444 151 507 512 104 26.0 3e-22 MDQATLMDKIKTGTTLCKAIMRNGYDAYAINAPLQGLIFEKTGVFDVDIACACDTETLFK IFPNAVPYTEEGAMAMLQENGVTLRFYSTDVEDASHPEMSQLRITPRLARLLAELQLKGQ LKSSNTNDPEAKEFEDFDKAGSVKLVGLPSKTLAKNYLLAIRALRYAANFDLPVNPHTWV SIIQSANRILDYVPAREIMEEWRKVAAESMWKFVQLLFEAQLLHGLMPEIAALSCIKQER NDDGDIESVFDHTIECMKRYPEEGFSMDWYGTLATMFHDVGKLYTAEHLNGRWTFYQHHR VGAGVTRKILRRLHFTSEDIDLICHLVRHHMMFHFMLTDRGIRRFKALPETERLIAMCRA DIEARDGSYTYFNHNSKYLTRAETDELLLEPLLNGNEIMEYTSLPPGPAVGTIREALLKA QVAGEVTDRDSAIAFVKNYAAKLK >gi|316922857|gb|ADCP01000090.1| GENE 6 6045 - 7118 1389 357 aa, chain + ## HITS:1 COG:CC0134 KEGG:ns NR:ns ## COG: CC0134 COG0820 # Protein_GI_number: 16124389 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Caulobacter vibrioides # 1 351 24 377 404 283 43.0 3e-76 MIDILNLTFPELERFIVEDLGQPKFRAAQVWQWIWQKHATSFDAMTDVSKQLRAKLAEVA EIVLPEIVTVQTSSDGTEKLLLRLRDGALVETVILPSTGQDGSVRIAQCVSSQIGCAMGC TFCSTGTMGFIRNMTAGEILSQVLVARMRLGDNRIDHPIIRNLVFMGMGEPLLNLRETTR ALEMLNHDKGMDFSPRRITVSTCGIKAGLRELGDSGLAFLAVSLHAPNQDLRAKIMPKAA NWHLDDLMATLESYPLKTREHITFEYLLLGGVNDQPEHARELAKLVSRVKGKLNLIAYNP SETQLYKAPTEADILAFEKILWSKGVTAILRKSKGQDIKAACGQLKSDWEREKGDEE Prediction of potential genes in microbial genomes Time: Fri May 13 03:42:00 2011 Seq name: gi|316922812|gb|ADCP01000091.1| Bilophila wadsworthia 3_1_6 cont1.91, whole genome shotgun sequence Length of sequence - 51019 bp Number of predicted genes - 45, with homology - 34 Number of transcription units - 23, operones - 11 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 145 - 181 -0.4 1 1 Op 1 . - CDS 392 - 1516 1301 ## COG0077 Prephenate dehydratase 2 1 Op 2 4/0.000 - CDS 1519 - 2511 1241 ## COG1465 Predicted alternative 3-dehydroquinate synthase - Term 2555 - 2590 3.1 3 1 Op 3 . - CDS 2708 - 3511 978 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes - Prom 3653 - 3712 5.1 + Prom 3759 - 3818 3.5 4 2 Tu 1 . + CDS 4029 - 4478 599 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 4486 - 4529 10.0 5 3 Op 1 . + CDS 4842 - 5078 230 ## 6 3 Op 2 . + CDS 5163 - 5594 480 ## - Term 5741 - 5785 8.2 7 4 Op 1 . - CDS 5807 - 8209 2841 ## COG0457 FOG: TPR repeat 8 4 Op 2 2/0.000 - CDS 8212 - 8991 1098 ## COG0107 Imidazoleglycerol-phosphate synthase 9 4 Op 3 . - CDS 8985 - 9632 537 ## COG0118 Glutamine amidotransferase + Prom 9824 - 9883 3.8 10 5 Tu 1 . + CDS 9978 - 10688 750 ## COG0797 Lipoproteins + Prom 11131 - 11190 5.5 11 6 Op 1 2/0.000 + CDS 11307 - 12623 1568 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits 12 6 Op 2 . + CDS 12642 - 14093 1892 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits + Term 14101 - 14139 5.4 13 7 Op 1 . + CDS 14216 - 14512 343 ## azo2629 hypothetical protein 14 7 Op 2 . + CDS 14515 - 16218 1556 ## COG1797 Cobyrinic acid a,c-diamide synthase + Term 16390 - 16420 0.4 + Prom 16414 - 16473 2.8 15 8 Tu 1 . + CDS 16525 - 17049 814 ## COG2077 Peroxiredoxin + Term 17132 - 17172 6.2 - Term 17120 - 17159 9.8 16 9 Op 1 . - CDS 17166 - 17450 387 ## 17 9 Op 2 . - CDS 17571 - 19898 2387 ## CHAB381_0287 hypothetical protein 18 9 Op 3 . - CDS 19889 - 20404 655 ## - Term 20420 - 20456 -0.4 19 9 Op 4 . - CDS 20467 - 21240 690 ## Neut_1458 hypothetical protein 20 9 Op 5 . - CDS 21240 - 22898 1277 ## HCH_05654 hypothetical protein 21 9 Op 6 . - CDS 22895 - 24031 1070 ## 22 9 Op 7 . - CDS 24031 - 25029 965 ## 23 9 Op 8 . - CDS 25066 - 26256 1327 ## 24 10 Tu 1 . + CDS 26248 - 26499 203 ## 25 11 Tu 1 . - CDS 26442 - 26795 365 ## - Term 26810 - 26868 22.8 26 12 Op 1 . - CDS 27007 - 27969 1124 ## HTH_0895 hypothetical protein 27 12 Op 2 . - CDS 27979 - 28413 289 ## 28 12 Op 3 . - CDS 28425 - 28847 564 ## COG4387 Mu-like prophage protein gp36 - Term 28933 - 28966 -0.8 29 13 Op 1 3/0.000 - CDS 28987 - 29931 1127 ## COG4397 Mu-like prophage major head subunit gpT - Term 30104 - 30132 -0.9 30 13 Op 2 . - CDS 30165 - 31286 1024 ## COG4388 Mu-like prophage I protein - Prom 31469 - 31528 3.1 + Prom 31513 - 31572 5.3 31 14 Op 1 . + CDS 31596 - 32000 491 ## COG3383 Uncharacterized anaerobic dehydrogenase 32 14 Op 2 . + CDS 32094 - 33638 1693 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing + Term 33860 - 33902 -0.6 + Prom 33748 - 33807 2.0 33 15 Tu 1 . + CDS 33933 - 35948 2272 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases + Term 36138 - 36168 -0.4 + TRNA 36229 - 36305 87.5 # Arg ACG 0 0 34 16 Tu 1 . - CDS 36897 - 38405 1286 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 38480 - 38539 3.7 - Term 38888 - 38925 5.3 35 17 Op 1 1/0.000 - CDS 38943 - 39923 1681 ## COG3181 Uncharacterized protein conserved in bacteria 36 17 Op 2 . - CDS 39955 - 41448 1912 ## COG3333 Uncharacterized protein conserved in bacteria 37 17 Op 3 . - CDS 41458 - 41943 468 ## gi|302863691|gb|EFL86622.1| putative tricarboxylic transport TctB 38 18 Tu 1 . - CDS 42049 - 42522 718 ## gi|302863690|gb|EFL86621.1| conserved hypothetical protein - Prom 42548 - 42607 2.0 + Prom 42490 - 42549 5.1 39 19 Tu 1 . + CDS 42586 - 43044 -69 ## 40 20 Tu 1 . - CDS 43663 - 45057 1722 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 45130 - 45189 4.8 - Term 45155 - 45192 8.4 41 21 Op 1 11/0.000 - CDS 45219 - 46520 680 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 42 21 Op 2 11/0.000 - CDS 46520 - 47026 604 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component - Term 47034 - 47076 8.1 43 21 Op 3 . - CDS 47086 - 48084 282 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 44 22 Tu 1 . - CDS 48354 - 49187 1043 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold - Prom 49276 - 49335 5.0 - Term 49717 - 49762 5.5 45 23 Tu 1 . - CDS 49808 - 50716 981 ## COG1180 Pyruvate-formate lyase-activating enzyme - Prom 50934 - 50993 2.3 Predicted protein(s) >gi|316922812|gb|ADCP01000091.1| GENE 1 392 - 1516 1301 374 aa, chain - ## HITS:1 COG:aq_951_2 KEGG:ns NR:ns ## COG: aq_951_2 COG0077 # Protein_GI_number: 15606269 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Aquifex aeolicus # 101 370 1 270 277 201 37.0 1e-51 MPESNDTPDGWSEAAAAKQSEALQGLRLQIDAVDRELLALLNRRASLSLEVGRVKANTSG PVFRPQRERQILDNLASENSGPLPEEHLRSIWHEIFASSRTLQRPVSVAYLGPEGTNSYF AAVDLLGNLMDYRPCKNFREAFAAVHGGECALGVIPLENVLQGSVGQCFDLFMEYDVYIQ TESVARIQHTLLSVESSPEAIKVVYSHPQALAQCQRWLRQHLPEASLMACDSTAAAARRA VSEPGSAAIGHKSLSTMLGLPILAYNLEDEPTNQTRFVVIGPKPTDTSAANKTSVLFTLP DKPGSLAGILDLLFKAGVNLRKLESRPLRSQSWKYAFFADLETNLLKPELSGVLAELRNT CHSFRLLGCYTAGE >gi|316922812|gb|ADCP01000091.1| GENE 2 1519 - 2511 1241 330 aa, chain - ## HITS:1 COG:aq_1922 KEGG:ns NR:ns ## COG: aq_1922 COG1465 # Protein_GI_number: 15606938 # Func_class: E Amino acid transport and metabolism # Function: Predicted alternative 3-dehydroquinate synthase # Organism: Aquifex aeolicus # 10 329 10 330 331 231 42.0 1e-60 MPDIYFKSVPFSKEDVTLALESGVDGIITEAEHVEGVRSLALCDVRAEADMPSVGLGSKA EEESIASRIAGGERLVLAQGWEIIPVENLLAQPAVAGKLAVEVASLDEARLAAGVLECGV PVVVVLPEALASLKNIVSELKLSQGTLTLDKATITEVKTVGLGHRVCVDTLTLMERGQGM LVGNSSAFTFLTHAETEHNEYVAARPFRINAGGVHAYAVMPGDKTCYVGELRSGDEVLIV DKDGRTSLATIGRIKTEVRPMLMITAQMETPEGTRTGSVFLQNAETIRLVRPDGTPVSVV ALRPGDEVICRADVAGRHFGMRIQENIREI >gi|316922812|gb|ADCP01000091.1| GENE 3 2708 - 3511 978 267 aa, chain - ## HITS:1 COG:aq_1554 KEGG:ns NR:ns ## COG: aq_1554 COG1830 # Protein_GI_number: 15606691 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Aquifex aeolicus # 1 266 1 264 264 322 60.0 5e-88 MNLGKQVRLQRIFNRETGRAIIVPMDHGVSVGPIEGIENIHKTVSDMADGGADAVLMHKG LCRCCFRASGEGKDVGLIIHLSASTSLSSYSNKKRLVCTVEEAIRRGADGVSVHVNLGDD NESDMLADLGEVARVAEEWSMPLLAMLYARGPRISNEYDPAVVAHCARVGVELGADIVKV PYTGDIDSFADVIAGCCVPVVIAGGPRTETTREFLEMVDNSLKAGGSGLSVGRNVFQHPK RVQLVRALRGLVHQGLSLDEALAVVEG >gi|316922812|gb|ADCP01000091.1| GENE 4 4029 - 4478 599 149 aa, chain + ## HITS:1 COG:MJ0531 KEGG:ns NR:ns ## COG: MJ0531 COG0589 # Protein_GI_number: 15668711 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Methanococcus jannaschii # 5 144 27 165 170 76 35.0 1e-14 MPALKKILCALDLSDQSESVAEYAVMLAKMSGASIVAVYAAPTLTQYTGFHVPPNTIDNF VGEIVSGAERSMTDFVSEHFTGVDARGVVVVGYAAEEILALAESEQADIIVMGTRGRKGI DLILFGSVAEKVVKNATCPVLTIRPTDDD >gi|316922812|gb|ADCP01000091.1| GENE 5 4842 - 5078 230 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIPGAFLCALFLLMPYGGWIPALICAFAGALFAPATAALKGLDFGLIPTMLWGFMAGMTA YPGVVLPQAAQPEDEPAG >gi|316922812|gb|ADCP01000091.1| GENE 6 5163 - 5594 480 143 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGFSIRSELASYARLKRRCYFLFVISYGGLLAGLTYAAMTHQSPPLWIGLIPGAALSVPL LLLPTQSWLPCLFHVLLGVVFAPIICARREADLSLIEAVELGAVAGLALIYPFYILCSTL EKEGTLKKYAPDQRRKGSQETEE >gi|316922812|gb|ADCP01000091.1| GENE 7 5807 - 8209 2841 800 aa, chain - ## HITS:1 COG:MA3260 KEGG:ns NR:ns ## COG: MA3260 COG0457 # Protein_GI_number: 20092076 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Methanosarcina acetivorans str.C2A # 585 770 165 349 395 72 29.0 3e-12 MSGLTTGLVRRDLLDLEQRLRDSLYRVLPFEAHAVYFPRQTRSKEPEWLPDEGKLLLPLV RHGELLGVFMARVPDAERVTALLPSLPGIVDLCLENLELYKAGRLDPRSGLATHEVLLER LTQETGAIQSAFAREFGLESGEGSSRGGTLGLIVVRFDGVREVARNISYGFADRLVGKLV EAFSGQLPEQAFGARIGDSAVAVFLPEATRPECENLSAAILRAMEQVRLPDPLIGRQFGV QPHAGYALYPQDMDGTRQRSSEEHGRILLHKAQLAAEVARTRGPGGFQAGRVMGYGRLLL EGGHIRQVMPLSRVLTSLGRSVGAREGQHFSVWSVNYAVKGGSGDESLQPLYKGEIVLLE VRESESVAEILHLGDPAWPLEPDDALTLLQEEQRLSVQNAAPEGQDDGVFHRPDPLTGLL RHGDFLAHLARACSECERFSLALLHVDMARRDGESSGAIQPMTQPEHIMAQVADLARSVC GRKVPGGRFGLNSLIFFHPDLEAEPLRELYEKLCADIASRLGVRAGVGLACWPFLDLRPS DMIEGARKALEYALLLPAPHIGQFGSLALNISADKRHCRGDVFGAIEEYKLALLADEDNV LAWNSLGVCLASLGRHAEARRFFEEAIQRTPDDPALAYNLGAVCQSLHDNEAAAEHFRTC IRLSPSHLYALIRLGQLDEAEERLEEARARFESAAALDTGSPLPYRHLARLALRLGKADQ AREHLHQALLRNPRDVAALSLMADLYLDGGEDPELAESLARQSVALRPEYRNGWLVLSRA LEVQGRLSDAREALLKAGEL >gi|316922812|gb|ADCP01000091.1| GENE 8 8212 - 8991 1098 259 aa, chain - ## HITS:1 COG:MJ0411 KEGG:ns NR:ns ## COG: MJ0411 COG0107 # Protein_GI_number: 15668587 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Methanococcus jannaschii # 1 257 1 271 272 294 53.0 9e-80 MLTKRVIPCLDVRNGRLTKGVKFVGNEDIGDPVETARQYYEAGADEIVFYDITASAEARG IFLDVVEKVASEIFIPFSVGGGISTVQDMRAVLLAGAEKISVNSAAVKHPEIISQGADAF GSQAIVVGMDVKRVPVTPEIPSGYEIVIHGGRKAMGIDAIEWAKRCEALGAGELCVNSID ADGTKDGYELTLTRLICDAVRIPVIASGGAGSPQHMVDAVTEGRASAALIASIVHYGEYT IGQLKQYMADRGVPVRLTW >gi|316922812|gb|ADCP01000091.1| GENE 9 8985 - 9632 537 215 aa, chain - ## HITS:1 COG:PA5142 KEGG:ns NR:ns ## COG: PA5142 COG0118 # Protein_GI_number: 15600335 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Pseudomonas aeruginosa # 2 204 5 209 213 173 42.0 3e-43 MLAILDYKAGNQTSVRRALDHLGIPCVITADPAVVDSAAGVIFPGVGAAGQAMRHLQDSG LDAVLRNVVSSGRPLLGICLGCQIMVAHSDENDTPTLGLVEGRCVRFDEHLEEGGAPIRI PHMGWNTISRKQESPILKDVPADAAFYFVHGYYVETAPEKVIATSCYGTEFCAVYGQDGL WAIQFHPEKSGRPGLKILSNFYEWCMARSAGGRVC >gi|316922812|gb|ADCP01000091.1| GENE 10 9978 - 10688 750 236 aa, chain + ## HITS:1 COG:NMB0267 KEGG:ns NR:ns ## COG: NMB0267 COG0797 # Protein_GI_number: 15676191 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipoproteins # Organism: Neisseria meningitidis MC58 # 36 203 61 216 239 114 39.0 1e-25 MIKRAITLLLLALPFYLAAVQQAEGRSPGPMLEVFTGGATFYSDSFHGKKTANGERYNKN EFTAAHRSLPLGTIVRVTNLSNGNNLLVRINDRGPGKKKLILDVSRAAASKLNMIRRGVI SVQVEVVADKRGIPVLRNNAFYLRLASARTLKDAQNKLKTLSSASSSKLTKQTSRNRSGE LKILSERLPNGRTQYFVGQGPFTRYRDAQHALSKARARHTEASVACLPTLVAENTR >gi|316922812|gb|ADCP01000091.1| GENE 11 11307 - 12623 1568 438 aa, chain + ## HITS:1 COG:AF0423 KEGG:ns NR:ns ## COG: AF0423 COG2221 # Protein_GI_number: 11498035 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Archaeoglobus fulgidus # 7 429 4 412 418 466 57.0 1e-131 MAIKHPTPLLDQLETGPWPSFVSDIKQEAEYRAANPDNVEFQVPADAPDDLLGVLELSYK EGETHWKHGGIVGVFGYGGGVIGRYCDQPEMFPGVAHFHTVRVAQPSGKYYSTEFLRGLC DIWELRGSGLTNMHGSTGDIVLIGTQTPQLEEIFYDLTHKLDVDLGGSGSNLRTPAACLG QSRCEYACYNTQDACYQLTMDYQDELHRPAFPYKFKFKFDGCPNGCVAAMARSDFAVVGT WKDDIKIDQEAVKAYVAGEFAPNAGAHAGRDWGKFDIEAEVINLCPSKCMKWDGSRLFIN NAECVRCMHCINTMPRALHIGDERGASILCGAKAPVVDGAQMGSLLVPFVPVEAPYTEVK EIIEKIWDWWMEEGKNRERLGETMKRLSFQKLLEVTEIPALAQHVKEPRSNPYIFFKEEE VPGGWNRDLAAYRSRHQR >gi|316922812|gb|ADCP01000091.1| GENE 12 12642 - 14093 1892 483 aa, chain + ## HITS:1 COG:AF0424 KEGG:ns NR:ns ## COG: AF0424 COG2221 # Protein_GI_number: 11498036 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Archaeoglobus fulgidus # 14 381 5 368 368 444 55.0 1e-124 MAFVSSGYNPEKPMEGRISDIGPRKYDSFFPEVIKKNFGKWLYHEILEPGVLVHVAESGD KVYTVRCGGTRTMSVTNIREICEIADKYCDGHVRWTTRNNIEFMVTDEATLKALKEDLAG RKFAAGSYKFPIGGTGAGVSNIVHTQGWVHCHTPATDASGPVKAVMDTMFDEFKNMRLPA PVRISLACCINMCGAVHCSDIGIVGIHRKPPMIDDQWVDQLCEIPLAVAACPTAAVRPVK SEHDGKKVNSVAIKQDRCMYCGNCYTMCPALPISDGEGDGIALMVGGKVSNRISMPKFSK VVVAYIPNEPPRWNTLTSTIKHIVEVYSENANKYERLGDWAERIGWESFFELTGLEFTHH LIDDFRDPAYYTWRQSTQFKFSELALAAHGGEAHEAASAAEVTAEDKEIVINFLKDKMSR PGAKTKYYFKDFLELFPAKGTRDVKNVLSVLVSEESLEYWSSGSTTMYGLKGAGKQASSE GEN >gi|316922812|gb|ADCP01000091.1| GENE 13 14216 - 14512 343 98 aa, chain + ## HITS:1 COG:no KEGG:azo2629 NR:ns ## KEGG: azo2629 # Name: not_defined # Def: hypothetical protein # Organism: Azoarcus_BH72 # Pathway: not_defined # 1 96 1 104 117 97 46.0 1e-19 MAQNIRLRHAQSGVITTGFYGFSWTTLFFSGFPAIFRGDLITGVIVLILSASSFWLVAII WAFLYNRVYTTRLLERGYVFDDDPEKVREAKRALRIQE >gi|316922812|gb|ADCP01000091.1| GENE 14 14515 - 16218 1556 567 aa, chain + ## HITS:1 COG:alr3934 KEGG:ns NR:ns ## COG: alr3934 COG1797 # Protein_GI_number: 17231426 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Nostoc sp. PCC 7120 # 13 556 3 488 514 192 28.0 1e-48 MDAAFPSITAPRLVVSGLSGGGGKTLLSLGLARALTLRGLRVKPCKKGPDYIDAAWLGLA SGRTPTNLDPFFLTDARLRALFCTSFGDADLAIIEGNRGLYDGRDVQGSCSTATLARALG APVLLTLTVTKMTRTAAAVIAGLAHFEPVNLAGVVLNRVASSRHAALIRQSIETYTGIPV LGEIPRLAENPIPERHMGLVSMHDETPDALGRASLLAALDSLAGLIERHMDIDAALRLAR SAPDLRDVEPFWEGGEDAAFREAGTLREGSGNAPQSADATAEDTDLGKTFIAPGDTTQPH NEGVSAESASSGMEPSRPPEPASTRKPSASALIHSGLPPVTPVTIGFVRDAALWFYYEEN FEALRRAGAELVELSLLSPEPWPGDRLDGLYLGGGFPEMVPERLADSPHLAEIREYSMRG MPIYAECGGFMVLCQELQINGKQYPMTGIFPARAEFCPRPQGLGYVEATVEAENPFHPVG ALLRGHEFHYSRCVALGELEPTLRLSPGVGMSGPGHRAKGLAAEGPDNLKSRDGLLVRNT FAAYTHLFAPAVPHWAARFAAACRKNA >gi|316922812|gb|ADCP01000091.1| GENE 15 16525 - 17049 814 174 aa, chain + ## HITS:1 COG:BS_ytgI KEGG:ns NR:ns ## COG: BS_ytgI COG2077 # Protein_GI_number: 16080001 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Bacillus subtilis # 9 173 4 167 167 205 60.0 4e-53 MIEERTGIITFQGNPLTLIGKGVSVGEAAPEFCVLANDLTPRTLADYKGKVLVISVVPSL DTPVCDMQTRRFNTEAAKLSDNVRILTISCDLPFAQTRWCGAAGVDAVETLSDHRDLSFG TAYGVAIKELRLLSRAVFVICANGVIAYEQIVKEVTHDVDFEAALGAVKACLEK >gi|316922812|gb|ADCP01000091.1| GENE 16 17166 - 17450 387 94 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MASLTPTGLETITYGQQGWNAIVTANMQRLNTWLGKLLPLMDSRSRTAGAVPVWNAVNGN WEPSNAVQALQTELEALKSRVAALEAAAEEPPSA >gi|316922812|gb|ADCP01000091.1| GENE 17 17571 - 19898 2387 775 aa, chain - ## HITS:1 COG:no KEGG:CHAB381_0287 NR:ns ## KEGG: CHAB381_0287 # Name: not_defined # Def: hypothetical protein # Organism: C.hominis_BAA-381 # Pathway: not_defined # 125 717 46 627 684 107 22.0 3e-21 MGVGLIIAAVTIGATVATTLFAKGPRSTKGADMAPANLDAFRLTTAEEGTVIPRVFGTVR LPGNLLYYGNLSSEPEYEETTVGGKGGKKKKQKVLQGYHYRMDVWQGIGMGPLELVGVYQ DDRLLTEQGGAISCAEQVWNNGTGAFYPAQAGPYASRLPGVSHIWLRQFYLGFNVSMMPT LHYVVRFCGDIPLEHAILSNGVSPAAIILQLLLDAGASWSSIDKPSFSAAASFWAQKGYG LNIVFSRQKPVRDHIMQVLGAVGGWLIERADGTLSLRAPDPDVAPSATLGEGDFLEGSFS FKRAAWDTTWNDLSGKFTDAAQDYSERAVTANNAASIQLLGLRRKKSVDLTAFTDRSAAQ RRIEELRDVESYPAASFSFDVSRDYAHIEQGQILEITHSRFGLSGVRVRVLEVVRGNLSE NRISIQARQVVERLSGTFVPPGETLPDPMAPPTVVYPPVTGIPPWSAPDLSPVPLRHARL FELPRNPETGSEPVVLVLAARELLSEEGVLVERSATNADYSPVGLLSAPWAQYGTLAEAY PASTLAVDDTQGLLYRPYKEDPSFGPVSRTDLFGARRMALVGDELMLFQTVVLEGSETTR LSGVVRGYMNTPVQAHASGAAIWVFRNPGEGNTVQGLPIGTHNIKLRPVSGDEVLGADRV DRLSLAVTGKAMVPWPVAGLRAVRTGDSVAVSWSPVDVAFGGAGTRTENENEPVPAGFSG DFVLTAAGTEHVVNGTSITLTRSGRFALSVTARQLGYVSPAAYVTVESQDGEYVA >gi|316922812|gb|ADCP01000091.1| GENE 18 19889 - 20404 655 171 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHYFDNEAAWEAFRAEAESWLGTPYRHLQRCRGRGADCTLFVGQALLDAGLLTRLEYDYY PRDWHEHTRDEYVLEASHRHMRDYLRPGLEMASLPVGTPLLRGDWLAFSTTERRVTNHCG LAWPCADGGFQMLHAINDRGVSFTPLGNWWLRRMTRHFRIVIAEAEVAAWA >gi|316922812|gb|ADCP01000091.1| GENE 19 20467 - 21240 690 257 aa, chain - ## HITS:1 COG:no KEGG:Neut_1458 NR:ns ## KEGG: Neut_1458 # Name: not_defined # Def: hypothetical protein # Organism: N.eutropha # Pathway: not_defined # 23 251 4 229 236 100 31.0 7e-20 MRTVLPQIVELYEIIADTIHTLTTSREPVLWRGAEWTPAAIERGELRGSVDGKAATVTLT ACMDTLLTRYLASLPILPTRVNIYEHEPESGDTVQVFGGVVLEVIPGADATQVSVNCLSS SCILDAMLPRMIYSGQCQFVLFDSGCGLAAQSWGVTASVSVDGYLLASPSFAAYPNGYFT RGYAAAGGDFRFIVAHGGDTITLQLPFDSRVGNGGTVTAYPGCDGSPATCRDRFGNSARF GGCASIPSRNPAVWGFK >gi|316922812|gb|ADCP01000091.1| GENE 20 21240 - 22898 1277 552 aa, chain - ## HITS:1 COG:no KEGG:HCH_05654 NR:ns ## KEGG: HCH_05654 # Name: not_defined # Def: hypothetical protein # Organism: H.chejuensis # Pathway: not_defined # 31 551 42 562 563 95 24.0 6e-18 MIFEVLWPRAGGVPECSYSVPGTSGAWVGETPHRGDEVRHEATPLPRPSGHRMAESVADL IYGQIHLYPSEVHLGLLSGMEALDVVLWNATFAPVQLVGVNSASSAGTTLSGFRPGVLPP TGALKGLLTVLSSGPAQQDTTYTFVTGIGERSLTITASRVLLFPFWPDWSDGLEIDYAFD TVLTRGENGDEQRRPLAKRPLRTLRATIWGDGVNGQRLHHLVQHGKDRVFGVPLWQEALD VTDIDATRQVLMLGRDFSDCWNLTRLCNLIMLHERRSGTFMACSLLSRDPAGRTLSVTAP VSDVFAGGATRLVPLFTGILTSAEPAVVSDGMETWSVEFRELAGAQPALGALPNAPSGTG AYLWPHRPDWSGDGVGGTSSLLRTLRTVRGGVMELGVRRDVAPSTHRQKYLLRERELADL LDRATALRGRWKALLVRDPRQYFTLTRGSTADKAILYVRDNGAKDGFIPNQRLWIALPGG DVLLRRLVAVETANVGELALHLSSELGVTVPPASRVGRDYFARLDTDCVAIRHESAGVSR CELSFCTIPEES >gi|316922812|gb|ADCP01000091.1| GENE 21 22895 - 24031 1070 378 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAQPCKVFSYTNVGTEKGVLANIASAAQAGGWTVDKNAVDADGELYLHSAGGGNRRLFFS LRLLQAHDNAERFLLAVHGNTGFDASAAWDAQPGRFTERLAHGYCSRTTGKPIWLRTPGK YSITSTGWWILPPVAEQIVLVCPTFVMTAMRVVYTFSDGANTPYSGWVPLMFGAADGFDA ETELNMVLWSAWSANSAMGLMLSALYVAQRDVNNEYYFCNNNYGNVGLLWKGANAEKFPP EGTYGYRTSPLASSVVRTSVTVRNVIGRVNTDYCLTGTVNAGGKTINYTHKGGVCASVPQ YNAALLQNSGTLRHMLIKPLLYVCNGADVRLAGELPYWAVNLHGLKPKDRISIGSRVFMV LPDISDSDEIGLAVEVEA >gi|316922812|gb|ADCP01000091.1| GENE 22 24031 - 25029 965 332 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAFSGKITAANHADLLAKVTNFITGDPGTPGRDWTVARQDSLTWGPATVFRNTGLSGSEE VYVGLCAATYTDGVKGGLVCKVYKAFDSAPGGTGFLDTAYGNGTGQNGTHAFLPCWNAAM NVWIWSNKARVVIVAECNGVYANAYLGQLRRFSLPSENPWPLACLTDGYTSLWQYTTWHD AITSTGSRDADLHRRNLAFVRRGSFPLATTFSYYHACHQICRPDGVWTSHFAICPTTSLL GSSSGYETDMKIDTQGTGIVFPEGTPRLLLPMYVIQLENTGDYTGASAIGEMYGVRWAPD SLAGIESDVDGYILFPDVNRVEWHSFMAIGDE >gi|316922812|gb|ADCP01000091.1| GENE 23 25066 - 26256 1327 396 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGHGGYTLELSLKDSISAGLKDMGRALSNVRELVGSLRADVSALNKESGQAVSGLDLGKL AEGAADQLEGLSKGALQELDAAFRDSGKTMADYYAERRRQAGETADMERENASEQAERMA QELEARKDAYQAELEAFVAHGQNWAEAQQAQADREVSIKAEQADREKAIDAQRLEGALSL AGGMADAMKQVYESGLAQSKGVYQLYQALAIAEATISTYKAAQAAYAQGMEWGGPVVGAI MAATAVAAGMARVAAIKSTPLKGFAFGGLIGGQDKGERADNVLIRATPGEYMLDRPTVRH YGVSALEALRRRSVPRELLEPFASPRLPASGGKRTAYALGGEIGNGSLTEGGKASGNEGL TVINVMDFQREFDRALASTRGRRVLINILGEEGIAS >gi|316922812|gb|ADCP01000091.1| GENE 24 26248 - 26499 203 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSHVPLLCFQKGFELPPVRMQRHTQGKGGVLHAFVPALRLGLKRFDERAPPILEYVPVPG PRQPHHRTEEFPFRAASIRSLRN >gi|316922812|gb|ADCP01000091.1| GENE 25 26442 - 26795 365 117 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKSAEFTMENDKGTLSVRLQELTVREVLDMCEQFGKGPLVHDLERLLPRMSNLTTNDLM DMTPGELEMLWEQVKEVNAAFFRIAGALGFGRVLEAFRAQLRSDLMEAALNGNSSVR >gi|316922812|gb|ADCP01000091.1| GENE 26 27007 - 27969 1124 320 aa, chain - ## HITS:1 COG:no KEGG:HTH_0895 NR:ns ## KEGG: HTH_0895 # Name: not_defined # Def: hypothetical protein # Organism: H.thermophilus # Pathway: not_defined # 7 123 3 119 243 68 34.0 3e-10 MTASPQTRNYLYGKGELFFRAVGSEGYDHLGNAPAFTISLTEEKLEHFSSMSGTKTKDLQ LVTQKGATVAFTLEEFTTGNILRAFKGAAVAKQMQAAATVSGQSVSANKGLYTFVGKEKL GFTRLEHGTVAGGTFAPGSSVVGSTSSATATVAYVTDGVLECVNVRGTFVPGEEIAASAI KATLQGIARVADVVLTDKASAPTVRYRQGVDYDLNARTGLLRVRESCSADTVFLTADCES SDEQLVDALTASDVTGELLFVGQPDQGPGLVVQCWKVTLSLSGEVGLISEELASIPMTGE VLADDLNHPESPFFRVRYLG >gi|316922812|gb|ADCP01000091.1| GENE 27 27979 - 28413 289 144 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEVEDAILELLQPLRETPGAKTVAAFQGQPDKDTWARLRRGFPAVLVLYAGSPAFTPSGR RLEERMEFEVYVMDKSYRSAANGQKPEPGHPGTYALLATARERLLGVPPLPGMGPCLPSS IQGFVLDGASVYRMTVSTIHNIAL >gi|316922812|gb|ADCP01000091.1| GENE 28 28425 - 28847 564 140 aa, chain - ## HITS:1 COG:NMB1101 KEGG:ns NR:ns ## COG: NMB1101 COG4387 # Protein_GI_number: 15676981 # Func_class: S Function unknown # Function: Mu-like prophage protein gp36 # Organism: Neisseria meningitidis MC58 # 1 121 1 118 140 76 40.0 2e-14 MAYATLDQLVSLFGADEIRTLSDRQGTGELDEAVISDALERASSEVDSYLADRYATPLSD SDPIPPVVVSVAGDIARYRLTGGDIRDTDPIRERYTKALNWLRDVADGKAGIPGLPPAGT ETPGAVLLEPGTRPWDGVTP >gi|316922812|gb|ADCP01000091.1| GENE 29 28987 - 29931 1127 314 aa, chain - ## HITS:1 COG:NMA1847 KEGG:ns NR:ns ## COG: NMA1847 COG4397 # Protein_GI_number: 15794735 # Func_class: R General function prediction only # Function: Mu-like prophage major head subunit gpT # Organism: Neisseria meningitidis Z2491 # 3 314 2 300 300 288 43.0 1e-77 MAVVTSSLVSALRVGFQREFQDALSSAPSQWDRLSTRVPSSSAGNTYGWIGQFPKLREWS GDRSFKNIKEHGYSVMNNLYEATVDIPRTAVEDDDIGVYAPLFREMGYAAGTHPDEIVFG LLKNGMSGTCYDGKAFFAVDHPVYLNADGSGDAETVSNWLRPAAVDGTVTDGTPWFVLDV SRPLRPFIFQERTAPELQVITNPDNDYVFMKDKIPYGIRYRCNGGYGFWQQAVCSTQELN AANFAAALEAMQSFRADGGRPLGLGFGGEAGTMLVVPPSLQSAARRVVSAEQDPDGGSNI WYRAATLLVSHWLI >gi|316922812|gb|ADCP01000091.1| GENE 30 30165 - 31286 1024 373 aa, chain - ## HITS:1 COG:NMA1848 KEGG:ns NR:ns ## COG: NMA1848 COG4388 # Protein_GI_number: 15794736 # Func_class: R General function prediction only # Function: Mu-like prophage I protein # Organism: Neisseria meningitidis Z2491 # 25 368 25 350 354 134 33.0 4e-31 MKQSLLHPLLLSAVADSLRRPFPEIRLLPDGAFAARDGRPGTLTGGNLNAWNLSGPGAEH VLDQWRRRETPLAVDYEHQSLNARHNGQPAPAAGWIESLRYDPGQGLFASIRWTEGAKAF IEQDEYRFISPVFSFNPQNGDVLELKGAALTNVPALDGLGAVAATEDFPPSDTPQPETVM NALNRLKQLLGLPEDAAEETLQAELDKLESLLTPANPAASDPPALPGQPPFPHQADPLPG NARPTLFDFLQACHPQAALTSLVRANTALRDQLSVALSVTQGDRVARSVETAVTDGRLSR GLVGWATALGRQNPEALETYLAAVSPIAALSSFQSTGSRPALSAPAASPLSDEERFVCAQ LGLSEAEYLAVRG >gi|316922812|gb|ADCP01000091.1| GENE 31 31596 - 32000 491 134 aa, chain + ## HITS:1 COG:MTH1552 KEGG:ns NR:ns ## COG: MTH1552 COG3383 # Protein_GI_number: 15679548 # Func_class: R General function prediction only # Function: Uncharacterized anaerobic dehydrogenase # Organism: Methanothermobacter thermautotrophicus # 3 134 199 325 865 124 47.0 6e-29 MKKVLTTCGYCGCGCNLYLTVEGGRITGVLPKPDHPVSQGMLCSKGWQGHGFARHPDRLD QPLLRQEDGTLKGVSWSEAYRAIAEHLRAVLRTHGPDKFAMLASARCTNEENYLAAKLTR AVIGSPNIDHCARL >gi|316922812|gb|ADCP01000091.1| GENE 32 32094 - 33638 1693 514 aa, chain + ## HITS:1 COG:MJ1353m KEGG:ns NR:ns ## COG: MJ1353m COG0243 # Protein_GI_number: 15669895 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Methanococcus jannaschii # 1 511 163 671 673 457 47.0 1e-128 MIGSNTSAQHPLAYARITAAKKRGAKLITVDPRRTPVATLSDIHLAIRPGSNLALVNALM HVILFEEKGQDDAFINERTEHFDALRASLEGCTPEWAAPLTGLQPEAIRETARLYAKGPN SIILYCMGVTQQVTGTRTCGALANLVMLCGMIGRPSTGLIPLRGQNNVQGSCDMGALPET LPGYVKADSELGHQRFARLWGPFADEQGKKLTEIMDGMRDGSIRAAYIMGENPMQSDPDA AKVEAGLRNLDFLIVQDIFMTPTARLADVILPAASYLEKQGTFTNTERRVQLINEVLPPL PGTRPDWRILCDVINALGGRADYDSPREIFNEIRQVVPSYAGMDYTRLAEEQGLCWPCPT PDHPGTPYLHAASFTRGRGLFTVNDVPDDLGCATDETYPFTLITGRVSHHYHTGTMTRRS WALDREYPEAFLDMHPDDAAALRLKDNWKVRVTSRQGSIVARLRTATDLQPGTVFLPFHF AESPANALTSHEHLDPTVKIPALKLTPVRVEEAK >gi|316922812|gb|ADCP01000091.1| GENE 33 33933 - 35948 2272 671 aa, chain + ## HITS:1 COG:PAB1785 KEGG:ns NR:ns ## COG: PAB1785 COG0543 # Protein_GI_number: 14521075 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 392 668 5 288 292 165 32.0 3e-40 MFKFKASDLPEILTRWSARYSVFVPSGSPDNAQMRIWSRRTRKEVRFMEPDEYTNLIVAP KGFVFGEREELFRWEGNEKTCTAISAPSSSSLQEEDKILFGLRPCDTYGLAYMDRFFLGE HHDINYHLRRQHVFIVAVNCLEAGPECYCASMGTGPFAEITAHTEYGMQAGKGYDLLLTP DYGPDHKKGGKGENDWYWVEAGSDRGKALLSHVAPLLYRDLEFTGRRRKKALQEDALKTF RRTLDTSTVRQVLAAHFKDEEWDAIASSCIACTGCTRVCPTCTCFTTEEEQDTPHSGTRV RVWDSCQSVSFTRNAEFHNPRSKTSAVRYRIYDKLQYIEERFGMKGCTGCGRCAAVCPAS IDMVDIMARMKEQTPHEVLEAPAPAVNVHYEREERLFDPQPYTPLVAEIIDIFEEAKGIK RFTVRYRDRPNQGRPALRGQFFMLTVFGAGEIAISVPFSDRVKDAFTFYVKKVGKVTTAM HNLKVGDMMGLRGPFGVPLPYETLKGRDLLVVGSGVGHAPVRATLVRAIENKPDFGRIAI MASASTYDGLLLKDDLREWAKVPGVEVHYSLSKPTDQVDAHIGYINDLLPGLGLDWKNTS AIICASARRIKAVARDLMQLGMKPSDIYTALETNMHCGIGKCGHCKVGSHYMCVDGPVFT YEEMLQLPPEF >gi|316922812|gb|ADCP01000091.1| GENE 34 36897 - 38405 1286 502 aa, chain - ## HITS:1 COG:BH1879 KEGG:ns NR:ns ## COG: BH1879 COG3829 # Protein_GI_number: 15614442 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 179 495 241 554 555 294 48.0 3e-79 MRSEIHAAVPLWINTPLEPPLPSGSPVTERWRLVQKHLPPLAELLDVEIVIMNAEGYCVG GTGPYLRGVGFRMPADNALTYSLRSGQSSMVLTPGKEEVCRTCSGKGACADVANFTGPVV IENEIVAVVQIVAFTDNQRAELLMKSEKAFELISQIIGFAWARGRDVESLPEQPRDDRFP SIIGESKAILRLKAAILKAAGTNATVLIQGESGTGKELIAQAVHDNSPRHTGPFVAINCG AIPESLMESELFGYEPGAFSGANSGGQKGLLEQAHTGTFFLDEVSEMPVSLQVKLLRVLQ ERVVRRVGGKVNHSVDIRIIAASNRNLRELVAQGAFREDLFFRLDVIPLFVPPLRERQGD IRLLVAHFLQSFSRERGCTYRVATELMHAFEAYPWPGNVRELKNFVEYGVGFCEHGVLTL DLMDSRFKTASPPPVKTEKGMPSPPPGPARDTAEYDRIMQLLGRYGRHTEGKKRTASELG ISLATLYRRMGQLGITARKHAR >gi|316922812|gb|ADCP01000091.1| GENE 35 38943 - 39923 1681 326 aa, chain - ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 26 317 10 302 308 115 27.0 9e-26 MKRNLSRMFRTGVAAAAALLCAMPFQAAAEDYPSRPITLIMPFGAGNAPDTAARIIGDYL QRKHNITLLITSKPGGSGIPATLETVRARPDGYTISLTSANVLTVVPQYKKCGFTYKDLA PVAQVNVFTMGWGVRADSGIASVQDLMDKAKAERGKYSLASPGAFTAQRFYHANVMKLFP ESDLPYVAYNGGAEIVTALLGNHISAGFTPVVNFKPHKDIRVIAVCGAQRDPNYPDAPTF KEMFGDGFVFDSVYGIVAPLKTPKDRIERLQSLIKEALSDPDVQAKFAKVNMTTNYLPAD EFGKVIEGYYKLFEEPIRKAKEAEKK >gi|316922812|gb|ADCP01000091.1| GENE 36 39955 - 41448 1912 497 aa, chain - ## HITS:1 COG:BH2009 KEGG:ns NR:ns ## COG: BH2009 COG3333 # Protein_GI_number: 15614572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 468 2 466 504 353 43.0 3e-97 MSEVFAGIEAALSFGSLIANVTGVSLGILFGAMPGLTAAMGVALLIPLSFGMPPVEAFSM LLGMYAGAIYGGSITAILVGTPGTVAAAATLLEGPKLTAKGQSRKALEMATFASFFGGIF SAVALVTCAPLLATAAMSFGPAEYFALAVFALTVVATLSSGAMCKGLAAAFVGLFLSTVG IDPVSGDFRNTFDVPELFNGISLVPALVGLFAVSQVLLSLEDVFLGKSGIVKEGKLSNQG LTLREIWTNKFNLLRSSLLGTLIGIIPATGASAATFIAYGEAKRFSKHPEAFGKGTLEGV AATESSNNGVTGGALIPLMTLGVPGDVLTAILLGALMIQGLTPGPLLFQEHGATVNGIFA SFFVSNILILVVGLIAVRILGKVVLVPTAILMSVVLTLCAIGSFAANNSTFDIGVMAAFG LLGYLMLKAGFPQPPMLLAMILGPLAESNFRRALSLSRDDFSTFFTSPISCAILAVSACV VLKTIWDEYRGCKREAV >gi|316922812|gb|ADCP01000091.1| GENE 37 41458 - 41943 468 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302863691|gb|EFL86622.1| ## NR: gi|302863691|gb|EFL86622.1| putative tricarboxylic transport TctB [Desulfovibrio sp. 3_1_syn3] # 1 161 1 161 161 130 45.0 2e-29 MKNLNDIFSGAVLLCLCAVGAYEVSASIAPSDGEVVGADALPTLALGGMALCGVFLILQG LLRSRPGRSWGNRSAVIKTLLFFGFFVMYLSGMIWLGDRLVEQSWFPWPHNGGFTISTFL FLLFSLPLLGRRNPVEIIGVAALTTGALLYAFGYFFQIMLP >gi|316922812|gb|ADCP01000091.1| GENE 38 42049 - 42522 718 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302863690|gb|EFL86621.1| ## NR: gi|302863690|gb|EFL86621.1| conserved hypothetical protein [Desulfovibrio sp. 3_1_syn3] # 1 156 7 162 163 213 67.0 3e-54 MEWNPEVVRRVRGAMQDRARSQACLFKAFSSVMTREQAEALAREGIAEYGRIRAERDGRK LAPEDWMNEHYTMMGGVFETEISDTEEYCEMKMHYCPLLEEWKKMGLSPEDQDLFCDVAM ELDRNRAKEHGIPCDIVERLGKGDPFCRVVLWKKKKD >gi|316922812|gb|ADCP01000091.1| GENE 39 42586 - 43044 -69 152 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRFLCHTAHTPHLTIVTDFLSLQTHYSHPEILFSHLIEKKFSLHENTVPFMPFSLFSDRY AAQLRQRSFRTPLPTGIQGTKTHRQDPRMGPQPARARKRMRNGKRKIDRMARCERRKKKP EKMPRGKDNGSLCSPACFSGETGEGDKRLRVG >gi|316922812|gb|ADCP01000091.1| GENE 40 43663 - 45057 1722 464 aa, chain - ## HITS:1 COG:BH0992 KEGG:ns NR:ns ## COG: BH0992 COG3829 # Protein_GI_number: 15613555 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 9 456 20 451 454 302 37.0 1e-81 MLQEYLDQLFDAFKEGICISDVEGTVVHLNRRHAEITGIPREEMLGHSVMEFVRRGRLDV VLNPEVMRTGQPATRVQTVSDGRKLILEANPVFDAQGKLVLCITFLRDVTMLSEMREQMS VQKELLEAFQQLSTAGDAQELSRKTPKLFTSSAMIKLYGEVNTIAETDATVLLLGETGVG KDVIARRIHALSLRKNKTFIKVDCGSIPENLIETELFGYAAGTFSGASKTGKIGLIEAAA GGTLFLDEIGELPMPMQTRLLRFLQDWEIMRVGSTTPKQIDVRIVAATNKNLERAIARGE FRSDLYYRLKVAVIAIPPLRDRQADILPLARMFLRFYGSKYHRQLTFSEEAESLLLDYKW PGNIRELENLVQGLAVTSKDSVIRASDVPISGVAKPAASVRRDGLDFELDGKTYKEIMKD LENKVLSAAMERYGSITEVARHFQVDRSTIFRKVKELEKEGKMR >gi|316922812|gb|ADCP01000091.1| GENE 41 45219 - 46520 680 433 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 1 430 1 430 431 266 32 2e-70 MTPLAIALIVLVAMFLLGLPIFMSLIISSVVAILAGGDILPLSVIHNSLFDGLNLFPLLA IPCFVVAGTLMEFGNITQQIVDVVKQLVGRVYGGLGITTILACTFFAAISGSGPGTVAAV GTILVPAMVRNGYSKDYASAAASSGGTIGILIPPSNPMIIYAILGNLSVTAMFTAGFIPG FIVAFAMIMTAYLLAKRNGFKGDENAAPFNMKFFLRSCGNAGFALATPFIILGSIYTGWA TPVEASVVAIVWALFVGIVINRVLRPRHIYRALLEGAMTCGAVLLIVGASTLFGKILTFE EAPQRLASIVLGISDDPHLVLLMIIGVLYVLGMFMETLATIIILVPVLLPMILQLGIDPI HFGIVLVVTNNVAMLTPPLGVNLFVASRIAGISVERISVAVIPYLIALTLCILLFTYVPA ISTWLPSLMGYGQ >gi|316922812|gb|ADCP01000091.1| GENE 42 46520 - 47026 604 168 aa, chain - ## HITS:1 COG:AGc4972 KEGG:ns NR:ns ## COG: AGc4972 COG3090 # Protein_GI_number: 15889991 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 21 104 82 165 237 59 38.0 3e-09 MSFFKQLYDNFEEGACALFVAVMVLCLFLQVAMRIAVGSSLAWTEELSRYSFLWAVYVGA ALAVKRGGHVRITAQFMFLPTRWRLVFRAVGDIIWIGFSLYVAWIGLACIEEGLYYPEVS PTLGVVKAWVECIIPFSFVLVSWRIVEEYVVRWRNGTLGELVRYEEAA >gi|316922812|gb|ADCP01000091.1| GENE 43 47086 - 48084 282 332 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 53 310 53 307 328 113 31 2e-24 MRRFLTAFAAAVLMLSLGTANAAEYKKMTIRAATANPQGSLHVVAIDKFKEIVEKESNGA ITVQTFYGGSLGDEQANVKQLRNAEIHLAVLADGNLTPFAPQAGVFILPYMFPKISDAEK LFGNEAFMNKTADAIAKQSRTRPLSWLVGGYRIITNSKKPINTMADLKGLKIRVPAVELQ LAAFRSWGVEPHPLAWSETFNGLQQGVVDGQENPHAINRDQKFWEVQKYITNIHYMLWVG PMLVSDPWFRKLDPQTKALVEKAAKEAAAYEWKWSAEQDEIALKECLARGMVINDVSDEP AWTEAARSVWPQFYDKVGGKAVVDEALAIMQQ >gi|316922812|gb|ADCP01000091.1| GENE 44 48354 - 49187 1043 277 aa, chain - ## HITS:1 COG:MT2360 KEGG:ns NR:ns ## COG: MT2360 COG2159 # Protein_GI_number: 15841794 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Mycobacterium tuberculosis CDC1551 # 44 277 64 305 307 137 33.0 2e-32 MKIIDFRFRPNIPSTVQGMIDHPVFGEMHALFKFHERARSQPIEDIVRDMEAQNVVKGVI TSRDAETTYGIASGNKGVIPFLEQYPDRFIGMAGLDPHKGMDAIDELGLMVETHGFRGAA IDPFLAKIPANHAKYYPIYAKCCEFDIPVVISTGMATLVDGADPEHCHPRYIDAVARDFP KLKIVVSHGCYPWVNEIIMVVQRNRNVYLELSEYEQSPFSEGYIQAANTMIGDKVIFASA HPFLDFKGQIALYRKLPFSPQALENIFYNNAAKLLGL >gi|316922812|gb|ADCP01000091.1| GENE 45 49808 - 50716 981 302 aa, chain - ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 8 288 6 278 302 220 40.0 3e-57 MLDRQQTGIVFNVQKFSVHDGEGIRTLVFLKGCPLHCPWCSNPESQRREPERAYNPTRCL TAAVCGRCAKACPTGAVSIVGGLVCFDRSKCTGCNACVRACPSGAQTVYGETQSVDQILS RVEEDGVFYTRSGGGLTLSGGEALAQPDFALALLREAKKRHIHTTIETCGHYPTEVLDQA CRVLDALIFDIKCLDSARHKKATGVGSELILKNIGHVFEHFPDLPVLIRTPVIPGFNDTE EDILGIREMIPRKANIRYEALTYHRMGQPKYGYLGRRYELEGVKADEAFMKRLNIMLKSY EK Prediction of potential genes in microbial genomes Time: Fri May 13 03:45:08 2011 Seq name: gi|316922809|gb|ADCP01000092.1| Bilophila wadsworthia 3_1_6 cont1.92, whole genome shotgun sequence Length of sequence - 3835 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 7 - 2481 3462 ## COG1882 Pyruvate-formate lyase + Prom 2793 - 2852 4.7 2 2 Tu 1 . + CDS 2991 - 3824 1047 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold Predicted protein(s) >gi|316922809|gb|ADCP01000092.1| GENE 1 7 - 2481 3462 824 aa, chain - ## HITS:1 COG:SPy2049 KEGG:ns NR:ns ## COG: SPy2049 COG1882 # Protein_GI_number: 15675819 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Streptococcus pyogenes M1 GAS # 40 821 28 803 805 400 35.0 1e-111 MSQCCCLSPQEERLQGTKKVNRQGRERVYKILDRIQFTVPHVDIERARYFTESMRQTEGE LLTLRWAKALKNVAEKMTVYITPDQLLAGRVGQLGRYGILYPEIDGDFYIEVMKDLPNRE KSPFQIDPTDMQILMEEIAPYWEGKTYHEHLNKVLPAEIRGVTYHDERGLKSKFVVSETS SYRSALQWVPDYEKAMKRGFIDIQNEAKAKLAGLDLTNSVDIWEKKPFLEAMIIVCDAIM IWAKRHAQLARDTAAATSDPVRKQELLRMADICEHVPAYPARNFREAVQCQWFVQMFSRI EQKASAIISNGRMDQYLYPYYKKDIEEGTLTSEEAKELLECMWVDMAQFIDLYINPTGNE FQEGYAHWEAVTVGGQTPEGEDATNELSYLFLESKREFPMTYPDLAVRIHSRTPDRFLYE IALTVQDGSGFPKLINDEEVVPLNAIKGCPINEALDYAISGCTETRMPNRDTYTSGCVYI NFATALEMLMNNGRLHYYGDELIGLETGDPTRFQTWEEFYEAYKAQHINLLQKAFQQQHI VDRLRPQHFAAPLSSVLHNLCMKNMQDLHSEKIEGGVDYSYFEFLGYATVVDSLAAIKKL VFEEKRLTMREVLDAMNANFVGYEPIQEMLKNAPCYGNNDPYADSIAKDVDRFTQVEAEK SSRDRGIHVDVRYVPITSHVPFGKIIAATPNGRVAGFPLADGSSASHGADHNGPTAVLLS NYHSKNYGMINRASRLLNIKLSPKCVAGEQGAKKIMSIIRTWCDLKLWHLQFNIVNRDTL LAAQKDPNSYRNLIVRVAGYSAYFCDMSPDLQNDIIDRTEHADL >gi|316922809|gb|ADCP01000092.1| GENE 2 2991 - 3824 1047 277 aa, chain + ## HITS:1 COG:MT2360 KEGG:ns NR:ns ## COG: MT2360 COG2159 # Protein_GI_number: 15841794 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Mycobacterium tuberculosis CDC1551 # 44 274 64 303 307 128 35.0 1e-29 MKAIDFRFRPNTPDAVDGLIHSPFFGDMCAFFNYPDRAWSASLDKIVEIMDEHEITGVIT GRDVETTYGCPCSNMGVVEMMRAYPHKFYGMAGLDPHKGMQAIDELSRMVNEFGMKGASI DPYLAKLPADHRKFYPIYAKCCELKVPVVISTGPGTRVPGAIMGDASPARVDEVARDFPD LTIVMSHGGYPYVQEACMVCHRNANVYMEWSEYETWPFADGYVKAGNELIPDKLIFASAH PFYDSWERAKLYETLGFRPDVLENVLYNNAARILGIA Prediction of potential genes in microbial genomes Time: Fri May 13 03:45:10 2011 Seq name: gi|316922806|gb|ADCP01000093.1| Bilophila wadsworthia 3_1_6 cont1.93, whole genome shotgun sequence Length of sequence - 2376 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 189 - 264 49.0 # Glu CTC 0 0 - Term 572 - 611 7.4 1 1 Tu 1 . - CDS 629 - 901 286 ## COG0724 RNA-binding proteins (RRM domain) - Prom 928 - 987 3.9 2 2 Tu 1 . - CDS 1173 - 2321 1250 ## Mlab_0021 hypothetical protein Predicted protein(s) >gi|316922806|gb|ADCP01000093.1| GENE 1 629 - 901 286 90 aa, chain - ## HITS:1 COG:ssr1480 KEGG:ns NR:ns ## COG: ssr1480 COG0724 # Protein_GI_number: 16330189 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Synechocystis # 4 84 2 83 83 91 60.0 3e-19 MTKSLYVGNLPWSATEDDVRDLFAPYGEVTSVKLVSDRETGRARGFGFVEMASEGVQAAV EALDNFSFSGRNLKVNEARPREARPSYSRY >gi|316922806|gb|ADCP01000093.1| GENE 2 1173 - 2321 1250 382 aa, chain - ## HITS:1 COG:no KEGG:Mlab_0021 NR:ns ## KEGG: Mlab_0021 # Name: not_defined # Def: hypothetical protein # Organism: M.labreanum # Pathway: not_defined # 1 381 1 380 382 372 48.0 1e-101 MILLDRPYVSKHFLNYLERSGQPVLETPFTAGLARDYGLNLVDDAEAEKRCRNGERLYTS SENALEWIHAHLGDMPVARHIDAMKDKAALRRLLAGMYPDFFFREIPVGELGKTDVSGWK KPFILKPSVGFFSLGVYTITSDDDWTAAVADIERNLRESRDRFPESVVDQSSFLVEEYIT GEEFAVDVYFDGEGEPVILNIFTHRFASLSDVSDRLYYTGKDVIEANKARFEAFFRKVNA LMGIRDFPAHVELRVEGDRILPIEFNAMRFAGLCTTDMAYFAFGINTVDCYLNDRKPVFD AILRGREGKIYSMVILDKPKSLPPSSAFDYDRLAAHFAKVLELRKVDVPELPVFGFAFTE SDAGRTEELDTILQSDLTEFLR Prediction of potential genes in microbial genomes Time: Fri May 13 03:45:23 2011 Seq name: gi|316922791|gb|ADCP01000094.1| Bilophila wadsworthia 3_1_6 cont1.94, whole genome shotgun sequence Length of sequence - 14803 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 10, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 154 - 1419 1737 ## COG2873 O-acetylhomoserine sulfhydrylase - Prom 1575 - 1634 3.7 2 2 Tu 1 . + CDS 1635 - 2606 996 ## COG0024 Methionine aminopeptidase + Term 2614 - 2670 3.8 + Prom 2695 - 2754 1.6 3 3 Op 1 . + CDS 2782 - 3930 1306 ## COG3853 Uncharacterized protein involved in tellurite resistance 4 3 Op 2 . + CDS 3934 - 4914 693 ## ESA_04385 hypothetical protein + Term 4953 - 5000 14.8 + Prom 4941 - 5000 2.0 5 4 Tu 1 . + CDS 5241 - 5699 -300 ## + Term 5816 - 5853 -0.9 6 5 Tu 1 . - CDS 6093 - 6317 65 ## gi|302863944|gb|EFL86875.1| putative helix-turn-helix protein - Prom 6486 - 6545 2.7 7 6 Tu 1 . - CDS 6705 - 7022 64 ## gi|302863943|gb|EFL86874.1| putative helix-turn-helix protein + Prom 7268 - 7327 4.8 8 7 Tu 1 . + CDS 7374 - 8066 368 ## DvMF_2676 putative phage repressor 9 8 Tu 1 . - CDS 8148 - 8345 217 ## Ddes_0329 transcriptional regulator, XRE family - Prom 8415 - 8474 5.9 - TRNA 8467 - 8556 66.9 # Ser TGA 0 0 10 9 Op 1 . + CDS 9149 - 9790 570 ## Dvul_2268 hypothetical protein 11 9 Op 2 16/0.000 + CDS 9801 - 11903 2978 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 12 9 Op 3 . + CDS 11916 - 12683 974 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 13 9 Op 4 . + CDS 12688 - 13944 1891 ## DvMF_2687 polysulphide reductase NrfD + Term 13969 - 13996 0.1 14 10 Tu 1 . + CDS 14150 - 14638 661 ## DVU1185 colicin V production family protein + Term 14687 - 14721 1.3 Predicted protein(s) >gi|316922791|gb|ADCP01000094.1| GENE 1 154 - 1419 1737 421 aa, chain - ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 1 421 1 421 422 546 60.0 1e-155 MEIETQCLYAGYSPKNGEPQTLPICQSTTFKYESTEQVSKLFDLEEAGFFYTRLGNPTVD AVEQKIATLEGGVGALCTASGQAATTFSILTLAAAGDHIVSSSAIYGGSLNLFAVTLKRL GIEVTFVSPEASEDEIQAAFRPNTKALFGETIANPSIEVLDIEKFARIAHRNGVPLIVDN TFATPVLCRPIEFGADIVVHSTTKYMDGHALQMGGVIVDGGTFDWTNGKFPEFTEPDDSY HGTIYTQAFGKAAYIVKARTQMMRDMGACQTPFGAFLINQGLETLPLRIERHSQNADAVA HWLEKHDKIESVSYPTLEGNPYKERAAKYLPNGCSGVISFSLKGGREAGARFIDSLKMAS LLVHVADIRTCVLHPASSTHRQLTDEQLVSAGITPGMVRLSVGIENIKDILADLEQALAK A >gi|316922791|gb|ADCP01000094.1| GENE 2 1635 - 2606 996 323 aa, chain + ## HITS:1 COG:CPn1009 KEGG:ns NR:ns ## COG: CPn1009 COG0024 # Protein_GI_number: 15618917 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Chlamydophila pneumoniae CWL029 # 40 321 3 286 291 234 43.0 2e-61 MRAGRSMLCSLFVRDFLLLRPPSPHSPKRSEKDVDQKRNRKTPSWCGSRKRKAARTPYDA RITEYAHQGHLVPPRSILKTPEQIVGIRESGKINIAVLDHVAAHIKAGMTTAEIDRLVFE KTRELGGVPAPLGYRGFPKSTCTSINEEVCHGIPADDVALKDGDIVNVDVSTIYNSYFSD SSRMFCIGAVSKEKRQLVDVARECVELGLQEVKPWGFLGDMGQAVHDNAKKHGYSVVREV GGHGVGIRFHEEPFVSYVTKRGTDMLLVPGMIFTIEPMINMGKDAIVLDKANGWTIRTAD GAPSAQWEIMVLVTEDGHEVLAY >gi|316922791|gb|ADCP01000094.1| GENE 3 2782 - 3930 1306 382 aa, chain + ## HITS:1 COG:BS_yaaN KEGG:ns NR:ns ## COG: BS_yaaN COG3853 # Protein_GI_number: 16077094 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in tellurite resistance # Organism: Bacillus subtilis # 10 354 42 386 386 237 43.0 3e-62 METSATTIVVPPTAEEQRIASIADSINLADPSLTVTYGTEAMQGISRFADDLLSRVQAKD SGELGESLTNLMLKVKDVNVSEVMGSPGFLQSLPLVGSLFSSAKRTVAKFNTLSEQIEII VDKLDEAMIGLLRDIETLEMLYEHNARFHAELTAYIEAGKRKLEEARTVELPRLKAQADA SGDLMEAQQVRDLSEQINRFERRLHDLQLSRTITVQTAPQIRIIQSNNRTLAEKIQTSIL ATIPIWKSQMVLALSLHGQKNAAALQKTVSDTTNDMLRSNAELLEQAAVDTAREVERSVV DIETLREVHEKLIGTIEETLRIAQEGRERRAAAEKELAVMETELKDRLTSLAARKSREEI AAASGTETRQSAPGQGGTGSGV >gi|316922791|gb|ADCP01000094.1| GENE 4 3934 - 4914 693 326 aa, chain + ## HITS:1 COG:no KEGG:ESA_04385 NR:ns ## KEGG: ESA_04385 # Name: not_defined # Def: hypothetical protein # Organism: E.sakazakii # Pathway: not_defined # 95 325 82 326 328 89 30.0 1e-16 MHEGEKEHMVVGPNAEEPKGSFWDAIRRFFKRILFHISAFIIGAILSTHVPQIHPGEAAL GVYLGLICIVARGASRKPSWLVPAVSGAGLAVLLILFGMPFPYALFWGGLQSWVQRLLVK SFRMGTEWVPFSLLLIMAFSVRQTLSLPLAMLFCALLAGGQGILWALARKRAAELAPKPK PEPKPEPAPSPDEPASMKASRASLAELECKLPGLPAETQASVRAIMQSANNILGCMATDP RDVDHGARFLKRYLAATHGIVDTHLRFAHDKDISPDVASALARSNDMLARLKVAFAKEHA LLLQNDVTDFSADLKTLDTLLKMDGN >gi|316922791|gb|ADCP01000094.1| GENE 5 5241 - 5699 -300 152 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLTKSGFEVGNFFHCAVDFPKLFSQGCSLLPSGSSITGMEGSSSTVLRNNIRGSISRAWA ITTIVESTGSFSPFSSLANCCGVRPINFAKRYRETPCSIRAARTTSPKGHGVLSVFSLNR HHRLLLATHGFCRLIHQSSRLHAQRVGEQNQR >gi|316922791|gb|ADCP01000094.1| GENE 6 6093 - 6317 65 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302863944|gb|EFL86875.1| ## NR: gi|302863944|gb|EFL86875.1| putative helix-turn-helix protein [Desulfovibrio sp. 3_1_syn3] # 1 68 110 177 183 100 67.0 2e-20 MLLSQCRSVTKLSQNYISSIAKYDHRSLQKVEKGTQLPGIINAVKLVMATGVDAGVFFDI FSEKLETIKVKKRI >gi|316922791|gb|ADCP01000094.1| GENE 7 6705 - 7022 64 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302863943|gb|EFL86874.1| ## NR: gi|302863943|gb|EFL86874.1| putative helix-turn-helix protein [Desulfovibrio sp. 3_1_syn3] # 16 105 2 91 91 125 70.0 6e-28 MKLCTSSERISPYLFESLKEDMFSRTPEPHEVSGYGELLRYCRLQRGVSQKRIAKNIHYD LRSLQRVEKGEQEPLVTTAVKLVAAIDVPPGQFFEQLWFFLSRIG >gi|316922791|gb|ADCP01000094.1| GENE 8 7374 - 8066 368 230 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2676 NR:ns ## KEGG: DvMF_2676 # Name: not_defined # Def: putative phage repressor # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 5 229 4 230 231 212 47.0 7e-54 MNLLYEETMARIKIVSRTQTQTELAKLLEVSQSSVAEAKRRQSIPADWYLKLFEKLGVNP DWLKKGNGPVYLRTEAGYVPSDGDGPPIDPGLLGSPLAQSALAPVYAMCGDSTKTGSPAA ALQPIGKIALPKPYAREGIIVLEMDNESAAPTARCGAYVGIDTDASHPASGELFAVLLPY EGIILRRLAWDRKKKCFVLRAENPAYPEIRIQEDLTGRILGRLAWVMQKV >gi|316922791|gb|ADCP01000094.1| GENE 9 8148 - 8345 217 65 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0329 NR:ns ## KEGG: Ddes_0329 # Name: not_defined # Def: transcriptional regulator, XRE family # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 64 1 64 64 62 53.0 7e-09 MVKSNLAKLMKEKKLTYRRLAELSGVAGETINRARGTLIYECKLSTLDALAKVLDVNIKD LFDEN >gi|316922791|gb|ADCP01000094.1| GENE 10 9149 - 9790 570 213 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2268 NR:ns ## KEGG: Dvul_2268 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 213 1 210 210 273 62.0 4e-72 MEDRQNNAPSHEKARSLAKRLEFGPAPFIVGLVVALVFGWWIFPELLFSRQDQPVQFSHE THLKDASLDCSVCHHLRADGTFDALPSTKDCAVCHSQMLGKSEAERVFFNEYVKKGKEVD WKVYQKQPDNVFFSHAVHSLATCNNCHEFSERELCSTCHLDMASSKKPPVFRENRISGYS QNTMKMWQCEACHANPNHLGSTNASNACFVCHK >gi|316922791|gb|ADCP01000094.1| GENE 11 9801 - 11903 2978 700 aa, chain + ## HITS:1 COG:AF1203 KEGG:ns NR:ns ## COG: AF1203 COG0243 # Protein_GI_number: 11498802 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Archaeoglobus fulgidus # 35 526 20 539 741 104 24.0 7e-22 MLDRRGFLKFIGGAAVGTLATPVVWKGLDDISIWSQNWPWIPSLQYGNHENTYVRTVSKL CPAAGATKIRLVGGRPVRVLSDPESPLGGGVSALAVTEVQMRYSPARLKRPLLRNSDGGY REITWEEAEKLLLEKLAAAKKQTEHDSVVCISGDENGTMNELLAGFAAQTGSGRFFAMPS DAQATAQAWKLMGGRGRVGFDIPNSDYVFAVGANVLETWGSVVVNRHAWGQARPAGAEPA MRLAYAGPVQNNTAAGADLWLPIKPGTELFLLLGVARQLIKDGVAAPAGGLDDFAALVAK WTPEKVCEITGLMPERFTAVVDGLKKAKKPLVVVGSDMDQGGGTGPVRIGMAINMLLDRV NKEGGMRMIPLAPPAVNGAASYDVLMQGDLVRYAADMAQGNMPEVGVLMVYEANPVYALP GKDIEAVFKKSDFSVAFTCFFDETARRCDLVLPNALGLERYDDVAEPFSYGKFVYALVRP VAEPLYQARPAGDVVIDMAYKLGINLGVSDVVTMLKAKAFNIGADWGSLSDGNVYVSDIV VPNKTLAYRFSPDDLALVEKAEAAAASSGKELAVAFVSKLGLGTPETAIPPFNTKLITDD ELDKNMLVAAVNGATLKKLGLYEGNRIVLTSKAGKVMAKIRVFEGVTNDTVALTMGFGHT AFGEFNDGKGMDVMALVTPSSEPGSGLSVWNDTRVNVAQA >gi|316922791|gb|ADCP01000094.1| GENE 12 11916 - 12683 974 255 aa, chain + ## HITS:1 COG:AF0499 KEGG:ns NR:ns ## COG: AF0499 COG0437 # Protein_GI_number: 11498110 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Archaeoglobus fulgidus # 10 233 42 263 269 145 38.0 6e-35 MSEVKEFDIKWTMVVDLDKCTGCGACMVACQAENNVAPNPDGTNKVRSINWMKVYRLSNH KPFPEHDTAYLPRPCMQCGKPSCVSVCPVVATDKNEDGGIVSQIYPRCIGCRYCMASCPY HARYFNWYDPIWPEGMEKTLTPDVSVRPRGVVEKCTFCHHRWMKAKDKAIAEGRNPYELA DGEYVTSCTEVCPNGAISFGDSKNPEHKVYELIRSPHAFRLLERLGTDPQVYYLSKREWV RRQGDNYLENEKTKG >gi|316922791|gb|ADCP01000094.1| GENE 13 12688 - 13944 1891 418 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2687 NR:ns ## KEGG: DvMF_2687 # Name: not_defined # Def: polysulphide reductase NrfD # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 4 418 3 417 417 646 79.0 0 MEVKNYNQPVDASLFPEGCPRCSLLKFLLHLVPVAAVGLWGAYAAFRVLAHGLGETGLDD YFGFGLWITFDLAVIALGAGAFFTGALRYLLNIDALKNIINLTVVVGFLCYSGAMLVLVL DIGQPLRAWFGYWHPNVHSMLTEVIFCITCYCTVLIIEYVPLALENRQLAKNKLAHAIAH NFHLIMPLFAGIGAFLSTFHQGSLGGMYGVLFGRPYILRDGFFIWPWTFFLYILSAVGSG PMFTVLVCTFMEKLTGRKLVSWDVKQLMGKIAGTMFTIYMVFKLVDTAGWVMDILPRSGM TFDQMFHGWIYGKWLLFAELVVCGVLPAYLLFSKKCRSNPALFYLGGILACVGVTINRYV MTVQALAIPVMPFDTWEFYNPNWAEWGASFMVIAYSWLCLAISYRYLPMFPQERELNK >gi|316922791|gb|ADCP01000094.1| GENE 14 14150 - 14638 661 162 aa, chain + ## HITS:1 COG:no KEGG:DVU1185 NR:ns ## KEGG: DVU1185 # Name: not_defined # Def: colicin V production family protein # Organism: D.vulgaris # Pathway: not_defined # 8 159 4 156 159 116 39.0 2e-25 MTLPYDLNILDVVLLSVIALFTLRGALRGFLDEVAGLVGILGGVWLAGRYYGELGRIFSQ YTTSQWVYIVAYVLILCMVMFVISMISRALHSFLKMAYADWINHLAGAAVGGLKGFLICA VMVTLLTYFINDADFIKKSRMIQPIKETIVVFKQFLPEQYRK Prediction of potential genes in microbial genomes Time: Fri May 13 03:47:07 2011 Seq name: gi|316922729|gb|ADCP01000095.1| Bilophila wadsworthia 3_1_6 cont1.95, whole genome shotgun sequence Length of sequence - 72246 bp Number of predicted genes - 63, with homology - 59 Number of transcription units - 32, operones - 17 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 - CDS 104 - 1543 1962 ## COG0374 Ni,Fe-hydrogenase I large subunit - Term 1588 - 1622 2.3 2 1 Op 2 . - CDS 1642 - 2583 1243 ## COG1740 Ni,Fe-hydrogenase I small subunit - Prom 2686 - 2745 8.0 - Term 2832 - 2876 9.1 3 2 Tu 1 . - CDS 2908 - 4290 1140 ## COG0534 Na+-driven multidrug efflux pump - Term 4673 - 4715 9.3 4 3 Op 1 . - CDS 4745 - 5302 553 ## COG1434 Uncharacterized conserved protein 5 3 Op 2 . - CDS 5515 - 6552 1720 ## COG0687 Spermidine/putrescine-binding periplasmic protein - Prom 6707 - 6766 3.5 - Term 6731 - 6777 8.3 6 4 Op 1 36/0.000 - CDS 6913 - 7698 998 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 7 4 Op 2 30/0.000 - CDS 7698 - 8567 1256 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 8 4 Op 3 . - CDS 8554 - 9660 1224 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components + TRNA 10014 - 10090 75.8 # Pro CGG 0 0 - Term 10142 - 10184 6.3 9 5 Tu 1 . - CDS 10221 - 10589 490 ## PCC8801_3547 cupin 2 conserved barrel domain protein - Prom 10819 - 10878 2.5 + Prom 10628 - 10687 3.4 10 6 Tu 1 . + CDS 10729 - 11301 623 ## + Term 11319 - 11359 1.2 - Term 11384 - 11430 6.4 11 7 Op 1 . - CDS 11662 - 13044 1720 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases 12 7 Op 2 . - CDS 13066 - 13710 195 ## PROTEIN SUPPORTED gi|154175107|ref|YP_001408238.1| ribosomal protein L22 + TRNA 13984 - 14059 87.8 # Thr CGT 0 0 13 8 Tu 1 . + CDS 14526 - 14642 185 ## + Term 14693 - 14741 12.2 - Term 14681 - 14729 11.1 14 9 Op 1 . - CDS 14878 - 16305 2296 ## COG0499 S-adenosylhomocysteine hydrolase 15 9 Op 2 . - CDS 16330 - 17259 316 ## PROTEIN SUPPORTED gi|223485607|ref|YP_002589442.1| ribosomal protein L11 methyltransferase - Prom 17296 - 17355 5.4 + Prom 17509 - 17568 3.2 16 10 Tu 1 . + CDS 17592 - 17894 305 ## Sterm_1552 hypothetical protein + Term 17906 - 17951 3.4 - Term 17900 - 17936 7.2 17 11 Tu 1 . - CDS 17961 - 19136 1182 ## COG4383 Mu-like prophage protein gp29 18 12 Op 1 . - CDS 19417 - 21006 1111 ## DVU2705 phage uncharacterized protein 19 12 Op 2 . - CDS 21008 - 21280 446 ## 20 12 Op 3 . - CDS 21277 - 21870 329 ## LIC035 hypothetical protein 21 12 Op 4 . - CDS 21863 - 22507 490 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 22 12 Op 5 . - CDS 22567 - 22929 406 ## HD1548 hypothetical protein 23 12 Op 6 . - CDS 23018 - 23530 480 ## LI0834 hypothetical protein - Prom 23718 - 23777 2.6 24 13 Tu 1 . + CDS 24148 - 24555 498 ## COG2932 Predicted transcriptional regulator + Term 24718 - 24767 6.1 - Term 24704 - 24755 10.8 25 14 Tu 1 . - CDS 24868 - 25602 810 ## DVU2641 putative lipoprotein - Prom 25660 - 25719 4.8 + Prom 25715 - 25774 3.2 26 15 Op 1 10/0.000 + CDS 25938 - 29939 5547 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 27 15 Op 2 1/0.000 + CDS 29933 - 30892 1395 ## COG0650 Formate hydrogenlyase subunit 4 28 15 Op 3 6/0.000 + CDS 30906 - 31340 578 ## COG3260 Ni,Fe-hydrogenase III small subunit 29 15 Op 4 . + CDS 31342 - 32181 890 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 30 15 Op 5 . + CDS 32217 - 32777 899 ## Dvul_0965 NADH dehydrogenase (ubiquinone), 30 kDa subunit 31 15 Op 6 . + CDS 32774 - 33862 1545 ## COG3261 Ni,Fe-hydrogenase III large subunit + Term 33898 - 33925 0.1 + Prom 33940 - 33999 4.2 32 16 Op 1 . + CDS 34191 - 34532 278 ## COG0375 Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) 33 16 Op 2 . + CDS 34572 - 35069 570 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 + Term 35096 - 35134 11.4 34 17 Op 1 . + CDS 35161 - 36006 1139 ## COG3494 Uncharacterized protein conserved in bacteria 35 17 Op 2 . + CDS 36035 - 36517 784 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins + Term 36566 - 36610 6.1 36 18 Tu 1 . - CDS 36854 - 38323 1950 ## LI0461 hypothetical protein - Prom 38467 - 38526 4.0 - Term 38512 - 38556 1.3 37 19 Tu 1 . - CDS 38681 - 39142 454 ## LI0864 hypothetical protein - Term 39275 - 39314 1.3 38 20 Tu 1 . - CDS 39513 - 40301 1060 ## NT01EI_1069 hypothetical protein - Prom 40352 - 40411 8.0 + Prom 40369 - 40428 10.6 39 21 Op 1 . + CDS 40678 - 42612 2445 ## COG1048 Aconitase A + Term 42644 - 42678 0.7 40 21 Op 2 . + CDS 42685 - 44574 2968 ## COG0760 Parvulin-like peptidyl-prolyl isomerase + Term 44600 - 44640 5.1 - Term 44587 - 44628 9.1 41 22 Op 1 . - CDS 44658 - 45719 239 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 42 22 Op 2 . - CDS 45725 - 46336 604 ## COG1636 Uncharacterized protein conserved in bacteria 43 22 Op 3 . - CDS 46326 - 48953 2463 ## COG1530 Ribonucleases G and E - Prom 49095 - 49154 3.3 - Term 49158 - 49205 3.4 44 23 Tu 1 . - CDS 49229 - 49969 889 ## COG2129 Predicted phosphoesterases, related to the Icc protein 45 24 Op 1 . + CDS 50094 - 50786 699 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 46 24 Op 2 . + CDS 50777 - 51532 991 ## COG1496 Uncharacterized conserved protein 47 24 Op 3 . + CDS 51533 - 53314 2192 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + TRNA 53387 - 53469 76.5 # Leu TAG 0 0 + Prom 53394 - 53453 80.4 48 25 Op 1 29/0.000 + CDS 53569 - 54891 2130 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) 49 25 Op 2 24/0.000 + CDS 54959 - 55693 872 ## COG0740 Protease subunit of ATP-dependent Clp proteases 50 25 Op 3 18/0.000 + CDS 55702 - 56958 253 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 + Term 57062 - 57095 1.0 51 25 Op 4 . + CDS 57260 - 59728 3180 ## COG0466 ATP-dependent Lon protease, bacterial type + Prom 60102 - 60161 3.4 52 26 Op 1 . + CDS 60257 - 60577 420 ## COG2076 Membrane transporters of cations and cationic drugs 53 26 Op 2 . + CDS 60611 - 62014 1745 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 62111 - 62149 1.1 54 27 Tu 1 . + CDS 62186 - 63694 2088 ## COG0260 Leucyl aminopeptidase + Term 63795 - 63830 3.2 55 28 Op 1 . + CDS 63841 - 64314 513 ## COG3467 Predicted flavin-nucleotide-binding protein 56 28 Op 2 . + CDS 64338 - 64880 203 ## PROTEIN SUPPORTED gi|239918553|ref|YP_002958111.1| acetyltransferase, ribosomal protein N-acetylase + Term 65079 - 65117 6.0 57 29 Tu 1 . - CDS 64749 - 65000 140 ## - Prom 65138 - 65197 2.3 - Term 65061 - 65110 13.4 58 30 Op 1 30/0.000 - CDS 65261 - 65764 841 ## COG0066 3-isopropylmalate dehydratase small subunit 59 30 Op 2 6/0.000 - CDS 65913 - 67172 1762 ## COG0065 3-isopropylmalate dehydratase large subunit - Prom 67252 - 67311 1.9 - Term 67264 - 67313 4.1 60 30 Op 3 1/0.000 - CDS 67408 - 68979 2256 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases - Prom 69047 - 69106 2.8 - Term 69092 - 69146 14.0 61 31 Op 1 14/0.000 - CDS 69170 - 69922 827 ## COG1183 Phosphatidylserine synthase 62 31 Op 2 . - CDS 69977 - 70624 681 ## COG0688 Phosphatidylserine decarboxylase - Prom 70690 - 70749 3.4 - Term 70702 - 70745 1.0 63 32 Tu 1 . - CDS 70756 - 72219 1610 ## COG0168 Trk-type K+ transport systems, membrane components Predicted protein(s) >gi|316922729|gb|ADCP01000095.1| GENE 1 104 - 1543 1962 479 aa, chain - ## HITS:1 COG:CAP0142 KEGG:ns NR:ns ## COG: CAP0142 COG0374 # Protein_GI_number: 15004845 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I large subunit # Organism: Clostridium acetobutylicum # 8 479 2 449 471 278 32.0 1e-74 MAEATQTKTTIAIDPVTRIEGHLKAEVVVENGKVVDAKLTGGMYRGFESILRGRHPRDAA QIVQRICGVCPTAHATAASIALEKASGTAVPSNGRATRNLILGANYLQSHILHFYHLAGQ DFIQGPDTAPFMPRYAKPDLRLPPAINAVGVDQYVEALEVRAVCHEMVALFGGRMPHVHG ILAGGAAEIPTKEKLVEYAARFKKVRKFVEEKYLPVVYLVGSQYKDLGAFGQGYRNALCV GVFPLDDEGKEFVFKAGAYQDGKDMPFDVKRVTEDVKFSWFADSTSGKPFTKGENVLEVD KKGAYSFVKAPLYNGKPMEVGPLARMWVNNKPISPIGQKLFKEYFGLDVKNFRDMGEDLA FSLLGRHVARAEESYLMLDVVERFLKEVRPDEETFSMPVMKDSEGFGFTEAPRGSLMHFT KVRNGKIDNYQIVSATLWNCAPRNDTGMRGALEEALIGVPVPDIENPVNIARLIRAFDP >gi|316922729|gb|ADCP01000095.1| GENE 2 1642 - 2583 1243 313 aa, chain - ## HITS:1 COG:CAP0141 KEGG:ns NR:ns ## COG: CAP0141 COG1740 # Protein_GI_number: 15004844 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I small subunit # Organism: Clostridium acetobutylicum # 46 310 40 288 291 187 39.0 3e-47 MALSRRDFVKLCSGTVAGFGVSQMFHPAIHEAFAQTLTGERPPVFWVQGQGCTGCSVTLL NSTHPSIADVLLKIISLEFHPTVMAAEGEGAYEHMMRVAEKFKGKFIFAVEGAVPVAHDG KCCVVAEADHHEVTMTEVTKVLAANAAAVLAVGTCAAYGGIPAGKGNETGAMGVSAFLKK EGIPAPVINIPGCPPHPDWIVGTIGLGLQALATNTLGLLVKQGLDANGRPKAFYKNVHMN CPHLSAFEAGHMVKTMSDKDGCRFSMGCKGPRSACDSFERKWNNGVNWCVNNATCIGCTS PTFPDGQSPFYVN >gi|316922729|gb|ADCP01000095.1| GENE 3 2908 - 4290 1140 460 aa, chain - ## HITS:1 COG:FN1469 KEGG:ns NR:ns ## COG: FN1469 COG0534 # Protein_GI_number: 19704801 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 19 429 13 425 440 184 27.0 3e-46 MHGLDLMLHGPLVRTFLRYCIPWTLAMLLQSSAAIVDGFFVGRYVGAMSLAALNLVMPMF SLFFGVGVMLASGGAVRAGKYLGEKRPEAASAIFTKTMLSLLLVGLAASAGMLLFSGELV RFLGADEELRGPAEAYLRGLMFFGPMIPCGLALSYFVRVNGKPALASVGLMLSALINIVL DAVFIPVFGMGVEGAAYATGIAYTITTGVLCLHFFTPEAKCRFTRAIGAWKEVGQACWNG VSELINETSIGLVILFINWILLARIGSYGVAAFTIINYAAWFGASVSYAISDSLSPLVSA NFGARNFARARQFLGLALGMVFGIGLMLYGLFALYPEMLLKIFLPGEEKVVAVTLEFIDW FKLAFLFSGLNMALASFFTALHMAAASAAVALMRSLLFPVGFLWLLPQLFGNVGIYVAIP LAEMCTLVVASLIFAASRKHFADRKAHHGPERAARGGKNL >gi|316922729|gb|ADCP01000095.1| GENE 4 4745 - 5302 553 185 aa, chain - ## HITS:1 COG:RSc2460 KEGG:ns NR:ns ## COG: RSc2460 COG1434 # Protein_GI_number: 17547179 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 27 182 27 185 185 160 58.0 1e-39 MGQGWYGAIGKLAIAALLIGAVLIWRAEAWLHVDVPLEHADVIVVLGGESGQRVIGAAEL YHQGVAPKVFVTGSGDGGLIVRRLGMAGVPDAACESQSRSTYENAVLTKKALEASHPRSV MLVTSWFHSRRALAVFRDVWPEVTFGVHPVYAGATFTARFRIYEMGYIFSEYVKTLWYMV RYGIA >gi|316922729|gb|ADCP01000095.1| GENE 5 5515 - 6552 1720 345 aa, chain - ## HITS:1 COG:STM1222 KEGG:ns NR:ns ## COG: STM1222 COG0687 # Protein_GI_number: 16764577 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Salmonella typhimurium LT2 # 16 344 19 347 348 311 47.0 1e-84 MKKLLVALVALLIAASPAFAEEAKELNVFSWSEYIPQEVVNGFTKETGIKVVLSTYESNE AMYAKLKLLGAKGYDLVVPSGYFVEVMAQDNLLAKLDKSKIKGLGNLDPASLNQPFDKGN ALTIPYMWGTVGLLVNKKAVDVSTIKSWKDLARPEFKGRVLLSDDLRDTFGVGLKACGYS INSTNPDEIKAAYEWLKALKPSVRVFDVTATKQAFISEEVVAGLAWNGDAFIAMQENPDL EFIVPEEGMLIWLDNFAIPAGAEHKENAYKFISYLMRPEVAKLCVEEFNYSTPNKAAEKI LEPEYAESKVIVLDPEELAKGEVTVNVGKARELYEKYWEQLKAGD >gi|316922729|gb|ADCP01000095.1| GENE 6 6913 - 7698 998 261 aa, chain - ## HITS:1 COG:HI1345 KEGG:ns NR:ns ## COG: HI1345 COG1177 # Protein_GI_number: 16273255 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Haemophilus influenzae # 10 252 1 243 247 209 46.0 3e-54 MMRKLGKLWLWIVYAFLFLPIFVVIAYSFNAAKYTSAWKGFTLKWYSQLFGNAALIDAVV NTLSVAALAASLATLLGVLAALCLKRTEFPGRQLLHASLYVLTVSPEMVMGISLLIFFIS VKLPLGFVSMLIAHTTLGLPFVALTVLARLAEFDEHLVEAARDLGATEAQAFRYVILPVI MPAVVAGWLLAFTLSMDDVLISFFVAGPTFEVLPLRIYSMVRLGVKPDINALSAIMFCVT IALVVLAFCLSRPRKSLRKQG >gi|316922729|gb|ADCP01000095.1| GENE 7 7698 - 8567 1256 289 aa, chain - ## HITS:1 COG:PM0263 KEGG:ns NR:ns ## COG: PM0263 COG1176 # Protein_GI_number: 15602128 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Pasteurella multocida # 8 277 6 275 284 252 48.0 6e-67 MKQTRAFFNRFSIGVTALWLLLFALLPNIGLLVVTVLTRGEQDFVLPQFTLDNYLRLLDP TFLKILWESLWLAFMSTLACLLVGYPFAYRIARASAKMKPWLLLLVIIPFWTNSLIRTYA LILILKANGLISTLLVWLGVTEQPVSFMYGDFAVFMGILYTFLPFMVLPLYASIEKLDSR LLDAARDLGASGFQSFWHVTFPLTLPGIIAGSMLVFLPSLGAFYIPEILGGAKSMLIGNF IKNQFMVARDWPLGAAASTILTLLLALLIAVYRMANRKVASRDRMDEVA >gi|316922729|gb|ADCP01000095.1| GENE 8 8554 - 9660 1224 368 aa, chain - ## HITS:1 COG:PM0264 KEGG:ns NR:ns ## COG: PM0264 COG3842 # Protein_GI_number: 15602129 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Pasteurella multocida # 7 366 20 379 380 409 57.0 1e-114 MTGQDNIIELRGVDKSFEDTRALEGIDLSISNGEFLTLLGPSGCGKTTILRILSGFETPD QGDVSIGGQRMNDVPPERRQVNTVFQNYALFPHMTVRDNVAFGLRMQSCPKGEIEGRVLE ALRMVHLEQYADRRPHQLSGGQQQRVAIARAVVNKPLVLLLDEPFSALDYKLRRAMQLEI KRLQRRLGITFVFVTHDQEEAFAMSDRVVVMNQGRIEQIGTPQEIYEEPSNLYVARFVGE INILPARIMSVPGEGIYIADISGRRFTLRTSRPFAAGDRVNVLLRPEDIRVYAHDDERPE GPYLTGRIEETVYKGATVDISIALDSGETLSVAEFFNEDDAEISYNRGERVAVTWVDGWE VVLPYEAD >gi|316922729|gb|ADCP01000095.1| GENE 9 10221 - 10589 490 122 aa, chain - ## HITS:1 COG:no KEGG:PCC8801_3547 NR:ns ## KEGG: PCC8801_3547 # Name: not_defined # Def: cupin 2 conserved barrel domain protein # Organism: Cyanothece_PCC8801 # Pathway: not_defined # 17 112 10 106 107 115 54.0 5e-25 MNGKRNILFVDKDAYPSLEEPELFDTLAAGAGSIRVERIVSNGQVTPEGEWYDQDLDEWV VVLEGEARLHYMDGEEVGLKKGDSLFLPKRRKHRVVYTSSPCIWLAIHADLLTPQSVVAD AI >gi|316922729|gb|ADCP01000095.1| GENE 10 10729 - 11301 623 190 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRFGYTLLLAGALSLGIGIQAQADDFIPQKAGNVTISLPSGWEVLPKGVLKQFSQENGPE ILLLAQGPADDFLKLSIIRNPDPGTQEAFLKQDAAQTEKRCKKLIGELESQLGTGKAEAT CGKVENNGVAALATQMTIPAQDNRPELINMTWTYPNGDKGVIANAMFLKKDAGKYEADVK KALQSVKFDK >gi|316922729|gb|ADCP01000095.1| GENE 11 11662 - 13044 1720 460 aa, chain - ## HITS:1 COG:alr3658 KEGG:ns NR:ns ## COG: alr3658 COG0017 # Protein_GI_number: 17231150 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Nostoc sp. PCC 7120 # 1 460 1 463 463 546 55.0 1e-155 MLRTPVAEALGAASAASEIRIGGWVRTRRDAKGFSFLELNDGSCLGNIQCIVDEGTAAWD KLEGVNTGACVAVTGELVESPGKGQKWEVRAKDMTVFGLADPETFPLQKKRHSDEFLRTI AHLRSRTNKYGAAFRIRAEAAHAVHDFFYQNGFYYVHTPILTGSDCEGAGEMFRVTTLPL NTPVKPGENPYAQDFFGKECSLTVSGQLEAEALALGLGKVYTFGPTFRAENSNTPRHAAE FWMIEPEVAFTDINDNMDLAESLTCFVINRVLERRADDMDLFDRFVDKGLVDRLKGMVAQ PFARCSYREAIDILKASGKNFEYPAKFGLDLQTEHERFLAEEHFKRPVAVYDYPKEIKAF YMRQNDDRETVAAMDVLVPRIGELIGGSQREERLDVLESRIREMNQDPANYWWYLDLRRF GSVPHSGFGMGFERLVMMLTGISNIRDALPFPRTPGSLEF >gi|316922729|gb|ADCP01000095.1| GENE 12 13066 - 13710 195 214 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|154175107|ref|YP_001408238.1| ribosomal protein L22 [Campylobacter curvus 525.92] # 8 214 2 198 199 79 27 4e-14 MTDLSFFEQAISFVLHIDQHLFALVAEYGALLYLFLFIVVFCETGLVVTPFLPGDSLLFA AGTVAGAGHLSYPVCMLVLLGAAILGDAVNYEIGRHVGPSIFSRETRFLNKEHLLKAHAF YERHGGKAIILARFIPIVRTFAPFVAGIALMSPVKFLSFNITGAILWVVGLVSAGYFLGN IPIVKNNFSVVIYGIIIVSVLPVVIEFIRAKIKK >gi|316922729|gb|ADCP01000095.1| GENE 13 14526 - 14642 185 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPYAITTPEHGTAFDIAGKGIAKTKATEEAIRIQSMNL >gi|316922729|gb|ADCP01000095.1| GENE 14 14878 - 16305 2296 475 aa, chain - ## HITS:1 COG:XF1037 KEGG:ns NR:ns ## COG: XF1037 COG0499 # Protein_GI_number: 15837639 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylhomocysteine hydrolase # Organism: Xylella fastidiosa 9a5c # 35 475 1 446 446 608 66.0 1e-174 MSNAKPLDLSLPYQVADISLADFGKKEMQLSEREMPGLMELIRRYGDKKPLKGFKVMGSL HMTIQTAMLIKTLYELGADIRWASCNIFSTQDHAAAAIAELGYAKVFAWKGETLEDYWWC TEMALTWPDGSGPDLIVDDGGDATLMIHLGVDADRDPSVLDKPCDSKEARVLMDRIILGH KTNPGHWTKVAAKVRGVSEETTTGVHRLYQLSEQGKLLFPAINVNDSVTKSKFDNLYGCR ESLADGIKRATDIMIAGKVVVIAGYGDVGKGCAHSMRGFGARVLVTEIDPICALQAAMEG FEVTTMEDAVKEGDIFVTCTGNYHVITGDHMNNMKDEAIVCNIGHFDNEIEMTWLEENPA NKRIQIKPQVDKWVLPNGRSLIILAEGRLVNLGCATGHASFVMSNSFTNQVLAQMDLAAT DHEVKVYTLPKKLDEEVARLHLARLGVKLTTLTEEQAAYIGVSVDGPFKSDLYRY >gi|316922729|gb|ADCP01000095.1| GENE 15 16330 - 17259 316 309 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223485607|ref|YP_002589442.1| ribosomal protein L11 methyltransferase [Brevundimonas sp. BAL3] # 5 271 9 273 324 126 32 4e-28 MSNAIRYFKALSDETRLRLTYVLERYELSVNELVSILEMGQSRVSRHLKILTEAGLLSSR RDGLWVFYSSVKEGEGAAFLKAAMPFLDEDAEMRADVEMAARIIEDRALKTRQFFNTIAE HWDTLSREVLGGFDLAGAVCDTMPEGCDTSVDLGCGTGIVLERMRGRARQIIGVDGSPRM LELSRRRLAGVEGEPEGVSLRIGELSHLPLRDCEADFASINMVLHHLSNPENALAEIRRV LRSGGLLVVADFDRHTQERMRLDYGDLWLGFDEATMARLLHAASFEVVSTTRYPVEQGLA LNLTLARRL >gi|316922729|gb|ADCP01000095.1| GENE 16 17592 - 17894 305 100 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1552 NR:ns ## KEGG: Sterm_1552 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 98 1 98 98 113 58.0 2e-24 MKKKFAFLLMGKEFDTAKDTAVFETEHMISYIFTVTSFEEALKRAVACAEDGVGAIEVCG AFGKEWADKLIAATGNKVAVGYVVHNPEQDELFRQFFAKA >gi|316922729|gb|ADCP01000095.1| GENE 17 17961 - 19136 1182 391 aa, chain - ## HITS:1 COG:NMA1851 KEGG:ns NR:ns ## COG: NMA1851 COG4383 # Protein_GI_number: 15794739 # Func_class: S Function unknown # Function: Mu-like prophage protein gp29 # Organism: Neisseria meningitidis Z2491 # 11 387 39 415 519 340 46.0 3e-93 MLALEPQMNLTHGLTPKRLRGILELADQGDILEQHLLFADMEDRCEHLAAEMGKRKRALL TLDWEILPGRANDARAARIAAAVREQFDAVNGFEDLLMDLADGIGHGFSACEIEWDYRYG LHLPAAFHFRPQSWFQVLREDRNQLRLRNGKPDGEELWPFGWIVHTHRSRSGWLPRCGLF RTVAWAYLIRSYALEANIRYVQVHGLPFRLGKYPPGSREEDRRALYNALRTLGQDAAGII PQGMDILFETPASSSQDLAGALISRCELGMSKAILGGTLTSQSDGAASTNALGRVHDRVR RDLTISDAMQIASTLSRQLLAPLAVLNCGVSDPRLLPWFRFDTREAEDIATFAQALPALA SVMKISARWAHDKLKIPEAENENDILGAWRA >gi|316922729|gb|ADCP01000095.1| GENE 18 19417 - 21006 1111 529 aa, chain - ## HITS:1 COG:no KEGG:DVU2705 NR:ns ## KEGG: DVU2705 # Name: not_defined # Def: phage uncharacterized protein # Organism: D.vulgaris # Pathway: not_defined # 12 508 38 536 558 720 71.0 0 MRGKRRNSPKLPDASAGTPEERRERARQDFTFFRRAYFPHYCLVGGDSRLHEWLDAELPR MADAPEGVRLAIAAPRGEAKSTFVSLFFVLWAVLTGRKHYILIIADALEQAASLLGAVKD ELEFNEALKRDFPGVGKGHVWNVGTVVTPGNVKIQALGAGKRMRGLRHGPHRPDLIVLDD LENDGNVDKPEQRDKLQSWLQKTVLNLGAADGSMDVVYVGTLLHYDSVLARTLRKPLWKA RTFRSILRWPDRMDLWDEWESLLSSRGEAEARRFYRQHRAKMEAGAEVSWPASRPLYRLM CLRVEDREAFDSEQQNDPLSANEAPFNGCITFWVDRSRDWLLFGAVDPSLGKQGAGRDPS AILVGGFLRGSMTLDVVEASIRKRHPDRIIEDVIALHDVHHPLLWGVEAVQFQEFFAHVL VQRAAERGIALPVRPLVNSADKLLRIESLQPHMAQGRIRLHVSQQALIDQLRHFPRADHD DGPDALEMLWRLASTGFVSMGDAYIRMPLEREWCDDGEDGRWGGGRGPF >gi|316922729|gb|ADCP01000095.1| GENE 19 21008 - 21280 446 90 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTQAALQFDAHTLLLGLLGMVCGFLAYWGTRLEQRVDELKARQCLLAEEMHKSYVPREDC RERTAQILGGLERADEKLDRIADAIRTGGR >gi|316922729|gb|ADCP01000095.1| GENE 20 21277 - 21870 329 197 aa, chain - ## HITS:1 COG:no KEGG:LIC035 NR:ns ## KEGG: LIC035 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 18 186 11 226 234 87 27.0 3e-16 MTRNVLAVGLLCAACCAAGYVRGVGIAETRGRVLLAEQAEASASEREALAVAAAEAERLA RQRLEAETERAAAIAGELSETRNRLAAERQAFTRRMDRVAEDASRHCAGLPAGWVRLYNE ALGLAPAAPEASAPAAGIGAPAGSASATGAGVRSGASVMVSPEDLLAHARDYGGYCRNLR AQAEALLAVEKGREGRP >gi|316922729|gb|ADCP01000095.1| GENE 21 21863 - 22507 490 214 aa, chain - ## HITS:1 COG:STM4217 KEGG:ns NR:ns ## COG: STM4217 COG0741 # Protein_GI_number: 16767467 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Salmonella typhimurium LT2 # 5 202 8 195 201 179 51.0 3e-45 MMIRMVLMMALLSAPFPQPGLASDIPRAAERHRAELIRVSRAVWGVEAPVAVFAAQVHTE SWWRNGTVSPAGAQGLAQFLPSTAEWLPRAVPELEREAGRSAPFNPGWALRALVSYDKWL WDRLNGADACQRMAFTLSAYNGGIGWVGRDRKEAERQGRDPARWFGQVEKVNAGRSASSL RENRRYVRLILLERQYWYRKAGWGPGVGCGGGHD >gi|316922729|gb|ADCP01000095.1| GENE 22 22567 - 22929 406 120 aa, chain - ## HITS:1 COG:no KEGG:HD1548 NR:ns ## KEGG: HD1548 # Name: not_defined # Def: hypothetical protein # Organism: H.ducreyi # Pathway: not_defined # 2 120 3 124 124 75 40.0 4e-13 MGFFRLLLRPRWRLFFFGLVALGLQIALLLISPAQVPVVLYKLALVMLAAILGMFFDVAV FPFATPDSYLDDDWKRAPGAVRLRCADFAIAKGYLWPFVMACLRRAAVVAAFVVAVSVGL >gi|316922729|gb|ADCP01000095.1| GENE 23 23018 - 23530 480 170 aa, chain - ## HITS:1 COG:no KEGG:LI0834 NR:ns ## KEGG: LI0834 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 27 157 17 147 147 100 39.0 2e-20 MEHMAKNRERRSASSEMFPGQARTRGQGVNELVELIGLPAAIELIRAKGGTSFSVPLGIT LRGQEQREKLVQIIGREQATKLIGRYGGTTLYIPTCRQAFVDTRDRNINLERDELAREGL SERALVSVLAVRHGLSDRQIWRILKKSGPVRPDMAACAAARRVSRSSARC >gi|316922729|gb|ADCP01000095.1| GENE 24 24148 - 24555 498 135 aa, chain + ## HITS:1 COG:NMA0738 KEGG:ns NR:ns ## COG: NMA0738 COG2932 # Protein_GI_number: 15793714 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Neisseria meningitidis Z2491 # 2 132 95 226 228 68 32.0 3e-12 MIPMVEARLSAGHGSLEVGGDSERSYAFRSDFLHRKGNPREMVLMRVSGDSMEPEIMDND IVLIDQGKRDVTPGRLYAIGFDEAIYLKRIDILPGKVILKSTNAAYPPVELDTRGDCEDA FRVIGRVLWCGREYK >gi|316922729|gb|ADCP01000095.1| GENE 25 24868 - 25602 810 244 aa, chain - ## HITS:1 COG:no KEGG:DVU2641 NR:ns ## KEGG: DVU2641 # Name: not_defined # Def: putative lipoprotein # Organism: D.vulgaris # Pathway: not_defined # 25 232 17 230 244 152 43.0 1e-35 MSARFPLSLRLFFTTCLCLCVCLAAGCSGKQVHVTVENEFNMMAKRLAPVLKAHGVIDEH GAYVAPVFSTPELPPQLGEYLFQRLSPAFRFKVDPALLPPTFALSRTAGDTVEMQPYGFM LGQGADIVTVTLLAQTDWNDDGLNEWLLLCRVKPIIGKNNMRDYYLLVEKPGASILVPKL LAVYDCLSQSCKLFVDVDQKKPPYAPEEPTIEVKIGQKDVTLPPNAPPPPGAPQHEFKES KLGE >gi|316922729|gb|ADCP01000095.1| GENE 26 25938 - 29939 5547 1333 aa, chain + ## HITS:1 COG:PH1451 KEGG:ns NR:ns ## COG: PH1451 COG0651 # Protein_GI_number: 14591241 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Pyrococcus horikoshii # 792 1244 64 514 617 187 28.0 1e-46 MDSSVTPTSPSEMAILVAGLLIMLFGVYKVYRERGRPGLLILWGGLVDLGLALMGLGLGP SGDVGSIMIVIYHAIARLAAWIGLSGLVANPYSCVANDIRGALHRNPVGAVLLAFALEAS LGISPAMTPEGQHLVLHAMAMHSIDVPLGGPVLAIIMALGYTVLLWLSAEVVLQVCFTRP DEPICTDVPLGGGGFARALFDQETELLGVPSGSWCPIVPNRWSKTPAAISLYVLLGMLVV LGLGRFFVQDFGAWLIGIDPHTLVELEGPWHVTPLLLYVGSFLVLCPRIIRPAYRAKYTF ALFLAAFILIYASDLPPLSRLFSLIISGIGTLVACYSTNYIEEDGRKHWYWFFLLLTFGA LLNIVTTDNIGTLASQWEVMTWASFALVAWERTSKARDAAIKYVVLCCSAAYLMIVGFFM IGGNHTQYGEIVANLAVHSDDVLKFALVFTLLGFAAKAGLVPLHSWLPDAHPAAPSSVSA PLSGVLTKMGVFGVVTVYFGLIGYKELASTGTYGGLTAPGLMVTFMGLCTMAYGEWMALR QEDLKRMLAYSTMGQVGEIFTVLGLGTWLAAAGALSHVLNHAIMKDLLFLCAGGLIFRVG SRKLSDLAGLVHEMPWAVGCMSIGIVSIMGLPPFNGFVSKYLMIVACMDAGQPLIAAALI AASLVGAIYYMRILRTILLDPAPAGRRRVQGVTTTMRVPMIILAGLCVILGVAPQTGLSL VTPVINELMPHLGSGAGHLSAALPSLDVGWPVYVILPMIGAVVPIWFRKDRVKAGWGAAA ILFLTTLHVLAFGQDLDTLSFIMALFIPAIGTVNLVYAVGYMEHSHTQWRFYAFFLLMTG GLLGVAASSDLFSFFTFWEIMSSWALYFAIAHEGTPEALREGFKYFLFNVAGAGFLFLGI GLIVAYTGTGEFAGVAKGFASLSPAVGTSIMLLMAVGFCMKAAQVSLRIDWQMHPALAPT PVSGYISSVLLKIAVFGLVKLFLVFGSVYAQSPDTAGLFSQQAVMYATAWIGAFTLLYSA VQAMLQNSLKLVFIWSTVSQIGYMVLGVAVGTSLGMAGGLLHMANHVFFKDLLFLMVGAV MLRTHADTISELGGVGRKMPVTMFCFFIGSLAAVGVPPTNGFTSKWIIYQALMAEGEPLL ALISLIGSVITLAYLARFMHAVFLGQPGRNLDHIEEAPWVMRAPMLLMAFMVILTGVFPG LMLAPLNAALAEYGMPSLDVAFYGLATGPGAWDATAVAVLMLVAFGGCWLALRFLLSRVK IRVAPPHACGHDASREASRIPPEAIYPALVNLCTGNKPGSATCPLPELALSLCRALRQSL GRNGRINKERPSC >gi|316922729|gb|ADCP01000095.1| GENE 27 29933 - 30892 1395 319 aa, chain + ## HITS:1 COG:PH1439 KEGG:ns NR:ns ## COG: PH1439 COG0650 # Protein_GI_number: 14591230 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 4 # Organism: Pyrococcus horikoshii # 25 318 22 318 321 123 34.0 6e-28 MLDLLVGILHLVVFPGGLFALALGLFLKGCDRRVEARLQRRVGPPLIQPFLDLVKLSTKE VIIPTSAVRGAFLAAPVVALGGIAMCAALLPVPGVTDGLPKMGDLLVIFYLFPIPAMAIM LAGSSSGSPYGGIGFSREMIMMLAYEVPLLMIMLTVAMKVGGGSGAEFSLTKIMDYQAEH GSFGLTPVMWPAFLAWLMFLPATLGVPPFDQPEAETEILEGPLLEYSGVLLAFFHMASAL KMVVALALGVVLFFPGTISDIPAVNLVWFVVKCALFMLFALTVVKSATGRLRIEQALSFF LKYPAGLALLSLILVWIGC >gi|316922729|gb|ADCP01000095.1| GENE 28 30906 - 31340 578 144 aa, chain + ## HITS:1 COG:PAB1890 KEGG:ns NR:ns ## COG: PAB1890 COG3260 # Protein_GI_number: 14520931 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III small subunit # Organism: Pyrococcus abyssi # 10 135 28 152 170 149 53.0 2e-36 MNLIKKLSTRSPWLFRINAGSCNGCDVELATTACISRYDVERLGCKYCGSPRHADIVLIT GPLTARVKDKVLRVWNEIPEPKVTVAVGICPISGGVFREGYSIEGPVDRYLPIDVNVPGC PPRPQAIMEGVLEAVAIWRERMEG >gi|316922729|gb|ADCP01000095.1| GENE 29 31342 - 32181 890 279 aa, chain + ## HITS:1 COG:PH1440 KEGG:ns NR:ns ## COG: PH1440 COG1143 # Protein_GI_number: 14591231 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Pyrococcus horikoshii # 1 116 4 118 136 62 32.0 1e-09 MEFLKILWDNLKKGPVTDAFPFGETYTPDRLRGRVEIDPALCVGCGTCVHVCAAGAINIS KFEDGSGFEITVWRNSCCLCAQCRHYCPTKAVTLTNDWHSAHRATEKYTQITRAKVKYDY CAVCGAKMRVLPQEVLNRIYAKHSEVDVDVISHLCPNCRRIDVAVEEDRACHIDQLEKMA AREEVCLVPPRKAEKASADKENAKEAPAQAASAPKTAAPSREATAQPKSDPAASKAAPSA QTPPAEQAASAAAETRPDTPSGTESAAEKPSPAAGNTAD >gi|316922729|gb|ADCP01000095.1| GENE 30 32217 - 32777 899 186 aa, chain + ## HITS:1 COG:no KEGG:Dvul_0965 NR:ns ## KEGG: Dvul_0965 # Name: not_defined # Def: NADH dehydrogenase (ubiquinone), 30 kDa subunit # Organism: D.vulgaris_DP4 # Pathway: not_defined # 3 179 2 180 181 166 49.0 5e-40 MNTTWSSRSINPELVKGLVDLASEERHNWDSWGNGFEWVTLMNAGQLPAAAALLAKHEAR LCTVTALNKQLAEPVTTLAYHFDVHGYTVTVTVPLDPAENSVPTITPWFRNADWNEREFA ELFDVHVDGNTNPKRLFLDPEIDEGILNEVIPLTIMMNGACTKDMWERIMEVNAEPGEEV EKGVHE >gi|316922729|gb|ADCP01000095.1| GENE 31 32774 - 33862 1545 362 aa, chain + ## HITS:1 COG:PH1437 KEGG:ns NR:ns ## COG: PH1437 COG3261 # Protein_GI_number: 14591229 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III large subunit # Organism: Pyrococcus horikoshii # 9 362 11 381 427 265 37.0 1e-70 MSQTSSFSLPIGPVHVALEEPVYFHLDVDGETVRKVEITSGHVHRGMEGMATQRNFFKNT TLTERVCSLCSNSHSLTYCMAVENVLGMTPPPRAQYLRVLAEETKRVASHLFNIAIFAHN VGFKSLFLHIMEVRENMQDVKETIYGNRMNLSANCIGGVKFDMDDSLTMYMLERLDTVEE AVKRVEKVFRTDPLIAKRSRGVGVLSREDAWALGVVGPVARGSGLDIDVRKNTPYLVYPD LYFDIVKEQDGDVWSRAMVRVREVFESIRLLRQCLDKLPDGPLAVHMERIPESESVARSE APRGELIYYLQTNGTDTPARLKWRVPSYMNWEALTVMMKDCQLADIPVIVNSIDPCVSCT ER >gi|316922729|gb|ADCP01000095.1| GENE 32 34191 - 34532 278 113 aa, chain + ## HITS:1 COG:aq_1021 KEGG:ns NR:ns ## COG: aq_1021 COG0375 # Protein_GI_number: 15606317 # Func_class: R General function prediction only # Function: Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) # Organism: Aquifex aeolicus # 1 111 1 111 115 80 36.0 6e-16 MHEASIVAGMMHILEDEARRHNVSRITRVRLSVGLFTAVEPKTLEACFELYAEGTAAEGA ELDITTVPAEGRCLECGHGFPMTSPRCRCPECGSLRLEGKGGRDFLITGIDAC >gi|316922729|gb|ADCP01000095.1| GENE 33 34572 - 35069 570 165 aa, chain + ## HITS:1 COG:STM2843 KEGG:ns NR:ns ## COG: STM2843 COG1142 # Protein_GI_number: 16766149 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Salmonella typhimurium LT2 # 1 158 1 151 181 95 36.0 4e-20 MSRYMIESVPEKCKACRRCEIACIAAHHGLTFKEALKKRDELVARVHVVKSGSVKSPISC RQCENAPCARICPTGALQQDDGIVTMNAQICSGCQLCIMACPYGAISLESIGLPSATSET MAQKSMRSVAVRCDLCKDWMKREGKSVPACVEACPVHARSVVQIG >gi|316922729|gb|ADCP01000095.1| GENE 34 35161 - 36006 1139 281 aa, chain + ## HITS:1 COG:CC1910 KEGG:ns NR:ns ## COG: CC1910 COG3494 # Protein_GI_number: 16126153 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Caulobacter vibrioides # 5 257 4 265 280 148 38.0 1e-35 MSESIGIIAGSGQFPRLVAEDAKAAGYGVVVCAFHGFTDPGLEALADAYTTVYLGQFDKV IDYFRKHGVRRLCMAGAINKPRALDLRPDFRAARILFSLRGKGDDALLRAIMADLEKEGF TLIQAAELSTSLLCPEGVLTRRGPSAEEIAEIDYGWPIAEALGRFDIGQCIVVKQGMVVA VECLEGTDAALRRGGELRGEGCVAIKRFKPKQDERVDLPSIGLQTVRLLIEQHYRCLAVD AGKTLFFDRAEALALADKHNFCIVALTEDSFGKLRLQAKAD >gi|316922729|gb|ADCP01000095.1| GENE 35 36035 - 36517 784 160 aa, chain + ## HITS:1 COG:mll0141 KEGG:ns NR:ns ## COG: mll0141 COG0503 # Protein_GI_number: 13470434 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Mesorhizobium loti # 8 160 5 164 166 186 55.0 1e-47 MGKADRYQKLYPVTWDQLHRDAKALAWRLLEQGPFKGIIGVARGGLVPAAIVARELNIRL VDTLCICTYQGREQTGGEEVLKAIEGDGDGWLVVDDLVDTGNTATIIRNMLPKAHFATLY SKPAGKPLVDTFITEVSQDTWILFPWDSEVQYTAPLVEQN >gi|316922729|gb|ADCP01000095.1| GENE 36 36854 - 38323 1950 489 aa, chain - ## HITS:1 COG:no KEGG:LI0461 NR:ns ## KEGG: LI0461 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 2 489 3 498 498 459 47.0 1e-127 MKRLTTLLLAAGLICSAFSAANAADIKAKGMWDFSFEYTNDSFDKHNSSDRFGAAQRLRT QIDVIASESLKGVVYFEIGDTHWGHAKDGASLGTDGRTVEVRYSYIDWAIPETDVLVRMG LQPFVMPGFVAGSAVLDGDGAGITIGGNFTENVGANLFWLRAENDNTNGYGKWHDDQNNA MDFVGLTLPLTFDGVKITPWGMYGFVGSRSLGGDAKGDIDDMRAGMLPVLPSGSDLTTLE GNRSEGNAWWVGITGEMTTFSPFRLAGDFNYGSVDMGTVRSLSGYDGATSKTIDLKRSGW LASAIAEYKLDFGVPGLLLWYGSGDNSNPYDGSERMPTVDAGWSGSSFGFDGGYGISSDT ILGTSPVGTWGVALRMKDISFMENLTHAIQVGYYRGTNNKNMPANAGMTTFHASYPDATG SMVGSTYLTTKDGAWEATFDTDYQIYKDLTLAVELGYIHLSLNDGVWGNVLDNTNRNAYK ASVNMRYAF >gi|316922729|gb|ADCP01000095.1| GENE 37 38681 - 39142 454 153 aa, chain - ## HITS:1 COG:no KEGG:LI0864 NR:ns ## KEGG: LI0864 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 4 148 9 158 165 122 40.0 6e-27 MAALLVLGWFSADAQAHALLAKELASGDARVVEFSYSTGDVAAYAEAKLYGPDSADVEFQ NGRTDAQGRFAFLPDKPGTWTIVVADNMGHKVSHPMTITLSDGGTEEREVAPSGFSESLW ALPVLFRAVLGVSLLLNLFLGLALFRHRGGRRA >gi|316922729|gb|ADCP01000095.1| GENE 38 39513 - 40301 1060 262 aa, chain - ## HITS:1 COG:no KEGG:NT01EI_1069 NR:ns ## KEGG: NT01EI_1069 # Name: not_defined # Def: hypothetical protein # Organism: E.ictaluri # Pathway: not_defined # 22 260 21 259 261 246 48.0 7e-64 MLFKVLLAGMCSLCLVGTATLASAHEFIVKPDKTQAAKGESVGVQAQAAHVFMISEEAEP VETVVVELIQGNAKEPVKLAEDAKVKALVGKAALPADGPAMIVGHRLPQTWSDTTEGVLE GGRKDLEAKGKKVIKVGKYEKFAKTMLNPAANDGLYKKVLGHDLEIVLLTNPADIKAGDD IKAEVLLNGKPVKAPLGLTYDGYSAEQDAYMAKAETGADGMASFKVTKPGLWMLRTEVTE KLTDGSADKRNMRATYVFPVMK >gi|316922729|gb|ADCP01000095.1| GENE 39 40678 - 42612 2445 644 aa, chain + ## HITS:1 COG:aq_1784 KEGG:ns NR:ns ## COG: aq_1784 COG1048 # Protein_GI_number: 15606842 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Aquifex aeolicus # 6 643 9 652 659 720 58.0 0 MNLTQKIIGSHLVSGTMTPGSEIGLRIDQTLTQDATGTMAWLQFEALGIPRVKTDISLSY VDHNTLQMNFRNPDDHRFLRTVAAKFGGVYSGPGTGICHQLHLENFAKPGATLVGSDSHT PTAGGICSLAMGAGGLSVALAMAGEPYYIPMPKVVRVFLTGALTGWASAKDVILELLRRR TVKGGVGRVFEYAGPGVATLSVPERATIANMGAELGATGTLFPSDGVTRAFLKAMGREAD WRELGPDADAAYDEEEAIDLSALVPLAAQPHMPDRVVPVAELDGLPVEQVCIGSCTNSSY ADLKLVAQILDGHRINPGTDAMISPGSKQVLTMLAQEGLLEPLIGSGARLLECSCGPCIG SGGAPVTSGVSARTFNRNFEGRSGTKDAKVYLVSPLTAAQAALNGKFTDPATWGTPPAKP ELPADPPSIRHLFVFPPENGDGVEVLRGPNIVALEKFPRLDASGDVLEANVVIRLGDNIT TDHIMPAGAEILALRSNIPAISEHVFVRVDPDFVKRARAVSEAGGNGVIVAGENYGQGSS REHAALAPRHLGIRAVIALSMARIHRANLVNFGILPLVFVNREDYAKVAQGADIRIPLTE ITPGGVTNIEIAGVGVLPVRNDLSAAELDIIRSGGLLNAVKEKM >gi|316922729|gb|ADCP01000095.1| GENE 40 42685 - 44574 2968 629 aa, chain + ## HITS:1 COG:RSc1715 KEGG:ns NR:ns ## COG: RSc1715 COG0760 # Protein_GI_number: 17546434 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Ralstonia solanacearum # 1 628 1 637 648 227 30.0 6e-59 MLDLIRANAQSWGVKIAFGIIILVFVFWGVGGLTGGPSTVILTVNGEPITIQEFQRKYEQ LEQQVRAQYPDLDAAGLKAMQLKQQLIQNLILENLIMQEAKRVGIVVTPVELRKLIESFP AFHNAEGKFDPDAYVRVIKAQRNTPGNFEAELRNNMLMNKLRADVTAGAFVPEAEVRDLF RYEGERRILEYVFYPLEDYTSKVTVDDAQIKDYYEANQASFTVPPQADVEYLLIGAEALA AAQNISDAAVSEYYEKNAAQFATPEMVRARHILILSDAKASAEDQAKAKAKIEEIAKRIK AGEDFGEVAKEVSEDPGSGPQGGELGWFAHGQMVPEFDKASFALNPGELSEPVKTQFGWH LIQLEEKKAAGQKPLDEVKDQIRTRLAQDEASGKVQEALEQVQLAVIGGKSLKEAGEPLK LAPQETGLVNTTELTEKMGIKPENLPALLAAKPGTVLDTPFVTKTGYVIAKVNESKPQTV KPLDAVKDDIKARLQQDKARTLAFEAANAERKTFTADLPKELQGKLKKTEPVGRQGYMGE LGMNAELAKAAFAAKVNEWFPVAYTLDGGAVIARVAEVVTPSDEDWKTAAPQITDAVLNA KREQMFRGFLSLLRSNAKIELKNETILAD >gi|316922729|gb|ADCP01000095.1| GENE 41 44658 - 45719 239 353 aa, chain - ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 4 213 5 213 344 104 31.0 3e-22 MVKVSVIIPVYNVAPYLGECLDSVIQQSLTDIEIICIDDCSTDDSLQLLQKYAVSDSRIR ILCNDKNRDVSATRNRGIREAVGKYIFFMDSDDTFFSNNALESLYDVAITDLADVVWGRT VNWYCELGKSYYEINTLANPDKDIRRGTVAGFPVLSYNVTVWNKLLLRSFVLKYCLFFDE DLLKFEDNDFSCRICIHAKTISYIAINTYKYRQRIRGDRSKMYIKGCEDAFWKCHAAKNM VSAVYNAEIGIQKIYAENIGDILRGAYRDFIRYRKDESRIEEFMMAMRDIFLVMPEGVFK NLPMEIQLIAFPLSREKYSQAWNMLFTYKKLRVWYYIRYLPSLFQYQLERFSL >gi|316922729|gb|ADCP01000095.1| GENE 42 45725 - 46336 604 203 aa, chain - ## HITS:1 COG:NMA1447 KEGG:ns NR:ns ## COG: NMA1447 COG1636 # Protein_GI_number: 15794352 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 3 188 21 202 241 113 36.0 3e-25 MAGNGKKVLLHACCGPCSIMCIQSLRDEGYDVTGYFANPNIHPVSEYFRRREAMEQVAEQ MDLPMLWQDDVYDLPGWLKMVHDLGIADNQGYARCGYCYESRLALTCAIASENGFDCFTT SLLYSRHQQHDAIRGIGTRVAASGDYRSDAGFGSEFLYRDFRPLWQAGIDRSKEMGLYRQ NYCACIFSEYERFEKKLRKLSAE >gi|316922729|gb|ADCP01000095.1| GENE 43 46326 - 48953 2463 875 aa, chain - ## HITS:1 COG:BU347 KEGG:ns NR:ns ## COG: BU347 COG1530 # Protein_GI_number: 15616953 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Buchnera sp. APS # 372 840 2 464 902 308 40.0 4e-83 MTDSEKIAAPAEQPAASEKPMSAAADAVETPSEPAPEKKPRRPRRKKPAVEQATLELSAP ENAVPAQPVEEQSVSAPPAPEAPARKRPAPKQSAAKTARKQSEQPKQPVEAPVEPKQPEQ PVLTEHPVQIPAVAASVLPAEAVASAVPAVSEPAAVPAVASPEESAAADASETAEGDAAA PAKRRRSRRSGRGRRRPAEGDAAVDAAAPEAGETPILDEPVAADAPAEAFAESAPAVEDE PAGAESGQESAATGEEAPRKPRSRSRRRKPARPAAEQAEETLKDQQAAVPEDEVFVFDDD DDGPAAAPAEAEGEEARAPRVPRKPRRRRNKKASAAAALQEAEDESAEEGEEEQDAKKGR KSAKAAASGVDRKMLVSVLPGDMVEVVLTEDGSVREYYVEMTHQAKIRGNIYKGVINNID TNLQAAFVNYGNVKNGFLQIDEVHPEYYLQPHEPTKGRKYPPIQKVLKPGQEVLVQVVKE PNGSKGAFLTTWLSLAGRFLVLTPGQEQIGISRKVEDGEERNRLRELIRGLEPGEGLGAI VRTVSAGTSKTTLQKDLSFLKRVWKDIRKRATTEKVPSLIYQEPGLASRSVRDYLSDDVS EVWVDNAETAEAIREMASLLFPRKGDLVHLYEKHDKTLWEHFGLLNQIEQVHSREVLMPS GGRLVFDQTEALMAIDINSGKIGGKTNFEAMAFRTNMEAATAIARQLRLRDIGGQIVIDF IEMRDPEHCREVEKELRNAMKGDRARHDIGKMSSFGLLQIVRQRTGSSAISISMETCPHC KGTGQRRNMEWQSVQTLRDLHRAMRSAAAAGQTEYAHSVDAELGLYLLNHKRERLSLMEK EFHIRLDIRIEGISSAPAAPAQQQGHQHNGQNNGR >gi|316922729|gb|ADCP01000095.1| GENE 44 49229 - 49969 889 246 aa, chain - ## HITS:1 COG:PAB1249 KEGG:ns NR:ns ## COG: PAB1249 COG2129 # Protein_GI_number: 14521860 # Func_class: R General function prediction only # Function: Predicted phosphoesterases, related to the Icc protein # Organism: Pyrococcus abyssi # 55 231 31 201 213 124 37.0 1e-28 MPSAQPLSSFAVPASGTSGAAAPSSGVRTWIVMGDLHDKAARLGEIQGLEEADGIIVTGD LTVTGGAAQARNVLEKLTRYNPVIYAQIGNMDRAEVTDWLAKQGWNTHLCVRELAPGVAL MGLGGSTFTPFGTPSEFPESRFADWLEHMWREARTYRHVVLSVHTPPHDTLCDIVGDGIH VGSSAVRDFILDAQPDVCLCGHIHESRAVDRLGRTVLVNPGAFATGGYAVLRLAGDDLSV TLHMLD >gi|316922729|gb|ADCP01000095.1| GENE 45 50094 - 50786 699 230 aa, chain + ## HITS:1 COG:BS_yqgN KEGG:ns NR:ns ## COG: BS_yqgN COG0212 # Protein_GI_number: 16079545 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Bacillus subtilis # 22 228 2 183 187 88 31.0 8e-18 MSLPPTSPSATGTPSGHAVPDRNVLRSKLIRARKALDPAEWERHSAAVQRRVLAIPEWRD ARTVAMYIAVRNEISTDEIIRRAWAEGKQVLLPRCLPPSAGEGIMEFVLCRGYDELTPGA FGLSEPGPACVPLPRTGWGQDAAGTFVPPSGTDASLRLPDLILVPAVGISPAGARLGYGK GFYDRLLALPGWGGAKRLALVHSLQLAAFPAGPLDIPMHGYATEKELIWL >gi|316922729|gb|ADCP01000095.1| GENE 46 50777 - 51532 991 251 aa, chain + ## HITS:1 COG:STM2661 KEGG:ns NR:ns ## COG: STM2661 COG1496 # Protein_GI_number: 16765977 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 10 250 9 242 243 112 32.0 7e-25 MALSFIPFAFPGAPGVRCAFQTRDWGVSEGPYGGGNIAYTVGDDPLRVSANRLDLAASLE LAEIAELDQIHGDTLVFDPPPSPLERFVEPPAFPKGDGLATDKPGIGLIIKTADCQPVLV AHKEGRHIAAFHIGWRGNRLRFLQSGIAAFCGRYGLDARDLLAVRGPSLGPQRSEFVNYG KEWGPDFANWFNARDKTMNLWRLTRDQLLEAGLPDEGIFGLDLCTASMPDSFFSYRRDGI CGRQANVIWIA >gi|316922729|gb|ADCP01000095.1| GENE 47 51533 - 53314 2192 593 aa, chain + ## HITS:1 COG:ygeV KEGG:ns NR:ns ## COG: ygeV COG3829 # Protein_GI_number: 16130771 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 3 583 9 588 592 379 37.0 1e-105 MPMLLDIRDTVADYAELIARVVGVDVEVVDAGLNRLAGTGLYASGVGENIRAEGETYRHV MRIRRSFVMETPREHFICTRCPGRDLCRETLSISTPIIDGSDVLGVIGLVCSTDEDRARV LGHKDVYVQFIERCAEFILHKLHDHADLLRARSFLDIMLRILEINSRGIVIFNAKGGISY LNDIARRDLGLKDDGLPTDVQFKRTGESFSDLEEFVVTARSRKHTLMGQMTPLAPSDYHF ATVFTFESLPRMADRVSSLGDSLSGVENLVGRSPAMLQLKDQIRQIAGSTSTVLVTGESG TGKELVARAIHAASGRRDKPFIGINCGAIPDALLESELFGYTGGAFTGASAKGRIGKFEL AHKGVLFLDEISTMPLYLQVKLLRVLQERAFTRLGSNRLIEVDIRIIAATNEDLAGAVRQ GRFREDLFYRLNVIPLQVPPLRDRHGDIELLSSHFLARYSARFNKPVPSLGPDMLAALSA YPWPGNVREFENVMEFMVNMAPSGGVLHPDMLPASVRGALPSSAPSPSAPLPPGGIIPLR DLERNAILDAVRRCGDDTPGKKAAAAALGIGVATLYRKLKEYEDEAAALSRTT >gi|316922729|gb|ADCP01000095.1| GENE 48 53569 - 54891 2130 440 aa, chain + ## HITS:1 COG:SA1499 KEGG:ns NR:ns ## COG: SA1499 COG0544 # Protein_GI_number: 15927254 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Staphylococcus aureus N315 # 15 427 15 428 433 162 28.0 2e-39 MEFVVENLSPTRKKIAFTLTADDVNAAINTAVKGFQKDLSLPGFRKGKVPASVVEKRFGE DVYARATQDEINGLLQKTLKDSSLTPVSSMEMDNHDAFARDTEFKCNVTFDVLPEIEFPN YEGLEVDQAKADVSDDDVNEIIENLRGRMAELTEVTEDRLPQDGDTVDVDYSGTDEEGNK IDDVQGENFGVALGQGQALPDFEALVKTAKVGEEKTGPVNFPADYPHKPLAGKTVIFTIK VNKIQTSQKPEVNEEFAQKVGQESLEKMKASIVEHVSATKAQAARSEAMKKLIDGLLEKV SFEIPDSMLNARVERVVQEQAMKLQRMGLDDLRKDQAQEEKREEAKKEALETLRPQVFLM ALAQKENITVNDQEVEMAIYGMAMRAQQDYKKVSEAYHKSGLIYELRDRILADKALEMVY SKAKINEVPAVSLSEPKNAD >gi|316922729|gb|ADCP01000095.1| GENE 49 54959 - 55693 872 244 aa, chain + ## HITS:1 COG:aq_1339 KEGG:ns NR:ns ## COG: aq_1339 COG0740 # Protein_GI_number: 15606541 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Aquifex aeolicus # 39 235 4 200 201 280 69.0 3e-75 MVALCYFPFLWLHPETTAQSCLSDGESPDAQLSYGDSMQDILDMTVPIVVETSGRTERAY DIFSRLLKDRIVLLGSEVNDAVASLICAQLLFLESQDPEKEIYLYINSPGGSVTAGMAIY DTMNYITPPIATVCMGRAASMGAFLLSAGQKGMRYALPNSQVMIHQPLGGFQGQATDIDI HAREILRMRETLNGLLAKHTGQPIEKIAQDTERDNFMTAEMAQAYGLVDKVLASRRDLLA EKSE >gi|316922729|gb|ADCP01000095.1| GENE 50 55702 - 56958 253 418 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 156 383 258 432 466 102 30 8e-21 MPNTTPPNEPRCSFCGKTETQANQLVSGPDGVHICDECIATCNEMMAQNAIEQVREDERL LTPQEIKARLDEYVIGQDEAKKILAVAVHNHYKRVFYADALKGDDGVELEKSNILLVGPS GSGKTLLAKTLARVLRVPFAIADATTLTEAGYVGEDVENILVQLLQNADYDLEAARKGII YIDEIDKISRKADGPSITRDVSGEGVQQALLKIIEGTEANIPPKGGRKHPQQEFIRMDTS NILFIVGGAFIGLDKIVEQRMRGGSMGFGAKVETKKERTLGELLEQVHPADLVQFGLIPE FVGRIPVLTHVDDLGEDDLVRILTEPKNALTRQYQKLFELDNVTLRFTSDALRAIAHRAI ERKTGARGLRNVMESVMLDIMYQLPSMPGVKECVINDAVIEKKAPPLYIYDTEAHIAS >gi|316922729|gb|ADCP01000095.1| GENE 51 57260 - 59728 3180 822 aa, chain + ## HITS:1 COG:CC1960 KEGG:ns NR:ns ## COG: CC1960 COG0466 # Protein_GI_number: 16126203 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Caulobacter vibrioides # 19 804 7 781 799 780 53.0 0 MNETNDAITSNATLEGFELPLMPLREVVMSPHSIMPLLVGREASIKAIESAVNDYGKRIC LVTQREPELEKPDPADLYAVGVVGRVLQYMRLPEGPIKVLFEGLYRISWRPLTPEDQENP FGTDAFPKVVVTPLPIMRNDGPETEALVRATKEILAEYAHTNKKLTPEILSAVSALRDPG QLADTILPLLKIEYPKKQEALELADPGQRLEKVYEFLNTEMERVSMERRIKSRVKDQMDK NQREYYLSEQIKAINKEMGREDDPQAEIAELEQMLKAKNMPEEAREKGMSELKKLRMMAP ASAEYAIVRNYVDWILELPWNDMRETSIDIPEARAILDADHFGLEKPKQRILEFLAVQKL TNALKGPILCLVGPPGVGKTSLARSVATATGREYVRLSLGGVRDEAEIRGHRRTYVGAMP GKIIMALKRAKSSNPLFCLDEIDKMSSDYRGDPASALLEVLDPEQNKTFNDHYLDIDYDL SKVFFITTANSLHNIPAPLQDRMEIIELSSYLEVEKKHIARDFLLPRQIKEHGLKPGNIS LPEETLTEVIRSYTREAGVRSLEREIGALCRKTAIKLVEEDNLDQCVTITPEALETYLGV KKYRMDENEKSPKVGVVNGLAYTGVGGVLLNVETVIMPGKGNVATTGKIGEVMSESAKAA LSYVRSRAAALGLKPDFHSEIDIHTHFPEGATPKDGPSAGTAITTSLVSALLGIPVRHDV AMTGEVTLRGRVLPIGGLREKLLAASRAEMATVIIPRDNAKDLKEVPDEILQTLRIELVD HVDEVLPIALDAPEDAIWDKSDCPSLIESLRRHHEPAVTTAQ >gi|316922729|gb|ADCP01000095.1| GENE 52 60257 - 60577 420 106 aa, chain + ## HITS:1 COG:ECs5129 KEGG:ns NR:ns ## COG: ECs5129 COG2076 # Protein_GI_number: 15834383 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Escherichia coli O157:H7 # 1 102 51 152 155 113 64.0 1e-25 MHWFLLVLAGLLEVGWAIGLKSSHGFTRFWPSVATLVLMALSFFLLASALKHLPIGTAYA VWTGIGAVGTALLGIVLFGESASPARLLCIGLILAGIAGLKLLSDA >gi|316922729|gb|ADCP01000095.1| GENE 53 60611 - 62014 1745 467 aa, chain + ## HITS:1 COG:BMEI0868 KEGG:ns NR:ns ## COG: BMEI0868 COG2204 # Protein_GI_number: 17987151 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Brucella melitensis # 2 464 3 453 453 364 45.0 1e-100 MARILIIDDEEPIRFSLRGILEDEGYEVLEAATAEEGLEVADAERPDLVFLDIWLPGMDG LTAQARLKGNHPDLPVVMISGHGTIETAVSAIQQGAYDFIEKPLSLEKVVIVAARALEAG SLRRENQVLRTVLPEQDELIGQSPVMLKFQELLARVAPTDVWVLLTGENGTGKELAARAL HAGSRRADAPMIAVNCAAIPEELIESELFGHEKGAFTSADQARAGRFEMANNGTLFLDEI GDMSLKTQAKILRILQEQSFERVGGTRTIKVDVRVIAATNKNLEEAIAAGTFREDLYYRL RVVPLHLPPLRERGGDLDLLLAAFTERLCRVHACKAPVYAPETMERLRRYPWPGNVRELR NFAERMVILFGGKTVLPVDLPPEMTPQGKPEPSAEAASAACEPAFLPQSAVLGPDLDFKK ARAVFEARYLEAKLHECGGNITRLAETIGLERSYLHRKLKGYGISSE >gi|316922729|gb|ADCP01000095.1| GENE 54 62186 - 63694 2088 502 aa, chain + ## HITS:1 COG:VC2501 KEGG:ns NR:ns ## COG: VC2501 COG0260 # Protein_GI_number: 15642497 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase # Organism: Vibrio cholerae # 1 502 1 495 503 286 38.0 6e-77 MELRFQAGGPESWSADAVIVFLSEDEKLEKAYPELLDAAPWMSIAPGMRDFHGKKGEMGL LYGPPAHPLSRAVVVGLGKLDALTYAETLEEIRKGAGAAAVRCRELGVDTLALPVSGLAR FNADVERTIEEIVCAAMLGLWRNKAFKSKEPEDADPRWLALLFCDANVPDFPRLAARRGE THAEAVILARNLANTPANSLTPEDMAEEARKLARRHDMNCEVLEREDIVSLGMGAFAAVA AGSANDPRLIILEHAPAGHADEAPLIVVGKGITFDSGGISIKPAAGMWEMKGDMGGAAAV MGLFEALGQLETPRRVIGLMACAENMPDARATRPGDVVKTLSGKTVEIVNTDAEGRLVLC DALTYAQRRWNPSMIVDVATLTGACVVALGDDVAGLFCQDATLAQRIKDFGDIVGEPYWP LPLWDRYFELLKSETADFANAGARAGGASSAAVFLKQFITFDGSWAHLDIAGPGFVTRKI PNCPAGGTGFAVRTLIELAENK >gi|316922729|gb|ADCP01000095.1| GENE 55 63841 - 64314 513 157 aa, chain + ## HITS:1 COG:MA2197 KEGG:ns NR:ns ## COG: MA2197 COG3467 # Protein_GI_number: 20091038 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Methanosarcina acetivorans str.C2A # 1 146 1 148 152 79 31.0 3e-15 MGMKDRFRADRDFIDGVLNDADDMVVALNDGDGAPYAFPVNFVLLGDALYIHTAFTGKKM DLIRKNPRVGFSAYADVRIIREKATTTFRSVCGTGTAIIVEDKEEKRTALDAITVRYKSL CPRPAPDSMINRVAIIRIDIDSLMGKYAPGEEEALKA >gi|316922729|gb|ADCP01000095.1| GENE 56 64338 - 64880 203 180 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239918553|ref|YP_002958111.1| acetyltransferase, ribosomal protein N-acetylase [Micrococcus luteus NCTC 2665] # 1 147 2 146 170 82 32 5e-15 MPLVTERLRVRLWTEADRAAFRRMNADPRVMKFFPSTLSEAESDALLARIQAHQAEHGFT LWAVEDKATGNLAGLTGLARVAFDAPFTPCVEIGWRFLPEYWGTGYALEAARSVMGYAFK VLELPEVVAFTTETNLPSQGLMRRLGMLHNPAEDFDHPALPPDHPLHRHVLYRMLKDFWE >gi|316922729|gb|ADCP01000095.1| GENE 57 64749 - 65000 140 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRWGDSCRVFPLLPERLEGMGTEGAAWKRCLRARILLRTLSFPEVFQHPVEYMAMERVIR GQGGMIEILGGIVEHPEPPHEAL >gi|316922729|gb|ADCP01000095.1| GENE 58 65261 - 65764 841 167 aa, chain - ## HITS:1 COG:MJ1277 KEGG:ns NR:ns ## COG: MJ1277 COG0066 # Protein_GI_number: 15669463 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Methanococcus jannaschii # 9 162 11 164 168 158 51.0 5e-39 MPFTGVSHKVGEHIDTDAIIPARFLVTTDPKKLGENCMEGLEHGWVKRVQVGDILVAERN FGCGSSREHAPIAILGAGMKAVIAHSFARIFYRNAFNMGLLLLEIGDDVNKIKDGDKLEV SPEEGVIKNLTTGETIKVPALSPVMQNLIDKGGLVNYVREEIEAEGK >gi|316922729|gb|ADCP01000095.1| GENE 59 65913 - 67172 1762 419 aa, chain - ## HITS:1 COG:PAB0891 KEGG:ns NR:ns ## COG: PAB0891 COG0065 # Protein_GI_number: 14521549 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Pyrococcus abyssi # 1 419 1 421 423 503 56.0 1e-142 MPHTLAQKILQAHTDEEVREAGQIVNCHLSGVLANDITGPLAIKSFRAMGAAKVFDKDRV FLVMDHFTPQKDIDSANQVMVTRKFAQEMDVTHYYEGGDCGVEHALLPEQGLVKPGDLVI GADSHTCTYGGLGAFATGLGSTDVAAGMALGRLWFKVPPTIRVEFEGKMPKWLRGKDLIL RLIGDIGVDGALYKALEFGGETLSELDVEARLCISNMAIEAGGKCGLFPSDEKTLAYCRD HNSPDAKLIAPDAGATYERIVRIDVSNMEPQVSCPHLPENTKPVSEVKDVTVNQVVIGSC TNGRISDMRDAAEVLKGRKVAKGVRCIVLPATPTIWRQAMKEGLFDIFTDAGCIVGPPTC GPCLGGHMGILGDGERCIATTNRNFRGRMGSLEAEVYLASPCTAAASAVTGVITSPEKL >gi|316922729|gb|ADCP01000095.1| GENE 60 67408 - 68979 2256 523 aa, chain - ## HITS:1 COG:NMB1070 KEGG:ns NR:ns ## COG: NMB1070 COG0119 # Protein_GI_number: 15676954 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Neisseria meningitidis MC58 # 2 501 4 503 517 505 53.0 1e-143 MSNRVIIFDTTLRDGEQSPGATMNLQEKLRMARQLESLGVDVIEAGFPASSVGDFEAVHA IANAVTDVQVAALARCNKNDIDRAWQALNGARNPRIHVFLATSPIHMEYKLRKTPDQVVE MAVAAVRHAAKYTNNVEFSAEDASRSNPDFLVRVFTEVINAGATTINVPDTVGYAQPDEY GKLIKYVIENTPNSHKAVFSVHCHDDLGLAVANSLSAIQNGARQAEVTLCGIGERAGNAS LEEIVMAMKVRPDVYGVDCGIQNEQLYPACRLLSLIIGRPIAANKAIVGSNAFAHESGIH QDGMLKNRETYEIMTPQSIGRKSTDLVIGKHSGRNAVRTRLEELGYRLDDEQVNVVFAAV KDLADKKANVHDEDLEALVFSEVYRLPDRYRLGNVSVQTTMGSSMPATAAVVIQAGEEEM RRAGFGAGPIDATFNVISQIVSRAPDLEQFSINAITGGTDAQGEVTVRLREGEYSATGRG SDPDIIVASAKAYIDALNRLDHKERDALRTQEATGEPAINTCN >gi|316922729|gb|ADCP01000095.1| GENE 61 69170 - 69922 827 250 aa, chain - ## HITS:1 COG:XF0442 KEGG:ns NR:ns ## COG: XF0442 COG1183 # Protein_GI_number: 15837044 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Xylella fastidiosa 9a5c # 9 230 11 241 256 169 43.0 5e-42 MQEEKPVRRGVYILPNLFTTASLFTGFLAMLLAVQGNIEGSALAILFSALMDGLDGKVAR LTNTASEFGIQYDSLADLVAFGVAPAFTSWMWVLEGYGKLGIGVAFLYVACTALRLARFN VCVTTVSKKFFVGLPCPAAGCAVAMFILFSSYLPSWLEARTPFLGLVVTFVMALLMVSRV RYFSFKEYGFLKTHAFSSMISALLVFVLILSQPVVFGFLLGAIYLLSGPIYTYIILPRRN RQLLGSLTQN >gi|316922729|gb|ADCP01000095.1| GENE 62 69977 - 70624 681 215 aa, chain - ## HITS:1 COG:RSc2074 KEGG:ns NR:ns ## COG: RSc2074 COG0688 # Protein_GI_number: 17546793 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Ralstonia solanacearum # 8 210 10 212 215 188 51.0 6e-48 MRKGNIGVAPEGLPFLFLLALTALAFAALRFWPLALVFLVLTWFSGHFFRDPERVVPSED DIAVSPADGRVIRVEPKADPITGEVRPCISIFMNVFNVHVNRSPVAGRVEAIRYFPGKFF NAALDKASTDNERCAYLVRDDGGRSWVMVQVAGLIARRIVCRVEEGDTLARGERYGMIRF GSRVDLYLPPDYSAVVANGDVVAAGESILARKFQA >gi|316922729|gb|ADCP01000095.1| GENE 63 70756 - 72219 1610 487 aa, chain - ## HITS:1 COG:MA1483 KEGG:ns NR:ns ## COG: MA1483 COG0168 # Protein_GI_number: 20090342 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Methanosarcina acetivorans str.C2A # 1 485 1 474 476 329 41.0 1e-89 MQIRSVLFIVGVLLVCVGASLVLPLGVSLWYGDGAFLGLLSSLVIMVGGGLALAYTNRSQ TPLNLTLRDGFAVVGICWIVASFAGGLPFVLTSTASVTDAVFEAASGFTTTGATIFADIE GLPHGLLMWRSLTHWIGGLGIILLSLIVLPLLGVGGMQLYRAEVTGPSPDKLTPRMQDTA LTLWRVYCLMTIVLTVLLYIAGMDWFDALNHSFSTVATGGFSTKNNSIAAYPQASIQWIL IFFMFIAGINFSLHAQALRGVVKVYLRDRECRFYTVLLILGSAVIVAGLVEHGLFGMRSF PEFEAVTRAAVFQLVSVCTTTGFVSENFNLWPSITLVIILMFMLMGGCAGSTGGGVKVMR LVMLFRLAYLEIFRLLHPHSVRHLKIAGKSVPVEVISGVVGFFLLYLIVTTVGTIVLTVQ DMDLVSAFTATLTCISNVGPGLGSVGPVDNFSHVPTLSKWVLTLCMLLGRLEIYAILVLF IPEFWRD Prediction of potential genes in microbial genomes Time: Fri May 13 03:48:52 2011 Seq name: gi|316922696|gb|ADCP01000096.1| Bilophila wadsworthia 3_1_6 cont1.96, whole genome shotgun sequence Length of sequence - 43482 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 17, operones - 8 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 197 - 238 -0.0 1 1 Tu 1 . - CDS 340 - 1731 1645 ## COG0569 K+ transport systems, NAD-binding component - Prom 1896 - 1955 2.7 + Prom 1864 - 1923 3.3 2 2 Tu 1 . + CDS 1961 - 3958 577 ## PROTEIN SUPPORTED gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 + Term 4152 - 4177 -0.5 + Prom 4203 - 4262 2.7 3 3 Tu 1 . + CDS 4377 - 5135 1111 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 4 4 Op 1 . + CDS 5253 - 6170 1412 ## COG0451 Nucleoside-diphosphate-sugar epimerases 5 4 Op 2 . + CDS 6195 - 6965 1032 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase + Term 7072 - 7113 9.0 + Prom 7057 - 7116 1.9 6 5 Op 1 . + CDS 7239 - 7808 606 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 7829 - 7853 -0.3 7 5 Op 2 1/0.000 + CDS 7872 - 8783 1271 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 8 5 Op 3 . + CDS 8788 - 9432 1002 ## COG0546 Predicted phosphatases + Term 9663 - 9700 6.3 9 6 Op 1 . + CDS 9981 - 12206 2811 ## COG0210 Superfamily I DNA and RNA helicases 10 6 Op 2 . + CDS 12207 - 13205 1229 ## COG0611 Thiamine monophosphate kinase + Term 13393 - 13443 3.9 + Prom 13599 - 13658 5.8 11 7 Tu 1 . + CDS 13807 - 15717 2497 ## Ddes_2311 hypothetical protein + Term 15730 - 15762 1.3 + Prom 15836 - 15895 2.7 12 8 Tu 1 . + CDS 15944 - 16987 1431 ## BF3822 putative sodium-dependent transporter + Term 17157 - 17190 -0.5 - Term 17352 - 17394 7.0 13 9 Op 1 1/0.000 - CDS 17404 - 18771 1662 ## COG0471 Di- and tricarboxylate transporters - Prom 18793 - 18852 2.9 14 9 Op 2 . - CDS 18953 - 20266 1741 ## COG0281 Malic enzyme - Term 20390 - 20443 4.0 15 10 Op 1 36/0.000 - CDS 20574 - 21359 1056 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit 16 10 Op 2 8/0.000 - CDS 21519 - 23390 2398 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 17 10 Op 3 . - CDS 23402 - 24052 756 ## COG2009 Succinate dehydrogenase/fumarate reductase, cytochrome b subunit - Term 24719 - 24756 7.8 18 11 Tu 1 . - CDS 24781 - 25164 470 ## COG2033 Desulfoferrodoxin - Prom 25228 - 25287 8.9 - Term 25348 - 25380 5.4 19 12 Tu 1 . - CDS 25402 - 27108 1359 ## COG2821 Membrane-bound lytic murein transglycosylase + Prom 27401 - 27460 3.1 20 13 Tu 1 . + CDS 27513 - 28595 1635 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 28625 - 28663 7.3 + Prom 28698 - 28757 4.0 21 14 Op 1 7/0.000 + CDS 28897 - 29514 535 ## COG1150 Heterodisulfide reductase, subunit C 22 14 Op 2 2/0.000 + CDS 29515 - 30435 1259 ## COG2048 Heterodisulfide reductase, subunit B 23 14 Op 3 2/0.000 + CDS 30432 - 32393 2636 ## COG1148 Heterodisulfide reductase, subunit A and related polyferredoxins 24 14 Op 4 2/0.000 + CDS 32381 - 33850 1816 ## COG1908 Coenzyme F420-reducing hydrogenase, delta subunit 25 14 Op 5 6/0.000 + CDS 33856 - 34908 1261 ## COG1145 Ferredoxin 26 14 Op 6 . + CDS 34920 - 35756 1229 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases + Term 35882 - 35923 6.4 - Term 36332 - 36361 -0.5 27 15 Tu 1 . - CDS 36391 - 36990 419 ## COG1489 DNA-binding protein, stimulates sugar fermentation - Prom 37228 - 37287 2.7 28 16 Op 1 . + CDS 37304 - 38479 1597 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 29 16 Op 2 . + CDS 38553 - 39896 1552 ## COG0166 Glucose-6-phosphate isomerase + Prom 40095 - 40154 3.4 30 17 Op 1 13/0.000 + CDS 40276 - 41865 2009 ## COG0642 Signal transduction histidine kinase 31 17 Op 2 . + CDS 41884 - 43263 1876 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains Predicted protein(s) >gi|316922696|gb|ADCP01000096.1| GENE 1 340 - 1731 1645 463 aa, chain - ## HITS:1 COG:PA0016 KEGG:ns NR:ns ## COG: PA0016 COG0569 # Protein_GI_number: 15595214 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Pseudomonas aeruginosa # 13 450 1 441 457 283 37.0 4e-76 MLFRRAPQKVEELKIVIVGAGEVGFHIAQRLSEEQKQVVVIDQNPDKLRRIEDNMDVQTV LGSGSIPSVLKNAGAGDAAIFLAVTDSDETNIVACLFANTIAPKAMKLARIRTEEYTAYP QLLGSGSLQISMLINPEQEIVRSIERLLTLPGAVEYGQLADGFIRMVGMRVESGPLIEQP LTRFREIVQDDGIMVGAIARGQKLIVPSGSDVIKAGDTVYFAYKPTSQRALLRALHKTRG SLGTACIVGGGNIGLRLARLLENKGVDTKLIDISEERCEQLADQLQGTLVLHGDGTDKSL LQEEHIDQMDAFIAVTGDEESNILSCLLAKSLGVKETVARVNKAAYLPLVEAIGIAHSVS PRLSAVNSILQYIRQGKVLSSVSVGGDAAEMLEALVDEESLVAGKRVHELGLPKGILLLG VIRGGEAFIPSGQTVIQPQDRIVLLSLREKMSKLEGIVSKKKE >gi|316922696|gb|ADCP01000096.1| GENE 2 1961 - 3958 577 665 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 [Haemophilus influenzae 7P49H1] # 4 553 7 558 618 226 28 1e-58 MDIPLLPDIVAIFCLSIGVLLVCHQVKIPPIVGFLLTGVLCGPTALGLVQNPHAVELLAE IGVVLLLFSIGLEMSGEELMRLKRPVFVGGTAQVLLTVGAFMCLGVLTGQTWQQSMMYGF LASLSSTAIVLSRLQQKAQSESPQGRLDFSVLIFQDIAVVPMMLAIPILAGRGDTDLGGM LVSAGRTLVILVGGWVLARHVVPRIMQLVLRTRSKELMLMTVLGLCFAIALGTASLGLSL ALGAFLAGLLLSGSEYSLNVLEGILPFKDVFTSLFFISVGMLLDVGFLVHHLDKVFLFAA LLIFLKSILSLPPMLLVGYPLRVSILAAMSLAQIGEFSFVLARSAVNSGLMDNDGYQMFL AASIVTMMLTPTVMEIAPKVASFVSRHMHMPVDEEAAAQRDESLKDHLIIVGFGVGGKHL ARTAREAGIPYVILEMNPDTVSRYGGKEPIHGGDASKPLVLEHFGIQSARVIAVVISDPS AVRAITAVARKLNPKVHIVVRTRFLGEVDALRRLGADDVIPEDFETSIEIFSRVLGHYLV PRQTIERFVNSIRHEYYNMARQLRMTGMDLPSLADEVLTGLEVVACKVEPGCALDGKRLM DTSLRKKYGVTVVGIRHAGQIIPSPGGDAFLHGDDTVFLFASPASLTTVMPFFRTNPEPE RAEDL >gi|316922696|gb|ADCP01000096.1| GENE 3 4377 - 5135 1111 252 aa, chain + ## HITS:1 COG:STM2289 KEGG:ns NR:ns ## COG: STM2289 COG3836 # Protein_GI_number: 16765616 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Salmonella typhimurium LT2 # 11 242 13 242 267 129 35.0 5e-30 MKTAQEIRAMMRGGNATIGTWLQFPSADVAEVMGASGYDWVAVDMEHGAFTRAILPDVFR AIERGGSTPFARIAQAELRPIKDALDSGAQGLVFPMIESREQLDRAISLSLYPDAGGSRG VGYSRANMYGRDFDPYRVDVAKDILLVAQIEHIRAVDQLDAILSHPRLDAIMVGPYDLSG SMGLTGQFDHPDFVAVMARINEACKRHNMPMGNHVVQPEPERLAKCIADGYRFIAYGIDA VFLLHGCACPKR >gi|316922696|gb|ADCP01000096.1| GENE 4 5253 - 6170 1412 305 aa, chain + ## HITS:1 COG:MA4464 KEGG:ns NR:ns ## COG: MA4464 COG0451 # Protein_GI_number: 20093250 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Methanosarcina acetivorans str.C2A # 1 203 1 225 298 121 33.0 1e-27 MKIVFFGGSGFLGSHVCDKLSDAGHDVTIVDLRPSPYLRPDQKMVTGSLLDEDLVNTVVS GADAVFDFAGLADIGESNQKPVETARINVLGNVILLEACRKAGVKRYVFASTLYVYGKSG GFYRCSKQACELYIENYHDMFGLEYTILRYGSLYGPRADRRNAINRFVAEALEKGTITYY GAPTALREYIHVEDAALSTVEILGPEYANQNIVLTGNQPMQVGDLFRMIEEMLGKPIEIK YQHDPNSGHYQITPYAFMPKVGRKMTPHLSTELGQGVLKVMEEIHQELHPDLKELGGYLM VDKAD >gi|316922696|gb|ADCP01000096.1| GENE 5 6195 - 6965 1032 256 aa, chain + ## HITS:1 COG:FN0807 KEGG:ns NR:ns ## COG: FN0807 COG1212 # Protein_GI_number: 19704142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Fusobacterium nucleatum # 1 252 1 245 245 160 36.0 1e-39 MNIIAIIPARMGSSRFPGKPLADIHGIPMVGHVTIRTAMCKELSSTWIATCDQVVMDYAA SAGLKAVMTADTHVRCTTRTAEAMLKIEEMTGQRADIVVMVQGDEPMITPDMIDAAVEPM LKDPSINVVNLMADLATEEEFEDPNEVKVVTDLNGDALYFSREPIPSRRKGTPHVPMHKQ VCIIPFRRDYLLKFNAMDECPLEIIESVDMMRILEHGEKVRMVPTDKRTLSVDTQEDLER VREMMRDDALRISYTK >gi|316922696|gb|ADCP01000096.1| GENE 6 7239 - 7808 606 189 aa, chain + ## HITS:1 COG:alr1276 KEGG:ns NR:ns ## COG: alr1276 COG0110 # Protein_GI_number: 17228771 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Nostoc sp. PCC 7120 # 1 186 54 265 275 92 31.0 5e-19 MKHFLEELCIAALAWIPTAAGMGARLLLWRPLFKRCGRARFGTGIAMQGCKNMSLADGVR IGRGCQLYAEGGTLDMGEDAALSPGVTVDASGGLIRIGKQVAIGPGTVLRAANHCFDSLE KPIMLQGHLYGEIVIEDDVWIAANCTITPGTRIGHGAVVGAGAVVTRDVEPYAIVGGVPA RVIGSRRKD >gi|316922696|gb|ADCP01000096.1| GENE 7 7872 - 8783 1271 303 aa, chain + ## HITS:1 COG:AGpA578 KEGG:ns NR:ns ## COG: AGpA578 COG0111 # Protein_GI_number: 16119626 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 299 5 308 317 197 36.0 2e-50 MKIAITTSSFARYSDEPLKLLEQAGVEYVMNDKGRALTEEEAIQILEGCHGVAAGTEPLT KKVMDALPDLKVISRCGTGMDNVDRAYAAEKGIEVRNTPDGPTLAVAELTLGLILTLLRQ VPHQDRELRSGVWKKRMGNLLHGKNVGIVGFGKIGQAVAHLLEPFGVNVGYHDPFADVPG YTRLELDDLMGWADVITLHCPKTENGAPLLDLGRLSLMRPGSIILNIARGGLIDEKALLG LLTAGHLAGAALDCFTKEPYDGPLKEMDNVILTPHIGSYAKEARIIMETDTIKNLLDVLF PGA >gi|316922696|gb|ADCP01000096.1| GENE 8 8788 - 9432 1002 214 aa, chain + ## HITS:1 COG:jhp1130 KEGG:ns NR:ns ## COG: jhp1130 COG0546 # Protein_GI_number: 15612195 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Helicobacter pylori J99 # 1 184 1 188 222 100 28.0 3e-21 MALSLIVFDCDGVILESVDAKTKAFGQVAEPLGAEARDRLLLYHTLHGGVSRREKFAWLY REVFHREITPEEMEDCCARFVSYALDNVLNAPLVPGVMDVLERWHGRVPMYVCSGTPQEE LQSVLEQRGLARYFTGICGTPPAKAELLKGIVRKERIDPADAVMVGDATTDSDAAEAAET LFYGRGPFFENTAYPHGSDLTGLNRWLEELAGNE >gi|316922696|gb|ADCP01000096.1| GENE 9 9981 - 12206 2811 741 aa, chain + ## HITS:1 COG:mlr7946 KEGG:ns NR:ns ## COG: mlr7946 COG0210 # Protein_GI_number: 13476580 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Mesorhizobium loti # 4 635 20 666 697 344 37.0 3e-94 MIDISTLNPAQLEAVTAPDGPVLVIAGAGSGKTRTIVHRLAWLAEQGVPASDMLLLTFTR KASREMLLRATDLLGYSIGGVHGGTFHSFAFSVLRQYRPAWAEGPVTVMDSADSASAIQQ CKERLKVGKGDRSFPKTQTIIGLLSKARNKEISIGDVLQRDAQHLLPHADALESIGEAYR GYRRQHGLLDYDDLLFELEDLLKGDPEAGREGLAERFRERYRYIMVDEYQDTNRVQARLV RLLAGEAGNVMAVGDDAQSIYAFRGADVRNILDFPKLFPGTRVIRLEENYRSTQPVLDVA NAVLAPASEGFRKNLFTTKENTPKTPRVRLVRPMSDLTQANVVAARVEELLDRYQAKEIA VLFRAGYQSYHLEVALSKRGIKFRKYGGLRYAEAVHVKDVVAFVRLAINPLDMPSFERVA GLSKGVGTKTAEKIYHVAAQGDFDALRKACTKYPDLWSDMLLLDKLREHNLTPAALIEMV IEHYTPRLQAIFPDDWPRRQQGLSELAHIASAYTDLEQFVADLSLETPEDDADEFDEAGR VVLSTIHSSKGLEWDAVILLDLVEDRFPSRHALVRPEDFEEERRLMYVACTRAREDLELF VPATLYSRQNGGNEPATPSPFVRELPFSALEEWQEGYTGRISKRSTSFAGDPAFSRPSLD IPRELANPNAGRVKGVFPPPVLPEAKGDRAASKGGAGCGYCRHKVFGRGKIVEQLPPDKC RVNFPGFGLKVILSAFLTLEE >gi|316922696|gb|ADCP01000096.1| GENE 10 12207 - 13205 1229 332 aa, chain + ## HITS:1 COG:BS_ydiA KEGG:ns NR:ns ## COG: BS_ydiA COG0611 # Protein_GI_number: 16077657 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Bacillus subtilis # 25 292 20 275 325 134 37.0 3e-31 MSDLVSEDHILARIASHFPNEGPDVLLGRGDDCAVLGVSGPLCVSTDLFMEDVHFRRSYF TPEDIGWKALAVNLSDLAADGARPMGFTVGLSLPPDADMALVDGLCAGMAELAAQASVPL VGGDLSRADKLHLCLTVFGQAEKTLLRGEAQPGDVIFLIGRTGLARAGLLLLEERGRAAL NEWPIPCEAHLRPAPRLKEGMRLSRLAADWGREKGDPTCGRLGLMDLSDGLARDLPRLVG PGMGADIDMPMPHTEILRFMRSRNEAEPVAAAKRHAFLGGEDYALIGTCSPELAVHVMVA NAETTMLGKVTEGGVIRVDGVPISGGFDHFAG >gi|316922696|gb|ADCP01000096.1| GENE 11 13807 - 15717 2497 636 aa, chain + ## HITS:1 COG:no KEGG:Ddes_2311 NR:ns ## KEGG: Ddes_2311 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 22 636 33 681 681 532 48.0 1e-149 MATLILLNKTELPKGTPSEALVAVWDKGSVPDGQISIPVELNERLLPIRDDLAAWTYETG CARINGKLLEEHLRADDNLSMWWCSTLVEKHPKVTHNLFPALKLRALELLLDEKGVTRLE LCAAAGADPWMEDVLGRFCKATGREFTVRRIGGAEAAQPEGLKAKLKACYYRLPAPVKAL VRFPAWLWIVRRRLPRTPLSRPALPEGVKPASIVTYFPNIDMAAAKNGRFRSRYWEKLHD ALQPADGKPNRVNWVFLYFPAPQCSFPDAIKLRDRFRANGQDGASFHFLEEFLTNGDIAK SLVRFGKLLLSSRAVEPQVRELFRFPGSQMDLWPVLKDNWGDSTRGWRALERCLMREAFR RYAEWAGPQEWTTFPQENCPWERMLCQSMHDGDAGPVYGAQHSTVRPTDFRYFDDPRMIN DPACRRAMPELWLCNGTGAQDALLAGHMPEDRTALVEALRYLYLAPKPDAEAAPETPAKP TRLLIVTSFFADETDAHLATLAASAKAGTLDGWEVVVKPHPYLPVVERLKNLYGGAEPPC VVDGPIGNFLTPGTVVWASNSTTVALEAAFRKLPVLVQAAEDDVDLCPLQGLPGVVNVRT VSDVAAALAAPKAPGLPPDYLALDPELPRWKERLGL >gi|316922696|gb|ADCP01000096.1| GENE 12 15944 - 16987 1431 347 aa, chain + ## HITS:1 COG:no KEGG:BF3822 NR:ns ## KEGG: BF3822 # Name: not_defined # Def: putative sodium-dependent transporter # Organism: B.fragilis # Pathway: not_defined # 4 300 1 295 305 223 41.0 1e-56 MKKLLRALKQWTLLIAIVVGAVGHSFFSQFTWLAPWLLASMLLLTFCNISPRDLRFHPLH LILMSLQIVMSLGLYALTLPWHPAIAQGVCMAALTPTATAAAIITGMLGGSVAFLAAYAF LSNLAIVVLAPLVLPLIATGQIDTPFLETMSNVFMRVGPTMLFPLLAAWLIQYAAPKVNA VLLRWGILAYYLWAGMLVILMGGTFEQLLKPGEKDYQLEIFLALTGIAVCTANFILGKSI GSRFHRRIAGGQALAQKNIILPMWLTFQYLDPIASVSLAAYSIFQNIVNAAQIWLKGRRD DRIIQRLHDFHEKKHARIQAAQQLVPQEEHAELLSELPTRVKDRLKK >gi|316922696|gb|ADCP01000096.1| GENE 13 17404 - 18771 1662 455 aa, chain - ## HITS:1 COG:STM3356 KEGG:ns NR:ns ## COG: STM3356 COG0471 # Protein_GI_number: 16766651 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Salmonella typhimurium LT2 # 48 455 6 420 422 250 39.0 5e-66 MKRFMWSFLALLIVACVGVYLAGGFECFAASPGTAGAAIPGDRTKAYITLGILIVAALLF VTEALPLPITAMLVPVGLSSTGVLTSKVAFSYFGDNTVVLFMAMFIIGESTFVTGFADRV GQMALKLSKGDMKKLLLFSMIAVGGLSTVLSNTGTTAVAVPMIMGMCASAKVAPSKILMP VAFASSLGGTVTLVGTPPNGIINSMLDQVGIPQFGFFEFGKFGLILFAVGMLYYWAFGYR LLPEDKGGEDHSFGSELKERRTAKMPYSMGIFAFVVIMMVTGALPLTTAAVLGACLVVAT RCMTMKEAFHCIDWVTIFLFAGMLSMSAAMQKSGAAALVANAVVSNVSGPTMLMFVSCAL TMLLTNFMSNTATAALMAPLALPIAVGGGISPLPLMMGICMSASACFLTPIATPPNTIVL TPGRYTFLDYMKAGWPLQIISLALCVFVIPMIWPF >gi|316922696|gb|ADCP01000096.1| GENE 14 18953 - 20266 1741 437 aa, chain - ## HITS:1 COG:VNG1624G_1 KEGG:ns NR:ns ## COG: VNG1624G_1 COG0281 # Protein_GI_number: 15790581 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Halobacterium sp. NRC-1 # 7 430 5 428 433 472 60.0 1e-133 MALFTREEALEYHRSPRHGKTEVVLTKRCESQKDLSLAYSPGVAEACKAIAEDQALVSEY TGRANLVAVISDGTAVLGLGNIGPYAAKPVMEGKGVLFKNFADVDVFDLCLKTGSTEEFI AAVRTMEPTFGGINLEDIKAPECFIIEETLKKEMGIPVFHDDQHGTAIISGAALLNACEL TGRRIEDVTVVVVGAGAAGMACARFYCALGVKRGNIRMFDSKGLIHKKRSNLAEYKLEFA QEEDLGSLADCMKGTDLFLGLSTKNLVSQDMVRSMADKPVLFAMANPDPEISYEDAKAAR PDCIMGTGRSDYPNQINNVSGFPYIFRGALDVEATEINEAMKIAAAEALAALAKEPVPAE VCSAYNVDSLEYGFDYIIPKPLDPRILTCITPAVAKAAMDTGVARKRIEDLDGYARELER RVKASHDRIRPFVASYE >gi|316922696|gb|ADCP01000096.1| GENE 15 20574 - 21359 1056 261 aa, chain - ## HITS:1 COG:jhp0177 KEGG:ns NR:ns ## COG: jhp0177 COG0479 # Protein_GI_number: 15611247 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Helicobacter pylori J99 # 3 237 6 241 245 272 51.0 4e-73 MSRKLHIEVFRYNPLDPASVPHMQSFYVEEYSSMTLFIALNIIREQQDPSLQFDFCCRAG ICGSCGMVINGRPGLACHTQTRDLPDHIVLHPLPVFKLLGDLSVDTGVWFREVGRKIEAW VHTQDKAFDPRAEEKRMENELAEQIFELDRCIECGCCIAACGTARMRKDFLGAATINRLA RFYIDPRDNRSEDEYYDVIGNDQGVFGCMGLLGCEDVCPKKIPLQDQLGIMRRMLALKSV KGILPKALYEKFKGCGSHCHS >gi|316922696|gb|ADCP01000096.1| GENE 16 21519 - 23390 2398 623 aa, chain - ## HITS:1 COG:Cj0409 KEGG:ns NR:ns ## COG: Cj0409 COG1053 # Protein_GI_number: 15791776 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Campylobacter jejuni # 1 619 1 631 663 681 53.0 0 MRTIYSDVLCIGAGLAGERVALAAAMEGFHTICLSVVPPRRSHSSAAQGGMQAALGNAVM GKGDSTDIHFADTVKGSDWGCDQEVARIFADSAPIVMREMAHWGVPWNRVVAGKHTYYKG GKPFEAEEEESKHGLIHARAFGGTAKWRTCYTADGTGRSVLNTLDSKCLQYGVEIHDRMQ AEALIHDGERCMGAIVRSLRDGELVAYIAKATLIATGGYGRIYRATTNAIICDGGGQICA LNTGVVPLGNMEAIQFHPTGSVPTDILMTEGCRGDGGTLLDVNEYRFMPDYEPEKAELAS RDVVSRRMTEHMRKGFGVKSPYGDHLWLDIRHLGEHHITTKLREIYDICTHFLGVNPIHQ LIPVRPTQHYSMGGVRTDKDGHAYGLKGLFAAGEAACWDLHGFNRLGGNSLAETVVSGRY IGSKMVEYLKGSESVFKTEPVNDARKLVAKTIDDIISCRNGKENCFDLRNAMQDVMMDDV GIFRNAKDLQNGVDRLLELSERAKHIGLHGSVKGFTPELSMALRVPGMIKLALCTAYGAL QRTESRGAHTREDFPARNDAEWLTRTLAYWKEGDSLPTLKYEDASPWFELPPGERGYGGG KIIPAEIPADKIKTKEQAEEAIK >gi|316922696|gb|ADCP01000096.1| GENE 17 23402 - 24052 756 216 aa, chain - ## HITS:1 COG:Cj0408 KEGG:ns NR:ns ## COG: Cj0408 COG2009 # Protein_GI_number: 15791775 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, cytochrome b subunit # Organism: Campylobacter jejuni # 21 206 25 228 260 94 34.0 1e-19 MAQANITSHVRLRNHISGYTDVIQMLTGVLLCCFVLMHMVLVSSVILSPKIMDSLAVFLE TSYLAQIGGPIILLVMILHFILAARKMPFSPLELREFWRQAKMMHHMDTWLWLVQVATAI VILVMASAHVINILSNLPISADKSAAAIQGGMVPFYLVLFAALDLHIAIGLYRVGVKFGI LNRENRLKWRKYALYLVIGLALLSLATHYSFATMAI >gi|316922696|gb|ADCP01000096.1| GENE 18 24781 - 25164 470 127 aa, chain - ## HITS:1 COG:AF0833 KEGG:ns NR:ns ## COG: AF0833 COG2033 # Protein_GI_number: 11498439 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Archaeoglobus fulgidus # 1 126 1 124 125 102 42.0 1e-22 MTYRYQLYKCPICGTVTEIVHDGVSAMICCGKTMEVLDDREPVGTKESHMPVVEATEKGC KVSVGSKLHGMDADHLIEWAEIRTSDDTVLRKYFKAGETPVVEFPVPFDQVVFARAYCNK HGAWRAK >gi|316922696|gb|ADCP01000096.1| GENE 19 25402 - 27108 1359 568 aa, chain - ## HITS:1 COG:PA1222 KEGG:ns NR:ns ## COG: PA1222 COG2821 # Protein_GI_number: 15596419 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-bound lytic murein transglycosylase # Organism: Pseudomonas aeruginosa # 271 568 97 378 385 174 37.0 3e-43 MERFPAHEFDEPFRWAHGAPVASGAGVFSHALGKQGLALGVHDIVHKDERGAPDFGYPRA DMDQIVIPGGPPIAALGFAYGQHDAEAFDLRVTHAHLADVLAAGTFEEVQVFRVIEIPHR IGFSIGDAVGEPNIGSGKIVHEQHPLLKVPQKRGAGTAALAAAVLALGLLAGCAAPLPAP SAPVRDVAGDLVPENDEQAAVLAGKLDRAGQRLHSWKELAPALRASRAHIAGREPGQVAV AHGDVAVTWGDVAKTLDCLEALLPRLDAEPGLLAERFRWVKLKDGAAFSGYYEPVVKASR TRKPGYTAPLYRVPPDLRELNLGSFKSELIGQRVVYRMEKGKPVPYYTRAEIDGLDGRPG VLRGKGLELAWLSDPVDAFFLQVQGSGRLRFEDGKEMPVRFAGSNGKPYLSIGRYLADQG EIPTGQVSMQSIRQWLREHPELRDDLLRRNQRYIFFRKGPETSSGSITSGPVGSMGSPLS SMVSLAVDRTTFPLGSVLAFNVNIPDPSSPVEEGPVSTTPLFGIGLAQDTGEAIKGRRVD LFCGKGARAAYIAGHLNGPGEIWMLLAK >gi|316922696|gb|ADCP01000096.1| GENE 20 27513 - 28595 1635 360 aa, chain + ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 12 268 8 258 332 66 26.0 8e-11 MKRWFKMLTGSLAVVLFITFSCAAFAAAPVKLRTAWLDEHEAFLVWYAKEKGWDKEEGLD IEMLLFSSGMAQLNALPAGEWVLAGTGAVPGMMGALRYGTYTIAVTNDESYTNALLLRPD SPILKTKGYNKDYPEVYGHPDDVRGKTFLITTMSSPHFTLSHWLRVLGLKESDVTIKNID QAQGLAAFDSGIGDGVCLWAPHMFFGIDKGWKVAGTPNTCGVGLPTVLMGDKKFCDANPE LVAKFLRVFLRSANMLQKETPESLAPEYCRFFQEFVGKSFTPEMAALDIKTHPVFNLEEQ LALFDGSKGQSKAAKWMADIAEFFAGVGRITPDELKVVKDGSYATDKFLKLVKQPIPDYK >gi|316922696|gb|ADCP01000096.1| GENE 21 28897 - 29514 535 205 aa, chain + ## HITS:1 COG:MTH1878 KEGG:ns NR:ns ## COG: MTH1878 COG1150 # Protein_GI_number: 15679866 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit C # Organism: Methanothermobacter thermautotrophicus # 27 120 157 250 325 80 34.0 2e-15 MKKVYWSLNHSTAGRIMETINLETTRDEAFLREVEAESGQKVTNCYQCGNCTAGCPCGPE YDLQVSQVMRAVQLGNKEMALSSRSLWLCVSCSTCTSRCPNNIDVAKVMDVLRHMAWKEG KTNYAMASFWQSFLTTVRHFGRTYELGVMAMFMARTGRVFTDVDLAPRILPKQKLPFKPH TIVGKDAVARIFKRFDEQRSAEGDK >gi|316922696|gb|ADCP01000096.1| GENE 22 29515 - 30435 1259 306 aa, chain + ## HITS:1 COG:SSO2358 KEGG:ns NR:ns ## COG: SSO2358 COG2048 # Protein_GI_number: 15899116 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit B # Organism: Sulfolobus solfataricus # 1 282 4 288 293 182 36.0 7e-46 MKLAYYPGCSGQGTSAEYERSTRAVCRALEIGLKEISDWSCCGSTPAHACDHVLSSALSA RNLALAAAEGAERVGTPCPSCLANLKTAKYRMQDEAFRDKVNALLDNPCPEELPETVSIL QVLVEDYGTGAIAEKVKKPLEGIKVACYYGCLMSRPADIMQFDDAENPMAMDNIMTALGA EVVPFPLKTECCGAAMGIPRRDITARLTGRILERAQAFGADAVVVACPLCHMNLDLRQRQ AVGGSMKMPVLYFTQLMGLALGLPHQELGFEKLCVSPDELLRKIDAAQAAKAAAAKADEA TEEAKA >gi|316922696|gb|ADCP01000096.1| GENE 23 30432 - 32393 2636 653 aa, chain + ## HITS:1 COG:MK0265 KEGG:ns NR:ns ## COG: MK0265 COG1148 # Protein_GI_number: 20093705 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit A and related polyferredoxins # Organism: Methanopyrus kandleri AV19 # 2 652 7 653 656 634 51.0 0 MKIGVFICHCGSNIAGTVDCGKVAEIAKTYPDVAFSTDTMYTCSEPGQEEIVAAIKEHKL DGVVVASCTPRMHEPTFRRTVERAGLNKYMFEMANIREHVSWIGKDKEANTNKAAELVRI AVEKLRRDNPLFSKSFESTKRVLVIGGGVAGIQAALDCADAGLDVLLVERESHIGGKMAK LDKTFPTVDCSSCILGPKMVDVAQHPKITLMAASEVTNVSGYVGNFTVTVKKKATYVDWS KCTGCGACMDKCPAKKTPDKFNEFVGPTTAINIPFPQAIPKKATINAEFCRKLTSGKCGV CAKVCPTGAINYEMKDEEVTETVGAIVAATGYDLMDWTVYGEYGAGQYPDVITSLQYERI MSASGPTGGHIKRPSDGKEPKSVVFIQCVGSRDKSIGRPYCSGFCCMYTAKQAVLTKDHI PDSQSYVFYMDIRSNAKLYDEFTRRAVEEYGTQYIRGRVSAIQPDGHGQYIVRGADTLLG KPVEVKADMVVLAVGIEGSKGANKLAETLHISYDGFGFFMESHCKLRPVETNTAGVFLAG VAQGPKDIPSSVAQGSGAAAKVIGLLSKPLLESDPQISVVDIKRCVGCGKCIKVCPFQAI VEKEIRGEKKAQTIEAVCQGCGLCTATCPQGAIQLSHFTDNQILAEVDALCRF >gi|316922696|gb|ADCP01000096.1| GENE 24 32381 - 33850 1816 489 aa, chain + ## HITS:1 COG:MK0268 KEGG:ns NR:ns ## COG: MK0268 COG1908 # Protein_GI_number: 20093708 # Func_class: C Energy production and conversion # Function: Coenzyme F420-reducing hydrogenase, delta subunit # Organism: Methanopyrus kandleri AV19 # 10 134 7 131 141 160 58.0 6e-39 MSVLTGKELRIVGFLCNWCSYGGADTAGVARFTQPTDLRIIRVPCSGRVNPLFPLKALLS GADGVLVSGCHPRDCHYSEGNYYARRRLETLKEFLPIIGIDPRRFEYTWVSASEGQRWQA VVTAFTERVHKLGPAPKFEDAKPMYVMPNLDLPAPLRPLGCGVNPAAMTELKDQIKAALE AKEVEFVMGWQKGFDGLHATPLYMRKPEDVEKLIWGPLNVHSLATYLPLFKGKKVGIVVK GCDSRGVVELLQENLINREDVVVFGMGCNGTIDINRVLAKIGDVTEVESVTGSGATLKVR ADGKDYEFAMQDVAQDKCRACTVPNAVIHDHFAGSPTNIPDGAQPAMPAIMTFLDGLSLE DRMGFWRGHIDRCIRCYACRNACPMCVCRDNCVADSREPHWLTQEDSPTQKMFFQLIHAM HLAGRCTGCGECNRACPMGIPVGALKLQMGRVVKKLFEYAPGMDVDAVPPLLGFQLEEKN IHEHHIEGA >gi|316922696|gb|ADCP01000096.1| GENE 25 33856 - 34908 1261 350 aa, chain + ## HITS:1 COG:Ta0046 KEGG:ns NR:ns ## COG: Ta0046 COG1145 # Protein_GI_number: 16081223 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Thermoplasma acidophilum # 7 339 5 299 306 146 28.0 8e-35 MSAIRFLAADKLASWLAEVAASRQVLAPRQEGKAVVFRPLVAGETPSLARATISPKAAVF PACETLVHFTGTKDPENPAKLNMQLDDTPDAPARMVFGCRSCDARGFAVLDKPFLEGKFQ DAYYKARRDNLLVVTRTCDSPCSTCFCHWTGGGPADPTGSDVLLTAVNGGFVLEAVSEKG EAFLASTKLADGSDKMDEAKAAREKAAASQQPAPDLSKAARRLEQRFTDVDFWAEQTAKC LSCGACTYMCPTCQCFNISDEGDPLEGRRLRSWDNCMSPLFTREASGHNPRTAKALRMRN RVSHKYWYAPDYSDGRFACTGCGRCIKQCPVSLDIREIVLNAIADDAEAK >gi|316922696|gb|ADCP01000096.1| GENE 26 34920 - 35756 1229 278 aa, chain + ## HITS:1 COG:PAB1785 KEGG:ns NR:ns ## COG: PAB1785 COG0543 # Protein_GI_number: 14521075 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 1 267 8 280 292 192 36.0 5e-49 MAGTHNPLLPEMATVIETVQETHNIKTFRVRFDDEEKMKNFTFQPGQVGQLSVFGVGEST FVINSSPTRMDYLQFSVMKAGENTAALHKLNAGDKIGVRAPLGNWFPYEQWKGKNVFFIG GGIGMAPIRTIMVYLLDNRKDYGDISLLYGAKTPADLSFQSDMPEWLERKDLNVTLTIDN PADGWEHKVGLIPNVLKEIGPKPKNTIAVLCGPPIMIKFTLAALVELGFPDDQIYTTLEK RMKCGIGICGRCNIGGKLVCVDGPVFSNKQLKELPPEL >gi|316922696|gb|ADCP01000096.1| GENE 27 36391 - 36990 419 199 aa, chain - ## HITS:1 COG:sll2014 KEGG:ns NR:ns ## COG: sll2014 COG1489 # Protein_GI_number: 16330301 # Func_class: R General function prediction only # Function: DNA-binding protein, stimulates sugar fermentation # Organism: Synechocystis # 1 189 44 227 237 122 39.0 6e-28 MIGLLRPGAPVLLSPAQNPERKLRWTLELVWCGDPAAYPRGGFWAGVNTSTPNRMLEAAF RAGKLPWAEGYTSFAREKVRGESRLDGLLEGPGMPRLWVECKNVTMSEDDVALFPDAVSE RGLKHLGTLKGIVATGERAAMFYCIQRPDACCFGPADMIDPAYAVGLRDAVAHGVEVYPH LAVLTVEGIGLEAECLFMV >gi|316922696|gb|ADCP01000096.1| GENE 28 37304 - 38479 1597 391 aa, chain + ## HITS:1 COG:alr4853 KEGG:ns NR:ns ## COG: alr4853 COG0436 # Protein_GI_number: 17232345 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Nostoc sp. PCC 7120 # 1 386 1 382 388 325 45.0 1e-88 MQFSERMAHIKPSATLTINAKALELKSQGVDVTSLAIGEPNFPTPAHVCDAAKQAIDEGF TRYTAVPGIPPLREAIAGYFKRAYGVDVAPDCTIASNGGKQSLYNLFAALLNDGDEVLIP SPYWVSYPDMVYLNGGVPVAVKAPASQGFKVTVEQLEAALTPKAKILVFNSPSNPTGACY SVSERDAIVKWALDRGLFIIADEIYDQLVYEPNKPSSIIVWWKQYPDRIAVVNGLAKSFA MTGWRVGYTVTHPDLVKKLSQITGQATGNICSVSQKAAVAALTGPYDCVESMRQAFQKRR DMAWKEIASWSNVVCPKPEGAFYLFADISALYDEKMPDATAACTRLLSEAGVALVPGDAF GAPDCIRFSYAVADDVLMDALARVRKVLIGQ >gi|316922696|gb|ADCP01000096.1| GENE 29 38553 - 39896 1552 447 aa, chain + ## HITS:1 COG:TM1385 KEGG:ns NR:ns ## COG: TM1385 COG0166 # Protein_GI_number: 15644137 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Thermotoga maritima # 63 438 56 437 448 189 32.0 1e-47 MHTLDWKHAWTGRLLTTAPTRPQALKLLEPQLDSCNARLIGEVASGKLPFLNLPFRSSLT ARLKALTPQLRRFRHMVVLGIGGSALGTRALQKAFFPQQDLPNHQGPWLWILDNVDADVL EAQMSTLNPEETIVVPVSKSGGTIETLAQYFFMKAWLQRALPNTWHEHMLLVTDEKKGYL REEADRHNIVSLPVPDHLGGRYSVLSAVGLVPAAFMGLPWEDFLEGAASVNAPLVNDELG KPENLRNHPAWALANWCFELSRHDYSQLIFFTYIPSWASFGQWFGQLWAESLGKNGKGTM PLPAVGVTDQHSLQQMFLDGPADKGCIQLHCPNLSKGPQFPDDVPDSWEWLRGKTFGDLL NAETLGSAAALVHNGVPLTRLEAAESSMRAAGEVMGLLMATTVLTGWLMNINPLDQPAVE LGKRLAYSRLGSSSYPEEAAILKAFLD >gi|316922696|gb|ADCP01000096.1| GENE 30 40276 - 41865 2009 529 aa, chain + ## HITS:1 COG:atoS_3 KEGG:ns NR:ns ## COG: atoS_3 COG0642 # Protein_GI_number: 16130156 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 248 481 40 270 278 131 32.0 3e-30 MFGKKDSVDKGLPFGLTRFLSWISLILILASSVMLAFFIGNSARNTLLLRQQQYAMLMAS NLNQQIYRRFTLPTFLAFGRIALRQSMQYQQLDDVVQSTILGLHLESLRIYGADRVVNYS INQMELGRTDITPASMERVMSSGIPFFETLSSVNMLSAMFMLKIPDGSYSLRTMFPLTVD LGISESTGKPESLVTGVLEITQDITDDYDSVVRFQWLILATCLGSSTILFAILQFFIIKA ERTLAERMSRNRKLEAELHQNEKLASMGRVIASIAHEIRNPLGIIRSSSEFLIRRTPEQE KGSPAQRILTAIFDESCRLSQIVNDFLDYARPRVPRQDRVDLNALLNQATGFLDGEMKRL GVECVRDTEGSLFVLGDKDLLYRAVYNILVNAYQAIGTNGTIHIRGERRDDGMIALSFHD SGPGFPPALLHKLLDPFFTTKDHGTGLGLPIVNTIITSHNGKLLLTNPEEGGAMITILLP VASEPDIGDTPEKDERPAASADDEPVTLTDPLPSEKHDLREGDASPQTR >gi|316922696|gb|ADCP01000096.1| GENE 31 41884 - 43263 1876 459 aa, chain + ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 7 456 7 456 461 386 45.0 1e-107 MNDKVHLLLIDDEKNYLLVLETLLTEAGYAVTALNDPETALAFLEESEVDIVITDMKMPR ISGREVLERVKKNWPHIPVLIMTAFGSIESAVEAMRYGAFNYITKPFSNDELLLSIQNAA ELARAHRQYQLLRESLEERYGKHQIIGRSRAIRDMLALVDRAGPSRATVLITGESGTGKE LVARAIHFSSPRSQEPFVSVNCMALNPGVLESELFGHEKGSFTGAVAMRRGRFEQASGGT LFLDEVGELTPELQVKLLRVLQERRFERVGGSEEIEVDIRIVAATNKDLMPLVQAGTFRE DLYYRLNVVHIPIPPLRERREDIPLLVAHFAEKAAKENGIPPKTFSTEALNHLSGYEWPG NIRQLQNVVERCLVLVPGDVITLEDLPAEIRDEEAQFKSAVDLLPVQLDLADTLERIEAA LIRRALVRADFVQAKAAELLGISKSLMQYKLKKYSITGH Prediction of potential genes in microbial genomes Time: Fri May 13 03:49:20 2011 Seq name: gi|316922678|gb|ADCP01000097.1| Bilophila wadsworthia 3_1_6 cont1.97, whole genome shotgun sequence Length of sequence - 19179 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 8, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 408 - 1472 978 ## COG0144 tRNA and rRNA cytosine-C5-methylases + Term 1625 - 1654 0.5 - Term 1527 - 1581 18.5 2 2 Op 1 . - CDS 1601 - 2023 414 ## CDR20291_1132 hypothetical protein - Term 2150 - 2189 8.2 3 2 Op 2 . - CDS 2242 - 3258 1137 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase 4 2 Op 3 25/0.000 - CDS 3279 - 4253 1063 ## COG1475 Predicted transcriptional regulators 5 2 Op 4 2/0.000 - CDS 4313 - 5101 886 ## COG1192 ATPases involved in chromosome partitioning 6 2 Op 5 . - CDS 5604 - 7241 1215 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 7 3 Op 1 5/0.000 - CDS 7448 - 8377 1040 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 8596 - 8655 3.7 8 3 Op 2 . - CDS 8844 - 9974 1280 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis - Term 10044 - 10074 3.4 9 4 Op 1 1/1.000 - CDS 10082 - 10528 472 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 10 4 Op 2 5/0.000 - CDS 10512 - 11672 1433 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 11833 - 11892 1.7 11 4 Op 3 . - CDS 12022 - 12792 1014 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) + Prom 12782 - 12841 2.8 12 5 Tu 1 . + CDS 12974 - 13399 686 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism + Term 13626 - 13673 16.2 - Term 13612 - 13661 9.0 13 6 Tu 1 . - CDS 13712 - 15208 1277 ## LI0827 lipoprotein - Prom 15376 - 15435 2.6 + Prom 15433 - 15492 1.9 14 7 Tu 1 . + CDS 15540 - 16598 1415 ## COG2089 Sialic acid synthase 15 8 Op 1 . + CDS 16950 - 17426 578 ## 16 8 Op 2 . + CDS 17457 - 19151 1926 ## DVU0348 hypothetical protein Predicted protein(s) >gi|316922678|gb|ADCP01000097.1| GENE 1 408 - 1472 978 354 aa, chain + ## HITS:1 COG:BS_yloM KEGG:ns NR:ns ## COG: BS_yloM COG0144 # Protein_GI_number: 16078637 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Bacillus subtilis # 4 353 80 445 447 156 30.0 6e-38 MKLMIGIAAYELLYLDRIPAHASVSAAVDAVRARFGQGLSRVANGVLRSLIRLSESEDLK AFAFYESRIKDPMERLSVFHSVPRWMLELWTKGYGPEKAETFARAVSVVPSPCVRVNAAR DEWEALRDFLCKEGEAVGISGVRFAPGTQPEYMRDYLRQGRLSIQGAGSQLVLEALDAPR WEGPVWDACAGRGGKTLALVELGVPVLAASDTYQPRLRGMRDDAKRLVLKTPPLFCASAA EPALRGTPRTILLDVPCSGLGTLARHPDLRTLRTPGQVAGLVDLQRRILDAVWSYLPSGG HLAYITCTMNPAENEGQIDAFLARTPGASLEKQWQSTPDAFGSDLMYGAMLKKA >gi|316922678|gb|ADCP01000097.1| GENE 2 1601 - 2023 414 140 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1132 NR:ns ## KEGG: CDR20291_1132 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 140 1 140 140 179 59.0 2e-44 MESTTSKCVMVLDENLPLGLLANTAAILGITLGKHMPEAVGADVLDGSGKPHLGIITFPV PILRGDAEQIRAIRETLYGVDYQDVIVVDFSDVAQCCKNYGEYIGKAAQADESEWRYFGL GLCGPKKLVSRLTGSMPLLR >gi|316922678|gb|ADCP01000097.1| GENE 3 2242 - 3258 1137 338 aa, chain - ## HITS:1 COG:HI1526 KEGG:ns NR:ns ## COG: HI1526 COG2870 # Protein_GI_number: 16273426 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Haemophilus influenzae # 21 334 12 318 476 211 41.0 2e-54 MSDASPVLLTPDVVGGLAGKRVLVVGDIMLDAYLIGDADRISPEAPVPVVRIEKERYLLG GAGNVARNIVALGGVATLAGVMGKDASADRIRGLVRDEGICGAFVALPQRPTTVKTRVMA RRQQMLRLDSEDASPLSSADQEALLSVITEQLPAHDVVILSDYSKGIVSLSFMSRLRGLL AASPHQVKVLIDPKPSNVGLYGGSFLLTPNTKETGECAGMPVRSREEILAAGRAILEKVG CPHLLTTLGSDGMALFSGPSEVWHVPTMAQDVFDVTGAGDTVIATLGLGLAAGLPLLASC VLANYAAGLVVAQVGAAVASPDALREAIRELPVTITKW >gi|316922678|gb|ADCP01000097.1| GENE 4 3279 - 4253 1063 324 aa, chain - ## HITS:1 COG:RSc3325 KEGG:ns NR:ns ## COG: RSc3325 COG1475 # Protein_GI_number: 17548042 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Ralstonia solanacearum # 5 237 6 226 303 172 49.0 6e-43 MSANERGLGRGLDALFRNVTPAERPSSAKADSAEVPAPSVPKNTTTLPIAALTPCKGQPR KHFDEAALDELAASIRSQGVIQPLLVRPRRTETATIYEIVAGERRWRAAQRAGLTEVPVY LRELSDEDALTAALIENLQREDLNPLEEAQAIQSLRERLPYSQEELAQRLGKSRSAVANS LRLLQLPRPMQDALKDGVFTPGHARAVLALPERALQDILFNAVMTRRLSVRDAEEAVIHW KRHGLLPSSLMGASAPVARSARAPKPACIKLAIRQLRENVAPKASISGTDRAGRITLPYE SEEQLAELLSRLGLSVELSEAKPE >gi|316922678|gb|ADCP01000097.1| GENE 5 4313 - 5101 886 262 aa, chain - ## HITS:1 COG:BMEI0009 KEGG:ns NR:ns ## COG: BMEI0009 COG1192 # Protein_GI_number: 17986293 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Brucella melitensis # 1 253 17 272 278 256 51.0 4e-68 MARIIAIANQKGGVGKTTTAINLAASLAIMEKRVLLVDCDPQANTTSGLGFGQDVLPANL YTSFFQPEKVNEAILSTVSPYLFLLPSGTDLAAAELELVDKMGREFYLSELLEPLEKRFD FIILDCPPSLGLLTLNALCAAKEILVPLQCEFFALEGIVKLLQTYEQVKKRLNPQLGLLG VVLTMFDGRNRLTRQVQDEVNRCFPDQIFKTVIPRTVRLSEAPSFGKSILHYDIKSKGSE AYLSLAKEIVMRKPGSRTAKAE >gi|316922678|gb|ADCP01000097.1| GENE 6 5604 - 7241 1215 545 aa, chain - ## HITS:1 COG:alr0522 KEGG:ns NR:ns ## COG: alr0522 COG0564 # Protein_GI_number: 17228018 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Nostoc sp. PCC 7120 # 1 313 1 306 312 224 41.0 4e-58 MTRTYAFTVDGGRVRLDRFLGEALREQDVSREKVKRAIRDGGCLVDGAVCTDVSAKVGAG QSIELRMEAEPTSVQPEEGELEILYQDAWLAVLNKPAGMTVHPAPSCPDGTLVHRLVARF PSLRAQEGFRPGIVHRLDKDTSGLICVALTEEARMRLSEAFAAREIRKEYLALVQGVPAP AGTVDAPLGRHPTVKVKVAVAKNGKEARSEWRVLHKGQGYSLLEVRIFTGRTHQIRVHMA HIGHPLWGDKLYGRGGQVLTPPVPGVLLKHDPAPRQMLHAWHLSFIHPFTGEKMAFTCPP PEDFVQTALTLERHMRRVVVTGVAGSGKSLFMRMLAEEGVPTWSADAAVIRLYEPGREAW QALRLRYGERFIPDDRSPVDRKALAAALLPSAESGVDVHELENLLHPLVLDDLERFWSGQ EEAGRGYAVAEVPLWFESGWSRKLCGERRPYVVGISCEQGERCRRLLEVRGWSDTLMARM DGLQWTQERKLAGCDQVIANSGTEGALRDKARTFVREMEHWDAESRAAFLGQWEHLVNEP CPEAG >gi|316922678|gb|ADCP01000097.1| GENE 7 7448 - 8377 1040 309 aa, chain - ## HITS:1 COG:alr3071 KEGG:ns NR:ns ## COG: alr3071 COG0463 # Protein_GI_number: 17230563 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 1 204 1 203 318 120 35.0 5e-27 MPKVTVIMNCYNSAEHLREAMDSVFRQSWPDWEIVFWDNCSTDDSAAIAQSYGEKVRYFL AEKTVPLGAARNLAIAKAEGSLLAFLDCDDEWLSTKLEKQVMLFNANPRLGLVSTDTEMF SEKGTLSRMFESTHPARGMVFRELMTRQWISMSSAMIRKSALDSLGEWFDESLNVCEEAD VFYRIAKTWELDHVDEPLTRWRVHGVNTTFRKFGQFADETLRILDKHRKLYPGYDRDYPD LVGILTRRAAFQKAVTLWREGNGAEARRLVAPYVASPKLRLFWLASWLPGSLFDPLSQLY FKLPGFLRK >gi|316922678|gb|ADCP01000097.1| GENE 8 8844 - 9974 1280 376 aa, chain - ## HITS:1 COG:BS_spsC KEGG:ns NR:ns ## COG: BS_spsC COG0399 # Protein_GI_number: 16080840 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Bacillus subtilis # 6 360 12 370 389 210 35.0 4e-54 MSMRLSRSIVGEAEAEAVHRVICEDGYLGIGKEVQQFEADVAAYLGVPASWVISVNTGTA ALHLAVEAVLGHGSGAEVLVPSLTFVASFQAIGGAGAVPVACDARLDTATIDLADAERRL TPRTKAIMPVHYASNPVDLDGVYAFAEKHGLRVIEDAAHAFGCLYKGRKIGSFGDVACFS FDGIKNITCGEGGCIVTSDQAVAEAARDGRLLSVEKDTEKRFSGQRSWEFDVERQGWRYH MSNVMAAIGRVQLTRLDGEFAPKRRELAELYRERLSGVPGVALLTAEPEAWIVPHIFPVR ILNGRQHDVIGALDAKGIPTGQHYKPNHLLSLYGGGKPSLPATEQLHGELLTLPLHPGLS REDVEAVCAGIIEAVK >gi|316922678|gb|ADCP01000097.1| GENE 9 10082 - 10528 472 148 aa, chain - ## HITS:1 COG:mll5319 KEGG:ns NR:ns ## COG: mll5319 COG1898 # Protein_GI_number: 13474438 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Mesorhizobium loti # 40 148 66 175 176 66 39.0 2e-11 MESVINDVELLDLAVIPTDGGPVMHMMRPASPLFGEIGEVYFSEVEPGCVKAWKCHTRQT QRFAVPVGQLKIVLYDDRPESPTRGRIMEVLLGRPDNYALLQIPPRVWYGFTAAGGVPAV ICNCPDIPHDPAEGLRKDFDSRDIPYHW >gi|316922678|gb|ADCP01000097.1| GENE 10 10512 - 11672 1433 386 aa, chain - ## HITS:1 COG:STM2091 KEGG:ns NR:ns ## COG: STM2091 COG0451 # Protein_GI_number: 16765421 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Salmonella typhimurium LT2 # 4 368 5 353 359 317 44.0 2e-86 MFGHIYEGRRVFVTGHTGFKGSWLTAWLLQMGAEVAGFSDVVPTSPSHFDVIGLGGRIRD YRGDVRDRKALVEAMREFKPEIVFHLAAQALVRASYDNPADTIEVNAMGTLNVLEAVRVT PSVEAVVSITSDKAYRNDEWVWGYRETDHLGGYDPYSASKGCAEIIAHSYFKSFFGDPVN APYCATTRAGNVIGGGDWAADRIVPDCARAWSEGREVEIRSPHATRPWQHVLEPLSGYLW LGAKLFAAHRDAVKAGKGDGCHEGFRLDGEAFNFGPSSDASHTVADVVQSLESQWSGFGS KMDLAGQAGMKECTLLKLCCDKALACLNWRATLDFDTTMRFTAKWYEVFYREQGADIYAF TQSQIQDYCRKAAALGLPWSSDGIRY >gi|316922678|gb|ADCP01000097.1| GENE 11 12022 - 12792 1014 256 aa, chain - ## HITS:1 COG:SMb21416 KEGG:ns NR:ns ## COG: SMb21416 COG1208 # Protein_GI_number: 16264991 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Sinorhizobium meliloti # 1 254 1 251 255 267 50.0 1e-71 MKVIILCGGKGTRLREETEFKPKPMVEIGGRPVLWHIMSMYARYGHKDFVLPLGYKGEVI KQYFHDYRMRNADFTVDLATGSTTYHPTAQVTDWRVTMSDTGPETLKGARIKRIAQHIDT DRFMVTYGDGVSDINISELVDFHIKSGKMATFTGVRMPSRFGSVKTDENGNILSWQEKPV LDEYINCGFFVFKREFLDYLSEDESCDLEKEPLERLAAEGQLSMYRHTGFWQCMDTLRDA QALNSLWDKGNAPWCK >gi|316922678|gb|ADCP01000097.1| GENE 12 12974 - 13399 686 141 aa, chain + ## HITS:1 COG:MA0735 KEGG:ns NR:ns ## COG: MA0735 COG2050 # Protein_GI_number: 20089620 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Methanosarcina acetivorans str.C2A # 5 124 12 128 146 73 38.0 1e-13 MSPSVQAIRRFLEAHDRFAAHNGITVEDADYDYGKVSLAFEDHAKNSLDMLHGGALFTLA DMAFGLASNFGQEDGTMISTNATISYMKSAKTGPIVAEARLANGGRHLATYEVKIYDGQG TYIACAMISGYRLDTKLAIEE >gi|316922678|gb|ADCP01000097.1| GENE 13 13712 - 15208 1277 498 aa, chain - ## HITS:1 COG:no KEGG:LI0827 NR:ns ## KEGG: LI0827 # Name: not_defined # Def: lipoprotein # Organism: L.intracellularis # Pathway: not_defined # 129 496 294 662 671 384 48.0 1e-105 MVLACLLLLAGCGRTVFVSGGGYPSSQPGHAVSGQPLSEQADAAWRAGNYAEAERLYNVV ARDAALPTEQRALAWERVARAALSSGNMQGAQNALKFWKDLVPGAENSSAWLDVQSKLSA AGGAGQPSFSSGCVALALPLSGAYGPFGNKIAAGATAAQGELSKSGMMMDVRMIDTESSD WLDKLSQLPPQCVMVGGPLRADRYTAMKSRSIQQSRAVFAFLPSLEGSDEGTVAWRFFAS PQDQIDAVLNFTRNLGISSYGVLAPTDTYGQRMTDLFLKAVRTNGSTAKIATYPSGDTTS WGEVMRGFVGGTMRGKTPVPTSTFQAAFIPDSWKNLELLVPFLFYQGEDRLVLMGTSLWE QGLSNRSSVNVANLDLAIFPGAWNPASPSPAAGALVRAMAESGKGTPDFWEGIGYDFVRM ASVMNLQTPWTPAQVNQRLASAQNMEWSMAPMSWSNGKAAQALFVFRPVQSGFELAEPGA FKTRYNEIIARHARRVGR >gi|316922678|gb|ADCP01000097.1| GENE 14 15540 - 16598 1415 352 aa, chain + ## HITS:1 COG:MJ1065 KEGG:ns NR:ns ## COG: MJ1065 COG2089 # Protein_GI_number: 15669254 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sialic acid synthase # Organism: Methanococcus jannaschii # 21 351 18 334 337 209 38.0 6e-54 MNLIDLISKHPVPVTRFQRPYVIAEAGVNHEGSMDIARRLIDEAALGGADAIKFQTYKAG TLASKHSPAYWDTSKEPTPSQYELFKKHDSFWKGEFENLKKHCEQAGIAFMSTPFDVESA KFLNDMMDVFKISSSDLTNRPFIEFMCDFKKPIIMSTGAASLAEIAEAVSWIEAKGNPLA LLHCVLNYPTMDENAALGMIPALVRHFPQHVIGYSDHTLPNDMKVLEVAALLGAQILEKH FSHDKTLPGNDHYHAMDYKDLQLFRQNFERTLSMIGEMRIEALASEEPARQNARRSLVAN RNIPAGKIIDKDDLTWKRPAGGISPRHYYEVLGMAARADIEEDTVLQWAHLE >gi|316922678|gb|ADCP01000097.1| GENE 15 16950 - 17426 578 158 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKMLIAAVCLMLLSPAHAGASLFGSDGPKDTDEITPLQYASLLGLRDIDPVMFRTKLMP LIDEAMKDGKITVGECKAIEKAAGNVAPAFYAAAKAPRLQDSISETMDKAKKDGKELGSK LEDTLNNQLPQLFDDAMNLFRDQMRQYNKEKPEAPTNL >gi|316922678|gb|ADCP01000097.1| GENE 16 17457 - 19151 1926 564 aa, chain + ## HITS:1 COG:no KEGG:DVU0348 NR:ns ## KEGG: DVU0348 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 9 562 21 572 574 419 44.0 1e-115 MTTQSITTLILGTAPENATPATHILAGPWCTAPASAEETGFDMPPEPLADPERLETAARQ ARQLVTQLAPRLRDYLNGLYGTQHDDRYWDFVLAPWLVRAVETLIDRLYRAAGILETFGG RPLSVPVLDASEFAFADTQDFIMGGILNPQWNHWLFSRLLKRDWPAAWTAEELPPVRREA APTRRENLLKHVARNILFSLPFPRVKGFSLRQSFDLSVVLTLNRDRRDDTIPLRLLGDPS APPLPLGFTEDEAFDLAVALLPESLRKASIPQTLGHEQARIRSRVVSVAACEDDAYRIKM ALHRGRGCRMIFIQHGGEYGYVRTSVAYPLVEYCQHRFITWGWKRHGLMPGNFLPLPHPQ LAALRDAHREKSPTLLLVGTEMALLPYTLKSMLRGRQQFAYREDKARFMAALPEALRKQS QYRPYFDVPSSLEDAPWLLRRFPEVTRCTGPLEPRLLGCRLLVLDHAGTTLAQALAANIP FLLYWNPETASFTPEALPLLDKLREAGILHDTPESAAAKAAEVWGDVPGWWASVRAACKP WASQHALTNKDDVPHVWKLVLKDV Prediction of potential genes in microbial genomes Time: Fri May 13 03:49:53 2011 Seq name: gi|316922665|gb|ADCP01000098.1| Bilophila wadsworthia 3_1_6 cont1.98, whole genome shotgun sequence Length of sequence - 15758 bp Number of predicted genes - 13, with homology - 10 Number of transcription units - 7, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 115 - 151 6.5 1 1 Tu 1 . - CDS 310 - 933 734 ## COG0681 Signal peptidase I 2 2 Op 1 2/0.000 - CDS 1089 - 1592 711 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase 3 2 Op 2 . - CDS 1594 - 2865 1599 ## COG0151 Phosphoribosylamine-glycine ligase - Prom 2928 - 2987 2.1 + Prom 2892 - 2951 2.2 4 3 Tu 1 . + CDS 2995 - 4293 1570 ## COG1541 Coenzyme F390 synthetase + Term 4442 - 4475 5.4 - Term 4427 - 4466 8.4 5 4 Op 1 . - CDS 4685 - 5608 830 ## COG1566 Multidrug resistance efflux pump 6 4 Op 2 . - CDS 5605 - 5808 227 ## - Prom 5840 - 5899 5.3 7 4 Op 3 . - CDS 6021 - 8147 2023 ## COG1289 Predicted membrane protein - Prom 8304 - 8363 78.0 + TRNA 8287 - 8362 95.6 # Val TAC 0 0 + Prom 9019 - 9078 1.7 8 5 Op 1 . + CDS 9144 - 10481 1869 ## COG0477 Permeases of the major facilitator superfamily 9 5 Op 2 . + CDS 10673 - 12124 2198 ## Ddes_2093 hypothetical protein + Term 12229 - 12275 11.0 10 6 Op 1 . + CDS 12468 - 12752 258 ## 11 6 Op 2 . + CDS 12783 - 13646 1060 ## Ddes_1310 17 kDa surface antigen + Prom 13723 - 13782 2.1 12 7 Op 1 . + CDS 13832 - 15577 1578 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain 13 7 Op 2 . + CDS 15605 - 15758 101 ## Predicted protein(s) >gi|316922665|gb|ADCP01000098.1| GENE 1 310 - 933 734 207 aa, chain - ## HITS:1 COG:RSc1061 KEGG:ns NR:ns ## COG: RSc1061 COG0681 # Protein_GI_number: 17545780 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Ralstonia solanacearum # 18 188 75 292 305 144 38.0 8e-35 MNEQLSLIQRFRKTVIGEYAEALIWAVALALVLTTFVVQAFKIPSGSMLETLQIGDHLLV NKFLYGLRNPFNDDYLIRGVEPKVGDIIVFRYPKDRSLDYIKRIVGVPGDTLEMRNKVLY RNGVEVQEPYTQHSQPLIMIPGRDNWGPITVPADKFFALGDNRDDSADSRFWGFLDRNDI RGKAWRIYWSADGLSNIRLNRIGRAVE >gi|316922665|gb|ADCP01000098.1| GENE 2 1089 - 1592 711 167 aa, chain - ## HITS:1 COG:TM0446 KEGG:ns NR:ns ## COG: TM0446 COG0041 # Protein_GI_number: 15643212 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Thermotoga maritima # 5 163 3 159 171 130 45.0 2e-30 MEKAKVAVFIGSASDEATVAPCADILKKLGIPFRFTVTSAHRTPERTAELVDSLEAAGCE VFICAAGMAAHLAGAVAARTMKPVIGIPVAASPLGGMDALLATVQMPPGFPVATVALDKA GARNAAWLAAQILALHDPELAHKIRISRERFKSDVAVAGAEIEAKYC >gi|316922665|gb|ADCP01000098.1| GENE 3 1594 - 2865 1599 423 aa, chain - ## HITS:1 COG:VC0275 KEGG:ns NR:ns ## COG: VC0275 COG0151 # Protein_GI_number: 15640304 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Vibrio cholerae # 1 422 1 425 429 417 53.0 1e-116 MRVLIIGSGGREHALSWKVARSGKAEKIYCAPGNGGTAQEGENIPLKDSDVAGLVSFAKG NVIDFVIPGPELPLTLGVVDAMTQAGIRCFGPSQWCAQLEGSKAFAKDIMQKAGVPTAAC GVFTELEAAKSFVEEKGAPIVIKADGLAAGKGVVVAQTKEEALEALDQIMGGVFGAAGSR VVIEEFLVGEEVSLLAFCDGKTALPLPSAQDHKAVFDGDTGPNTGGMGAYSPAPIVPDAE LEKMADIAIRPILAEMARQGHPFTGILYAGLMMTKDGPKVLEYNVRFGDPECEPLLMRLE SDLLEIMDACIDGRLDEVKLSIRPESALGVVIAAEGYPGSYPKGMEIKGIDAVDALPDTK VFQAGTKREGSRTLSSGGRVLCVTALGKGLAEAQKRAYEAVAKIDMEKSQHRSDIGAKGL RRL >gi|316922665|gb|ADCP01000098.1| GENE 4 2995 - 4293 1570 432 aa, chain + ## HITS:1 COG:MA1725 KEGG:ns NR:ns ## COG: MA1725 COG1541 # Protein_GI_number: 20090577 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanosarcina acetivorans str.C2A # 19 429 24 433 435 402 50.0 1e-112 MSPAKPRFLPHLPADFLAEHQLAGIRWSVQQAMKSPQYRAKLAAAGIACGEDIKTLDDVR RLPFSDVADLRDGYPLPLLCVEPSEVVRVHASSGTTGKRKILAYTQKDVDTFALQMARCY ELAGLTTEDRMQIAVGYGLWTAGAGFQLGSERFGALTIPVGPGNIDMQLQLLVDLQTTCI GCTSSMALLLAEEVERHNLRDKIALKKIIFGSEPHSLKMRQSFTEKLGLEASYDIAGMTE MYGPGTGLNCEVGEGIHYWADLFYIEIIDPHTLQPVPEGEVGEMVVTSLSKEAVPLIRYR THDLSRLIPEPCPCGRAIPRHGHIRGRSDDMIIFRGVNIYPGQIADVLNLYPDVGGEYHI ELTRSEGLDYMRLRVERQSGVGSGNDAGIARAIGDEMHKRIMARISVEIVDPGALPRTFS KAKRITDLRLND >gi|316922665|gb|ADCP01000098.1| GENE 5 4685 - 5608 830 307 aa, chain - ## HITS:1 COG:STM1442 KEGG:ns NR:ns ## COG: STM1442 COG1566 # Protein_GI_number: 16764790 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Salmonella typhimurium LT2 # 1 288 13 295 298 192 39.0 5e-49 MSINSLLRWFLTLLIIVIACVFCYIRYNAYFRNPWTRDGMVQAEVVQVAARVSGEVKTVH VRDNQFVKVGDPLFDLDDRPSRVALSQAEANVRQKQAVAKAARDRAGRDARLQQNAPGAI SPEAFQQAEDQLLSAEADVAVAQAQLDQARLNLDFTHVTAPVNGYITNLTVQPGTMSTAY RPLVALINADSFRVDAFFRETQIRDFRPGDRALITLMTYSDTPLEATVESIGWGIARENG STGEQLLPSVSPTFEWIRLAQRIPVRLHFDALPKGIELRAGTTASVLVRVHPEDTAASIP PLPTIGQ >gi|316922665|gb|ADCP01000098.1| GENE 6 5605 - 5808 227 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPHELALVGIYFSPLLPIVLFGILGALATAFVLNCTGLSGWFANPPWVFMALIVIYVCLL LPFGMVL >gi|316922665|gb|ADCP01000098.1| GENE 7 6021 - 8147 2023 708 aa, chain - ## HITS:1 COG:PA2431 KEGG:ns NR:ns ## COG: PA2431 COG1289 # Protein_GI_number: 15597627 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pseudomonas aeruginosa # 15 172 25 184 724 86 33.0 2e-16 MTAELSSRIHTAAVTALAFVCTYVLAFYWNLDRPFWAGFTVFVISLPSIGQTLQKGVLRM FGTIAGAVAALVLLGLFAQERMLLLSVLSLYLCLMLYLMLTSVYYGYAFFISCIVTLIIC LMAVHEPQDAFHLSVYRVEETLLGIGVYTVVTLVFSPRTSIKSLYRGVQDLMAGHKALFV MNEGAGAEGQMSRMYTQYVGMREILDKVGQLVPAVQLETYQVYRYREHWERAVRCSAELL ELQRRWMGTLVAMKDLDMASLFPHFESRVAELGKLFDRLDALGKEGSPGGSSKPEDVQPL AFDESVFEKLGSTRKGLAFSAVKLFEEQTAHCRELLMLAGFLLHGGPMPELKPSASKKAL SVIKPEQYAYMSQMFVIFWVATLCWILLNPPGLESIAFLELTIVLGLIGIMTGEDKPLRQ VLIFALGIGCTGLAYTFIYPLVSNLGEFIVMMGVSGFFITFAFPRKEQNFTKQAFMLPWL SIGNFTNLPTYDFGQLLVGSFTLLFGISIVSLVHYALFMPNTEALFLQRQKAFFRSSEAL LLHLRACREQRRGSLFRLRMLFRLHRTRFLAEEIGILAKKLPVTFVKPELVRVFSTEVQD MVSSMQGLYARLRSVDSDLLCPSQPGTARVSLVEAPLKDRLESMEAVVRSMIDGLRESAR AEKGPDAPAGGQVVQDLLVSCAGIMKSLSNTLDLMRSLPVEVDSRNKF >gi|316922665|gb|ADCP01000098.1| GENE 8 9144 - 10481 1869 445 aa, chain + ## HITS:1 COG:BS_yyaJ KEGG:ns NR:ns ## COG: BS_yyaJ COG0477 # Protein_GI_number: 16081136 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 12 441 18 451 451 185 27.0 2e-46 MSQSTVKKIPNNYFDGMPVTGFHKIVFFIIMMAYFCEQMDNWNFGFIAPALMHTWGLQMS DIGVVTLWYFIGMTAGGFFGGVISDFIGRRKTFLISIATFSFFSVLNGFTDNFHVFVIAR SMTGFGVFCLMVCSQAYIAEMAPAESRGKWQNLVAAVGFSAVPVIGVMCRAIIPLAEEAW RFIFYFGGVGLIAFFVGLRYLKESPRWLVARGRLEEAEQVVFEITGKHVDLSEAAKNVQT KVPVMEVLTGMFRRQYIKRTLVLLAIVVFTNPATFTVTNWTATLLKAHGFSLEDSLMATT IISVGVPAGLFFSSFVTDLGGRKIPIMCMLIILAVLGPIFGNMTGYWPVVLTGVVLTAVV MAMGFTVFSYTAESYPTRMRNTATGFHSSTGRLAVAAAQPLIPMAYAAYSFDGVFWIFSF LCFVPAVVIAIWGQRTGGKSLEEIA >gi|316922665|gb|ADCP01000098.1| GENE 9 10673 - 12124 2198 483 aa, chain + ## HITS:1 COG:no KEGG:Ddes_2093 NR:ns ## KEGG: Ddes_2093 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 480 1 480 483 818 81.0 0 MGHPTIYPTGVTVYNPEKCWNGFTIFQALEVGAVLMNMNGRENKVWKGVHGFPNKIFPGG YLMTSRGSRDGRYGVQDGLDLVQIDWDGNVVWKFDRNEYIEDPGIPGRWMARSHHDYQRE GSTTGYYAPGMEPKTDSGNTLVLAHRNARNPKISDKQLLDDVILEVDWDGDIVWEWHCNE HFDEMGFREGPKNTLARDPNYRQTQPEGMGDWMHINSMSVLGPNKWHDAGDERFHPDNII VDGREANIIFIISKATGKIVWKLGPDYDNSPEAKAIGWIIGQHHCHMVPRGLPGEGNILI FDNGGWGGYDVPNPGSPTGVKAALRDHSRVLEIDPVAMKIVWQYTPTEAGFLAPMDCNRF YSPFISGMQRLPNGNTLITEGSDGRVFEVTKDHELVWEFISPYWGQKLPMNMVYRAYRVP YEWVPQLGKQEETPIERIDVNAFRMPGAAALGDRDSEIAIEGCAPYEGDNALCVASVDDP EDQ >gi|316922665|gb|ADCP01000098.1| GENE 10 12468 - 12752 258 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPGPARKGFPVHDFQSAESWLRKALRNAPKPLPSGVFPKLLDEAEQAGFSHSTLNDVVDE WLNFGYCRVTDHVSNDIALTPEGDEYFGHRTIDE >gi|316922665|gb|ADCP01000098.1| GENE 11 12783 - 13646 1060 287 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1310 NR:ns ## KEGG: Ddes_1310 # Name: not_defined # Def: 17 kDa surface antigen # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 16 273 9 248 263 111 40.0 3e-23 METGMKGSFTTRKGALYCIVMTCLCSLLATGCASKYGAQTTDVHYYPDCYQPIADLRSAE KSFNTTMVMGTTMGALLGAVIGASQTGKAEGALAGAAIGAGAGAGASYLVAKYNNERDDR VRLASYARDLNADVNSLNRVTAAGQVAYNCYSAKFRAALEDYKAKRITRAELDQRYAEIK SGLAEASAILGSTLSEADKREAEYRQVLTIEAKKASRPIPPVQTVATTSKKTSKATKTVT RKQTGDQLQNLSNGVGSYKESTAQLTAVKSDLDEIQLQMDKVMLAQG >gi|316922665|gb|ADCP01000098.1| GENE 12 13832 - 15577 1578 581 aa, chain + ## HITS:1 COG:mll9173 KEGG:ns NR:ns ## COG: mll9173 COG0265 # Protein_GI_number: 13488111 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Mesorhizobium loti # 373 578 321 531 552 106 38.0 1e-22 MGNPLHFIQASDTSAFAALHTEQGALPSQYEALQAYLKKIGLGRFGTLWAEPVTGQATPA GYASISWYTPLEGVGTPLSRLDAGEAEQARAALRRAREAVTPHLDDTPGGVLLRKALFIP SADNVFVVNGHPVLVNWGLLPQGTDEATAPILWGEPENGETPSGVTGAATAATGASAAAV AGTAARNAVPGAGRAAGPSAGTRTAGATGTAAGTAAGIAGATAGNVIIRRSKAGLFGALL LGILLGVLLLFLLRSCVKTAQEQPVAPETPPVAAESPEEAARKAEWANLLREKAFWQGLL ELTPCELKAFFDGNAGKTDAKPDAGETIPGKDVPENQPGEAVPGISLPPLPSANATVPAK AVDLSTLTIPQRLEQGTVLILAPVPGGAQQGTGFFINPDHVLTNRHVVANAIDDMVMVTN ANLRGARRGMVVARSDTPGYDFAVIKVMLVEGDQVPALPLSPTVNRTDKVSAWGFPALVS EQDPAYLRLLRGDFNAVPEVVFTDGVVNAILQTSPRNIVHSAVVSQGNSGGPLVNEKGDV IGINTFIRLDADSNRQTQTALGSDAIMGFLREKGIPFTEHQ >gi|316922665|gb|ADCP01000098.1| GENE 13 15605 - 15758 101 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGYFILSTRKGDYRALAFQGSQVYACHAQLSGLLRTHLGEAHRPSCWPDA Prediction of potential genes in microbial genomes Time: Fri May 13 03:50:31 2011 Seq name: gi|316922662|gb|ADCP01000099.1| Bilophila wadsworthia 3_1_6 cont1.99, whole genome shotgun sequence Length of sequence - 4806 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1311 1160 ## Ddes_1312 hypothetical protein + Term 1545 - 1586 0.6 + Prom 1485 - 1544 2.2 2 2 Tu 1 . + CDS 1728 - 4757 3645 ## COG4457 Uncharacterized protein conserved in bacteria, putative virulence factor Predicted protein(s) >gi|316922662|gb|ADCP01000099.1| GENE 1 1 - 1311 1160 436 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1312 NR:ns ## KEGG: Ddes_1312 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 191 435 202 435 436 83 32.0 1e-14 MDPQGTAVDWYTSGPVQPLAELPAETQEAVKTRLQGLLSDIEELAASLQTDSDPYKSLCG TMLHLATRFPTQECLYASLPSGEPASPPQPVLVCWGMALSSSTAQEGIYAMGATPFSPEP QTPLASGASGDPDRIVREGEVTERPIVAPLPPTVVVEGEPWWKKLLMLLLGLLLALLLFW GLHFLFPWVPPLPLPAGCARTPAVETPQPAVQPADPGPSEADLTALRAELEVLRQAAQKR AEACQPPAPPAVEEPKVETPAEPPKDLGDLLGDLTTIPKEEPKPEPKPEPKKEQPAPKKE EPKPEPPKEPPKKGDPMKLPDKDDKSMDFLDGCWNCRTGLSDTRGNPIKVRFCFGKNGKG QITIIDRQGKTYTGRADAQMQNGRLHIDTGDATSRNSRGSFNGLRIDCSPGAGNDAMCYG RNKGDNSPWSAKFFRD >gi|316922662|gb|ADCP01000099.1| GENE 2 1728 - 4757 3645 1009 aa, chain + ## HITS:1 COG:mll9171 KEGG:ns NR:ns ## COG: mll9171 COG4457 # Protein_GI_number: 13488109 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria, putative virulence factor # Organism: Mesorhizobium loti # 2 1008 7 1040 1041 554 33.0 1e-157 MNYTSAVSIIPGGCPQFLDCRLDAKSLGRFWAYFREIPQTGESAIASARRQVELVPVPLD DTGQPGDYDYKIDANRALEAFTGRWVPVPFLRLSNEQWKDGAFKCEKGPSNWARLHVSRE DSDGAYRLTFLFDTTIEEREQPTGQYFALCDDDVAENARFALSPKSRDNAWFLNTLWVDE WIAEIYDAHQTARHNGRTTWRENTPFIMEHLATYLTLLEALAASGTVPTVRVVDPAHLTP VDVDLVLDLGNSRSTGMLVETLPQRQTNLNDSYLLQIRDLSQPDRTYTGPFATRIEFAEA TFGNPRLSARSGRSTPAFVWPSVVRVGPEAARLAQHSVGAEGNTGMSSPKRYLWDESPRA QQWRFNAWGTGSELEPPVTRGVFIQQINREGTPLCCFDDPGLSPRPKILRTQQPEVAFEA HFTRSSTMMFLLSEIIMHALATINSPAQRGEREQPDVPRRLKRIIFTVPSAMPIAEQRIY RRWVTWAVRMIWETLGWGQWYTTKQGMRDTRPDYRVSPEVRCSWDEATCTQLVYLYNEIT EKFQGDARHFFALMGRKRGADAPSLRIANVDIGGGTIDLSITTFAVTGDEATAARIKPHM AFRDGFNIAGDDVIREIVEQHVLPCIGQATGLSDPRNLLGQLFGRDTVGGSQRNRALRTQ FARQIAGPVVTRMLEGYEQADLLVGGVQERKLSAFFRPEHAPQESDPASPETEGLPEQPS AALIQYVNETVERQTGKPFSLMDVALRIDPRAIDRTIRNTLGQILANLCEVIHAYNCDLL LLTGRPSKWHAIISSFFAKLPVPADRIIPMRDFRVGSWYPFADNRGEITDPKTTVVVGAI LCALSEGHLEGFSFDTGSLFLKSTARFIGAMDAGGQIRQAQVWFETDTDNPSGGELHKAI QFSGPIPIGFRQIEAERWTTTRFYMMDFASPAARNNARNRLPYTVKLAFTVADLADAPNA ASRDEGELAVNEIEAVDGTPVNPRDLEIRLQTLPADEGYWLDTGVFNIL Prediction of potential genes in microbial genomes Time: Fri May 13 03:50:45 2011 Seq name: gi|316922658|gb|ADCP01000100.1| Bilophila wadsworthia 3_1_6 cont1.100, whole genome shotgun sequence Length of sequence - 6482 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 426 - 3080 3601 ## COG4458 Uncharacterized protein conserved in bacteria, putative virulence factor 2 1 Op 2 . + CDS 3082 - 4131 1264 ## gi|302862711|gb|EFL85643.1| putative LigA + Term 4284 - 4321 1.0 3 2 Tu 1 . + CDS 4411 - 6390 2514 ## Ddes_1316 PpkA-related protein + Term 6425 - 6452 0.1 Predicted protein(s) >gi|316922658|gb|ADCP01000100.1| GENE 1 426 - 3080 3601 884 aa, chain + ## HITS:1 COG:mll9170 KEGG:ns NR:ns ## COG: mll9170 COG4458 # Protein_GI_number: 13488108 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria, putative virulence factor # Organism: Mesorhizobium loti # 6 879 3 894 901 479 36.0 1e-134 MTIDDQKLVEQCEAGIKTLAGVDEWVRANPQATGSSGEALLRESNRLARALRKEAATAAR KMCVGVFGPSQSGKSYLISALAQDADGSLLTALGDESADFIQDINPAGGKESTGLVTRFT LTPSGAPAMFPVKLRLLSELDIVKILTNTYYADCRHLTPPDEDALAARVDALAKKAKGEP WRASFSEDDMIDLKEYVTRNFRATAVVQRLEHLYWPKAIHAAGRLAPEDRAALFEILWDE AKPFTALYLRLSGVLDALGYPDEAFCGKEALLPRETSIIDVETLRGLGETEGADSLELVT KDGRRVAAGRSEIAALTAELAITMRHKPDDFFEHTDLLDFPGYRSRLKTDDVARELAKPD QIRQFFLRGKVAYLFERYKTDLELTSMLLCIGPSNQEVQDLPAVINDWVSDAAGKTPELR QGRHTTLFLVLTKFDMEFEKKKGAVDDETRWSNRLHASLLDFFGKQHDWPEQWTPDQPFN NTYWLRNPKFRWEAVIAFDGDRETGIRPEQEAYVSDMKAKFLNTPEVRRHFADPEAAWNA AFTLNDGGVAFLRRQLRPVCDPAIKRRQVADRVADNLRPFVEHLRRFHRIDDKAALREQQ RQLGMRLARSLALTAQNQRFGELLRAMLMRDHELYALYYQVENRLMRENEQAPVPQPSVG SAASAQDIMDDLFGDMAPPVPASAETPEAAPQPLDEAGAFAELVVGAWIEQLNALAADPV RQRYFGLEAEDFGQLVHEIIQGMSRLGLEKDMAQAVRDVSGYRNIRRDKLIWRQASMAAY CISAFVDWLGFNPASVPEAKRTIQLAGRTRTLFGQQPDPEGYPALAAEPLAFDRQYYTDW LAALMHLIEGNADYEGGQTFDPEQNARLGRLINNLNPSVFTQDN >gi|316922658|gb|ADCP01000100.1| GENE 2 3082 - 4131 1264 349 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302862711|gb|EFL85643.1| ## NR: gi|302862711|gb|EFL85643.1| putative LigA [Desulfovibrio sp. 3_1_syn3] # 226 349 184 312 334 67 35.0 9e-10 MTFRCALRFEPDKAKGPGYGLILASCPGFSAAPNTLLYMIKDPSNGMSLGPNGWQPGDAH LVPDAVAPSGEILSLSVGPAVVDSLELQAIYQVVLFTDAEHVWQGGLSINRIVYSRDPSL LPDLELAPKAAPSAGREPLPMPEMETGPAPTSLSFDGKPGSAVPASELLNLKERPKSYKS LILNIIFAILTVILLAGAGYFIYTRVLPKTDTTLEEKPAEVLKHEAADLPPGQEAPVKYA PDDIKGPKGLQVARDALKRKVEPDTAMALAQQLFAAADDVKAQDGAFLIADELAKKGNAQ ALLMLGDFYSPKQPQRGTIQKDTDMANDCYKKALAAGAAEAQQRLDALK >gi|316922658|gb|ADCP01000100.1| GENE 3 4411 - 6390 2514 659 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1316 NR:ns ## KEGG: Ddes_1316 # Name: not_defined # Def: PpkA-related protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 14 659 31 671 671 678 56.0 0 MKRILIAILTALTLLAGWTPAAHAADRTPLLMEGKKTLYQRVITHPGASLYAGQGEGSTV IQKAVRPFSVFYVYTRDGNWLEVGASTTKPDGWIKATDTTAWNQALTLLFTDRSQRQPVL FFKDHKAIMDVCQAEDLPGSLSKLRAEAKAAQQGDAPADLPILAVEPDDAQGAVSRNRFY LMPILKVDTPFEGTKLLEVASIDPGSAGDKPDSNAPKDAKDDGVPRTGIAMVIDTTISMK PYIDQSLNVVRAIFDSVAKDKLDDKVSFGIVAFRNSTKASPKLEYVTKVVSDFRDATKRD ELEKALSGLEEAKASSHSFNEDSLAGVKAALDQLNWQPYANRVLLLITDAGPLPLSDANA STSLDVQELADLAASRNIRLVVAHVRTPAGKGNIDYAAKAYATLSSVPGGKSAYIPIQAT DAAKGSASFAKAATGLSSALVSSVKQALAGKAPVKPQEAPAQSPEERAKQIGEELGYAMQ LEYLGKKQGTRAPEVVTSWIADADLDALSAGKPVSAVSVAVLLTKNQLSDLQRQLKIIID NAERTRKTDSRDFFQGILSASAQLARDPAAFSQKPGANLRDMGVLGEFLDDLPYKSDIML LSEDDWYRMSVGEQTAFINRLKSRVARYEEYDRDVSNWESFGSKDPGDWVYRVPLAMLP Prediction of potential genes in microbial genomes Time: Fri May 13 03:51:17 2011 Seq name: gi|316922654|gb|ADCP01000101.1| Bilophila wadsworthia 3_1_6 cont1.101, whole genome shotgun sequence Length of sequence - 6117 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 21 - 80 3.1 1 1 Op 1 36/0.000 + CDS 149 - 868 184 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 . + CDS 871 - 3825 3049 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 3833 - 3873 4.2 + Prom 4044 - 4103 5.7 3 2 Tu 1 . + CDS 4264 - 5721 1861 ## COG0753 Catalase Predicted protein(s) >gi|316922654|gb|ADCP01000101.1| GENE 1 149 - 868 184 239 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 27 216 22 204 245 75 32 1e-13 MNLCELRNVALERKARDYTYRLRIESLAIRAGEKIALIGPSGCGKSTALDLLAMILPPSG AEKFLFNAPGDTPEDVAELWRRGGRERMARLRLRHIGYVLQTGGLLPFLSARDNMTTPGR AKGMDAKAVDAGLKELAGRLGIGHLLSALPDKLSIGERQRVAIARALLPKPELVLADEPT AALDPVTAKNVMRLFTELAGECALVVVSHDHRLVEENGFRCLRLDVDGGDGSINTTLRG >gi|316922654|gb|ADCP01000101.1| GENE 2 871 - 3825 3049 984 aa, chain + ## HITS:1 COG:PA0072 KEGG:ns NR:ns ## COG: PA0072 COG0577 # Protein_GI_number: 15595270 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Pseudomonas aeruginosa # 1 397 1 388 399 214 39.0 7e-55 MRFSIICGMALRDYWHERALSLCAVLALATVLAPLLILFGVRNGVISNLQERLLQDPRNL EIVPVGSGKYGKAFFEELRKRPDVGYVVPQTRAIAATIGLLPASEKKGGLSPNPVNVSLI PSGAGDPLTRRFAGSAALKDDEIILTAPAADKLGARAGTILTGRVTRARGMTREQAETPL RVKAVLPLSALQKEAAFVPPSLAEDAETYRDGMGVPERGWTGTARGDGERLYPSFRLYAG NLDGVASLRDFLGARNIDVYTRAEEIENVKMLDRSTTLVFGLICLAAGVGFLASTASNLV AAIRRKRRVLGIIRLMGFSGADIVCFPLMQTLFTSALGMGLAFALYAAVAGVINALFPLG AELGAVCRLSWAHVGGLPASCSRCPCWPPCRRAFRAAAFHLPRLSAMNKWCFLLIAACLL APDTGHAALKARDDVKAEDAFNPNPAPDDLILPMPCGQSMVLKAVGVRGKGLLWDLETRF GRRDGGSDDRGYYDSPYASAISGPFVLKDLPPDWRQKIKAANPDADAMQFYFEGKYEVSK RQWDAVMGGQCMDGNALPALSPEDARPVVEVSWHEAQEFTRKYTEWLLANAPQSLPGFQG DDRNTAFVRLPTEAEWEYAARGAQKVSPLSLSQEDFFEMPMGDAIKNYAVFRDSEGTSEE TLQRIGSRKPNPAGFYDMAGNAAEMVQDGFQFSLGGRLHGSTGGFIRKGGGFLSTQDEVM PGRREEVAPFLKDGPNKARDLGFRIALSGINTPGGARPAELASEWKKAGIELSAELNPSG DPLELVAQLEARAGSDAEKASLKGLQNLIRENNISLERQQRSTVSNEIRSAVYLVTMLKD CLIKREIIGKELQNFEDKKKQITAALPKMKAAEANQYKQVLASLDKGIALSKTGIGNQEA EFRSMLLFYKQTVENTRKVPQSLFDEALKGIGQEMSGKAPHLVKQNASLQIYRKHVAALR GGKLALLSSDALRKDILSLIPKGK >gi|316922654|gb|ADCP01000101.1| GENE 3 4264 - 5721 1861 485 aa, chain + ## HITS:1 COG:HI0928 KEGG:ns NR:ns ## COG: HI0928 COG0753 # Protein_GI_number: 16272865 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase # Organism: Haemophilus influenzae # 6 482 15 491 508 759 75.0 0 MSNDLLTNNAGAPVVDNENALTAGPRGPMLLQDTWFLEKLAHFDREVIPERRMHAKGSGA YGTFTVTHDITRYTKASLFSQIGKKTDLFMRFSTVAGERGAADAERDIRGFAIKFYTDQG NWDLVGNNTPVFFLRDPLKFPDLNHVVKRDPRTNMHSAKNNWDFWTLLPEALHQVTVVMS DRGIPASYRHMHGFGSHTFSFINANNERFWVKFHFRTQQGIKNLTDAEAEAIIAKDRESH QRDLYEHIEKGDFPRWTLFIQIMPEADAEKLPYHPFDLTKVWYHKDYPLIEVGELELNRN PENYFAEVEQAAFNPANVVPGISFSPDKMLQGRLFSYGDAQRYRLGVNHHLIPVNKSRCP FHSYHRDGQMRVDGNYGSYTNYEPNSNGAWKDQPDFSEPPLKLSGDADHWKYPCDDADYY EQPGKLFRVMTPEQQEALFGNTARAMAGVPREIQIRHIYNCSKADPAYGKGVADALGIPM SEVKL Prediction of potential genes in microbial genomes Time: Fri May 13 03:51:24 2011 Seq name: gi|316922642|gb|ADCP01000102.1| Bilophila wadsworthia 3_1_6 cont1.102, whole genome shotgun sequence Length of sequence - 11885 bp Number of predicted genes - 12, with homology - 9 Number of transcription units - 9, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 38 - 439 469 ## COG3439 Uncharacterized conserved protein - Term 509 - 546 8.2 2 2 Tu 1 . - CDS 607 - 2430 1936 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains + Prom 2548 - 2607 3.5 3 3 Tu 1 . + CDS 2714 - 2965 366 ## + Term 2987 - 3049 9.2 - Term 2979 - 3026 15.3 4 4 Tu 1 . - CDS 3144 - 3338 67 ## - Prom 3381 - 3440 2.8 5 5 Tu 1 . + CDS 3198 - 3476 66 ## + Term 3532 - 3587 12.8 - Term 3769 - 3814 8.2 6 6 Op 1 . - CDS 3882 - 4511 947 ## COG0457 FOG: TPR repeat 7 6 Op 2 . - CDS 4486 - 4887 487 ## Ddes_0887 hypothetical protein 8 6 Op 3 . - CDS 4887 - 5576 850 ## DVU3274 hypothetical protein - Term 5638 - 5681 10.0 9 7 Op 1 . - CDS 5693 - 5881 270 ## Dbac_0275 4Fe-4S ferredoxin iron-sulfur binding domain protein 10 7 Op 2 . - CDS 6096 - 7532 1711 ## COG0498 Threonine synthase - Term 7894 - 7920 -1.0 11 8 Tu 1 . - CDS 8035 - 9003 1109 ## Geob_2729 hypothetical protein - Term 9092 - 9144 9.4 12 9 Tu 1 . - CDS 9179 - 11617 3079 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit Predicted protein(s) >gi|316922642|gb|ADCP01000102.1| GENE 1 38 - 439 469 133 aa, chain - ## HITS:1 COG:SSO0995 KEGG:ns NR:ns ## COG: SSO0995 COG3439 # Protein_GI_number: 15897871 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sulfolobus solfataricus # 10 85 5 80 123 63 42.0 1e-10 MHTTEGLVSKPCAGSVETVWNRLLETLKILNVPLFATVDHAANARAAGLFMPETRVAFFG KPAVGTHLMLERPEVALELPLRIVISEVPGEGTMLFMPDILWLAHRYGIDPQLDSLRKTL GFMALVSDKVCGD >gi|316922642|gb|ADCP01000102.1| GENE 2 607 - 2430 1936 607 aa, chain - ## HITS:1 COG:DR0302 KEGG:ns NR:ns ## COG: DR0302 COG0449 # Protein_GI_number: 15805332 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Deinococcus radiodurans # 1 607 37 642 642 531 48.0 1e-150 MCGIIGYAGHRPAVPVIIEGLRRLEYRGYDSAGVGFIQGRELKVIKAEGKLAALEEKLAH YPNTVAMNGIGHTRWATHGVPAERNAHPHIGGNNELAIIHNGIIENFQELKEGLLAKGYV FKSETDTEVLAHLIAEGRKHNPTLLEAFAWALRQAHGAYAVAVICPDEPGTILSARMSAP LILGVGVGEHFIASDIPAFLPYTRDVVFLEDGEIVRITDATWQVLSLETLEPVRKDVQTV QWDMQAARKGGFKHFMLKEIFEQPKVISDCLAGRVDEEKGMVVLPELDGLPVPSRLRIIA CGTSYHAGMWGQHLLENWARMPVVPEIASEFRYRNVILEPDDMVLVISQSGETADTLAAL RLAKEKGCIAIALCNVVGSSIPREADAVIYTQAGPEISVASTKAMCSQMVIMALIALYYG QRKGLLDAKTRHEALAALHGLPKQIEDALPAMRETAQMLSPLYASAHSFFYLGRGHAFPL ALEGALKLKEISYIHAEGYASGEMKHGPIALIEPEFPTFALAFNDALFPKVKSNIVEVQA RRGPVIALANPGADLHVEHLWTIPSAWGPFNAFLALPAMQLFSYEMADYLGKDVDQPRNL AKSVTVE >gi|316922642|gb|ADCP01000102.1| GENE 3 2714 - 2965 366 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNTILKTAIFATALLMLGGCATKALTPEEQAFQDAQRSCTEQTNSMIGGSRYSWSDQLPW SNYFEWCMEGKGYTKDQLKSIWY >gi|316922642|gb|ADCP01000102.1| GENE 4 3144 - 3338 67 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLMYGRVLYRLFIGSCHAVIGFSSEGHSTSCSHTKERENQKCNHFFHKRYSLGAAIVTLP LISQ >gi|316922642|gb|ADCP01000102.1| GENE 5 3198 - 3476 66 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKMIALLVLAFFGMTATCGMTFAAEANNGMATANEQAVQDPAIHQQKKAPHAVKNQEKK NAQHVNTKQQKKVTKKHDNAASQQKKTSDSQE >gi|316922642|gb|ADCP01000102.1| GENE 6 3882 - 4511 947 209 aa, chain - ## HITS:1 COG:all4382 KEGG:ns NR:ns ## COG: all4382 COG0457 # Protein_GI_number: 17231874 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 8 190 61 243 311 75 27.0 6e-14 MKDRYENLDEYIADLRAALSQNDGCANHHYNLGVALLSKGDFVAAEESFLAAVRNSSHLA EAYVQLGGICLQRGDLEGCLRYNEEAANCRAKFPVPWSNIGFVHLQRGEPEKAISALQKA LKWDAEFIQAMATLGAAYYMNGQYDESIKISEKAIEKQPGFAPAWNNLSLAWFEKGDYDK AAEYADKAIEFGFDVRPEYLEEIAAHRNK >gi|316922642|gb|ADCP01000102.1| GENE 7 4486 - 4887 487 133 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0887 NR:ns ## KEGG: Ddes_0887 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 2 125 5 129 130 177 68.0 8e-44 MISATMFRDSTKQELTEEQEQLKREIYERMNPRRRKFIDRIGYDEWDPFQKPNDPMDIRV DPSKRTTQQLFRGFMHSLGERTEGNDFNKGALDCALGIVNRDERYLGIYEFCVWYYKLLK KEGLIDEGPVRES >gi|316922642|gb|ADCP01000102.1| GENE 8 4887 - 5576 850 229 aa, chain - ## HITS:1 COG:no KEGG:DVU3274 NR:ns ## KEGG: DVU3274 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 7 228 4 221 222 287 63.0 2e-76 MINIMPELTPVFTAYEALVAEADAVFARVGEMHPDCVTCKPGCSDCCHAMFDLSLVEAMY LNARFIEKYDFGPERSRILENAGSVDRKLTRMKRDYFRELRDAKGSEDAMEHIMEEAAKA RIRCPLLDSNDQCVLYDYRPITCRLYGIPTAIGGKGHVCGQSNFAAGASYPTVHLDKIQE RLDRLSVEIRDRVQSRFKELHEVFVPVSMALLTNYDDAYLGIGPAPKES >gi|316922642|gb|ADCP01000102.1| GENE 9 5693 - 5881 270 62 aa, chain - ## HITS:1 COG:no KEGG:Dbac_0275 NR:ns ## KEGG: Dbac_0275 # Name: not_defined # Def: 4Fe-4S ferredoxin iron-sulfur binding domain protein # Organism: D.baculatum # Pathway: not_defined # 1 62 1 62 62 70 69.0 2e-11 MSWNVTVDADKCVGCGECVDVCPVEVYEMTDGKSSPAHAEECLGCESCVEVCEHDAINVE EN >gi|316922642|gb|ADCP01000102.1| GENE 10 6096 - 7532 1711 478 aa, chain - ## HITS:1 COG:MJ1465 KEGG:ns NR:ns ## COG: MJ1465 COG0498 # Protein_GI_number: 15669656 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Methanococcus jannaschii # 16 415 5 376 405 214 35.0 3e-55 MSDQFPTYRAQMEYVCLGCGARHPIDELLYTCPSCGGVFLLEDTTFDKLKEHGGAYWRNL FDSRAATRNTALRGVFRFYELMAPVLEQEDILYLGEGNTPVVDGAPELQKKLNASFAFKN DGQNPSASFKDRGMACAYSYLRALVRKHGWDEVLTICASTGDTSAAAALYGAYVGHPIKT AVLLPQGKVTPQQLSQPLGSGATVLEVPGVFDDCMKVVEHLAEHYRVALLNSKNSWRILG QESYAYEVAQWFNWDLAGKVLFVPVGNAGNITAVMSGLLKMRRLGIISDLPRLFGVQSEH ADPVWRYYSKPKAERVYNPVTVRPSVAQAAMIGNPVSFPRVKALVDAYEAAGGEFGVVQV TEQAIMDATILANRHGHIVCTQGGECLAGLLKAREEGRVAAGEKAVLDSTAHALKFAGFQ NMYFTDTFPPEYGVAPDASLANRPSLVVDAAEKARLSAEEFTRAAAKAVAERLELAGK >gi|316922642|gb|ADCP01000102.1| GENE 11 8035 - 9003 1109 322 aa, chain - ## HITS:1 COG:no KEGG:Geob_2729 NR:ns ## KEGG: Geob_2729 # Name: not_defined # Def: hypothetical protein # Organism: Geobacter_FRC-32 # Pathway: not_defined # 29 293 30 284 304 127 31.0 4e-28 MKYLGVAGVLLGLSLLGVPALADEAKQPVPRVGQVVIESMGPASGTIPKGKLATIVNYIH ADKHTWYKNGSRHTAKYPFNKGTQRNLHDRFVLKMRYGLGDGYDVRFATPIVMNNFRAHS TLTPNSVSSKNGIGDTTFVLHKQFLSQREGSPVDMAWDLGVWTPTGGTSDDGVGTGAWGA MAGLGVGHAFDGGRQFVEGEVMYLYRGVGHTSRVDVSDVFRLNGRYVYALSQHWDMGVET QFEYNTDSRRYGRSNNDASTTWFAGPSVTLKLPEYNASLGVSAQFSLYQNYDAVARIGNV DYPTAASLGERWKLEAKLAIVF >gi|316922642|gb|ADCP01000102.1| GENE 12 9179 - 11617 3079 812 aa, chain - ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 5 811 10 814 830 818 51.0 0 MSQNVNIEHEMRKSYLEYSLSVIIGRAIPDVRDGLKPVHRRILFAQQELANSYNRPPKKC ARIVGDVIGKYHPHGDSAVYNALVRLAQDFSMRDPLEDGQGNFGSIDGDAAAAMRYTEVR MSRLASEFLNDIDKETVDFRPNYDNTLAEPVVLPTKVPNLLLNGSSGIAVGMATNIPPHN LSELSDALLMLIDAPESSVEDLMDVVQGPDFPTGGYVYAGQGLYDAYTTGRGTVKVRGKV EIEDRKKGFQSLIIREIPFGLNKSSLVEKIAALINDRKIDGVSDLRDESDRRGIRVVIEL KRGTMPEIVINSLYKYTPLETSFGINMLAVVDNRPVLLNLKTALAYFLDHRREVVVRRTK FELRKAEARAHILEGLLKALDHIDEVVSLIRSSATPPEAKERLMQRFELSEVQAQAILDM RLQRLTGLERGKIEEELRDLLSKIAWYQSILGDASVLWGVIRDEVEYIKQTYSTPRRTEV IREALSNIDIEDLLPDDDVVITLSRRGYIKRTSLGIYQQQRRGGKGVAGVHTGEDDFVQE FITTTNHQFLLMFTNKGRMHQLKVHQVPEGSRTAKGVHIANLVPMEKDEWVNTILTVREF AEDKSFLFATRRGMVKRSSAALYARSRKGGLIAVGLREDDELIMVREITDDDFVVLATAD GIAIRFSCRDVRNMGRGATGVKGIALRHGDTVVACLVLKEDSPAIMTISALGFGKRTNVD LYRVQSRGGKGIINFKVTPKTGPVIGAKSVLDDEALVLLTSTNKIIRMGVDEIRSVGRAT IGVRLVKLDDGARVVGFDTVNSDADAEGEEAL Prediction of potential genes in microbial genomes Time: Fri May 13 03:52:12 2011 Seq name: gi|316922628|gb|ADCP01000103.1| Bilophila wadsworthia 3_1_6 cont1.103, whole genome shotgun sequence Length of sequence - 18405 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 9, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 49 - 2436 3086 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit - Prom 2677 - 2736 2.5 - Term 2730 - 2787 2.8 2 1 Op 2 16/0.000 - CDS 2793 - 3941 1497 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 3 1 Op 3 . - CDS 4080 - 5648 1411 ## COG0593 ATPase involved in DNA replication initiation + Prom 5950 - 6009 3.9 4 2 Tu 1 . + CDS 6091 - 7503 1146 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 7696 - 7740 5.1 5 3 Op 1 . + CDS 7883 - 8980 716 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 6 3 Op 2 . + CDS 9047 - 9145 107 ## + Prom 9596 - 9655 7.2 7 4 Tu 1 . + CDS 9793 - 10008 103 ## + Term 10180 - 10215 1.5 - Term 9874 - 9902 1.0 8 5 Tu 1 . - CDS 9941 - 10543 656 ## COG1896 Predicted hydrolases of HD superfamily - Prom 10570 - 10629 2.8 9 6 Op 1 . - CDS 10653 - 11822 596 ## COG2081 Predicted flavoproteins 10 6 Op 2 . - CDS 11819 - 12355 549 ## COG3028 Uncharacterized protein conserved in bacteria - Prom 12446 - 12505 3.2 11 7 Op 1 13/0.000 - CDS 12646 - 13590 1185 ## COG0167 Dihydroorotate dehydrogenase 12 7 Op 2 . - CDS 13593 - 14408 818 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases - Prom 14564 - 14623 3.9 + Prom 14513 - 14572 6.2 13 8 Tu 1 . + CDS 14602 - 15786 1453 ## COG1364 N-acetylglutamate synthase (N-acetylornithine aminotransferase) + Term 15818 - 15859 -1.0 14 9 Op 1 . + CDS 16281 - 17207 698 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 15 9 Op 2 . + CDS 17209 - 18108 1173 ## COG0730 Predicted permeases + Term 18307 - 18338 2.4 Predicted protein(s) >gi|316922628|gb|ADCP01000103.1| GENE 1 49 - 2436 3086 795 aa, chain - ## HITS:1 COG:PA0004 KEGG:ns NR:ns ## COG: PA0004 COG0187 # Protein_GI_number: 15595202 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Pseudomonas aeruginosa # 1 795 1 806 806 715 50.0 0 MTANANAYTADSITILEGLQAVRKRPAMYIGSTDYRGLHHLVYEVVDNSIDEAMAGYCTR IEVVLHMDNSVTVRDNGRGIPVDIHPKEKVPAVQVVMTKLHAGGKFDNSAYKVSGGLHGV GVSCVNALSERLEVTVRRDGKRWKQVYSRGVPQEELQYIGESDSHGTTVRFKPDETIFET TEFSYDTLKKRFEELAYLNKGLQIECRDERTSEVHLFHAEGGIHQFVKDLNSGEVGIHAI VDGEGVADGVSVEFAIQYNAGYKENMYTFANNIRTKEGGTHLAGFKTALTRAINNYIKSQ PDLTKKMKGQALSGDDVREGLTAVLSVKLPQPQFEGQTKTKLGNSEVAGIVGGVVYGKLD TYFQENPKDARLIIDKAVDASRARDAARRAKELVRRKGALSDNALPGKLADCQSKDPADS ELFIVEGDSAGGSAKQGRNPNTQAILPLRGKILNTERTRFDKMLANKEVKALITAMGAGI GEEDTDYDKLRYHKIVIMTDADVDGAHIRTLLLTFFFRNYAGLIDRGYLYIAQPPLYRAH NSRMEKFIKDDYELNAFLMQRVSDDMEVEAPNGTLFRGEEIKALMGQIENISHRVRDAEL AGINRDLFLALVDVEDRVTPDCLTSGTRCYDFLKERGYELTTETEAHEDGDRVFALIENA NGHRTRVSVEFFISKMYLNSWEALDAIRQKCGGLSFMLRNKEQEYAASDIFDLHALTLEE ARKGLNIQRYKGLGEMNPEQLWVTTMNPENRTLLRVTVEDAEAASDTFEQLMGDRVEPRR EFIERNALTVRDLDI >gi|316922628|gb|ADCP01000103.1| GENE 2 2793 - 3941 1497 382 aa, chain - ## HITS:1 COG:AGc520 KEGG:ns NR:ns ## COG: AGc520 COG0592 # Protein_GI_number: 15887650 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 373 32 403 403 122 24.0 1e-27 MLFTVKKENIIEGLQKAAGIIPTRAGAAYLRSLWIKAEGEVLTIMATDANIEFTGTYPAD VKEPGLVGVNGRNFVDLVRRLPAGELRIRLDDATGTVILEQGRRTYKLPANDPTWFQSLA PFSSEGAVVWSGDFFQEIIDRVFFCVSDDESSDAIACLYFKPVGEGHIEVCGLNGHQFAL TRFTHDDLSARLPEDGVLIQRKYVGELRKWLGGDEIEVNLTDKRLFVRSGDGHETLSLPR AGYAYPDYSPFMTRLASPDVSLLKLSRKDCLDALDRISIFNTESDRCTYFELSSAEAMLS AQGQDTGSANESLEVSYNGSIGRIAFPTKNLMEILTHYQSGELTFTLTGAEGPCGINGTE DPDYTVLIMPMKIAEESYYEEN >gi|316922628|gb|ADCP01000103.1| GENE 3 4080 - 5648 1411 522 aa, chain - ## HITS:1 COG:sll0848 KEGG:ns NR:ns ## COG: sll0848 COG0593 # Protein_GI_number: 16330393 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Synechocystis # 198 514 132 435 447 135 29.0 3e-31 MELRNAVSKETLRKYLLTSSAEAIPQARPSLLSCFTNAENPKPQLQDRLSAAGEALESCS ETALQSWFDPLEILLDQEAGTLAVRFPHAFFGIRFSSLFQKLFEKKARELWGKGLSITYG AGKFTPSPPVPFVQPAPESRQDALHPAVANPMPFGEEWTFDTFIGNGKHKWALSLARDIT RRAVYRATHGRAPSLDDIGEAPGLLVLCGPHGTGKTHLLRAIGNELFRTLGSDLYYASLS DLELLFAGRSVLAARQELLSKEAVLIDDFQHLSRIPDKTPTTASRFRAQIVAEPEQPAGP SVREELCLLLDRFMDQGKPVIVAGVGRPKEWSLGKALLSRLETGLWAELPEPDLDVRLRY AQQQAKFRRLPLSREQLLLIAQHCPDVRRLSGVILRVASHRSLLGRDLSEQDLLNIVRQG GDSSALTPQLIVSIVGEHCGVPAKEILGEKRRPDLVQARQLAMYLCRELLGHSYPVIGRM FGGKDHSTVMHGVKKIKLLQESDRLAHSMVTELTKACLEHRG >gi|316922628|gb|ADCP01000103.1| GENE 4 6091 - 7503 1146 470 aa, chain + ## HITS:1 COG:PA0667 KEGG:ns NR:ns ## COG: PA0667 COG0739 # Protein_GI_number: 15595864 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Pseudomonas aeruginosa # 104 435 77 409 447 228 36.0 2e-59 MRNTFSAGTILISFICLFLVVGGMYLVNSRREASLLIPESQPVAVSSVAQRLQDQVTPSL FALWERAKSAVAVLPEEPPAQEMPEGEAAIVSEMQQVPLEQEPPSDPSVINGVIAKGDSA SELLSPYMSASSVQQLLNVTRKLHPLNNLRTGQPYTLVCSPDGRDMERFEYEINDRKKLI VTKTDEGLVAHVEPIVYDFSLVRVSGTIRSSLFETLASTGESPILAVRIADIFGSEINFI KDLREGDSFSLLIEKRFRENEFKGYGKVLGATFTNQGKTYEAYNFATEDGEAFFNAKGES LKKTLLKAPLAFTRISSGYSMNRKHPIFKTHKPHQGVDYAAPTGTPVKAVGDGTIEKAGW GNGFGNMVILKHSGGLESMYSHLSGFASGAKRGARVRQGQVIGYVGATGYATGPHLDFRL KQNGKYVNPAKVVAPRGGSIPRNRMNDFKNRKALIGEYLSGKRSLSDYKR >gi|316922628|gb|ADCP01000103.1| GENE 5 7883 - 8980 716 365 aa, chain + ## HITS:1 COG:BS_queA KEGG:ns NR:ns ## COG: BS_queA COG0809 # Protein_GI_number: 16079825 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Bacillus subtilis # 16 364 6 341 342 310 46.0 2e-84 MYSIQQSEADFLLSSYRYELPPERIAQHPGERGASRLLVLDRETGKRIHSHFSHLLEYLP KGTLLIANNSRVVPARMIGQRPSGGKMEFLLLTPMPLLEQHGEGDGWHCAEADGLIKPAK NARIGDYLVFGDDLRVEVLAKGEFGRHRVRLHWTGDIRALFESRGHLPLPPYIKREDTKD DRGTYQTVYARDDKAGSVAAPTAGLHFTPELRAQMASAGVEWAEVTLHVGYGTFSPVRCE DIREHVMHEEFVEVPQATVDAIGKAKAEGRPVIAVGTTSARTLEGVAGAHGGELTAHQGW INCFIWPGYRFHVLDGLITNFHLPESTLLMLVSALTGRERMLETYTEAVKMEYRFFSYGD AMLIR >gi|316922628|gb|ADCP01000103.1| GENE 6 9047 - 9145 107 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWEALFLILAIAAALVVLRLAKKAGLDIMPCG >gi|316922628|gb|ADCP01000103.1| GENE 7 9793 - 10008 103 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCYLFYEQYMNNIMHISGEKEQYGVSHRNKKSLHTEAFLFYEREWKIGMVSSSFPTSCDR GRDTTNPYDRF >gi|316922628|gb|ADCP01000103.1| GENE 8 9941 - 10543 656 200 aa, chain - ## HITS:1 COG:AF1432 KEGG:ns NR:ns ## COG: AF1432 COG1896 # Protein_GI_number: 11499027 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Archaeoglobus fulgidus # 12 187 7 172 173 81 33.0 8e-16 MSAEQLKQRDAMQRLTDFLNEVGMLRHTPRSGYKFLGSGQETVAEHSHRTAVIGYVLAKK TGADAARTVMLCLFHDLPEARTGDFNYVNRLYDTSRERDALEDAVEGTGLEEDIMSIWDE HACRTTPESLLAHDADQLDLILNLKRESDLGNRYADKWLESAVERLRTDIAKELAQTILK TDHTDWWYLGPDRNWWERKS >gi|316922628|gb|ADCP01000103.1| GENE 9 10653 - 11822 596 389 aa, chain - ## HITS:1 COG:PA0559 KEGG:ns NR:ns ## COG: PA0559 COG2081 # Protein_GI_number: 15595756 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Pseudomonas aeruginosa # 4 385 5 390 392 276 39.0 6e-74 MIRDAIILGAGAAGLMCAMTAGRRGFSCAVIDHSPVAGRKVRLAGGGKGNVTNRYISPEW YVGTQKGFPDTLLKRCSTDFVLSMLASFGIEWEEREYGQIFCTVQAARLVESMAAACRNA GAELLFNRTFGEIRHEGGLFSVETSGGTLQAPRLVIATGSPAWPACGATDGGMRLARQWG HKVVPVRPVLAPLLLPESWPLHGLAGVSLPAAVSVGGRTFRYPLLFTHKGLSGPAILQAS CFWRQGEALHIDFLSDHPALELMHQPEHGKSTVLGLFKKVLPTRLAERLIPGSLASRKVA QLAKKDREYLALAIHDHMVTPSRSEGMGHAEAAAGGVDTSDVNPRTLESRKQPGLYFAGE VLDVTGLLGGYNLHWAWASGKAVGESWKK >gi|316922628|gb|ADCP01000103.1| GENE 10 11819 - 12355 549 178 aa, chain - ## HITS:1 COG:PA4473 KEGG:ns NR:ns ## COG: PA4473 COG3028 # Protein_GI_number: 15599669 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 7 174 21 192 200 96 35.0 2e-20 MPPKKRYQWDTKDSEFEDDRPSRSQKKRDSTALQRMGEELTTLGSSVLAKMPLTPNIREA VLEWQRLSSHEGRRRQMQYIGRLMREEADPQAVRDALDAIKLGHTGETASFKRSEKLRDD LMNATDAEMDTLLAAFSAEDATEIRDLTAKARNEREHSRPPHAYRALFRKLKSLPAEQ >gi|316922628|gb|ADCP01000103.1| GENE 11 12646 - 13590 1185 314 aa, chain - ## HITS:1 COG:BH2534 KEGG:ns NR:ns ## COG: BH2534 COG0167 # Protein_GI_number: 15615097 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus halodurans # 21 310 11 299 305 291 52.0 1e-78 MDRFTVRLSSSHKPSGNTKPLVFKNPILTASGTFGYGVEFANYGDLTHLGGIVVKGLSLK PRPGNPLPRVAETPCGMLNAVGLQNDGVESFLKNKLPRLPWHETPVIANLYATSPAEFAE LAGILAAEEGIAGLEVNISCPNVKNGGVLFGQDPALAAEVTQAVKKHAGNLPVIVKLSPN VTDITAIAKAAEQAGADAISCINTITGMGVDVKTRKPLLANVVGGLSGPAVKPVALRCVW QVCNAVKIPVIGIGGITCAQDVLEFILVGAHAVEIGTMNFVRPDAAFRIAEELPHLCQQL GIKNLAEFRGTLQL >gi|316922628|gb|ADCP01000103.1| GENE 12 13593 - 14408 818 271 aa, chain - ## HITS:1 COG:BH2535 KEGG:ns NR:ns ## COG: BH2535 COG0543 # Protein_GI_number: 15615098 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Bacillus halodurans # 40 269 34 259 259 115 32.0 7e-26 MSQSAFYDLAVIDNVPFGLTGSQGRFFALRLERPDWKSWAPGQFVMLRPQGWALDMPWAR PFSICRVSTNELVIFFQVAGRGTEKMAKLRRGDKVHLWGPLGNRFAVEGGTPTLLLAGGI GIAPFVGYAERHPTPGVLSMVFGHRLPEDCYPVESLGERIDVESFHENTSEDLEWFLNHV RSRIEAYAERNGLVLACGPAPFLKAVQGFALAAKARCQLSLESRMACGVGACLGCVTKTT EKWPVEEKAGTPVQTCTHGPVFWADQITLEA >gi|316922628|gb|ADCP01000103.1| GENE 13 14602 - 15786 1453 394 aa, chain + ## HITS:1 COG:sll1883 KEGG:ns NR:ns ## COG: sll1883 COG1364 # Protein_GI_number: 16330800 # Func_class: E Amino acid transport and metabolism # Function: N-acetylglutamate synthase (N-acetylornithine aminotransferase) # Organism: Synechocystis # 3 394 18 419 419 290 44.0 3e-78 MSTAPKGFRFATACAGFRKEGRADLALLVSDVPAVAAGTFTQNRFPAAPVLVAKAMLAER PAARAVVINSGQANACTGDEGMRNCRTTQELVASALGLEASDILPASTGVIGAQLKMDLW EKAAPALAANLGKATPEDFARAIMTTDAFHKISHREVSLPGGVVKLVGMAKGAGMICPNM ATMLSVVLCDAQVEASVWQKMLRDAVELTFNRVTVDGDTSTNDTLYGLANGASGVSASDE ESLKALSAALTSVLGDLAYMLVKDGEGASKVARIHVTGAASDADAERVARTVGHSQLVKT ALYGRDANWGRIVAAVGRSGADFNPDDVVVTLCGVELFRKGQPTDLDFDALLEEPLKQRD LPIDIVLGSGSGSYTLLASDLGHEYVNVNADYRS >gi|316922628|gb|ADCP01000103.1| GENE 14 16281 - 17207 698 308 aa, chain + ## HITS:1 COG:HI0412 KEGG:ns NR:ns ## COG: HI0412 COG0564 # Protein_GI_number: 16272361 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Haemophilus influenzae # 5 264 16 267 322 123 34.0 5e-28 MPEPLQVTAAEAGQKLLQFLSRRFEEPQSVLHRWIRTGQVRINGGRAKPFDRVKLNDEIR VPPFAGAGTKAERVPASSGGKELPPIVAETADVIVFCKPSGLPVHPGTGHTDSLTTRLEA AFAGSPFIPAPVHRLDRDTSGLLLVGKTYAAVRRLSDALAAHDGSVAKDYLAWVQGECPW SRPKRLEDHLAKRTVGAQGREKVVAGKSRTGGPDEKPASLTVRSLTVREGRSLVLIRLHT GRTHQIRVQLSERGFPLVGDVKYGGPRCGDGLKLHAVRLRVGEETYTALPPWAGPWRVAH LPPEWEDV >gi|316922628|gb|ADCP01000103.1| GENE 15 17209 - 18108 1173 299 aa, chain + ## HITS:1 COG:AF1688 KEGG:ns NR:ns ## COG: AF1688 COG0730 # Protein_GI_number: 11499278 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Archaeoglobus fulgidus # 12 297 5 270 274 153 35.0 4e-37 MYFPTAEIEVSPFVPPLVAFAVSFFTSMGGISGAFLLLPFQMTFLGYTNTSVSATNQLYN VFSNPGGVFRYYREKRMLWPLSWIVMAGAVPGVVIGGLVRINWLADVRPFKLFVALVLLG IGLEMIRDLFGWMRFKPAPKYEQSPPKPYVEVVERNLSRVSYTFDGHLYSFSVPILLFYS IAVGLVGGIYGIGGGAIIAPFLVSVFHLPIYTVAGATLAATFVNALAGVLFYIAISPLYP DLSIAPDWGMALLLSLGGLLGMYFGAKFQKCISPVLLKWMLVVILFGTVIAYFVEFVRG Prediction of potential genes in microbial genomes Time: Fri May 13 03:52:27 2011 Seq name: gi|316922625|gb|ADCP01000104.1| Bilophila wadsworthia 3_1_6 cont1.104, whole genome shotgun sequence Length of sequence - 1967 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 21/0.000 - CDS 32 - 1039 1299 ## COG0306 Phosphate/sulphate permeases 2 1 Op 2 . - CDS 1032 - 1835 718 ## COG1392 Phosphate transport regulator (distant homolog of PhoU) Predicted protein(s) >gi|316922625|gb|ADCP01000104.1| GENE 1 32 - 1039 1299 335 aa, chain - ## HITS:1 COG:CAC3093 KEGG:ns NR:ns ## COG: CAC3093 COG0306 # Protein_GI_number: 15896344 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Clostridium acetobutylicum # 18 324 19 324 330 265 55.0 1e-70 MFEVPVLLVLIVVVALIFDFTNGAHDCANAIATVVSTKVLSPRVAVVMAASLNFIGAFLG TKVATTLGAGIVNTDMVANCQPLVLAALLGAIAWNLITWYFGIPSSSSHALIGGLMGAAV AYAGFSTLNGASIAEKILLPLILSPLAGFGMGFLTMFLIMLLCARANRNKLTRAFTKLQI VSAAFMATSHGLNDAQKTMGVITLALFIFNKIDTIAVPFWVKCVCALAMAMGTALGGWKI IKTMGHKIFKLEPVHGFATETSAAAVICGASLFGAPVSTTHTITACIFGVGSTKRLSAVR WGVAGNLVIAWILTIPASAFIAAIAFWCMKIGGIV >gi|316922625|gb|ADCP01000104.1| GENE 2 1032 - 1835 718 267 aa, chain - ## HITS:1 COG:CAC3094 KEGG:ns NR:ns ## COG: CAC3094 COG1392 # Protein_GI_number: 15896345 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate transport regulator (distant homolog of PhoU) # Organism: Clostridium acetobutylicum # 65 265 6 208 210 63 24.0 6e-10 MASTGSGRHCFSLLLIPSEKIFHNERFYSPVDFSCLLARSISFQYNGGIYESDLLGGSML ARFLPQTVPFFKLLMEQNAVLQRMAQSLIVVMENTYESESHLKQINILEEEADKLNRKIT WHLSQTFITPIDREDIHAINLAQERVADGIQNLASRFFMCGFMYQRFPAHMMTRNIKGMI EETELMLGQLEKKKDVSGHLHSLKSRKSDCEMLQAAGISEMMDSEIPNFEKVRELILWSQ VYERIERAVDMISDLGDTLEEVVLKYV Prediction of potential genes in microbial genomes Time: Fri May 13 03:52:38 2011 Seq name: gi|316922607|gb|ADCP01000105.1| Bilophila wadsworthia 3_1_6 cont1.105, whole genome shotgun sequence Length of sequence - 20643 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 14, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 288 - 1307 1005 ## COG0564 Pseudouridylate synthases, 23S RNA-specific - Prom 1434 - 1493 2.7 2 2 Tu 1 . + CDS 1561 - 2898 1464 ## COG1206 NAD(FAD)-utilizing enzyme possibly involved in translation + Prom 3048 - 3107 2.2 3 3 Tu 1 . + CDS 3186 - 3485 369 ## COG1359 Uncharacterized conserved protein + Term 3523 - 3554 2.4 - Term 3503 - 3549 6.1 4 4 Tu 1 . - CDS 3666 - 5603 2359 ## COG3276 Selenocysteine-specific translation elongation factor 5 5 Tu 1 . + CDS 5415 - 5774 114 ## + Term 5895 - 5958 18.5 - Term 5813 - 5839 -0.3 6 6 Op 1 31/0.000 - CDS 5955 - 7376 479 ## PROTEIN SUPPORTED gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 7 6 Op 2 . - CDS 7386 - 7667 425 ## COG0721 Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit - Prom 7698 - 7757 4.2 + Prom 7758 - 7817 4.0 8 7 Tu 1 . + CDS 7843 - 8379 548 ## Dalk_4820 hypothetical protein + Term 8387 - 8431 14.8 - Term 8375 - 8419 14.8 9 8 Tu 1 . - CDS 8540 - 9511 887 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 9751 - 9810 5.6 10 9 Tu 1 . + CDS 9879 - 11792 2665 ## COG0443 Molecular chaperone + Term 11854 - 11924 32.2 + Prom 11890 - 11949 3.2 11 10 Tu 1 . + CDS 12062 - 12559 555 ## COG3467 Predicted flavin-nucleotide-binding protein + Term 12572 - 12615 11.7 - Term 12558 - 12603 8.3 12 11 Op 1 1/0.000 - CDS 12662 - 13171 245 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 13 11 Op 2 . - CDS 13377 - 14732 1753 ## COG1109 Phosphomannomutase - Prom 14884 - 14943 3.6 + Prom 14999 - 15058 4.3 14 12 Op 1 21/0.000 + CDS 15219 - 16442 1608 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 15 12 Op 2 . + CDS 16442 - 17290 757 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 16 12 Op 3 . + CDS 17314 - 18009 664 ## LI0677 hypothetical protein + Term 18105 - 18160 2.2 17 13 Tu 1 . + CDS 18273 - 19280 1093 ## COG2951 Membrane-bound lytic murein transglycosylase B + Term 19285 - 19322 8.5 18 14 Tu 1 . + CDS 19425 - 20588 1357 ## GAU_3543 hypothetical protein Predicted protein(s) >gi|316922607|gb|ADCP01000105.1| GENE 1 288 - 1307 1005 339 aa, chain - ## HITS:1 COG:AGc4432 KEGG:ns NR:ns ## COG: AGc4432 COG0564 # Protein_GI_number: 15889714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 52 319 35 309 340 79 28.0 8e-15 MPSSANRPENGGISFRVPEEAQGWRLDKALGLLLASPSPEQEAARPGLFALADLGLRARR RLCDRSLVLVNGKPGIPGLKLRAGQEIIILPDPEGAAIPEDAPSLVYKDGGIAALYKPAG MHTAALAGSLSPSLETLLGDLLPSEEEGYASRLLNRLDAPTSGLVLASCTDGGERRWYRA ERIGNTDKLYLALIEGQPLYDFTVARRLDTDTRNKTRVRHTDDPDPVRHTDVTLLAPLTA GDVPGLVESDDPDAPLMLVGCRIRKGARHQIRAHLAAAGHPLAGDSLYGAELPCPSGFLL HHGRVSLPDFQAFRLPAWLPLLPREAQEKATAFLGMEEE >gi|316922607|gb|ADCP01000105.1| GENE 2 1561 - 2898 1464 445 aa, chain + ## HITS:1 COG:SA1094 KEGG:ns NR:ns ## COG: SA1094 COG1206 # Protein_GI_number: 15926834 # Func_class: J Translation, ribosomal structure and biogenesis # Function: NAD(FAD)-utilizing enzyme possibly involved in translation # Organism: Staphylococcus aureus N315 # 3 429 2 424 435 416 48.0 1e-116 MTTQRIALVGAGLAGCECALQLARRGFAVTLFEQKPEAYSPAHSMPQLAELVCSNSLRSD ELASGVGLIKQELRELGSALMAVADATRVPAGKALAVDRELFAGKVTELVEAEPNITLVR KAIADLDDPALEGFERVVISAGPLASESLSASIARAVGAEHLYFYDAIAPIVAADSIDMS IAFWGSRYGEPGEGDYLNCPMNEDEYRAFYQGLLDGEKVASREFEQEKHFEGCMPVEALA ARGEKTLAFGPLKPVGFTDPRTGRRPFALLQLRPENGNKTMFNLVGCQTKLTYPAQAVCF RKVPGLEHAEFVRFGSMHRNTYVNAPVSLNEELALKSRPNVHLAGQITGVEGYVESIACG LWVGQMLAAKLAGRSLPKPPATTALGALLNHLSTPAKHFQPSNVHFGLMPEPEVRVKKKE RKQWYADRARAAFGEWLKGLEKEGD >gi|316922607|gb|ADCP01000105.1| GENE 3 3186 - 3485 369 99 aa, chain + ## HITS:1 COG:sll1783 KEGG:ns NR:ns ## COG: sll1783 COG1359 # Protein_GI_number: 16330238 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Synechocystis # 32 93 78 139 147 65 40.0 2e-11 MDTCFVTAKLTAKKGMEAQLEAEVVKNIPNVRAEKGCIRYDFHKNRSEDGTFLFYEIWES PEALDAHGKTPHMLAYKERTKELLACPTVVTVWSQVDCR >gi|316922607|gb|ADCP01000105.1| GENE 4 3666 - 5603 2359 645 aa, chain - ## HITS:1 COG:SMa0015 KEGG:ns NR:ns ## COG: SMa0015 COG3276 # Protein_GI_number: 16262460 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Selenocysteine-specific translation elongation factor # Organism: Sinorhizobium meliloti # 3 640 1 629 666 329 37.0 1e-89 MAIVMGTAGHIDHGKTSLVRKLTGIDCDRLEEEKRRGITIELGFAFCDLPGGGRLGIVDV PGHEKFVKNMVAGASGIDFVMLVVAADEGVMPQTREHLEICSLLGIKHGLVAVTKIDMVD PELLELAVEDISEFLKGTFLEGAPLFPVSSQTGEGVDKLRDYIVKQEKELAPRRRTDLFR LPVDRVFTLKGHGTIVTGTMISGSVKVGDALELLPKKLATRARSLQSHGESVEVAESGHR TAVNLQGLDVADVERGDVLALPGTLFPSDRWLVRLTCLGSSPRALRHRAEIHFHHEAREI AARLYFLDRDKLGPGETALCEVRLDEPLVGVFGDHCVVRAFSPLRTVAGGVVLDPISAGL RRRDATPDRVASLLGLEDASDEDRVRMQIELAGNRGANLAQLSVLTNLDSKRLDKVLQAL SGKGKIFCFDREEKGYVAAGASAELAKRCLAVADAFHKKEPLKQGMARGTLLSGGTGREA WSKGIPPKLAFFVVERLLRSGELVSEGDVIRMASHTVSLKSDQAGLRDALLKAHVDGAFT PPNLKDVLEELSVDAKAAAPVLKLLCEDGSLVKVKDGLYFHGPVIQELKARMQAWFGSHD DLDPAGFKELSGGLSRKYVIPLLEYFDRERVTIRVGDKRQFRGRG >gi|316922607|gb|ADCP01000105.1| GENE 5 5415 - 5774 114 119 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGHVHDAEPSAAGQVAEREAELDGDAPAFFLFEAVAVDPGQLAHERGFTVVDVTRRSHY DSHSNASGGRAAARTRASRRAGGNDFPRTPSVPPFGLQPKRRFVRRPHGGPVGVRSFRF >gi|316922607|gb|ADCP01000105.1| GENE 6 5955 - 7376 479 473 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 [Phaeobacter gallaeciensis BS107] # 1 455 1 435 468 189 33 2e-47 MSDLIQASLAEIRARLAGGDVSAEAVAKACLDRIAETEPSIHALITVREQALEEARALDA QGPDASKPLWGVPVTVKDAIVTKGTRTTAGSKILGGFNPFYDAFVVERLKEAGAVIIGKN NMDEFAMGSSTENSAFGPTRNPRDPERIPGGSSGGSAASVAANQCYASLGTDTGGSIRQP AALCGCVGLKPTYGRVSRYGVIAYGSSLDQVGPLTRTVEDAALVLSVIAGHDKRDSTSSP RPVDGYADFSRSDLKGVRLGVPREFMAEGLDGPVAKACQDALARARDLGAELVDVSLPHA TRHAIAAYYIVAMAEASSNLSRFDGVRYGYRAAEPKNLDDLYCRSRSEGFGQEVQRRILL GTYVLSAGYYDAYYKKAAQVRRLIRQDYLNALGSCDALFGPVSPVTAWKLGSIIDDPLKM YLMDIYTVSLNLAGLPGLSFPVGEADGLPVGMQLIGKDFDEAGILGIGNVLSK >gi|316922607|gb|ADCP01000105.1| GENE 7 7386 - 7667 425 93 aa, chain - ## HITS:1 COG:aqq_07 KEGG:ns NR:ns ## COG: aqq_07 COG0721 # Protein_GI_number: 15607091 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit # Organism: Aquifex aeolicus # 1 93 1 92 94 60 40.0 6e-10 MLTTDRMAHLARLARLAPEEGTLAKFGEQCADIIAYMDILAEVDTSAVEPLYSPVEHSTP YRADEPVQKNMHSDVLANAPETDGQFFVVPRIV >gi|316922607|gb|ADCP01000105.1| GENE 8 7843 - 8379 548 178 aa, chain + ## HITS:1 COG:no KEGG:Dalk_4820 NR:ns ## KEGG: Dalk_4820 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 5 70 10 75 135 73 59.0 2e-12 MAASKRRSGFTLIEIISVLVILGILAAVAVPKYYDLQAQARVKAALVAVQEAQARINLLF AQEILSGRLTDCKKFNVTTLHSQSNIGNNYNFSIVDSDDGTQGAGTVGMWRVYLKNGFKN TNVRRQDEVNVLIPPNEDWYVINGSSNSQASGYVKDKTSYPVPSELKSLFVLSLPSCE >gi|316922607|gb|ADCP01000105.1| GENE 9 8540 - 9511 887 323 aa, chain - ## HITS:1 COG:YPO1320 KEGG:ns NR:ns ## COG: YPO1320 COG1686 # Protein_GI_number: 16121602 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Yersinia pestis # 37 276 69 315 432 184 40.0 3e-46 MAGFLLIAASLVLAPAAEARNARKAHAAQSTASRQILNVRSAVLMDARTGKILYSQNPDV RIPPASLTKIMSMMVTFDAIRSGKVKLSDQIRVSRHAARQGGSRMGLRAGERVSLNRLLM GMAVSSGNDASMAVAERVGGSGRAFVKMMNNKARQIGMSSSTFKTPNGLPAAGQYTTARD MARLGYVYLKNNPSALQYHRVRVLKHRGAVTTNKNPLLGACPGADGLKTGWVTASGYNII STVRRGNTRLIAVILGADSAGLRSHEVRRLVEAGFKSRTNGITVASALTKYSYPNKATAS SVKKHKIRRNKATAQASRRTGRS >gi|316922607|gb|ADCP01000105.1| GENE 10 9879 - 11792 2665 637 aa, chain + ## HITS:1 COG:CC0010 KEGG:ns NR:ns ## COG: CC0010 COG0443 # Protein_GI_number: 16124266 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Caulobacter vibrioides # 1 635 1 623 631 756 65.0 0 MGKIIGIDLGTTNSCVYVMEGKDPKCITNPNGGRTTPSVVAFTDKERLVGDAAKRQAVTN SSRTVFAVKRLMGRMANSPEVEHWKQHAPYKIVAGANGAAVVEIDGHQYTPQEVSATILS KLKADAEAYLGEPVTEAVITVPAYFNDAQRQATKDAGQIAGLDVKRIINEPTAASLAYGF DKKANEKIVVFDLGGGTFDVSILEVGDNVVEVRATNGDTFLGGEDFDQRVINYLVEEFKK EQGIDLAKDSMALQRLKDAAEAAKKELSTSMESEINLPFITADQTGPKHMLIKLSRAKLE QLVSDLVQKTLEPCRKALADAGLKPSDIDEVLLVGGMTRMPLVQKTVADFFGKEPNRSVN PDEVVAMGAAIQGGILAGDVKDVLLLDVTPLSLGIETMGGVFTKLIDRNTTIPTRKSQTF TTAADNQPSVSIHVLQGERPMAADNMTLARFDLTGIPPAPRGVPQIEVAFNIDANGIVNV SAKDLGTGKEQSIQITASSGLSDADIEKLVKEAEAHAADDKKKQDLISARNQADGLIYGT EKSIKDLGDKLDAALKSDIEGKIASLKTLMEGEDVEAIKKATEELAQASHKLAEQLYSQA QGQQPGADAAGAQPGAGAAGGKKADDDVIDADFTESK >gi|316922607|gb|ADCP01000105.1| GENE 11 12062 - 12559 555 165 aa, chain + ## HITS:1 COG:CAC2475 KEGG:ns NR:ns ## COG: CAC2475 COG3467 # Protein_GI_number: 15895740 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Clostridium acetobutylicum # 4 150 5 151 154 103 44.0 1e-22 MVAMRRKDRQVSDEEAMAYLKAAEYGVLSTVGPDGEPYGVPLTYAVEEDGKGLVFHCARD GYKLACFGANPRAHFVAVQETNVLPEEFSIEYKSVMVSGLLEEIEDREEKVRCAQVVGDK YSTISSEEYANRAADKIRVFRLNIENISGKRLVKAGEPGMGYKKG >gi|316922607|gb|ADCP01000105.1| GENE 12 12662 - 13171 245 169 aa, chain - ## HITS:1 COG:MA3114 KEGG:ns NR:ns ## COG: MA3114 COG0454 # Protein_GI_number: 20091932 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Methanosarcina acetivorans str.C2A # 12 157 7 152 166 124 46.0 1e-28 MPCLKAVTPHHIVQAEEADLEAILELQRLAYQGEARLLNDFSIPPLMQTLEEMKEEFRSG IFLKAVDEKGKIVGSVRGTLRGDTLLIGKLMVHPEHQGNGLGSCLLQELEKNCPAPRLEL FTSNKSLRNLCLYERNGYTRCVEKAVSPALTLIFLEKKPSHPASAQNPV >gi|316922607|gb|ADCP01000105.1| GENE 13 13377 - 14732 1753 451 aa, chain - ## HITS:1 COG:PA5322 KEGG:ns NR:ns ## COG: PA5322 COG1109 # Protein_GI_number: 15600515 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Pseudomonas aeruginosa # 4 444 11 453 463 421 45.0 1e-117 MNAGNVFRAYDIRGVVDKDFNAEWVTHLGKACGTYLAGRGISSAVIGYDCRHSSPAYHDA LVEGILSTGTDVISIGMVPTPALYFAVKHLNRQGGVMITASHNPPEYNGFKVWAGQSTIH GEEIQKIKAIFEEGAFAEGTGTASRIDIIPTYKQDILSRFKLARPVKVVLDGGNGAGGEI CADILTKLGATVIPMFCEPDGDFPNHHPDPVVEANMQALMARVKEEKADLGIGLDGDADR LGIVDPDGRLLFGDEVLSLYARELLTRKPGSTVIADVKCSSRLFNDIKAHGGTPMMWTTG HSIIKAKMQEVGAPLAGEMSGHMFFDDNWYGFDDAIYGSARFVALFSAQDKPMTELPGWP ASFATREINIPCPDNAKFAVVEKVKAHFRALYDTIELDGARVNFPHGWGLVRASNTQPVL VTRFEADSAEALAAIREEMETPLKKWIEEAE >gi|316922607|gb|ADCP01000105.1| GENE 14 15219 - 16442 1608 407 aa, chain + ## HITS:1 COG:TM1822 KEGG:ns NR:ns ## COG: TM1822 COG0330 # Protein_GI_number: 15644566 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Thermotoga maritima # 73 352 3 280 308 238 42.0 2e-62 MNWDWDKLQEKRQRQNWGNSNNNSNNNNNNNKNDEEPPFGSGSGNGGGKGPDYEKFSDAF KRLPHLSAPAGGKVKWILVALVAVWLLSGIYIVNPDEEGVVLRFGKYDRTVGAGPHYALP FPIETVYKPKVTQVQRVEVGFRSVGQGRTFQQGANRSLPEESGMLTGDENIVNVQFSVQY QIKNPVEYLFNVTDQAAVVKNAAEAAMREVIGNSLIDSALTDGKLQIQTEATQLLQEILD RYKVGVRVIAVQLQDVHPPKEVSDAFKDVASAREDKSRIINEAEAYRNELIPKARGLAAE VENQAQAYKETRIRNAEGEANRFLALLKEYEQAKDVTKQRMYLETMEEILSRPGMEKLVL PKDAADRVLPLLPLMQSAPSAGKIAPQEQSSGTQLPEATLSRPRGNN >gi|316922607|gb|ADCP01000105.1| GENE 15 16442 - 17290 757 282 aa, chain + ## HITS:1 COG:TM1823 KEGG:ns NR:ns ## COG: TM1823 COG0330 # Protein_GI_number: 15644567 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Thermotoga maritima # 7 260 4 261 283 211 41.0 2e-54 MEKRFPFLIGLIIILVLIGSQSIFIVNQTEKALVIQLGDPVDKVFGPGLHFKIPLIQTVV RFDARVLDYEARAAEALTSDKKAIVLDNYARWRIIDPLQFYRSVRTIPGAQARLDDVVYS QLRAQVGRHSLTEVVSSKRSGIMADVTRRASDIMKEYGIEVVDVRIKRTDLPAENQRAIF GRMRAERERQAKQYRSEGVEEATKLRSEADRERAVILAEANRRSSVIRGEGDATAARVFA EAFSRAPDFYKFQRGLEALKKGFEQNSRIVITNDDPFLAPIR >gi|316922607|gb|ADCP01000105.1| GENE 16 17314 - 18009 664 231 aa, chain + ## HITS:1 COG:no KEGG:LI0677 NR:ns ## KEGG: LI0677 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 229 22 251 253 281 60.0 1e-74 MTFFQEVYDRIRCATNARTQTELAAVLEIRQSSISDAKRRNSIPSDWYMKLFEKFGLNPD WIKSGTGPMFLKIDQVYTPVEGPPEGFHEEPAHYADPNAISTLTKVYSTQCAYEEGHPAP DLQTSGKIALPQSYVNAQTLVFHIESDSFSPLVRRGAYVGVDTSRTHPISGEVYAVYMPH EGVALKRLFLDGDHDRFILRTEQPAHPGVALLPQDCPGRILGRVSWVLQNV >gi|316922607|gb|ADCP01000105.1| GENE 17 18273 - 19280 1093 335 aa, chain + ## HITS:1 COG:PA4444 KEGG:ns NR:ns ## COG: PA4444 COG2951 # Protein_GI_number: 15599640 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-bound lytic murein transglycosylase B # Organism: Pseudomonas aeruginosa # 102 297 102 265 367 91 33.0 3e-18 MRRTLSAVLLLLALAGAGCAQNRIVYAQSGVQVQPPQYPSYQGSYTNGAIPSAWLPLYQR LRADGLDGPDLPALFASMGTPSQDPMGRKIKELYTKAFAPKPKPTPPSSKPTTPARPKPL VYPGVITAENVEKCRAFLEANKPAFDYAERTFGVPRQISVSLLFVETRLGTFLGKEKAFY TLASMASSRRPEAISTYVAQLPGSTAADRQSWIQQRMEQRSDWAYKELVALLKNIRSSGE DALAMPGSIYGAIGLCQFMPTNISHYGADGNGDGVVNLFTVPDAVASLANYLAQHGWNKA ATRAQKQKVIKTYNRIDIYANTILGLAEAQGYVAE >gi|316922607|gb|ADCP01000105.1| GENE 18 19425 - 20588 1357 387 aa, chain + ## HITS:1 COG:no KEGG:GAU_3543 NR:ns ## KEGG: GAU_3543 # Name: not_defined # Def: hypothetical protein # Organism: G.aurantiaca # Pathway: not_defined # 33 385 23 369 372 202 36.0 2e-50 MTYRLSGRLLNKGLMGAVVALLLLMSLLAPARAELVQFVYTSDQHYGITRKAFRGLDKVS SREVNAAMVQAINTLPGISLPEDGGVRAGQPVQWADAVISTGDIANRMEGTDERLIPSAT ECWDLFEKQYINGVSLKDRAGKAAEVLAIPGNHDVTNAVGFYKAMTPAKDNGSLLAMYNR ANNTSLAPEAFDAKRDKVFLNREYGGVRLLFVQMWPDSAARAWLDAQLANVPSGKPVLLF THDQPDIEAKHLINPNGSGDINAKDKFENLVSDVSSVQKAKEIPVKEHRELAAFLKKHPS IVAYFHGNENYNEFYTWGGPDNDIAMPVFRVDSPMKGNASSKDEKLLSFQVVSIDTDARK MTVREALWNADPKHPAITWGESKTIGF Prediction of potential genes in microbial genomes Time: Fri May 13 03:53:20 2011 Seq name: gi|316922589|gb|ADCP01000106.1| Bilophila wadsworthia 3_1_6 cont1.106, whole genome shotgun sequence Length of sequence - 26767 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 10, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 171 - 207 7.2 1 1 Tu 1 . - CDS 343 - 1908 2033 ## COG0728 Uncharacterized membrane protein, putative virulence factor 2 2 Tu 1 . - CDS 2055 - 4916 1503 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) - Term 5151 - 5183 7.0 3 3 Op 1 . - CDS 5206 - 6300 1556 ## COG0075 Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase 4 3 Op 2 . - CDS 6297 - 7187 1197 ## COG0796 Glutamate racemase 5 4 Tu 1 . - CDS 7294 - 8220 914 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase - Prom 8353 - 8412 2.6 6 5 Tu 1 . - CDS 8494 - 9282 956 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases - Prom 9351 - 9410 2.3 7 6 Op 1 1/0.000 + CDS 9387 - 11213 2307 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 8 6 Op 2 . + CDS 11213 - 12775 1234 ## COG2121 Uncharacterized protein conserved in bacteria + Term 12782 - 12845 23.3 9 7 Tu 1 . + CDS 13168 - 13914 898 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity + Term 14115 - 14183 30.3 - Term 13978 - 14010 -1.0 10 8 Tu 1 . - CDS 14177 - 14833 853 ## Pecwa_1808 hypothetical protein - Prom 14925 - 14984 2.2 - Term 14980 - 15019 9.1 11 9 Op 1 6/0.000 - CDS 15102 - 16280 1748 ## COG5557 Polysulphide reductase 12 9 Op 2 . - CDS 16290 - 17087 989 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 13 9 Op 3 . - CDS 17087 - 17473 642 ## DVU1288 cytochrome c family protein 14 9 Op 4 1/0.000 - CDS 17477 - 19141 2207 ## COG0247 Fe-S oxidoreductase 15 9 Op 5 . - CDS 19154 - 20185 1350 ## COG2181 Nitrate reductase gamma subunit - Prom 20205 - 20264 3.7 16 9 Op 6 . - CDS 20275 - 20802 560 ## DvMF_0071 hypothetical protein - Prom 21002 - 21061 6.4 - Term 21166 - 21225 20.6 17 10 Tu 1 . - CDS 21331 - 25713 5478 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) - Prom 25869 - 25928 8.0 Predicted protein(s) >gi|316922589|gb|ADCP01000106.1| GENE 1 343 - 1908 2033 521 aa, chain - ## HITS:1 COG:XF2420 KEGG:ns NR:ns ## COG: XF2420 COG0728 # Protein_GI_number: 15839011 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, putative virulence factor # Organism: Xylella fastidiosa 9a5c # 14 396 19 398 536 135 30.0 2e-31 MGIAALIMAGSVFLSRLMGLVRDKVVSWQFGAGAESDVYFAAFVVPDFLNYLLAGGYISI TLIPLLSKRFEEDEADGWRFFSAVFWWAALGIAALTAVAWIFAPELARIVGPGFSPEKQA RLAHFLRIILPAQVFFLPGACVSALLYIRKQFLAPALTPLIYNGCIIAGGLLVTGRGMEG FCWGVLFGAALGSFLLPVVAARSSGSPLPEGIPAGLRLRFNLRHPLLKRLLLLALPLMLG VSIVAMDEQFVRIFGSMAGEGAVSLLSYARRIMLVPVGVVAQAAGVASFPFLAALAARGD DAGFDKTLGTALRGSMLVVIPLTAYMMAVALPTLGFIFEGGRFSAEETILAAPLLQILLL SVPFWVVQQVIGRAFYARQNTLTPAIVGTVATLAALPVYPLAVKLWGAFGVAMLTTLCLF VYTLALSWFWIRKHGTGAFDGMGHLLLKGFLLVLPGTLLAFFAVYGLPGRLPLWFPSLYA YLPAAMRHAAICGIAGVIFAVPYLLLAKLFMPEALSLRRRR >gi|316922589|gb|ADCP01000106.1| GENE 2 2055 - 4916 1503 953 aa, chain - ## HITS:1 COG:BS_comEC_2 KEGG:ns NR:ns ## COG: BS_comEC_2 COG2333 # Protein_GI_number: 16079611 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Bacillus subtilis # 632 928 5 297 307 102 26.0 4e-21 MPTKQPMPPVPPLQPLLFWETGVLLFVAGIVTARFPVPALTACALFAWVDSRTRRPLCCL LAAACVISGWWIGEKSVPRVPQEYPAWLEKSLSSRRAVTVEGVIVGSRGLPDQRLQIILD DVGPAEKEPLPGRMALTWQDMPDVPRPLPGQRITADLKIRPVHGFHNQGTWNSEAYWHRQ GVFFQAWAKQDDAAIRTSGTPSAGAELRERLRLRVAAALDPPEESGLRTLSPSSTDRNAS RQAALPSQNAWSEPPSPPRRENLPSAPEQIGEVQTEEVREHSGSPAHSAAPADTRRFSRV RDGGSSIIPALLFGDRYGLNTPDMERINAAGLTHSLALSGQHLAVVGLGALALTGIVGLL APGLFLRFPAYSLIGLLSLPLASAYLWLGDAPPSLVRAALMLAIVCLLRCVPDLLPERFR RNLRPAFTFADVLLLALFCMVLADPLCLYDLGVQLSFSAVAGIALCSPWLSKLWNDGPLS FSPLKVLQGGLSPMRAAGGRFIRLLWLTLGCSVAAQLATLPLVLDAFGRSTLWFPINLLW LPALGFIVLPLSFLGLIAAATGLEQAAGFLLHLANIPCEALLHSLRWLQAHAGLDLFVSP RPHWTAILGFGAIAVALAMRIHRDHFPHAAKRLLISGALLLSVGPLLWVHAFFEPKISLR VLDVGQGQAVLLEWPYGGRAMVDGGGLFSDRFDVGRDLVSPVLTANNLPRLDFIAVTHPD RDHLKGLLFIAANYAMKAAYTAPLEGIDTPQHGSPRPLSEAFTAILASRGIPRHTLGAGN VLPLADGLALEVLAPAPGVTPSGNDGLVFRLVLNGHGLALLPGDAEAPYLRALLRSGADL SADVLVLPHHGSAGSLVPALYDAVSPKLAIASAGAYNPYRLPSRKVRDALEWRDIPLHIT GNEGEIAVHWDLKKNAGKKNILQEGFPPPRPHLSQYVQPMGRARESSPAGNTK >gi|316922589|gb|ADCP01000106.1| GENE 3 5206 - 6300 1556 364 aa, chain - ## HITS:1 COG:CAC0221 KEGG:ns NR:ns ## COG: CAC0221 COG0075 # Protein_GI_number: 15893513 # Func_class: E Amino acid transport and metabolism # Function: Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase # Organism: Clostridium acetobutylicum # 5 362 4 358 377 183 31.0 6e-46 MSYAPIPMVPGPVALHEDVIAVLGRDYGSGQVESDFLCLYDATSRSIGKLMGTKDDVVLM TGEGMLALWGALKSCLKPGDHVVSVGTGVFGDGIGEMAESFGCIVEKVSLPYDCSIRESD LAAVEEAIRRVKPVMLTAVHCETPSGTLNPIGLLGKLKKDLGVPLFYVDTVAGLGGAPVH MDEWNVDLMLGGSQKCLSCPPSMSMVGVSAAAWERMKEVNYQGYDAILPFRTVRTDGRCP YTPNWHGVAALYAGTQAIFKEGMDAAFARHEAVAAQCRAGLAELGIKLWTAPDAVNAPTV TAAMIPEGFTWPEWKEALRRHGLICTGSFGPMDGKVFRLGHMGTQAQPYLMEQALDAIAA TLGK >gi|316922589|gb|ADCP01000106.1| GENE 4 6297 - 7187 1197 296 aa, chain - ## HITS:1 COG:NMA2026 KEGG:ns NR:ns ## COG: NMA2026 COG0796 # Protein_GI_number: 15794906 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Neisseria meningitidis Z2491 # 10 268 7 261 270 206 43.0 4e-53 MTTEQCGQDQLPIGLFDSGVGGLTVLDAMRRLMPGEDYLFLGDTARVPYGAKSQKTIVRY SLQAAAKLIGQRIKLLVIACNTATAAALPALRETWPDMPIIGVIEPGSRAACEASPSGDI AVIATESTIRSGAYAEAILRRRPEAQVRSLACPLFVPLAEEGWFDGPIAEGVVSRYLEPL FEKSPAPDCLVLGCTHYPMLAAAIRKVVGPEVHIVDSAATTAEVVKRRLADKGLAHPQAG RTGRIRFFITDDPQRFTRTGSLFLNMTITDSDVRLVDLENVPLPDAAETPTQRKSS >gi|316922589|gb|ADCP01000106.1| GENE 5 7294 - 8220 914 308 aa, chain - ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 1 285 1 281 290 283 47.0 3e-76 MRNPVPRIAAIHDMSGFGRTSLTVAIPILSCMGIQVCPMPTAVLSTHTVEFTDYTLCDLT PELGGILDHWERLGLHFDGVYSGFMASPEQMDSAARCIRNCLAPGGLAVVDPVLGDNGIL DPTMTPEMVEKMRWLISCADIITPNITEVALLLDEPFTPRISPEEIKGRLRRLSAMGPQT VVATSVPLLEGGRDPGQNTSVIAYERDEDRFWRIDCAYIPAHYPGTGDTFSSVLTGSLIQ GDSLSIALDRAVQFVTLGIRATFGQGLPSREGILLERILGSLFAPVSTCKCRIMDGDGCC NPVLPFED >gi|316922589|gb|ADCP01000106.1| GENE 6 8494 - 9282 956 262 aa, chain - ## HITS:1 COG:aq_1464 KEGG:ns NR:ns ## COG: aq_1464 COG1187 # Protein_GI_number: 15606629 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Aquifex aeolicus # 5 260 4 239 249 181 43.0 1e-45 MEKTVRINKALADAGVCSRRRADELIAAGRVSVNGEPVASPGLQVTPGTDKLEVDGKPVQ PIAQAPCYLLLNKPVRVVSTAYDPEGRTTVLDLVPAKWKARRLYPAGRLDFFSEGLVLLT DDGELTNRVVHPRHHMPRVYHVLIRGGVSDRALEVMRKGMTLAEGEKLAPVEIRVLPGQV HDLSGVSRNAGPPAGQRGTLLEMTLHQGLNRQIRRMCRDLHLTILRLVRVAQGPLRLGVM KPGEARELTPAEVAALKKAVEM >gi|316922589|gb|ADCP01000106.1| GENE 7 9387 - 11213 2307 608 aa, chain + ## HITS:1 COG:RSc2200 KEGG:ns NR:ns ## COG: RSc2200 COG1132 # Protein_GI_number: 17546919 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Ralstonia solanacearum # 24 595 22 592 592 397 39.0 1e-110 MSKKKETKEERSERRRKELRLALRVFRYFLPYKWAMVVALLASAVVSGTTAGTAWLIKPA LDDIFIRQDGTALLYVPLAFIVLTALKGAGRYLQNWCMHYSALHVLETMRQELFQKIITL PLHFYEESQVGALMSRVINDVGMIRQSLPAFIQIIRQVITMISLLYVVFQQNFELACWAI IVLPVAGFPLSLFSRALRRYGRKNAEVNASISSMLQELLSGIRVIKAFATEKQETGRFNK ENARIININFRQSCVSELSSPVMELIGAIGIGLVIWYGGREVIQGDMTPGTFFAFMGALA MLYTPFKSLNGANMNVQNALAGAERVFAILDDPALKTERGGTLPLDEPFRELTFRDVSLH YGDESTPALRGVSLTVKAGERIAFVGPSGAGKTTLVNLIPRFYDPQRGEILLNGKPLKEY SLASLRRSVSMVSQDAFLFDMTIAENIAYGRDLSSELDMDRVRSAAVAAYADGFIRELPE GYETPIGERGVRLSGGQKQRLTIARALLKDAPLLILDEATSALDSESEHMVQKALDNLML NRTSLIIAHRLSTILEADRIVVMECGRIVDIGRHEELLGRCELYTRLYNMQFRTHENICG CETDLVES >gi|316922589|gb|ADCP01000106.1| GENE 8 11213 - 12775 1234 520 aa, chain + ## HITS:1 COG:jhp0255 KEGG:ns NR:ns ## COG: jhp0255 COG2121 # Protein_GI_number: 15611325 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Helicobacter pylori J99 # 12 176 18 176 217 100 33.0 5e-21 MPHSGKSLLGALSAPVYWLYRLWCRSLRYTEINRAAIENTTDQGRPVVLSLWHDELFPLI YLKRRLNIIALVSQSDDGDLLAGVLERMGLETARGSSSRGGVKALLAAARRMRESGICGC VTVDGPRGPRHEVKEGAIFLAARADAPIVPIRLFMERRKVFKSWDRFQLPLPFSRVSMVC ADAYRVECDIRDPEAVARECRRLEEKLEALKAPELPPSPSLLHQLSQAFSRAMCRVGYGF ALLLGKLGFSRIRSLARGLGSLLWTCLPKRRRLATESIARHLELSQATAESLARASFTHN ARSFLESVLVPEFGLSHPLLDVERPDLLERLKRGERPSVITTAHLGAWELLASLLGDVSD HPRLTVVRTYKNKLMDYVTTRLRSSHGADVIGHREAAFPVLRALRKNGYAAFLADHNTSR SEAFFLPFLGEEAAVNKGPAVLAVRAKALVWPIALIRDGDRYRIIIEEPLDTALLEGDAE EKALAVAAFYTEANERMVRRAPDQWFWMHNRWKTKRVMDD >gi|316922589|gb|ADCP01000106.1| GENE 9 13168 - 13914 898 248 aa, chain + ## HITS:1 COG:ECs2148 KEGG:ns NR:ns ## COG: ECs2148 COG4221 # Protein_GI_number: 15831402 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Escherichia coli O157:H7 # 3 247 2 247 248 286 53.0 2e-77 MPVAYITGATSGIGRATAERFIEEGWTVVAMARREERLQELQDAHPGSVHCFKLDVRDKA AVEHVFSEAKERFGAPDVLVNNAGLALGLEPAQACSLDDWDTMVDTNIKGLLYCTRAALP GMVERHSGHVVNLGSIAGTYAYPGSNVYGASKGFVLQFSRGLRCDLHGTGVRVTDVEPGL LESEFSNVRFKGDESRFDTLYENAHPLRPEDIADTIWWVVSRPAHVNVSQVEVMPTTQSL ASARVFKG >gi|316922589|gb|ADCP01000106.1| GENE 10 14177 - 14833 853 218 aa, chain - ## HITS:1 COG:no KEGG:Pecwa_1808 NR:ns ## KEGG: Pecwa_1808 # Name: not_defined # Def: hypothetical protein # Organism: P.wasabiae # Pathway: not_defined # 74 186 86 195 237 95 43.0 2e-18 MNFTFQRNPERLLDLCGLSPIRDRRQEKLFMEKAAALDARDLRPLVLVIGGFLDQILGNS YAVSARYPADLRARHDVWFREHYESRKMRDIVNIYASKGHSVALIGHSWGGDAAVNLVAR KLDAPIDLLISLDPVSRKGAPRRKIPNVRHWLNIHIDYSQSTWLDIPNLVARIGGPWEAA ENADVNVSCPPDMTHAWAWGMFERYGEKVLRERAEGWK >gi|316922589|gb|ADCP01000106.1| GENE 11 15102 - 16280 1748 392 aa, chain - ## HITS:1 COG:VNG0831G KEGG:ns NR:ns ## COG: VNG0831G COG5557 # Protein_GI_number: 15789982 # Func_class: C Energy production and conversion # Function: Polysulphide reductase # Organism: Halobacterium sp. NRC-1 # 15 370 23 408 435 108 26.0 1e-23 MLELILKGRPRFYLWLCFLGAIFGLGFIVYIFQANIGLAITGMSRDVSWGLYIAQFTYFV GVAAGAVMLVLPAYFHHYEKFKRIIIFGEFMAVGAVIMCMLFIVVDLGQPQRALNVLLHP TPNSVMFWDMTVLFGYLLLNIIIGWVTLEAARNDVKPPKWIKVLIYISIIWAFSIHTVTG FLYAGIPGRHYWLTAIMAARFLASAFCSGPAILLLLLMAVRRVTGFEPGREAMKTLSTII TYAMCVNVFFYLLEIFTAFYSGIPGHQHSILYLFTGHDGHMAWINSWMWTAVVFAVLSLL MLIPPALRYNDKILPWALILLVIASWIDKSLGLLIGGFVPNMFETVTEYTPTIPEILVAL GVYGLGGIIISVLWKIAMDVKKENGTFALKDN >gi|316922589|gb|ADCP01000106.1| GENE 12 16290 - 17087 989 265 aa, chain - ## HITS:1 COG:AF0499 KEGG:ns NR:ns ## COG: AF0499 COG0437 # Protein_GI_number: 11498110 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Archaeoglobus fulgidus # 42 263 26 263 269 184 42.0 2e-46 MKSRRDFLKIAGLSAFALGTASALTLATGSKASALPAGTAPAKGSYTNEPDALHAKRWAM VIDTRKFTSPEHFQRVIDACHHAHNVPTIPGNQDVKWIWTDKYAHVFPDDVNAYMPEKFL HGDFLLLCNHCENPPCVRVCPTAATFKREDGIVVMDPHRCIGCRFCMAGCPFGARSFNFR DPQPYVKDVNPEFPMRTRGVVEKCTFCTERLAQGKLPACVEASEGAMIFGDLNDPGSPVR QALSENFTIRRKPTLGTQPGVYYII >gi|316922589|gb|ADCP01000106.1| GENE 13 17087 - 17473 642 128 aa, chain - ## HITS:1 COG:no KEGG:DVU1288 NR:ns ## KEGG: DVU1288 # Name: not_defined # Def: cytochrome c family protein # Organism: D.vulgaris # Pathway: not_defined # 1 127 1 125 126 186 66.0 3e-46 MYNAKYIIPGVLIAVVAFTSPFWLNLGGKTYVYPEVALPTGEGKDKCIESKEWMRAEHMA LLNTWRDEAIREGKREYVATDGRKWVISLQDTCMACHTNKADFCDKCHNSNNVDPYCWTC HIAPRGNN >gi|316922589|gb|ADCP01000106.1| GENE 14 17477 - 19141 2207 554 aa, chain - ## HITS:1 COG:AF0502 KEGG:ns NR:ns ## COG: AF0502 COG0247 # Protein_GI_number: 11498113 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Archaeoglobus fulgidus # 1 540 1 553 555 489 47.0 1e-138 MAKMPKPEEMIASRASFPEKSWMDVKTEIKAGMYAYPAKVDIMEGLHMPHPHAWQPEEED WHLPENWEEIIYKALKERLAKHRSLKIFMDTCVRCGACADKCHFFLGTNDPKNMPVLRAE LLRSVYRRDFTTAGRLLGKVAGARKLTTDVIKEWFTYFYQCTECRRCSLFCPYGIDTAEI TAIVRELLHELGLGINWIMEPVSNCNRTGNHLGIQPHAFKEIVEFLCEDIETITGIHIDP PFNEKGHEILFITPSGDVFAEPGIYTFMGYLMLFHELGLDYTLSTYASEGGNFGSFVSFD MAKKLNAKMYAEAKRLGSKWILGGECGHMWRVINQYMATYNGPAPEGMMDVPTSPITGTR FENARATKMVHIAEFTADLIHHNKLNLRPERNSGIITTFHDSCNPARAMGLLEEPRAVLR AVCPEFVEMPPHTIREETFCCGSGSGLNTEEIMELRLRAGFPRGNALRYVQEKNGVNWMS CVCAIDRATLPPLANYWAPGVTVSGLHELVANALVMKGEQPRTMNLRQEDLECPDPEPEE EAVEELAASAEEDN >gi|316922589|gb|ADCP01000106.1| GENE 15 19154 - 20185 1350 343 aa, chain - ## HITS:1 COG:AF0501 KEGG:ns NR:ns ## COG: AF0501 COG2181 # Protein_GI_number: 11498112 # Func_class: C Energy production and conversion # Function: Nitrate reductase gamma subunit # Organism: Archaeoglobus fulgidus # 26 305 5 286 332 226 42.0 4e-59 MIVSLVAVLLIGAIAWSGAVAGFQSLLGVALPLLAAVVFVCGIIWRITCWWAKSPVPFAI PTTGGQERSLDWIKPNRLDTPMSNAAVVGRMILEVLLFRSLFRNTSAEVSSVGPRVTYFS SKWLWVFALTFHYCFLVIFIRHFRFFIEPVPVCLTWLEFFDGIMQVGVPRLFMTDMLIVV AVLFLFGRRLANPKVRYISLANDYFPLFLIMGIIGTGICMRYFDKVDIAQAKVFIMGILH FTPQSAVGLNPLFFTHVALVSVLLIYFPFSKLMHMGGIFMSPTRNMRCNTREVRHVNPWN NPNAPYRTYAEYEDDFRDLMAEAGLPLEKQPEDVEPAAPAPSK >gi|316922589|gb|ADCP01000106.1| GENE 16 20275 - 20802 560 175 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0071 NR:ns ## KEGG: DvMF_0071 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 165 1 165 181 141 43.0 8e-33 MSTEDIISARRDAAIQKWTEAVFAMYPFETTGFIRTQRDQFANPVGHATRAAGEQIYDAV TGRDVDMEKVHASVAALIRIRAVQDLKPEQAVGVLYLYKSVLRELLLADMLAAGDVQGFL DMGDRLDTLCLMAFNLYLADREQVYAERVAQQRREASQIRRWAARHGLVENQDGE >gi|316922589|gb|ADCP01000106.1| GENE 17 21331 - 25713 5478 1460 aa, chain - ## HITS:1 COG:CAC2401_1 KEGG:ns NR:ns ## COG: CAC2401_1 COG1924 # Protein_GI_number: 15895667 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 1 671 1 663 663 711 51.0 0 MERSVFLGLDIGSTTAKLALVAADGKLLEAQYLRHGAAVRQTLSTMLDQLALKYPGMAVS AAITGSASLGLSETLGLPFVQEVVAASRAIAVLAPQTDVAVELGGEDAKILYLSQGMDLR MNEACAGGTGAFIDQMAALLHTDASGLNELAENYTTLYPIASRCGVFAKTDVVPLLNEGA AREDLAASIFQAVVEQTIGGLACGNPIRGKVAFLGGPLHFLPELRKRFIATLKLAPEEVV PFPNAQYMVALGTALSLVDLPGAARMADMEPVTLGTLAERARVRGDAGDRTASLPPLFDN EADYDLFRERHNRDAAPRKPLESASGPLFLGVDLGSTTVKAVLADADGAVLTSWYERNQG DPLAGLLPYLADLIDSLPEGAWIQSSVATGYGAQLAQAALGSLSADVETVAHLKAACRLV PDATYVIDIGGQDMKCLKARDGLIAGVTLNEACSAGCGAFLETFARSLNLSMEAFVRAAL FARHPVDLGSRCTVFMNSKVKQAQKEGADIGDIAAGLCYAVARNALYKVLRLRTPAELGD RVLVQGGSFLNDALLRVMERLLDHHVCRPDIAGLMGAYGAALLARSRTPDGAAPTPLTSA SLRSLDISAKTLRCKGCGNHCLLTVNRFSNGRKLVSGNRCERGASGGEAARPSMPNLYAW KERRLFGYEPLPQDDAPRGRIGIPRVLNMYEHYPFWFTLFTELGYRVELSPPSGKELFDL GLSSMPSQTVCYPAKLAHGHIASLLRKGLKRIFFPCLPRERRESKDAADGYNCPVVAGYP EVIRLNTDELLEQDCTLYTPFVSLEHPDTLVAALHDLFGIPKKELRAAVRVAALEQEAYR TELRAEGERVLAELERTGKLGIVLSGRPYHADPAVHHGLPELVASLGAAVLSEDSVAHLG HPAEPLRVVDQWTYHSRLYRATALVCTRPNLELVQLTSFGCGLDAITADQVSELLTSAGK LHTLIKIDEGASLGAARIRIRSLLAAVEERRETPPTPRPAASAFRPVFTRPMRQTHTILA PQMSPLHFDIIGKAISSAGYNLEILPTVSREAIETGLNYVHNDACYPAIVVIGQLIDALQ SGRCDPRRTALMLAQTCGPCRATNYPALLRKALCEAGFGDVPVLTLSGGSLNRQPGFAIS ARLLHRLILGCLYGDMLQRVSLSTRAHERNAGDTDDRLAHWMNRVKFSAARGDSSLFKSH MRSIVRDFSAIRQAHAPLPRVGIVGEILLKYHPDANNQVIRHIMEEGGEPVLTDLMDFFL YCLLDPVYLWRHMGGKAFPAFSNWLLIKRIESLRDAMRRALEGSRFLPVSRIADLARSVR GIVSTGNQAGEGWLLTAEMLELIDHGVGNVLCLQPFGCLPNHITGKGVLKELKRIRPHAN LMAVDYDPGSSEANQLNRIKLFMTVAHSRIPQEAQGDHAPASLNESVAQLAGRLRDNLKG KLENTIGGSGNSGDEVTNLA Prediction of potential genes in microbial genomes Time: Fri May 13 03:53:33 2011 Seq name: gi|316922585|gb|ADCP01000107.1| Bilophila wadsworthia 3_1_6 cont1.107, whole genome shotgun sequence Length of sequence - 3959 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 61 - 1398 1418 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Prom 1426 - 1485 4.7 2 2 Op 1 . + CDS 1589 - 2551 910 ## COG0679 Predicted permeases 3 2 Op 2 . + CDS 2564 - 3907 1760 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|316922585|gb|ADCP01000107.1| GENE 1 61 - 1398 1418 445 aa, chain - ## HITS:1 COG:Cj1269c_2 KEGG:ns NR:ns ## COG: Cj1269c_2 COG0860 # Protein_GI_number: 15792593 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Campylobacter jejuni # 193 441 76 324 329 130 34.0 8e-30 MKITQPLKKPSPHYGMDTSQQPRDENKITHKNSLFLSWCILLALCFAFVSAVPAAAGPNP AKYDKAIRDMGQLQKSKKRLQREPWEKLAETFLTVYRVEKKWKERSAALFRSAEALDHLA RCASNAKDARRSVDRYLQLVRLYPKSSLADDSLYRAARLRGQILRDKAGAQELLQQILKK YPSSNTAKDASSYLATLSPKEKRQSPSAASKASKQKQPRGKPFRLGVKTVLIDPGHGGKD PGTHHNGIREKDLTLDISKRVGAILSSRGLNVRYTRRSDTWITLEQRADKVRTNKADLFI SIHVNANPSEGVQGFETYYLDVSRTSASTRLAAVENALRDRSRATREKLPPHRLFTIQKQ ESRRLARNVHETTLKYLRKKNYRTHDGGIKTAPFHVLRRSGVPGVLIEVGYCTNKTEAER LAVDAYRASIARGIANGILIYTGKM >gi|316922585|gb|ADCP01000107.1| GENE 2 1589 - 2551 910 320 aa, chain + ## HITS:1 COG:AGc3303 KEGG:ns NR:ns ## COG: AGc3303 COG0679 # Protein_GI_number: 15889099 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 317 1 312 312 89 23.0 9e-18 MLSILSTLVPVFGIILFGIVVERMGFLPIETSGCLNQFVYWIGLPMMLFNQLARMEAGQM SGAMVGGILLGYGITYLFAYVVFSSLLRRRWEESSVFALLSSFPNAAFMGLPIVVLLLPD SSEAAIVASLCAVMTSANLLFTDGRLEAGKHKGEGRKQVFLSLLRSLFHNPLLIYSALGA AVSLLHIAVPKPILSMSAMLGSTSAPCALFCMGMILSKQMTSSQGFVKGWARRQFPLHVL KLVVEPLFIFGVLYLLGVRGIALASATIVAAMPTAVAAYIIAEKYQVATEDSSLGIVVDT ALSAVSIPLLIVAFQYYGLL >gi|316922585|gb|ADCP01000107.1| GENE 3 2564 - 3907 1760 447 aa, chain + ## HITS:1 COG:BS_yisQ KEGG:ns NR:ns ## COG: BS_yisQ COG0534 # Protein_GI_number: 16078145 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus subtilis # 12 444 21 454 455 311 41.0 2e-84 MPQEQQNIRRWLVRLTGPIFIEMVLIILLGIVDTVMLSQCSDNTVAAVGVVVQLLNMIFL VFEATTAGTSVLCSQFLGARQMDNVFRAIGVSLAFNMMMGVIVSAALFFGAERILRLMEL RPELMGDGVTYMRIVGGFAFFQAISLTLSAILRSAGMAYYPMQVTFLINIINIIGNYTLI FGHFGFPALGVEGAAISTSFSRGVAMCLLLFILFRKKLTFPPSFLWPFPFGVLRKMLSIG LPSGGEQLSYSLSQVVITYFVNMLGNEALAARTYAVNIVTFSYVFAMSVGQGGGICIGHL IGKGHKNAALLLGRYCIRITLMISFSVALLSALLGKSVMGLLSDNPEVVSLTALILCIDV VLELGRAVNILCVNFLRATGDAVYPFIIGLIFMWGVATVGGYVLGVTLGFGLVGMWIAFT MDESIRAALFWRRWQSRRWMGKSIVAR Prediction of potential genes in microbial genomes Time: Fri May 13 03:53:35 2011 Seq name: gi|316922582|gb|ADCP01000108.1| Bilophila wadsworthia 3_1_6 cont1.108, whole genome shotgun sequence Length of sequence - 3055 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 36 - 773 926 ## COG0730 Predicted permeases 2 2 Tu 1 . + CDS 787 - 1041 92 ## 3 3 Tu 1 . + CDS 1370 - 2977 1754 ## LI0781 5-enolpyruvylshikimate-3-phosphate synthase Predicted protein(s) >gi|316922582|gb|ADCP01000108.1| GENE 1 36 - 773 926 245 aa, chain - ## HITS:1 COG:AGc1383 KEGG:ns NR:ns ## COG: AGc1383 COG0730 # Protein_GI_number: 15888104 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 241 1 235 243 90 31.0 2e-18 MSGFTPLLFAYVFTAVFLAGIVRGATGFGFSMIMIVLLTLFFPPAQVAPVILFWEVLASI GHLPFVYKQVHWKSLRWLALGVALGTPFGVYCLVSIPVDAMRLIINAVVLILTSMLYCGL RPKNAPTPPQTTGVGLLAGVINGASANGGPPIILFFLSSPLGAAVGRASLIAFFLFTDVW ASLFYWQQGLISLDTIIFTLVFMVPMFGGMWFGNRWFSTVDEARFRKVVLALLMIISVVG LVKAL >gi|316922582|gb|ADCP01000108.1| GENE 2 787 - 1041 92 84 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVRGRMPVPENALRERGIRDRPNVGVSPLSSEPQGGKDNGLRLRVKAFRGIITSLYDSSF RKLRLAYGTFISRLELRPLTSTPP >gi|316922582|gb|ADCP01000108.1| GENE 3 1370 - 2977 1754 535 aa, chain + ## HITS:1 COG:no KEGG:LI0781 NR:ns ## KEGG: LI0781 # Name: not_defined # Def: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: L.intracellularis # Pathway: Phenylalanine, tyrosine and tryptophan biosynthesis [PATH:lip00400]; Metabolic pathways [PATH:lip01100]; Biosynthesis of secondary metabolites [PATH:lip01110] # 1 516 60 572 574 384 37.0 1e-105 MLTRLPNAGTSDHERELRTSWEENASAVSRDPKLIRQIFALLQEVEVAPADMEQPSAFNL APARKALAVELPAPASDRLPRVRMVLAASGATECTLHGVPLNGPVMECLKGLNQVGARLR WEEDGRILCQGGEPVSGYNKSILDKVVHVGDDPFNLYLMLFQMVTRPARLKIIGESGLKF VDLAPIRHFLPLLGARLTSVVPGQEGLPARLESSAMLPSDVAVPAELPADALEALLVATA GWERDVTVDLSGHAEGRNIVSKVLPILQSAGIKASLTDTEDTLSLHVVPGKAQFSDELLP GINVVAAATLLAMPAFVGGSVKLSGRWEATGDARSGDAVKGLFSKLGVNLSVADGELSAS FGEGIAEGEPLPSPNPVDDLTGLPSALLPLGLALSLIPAVRAKGGVMPKLPESVDKVLVE SFLNQLGLSCEDGRLVSIEPSTTPWASPTVQWALALSLAAFLRPNIKLSNPGIVTNYLPV YWNIYNTLPTPSMTRKAQEEETAPAANSRPARRRVLASHTPESEMPDEIVYPDED Prediction of potential genes in microbial genomes Time: Fri May 13 03:53:51 2011 Seq name: gi|316922577|gb|ADCP01000109.1| Bilophila wadsworthia 3_1_6 cont1.109, whole genome shotgun sequence Length of sequence - 3651 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 132 - 187 18.3 1 1 Op 1 . - CDS 267 - 1301 830 ## LI0798 hypothetical protein - Term 1386 - 1424 1.2 2 1 Op 2 . - CDS 1455 - 2117 540 ## COG0457 FOG: TPR repeat + Prom 2055 - 2114 5.3 3 2 Tu 1 . + CDS 2139 - 2543 300 ## LI0797 hypothetical protein + Term 2568 - 2614 -0.7 + Prom 2563 - 2622 2.9 4 3 Tu 1 . + CDS 2687 - 3610 787 ## Predicted protein(s) >gi|316922577|gb|ADCP01000109.1| GENE 1 267 - 1301 830 344 aa, chain - ## HITS:1 COG:no KEGG:LI0798 NR:ns ## KEGG: LI0798 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 287 1 290 329 213 37.0 9e-54 MLTNRSHSPLYLPLLIAVCGILFSVWNALDAASVPCVSAGCTLYQNFSINGYSLWWGGVG IFSLLGLLALTGRAVLGRWLAGLAVLLDCVLLGIMILTLPCVACMVAALLLALSYITFRA ATLADTHRSALHHASPLLVLWGALFLLTLGGLARSEVGPWAIQAPEMEDETTARVFFSPS CSACRQLVMGMSEADARKIIWCPVAEEEKDLAIILNLKKRIEMGTPLTRSFLPSLETEPM TFWDLFSLEVLTTQFRLWCNDARVISSSEGRLPLVEFMGIPSALVKAKSPASPAARPTAP SAPSEPMFGGQSVLPGQTAPQDHGLPFDMGNSGSCGGPNAVPCP >gi|316922577|gb|ADCP01000109.1| GENE 2 1455 - 2117 540 220 aa, chain - ## HITS:1 COG:alr1677 KEGG:ns NR:ns ## COG: alr1677 COG0457 # Protein_GI_number: 17229169 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 32 201 20 187 280 67 30.0 2e-11 MIMNIESFNISKNVLYKLINKINPVRFRAIAGGLALLSILALPGCSLLEPEKMPPKPANP TGVSGKHDPQAEQLFAKAHVLWKGETCTDPEKALEYLDEALKIEPDYPQALIRRGLALSQ LGYADDAFDDLTKAIRLEPSAEAYLSRGICLLQQGNTAGARKDLEEALRRDDRSYRVWNI LGAVSLKEGKEQEACDAFEKACSSGDCAGIEAARREKICK >gi|316922577|gb|ADCP01000109.1| GENE 3 2139 - 2543 300 134 aa, chain + ## HITS:1 COG:no KEGG:LI0797 NR:ns ## KEGG: LI0797 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 125 1 125 128 77 33.0 2e-13 MFRHKVPAIKVARDRLLDLPVEQARTGALVLGGLRRACELADTLDSCITEDMTFAKHFFN ELATLPHDDESHWMNLLEDLALIFRAKRLAFPDLPEEGEERRLLEFFETSEEWGDPETEV GSWYWKLLPERLSR >gi|316922577|gb|ADCP01000109.1| GENE 4 2687 - 3610 787 307 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRSGILALWLILSLVSVVTSGCGYRVAATIQANAEATPVPTKLSWAAVPVPALSASAPE TLSGLQTVSVPLATALSLYNWKWLSNPDQAAEADVLVRIWWMTDGPQYITERADPFYRPG LSFGTGIGFGSSPWHRGPFGYARQAFYVPEPSIQAIYSRVLVVEALRADALPKATLEALL PAAKPSGIASKPAPAAKPDGDPSKGPYAPPIALEGEALSKPPYAAPLKATGADPDQEPPY APPLLASASQGVPGGAVLWRVVVTSGGSKGNTYEILPQLATAAAQAVGKNMQADVFVDSD MRVTFGK Prediction of potential genes in microbial genomes Time: Fri May 13 03:54:17 2011 Seq name: gi|316922574|gb|ADCP01000110.1| Bilophila wadsworthia 3_1_6 cont1.110, whole genome shotgun sequence Length of sequence - 2781 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 24 - 536 556 ## COG1247 Sortase and related acyltransferases 2 2 Tu 1 . - CDS 720 - 2645 2240 ## COG0303 Molybdopterin biosynthesis enzyme - Prom 2665 - 2724 1.8 Predicted protein(s) >gi|316922574|gb|ADCP01000110.1| GENE 1 24 - 536 556 170 aa, chain - ## HITS:1 COG:SA2317 KEGG:ns NR:ns ## COG: SA2317 COG1247 # Protein_GI_number: 15928108 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase and related acyltransferases # Organism: Staphylococcus aureus N315 # 1 158 1 158 163 192 55.0 4e-49 MIRPATEADLQAILDIYNDAVINTTAVYTYTPHTLDMRRQWFNEHREAGLPVFVLEEDGV IAGFATYGNFRPWPAYKYSIEHSIYVHKDFRRRHIATRLMEKLLEAANGAGYATMIAGID ADNAASIRMHERFGFTFAGKIVKAGYKFGRWLDLVFYQKLLDGPKQPVGE >gi|316922574|gb|ADCP01000110.1| GENE 2 720 - 2645 2240 641 aa, chain - ## HITS:1 COG:CAC2020_1 KEGG:ns NR:ns ## COG: CAC2020_1 COG0303 # Protein_GI_number: 15895290 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Clostridium acetobutylicum # 31 410 30 407 407 245 36.0 2e-64 MKRNIYLKTIPPAEAVARVKASIDREALIREETITSHEASGRVLAHAVHARYSSPTFHSA AMDGYAVNARSTFAAREGHPLTLKAGETCFAVNTGHPMPEGCDAVIMIEQVVLDGGSVTI EAPAFPWQHVRRIGEDIVATELLFPRNHQLRPWDVGALLSGGIWDVPVWERVTVRIIPTG DEVLDFTTHPDPGKGQVVESNSQMLAALARELGCVVERAAPVPDNPEALHQALKDSLDAG RHLTIFCAGSSAGSKDFTRATLEKEGEILTHGIAAMPGKPSLLANCRGRLVAGAPGYPVS ALVCFKELLEPLISWLSHREPPAKTVVPVELTRTVPSRPGVEEHVRVSIGRVGDKLVATP LGRGAGNITTVTRAQGDVRIPEQAEGLNEHAVVPAELSVSEAELDRILVCVGSHDNTLDL LADELMGLPEPFRFASTHVGSMGGITALKNGSCHLSGMHLFDPGSDDFNFPFIKKFLPDV DVTVINLAIRHQGIIVPHGNPMNIQGIDDFTRVRFINRQRGAGTRILLDWKLKQAGLKPS DVKGYDKEEFTHMAVAVNVLTGAADCGMGIYAAAKALGLDFVPLALERYDLVIPTRFLDD PRVQAVRALLDSPAFKARIEAQGGYDTPLTGQIMAEGEKRM Prediction of potential genes in microbial genomes Time: Fri May 13 03:54:26 2011 Seq name: gi|316922564|gb|ADCP01000111.1| Bilophila wadsworthia 3_1_6 cont1.111, whole genome shotgun sequence Length of sequence - 13744 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 4 - 2019 2143 ## COG0322 Nuclease subunit of the excinuclease complex - Prom 2062 - 2121 3.5 + Prom 2093 - 2152 4.8 2 2 Tu 1 . + CDS 2186 - 3853 1203 ## + Term 4088 - 4131 10.6 - Term 4076 - 4120 12.5 3 3 Tu 1 . - CDS 4166 - 5971 2670 ## COG0481 Membrane GTPase LepA - Term 6119 - 6155 7.5 4 4 Op 1 34/0.000 - CDS 6274 - 6948 1204 ## COG0765 ABC-type amino acid transport system, permease component 5 4 Op 2 34/0.000 - CDS 6952 - 7818 505 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 6 4 Op 3 31/0.000 - CDS 7830 - 8519 994 ## COG0765 ABC-type amino acid transport system, permease component - Prom 8769 - 8828 3.2 - Term 8861 - 8898 11.5 7 4 Op 4 . - CDS 8908 - 9660 1359 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Prom 9871 - 9930 5.5 8 5 Op 1 . + CDS 10153 - 12075 2649 ## COG1166 Arginine decarboxylase (spermidine biosynthesis) 9 5 Op 2 . + CDS 12290 - 13477 1811 ## COG1748 Saccharopine dehydrogenase and related proteins + Term 13596 - 13633 6.7 Predicted protein(s) >gi|316922564|gb|ADCP01000111.1| GENE 1 4 - 2019 2143 671 aa, chain - ## HITS:1 COG:PA2585 KEGG:ns NR:ns ## COG: PA2585 COG0322 # Protein_GI_number: 15597781 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Pseudomonas aeruginosa # 7 666 12 605 608 335 35.0 2e-91 MERPASSTLPTTPGIYIYKDAQGRIIYVGKARNLRKRILSYFRDASALTPKTVAMIGHAA SLETLSTTTEKEALLLEASLIKKHRPHYNIVLRDDKQYVLFRIAKDAPYPRLEVVRQARR DNARYFGPFTSGQAARETWKTIHRVFPLRRCVDRAFKNRVRPCLYHHIGQCLAPCTENVP VEEYASLIHRVELLLSGRSRELLDTLRHAMADASEAMNYEQAAVFRDQIKAIERTVERQS VVLPEGGNMDVAGVAPAKGGLALGLLFVREGRLVDGRTFFWPDLELAEGPELLWSFLGQF YGPQTSIPPRIVVPWLPEDTVLPGMSGEQDAPDSASPVKDHIATLEGEAEPPHAAGTDSN GHAPDEVESLEALEAALADARGGIVRIAKPRNAAEARLVDMAVSNAREAAVTKSEVPMSD RLAAAFALDKVQSALPGAVPMASRPIRRIECVDVSHTGGTSTRVGMVVFEDGQPQKSDYR TYALEGEAACNGDDYAALAEWMRRRLESGPPWADLVLIDGGRGQVSAVQRIVRESNAEGL FLLAGIAKARDDMGRADRRAGNVGDRIFLPGRSNPLPLRDGSAELLFLQHIRDTAHHFVI GRHRRARAGAALSAELLRIPGVGKATARLLWDHFKALDAIIAATPAQLAAIPGIGPRKAE ALAGSLKRLKE >gi|316922564|gb|ADCP01000111.1| GENE 2 2186 - 3853 1203 555 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MARKDPYDPEEGIIDLFDIVDDPEASPEPEPRGALHDDGGQALPGQPEAADPETPTGEES ERVVSRYPFENSDDFLEEIEKTPLMTLLPPKGEASALPEEPVSDGAEEEAPAESLGEPVR EEPDEALIRAFEAEMAKAGAVVHGYGVETAPGMADASGVPDETEAPALSAGAVDQEFAEM HLEAQEPPIPEPVERQYAEDLELLFGDAPVADAPQVEPSPVAPEPAMEAPMLVSPEGEEG SKADPGLSQQAEAAPEGAVEEPSEALPSSVSADEVAEPLPEPSGDGADRALAFVPVAVPD VPVSAAVPQAEPSCGHEMEERIIRLEEALSRLNERVVALEQRADGAAEQDLNVSAISGDI QSLLTEGNALCGQLKSLAAELGSTPSGVSAEAPEAAPLPGSSAEVAEEPTFPPVPSRSAE IPWESSSQEADPDDGPDVLGLALESLERRLALLESRPVQAPDAAGIVQDVLALVRVDMEK ASEEQEASARALEQLERRVQELESRPLPQLILPDLPDTEAITADVMSRIQNELDRIAAEA AARVLREEIAGLMGK >gi|316922564|gb|ADCP01000111.1| GENE 3 4166 - 5971 2670 601 aa, chain - ## HITS:1 COG:BH1342 KEGG:ns NR:ns ## COG: BH1342 COG0481 # Protein_GI_number: 15613905 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Bacillus halodurans # 4 600 10 606 609 785 63.0 0 MPVQEHIRNFSIIAHIDHGKSTLADRILEITGLVSEREKRDQYLDRMDIERERGITIKAQ SVRIPYKSKNGKQYLLNLIDTPGHVDFNYEVSRSLAACEGALLVVDATQGVEAQTLANVY LALDHNHEIIPVLNKIDLPSSDVERVKSEIEEDIGLDCSKALPVSAKTGDGVPDVLEAIV ERLPAPEGNPNAPLKALIFDSWYDSYQGVVVMFRIMDGTLRKGDKIELFATKKTYEVIRL GVFSPESRDMDQLSAGEVGFLCGNIKELGDAKVGDTITLAERPCAEAVPGFKEVKAVVFC GLYPSDPSDYEQLKFALEKLQLNDAAFSWEPETSQALGFGFRCGFLGLLHMEIIQERLER EFQVDLIATAPSVIYRVETTDGKTTEIDNPSKLPDPARITNLYEPYVRIDIHVPSDYVGN VMKLCEEKRGIQKNMGYLAQNRVVITYELPFAEIVFDFFDRLKSATRGYASMDYEVIDYR ASNLVRLDILLNGEAVDALAVIVHKDKSYAYGRALALKLKRTIPRQMFEVAIQAAIGQKV IARETISALRKNVTAKCYGGDITRKRKLLEKQKEGKKRMKRMGNVELPQEAFLAALQVGD E >gi|316922564|gb|ADCP01000111.1| GENE 4 6274 - 6948 1204 224 aa, chain - ## HITS:1 COG:BMEII0600 KEGG:ns NR:ns ## COG: BMEII0600 COG0765 # Protein_GI_number: 17988945 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Brucella melitensis # 20 217 37 234 244 140 42.0 1e-33 MQWLDPDFIFQTVLPALNRGLVMSIALIVPSATIGFVLGVLTGVARVFGPKWLRSLGNGF TTIFRGVPLVVQLMILYFGLPNLGIYLEPYPASVLGFILCTGAYQSEYVRGALLSIRQGQ IKAAYALGFTKLQTILWVVIPQAARRALPGCGNEIIYLIKYSSLAYIITCIELTGEAKVL VSRSFRPTEVYIVAGIYYLVLVSFATWFLQKLERKFAIPGFGKK >gi|316922564|gb|ADCP01000111.1| GENE 5 6952 - 7818 505 288 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 6 245 1 242 245 199 41 1e-50 MNTTPILRATNIVKKLGGKTILNDVSLDVHKGDVKVVIGPSGAGKSTFLQCLNYLLPPDS GDIWLEGKKVNARDTRELCALRQQVGMIFQDFNLFDHLTAEENVSIALRKVMGCNKAEAR NRALTELSRVGLAKRAALYPAQLSGGQKQRVAIARALAMDPKVMLLDEPTSALDPELVGE VLSVIRDLADGGMTMIMATHQMDFARALATDILFMEQGKIIEQGAPDVLLAPGSGTRTSD FCGKLFDLRGTEKSDGPTVSELLTGDLSHDITDDGSESSMIPDAPKKD >gi|316922564|gb|ADCP01000111.1| GENE 6 7830 - 8519 994 229 aa, chain - ## HITS:1 COG:TM0592 KEGG:ns NR:ns ## COG: TM0592 COG0765 # Protein_GI_number: 15643358 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Thermotoga maritima # 8 220 5 216 216 137 38.0 1e-32 MLEIVDKIPVILNALPYILQGSVVTLITVIGSLALGFCIGLPLAVLQVYGPSFIRRLIGV YVWFFRGMPILLLLFLFYFGLFEVIGLNLSTITASCLVLGMASAAYQSQIFRGSIETLPV GQFRAARALGMNDGQAIRHIVLPQALRLSIPGWSNEFSILLKDSALCFVLGTPEIMARTH FVASRTYEHLPLYITAGLLYFGITLVGVHILRVLERKVHIPGYAVSGSM >gi|316922564|gb|ADCP01000111.1| GENE 7 8908 - 9660 1359 250 aa, chain - ## HITS:1 COG:TM0593 KEGG:ns NR:ns ## COG: TM0593 COG0834 # Protein_GI_number: 15643359 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Thermotoga maritima # 12 238 10 236 246 133 37.0 3e-31 MKRIALRWLGALLLSITLAAPALAAGKHLINGIDANYPPFAYVDETGKPAGFDVDSMNWI AKKMGFTVEHKPMDWDGIIPALLAKKIDMVCSGMSISPERKAQANFSDPYWTIRKIFITK KGSKVTEDQIYNGKIKLGVQRGTNEHEMLQKAQAEKKYNYQLRFYDSGPMAIEDLLNGRI DAIGLDAAPAEDAMRKGKPVQEVGVFGSDEFGVAIRKDDAETMKLVNEGYKLLKADPYWK ELQAKYLGNK >gi|316922564|gb|ADCP01000111.1| GENE 8 10153 - 12075 2649 640 aa, chain + ## HITS:1 COG:all3401 KEGG:ns NR:ns ## COG: all3401 COG1166 # Protein_GI_number: 17230893 # Func_class: E Amino acid transport and metabolism # Function: Arginine decarboxylase (spermidine biosynthesis) # Organism: Nostoc sp. PCC 7120 # 10 638 53 677 679 582 43.0 1e-165 MAIKRTLQQWSVEDSTELYGIRNWGAGYFGVSPEGDVVIYPSGENKGVAVSVPEVIRGMR ERGYDMPVLLRIENILDSQITLLHESFRKAIASLGYKGEYRGVFPIKVNQQQHTVEKIAQ FGSRFHHGLEVGSKAELISAISQLKDPEACLICNGYKDEEFIDLGLSAVRMGFKCIFVME MPGELELILERSAALGVRPLLGVRVKLITKGSGHWAGSCGERSAFGLTTAQVVDIVDTLK QHDMLDCLQLLHYHLGSQVPNIRDIRTAVMESTRVYAGLVQEGAKMGYLDLGGGLAVDYD GSHTNYVSSRNYSMDEYCTDVVEAVMTILDQQGIEHPHIVTESGRATVAYYSILLFNILD VSIAETGESIPDVLPEDVPEPVVNLREVLQGLSVRNLQECYNDAVYYRDEMRQLFITGRV TLRQRTLADKYFWAIINRIAEEKEKLKHTPKELADIDSTLADIYYGNFSVFQSLPDAWAI DQLFPVMPVHRLTEFPSRKAVISDITCDSDGRIDKFIDPQGMRTSLDLHPLVDGDEYYLG VFLVGAYQETLGDLHNLLGDTNVVSIRISEDGSYQFVREIRGDSVADILDYVEYDPRRIL EDLREAAERAVREKRITPSERYKILQTFEDGLRGYTYFER >gi|316922564|gb|ADCP01000111.1| GENE 9 12290 - 13477 1811 395 aa, chain + ## HITS:1 COG:BH3957 KEGG:ns NR:ns ## COG: BH3957 COG1748 # Protein_GI_number: 15616519 # Func_class: E Amino acid transport and metabolism # Function: Saccharopine dehydrogenase and related proteins # Organism: Bacillus halodurans # 1 391 1 389 410 563 67.0 1e-160 MARVLIIGAGGVGSVVAHKCAQNPEVFTEIMLASRTVSKCDAIAASIKERTGRIIETARV NADDVPELVALIKRYKPVMVINVALPYQDLTIMDACLEAGVHYMDTANYEPLDVAKFEYK WQWAYQQKFKDAGLMALLGSGFDPGVTNVFAAYAQKHLFDEIHVLDIIDCNAGDHGQPFA TNFNPEINIREVTAKGRYWERGEWVETDPLSWSMTYDFPDGIGPKKCFLMYHEELESLVQ NLKGIRRARFWMTFSENYLNHLKVLGNVGMTGIEPVEFQGQQIVPIQFLGKLLPDPASLG PLTKGKTCIGCVMKGVKDGKEKSAYIYNICDHEACYAEVGSQAISYTTGVPAMIGAMMMV TGKWMQPGVWNMEQLDPDPFMEQLNLRGLPWVVQE Prediction of potential genes in microbial genomes Time: Fri May 13 03:54:59 2011 Seq name: gi|316922560|gb|ADCP01000112.1| Bilophila wadsworthia 3_1_6 cont1.112, whole genome shotgun sequence Length of sequence - 2980 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 58 - 1278 1161 ## COG0019 Diaminopimelate decarboxylase 2 1 Op 2 . + CDS 1278 - 2030 738 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 3 1 Op 3 . + CDS 2027 - 2686 539 ## COG0110 Acetyltransferase (isoleucine patch superfamily) Predicted protein(s) >gi|316922560|gb|ADCP01000112.1| GENE 1 58 - 1278 1161 406 aa, chain + ## HITS:1 COG:BH3958 KEGG:ns NR:ns ## COG: BH3958 COG0019 # Protein_GI_number: 15616520 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Bacillus halodurans # 8 406 4 379 379 415 52.0 1e-116 MTETYLEQLKPERFPSPGFVVDEAKLRSNVAILDEVQRRTGAKILMALKCFAMWDVFPII SRSGEGALYGCCASSPDEARLARECFGGEVHAFAAGYSEADMVELLETVDHVLFNSFAQW ERFKGMVEAKNAGRATPIECGIRVNPEHSEGAVPMYDPCAPGSRLGVRLREFERFASAGG CPAQTPPPGLKGLSGLHFHTLCEQGADALDRTLKAFEMKFGRYLKGLKWMNFGGGHHITK EGYDIDLLCKCIERVRDRYGVQVYLEPGEAVALNAGVLVSTVLDVVRADMPVAILDTSAA THMPDVLEMPYRPMVIGSGEPGEKRWSCRLAGKSCLAGDVIGEYSFDEPLKAGDRLVFTD MAIYSMVKTTTFNGLRLPSIVRWNPETDETRLVREFGYEDFKMRLS >gi|316922560|gb|ADCP01000112.1| GENE 2 1278 - 2030 738 250 aa, chain + ## HITS:1 COG:NMA1465 KEGG:ns NR:ns ## COG: NMA1465 COG0037 # Protein_GI_number: 15794367 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Neisseria meningitidis Z2491 # 16 226 21 229 319 106 33.0 4e-23 MSKEKLSYAQQVCVKSAGKAMQRTGMVGPGAKVGVAVSGGVDSWVLLEVLRRRQRIVPFR FDIMAIHLNPGFDAENHAPLVEYLAKHGVAGHIEVTDHGPRGHSPENRRNSACFYCAMLR RTRLFEVCQRYGLTHLAFGHNADDLVTTFFMNLVQNGRVEGMGMCDDFFKGALKVIRPLL LVEKPDIIKAARRWELPVWSNPCPSAGKTNRANFQAKIDALHGGDKMLKTNLFNGLCRWQ LAQSESGNKA >gi|316922560|gb|ADCP01000112.1| GENE 3 2027 - 2686 539 219 aa, chain + ## HITS:1 COG:SA2342 KEGG:ns NR:ns ## COG: SA2342 COG0110 # Protein_GI_number: 15928134 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Staphylococcus aureus N315 # 125 216 95 182 199 80 40.0 3e-15 MIGKLRKAIERRLSAQDILSYAAITLQRHWSKDWGTLALRIKALLFGVEVGPGVTACGSV ILGRWPGSHIRLGAGCSLISSSRRATASTLYAPVRLRTYAPTARIDLAEGVQLSGTAITA RSCTISIGKNTMVGPNCVITDSDFHAHWPAETRHIEPAFELDRGVSIGANVWIGMNSLIL KGVTIGDGAIVAAGSVVVRDVPPKAVVAGVPAKVVKVGE Prediction of potential genes in microbial genomes Time: Fri May 13 03:55:02 2011 Seq name: gi|316922556|gb|ADCP01000113.1| Bilophila wadsworthia 3_1_6 cont1.113, whole genome shotgun sequence Length of sequence - 6150 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 13 - 3066 2638 ## COG1205 Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster + Term 3191 - 3239 11.0 2 2 Op 1 3/0.000 - CDS 3377 - 4063 938 ## COG1861 Spore coat polysaccharide biosynthesis protein F, CMP-KDO synthetase homolog - Term 4107 - 4163 0.6 3 2 Op 2 . - CDS 4347 - 5984 1941 ## COG3980 Spore coat polysaccharide biosynthesis protein, predicted glycosyltransferase - Prom 6033 - 6092 1.8 Predicted protein(s) >gi|316922556|gb|ADCP01000113.1| GENE 1 13 - 3066 2638 1017 aa, chain + ## HITS:1 COG:MA2672 KEGG:ns NR:ns ## COG: MA2672 COG1205 # Protein_GI_number: 20091495 # Func_class: R General function prediction only # Function: Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster # Organism: Methanosarcina acetivorans str.C2A # 4 797 2 765 912 515 36.0 1e-145 MQNNVADYIRSLLASERFARQVTHHRVLPAREARYGETRRPWPRAIAEVLRERGIASLYS HQALTADTVRAGRDVVVATPTASGKTLCYSLPILEKCLQDPDSRALMLFPLKALAQDQLA AFGELTGHWPEAARPSIAIYDGDTTDHFRRKIRKNPPNVLITNPEMLHLAILPFHEQWTT FLASLSLVVVDEAHTYRGVLGSHISQLFRRLNRICARYAARPNFVFCTATVGNPEELAGN LLGRELVQEKGLPSENAFPFSPLFSPSSSSEPVALGKETSLNAPQESADIPVIRESGAPT GKRHMVFIDPEDSPSTTAIALLKAALARGLRTIVYCRSRRMTELIALWAADKAGSFAGRI SAYRAGFLPEERREIEARMSDGQLLAVVTTSALELGIDIGSLDVCILVGYPGTVISTMQR GGRVGRAGQESAVLLVAGEDALDQYFIHQPEAFFGREPERAVINPDNEVIVKRHLECAAA ELPLPSDDPWLRGPGAQAALRELEREGLLLKSADGREWVAARKRPQRHVDLRGCGASCTI VDAEGRPIGSVDGHQAYKETHPGAVYLHRGKTYVVKSLDMAERVVRCEVPQQRVNWHTRV RSHKETAIIEVKASGSAFGAPVAFGRLRVTETITGYEQRSVADNRLICVVPLDLPPLVFE TEGLWFCVPDGPRRSTEDSLMHFMGSIHALEHASIGLMPLMVMADRNDFGGISTPMHAQL GMPAVFVYDGLPGGAGLCRSAFPRLAELFAAVRDLLLRCPCELGCPSCVHSPKCGSGNRP IDKAGALFLLERIMAAPAPSGDMAVSGLESEQPKEKTVMAADIQLGGPAAGSSERIVAPL PERFMVLDVETRRSAAEVGGWHRADLMGVSVAVLYDSKGDCFTEYEQEDLPAMFERLREA GLVIGFNSSRFDYAVLQPFAGYDLRSLPTLDMLVEVKKRLSYRVSLDNLARATLNAPKSA DGMQALQWWKEGNLASIAEYCRKDVEITRDVYLFGHREGYLLFTNKAGQQVRVVVEW >gi|316922556|gb|ADCP01000113.1| GENE 2 3377 - 4063 938 228 aa, chain - ## HITS:1 COG:MJ1063 KEGG:ns NR:ns ## COG: MJ1063 COG1861 # Protein_GI_number: 15669252 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Spore coat polysaccharide biosynthesis protein F, CMP-KDO synthetase homolog # Organism: Methanococcus jannaschii # 1 206 1 196 243 116 32.0 4e-26 MKTVIVIQARLGSSRLPCKTLLNLHGLPVIDWVVGRCARSELADDLVVALPDTRRDEVLA RHLRAQGVNTFRGSEQDVLSRMHGAAAAHGADLVVRVCADNPLIWGGEIDNLIRFYQREH TAGNCDYAYNHIPRNNLYPDGLGAEIVSFELFTKAMNEATLPAHREHCLSYIVDNPELFA IRTFDPLDPALHHPEMKLDMDTADDFINLALRDIRPDITPREIVELFR >gi|316922556|gb|ADCP01000113.1| GENE 3 4347 - 5984 1941 545 aa, chain - ## HITS:1 COG:MJ1062_1 KEGG:ns NR:ns ## COG: MJ1062_1 COG3980 # Protein_GI_number: 15669251 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Spore coat polysaccharide biosynthesis protein, predicted glycosyltransferase # Organism: Methanococcus jannaschii # 211 470 2 272 330 107 29.0 4e-23 MTETRRCIVIPAIKKNAVIPDQLVKRLAGVTLIQRAIDTARGVVPAHDVVVVTDSQEISL ICERNGVRVHYNAGLRFTSLDIITEMKSILRELGNDYGHIIIYRASCPLLTWVDIDDAYK TFLEDEADCLVTVKSVRHRIWEVHQGRLESFMSEDESELVVESKALIIVKSDALDGGTIR HTVPYFLNDRAMEINSYQDWWLCERLLTQRRVVFVVAGYPAIGMGHVFRSLMLAHEIANH KVFFVCTKESELAASNIAARDYKTVIQQGELWEDVLALDPDLVINDMLDTPREYMEHLKA ANIPVVNFEDEGPGSVLADQVVNALYEEPQNETNGKQPERFLYGHKYFCLRDEFLQAEQN AFRPAPKCILITFGGTDMPDYTRQTLDTVEPLCRERGIAIRVVTGPGYAHRDELVRHIKA LGNPLLRFEYATNIMSRMMEGVDLAICSAGRTVYELAHMHIPSIVLAQHEREARHTFARA DHGFAYMGIMRKFNAGRLRKVFVELIDEPERRNVLYQRQSRIHFEKNKAKVVSGILKLLK KEKES Prediction of potential genes in microbial genomes Time: Fri May 13 03:55:09 2011 Seq name: gi|316922548|gb|ADCP01000114.1| Bilophila wadsworthia 3_1_6 cont1.114, whole genome shotgun sequence Length of sequence - 9840 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 9, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 74 - 1249 1680 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 2 2 Tu 1 . - CDS 1423 - 2499 1522 ## COG1454 Alcohol dehydrogenase, class IV - Prom 2535 - 2594 2.2 + Prom 2455 - 2514 3.2 3 3 Tu 1 . + CDS 2628 - 3650 815 ## COG0240 Glycerol-3-phosphate dehydrogenase + Term 3694 - 3734 1.8 4 4 Op 1 . - CDS 3776 - 3871 64 ## 5 4 Op 2 . - CDS 3855 - 4340 112 ## ACP_0584 ISAca3, transposase orfB - Prom 4579 - 4638 2.8 6 5 Tu 1 . - CDS 4762 - 5124 150 ## Dd586_0657 hypothetical protein - Prom 5259 - 5318 3.2 7 6 Tu 1 . - CDS 6281 - 6949 80 ## COG0438 Glycosyltransferase - Prom 7117 - 7176 4.6 + Prom 6497 - 6556 3.3 8 7 Tu 1 . + CDS 6647 - 6970 115 ## 9 8 Tu 1 . - CDS 7485 - 8894 169 ## COG0836 Mannose-1-phosphate guanylyltransferase + Prom 9106 - 9165 10.7 10 9 Tu 1 . + CDS 9398 - 9688 161 ## GOX1471 transposase Predicted protein(s) >gi|316922548|gb|ADCP01000114.1| GENE 1 74 - 1249 1680 391 aa, chain - ## HITS:1 COG:PAB0774 KEGG:ns NR:ns ## COG: PAB0774 COG0399 # Protein_GI_number: 14521367 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Pyrococcus abyssi # 16 386 11 363 366 192 34.0 7e-49 MDIKVNFSGRAIKYTEDEIAVVVEAMRNAEPLTQGKHLQAFQKAFGEYIGAEHCFAVMNG VSALELSAQLCNFKPGDEVVIPSHTFTASAYPYAKKGAKLVWADVDPVTHVVNAETIEKC ITPKTKAIVVVHLYGYVADMPAIMEVAKRHNVLVIEDAAQSIGADIDGIKSGAWGDMAIF SFHSHKNLTTLGEGGMLVVKDPKLAALVPALRHNGHCGYPEPRPNYWTPAMGNVDMPMLD GDMLWPNNYCLGEIECALGVKMLERIDRINAEKRARAIRFIDALKDYPELEFLRDDTTRH NYHLLAGCMKNGKRDAFMHAISEEKGIKCVVQYIPLDRYDFYKKLGLGEANCPNADAFFD GQISFPFQHWLSEEDFGYMLKSTREVLESLR >gi|316922548|gb|ADCP01000114.1| GENE 2 1423 - 2499 1522 358 aa, chain - ## HITS:1 COG:SMb20147 KEGG:ns NR:ns ## COG: SMb20147 COG1454 # Protein_GI_number: 16263895 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Sinorhizobium meliloti # 69 280 60 272 368 112 33.0 8e-25 MYRNTKNVPYYVFGRGSLAQLGDLLKPRREAVDGPVVYFVDHFFRDHELIGRLPMEKGDQ LHFVDSSSEPEVTRIDAFKEAVSAADSRIPCCVVGIGGGNALDTAKAVANLLTNPGKAED YQGWELVKNPAVYKIGIPTLSGTGSECSRTCVLTNIPKNLKLGMNSDYTVYDQLLLDPDL LATVPRDQYFYTGIDTFMHCVESLQGSYRNAIIDAFSVRAVQLCEEIFLSDDMMSEENRE KMMIASYIGGMAAGNVGVIHPFSAGLGVVLHVAHGLGNCLALNVLGDIYPKEYKQFRTMI ERQGVNLPTGICKDLTEEQYEGLYRGCIVHEKPLTNALGPDFKQILTRENLIERYKSI >gi|316922548|gb|ADCP01000114.1| GENE 3 2628 - 3650 815 340 aa, chain + ## HITS:1 COG:BH1640 KEGG:ns NR:ns ## COG: BH1640 COG0240 # Protein_GI_number: 15614203 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Bacillus halodurans # 6 340 3 331 345 272 45.0 8e-73 MSGKTSIAVLGGGSWGTALAHLLAHAGHSVALLLRDTEQAAHINAHHENPRYLRGVQLHP GIRAVAGDTGGPEQAEVLADVSILVLSVPCQAMRGTLRRLAGVVPPDCVLVNTAKGIEVS ELVTVERMVREELPGFVDRYAVLSGPSFALEVATCKPTAVVLGCRDERLGARLREVFATP WFRSYSSTDVPGVELGGAIKNVIAIAAGLSDGLGFGLNARAALVTRGLAETSRLGVALGA RASTFMGLSGLGDLMLTCSGDLSRNRQVGLRLGSGESLEDIASSMCMVAEGVKTTQAVHR LAQRLGVDMPVTGTMYSVLYEGFAPREAVQALMGRRLKEE >gi|316922548|gb|ADCP01000114.1| GENE 4 3776 - 3871 64 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKLSSNDTTGVYYDLVQVDYEMTPSMPQPC >gi|316922548|gb|ADCP01000114.1| GENE 5 3855 - 4340 112 161 aa, chain - ## HITS:1 COG:no KEGG:ACP_0584 NR:ns ## KEGG: ACP_0584 # Name: not_defined # Def: ISAca3, transposase orfB # Organism: A.capsulatum # Pathway: not_defined # 63 161 3 101 117 124 59.0 9e-28 MLALAEDLRERGKRDLREAFIDGTFVPAKKGGRTVGKTKRGKGTKIMAVADAAVFPLALH VASASPHKVTVVEDTLDCTFTNELPGSLIDNKAYDSDALDARLEEAWGIEVIVPNRKKRA KTQDDRPLLRYRRRWKVERLFAWLQNFCRLVVIYEFHEKTF >gi|316922548|gb|ADCP01000114.1| GENE 6 4762 - 5124 150 120 aa, chain - ## HITS:1 COG:no KEGG:Dd586_0657 NR:ns ## KEGG: Dd586_0657 # Name: not_defined # Def: hypothetical protein # Organism: D.dadantii_Ech586 # Pathway: not_defined # 1 83 394 477 521 72 41.0 4e-12 MLHGDCYAQQNQTVTAHVDGLDCQIAARGIWGYLDHPFSSRYQYGWESGSERIKAHEKLI RAIRSRPNILFWSQSQCFQFLRNLMETPITVQNDKPLALKKEKHTEYRMECRYRGQTHSL >gi|316922548|gb|ADCP01000114.1| GENE 7 6281 - 6949 80 222 aa, chain - ## HITS:1 COG:CAC3066 KEGG:ns NR:ns ## COG: CAC3066 COG0438 # Protein_GI_number: 15896317 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Clostridium acetobutylicum # 4 210 181 385 385 79 28.0 5e-15 MQALFSQCVCEPNVCDYDLFHTAVTEPLEEPEELRFIPHPRLIFIGALSEYKVDFDLIRF VAQCLPNVHWVLIGPDGEGQPDSNKPPVLPNVHMLGPKPYHRLPSFLKYGDMAVLPAVHN NYTDAMFPMKFFEYLASGVQVVATALPSLREFQDLYFPADSKEDFVSAIKIVLAGEKRNA VSIEAACHRHSWESRFERMEKILWGALNAKKTAEWKRTDDVR >gi|316922548|gb|ADCP01000114.1| GENE 8 6647 - 6970 115 107 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIGLGTKHMNIWQYRRFITVRLPFSIRADQDPMNIGQALSHKAYQVKINLVFTQCPYENK PGMRNKTKLFRFFKRFCHCCMKQVIVTDIGFTDALREQCLHPFLQSC >gi|316922548|gb|ADCP01000114.1| GENE 9 7485 - 8894 169 469 aa, chain - ## HITS:1 COG:YPO3099_1 KEGG:ns NR:ns ## COG: YPO3099_1 COG0836 # Protein_GI_number: 16123273 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Yersinia pestis # 1 352 3 354 354 339 46.0 7e-93 MIYPVILCGGSGTRLWPLSREMYPKQFINFGEGRTLFKDTVCRAMNISDSVSPLIVCNEE HRFYVSTSFLECGISGRILLEPEARNTAPAIAFAAFVVQKLEPDGILLVLPSDHMLRDEK TFVDTVQSAVSAAQKGFIITFGITPLYPATGFGYIQQGETLSDGGYRVARFVEKPDGKTA ASMLSQRSYLWNSGMFMFSASTYLRELKIYAPKIFEHCLQAWEERKEDGAFIRPGRAAFT TSPSNSIDYAVMKYTAHAAVFPLCSDWDDLGSWEAFYQTGKKDVSGNICTGDVIVEDARG CYLYGTSRLVAALNVENLAVIETKDAVLVAQRDAVHNVKSIVARLKDKDRQEYRLYPLVY RPWGIYESLAMGPRFQVKRIVVNPGAQLSLQLHHHRAEHWVVVSGVAEVTVDTLVKRIEE NQSIYIPVKTKHRLCNPGTEPLIVIEIQSGSYLGEDDIVRFEDVYGREN >gi|316922548|gb|ADCP01000114.1| GENE 10 9398 - 9688 161 96 aa, chain + ## HITS:1 COG:no KEGG:GOX1471 NR:ns ## KEGG: GOX1471 # Name: not_defined # Def: transposase # Organism: G.oxydans # Pathway: not_defined # 55 89 37 71 133 63 74.0 3e-09 MEFDKKYVRYIYNTVLFKGRKTGYVGDTEVSHAYPVLSFRKSVGAYKIVLYLFAPRVDDR RVVSGIIYAIKHGLQWKDASDEYGPHKMLTIALQVD Prediction of potential genes in microbial genomes Time: Fri May 13 03:55:52 2011 Seq name: gi|316922533|gb|ADCP01000115.1| Bilophila wadsworthia 3_1_6 cont1.115, whole genome shotgun sequence Length of sequence - 22325 bp Number of predicted genes - 18, with homology - 14 Number of transcription units - 11, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 636 - 893 86 ## 2 1 Op 2 . - CDS 908 - 3232 256 ## COG1541 Coenzyme F390 synthetase 3 2 Tu 1 . - CDS 3431 - 4201 274 ## Arad_4842 glycosyl transferase protein - Prom 4259 - 4318 2.4 4 3 Tu 1 . - CDS 4439 - 4870 70 ## Dd586_0663 glycosyl transferase group 1 - Prom 4934 - 4993 1.9 - Term 6527 - 6559 2.2 5 4 Tu 1 . - CDS 6736 - 8370 244 ## COG0438 Glycosyltransferase 6 5 Tu 1 . + CDS 8396 - 8641 109 ## - Term 8907 - 8947 1.2 7 6 Op 1 11/0.000 - CDS 8983 - 11580 797 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 8 6 Op 2 26/0.000 - CDS 11610 - 12689 0 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 12764 - 12823 3.2 9 6 Op 3 1/0.000 - CDS 12902 - 13987 163 ## COG0438 Glycosyltransferase 10 6 Op 4 . - CDS 13987 - 14610 211 ## COG0406 Fructose-2,6-bisphosphatase 11 6 Op 5 . - CDS 14627 - 15250 -54 ## HM1_1151 hypothetical protein - Prom 15460 - 15519 3.6 12 7 Tu 1 . + CDS 15401 - 15607 154 ## 13 8 Tu 1 . - CDS 15838 - 16305 115 ## gi|223934765|ref|ZP_03626685.1| hypothetical protein Cflav_PD5796 - Prom 16491 - 16550 2.2 14 9 Op 1 26/0.000 - CDS 17016 - 17672 171 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 17703 - 17762 3.0 15 9 Op 2 . - CDS 17765 - 18583 293 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 16 9 Op 3 . - CDS 18583 - 19530 275 ## COG1541 Coenzyme F390 synthetase - Prom 19720 - 19779 3.2 17 10 Tu 1 . - CDS 19917 - 20954 -71 ## COG0381 UDP-N-acetylglucosamine 2-epimerase - Prom 21157 - 21216 6.2 + Prom 21985 - 22044 2.6 18 11 Tu 1 . + CDS 22064 - 22186 63 ## Predicted protein(s) >gi|316922533|gb|ADCP01000115.1| GENE 1 636 - 893 86 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWHKINLFIIEGKINSPLLTLEDTKNVVQKHIKDEKFPDTYLFHTRRYFSSIKYIHNPNC QIVYTEDDLRHTIDILVILLIILSV >gi|316922533|gb|ADCP01000115.1| GENE 2 908 - 3232 256 774 aa, chain - ## HITS:1 COG:MA1063 KEGG:ns NR:ns ## COG: MA1063 COG1541 # Protein_GI_number: 20089933 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanosarcina acetivorans str.C2A # 4 353 54 415 445 114 27.0 9e-25 MVVFAGEYVPYYRELFAKIHFKPIWLKNSPVAYNELPFLDKSIVQEQGKRLYAEGRSPLF YRITGASTGTATTIAYDRDGLDWTSASNILVLGWAGCPLSSSHVHLSTEFNNRTLRERVT EYCKCIVINKTNIFTTTLDAEQMQHFYEEIQKAHPRLIQGHPSTLYALACYLEKSGFKET PLFPVFESTGEKLEPQKRKKIETVLGCQIFDRYGCAEFGVVAHPRNSDDPRLALLEHQVW AENYPISGGLSELVFTGLTNQGMPLLRYRSGDLGNVVTTEEGTFLSALEGRVHDVISLNG KTYPSHYFHDVLTRVGGLSDFQIILNRDNTLQELLVVMNTTDDEDAVRKKVYSLTGVSFP IRFCHDIAEFDRVGWRNKFHYMVHREKIESSQNKRTVSEGRPHICIVCHSMAINGANNHV LELVRLFHIHCDFSILAVSEGPMRSFLQPYVHSVEIYDPESTYDFTGYSMVMGNTLMTTH ILPEVLYQRVPVCVVVHESWHPEYLQQHINTFEFGGHVTEACIRQSLNGADQVIFPAHFQ ADLYKPLLPRAVCIREIGCTRPFDDISAYMQKVSPAEARRKLGLSDTAVVFLQLGTVTHR KNQLGTVRAFLRFVEQHSDIEAILLLVGARRSRGNESAYVDNLLHFINSSPIKAEIHVVD VVPNPYPYLRAADGMVHPSYNEVLPLVLLEAGAFGIPVIGANQDGLPEIVTDGKSGFLLP PDDIQGIADGMARLAQDTELRKRMGKYLRDTVLKTHSIKQFERQYEEVFRHKKD >gi|316922533|gb|ADCP01000115.1| GENE 3 3431 - 4201 274 256 aa, chain - ## HITS:1 COG:no KEGG:Arad_4842 NR:ns ## KEGG: Arad_4842 # Name: not_defined # Def: glycosyl transferase protein # Organism: A.radiobacter # Pathway: not_defined # 1 248 73 321 325 196 45.0 6e-49 MPLVISPIVWIREHKEHYPLWEIRSLMDMATAVLPNSRAECDQLIELFEVNPEKCTPIVN GVDDIFFKTVSASLFRETFSLTEPFILCMGNIESRKNQLRLIQAVKGLGLHLVLAGQDRE ADYAEHCRAVADETVHFVGRLEHKSELQRSAYAAASVLALPSSLETPGLVALEAGATGCR LALTSEGCTKEYFGDFACYLDYHKVDSIRTAIEKALTLPRSEQLSAFVKEKYTWHRAAQQ LADVYHGILEHEGEVS >gi|316922533|gb|ADCP01000115.1| GENE 4 4439 - 4870 70 143 aa, chain - ## HITS:1 COG:no KEGG:Dd586_0663 NR:ns ## KEGG: Dd586_0663 # Name: not_defined # Def: glycosyl transferase group 1 # Organism: D.dadantii_Ech586 # Pathway: not_defined # 3 129 867 993 1006 104 39.0 9e-22 MIFIDEMLTPAENVNLLNTCDAFVSLHRSEGFGRLLAETMLLGKLLVASDFGGSTDYVLP ETGCPVPGKLTPVPDGGYLFGEGQSWFDPDEEYAAELFRDILANWKLYTSRVAQGQSFIM KNHTPEVVGKKALTALQQAGFLA >gi|316922533|gb|ADCP01000115.1| GENE 5 6736 - 8370 244 544 aa, chain - ## HITS:1 COG:TM0622 KEGG:ns NR:ns ## COG: TM0622 COG0438 # Protein_GI_number: 15643387 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 306 459 174 319 388 85 37.0 3e-16 MNQSIDPIPAFSVCAASDRFVSRERKIWISSGVAWFLSLLDTDPFRLATNVAEVKTGDVF LLLPTEEYSDKFLRSVFTIYIVNPIDNIAKYITKIRKNARFSLAHDGWMQTLACHGLSVD ALPSSPEILVEKIHTLVNADVKGIALDNIPEETIQPLSVLLLMDQIGIGGMENVLLNLTR HMQARGWHPLIGYMDSISPHMEKKIRLLGIPCCHLSRDTEEQLAFCKKEAIQCINAHYSL EMTEAAAQLGIPCIQTVHNMYIWNDGSDLDFWRFSYTNMDGFLCVSDACARVIEERYWVP KEKIHVIENGVPVLSPPKQTTQALRQHLGIRPDTFIFLNMATLSPIKKQDVLVSAFAQAF SHDENVALILLGNPVSASFSQKIQSMIDQYGLARKVFVAGFHEPVSQWLEMADSFVLPSV VEGWSLALDEARSCGLPLIATAVGGAPEQLDSAADILLPSYFSEDDTFASCVFSRIVEDR EADSRVAKALADALWHHFVSVVRGEPIKRSSPLSNDVVFSRHCAYIYSCWDKKQVRIQQN MKEV >gi|316922533|gb|ADCP01000115.1| GENE 6 8396 - 8641 109 81 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKICEIYHALRTFTNFLCRNSPHDCVGRDILSHNSPTGHDSPLPDGNTGEDYGPRPNPGM IPYLNWRSNSREFRLMNIMVQ >gi|316922533|gb|ADCP01000115.1| GENE 7 8983 - 11580 797 865 aa, chain - ## HITS:1 COG:PAB0796 KEGG:ns NR:ns ## COG: PAB0796 COG0463 # Protein_GI_number: 14521394 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Pyrococcus abyssi # 473 701 2 230 331 113 28.0 2e-24 MGYDLDKACPEQGVWHWNFTEHSDRPGKAVLVTCIMAPLPTISGGAQYVANALFPLAEEY ELHLLIVGYKYMKDQVESFRTRYGELFRSVTVVARDSLPATAEEEEEEEAYYGERLEHGL PFLDISYYSARVVDAAAALITRFSIGLLELHSTHVAYLKHFFPDIPALLVSHNIESELFP FWIPATSKLPLERMEQIAEQSRKNAREVEIDNRWHFEAMTFISPSDMHKVTPRVERHHLP LGFPVTDKAYQVRSGACNVLWLGAFDWSPNAEGMNWFAEKVFPLLRDRLEESQLVFHIVG AKPSQQVLALHDGIHVIVHGFVENLDAIFAEADMLMAPLLRGGGVRVKVIQAMSQGLPVI GTTVGCSGIGLTHGKNVWIADTPEAFADGLVELSRSGELRTKIAEEGKIFLLRHHNIKYS LAMKRDIYNRLLAGSIFDPTKDTMRSPYDGLTYLKPIPCVTLDKDYQIADETRFSVIIPF RNEAEGLSEFMQDLAAQTARPAEILMIDHDSTDSSCQIATNLAKKYQLPIRLLHAEEGPQ ARSGRKTTVAGNRNYAIQEAQTDLLLFTDAGTRLHPCFFANMIGPMQEDPSLEMTGGIYY AQSAQLTATLVYDWDTVDWNSFLPACRALAMRKALAIRAGGLPEFLTFAGEDALFDITFR RLCNKWAFNRRAIVYWNAPDTVESLRKKFYTYGIGDGESGIGDFLFSRNALHLQRTGTLP PELTILDYNYAAFQGYFAGRARRGTIDRTLRGVRNIVLLPVEKPLFAAKSTRRLIGRLVG ENTRVLCIIADAESTNQGKLCFMDFDFTLCELHYMKDFRIEDFLARYGNPELPQQIAVMR DPDNTSERVVAFIASLRPFSLGGDE >gi|316922533|gb|ADCP01000115.1| GENE 8 11610 - 12689 0 359 aa, chain - ## HITS:1 COG:PAB0796 KEGG:ns NR:ns ## COG: PAB0796 COG0463 # Protein_GI_number: 14521394 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Pyrococcus abyssi # 1 198 35 236 331 89 29.0 1e-17 MVDGNSIDDTCQIIREESDLLGLPVTVLHTADASHVKAGKRATLGGDRNYAIRRAKTDLL VFTDAGNHLPPDFFANLLGPLLDDEQTDLCGGIYRCENHDLDCDWSTRNWQTFLPACRCM AIRRHIALRCGMFPEFLSYAGEDTLFDIWYRRFSSQWVFNQQAVVWWDSPESGEEAWSKF FRYGVGDGENGVGDWNNFYRFARILRNTGELPEGLGTRSNYTLMKQALYGYIRGKTLRGE IDRTRRGVGEVVLLTVEKPLHAAPSSLELVYSLAEQGKRVVCLAADSSCVPCVDAIPRLF DADLSLCDLEFTDDFLLSDFLQRYGWSVREAVLQLMEDPWNTGERARRFQTRLQTFFHR >gi|316922533|gb|ADCP01000115.1| GENE 9 12902 - 13987 163 361 aa, chain - ## HITS:1 COG:sll1231 KEGG:ns NR:ns ## COG: sll1231 COG0438 # Protein_GI_number: 16330676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Synechocystis # 183 326 222 363 399 66 31.0 7e-11 MQPGRVLRLLVSDLYLNDAIGNFVLSLADEVPARKIPITVYAERYMEGINLDGQYDDFFR EVRPDDIVLYQLSNGDPGFERLMRLPCWRKIVFYHNMTPGHFFRPFSSEIADLLDEGRTS LKLLGLLGSNDAVFANSAYSLSEVLPYLDASVFRAHMPPFTERILKRIQLDEEQEVCAGA GHPYLLTVGRLVPHKNLEGGLALYDRLRQYIPELEYVIMGAGFWPYEKKIKIKAAEYSKK GAHIRFAGQTDNAEASRLFSGAYALLCPSLHEGFCVPIVEAMSRGIPILAFDQPAMWETL GGQGILVSEQASQEQVSLWTEKLSNDRSRHSLVCGQRSRLENLLQCVRTSALWSLMSGEP L >gi|316922533|gb|ADCP01000115.1| GENE 10 13987 - 14610 211 207 aa, chain - ## HITS:1 COG:ECs5353 KEGG:ns NR:ns ## COG: ECs5353 COG0406 # Protein_GI_number: 15834607 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Escherichia coli O157:H7 # 1 175 2 180 215 72 28.0 6e-13 MKLYLLRHGQAVANVHRLVTGTPDDPLTAEGERQARQAQMLLRKCSFDAYVVSHWKRARQ TGELATGNNDFFVDRRLGETDAGLVSDWSRAHFEQQHPDFFAYFDPSAKFPGGESHDDLF KRVIMWMKETAAVMPENASVLAVSHMGPISCMLQYALGIPMSLFPRFKPGNASLTYLKVG KKAIPEQIQIVYYAVTPDIVCRDRENV >gi|316922533|gb|ADCP01000115.1| GENE 11 14627 - 15250 -54 207 aa, chain - ## HITS:1 COG:no KEGG:HM1_1151 NR:ns ## KEGG: HM1_1151 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 17 199 185 365 371 112 32.0 1e-23 MHNSRIYPKECRTVLPEARTGDFFYGGRLRKKQNYGEWFLKEDIQKYEIDFYGLFGEFSK DEEVLIREGNAGRIRYHGYIPSGNRYFSILSSYFYSLVLWAPLDEGTRYACPNKLYDALA CGVPFITGPILLATELVEKFHCGIIMDSWDYTAFCKALTYAEAIKGSSEYAQMVYGCWKA MQEECSWSIQFDRKIIPLLASAELIPT >gi|316922533|gb|ADCP01000115.1| GENE 12 15401 - 15607 154 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKRIQNQKWPVLFDILSEYDKKIPLKKHHDIRMERFTEGKNFLSALFSSFGCGMGMLTN RCSYDSEL >gi|316922533|gb|ADCP01000115.1| GENE 13 15838 - 16305 115 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|223934765|ref|ZP_03626685.1| ## NR: gi|223934765|ref|ZP_03626685.1| hypothetical protein Cflav_PD5796 [bacterium Ellin514] # 1 154 240 398 412 82 34.0 1e-14 MLSGIPTAVIDYTNSPVYVPAAWTITAPEHMGPVVQGLLNPDPKRLLHQDICLHDSLECQ TPATPRLIELIRRMGMAHREKRPLPDRILNIPGMVSAMPEQPFPLDACFPRNEFYQNRDM VELQAALEQMKAFLASLPLSELQTRYLLQFSWYKG >gi|316922533|gb|ADCP01000115.1| GENE 14 17016 - 17672 171 218 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 13 198 19 214 305 70 27 9e-12 MPSAIRELRSRRIVLDNVSLTIHRGETFGFIGKNGAGKSTLLGLIAGVLRPEKGWVITQG RVSPLLELGAGFHPELTGRENILLNGVLLGLRKEEVHRHFDEIVEFSGLEEFIDQPIRTY SSGMFAKLGFSVVAHLKPEILLVDEILSVGDIAFAQKCEKRIEELRKNKNVTIILVSHTM ASVEKICDRAAWIDSGHVRMLGDAKQVVDSYQQYMLGE >gi|316922533|gb|ADCP01000115.1| GENE 15 17765 - 18583 293 272 aa, chain - ## HITS:1 COG:CAC2329 KEGG:ns NR:ns ## COG: CAC2329 COG1682 # Protein_GI_number: 15895596 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Clostridium acetobutylicum # 25 270 11 256 258 112 32.0 7e-25 MTKEISSTCTTAAQNASFSSRWGWYRDLISVLLHKELEVRYKSSILGYLWSIMNPLAQAA VFYLVFSVYMRFAVAHYLVALLAALFAWQWFTNSVLQGPHTFLANPTLVKKVSSPRYIIP LVTVLQDMVHFLISLPVFLVFKLTDGLLPAVNWIWGIPSLTLITFITIYGICLLLGTLNL FLRDIGNLVGILVNILFFSCPIMYVLDVVPKEYLIYFKINPIAPLFISWRALLLNNTFGD EFLLLSIGYAAIFLVMGIWVYKKLEYKFAEVM >gi|316922533|gb|ADCP01000115.1| GENE 16 18583 - 19530 275 315 aa, chain - ## HITS:1 COG:MA1063 KEGG:ns NR:ns ## COG: MA1063 COG1541 # Protein_GI_number: 20089933 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanosarcina acetivorans str.C2A # 22 211 145 336 445 72 25.0 1e-12 MIYYSQEDLDWTAAQNILMLEWAGKKLGEKEVHLSTRFLEPPTPEGIRYEKRKCFLLNRV NILTKDHSEKEMEDFLRRLQKARAVSVQGHASTMYALANYMESKNIISKRLFDIFISTGE SILDSRKQKIERFIGCKVADRYGAAEFGVMAQQLRQSNDLLVSDSLVWPEVTDCDQEGVG ELVFTAIRNKAMPLIRYRMGDLGRLDERDTGWWVSHLSGRIHDEVSIGGHVYPTHYLLDV LDHRCPGILDFQVLARKGVAEGLNIVPGEGQDITWIAQHLQKEFPGLPVRQIQPDELRFV GRRGKFRYIFSLGEE >gi|316922533|gb|ADCP01000115.1| GENE 17 19917 - 20954 -71 345 aa, chain - ## HITS:1 COG:RSp1017 KEGG:ns NR:ns ## COG: RSp1017 COG0381 # Protein_GI_number: 17549238 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Ralstonia solanacearum # 8 341 41 378 379 283 45.0 2e-76 MSTVLFSTGQHGEMLQQAWQSFGITPDIDAHVMTQAQSLAGLSSRLFTLIDAELENIKPD FVLAQGDTTTVLVSSMCSFYRQIPFGHVEAGLRSNNIQSPFPEEFNRRIAGLTATLHFAP TSGAAENLRKEGVEEKSICVTGNTIVDALHYMRDRVRVSPLELPQNVHEAVRAGRPAVLI TLHRRELSVEELEAVCNTVAVLARQFQQAVFIWPVHLSPRISSIVHLLLSDRENIFLIHP LGYASLIYLLDKCHFVISDSGGIQEEAPSFGKRVLVLREETERPEAVEAGFCKLLGTDVE ALYTEASLLLSSPTPPEHHAANPFGDGHAAERIVQSIKDFFIMNN >gi|316922533|gb|ADCP01000115.1| GENE 18 22064 - 22186 63 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDDRRVVSGIIYAIKHGLQWKDASDEYGPHKMLTIALQVD Prediction of potential genes in microbial genomes Time: Fri May 13 03:57:10 2011 Seq name: gi|316922513|gb|ADCP01000116.1| Bilophila wadsworthia 3_1_6 cont1.116, whole genome shotgun sequence Length of sequence - 19502 bp Number of predicted genes - 19, with homology - 17 Number of transcription units - 9, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 675 - 751 80.8 # Arg CCG 0 0 + Prom 845 - 904 2.4 1 1 Op 1 . + CDS 932 - 1624 713 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 2 1 Op 2 . + CDS 1640 - 1843 295 ## 3 1 Op 3 22/0.000 + CDS 1871 - 3028 1389 ## COG0795 Predicted permeases 4 1 Op 4 . + CDS 3025 - 4098 1468 ## COG0795 Predicted permeases + Term 4150 - 4175 -0.1 - Term 4474 - 4516 1.1 5 2 Op 1 5/0.000 - CDS 4533 - 5576 1209 ## COG2896 Molybdenum cofactor biosynthesis enzyme 6 2 Op 2 . - CDS 5630 - 6283 683 ## COG0746 Molybdopterin-guanine dinucleotide biosynthesis protein A 7 2 Op 3 . - CDS 6280 - 7056 632 ## COG1526 Uncharacterized protein required for formate dehydrogenase activity 8 2 Op 4 . - CDS 7053 - 7280 329 ## - Term 7416 - 7481 22.7 9 3 Tu 1 . - CDS 7503 - 8780 1246 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Prom 8862 - 8921 3.2 10 4 Tu 1 . + CDS 8924 - 10573 2410 ## DvMF_2025 hypothetical protein + Term 10636 - 10677 9.3 + Prom 10646 - 10705 2.0 11 5 Tu 1 . + CDS 10905 - 12134 1406 ## COG1171 Threonine dehydratase + Term 12144 - 12190 0.2 + Prom 12241 - 12300 4.7 12 6 Op 1 . + CDS 12482 - 13261 669 ## Dvul_2461 hypothetical protein 13 6 Op 2 . + CDS 13287 - 14105 981 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase + Prom 14234 - 14293 6.9 14 7 Op 1 12/0.000 + CDS 14471 - 14866 555 ## COG2076 Membrane transporters of cations and cationic drugs 15 7 Op 2 . + CDS 14863 - 15192 526 ## COG2076 Membrane transporters of cations and cationic drugs + Term 15405 - 15436 3.4 - Term 15393 - 15424 3.4 16 8 Tu 1 . - CDS 15566 - 16951 1049 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase - Term 16966 - 17025 9.5 17 9 Op 1 . - CDS 17043 - 17942 1275 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 18 9 Op 2 . - CDS 17899 - 18597 572 ## LI0683 hypothetical protein 19 9 Op 3 . - CDS 18608 - 19303 530 ## COG2068 Uncharacterized MobA-related protein - Prom 19323 - 19382 3.6 Predicted protein(s) >gi|316922513|gb|ADCP01000116.1| GENE 1 932 - 1624 713 230 aa, chain + ## HITS:1 COG:MK1542 KEGG:ns NR:ns ## COG: MK1542 COG2220 # Protein_GI_number: 20094978 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Methanopyrus kandleri AV19 # 5 227 3 227 230 186 45.0 3e-47 MSVSVTWYGHSNFKVSCGDVTVFIDPFFTHNPSCPVTWNEAGKPDLVLVTHDHGDHTGDA VAICKSSGATCGCIVGTAERLIDAGMPQASIPAGIGFNIGGTIEVKGVRVTMTQAFHTSE SGAPAGYVVTMPDGFTFYHAGDTGLFSSMELIGSLYPLDLALLPVGGFFTMDGLQAAHAA RLLKPKAVIPMHWGTFPVLAKDTSAFASHLASVAPDVRLVSMKPGDSAAF >gi|316922513|gb|ADCP01000116.1| GENE 2 1640 - 1843 295 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTLIEHGELVRRALEYVQEERCRRPDAPLSSLLDEAGMRFNLTPLDAAQLARLFSETGQE PSAPCKR >gi|316922513|gb|ADCP01000116.1| GENE 3 1871 - 3028 1389 385 aa, chain + ## HITS:1 COG:alr4069 KEGG:ns NR:ns ## COG: alr4069 COG0795 # Protein_GI_number: 17231561 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Nostoc sp. PCC 7120 # 5 361 1 365 371 89 22.0 8e-18 MPSLLQRQIFKEIANLFVLAVGVLLTLILISRAVQMRELFLGLDLGILDTVLLFGYMTPL FLMLVIPIACMLSVFLTFLRMSTDRELVALRAGGINIYQMLPAPLLFSVICMLLTLWVSL HWLAWGMGHFRETILEIANTRARVVVQPGVFNTDFPNLVLFARQVNPGDGDMSQVLVDDR SHPERHMTILAPEGRFDTDTERGELVFLLEDGKIYTTDKKGASVLAFDEYKVRLPLDSLF KSLDLGDVRPREMSWSKLSSITREEALKVDANYANKLEVERHKRWAYPVACIALTLFVLP LASAFEGLHRQTGLMLALAMFFVYYSLMSLGFSTGESGAIPPSIGLWVPNILFLCLGIYG LWLTAHERTPHVATFFRNLRTRRTS >gi|316922513|gb|ADCP01000116.1| GENE 4 3025 - 4098 1468 357 aa, chain + ## HITS:1 COG:alr4069 KEGG:ns NR:ns ## COG: alr4069 COG0795 # Protein_GI_number: 17231561 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Nostoc sp. PCC 7120 # 6 355 3 371 371 70 21.0 4e-12 MSLLFRYLTKNNAMILLPTLAVGIGLYVLTDLFERLDNFIEAGLSVGMVLTYFVVKMPLV ISQILPVIFLLSTVIQLCIMARSRELTALQAGGISLGVVANSMILCGIFWGGVQLGFSEY LGVAGERESARIWQEEVRKKNLAATVLKDVWFTDGDWIVSLGTLDPQAHGTGFSGYELSD DGLSIKRIVQASTFTAEPNHWALQDVRVYTPDTFTQEQEPDFVLPLRQDPETFRLVSTGT KPQQLPLWQLGDAISQLKSSGSNVEALRTAWHAKLAYAASILVMAFVATAIVSWKDNIYI AVTVALLCTFLYFAVYTLGTTLGQRGILHPFLAAWTANLIALFFAFWRLIPLLLRRE >gi|316922513|gb|ADCP01000116.1| GENE 5 4533 - 5576 1209 347 aa, chain - ## HITS:1 COG:lin1039 KEGG:ns NR:ns ## COG: lin1039 COG2896 # Protein_GI_number: 16800108 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Listeria innocua # 17 347 4 333 333 219 36.0 8e-57 MNAIPPFAHLCEPAASLEDGHGRTVRYIRLSVTDRCNLRCTYCRSGMETFIPHESVLRYE EMEQLVDMAMDMGVEKVRLTGGEPFARKGFADFLERLRAAHPALDIRVTTNGTLIGPHIQ TLKAIGLNAVNLSLDTFDRDKFEQITGRDLFGKVRENMDALLDAGIPFKLNAVALRGFND DELPAFIDYAMRHPIDVRFIEFMPMGEGTRWSDSCFWSAPDILDAVKGLVAVAPVEQEQR NGGPARLYTLSGPDGPGLGRLGLISPLSSHFCTSCNRLRITSDGALRTCLFDDREYRLRN ALRHPKLGIEAVRRIVTLATRDKPIGARLLERRHNAVAQRRMTAIGG >gi|316922513|gb|ADCP01000116.1| GENE 6 5630 - 6283 683 217 aa, chain - ## HITS:1 COG:all0961 KEGG:ns NR:ns ## COG: all0961 COG0746 # Protein_GI_number: 17228456 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein A # Organism: Nostoc sp. PCC 7120 # 1 190 4 186 207 65 31.0 6e-11 MIGVVLAGGRSTRLGQDKVRLRLPGDGRDMLARTADLLAACTDGVVISCRAPDAGEETLA LPGIRSIPDAEPGLGPLGGVWSALRELRQPILVLSCDLPFMDMPTLRRLIDARGARPPEA LMTTFQQAETGFIEALVSIYEPACLPFFEEARARNLRQLNLVIPEKLQSRVVYTRAEALP FFNINFPGELEQARRMAEAAREQRHSTACHASEEGIR >gi|316922513|gb|ADCP01000116.1| GENE 7 6280 - 7056 632 258 aa, chain - ## HITS:1 COG:Cj1508c KEGG:ns NR:ns ## COG: Cj1508c COG1526 # Protein_GI_number: 15792822 # Func_class: C Energy production and conversion # Function: Uncharacterized protein required for formate dehydrogenase activity # Organism: Campylobacter jejuni # 114 257 130 258 260 87 37.0 3e-17 MNTPLSYSVTRWKDGGWQHIDDAVSPEEPVYVSWSGSSCRLWAWPDDLEPLAIGHVLLDR RHLGPEAGPSPFRMSGSVRAVSVPPGIDAKHAFAVELKRMEKKGHDETPSIVMTPDAIVK HMDAFLTLEGSCPLLWESTGCFHRAGMFDPKTGAFLRLAEDIGRHNCLDRLAGWACLNGI DPAETVLFVSARITSSLYAKARRAGFSFLISRSAVTSTPVEMARQQNRLHPVHPVTVVGF CRPREERLTVFAGEGRIA >gi|316922513|gb|ADCP01000116.1| GENE 8 7053 - 7280 329 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLIEIRRFATLPPHTPENDRLDVAEGTTAGEAMHILNIRSAETLILVNGVHAGPDRILND GDRLAFVPAAEAANS >gi|316922513|gb|ADCP01000116.1| GENE 9 7503 - 8780 1246 425 aa, chain - ## HITS:1 COG:MK0370 KEGG:ns NR:ns ## COG: MK0370 COG0144 # Protein_GI_number: 20093808 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Methanopyrus kandleri AV19 # 1 276 43 321 323 139 36.0 1e-32 MTDTPSLRRSFRLVCNPGQIPSVEALLSAQGFVFEPEPFSPLARRLLQEPFPLGRSLAAF WGYVYIQDRSSMLPPLALAPGEGARVLDMCASPGSKTGLLAQLVGREGLVLGNEPARPRL ANLRRNLAALNLLQAVTCSWPGESLPLPDASWDAVLLDPPCSGWGTTDKNPQAIKRWQGD RLKPMLDLQRKLLTEASRLLRPGGKLVYSTCTTNVDENEGQVRFAVEGLGLEPIPLEPFP GFVFAAPELPGCEGTLRVDEGASNAQGFYIALLRKPGDSAAVPGLARGTAATAAYRAIPP AFLAEFGLSPALLPPGDLAVFEDSLHFLPAPALAHLPAAVRWQGMALGKASAQGLIPSAR LRALLDPEPQRIPRLDVDDVPVLERLLQGGSLDTGLPGKEASLFWRGLPLGRLRLKAGRA VWSDR >gi|316922513|gb|ADCP01000116.1| GENE 10 8924 - 10573 2410 549 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2025 NR:ns ## KEGG: DvMF_2025 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 27 546 28 546 554 605 56.0 1e-171 MHLIFRTALMLCTMLVFVAQALAADVRSYAVLPFKVNGSSGFTYLEKAIPSMLNSRLYWQ GHFQPVADAALAKAGAVSDAGSAAKALSATGADYVIWGDVTVMGDNASLDVRVRDKAGKE WRKGAKTRVNDLIASLQGVSDSINAEVFGRGGSGVVAAAPAASAGQANPNFVQKDATQSQ VYLNPQFRYQGSDGTRYRSQTLPYASVGMIVADVNGDGKNEVVVLGEKRLYVYQWQNERL VQLGEYKIPTRMMPLALRSIDLNRDRAEEIILTAYEEDYTQPYSMVLSFKGNTFTEVAER LPYYLNVVRIPPDYMPTLIGQKGDPSRIFSRGGVYEMMKQGNSLVQSKKLDLPSGVNALN FAWLPGTPGKETEKLVVLTDEERLRVFSDKGSQLNQTDEKFSGSAIGIAEQTQMPGMGKD NILIPSKYFVPLRMIPSDLEQDGSWELLVNKPISVSAQFFENYRFFPEGEIESLYWDGVG LGLQWKTRRIKGSVVDFALADPNNDGTKDLVVCLNTHPGALGLQNRKTVVVFYPLDLSMM DPKTAPALD >gi|316922513|gb|ADCP01000116.1| GENE 11 10905 - 12134 1406 409 aa, chain + ## HITS:1 COG:STM1002 KEGG:ns NR:ns ## COG: STM1002 COG1171 # Protein_GI_number: 16764362 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Salmonella typhimurium LT2 # 6 400 7 402 404 442 55.0 1e-124 MDNMLYHINAQRLRPGSGASLDMLNAAEARRARRFHAGFAQYAPTPLVPLPYLASTLGLG NIRAKDESKRFGLNAFKVLGASYAMGRYLAERLGKSIDDLTPADLCSPAVSERLGPITFV SATDGNHGRAVAWTAQQLGKKAVIYMPKGSDPARLENIRSHGAEASITDLNYDDAVRLAW KMACENGWIMVQDTAWDGYEDIPTWIIQGYAALAVEALAQWRAEELEPPTHLFLQAGVGS FASGVLGYMASELGDALPKTIIVEPHAADCIYRSALAGDGQPHNVTGDLSTLMAGLACGE PCTVGWPILRDYASAYVSCPDYVAANGMRLFAAARGGDAPIVSGESGAAPLGALDHIMRN PALAPLRESLGLGPDSRVMLISTEGDTSPRVWRDVCWYGRHGDELRVGE >gi|316922513|gb|ADCP01000116.1| GENE 12 12482 - 13261 669 259 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2461 NR:ns ## KEGG: Dvul_2461 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 2 244 18 237 243 163 42.0 5e-39 MSSCKRCGQCCRLGGPVLHRDDLSLLDRLDAPAKGTVPFGMADLVTLRTGELVRDDVIGT LTPLESECVKLAPARGRTDWECRFLVRMPDAVPGRDAGCGIYDRRPAQCRALSCSDTRGI AELYGHDRASRADVMRAVGAPEEWLGLFPAYEEMCAYSRIAPLAKLVMPPINPGTPAEKE ATEALLETVRYDISFRQLCVERGNIPEGCLPFLLGRPLSETLVMFGLALTRNNNRMGLVR RGEGRYGGPVFPDGKDLYR >gi|316922513|gb|ADCP01000116.1| GENE 13 13287 - 14105 981 272 aa, chain + ## HITS:1 COG:NMA1815 KEGG:ns NR:ns ## COG: NMA1815 COG0351 # Protein_GI_number: 15794705 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Neisseria meningitidis Z2491 # 8 265 9 266 268 182 46.0 4e-46 MPTPPCILTIAGSDSGGGAGIQADLKTITVIGGFGMSVITALTAQNGVGVAGIHAPEPGF VGLQLKTVREGFPIAAAKTGMLFSAEIIEAVEAGLQGREFPLVVDPVCVSQSGQRLLRED AVNALRWHILPKADLLTPNRPEAELLADMKIDSEKDIDVAARKLIDMGPKAVLIKGGHFG SASAAEFTDWLALPDEPLIALKQPGVKTSNNHGTGCTLSAAIATYLGLGLGIRDAIVHAQ KFLNHCLRHGYAPGLGAGPPNHASWIAKRVGE >gi|316922513|gb|ADCP01000116.1| GENE 14 14471 - 14866 555 131 aa, chain + ## HITS:1 COG:PA1541 KEGG:ns NR:ns ## COG: PA1541 COG2076 # Protein_GI_number: 15596738 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Pseudomonas aeruginosa # 7 127 1 119 122 79 43.0 2e-15 MSSVSHVKHWFFLFCAIVFEVGGTSVMKMSQHEGWLLGPNAGLMAMFALIGLSYYCLALA ATGLPIGVAYAFWEGLGLTMITLVSVFLLGEAMNMQRFFALLAVLGGALLIHHGTSEGGP AQRPAKAGVAS >gi|316922513|gb|ADCP01000116.1| GENE 15 14863 - 15192 526 109 aa, chain + ## HITS:1 COG:STM1483 KEGG:ns NR:ns ## COG: STM1483 COG2076 # Protein_GI_number: 16764828 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Salmonella typhimurium LT2 # 14 109 14 109 109 82 54.0 3e-16 MSSFFSLSLLAVIVAALLDIVANLLLAKSQGFRRKFIGFASLGMVGLAFYALSLAVRDMD LAVAYAMWGGFGILGTSIGGWLLLGQRLKPCAWLGMVLLIGGMTVLKLS >gi|316922513|gb|ADCP01000116.1| GENE 16 15566 - 16951 1049 461 aa, chain - ## HITS:1 COG:XF0105 KEGG:ns NR:ns ## COG: XF0105 COG1519 # Protein_GI_number: 15836710 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Xylella fastidiosa 9a5c # 59 451 64 431 448 143 30.0 6e-34 MKRSVLRALLSGAYGLAWLAARPVLCRHKRLQEGFPQRLVPDGWPGSALGMETGDGSASS HTRSDIWLQAASGGEAYLVWELLAHLAVLCEKQGTPEPLRVLATTWTRQGLDILQDMSGK LHEKHPWLSVRSAFFPLDAPKLMEKALDQVRPRVVGLLETELWPGLMLACEKRHVPMLIL NGRMTDKSLRGYLKLEAAIPGFWESIAPKHVCAISKADAGRFARIFGGDRVEAVPNIKFD RATATAIPAVSDPLLKLLPPELHARQTVLLASVREQEEPALLSVIQTLHAHDAPTIVVAP RHMHRVKPWQALLSGAKLPAVMRSKQEGTIPAGSIVIWDTFGELGQLYQLADAVFVGGSL APLGGQNFLEPLALGRIPCCGPHLDNFAWALEPSGEAERDSLETLGLLQTGENAKAVAAL LQQQLTLPTPHDAVRERFQHWLAPRLGGSARCAQRLLDEMR >gi|316922513|gb|ADCP01000116.1| GENE 17 17043 - 17942 1275 299 aa, chain - ## HITS:1 COG:BH1621 KEGG:ns NR:ns ## COG: BH1621 COG1181 # Protein_GI_number: 15614184 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Bacillus halodurans # 1 299 1 305 305 207 42.0 2e-53 MKVLLLAGGWSPEREVSLKGGEQIAAALKERGHSVTLCDPAKDFERLMDMAREHDAAFIN LHGAPGEDGLVQAILDRVNCPYQGAGPAGSFLALHKAAARQIFRDAGLNIPDGVFLPRHP GPNWKPDLQYPMFVKSNTGGSSLHLSRVTNEEELYTALNSLFSMGEEAIVETAVVGREVT CGVLGEEALAPILVVSKGNYFDYHNKYAPDGAQEICPAPIAPEETAKVQDAALRAHHALG LKGYSRADFILQDDGTLYILEVNTTPGMTSTSLVPREAAVKGMSFADLVERLLELALER >gi|316922513|gb|ADCP01000116.1| GENE 18 17899 - 18597 572 232 aa, chain - ## HITS:1 COG:no KEGG:LI0683 NR:ns ## KEGG: LI0683 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 41 232 25 213 213 232 56.0 7e-60 MSQSASEHAVHFVTPPLRSGRPYENPALPELPNLWSLDHPDTAPIPDEAACRALWTRYAM FSHIERHSECVAGMAEALARRAVETGATRHPELVALSLAAGLLHDIAKSYTVQFGGSHAQ IGASWVVDSTGNHKVAQAVYHHVEWPWPLPEDLVHPVFFVIYADKRARHDEMVSLDERYE DLLVRYGKSEHSRAAIHRGWEHSKTIERVLSAQLEFPLHESTVVGGRLVSRA >gi|316922513|gb|ADCP01000116.1| GENE 19 18608 - 19303 530 231 aa, chain - ## HITS:1 COG:AGpA742 KEGG:ns NR:ns ## COG: AGpA742 COG2068 # Protein_GI_number: 16119729 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 20 221 18 220 230 78 37.0 8e-15 MGFGEGRGKLFKSFPSLPQAPRTTMTTTRSPLPAIVLAAGFSRRMGRCKLTLTLDGEPLV RRAVRAALDAGLAPVLVTVRPDASPELLEAVSGFDARVEIVPAPDAHLGQAESLKAGIRR LVSRFACPPPGTVVLLGDQPLVGAELVRELTAFYLQKPECAAAPACDGVRGNPVILPAGA FGDTLLLSGDKGARGILAAFGLRLMPTNDTAAITDVDTWEAYESLSRAKPL Prediction of potential genes in microbial genomes Time: Fri May 13 03:57:56 2011 Seq name: gi|316922480|gb|ADCP01000117.1| Bilophila wadsworthia 3_1_6 cont1.117, whole genome shotgun sequence Length of sequence - 33086 bp Number of predicted genes - 33, with homology - 29 Number of transcription units - 22, operones - 8 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 5 - 1870 1708 ## COG1032 Fe-S oxidoreductase + Prom 1934 - 1993 2.8 2 2 Tu 1 . + CDS 2035 - 3252 1204 ## COG0438 Glycosyltransferase + Term 3307 - 3366 15.0 3 3 Op 1 . - CDS 3144 - 3359 62 ## 4 3 Op 2 . - CDS 3376 - 3993 795 ## COG0778 Nitroreductase - Prom 4107 - 4166 4.1 5 4 Tu 1 . + CDS 4123 - 5397 1170 ## LI0438 PhyA2 + Term 5476 - 5531 17.4 - Term 5555 - 5585 -0.5 6 5 Tu 1 . - CDS 5781 - 6128 662 ## LI0809 hypothetical protein - Prom 6231 - 6290 2.5 7 6 Tu 1 . - CDS 6294 - 6929 961 ## COG0177 Predicted EndoIII-related endonuclease - Term 7069 - 7099 1.4 8 7 Tu 1 . - CDS 7121 - 7354 350 ## - Prom 7391 - 7450 1.9 + Prom 7483 - 7542 1.9 9 8 Op 1 . + CDS 7567 - 7887 336 ## COG1324 Uncharacterized protein involved in tolerance to divalent cations 10 8 Op 2 . + CDS 7887 - 8831 1345 ## COG0524 Sugar kinases, ribokinase family + Term 8980 - 9014 2.3 + Prom 8942 - 9001 1.8 11 9 Op 1 . + CDS 9025 - 9777 904 ## Dvul_2301 hypothetical protein 12 9 Op 2 . + CDS 9789 - 9998 241 ## COG1942 Uncharacterized protein, 4-oxalocrotonate tautomerase homolog 13 10 Tu 1 . + CDS 10524 - 11027 -122 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 11065 - 11110 13.5 - Term 11052 - 11098 9.1 14 11 Op 1 . - CDS 11135 - 11800 646 ## COG1309 Transcriptional regulator 15 11 Op 2 . - CDS 11905 - 14802 4224 ## COG1026 Predicted Zn-dependent peptidases, insulinase-like - Prom 14951 - 15010 2.5 16 12 Tu 1 . + CDS 14620 - 14964 113 ## - Term 15204 - 15243 10.5 17 13 Op 1 . - CDS 15324 - 15734 638 ## COG3193 Uncharacterized protein, possibly involved in utilization of glycolate and propanediol 18 13 Op 2 . - CDS 15751 - 16380 722 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase + Prom 16719 - 16778 3.0 19 14 Tu 1 . + CDS 16809 - 17372 521 ## COG0071 Molecular chaperone (small heat shock protein) + Term 17567 - 17636 20.1 + Prom 17783 - 17842 2.3 20 15 Op 1 . + CDS 17867 - 18970 1428 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 21 15 Op 2 . + CDS 18973 - 20202 1393 ## COG0283 Cytidylate kinase 22 15 Op 3 . + CDS 20205 - 20987 758 ## COG0682 Prolipoprotein diacylglyceryltransferase + Term 20997 - 21061 8.2 23 16 Op 1 . + CDS 21318 - 21536 275 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 + Term 21561 - 21597 7.1 24 16 Op 2 . + CDS 21616 - 22110 518 ## COG3760 Uncharacterized conserved protein 25 17 Tu 1 . - CDS 22361 - 23437 1175 ## 26 18 Tu 1 . - CDS 23545 - 27222 3673 ## Dde_0225 hypothetical protein - Term 27272 - 27322 5.6 27 19 Tu 1 . - CDS 27348 - 28049 1016 ## COG0670 Integral membrane protein, interacts with FtsH 28 20 Tu 1 . - CDS 28152 - 28763 890 ## COG0353 Recombinational DNA repair protein (RecF pathway) - Term 29038 - 29075 6.9 29 21 Op 1 . - CDS 29222 - 29533 485 ## COG0718 Uncharacterized protein conserved in bacteria 30 21 Op 2 . - CDS 29595 - 29945 321 ## Dvul_0189 DNA polymerase III, subunits gamma and tau (EC:2.7.7.7) 31 21 Op 3 . - CDS 29846 - 31579 979 ## COG2812 DNA polymerase III, gamma/tau subunits 32 21 Op 4 . - CDS 31646 - 32569 1446 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase - Prom 32761 - 32820 3.4 33 22 Tu 1 . + CDS 32793 - 33026 400 ## DVU0136 hypothetical protein Predicted protein(s) >gi|316922480|gb|ADCP01000117.1| GENE 1 5 - 1870 1708 621 aa, chain - ## HITS:1 COG:MA4618 KEGG:ns NR:ns ## COG: MA4618 COG1032 # Protein_GI_number: 20093399 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Methanosarcina acetivorans str.C2A # 12 593 121 704 742 459 43.0 1e-129 MFDLTLPARTPLPQPPFLPTTRAEMDAIGWDSLDILLVTGDAYVDHPSFGVPLLGRWLVG HGYRVGIVAQPRWKDEGQGIADLARMGRPRLFAGVSAGALDSMLAHYTAFRKKRHDDAYT PGGEAGSRPNRAVIVYAGLIRKAFPGLPLLAGGIEASLRRITHYDFWADSLRRSILFDAR LDILSCGMGERALLDVARRLDAVAELVGDLSVLEPVDGELWPDLWAGIPGTARLVKTASI PSDAEELDGLELVRLPSHDEMLAVPRAYLDGTVRLERETHQSRRILAQPNGDRTVLLMPP AAPLTTEELDGLYALPFSRRPHPSYKEPIPAVEMIATSITTHRGCGGGCSFCSLALHQGR RIASRSEASILDEAKRIAAMPRGGSISDVGGPSANMWGAACRLDPSKCRRDSCMYPSICK GFSVDQRACIDLLRDVQATPGVKHVRVASGVRFDLALKDATALAAYTGEFTGGQLKIAPE HCVPDVLDLMRKPGMKPFEAFLEAFVRYSEAHGKEQYVIPYLMSAFPGCTDAHMRQLGDW LAARNWSPRQVQCFIPTPGTVATAMYYARCTPDGAPIYVARTDEERRRQHEILLGRPSGQ RPRTQGNRNAFAERERRKRER >gi|316922480|gb|ADCP01000117.1| GENE 2 2035 - 3252 1204 405 aa, chain + ## HITS:1 COG:alr2839 KEGG:ns NR:ns ## COG: alr2839 COG0438 # Protein_GI_number: 17230331 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 177 330 183 333 381 79 33.0 1e-14 MARIALILPRLSRYGGVEQFAFRLAEALAETRNSEHEVEFICARSECLPPVGVRTHIVGR PGGLKFIKMLWFLIRAEQVRKRGNYDLVISLGKTWNQDMMRVGGGPQKTFWELSEKAWPA GFSRWFKHLRRRLLPSNWLTRIIDNHQYRSGCRIICVSDAVRHWTQKAYPGIPVPEVIYN LPDLSRFTPPTPEQKLLSRIALNIDNNHVAIATATSNFALKGTGILIRSVAMLPETVHLF IAGGRDSEPYQRLAKKLGVAGRIHFLGKVEDMPALYRAMDLFVLPSFYDACSNAVLEALA CGLKVLSTTANGSSVFLPQEHVTPDPGDAEDLAARIKVLIDEPAPGPFRIPDHIQAGLDA WVQVVNEECDQRTGKSCGMRERKADGTEENEGNGKKPHPSPSQDF >gi|316922480|gb|ADCP01000117.1| GENE 3 3144 - 3359 62 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNGKEKSLNETPFRLFFSRSAASPRKGLPPRGLPANQKSWEGEGWGFFPFPSFSSVPSAF LSLIPQLFPVR >gi|316922480|gb|ADCP01000117.1| GENE 4 3376 - 3993 795 205 aa, chain - ## HITS:1 COG:PAE2336 KEGG:ns NR:ns ## COG: PAE2336 COG0778 # Protein_GI_number: 18313271 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Pyrobaculum aerophilum # 19 202 45 252 274 74 27.0 1e-13 MTKRFIPVLAACLFLMVGTAQAATSLAFPPPDTKGGKPLMQTLSGRATNRSFTAKPLSDK LLGDLLWAAYGVNRPNGKRTIPTAQNRQDLEIYVLRSDGAWLYDAPKHSLEQVSSSDLRN FLAGQGFVREAPVGLVYVTDTQKNSKELFAAMHTGSAYQNVGLFCASVGLHNVVRASYDA EGLASALGLPNSKHILISQSIGWGE >gi|316922480|gb|ADCP01000117.1| GENE 5 4123 - 5397 1170 424 aa, chain + ## HITS:1 COG:no KEGG:LI0438 NR:ns ## KEGG: LI0438 # Name: not_defined # Def: PhyA2 # Organism: L.intracellularis # Pathway: Inositol phosphate metabolism [PATH:lip00562]; Riboflavin metabolism [PATH:lip00740] # 24 380 23 381 441 281 40.0 4e-74 MKWGLSILALCALLAATAPEGGAAEQGGGDAKLLKMVVLSRHGVRSPTQSSETLESWSRK DWPEWPVKRGELTPRGAKLVTAMWEQEAAFLREAGLLPSKGCPEAGTIAVRADRDQRTRV TGEAVLEGLAPGCGFKPIVNETDHPDPLFHPLEAGYCALDPAVVRKEIPVGAIEGLEQSL SGPIGELAAILGPASPEFCRKHQLPEGCTVADVPTRLTLAKDNRTVHLDGKLGTASSAAE IMLLEYGQWDHPAGWGAVDKGALQRLLPVHSTVFDAVNRAPSVAAGRGSELLLDMANALT GRYADPAVNKAKVVVFVGHDTNIANIGGMLGLHWQLPGYAPDEIPPASALVLTLWLQNDV YQLRARMIGQSLDTLHDPAMKGEVLRQDIEVPWCGPYEDGKNCTLTDFELRVRDVLRPEC VRER >gi|316922480|gb|ADCP01000117.1| GENE 6 5781 - 6128 662 115 aa, chain - ## HITS:1 COG:no KEGG:LI0809 NR:ns ## KEGG: LI0809 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 112 1 112 129 141 61.0 9e-33 MKRLIVALALVASLGMAGHALAMDKKINIFNGAQYDIYTLYLSPTNANDWEENLLKQETL PNGDKVDVEVSRTEKAEAWDVKVTNKAGETMTWIGVPLNKAGQITLLPDGKYEAR >gi|316922480|gb|ADCP01000117.1| GENE 7 6294 - 6929 961 211 aa, chain - ## HITS:1 COG:all3970 KEGG:ns NR:ns ## COG: all3970 COG0177 # Protein_GI_number: 17231462 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Nostoc sp. PCC 7120 # 1 206 9 214 223 241 55.0 6e-64 MTDKQRAAKVLELLAERYPDLETHLMAESPWELLVATVLAAQCTDKRVNQVTPELFRRWP DPAALAQATIPELEEVIHSVGFYHSKAKHLIAAAQLVVKEFNGETPNTMKDLIKLPGVAR KTANVVLWGGFGINEGLAVDTHVKRISGRLGLTKHTDPVDIEKDLVKLFPQSEWGKVNHR MVWFGRHVCDARKPLCDECEMAPFCPKVGVE >gi|316922480|gb|ADCP01000117.1| GENE 8 7121 - 7354 350 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSAYGSIVSDVMVKGGTRCRLDTVTYTSPEFDELETLVEVCEPLCETEKPTMQAPVEQGP EPVIFVSPEFDEIETLM >gi|316922480|gb|ADCP01000117.1| GENE 9 7567 - 7887 336 106 aa, chain + ## HITS:1 COG:MTH1509 KEGG:ns NR:ns ## COG: MTH1509 COG1324 # Protein_GI_number: 15679506 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in tolerance to divalent cations # Organism: Methanothermobacter thermautotrophicus # 1 100 1 100 105 97 45.0 7e-21 MALLVYMTAASADEAERIAGDLVESRLAACVNVMAPIRSVYSWKGELCRSEEIPFIAKTD DDRFEALAARVRALHSYETPCIVALPVARGDADFLAWITESTHPEE >gi|316922480|gb|ADCP01000117.1| GENE 10 7887 - 8831 1345 314 aa, chain + ## HITS:1 COG:RSc2791 KEGG:ns NR:ns ## COG: RSc2791 COG0524 # Protein_GI_number: 17547510 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Ralstonia solanacearum # 1 306 1 306 311 215 39.0 9e-56 MSIYVAGSLAFDRIMSFNGAFADHILADKLHILNVSFLIDGLVEKRGGCAGNIAYTLALM GEKPLILATAGKNFSEYGAFLESKGISLEGVRVMKDEFTASCTLITDKNNNQINGFHPAA MGFPCEYAFPHPDASADWGIVSPGNLDDMKALPRLFREKGIRYIYDPGQQIPALSGDDLL DAITGSALLVTNDYELEMISKATQRTRAELRALTGGVITTLGEQGSVIDNGERGSVGIAA PKTVADPTGAGDSFRSGLLKGLLHGLDVPASARLGATCASYCIEHQGTQEHVFTYEDFAA RHRAAFHEEPGVKW >gi|316922480|gb|ADCP01000117.1| GENE 11 9025 - 9777 904 250 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2301 NR:ns ## KEGG: Dvul_2301 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 5 245 4 248 256 91 32.0 2e-17 MFKIIIPILVIALAAFVGMNAKSGLIRGVADGMLVSPARPAVAVKPAADFASVDARRVDL SPSVQNSMLQATSVQAVYALYNHEGPQAAPAHLAALLAVSQNDETWPVEPELGFPAIRHK KQDIDGFSGFADTYVLDAKDDPWASDEPAQWENGSLVRRFTFILWFYKAKLIVEYREPLP APSDMPLEDNIPLLTAFEARALESFRLLNGDPKNGGVELPRPKEKLPYPPAGINRQALTD FIGTLWDMKK >gi|316922480|gb|ADCP01000117.1| GENE 12 9789 - 9998 241 69 aa, chain + ## HITS:1 COG:NMA1685 KEGG:ns NR:ns ## COG: NMA1685 COG1942 # Protein_GI_number: 15794578 # Func_class: R General function prediction only # Function: Uncharacterized protein, 4-oxalocrotonate tautomerase homolog # Organism: Neisseria meningitidis Z2491 # 1 69 1 69 69 90 72.0 8e-19 MPYVNIRVTRENGTPTAEQKAELIKGATELLVRVLGKNPATTVVVIDEVDTDNWGVGGET VTRRRRQGK >gi|316922480|gb|ADCP01000117.1| GENE 13 10524 - 11027 -122 167 aa, chain + ## HITS:1 COG:BS_ycxC KEGG:ns NR:ns ## COG: BS_ycxC COG0697 # Protein_GI_number: 16077424 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 46 160 184 297 312 65 39.0 4e-11 MIVCVTAKGFESSFNGIGYIMLCMAVLSYSLYCVSAEKAYKTTSIEKTYVMIACGACFFA TMAIARSIAAGTVVDLFMLPFQNTVFLFSILYLGIASSVLSFVLFNISISSIGTNKASSF VGISTAVSVLAGVFILDEKFSAMQCYGTLIILLGVYIANMKSRFFWR >gi|316922480|gb|ADCP01000117.1| GENE 14 11135 - 11800 646 221 aa, chain - ## HITS:1 COG:RSp0108 KEGG:ns NR:ns ## COG: RSp0108 COG1309 # Protein_GI_number: 17548329 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Ralstonia solanacearum # 28 185 31 188 216 94 37.0 2e-19 METKGIAPATPATERRQRTPLAVTRERVLGIAEQMFRQSGVQAVSVDAIAQAAGIKKMTL YRCFPSKEELVMACMDQWEAAFRRIWDQAQDQYPNESARQLLAFFQSIYELVSQPGYSGN VFMHLMTDYTDPEHHICVRVREQRASLRADVRNQLVKAEASDPDQLADTYCLLLEGLFTA ARTHGFDSGPVRNAPAIAEKTLRQGCGSRAFPPEPRRFRRW >gi|316922480|gb|ADCP01000117.1| GENE 15 11905 - 14802 4224 965 aa, chain - ## HITS:1 COG:CAC3006 KEGG:ns NR:ns ## COG: CAC3006 COG1026 # Protein_GI_number: 15896258 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases, insulinase-like # Organism: Clostridium acetobutylicum # 5 946 12 956 976 638 38.0 0 MNTRGFSLVREERLSEVSGTVKLWRHDATGAELLSIVNNDENKCFGATFRTPPKDSTGVA HILEHSVLCGSEKYPVKEPFVELLKGSLQTFLNAFTFPDKTCYPVASANLQDFYNLVDVY LDAVFFPRIDENCFQQEGWHIEADSPAGPLRYKGVVFNEMKGVYSSPDSVLAEHSQQSLF PDMTYGLDSGGNPEVIPQLTYKAFKSFHESHYHPSNTRFFFWGDDPEEQRFALLEPYLSR FTARETDSAVPLQPRLDVPRQLEFPYASGEDGDKGHVTLNWLTCETADTGELLVLEMLEH ILLGLPGSPLRKALIESGLGEDLTGGGLETDLRQTFFSVGLRSITPGTAEDVEMLIMETL AELAENGIPAAAVEAAVNSVEFDLRENNSGRFPRGLAAMIRSLATWLYDGDPIAPLAWEK PLAALKARLASGEKVFEGAIKRWFLDNEHRSTVILTPDSGLAAEREAAEAAKLQRIYDAL SDEDHKEIVACTEALRASQQAPDSPEALAAIPSLTLADLPRENVILPKEEGKAGDLAILA HDIDTSGILYAEILFPLDAVPSELLPLVPLMGRSLTEMGTSKRDFVELGTLLASKTGGMD AAPLVATMRGTRMPVAKLCLGGKATADKADDLFSLMAEVLTDTNFDNPQRFTQMVLEERA RLEQSLIPAGHGTVIARLRAAYSLAGQISEAIGGITYLEAIRALSERVVSDWDSVRADLE ILRGLILNRQDAILNLTADAGTLAAVQPYAAALGRALPTAFSVPLEREPLRAAANEALIV PAQVNYVGKGCNIYDLGYTWHGSAHVITRHLRMGWLWDQVRVQGGAYGAFCALDRMSGSL ALVSYRDPNVEKTLATYDATADYLRKLDLSDRDLTLAIVGAIGDLDTYLLPDARGAASLS RHLTDDRDDLRQQMREEILGTTRRHFTEFADVMAEAAKAGTVCVLGGSAAENAATEHGWT KKKVL >gi|316922480|gb|ADCP01000117.1| GENE 16 14620 - 14964 113 114 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGYAGGVFRGGAERRAEALVLVVVDYGEQFRSGGVVPPQFDGAGNFGKPFLAHEAKAAGI HLGLLILHERLGERARRAGAHFPSGPPFSGGFTPERYSPFTWAAIAARRHYEKE >gi|316922480|gb|ADCP01000117.1| GENE 17 15324 - 15734 638 136 aa, chain - ## HITS:1 COG:SSO1224 KEGG:ns NR:ns ## COG: SSO1224 COG3193 # Protein_GI_number: 15898076 # Func_class: R General function prediction only # Function: Uncharacterized protein, possibly involved in utilization of glycolate and propanediol # Organism: Sulfolobus solfataricus # 24 132 29 137 139 57 32.0 6e-09 MNNAIPFRAVLDVLVRLESEAAGGAPVCVAIVNRTGKLAAFLSLDGTPERAGAIAQSKAY TAMRMELTTQAFHDRLVRENLSIADFCDPGFSTLPGGVPVFDETGRCIGGVGISGRKPQE DAELAERLVQKLMGKG >gi|316922480|gb|ADCP01000117.1| GENE 18 15751 - 16380 722 209 aa, chain - ## HITS:1 COG:VCA1060 KEGG:ns NR:ns ## COG: VCA1060 COG0108 # Protein_GI_number: 15601811 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Vibrio cholerae # 13 206 16 209 218 256 65.0 2e-68 MSQHQFTTTRNERVEAAVAAVRKGRGVLVVDDEDRENEGDIIFSAASITHEQMALLIREC SGIVCLCLPEDKVHQLDLPQMVANNTCKNRTAFTISIEAAEGVTTGVSAADRVKTIRTAI ADDAVPGDLNHPGHVFPLCARSGGVRERAGHTEASVDLMRLAGLPPYGVLCELTNPDGTM ARLPEIMAFSQLHDMPVLQVRDLIEYMEH >gi|316922480|gb|ADCP01000117.1| GENE 19 16809 - 17372 521 187 aa, chain + ## HITS:1 COG:TM0374 KEGG:ns NR:ns ## COG: TM0374 COG0071 # Protein_GI_number: 15643142 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Thermotoga maritima # 32 183 10 145 147 98 39.0 9e-21 METTRNAGSCCGHHGATSAPAHRTAAPAPQGYTPVDQLHNEIDRLFGDFFGGFFSPWRAF SGMAPRSADNAPDMLIPHMDLSVTDTAYKATVELPGVAQDQVNIEVRDNMLIVEGEKKNE TEDKDEKKGYYRMERSYGSFRRVLSLPEDVETDKITATHKDGVLSIEIPRKEPEKPAARK IEVVKGE >gi|316922480|gb|ADCP01000117.1| GENE 20 17867 - 18970 1428 367 aa, chain + ## HITS:1 COG:MJ0955 KEGG:ns NR:ns ## COG: MJ0955 COG0079 # Protein_GI_number: 15669145 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Methanococcus jannaschii # 6 367 10 372 373 213 34.0 3e-55 MSPIPMRSLVRGFKPYTAGLSIDEIREKYGIERVIKLASNENPLGTSPLVQHTLETHVGL AFRYPQAGNPRLVKAIAAHHGVSPSRVIVGDGSDEVIDLLFRICVEPGVNNAVAFMPCFG IYTTQAALCGVELRQTPVNADFSFPWDNLLKLVDDKTSLVFVTTPDNPSGYTPPVAELET LAKALPDTCLLVLDEAYMDFTDDEADYSLAKRLDEFPNVIISRTFSKSFGMAGLRLGYAI VPEELADHYWRVRLPFSVNLMAEEAGIAALQDVAFHDETVRVVREGRAWLTGELTKLGCR VFPSQSNFLMFELPADGPNAASLFEDLLRRGIILRPLTSYGLPRNLRVSVGTAEENTMLI NAMRELL >gi|316922480|gb|ADCP01000117.1| GENE 21 18973 - 20202 1393 409 aa, chain + ## HITS:1 COG:PM0802 KEGG:ns NR:ns ## COG: PM0802 COG0283 # Protein_GI_number: 15602667 # Func_class: F Nucleotide transport and metabolism # Function: Cytidylate kinase # Organism: Pasteurella multocida # 6 210 6 212 227 156 47.0 9e-38 MKCLPIITLDGPAGVGKSTLAKRLATILGIPYLDTGAMFRTIALRLGPGAEALPEDELRA RCKAFRFKLQGGGEHSVLLCNGVPVGPEIRTEEVGRLASRLATSTVVRDCLKEAQRSLGE SGLVAEGRDMGTVVFPTARFKFFLDARPEVRGMRRFEELQAKGEPADLAQITEMIRQRDD MDRNRAVAPLKPATDAVIVDTSDLGIEGVLKVLTDTVAASSERIVREVRYCPKDAAEDAV SGMRDAAFSHMAGDGSISMVDVGSKNITRRVAIVRGAVEMNAHTLGLLKEHALPKGDVLV TAQVAGIMAAKRTSELIPLCHPVPLSFVDVRFAIQDEPPAVLIESEARTSDRTGVEMEAI IAAQVAAATIYDMCKAVQKDMVIRDVRLVHKSGGRTGTFERRESSGERT >gi|316922480|gb|ADCP01000117.1| GENE 22 20205 - 20987 758 260 aa, chain + ## HITS:1 COG:PA0341 KEGG:ns NR:ns ## COG: PA0341 COG0682 # Protein_GI_number: 15595538 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Pseudomonas aeruginosa # 1 255 1 252 266 289 59.0 5e-78 MLAYPNISPVALSLGPVQVHWYGLMYLFAFLAGWGLARWRASRPAWIAAGWNGQKVDDLL TWIMLGVVAGGRLGYVLFYDLPVYLDQPSEIFKIWNGGMSFHGGLIGVLLMGWWWSRRNK KKFMDVVDFVAPLVPPGLFFGRIGNFINGELWGGVTDGPLGMVFPTGGPLPRHPSQLYEA GLEGLVLFVVLWVYSSRIRPTGRVSGLFAVLYGVFRFAVEFVRQPDPQLGYLAFGWLTMG QLLCVPLLIAGLWLLFRPVK >gi|316922480|gb|ADCP01000117.1| GENE 23 21318 - 21536 275 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 110 73 1e-23 MAKEDAIEVDGVVQEALPNAMFLVELENGHTVLAHISGKMRKFYIRILPGDRVKVELSPY DLTRGRITYRMK >gi|316922480|gb|ADCP01000117.1| GENE 24 21616 - 22110 518 164 aa, chain + ## HITS:1 COG:PA1841 KEGG:ns NR:ns ## COG: PA1841 COG3760 # Protein_GI_number: 15597038 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 18 157 19 158 165 96 40.0 2e-20 MEDREIRKVLDVLDNAGIRYRMVSHDPVYTIEEMERLHLDADAEIAKNLFLRDARGKRHF LVVLCSCKTVDLKALRGQLGTSALSFASEDRLKRFLGLEKGAVTPLGILNDEARAVEVLF DRDLARLPSLGVHPNRNTATVFLAFSDLESLIRSHGNSVSFVAI >gi|316922480|gb|ADCP01000117.1| GENE 25 22361 - 23437 1175 358 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNLPYMPGVRRVSPAGLPCDPGRVESGVRKRFRFIALLTLAVFMAGGTAASAATIANHMD YPVEGFSLWCQDPQQHWIERKLTGHLDMEQKAAVTLPEDPCFFLEADMGDYLLSFPIKQE LQGEDLLDTSCMPDPTLEVYRKAEPLFIADGEEKDRALERDPDRNPPEVVPIGRLLDTFK GGMSEQECLMYGIPLRRESYNDSQNFSLKTAMVSDGVVWHGIISLYKDAETNIIGLSLTT PLSEGALEKLFSALSGRAYKHVALGDDGPVYMDDPILSPDPAVRKEAIRNMLAQLTVPNA DPYDTHFEPDSPECREKDAKKEQNEAQAACKELSADIRIAPGSDRIELFLQEGEESAE >gi|316922480|gb|ADCP01000117.1| GENE 26 23545 - 27222 3673 1225 aa, chain - ## HITS:1 COG:no KEGG:Dde_0225 NR:ns ## KEGG: Dde_0225 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 46 1202 17 1235 1261 317 26.0 3e-84 MASKQRKRRRKRRAVKKAPPLPPVSAQAEEEKARTRGRRSAKGWRACLFGIGLLCALWTA LAVSLPPTVLRGVVESQLSEAVNAPVTIESLSVNPFTLGIAVRGVRVPYLEETGTPDGAL LTIERLDIRPRLRLFADAAPIQAALRVYNPVLDITYLGNGLFSFSKLLPAPSGADTAPDT PLFALSDLDLKGGTILLRDGPVGMVHTVSDVNFHVPLLRSADLPFAPTLSALVNGTRIDV EGRTEPDAETLRTTFAIHTAPLHMEHFKRYLSDFTPLNLNSGTVALDMLFTLSQPHLGRV ESSLSGTVKVEDAEIATPEGTIVGRLRRGQAALKDFTLSERRIRLDEMELDGLYLRVGRD KAGRIDWVDWLSGTDANPENLSPETPFVVEGADLILRNSDFTWVDASLADTQQIEITGVD GRIAEYSTRPGARTAMRLSFGISTEGVLAVDGEGTLTPPSLNASLWVDDLPLIVLRPLLG ETPLNDILGRIAIKGGVRFGSDGKTPQTPAAPASATSLAVTGAEASVSALSFGRGNSLSA ADRKAGKDAGSPAVRIRALTFNGLHFDTQARSASVKTLSVEAPEVRVTSSGGMGLAIPGA KPSAANPAPALKTRLPEGMFTAALPDAAKAFLSGWKLKAGGFRIAGGRLEHVAAGKAEKL ISSLDLETGPLSGDLTDTISLTLRARGTSSDSLNLKGTLRPVPLRFSASMDAADLPLEWL GPPLRASTNLSPSGRLSANLDATITEEKGKDLGIQASGSLTVRDLRLKDARTEKVYAVLR RLSADTFRFSSASSSFEAKEMLLDLLRMDVALNADKTLDILECVPKKQTGQEPSSPFRFS VASLRLQDAALLFRDQAHGSVSAVQDINATVSGLSSSGGLSDIVLTGQIGGAPITLSGSC NPFSTPPAAKLAFTAKGVDLARYSAYTRAYLGYPVVQGRLDLESAFATSGWTFSLDNHIR LEKPVLGPKDTRPGAPDYPVSLGFALLEDLRGNIALDLPISGRLDDAALQVGGLVGKALG GLFTKVVTSPFALLGGIIGLVTPGDPALQVIAFPPGDTRINPAAQGRLKRIAKALEERPR VKIELIGMYEPASDTRGLKRLRVLRKVQARQYAALPAKQRAANPVGATKLSSGEYERFLL HVYKASPAGRKAKGNEEPDIMEQKLQALETVTQADLEALARSRAEEVRAFLLKHGPGLGK RVNIASKGGLPDVRSGTAQVEIQLR >gi|316922480|gb|ADCP01000117.1| GENE 27 27348 - 28049 1016 233 aa, chain - ## HITS:1 COG:YPO1163 KEGG:ns NR:ns ## COG: YPO1163 COG0670 # Protein_GI_number: 16121459 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Yersinia pestis # 19 233 22 236 236 194 53.0 9e-50 MESLRDSTRSTSISVASVFMRHVYQWMTVGLLVTAGTAFFVASSPALLSAIFGNTFGLII LAIAVFAMPLVLSGMISRLSAFAATALFVVYSALMGAFLSSVLLVYTGASVMSTFVTCAA MFGGMSIYGTVTKRDLTGMGSFMMMGLFGLIIAMIVNIFLQSSAMTFVISALGVVIFTGL TAYDTQKIRAFGENAPLDDATAIRRGALLGALTLYLDFINLFLMLLRLMGDRR >gi|316922480|gb|ADCP01000117.1| GENE 28 28152 - 28763 890 203 aa, chain - ## HITS:1 COG:lin2850 KEGG:ns NR:ns ## COG: lin2850 COG0353 # Protein_GI_number: 16801910 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Listeria innocua # 6 202 4 198 198 156 37.0 3e-38 MDHRIPEPLKALVEQLARLPGLGPKSAMRVAMTLLKWPEAETRRLGKGIHDLRDNLRLCS RCGALTETDPCNVCSDKERDDSLLCVVAEWDSMLTLEEGAFYKGRYLILGGLLAPLDNLS AESLELDRLTKRLEEGQVREVVLALGATVEAETTGALVRSLVNRRFPGVTVTRLAQGIPL GAEVKFMDRETLRQSLQYRQEIR >gi|316922480|gb|ADCP01000117.1| GENE 29 29222 - 29533 485 103 aa, chain - ## HITS:1 COG:CAC0126 KEGG:ns NR:ns ## COG: CAC0126 COG0718 # Protein_GI_number: 15893422 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 3 102 13 112 112 73 39.0 7e-14 MRNMNDLVRQAGVMQNKIAKMQQEMAERTVEASSGGGMVKVVATCKQDIVSITIDPKALE GGDVEMLQDLVLTAVNEAVRIGRATMDREVSAITGGIKLPGII >gi|316922480|gb|ADCP01000117.1| GENE 30 29595 - 29945 321 116 aa, chain - ## HITS:1 COG:no KEGG:Dvul_0189 NR:ns ## KEGG: Dvul_0189 # Name: not_defined # Def: DNA polymerase III, subunits gamma and tau (EC:2.7.7.7) # Organism: D.vulgaris_DP4 # Pathway: Purine metabolism [PATH:dvl00230]; Pyrimidine metabolism [PATH:dvl00240]; Metabolic pathways [PATH:dvl01100]; DNA replication [PATH:dvl03030]; Mismatch repair [PATH:dvl03430]; Homologous recombination [PATH:dvl03440] # 3 115 497 609 622 82 38.0 6e-15 MDFVEFCRKNQSDVTLSLPALTQANGVFSEGVLTLATTSAIQYEQLSAPSRLSELERLAS DYSGKAVSVQVSAPERLHKTEAELKKEFASHPVIKSLTETFGASLIRCIPVEHTRS >gi|316922480|gb|ADCP01000117.1| GENE 31 29846 - 31579 979 577 aa, chain - ## HITS:1 COG:aq_1855 KEGG:ns NR:ns ## COG: aq_1855 COG2812 # Protein_GI_number: 15606894 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Aquifex aeolicus # 1 312 1 312 473 276 46.0 1e-73 MSHASLAARYRPQTFAEVTGQETVKAILSRAAAEDKVAPAYLLSGTRGVGKTTIARIFAK ALNCVHAPTGEPCNQCEQCRRITMGNHVDVVEIDGASNRGIDDARRLREAIAYAPMEGRY KVFIIDEAHMLTRESFNALLKTLEEPPPRATFILATTEAHKFPITIVSRCQHFIFKAIPE AELVAHLTRVLNKEGIPFEENAVRLLARRAAGSVRDSMSLLGQTLALGGDHLTEASTRSV LGLAGQELYERLLNALKAQDCLAVAALTQELLERGVDLGFFLRELTTLWRNLFLIRQAGA AATAALDMPDAEKQRLLDLAPQFDPAYIHAAWQMVLESQRQVLTSLEPSAALELLLLNLA LLPRLVSLETLSRTTVAPASGTPAAPAPSAPAAPPAPASPAPASSSPVSSAPVASVRPVP AAPVSSPSAPAAAPVQPTFSGTQPPSAKPSQTSAIPRAPLRQRPVRRFRPALWRPPPFQP GLPPRKARRRGKQLLRPPRSSLNLPQRSRPQTPRTLWTPLRRRMPSPQDPRKNMLPPCPR PRRHGWTSWSFAARTRATSRCPFRRSHRRTAYFRKES >gi|316922480|gb|ADCP01000117.1| GENE 32 31646 - 32569 1446 307 aa, chain - ## HITS:1 COG:RSc0567 KEGG:ns NR:ns ## COG: RSc0567 COG0115 # Protein_GI_number: 17545286 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Ralstonia solanacearum # 8 303 11 307 309 357 58.0 2e-98 MLQKTEFIWFDGKLVPWDQAQVHVLAHGLHYGTGVFEGIRAYACPDGSSAVFRLPEHSKR LVNSAKILGINMPYTADEISKAIVETVVANKLSEGYIRPLAFAGEGDMGVFPGNNPTHVI IAVWPWGAYLGAEALEKGIRIKTSSFARMHVNTLMSKAKAAGNYVNSVLAKMEVKQDGYD EALMLDTNGYVCEATGENFFIVRNGVIKTPPLTAILDGITRDSIIKIARDLGYTVEEQLF TRDEVYYADEAFFSGTAAELTPIRELDNRTIGEGHAGPVTKALQAAYFKAIKGQDPRYGH WLTNYKF >gi|316922480|gb|ADCP01000117.1| GENE 33 32793 - 33026 400 77 aa, chain + ## HITS:1 COG:no KEGG:DVU0136 NR:ns ## KEGG: DVU0136 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 10 77 7 72 72 70 70.0 1e-11 MDQPLPFEIPNVTVLLQPEEQVLEIPRHKVKNVKQLLKYLGIRECTALVARGRELLTPDR AVLPHDNLLVRKVTSSG Prediction of potential genes in microbial genomes Time: Fri May 13 03:59:32 2011 Seq name: gi|316922437|gb|ADCP01000118.1| Bilophila wadsworthia 3_1_6 cont1.118, whole genome shotgun sequence Length of sequence - 49943 bp Number of predicted genes - 46, with homology - 42 Number of transcription units - 29, operones - 12 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 464 - 970 569 ## Dde_0111 zinc resistance-associated protein - Prom 997 - 1056 2.1 2 2 Tu 1 . + CDS 1087 - 1362 103 ## + Term 1385 - 1424 -0.5 - TRNA 1253 - 1329 90.7 # Pro TGG 0 0 - Term 1202 - 1242 4.3 3 3 Op 1 20/0.000 - CDS 1380 - 1877 166 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 - Prom 1939 - 1998 2.6 4 3 Op 2 . - CDS 2125 - 3447 1738 ## COG0823 Periplasmic component of the Tol biopolymer transport system - Term 3509 - 3547 5.2 5 4 Tu 1 . - CDS 3574 - 4257 628 ## Dvul_0279 TonB family protein 6 5 Op 1 . - CDS 4755 - 5741 942 ## LI0693 colicin uptake membrane protein 7 5 Op 2 30/0.000 - CDS 5823 - 6263 637 ## COG0848 Biopolymer transport protein 8 5 Op 3 . - CDS 6273 - 6965 858 ## COG0811 Biopolymer transport proteins + Prom 7171 - 7230 3.2 9 6 Tu 1 . + CDS 7285 - 7680 406 ## COG0735 Fe2+/Zn2+ uptake regulation proteins + Term 7684 - 7737 12.4 + Prom 7692 - 7751 2.2 10 7 Op 1 2/0.000 + CDS 7832 - 8407 810 ## COG1592 Rubrerythrin 11 7 Op 2 . + CDS 8427 - 8612 272 ## COG1592 Rubrerythrin + Term 8661 - 8692 -0.3 - Term 8577 - 8612 3.1 12 8 Tu 1 . - CDS 8669 - 8974 61 ## 13 9 Tu 1 . + CDS 8828 - 9880 933 ## COG0859 ADP-heptose:LPS heptosyltransferase + Term 9887 - 9946 5.0 + Prom 10194 - 10253 5.5 14 10 Op 1 11/0.000 + CDS 10430 - 11380 868 ## COG1740 Ni,Fe-hydrogenase I small subunit 15 10 Op 2 . + CDS 11396 - 12820 1516 ## COG0374 Ni,Fe-hydrogenase I large subunit + Term 12980 - 13006 -0.7 - Term 12860 - 12920 18.6 16 11 Op 1 . - CDS 13154 - 13630 289 ## gi|302861564|gb|EFL84500.1| nitroreductase family protein 17 11 Op 2 . - CDS 13646 - 13777 78 ## gi|302861564|gb|EFL84500.1| nitroreductase family protein 18 11 Op 3 . - CDS 13828 - 14751 518 ## COG2378 Predicted transcriptional regulator - Term 15028 - 15065 6.6 19 12 Tu 1 . - CDS 15097 - 15837 1298 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 15861 - 15920 3.6 + Prom 15987 - 16046 1.9 20 13 Op 1 31/0.000 + CDS 16256 - 17014 1218 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 17043 - 17078 5.6 21 13 Op 2 . + CDS 17098 - 17799 1000 ## COG0765 ABC-type amino acid transport system, permease component 22 13 Op 3 . + CDS 17853 - 19553 1978 ## COG0659 Sulfate permease and related transporters (MFS superfamily) + Term 19692 - 19729 6.0 + Prom 19783 - 19842 3.2 23 14 Op 1 . + CDS 19957 - 21483 1230 ## COG0606 Predicted ATPase with chaperone activity + Term 21547 - 21582 2.1 24 14 Op 2 . + CDS 21684 - 22655 744 ## COG0709 Selenophosphate synthase + Term 22856 - 22908 21.1 - Term 22852 - 22888 9.4 25 15 Op 1 . - CDS 22990 - 23439 706 ## LI0903 hypothetical protein - Term 23455 - 23490 6.3 26 15 Op 2 . - CDS 23513 - 24535 171 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 - Prom 24745 - 24804 2.2 - Term 24832 - 24870 11.6 27 16 Op 1 . - CDS 24940 - 25257 529 ## COG2920 Dissimilatory sulfite reductase (desulfoviridin), gamma subunit 28 16 Op 2 7/0.000 - CDS 25485 - 25994 427 ## COG0319 Predicted metal-dependent hydrolase 29 16 Op 3 4/0.000 - CDS 25991 - 28510 663 ## PROTEIN SUPPORTED gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 30 16 Op 4 . - CDS 28473 - 29483 970 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase - Term 29698 - 29733 6.0 31 17 Tu 1 . - CDS 29961 - 30803 909 ## COG0084 Mg-dependent DNase - Term 30893 - 30947 9.6 32 18 Op 1 . - CDS 30965 - 31327 533 ## DVU2212 hypothetical protein 33 18 Op 2 . - CDS 31492 - 32742 1643 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 34 19 Tu 1 . + CDS 32692 - 32847 58 ## + Term 33023 - 33090 31.8 + TRNA 32999 - 33074 80.6 # His GTG 0 0 - Term 33160 - 33215 9.8 35 20 Tu 1 . - CDS 33286 - 34077 1038 ## COG0708 Exonuclease III - Prom 34277 - 34336 2.7 + Prom 34024 - 34083 1.7 36 21 Tu 1 . + CDS 34287 - 35903 1312 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases + Term 36058 - 36114 2.0 + Prom 36094 - 36153 1.6 37 22 Tu 1 . + CDS 36194 - 38935 2255 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily + Term 39149 - 39188 -0.5 + Prom 39248 - 39307 1.5 38 23 Tu 1 . + CDS 39332 - 39685 530 ## COG3169 Uncharacterized protein conserved in bacteria + Term 39720 - 39755 7.4 - Term 39708 - 39743 7.4 39 24 Op 1 5/0.000 - CDS 39890 - 40531 802 ## COG0602 Organic radical activating enzymes 40 24 Op 2 . - CDS 40724 - 41443 855 ## COG0603 Predicted PP-loop superfamily ATPase 41 25 Tu 1 . + CDS 41638 - 43014 1679 ## COG1207 N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) 42 26 Op 1 . + CDS 43307 - 43531 350 ## 43 26 Op 2 . + CDS 43547 - 43810 205 ## DvMF_1183 hypothetical protein + Term 44037 - 44079 -0.8 + Prom 44024 - 44083 3.1 44 27 Tu 1 . + CDS 44187 - 45746 1969 ## COG1418 Predicted HD superfamily hydrolase + Term 45771 - 45804 5.4 + Prom 46194 - 46253 3.1 45 28 Tu 1 . + CDS 46341 - 48530 2245 ## COG3968 Uncharacterized protein related to glutamine synthetase + Term 48686 - 48728 12.6 + Prom 48887 - 48946 3.3 46 29 Tu 1 . + CDS 49036 - 49878 709 ## COG0253 Diaminopimelate epimerase Predicted protein(s) >gi|316922437|gb|ADCP01000118.1| GENE 1 464 - 970 569 168 aa, chain - ## HITS:1 COG:no KEGG:Dde_0111 NR:ns ## KEGG: Dde_0111 # Name: not_defined # Def: zinc resistance-associated protein # Organism: D.desulfuricans # Pathway: Two-component system [PATH:dde02020] # 51 134 60 143 172 72 44.0 7e-12 MRTYLKAFLFLALTASIACAGANFAQAAETAPAASAGNGMMPPVGTFDNNAAYQALTPEK QAKYNEIVKSAEEAMLPLREKMMAKRLQLDTMASMPNMNEEAIAKAAAEVASLHTQMIKV HDMMADRLANELGISIQRGTCSDGYTPCPMHRGMMRGYHHQGMMMPMY >gi|316922437|gb|ADCP01000118.1| GENE 2 1087 - 1362 103 91 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRARRCGGWQWGRRDRDAAGNGACRDARKEAESGEGENDGQKKRTPAVLESFCVILVGAT GFEPATPWSQTRCATKLRHAPLWLTPPRKAA >gi|316922437|gb|ADCP01000118.1| GENE 3 1380 - 1877 166 165 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 60 163 244 347 347 68 32 7e-11 MNTFKRLALVVTLVVAMGAGFGCAKKQTGADVAPATTADDNGAANAAIERAAQAITDGIV YFDFDKSDIKAESRDMLRQKAELMKAYPSIRVRIEGNCDARGTQEYNLALGERRARAAYE YLVMLGVNPDQMEMISFGKERPAVEGTGPAVWAKNRRDDFRVIAK >gi|316922437|gb|ADCP01000118.1| GENE 4 2125 - 3447 1738 440 aa, chain - ## HITS:1 COG:RSc0735 KEGG:ns NR:ns ## COG: RSc0735 COG0823 # Protein_GI_number: 17545454 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic component of the Tol biopolymer transport system # Organism: Ralstonia solanacearum # 14 436 16 428 432 120 25.0 4e-27 MKPFRLFSQVLLLLGLVMVCAAGSSKAAMQIDIYGPGQNIVNLAMAAPLSTGQPAQGLGK ELDAAIHDNLSFLPFMRLTDPKAVLGGTTLPGYQPPNVDFKRFQLAGADLLVTAGWPQGD KSGSTVELRVYETFSGQFVFGNAYSGVTKEEVQDVADRFCADLMKALTGHGDFFLATLAF VKNSGKNKRDVWITKPTGRNLRKITDIPGIAMSPSWSLDGRFIVFSHMDDKSHALGVWDR LTNRVQRIRFPGNTVIGPTFTPGNKVAVSLSTGKNPDIFMLNHAFQKERTLEANGTINVS PTFDAAGTKMAFTSNRMGGPQIFMKDLASGSVSRVTKQGNYNTEPSMSPDGTLIAFSRLT SEGNRIFVQDLTTGTEQQISFGPGNDVLPSFAPDSYFIAFTSNRSGPNQIYLTTRHGGDA KKVPTGSGDASFPRWGAIPR >gi|316922437|gb|ADCP01000118.1| GENE 5 3574 - 4257 628 227 aa, chain - ## HITS:1 COG:no KEGG:Dvul_0279 NR:ns ## KEGG: Dvul_0279 # Name: not_defined # Def: TonB family protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 17 213 19 207 221 91 37.0 2e-17 MPKSIMRLLAAFGFILCSALPGMAEPTGMPLIAGADTAGGYAGDVLVKVLPAWQPPAGAS GIVTISLRIGSDGRPLYCEASRKSGNAALDESACQAVVRAGTLPTPPYGAITEVYLTFMT DPGALKGTAPQGQASAPKERSYAEEIMFRAKPYIQIPQGVHGEYSVELTLRINETGGIEQ MSVSKSSGKAEVDNAVLTGVIREGVIPPRPAGSEPLTMRLLFTLKNN >gi|316922437|gb|ADCP01000118.1| GENE 6 4755 - 5741 942 328 aa, chain - ## HITS:1 COG:no KEGG:LI0693 NR:ns ## KEGG: LI0693 # Name: not_defined # Def: colicin uptake membrane protein # Organism: L.intracellularis # Pathway: not_defined # 2 324 3 316 317 160 40.0 4e-38 MFGSYFFSILLHLALALLIFLWPSSPPVKLDQPMMQISVNMGAPGGNRMASPVLGPQGKP MPTKAAPQPAPAEQAASAVPVAREDTVQPKAEPKKESQPKPKPKPEPKPDEVALAQKKQP KKPKDEEEEEPKEKPKEKPKDDGKKDKKEASENAKKTDDKAKPAKESDKKKDGVDPSKAL ADALNDAKKKAGTSRGTKEKGGKSSVAGALADFQKSAGGAGGGGGGEGDGPGGGGIYDVY MGQVILAVRPNWSMPTYSRANLSVQVNVKLDPNGKVLSCTVARSSGRAEVDASAVNAVIR TKVLPAPPTPDQQELLLTFNTQEMMGRR >gi|316922437|gb|ADCP01000118.1| GENE 7 5823 - 6263 637 146 aa, chain - ## HITS:1 COG:AGl2238 KEGG:ns NR:ns ## COG: AGl2238 COG0848 # Protein_GI_number: 15891232 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 12 130 27 145 152 101 48.0 5e-22 MGSSVGGNKGFVSEINVTPFVDVMLVLLIIFMVTAPMMTEGLDVDLPQTRAVETLPTESD NMILTVRKDGAIFLDSYEVGLDELQDKINLLVKQQNKQLFLQADKDVAYGLVVDVMGRIR EAGIDKLGVVALREDTPAVPEKKGKK >gi|316922437|gb|ADCP01000118.1| GENE 8 6273 - 6965 858 230 aa, chain - ## HITS:1 COG:PA0969 KEGG:ns NR:ns ## COG: PA0969 COG0811 # Protein_GI_number: 15596166 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Pseudomonas aeruginosa # 4 218 10 227 231 154 42.0 2e-37 MDLSILSLLAQATLVAKAVLVVLLFMSVFSWALMFRKWMNIRAALKKAADGMDRFNHARD LREAVQSLGGDPDSPLYTVAHEGVTEFNRSKEAGNSEDVVVDNVRRALRQGVDMEMTRLT SSLSFLATSANTAPFIGLFGTVWGIMNTFHAIGAMKSASLATVAPGISEALIATAMGLLV AIPATIGYNTFHGSLGVLETRLVNFAGMFLNRVQRELNAHRAVSRSGQAE >gi|316922437|gb|ADCP01000118.1| GENE 9 7285 - 7680 406 131 aa, chain + ## HITS:1 COG:BH0951 KEGG:ns NR:ns ## COG: BH0951 COG0735 # Protein_GI_number: 15613514 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Bacillus halodurans # 1 130 13 143 145 91 39.0 4e-19 MSQPQTRMTRQRMVILEELRKVKTHPTADELYAMVRTRMPRISLGTVYRNLDFLTESKEI LKLESAGSIRRFDGDTRPHQHVRCRVCGKIGDVIPPVPTPSVEGVSVEGFTITEARVEYE GICEECARKAS >gi|316922437|gb|ADCP01000118.1| GENE 10 7832 - 8407 810 191 aa, chain + ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 187 1 191 195 191 52.0 1e-48 MKSLKGTQTERNILTAFAGESQARNRYDYFAGAAKKDGFVQIADIFTETALQEKEHAKRL FKFLEGGDVEITAAFPAGIITGTEANLYAAAAGEHHEYTEMYPSFANIADSEGFSEVACV MRNIAIAEQYHEERFLAFAKNIKEGRVFVRETPVVWRCRNCGCLVTGTHAPALCPACAHP TAHFEVLHNPW >gi|316922437|gb|ADCP01000118.1| GENE 11 8427 - 8612 272 61 aa, chain + ## HITS:1 COG:alr1174 KEGG:ns NR:ns ## COG: alr1174 COG1592 # Protein_GI_number: 17228669 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Nostoc sp. PCC 7120 # 15 60 188 233 237 63 56.0 1e-10 MADPKDMYRCPVSNCGYVYDPDRGDKRRKIPAGTRFEDLPEDWTCPVCGASKKNFKPLSE A >gi|316922437|gb|ADCP01000118.1| GENE 12 8669 - 8974 61 101 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLEQGGKPPAHVKIDLRTGKRVRQRLDQRQRQYRIPQKSRIPYRDPLHKAPPFGQTASP IRENPQFLFRNGTLSRVPCPHKRREKPRQASGLFSMGIRLF >gi|316922437|gb|ADCP01000118.1| GENE 13 8828 - 9880 933 350 aa, chain + ## HITS:1 COG:FN0544 KEGG:ns NR:ns ## COG: FN0544 COG0859 # Protein_GI_number: 19703879 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 4 335 3 317 342 141 30.0 2e-33 MKRIAVWNTAFLGDAVLTLPLIQTLADAFPGSEIDFYVRRGLASLFEAHPALHAVYEYDK RRIAGNNRLSALLKSGRILGGRRYDIWIGAHPSPRSALLARLSGAPLRVGYIGGPVAALC YNRRVSRRFSELHEIERLLELLKPILPPETPRQHWPELTLPSEALEKAAAFRESLRQGDQ EPVLLGLHPGSVWGTKRWLPSGFAEIARRAAARGAHVLVFAGPGEEGVARDVIALSELKG NPLLHDLSCALTLPELAAYLRMLNCYVSNDSGPMHLAWAQHTPVTALFGPTVRELGFFPR GDSAKVFEVSLECRPCGLHGPHHCPKKHHRCMVDIDVEAVWRDVAGKLGV >gi|316922437|gb|ADCP01000118.1| GENE 14 10430 - 11380 868 316 aa, chain + ## HITS:1 COG:CAP0141 KEGG:ns NR:ns ## COG: CAP0141 COG1740 # Protein_GI_number: 15004844 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I small subunit # Organism: Clostridium acetobutylicum # 46 313 40 288 291 199 38.0 4e-51 MELTRRDFVRICTGTVAGIGVSGMFHPFVRDVLAETLTGQRPPVFWVEGQGCTGCTVSLL NNAHPGIAKVLLEIIAMHFHPTIMAAEGQLAFQSMLDKSKEFNSQYVLIVEGSIPLKADG KYCVVGEVDHHEYTVAETTDILARSAAAVVAAGTCSSYGGVPAARGQQTQAVSVSRFLKM RGIPTPVINVPGCPPHPDWMVGTLALLLDAMKRKGTEGGVLEIMRGLDDVGRPKVFYPNT HLTCPYLSFFEDGIFSPFMTDKKGCRFELGCRGPWSGCDSATRKWNGGVNWCIANATCVG CTSPNFPDGMSPFYEN >gi|316922437|gb|ADCP01000118.1| GENE 15 11396 - 12820 1516 474 aa, chain + ## HITS:1 COG:CAP0142 KEGG:ns NR:ns ## COG: CAP0142 COG0374 # Protein_GI_number: 15004845 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I large subunit # Organism: Clostridium acetobutylicum # 1 474 1 449 471 276 33.0 5e-74 MSKTVIAIDPVTRLEGHLKVEVQVEDGKVADAWITGGMFRGFEAILRGRNPRDASQIVQR ICGVCPVAHATASSLAIEAVCGVEVPENGRIARNLMLAGNYLQSNILHFYHLGGQDYFHG PDTVPFIPRYRNPDLRLSEEQNTLAMDEYIEALEVRQVCHQLVALFGGRMPHLQGILGGG AAQIPDRETILEYAARMKQVRKFVENRYLPLVYTIASRYMDMFEMAHGYKNALCVGVFPL AKKGEQFFNAGAYINGRDEPFDGNRILEDVRYSWFEPAPSGTPLQKSESNPQVDKEGAYS FIKAPLYAGHRVEVGPLARMWINDKPLSPIGQRFFADMFGVRAETFRQIGEDPAFSIMGR NVARVEEVYQTLGMIEYWLHELEPGAQTFALPEVPQAGEGIGFTEAPRGALCHYMRVKNG VIDDYAVVAASMWNCSPRDDAGKRGAVEEALIGVPVPEVDSPVNVGRVIRAYDP >gi|316922437|gb|ADCP01000118.1| GENE 16 13154 - 13630 289 158 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302861564|gb|EFL84500.1| ## NR: gi|302861564|gb|EFL84500.1| nitroreductase family protein [Desulfovibrio sp. 3_1_syn3] # 1 158 50 207 207 258 79.0 1e-67 MNGKEERANLLRVENMTSRDECEQMLDGFGMTDEVQRNMYREAMPRQFSMLYNAGCLILP FFKIREPLLQPSSLSSLNDFASIWCCIENMLLAAASEGLLGVTRIPMAEESEHIKTAVGH PENYVMPCYIALGYPAKNASVPAQKSICAKDKIHINIW >gi|316922437|gb|ADCP01000118.1| GENE 17 13646 - 13777 78 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302861564|gb|EFL84500.1| ## NR: gi|302861564|gb|EFL84500.1| nitroreductase family protein [Desulfovibrio sp. 3_1_syn3] # 1 43 1 43 207 73 79.0 5e-12 MDVYEAIRARRTIRDFEDRQVGMGTIERIIDAGLKAPTNNHLR >gi|316922437|gb|ADCP01000118.1| GENE 18 13828 - 14751 518 307 aa, chain - ## HITS:1 COG:CAC3494 KEGG:ns NR:ns ## COG: CAC3494 COG2378 # Protein_GI_number: 15896731 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 300 1 299 300 209 38.0 8e-54 MHVERLFQIVFLLMSGSCGTAKELAEHCSVSVRTIQRDLDALSLAGIPVFSSRGYGGGIG LLKEFTLDKTFMTEQEQADILHGLQALEGAGYPDGNAALHKLAALFRRKEDRWLRVDFSS WCGSGFGKEKFHRLKEAILAKKVVRFTYFSSENRVSERFAEPLCLLFRERAWYVCVFDRK KSREMVLRASRIRDLHVMEETFERTMQEDPMATPDYSKSYTLQRVVLRIDAECAFRAFDE FPEKDIRPQPDGSFLIDEEMPVNEWLTGYLLSFADGLEVIEPATLRTALKEKIHSMQKKY DKQVSGS >gi|316922437|gb|ADCP01000118.1| GENE 19 15097 - 15837 1298 246 aa, chain - ## HITS:1 COG:CAC0380 KEGG:ns NR:ns ## COG: CAC0380 COG0834 # Protein_GI_number: 15893671 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Clostridium acetobutylicum # 34 245 54 263 272 146 39.0 4e-35 MFRTVLTSLLAVLVLCGSAFAAEKTLIAAANPTWPPMEFLDENKNIIGYDRDIIAAIAEE IGMKSEFRNIAWDGIFASLESGQANVIASCVTITDKRKKAYVFSDPYYEVHQAVVVAKDS AIKKPEDLKGKKVGVQIGTTAIEALRKMNVDIDLRTYDEVGLVFEDMRNGRLDVVVCDDP VARYYASRKEGYKDLMKVAFLTEDVEYIGFVFSKDNAELAAQVNKGLKAIRENGKEKAIR NKWLGE >gi|316922437|gb|ADCP01000118.1| GENE 20 16256 - 17014 1218 252 aa, chain + ## HITS:1 COG:STM0830 KEGG:ns NR:ns ## COG: STM0830 COG0834 # Protein_GI_number: 16764192 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Salmonella typhimurium LT2 # 4 248 9 248 248 235 49.0 5e-62 MKRIAALIMGFVLMTVLVAPASAKELVVAHDTNFMPFEFKGPDGKFTGFDIELWETIAKK LGLSYKFQPMDFNGIIPGLQTGNVDVGIAGMTITPERAKVVQFSNGYYTSGLKILVRDDE KGISKVEDLAGKVVAVKTGTSSVPFMKDFGKAKELKQFPNNDGMFFELLSKGVDAVVFDM PVVTAFANSAGKGKAKVVGPLYEGQKYGIGFAQGNEELVKKVNGVLDQMRKDGSYAKLYE KWFGFAPKESDM >gi|316922437|gb|ADCP01000118.1| GENE 21 17098 - 17799 1000 233 aa, chain + ## HITS:1 COG:STM0829 KEGG:ns NR:ns ## COG: STM0829 COG0765 # Protein_GI_number: 16764191 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Salmonella typhimurium LT2 # 5 219 3 218 219 205 55.0 5e-53 MAFDFEPSIILETAPQLLGGMKLTLIITLGGLLVGLLLGVVTGLANTSRSRFLRGIAILY IEAIRGTPLIVQVMFLYFGLPLALGMRIPPDVAGVIAIGVNSGAYIAEIVRGAIQSIDKG QTEAGRSTGLTQAQTMLYIIWPQAFRRMIPPLTNQCIISLKDTSLLVVIGVGELTRQGQE IIAVNFRAFEVWLTVAIFYLAMTLSISAVLRYYERKLMIGRRAVPASPSDPQM >gi|316922437|gb|ADCP01000118.1| GENE 22 17853 - 19553 1978 566 aa, chain + ## HITS:1 COG:CT856 KEGG:ns NR:ns ## COG: CT856 COG0659 # Protein_GI_number: 15605592 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Chlamydia trachomatis # 13 565 11 564 567 452 43.0 1e-126 MSSIQPAQHKSPLVPSLFKTIRSGYSLGMLSKDVSAGITVGVVALPLALAFAIASGLTPE RGLYTSIIAGFFMAFFSGSRFPVSGPTGAFVVIIYSIVSRHGYEGLVLTTLMAGILLVIF GFLRLGALVKYIPYPVTTGFTTGIALLIFSSQMKDFFGLPLVDTPPEFFDKWGAYAQNAM DFSPATFGVAVFTLLVILIVRHKIPKVPAPVVAVFLSTLLVWLFSLPTDTIGSRFGTLPT GFPDFTLPYGITFERIRELFPDALTIALLAGIESLLACVVADSMTGDRHNSNMELISQGI GNVASVFFGGFAATGAIARTATNVRAGAHSSISAIVHSLFLVVVVMWLLPLTEYIPLAAL AAVLVMVAYDMSDLRTVRHIFQGPKSDWSVMILTFALTVIFDLTVAVYTGVMLASLLFMR RMSELTGIHTCVSGEEEEAAAHDIPLPDEETVPDGVEIFAINGPLFFGVADRFQSTLDAM ETPPKVFIMYLHNTPAIDMTGIHALEAFLERRQEGCRVLFAAVQEPARRTLQRVGILRAV GEENIYPSLDEALLRAEEILEEEKRK >gi|316922437|gb|ADCP01000118.1| GENE 23 19957 - 21483 1230 508 aa, chain + ## HITS:1 COG:slr0904 KEGG:ns NR:ns ## COG: slr0904 COG0606 # Protein_GI_number: 16331658 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Synechocystis # 1 505 1 507 509 453 47.0 1e-127 MIVRLACGALQGVDAFRVDLEADFLRKGMPAFVMVGLAEGAVREAKERVFAAMRSCGFTL PAARITVNLAPADRRKAGSAYDLPLAVALLGAAGVLPAERLEGWFLAGELSLSGAIKPVP GVLPLGMLARNEGAKGIVTAPENASEAALVKGLPAYGPATLDEAVRFLMGECELMPAEPP VSDDDPARGILDFADVKGQEHAKRAIEIAAAGAHNLLLIGPPGSGKTMLAKRIPSILPPL EPDEALEVTKIYSVAGMLGGKGLVSERPFREPHHTVSDVALVGGGAYPRPGEVSLAHRGV LFLDELPEYGKNTLDVLRQPLEGGTVTVSRSAHSVTFPADCMLVAAMNPCPCGHATDPRH TCVCSPGLIRRYRARLSGPLLDRIDLHINVPAVPYEELQAGASPVTSARMRERVLAARAV QQARYAEVPYCRTNADLSGSLLEKHCRMGAPERAFLREAVQRLALSARAYTRILRISRTI ADLAGSESIGVPHLAEAIGCRVLDREAF >gi|316922437|gb|ADCP01000118.1| GENE 24 21684 - 22655 744 323 aa, chain + ## HITS:1 COG:Cj1504c KEGG:ns NR:ns ## COG: Cj1504c COG0709 # Protein_GI_number: 15792818 # Func_class: E Amino acid transport and metabolism # Function: Selenophosphate synthase # Organism: Campylobacter jejuni # 14 298 7 284 308 194 38.0 3e-49 MASLPGVPDPEGRVLHGFEHNEDAVILRTPPAGMALVQTVDVLSPLGNNPRLFGQVAAAN ALSDVYAVGGVPWSAMNIAAFPAQDVPLEVFAEILAGGLEKIVEAGAVLAGGHTLEDAEI KYGLSVTGYVDPGAVASNAGLRPGDALVLTKPLGTGVLATAIKAQWNGSGKAEADLYRWA THLNANAAEVLRAMNLKAATDITGFGLGGHLLEMARGSRVVVEIDTASLPFIDNVEAFAS DGLIPAGSYANRRHCSCRTFVSPDVDSLRATLVFDAQTSGGLVLAVPQERVEEALERLAA LGEPGWLVGHVLPLRDGPQLLLS >gi|316922437|gb|ADCP01000118.1| GENE 25 22990 - 23439 706 149 aa, chain - ## HITS:1 COG:no KEGG:LI0903 NR:ns ## KEGG: LI0903 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 36 148 42 155 156 103 42.0 2e-21 MQLIKKSALFALVLSSALICGQQVQAAKPSAAPAQETTLEVKAKLDAFAKSYVARANDTL KNNRQNMSVTKQGKGYVARYTEVDASTMTTEIYPGKGPGCEYVGHIVYLEKVYECTGKTI SEAKTGTFTTPKARRIRELTRYDGKKWIY >gi|316922437|gb|ADCP01000118.1| GENE 26 23513 - 24535 171 340 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 237 336 244 344 347 70 35 2e-11 MKFSRSIVLAVALIFAWATVGFAAHSAEICQKRIESFNYLVDYSGSMMMNFPSVGKTKMA VAKEVIKRVNDMVPELGYQGGLYTFAPYGAVVNQGPWVRSTLAAGVDSLKDNLETFARFT PMGDGIQAHNGIISQMTPRAAVILVSDGESNRGISPIDEVKAIYAANPNVCFHVISVASS PEGQATLDAIAALNACSVSVKAIDLLKSDAAVDKFVGDVFCQERVAVVEDVVVLRGVNFA FDKYDLTPEAQGILNEAARIIMEHPNMKVQLLGWTDSIGTDAYNLKLSQRRADAVKNYLV AQGVPASRMIAIGKGKSFRYDNNTEEGRYMNRRTELVFMD >gi|316922437|gb|ADCP01000118.1| GENE 27 24940 - 25257 529 105 aa, chain - ## HITS:1 COG:AF2228 KEGG:ns NR:ns ## COG: AF2228 COG2920 # Protein_GI_number: 11499810 # Func_class: P Inorganic ion transport and metabolism # Function: Dissimilatory sulfite reductase (desulfoviridin), gamma subunit # Organism: Archaeoglobus fulgidus # 1 105 1 115 115 93 44.0 8e-20 MAEVTYKGKTFEVDEDGFLLKFDEWCPEWLEYVKESEGITDLNEDHQKILDFLQDYYRKN GIAPMVRILSKNTGYKLKEVYELFPSGPGKGACKMAGLPKPTGCV >gi|316922437|gb|ADCP01000118.1| GENE 28 25485 - 25994 427 169 aa, chain - ## HITS:1 COG:aq_1354 KEGG:ns NR:ns ## COG: aq_1354 COG0319 # Protein_GI_number: 15606551 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Aquifex aeolicus # 49 147 33 125 150 63 37.0 2e-10 MSPLERFPQEQTDAPSDSGALRLNLSGGGLCWRLPFCRAELHRALSAMLRAAGSGPAELD LVLVRDAGMADYNLRYMGCHGPTNVLSFPIDEQIAGPEDEDVPVQLGSLVFSVDTLHRET LLYGQDPEEHCLRLLAHGLGHLVGYDHGPEMDELCSEMLSAAEAWLTAQ >gi|316922437|gb|ADCP01000118.1| GENE 29 25991 - 28510 663 839 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 [Bacillus selenitireducens MLS10] # 249 732 249 732 750 259 32 2e-68 MMTTRNSPLRINELSRITRTLFRQHHHGVGLFVLIATLFVLSFLAGLQPPMQFRVFMAGE VADSDVVAHRSLTVEDTEAAKIKRNQVAMLQPTVFDLSVTEIAALRQKIFGMLKLINGQE TSPEAQDALRASLAEKLATPVSSDLLAQWMSPEVQEYVLSTGLPWFEDRLREGVVGDVRL ALPSKNGIIVRDVDTKAETLRPDVSDIRDVPAVLAAFTQKLRSAEKLTPQGRRAVLVLFS PLIMPTLTLNREATQALGNAVAQTVEPVYYHIQKGEVVVYQGERVTREKQLKLQSLYQKQ EGLIHSRTVAGTFVFSLILTLGLFMAPSGKPGTPLRKKDFLFISLLLFVFGIAAKGTYML VGKVVQPQDIALVIHFFPVAGAAGLCALIFAARRYCVVGLLISLFACVMLKADLPLFLFF FLSAMLNTWLVIRAQSRQDVVIGIIPLALGQIIIALGSGWLEGFRGGNTFLWLFGIVSLN AFISIAILFAISPIIEMAFRYTTRFRLMELMNLEQPLLQDLMVAIPGTYHHSLVVSNMVE AGAKAVGANSLLCKVAALYHDIGKLAYPDYYIENQFNGPNRHDKLAPAMSALILLSHVKK GTELAAQHHLGDEIEDIIRQHHGTGLIKFFYAKAKESGENPRIEDYCYPGPRPQTREAAI VMLADAVEASSRTLTDPTPARIKTHIDTIMKGIFSEGQLDESELTFKDLHKLSESFARIL TGLFHQRIAYPDLNKDKKTASDKPEQAKVQAKPEEKTEQRPDVKRTVAYAASSVMPGSSG VAAPNGPEQQGAAPAAPHTPVPPSANPGPPIDLVYPVVPGPPKPVNKGPLSMKPVGKRP >gi|316922437|gb|ADCP01000118.1| GENE 30 28473 - 29483 970 336 aa, chain - ## HITS:1 COG:BH1361 KEGG:ns NR:ns ## COG: BH1361 COG1702 # Protein_GI_number: 15613924 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Bacillus halodurans # 14 330 9 319 320 295 50.0 8e-80 MQTPASSTAFVETLSFDDPTLANALFGPHNAHLNVIAERTGADLSTRGTTLAVTSDDPAL RERILNLFTQFYGLLRSGHALQSSDLEQGLQLLESDPGSDLRRSFRDEAVLSLPKKTVTA RNAAQRAYLETLRSNEMVFSVGPAGTGKTYLAVAMAVHMLNARRVRRIILTRPAVEAGEK LGFLPGDLAEKVNPYLRPLHDALLDMLGPQKMASLQETGIIEVAPLAFMRGRTLNDAFII LDEAQNTTREQMKMFLTRLGFGSRAAITGDITQIDLPAQGEDSASRSGLVHALHVLKGVK GVAFRHFTSADVVRHPLVGRIVQAYDDYEKLPSSHQ >gi|316922437|gb|ADCP01000118.1| GENE 31 29961 - 30803 909 280 aa, chain - ## HITS:1 COG:BS_yabD KEGG:ns NR:ns ## COG: BS_yabD COG0084 # Protein_GI_number: 16077107 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Bacillus subtilis # 23 273 4 252 255 176 38.0 6e-44 MSKKKDTPRPLPDTLGLPPTGADSHAHLDSEGIIERLPEVMGRARSTGLASIGQVFLGPA AYHANKGTFAGYPGVFFLMGIHPCDGQECTGATLKAMREAFAEDPRLRAVGEIGLDFYWK DCPPFIQEEAFRVQLALAKETDRPVVIHSRDAAKDTLRILEAEGFSGRPVLWHCFSGDAV AFLDRFLANGWHISIPGPVTYPANHDLREAVKRIPPDRLMVETDCPYLSPVPWRGKPNEP ALVAFTAETVALERGMDPAELWTLCGDNTRRFFNAPAKGL >gi|316922437|gb|ADCP01000118.1| GENE 32 30965 - 31327 533 120 aa, chain - ## HITS:1 COG:no KEGG:DVU2212 NR:ns ## KEGG: DVU2212 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 119 1 119 120 189 78.0 4e-47 MDTTKILADLKKRPGFTENVGMVLVHNGVVRGWARADHAPVKSMKVSHDRAKMDAICREM EQQPGIFCIHAEAVEGELKPGDDVLFLVVAGDIREHVKATFSELLDRIKAEAVIKQEFTE >gi|316922437|gb|ADCP01000118.1| GENE 33 31492 - 32742 1643 416 aa, chain - ## HITS:1 COG:RSc2953 KEGG:ns NR:ns ## COG: RSc2953 COG0766 # Protein_GI_number: 17547672 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Ralstonia solanacearum # 1 413 1 418 421 426 55.0 1e-119 MDKLIIEGGVPLTGVVHVSGSKNAALPILMGSILLDEPVTYTNVPDLRDIRTTFKLLELL GCPNAFENGQVTVTPGALSPEAPYELVKTMRASVLVLGPLLARLGEARVAMPGGCAIGAR PVDLHLSALEKMGARFDLEDGYILGRCRKLNGAHIHFDFPTVGGTENLLMAAALADGETI LENAAREPEVVDLARFLIACGARIEGHGTSVIRIQGVPRLGGCTYRIMPDRIEAGTLLAA AGITDGELLLTDCPFEELDAVIAKMRQMGMEIEPTADGVLARRSCALRGVDVTTQPFPGF PTDMQAQIMSLMCLSEGSGVISETIFENRFMHVQELVRMGADIRLSGHTAIIRGVQRLAG APVMASDLRASASLVLAGLAAKGTTTVQRIYHLDRGYEHIENKLNAVGARIRREAE >gi|316922437|gb|ADCP01000118.1| GENE 34 32692 - 32847 58 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHDAGQRNAAFDDQFVHELSLYVGPEKPGVAGWMENAPREGGTLARSVCGS >gi|316922437|gb|ADCP01000118.1| GENE 35 33286 - 34077 1038 263 aa, chain - ## HITS:1 COG:MTH212 KEGG:ns NR:ns ## COG: MTH212 COG0708 # Protein_GI_number: 15678240 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Methanothermobacter thermautotrophicus # 1 262 1 257 257 258 47.0 7e-69 MDTIRLVSWNVNGFRALSGKPDWDWFASTDADVVALQETKAEPSQIAEEHRSPEGWNAEW LAAQVKKGYSGVAVFSRQKAALEPLAVHRELPDPRYQGEGRLLHIEYPAFHFFNVYFPNG TKDDGRLAYKMGYYDAFLAHAEELRRTKPIVVCGDFNTAHRPIDLARPKANEENSGFLPI ERAWVDRFIAAGYVDTFRHIHGDEPHQYSWWSYKQRARVNNVGWRIDYFFVSEELAPAIR DAWIENNVYGSDHCPVGLELAIA >gi|316922437|gb|ADCP01000118.1| GENE 36 34287 - 35903 1312 538 aa, chain + ## HITS:1 COG:mlr3017_1 KEGG:ns NR:ns ## COG: mlr3017_1 COG0737 # Protein_GI_number: 13472651 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Mesorhizobium loti # 9 535 12 533 656 335 38.0 2e-91 MKKIRLPLLSLLLLCFAAPSQAFELTVLHTNDVHSMYGGTTEKGTACYAAQCAGGSGGSV RLKQAVDTVRAAEPNVVLLDAGDEFQGTLFYTQFKGDVAAEVLDALDYTAFTPGNHEFDD GCGEFRRFVERTHVPVLAANLTLPPVPGGKPLTRPWIVVERQGRKIGIVGLVNEETPSLA SPCKEAVFGPAETALREAVASLRAQGVNIVIALTHLGLNVDCELAGRVDGVDVFVGGHTH SLLSNTNPKAVGPYPIVKHSPSGEPVLVVTAASSCKLLGHIAIDFNDAGTAQRWNGEPIV LDGRNVTVPPDAKLSARLDSYAAQLRSLIGQPVGKILLAGDTSGRQVDLEEDAHLCRVQE CPSGDVLMDSLLWGARDTGATVALSVGGTVRSPLHTGVVTMGDLLATMPFDNTLVVGDLT GKQLLGALEHGASGYENGAGRFLQVAGLTYSVEPAKPVGQRVSAVSVRTKAGGWEPLRPE AVYRVATVDYAAEGGDGFAAFKKIKWQYTGRAHIECLRDYIAKAPVEVRSGGRITVLR >gi|316922437|gb|ADCP01000118.1| GENE 37 36194 - 38935 2255 913 aa, chain + ## HITS:1 COG:VCA0802 KEGG:ns NR:ns ## COG: VCA0802 COG1368 # Protein_GI_number: 15601557 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Vibrio cholerae # 332 814 65 555 657 229 34.0 1e-59 MTKRMIAAQLSVGALLLVLVALMESYTGWDTAAQRLWFDSATHEWVVSNELHARLAWFFY DGPKILLVVLGIACVAGVLGGARWNLPPECRRGCLLLLLSLAFVPMLLGGAKQFTNVYCP KQIEEFGGEYVHQGVLECRNPANEGRSPGRCFPAGHASGGFALMMLFFCFRSRRDRWAGL GAGLIAGWGMGFYQMLRGQHFLSHTLFTMIGAWMIILLVTWALRGFSLNKLVSINICPAV LPRLSRNRNSSCVTTRSPNRIFSFKRGFIMYAFLDAVRYLVRRLLPFIGIYFFAELTELS ILALRESSNLHLSLKGFLVSFPVWVGTTMVSCLFSILPVLAYLLLLPRKWHGGRWDRRLS ILFFFLFTAGHLFEEVAELLFWDEFTSRFNFVAVDYLVYTNEVIGNISQSYPVALFLGGI TVAAGVITLLARRWLSTVRTVPRLLMRFAGAALLVLCACSLNMVNFMDISEDTGDRYLTE LSKDGLYSLFHAFFSNELSYNDFYLTRPDADTVATLAPLMASDARRVGDPASLAYEVAPH EKEIRANVVIVLMESMGSEFFSEFRDDGQKLTPELEKLASESLYFSHVYSTGTRTVRGIE ALTLARPPLPGMPIVRLQGNDNLRGIWSVFRERGYDTKWIYGGYGYFDNMNAYFSGNGFT VVDRTVMQPEEITFSNIWGVCDENLFARAIKEADASHAAGKPFFNFVLTTSNHRPYTYPD GKISIPSKSGRNGGVMYADYSIGKFMEEARKHPWFDDTVFVFVADHGASSSGREEIKQGN HHIPLIIYAPKFIKPERHDQPISQIDAVPTLLSLLHFKYTGEFYGTNALDRDYVSRLFLS NYQKLAYVKGNEMVIMRPVRGVHFYRDGQQIGSAEAAKPQDRVKAPDASLQQVLDEGISY YQHSARWREFLKE >gi|316922437|gb|ADCP01000118.1| GENE 38 39332 - 39685 530 117 aa, chain + ## HITS:1 COG:XF0449 KEGG:ns NR:ns ## COG: XF0449 COG3169 # Protein_GI_number: 15837051 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 1 116 1 116 116 114 56.0 3e-26 MNLPVPVPVATIVLLILSNVFMTFAWYGHLRYKALPLLAAIVISWGIAFFEYLLQVPANR VGYGYFSAAELKAIQEIISLSVFAFFSTLYLGESLRWNHLVGFSLIVLAAFFIFKKW >gi|316922437|gb|ADCP01000118.1| GENE 39 39890 - 40531 802 213 aa, chain - ## HITS:1 COG:RSc1449 KEGG:ns NR:ns ## COG: RSc1449 COG0602 # Protein_GI_number: 17546168 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Ralstonia solanacearum # 1 213 1 212 212 249 58.0 3e-66 MAYRVKEIFYTLQGEGAQAGRPAVFCRFSGCNLWSGRPEDRATAKCRFCDTDFVGADAGV FATAEELAQTIAATFPVLAPQAYGGRKPYIVFTGGEPALQLTRELIDRLHALGFELGVES NGTLPLPEGLDWITVSPKGSNPLATTSGHELKLVWPQQGCSPEDFEDLDFRHFLLQPCDD PRNKANTRECIAYCLLHPRWSLGLQTHKWVGVR >gi|316922437|gb|ADCP01000118.1| GENE 40 40724 - 41443 855 239 aa, chain - ## HITS:1 COG:CC3160 KEGG:ns NR:ns ## COG: CC3160 COG0603 # Protein_GI_number: 16127390 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Caulobacter vibrioides # 1 235 1 232 242 287 59.0 1e-77 MTSNTDSSALVIFSGGQDSATCLGWALNRFDRVFTVGFDYGQRHHVEMECRRNVLNAVRS DASLPASWAARLGEDTLLSLGLFNQIGETAMTSDMEIAFNEQGIPNTFVPGRNLVFLTAA AALAWRKGIRHLVMGVCETDFSGYPDCRDDTVKAMQVALNLGMDARFVVHTPLMWLDKAR TWTLAEQEGGAPFVECVRTVTHTCYIGDRGELHPWGYGCGQCPACKLREHGWKEYSEGK >gi|316922437|gb|ADCP01000118.1| GENE 41 41638 - 43014 1679 458 aa, chain + ## HITS:1 COG:PA5552 KEGG:ns NR:ns ## COG: PA5552 COG1207 # Protein_GI_number: 15600745 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) # Organism: Pseudomonas aeruginosa # 14 455 6 443 454 345 45.0 7e-95 MSNSSALSSSCCALILAAGKGTRMHSKKPKVLHTILGEPLLGHVAGALRPLFGEAVWAVI GHEADMVRTAFAGRDLRFVEQKEQLGTGHALMVALPELKRAGMKKVLVVNGDTPLITTDT LRDFMFYAEGADVSVATLTLENPGAYGRIVRQNGELRAIVEAKDFDVAVHGEPTGEINAG IYMLNLEAVEILVPKLGNENKSGEYYITDLVGMAVAEGMVVRGLSCGSDPNLLGINNPAE LAASEELRRRAIVEERLAAGVAMHGPDMVRMGPYVTVEPGAELFGPCELYGHTHIASDVI IESHCVIRDSRVESGTVVHSFSHMDHAEVGPDCLVGPYARLRPGAVMERGAHMGNFVEMK KARLCEGAKANHLTYLGDAEVGARANIGAGTITCNYDGVNKYKTVIGEHAFIGSNTALVA PVTVGAEALVGAGSVITKDVPDGDLAVARGKQMNVHRK >gi|316922437|gb|ADCP01000118.1| GENE 42 43307 - 43531 350 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MELFDLLETRVVELLGQIDTLREENKTLQEKVNALDEENKTLKESLEQEQHTKEEINGRI DTLLSTIREFTAEA >gi|316922437|gb|ADCP01000118.1| GENE 43 43547 - 43810 205 87 aa, chain + ## HITS:1 COG:no KEGG:DvMF_1183 NR:ns ## KEGG: DvMF_1183 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 86 1 86 87 78 51.0 8e-14 MHSFQLNVLGMDISFRTDADSARIERAQDFVEKQYERLKNQGGQFGREKLLTLLVLGVAD DLLQTQQQLDGVETRLANLLELIEKTD >gi|316922437|gb|ADCP01000118.1| GENE 44 44187 - 45746 1969 519 aa, chain + ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 7 519 9 514 514 451 52.0 1e-126 MEFFTALSVLAAICVSGGVGFYLNKHLNAKRIGDATELATRIVEESRKEAQAQKKEILLQ GQDELFNQKRELEHEFKERERELKARDRKLQEQGERLEEKLEKATEKEHELLSVEKDLSK KERKLAELEETLNERIDEQEHRLQEVSGLTAEEARQRLFAEIESRTRHEAAKMMRLIEAE ARETADRKAKEIIACSIQRYAGDYVGEHTVTAVTLPSEDMKGRIIGREGRNIRALEAATG VDLIIDDTPETVILSAYSPLRRQVAKMALERLIQDGRIHPARIEDVVHKCEQELEVQIRE VGEQATFDAGVHGIHPELVRFLGQLRYRTSFTQNVLQHSLEVSALCGMMAAELGMDIKKA KRAGLLHDIGKAVDHEVEGPHALIGADLAKKYNESKEILHAIAAHHEDQRPETALAVLVQ AADSLSGARPGARKELLESYVKRLEDLENIASEFDGVSKAYAIQAGREIRVMVNSENVDD DQTYMLCKGITGKIEENLTYPGQIRVTVIRERRAVGYAK >gi|316922437|gb|ADCP01000118.1| GENE 45 46341 - 48530 2245 729 aa, chain + ## HITS:1 COG:slr0288 KEGG:ns NR:ns ## COG: slr0288 COG3968 # Protein_GI_number: 16331104 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Synechocystis # 21 728 18 723 724 702 49.0 0 MSSTVRETALDHISTAQPVAPLPAPEQGKALANGYGCDVFNTRVMRQRLPREVYDRLMRT MTHRTPLDPADADIIAGAMKDWALEHGATHYTHWFQPMTGLTAEKHDAFLSPTADGQVFG EFSGKMLIKGEPDASSFPSGGIRSTFEARGYTAWDPTSPVFLLGDEYGRKTLYIPTVFYS YTGEALDRKTPLMRSVEAVSAQALRILRLLGNTTATGVQAMVGSEQEYFLVDRRLYVQRP DLMMCGRTLYGAKPAKGQEMEDHYFGSIPSRVLAYMQDVEQRLFALGIPAHTRHNEVAPG QFEIAPVYEDVNIATDHNMLTMEVMRRTARRHGYVCLLHEKPFAGVNGSGKHNNWSLCDS DGHNLLNPGETPMKNAQFLVFLAAVLRAVHKHGINLRLGIIGASNDHRLGANEAPPAVLS VYLGDSLSAIIRAIAYGKEAAGGCSEPLQIGVSVLPNIPRDLSDRNRTSPFAFTGNKFEF RALGSSQNIATANISLNAAMACALDDIASMLEAELAQGTPLNAAIQSLLAKLFAEHMPIV FDGNGYSDEWLAEAEKRGLPNLKDTVAALAHYSDKDVMAVFERHGVLSPREMLSRQEILL ENYTHSVSIEGHTALKLGRTRILPVALAYQTRLAKAASSAAALVDDAAEEKAYFVRVREQ VRGLMSALDTLEAAVNGDFGGTALARAAYARDTVLPAMNACRNHADALECLCASHDWPLP SYAELLWLH >gi|316922437|gb|ADCP01000118.1| GENE 46 49036 - 49878 709 280 aa, chain + ## HITS:1 COG:slr1665 KEGG:ns NR:ns ## COG: slr1665 COG0253 # Protein_GI_number: 16332245 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Synechocystis # 1 277 3 276 279 284 50.0 1e-76 MRFTKMQAAGNDYVYMDTLKAPISSPEALARRISDRHFGVGSDGLILICPPSDPGEADFR MRMFNADGSEAEMCGNGIRCVGKYVYDNGHTDKEQVRIETLAGVLTLRLHVSGEKVETVT VDMGEPRLAPKDIPVSGPGEDFINRPLRVQGRVWNVTAVSMGNPHAVVFMHGIDNLDLSG IGPQFEHHPLFPRRTNTEFVEVVGRHHLKMRVWERGAGETLACGTGACASVVAAVLTGNT ERKVRMKLLGGELDIEWDEATNHVFMTGGAVTVFTGEWLG Prediction of potential genes in microbial genomes Time: Fri May 13 04:01:21 2011 Seq name: gi|316922421|gb|ADCP01000119.1| Bilophila wadsworthia 3_1_6 cont1.119, whole genome shotgun sequence Length of sequence - 18065 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 11, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 120 - 1370 1200 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Term 1512 - 1554 3.0 + Prom 1398 - 1457 3.4 2 2 Op 1 6/0.000 + CDS 1683 - 3008 1151 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase 3 2 Op 2 . + CDS 3005 - 3787 530 ## COG0287 Prephenate dehydrogenase + Term 3897 - 3941 4.0 + Prom 3944 - 4003 6.0 4 3 Tu 1 . + CDS 4127 - 4615 312 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 4660 - 4713 17.2 - Term 4648 - 4701 17.2 5 4 Op 1 15/0.000 - CDS 4793 - 6091 1603 ## COG0247 Fe-S oxidoreductase 6 4 Op 2 . - CDS 6095 - 7480 1956 ## COG0277 FAD/FMN-containing dehydrogenases - Prom 7636 - 7695 1.5 + Prom 7616 - 7675 2.2 7 5 Op 1 . + CDS 7812 - 9440 527 ## PROTEIN SUPPORTED gi|225086089|ref|YP_002657117.1| ribosomal protein S15 8 5 Op 2 . + CDS 9430 - 10254 689 ## COG2720 Uncharacterized vancomycin resistance protein + Term 10371 - 10418 11.7 - Term 10362 - 10403 12.5 9 6 Tu 1 . - CDS 10436 - 12148 1380 ## PROTEIN SUPPORTED gi|225873902|ref|YP_002755361.1| ribosomal protein S1 - Prom 12281 - 12340 4.1 - Term 12298 - 12356 10.8 10 7 Op 1 . - CDS 12518 - 12985 474 ## COG0328 Ribonuclease HI - Term 13106 - 13134 2.1 11 7 Op 2 . - CDS 13170 - 13796 815 ## COG0125 Thymidylate kinase 12 7 Op 3 . - CDS 13781 - 14848 1141 ## COG3481 Predicted HD-superfamily hydrolase - Prom 14920 - 14979 1.6 + Prom 14810 - 14869 4.5 13 8 Tu 1 . + CDS 15038 - 15790 828 ## COG0496 Predicted acid phosphatase + Term 15853 - 15888 3.0 14 9 Tu 1 . + CDS 15927 - 16850 1152 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 16866 - 16914 6.0 15 10 Tu 1 . - CDS 16909 - 17133 64 ## 16 11 Tu 1 . + CDS 17011 - 17994 1448 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase Predicted protein(s) >gi|316922421|gb|ADCP01000119.1| GENE 1 120 - 1370 1200 416 aa, chain + ## HITS:1 COG:MTH52 KEGG:ns NR:ns ## COG: MTH52 COG0436 # Protein_GI_number: 15678081 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanothermobacter thermautotrophicus # 1 407 1 407 410 574 65.0 1e-163 MAQINDDYLKLPGSYLFADIAHKVNAFKENHPEMELIRLGIGDVTRPLPPSVIKALHKAV DEQATAEGFRGYGPEQGYRFLREAIAQGDFASRGVEIDPDDIFVSDGAKCDLGNFQELFG RGNVIAVTDPVYPVYVDSNVMGGRAGTFDGKGSWSGIVYLPCSAENGFVPSLPLKRPDVI YLCLPNNPTGTALSRPELQRWVDYARGNQCLILFDAAYEAYIREDGIPHSIYELEGAKEV AVEFRSFSKPAGFTGLRCGYVVVPETVKARAADGRMLPLKPLWNRRQTTKYNGCPYIVQR AAEAVYSPEGRQDVRDNVGYYMENASTIRGGLQAAGLDVYGGVNAPYIWLKTPGGMDSWR FFEALLNRFGIVGTPGVGFGPSGEGYFRLTAFGSHENTRKAMRRIADAGHWSEWKI >gi|316922421|gb|ADCP01000119.1| GENE 2 1683 - 3008 1151 441 aa, chain + ## HITS:1 COG:all5019 KEGG:ns NR:ns ## COG: all5019 COG0128 # Protein_GI_number: 17232511 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Nostoc sp. PCC 7120 # 3 433 16 417 425 228 32.0 2e-59 MITVQAPASKSVSHRMVMGAALAQGDSVVSRVLESKDLERTMAILRGAGAGIVRTGEGEY AISGVGGQPHGGSVDPLSCDVHESGTTCRLLTAVLAAGMGRFRVHGAPRMHERPIGELTA ALETLGVSFTFEGKPGFPPFVLKTCGLDGGEVGIGMDESSQYLSGVLLAAPLARAPLTVN IGGSKVVSWPYVGLTLQALENFGVPFSVERKEGGAWSAVDWHTLEQAEPGNVRFRMVPAM YRAGRYAVEGDWSGASYLLAAGAIGPRPVRIEGLRADSLQGDRVMLDILRDMGARIDIEP DAVTVHPSELRGVVADMSRCPDLVPTVAVVAAHASGPTRLWNAAHLRIKECDRIAVPAQE LSKVGVRCDEHDDGLTIHGDPALASRLHSLDGIAFSAHGDHRIAMSLALLELRGGRLTLD DPSCVSKSFPNFWECWGQVRV >gi|316922421|gb|ADCP01000119.1| GENE 3 3005 - 3787 530 260 aa, chain + ## HITS:1 COG:HI1290_2 KEGG:ns NR:ns ## COG: HI1290_2 COG0287 # Protein_GI_number: 16273204 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Haemophilus influenzae # 15 216 1 204 275 90 30.0 3e-18 MSGNTAESGFTPRRLVIVGCRGRMGTLLSARWSAAGHTVAGLDLPLTDEAFAEALPGADA VFLCIPAGAMAEVLPHLVPHLDGRQILADITSVKMQPLGQMERAYAGPVVGTHPLFGPKP QPSDLRVCITPGAAATDTHIGLVEGLFKDMGCSTFRSTAEAHDSAAASIQGLNFISSLAY FATLAEHEELLPFITPSFRRRLEASRKLLTEDAPLFEWLFEANPMSQESIRQYRSFLNVA AGGDVNVLVQRAQWWWKEAR >gi|316922421|gb|ADCP01000119.1| GENE 4 4127 - 4615 312 162 aa, chain + ## HITS:1 COG:CAP0111 KEGG:ns NR:ns ## COG: CAP0111 COG0454 # Protein_GI_number: 15004814 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 160 3 162 162 183 50.0 1e-46 MRIRTATQDDLDAICRVEAICFPASEAGTRESFAARLKVFPRHFLLLEDEGRLIGFVNGM VTDDRTISDVMFEQAELHKEDGKWQSVFGLDVLPEHRRKGYAGQLMRALIEHSREDGRCG CILTCKEHLLPYYTRFGFRNLGVSLSQHGGAVWYDMILEFSR >gi|316922421|gb|ADCP01000119.1| GENE 5 4793 - 6091 1603 432 aa, chain - ## HITS:1 COG:DR1730 KEGG:ns NR:ns ## COG: DR1730 COG0247 # Protein_GI_number: 15806733 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Deinococcus radiodurans # 20 426 23 416 425 178 29.0 2e-44 MDELTLLANALMALDDKLAACMKCGMCQSVCPVFGETMMEADVTRGKLTLLEKLAHRLIE DPDAVNEKLNRCLLCGSCQANCPSGVSILEIFMEARAIVAAYKGLSPLKKAIFRTLLPNP KLFGFLLKIGAPFQGLAMRDDNNAQGTATCSPLLRPFLGERHMQPLAKQTLSAKIGAVNS PAGKSRCKVAFFPGCMGDKIFTNVSEACLKVFDYHEVGVYLPTNYSCCGIPALSSGDLEG FEKMVMHNVDVLKEGSFDLIVTPCSSCTETIRSLWPKMAGKMPYKYRDAINELSQKAMDI NAFLVDVLKVKPRASGRHGQTKVTYHESCHLLRSLGVSAQPRELIRMNPDYDLVEMKEAD RCCGCGGSFTLSHYDLSRKIGQRKRDNIVASGAEVVATGCPACMMQLSDMLAHNGDSVTV KHTIEIYADSLK >gi|316922421|gb|ADCP01000119.1| GENE 6 6095 - 7480 1956 461 aa, chain - ## HITS:1 COG:BS_ysfC KEGG:ns NR:ns ## COG: BS_ysfC COG0277 # Protein_GI_number: 16079920 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Bacillus subtilis # 15 458 15 457 470 420 49.0 1e-117 MPSASLVKEFENLIGKENVFTSEADRQSYAYDAAVLPSVVPGLVLRPTSTEQLGALIERC YANGLPMTIRGAGTNLSGGTVPDADKSVVILTQGLNKIIEINEEDLFAIVEPGVVTAQFA AAVASKGLFYPPDPGSQAVSTIGGNIAENAGGLRGLKYGVTKDYVMGLEFYDHTGELVKS GSRTVKCATGYNLGGLLVGSEGTLGMISRAILKLVPPPQASKAMMAVFDNVQKASEAVAR IIAAHILPCTLEFMDNMTINLVEDDVKIGLPRDAQAILLIEVDGHAGQVADEALIVEEQL KKSGATQIVVAKDAEEKNRIWEARRKALPALARFKPTTILEDATVPRSKIPAMMKVIKGL GEKYNLTVGTFGHAGDGNLHPTFMCDKRDKDEFERVERAVDELFNTTIEMGGTLSGEHGI GTAKQKWLEKETSRGTIGYSRRLRKAIDPKGLFNPSKLVGV >gi|316922421|gb|ADCP01000119.1| GENE 7 7812 - 9440 527 542 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225086089|ref|YP_002657117.1| ribosomal protein S15 [gamma proteobacterium NOR51-B] # 16 531 26 510 517 207 32 4e-53 MFAPLPSPAEMSGWDRAAIDLGLPELLLMENASREALHVLSAETGQMPGKRVLLFMGGGN NGGDAACLARHLHDGGAEVLVLHTRPLGAYRGVTGQHVRLARRCGVAFAPAAGWPEKYRN TPWDVYADASVCERGGPDIVVDGILGTGFSGSLRPLELGLVARINALAQRAFIFSLDIPS GLSGLTGRPCPVAVRAHATVTFEAAKPGLVLPEAAPYTGRLHIRPIGIPAMARREHPASY QMLTKAIAGAFPEAAPGWHKGTAGAILIVGGSEGLTGAPHLAARAALRAGGGLISVAAPH GLCRDIKADCPDIMTRALGPAGNVRWSPALLEELLPFLRQCGAMVLGPGIGREPETAAFV QALLMCPSRPSAVVDADALHALAMHPEALSSLRACDVLTPHPGEAATLLHTTPALVQADR FAALAGLRRLAPSVWILKGAGTLIGTEGQPVTIAPYAEPNLAVGGSGDVLSGCLGTLMAQ TAAHPSEGCESSSFLAACMGVHLHARAGRLLREAFPQRGNSATDIAEMLPRARIREAAGD GA >gi|316922421|gb|ADCP01000119.1| GENE 8 9430 - 10254 689 274 aa, chain + ## HITS:1 COG:PA0749 KEGG:ns NR:ns ## COG: PA0749 COG2720 # Protein_GI_number: 15595946 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Pseudomonas aeruginosa # 11 270 10 270 273 199 44.0 4e-51 MARKLFCDLCPLTYELSRWKGIAFRHLQDLRCSSPFARSRREEPLPVLAYRHASLIRRRL GNVNMRLQENKAVNLRLAAPKVSGVLIRPGEVFSFWRLVGATSARKGYREGLMIKRGQPS QGIGGGLCQFTNLLHWLVLHSPLRIVEYHHHDGVDLFPDCGRQIPFGIGTSISYNYLDYR FQNPTGTTFQLLVWTTDTHLCGELRADADLPLSWHIRSEEEAFVREGDRVYRNNVIAREC VDKRSGNLLSRETLKRNHALVLYEVPEDRIMEEI >gi|316922421|gb|ADCP01000119.1| GENE 9 10436 - 12148 1380 570 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225873902|ref|YP_002755361.1| ribosomal protein S1 [Acidobacterium capsulatum ATCC 51196] # 1 564 78 633 637 536 45 1e-152 METMENHGMDAEFNFESALEDYLSTDFGDLEEGTIVKGEIVRVDDDNVLVDVNFKSEGQI PTAEFRDAEGNLVVKVGDRVDVFVARKNEQEGTITLSFEKAKRMQLFDQLEDVQEKNGVI KGRIMRRIKGGYTVDLGGVEAFLPGSHVDLRPVPDMDALVNQEYEFRVLKINRRRSNVIV SRRVLLEEERDSKRQDLLQTLAEGQVVTGKAKNITEYGVFVDLGGLDGLLHITDMSWKRI RHPREMVTLGQDLELKVLSFDKDNQKVSLGLKQLVPDPWQDITARFPEASRHNGKVTNLV DYGAFVELEPGVEGLVHISEMSWTRKLRHPSQMVRQGDEVEVVILGVDPEKKRISLGMKQ IKPNPWELVGEKYPEGTILEGVIKNITEFGMFIGIEDGIDGLIHVSDISWTKKIRHPNEL FKVGDTVQAKVLTVDQESEKFTLGIKQLTEDPWTNVPTAYPVGGLVKGIITNITDFGLFV EVEEGIEGLVHVSELSNKKVKTPAELYKEGEEIQAKIIHVSAEDRRLGLSIKQLKDEEER KKPREYSRSGPEAGQSLGDLLKQKFEESEN >gi|316922421|gb|ADCP01000119.1| GENE 10 12518 - 12985 474 155 aa, chain - ## HITS:1 COG:YPO1081 KEGG:ns NR:ns ## COG: YPO1081 COG0328 # Protein_GI_number: 16121382 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HI # Organism: Yersinia pestis # 2 155 3 154 154 187 59.0 8e-48 MRNVTIFTDGSCLGNPGRGGWGCILRCEGHEKELSGGYAHTTNNRMEIMAAIAGLEELKE PCAVTLYTDSQYLRHAVEKKWLAGWRKNGWKTAAKKPVKNRDLWERLQVQLDRHSVTLEW VRGHNGHAENERCDELARSQSARHDLPEDTGYEPE >gi|316922421|gb|ADCP01000119.1| GENE 11 13170 - 13796 815 208 aa, chain - ## HITS:1 COG:DR0111 KEGG:ns NR:ns ## COG: DR0111 COG0125 # Protein_GI_number: 15805151 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Deinococcus radiodurans # 1 189 5 186 206 166 49.0 2e-41 MFITLEGIEGSGKTTLIENLADVFRTLNHEVLVTREPGGCALGRELRQMLLNPETEICPE AELFLFLADRAQHVAEVIRPALKRGEVVLCDRYADSTVVYQGYGRGLDIEKLRSLNDVAI GGLWPDRTFVLDMDPADALKRARRRNAELGLSEKEGRFEAEQMPFHTRIREGFKLWAAHN TKRIVVLDAADSPEGLMHQALANIDMFE >gi|316922421|gb|ADCP01000119.1| GENE 12 13781 - 14848 1141 355 aa, chain - ## HITS:1 COG:BS_yhaM KEGG:ns NR:ns ## COG: BS_yhaM COG3481 # Protein_GI_number: 16078057 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Bacillus subtilis # 19 324 11 310 314 173 34.0 5e-43 MQKIQSAPDRRWAKDVAPGDEIRALYLIGSASQLQAKNGPFWRLELKDASGDLEARIWSP LSQQFAEIPSGVIAEVEGRAESFRDKLQVNVNALRVLSPEETASVDTSAFMASSSRPPQE MLDELEALCRKEFTYKPWRKFILSVLEDEAIRGPLLTAPAAKSVHHAWVGGLLEHTLSVA TLCLRFCDHYPDLDRQTLLAGAICHDLGKIWEFSGGLANDYTDAGRLVGHINLCLGKLDR HLAKSGLDEELILHFQHLILSHHGLYEYGSPRLPQTAEAFALHYADNIDAKITQSRSLFG ELEDGESGWSPYQKSLERQLFQAPKTPEAESRKPRTSKRTAAEPEAPRLNQCSLL >gi|316922421|gb|ADCP01000119.1| GENE 13 15038 - 15790 828 250 aa, chain + ## HITS:1 COG:alr4846 KEGG:ns NR:ns ## COG: alr4846 COG0496 # Protein_GI_number: 17232338 # Func_class: R General function prediction only # Function: Predicted acid phosphatase # Organism: Nostoc sp. PCC 7120 # 1 236 1 249 265 165 38.0 6e-41 MRILLSNDDGIHSPCLRALHDALCEAGHELDVVAPLTEQSGVGCSVTLHNPLRLYPVQEP GFSGTAVAGTPVDCVKLALTTLLPQPPDLVVVGINNGANKGVDVFYSGTVGAATEAALRG LPAVAFSRPRPELEPPQALARHAASLVDAVDWRCCAGKVLNVNYPRCRVAEIKGIRAARM AESRWAENYERREDPAGRPYWWIADFLKRDSGGDDTDIALMEGNWVAVTPLQVDRTDREL LGRLQERFGV >gi|316922421|gb|ADCP01000119.1| GENE 14 15927 - 16850 1152 307 aa, chain + ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 3 307 9 332 332 305 49.0 6e-83 MPLTSPRAMFERAYKEGYAIGAFNVNNMEIIQGIMEAGTEEKAPLILQVSAGARKYAGQN YIVKLIEAGLLEADLPVVLHLDHGADFDICKACVDGGFTSVMIDGSHHSFEENIAVTKRV VEYAHAHGVWVEAELGRLAGVEEDVSSEHSIYTDPDQAVEFVERSGCDSLAIAIGTSHGA YKFKGEAKLDFERLEKIGSLMPGYPLVLHGASSVPQEFVEMCNTYGGKVAGAAGVPEELL RKAAGMAICKINIDTDIRLAMTASIRKQLVEHPEEFDPRGYLKPARQAVKDMVQHKIKHV LGCSGKA >gi|316922421|gb|ADCP01000119.1| GENE 15 16909 - 17133 64 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGQRSVVRAGIDHHGEFVALSHKADKITTDAAEAVDSDASHWGTPSIREKVCGTILMRPT PGLARPHKERRNAV >gi|316922421|gb|ADCP01000119.1| GENE 16 17011 - 17994 1448 327 aa, chain + ## HITS:1 COG:BS_gap KEGG:ns NR:ns ## COG: BS_gap COG0057 # Protein_GI_number: 16080447 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Bacillus subtilis # 3 326 4 333 335 345 56.0 7e-95 MARIGINGFGRIGRYLVRLMAERNEFPVVINARADNASLAHLFKYDSVHHTFKGTVGHDE NGIIINGTHIAVTRCKPNEWQWKDYGVDLVVESTGTIKKRAGLAEHMACGAKKVIMSAPV SDADITIVMGVNDGLYDPAKHDVVSSASCTTNCLAPVAKVLNDTFGIEHGLMTTIHSYTM SQRILDGSQKDIRRARAAAMSMIPTTTGAAKAVALVLPELKGKLDGIAIRVPTPNVSIVD LNCVMKRDVTVDEVNAALTAAANDNFGVSDEPLVSIDYNGDTHGGVVDLLSTSVMSGNML KVLIWYDNEAGFTNQLLRLITMVAGKM Prediction of potential genes in microbial genomes Time: Fri May 13 04:01:45 2011 Seq name: gi|316922396|gb|ADCP01000120.1| Bilophila wadsworthia 3_1_6 cont1.120, whole genome shotgun sequence Length of sequence - 27267 bp Number of predicted genes - 25, with homology - 23 Number of transcription units - 15, operones - 7 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 194 - 248 14.1 1 1 Op 1 . - CDS 409 - 1206 1010 ## COG1852 Uncharacterized conserved protein 2 1 Op 2 26/0.000 - CDS 1370 - 2374 1222 ## COG0223 Methionyl-tRNA formyltransferase 3 1 Op 3 . - CDS 2375 - 2893 690 ## COG0242 N-formylmethionyl-tRNA deformylase + Prom 2848 - 2907 3.1 4 2 Op 1 . + CDS 3063 - 4070 866 ## COG0042 tRNA-dihydrouridine synthase 5 2 Op 2 . + CDS 4153 - 5031 499 ## Dalk_4820 hypothetical protein - Term 4842 - 4875 0.9 6 3 Tu 1 . - CDS 4998 - 5678 661 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation + Prom 5653 - 5712 2.1 7 4 Tu 1 . + CDS 5768 - 6790 910 ## COG0618 Exopolyphosphatase-related proteins + Term 6808 - 6858 17.1 - Term 6934 - 6975 12.0 8 5 Tu 1 . - CDS 7150 - 7761 805 ## COG2095 Multiple antibiotic transporter - Prom 7849 - 7908 2.6 + Prom 7875 - 7934 2.5 9 6 Tu 1 . + CDS 8040 - 9413 1323 ## COG1760 L-serine deaminase + Prom 9441 - 9500 1.6 10 7 Op 1 . + CDS 9549 - 9785 337 ## COG1758 DNA-directed RNA polymerase, subunit K/omega 11 7 Op 2 . + CDS 9816 - 10952 1086 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain + Term 11123 - 11163 13.4 12 8 Tu 1 . - CDS 11454 - 11837 425 ## Ddes_0042 hypothetical protein - Prom 11872 - 11931 3.7 13 9 Op 1 . - CDS 11995 - 13167 1512 ## COG1408 Predicted phosphohydrolases 14 9 Op 2 . - CDS 13172 - 14254 1263 ## YE1029 putative integral membrane protein 15 9 Op 3 . - CDS 14283 - 15086 1040 ## COG5006 Predicted permease, DMT superfamily 16 10 Tu 1 . - CDS 15451 - 18099 3977 ## COG0525 Valyl-tRNA synthetase 17 11 Op 1 . + CDS 18111 - 19511 1113 ## DvMF_2764 hypothetical protein 18 11 Op 2 . + CDS 19498 - 20613 1101 ## COG0391 Uncharacterized conserved protein + Prom 20765 - 20824 2.7 19 12 Tu 1 . + CDS 20899 - 21423 577 ## COG1051 ADP-ribose pyrophosphatase - Term 21550 - 21593 10.0 20 13 Op 1 . - CDS 21793 - 22503 873 ## COG0325 Predicted enzyme with a TIM-barrel fold 21 13 Op 2 . - CDS 22507 - 23442 1170 ## COG1159 GTPase 22 13 Op 3 . - CDS 23511 - 23912 343 ## - Prom 24069 - 24128 2.3 23 14 Op 1 . - CDS 24205 - 24396 60 ## 24 14 Op 2 . - CDS 24490 - 25350 1209 ## COG0501 Zn-dependent protease with chaperone function - Prom 25420 - 25479 6.9 + Prom 25367 - 25426 3.6 25 15 Tu 1 . + CDS 25551 - 27248 1402 ## COG0145 N-methylhydantoinase A/acetone carboxylase, beta subunit Predicted protein(s) >gi|316922396|gb|ADCP01000120.1| GENE 1 409 - 1206 1010 265 aa, chain - ## HITS:1 COG:FN1837 KEGG:ns NR:ns ## COG: FN1837 COG1852 # Protein_GI_number: 19705142 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 122 249 34 162 176 123 44.0 4e-28 MSISKSAFSLSKDEYEGTGKRLFIGLVIGTAILLCVLLLLIWLVPTIGFAAIHPWLPFIA GIVFGGAILGILWLSVGLVLHAYTGKPILGTSRLRGITIRLLLPIMELMGRMMRIPVAAV RRSFIKVNNELVLSSGIQCEPRQLLVLLPHCIQSSRCTHRLTYHIDNCARCGACPLKDVL NLRDTYGVQVAIATGGTIARRIVVQARPKLIVAVACERDLSSGIQDTHPLPVFGVINERP NGPCLDTFVSIRRLESAIRHFIGMK >gi|316922396|gb|ADCP01000120.1| GENE 2 1370 - 2374 1222 334 aa, chain - ## HITS:1 COG:ECs4153 KEGG:ns NR:ns ## COG: ECs4153 COG0223 # Protein_GI_number: 15833407 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Escherichia coli O157:H7 # 6 314 3 302 315 241 44.0 2e-63 MAAPEQKMRVVFMGTPDFAATVLRHVAAWPGCEVVAAYCQPDRPAGRGHKLQPPAVKVLA QELGIPVFQPLNFKDEADRAALAGLRPDALVVAAYGLILPQSVLDIPTIGPFNVHGSLLP QYRGAAPIQRAIMDGNHLTGITIMRMERGLDTGPMLLQRALGIGIDDTAATMHDELADLG GRLMVEVLRQYADGDPSTPIPQEEALATYAAKLTKADGHIDWDEDAAVIHARIRGVTPWP GAQTVFLLPGRDPLPALLQPGRVGETFTGGHPAPGTLVALRGGKLLIACRDALYEVSTLK PAGGKPMSAEAFWNGYCRAANKDGCGRAVSPSLG >gi|316922396|gb|ADCP01000120.1| GENE 3 2375 - 2893 690 172 aa, chain - ## HITS:1 COG:BMEII0264 KEGG:ns NR:ns ## COG: BMEII0264 COG0242 # Protein_GI_number: 17988608 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Brucella melitensis # 1 172 13 184 187 137 44.0 9e-33 MSLKIVTYPNPLLGKPSLPITEVTDEIRKLAEEMTEAMYKSDGIGIAAPQVGQLIRLVII DVTGPEKREGKMVLVNPVWTPLPDAGYVESEEGCLSVPDYRSKVRRTARVHVEATDLDGN PVSFDADDILAICVQHEIDHLDGKLFIDRISRLKRIMFENKLKKGTRQSQVK >gi|316922396|gb|ADCP01000120.1| GENE 4 3063 - 4070 866 335 aa, chain + ## HITS:1 COG:TM0096 KEGG:ns NR:ns ## COG: TM0096 COG0042 # Protein_GI_number: 15642871 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Thermotoga maritima # 20 312 7 287 306 165 35.0 1e-40 MVATLPNDVFLPFAPDKPWLAPLAGWSDLPFRLLCRRFGAAVCCTEMISAKGLIYKSPGT AELFATCPEDGPVVVQLFGSEAPFMEAAAGQLRDKGFVWFDCNMGCSVPKVARTGSGAAM CKDIDNALAVAEALIRAAGRGRVGFKLRLGWDEPGRSPAAETWRVLAPALEALGAGWLTL HPRTAKQGFTGTARWEYLSELKALVSLPVIASGDLFTAEDGVRCLAETGVDTVMYARGAL RDPAIFKAHLDLLAGREPEVADVATLLARIREHARLARELSTEKVALLKMRTIVPRYVRH IEGSKQLRADIIACRSWDDFEKALRSFEAGKSSSE >gi|316922396|gb|ADCP01000120.1| GENE 5 4153 - 5031 499 292 aa, chain + ## HITS:1 COG:no KEGG:Dalk_4820 NR:ns ## KEGG: Dalk_4820 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 9 68 14 73 135 69 60.0 1e-10 MKSSPYRLGFTLIEIISVLVILGILAAVAVPKYFDMQDDAEKKAALSAVAEAQSRIQLSF GQQILQGKPCDKAVEEVSEISKLSDDGKSKFGDFSLGIDKSASGGTLTTAGVAIYAKRGD NDTAVETGGKLYLPSCTDEKGLSGNFSQAVLGYLDRVLATDSHDFRDDILGEVISLGNGV SYTVKNIVNQGGNGAYVDLQYTNASGDTLTFQVHKNSNGESWIQQMVAHSNDGKSINLIK DNYRANLSDAQIRQAQNIVSSMGMDTSSFGPAFDSKLGAGVLYFDKGFVPKQ >gi|316922396|gb|ADCP01000120.1| GENE 6 4998 - 5678 661 226 aa, chain - ## HITS:1 COG:DR1206 KEGG:ns NR:ns ## COG: DR1206 COG0424 # Protein_GI_number: 15806225 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Deinococcus radiodurans # 16 214 9 191 195 124 44.0 2e-28 MQETMPSGAFRALRPIILASSSPRRRELLGSLGIDFTIKVVDGSEPAPTLEDDPAAYAMR AAQAKAEAVARAAGPETPDAVVLGSDTIVVLHEEHGPIILGKPASRDEALAMLLHLSGRT HTVYTGCCIIWLSARGESRTELFYDAADVTFAEWPEDVLRAYAATGECDDKAGSYAIQGL GSFLVSGIKGQWATIVGLPVLAVAQRLMAGGSIAPRHCLGTNPLSK >gi|316922396|gb|ADCP01000120.1| GENE 7 5768 - 6790 910 340 aa, chain + ## HITS:1 COG:MJ1633_2 KEGG:ns NR:ns ## COG: MJ1633_2 COG0618 # Protein_GI_number: 15669829 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Methanococcus jannaschii # 28 340 11 315 316 140 33.0 3e-33 MSSTRKLSSRLTQLLELFNRNDDWLIVINADPDAMASAMALKRIMSHRTGKVTIARINEI SRPDNLAMIRYLRIPMLPLTDKLKASYSHFAMVDSQPHHNPAFAGIPFSIVIDHHPAVPE HPVDAAYVDIRPQYGAVSTLLTEYLRAYHIRPGIRLATALQYGIRTDTATFTRTGTEIDL RAYQYLAAHGDTALLTRITRSEYLPEWLKYFARAFSSMHQCGSGAYCYLDTVENPDILVV VADFFTHVHSIKWVGVCGVYNDTVVVIFRGDGHVDLGEFAASRLGALGNAGGHRALARAE FPLEATEGRSVDVFVFRKLTEKPQKKKVEDVESVEGGEEE >gi|316922396|gb|ADCP01000120.1| GENE 8 7150 - 7761 805 203 aa, chain - ## HITS:1 COG:PA4857 KEGG:ns NR:ns ## COG: PA4857 COG2095 # Protein_GI_number: 15600050 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Pseudomonas aeruginosa # 8 203 4 197 197 127 37.0 2e-29 MDYSIGDFFSSVIKLFALLTPPAALSAFLSGTQTYGGRRKRRTALRTGTAVFIVGTVLYF FGESLFSVFGFTLDAFRIGAGALLFLTSVALMGEQRESHTPDSPDEDISVVPLAIPICMG PASIGTLMVLGASAHSMTERVIGSAALFVASALITLMLLMANHVQRILGKTGLAVLSKLT GLLLAAIAAQVVFTGVKNFLIVG >gi|316922396|gb|ADCP01000120.1| GENE 9 8040 - 9413 1323 457 aa, chain + ## HITS:1 COG:sdaB KEGG:ns NR:ns ## COG: sdaB COG1760 # Protein_GI_number: 16130704 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Escherichia coli K12 # 9 457 3 454 455 312 41.0 9e-85 MPQRYIDTTLFDLFKVGPGPSSSHTIGPMKAGHHFAETCAALPRDLLVRAVRFRVRLYGS LSATGTGHGTDAAVVAGLLGTAPESCPAGLLQELMDDPHTRRKRSLGDGQVSVCVDDVSH DAIIHDFPYSNTLVADLLDAEDKVLHSQEYYSVGGGFIQWKGWEPPSLGKPVHRYSNMTE LRAIVKEKGLNIYEIILDNEMAITGASRPSIIYSLNQIIDHMESSVRRGLDSEGQLPGPL KVQRKARMLWQQASRMQSSPDQFLTRINAYAFATAEENASGGVIVTAPTCGSAGVMPALV YALRHEMFIGDRAIREAFLASAAVGFIAKHNASIAGAEVGCQGEIGVASAMAAAFVADAR GYRSRVTENAAEVALEHHLGITCDPVQGYVQIPCIERNAMGAIKAYNAAVITSGTDPYSQ RVSLDAAIAAMAETGREMSCKFKETSLGGLAVSMVAC >gi|316922396|gb|ADCP01000120.1| GENE 10 9549 - 9785 337 78 aa, chain + ## HITS:1 COG:RSc2154 KEGG:ns NR:ns ## COG: RSc2154 COG1758 # Protein_GI_number: 17546873 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, subunit K/omega # Organism: Ralstonia solanacearum # 1 57 1 57 67 59 54.0 2e-09 MARITVEDCQERVENRFLLVQMAIKRVRQYREGYEPLVESRNKEAVTALREIAAGKIFPE DMSQYRMPAGETQPDMDD >gi|316922396|gb|ADCP01000120.1| GENE 11 9816 - 10952 1086 378 aa, chain + ## HITS:1 COG:RSc2634 KEGG:ns NR:ns ## COG: RSc2634 COG0484 # Protein_GI_number: 17547353 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Ralstonia solanacearum # 1 355 1 358 380 347 50.0 3e-95 MAQRDYYEVLGVARDASEDEIKRAYRKMALQNHPDHNPDNPEAEQRFKEAAEAYEVLRDP ERRARYDQFGHAGVGNNGDFGGFGSAEDIFAHFGDIFGDMFGFSMGGGTRRGGPRPTAGA DLRYNLNITFAQAARGAEITLNIPRMVTCGECHGSGAAPGTSRETCRQCGGSGQVRNSQG FFQFVVPCPTCHGEGYTISKPCPKCRGKGQVQETRELSVRIPAGVDTGTRLRLRNEGEAG TNGGPNGDLYVFISVDEDKTFRRQGQDLVLTREISFVKAALGHTITVPGLDGDLELKIAK GTQSGTVLRLPGKGLPYLNQKRNGDLLVEILVKTPTNLSARQEELLREFESSAEESIGDK IIKGVENLLGGKSKKKKK >gi|316922396|gb|ADCP01000120.1| GENE 12 11454 - 11837 425 127 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0042 NR:ns ## KEGG: Ddes_0042 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 8 126 6 124 125 125 52.0 5e-28 MTDDALTSRANVVLHLDNGAEMQFSGRPFAGGSWFDDETGELTRQKLYTTESGEHVYSIV TGKGQQRSRRAYRVSLHGEACTINDGRTEMTLDLEMLMLAVRALSGLEKDDAPTLDMVEE TLRAANC >gi|316922396|gb|ADCP01000120.1| GENE 13 11995 - 13167 1512 390 aa, chain - ## HITS:1 COG:Cj0846 KEGG:ns NR:ns ## COG: Cj0846 COG1408 # Protein_GI_number: 15792184 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Campylobacter jejuni # 117 384 112 372 374 160 34.0 5e-39 MSMITLFSAFMGLYVIVRLILPSGMNWPLKLILSLFALACAEKLLLTKLVYGTMGAFMPE PVQLASGYLHSAVTILFLLLVARDALLLLTWPFRRSTGRQRKIFYGHKEKKPASGFWAFT LVLLALALSGYGMREALRVPPVREVRMQVPGLPDALNGFRIAQLSDLHIGPTFGKAWLTD VVARTDSLNPDLIVITGDVVDGSPSRLEEDVAPLADLKAKYGVIFAPGNHEYYSGIQQWL PVFQRLGMHVLMNENTQIRVNGTPLAIAGVTDTAALNWGLEGPDPEKALSGLSKDITKLM LSHRPSLAPESAKAGASLQLSGHTHGGLILPVTPLIAAFNGGYVSGPYTVDGMPLYVSSG SGLWGGIPLRLFVPSEITLITLTGTGFDTN >gi|316922396|gb|ADCP01000120.1| GENE 14 13172 - 14254 1263 360 aa, chain - ## HITS:1 COG:no KEGG:YE1029 NR:ns ## KEGG: YE1029 # Name: not_defined # Def: putative integral membrane protein # Organism: Y.enterocolitica # Pathway: not_defined # 1 350 1 341 349 276 50.0 1e-72 MKLPKKQAAVVRAALEEWRASGLLDEETGKRLLDDLTPLPFDWYRLGRYALWSALTCILI GVAALLGDELFLELINRLFVMTELGRSLFLILAAAGLFFWGVRRRRRAPQKRLTNEGLFF LGVLAIAGSLASLAAWLYAYGGEDIGFLNISSLFLLAAVIYGALGLLLDSRLIWVFALFA FGSWLGAETGYRSGWGAYYLGMNYPMRFVLLGLLLCGLSVLFKRNDLWSRLAAFERSTLS VGLLYLFISLWILSIFGDYGDMTSWYQARHMELFHWSLLFGAAACAAIWLGVKDDDAMLR GYGLVFLGLNLYTRFFEWFWDSLNKGLFFIIVGLSLWVLGAHAERLWNLGKKPSSSPKDA >gi|316922396|gb|ADCP01000120.1| GENE 15 14283 - 15086 1040 267 aa, chain - ## HITS:1 COG:ECs0891 KEGG:ns NR:ns ## COG: ECs0891 COG5006 # Protein_GI_number: 15830145 # Func_class: R General function prediction only # Function: Predicted permease, DMT superfamily # Organism: Escherichia coli O157:H7 # 1 267 21 287 295 236 51.0 4e-62 MLSIQCSASLAKSVFPIIGPEATTALRLMFAALVLLPVMRPWRAKLTRKQWLPIILYGLS TGVMNMCFYQAISRIPLGVGVALEFTGPLAIAMLGSRRLIDFLWIALAVGGLILLLPIHE FSGNLDPVGVAFALGAGFCWAMYIFFGKRAGNAGGGASVSLGMIVGACAILPFGVASAGT SMFSMSVLPLALLLGVFSSALPYGLEIVALKQLPAQTFGILMSMEPVLAALSGIIFLGEQ LNVAQWVALACIIVASIGATLTIRRKA >gi|316922396|gb|ADCP01000120.1| GENE 16 15451 - 18099 3977 882 aa, chain - ## HITS:1 COG:TM1817 KEGG:ns NR:ns ## COG: TM1817 COG0525 # Protein_GI_number: 15644561 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Thermotoga maritima # 6 881 4 864 865 892 50.0 0 MSENALPKGYEPHAVEDHWREYWEKNKTFTPDPDAPGEPFSIVIPPPNVTGALHIGHALN HTLIDVLCRHARQKGKKVLWLPGTDHAGIATQNVVERALAKEGKTRHDLGREAFVERVWQ WKEDYGNRILNQIRMLGDSVDWTRERFTMDEGLSKAVRKVFVELYKEGYIYKGKYIINWC SRCHTALADDEVDHMPEKGHLYHVRYDFEDGSGSVIIATTRPETIMADTGVCVHPEDERY AGLVGKKIVVPVIGRVVPLFADRYVDKEFGTGALKVTPCHDPNDWTLGERHGLEFIQCID EDGNMTAEAGPYAGLSKEECRKRIVEDLQASGQLVKIEDLDHSVGHCYRCKTVVEPHMSE QWFVASTKLAPRARAAVPEMTQIFPENWMKTYYNWLDNIRDWCISRQIWWGHRIPAWTCP QCGKLIVSEEDPTSCPDCGSTELVQESDVLDTWFSSALWPFSTMGWPDKTKDLATFYPTS VLVTGFDILFFWVARMMMMGLHFMDQVPFKHVYLHALVRDAQGRKMSKSTGNVIDPLVMI EKYGTDALRFTLTAFAAMGRDIRLSEDRIEGYRHFVNKLWNASRFALMNLPEDVPAAVEL DKVQGLHHQWLLHRLEEVKKDVDGAIEAYRFNDAAQTLYKFLWNEFCDWYLELIKPDMKE EGPRKTEAQYVLWTALREFLILMHPIMPFVTAEIWQALPGGAGSDAAVQLLPEARPGCLK PEAAARMEFVQEVIGMIRTIRAELNIAPSYRLTVLLRPSDAHQKDVLEANREVILTLARL GELIIDADGEAPKASASHVAQGCEVIVHLSGAVDFAAELARLEKELGKIDKELGGLNQKL ANEGFVSRAPAEIVERERARVAELTDARVKLTALQQRFRDVM >gi|316922396|gb|ADCP01000120.1| GENE 17 18111 - 19511 1113 466 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2764 NR:ns ## KEGG: DvMF_2764 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 28 456 1 423 425 397 50.0 1e-109 MKGFKGSCQRGRAGVVNSARLFYAEGTMTDIAIFWDASQLWGLLVWRAAEAFGLPYRLVK AKEIAQGALSDKTSLLLVPGGTARHKSAALGEKGREAVRAWVRGGGRYVGFCGGAGLGLS DAADPVRTAEIGKGLCLCPWHRAEIGERVQHFVSGHVRVRFQGGHPLVPEFFSEPVAPGS EPAIPIWWPGRFAASSGGPDDGVDVLARYAGSDPRTCPPDLCVADLPLSSLSPAVLEQWI ELYGVSLAPGFLNGQPCALHGRYGKGSYTLSYSHLETPGSPDANRWFAHILRTLAGFEPR ADTVPAWRPGEMPVLWHDPDLLEARRGMGELIRLGLAHDLLFERAPWLTGWRSGVPGSGL NALFMGLCVLTGVSPSPEAETFWAAQRIRFGETFAVFRQGVEGLLLGLRLATIMPEEVPR KILAEQRLALFGSAMQGGGLYKELMDMVDELLFLSLGKAGGCDGSA >gi|316922396|gb|ADCP01000120.1| GENE 18 19498 - 20613 1101 371 aa, chain + ## HITS:1 COG:SA0721 KEGG:ns NR:ns ## COG: SA0721 COG0391 # Protein_GI_number: 15926443 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 9 305 5 238 331 86 24.0 1e-16 MVPPDGGSRILFFSGGTALAPVAAELSRHTRNAVHVITTFDSGGSSAELRRAFDMPAVGD IRARIMALADRSLEGNPETVELLGYRLPKDAGPEALHRELASLASGAHPLVAAIPEPMNE VVTQHLSAFRSLMPVDMDLAGASIGNLILTSGYLSLDRQLEPVVRVFSGMVQARGVVMPV ADSCAHLCVRLENGEVIVGQHRFTGKTATSITSPILDMWLSASLDEPSPVSVPIQPRLAH VIRTADLICYPVGSFFSSVMANLLPLGVSCAVREAACPKVFIPNLGTDPELFGLTVQDQV AYLLRFGADGCPAGQALNWLLVDEDESRYPGGIPYEWLARLGIRVRKAHLVSEPPYLDAG LLCRELLGDWK >gi|316922396|gb|ADCP01000120.1| GENE 19 20899 - 21423 577 174 aa, chain + ## HITS:1 COG:MJ1149 KEGG:ns NR:ns ## COG: MJ1149 COG1051 # Protein_GI_number: 15669336 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Methanococcus jannaschii # 32 157 38 163 169 126 49.0 2e-29 MHMGNIPFILLNGLMMKTELTCPKCGAIVEKYLNPKPTVDIVIHDRERGLVLVDRKNPPY GNALPGGFIDLGESAEQAAVREAFEETNLRVRLTGLLGVYSAPDRDPRQHTISTVFIAEP LNPEQLRAGDDAAKVAFYSMEALPALVFDHAKILADFKGVLYGGRSAAGLTTPW >gi|316922396|gb|ADCP01000120.1| GENE 20 21793 - 22503 873 236 aa, chain - ## HITS:1 COG:aq_274 KEGG:ns NR:ns ## COG: aq_274 COG0325 # Protein_GI_number: 15605814 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Aquifex aeolicus # 11 232 5 226 228 198 47.0 6e-51 MNPSETAALGERLEAVRGRLAGAARIAGRKPEDVRLIAVSKLHPVEAILAAYGFGQRVFG ENYVQEALAKQEALPDLDVEWHCIGHVQTNKAKDVTGRFALIHTVDNLKFAETLARRLPE DIPVQRVLLQVNIGNEPQKAGVDEHDLPALAEAVLALPRLEVRGLMCLPPFFDDGEAARP YFARLRELRDDLEARLGIKLPELSMGMSGDCVQAVEEGATLVRVGTDIFGPRPVKS >gi|316922396|gb|ADCP01000120.1| GENE 21 22507 - 23442 1170 311 aa, chain - ## HITS:1 COG:YPO2719 KEGG:ns NR:ns ## COG: YPO2719 COG1159 # Protein_GI_number: 16122923 # Func_class: R General function prediction only # Function: GTPase # Organism: Yersinia pestis # 7 305 9 298 303 226 39.0 3e-59 MATAHRCGWVALMGPPNAGKSTLTNALVGQKVAIVTAKPQTTRNRIVGILTQKDAQVIFM DTPGVHALRGQTRGQLGKIMVQSAWQSFAVANCIVLVIDGDLYLRKPDFMERDLAPLIQP LAEEERPVVVVVNKVDLFHDKSRMLPLLESVAQMFPKAEIFPASALRRNGVEQLLELIRS HLPEGEAQFPEDQLSTAPMKFMAAEIIREKLFEKLYQEVPYSVAVDVEVWDEEDDRVLIH AAIYVAKPSHKAMVIGRAGEGIKAIGTAARKEIRDLVDKKVHLELWVKVREDWVDDPQFL HSLGFGAEAEY >gi|316922396|gb|ADCP01000120.1| GENE 22 23511 - 23912 343 133 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQYGSLSIGLALAACLGVSGCGEKRTLGEEPVPVQRSECGGIFVLAPASYMQYIQHDEAT LLASEGRYEDLPVFCTEEMARMLEERLEHDHELLPGLWRVYRLEGVWEEDVVEVRPGEWR MRRPAQLIALSAS >gi|316922396|gb|ADCP01000120.1| GENE 23 24205 - 24396 60 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAPPPFTVHFLHVSKEDGFRIRRKSDAEPFLLIPEQSDTPLATNRKEYVSFLRSNGSESL DPF >gi|316922396|gb|ADCP01000120.1| GENE 24 24490 - 25350 1209 286 aa, chain - ## HITS:1 COG:PH1256 KEGG:ns NR:ns ## COG: PH1256 COG0501 # Protein_GI_number: 14591074 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Pyrococcus horikoshii # 5 283 10 285 292 270 52.0 3e-72 MTSQLKTFMLLALLSGLIIVLGGALGGKTGIVIAFGLALIMNVGSYWYSDKIVLRMYQAR ELSESEAPMIYSMVRELAANAQIPMPRIAVVPEEAPNAFATGRNPENSVVAVTEGILRLL SPEELKGVLAHEIAHIANRDILIQTVAGVMASTIVSIANFMQFAAIFGMGRSSEDGEGGG NPLMAIVLAILAPIAASLIQFAISRSREYLADASGARYCGQPLALAAALGKLQSWNQRIP MQNGNPTTAEMFIVAPLFGGGMAKLFSTHPDINDRIARLREMAGVR >gi|316922396|gb|ADCP01000120.1| GENE 25 25551 - 27248 1402 565 aa, chain + ## HITS:1 COG:MA2889 KEGG:ns NR:ns ## COG: MA2889 COG0145 # Protein_GI_number: 20091711 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-methylhydantoinase A/acetone carboxylase, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 10 563 2 559 568 285 34.0 2e-76 MSKERKTPAVLGLDMGGTHTDAVLLRGGTLLASVKVPTNHQDLLVSTREALHALFESGAA FPADVSRATFGTTLAVNAVVQNACAPVGLLIGAGPGLDPGWFGLGGHTALVPGGVDHRGT EVAPPDREAIRRVVSGWREEGLAAFACVAKFSTRNPAHEDAMAEVIREVFSDGPEPVIAL GHRLSGRLNFPRRIATAYWNAAVWTLHNRFADAVEASLKDMGIDAPAYLLKADGGAIPLS VSRERPVEALLSGPAASVMGMMALMPSCSEDTLVLDMGGTTTDIALLAAGQPVLSPDDLV VNGRSTLVRALKSVSIGLGGDSQVTVAPGIQVGPLRKGPARAFGGTDGPTFLDCLNVLGH ADAGDVVASRAGVESLAAAHGLSAESLSQEVLDCARSRVASAVRSLLDEVNSRPVYTLAA LLEERAVRPARAVLVGGPAEAVAPLLGDALGLPVETLGDPVLGPVANAIGAALTRPTASL DLFADTAAGMLLVPSLDIRKPITRRYTLEEAKAEACGLLRGQAAFASASPEIDVTEAQLF ATLDEYGRGGRDIRVRCQLRPGVER Prediction of potential genes in microbial genomes Time: Fri May 13 04:02:38 2011 Seq name: gi|316922383|gb|ADCP01000121.1| Bilophila wadsworthia 3_1_6 cont1.121, whole genome shotgun sequence Length of sequence - 13870 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 11, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 22 - 1104 1282 ## COG0409 Hydrogenase maturation factor + Prom 1193 - 1252 2.4 2 2 Op 1 . + CDS 1309 - 2895 1675 ## HRM2_45680 hypothetical protein 3 2 Op 2 . + CDS 2904 - 3380 633 ## COG1522 Transcriptional regulators + Term 3381 - 3419 0.5 4 2 Op 3 . + CDS 3467 - 4741 1614 ## COG0001 Glutamate-1-semialdehyde aminotransferase + Term 4844 - 4903 19.1 - Term 4832 - 4889 17.9 5 3 Tu 1 . - CDS 4947 - 6260 1075 ## gi|302863651|gb|EFL86582.1| putative PE-PGRS family protein 6 4 Tu 1 . - CDS 6923 - 7189 352 ## DVU1780 hypothetical protein - Prom 7297 - 7356 5.0 + Prom 7416 - 7475 4.5 7 5 Tu 1 . + CDS 7596 - 7862 365 ## DvMF_2842 hypothetical protein 8 6 Tu 1 . - CDS 7933 - 8790 338 ## PROTEIN SUPPORTED gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase - Prom 8829 - 8888 4.2 9 7 Tu 1 . - CDS 9008 - 9601 524 ## LI0727 hypothetical protein - Prom 9748 - 9807 5.1 + Prom 9712 - 9771 4.4 10 8 Tu 1 . + CDS 9894 - 11069 1562 ## COG0192 S-adenosylmethionine synthetase + Term 11153 - 11184 3.2 11 9 Tu 1 . + CDS 11223 - 11399 75 ## + Prom 11801 - 11860 2.6 12 10 Tu 1 . + CDS 11941 - 12933 288 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase 13 11 Tu 1 . + CDS 13046 - 13864 792 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family Predicted protein(s) >gi|316922383|gb|ADCP01000121.1| GENE 1 22 - 1104 1282 360 aa, chain - ## HITS:1 COG:alr0696 KEGG:ns NR:ns ## COG: alr0696 COG0409 # Protein_GI_number: 17228191 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Nostoc sp. PCC 7120 # 6 359 7 363 383 332 45.0 7e-91 MPTPTLHDPELCQALLRRLEDLLDRPMRFMEVCGTHTVSIFRGGLRTLLPEQVTHLTGPG CPVCVTHDREVAAFLKLAEQPNVIIATFGDLMRVPGPDGRSLKHAQADGARVSVIYSPLD ALTLAADNPDATVVFLGVGFETTAPAVAATVLAAEQRKLDNFAVFSCHKLVPPALAALLG DPDNGIDAFLLPGHVSTVLGLSPFRFVAEDWKRPAIVAGFEPADILDALCRMARQYREGE FKVENAYPRAVNDDGNPKARAILSQVFRTADALWRGLGVIPGSGLALAPAYQRFDALARL GLSLPETKPLPGCRCGEVLKGKMPPNECPLFGKVCTPANPVGPCMVSTEGSCSAYFKYGL >gi|316922383|gb|ADCP01000121.1| GENE 2 1309 - 2895 1675 528 aa, chain + ## HITS:1 COG:no KEGG:HRM2_45680 NR:ns ## KEGG: HRM2_45680 # Name: not_defined # Def: hypothetical protein # Organism: D.autotrophicum # Pathway: not_defined # 16 516 9 508 513 189 29.0 3e-46 MSYSIVRRRRRPNASLLAEAVRTIGKRRFEIRQQVSCPLEDRKLHYLVETWFFFPQSLQI NRWSYTATDYQQSLKNYIRLGVPIRPLESLLGGELPPPLRKLEAEEGGGEEIIELTLPEQ FPDILEDCSRRLDDLLWQNTPETRERYEDSLKLFCIAFRVSLLARKNAVLAIEDGSERTR AAYDLTRLAVACLKRYRKLLGKRAKAARNLLRAPAFLYCDEYLSIITTRTLGEIVRRLSP VHKCSTYILACFGAQRAYRRSCYPESMPSKTGDNELPVFRWSVLKKYVDMPLFLNVQRHS GNSLLEHVLYSLIAAVAMSLALTVTLLWEGAGSLSAPIFVIAVFAYICRERIKDVLKHKL FKVFGKWIPDRVLRVCDGYGRRLGHCAEQFRFADWDKLPKEVRVLRNRTHFVDILNAFHN EDILYYSKRIDIKELPDPFHEGKNLLLDISRFDISDFLRHADEVLDEPQGGMDDVVGGDK VYHVDMVRKITHRCGSDLERFRIVLTHAGIRRIDEIKPLFITTGDNYH >gi|316922383|gb|ADCP01000121.1| GENE 3 2904 - 3380 633 158 aa, chain + ## HITS:1 COG:MA0575 KEGG:ns NR:ns ## COG: MA0575 COG1522 # Protein_GI_number: 20089464 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 8 155 10 156 158 111 36.0 6e-25 MSTAFTDTERAILRIVQFQLPDTLTPYADIAREVGTDEETVLNLLRRMKEDGSIRRFGAS IKHQKTGWTHNAMVAWIVDEADSDAIGEVAAKNQRISHVYFRPTSAPDWPYTLYTMVHGR SEDECLQVIEELRRETALDEYAVLDSLEELKKTSMTYF >gi|316922383|gb|ADCP01000121.1| GENE 4 3467 - 4741 1614 424 aa, chain + ## HITS:1 COG:aq_816 KEGG:ns NR:ns ## COG: aq_816 COG0001 # Protein_GI_number: 15606182 # Func_class: H Coenzyme transport and metabolism # Function: Glutamate-1-semialdehyde aminotransferase # Organism: Aquifex aeolicus # 5 420 2 418 424 452 54.0 1e-127 MKTTRSEKLFKEACELIPGGVNSPVRACLGVGADPLFIASAKGSHVTTVDGQEMIDFVES WGPMLLGHSHPEVTKAVVEAAGKGTSYGAPCPAEVELARMVVDAFPGMDMVRMVNSGTEA TMSALRLARGVTGRTKVLKFIGCYHGHADPFLASAGSGVATLSIPGTPGVPESTVRDTLL APYNDVRAVEDIFHLYGKDIAAIIVEPIAGNMGLVLPEEGFLKALRALCDTHGALLIMDE VITGFRAEYGGAQTRFGITPDLTTLGKIIGGGLPVGAFGGKREYMSRVAPAGDVYQAGTL SGNPLAMAAGIATLSVLKASDYAALEARVEAFAAELESILRGKGVPVTVNRLASMYTVFF SDGPLRNFEDVKNSDTEMYSRFFRAMRERNIFVAPSAFEVAMVSFAHTEEDFAKTLDAVR GIKL >gi|316922383|gb|ADCP01000121.1| GENE 5 4947 - 6260 1075 437 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302863651|gb|EFL86582.1| ## NR: gi|302863651|gb|EFL86582.1| putative PE-PGRS family protein [Desulfovibrio sp. 3_1_syn3] # 55 412 269 668 681 72 26.0 5e-11 MGARDTTPDREGRLALDIIQELGADEMAARLPFWENRGKQEGRMVVNLQQSKTVQIRGRK YRLEAKGFLSYASLATNEGELAYVAAVQGAAASLPEDERYNIQNNDTDLPYIPDHIGLSP LPGDSGRVSPWYWRNGGFRAAESGRRAALLKCLWPEASPLMLRVLTDPGFRADPDKLAGV LLAEWSAWDIPGDVLGPALLEGNENLRKRPPGTTIWFSPSGLSSYPWYHRNSTPMFSVWL LLAALRDDVALPREVREAAPALADLCDIRAALGRERWRKAGRHLCAAAWLEEQTPTDKLL EKAVDDWNSPTDMKLIIQAWSGGGRIVPKGLWRKPDAIRCWAARRTEAGPDAGEPLVCVA SMLGHVDEATGKRMIRMMLASGAKPDVRDENGMFPEEAARRSGASAELLKLLAAPPAGWT QGTDGAPQRDKQAPPSP >gi|316922383|gb|ADCP01000121.1| GENE 6 6923 - 7189 352 88 aa, chain - ## HITS:1 COG:no KEGG:DVU1780 NR:ns ## KEGG: DVU1780 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 84 12 95 101 103 59.0 3e-21 MTTFFRHYKNCYYQIVGEALDTRDDTLVVVYRTLYPSEYSLFTRPKEEFFGSVRCPDGSE CLRFTPVEYAELPEDARSRVVHEVEIWR >gi|316922383|gb|ADCP01000121.1| GENE 7 7596 - 7862 365 88 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2842 NR:ns ## KEGG: DvMF_2842 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 81 1 81 98 103 65.0 2e-21 MSNDLITTVHDLVLESPIGAKAIAQAVGKPYSTLLREVNPYDTGAKLGAETLMHIMKTTG NVTPLEKMALEMGYRLEPAEGMGSTVRA >gi|316922383|gb|ADCP01000121.1| GENE 8 7933 - 8790 338 285 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase [Clostridium acetobutylicum ATCC 824] # 1 276 1 273 642 134 31 3e-31 MATVLRAKTAGFCMGVGLALRKLDQALKEHRARSSGRLVMLGPIIHNPQVMNTYMQQGVV LAHSLDAVQPGDTVIIRAHGVPREDEARLLDLGAHVLDATCPKVKQAQLAIDEATRQGTS LYLFGEAEHPEVQGLVSYANGPCLVFGSLKELKEKQTFKDTHNVVLAAQTTQEREEFEAI KKTLAEGHSLSVLETICDATRRRQQEALMISQQAEAMIVVGGKTSGNTRRLADVAESCGI AVWHIEVPEELPLEALKPYGFIGLTAGASTPRSIIDAVQESLKTL >gi|316922383|gb|ADCP01000121.1| GENE 9 9008 - 9601 524 197 aa, chain - ## HITS:1 COG:no KEGG:LI0727 NR:ns ## KEGG: LI0727 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 3 190 5 194 196 135 40.0 7e-31 MNPAYQGQLDVFCAAYAVINAMRHINATRLLTCRAILHEALLDAARDEESFCALLEQRTD YVDWVDAMLSRLEKQGSLLTARPFPTHFPGPSAPSPAVLWDAIADWLRRGERNAVLLQFV RILYPTEAVIRHWTCCNDVVGDTLMLFDSSIEPGALHQLSRESLISDPADDGPGKILIVP YTIRFLGPRLQGGKRRQ >gi|316922383|gb|ADCP01000121.1| GENE 10 9894 - 11069 1562 391 aa, chain + ## HITS:1 COG:YPO0931 KEGG:ns NR:ns ## COG: YPO0931 COG0192 # Protein_GI_number: 16121235 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Yersinia pestis # 6 389 2 381 384 487 61.0 1e-137 MQPEKGRYFFTSESVTEGHPDKVADQISDAILDTLIEQDPGSRVACETLVTTGMAVIAGE ISTKGYADLPKVVRETIKEIGYNSSEMGFDWQTCAVISSIDHQSGDIAQGVDREDPENQG AGDQGMMFGFACNETPTLMPAPIYWAHQLSQQLTKARKDGLVDFFRPDGKTQVSFEYVDG KPIRINNVVVSTQHAASASQADIIEAVKRTVIRPVLEPTGLFDEKDCEIFINTTGRFVVG GPMGDCGLTGRKIIQDTYGGSGHHGGGAFSGKDPSKVDRSAAYMGRYVAKNVVAAGLAPR CEVQIAYCIGVAEPVSVLVSSLGTSDIPDEVLTKAVREVFDLRPYHITKRLDLLRPIYKK TSCYGHFGRELPEFTWEHTDAAADLRTAAKV >gi|316922383|gb|ADCP01000121.1| GENE 11 11223 - 11399 75 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYAEGLSESQQYPKVVLFGKSLSGSASELGLDLSLFCKFFFVDGIDFIRGYTALVSYL >gi|316922383|gb|ADCP01000121.1| GENE 12 11941 - 12933 288 330 aa, chain + ## HITS:1 COG:yijP KEGG:ns NR:ns ## COG: yijP COG2194 # Protein_GI_number: 16131793 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 2 328 236 562 577 203 33.0 5e-52 MPRTYVFIVGESANRNHLSLYGYKRNTTPKLEAMRDELVVFEDVISPDTHTIPSLRKVLL FSELQKGDTILTSPSVLTLLNSAGFTTYWISNQAVNVEGATGVRLFAEDAKQVSFLNMAR DEGRSVSYDSVLLPELEKILGNSVERKAIFIHLMGSHLTYALRYPPEFGVFSSTDDIPTK PWRTESEKQYVNTYDNSIRYTDHIVSSVINAVKRKGGDAFVVYFSDHGQEVYDTRPVRGQ IANDPSKHMLDIPFLCWFSPEYQRRNAEFMQRVKGAIHKPWMTSGFADAAAELARLSFGG GKPEKAPFSASYEPWVRYAPNGSRYDSLDK >gi|316922383|gb|ADCP01000121.1| GENE 13 13046 - 13864 792 272 aa, chain + ## HITS:1 COG:DR0470 KEGG:ns NR:ns ## COG: DR0470 COG1387 # Protein_GI_number: 15805497 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Deinococcus radiodurans # 3 266 4 257 260 121 31.0 2e-27 MIVADVHNHTNASHGKASVGEMYAAAKDRGLAVYGFSEHSPRPEAYSYPVEYREHLRQNF VRYASEVMQLRALSGSPKVLLGMELDWFPSERPFMEAAVAAYPFDYIIGGIHFLGSWGFD FTQDDWKISPQQCYTRYENYFRTLADMARSGLVDIAAHPDIIKLYSVDVFHQWLAMPESL ALISEALTAIRDNGLVMEISSAGLRKPCNEIYPHPAIMKLASDLGVKISFGSDAHCPNTP AYAFDQLEAYARSYGYTSSVIFENRKPREIGF Prediction of potential genes in microbial genomes Time: Fri May 13 04:03:37 2011 Seq name: gi|316922353|gb|ADCP01000122.1| Bilophila wadsworthia 3_1_6 cont1.122, whole genome shotgun sequence Length of sequence - 32349 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 13, operones - 8 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 119 - 178 1.7 1 1 Tu 1 . + CDS 302 - 1747 174 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) + Term 1841 - 1884 14.2 - Term 1829 - 1871 10.2 2 2 Op 1 5/0.000 - CDS 1957 - 2469 661 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 3 2 Op 2 . - CDS 2482 - 3558 1654 ## COG3261 Ni,Fe-hydrogenase III large subunit 4 2 Op 3 . - CDS 3569 - 3961 480 ## DVU0431 ech hydrogenase, subunit EchD, putative 5 2 Op 4 1/0.000 - CDS 4003 - 4494 288 ## PROTEIN SUPPORTED gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B 6 2 Op 5 1/0.000 - CDS 4525 - 5373 1412 ## COG0650 Formate hydrogenlyase subunit 4 7 2 Op 6 . - CDS 5373 - 7307 2712 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit - Prom 7527 - 7586 2.2 + Prom 7486 - 7545 3.0 8 3 Op 1 . + CDS 7739 - 9316 1190 ## COG0815 Apolipoprotein N-acyltransferase + Term 9362 - 9387 -0.1 9 3 Op 2 . + CDS 9503 - 10492 1195 ## COG1186 Protein chain release factor B + Term 10571 - 10619 10.2 + Prom 10804 - 10863 4.8 10 4 Op 1 . + CDS 10888 - 11163 354 ## COG0776 Bacterial nucleoid DNA-binding protein 11 4 Op 2 . + CDS 11170 - 11430 153 ## 12 4 Op 3 . + CDS 11452 - 12183 699 ## DVU1866 hypothetical protein + Term 12208 - 12246 10.1 - Term 12196 - 12233 6.1 13 5 Tu 1 . - CDS 12261 - 13304 1204 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 13373 - 13432 3.0 - Term 13442 - 13477 6.3 14 6 Op 1 39/0.000 - CDS 13485 - 14366 1272 ## COG0074 Succinyl-CoA synthetase, alpha subunit 15 6 Op 2 . - CDS 14366 - 15487 1674 ## COG0045 Succinyl-CoA synthetase, beta subunit 16 7 Op 1 . - CDS 15734 - 17128 2049 ## COG1012 NAD-dependent aldehyde dehydrogenases 17 7 Op 2 . - CDS 17202 - 18707 1918 ## COG0471 Di- and tricarboxylate transporters - Prom 18832 - 18891 4.2 - Term 18950 - 18993 9.0 18 8 Tu 1 . - CDS 19014 - 20033 1257 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Term 20415 - 20461 15.1 19 9 Op 1 . - CDS 20491 - 21462 1379 ## COG3181 Uncharacterized protein conserved in bacteria 20 9 Op 2 . - CDS 21502 - 21957 611 ## gi|302863851|gb|EFL86782.1| hypothetical membrane protein 21 9 Op 3 . - CDS 21970 - 23487 1852 ## COG3333 Uncharacterized protein conserved in bacteria - Term 23700 - 23749 16.2 22 10 Op 1 1/0.000 - CDS 23888 - 25267 416 ## PROTEIN SUPPORTED gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 23 10 Op 2 2/0.000 - CDS 25326 - 26729 1854 ## COG0477 Permeases of the major facilitator superfamily - Prom 26807 - 26866 5.2 - Term 26920 - 26980 22.2 24 10 Op 3 . - CDS 27017 - 28393 1726 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 28426 - 28485 2.2 25 11 Op 1 . + CDS 28672 - 29487 836 ## COG0338 Site-specific DNA methylase 26 11 Op 2 . + CDS 29484 - 29984 514 ## COG0338 Site-specific DNA methylase 27 12 Tu 1 . + CDS 30111 - 31634 1615 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains 28 13 Tu 1 . + CDS 31824 - 32336 411 ## COG3963 Phospholipid N-methyltransferase Predicted protein(s) >gi|316922353|gb|ADCP01000122.1| GENE 1 302 - 1747 174 481 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 270 477 20 222 223 71 29 6e-12 MFVTYPGTERHILNDFNLTLYEGEHAVIRGGNGAGKSTLLRLLRGEQWPDQIDHRRAGRV LWHGPEGADPSPLTGRKVTSLVSAMQQERVVHQEWRVDGERLVLGGFSDAIYIAQQPTSE MCETAYQLVRLLGGVHLLKKPVTAMSQGQLRLMLVARSLVREPEVLLLDEVTDGLDARAR NTLLDALERASELSTLVMTTHRPETLPSWIGRQIVLENGKAVDGPMLETAVEPEKEPAPV ASAPELKGIRGCSARIAIKDASVFIDRVPVLYDINWTINPGENWAVLGGNGAGKSTLLRL LAGDEIVAYGGEIVRELPRQGGVVDRLEVLRKGVRLVSDRQQATYTYDITGEELVFSGID NSVGVYREPSEKELAQVTDILASLHLEFLAKRTIRSCSTGEFRRLLLARALAGEPDLLLL DEPFSGLDAPSRNEFFALLNQLARQGVQMILVTHHKADIFPAITHMLQLENGRISAIWEQ G >gi|316922353|gb|ADCP01000122.1| GENE 2 1957 - 2469 661 170 aa, chain - ## HITS:1 COG:PH1440 KEGG:ns NR:ns ## COG: PH1440 COG1143 # Protein_GI_number: 14591231 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Pyrococcus horikoshii # 3 90 6 94 136 64 41.0 9e-11 MLMINTILRNFLKRPATRKYPMVKRDPFSNYRGRLINNVESCIFCRMCSTKCPAQCISTD PKEAFWGYDPFSCVYCGICVEVCPTKSLFMLSTHRSPVPTKFVVYHKGTARVAKPRLASV PKVAPSEVHSETPPPKVVAAVETPAEAPIEEPKPTPKATPPATPGGKKKK >gi|316922353|gb|ADCP01000122.1| GENE 3 2482 - 3558 1654 358 aa, chain - ## HITS:1 COG:PH1437 KEGG:ns NR:ns ## COG: PH1437 COG3261 # Protein_GI_number: 14591229 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III large subunit # Organism: Pyrococcus horikoshii # 6 358 11 381 427 276 40.0 3e-74 MSRTIIPFGPQHPVLPEPIHLQLVMDNETVVEALPKLGYVHRGLETLTTLRDYNQMIYIV ERVCGICSCIHALCFCRAIEHLMDVEVPRRAAYLRIIWSELHRMHSHLLWLGLFADALGF ESLFMQFWKIRERIMDINEATAGNRVIISTNIVGGVRKDLSREHQRWILDEMDTLEAEMR RLEKVTMEDYTVKKRTVGIGVMTAQEAIQLGAAGPTLRGSGVAQDVRQTEYEAFAELGFT PVSVLDGDCWARTKVRFLEVLQSMEIVRQAISHLPDTEINVKVKGNPSGEVISRVEQPRG ECMYYVKGNGSKYLDRVRIRTPTFANIPPLLHMVKGIQLADVPVVVLSIDPCISCTER >gi|316922353|gb|ADCP01000122.1| GENE 4 3569 - 3961 480 130 aa, chain - ## HITS:1 COG:no KEGG:DVU0431 NR:ns ## KEGG: DVU0431 # Name: not_defined # Def: ech hydrogenase, subunit EchD, putative # Organism: D.vulgaris # Pathway: not_defined # 1 130 1 129 130 139 51.0 4e-32 MPPSVARLEAEPITLETVRAVAKANFDAGYRFVTLSVHNLGDGNLDIIYHYDKNLNMRHY RLTVPLGQAVPSISDIYFCALLVENESRDQYGITWDGLILDFQGSLYLEKDTPPPLLRGP SCTLSTVTKK >gi|316922353|gb|ADCP01000122.1| GENE 5 4003 - 4494 288 163 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B [Campylobacter curvus 525.92] # 10 154 21 161 170 115 36 3e-25 MNFSALHEKLNQYINKGRLKSPWVVHYDCGSCNGCDIEVLACLTPVFDVERFGIVNIGDP KHADVLLVTGTVNQRNKDVLKNLYDQMPEPKAVIAIGACACTGGVFQDCYNVVGGVDHVI PVDVYVPGCAAKPEAIIDGVVLALEVVKGKLGLRDMPQTTIFE >gi|316922353|gb|ADCP01000122.1| GENE 6 4525 - 5373 1412 282 aa, chain - ## HITS:1 COG:TM1213 KEGG:ns NR:ns ## COG: TM1213 COG0650 # Protein_GI_number: 15643969 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 4 # Organism: Thermotoga maritima # 1 279 1 287 293 84 29.0 3e-16 MLKLIFALIAVAVAVPAGGLLAGVDRRLTARLQSRQGPPLLQPFYDVLKLFGKAPCVTNP WLIFSSYVCLMASAIALLMFFMQGDMLLLFFVMTVGAVFKVAGALSADSPYSNVGAQREL LQMLAYEPLVILTFVGMSAATGSFMISDIYALDVPLLPHLPFLFIALGYALTIKLAKSPF DIASCHHGHQEIVRGVLTDYSGPQLALLEISHWLDVILLLGLCSLFWHTSITGMVVLLVL TYALEIVVDNVCARMTWGWMLKHAFGITLGLAIFNILWLYVR >gi|316922353|gb|ADCP01000122.1| GENE 7 5373 - 7307 2712 644 aa, chain - ## HITS:1 COG:BS_shaA_1 KEGG:ns NR:ns ## COG: BS_shaA_1 COG1009 # Protein_GI_number: 16080212 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Bacillus subtilis # 120 440 50 378 603 196 39.0 8e-50 MLQLLLFGCIVFPLIAAVAVALDKEGTLRRRIVYAGTGITALSALGLALYGSFVLPISPD SSLQPLMTVLDLALLVFILYVAVNIRHRLSMGLALGQLAGMIYLDFFMLEGHQATAFLAD PLALVMVLVISLVGGIVCIFGLGYMQEHEDHLRLAKSKQSRFFFFLLLFLGAMNGLVLCD SLTWVFFFWEITTLCSFFLISHDGTREAKRNATRALWMNMVGGIGFLAAMLFMQKAIGTL SIQAMLAQSVVMHSTAAMLPIAFLCFAAFTKSAQLPFQSWLCGAMVAPTPVSALLHSSTM VKAGVYLVLRLSPAFAGTMLAGIVSLTGAFTFVAASALACGQSNGKKILAYSTIANLGLI IACAGMATPAAITAAILLIIFHAVSKGLLFLCVGTIEQHIGSRDIESMRGLYKIMPRVAV ITLFGIVTMMLPPFGALLAKWIAIEASASAPAFMPVLVVLIAFGSALTVLFWARWAGLLL GSDPLSDKRPVPEHLDGTMSFALRTLLYGAIALSLFVPWIFSGIEQSIASLFGVEGMFAS DWGILTNGRGLFAVYPMFFLLALGMWYALRQSKKAERGPCALPYLSGIQYVQDGKVGFNG PLNTFVEPKSSNYYMEAWFGEQTLTGRINTIAIVLMVVMLGGLL >gi|316922353|gb|ADCP01000122.1| GENE 8 7739 - 9316 1190 525 aa, chain + ## HITS:1 COG:HI0302 KEGG:ns NR:ns ## COG: HI0302 COG0815 # Protein_GI_number: 16272257 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Apolipoprotein N-acyltransferase # Organism: Haemophilus influenzae # 67 498 80 493 522 174 31.0 3e-43 MNMLWWIFCGAAGLWIGMPNPVASFPAAALLYPASLFLLGTGSTSWKHAFRLGWLCGLAG ASACLYWLAIPVHDFGGLPWALAAPCPLLMGAYIGLYGGLFAALAHALRGEQAWRKGVAL GLGWYLMECFRGWFFTGFPWITLASAFTPWTPVIQLASVIGAYGLGGLLAGLSCLCVEGI LRVRHPHALRKRWLPALGGSLAGIFLVVAFGVFSLSFAPSGQGGLWVSLEQGNLDQNVKW EPAMQRLTVKRYLSLSAESLSVPESERPELLIWPETAMPFDYQTAPELSAAIRAFARDRR VALLFGAPGFRNRGDGVVDAFNRAYLIAPNGVGEGWYDKEHLVPFGEYVPPFLDLPFLRP LLQGVGEFLPGESSGPLVLPATPQLSPDRGPLVLGVLICYESIFPELARKQVAQGATLLV NISNDAWFGRSSAPEQHLSLGILRAVEQRRWIARSTNTGISAFVDPTGAVVARSGLFKAE SLSHPVVPLTEKSVFFTLEPWLPWAGLALFLTFFAPVLSRFRRHI >gi|316922353|gb|ADCP01000122.1| GENE 9 9503 - 10492 1195 329 aa, chain + ## HITS:1 COG:BS_prfB KEGG:ns NR:ns ## COG: BS_prfB COG1186 # Protein_GI_number: 16080582 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Bacillus subtilis # 41 317 86 361 366 292 54.0 8e-79 MLREKSRLEAELDRLDTLKNCHASMLDWQEMAQELSGGAEIKEGGDAAEALESLSAEIER LTELLEETETNLLLSSEEDHADAILEIHPGAGGTEAQDWAQMLQRMYLRWADSHGFSVEE LDFLPGDEAGIKSVSLRISGENVYGFLKGERGIHRLIRISPFDSSGRRHTSFASVDVIPD AGDDIEIEIKESDLRVDIFCASGPGGQGVNTTYSAVRITHIPSGISVQCQNERSQHHNKD SAMRILRGRLYDLELSKRDAARQAEYAGKNAISFGSQIRTYTLQPYRLVKDHRNNCEIGD TDAVLDGRIDKLLRDYLLWQHTNRQQQQQ >gi|316922353|gb|ADCP01000122.1| GENE 10 10888 - 11163 354 91 aa, chain + ## HITS:1 COG:RSc0910 KEGG:ns NR:ns ## COG: RSc0910 COG0776 # Protein_GI_number: 17545629 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Ralstonia solanacearum # 1 90 1 91 138 83 49.0 1e-16 MNKSELVKALADQANISLDEATLVVNTFVDSMKDSLLEGGRVEIRGFGSFKVKEYGSYAG RNPRTGEKVAVEPKRLPFFRAGKELKEYLNA >gi|316922353|gb|ADCP01000122.1| GENE 11 11170 - 11430 153 86 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MREFVSVNTDGVKTSRRCPGEPASPGFLVLCLLLFSFIGTAVASELFPEGEGLRTAPELS VSATAQPILRILYAADSHGALHPCPS >gi|316922353|gb|ADCP01000122.1| GENE 12 11452 - 12183 699 243 aa, chain + ## HITS:1 COG:no KEGG:DVU1866 NR:ns ## KEGG: DVU1866 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 16 238 7 225 233 123 35.0 6e-27 MARRATVLEGYRREPVPTLTLVGADEFSPDTELPDPAKTGDASPEAVRAVYDALHVDCGM LSAGAAAWFGASRPRNFFEVGDTPVTKRFHVGGVPVAVILFPPLSAGGTPETEAPTPKLL ASVLAAADAASDAAIRIGISPWGFEGEFAVREALEQRYHLLLGGGPGAAFAGEVNAQAPG LIWSRADKDGRSVMAIDIMALPEPGVPFAWEWGLSMQAQEVRLTSAVPSNPHMEALVAGK AAR >gi|316922353|gb|ADCP01000122.1| GENE 13 12261 - 13304 1204 347 aa, chain - ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 6 248 5 244 332 69 25.0 8e-12 MRFRVLMSLLAFVLCLLPSHAKAAEQLQTAWIGEHEAFLVWYAKQKGWDKDAGLDIRMLR FGSGDKIVSRQGDYQWKIAGCGAVPTLMAASKGQFYVVGIGNDESDLNAIYVRPDSPILK AKGANPRYPDVYGSADTVRGKLILCPEKTSAHYLLSTWLHILGLKDEDVKIQNVLPGPAV DMFSKGFGDAVSIWAPATYTAAQKGYQIAASSPDCGIRQPILLLADREFADKNPEQVRAF LRVYLRVVDAMHSEGFDAFVQEYIRFNKTWNEKLMTREEAVEDLRAHPVFSLGEQITLFH PESGNLRNWLRGITAFYGRIGELSQEDVANLNAFGFLNDSFLKGLQQ >gi|316922353|gb|ADCP01000122.1| GENE 14 13485 - 14366 1272 293 aa, chain - ## HITS:1 COG:APE1072 KEGG:ns NR:ns ## COG: APE1072 COG0074 # Protein_GI_number: 14601168 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Aeropyrum pernix # 1 292 6 295 297 251 46.0 2e-66 MSIFIHKDSRVVVQGVTGKEGAFWAKHMKDMGTQVVFGVTPGKEGQDVDGIPVYHSVRRG IKDHPADVAMLFVPPKFTKDAVFEALDAGIKKICTIADGIPLHEAIQIRRAALSCGAMVV GGNTSGIISVGEAMLGTIPYWIDRVYKKGHVGVMTRSGSLTNEVTAEIVKGGFGVTTLIG VGGDPVPGTRFAELLPLYEADPDTHAVVIIGELGGTMEEEVAEAMEAKAFTKPLVAFMGG RTAPEGKRMGHAGAIVTGGRGTVKGKTEAIVKAGGKVAKRPSEVGALLKALLG >gi|316922353|gb|ADCP01000122.1| GENE 15 14366 - 15487 1674 373 aa, chain - ## HITS:1 COG:MTH1036 KEGG:ns NR:ns ## COG: MTH1036 COG0045 # Protein_GI_number: 15679054 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Methanothermobacter thermautotrophicus # 1 369 1 362 365 229 39.0 9e-60 MKLFEYQAKEAFREKGLPTPKGVLVRGMDELDGAFAEVGFPCVLKSQVLSGGRGKAGLIK VVKDAESAKATAKTLFESEHNVRMLLVEEAVDIDREIYLSITVDPVESKIMLIGCAEGGV EIEKLAATDPDKIVTVRVDIAEGLGGYHVNNLTYGMGLSGDLGKQVGAVVRGLYKTFRAY DAQLAEINPLFITKDGKAIAGDGKLIIDDNSVWRQPSYPLTRDYFDSEVEFEAAQEGIPY LQFNGDIALMCAGAGLTTTVFDLINDAGGHPATYVEFGGANYTKAVRTMELCLKTPSKVI LVVTFGTIARADVMAQGLVEAIKAHQPKQPIVTCIRGTNEEEAFAILRDAGLEPLTNTEE AVQRAVDIAAGRA >gi|316922353|gb|ADCP01000122.1| GENE 16 15734 - 17128 2049 464 aa, chain - ## HITS:1 COG:CAP0035_1 KEGG:ns NR:ns ## COG: CAP0035_1 COG1012 # Protein_GI_number: 15004739 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Clostridium acetobutylicum # 2 442 7 444 448 288 38.0 1e-77 MKTVSEQMELARKAQAIVNDYTQEQIDEICLAIGWEVYEDENIAKLAKMAVEETGYGNVQ SKIIKHKRKVGGVLHDIKGAKSVGLIERNEETGISKYAKPVGVVCAILPATNPTATCGGK AVGILKGRNAVIFKPSSRALKSTTEAINMMRAGLRKVGAPEDLIQVLEDPSREAITELMK VSDLIVATGSGQVVRAAYSSGTPAYGVGQGNACAIVAEDADVAEAAKMICGSKLFDYATS CSSENAVIPVEAVYDQFMAEMKKNGCYLVTGEDREKLKNHMWKPNAKGKIALNPDIIAKS AQVIADGAGISIPEGTDILLVEGMEPILGDKFHDEKISPVLTVYKAKDFKDAYRILVELT NLVGRGHSCGIHTYKHEYIEFLGEHMKSSRITVRQSMSAGNGGHPFNRMPSTATLGCGTW GGNSTTENVHWRHFINVTWVNEPVAPWTFTDEDMWGDFWKKYGK >gi|316922353|gb|ADCP01000122.1| GENE 17 17202 - 18707 1918 501 aa, chain - ## HITS:1 COG:MTH788 KEGG:ns NR:ns ## COG: MTH788 COG0471 # Protein_GI_number: 15678812 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Methanothermobacter thermautotrophicus # 55 483 13 428 443 179 30.0 1e-44 MSRLDNCLTAIVGDKLPIETARKTFFFLLGIAVMLIVIALPSPSPFYKGEEAIPLTANAK IVMAVLCFAVIQWMTEAIPFPATSLCLIVFLHILGVASFDKLVTLGFGNNVLLFLMGAMG LSAAMTSSGLARRFMLWMLTKVGRRTDRIVLAFIAIGTMTSMWVTDMAVAAMLLPLGVSI LESSGCKPLQSNFGRALMIGIVWGALIGGTATPAGCGPNVLAMQYVRDMAHMDVSFAQWM AVGVPGAMIMVPLGWFCLMKLFPPEFREIPTSLESIRAELNALGGLNTKEIRTLVVFLTM VTLWLGGSNLEPYIGFKAPEGFVALLGFVLLFVPGLRVFDDWKVAAKSIDWGGLVLIAGG ISAGMMLASTGAARWLAWGMLSGVGELHPVLRVLAVIGMVELLKIFFSSNSVTGAVVMPL IIALAMDIGMNPWILAGPAGIATSMAFIMVTSSPTNVIPYSSGYFRISEFAKSGVVMTII GILAVTASVAIFGRFANMNIW >gi|316922353|gb|ADCP01000122.1| GENE 18 19014 - 20033 1257 339 aa, chain - ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 1 256 1 249 332 90 27.0 6e-18 MHKLLGGLCAALLFLAGLPQIGHAADPVPLKTAWLGEHEAFAAWYAKQKGWDLEEGFRLE MLSYDSGKQLMAGMNTAHWEIAACGAIPALTASLDNQAEIIAIGNDESLATGIYARKDSP ILGHTGYNALYPEMAGTGGTVRGKTILCTAGSSAHYTVHKWLEALSLTEKEVTIKDMPSE EALKAFLNGEGDAVALWSPYTLEAEQHGLMPVTLSSQCDASQPTLLLANKAFAKEHPALV TAFLKMYFRGITALQETPREQLIQDYRAFYHAWTGRDLPPQDAAWELAKHPVYGLKGQLT LFDPAQKKLHTWLQELASFAGGKNRLSDVTDVYLKKLAQ >gi|316922353|gb|ADCP01000122.1| GENE 19 20491 - 21462 1379 323 aa, chain - ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 13 312 1 295 308 155 32.0 1e-37 MRRLLATLFCAAMLFGATSAQAASPADAYPSGPISLVVCYTPGGATDLQARLSALVAQDP KYFGQPIVILNKPGAGGMTGWNWIMERGSKDGLTMTAYNMPHFIAQSIVNKTKFSVDTFE PLGNWGADPAVLIVPKDSKFKNVADLVAFAKENPGKVTINGAGLYVGHHIATLQLQKATG VKMSYIPEKGGTEAIQNVLGGKVMAGFNNLADVYRSQDRLTILAIADVKRHEYLPDVPTL QELGYNVDDASVNFRGYALPKGVDPSIVDKAAKIVPVMFHDPEVVKRMKDSGSPMLIMDR AQVLEMFQKKQETLKALLSDLRK >gi|316922353|gb|ADCP01000122.1| GENE 20 21502 - 21957 611 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302863851|gb|EFL86782.1| ## NR: gi|302863851|gb|EFL86782.1| hypothetical membrane protein [Desulfovibrio sp. 3_1_syn3] # 4 147 28 172 176 124 57.0 2e-27 MLYKEFISALVMTGIVTAFAIPMTQLSGESQVVPLLLLICMGIFNVGQYILAGLRYAAKI DVKMTMRGYPLRRVAVLFVLTVLYLLFLETLGFYIGALLYFIAASLIAQPMKITPKLAAK RVAVCFVCIAFLYCLFTVLLAVQIPKGIFGI >gi|316922353|gb|ADCP01000122.1| GENE 21 21970 - 23487 1852 505 aa, chain - ## HITS:1 COG:BH2009 KEGG:ns NR:ns ## COG: BH2009 COG3333 # Protein_GI_number: 15614572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 463 2 462 504 327 40.0 2e-89 MEQFLTYFPSAFSWGNLAVLILGSAGGLFFGAMPGLSPTMAVALLVPFTFYLPADTSLIL LGAVYTSAVAGGSISAILLSIPGAPSSIATLFDGYPMAKQGRAQEALYTVFVSSLIGGVF GVLVMILFSPPMAEFAMRFGPSELFWTAIFGITVITGLSSGAMLKGLFGGALGMLLSTVG YSTLTGEPRFVIHEALNGGIALVPALIGLFAVPQIIDLMADSHVLLEKLTFKPQKGMLRQ VIKAHRRYLRTLGIGSLIGTIIGVIPGAGGQVAGLMSYDQTKKFDKNPDRFGTGDPEGVC AAESANNATVAPAMIPLLTLSIPGSPTAAVLLGGLLIHGLFPGPDLFTEKADITYTFISG LLLAQFFMCFFGLLASRYSHLVANAPNYMMFAAVMILCVFGSYCVQNSYEDVILMFILGV LMFVLAKFGFPSAPIVLGIILGPIAEENFLRGKMIADTDVGVFSYFFTGGLNLALIALCL LSLIWGIYSEVKYMRNHSRKAAAAN >gi|316922353|gb|ADCP01000122.1| GENE 22 23888 - 25267 416 459 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 [Phaeobacter gallaeciensis BS107] # 12 446 11 453 468 164 30 5e-40 MFDTLYSITITQAAAALRAGELTSAALVRHALEAVRLHDGTLNAMLRVSASALEEAERCD AEAAQGSFRGPLHGIPLVIKDNIDVRGLPTTVASPLFTHAAPAKADAVAVARLRAAGAVI FGKTNLDELAAHVSGRTSCFGPTVNPWHPERRLSPGGSSSGTAAAVAAGYCPGGLGTDTG GSIRLPAGWCGLYGIRSTQGQSDISGIYPRAASLDTVGAMARSARDIDLLLQILADPPLR DTLRASAPIRYLRIGVFPELVRAKGSPEVIEAYAEAVGRWEEFGECIPVGLPLLDDPAVV DSVGTIRSYEFARDIRKDIEGNPFAGKMHPVPLADYKNGQQATPEAYHAALLHKQALSRQ VDAFFASLQVDFLLLPVAFSTAPSIDAPQEAFTAGRALVNLFSVAGVPTLVVPGELTAGG MPLGIQLVGPALSERLLLDAGTAYETTYGPFPTPPLASA >gi|316922353|gb|ADCP01000122.1| GENE 23 25326 - 26729 1854 467 aa, chain - ## HITS:1 COG:PA3467 KEGG:ns NR:ns ## COG: PA3467 COG0477 # Protein_GI_number: 15598663 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Pseudomonas aeruginosa # 10 463 6 455 455 168 28.0 2e-41 MQASNLQQEANLISRMERMPLNKALLTLVGLLSWCWVMEAFDLGMIGQVVAVLKKIWDLD ASTLGLLGSCSTAGVVIGTASAGFLTDRFGRKRILLWGVFIFTFFTLIGSLYENLAWIVT MRFIGGLGAGAVFPLPYLMLSEIAPAKHRGILVCICNAILTASYLLPTLCGSWAINNFSL DVAWRVPFIVGGLPIVTIYFLHRWLPESPRWLMRRGRHDEVRQLVERLEKSAGIEHDDTF INPDVLRSLEQAAAAERTRTGASWKALFRAPYLSRSLVSWSMYSSGLITWYVVMVYVPTI LTTYGFQLSNSVILTGCMMVIGGAGSVVIGPLADKYGRKPIWSLYVIITIISLFLLTSTD SIPMLLFIGSFVAFFGTGIMPVCKVYIAEQYPTELRGVGTGFGEAVSRVVGGVLATYYLA FFLSVGGVKGVFTFMAVAFAIAIIALWIWGQETARRSVEDTSAAPQK >gi|316922353|gb|ADCP01000122.1| GENE 24 27017 - 28393 1726 458 aa, chain - ## HITS:1 COG:BH3899 KEGG:ns NR:ns ## COG: BH3899 COG3829 # Protein_GI_number: 15616461 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 5 448 113 555 559 313 41.0 5e-85 MDISQRNLLELAMSSLPTGLFICDRDGIIRFINDAYANYLRVRPEDAMGRHITDFIPDSG IPAVIASGEPELGAWRSIQGSERKILVNRIPLRDQDGHVIGAFSLTLFDTPEQMQALLQR VDFLDKKVNSYARRIKSALRASHTINSILGNSPAITAFKSYLLRYARTESPVLILGATGT GKELAASAIHMASNRPDGPFVSINCAAIPKELFESEVFGYVPGAFSGAHKDGKIGQIELA DQGTLFLDEVGDLPLHAQVKLLRVLEEKTLSRLGSSQPRAVDFRLVAATNRDLKKMIAAG TFREDLYYRINPMTLNLPPLSERVEDIPLIVRDVLNRMGGEGVRCTESAMNALMRYPWPG NIRELRNVLIRALSLCQDNQITLTDLPSELRQQAVAESAGTDGKLQSVVKNSEAQTILLA LGDHHWNVAKTARALGISRASMYEKMRKFSIKRPPNGF >gi|316922353|gb|ADCP01000122.1| GENE 25 28672 - 29487 836 271 aa, chain + ## HITS:1 COG:ECs4229 KEGG:ns NR:ns ## COG: ECs4229 COG0338 # Protein_GI_number: 15833483 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Escherichia coli O157:H7 # 1 270 1 268 278 262 49.0 4e-70 MDLTRPFLKWAGGKYRLLDRLLPSIPAGARLVEPFVGSGAVFLNAGFACYLLCDLNADLI GLYRTLRRDGERFIAEARALFTPGNNTQEAFLRLRETFNVSSDAEERSVLFLYLNRHSYN GLVRYNSKGIYNVPFGKYKAPYFPERELTAFLDKLRGCDVTFAVQDFRSTFAALRPGDVV YCDPPYAPVSATANFTSYTGGGFGVQEQIDLASEASAAAKRGVPVVLSNHDVPLIRELYA DARLESFPVQRFISCNGARRGAARELLAVFP >gi|316922353|gb|ADCP01000122.1| GENE 26 29484 - 29984 514 166 aa, chain + ## HITS:1 COG:RC0690 KEGG:ns NR:ns ## COG: RC0690 COG0338 # Protein_GI_number: 15892613 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Rickettsia conorii # 85 162 107 180 182 72 48.0 4e-13 MTSEQGGAIANRQGNILEQQVRQAFASHGFREVAFAEYEKLASGSTLPGVPVPDLLVRRV PYQSIYGHRGVTEFLAVSASRGLAICIECKWQQSQGSVDEKFPYLYLNCIQAMPEREIIL LIDGNGYKPGALAWLKQAVASQDAKLIHVFNLVEFLVWANRRLRQK >gi|316922353|gb|ADCP01000122.1| GENE 27 30111 - 31634 1615 507 aa, chain + ## HITS:1 COG:aq_218 KEGG:ns NR:ns ## COG: aq_218 COG3604 # Protein_GI_number: 15605774 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Aquifex aeolicus # 9 497 6 500 506 381 44.0 1e-105 MQVRNEVKELRLLFEVLQILDSASDLSDNLETVLEVMAEHTGMMRGVITLLDEAHGEIAI EAAYGMSAEAQSKGRYKLGEGITGKVIESGKPLVIPNVLVEPLFLNRTGSRSRKEAVSFI CVPVKMEGRVIGALSADRLFAASEALEEDMRLLSVLASLISKAVKNRQERHRMLEENRRL LDVLRERSRPEMFVGRSPGLRAVLTQLAQVAPTSATVLLLGESGTGKELAANTIHAGSPR AGHPFIKINCAALPEGLIESELFGHEKGAFTGAVGMRKGRFEAADGGTLFLDEIGELAVS TQVKLLRVLQEQEFERLGGMETRKVDVRVVAATSRNLEQMVKDGLFRRDLYYRLNVFPVS MPPLRERQGDIPLLVENFLEKYGHKIGKRGLRVAPEAEALLLAHSWPGNIRELENVIERA VILTTDGLIRPDLLPPAMQAPCALCSLRGTLPDALEQLERQLITEALQEHRGNMGHAAQA LGISERVMGLRMRKFGLGYKAFRRGAP >gi|316922353|gb|ADCP01000122.1| GENE 28 31824 - 32336 411 170 aa, chain + ## HITS:1 COG:SMc00414 KEGG:ns NR:ns ## COG: SMc00414 COG3963 # Protein_GI_number: 15964087 # Func_class: I Lipid transport and metabolism # Function: Phospholipid N-methyltransferase # Organism: Sinorhizobium meliloti # 2 163 35 198 200 74 28.0 1e-13 MVCPSGMPLARSMAAFAPRKGDGLVVELGAGTGTVTRQLLDAGIAPRRLVVVEQSPVMVS LLRERFPQLAILEADARWLAGCLPRDRQVDCIVSSLPLLSLNERVRNQIISAMFEVLHEG GLLIQYTYSWRKANAFLKDGFRCIGSSQVWHNVPPARIFRFRREAGARVR Prediction of potential genes in microbial genomes Time: Fri May 13 04:04:36 2011 Seq name: gi|316922304|gb|ADCP01000123.1| Bilophila wadsworthia 3_1_6 cont1.123, whole genome shotgun sequence Length of sequence - 58269 bp Number of predicted genes - 48, with homology - 46 Number of transcription units - 24, operones - 14 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 77 - 113 3.1 1 1 Tu 1 . - CDS 128 - 658 731 ## COG2193 Bacterioferritin (cytochrome b1) - Prom 728 - 787 3.5 - Term 787 - 820 7.5 2 2 Tu 1 . - CDS 845 - 2197 971 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 2292 - 2351 5.4 3 3 Op 1 16/0.000 + CDS 2714 - 4798 2229 ## COG0729 Outer membrane protein 4 3 Op 2 . + CDS 4798 - 9243 3957 ## COG2911 Uncharacterized protein conserved in bacteria + Term 9341 - 9407 27.8 5 4 Op 1 5/0.000 - CDS 9416 - 11524 2633 ## COG2217 Cation transport ATPase 6 4 Op 2 . - CDS 11517 - 11957 594 ## COG0640 Predicted transcriptional regulators - Term 12235 - 12276 7.1 7 5 Op 1 . - CDS 12524 - 14017 2170 ## COG4145 Na+/panthothenate symporter 8 5 Op 2 . - CDS 14017 - 14343 521 ## Desal_2644 protein of unknown function DUF997 9 6 Op 1 . - CDS 14533 - 15333 838 ## COG2602 Beta-lactamase class D - Term 15374 - 15408 -0.0 10 6 Op 2 . - CDS 15424 - 15771 547 ## COG1416 Uncharacterized conserved protein - Prom 15858 - 15917 3.4 + Prom 15759 - 15818 3.5 11 7 Op 1 . + CDS 15948 - 17252 1151 ## COG0038 Chloride channel protein EriC 12 7 Op 2 . + CDS 17321 - 18121 425 ## COG1451 Predicted metal-dependent hydrolase 13 8 Tu 1 . - CDS 18155 - 18673 433 ## - Prom 18762 - 18821 2.3 + Prom 18729 - 18788 3.2 14 9 Tu 1 . + CDS 19012 - 20673 1019 ## LI0603 two-component response regulator + Term 20713 - 20744 2.1 - Term 20748 - 20809 12.2 15 10 Op 1 . - CDS 20915 - 22369 1882 ## COG4783 Putative Zn-dependent protease, contains TPR repeats 16 10 Op 2 . - CDS 22354 - 23811 2115 ## COG0215 Cysteinyl-tRNA synthetase 17 11 Tu 1 . - CDS 24050 - 24508 709 ## COG0698 Ribose 5-phosphate isomerase RpiB - Prom 24580 - 24639 4.2 18 12 Op 1 . - CDS 24823 - 25146 437 ## 19 12 Op 2 . - CDS 25146 - 25952 1166 ## DvMF_0479 hypothetical protein 20 12 Op 3 . - CDS 25949 - 27718 2045 ## COG0457 FOG: TPR repeat - Prom 27752 - 27811 1.6 21 13 Tu 1 . - CDS 27822 - 29057 1609 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) - Prom 29183 - 29242 2.5 22 14 Op 1 . + CDS 29340 - 31757 3222 ## COG0646 Methionine synthase I (cobalamin-dependent), methyltransferase domain 23 14 Op 2 . + CDS 31773 - 32249 612 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 32258 - 32302 11.1 24 15 Op 1 . + CDS 32352 - 32810 624 ## COG1246 N-acetylglutamate synthase and related acetyltransferases 25 15 Op 2 . + CDS 32853 - 33383 702 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 26 15 Op 3 . + CDS 33401 - 34141 878 ## LI0615 hypothetical protein + Term 34282 - 34328 8.5 27 16 Tu 1 . + CDS 34431 - 34760 346 ## COG5561 Predicted metal-binding protein + Term 34810 - 34847 6.1 28 17 Tu 1 . + CDS 34879 - 35172 293 ## COG3070 Regulator of competence-specific genes + Term 35213 - 35245 2.3 - Term 35203 - 35231 1.4 29 18 Op 1 . - CDS 35357 - 38584 3200 ## COG1379 Uncharacterized conserved protein 30 18 Op 2 . - CDS 38594 - 38881 193 ## Dde_3501 hypothetical protein - Prom 39109 - 39168 2.8 31 19 Op 1 . - CDS 39511 - 39819 510 ## DvMF_2244 stress responsive alpha-beta barrel domain protein 32 19 Op 2 . - CDS 39849 - 42374 1905 ## COG0068 Hydrogenase maturation factor - Prom 42476 - 42535 3.0 + Prom 42515 - 42574 4.1 33 20 Op 1 . + CDS 42616 - 44310 1923 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 34 20 Op 2 . + CDS 44363 - 45079 769 ## COG1794 Aspartate racemase 35 20 Op 3 . + CDS 45097 - 45633 535 ## COG0703 Shikimate kinase 36 20 Op 4 31/0.000 + CDS 45630 - 46454 935 ## COG0765 ABC-type amino acid transport system, permease component 37 20 Op 5 31/0.000 + CDS 46593 - 47399 1008 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 38 20 Op 6 . + CDS 47396 - 48334 890 ## COG0765 ABC-type amino acid transport system, permease component + Term 48537 - 48579 4.0 39 21 Tu 1 . + CDS 48586 - 49137 408 ## LI0734 hypothetical protein + Term 49201 - 49238 5.4 - Term 49304 - 49342 0.4 40 22 Op 1 18/0.000 - CDS 49365 - 50069 240 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 41 22 Op 2 19/0.000 - CDS 50062 - 50868 260 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 42 22 Op 3 24/0.000 - CDS 50865 - 52109 1903 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 43 22 Op 4 20/0.000 - CDS 52120 - 53028 1361 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 44 22 Op 5 . - CDS 53254 - 54369 1717 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component - Prom 54556 - 54615 5.4 + Prom 54540 - 54599 2.4 45 23 Tu 1 13/0.000 + CDS 54634 - 55524 622 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase + Term 55575 - 55643 30.4 + TRNA 55557 - 55632 72.6 # Gln TTG 0 0 + Prom 55558 - 55617 80.3 46 24 Op 1 9/0.000 + CDS 55677 - 56612 1240 ## COG0462 Phosphoribosylpyrophosphate synthetase + Prom 56632 - 56691 2.8 47 24 Op 2 22/0.000 + CDS 56854 - 57432 659 ## PROTEIN SUPPORTED gi|94987179|ref|YP_595112.1| 50S ribosomal protein L25/general stress protein Ctc + Term 57581 - 57613 3.3 48 24 Op 3 . + CDS 57651 - 58247 581 ## COG0193 Peptidyl-tRNA hydrolase Predicted protein(s) >gi|316922304|gb|ADCP01000123.1| GENE 1 128 - 658 731 176 aa, chain - ## HITS:1 COG:SMc03786 KEGG:ns NR:ns ## COG: SMc03786 COG2193 # Protein_GI_number: 15966922 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin (cytochrome b1) # Organism: Sinorhizobium meliloti # 12 166 5 159 161 79 32.0 2e-15 MTQEAAKKERVAKVIEVLNKARSMELYAVHQYMNQHYNLDDLDYGEFASKIKLIAIDEMR HAEQFAERIKELGGEPNSELSSKITKGQKVMDVFPFNVKVEDDTIETYNKLLEICRTNGD SVSARLFERILDEEQVHDNYFCNMANHIETLGNTYLSKIAGSDSPEIPSRGFVDAE >gi|316922304|gb|ADCP01000123.1| GENE 2 845 - 2197 971 450 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 4 444 3 440 456 378 44 1e-104 MEQLTKIVSEINGFLWGMYCLIPLLCGVGIYFTLKLDFIQIRRFGLAVKSTFGGLTLHGE RAGKQGMSSFQSLATAIAAQVGTGNLAGAATAIAMGGPGAIFWMWIAAFFGMATIFAEAV LAQTYKTTDDQGHVTGGPAYYISKGLGSKKLAAFFSISIIIALGFIGNMVQSNSIADAFQ TAFSLPPLLIGVIISALAAFVFFGGMGRIAAFTEKVVPIMAALYLLGGLIVLLANASHII PALKMVFVGAFDPAAATGGIIGAGVKEAMRYGVARGLFSNEAGMGSTPHAHAVAKVKYPA QQGFVAIVGVFVDTFIVLNMTAFVIFVTNSIDGSTTGIALTQKAFESGLGSFGITFVAIC LFFFAFSTIIGWYFFGEQNIKYLFGTKGTTPYRLIVMGFIVLGSVLQVDLVWELADMFNG LMVLPNLIALIGLSKIVSMALRDYDANNPR >gi|316922304|gb|ADCP01000123.1| GENE 3 2714 - 4798 2229 694 aa, chain + ## HITS:1 COG:mll1662 KEGG:ns NR:ns ## COG: mll1662 COG0729 # Protein_GI_number: 13471632 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein # Organism: Mesorhizobium loti # 13 694 24 617 617 234 29.0 6e-61 MGFALCLLLSGCLFGSGNKQDTLQPMTDEAPNPAKKTIRYSVKLQSDPQNGDLVSELREN SQLVWLHDDLPDSRVGLERRALEDVETARKILHSQGYYDGTVRHHINWEAQPPEASITLR PGERYVMGPTKLRYERTGPAGEPVDKDLPESVRGIDFMENAPDTLEAFGLAKGSPAEAQT VLNAVTSVVTAMRKAGYPLAEQGKARYVIDRSTHTLEADVLIKTGPLLRMGPVLIKEENV RPADNPDAPGVPAVNEDYLNKLSPWIEGQYWNDDLLKEYRTTLQETGLFSAIDMKPARLS PEQAKAAGWTDRSLAEAGFSELPLGDSSDAPSPAPPADPPAGGTGKKAAGSDMPIWMTSD GTTPVSLTVRDAPPRTVSGGLQYSTDTGFGVRGAWENRNLFGGGEQLRVTAPISEDSQSL NASFRKPAFGIRDQALVGEAWAINETTDAYDQSALSFAAGLERRFRKDWRNWWASARVSV EAGSLKDNYRGRRTYSLLGFPLSVRRDTTNSLLNPTTGTRVTLEVTPYTGTFNGPLTTVR TRLDASGYWSPGWDRLVLAGRAAAGSLTGENVRNIPASLRFYSGGGGSVRGYKYQSLGPH DRNGDPLGGLSFTDVSLEARFRITESFGIVPFVDGGMVYETTMPDWGRDLSWGVGLGFRY YTLIGPIRLDIATPLQDRDNNKAFQIYLSIGQAF >gi|316922304|gb|ADCP01000123.1| GENE 4 4798 - 9243 3957 1481 aa, chain + ## HITS:1 COG:mll1661 KEGG:ns NR:ns ## COG: mll1661 COG2911 # Protein_GI_number: 13471631 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mesorhizobium loti # 444 1481 1064 2078 2078 166 23.0 4e-40 MTVPPSLRRAGRILGWSLLGLVLVLFFALAGGMAFLRTQPGERWIADTAVSVLKNVGIEA QIGSLEGPLPFSLHLRDVRLADGQGVWLDVPEAALRIGAGALLRGHLLVEELSVLDASMP RVPEFPPADPPPPPSDPLELLRPSFLQPLASDIAGYLKLATVGELRIERFRLGQAVAGIP IEATLEGSGPLTDFRSELESTLYRPQIEKAGGAAEPLLALSGALNLGNEGNWLSDGVKRP QPEERPDAGVGLDLRLAVPRAAGNGGDEAQGAQDAPEQGSLRLLLTLAGARFSVPELMAE VPGGLVTGHGLFFDDAKVGGDLSVLLSDPDALMRLVAALTGEAQAGLPLASANLEAALSG TLDAPGLVLKASVPGIVLDAAQPDARLDVASSWTLAVRSLLSEPSVTADATVNLGGPFVR KAMSANQAAKNGKEGERSGEVPIRLHVEASETAETLTVKALCVESEWLALAGDASLETAT GKLAAALRLDMPSLRRLAQFPPVAALLAGSPFRELAGTVGLKADVRRDDASSPFAGTLSL KLADMRWGLAPLQDTVGETASLGASFSAQPESGSASVSDLTLKAGQISGKGNASTDGKKL DAALSLALSSLKGIAPSLSGPFELKLRASGPLLSPGGELTLASPHLGVETATLEGLRVRL NTPSVGAAGGKGTLDVSATLRDASIKALQAGAPLRFSTEWQFAADRFSLHKTKLDAPGIA LSGALDALPAKRRLDGGFRLAVTDWAALSSLTGVPLDGSAAVANVSFSQAKTQQLAADWS FGALSAGSLFSMRTFKGNLRLDDLFGKQGIGLSASLGNGSASDFRWRSGKIDVSGSLKRL QASVALQGKTSADIGLTLDIAGQKAQVERLTFTDRRKRTLVGVRLNRPVNIAFGNGLSVD NLDLSVLPQGTLTALGSLDGRKLALEARLRDVAINMARLFTDVPVPDGLLNAAIALSGTP SQPRGTLDVSLRNIAFPESDMPPAAVDINGTLRSSALVLNVKTDGMGTSPATGMLSLPLS FSASGVPSPALDKPCSGNVNWEGSLASLWRFVPLANSSLTGQGSLNATLSGSLSAPELSA SLKIEKAAFEEILLGLALSDINLDASLQSGGMSRLSLSATDGQGGAINVNGTVGPLASGL PLSLHGTIKELAPLHRNDLSVTLSGTADIVGPATSPDVRAAITVNKGQFQIVSSFGTSIP TLNVVEAGQEESTASSSGSGPKLDVNVLIPNRFFVRGKGLESEWKGNLQVSGPATNPVVT GSINSIRGQFSLLGKQFTLSRGDVEFSGATPPDPLLNVLVTYAAANITAEATVSGPASSP TLTLSSQPPLPQDEVVAQVLFGQSASSLGRMEAIQLAAELASLSGFGSGGMGVLGEVRDT LGFDVLRFGSMQNGPKQQTSRNAGLLQPPGQNSGSSAQEDSIPSLEVGKYVMDNVYVGLE QGMNGDASGVRVEIELAPNLNLEGVSTPQGSEVGLNWKKDY >gi|316922304|gb|ADCP01000123.1| GENE 5 9416 - 11524 2633 702 aa, chain - ## HITS:1 COG:slr0798 KEGG:ns NR:ns ## COG: slr0798 COG2217 # Protein_GI_number: 16331908 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Synechocystis # 2 702 9 721 721 543 46.0 1e-154 MTKRYAVHNLRCAGCAARIEDAIAGLPGVKSAAINLDSARLRVEGEQPELAAMIRLADAI EPGTLFSEMDENEPLPESSEPQPSLFRSEYKLWIAAALFAVGMIFEERLDALLTPFVAKA VFFGIPYLLCGLSVLKKGLQSLLKFDFFNEYTLMGGATIAAAAIGQLPEAVGVMLFYSIG EALQDRAAGNSRRSIRALLAARPTVAHLIEEKDGKQMVNDADPAEIRPGMRILVKPGEKI PLDGTVLTGSSQVDTSPLTGESVPVSVGAGKQVYAGTVNLEGAITVEVTSRFADSSVARI LELVEQASANKAPIERFITRFARYYTPAVVFIAALVAVIPPLAGHGTFDEWLYRALVLLV ISCPCALVISIPLGYFGGIGAASKKGILVKGGHVLDALHNIKTVAFDKTGTLTRGVFEVT RVLPAEGASEEDVLNAARLVESRSTHPIARAIMKTAPGSHGFALEAVESKEIPGKGLSAE HKGRTLLVGNALLLQDAGISVPSLPAGQGSVVYVAVGNAYLGAIEVSDVLRPESAPAIES LRGQGIERLVMLTGDRPESAAIVAKELHLDEYRAGLLPEGKAAELESLGPRQSVLFVGDG INDAPVLAMSGVGVAMGGLGSEAAIEVADAVILDDSPSRLPEMFSIAAKTRVVVWQNIVM ALGIKGLFMVLGVVGLSGLWEAVFADVGVALLAVLNATRTAR >gi|316922304|gb|ADCP01000123.1| GENE 6 11517 - 11957 594 146 aa, chain - ## HITS:1 COG:CAP0103 KEGG:ns NR:ns ## COG: CAP0103 COG0640 # Protein_GI_number: 15004806 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 3 73 15 85 135 62 38.0 2e-10 MFDFLTTCRALADENRIRILMALRGRQLCVCQVTAFLDLAPSTTSKHLSILRQARLIESN KQGRWVYYRLADDSPRPNIREALAWMKDSLANDPTIIEDEARIAEILRAEEACGFGHGES DCFHSLEVHSLAQRFENAPQEEENHD >gi|316922304|gb|ADCP01000123.1| GENE 7 12524 - 14017 2170 497 aa, chain - ## HITS:1 COG:FN0685 KEGG:ns NR:ns ## COG: FN0685 COG4145 # Protein_GI_number: 19704020 # Func_class: H Coenzyme transport and metabolism # Function: Na+/panthothenate symporter # Organism: Fusobacterium nucleatum # 22 490 19 482 484 441 51.0 1e-123 MAELSTIVPVVIYLSLSFLAALWARKQSQKTSDSHGFIEEYFIGGRSMGGFVLAMSIIAS YTSASSFVGGPGVAYKLGLSWVLLAMIQVPTTFLTLGVLGKRFAIMARKTRSVTLTDFLR ARYRSDAVVVLCSLALLVFFMAAMLAQFIGGARLFQAVTGYPYIVGLALFGLTVILYTAV GGFRAVVLTDAIQGMVMVFASVVVLVAVVNAGGGMERCVATLKAIDPGLITPIGPGDAVP QPMILSFWVLVGLGILGLPQTSQKCMGYKDSRSMHDAMIMGTLIIGFLLLCVHLAGTLGR AVIPDLPAGDLAMPTLIMKLLSPFWAGVFIAGPLAAIMSTVDSMLLLASAAIIKDLYIHY GLKGDASRMTPVRLQTMSLVCTAVIGLIVFAAAIEPPDLLVWINLFAFGGLEAVFLWPII LGLYWKEANASGAVASIVAGVACFFALSILKPGMGGIHAIVPTSIVAFAAFVIGAKCGRP ASEEVIRLFWGEGNGTS >gi|316922304|gb|ADCP01000123.1| GENE 8 14017 - 14343 521 108 aa, chain - ## HITS:1 COG:no KEGG:Desal_2644 NR:ns ## KEGG: Desal_2644 # Name: not_defined # Def: protein of unknown function DUF997 # Organism: D.salexigens # Pathway: not_defined # 7 88 6 87 97 129 71.0 3e-29 MNKPFKQDWRYAQANREALLTLGAYGLYFVWWYACAYGLGDGDPEGYDYIFGLPAWFFLS CIAGYPLLSALLWFMVRRFFKDIPLDEEGGAPKREGLEVTDERHPEAF >gi|316922304|gb|ADCP01000123.1| GENE 9 14533 - 15333 838 266 aa, chain - ## HITS:1 COG:BS_ybxI KEGG:ns NR:ns ## COG: BS_ybxI COG2602 # Protein_GI_number: 16077278 # Func_class: V Defense mechanisms # Function: Beta-lactamase class D # Organism: Bacillus subtilis # 11 265 19 265 267 179 39.0 4e-45 MKKHIIIGTLFALVLLASVARASETVERTDLADIFQGFDGTFIMYDEGADRYTVFNKALS ETRLPPCSTFKVFHALIGLDSGVLDRDDARTLMKWNGKPSSIAAWNHDQTLASAMRHSVV WYFQRVATGIGEERMQRGLDRIGYGNRDISGGLTRFWLQSSLKISPREQVGLLRALFNGS LPFAAEDIAVVKRDITLSSGKGLRFMGKTGSGSDDADRGIIGWFVGGVETPHNRWSFAVN IQGPDASGVRARGIAEAALKRLSILP >gi|316922304|gb|ADCP01000123.1| GENE 10 15424 - 15771 547 115 aa, chain - ## HITS:1 COG:AF0913 KEGG:ns NR:ns ## COG: AF0913 COG1416 # Protein_GI_number: 11498518 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Archaeoglobus fulgidus # 5 115 3 112 112 72 40.0 1e-13 MFYDLLVHVDLPDESRFVMALGNINNYLAALNGEPCTVVLLANGGAASFLARKDNAQTEA IEKLAQAGVSFRVCANAMKKIPITKEDLLPCVEVVPAGVVEIVRLQREDFAYIKP >gi|316922304|gb|ADCP01000123.1| GENE 11 15948 - 17252 1151 434 aa, chain + ## HITS:1 COG:L113400 KEGG:ns NR:ns ## COG: L113400 COG0038 # Protein_GI_number: 15673646 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Lactococcus lactis # 25 428 24 418 512 182 34.0 1e-45 MSNPIHSTSAKSVLPMLAMRGLATGLCAGCVVGLLRLSHDRAAARVALWLEEWREVWWTV PIWFGVLVALAVILRRIVREEPLISGSGIPQTELALAGRLSFTWWRVLLAKFLGSWLALF GGLALGREGPSIQMGGAVGAGFHALWRPAAPHALTGSPYVVAGAVAGLAAAFGAPAAGVL FAFEEMKSRRDTSLIVAACASAIGAHLMIRIVFGMGRILPFAGFEAPPLSSFWIVALEGA VFGVLGVGYNKTLLWLHDREAGQTLIPDRWRALPPLLLAGCLALFAPLLIGGGESLIIFV GEHDVALKTLILLLAFKYLFAQYSTVASIPGGLLMPILCLGALWGRLWAELPVSALAAHG LASGSVQPYVLFGMVSYFAATVRAPLTGIVLVTEMSGTYTCLPGSLLAGLIACKVANLLH CPPVYDSLKERIRL >gi|316922304|gb|ADCP01000123.1| GENE 12 17321 - 18121 425 266 aa, chain + ## HITS:1 COG:MA3501 KEGG:ns NR:ns ## COG: MA3501 COG1451 # Protein_GI_number: 20092311 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Methanosarcina acetivorans str.C2A # 25 260 11 241 245 95 28.0 1e-19 MRFSCFAQKSSEGPEEGVEHLLWNGQSIAFTIRRGSLRRKRTLIRVTRAGKVEVILPPFL LRKEAFAAIQSRGGWILDRLREIQSCPQPSPLYYTDGEEHLFLGTYCRLQLVVCGVKKNF LSSDEKIGQSVVTGRSQLIQLRVDSDDPETVHKQLFLWYKTQIQKHIAYRLAALCPRIPW LTAVPTWRVRTMRRRWGSCTGSGVITLNTHLIKAPPQCLDYVLLHEIAHLQEHNHSQRYY AVLERLLPNWKAVRRELEAWSPLVLN >gi|316922304|gb|ADCP01000123.1| GENE 13 18155 - 18673 433 172 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPSNTPAEISPLLSQLGPSGPEGLRRCARNASTRWKILLPLSVCFLLAGLAGLFLRPDRR PDLPLLQKRFELLMTWTLPEIHCDDMHLSGLHKQGKEWLATYRFVVYATQGSEASPVIRA RLKRRFPECGELLFQTGKACTVEERVRFVPVQQHGLVPEKVILDYPELLSQM >gi|316922304|gb|ADCP01000123.1| GENE 14 19012 - 20673 1019 553 aa, chain + ## HITS:1 COG:no KEGG:LI0603 NR:ns ## KEGG: LI0603 # Name: not_defined # Def: two-component response regulator # Organism: L.intracellularis # Pathway: not_defined # 376 550 345 520 524 97 34.0 2e-18 MRHQERNVSEFDEIITYPPSAAPSGEKQVMSCVQAFTAALGFAMSAGWDWLPSMHTLWLR DYIEAVSMPTPFGFFGTLSGSIFFKMIMLLALCPVMERLFHAFPLDRVAFRRKPFPNADY ALLPPALRERIYLSTPLILSQLCAILLLLVLTFSLPPCLDGLMPWLLGTVSALFGLIWFV LVTLLPGVWSGISYGLAITFSILISLLLDTAPAESLARFRMLFPILAIPLIFFAMPPSDD IRRRIRRHHRQRLFSTRVKGVLLRGLLLRRLSYLDILSKQGSVYLVLCVFPIMYGVQILL DFSIGCPMVSFNLMLEPPLERHAFLIMSGEISGAFLASSLIAFFPSHSLLVPMLGLSLFG CGSLLTSVFPSQPLNSVAFYIMQLSSGCCAAFGLFVLHQFFQRSGFLFRTLSRALFLLTL FGSMGGYLLWKFADNIHNSYLSSHALFMHLLTIGAILGLFFVYLIRRPLRHLMVTTPRRS LEEAFVPIIQEDPFEQLTPREREVAELVQTGMKNLEISVKLNITETTLRVHLRRIYRKLG IQGRSNLRDFNSL >gi|316922304|gb|ADCP01000123.1| GENE 15 20915 - 22369 1882 484 aa, chain - ## HITS:1 COG:aq_972 KEGG:ns NR:ns ## COG: aq_972 COG4783 # Protein_GI_number: 15606285 # Func_class: R General function prediction only # Function: Putative Zn-dependent protease, contains TPR repeats # Organism: Aquifex aeolicus # 73 452 59 421 425 135 29.0 2e-31 MGYHLNARYGAAGNAPGRLWRGLVAVLLSFALFLQGAIPVQAALFGSFGVKDEKELGRKF DVLVRSRMPLVEDPEIKGYIQSIVDRLSKTFPPQPFPFTANVLLHNSMNAFAVPGGSVFV HTGLIMQLDHESELAGVLGHELAHVTQRHIATRMERAQAVTIGSLVGALAAALLGGGQGS GAIFAGSVAAGQAAMLNYSRMDETEADQIGLQYLTSAGYRPQGLQGAFEKIRRKQWASGI DIPEYLSTHPDVGGRINEIHARIQGLPAAVRNRKDDDTRFNRVKTLIWARYGDPDAAARF FAKQLEGKKPDCLAYMGQGILAERRNQIKDADAAFSKALACAPGDALVVREAGRFYYNKG DARAGRTLLRALELDPNDIMAQFFYARLLDGSGDKASAHKYYQQVLRRLPQDAEVHYYYG RSLGEAGKVFDAYLHLAYSSLYQNDKRKTESWLKQARPLARTPVEQDQLKRFEAIYAERQ EVWK >gi|316922304|gb|ADCP01000123.1| GENE 16 22354 - 23811 2115 485 aa, chain - ## HITS:1 COG:PA1795 KEGG:ns NR:ns ## COG: PA1795 COG0215 # Protein_GI_number: 15596992 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Pseudomonas aeruginosa # 1 482 2 456 460 433 48.0 1e-121 MQFYNTMSRKKETFTPVRKGLVGVYACGITAYALSHIGHARSAVAFDILVRLLRYKGYEV TFVRNFTDVDDKIIKRANDEGVSSTEIAETYIKAYHEDMDRLGVIRADIEPRATEHIQEM LNLTERLIADGKAYATPSGDVYFRVRSFPGYGKLSGRTPDDLRSGARVVPGDEKEDPLDF ALWKSAKPGEPYWESPWGKGRPGWHIECSAMSEKYIPLPLDIHGGGLDLIFPHHENEIAQ TEAALGKPLANIWMHNGFVQVDSEKMSKSLGNFKTIRDILESYLAETLRYFLAGKHYRSP IDFSLDNMDESERSQKRVYECLREVDKALARENWKPGAAPAELVEELKQQDQSFMDALED DVNTAAAMGHLFNMVRIAGRVLEDKKLRNSEGGRDILRAFRDATAKWDTLLGLFAQKPEG FLASLREIRARRRGLDMNKVAGLLRERQEARAAKDFARSDAARDELAALGVEVRDTPEGP VWDII >gi|316922304|gb|ADCP01000123.1| GENE 17 24050 - 24508 709 152 aa, chain - ## HITS:1 COG:CAC2880 KEGG:ns NR:ns ## COG: CAC2880 COG0698 # Protein_GI_number: 15896134 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Clostridium acetobutylicum # 4 150 3 140 152 152 51.0 2e-37 MATVYIGSDHAGLELKAALVKRLSEKGHDVHDLGPATTESCDYPVYAKGVCGKVLADAGN ATPGDEPASFGILVCGTGIGMSMTANHIPGIRAALCGCEFQARATRQHNNANVLCLGERV TGQGVALDIAELFLNTAFEGGRHLRRINQIEG >gi|316922304|gb|ADCP01000123.1| GENE 18 24823 - 25146 437 107 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDVQTSLNRIEELFRIMYLGLWVLWAETRWIAGPDIGYQRRLTLMRRRQGAIQDELSRMA TLADPRREQLAGDLALLEGDIVRLEKDREAHRAGHLAPLRARFGWLL >gi|316922304|gb|ADCP01000123.1| GENE 19 25146 - 25952 1166 268 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0479 NR:ns ## KEGG: DvMF_0479 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 22 267 41 294 296 222 48.0 1e-56 MNIRLSAPFRLILAALLVLSLQACATRGTEIGTSAPSDASNAAWQAYQGYASARASDHDP YRLSGSLRYGSQGDTRRVETLLWSNGYLPIRLDVMAGIGALVARIQETQDSFTAYAPNEN KAIVHKGPQRVQLNFGRPVPFSLRDFSSLMRGRFHEVFGAAEGLNPRALPNGDIAFTLSG GILPGMLELRPDGLPVRWSAEKGWVMDITYDDGNPPLPYKFKLTHPDGYSAILLVKDRQK PNAAFRDDQLALELPAGTIIEPIKKGGN >gi|316922304|gb|ADCP01000123.1| GENE 20 25949 - 27718 2045 589 aa, chain - ## HITS:1 COG:aq_854 KEGG:ns NR:ns ## COG: aq_854 COG0457 # Protein_GI_number: 15606205 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Aquifex aeolicus # 98 589 40 545 545 139 24.0 1e-32 MSFRIRLTHPAAPLAAAPSHARRGNSRCGAGLVRNLLRGMVCAVLTVQMLTGCARSTPQE PVQWDLSPEAQRTYATLLLDQSIRGDDKEGVLEAVKLLLTLENRPQPFIDAAAWLMLNRE TSEARSLLEQAVKRVPGDLGLHLLLAETWLEQGDNEQALKILKEYRKQRPDSELVQQELG ILYVKTGHFKDADKVFSALPQRLRTSFVRYAHAQALVGLKQPHRAIQQLRLAVQESPEFV DAWFELARLLEADRQFSEANEIYNSLIEQDPDNPDIWIRMVEGQIRAGKGQKALDYALNG PATYGYKLTAATLFLDARMYPEAEALLDGLKSDPNAPDEVHFYLAAIAYEYHKDVQETLN FLESIPPENRFYDRALRLRIQLLHDQQRYEDAMQLILQGQSQFPTERDFRLMEIHLYLLQ DRYKEALTAATAAQQIWPSDDEIAYVYGSVLDSLGRKREALAVMESIVARNPEDYQALNY VGYSLAEQGKDLDRAVLLLEAALKQAPDHAYILDSLAWAHFRRGENAEAWALVRRATSLP DGGDPTIWEHYGDIANAQGLKNEARTGWERALELDHPNPETIRKKLNSL >gi|316922304|gb|ADCP01000123.1| GENE 21 27822 - 29057 1609 411 aa, chain - ## HITS:1 COG:CC3098 KEGG:ns NR:ns ## COG: CC3098 COG0568 # Protein_GI_number: 16127328 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Caulobacter vibrioides # 124 394 10 283 298 189 42.0 1e-47 MAKGRTNHMSRRISTKAAESGQTTGGPADDAIAPPVSSATAAEEHHTVVEPEIIEPEILP KQASGKTAAHKAAKPMDVTDAEDEDSEDIGDDLPEDDVAALDPDADILPEGDTLPAPSQQ HLPSLSSSKDSLHLYLREVSRFPMLKPDEEFDLARRVQKTGDSDAAFRLVSSHLRLVVKI AMDFQRRWMQNVLDLIQEGNVGLMRAVNKFDPDKGIKFSYYAAFWIRAYILKFIMDNWRM VKIGTTQAQRKLFYNLNRERQKLIAEGFDPDAATLAERLGVGEDQIVEMQQRLDASDMSL DATVGDESGSATRMDFLPALGPGIEESLAGMEIAELLQSKIRDILPSLSEKEAYILEHRL LTDDPVTLREIGERYNVTRERVRQLEARLLQKLKTHLSTDIQDFSEDWISH >gi|316922304|gb|ADCP01000123.1| GENE 22 29340 - 31757 3222 805 aa, chain + ## HITS:1 COG:TM0268_1 KEGG:ns NR:ns ## COG: TM0268_1 COG0646 # Protein_GI_number: 15643038 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I (cobalamin-dependent), methyltransferase domain # Organism: Thermotoga maritima # 6 284 8 285 285 205 42.0 3e-52 MADFRSALASGRTLLLDGGMGTMLQARGLPAGEHPEQFCLDRPDVLRGIHADYLAAGADI ILTCTFGGSRLKLPAGIDVTPFNRTMARIARAAVDAAGREAFVAGDMGPCGQFVRPLGDL HPLELYEALREQARGLVEGGVDLFLIETQFDLAEVRIAVAAIRAESDLPIMVSMTFEQGT SLTGTTPEIFAETMQNLGVDALGLNCGLGPEQMAPLMERFLACSSVPVLAEPNAGLPELV DGKTVFRLPPEPFAEKTAAFVGMGARLVGGCCGTTPDHIAALRRAVDAVGPCPAPAVTPS GIVLTTRTRLVRVSPAEPFKIIGERINPTGKKDLQAELQAGEYGLAMRFASEQVALGAPI LDVNVGAPMVDEALLLPELVQRLTGKYAEPLSLDSSHAEAIAAALPFCPGSPLVNSISGE ADRMEHLGPLCRQWGAPFILLPIQGRKLPVKAADRIAIIENMLDKAAMLGIPRRLVLVDV LALAVASKAEAAREGLETIRWCAAHGLATTIGLSNISFGLPARELVNTTFLSMAMGAGLS SCIANPSSGRLREAKAAGDVLMGHDRDAAAFVNGYAMWTPGNGGTADAAAFARATAQTVE EAVILGDRESVVGLVEKELEAGADPFELVRGRLIPAITEVGSKYERREYFLPQLLRSAET MQTAFARLKPLLEREANAEAQKIIVVATVEGDIHDIGKNIVSLMLGNHGFKVVDLGKDVK AEAIVEAAVAHKADLIGLSALMTTTMVRMRDTVDLVKQRGLGVDVMVGGAVVTPAFAESI GANYSSDAVDAVRLAKSLIAARKNQ >gi|316922304|gb|ADCP01000123.1| GENE 23 31773 - 32249 612 158 aa, chain + ## HITS:1 COG:BS_yneN KEGG:ns NR:ns ## COG: BS_yneN COG0526 # Protein_GI_number: 16078864 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 46 151 58 165 170 72 30.0 3e-13 MRRLSCLIALLACLVLALPAFAAPQKADAISGISSEKVLDIVKANKGKLVFINFFATWCP PCREEIPELISIRKAYPADKVVIVGISLDSDRGKLESFAEAMNFNYPVYLAEPDVPALFG ISSIPFNLLYDKTGKLVAGGPGMVGKEDLEEAFSGLLN >gi|316922304|gb|ADCP01000123.1| GENE 24 32352 - 32810 624 152 aa, chain + ## HITS:1 COG:aq_1359 KEGG:ns NR:ns ## COG: aq_1359 COG1246 # Protein_GI_number: 15606556 # Func_class: E Amino acid transport and metabolism # Function: N-acetylglutamate synthase and related acetyltransferases # Organism: Aquifex aeolicus # 5 152 15 162 181 162 49.0 2e-40 MNLSVRKARMGDIRAIHALLMTSSADGLLLPRSLTDLYGHLRDFFVVEGDGATVGCGALS IIWENMAEVRSLAVAAHARRKGCGRLIVEACIAEARELDIHRLFALTYQLPFFNALGFSI VEKEVLPQKVWVDCVNCPKFPDCDETAVLLEI >gi|316922304|gb|ADCP01000123.1| GENE 25 32853 - 33383 702 176 aa, chain + ## HITS:1 COG:DR1376 KEGG:ns NR:ns ## COG: DR1376 COG0634 # Protein_GI_number: 15806393 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Deinococcus radiodurans # 10 173 12 175 176 147 44.0 7e-36 MRITNLKPLISEEAIQTRVKEMAGEISTLYKDEPLVVVCVLKGAFMFFSDLVRHLTCKPE LDFVRLASYGSAAQRSKTITFTKDVEIPLEGKHVLIVEDIVDTGHSMDFLYRQFQARGAR SLRLAVLVDKNERREVPVTSHFVGFTLPSGFVVGYGLDYAESYRELPAIYEAEIEE >gi|316922304|gb|ADCP01000123.1| GENE 26 33401 - 34141 878 246 aa, chain + ## HITS:1 COG:no KEGG:LI0615 NR:ns ## KEGG: LI0615 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 242 10 248 248 266 58.0 6e-70 MIVTCPNCSTKFNLPETQAAPGAKLRCSVCKHVFQLSDGVKPAEIRMEPDLSLSSPSLSM PPKKKGGIWGWVLTLLILCAVAGGTWWAWTYTPLFDTVKEMIAPPKKQDPVELVKNIALR GVRQYNISNEKLGNISVVEGKVVNGFNQPRELIRIEASLYDSAGNALVSKQQLAGTSLSL FQLQVLGEQDIEQALANKIDIMATNTNVLPGGEVPFMVVFYSPPDNAAEFGVKVIDARIP PEKEKK >gi|316922304|gb|ADCP01000123.1| GENE 27 34431 - 34760 346 109 aa, chain + ## HITS:1 COG:MA3162 KEGG:ns NR:ns ## COG: MA3162 COG5561 # Protein_GI_number: 20091980 # Func_class: S Function unknown # Function: Predicted metal-binding protein # Organism: Methanosarcina acetivorans str.C2A # 3 109 12 116 117 58 33.0 2e-09 MKKVGIIRCQQTEDLCPGTTDFVTAAKGKGSFEAFGECQIVGFVSCGGCPGKRAVARAKM LVDRGAEAVALASCIGKGSPIGFPCPNREQMVQAIRAKIGDIPLLEYTH >gi|316922304|gb|ADCP01000123.1| GENE 28 34879 - 35172 293 97 aa, chain + ## HITS:1 COG:CAC2476 KEGG:ns NR:ns ## COG: CAC2476 COG3070 # Protein_GI_number: 15895741 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Clostridium acetobutylicum # 1 91 8 98 114 88 46.0 3e-18 MQSVAARLEDAGTITYRKMFGEYCVYCDGKLFGCVCDDRLFVKITEPGKAFMPDGKLELP YEGGKPMLRVERLEDRAFLKKLVRMTCEALPEPKKKR >gi|316922304|gb|ADCP01000123.1| GENE 29 35357 - 38584 3200 1075 aa, chain - ## HITS:1 COG:MK0424 KEGG:ns NR:ns ## COG: MK0424 COG1379 # Protein_GI_number: 20093862 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanopyrus kandleri AV19 # 4 419 5 389 417 194 33.0 8e-49 MKYFRADLHVHSRFSRATSGRLNIRNLAAWSMIKGLSVMSTGDFTHPAWRDELRRDLVYD DNSGLYRVREKTPLETEIPGFSRPDGVSEPQFLIQAEISSIYKKDGSVRKVHNVVIMPSL ESADKLSNKLAAIGNITSDGRPILGLDCEHLLEMVLEADERGVLIPAHIWTPWFSLFGSK SGFDALEDCFGSLSSHIFALETGLSSDPDMNRLWSALDRYALVSNSDAHSGENLGREANL FEGTPSYDGIFDALRRAARDEEGSGCIYRGTVEFFPEEGKYHLDGHRACNVVLEPEESMK IGNICPVCGKPLTVGVLHRVMALADRKAPVMPKHDPGFVSLFPLPEMLGELLSVGPKSRK VQERQSELVRLFGSEMDILHTVPESDLRQHWDALGEAVARMRRGDVIKEAGFDGEYGVVK VFSEEERKQFVTGRYRSSSLLDALPEAQKPGRKPKAAPSKEPASKQVSLFAAMTPPAPQT TPDPKAFPYSEAQQKAIQAGPNPVLVLAGPGSGKTRTLVGRVQRLLKDGIPAQRILAVTF TRRAAAELRERLERALGEGVPLPQADTLHALALSHWGTAEEALPAVLPEETARSVFAKAN ALSAADARRAWDELSLARERLDTLTDEQSALQERYHAAKRERNLADYTDLLEHWLARLQA EPERQWTNVLVDEIQDLSPLQVALVRALMPGDGHGFFGIGDPDQAIYGFRGAHPDVQSAL REAWPSLESVTLAASHRSAAGILTSASALLGPSSACGKLIPTRNMEACLHLFSAPDARKE AAWVADQVALLLGGTSHTLEDSRRRDATLATPCSPGEVAVLVRMKALIPPLKTALERRGV PCSAPETAPFWGDPSAALILELAGLRFQRPFAAPEGPGSLDVDVPQLAASLPEQLWQAGP SALASHLDAAKLLAPLFTESAACNALLRAFREQGKTWEGLLDWVCLRQDLDLVREQAEQV QIMTLHAAKGLEFRAVFLPALEEGILPFPGADALLHGDPGGAIDPDALAEERRLFYVGLT RAQDAVFVSHAAQRHLYGKALELPPSRFLSALPELFRHTRLVRHVQTKATQMKLF >gi|316922304|gb|ADCP01000123.1| GENE 30 38594 - 38881 193 95 aa, chain - ## HITS:1 COG:no KEGG:Dde_3501 NR:ns ## KEGG: Dde_3501 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 2 94 165 260 275 67 40.0 1e-10 MFASAASSLLEEDGRFACIISPDRLQDMLAALNGAGLTPYLLQYIHKQKTSPATFVLIEA RKGAHAKPSVAEPIALYGPERFFTLESLAFCPFLR >gi|316922304|gb|ADCP01000123.1| GENE 31 39511 - 39819 510 102 aa, chain - ## HITS:1 COG:no KEGG:DvMF_2244 NR:ns ## KEGG: DvMF_2244 # Name: not_defined # Def: stress responsive alpha-beta barrel domain protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 101 1 100 101 123 64.0 2e-27 MVRHIVWWTLKPEAEGRTAAENAKLIKQRLEALMGQIPSLKSLEVSYDFLPTCTMPVNVI LMTTHDDAEGLKAYAEHPAHVAVGKELIKLVTESRQAIDYTF >gi|316922304|gb|ADCP01000123.1| GENE 32 39849 - 42374 1905 841 aa, chain - ## HITS:1 COG:aq_672 KEGG:ns NR:ns ## COG: aq_672 COG0068 # Protein_GI_number: 15606085 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Aquifex aeolicus # 2 836 4 742 746 562 39.0 1e-159 MKRHGYIIGGQVQGVGFRPFVYRAATRLRLTGHVGNTSDGVRVEVQGTDEALAAFAHALE HELPPLARITSLRREELPLADGEAAFAIAHSEGHHGHAVLVSPDVATCGNCLADMRDPAN RRYRYPFTNCTDCGPRYTITHSIPYDRATTSMSCFPLCPDCAAEYNDPLDRRFHAQPNAC PVCGPKVWLASTPEGDPRIPGPELAEGQRAIERTAAALRAGNIVAVKGLGGFHLVCDATS SEAIALLRERKCRPHKSLAIMVPDLETARRVACVSDAEAAVLASSERPIVVLASRGALPP AIAPDVGSIGVMLPYTPLHHLLLEAFAALSPLPALVMTSGNAGGEPISLGNREALARLRH IADLFLFHNRDILIRADDSVVRVESSPGECFSTDNNPQYYPQKNSIPSASSPSRPNGTSA GIPSPSAKNLEGWGAGGGNLSSENVLSRESPPLHFFRRARGFVPRPLELPRDAGRCVLAT GGELKNTLCLTRGKDAFLSQHVGDLKNLETFEFFQNMAEHLGGLLEVKPEAIVCDLHPDY LSTGYALDSGLPVLRLQHHFAHVYGVLAENGHDAPALGLALDGTGYGTDGTIWGGELLFV DPKSAGAERYGGRIGRLAPFPLPGGEAAIREPWRITSGFMGLPEAEGLEGFAEDMDALFG VDGKHAIHDAILEMVRRKATPMTSSCGRLFDAVSALLGLCPSITYEGQAAIRLEAHQDLS VTMAVPFSIKERGGLLELDTAAFFLHLLELRRTGVSVPDLARRFHLGLAEGLADLALSGA RRCGVRTVALSGGVFHNRTIALLLPEALARRGLVPLTHHALPPGDGGLSYGQAAWASHVL R >gi|316922304|gb|ADCP01000123.1| GENE 33 42616 - 44310 1923 564 aa, chain + ## HITS:1 COG:MTH657 KEGG:ns NR:ns ## COG: MTH657 COG0318 # Protein_GI_number: 15678684 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Methanothermobacter thermautotrophicus # 8 541 5 538 548 682 59.0 0 MSDFTVQELTLGQLLENTAARFPDTDAVIHADRELRQTWSEFSACVDDMARGLMAIGVQK GEKVAVWATNVPHWVTLMFATARIGAILVTVNTNYRETELRYLLQQSECENLFIIDSVRD HDFIASTYEVLPELRRQRRGMLEVEGLPHLKRVFFLGTDKHRGMYSLPELFSLAGDVSDE EYESRKASISPYDVVNMQYTSGTTGFPKGVMLTHVNIVNNGYWIGYHQKFSSRDRVCLPV PLFHCFGCVLGVMACVNHGATMVLLDAFSPVQVMTAVEQEKCTALYGVPTMFQAILDHRV FSRFDFSSLRTGIMAGSVCPEPLMRRVMDQMCMSEITICYGLTEGSPVMTQTRTDDTVER RVKTVGRSMPGIEVTIRDPETNEEVPRNAVGEVCCRGYNVMQGYYNMPKDTAEAIDEEGW LHSGDLGIMDEDGYVSISGRIKDMIIRGGENVYPREVEEFLLKMDGVADVQVVAVPSRRY GEEVGAFLIPKEGADVAPEDVRDYCRGKIAWYKIPRYVAVVEGFPLTASGKIQKYKLREM AAAYFPEAMRERNGLPSRGRRGTA >gi|316922304|gb|ADCP01000123.1| GENE 34 44363 - 45079 769 238 aa, chain + ## HITS:1 COG:ECs3697 KEGG:ns NR:ns ## COG: ECs3697 COG1794 # Protein_GI_number: 15832951 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Aspartate racemase # Organism: Escherichia coli O157:H7 # 1 228 1 229 230 269 57.0 4e-72 MKTIGLIGGMSWESTVTYYQIVNRLVQERLGGFHSAKCILHSVDFHEIEACQRAGDWERS GELLAEAARGLERAGADCVLICTNTMHKVADAVQAATSLPLLHIAELTADELEAAGVSTV GLLGTRYTMEQDFYTRKLVERGITVLIPEEDGRAFVNRVIFDELCLGTVSESSRRAFAAI IGGLTARGAQGVILGCTEIGLLVRPGDTSARLFDTTEIHARRAALYALGEAVPRDGGE >gi|316922304|gb|ADCP01000123.1| GENE 35 45097 - 45633 535 178 aa, chain + ## HITS:1 COG:MA3237 KEGG:ns NR:ns ## COG: MA3237 COG0703 # Protein_GI_number: 20092053 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Methanosarcina acetivorans str.C2A # 12 149 9 147 175 103 38.0 2e-22 MKVPGLTEDKCVTLIGMAGAGKSTVGRAVAERLGWAYVDTDHLIESTYGARLQDVTDALD KERFLDVEARVIQSLRMQRAVLATGGSVVYRPEAMRYLTSLGPVVFLDVPLPLILERISR KPDRGLAIAPGQTVEDLFREREALYRQWATCRVAAGDIDVSETANAVFDAIAGCGQAL >gi|316922304|gb|ADCP01000123.1| GENE 36 45630 - 46454 935 274 aa, chain + ## HITS:1 COG:VCA1038 KEGG:ns NR:ns ## COG: VCA1038 COG0765 # Protein_GI_number: 15601789 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Vibrio cholerae # 30 274 18 264 266 178 45.0 1e-44 MSERACGVRLHPAARLLRWMGRHAVGVCFLLIGAWLFRAVLAGADGISYDWQWYRVWRYL GRWTDGHFIPGPLLDGLGMTVRIALFGLALAVAAGLGAALLRLSQWPVARGMAHVYVGCL RNTPLLLQLFFVYFLFAPAIGVGPFGAAVLALGLFEGAYMAELFRAGLQSVPRAQWEAGI SLGLGVWNTLRLVILPQAVRRMLPPLTSQLVSLIKDTSLVSAIAVADLTMQAQVLIADTF LAFEIWLIVAALYLALTLCVALPARWMERRYAWR >gi|316922304|gb|ADCP01000123.1| GENE 37 46593 - 47399 1008 268 aa, chain + ## HITS:1 COG:VCA1039 KEGG:ns NR:ns ## COG: VCA1039 COG0834 # Protein_GI_number: 15601790 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Vibrio cholerae # 13 267 24 274 278 230 45.0 2e-60 MKKFFAVLMLAAALILPGLAQAKDAQTSMIDDVVKRGVLRVGFSSFVPWAMQDKNGEFVG FEIDVAKRLAKDLGVELQLVPTKWAGIIPALMAGKFDVIIGSMSVTPERNLKANFTVPYD YATIEVMANKEKTKGMKFPEDFNKPEVVVALRTGSTPVPVAKKVLPNATFRLFDDEAPAV QDVLSGRADVMFSSAPLPAFEVLRNPDKLYQPSTDAFYRQPVGMVIRKGDPDSLNVLDNW IRTVEAEGWLPERRHFWFKTTDWEAQLR >gi|316922304|gb|ADCP01000123.1| GENE 38 47396 - 48334 890 312 aa, chain + ## HITS:1 COG:VCA1040 KEGG:ns NR:ns ## COG: VCA1040 COG0765 # Protein_GI_number: 15601791 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Vibrio cholerae # 20 307 4 291 298 153 35.0 4e-37 MSSGGSRKALRREWSPYAEYGRQGRLFPTFGRVDAVLVVLLLLGACFFLWRSETIAAYRW SWPELADFLVRRTPGGYEPGLLIRGLIVTLRLGVWSMALALLIGGVLGALSAGQRGVAAL PIQLFVNSVRNIPPLVLLFLLYFFAGNLLPVSDMEQALRSMPPFVRDAVAWGFAPEGQLD RMVAAVLTLGVYEGAYVTEIVRGGIEGVPHGQWEASAALGFSRPQQLRLVIFPQAVRAIL PPLVGQVITTFKDSALASLISLPDLTFQALEVMAISRMTFEVWISAGAIYLLLGVLCARY GRWLETRETWKA >gi|316922304|gb|ADCP01000123.1| GENE 39 48586 - 49137 408 183 aa, chain + ## HITS:1 COG:no KEGG:LI0734 NR:ns ## KEGG: LI0734 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 172 1 156 160 140 46.0 2e-32 MKWFTHKAVAVAGALAAGAHPAALFAVMIGSVLPDMVDTAMARGDKKVWRRIHRQTSHWF GWYLVIIILGFLIPTQRIVLDLLRATHITFPGVSPSALAQVSGIGNDLLVWVGIGGLIHV LLDALTPMRVPLYPFGGSKRFGINLVSTGTWRETIFLFVALGAIALQFDQARAVFEEAVR TIL >gi|316922304|gb|ADCP01000123.1| GENE 40 49365 - 50069 240 234 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 225 1 230 245 97 26 2e-19 MLELRNVNTFYGNIQALRDVNIRINEGEIVTLIGANGAGKSTTLMTICGGTPPRSGEVLF RGHPIHTLRPNKIVALGISQVPEGRLIFPDLTVQENLDLGAFLRNDKDNIKKDLDYIFDL FPILAQRRAQTGGTLSGGEQQMLAISRAIMARPTLLLLDEPSLGLAPIIIQQIFSIIEKI NADGTTVFLVEQNANQALRIANRAYVMENGRIVMEDSADKLLTDPAVQKAYLGM >gi|316922304|gb|ADCP01000123.1| GENE 41 50062 - 50868 260 268 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 256 1 231 245 104 26 1e-21 MNTPVLEVTNLSKNFGGLRALNEVDLTINRQEIVALIGPNGAGKTTFFNCITGIYLPTEG TVQLHLPERNGKPASSTTLNGMKPNEITSLGMARTFQNIRLFPTMTVLENVMIGRHCRTS AGIAGAVFQDSRTKAEEQATIDRSYELLREVGLNMHYQDEARNLPYGAQRRLEIARALAT DPAVLLLDEPAAGMNPQETLELKSLILHLRKEFDLSILLIEHDMGMVMSLSDRIYVMEYG SRIAMGTPQEVRENPRVIKAYLGESDHA >gi|316922304|gb|ADCP01000123.1| GENE 42 50865 - 52109 1903 414 aa, chain - ## HITS:1 COG:YPO3806 KEGG:ns NR:ns ## COG: YPO3806 COG4177 # Protein_GI_number: 16123940 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Yersinia pestis # 101 396 104 419 428 289 50.0 6e-78 MNMILRALMVSVWFMALTFPLVVLKVDALNDTILFRWENMASIGVGAFVLSIFWRWSMER KARGAVSTGPSLFARALEAVRQPKPRYTCLGLIAVLMLVMPYISSMYQINIMISALIYIM LGLGLNIVVGLAGQLVLGYAAFYAVGAYTYGLLNTYFGLGFWAVLPVGGIMAALFGLMLG FPVLRLKGDYLAIVTLGFGEIIRLVLENWTSVTKGSFGLSNLSRPSLFGMEMGVTEATNY IYYIVLAAVVVTIIVVGRLKNSRIGLALQALREDEIACEAMGIDLTKVKLSAFALGSCWA GFAGVIFAAKTTFINPASFTFMESAMVLSMVVLGGMGSVLGVTIAAAILVLAPEYLRAFS EYRMLLFGAVMVIMMIYRPQGLISGEKRTYKITNKDKIDSSKSAFDVPATGGQA >gi|316922304|gb|ADCP01000123.1| GENE 43 52120 - 53028 1361 302 aa, chain - ## HITS:1 COG:YPO3807 KEGG:ns NR:ns ## COG: YPO3807 COG0559 # Protein_GI_number: 16123941 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Yersinia pestis # 5 302 7 308 308 268 52.0 1e-71 MDLQYFVELFCGGLTRGSIYALIALGYTMVYGIIELINFAHGEIYMIGAFTALIVASALG IMGFPAAGIIVIAALAAVIWCAAYGYTIEKIAYRPLRGAPRLSPLISAIGVSIFLQNYVS LAQTSDFVAFPQLITDFEFLDPISHIIGTSDFLIIVTSLVTMVALTLFIQYTKMGKAMRA TAQNRKMAMLLGIDADRIISLTFIIGSSLAAIGGVLIANHVGQVNFFIGFIAGIKAFTAA VLGGIGSIPGAMVGGLVLGLCESFATGYISSAYEDALAFALLVLILIFRPAGILGKPKVQ KV >gi|316922304|gb|ADCP01000123.1| GENE 44 53254 - 54369 1717 371 aa, chain - ## HITS:1 COG:PA1074 KEGG:ns NR:ns ## COG: PA1074 COG0683 # Protein_GI_number: 15596271 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Pseudomonas aeruginosa # 10 371 12 373 373 235 37.0 1e-61 MARAWMKAALAGMAIMAMTAPVHAADTIKIGVPGAHSGDLVSYGMPSLNAARIVADEYNA KGGILGKQIEIVAEDDQCKPELGTNAATKVLSEGAVAVMGSICSGATKAALPIYNDAKIV SISPSATTPELTQKGDHPYFFRTIASDDVSGHLAGSFAKDKLGLKKVALLHDKGDYGRGF VNYAREMLEKGGVQIVLEEGITPGAVDYGAVIQKIRKAGADGIIFGGYHPEASKLVQQLA KKKIDIPFIGPDGVKDEQFIKVAGKNAEGVYATGPTDVSKLPMNIQAHENHKKKYGTEPG AFYDNAYSATVALLEAIKAAGSTDSQKIMEALRTNMVDTPLGKIKFDAKGDAEGIGFSIY EVKDGKFVELK >gi|316922304|gb|ADCP01000123.1| GENE 45 54634 - 55524 622 296 aa, chain + ## HITS:1 COG:BH0061 KEGG:ns NR:ns ## COG: BH0061 COG1947 # Protein_GI_number: 15612624 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Bacillus halodurans # 10 285 4 270 287 118 33.0 2e-26 MDDGTMETAALRIGCKVNLTLRITGVRPNGWHELDTVFLPLAEPSDTLRLALRPGGGLAL HCAEPGIDPENNTLTKAYRLFAEASGFRPGVEAVLEKGIPHGAGLGGGSADAAALLGWLN ARAPEPLPLPELVGLAARIGADVPFFLYNVPCRASGIGERLVPCPEWLDAAGVAGAGLVL LCPRERVSTPWAYAAWDEWNRPLTEARNGAINRASQKPLDGASSIHWLENSFEPPVFEAF PRLRRLREQLLRQGAFAAVMSGSGSSLFGLFRDAGTAFRVAEGFREKDVAAYSHAL >gi|316922304|gb|ADCP01000123.1| GENE 46 55677 - 56612 1240 311 aa, chain + ## HITS:1 COG:alr4670 KEGG:ns NR:ns ## COG: alr4670 COG0462 # Protein_GI_number: 17232162 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Nostoc sp. PCC 7120 # 5 311 22 329 330 329 55.0 4e-90 MYGDLKILTGTSNAPLAQAICEHLGCQLTPALCEKFSDGESRIEISSSVRGDDVFVIQPT CAPVNFNLMQLCLMLDALKRASAGRITAVIPYYGYARQDRKVSPRAPISAKLVADFLSTA GANRVVTVDLHAGQIQGFFNCPVDNLFASQVLLEPFMSMKGEIVVVSPDAGGVERARSFA KRIEAPLAIIDKRRDRPNQATATHVIGDVDGKIAVLVDDIIDTAGTICAGAEVLLREGAR EVYACATHGVLSGPALERLNNSVFTKVVVTDTIPSGERLDVCSKLQVVSVASLLAKSIHN IHTGSSVSVLF >gi|316922304|gb|ADCP01000123.1| GENE 47 56854 - 57432 659 192 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|94987179|ref|YP_595112.1| 50S ribosomal protein L25/general stress protein Ctc [Lawsonia intracellularis PHE/MN1-00] # 1 192 1 195 197 258 61 5e-68 MSDLKVLSVQKRAGLGKGANRRARQEELIPGVFYTAKGENLPVQMSVRSFAKLFSQVGRT TVFNLEIEGDGTHPVLIWATQRDPITSRFTHIDFYGVDLDKPVKITVPVEFTGVARGTKV GGKLETYREQIQLMAKPLDMPAKVTIDISGMDVGTTIQIADVKLPEGVKAAYDNNYAIVS VLMPGGDDAAAE >gi|316922304|gb|ADCP01000123.1| GENE 48 57651 - 58247 581 198 aa, chain + ## HITS:1 COG:NMB0795 KEGG:ns NR:ns ## COG: NMB0795 COG0193 # Protein_GI_number: 15676693 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Neisseria meningitidis MC58 # 6 172 7 173 192 138 54.0 6e-33 MAVAGLIVGLGNPGKQYESTRHNMGFLFVDELLREAKNVSSMSGDKFRCELWKAALPGSP DQWLIAKPQTFMNLSGECVQPLAAWHRLLPENILVVHDELDIAPGRMKFKKGGGNAGHNG LKSITQRLGTPDFYRLRLGIGRSPHGGEDTVNWVLGRLSPEAQDAFRKQLPAAFEVVRLF AEGNIPAATHAANAFVIE Prediction of potential genes in microbial genomes Time: Fri May 13 04:05:41 2011 Seq name: gi|316922299|gb|ADCP01000124.1| Bilophila wadsworthia 3_1_6 cont1.124, whole genome shotgun sequence Length of sequence - 5044 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 175 - 223 8.1 1 1 Op 1 . - CDS 322 - 1323 453 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 2 1 Op 2 . - CDS 1327 - 2211 1312 ## COG1651 Protein-disulfide isomerase - Prom 2232 - 2291 4.1 - Term 2275 - 2323 13.0 3 2 Tu 1 . - CDS 2360 - 4081 2615 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases - Prom 4103 - 4162 2.0 4 3 Tu 1 . - CDS 4435 - 4791 65 ## + Prom 4523 - 4582 5.4 5 4 Tu 1 . + CDS 4606 - 4968 345 ## HRM2_43550 putative two-component system sensor protein + Term 4995 - 5034 1.3 Predicted protein(s) >gi|316922299|gb|ADCP01000124.1| GENE 1 322 - 1323 453 333 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 23 324 19 309 317 179 36 5e-45 MIITRNPDALDLPLPLDLSQGTSVTVGNFDGVHLGHQALLAMTCERAAATGTLPVAMTFN PHPLEVLTPYAPPRLTTPQARLALLEASGMALVMLVVFTSEFAAMSPETFVRRILLGTLN MRELLLGYDFTLGKGRAGTPEVLAQLGKAEGFNVDRMDALSVHDEVVSSTRIRELLHQGQ VWEASSLLGRLYSVQGEVIHGKNRGGRLLGFPTANLAPTPTMLPKPGVYVTLATPSESAG SGLPFVFDPAHPTPNTYPAVTNIGYNPTFGPGALTVETHLLDFDADLYGKHLEVAFVERL RGEVTFTGPDSLVAQIKKDADQARKILETVTGE >gi|316922299|gb|ADCP01000124.1| GENE 2 1327 - 2211 1312 294 aa, chain - ## HITS:1 COG:AF1354 KEGG:ns NR:ns ## COG: AF1354 COG1651 # Protein_GI_number: 11498950 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-disulfide isomerase # Organism: Archaeoglobus fulgidus # 116 289 126 304 305 90 34.0 5e-18 MHALSSNPAIKETSMKRKILWAFLIIWLILTAYLLFLSNKWASAATPSYAASSAVTTDNF AELLRETLQKNPDLLLSVLRENSEAVLDIAQEGSNQRRHKSLIAQWKAELNQPKEVDIKD RPFRGAADAPVTIVAYSDFTCPYCQQAAGTMEKVLKENLGKIKYVFKHFPLETTGAARLA AEYHVAAARQDPELAWKFYDLLFARRADVLKDGEPAIVNAAKDAGLNMKKLAADVKRKDV RAEVDADIAEGQRIGVQGTPYFLINNLVARGALSSDLFKEAINMALQAAPSGQK >gi|316922299|gb|ADCP01000124.1| GENE 3 2360 - 4081 2615 573 aa, chain - ## HITS:1 COG:AF0400_1 KEGG:ns NR:ns ## COG: AF0400_1 COG0446 # Protein_GI_number: 11498012 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Archaeoglobus fulgidus # 4 466 2 454 455 322 43.0 1e-87 MSRNVIVIGGVALGPKAASRLKRLDPTANVTMIDENINISFGGCGIPYYVSGEINSLDAL RSTTYGTVRTPEYFEHKGVHTLNQTRVTAIDRAAKTVTAKNLVTGEEKTLPYDRLIIATG STPKVPPVEGRDLKNVGGANNLEDADALRKACASGSVQNAVVIGAGFIGMEIAVALADMW DIKTSVVEFMPQAMPGVMSASLSDMVRHDLEEHNVDVYTGEKVQRLEGENGVVARVVTDK RTLDADLVIFATGFAPNTALARAAGLDLEERTGAILVNEYMQTSDPDIYAGGDCVAIPNL ITGKPFVLALGSLANRQGRVIGTNAASDDGKAAAFHGAVGTWCVKIFKMSACGTGLTIER AKAFGFDAISASLEQLDRAHFYPEKHMMTLELVVERGTRRVLGIQGVCEDGDALKARIDA VATMLQFGKPTIDDLANAEIAYAPPFASAMDAVNAVANVADNILSGQLKPISSKEFGELW KDRANNNVFFADARPAVAGNATAAKYPGEWHAIALEDIEAKFDSIPKDRPVALVCNTGLR SYEVMLYLHNHGVTDVVNALGGMQALIKRGDNM >gi|316922299|gb|ADCP01000124.1| GENE 4 4435 - 4791 65 118 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRLAITGTQGITNPPCRIPIRRGEFRQQFPVELFVSHQALDAGFPINAGNNAPRLSFPI RHISPSVKYCGKIMPNMPSLCNVSFLGLKPHGGADQVGTRRKSGHTPRIRVHSSLRFF >gi|316922299|gb|ADCP01000124.1| GENE 5 4606 - 4968 345 120 aa, chain + ## HITS:1 COG:no KEGG:HRM2_43550 NR:ns ## KEGG: HRM2_43550 # Name: not_defined # Def: putative two-component system sensor protein # Organism: D.autotrophicum # Pathway: not_defined # 10 98 1063 1151 1272 71 41.0 8e-12 MSDRETQPGSIIPGVDWESGVKRLMGNEQLYRKLLAKFAASYGDAAGRIRDALSAGDRQT AHNELHTLKGVTANLSLAPLADLVLAAEQAVKYDDTEHENECIDAMSRELDAVIKALSKL Prediction of potential genes in microbial genomes Time: Fri May 13 04:06:12 2011 Seq name: gi|316922251|gb|ADCP01000125.1| Bilophila wadsworthia 3_1_6 cont1.125, whole genome shotgun sequence Length of sequence - 52364 bp Number of predicted genes - 50, with homology - 41 Number of transcription units - 31, operones - 10 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 40 - 1368 1264 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 2 1 Op 2 . + CDS 1464 - 1760 300 ## + Term 1802 - 1845 9.0 + Prom 1989 - 2048 2.9 3 2 Tu 1 . + CDS 2100 - 3503 1404 ## COG1858 Cytochrome c peroxidase + Term 3525 - 3561 7.0 + Prom 3995 - 4054 4.7 4 3 Tu 1 . + CDS 4149 - 4640 417 ## DMR_32200 hypothetical protein + Term 4648 - 4684 10.3 5 4 Tu 1 . + CDS 4828 - 5205 258 ## COG0607 Rhodanese-related sulfurtransferase + Term 5238 - 5278 2.1 + Prom 5376 - 5435 3.9 6 5 Tu 1 . + CDS 5588 - 6142 506 ## COG2199 FOG: GGDEF domain + Term 6185 - 6222 1.1 - Term 6273 - 6316 7.6 7 6 Tu 1 . - CDS 6385 - 7551 1378 ## EcSMS35_3514 VWA domain-containing protein - Prom 7624 - 7683 1.6 8 7 Tu 1 . - CDS 7699 - 10113 2357 ## EcSMS35_3513 hypothetical protein - Term 10322 - 10368 1.4 9 8 Op 1 . - CDS 10477 - 11571 1383 ## COG0714 MoxR-like ATPases 10 8 Op 2 . - CDS 11581 - 13239 1668 ## EcSMS35_3511 hypothetical protein 11 8 Op 3 . - CDS 13236 - 14567 1320 ## EcSMS35_3510 SWIM zinc finger domain-containing protein - Prom 14806 - 14865 1.7 - Term 14856 - 14884 -0.2 12 9 Tu 1 . - CDS 14917 - 16080 1366 ## COG0477 Permeases of the major facilitator superfamily + Prom 16402 - 16461 2.5 13 10 Tu 1 . + CDS 16640 - 18337 1418 ## COG2202 FOG: PAS/PAC domain + Term 18478 - 18513 4.2 + Prom 19003 - 19062 4.5 14 11 Op 1 10/0.000 + CDS 19105 - 21450 1592 ## COG0642 Signal transduction histidine kinase + Term 21538 - 21577 1.9 + Prom 21519 - 21578 1.6 15 11 Op 2 . + CDS 21628 - 24351 1483 ## COG0642 Signal transduction histidine kinase + Term 24399 - 24449 4.5 16 12 Tu 1 . - CDS 24473 - 24682 109 ## COG0582 Integrase - Prom 24795 - 24854 2.3 - TRNA 24864 - 24948 69.2 # Leu GAG 0 0 + Prom 25075 - 25134 2.0 17 13 Tu 1 . + CDS 25317 - 26255 632 ## COG0583 Transcriptional regulator 18 14 Tu 1 . + CDS 26424 - 27398 1123 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 27399 - 27437 1.1 - Term 27435 - 27470 6.0 19 15 Op 1 9/0.000 - CDS 27491 - 28513 1522 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component 20 15 Op 2 . - CDS 28581 - 29888 670 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 21 15 Op 3 . - CDS 29885 - 30430 699 ## DP0596 DctQ (C4-dicarboxylate permease, small subunit) - Term 30616 - 30657 6.2 22 16 Tu 1 . - CDS 30675 - 31682 1672 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component - Prom 31702 - 31761 2.1 - Term 32432 - 32487 16.5 23 17 Op 1 . - CDS 32564 - 32836 273 ## Ddes_1035 hypothetical protein 24 17 Op 2 . - CDS 32889 - 33059 262 ## gi|302863290|gb|EFL86222.1| conserved hypothetical protein 25 17 Op 3 . - CDS 33092 - 34528 2166 ## Ddes_1033 hypothetical protein 26 18 Tu 1 . - CDS 34720 - 35094 269 ## - Prom 35126 - 35185 3.0 27 19 Op 1 21/0.000 - CDS 35219 - 36226 958 ## COG0306 Phosphate/sulphate permeases 28 19 Op 2 . - CDS 36219 - 36848 672 ## COG1392 Phosphate transport regulator (distant homolog of PhoU) - Prom 37081 - 37140 4.5 - Term 37389 - 37427 5.1 29 20 Op 1 3/0.250 - CDS 37610 - 38344 729 ## COG3701 Type IV secretory pathway, TrbF components 30 20 Op 2 . - CDS 38354 - 39472 1288 ## COG3846 Type IV secretory pathway, TrbL components 31 21 Tu 1 . - CDS 39613 - 39996 343 ## Desal_1705 P-type conjugative transfer protein TrbJ - Term 40050 - 40076 -0.6 32 22 Op 1 . - CDS 40128 - 40415 355 ## Desal_1705 P-type conjugative transfer protein TrbJ 33 22 Op 2 1/0.500 - CDS 40431 - 40823 507 ## COG3451 Type IV secretory pathway, VirB4 components 34 22 Op 3 3/0.250 - CDS 40836 - 42845 1446 ## COG3451 Type IV secretory pathway, VirB4 components 35 22 Op 4 . - CDS 42839 - 43135 206 ## COG5268 Type IV secretory pathway, TrbD component 36 22 Op 5 . - CDS 43132 - 43476 339 ## DPPB13 conjugal transfer protein TrbC 37 22 Op 6 . - CDS 43442 - 43657 69 ## 38 22 Op 7 . - CDS 43676 - 44653 764 ## COG4962 Flp pilus assembly protein, ATPase CpaF 39 22 Op 8 . - CDS 44653 - 45006 284 ## gi|302861831|gb|EFL84766.1| putative glycosyltransferase 40 23 Op 1 . - CDS 45126 - 45854 951 ## Desal_1711 conjugal transfer protein TraL 41 23 Op 2 . - CDS 45932 - 46213 217 ## - Prom 46372 - 46431 3.9 + Prom 46331 - 46390 3.5 42 24 Op 1 . + CDS 46453 - 46752 248 ## 43 24 Op 2 . + CDS 46817 - 47116 145 ## + Prom 47474 - 47533 1.9 44 25 Tu 1 . + CDS 47560 - 48363 -68 ## Rcas_0692 hypothetical protein + Term 48438 - 48471 1.3 + Prom 48367 - 48426 2.6 45 26 Tu 1 . + CDS 48539 - 48673 97 ## + Term 48833 - 48873 -0.5 - Term 48606 - 48643 -0.6 46 27 Tu 1 . - CDS 48670 - 48885 109 ## - Prom 48969 - 49028 3.1 47 28 Tu 1 . - CDS 49347 - 49805 -559 ## - Prom 50004 - 50063 3.0 + Prom 49926 - 49985 2.8 48 29 Tu 1 . + CDS 50023 - 50481 -125 ## Rcas_0691 hypothetical protein + Prom 50794 - 50853 3.9 49 30 Tu 1 . + CDS 51061 - 51588 -271 ## AM1_2410 DNA modification methylase, putative + Term 51796 - 51832 -0.8 50 31 Tu 1 . - CDS 51902 - 52363 -67 ## COG3293 Transposase and inactivated derivatives Predicted protein(s) >gi|316922251|gb|ADCP01000125.1| GENE 1 40 - 1368 1264 442 aa, chain + ## HITS:1 COG:PA5483 KEGG:ns NR:ns ## COG: PA5483 COG2204 # Protein_GI_number: 15600676 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Pseudomonas aeruginosa # 3 441 10 448 449 308 43.0 2e-83 MLRVLVADDDPGIVRTLMLGLKMMGCQPVGVESAEAAIKALEDKSADLLLTDMRMEGMSG VDLVRDAHARWPEMICVVMTAFASYENAVSAIKAGAYDYLPKPFTTEQLEHLIRKVEMLV DLQRENSRLRAGSGDWFDGLTSPKCQALQSVVERIAPSDATVLLSGETGSGKTELARSIH RRSARADKPFVEVTCTAIAENLFESEVFGHVRGAFTGAVRDKAGKFELADGGTLFLDEIG ELSATAQSKLLRFLEDKVIERVGDNKPIRLDVRIIAATNRDLAAMMKSGTFREDLYYRLN VFECTVPPLRERPEDIEPLAAKLLRSASAKYGTPPTLSQAARQALLRYGWPGNVRELRNV MERVALLAAGREVTLADLPPALSAGEGVRSGPDEEPHILTLRELEAEHIRRVLGLGVSME RAAELLGINTVTLWRKRKELGA >gi|316922251|gb|ADCP01000125.1| GENE 2 1464 - 1760 300 98 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGFIAGDSVLDAAYGYGMVCDVIRNPDETYPVICRFWQNGRPVEISYTRDGFNDVSDMIP SLVHAGTGAAKGKPVRLSFEDFDEEYRELFGGAGQKKP >gi|316922251|gb|ADCP01000125.1| GENE 3 2100 - 3503 1404 467 aa, chain + ## HITS:1 COG:ECs4398 KEGG:ns NR:ns ## COG: ECs4398 COG1858 # Protein_GI_number: 15833652 # Func_class: P Inorganic ion transport and metabolism # Function: Cytochrome c peroxidase # Organism: Escherichia coli O157:H7 # 55 453 59 455 465 426 51.0 1e-119 MRTRSALLVASCVLAASVVTGGALNESVGADKGIPPNHPQLTKFSKVSDIVMDKCMACHS RNYDLPFYANIPGIKEIIEKDFNDGLRAMDLNLELVEAAKDKPIGEATLAKMEWVIVNET MPPAKFTAVHWGSRVSSEDRAAILDWVKASRAAHYATGLAAPRHADEPLQPLPDALPVNA AKVALGEKLFVDKRLSGDNTVACVTCHDFSKAGTDNKRFAEGIRGQFGDINAPTMFNAAF NTKQFWNGRAADLQEQAGGPPMNPIEMGSKDWDEICAKLAQDPELTAAFTAVYPDGWNGK NVTDAIAEYEKTLITPNSRFDKWLKGDDKALTAQEIEGYQRFKMYRCSSCHVGKSVGGQS FEYMDLKKAYFEDRGNPLGSDEGLKGFTGKAEDLHRFKVPNLRNVELTAPYLHDGTVTTL DEAVRIMGVYLSGMDIPKGDRDLIVGFLRTLTGEWNGKPLAGEAVKN >gi|316922251|gb|ADCP01000125.1| GENE 4 4149 - 4640 417 163 aa, chain + ## HITS:1 COG:no KEGG:DMR_32200 NR:ns ## KEGG: DMR_32200 # Name: not_defined # Def: hypothetical protein # Organism: D.magneticus # Pathway: not_defined # 1 150 1 132 178 62 30.0 4e-09 MISDAELLKLVMERYPEQPTPFIIEQFTVLKAGLIQANNKLAAMDMEEPVCEQVIIEEPM PEAAEAAPSAAPKKKFTKRSLVVRPEEAIGDDVVQCCLCGRGFQNLTAKHLLSHGISVDE YKKLCGYVPEQKLICKNLLEKLQENVQKAQRSREQKLSGEHLK >gi|316922251|gb|ADCP01000125.1| GENE 5 4828 - 5205 258 125 aa, chain + ## HITS:1 COG:STM2798 KEGG:ns NR:ns ## COG: STM2798 COG0607 # Protein_GI_number: 16766109 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Salmonella typhimurium LT2 # 5 103 6 103 175 67 40.0 9e-12 MLETLSPQEAMTLFRNGGAFFVDVRTEKEFLKRAIPGCPLVPLDKLEREGGLDSRFSGKP VVFFCRSGSRTQNKAELLERSASGKAYQLGGGILAWEQAGFPVWSIRELLPCFAQEPAFI SGVLA >gi|316922251|gb|ADCP01000125.1| GENE 6 5588 - 6142 506 184 aa, chain + ## HITS:1 COG:VC0653_2 KEGG:ns NR:ns ## COG: VC0653_2 COG2199 # Protein_GI_number: 15640673 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Vibrio cholerae # 5 173 85 249 254 95 33.0 7e-20 MAECKQAVLVEKETAHRDCLTGCYDRMTIQHLVDLCFLKSDVRQSHAMFVIDVDNFKRVN DTWGREFGDIVLNDVAAVLRRIFRRSDYIGRIGDDELLVLLRDVEPERLSEKKAQEFYTA LGTLFSARGDERYKPTCSVGIARYPMDGTLFETVFQSADAALHEAKRLGLNRVAFAPGCE GRKR >gi|316922251|gb|ADCP01000125.1| GENE 7 6385 - 7551 1378 388 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3514 NR:ns ## KEGG: EcSMS35_3514 # Name: not_defined # Def: VWA domain-containing protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 387 1 386 389 645 82.0 0 MDTAEGERLRRWRLVLGEEAQASCGLAGLSLEDRQMDEALAALYEADSRHSLRGGSGASS PRVARWLGDIRRFFPSSVVQVMQKDAFERLNLHSMLLQPEMLESVQPDVNLVSTLMSLRG IIPEKTKETARLVVRKVVDALMKQLEEPMRTAVSGALNRAVRNRRPRHAEIDWNRTIRAN LKHWQQEYKTIVPETLIGYGRKSQRVQREIILCIDQSGSMASSVVYSSIFGAVMASMPAV KTHLVVFDTAVVDMTEQLDDPVDLLFGVQLGGGTDINRAVGYCQSLIRAPRNTILVLISD LYEGGVEKNLLQRANELVQSGVQVITLLALSNEGTPFYDKALAAKLAGLGIPSFACTPDL FPGMMAAAIRKEDVNLWAAKQGIVVPKG >gi|316922251|gb|ADCP01000125.1| GENE 8 7699 - 10113 2357 804 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3513 NR:ns ## KEGG: EcSMS35_3513 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 800 1 739 749 825 56.0 0 MPDQTVRLFGIRHHGPGCARSLLRALEAMQPDCLLIEGPPDGESVLPFVLESGMCPPVAL LVYAPDDSRRAVFYPFAEFSPEWRALRYGLGQSLPVRFMDLPIAHQFGLDKAFEDECRAK EEALRDAEGRTKTDAAEGTEAPASGAQAPENTATNTLAPEQPEGGDTGDTDGNAGGEASA QEDVYGDPLDWLGRAAGYGDGEAWWNHMVEERIDGLELFDAIREAMTALRAEAPRRERGE RETRREALREAYMRKTLRQAKKEGFQRIAVICGAWHVPALGAETPAKQDNDLLKGLPKMK VAATWTPWTYANLSSRSGYGAGVTSPAWYEHLWRSGKGDRAIGWLTHAARLFREQDMDCS SAHIIEASRLATSLAALRERPRPGLPELYEALQTTVCMGDSAPLRLIERQLIVGDKLGTI PETTPTVPLQRDLEQQQKSLRLKPEAARKVLDLDLRQANDLARSHLLHRLRLLEIGWATP GGSRNAKGTFHELWEMQWVPELPIAVIAASRWGNTILEAATAKAVELSGEADLLRLAELV NDILFADLPDAVGHATRMLEEKAAIANDVGQLLEAIPPLAAIARYGNVRQTDAGMVARVL EGLIPRASIGLPGACTSLDDENAAAMRARIIAAHNAIRLLGNEGLWESWLSALHQTALRD GMVHELLRGMAVRLLFDEQRLPVEEAARLMSLSLSAAAAPASASAWIEGFLNQSALVLLH DDALWGVLANWLDGLNDTHFTNILPMLRRTFSDFSAPERRQLGERAKRAAGKPMQKQAET RWDAERAALPVPLLRRVLGLTAQA >gi|316922251|gb|ADCP01000125.1| GENE 9 10477 - 11571 1383 364 aa, chain - ## HITS:1 COG:yehL KEGG:ns NR:ns ## COG: yehL COG0714 # Protein_GI_number: 16130057 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli K12 # 6 359 27 379 384 221 40.0 2e-57 MSTATNENKLRHHAEQQFAEELEELKKSDARQRPANWELSPWAVCTYLLGGELDNGFTVS AKYIGNRRIIETAVATLATDRALLLYGVPGTAKSWVSEHLAAAISGDSTRIIQGTAGTSE EQMRYGWNYAELLSKGPSRAALVESPLMRAMSEGRIARVEELTRIPADVQDTLITILSEK TLPIPELNDEVQAVRGFNLIATANNRDKGVNELSSALKRRFNTVILPVPATEEEEISIVS KRVSEMGRALELPAEPPAMHEVRRVVQIFRELRNGQTEDGKTKLKSPTGTMSTAEAISVL NSGMALAAHFGDGVLHARDVAASLVGAVVKDPVQDDLVWREYLETVVKERSDWKDLYRAC REVD >gi|316922251|gb|ADCP01000125.1| GENE 10 11581 - 13239 1668 552 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3511 NR:ns ## KEGG: EcSMS35_3511 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 6 551 7 528 534 278 36.0 4e-73 MNMPFLTMQALLGTDKRPPELPADDSEPGRLTQAIAATRDGSENAEALRLLRAAGVQAVC GLAGYQPPRADSALPEPCPPEPRASVDKAAVIDVLRQVLSSGSDRMRLEAFRLMLGAGKV LPPALLPQALDLGRRTPSLRGPIALIAGERGRWLGGRNAAWNLFATNAENELDPEVWDNG TLMQREAYLKSLRAQDPAKARERFETASASFDARERAAFTGCLGEGLSAADEAFLETLLE KDRSKEVRQAAASLLVRLPESGYVRRMGERLAACIVVPEARGGLIGRIASAFSAPLPAVE PPESFDPEWKKDLLEEKKPQYEEFGPRAWWLYQIGRCVPLAWWEAHTGLSPEALIDWAQK TDWRNVLLRSWLEAARRERHAAWASALLAHVFKEQIEVSVGKGRFSAFAFIGLLPPPERE AALLERFPCPGSAGGGATFAQQHYKQVMQFGQFAASEWEGDATFSLEGGHSLVRRIRFWA DHSDGIGQISYQTSIALGHIIEATFWLIPVSLLDDLLEGWPHGENGAPFCRSGYDFLTTA IRTRKALHQYFA >gi|316922251|gb|ADCP01000125.1| GENE 11 13236 - 14567 1320 443 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3510 NR:ns ## KEGG: EcSMS35_3510 # Name: not_defined # Def: SWIM zinc finger domain-containing protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 441 1 438 440 408 52.0 1e-112 MELTSESVLTLAPDASSVKAAQALLKPGQWPTLGFNENAVWGECKGSGSKPYQVEADLSG PVFKCTCPSRKFPCKHSLALLLLRVQQEAAFTQGEAPDWVKEWLDAREKRAARQEQKQEN KGQPADPAAAAKREASRLKRMAAGLDDLERWMCDLIRHGLGQLSGTPAWQEEAAARMVDA QLPGIAARLRNLQGVTSSGEDWPGVVLAQFGQLQLLIDAFRHLDALSPEEQADVRAALGL SPDKDEVLATGERLSDLWLVLGVGYAEENRLWRRRVWLRGQNSGRAALLLDFSHGGKHFE QSFVVGSLADMTLTFYPGAVPLRAIATGDPVRAKTAPIPTVSLREALEGMAKAIAAQPWQ WPLPLMVSDGVPCRSDDSWFLQTDAGDRLPLSITDNDGWELLTQGGGRPLSVWGEWDGAR FQPLSAWEPGTSGSPLWREGARP >gi|316922251|gb|ADCP01000125.1| GENE 12 14917 - 16080 1366 387 aa, chain - ## HITS:1 COG:BMEI0267 KEGG:ns NR:ns ## COG: BMEI0267 COG0477 # Protein_GI_number: 17986551 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Brucella melitensis # 15 363 22 369 397 159 30.0 1e-38 MPFIPLFQSPKSFYRLAVASFFFLQGLVFASWASRIPDIKQALALNEAQLGGVLFAIPVG QMSAMALSGYLVSHFGSHRMLLLAALLYSGTLMALGFISETWQLFAMLVFFGVAANLHNI SVNTQAVGVERLYRRSIMAAFHGLWSLAGFVGGIVGALFAAFSVSTRVHFSFIFAICMGI VAIMFRLTLPRDRARDTVPGHKRPKGKIDPYVVLLGLIAFGCMASEGTMYDWSAVYYEAI IKPSPELIRLGYIAYMCTMVCGRFMADGLVTRFGVIRILQASGALIAAGLLISVLLPHVA TATFGLALVGFGTASVVPVCYSMAGKSQIMHPSVALAVVSTIGFLGFLLCPPVIGFIAHA SSLRHSFALIAVFGLMTAVLARFLRRN >gi|316922251|gb|ADCP01000125.1| GENE 13 16640 - 18337 1418 565 aa, chain + ## HITS:1 COG:alr2279_1 KEGG:ns NR:ns ## COG: alr2279_1 COG2202 # Protein_GI_number: 17229771 # Func_class: T Signal transduction mechanisms # Function: FOG: PAS/PAC domain # Organism: Nostoc sp. PCC 7120 # 263 383 306 425 425 98 43.0 4e-20 MIQTDPIAFSPSIPLDSTEKLWEWHIPSDRLFFSKGTRLAFGLSDGESPATMAAFFEHIP EGCLQSLCELREGVIGGDRGFLETAYPVENFFVRERLIVLERDAEGRAIRIAGQFEIAPG TMAYLPPVSTVSKGQAQICFWNCSLEKRTLWVDANGSRLLGYAEAMPRVFNLDEWKERLH PEEGTDCRYQLIIEQPLFGDELTDDLRVRKEDGSYVKIWVRGSILSRDADGHAVSMSGTW QPFGFVEERHEQRTPENGSLLSALNAAGDGLWDWNPITDEVYYSPRWLSMLGYTAEQFPG RLETWKEKIHPDDLKKIVEPQRKLADSPKYGDTFECTYRLKRLDGSYAWILGRGYVTHRD AKGRATRIVGLHTNITASQADRDYFEKLAQNDALTGLHSRTYFDIKLANIENKGIRPLCV ISCDVNGLKLANDYMGHLVGDKLLQTVAALCKESVRTSDCVARMGGDEIVILLPTCPREA GEEILAKLRRNFDRHNACSDQMPALVSFGLACAESADIPASAILVEADREMLKEKRSHRK AAHERIKQWIEQNMDVVVSIEDDRY >gi|316922251|gb|ADCP01000125.1| GENE 14 19105 - 21450 1592 781 aa, chain + ## HITS:1 COG:all4496 KEGG:ns NR:ns ## COG: all4496 COG0642 # Protein_GI_number: 17231988 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 235 496 307 572 575 172 37.0 3e-42 MEWNWILEEYDDMVYVSDLNNYELLYVNRAGLDMFHLSREELFSGKKYLCYEIFHGRNTP CPFCTNNYLKDSGFYEWEHYNEQLGKTYLLKDRKVLWDGRESRIEFVFDVSKYKNKLDKQ GKQQAAMLRSLPGGIARLDARDERTILWYGSEFLSIIGYTEEQFKEELQSQCLYVYPEDM EQAEAAMHEAKETGRSSIMQGRIVTRSGKIRHLTITFSYMDGKDSEDGIPSYYSLGIDVT ETVERQRLQQQALEDACRVAKHANQGKTEFLSRMSHDIRTPMNAIVGMTAIAGAHIEDTQ KVRDCLKKINTSSKHLLNLINEVLDMSKIESGRLDVRANDFSLSNLVQNVFEVCRPMIIE KGHHLTMNISKVRHEGLYGDESRLQQVLVNILSNAVKYTPAGGRLHFSIEEKPTHAEGAS VFYFTFEDNGIGMSEEFVGRIFEPFSRAEDSRISKIQGTGLGMAISQNIIHMMNGEITVD STLGKGSRFTVSVALKFWDAEDGEHPVLTDLPVLVVDDERDVCEYASLLLKEVGMRAFWV LSGQEAVAEIVRAHEERDDYFAVILDWKMPDMDGLDTAREIRRRVGPDLPIIMLSGYDCS DVGADFLAAGVDVFIMKPLFKSNMVHLLRNFAEDRGCGHASAVPRPEGQRLDGLHVLLVE DNEINQEIAKELLLMEGASVDVADNGEQALNIFARSEEGYYQLVLMDIQMPVMDGYTATA ALRSLRRDDAKTAIILAMTANVFSDDVIKAEASGMNGHISKPFDPDKLYAMIAASCYKKR G >gi|316922251|gb|ADCP01000125.1| GENE 15 21628 - 24351 1483 907 aa, chain + ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 364 615 7 255 311 175 41.0 4e-43 MKLSPHDSSLDALKFVVDMADVPMVLTDEDGRFVFANGHTLDIVGKAEDACIGKPLASLL SVSRRPRFNRSGKAPVLFTREGRFYEVDGISRFHTKSGMLELSRIHDVTSTYRDKEMLEF LKSNAPIGIFQFVMDEHFSFFYVSEGFCAMHGYTQDQMKTAPDIRGSELIHPDHLQRFND LIRTAYEDGKSSVGFEMKVIRRDGKVRWLLTQCSFVKTADGTIVYGYVSDCTDLKNMEEK TKNVQDVCRFTVNHDYEWILLVDIPADRYEYFFSGRNSMIGSGMQGVFSELVRLRSETVV HPDDVEAYRAFFKPERLAWENGTPGLSHELKYRVLMGNGQARWHCLRAYLFDPVRRMLLF CAKDVEEEERQKATLRMALVAAEQANHAKSEFLSRMSHDIRTPMNAIIGMAAIAGMNVGN PERLSECLNNIALSSRFLLSLINDVLDVAKIESGKMSLASEPFDFDELVSSVSSFTYGSA VAKGIIFNLFASPLLEKIYVGDPLRIKQVLMNLLSNALKFAPEKGRIEFNITPVQKVGHR ELVRFVISDNGIGMSKSFQERMFQPFEQEMTDQRGLSGSGLGLAIVRSFVQLMDGTIRVE SEQGKGSSFSVDIPLGLFEGALYGWDKDSLEQSESVRVLVIGDDRAACEHTAVLLRRMAV EASFVLSGEEGVVCVRQARERRRDYGMILIDWETPGMDDMETVRHIREIVGKATTPIAMS ACDWREMEAPARTAGVDYFIHKPVLREHLRDMLLVVTHHRHAFDVPAMPEEVRFNREKIL IAEDNDLNAEILKTLLESRNLSVVWAENGKVAVGLFAESEPGEYAAILMDVRMPVMDGLE ATRHIRGMTREDARRIPIFALSANAFADDIQRSLQCGMNTHFNKPVDMDSICAALHYWLE HDKGNRP >gi|316922251|gb|ADCP01000125.1| GENE 16 24473 - 24682 109 69 aa, chain - ## HITS:1 COG:YPO3438 KEGG:ns NR:ns ## COG: YPO3438 COG0582 # Protein_GI_number: 16123586 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Yersinia pestis # 2 62 12 72 419 85 65.0 3e-17 MKPSGKVQKLSDCDGLYIHVSPAGGKLWRLFYRFDGKQKTLALGKYPEVSLAEARKRRDE ARALNESCP >gi|316922251|gb|ADCP01000125.1| GENE 17 25317 - 26255 632 312 aa, chain + ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 3 292 1 289 296 166 33.0 5e-41 MNVSFEYYKVFYHVARLGSITLAAKALFLSQPAVSKCIRQLEKVLGCALFYRMHKGMRLT PEGEVLYQHVAQACEQIAIGEKKIQAMLHLDWGSIHIGSSDMIMHAFLLPYLERFHKDYP KVRISTITASTPETIASLRARRIDLGVVFSPVEENEDLDIIPVCRVQDTFIAGNTFRHLE GSVLSPDEVAKLPIVCPEKGTSTRAYLDAFFQSHGVVLDPECELATTVLIVPFAERGLGV GITVRRFAEESLRAGSVFELQTTHRIPERAIAVVTRKRSQISHAGHAFIRCLTGKETEEA GGEQLREKCSPG >gi|316922251|gb|ADCP01000125.1| GENE 18 26424 - 27398 1123 324 aa, chain + ## HITS:1 COG:CAC1045 KEGG:ns NR:ns ## COG: CAC1045 COG0697 # Protein_GI_number: 15894332 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 1 305 1 305 317 276 54.0 4e-74 MKRGYLFIALATLFFSTMEIALKEVAGLFNPVQLNLTRFLIGGLVLIPFARRMLRKRGVR IDGLSLVKLAGLGFLGIVVSMTLYQLAVENTNASVVAVLFSCNPVFVLVFAGLILRTQIL RQHVMALVLECLGILILINPLDTSISTAGITFTLLSTAVFALYAVLGTKMCAKYSGVVVT CGSFLFASLEMAAIVAVSRLSAVAGVLADIGLGLFADIPLLSGYTPFSALMMLYICVGVT GAGFACYFMAMEATSPITASLVFFFKPALAPVLALVFLQEAIPVSMVVGVLFILAGSLVS LIPALSMIPAMALGKVWCWKQGRA >gi|316922251|gb|ADCP01000125.1| GENE 19 27491 - 28513 1522 340 aa, chain - ## HITS:1 COG:AGc4976 KEGG:ns NR:ns ## COG: AGc4976 COG1638 # Protein_GI_number: 15889993 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 48 304 42 299 335 124 30.0 3e-28 MSCKTIVSLLMAGFFLCAASASAAPAAAPHTTLTIAMGDPESSEMGVVGNAFKKYVEDKT NGAIQVVCIYGGSPGDDEGERFRKVQKGTLDMALGGIANIVPLEKKLGLLTLPYLFANLN EVVAGTNGAPAELLNKYATEAGFRILTWTYTGFRFISNDKHPITKLSDMKGLKFRVPQSA VMIATYKAFGAIPSLIPWSMTFNALQSGDVDGQCYGYIGFRAMKFNEANQRYLTEVHYTY HLQPLVISERVFEKLTPEVRQILIDAGKYAQEKSLLFQVEQAEASKRELVAGGLKVSQLE DEEQWKKVAIEKVWPEMADFVGGKDAINEYLKACGKQPWN >gi|316922251|gb|ADCP01000125.1| GENE 20 28581 - 29888 670 435 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 2 431 1 430 431 262 33 3e-69 MMEALSPANIGLLLFGIFFVLLLIGSPIMVALGVATMACFIVLDIDLSLMIERAFASLTA FPLMALPAFVLAGSLMEAAGVSRRLVHVAENIVGPTPGGLAISTTLSCVFFGAISGSGPA TTAAVGMLMIPAMAKRGYNVGYAAAATATAGGIGIIIPPSIPMVIYGVSGQQSISKMFMA GIIPGILIAVGLSVMHFFLCRNLKTEGLDWSMKTFIHSLRDGFWSILAPVIILGGIYAGI FTPTEAAVVAIFYTILVGIFIHKELTLKSFMASLKTTSWLTGRVLVLVFTATAFGYLLTS YRIPVEIANWILSFTNNVNLVWFFVVILLLFLGMFMETLAIIMLVTPVLLPIMTAYGVDP IHFGIILICCCGIGFSTPPLGENMFIASGIANISLEEISLKALPLVAINIAVIAILVMFP DIVLFLPNLMGTGLQ >gi|316922251|gb|ADCP01000125.1| GENE 21 29885 - 30430 699 181 aa, chain - ## HITS:1 COG:no KEGG:DP0596 NR:ns ## KEGG: DP0596 # Name: not_defined # Def: DctQ (C4-dicarboxylate permease, small subunit) # Organism: D.psychrophila # Pathway: not_defined # 1 179 10 188 202 197 54.0 2e-49 MKKIVLALLDHFESYLCQFLLVFFVVVILLQIVLRQINMSLPWTEEIARFAFIWFILFGA CYATRLCALNRVTLQFSRAPKWVGNLFLFIGDIIWLCFSLIMAWEGYIAVLDLVEFPYAT PALDWDLGMVYLVFPLSFLLMAIRIVQVNVIKYILKEEILDPDQESIKESQKTLMAEGDK E >gi|316922251|gb|ADCP01000125.1| GENE 22 30675 - 31682 1672 335 aa, chain - ## HITS:1 COG:SMc00271 KEGG:ns NR:ns ## COG: SMc00271 COG1638 # Protein_GI_number: 15965452 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Sinorhizobium meliloti # 9 295 7 289 325 140 30.0 4e-33 MSYKKMLPLVLAALLLWTVPASAAPETTLKVAVGDPEDSEMGVVGHAFKKYVEEKTGGKV EVQCFYGGSLGDESEAFRNVQKGTLPLAMGGIANLVPFEKKLGLLTLPYLFANIDEVVAG TNGAPAELLNKYATKAGFRILTWTYTDFRYISNSKRPITKMADMQGLKFRVPQSAVLIAA YKAFGGSPTPISWAETFTALQQGVVDGQCYGYIGFKAMKFNEANQKYLTEVHYTYQLQPL VISERVFKKMTPEMQKLIVDAGKYAQDAVLKFQLENAEAAKKELIAGGLVVSQLEDEDLW KKAAIEKVWPEMADFVGGKDAINEYLKACGKDAWK >gi|316922251|gb|ADCP01000125.1| GENE 23 32564 - 32836 273 90 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1035 NR:ns ## KEGG: Ddes_1035 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 70 1 70 74 68 48.0 7e-11 MSQYKLVYYSGMNMNLVQGASEIVEADSFNDALSLKCSWPVFEARDHLSAAAQNPGTCVY YTEMWEAVLMDPKQASTSHDCYGDFSGMRY >gi|316922251|gb|ADCP01000125.1| GENE 24 32889 - 33059 262 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302863290|gb|EFL86222.1| ## NR: gi|302863290|gb|EFL86222.1| conserved hypothetical protein [Desulfovibrio sp. 3_1_syn3] # 1 56 1 55 56 68 67.0 1e-10 MAEKKNDVKFHEEIQKMEYEPMDETELKLIKGGITLGIVLLVVLFVVSKYLLPGLH >gi|316922251|gb|ADCP01000125.1| GENE 25 33092 - 34528 2166 478 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1033 NR:ns ## KEGG: Ddes_1033 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 476 1 477 480 674 76.0 0 METVRKPLNEDRVALLLGSVIFILALLKFSGIDMLGWAVKIGMWVGDPLDAWRPATKGLL PGYGALFATYAILTGALACCVKLMGNDVVKFVKSFTIIFFIAMICYTLGANAYIAANPTQ LKAQGIPWALGLSTEAGLILALVMGIVISNFMPNLAESLKDACRPELFVKIAIVIMGAEL GVKAADAAGFAGHIIFRGLCAIVEAYLLYWCVVYYVARRYFKFNREWAAPLASGISICGV SAAIATGAAIRSRPVVPIMVSSLVVVFTCIEMLILPFIAQHFLTGEPMVAGGWMGLAVKS DGGAIASGAIAESLILAKAAEAGINWEPGWIIMVTTTVKIFIDVFIGVWALVLAWVWCTK FDKSGDRTMHWGDVWARFPRFVLGYIITFAILLIICLQSPELQKTGASVSGTLNAFRVIF FLLTFFTIGMVSNFRKLMEEGIGRLAIVYVVCLFGFIIWLGLFISWLFFHGMTPPLAS >gi|316922251|gb|ADCP01000125.1| GENE 26 34720 - 35094 269 124 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPIPSVTQKFMCTLFVFTMICGVWQYAAIENTFYFEDRENIVSEEICEYAVTQRIAEHKR LNYNTNTSGIEPLLPLLSLLLGFLLYAPVVRKIHLCRLQNDFLPKLCDFHGMLPLPFAPP RSLS >gi|316922251|gb|ADCP01000125.1| GENE 27 35219 - 36226 958 335 aa, chain - ## HITS:1 COG:DR0925 KEGG:ns NR:ns ## COG: DR0925 COG0306 # Protein_GI_number: 15805949 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Deinococcus radiodurans # 7 322 7 320 333 251 52.0 2e-66 MFDVSFLLILIIIVALIFDFTNGAHDCANAIATVVSTKVLSPRTAVIMAATLNLIGAFLG TKVANTLGSGIVHPDIVANCQPLVLAALIGAIGWNLFTWHFGIPSSSSHALIGGLMGAAV AYAGFSSLNGGSILTKILLPLVLSPLAGFGMGLLVMFLIMFMCAKCARNKLNTAFTRLQV LSAAFMATSHGMNDAQKTMGVITLALFIFNEIETIAVPLWVKCLCATFMALGTAMGGWKI VKTMGHKIFKLEPVHGFASETSAALVISGASLLGAPISTTHTITACIFGVGSTKRLTAVR WGVAGHLIIAWVLTIPASGLLAACSFFLLSAVGFA >gi|316922251|gb|ADCP01000125.1| GENE 28 36219 - 36848 672 209 aa, chain - ## HITS:1 COG:CAC3094 KEGG:ns NR:ns ## COG: CAC3094 COG1392 # Protein_GI_number: 15896345 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate transport regulator (distant homolog of PhoU) # Organism: Clostridium acetobutylicum # 7 207 6 208 210 70 26.0 2e-12 MFARFLPQTVPFFELLVQQNALLQEMSEALAVVMDNTEASERHLKRINLLEEEADRLYRS ITQHLSQTFITPIDREDIHAINMAQERVADKIQHLANRFFVSGFMYQRFPAQMITERIRG MLRDTKSMLDEISMKKEVSAHIHTLKSRKSDCEMFQSTGLAELMDSEIETFERVREIILW GQVYDRMERTVDAVSDLADTLEEVALKYV >gi|316922251|gb|ADCP01000125.1| GENE 29 37610 - 38344 729 244 aa, chain - ## HITS:1 COG:AGpT74 KEGG:ns NR:ns ## COG: AGpT74 COG3701 # Protein_GI_number: 16119846 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbF components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 224 5 214 220 125 33.0 7e-29 MALFKTKNESPPEPSSSSNPYLNGREEWLERYGSYISRAAQWRMVAFFCLILTGISITGN VMQANQVKTVPYIIEVDKLGKSSVVNRADRASATPQRLIQAEIAACISDWRTVTADVELQ QKMIERLSFFMAGSAKGVLRQWYEANNPYEIAKTGKLVHVEIKGLPLPVSSDSYRVEWQE TVRSHAGVLLDSYTYEATVTIQINPPTVDAVLLRNPGGVYITSLSAGKVVGAQAPAKPQP SEKQ >gi|316922251|gb|ADCP01000125.1| GENE 30 38354 - 39472 1288 372 aa, chain - ## HITS:1 COG:XF2046 KEGG:ns NR:ns ## COG: XF2046 COG3846 # Protein_GI_number: 15838640 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbL components # Organism: Xylella fastidiosa 9a5c # 40 273 40 279 464 84 26.0 5e-16 MKRKYLPLGILALALCAVVISAGEAHAANEDFVSTLVREFYNKTSNWEPTLKRYALGIFR WLVILEVCFLGIKAALNRDQLTDIFKQFVMLLLMAGFFMAVINYYQEWAWNLINGLGAIG KELTPGGYSSESPFLTGMQLVKLVLDKLSVWSPGNSIALLIAALVIIVCFALISAQVVFI KCEAMVAMAAALLLVAFGGSSFLKDYAVNAIRYVLAVAFKLFVMQLVLGVGIAFIEGFST STAELQDIFVVIGASVVLLALVKSLPDVCAGIINGSHVSSGAALTASAAAVGGAAIGAVV AGSNTVQNVRDAAKVAGMEGGSGLGKVASMAKSLWGARQDAKTGGEKDLSTRTRSEMQER LERARMNKNDNA >gi|316922251|gb|ADCP01000125.1| GENE 31 39613 - 39996 343 127 aa, chain - ## HITS:1 COG:no KEGG:Desal_1705 NR:ns ## KEGG: Desal_1705 # Name: not_defined # Def: P-type conjugative transfer protein TrbJ # Organism: D.salexigens # Pathway: not_defined # 3 105 151 252 276 79 42.0 4e-14 MDEARQRYRQEWDNWAESVDQASQATFQLSGQQLADLQRDPARFQQYIDSLLARPDGQQK AIMAGNQLSALQVQEARQLRELMATQVQSQLASQMKEEKESQMSQEAWRETIKTNRIGKV KAKPDPF >gi|316922251|gb|ADCP01000125.1| GENE 32 40128 - 40415 355 95 aa, chain - ## HITS:1 COG:no KEGG:Desal_1705 NR:ns ## KEGG: Desal_1705 # Name: not_defined # Def: P-type conjugative transfer protein TrbJ # Organism: D.salexigens # Pathway: not_defined # 10 92 21 103 276 71 49.0 1e-11 MKKYILLPALVLWICATPAYAITVTCTNCSTNLIQLLDRITNVEQLTNAIKQYQEAVEQT RQQITMVQQNIEQYQNMLQNTAQLPANLVNELKGS >gi|316922251|gb|ADCP01000125.1| GENE 33 40431 - 40823 507 130 aa, chain - ## HITS:1 COG:AGpT83 KEGG:ns NR:ns ## COG: AGpT83 COG3451 # Protein_GI_number: 16119850 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 130 685 813 822 146 53.0 7e-36 MRKANCAVILATQSLSDARNSGILDVLAESCPTKIFLPNSAAEDAGQKELYTGMGLNDKQ LAILKSGIPKQDYYMVSPQGRRKVQLALKGKALAFVGASDKASIARIRELAAEHGPGNWQ HIWLRERGVA >gi|316922251|gb|ADCP01000125.1| GENE 34 40836 - 42845 1446 669 aa, chain - ## HITS:1 COG:AGpT83 KEGG:ns NR:ns ## COG: AGpT83 COG3451 # Protein_GI_number: 16119850 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 669 1 679 822 611 45.0 1e-174 MLKLIDYRTKTRGLPDLLPYAALIAPGIVLNKDGSFLAGWEVRGQDTASSTESDLSQVSS RFNNAVKLLGSGWMLHMDAIRSSHRAYPPADKGYFPDPVTRLIDDERREFFSGNRWFSTS AILTVTYKPNFQDSKIAGKAQAGVATSDALEKARSYFQNMLDELEDALSSILRMERLVEY AMPDADGFDWVQSDLFSHIQYCTTGTLHPVRVPQTPMYLDALLGGETLVGGIVPRLGNRH LAVLSLDGLPQESYPAILRDLDALALEYRFSSRFICLDQYDAAKEINSYRKGWRQQVFRF LDQFFNNPNARANRDALLMAEDAETALTEVQGGYVSAGYLTTSIVLMHEDQEQLQDWARD LRRTVQTLGFGCRIESINALEAWLGTHPANSYAKLRRPMVNTLNLADLLPLSSVWTGSPV CPCPFYPPNSKPLAVLMTDNSTPFWFNIHAGDLGHTLIFGPTGAGKSTLLAFIAAQFRCY ENARIFAFDKGMSMFPLCFGASGDHYNIGNAEQLAFAPLQRIDSEEERTWAEEWIASLME LQQFTVMPAHRNAIHTAMQTLAANPPHLRSLTSFYHVVQDREIKEAIQHYTVQGAMGRLL DADADNLNLSRFMVFEIEDLMNLGDKNLVPVLTYLFRRIEKALDGSPTLLVLDEAWIMLG HPVFRAKIR >gi|316922251|gb|ADCP01000125.1| GENE 35 42839 - 43135 206 98 aa, chain - ## HITS:1 COG:AGpT85 KEGG:ns NR:ns ## COG: AGpT85 COG5268 # Protein_GI_number: 16119851 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbD component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 91 13 97 99 59 36.0 2e-09 MRRYLPIHQSLHRHAHVLGAERELVMTSALIALLVGVGGLTAVSIVSAIVFWIASVFALQ RMAKADPIMSRVWLRHIKQQEFYPARASRWRLVGGFTC >gi|316922251|gb|ADCP01000125.1| GENE 36 43132 - 43476 339 114 aa, chain - ## HITS:1 COG:no KEGG:DPPB13 NR:ns ## KEGG: DPPB13 # Name: not_defined # Def: conjugal transfer protein TrbC # Organism: D.psychrophila # Pathway: Bacterial secretion system [PATH:dps03070] # 13 108 7 99 108 88 57.0 9e-17 MNANHLPSQSLSSRLFVTGLAAFCLLALPDIAAASGGITEFSSPLEKVVNTITGPAGKWI SIVAMALCGVIFIMNKDDISGGFKLLLSVVFGISFIAFAASIVNSVFSFSGAVI >gi|316922251|gb|ADCP01000125.1| GENE 37 43442 - 43657 69 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKAPIPNTPEIRCRGCNKLLARGAIEAGDIELHCPRCKTRLHLRATRPSHAPHDGLHGDL HERESSSQPKP >gi|316922251|gb|ADCP01000125.1| GENE 38 43676 - 44653 764 325 aa, chain - ## HITS:1 COG:AGpT89 KEGG:ns NR:ns ## COG: AGpT89 COG4962 # Protein_GI_number: 16119853 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 308 29 323 343 246 45.0 4e-65 MLMRPCDDPRLLEGLRHSCGPLFMDALEDPEVIEIMLTPDGSLWIERYGQDHERIGEIPV AQGRLILSQVASGLNLTVNERSPIVEGEFPLDGSRFEGTFPPIVGPGPSFSLRKKASRVF TLAEYTASGSISPEVANIIEDAVLRRLNIVVVGGTSSGKTTFVNAVIDAIYRLTPSHRLL ILEDTAELQSKSPNVVFFRTSVLANVDMRTLAKVSMRYAPKRILIGEVRDAAALELLKLW NTGHPGGVSTFHADSAEEALPRLEELVEEAGLGPKQKLIGRAVDLAVYMEKTQDNKRRIS SIVKVNRFNHKEEVYEKETLYTVAA >gi|316922251|gb|ADCP01000125.1| GENE 39 44653 - 45006 284 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302861831|gb|EFL84766.1| ## NR: gi|302861831|gb|EFL84766.1| putative glycosyltransferase [Desulfovibrio sp. 3_1_syn3] # 1 117 55 171 173 93 41.0 4e-18 MLVKTHKTVVDDHDPILMVVTILNAHLTEVDKLQARHREGLARLMADKTGAYVSGVQAAV GQLTDSLSSASVEGIRKVFDEHAARLKLFKSNITWLAAIVAVSALLNMAVFVLGGLR >gi|316922251|gb|ADCP01000125.1| GENE 40 45126 - 45854 951 242 aa, chain - ## HITS:1 COG:no KEGG:Desal_1711 NR:ns ## KEGG: Desal_1711 # Name: not_defined # Def: conjugal transfer protein TraL # Organism: D.salexigens # Pathway: not_defined # 1 231 1 230 239 231 49.0 2e-59 MATIHFIQQGKGGVGKSMIAVILYQVLGHLGKEIMAFDTDPVNSTLAGFREFAVTRLDIL KNGDIDPREFDVLINEIAALPVDTHVIVDNGASSFLALNSYIKENDVLGILTESGHNVFF HTVITGGQAIGDTVLGLRSLALGFPESPIVVWLNPYFGEIRMDDRSFEEFKVYQEFSDQF HAVITIPQGNKATLGKDLEMLFAKRQSFATAINFSQSIVMRSRLSRYWNELVSVVEQAAI AF >gi|316922251|gb|ADCP01000125.1| GENE 41 45932 - 46213 217 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAAEGKRLSNNMDGIEFVACMETIKEFLQKSWKHSQIHRQLKQEGKITMSYGAFCYHMRR LSLDQHKPGETPPPVQAKKLLRHLCAHQPGHAM >gi|316922251|gb|ADCP01000125.1| GENE 42 46453 - 46752 248 99 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTTGYNFAMALERVFVELVAKRVKERGIKKGEFAALVWPEDSPKAAAARWTAMRSKASNT GKPQGVQISDAQRMAEVLGEDLSYLMAIAKEQARAQAEA >gi|316922251|gb|ADCP01000125.1| GENE 43 46817 - 47116 145 99 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGVPRQLTINDQEIDLEKAVADALEKILAPVAETICEMERIRRKEYLTEKEVEKLFSLSA ATLKTQRSRGGGPKYIKQGTRILYNRKELIAYFSRANVS >gi|316922251|gb|ADCP01000125.1| GENE 44 47560 - 48363 -68 267 aa, chain + ## HITS:1 COG:no KEGG:Rcas_0692 NR:ns ## KEGG: Rcas_0692 # Name: not_defined # Def: hypothetical protein # Organism: R.castenholzii # Pathway: not_defined # 1 266 114 379 380 399 69.0 1e-110 MMHFLEVDSLSKLSHMIRPVKKGEPVSPSWVQVYRELVEWAILFALLEKDFGTDTLIICD GLLRSKVFAQDLFPKLLSGMKERIDEHFSRSRRKILLAGVAKHSKVLSRYRLAMALEGIL VTDYPAYVEIPREVEERAYVWSEFARGDDHATESGEINKFVGGKMFLVKFGNRRRDPIWP VDIFLPQISDAQLILGSMLADAINGFPVPHYPRCLQKAHENAALVDFDFDILQDFIYEGV RSSLGSQAETLDSFQLQDADPAQRRYS >gi|316922251|gb|ADCP01000125.1| GENE 45 48539 - 48673 97 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKLFSGVLHRFLQKENFLMVLEKNLILGLFVMDALFQRTYENST >gi|316922251|gb|ADCP01000125.1| GENE 46 48670 - 48885 109 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAPICIDILAKSQVPYGWPINIVCGHFAQNVIIWECYFTAYVWQSTVRRNKGEPRSVFSQ NSKHSYIYPVF >gi|316922251|gb|ADCP01000125.1| GENE 47 49347 - 49805 -559 152 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINKAPRSYFRPSFTQAIYGSAAGAFSLRTDCASSTTAIVGIVFGSAAVNSCWLRSKILR RIKPDRINDCPPRICDTSMIHSFPSERAWIRRSINWLLLLCNILTIEVMLPRAASASASC RWGSRSSSLQISVSGVELPFSSIILTSSSHLF >gi|316922251|gb|ADCP01000125.1| GENE 48 50023 - 50481 -125 152 aa, chain + ## HITS:1 COG:no KEGG:Rcas_0691 NR:ns ## KEGG: Rcas_0691 # Name: not_defined # Def: hypothetical protein # Organism: R.castenholzii # Pathway: not_defined # 1 152 551 700 700 134 41.0 9e-31 MSFEKEFSMRDPNYNQPSGNTYARSIFSSTSGTASPITAPCQHTPPAGVLMQPSEDECNP IDIMNSIRENSIKALQGNSDLSQKINSAEGAAWGALKAFFIRTLPEHLDNKNDFAYELVA VAMNSIYGQQGTAWESYRNPERDMTSYVRLKR >gi|316922251|gb|ADCP01000125.1| GENE 49 51061 - 51588 -271 175 aa, chain + ## HITS:1 COG:no KEGG:AM1_2410 NR:ns ## KEGG: AM1_2410 # Name: not_defined # Def: DNA modification methylase, putative # Organism: A.marina # Pathway: not_defined # 3 171 198 364 368 158 45.0 8e-38 MRRAKRYYSYQPPSILSEVRQADSRESTSFSPDELSRKFDWVITSPPYYGMRTYIPDQWL RHWFLGGPDTVDYSNDKQIIHSSPELFERNLQSVWQNAALVCNDNARLVIRFGGIPDRRA NPLELIKNSLTDSGWKISTIRQAGTSHEGKRQADSFLRKKTTPLKEYDIWAKKYI >gi|316922251|gb|ADCP01000125.1| GENE 50 51902 - 52363 -67 153 aa, chain - ## HITS:1 COG:BMEI1402 KEGG:ns NR:ns ## COG: BMEI1402 COG3293 # Protein_GI_number: 17987685 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Brucella melitensis # 24 148 1 125 146 174 62.0 6e-44 PPHGGKPAQKGGSARLIGRTKGGLNSKLHAVCDGHGRPVLLLLTEGQVSDCKGAAILQHL LPEDGIFLADGRYDVAWLRESLKARGMTVCIPPRKTRNAAFPFDKTLYKRRHLIENTFSK LKDWRRIATRYDRCAHTFFSAICLAVCVIFYLH Prediction of potential genes in microbial genomes Time: Fri May 13 04:09:10 2011 Seq name: gi|316922225|gb|ADCP01000126.1| Bilophila wadsworthia 3_1_6 cont1.126, whole genome shotgun sequence Length of sequence - 40936 bp Number of predicted genes - 36, with homology - 23 Number of transcription units - 23, operones - 9 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 339 109 ## COG3293 Transposase and inactivated derivatives - Prom 584 - 643 3.1 2 2 Tu 1 . + CDS 351 - 809 -70 ## + Term 824 - 860 -0.8 3 3 Tu 1 . - CDS 1089 - 3002 -21 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits - Prom 3132 - 3191 3.8 - Term 5435 - 5494 15.0 4 4 Op 1 . - CDS 5540 - 5746 138 ## Slin_1968 phage transcriptional regulator, AlpA - Prom 5843 - 5902 1.9 5 4 Op 2 . - CDS 5959 - 6555 -106 ## BMD_2552 hypothetical protein + Prom 6474 - 6533 1.9 6 5 Tu 1 . + CDS 6711 - 7181 -294 ## + Prom 7345 - 7404 2.4 7 6 Tu 1 . + CDS 7529 - 7630 59 ## 8 7 Tu 1 . - CDS 9595 - 9783 95 ## - Prom 9836 - 9895 3.4 9 8 Op 1 . - CDS 9943 - 10464 179 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 10 8 Op 2 . - CDS 10386 - 11165 358 ## COG0582 Integrase - Prom 11240 - 11299 3.2 + Prom 11361 - 11420 4.9 11 9 Tu 1 . + CDS 11503 - 12066 128 ## + Prom 12138 - 12197 5.8 12 10 Tu 1 . + CDS 12366 - 13013 -198 ## + Term 13045 - 13072 0.1 13 11 Tu 1 . + CDS 13756 - 14274 -137 ## Dvul_2576 hypothetical protein 14 12 Op 1 . + CDS 14696 - 14923 155 ## 15 12 Op 2 . + CDS 14986 - 15486 88 ## gi|212704542|ref|ZP_03312670.1| hypothetical protein DESPIG_02601 + Term 15532 - 15581 14.3 + Prom 15546 - 15605 4.7 16 13 Op 1 . + CDS 15687 - 18428 867 ## COG1002 Type II restriction enzyme, methylase subunits 17 13 Op 2 . + CDS 18421 - 20436 1955 ## Noc_1466 hypothetical protein 18 13 Op 3 . + CDS 20448 - 21701 725 ## Dd703_3041 hypothetical protein 19 13 Op 4 . + CDS 21731 - 22273 277 ## COG2184 Protein involved in cell division + Term 22276 - 22313 0.5 - Term 22680 - 22731 18.2 20 14 Tu 1 . - CDS 22741 - 23199 180 ## - Prom 23248 - 23307 7.0 - Term 23369 - 23409 1.3 21 15 Op 1 . - CDS 23420 - 24538 516 ## Acid_5978 hypothetical protein 22 15 Op 2 . - CDS 24576 - 25160 136 ## - Prom 25245 - 25304 2.0 23 16 Op 1 . - CDS 25364 - 26224 619 ## 24 16 Op 2 . - CDS 26266 - 26718 -84 ## - Prom 26924 - 26983 5.7 25 17 Op 1 . - CDS 27028 - 29571 598 ## COG0457 FOG: TPR repeat 26 17 Op 2 . - CDS 29625 - 30293 180 ## Acid_5978 hypothetical protein 27 17 Op 3 . - CDS 30312 - 30530 176 ## + Prom 30844 - 30903 5.8 28 18 Tu 1 . + CDS 30923 - 31267 122 ## Bcen2424_0921 XRE family transcriptional regulator + Term 31327 - 31375 9.5 + Prom 31270 - 31329 3.8 29 19 Op 1 . + CDS 31382 - 32077 241 ## DVU2032 ERF family protein + Term 32078 - 32124 13.3 30 19 Op 2 . + CDS 32140 - 33318 226 ## LI0183 hypothetical protein + Term 33542 - 33579 3.5 + Prom 33657 - 33716 3.6 31 20 Op 1 . + CDS 33777 - 34754 604 ## COG0714 MoxR-like ATPases 32 20 Op 2 . + CDS 34771 - 35766 921 ## Dvul_2612 hypothetical protein 33 20 Op 3 . + CDS 35770 - 37341 189 ## Dvul_2614 von Willebrand factor, type A + Term 37421 - 37482 -0.1 34 21 Tu 1 . + CDS 37668 - 37808 105 ## - Term 37802 - 37868 9.0 35 22 Tu 1 . - CDS 37880 - 39316 490 ## COG0582 Integrase - TRNA 39656 - 39749 74.1 # Ser CGA 0 0 + Prom 39600 - 39659 2.6 36 23 Tu 1 . + CDS 39900 - 40922 322 ## COG0859 ADP-heptose:LPS heptosyltransferase Predicted protein(s) >gi|316922225|gb|ADCP01000126.1| GENE 1 3 - 339 109 112 aa, chain - ## HITS:1 COG:BMEI1403 KEGG:ns NR:ns ## COG: BMEI1403 COG3293 # Protein_GI_number: 17987686 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Brucella melitensis # 1 110 1 110 122 147 61.0 5e-36 MRTLFYLSESQMERLSPFFPRSHGIPRADDRHVVSGILYVIKHGLQWKDAPAEYGPYKTL YNRFVRWSRLGVFSRIFTELANQTPFDGSLMIDSTHLKAHRTAASLRKKGAP >gi|316922225|gb|ADCP01000126.1| GENE 2 351 - 809 -70 152 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSISPVFTSLGIMSPDPSYIEFIFQGFPNFLCSTFGRFSSSININILFDKFLYFSCCLI CYMFFTRVYGNGCLIYNLPVYINITTSNNIAVSINILDLVLCCRMTIAQRSIKLFNLCVP VVTNLSQLCFRKLSPTRMSRIVVPNYCVAVKA >gi|316922225|gb|ADCP01000126.1| GENE 3 1089 - 3002 -21 637 aa, chain - ## HITS:1 COG:YHR164c KEGG:ns NR:ns ## COG: YHR164c COG1112 # Protein_GI_number: 6321958 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Saccharomyces cerevisiae # 143 598 1047 1452 1522 106 27.0 1e-22 MAPAYTDDSELLTRPLWSTISAEINSLDLLNRKARVRLFHWREQTWFPYLVSNSQIDLLN DVFITKGISTFKWHDICSNILSSVGTPPIATPDLNAARTMAAVPPQRRGTDPTTPIASVL WNATALCSNSIVPQITAATIADSAEQTHNLNASQKVAVKNAIEQQMTIIWGPPGTGKTKT LAALVHSLTRALSGDRLGTNILIAGPTYKAVEELIERIIIAIANDASCLADIFIAYSSSQ RPKSFSPPYTKGRLASFNLRDREQLADECYASLSDRNKITIVATASMQAWKFAERLTNNC VGQVFDTIIIDESSQVQVTTAISPLATLREQGRLVIAGDHLQMPPIMALEPPVGAEYLVG SIQRYLIERFQIVSCALEENYRSNEDIVAYARSIGYRSSLASVFPHISLHLIAPLDTAIS SLPQGLPTSTLWKEILEPSRKVVTLLHDDDLSSQSSRSEAKIVAGLVWCLRNSVSGEIDG RGGGAHHPPTPKEFWEKCIGIVTPHRAQRALVVRELKRIFPSDPPELIDNAVDTVEKFQG GERHTILVTYGVGDGDVIAGEEAFLMQLERTNVAVSRAMAKCVVIMPFSLAGHVPQDKKA LKTAHALKGYLFEFCNKEQDGQILFGQKGKKAKIRYR >gi|316922225|gb|ADCP01000126.1| GENE 4 5540 - 5746 138 68 aa, chain - ## HITS:1 COG:no KEGG:Slin_1968 NR:ns ## KEGG: Slin_1968 # Name: not_defined # Def: phage transcriptional regulator, AlpA # Organism: S.linguale # Pathway: not_defined # 2 56 4 58 72 74 58.0 1e-12 MTTMTTVPTVGFLRLPQILQLVPISKSAWWEGCKTGRFPKPVKLGPRTIAWKAEDIAALV ERLGDKKE >gi|316922225|gb|ADCP01000126.1| GENE 5 5959 - 6555 -106 198 aa, chain - ## HITS:1 COG:no KEGG:BMD_2552 NR:ns ## KEGG: BMD_2552 # Name: not_defined # Def: hypothetical protein # Organism: B.megaterium_DSM319 # Pathway: not_defined # 5 192 358 530 547 65 26.0 8e-10 MDHPLEELPTLYQYWGTLQVLQAMLSVGEQRGFVVESQRLLRRDMSGLVFSVFPQGRAAL ILRHPTTGSRISLYPEQSFGRSSTKGFYSLSYQKRPDICIVCEEVEAPPQLLLFDPKYKL ISEEKGTMNDSRPLREDIDKMHTYRDAIRHEEIERPVSFAGIIYPGKTEHFGNGIGAIGC IPGCTGQEDIQEVLERFL >gi|316922225|gb|ADCP01000126.1| GENE 6 6711 - 7181 -294 156 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRWPALKTSLRFLPATLVCLLVEDGVVVFLQDHAGTAQCDFRMYPQWSLLAYPARSWGVH FPGCQPGSELERDLQRMGGGRRMLLPMVAHLLASVVAFELTLQAKQATQLFLQRFHRLLF EGEEAPVLAMIHAKAQPGHQPSHLVFAGKKRCHRVS >gi|316922225|gb|ADCP01000126.1| GENE 7 7529 - 7630 59 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFMVARPHIGNLVQHQLQLVLAGNIFKRCKDGI >gi|316922225|gb|ADCP01000126.1| GENE 8 9595 - 9783 95 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYNQVVWGRKSLLSEKLPVLRLVLGLVWGGVHEFHGLVFLLKGKLPIAAYTPSVYIIDSM IS >gi|316922225|gb|ADCP01000126.1| GENE 9 9943 - 10464 179 173 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 1 164 249 403 406 73 34 2e-12 SILRSASLRTSEAKWTEFNLDAAEWIIPSERMKVGRPHLVPLPRQAVELFKKMYAFSRTK EWVFPSTSSIGAGKPVSSMALIQAFRKMGYTAENGNRFVTHAFRGLFSTTAYNIFGASSL AVELQLAHVEQNKVKAAYHKTSLRTALAERRALLQQYADYLDELRAKALKAME >gi|316922225|gb|ADCP01000126.1| GENE 10 10386 - 11165 358 259 aa, chain - ## HITS:1 COG:RSc1871 KEGG:ns NR:ns ## COG: RSc1871 COG0582 # Protein_GI_number: 17546590 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Ralstonia solanacearum # 7 244 23 246 405 92 30.0 7e-19 MQRIPVGDSLFLRVSPKGKKTWYLRYDTLSPEGTRKQNLITLGEFPALGLKEARAEADAR KAQAKQENVNLTQLRKQERIRLAQPKVVHTFQSVADAWLDLKMMEWEERSGKQNRGRLVA NVYPVIGDVPIDELTVNDIERALKHIIARGSLEVARRVHTLIVSIFKYALAKDLIQQPDI VVRLSWYKDQMPKRRRQSLYSEELGPEDVGKLLRSIDEHKNRWTVPVSMALQLAPYCAVR PSELLKQNGPNLTLMRRNG >gi|316922225|gb|ADCP01000126.1| GENE 11 11503 - 12066 128 187 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCDRTNKLNLNGIYNLRSDKIVLPRNININYLDKEIFLKKIEEELNSFTNSYHSLFVRML PILIDDYCVSMYNLKNIDTSLSLLGCIASIAAADRGFHMVKSPDDQLNNLSVIISAGIHS GGGKSTSLEPFINIFKKREELEGKRVEEKNYCIQADLDEIKNKLRRAQKSSNREINIQCR KKLMNLH >gi|316922225|gb|ADCP01000126.1| GENE 12 12366 - 13013 -198 215 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIMYSPSSLRLSEQYSRYHNSDAYNLFERALNTIFDNSETYLESVEREHIIYTLSDTASE YYKNFKADMAQEAEVSRVPEWCRRAAQQSLKLAAIFRLAEYPDNDPNNIISNTEMDSAIA VILQVYEHVKRSVISFNDEKTHQCILRIAKMCCEDLGSEFTAKDVAQSLKTSYSSSIVHD ALNYLEYKNFIYEKRTNRVPRRGRPQSKIYVNNIY >gi|316922225|gb|ADCP01000126.1| GENE 13 13756 - 14274 -137 172 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2576 NR:ns ## KEGG: Dvul_2576 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 150 36 181 198 69 30.0 6e-11 MLDNHCKVIQTCFDLHYPDLQDFEYMNRDIYSFINEFIKSLNRRYCSGHYVDAKYLWVRE QKTSCHPHYHIVVFCNGNAIQSPYTIFNKAEYYWFKTIGYQDKKLIDYYDRSNGIKHENG IMIDVNKNDFEKQLNNGFRASSYLAKICSKDIRDKYSCVFGSSQIPSFKQNK >gi|316922225|gb|ADCP01000126.1| GENE 14 14696 - 14923 155 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKYCEEDLPYKIVSANDLLIFFGDALVAIINDKMPPEVPHSVREFLVECAELGGLELLDL YQEIIILISEYRKSK >gi|316922225|gb|ADCP01000126.1| GENE 15 14986 - 15486 88 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212704542|ref|ZP_03312670.1| ## NR: gi|212704542|ref|ZP_03312670.1| hypothetical protein DESPIG_02601 [Desulfovibrio piger ATCC 29098] # 1 159 8 172 176 157 55.0 2e-37 MALEDKDLFSAIFNKYSDQPIEFVMEQYEKAKNINLEIERRQSLRGYVDITPPSIPQNTG EVKQIEPDVESVPKKKFSKRDLIIKPDEAITDDTIKCCLCGKERSSLTLRHLATHGISVE EYKKLCGYAPEQKLMSNNHVEKVRDNVMKAQKARKSSKKIADIEEK >gi|316922225|gb|ADCP01000126.1| GENE 16 15687 - 18428 867 913 aa, chain + ## HITS:1 COG:BS_yeeA KEGG:ns NR:ns ## COG: BS_yeeA COG1002 # Protein_GI_number: 16077744 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Bacillus subtilis # 21 901 19 876 879 611 39.0 1e-174 MATLNQAKIIKALEKMVTAPNDDFIFSFLSAYGTPNATIERLQMGDPQRNVAKIDGDIAL PNQMYFHPVTDGSSLEKAFEDILALPVIEQRKIRFILVTDYTTVIAYDRTVKDQTTFDYS DFKTNYEFFLPLTGLYEKAIAYSEHPADTKACEKMGRLYDHIQIINQYENHDEKHTLNVF LTRLLFCFFAEDTGIFPEQNQMVKALKSVTQENGSDVAEFFEKLFTVLDLPSDAPERTAF SATFQSFPYVNGGLFKEKTVIPTFDAKARRLLLDCGFLTWSDISPVIFGSMFQSIMDPEK RRSLGAHYTSEKNILKVVRPLFLDELHEEFQKILGLKKGKKKALEDFHAKIASLGFLDPA CGCGNFLIVSYRELRELELQVLLAIKAETKGDTRFLALDIRPLIKVSISQFYGIELEEFP VEVARVSMWLMEHVMNLKVGKTFGQVISSIPLQHSATIVCANALTIDWKDVVVPEKLHYI MGNPPFSGYQYMDEQQKEYIANLFKKNKYSRSIDLVSGWFLLASKFIVKTDIEVSFVSTN SITQGEQVYPIWNDIFKIGIKINFAYKTFKWNNESKGNAAVHCVIIGFSYKETLSYRLYS SDDNFKFVSGITPYITESDGKSNYIVCQSKHPICAKQNMALGNMPKDGGNLIIEFEELYS FLQEAPEAKPYIRRLYGAEEFIRNIKRYCLWLKYAPQDILDIPVVKKRLEGVVAMRLSSK AASTRAYAKYPHLFRQITQPDDTPFLIIPLVSSERRKYIPIGFMGKENIVTNLVSIVPNA TLYDFAILTSAMHMTWMRTVCGRLKSDYRYSRDLCYNTFPWPDASEEQKKAVSELADMVL NIRNMYPDKTLAEMYDPDKMPEPLAEAHHNLDMAVDSLYRNTPFESDEERLQLLFKLYEK LVAAKNAKEKPHA >gi|316922225|gb|ADCP01000126.1| GENE 17 18421 - 20436 1955 671 aa, chain + ## HITS:1 COG:no KEGG:Noc_1466 NR:ns ## KEGG: Noc_1466 # Name: not_defined # Def: hypothetical protein # Organism: N.oceani # Pathway: not_defined # 1 671 1 681 683 810 59.0 0 MPDLLTVRYEQSGKSTRQDDMGMREMQARAYAARTAQYLLLKAPPASGKSRALMFLALDK VIRQGLKKVIVAVPEMSIGGSFRDTNLTSSGFFADWHVDCDLCTAGNEGKVEQVAAFLNN TESRYLLCTHATLRFAYERIGNCAAFNDVLVAIDEFHHTSAEDGNKLGAMVDALMADSTA HIIAMTGSYFRGDQVPILSPETEARFTQVTYTYYEQLNGYKYLKSLGIGYHFYQGRYFDA LKEVLDLDKKTILHIPNVNSMESTGQKLHEVDTILDAIGSVVAKDSRTGIMTVKTRKGKT LRVADLVTDDSMRLNVLKYLREIRERDQMDIIIALGMAKEGFDWPWCEHVLTIGYRSSLT EIVQIIGRATRDCEGKTHAQFTNLIAQPDAEDEDVKNSVNNMLKAITLSLLMREVLAPNI TFRPRSLMRPGETLQPGDIILDDTEHPVSERVQKILSSGGIDQVVTTLLNTPEVINAAIT KTTDPKLVTDVEMPKIIMQLFPDLDMSEVSTVNDVAKTQLAIHANGGLISKLDPKFEVVG GTSEKQKNDDDGSGIPNGAKEFLNIGNKLIDIDNLNIDLIDSINPFQGAYEILSKSVNAA MLKTIQDQVATARSTVTDEEAVLLWEDVKAFKREYGVAPSLNANDAYERRLAEVLAYVRN KKAQQMQSQAQ >gi|316922225|gb|ADCP01000126.1| GENE 18 20448 - 21701 725 417 aa, chain + ## HITS:1 COG:no KEGG:Dd703_3041 NR:ns ## KEGG: Dd703_3041 # Name: not_defined # Def: hypothetical protein # Organism: D.dadantii # Pathway: not_defined # 3 416 2 416 419 279 41.0 2e-73 MKLRSENTRWGKSVDDILHEEDELGLLDGIAIKPKTAQKEDSAVTVFLNLVDFYKRNHRK PDQSNPDEKTLYWQLMGYQTRPELRAKVMHLDSVRLLKPSVTAKIPEPEKEEADTESNVK SLADIFDDDDLDLLDDIDTSIYAVKHVSQKKDKELPDEIASRKPCDDFFRYEKFFQDIHK VLPTKMVMKERIVQESDAKVGQVFILNGLLCLVDSIIKEDTGESKRENPRLRVIFENGTE IDLLKRSLTRALYKDKHGRRVNFDPNLFANGSISISHKDKPTGYVYILSSETKAPALAQL KAAGKLVKIGYSTQDVHERIKNASKDPTYLEAPVNILASIQCFNLNPQKFENLIHAFLHK QRLGMTLISSKGKAYKPEEWFAVDMNTAVEVCKHIIDGTITQYRMDNTSGKIVQKNN >gi|316922225|gb|ADCP01000126.1| GENE 19 21731 - 22273 277 180 aa, chain + ## HITS:1 COG:NMA0004 KEGG:ns NR:ns ## COG: NMA0004 COG2184 # Protein_GI_number: 15793038 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Protein involved in cell division # Organism: Neisseria meningitidis Z2491 # 2 179 13 190 191 269 74.0 2e-72 MNIDGISLNNAYRLFESGDIAAMEVGTTKGLQQIHKYLFDGLYDFAGTIREANISKGNFR FANSLYLKEALAAIEKMPENTFEEIIAKYVEMNIAHPLMEGNGRATRIWLDMILKKNLQK VVDWHQIDKERYLQAMERSPVNDLELRMLIQPSLTERIDDREVIFKGIEQSYYYEGFRRK >gi|316922225|gb|ADCP01000126.1| GENE 20 22741 - 23199 180 152 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKLTFIGSVVLSVICLIFTYSYAETYHKASKQYEMCFQDNSLSTYEKNNCYDAEYRRIL TCILNDVNQFKEISSIPQDTRENLVDHIDKWFNYEQSRVQIELLTAQNIGGMYYVIGVLN EQIQRALHMQDWVMNMIEAIGDKYVESSDDAE >gi|316922225|gb|ADCP01000126.1| GENE 21 23420 - 24538 516 372 aa, chain - ## HITS:1 COG:no KEGG:Acid_5978 NR:ns ## KEGG: Acid_5978 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 207 371 99 286 287 66 29.0 2e-09 MKKYSLTGLMLFVLLAFIPSSGHAVGRLNFMLTNLTGLDITDVRIAPTYYPNYISENLLK TNLDPNTRLYIGPNYYGDQRFWNITVSWSNGFQYTWTHNQLTRYNSYVVYANPYGVRMRQ GYERSFARYGDNMPSMYAGAQPGVSVSVGIPEKVNAVAVADAGKVGNSTRKTRDLVFDDE EETEHPVVAGSTADTTKGETISVKATIELTRDGKLSTVLPTESFKSGDKVRLLFSTNRDG NVYWVAKGTSGQYQVLFPSPKAGMNNTVVKNNSYTVPAKGAWRFDEQKGTETLVCILSPS RVAELDKAVALGGEGKKNEASEIISSVVNRHESKRTTRDLVFEEEDNQDVNTKTQTSSDN EPFVATYELIHN >gi|316922225|gb|ADCP01000126.1| GENE 22 24576 - 25160 136 194 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELLGTFTHNWKYQSTTHELFYEDHRALARSIYGLAIYAGDVDSSLDPQKFIEGVLGYHY RVTQVCAWLNAVVSQKTSSPELDEENLIGVLLSDGVIAIKGGNFVPTGKYSHILAASQGK KRSFSDNLRHERLHVFWDEDSVFRERAQQEWKTLSEEERQKIRKTLHQYAQENEAQLVEE WAVKRAETSRMSIE >gi|316922225|gb|ADCP01000126.1| GENE 23 25364 - 26224 619 286 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVSKKMRKIALWTLLVVMLAGLVWAGLDLLGQSNKSAADAGSSAADMEMASLDSIKQSTK SACDWNQERALRKKIDAADASYRTKLASAKSEIERTGKVSESTRSAGIALAKQFQNASEN YAAFWDKNNGKTRAKLAREAGASRVKSAEMAFNNVDASKIDAYNDQQASLRKAQKAYFSE AKEDVSPQDLASLKSSLTPKLEKMGSDLMALVQSVTNILSQVKDQVGSTLSVGGIGGCAK QVATGGGTAAVNDGVASLLSPLQSLLSLVQSMGSNVQGMLSDIATF >gi|316922225|gb|ADCP01000126.1| GENE 24 26266 - 26718 -84 150 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELPDSSFLQTRYVILSACNTGVIFAPKTLKDERTFTDFNQSENMEEELRKVGWIPGIDQ VSFVDVFMRRKVNNVYGTLWFADDAASAYLMSHFMKKLVNQGEHQDAVAAFSETQRQYIK ESKEGKKPLGEDYPVPLHPYFWAVGALFGK >gi|316922225|gb|ADCP01000126.1| GENE 25 27028 - 29571 598 847 aa, chain - ## HITS:1 COG:sll0499_1 KEGG:ns NR:ns ## COG: sll0499_1 COG0457 # Protein_GI_number: 16332039 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Synechocystis # 122 481 123 461 563 82 25.0 4e-15 MFRFAAAFLVSFMLCTAPVQAAGFEQTFKEAQAAYKTGKYADAARLFVQTADLLKKAKET AKAQMVLGNAAIAYMQAEDYASAVTIYENLLSNPGKADQKILLKSYKNLVICHSHLEQHA LKIQTLERMMQALPKLPKAEISNVYAQMGDAYRALEIYAPASAAYDKAAHLLPQDADPRV RGRLLTAMGLCQGNMGDFAGASENLAKAKELAARINEPLTLAESDSNLGILYWERGDYPQ ALQLLNNSLETERKNTLRRNEGVDLNNLGLVKKSMGYFPDAMRAFEDALAIAREVGNVKD EAIALSNRALLNRITGKLSEARADYRAALELYERTGFQEGKAGALLGVGKIAEREDRDLE TALNCYREALDIYSQLRLPRNQAEALLQIGGVLKQTALPGRTSRDLVFDDEPTVPKIDKT EALAEAEKAYQSALLLAEQVGSKEMLWAAHQGLGFCAFQKGQPEEALEHYTFAINLVTAM RTSLESVELLGEYMAGKEDLYSEAMEVCARLHGETQDVKYLEMQMQFDETLRNEIQKASA ALVRMEFADTDKQKLYDKLVALGRKQAKAESTVPVVAPVPTEASEETKLVHKIKTEEAKK QKATVQQLDQDYQTLLTEWKEKYPGDAVIFESSARLDIPKIQQALSDDQVLLHYMQLPDK LVIVCISNDRVDSKIVNISKKELDEAIRKQFLVEYIQTYGSKSPPTPSEELDYMDKSVSI LSILHNCLILPINNLLKNKKRLYICAGGFIAQVPFSALVSSYEDKNPHFLIDDYDIGNIR PSFISALTDPNKKSSDKTLLAVGNPHNNTIYMKNLDGAEKEIQNVNSTLYKENFIKDIQY TNTANIS >gi|316922225|gb|ADCP01000126.1| GENE 26 29625 - 30293 180 222 aa, chain - ## HITS:1 COG:no KEGG:Acid_5978 NR:ns ## KEGG: Acid_5978 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 55 158 97 202 287 65 33.0 1e-09 MKRYSLLLILGALFMVWGDDSWAARTTRDLVFDDEEPAQTQAMESSGKQTTALKTTMLLK RDGTTSTVLPSHEFKSGDSVKLVFTPNIDGYVYWLAKGSSGNYSVLFPSKANMDNAVQRN QEYTIPPKGTFRFDDTPGNEELLCILSAEKLPDMDKAIAEADAAQINAQSSTQLAKLEEK NTPKRTTRDLVFDDEDEGDVNTKQQVAPKGEPFVAHYVLTHK >gi|316922225|gb|ADCP01000126.1| GENE 27 30312 - 30530 176 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDVRPAWDTTGVRNMALPEEGTYILALDKPDGTPISSGMFTIKAMSISGSPLPEETLGTT LAKIFNKYLPKK >gi|316922225|gb|ADCP01000126.1| GENE 28 30923 - 31267 122 114 aa, chain + ## HITS:1 COG:no KEGG:Bcen2424_0921 NR:ns ## KEGG: Bcen2424_0921 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: B.cenocepacia_HI2424 # Pathway: not_defined # 9 113 17 121 132 62 32.0 5e-09 MAEREVQRLAKILGNSIVERRKILGMTQIELANRLDMAPDALSRIENGYVAPRFNRIEEM AKVLECSVAELFREKNDSLKTRSDCIADIISPLPAEKQEEVISLVSHFVSILRK >gi|316922225|gb|ADCP01000126.1| GENE 29 31382 - 32077 241 231 aa, chain + ## HITS:1 COG:no KEGG:DVU2032 NR:ns ## KEGG: DVU2032 # Name: not_defined # Def: ERF family protein # Organism: D.vulgaris # Pathway: not_defined # 1 230 1 236 237 259 58.0 4e-68 MESLCSDQIRELAQALIKVQEQLQPATKDANNPFTKSRYATLNSVMDSCRDALLSNGIWL CQYPIPAEPGYLGLVTKLTHAESGQWQSSLAVVPLPKADPQGVGISMTYMRRYALSAMLG IVTEEDTDGEFNSDRLNRPQRQKNAVNAPQRGKTTQDDSGQAKKILSASNRTPDGLSKLP HLDGISYEIVTAQDGRECILATGNTAAKKEQLSAAGFHWNPQRKIWWKYAA >gi|316922225|gb|ADCP01000126.1| GENE 30 32140 - 33318 226 392 aa, chain + ## HITS:1 COG:no KEGG:LI0183 NR:ns ## KEGG: LI0183 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 7 391 4 392 400 283 39.0 6e-75 MDGTIRNTGAEGVLEILCCGLERDAEQRTANRLGDRSRYVGLSDIAKMLGCPRAALAAKL CLPEYKNAGEALKHQLLLQRGHWFETGVHQALMRYALSPLSQLEIEISHEDVPIKAHLDF TLISTQPQPTVRILEVKSTAKLPATLSESYAMQIGGQTALLKAYWNLPVFNLVQDTGEVL HHRTFLEICNECLGVSLPDASACDIQGWVLCLSMCDAKAFGPFLPENMDFARCLDMASEF WEAMNDLKENRLNLNTIRTAQELAPLCPSCFWKEDCPRFKGSSHPEWDAALAQLMDLKVQ KKSLDEAIGELEVRLKVAYQLSHTVRGEWINTGNHTFRVIPQSGRVTLDRKRLAEELKNL LGEQKAQTLMARCEKQGEPFGRLYAVQTANEF >gi|316922225|gb|ADCP01000126.1| GENE 31 33777 - 34754 604 325 aa, chain + ## HITS:1 COG:AGl2407 KEGG:ns NR:ns ## COG: AGl2407 COG0714 # Protein_GI_number: 15891317 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 8 323 16 325 338 121 32.0 2e-27 MNQNLFKELASLVPMQLDAGQVFTGMPTGKTVKGYKDACLYTPSIDPGYIFHESARDVVV WFMHDLDPLYLYGPTGCGKTTLIKQLAARLNYPTFEVTGHGRLEFADLCGHISLQKGNMV YEYGPLPLAMRYGGILLINEIDLLSPDVAAGLNGVLERGVLCLPENGGELIEPSPMFRFA CTANTNGGGDDTGLYQGTTRQNIAWLDRFMLCEVGYPAPETEKELLSRQFPNLPEHILSN MVSFANEVRKLFMGDADSYENTIEVTLSTRTLIRWADLTLKFQPLANQGIEPLSYALDRS LAFKATRSTRAMLHELVQRIFAVTF >gi|316922225|gb|ADCP01000126.1| GENE 32 34771 - 35766 921 331 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2612 NR:ns ## KEGG: Dvul_2612 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 7 331 3 331 331 355 52.0 1e-96 MSTIIRSDIRILDSLLALNLNISVWSARKKMCLEDFGGAELPPEDLASLGSKRIADPDSL KIFTTLKARAFNYLDRHGIRFMSGWAIPEDKAGDIIDELIQIRETFLHEKETFLADYDQS IENWINRHCKWGNIIRESTVGLDYVRSRLSFSWQLYKVSPLTDHDNPNAVCESGLNEEVE GLAGTLFNEIANSATEIWQKVYAGKDTVTHKALSPLKTLHQKLCGLTFVEPHVAPVASLI QTAINSIPAKGNITGKDILLLQGVVSMLRDPSSMLQHSQRLIEGHSPQDVMSALLANDVF AVCQSPDIPAEISLPPVHQRQSANIPNIGLW >gi|316922225|gb|ADCP01000126.1| GENE 33 35770 - 37341 189 523 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2614 NR:ns ## KEGG: Dvul_2614 # Name: not_defined # Def: von Willebrand factor, type A # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 517 1 531 533 350 36.0 1e-94 MIRTKDIMSCLPLLACILGKQYNITVEIGGTQAYTDGRTIHIPALKMDTDDIFIKMTRSY LDHEAAHIRYTDFQLLQQANLRRLQFHIFNIIEDWRVETLLGRHFPGCRKNFDFIIAYLF GKERQKAGSSAPVFFILEYILLTVRSWDSSEVEKNRTLSRKEMVMACPKIEKELGACLET IHANTRTTQDAIAHALLLESIIKKWIPEQSQGSTSPTENQDSPKATQGIISGEETELQNS IEDIFPKTLGTILKERLSVQAGDTETEHCTVAKPRNITPDVIPPDMLRNIDRITKGLSVR LQGLMQSLSLSAPYPSTRGRLNTAKLFRIKTGNPKVFIQKTEAVAINTSLHILLDASASM YGKRMELATASCHAIASACSGIRGLNITITAFNGNHRGDACSVYPLLKSGQPVHARINLM PSGGTPLAPALWWVMQQLLFTREQRKMLLVLTDGQPHDMNATQKAIETASKIGLEVYGLG MLDRSIGDFLLDTSRVICRLEKLPAMLFELLHDVLTRKSRPCL >gi|316922225|gb|ADCP01000126.1| GENE 34 37668 - 37808 105 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEKLLYFVGGIIVGSISTLAALGFTEEELDEVGECGGNFGDVGNDE >gi|316922225|gb|ADCP01000126.1| GENE 35 37880 - 39316 490 478 aa, chain - ## HITS:1 COG:SMc02297 KEGG:ns NR:ns ## COG: SMc02297 COG0582 # Protein_GI_number: 15964353 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Sinorhizobium meliloti # 32 475 97 494 513 75 25.0 2e-13 MQDNISQQFGITKENIKSVGTFLFNSTLKKDLSPASLTLLLPSFLREQKGTLSVKMFSEK AAPVRKDPIQDKDPAVRQEGNILTRSAEKTRRRIRLEGKKELPSLRDAFDAYVKAKTLTW SAASAKDIPPQVRQFVEIVRELEHGRDICVDELSRAHIRSYFDTLKHLPCRLCGQRQYAG KGWLQLADMGRSGQIERLLSVKTMEVRQTNVRSFVNWCELEYRGAVQAKYVNSGFPKVLS DKDIRRKGVKREAFTCDELHALFGDMEQYTKATEVVSSRFWAPLIALYSGMRLEEICQLH LSDIVKVDGVLCFSINEESGGSGYVKHVKSSAGIRKVPVHPYLWEKIGLGKFASFRWEQT KKEKHRSALLFPDLQERVNTVNHATVKLGSALTHWFTRYRRSVGVGGQHGETSTKAFHSF RHTVIEYLHKEARVDLSMLQSVVGHEMIDMGVTENYAGDWPIRALLTDVIAKLNWNIG >gi|316922225|gb|ADCP01000126.1| GENE 36 39900 - 40922 322 340 aa, chain + ## HITS:1 COG:aq_1543 KEGG:ns NR:ns ## COG: aq_1543 COG0859 # Protein_GI_number: 15606683 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Aquifex aeolicus # 16 334 5 300 312 100 26.0 4e-21 MTTSSFHDASAPTEWVVIRLSALGDVALTTGVLEYWHRTRGWTFTVITREAFAPVLERHP AVCSIVGLAKEDLRFPRQLGVFRDLAEAHAGQGLLDLHGTLRTRLLSLLWKGPVKRYRKL GLERRLFLRSKGRLFRNELRLWNVPQRYALAVEATPPPRAALLPHIWLSDEEIRQGQRLL EQLPDKAGTPPIALHPYATHPDKAWKEAYWRQLMELFESRGIPWIVIGRGSCMEGIPASR NFTNRTSLRETCALLKSSSLLITGDSGPMHLAGAVGTPVLALFGPTTEEWGFFPAGPKDR VLESQLDCRPCTLHGKKHCDRNHACMQSITPEQVMRALDE Prediction of potential genes in microbial genomes Time: Fri May 13 04:13:00 2011 Seq name: gi|316922197|gb|ADCP01000127.1| Bilophila wadsworthia 3_1_6 cont1.127, whole genome shotgun sequence Length of sequence - 35766 bp Number of predicted genes - 28, with homology - 26 Number of transcription units - 23, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 175 - 999 582 ## COG0169 Shikimate 5-dehydrogenase - Prom 1132 - 1191 5.9 - Term 1657 - 1700 0.3 2 2 Tu 1 . - CDS 1862 - 4537 3201 ## COG0342 Preprotein translocase subunit SecD - Term 4557 - 4598 1.1 3 3 Tu 1 . - CDS 4724 - 5296 372 ## - Prom 5410 - 5469 2.3 + Prom 5366 - 5425 5.1 4 4 Tu 1 . + CDS 5491 - 5739 323 ## DvMF_0268 hydrogenase assembly chaperone HypC/HupF + Term 5808 - 5832 -0.3 5 5 Op 1 . - CDS 6075 - 7355 690 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 6 5 Op 2 . - CDS 7352 - 7897 785 ## DP0596 DctQ (C4-dicarboxylate permease, small subunit) 7 6 Tu 1 . - CDS 8000 - 9007 1399 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component 8 7 Tu 1 . - CDS 9169 - 10812 1953 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 11036 - 11095 5.7 9 8 Tu 1 . - CDS 11097 - 11246 58 ## 10 9 Tu 1 . + CDS 11175 - 12308 1544 ## COG0686 Alanine dehydrogenase + Term 12323 - 12369 13.1 + Prom 12388 - 12447 1.5 11 10 Op 1 . + CDS 12469 - 13839 1883 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase + Term 13851 - 13892 10.2 12 10 Op 2 . + CDS 13967 - 15130 1608 ## COG1454 Alcohol dehydrogenase, class IV + Term 15155 - 15209 14.5 - Term 15237 - 15290 8.1 13 11 Tu 1 . - CDS 15322 - 16110 735 ## COG0778 Nitroreductase - Prom 16251 - 16310 1.7 14 12 Tu 1 . - CDS 16529 - 19453 3664 ## COG0178 Excinuclease ATPase subunit 15 13 Op 1 . - CDS 19629 - 20330 1102 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump 16 13 Op 2 . - CDS 20327 - 20818 730 ## COG0680 Ni,Fe-hydrogenase maturation factor - Term 21275 - 21310 -0.2 17 14 Tu 1 . - CDS 21319 - 23244 3010 ## COG0326 Molecular chaperone, HSP90 family - Prom 23332 - 23391 1.6 - Term 23355 - 23421 29.4 18 15 Tu 1 . - CDS 23528 - 24331 788 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III - Prom 24568 - 24627 3.2 - Term 24516 - 24544 1.0 19 16 Op 1 . - CDS 24785 - 25855 1602 ## COG0216 Protein chain release factor A 20 16 Op 2 . - CDS 25855 - 26904 1175 ## COG3872 Predicted metal-dependent enzyme 21 17 Tu 1 . - CDS 27070 - 27279 298 ## PROTEIN SUPPORTED gi|78358022|ref|YP_389471.1| 50S ribosomal protein L31 - Prom 27452 - 27511 3.9 + Prom 27392 - 27451 8.2 22 18 Tu 1 . + CDS 27646 - 27921 476 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 27984 - 28017 5.2 + Prom 28089 - 28148 1.8 23 19 Tu 1 . + CDS 28201 - 28923 846 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump + Term 29161 - 29192 -1.0 24 20 Tu 1 . - CDS 29017 - 29793 919 ## COG4106 Trans-aconitate methyltransferase - Prom 29843 - 29902 2.0 - Term 29876 - 29914 4.1 25 21 Op 1 . - CDS 29985 - 31133 1868 ## COG0538 Isocitrate dehydrogenases 26 21 Op 2 . - CDS 31208 - 32320 1288 ## COG1194 A/G-specific DNA glycosylase + Prom 32546 - 32605 3.6 27 22 Tu 1 . + CDS 32627 - 33133 913 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 28 23 Tu 1 . + CDS 33484 - 35457 2236 ## COG0826 Collagenase and related proteases + Term 35628 - 35678 18.4 Predicted protein(s) >gi|316922197|gb|ADCP01000127.1| GENE 1 175 - 999 582 274 aa, chain - ## HITS:1 COG:slr1559 KEGG:ns NR:ns ## COG: slr1559 COG0169 # Protein_GI_number: 16332159 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Synechocystis # 9 260 10 274 290 177 39.0 2e-44 MSTFTPPALHGIIGYPLGHSMSPLMHNTAFRTLNLPGVYLSWPIEPGRLPAFVDAARLLN IRGCSITIPHKIDIIPLLDKTSENVREMGACNTLYRDGEKICGENTDVIGFTAPLRKRSL SPETRVLVLGAGGVSRAAIAGLRRLGLANITLTNRRKERAEALCNEFGLLCAPWEERGDV PADLIINTTPLGMTGAQEHETPYPRAFQDNGIAYDLIYTPFQTRFLRDAAAAGWETVSGL DMFIGQGDAQFHLWTGHHLPEEAITAVTSALYGQ >gi|316922197|gb|ADCP01000127.1| GENE 2 1862 - 4537 3201 891 aa, chain - ## HITS:1 COG:RC0894 KEGG:ns NR:ns ## COG: RC0894 COG0342 # Protein_GI_number: 15892817 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Rickettsia conorii # 23 519 25 517 518 307 36.0 7e-83 MKLSFRYAIAAIVMAGALFLLSPTLMPEDVNLPGASRISLGLDLKGGVQLTLGVDTEKAV QSALINSGQVLRQKAREAGITVLGPRQMPGGVQELIIARANQVDAFLALAKKDAPQVVPG TPRTGADGAVHLPYSFTPAFVKQTQDLAVDQVLRTISSRIDQFGVAEPDIRKQQGDRILI QLPGLTNVNRAVQLVEKSADLSFHLVRDDISQGYLPPGVAFYPMTGKHGEATNAKLALDS TPLLSGKEVADARPGFDNKNTSMVSLSFTPRGAALFEMATGEHVGKRLAIVLDGTIHSAP VIQEKIAGGTASISGQFTPEEAQDLAISLRSGSLAAPVHVLEQRTVGPALGEASITSGIL AALIGTLAVMIIMPLRYGWSGMLANGMLLCTLSLLMGGMAALGATLTLPGIAGVVLTIGM AVDANVLIFERIREELALGLTPPKAIKAGFERANLSIVDSNLTTIIVAAILYQFGTGPVR GFAVTLTLGIIASMFSAIFLCRIAFDAWMKHTNGTRLSMHGPMELRALSGYLSHFPFLKR VKHVGVLTLLLVMVAVSVGTWRGGLQYGVDFSGGVAAQVRFEHPVSDKQLIGALDPMHLE GLMTQQYGDNNSVWLLRFGLPDMPAQQLGEALLSHLQTVDDGGAVSLDRLETVGPKVGDD LRSSALEAIYYALLLITVYISGRFEQRWGTAALLAGALTATMIALQWMGVPTEGRILAAL LLTLFLCWRFRLSFAAGAMASLIFDVTTTTSLLVILGQEIDLNIVAALLTIIGYSLNDTI VIYDRIRETLRRENPDSPRPLPEIIQEALGDTMSRTLLTAGTTLVAALALFLLGGPVIYG FALTMLIGVFMGTISSLFVAAPMLPLFGDTLSFKTAVTLGVFERPGEHGVV >gi|316922197|gb|ADCP01000127.1| GENE 3 4724 - 5296 372 190 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNTLKPIKDATTSFSRSVLGTLLFAFGLFLCSSFSAQAHIDNDVSRQFEALLEAAPSADV FKAQHSELEGNDIGPQLQRFDAVDEDSALVEASSGWRQLDAALARSERNLFASISETLNG NISQTTARTPKSVTDKPLPHSSGYAAALVSAPVLPQISVTPVALPIGTPLCFLPAERGLV SYPTPPPYGC >gi|316922197|gb|ADCP01000127.1| GENE 4 5491 - 5739 323 82 aa, chain + ## HITS:1 COG:no KEGG:DvMF_0268 NR:ns ## KEGG: DvMF_0268 # Name: not_defined # Def: hydrogenase assembly chaperone HypC/HupF # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 82 1 83 83 87 55.0 1e-16 MCLAIPAKVIEKSDDGMLKATVGNGPTCLTVSGILLPEEVEIGDYIIIHAGFAMHKMEKT EAEESLRLFRELAQATGQEATF >gi|316922197|gb|ADCP01000127.1| GENE 5 6075 - 7355 690 426 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 6 422 7 426 435 270 35 9e-72 MSPAGVLFATFAVLMLLGSPITVALGVAAMAALYTVDQNLVTLVQIAFTSVNSFPIMALP AFMLVGALMECAGISRRLVAIAENIVGPIPGGLAVSTALACVFFGAISGSGPATTAAVGM LMIPAMIRRGFSKGYAAAAAATAGGVGIIIPPSIPMVVYAVSGQQSITKMFLAGVMPGIM IAVGLAAMHLFLCRNMPVAEDLQSGSGDSLLKALKDGLWSLMAPVVILGGIYGGLFTPTE AAVVAIFYSIFVGLFIHRELTFGGIRHSLQITSWMTGRVLIIMFTAYAFERLLVQYRIPD MIVEHILGFTSDVAVIWMFVIALLLFLGMFMETLAIILLATPVLLPVMTAFGVDPIHFGV VLVCCCGVGFSTPPLGENIFIASGIADTTLEDISYNALPFVFVTVGVIILCVFVPDIVLF LPRMMY >gi|316922197|gb|ADCP01000127.1| GENE 6 7352 - 7897 785 181 aa, chain - ## HITS:1 COG:no KEGG:DP0596 NR:ns ## KEGG: DP0596 # Name: not_defined # Def: DctQ (C4-dicarboxylate permease, small subunit) # Organism: D.psychrophila # Pathway: not_defined # 1 175 9 183 202 224 64.0 1e-57 MLKKVLLAILDNFEGYVSQVLLAFFVLILLLQIFLRQFGHPLYWTEELARYSFVWFVFFG ASYAARLAAHNRVTIQFRPFPKWVGDACMLLSDGVWLFFNIVMVVESLTVIRELREFPYA TPALDWQLSYIYFIFPIAFTLMSARIIQVNVMKFVLKKELTAPDRIETEETRKAFMDGGK P >gi|316922197|gb|ADCP01000127.1| GENE 7 8000 - 9007 1399 335 aa, chain - ## HITS:1 COG:SMc00271 KEGG:ns NR:ns ## COG: SMc00271 COG1638 # Protein_GI_number: 15965452 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Sinorhizobium meliloti # 2 295 3 289 325 153 32.0 4e-37 MLKSLLAAVVTLALLTCPFPARAYEKTILKVGMGDPIDSEMGAIGTRFKEVVEARSEGKV EVQLFPSGQLGDETEMIQNVRAGNLDIAVVGIANTVPFVKKLGILTMPYLFDDMYDVVRA TTGPAHDLLNGYAIREGGFRILGWTYTDYRYISNSRKPIKNLNDIKGLKFRVPQSAILLA CYKAWGANPVPISWAETFTALQQGLVDGQCYGYITFLACKFNEVQKYITEVHYTYQLQPM ILSQRAFRKMSPEMQTLITDAGRDAQEYCLAFQLVEGVRARQRLIESGVQIDQLEDEADW RSAAISQVWPEMETFVGGRAAINAFLSAIGKKPWK >gi|316922197|gb|ADCP01000127.1| GENE 8 9169 - 10812 1953 547 aa, chain - ## HITS:1 COG:BH0992 KEGG:ns NR:ns ## COG: BH0992 COG3829 # Protein_GI_number: 15613555 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 97 537 20 452 454 322 41.0 1e-87 MLGEILKLSPGRMDDEPFLPRLTRCGFDEVEESLSEPHGGELIAHHKEGSMYIIRWKSVS IDDSAGGEQVRLITMSKIPVDQINRVAELPWQELTVLLDSIHDGIWVIDSDGITLRVNKA MERIAGLRAEEVIGKHVTEPMHKGRFETCVTLRALIEKRSVTMFDDYSNGKRCLNTSTPI FDEKGNVWRVIASIRDMTELETLQRKLTDLEMETLAYKARLENLETEMDAGFVGHSAPMR RLRKEASKAARTEAITLILGETGTGKTLTAKAIHDMGQRSAEPFIAVNCGAIPMSLMESE LFGYEKGAFTGAAKSGKPGMFELAHKGTLLLDEIGELPLPMQAKLLQVLDGHPFHRVGGT KPITVDVRVIAATNKPLADMAASGQFREDLFYRLRVLTVEIPPLRERPDDIPVLAMHFLQ EIIKKSGLQKNFDPQVLNCFLTYRWPGNVRELRALVQSLATMSEEETITMQDLPQYMQAQ SPLPQGSVSFRQPMREAVAELERNMIMTALTETGSTYKAARRLKVSQSTIVRKAQRYKIG LVEIAHK >gi|316922197|gb|ADCP01000127.1| GENE 9 11097 - 11246 58 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNASGSDADTEFLNFDFGGDTYTHLSASESSMLKYVLKFRVGLYSMPSL >gi|316922197|gb|ADCP01000127.1| GENE 10 11175 - 12308 1544 377 aa, chain + ## HITS:1 COG:SA1531 KEGG:ns NR:ns ## COG: SA1531 COG0686 # Protein_GI_number: 15927286 # Func_class: E Amino acid transport and metabolism # Function: Alanine dehydrogenase # Organism: Staphylococcus aureus N315 # 1 368 1 362 372 401 57.0 1e-112 MRVGIPTEIKVQEFRVGITPAGVHALKEAGHTVLVQKGAGLGSMITDEEYVAAGAQMVAT AKECWDCDMVVKVKEPLAPEYDLFHEGLILYTYLHLAPEPALTKALLEKKVIGIAYETVQ FDNGFLPLLAPMSEVAGRMATQVGAQMLTKIEGGMGLLMGGTAGVQAAHVVILGAGTVGL SAAKVAMGMGARVTILDSNLFRLRQIDDLFGGRIQTLASNAFNIAAATKDADLLVGSVLI PGALTPKLVTEAMVKTMKPGSAIVDVAIDQGGCIEPTAKHGATYHDKPTFKYPVNGGEVV CYSVGNMPGAVARTSTFTLTNATMPYMVDLANKGWKKACQDDKALARGINTYDGKVYFKG VSDALGYELHCTCDILK >gi|316922197|gb|ADCP01000127.1| GENE 11 12469 - 13839 1883 456 aa, chain + ## HITS:1 COG:BS_yhxA KEGG:ns NR:ns ## COG: BS_yhxA COG0161 # Protein_GI_number: 16077991 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Bacillus subtilis # 12 450 19 450 450 350 41.0 2e-96 MTYDKAELVALDKKYVWHHLTQHKNFEPAIYVKGEGMRITDIDGKTYLDAVSGGVWTVNV GYGRKEIVDAVAKQMMEMCYFANGIGNVPTIKFSEKLISKMPGMSRVYLSNSGSEANEKA FKIVRQIGQLKHGGKKTGILYRARDYHGTTIGTLSACGQFERKVQYGPFAPGFYEFPDCD VYRSKFGDCADLGVKMAKQLEEVILTVGPDELGAVIVEPMTAGGGILVPPAGYYETIREI CDKYELLLIIDEVVCGLGRTGKWFGYQHFNVQPDIVTMAKGVASGYAPISCTVTTEKVFQ DFVNDPADTDAYFRDISTFGGCTSGPAAALANIEIIERENLLENCTKMGDRLLEGLKGLM AKHPIIGDVRGKGLFAGIEIVKDRATKEPIAEAVANAMVGAAKQAGVLIGKTSRSFREFN NTLTLCPALIATEADIDEIVAGIDKAFTTVEQKFGL >gi|316922197|gb|ADCP01000127.1| GENE 12 13967 - 15130 1608 387 aa, chain + ## HITS:1 COG:AGpA199 KEGG:ns NR:ns ## COG: AGpA199 COG1454 # Protein_GI_number: 16119364 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 28 385 45 399 399 215 37.0 9e-56 MAFVHYTVKKIVHGLGAIKEAANEVKNLKGSKAFIVTDPGLAKIGVQKPLEEALTAGGIE WKLYAEAQLEPSMDSIQHCTDEAKAFGADVIIGFGGGSALDTTKAASVLLSNEGPIDKYF GINLVPNPSLPCILIPTTSGTGSEMTNISVLADTKNGGKKGVVSEYMYADTVILDAELTF GLPPRVTAMTGVDAFVHAMESFCGIAATPITDALNLQAMKLVGANIRQAYANGKNAAARD AMMYASALAGMGFGNTQNGIIHAIGTTLPVECHIPHGLAMSFCAPFSVGFNYIANPEKYA IVADILRGDDRSGCMSVMDRAADVEDAFRDLLNDLDIATGLSNYGVKREDLPACADRAFA AKRLLNNNPRAASRDQILALLEANFEA >gi|316922197|gb|ADCP01000127.1| GENE 13 15322 - 16110 735 262 aa, chain - ## HITS:1 COG:MA2742_2 KEGG:ns NR:ns ## COG: MA2742_2 COG0778 # Protein_GI_number: 20091565 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 87 259 1 182 184 121 36.0 1e-27 MVSLELNQDHCIRCGRCISVCPQRILGRHTNGSVDVLHGALARCIRCGHCVAVCPKAALT LEHIAPSSLPLVEDAPLSDLQRDMLFKTRRSTRAYKDEPVDRNVLLKALEEARYAPTASN CEEVAWLLVEGRDRLHDLASRVADWMSTLTGKYSHVASAFRAGQDPILRGAPSLILAHGD ANMPWNALDCAAAVSYLELALHSYGIGTCWSGFVIAAASNGVDLGIPLPEGRKICGGLMI GYPAVQYARVPPRKPVRLTVIE >gi|316922197|gb|ADCP01000127.1| GENE 14 16529 - 19453 3664 974 aa, chain - ## HITS:1 COG:VC0394 KEGG:ns NR:ns ## COG: VC0394 COG0178 # Protein_GI_number: 15640421 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Vibrio cholerae # 6 961 4 935 940 942 50.0 0 MSHSSIHIEGARQHNLKNLSLDIPRDELVVVCGPSGSGKSTLAFDIVYAEGQRRYVESLS AYARQFLPQMDKPDVDKIEGLSPAISLEQQTTGRNPRSTVGTVTEVYDFLRVFFARLGKM YCPQCGRPIEARAADEIIADILALPEGTKVIIMAPLVELQKGTHLDRFKKLKAEGFVRVR VDGQLYGMEDIPALDKNKKHTIDLVIDRLVIKDGIRGRLADSVELALRYGEGRIIVNEPD KGADGDTLHATESVCPVCKISLPAPSPQLFSFNSPQGACPRCAGLGTVEYFEPMLIAPNR GLSLNTKAILPWANPKTFARYEEALTALGKRFGFTLSTPLSAYSPEALQALFYGEDDPSQ KSKSGLRRNRLGGSVALESPEYDDNAPAPVKAYDGPYKLATDNWPGVIPLLERGMQYGDM WRDLLSRYLQSMDCPTCHGARLRPESLAVRVDDLNIHQFCSLPVERALRWLNGREFDGRH ALVAEPLLKELNHRLSFMTNVGLDYISLGRTMTTLSGGESQRIRLASQLGSGLVGVTYVL DEPSIGLHPRDNERLIATLRSLQGRGNTVLVVEHDEATIREADHIIELGPGSGAHGGDMV YHGSFENLIKHSETLTAKYMRGDLSVPIPDERREPKGWLTLRGVTTNNLKDIDCPIPLGT LTCVTGVSGSGKSSLVVDTLYKHLALAQGIRVDQPGSIRGIDGVEAIERIVAIDQTPIGR TPRSNPATYTKIFDEIRNIFAMTPDARKRGYQPGRFSFNVKGGRCEACSGDGQIRVEMHF LPDVYVTCDVCGGKRYNHETLEVRYRGLNISEVLDMPVREARQFFSNYPVLERRLAVLED VGLDYIRLGQPATTLSGGEAQRIKISRELGKRSLPGTMYILDEPTTGLHMHEVGKLVTVL HHLVELGATVVVIEHNTDVILASDYVIDLGPGGGENGGRIVAAGTPEAIMADPNSVTGKF LTEERRTRRKLGDG >gi|316922197|gb|ADCP01000127.1| GENE 15 19629 - 20330 1102 233 aa, chain - ## HITS:1 COG:AF1248 KEGG:ns NR:ns ## COG: AF1248 COG1811 # Protein_GI_number: 11498847 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Archaeoglobus fulgidus # 5 217 4 214 228 149 46.0 3e-36 MIPLGAIINALGITIGSLVGLAFGARLPERVRAIVFQGLGLCVLVIGFKMALLTQNPLIV IFSIVIGSVAGEMIGLEARLIRVGDWLKGRLKSSNPLFTEGMVNASVLFCIGAMAIIGSF DEGLRGDRAVVLSKTIIDSFAALAMASAYGLGVLFSAIPVLIYQGALTLLAGSLQSWLDP ATMTELTAVGGTLIIGIGLNMLEITRISLSNMLPALLAVVGLCALTASFAGTI >gi|316922197|gb|ADCP01000127.1| GENE 16 20327 - 20818 730 163 aa, chain - ## HITS:1 COG:aq_667 KEGG:ns NR:ns ## COG: aq_667 COG0680 # Protein_GI_number: 15606081 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase maturation factor # Organism: Aquifex aeolicus # 4 152 2 150 162 92 34.0 4e-19 MNKLLVLGIGNMLLTDDGVGVFASQELMKEDWPEQVTIREGGTFTQDIFYSFKGYSHLLV LDVVHCGGKPGTLYRLTEDALIKDDKQRLSIHDIDLIDSLLMAEKLFGTKSELLVLGVEP KDYVTWNIGLSDDVRAVFPEFLALARKEIGEWLKRYGNGETRA >gi|316922197|gb|ADCP01000127.1| GENE 17 21319 - 23244 3010 641 aa, chain - ## HITS:1 COG:STM0487 KEGG:ns NR:ns ## COG: STM0487 COG0326 # Protein_GI_number: 16763867 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Salmonella typhimurium LT2 # 10 628 17 630 632 412 38.0 1e-115 MSEQTTNHEFRAETRKVLNILTHSLYSNREIFLRELISNASDALDKLRFLQSQTQHEAIR DAELPLEIRISSDKVQNCLTITDTGIGMTRDEMIENLGTIAHSGSEAFLKEHTQTAEDGT TSDASNIIGRFGIGFYSVFMVADKVEVTSLSATGDGPAYRWISDGTGTFTVEEAPDQADI KRGTTIRAFIKEGDKEFLEKYRIEGIIRTHSSFIPFPILVDGERVNTTPALWREPKFSIK KEQYDEFYKFLTYDQKDPLDVLHLSVDAPVQFNALLFIPDITKDYFGAYREHWGLDLYAR RVLIQRENKELIPDYLAFLKGVVDTEDLPLNISRETLQENIVLRKISQTISKQILSHLEK MAKDDAEKYNSFWKLHGKVFKLGYSDFVNRDRITPLLRFNSSAMEDADGLTSLDDYISRA RESQKEIWYVAAPSREAAKVNPHSEVFRRKGLEVLYLYEPVDEFALETLAKYKDYTFKAV EHADSAVLDAFADTDEQPKAAPLSNEDMNEFDKLLETIRKVLGDQIKDARVSHRLADSPA CLVSPDDGVTSSMERLMRVMQKDDSIPQKIFEINRDHPLLRTMLKLSKADPDDPQLAEMI QSLFDSTLLLDGYIKDPHALASRATKLLEQASAWYADVKKL >gi|316922197|gb|ADCP01000127.1| GENE 18 23528 - 24331 788 267 aa, chain - ## HITS:1 COG:CAC1330 KEGG:ns NR:ns ## COG: CAC1330 COG1234 # Protein_GI_number: 15894609 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 1 267 1 267 268 281 48.0 9e-76 MEKLIVLGTGNAGATRCYNTCFILLDGKEPLLVDAGGGNAIFTQLERAGIRPSDIHHAFL THCHTDHLFGMIWMLRSVAQNIRGGSYEGTFTLYCHDELAGVVHSVADMTLPASMTQYIG DRILIVPVGDGEERPFGSYRATFFDIGSTKAKQFGFMLGLHSGKKLSCLGDEPCTPRGRK YAAGSGWLLSEAFCLYADRDRFKPYEKHHSTVREACETATELGVENLVLWHTEDTRLPER KTLYTAEGRQYFSGNLFVPDDLESIAL >gi|316922197|gb|ADCP01000127.1| GENE 19 24785 - 25855 1602 356 aa, chain - ## HITS:1 COG:CAC2884 KEGG:ns NR:ns ## COG: CAC2884 COG0216 # Protein_GI_number: 15896138 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Clostridium acetobutylicum # 1 356 1 357 359 398 54.0 1e-111 MFAKLESLERKFLDLEQELADPTVFNDQDRYRKLTKAHSDLKPVVDIFRQYRELKQNLAD NQELLNDADPDIREMAHEEIKSIEEKLPELEHDLKVLLLPTDPLDEKNIILEIRAGTGGE EAALFGGDLFRMYCRYAEKMGWKVEIMSQSESDNGGYKEVIALISGDKVYSRLKFESGTH RVQRVPATESQGRIHTSAATVAIMPEAEEVDLDLKPEDLRFDVYRSSGPGGQSVNTTDSA VRVTHIPTGLVVCCQDEKSQHKNKAKGLKILCSRLLQMKQDEQNAALADQRRAQVGTGDR SERIRTYNFPQGRVTDHRINLTLYSLGKVLEGEIQELVDALVTHAQTEALKAQASS >gi|316922197|gb|ADCP01000127.1| GENE 20 25855 - 26904 1175 349 aa, chain - ## HITS:1 COG:FN1092 KEGG:ns NR:ns ## COG: FN1092 COG3872 # Protein_GI_number: 19704427 # Func_class: R General function prediction only # Function: Predicted metal-dependent enzyme # Organism: Fusobacterium nucleatum # 37 334 8 294 304 211 40.0 2e-54 MSTRTGRPAATSVSSARSGKRFPMSLGRLLPFLAPAPLVGGQAVMEGVMMRNGDVYALAV RTADGNISVENRPWFSLTRSELLKKPFLRGFPTLIETLVNGIKALNLSAERSTEGTGEEL KDWQLVLTLIVSLLFAVGLFVVVPHLLSIIMNWIGLGGDVEGFSFHIWDGLFKFLIFIGY IVGIAFLPDIRRVFCYHGAEHKTIHAYESGDTVTPESAIRFSRLHPRCGTTFLLFVMSIA ILLHTVLVPLLLLVWTPDSAVAKHLFTIAFKLLLMVPISALSYELIRYAARLGDGFWGRI LRAPGMFLQLLTTREPELDQVEVAVAALASAVGQHPPRLETPPTCNRSH >gi|316922197|gb|ADCP01000127.1| GENE 21 27070 - 27279 298 69 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|78358022|ref|YP_389471.1| 50S ribosomal protein L31 [Desulfovibrio desulfuricans subsp. desulfuricans str. G20] # 1 69 1 69 69 119 73 3e-26 MKEKIHPKVYKAKITCACGNQVELLSTKGETVHVEICSACHPFFTGKQRFLDTAGRIDRF RKKYANLNK >gi|316922197|gb|ADCP01000127.1| GENE 22 27646 - 27921 476 91 aa, chain + ## HITS:1 COG:YPO3154 KEGG:ns NR:ns ## COG: YPO3154 COG0776 # Protein_GI_number: 16123316 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Yersinia pestis # 1 90 1 89 90 75 50.0 3e-14 MTKADLVDKIHAKAGLGTKAATEQFLDAAISVLSETIANGEQVSFTGFGSFKVVERSERK GRNPRTGEECLIPASRVVKFTPGKALKEAVK >gi|316922197|gb|ADCP01000127.1| GENE 23 28201 - 28923 846 240 aa, chain + ## HITS:1 COG:STM3115 KEGG:ns NR:ns ## COG: STM3115 COG1811 # Protein_GI_number: 16766416 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Salmonella typhimurium LT2 # 1 236 2 230 235 139 37.0 4e-33 MIGPVANSVCIVAGSILGALCGPVLSQEFRKKIMNVFSCITLGIGVTMLSKGDSLPPVVL SLLFGTALGELTRFESLVMKGAFKVAGLFKRNGKPKGGKSETLSETVFRDQFAVATVLFC VSGLGIIGSMREGLTGECSLLLIKGVLDLFTAMLISASIGSVVGVLAVPQCAIQFALLFG ATMLMPYTTPTMFADFSACGGIIMLGAGLRQMGLVQVPILSMLPGLFFVMPLSAVWKQIM >gi|316922197|gb|ADCP01000127.1| GENE 24 29017 - 29793 919 258 aa, chain - ## HITS:1 COG:ECs2126 KEGG:ns NR:ns ## COG: ECs2126 COG4106 # Protein_GI_number: 15831380 # Func_class: R General function prediction only # Function: Trans-aconitate methyltransferase # Organism: Escherichia coli O157:H7 # 3 254 4 248 252 193 41.0 3e-49 MQWDSAAYLRFKAERTQPSIDLVKRIDLEQPRKLLDVGCGPGNSTQVLADAFPNALRIIG IDSSPEMIEAVKDDHPDMEFRICDALNLPSLGEDGFDVVFSNACIQWVPDHPRLIRDMLA LLRPGGMLAVQVPMNYQEPIHRIIGELAASDEWRAELASARIFHTLSQEAYFDILAAEAG QFQMWQITYLHRMPSHEAIMDWYKSTGLRPYLSLLSGERAAAFEQAVLEEVAARYPKQGN GEIIFRFPRFFFTATPKA >gi|316922197|gb|ADCP01000127.1| GENE 25 29985 - 31133 1868 382 aa, chain - ## HITS:1 COG:TVN0195 KEGG:ns NR:ns ## COG: TVN0195 COG0538 # Protein_GI_number: 13541026 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Thermoplasma volcanium # 4 378 19 396 405 418 54.0 1e-117 MRKTVYWIEGDGIGPDVWKSARPVIDEAIRLSYGDERGFDWKELLAGEKALKETGTLLPD ETLAALRGAELAIKGPLGTPVGTGFRSLNVTLRQTLDLYACIRPIRYFEGIESPVKHPER VDMIVFRENTEDVYAGIEYKAGTPEAAKLIAFLRDELGANVDASAAVGIKPMTEKGSKRL VRAAIRHALAQNLPSVTLVHKGNIMKFTEGAFRQHGYDVAKEEFAAQTVVEAEAASAPGK LVIKDRIADAMFQEALIRPEQYSVLATPNLNGDYISDALAAQVGGLGLAPGVNMSEKLAF FEATHGTAPSLAGKDRANPGSLILSGALMLEHIGWKEAATRIYKAVNTAIARRAVTEDLA AQMTDARTVGCTEFGEIITKLL >gi|316922197|gb|ADCP01000127.1| GENE 26 31208 - 32320 1288 370 aa, chain - ## HITS:1 COG:BH0931 KEGG:ns NR:ns ## COG: BH0931 COG1194 # Protein_GI_number: 15613494 # Func_class: L Replication, recombination and repair # Function: A/G-specific DNA glycosylase # Organism: Bacillus halodurans # 8 356 11 354 372 254 37.0 1e-67 MNDTQRFTLFRKDLLRWFAENRRPLPWRADYTPYRTWIAEVMMQQTQMDRGVQYFLRWME RFPDVAAVAAAPEEDLLKAWEGLGYYRRARNIQAAARVIMERHGGNFPTSYADILALPGV GPYTAGAIASTAYNEEVPCVDGNVERVLSRVFDIDTPVKEEPAKSRIRELAQALIPKGEA RNFNQGLMELGALVCRKKPECERCPLAGLCESRHLGIQNERPVPGKKAAVTQIEVVCGVL LHEGKVFIQRRNEKDVWGGLWEFPGGCVEPGETPEQAVAREWMEEVGFKVAIVRPLDVIR HNYTTYRITLRCYQLRLEGKPKGCPVPEELAEATACQWIAPQDIEAFPLPAPHRKLADNC SLFDNPASGE >gi|316922197|gb|ADCP01000127.1| GENE 27 32627 - 33133 913 168 aa, chain + ## HITS:1 COG:ECs0587 KEGG:ns NR:ns ## COG: ECs0587 COG0652 # Protein_GI_number: 15829841 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Escherichia coli O157:H7 # 6 168 2 164 164 202 61.0 2e-52 MANPTVLLETTSGDILIELYADKAPATVENFLKYVNEGFYANTIFHRVIKGFMIQGGGMN MKMEEKATHAPIKNEADNGLANERGTIAMARTRDPHSATAQFFINTVDNGFLNFSSPDVN GYGYCVFGKVIEGMEAVDKIEKEKTTTRGIHSDVPVSAVLITNASVFE >gi|316922197|gb|ADCP01000127.1| GENE 28 33484 - 35457 2236 657 aa, chain + ## HITS:1 COG:MA0538 KEGG:ns NR:ns ## COG: MA0538 COG0826 # Protein_GI_number: 20089427 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Methanosarcina acetivorans str.C2A # 4 313 5 321 855 187 37.0 6e-47 MTRKPEILAPAGDTQSFLAAMAAGADAVYLGLKHFSARMQADNFGSAELSRMVELANENG RRIYVAFNTLVKPDDVPAAGRLIARLARDVKPHGLIIQDPGVLALARQAGYEGGLFLSTL ANLTHPASLLAAKELGADRVILPRELSIDEVKQISAACPEGLDLEMFIHGALCWCVSGRC YWSSYMGGKSGLRGRCVQPCRRVYKQRNHSGRFFSCLDLSLDVLAKTITDIPHLSCWKIE GRKKGPHYVYHVVTAYKMLRDNPDDPQARKDAEKILEMALGRPSTRARFLPQRTSEPTSP DSQTSSGRLVGKIQTDAEGRPFFKPHIELISYDYLRVGYEDENWHSTLPVTRRTPKAGTF TLRLPRHKTPKAGTPVFLIDRREPELVQLIAEWKRKLDRCKGSEPTAVEFTPKMPKPAHP RRRPDMTLRASLPYGSETRRSRNSVIGLWLSPKSVREVSRTVAPNMAWWLPPVIWPDEEE LWKRIINEAIRNGAKHFVCNEPWQAAFFKGREVDLVAGPFCNAANAFTLASFAKLGFSAA FVSPELTKEDFLALPKTSPLPLGIVLAGFWPMGIGRHDPLGIKPNEVFQSPKGEGFWTRR YGQNTWIYPAWPLDITVHRPELEAAGYSFFARMEENPPKSMPAATRPGLFNWEGELL Prediction of potential genes in microbial genomes Time: Fri May 13 04:13:31 2011 Seq name: gi|316922193|gb|ADCP01000128.1| Bilophila wadsworthia 3_1_6 cont1.128, whole genome shotgun sequence Length of sequence - 3407 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 293 - 1618 1888 ## COG2252 Permeases - Term 1660 - 1699 3.3 2 2 Op 1 15/0.000 - CDS 1731 - 2261 498 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 3 2 Op 2 . - CDS 2255 - 3154 951 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs Predicted protein(s) >gi|316922193|gb|ADCP01000128.1| GENE 1 293 - 1618 1888 441 aa, chain - ## HITS:1 COG:MJ0326 KEGG:ns NR:ns ## COG: MJ0326 COG2252 # Protein_GI_number: 15668500 # Func_class: R General function prediction only # Function: Permeases # Organism: Methanococcus jannaschii # 8 441 6 435 436 370 49.0 1e-102 MAQTKGILERVFKLSEKGTTVKTEILAGLTTFVAMAYIIFVNPSILADAGIPKEAAIAAT VWSAVIGSAAMGLWANFPVAVAPGMGLNAFFAYYVCGVLGLHWTVALGAVFFSGIVFLLL TVTHVRQLLIDAVPMNLKYAIVVGIGMFIAFIGLQNAGIVVKNDATVVALGHVTQPGPLL ACCGLMLTAGLMARNVQGSLLIGILVTAVLGMVFGVSPAPSGIDSVMSFSLPSLAPTLMQ LDIMGALKYGIISIIFTFTIVELFDNMGTLIGLSRKARLMDDKGHIENLDKALVTDSVGT VVSSFLGTSTVTSYVESAAGISQGGRTGLTALTVAVLFAASLVFAPLIGLVPAFATAPAL LIVGALMMMEVMNIDFNDFTEGFPAFMTIIMMPLTYSIASGFGFGFVSYAAVKLLSGRAR EVSLFMWIITAMFVINFAMRS >gi|316922193|gb|ADCP01000128.1| GENE 2 1731 - 2261 498 176 aa, chain - ## HITS:1 COG:ygeU KEGG:ns NR:ns ## COG: ygeU COG2080 # Protein_GI_number: 16130770 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Escherichia coli K12 # 4 151 8 155 159 176 61.0 2e-44 MLKTIRCTVNGKARELSFDVRASLLEVLRAEGLTSSKQGCGVGECGACTVLVDNIPVDSC IFLAVWADGKTIRTAEGEAKGNRLSRVQQAYVDAGAVQCGFCTPGLVMSSTAFIEKHKGQ PVTREEIRRGHAGNLCRCTGYDAIIAAVESCLNDTPPQNTCACLRKPDGPCHHHQS >gi|316922193|gb|ADCP01000128.1| GENE 3 2255 - 3154 951 299 aa, chain - ## HITS:1 COG:ygeT KEGG:ns NR:ns ## COG: ygeT COG1319 # Protein_GI_number: 16130769 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli K12 # 1 290 1 287 292 282 49.0 6e-76 MFDIQSYQKAESVSDAIRLLGEDPEARLIAGGTDVLIKLREFEGFSRLVDIHDLPELKPI TREADGTVRVGSGASFTDLEESPIIRECIPMLGESAASVAGPQIRNMGTIGGNLCNGAVS ADTCAPVLALNGYLNIRGAEGERTIPALGFHTGPGRVALKRDEVLLSVEFRPEDWQGWGA AYHKYAMREAMDIATIGCAAAVRLDGTVISGIRLAYSVSAPIPVRCPSAEAAALGRSATP EALPATLAAISAAVEADVQPRTSWRANRDFRMHIIRTLAERVIARCVVRAVELQEAASC Prediction of potential genes in microbial genomes Time: Fri May 13 04:13:53 2011 Seq name: gi|316922154|gb|ADCP01000129.1| Bilophila wadsworthia 3_1_6 cont1.129, whole genome shotgun sequence Length of sequence - 43258 bp Number of predicted genes - 38, with homology - 36 Number of transcription units - 12, operones - 9 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 3 - 43 2.1 1 1 Op 1 6/0.000 - CDS 78 - 2411 2883 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 2 1 Op 2 . - CDS 2461 - 3336 948 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family - Prom 3472 - 3531 3.2 - Term 3580 - 3635 15.0 3 2 Op 1 1/0.000 - CDS 3649 - 6957 3765 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 4 2 Op 2 3/0.000 - CDS 7075 - 8445 1372 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 5 2 Op 3 . - CDS 8472 - 9743 1524 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases - Term 9770 - 9813 -0.5 6 2 Op 4 . - CDS 9820 - 12399 2885 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs - Prom 12534 - 12593 1.5 7 3 Op 1 8/0.000 - CDS 12768 - 13964 1657 ## COG0078 Ornithine carbamoyltransferase 8 3 Op 2 1/0.000 - CDS 13977 - 14939 1215 ## COG0549 Carbamate kinase - Term 14956 - 15005 21.1 9 3 Op 3 . - CDS 15075 - 16295 1634 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 16370 - 16429 5.7 - Term 16616 - 16654 6.7 10 4 Op 1 . - CDS 16794 - 16916 90 ## 11 4 Op 2 . - CDS 16919 - 17860 1110 ## COG1045 Serine acetyltransferase 12 4 Op 3 16/0.000 - CDS 17878 - 19275 1635 ## COG0305 Replicative DNA helicase 13 4 Op 4 27/0.000 - CDS 19373 - 19888 630 ## PROTEIN SUPPORTED gi|46579371|ref|YP_010179.1| 50S ribosomal protein L9 14 4 Op 5 11/0.000 - CDS 19900 - 20163 410 ## PROTEIN SUPPORTED gi|218886447|ref|YP_002435768.1| 30S ribosomal protein S18 15 4 Op 6 . - CDS 20165 - 20467 421 ## PROTEIN SUPPORTED gi|46579369|ref|YP_010177.1| 30S ribosomal protein S6 - Prom 20617 - 20676 2.3 16 5 Tu 1 . + CDS 20662 - 20958 444 ## + Term 20994 - 21029 6.0 + Prom 21314 - 21373 2.8 17 6 Tu 1 . + CDS 21526 - 22239 713 ## DVU0372 hypothetical protein - TRNA 22603 - 22678 85.2 # Phe GAA 0 0 - Term 22544 - 22599 11.2 18 7 Op 1 . - CDS 22794 - 23384 856 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family - Prom 23425 - 23484 1.6 - Term 23392 - 23434 7.2 19 7 Op 2 . - CDS 23487 - 25202 1943 ## COG0608 Single-stranded DNA-specific exonuclease - Prom 25244 - 25303 2.2 20 8 Op 1 . - CDS 25509 - 26207 857 ## COG0284 Orotidine-5'-phosphate decarboxylase 21 8 Op 2 4/0.000 - CDS 26212 - 26844 625 ## COG0194 Guanylate kinase 22 8 Op 3 4/0.000 - CDS 26837 - 27094 314 ## COG2052 Uncharacterized protein conserved in bacteria 23 8 Op 4 . - CDS 27097 - 27975 1127 ## COG1561 Uncharacterized stress-induced protein - Prom 28168 - 28227 4.1 24 9 Op 1 . + CDS 28301 - 28696 293 ## PROTEIN SUPPORTED gi|154175150|ref|YP_001408716.1| 30S ribosomal protein S15 25 9 Op 2 . + CDS 28715 - 29590 633 ## COG0253 Diaminopimelate epimerase + Term 29700 - 29735 -0.6 - Term 29744 - 29796 13.6 26 10 Tu 1 . - CDS 29990 - 30928 1005 ## COG1242 Predicted Fe-S oxidoreductase - Prom 30953 - 31012 4.0 + Prom 30918 - 30977 2.7 27 11 Op 1 22/0.000 + CDS 31209 - 32252 1321 ## COG1077 Actin-like ATPase involved in cell morphogenesis 28 11 Op 2 . + CDS 32271 - 33410 851 ## COG1792 Cell shape-determining protein 29 11 Op 3 . + CDS 33407 - 33877 465 ## LI0394 hypothetical protein 30 11 Op 4 19/0.000 + CDS 33858 - 35690 1820 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 31 11 Op 5 . + CDS 35971 - 37080 1210 ## COG0772 Bacterial cell division membrane protein + Term 37215 - 37254 0.1 + Prom 37243 - 37302 6.0 32 12 Op 1 . + CDS 37343 - 37759 542 ## Dvul_2190 H+-transporting two-sector ATPase, B/B' subunit 33 12 Op 2 . + CDS 37822 - 38355 629 ## LI0399 F0F1-type ATP synthase, subunit B 34 12 Op 3 41/0.000 + CDS 38352 - 38903 459 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 35 12 Op 4 42/0.000 + CDS 38908 - 40416 1842 ## COG0056 F0F1-type ATP synthase, alpha subunit 36 12 Op 5 42/0.000 + CDS 40430 - 41326 969 ## COG0224 F0F1-type ATP synthase, gamma subunit 37 12 Op 6 42/0.000 + CDS 41344 - 42756 1801 ## COG0055 F0F1-type ATP synthase, beta subunit 38 12 Op 7 . + CDS 42766 - 43170 403 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) + Term 43195 - 43229 6.1 Predicted protein(s) >gi|316922154|gb|ADCP01000129.1| GENE 1 78 - 2411 2883 777 aa, chain - ## HITS:1 COG:ygeS KEGG:ns NR:ns ## COG: ygeS COG1529 # Protein_GI_number: 16130768 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 9 775 2 752 752 880 57.0 0 MAIGQSKQRLDAEAKVTGRARYTDDMGLPGMRHAAYVHSTIAHGMVLSIDASEALALPGV EAVFTADDVPGFLFPTAGHPYSMDPSHGDVADRLLLTKHVRYYGDEVAVVVARDGLTARK AAHLVKVEYEPLPVMTSAETALAPGAPVLQPSVNPDGNLLKQHTLECNGTLDEALAAADV VVEGAYRTPTMQHCHLENQTAYAYMDDMRHIVIVSSTQIPHICRRVVGQALGIPWSSVRV IKPYIGGGFGNKQDVVLEPIVAFLTMKLDGAPVSMELTREECMLCTRVRHAFAMTARAGA TKDGKLLGYGLDVLSNTGAYASHGHSIAAAGGSKVCYVYPHATYRFSAKTFYSNIPVGGA CRGYGSPQTAYAIECLMDDTARALGMDPLDFRLKNTGRNGDISPLNGKPVATLGISDCLE EGRKKFRWDERKAACKAFNEEAVRTGSPLRRGVGVSAFSYGSGTYPANVEPGSARLILNQ DGTVNLMTGATEIGQGADTAFAQMVSETLGVAYENIHVISTQDTDITPWDPGAYASRQTY TCAPAVHAVADGLRRKLLDYAAEMTGHTPAALTIAPTAQGDAVVFVRNPESVVVTLHDLA MDSFYNKDRGGQLSAEASFKTRQNPPSFGGCFAEVEVDIDLCKVRITHILNVHDAGVIIN PALATAQVHGGMGMGIGWALYEELLVDPATGRVHNNNLLDYKFPTTCDIPDLACAFVETQ EPSGVYGNKSLGEPPLISPAPALRNAVLDATGVAVNAIPMTPKLLFEEFVKAGLIKG >gi|316922154|gb|ADCP01000129.1| GENE 2 2461 - 3336 948 291 aa, chain - ## HITS:1 COG:yqeB KEGG:ns NR:ns ## COG: yqeB COG1975 # Protein_GI_number: 16130777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 9 266 271 524 541 240 49.0 2e-63 MYQYQQFAILLRGGGDIATGIALRLYRSGFRRLIILETAHPMAVRRRVAFSEAVYEGLCT VEGITAALAPRPEEIAPLWDAGCIPVMVDPQGVTIPRLRPEVVIEATLAKRNVGVGITDA PLVIGVGPGFTVGENVHCIVETNRGHNLGRVLYSGSAEPDTGIPGDIVGMTTERVLRAPQ TGIFLSRHDIGDHVKAGDVVATVEANGVSKEIRTVISGVIRGLLRSGTPVTDRIKVGDVD PRDNTACHHVSDKAFAIGGGILEAILGRFNRPCYIRRGILHCPVDKNVPTA >gi|316922154|gb|ADCP01000129.1| GENE 3 3649 - 6957 3765 1102 aa, chain - ## HITS:1 COG:Z4217_2 KEGG:ns NR:ns ## COG: Z4217_2 COG0493 # Protein_GI_number: 15803415 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli O157:H7 EDL933 # 457 1044 1 550 582 288 33.0 3e-77 MHTDRFTPLDAATLLRWMQHDLPNKQLFGIDRELFFMPKPDDPFRMERYGKVLETPLGVA AGPHTQLAQNILSAWLCGARYIELKTVQVLDELNVAKPCIDMADEGYNCEWSQELKLDQS FEEYLKAFVLLYVLRDMLNLPVEAEPGKGPGFIFNMSAGYNLEGIKSPAVQRFLDRMENA EEDVKRIKAGLAPLYPAIEKMPIPARLSDNLTVSTMHGCPPDEVESIGRYFITERKYNTA IKLNPTLLGPERLRDILNTQLGYDVCVPDEAFGHDLKYNDGVAIIRNLAAAAETAGVAFG LKLTNTLETANEKQNLPRSEGMVYMSGRSLHPISINLAERLQREFDGKLDIAFSAGVDTF NVADTLACGLRPITMSSDILKPGGYGRLHQYLDTLRAEMRKAGAHTIAEWEKSRSPIAGN PGFANLVSYAAAVLGKRSRYHKERFPYVSVKTERPLPRFDCAAAPCMSGCPAEQDIPRYL DAVARGDFELAWRVITATNPFPNVQGMACNHQCQSRCTRINYDKPLMIREVKRFVAEKMA EAASGVPAAQTGKRAAVIGAGPAGLSCARFLAAQGVEVHLYEEKPVLGGMAGDAIPAFRL TGDSLRRDIDAILKLGVRLHKDTPVDAALFDTLAEENDAVYIAVGAQESLPLGIPGEDAA GVLDQLSFLSAVRRCEPTGIGSHAVVIGAGNSAMDCARAARRLVGENGSVTIAYRRTRKE MPADIEEIEAALDEGVRIVELCAPEEVLADGGRVSGLRCRRMRLVPDPDGGRPRPVPTDE TFTLGADTLIVSIGQRVKAGFLPDAMTLKADPHTSQTSLPNVFAGGDAVRGAATLINAVA DGRHAAETILARLGLSAQAATATPSDERRPDLDGLRIRQATRVMGPALPERSPEDRLDFD PAIRTLTEEEAMDESRRCLQCDLVCNVCTTVCPNRANVALLSLPMPHPVQVAVRDGDGVR VETLSNRRLEQSYQIVNIADACNECGNCATFCPSAGAPYRDKPRIHLSRESFDNAPDGYR LASPSRLEGKRGGKAFSLAAEKDGFVFESDALIAHLDGGTLCATKVTLNGDVNEAALSGA VEAATLFRLLARKQPFVGPKHK >gi|316922154|gb|ADCP01000129.1| GENE 4 7075 - 8445 1372 456 aa, chain - ## HITS:1 COG:ygeZ KEGG:ns NR:ns ## COG: ygeZ COG0044 # Protein_GI_number: 16130775 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Escherichia coli K12 # 1 454 5 457 465 400 46.0 1e-111 MRMLITGGLVARADGVFNTDILIEGDRILELGERLHEAQQPEGTEIVDVSGCVVMPGGVD AHTHVNLTVGDAHVSDGFEAGTLAAAWGGTTTIVEHPGFGPQGCDLPHQPTAYLEQADGR CHTDYALHGVFQHVDDAVLARIPEVVAQGFPTLKAYTTYDGRLDDEGLLQVLAALRDAGG LLAVHCENHAITRFLGDKLRREAPRDPMSHPRSRPARCEAEAVDRILKLAKTAEAPVYIV HLSTAEGLACVREAQKAGQPVIAETCPQYLLLDESAYAERDGLKYIMAPPLRTEADRAAL WEGLADGSISVAATDHCSFSLAQKRERGKESVLDCPGGVPGVETRIPLLFSEGVLHGRLT LPRFVDVVSTSPARLMGLASKGRLEPGADADIVVIDPTDERLIKTRNLHQKADCTPYEGM VVRGWPRHVWLRGEPIIFGRQLSGYAGQGRFVPRTL >gi|316922154|gb|ADCP01000129.1| GENE 5 8472 - 9743 1524 423 aa, chain - ## HITS:1 COG:ssnA KEGG:ns NR:ns ## COG: ssnA COG0402 # Protein_GI_number: 16130781 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli K12 # 45 420 64 457 464 189 31.0 8e-48 MSLYLRNATWIDPETFKATTTTIKVEEGPSGGMALDAHAPVECPPEDTVLDCTGRLVTPS FGCGHHHIYSALARGMPPAPKAPANFLEVLEYVWWRMDKKLDHDMIEASALVTGLYCAKN GVTFVIDHHASPFHLDGSLETIAKALDRVGLGHLLCYEISCRDGEEIKEKGLDETDAFLS SGRKGHVGLHASFTVDDDLLGRAVDLARKHKTGVHIHVAEGIEDQEHCARTYGKSVVRRL SDAGVLDLPLSILGHCVHLDAEERDLIRTSPCWVVQNTESNLNNNVGLGRYNDFPRVMLG TDGMHSDMIRSAQSSYFIAGLAEGGMSPLAAYQRLRAVHAHIQGHNAPGGGANNLVILNY DSPTPLTQDNFPGHFCYAFDARNVESVISQGRLIVDHGRLAGMDEADILAYANEQARRLW ALL >gi|316922154|gb|ADCP01000129.1| GENE 6 9820 - 12399 2885 859 aa, chain - ## HITS:1 COG:BH0748 KEGG:ns NR:ns ## COG: BH0748 COG1529 # Protein_GI_number: 15613311 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Bacillus halodurans # 174 855 18 740 760 296 31.0 1e-79 MIAFRLNGQATTYDGDPAVSLLSYLRNDRHLTAAKDGCSGQAACGACLVELNGKACLACS TPMSKVTDGDVVTLEGLPETLRRTLGMAFVNRGAVQCGFCTPGFLMRAKVLLQTNPNPTR EEVVKAVRPHLCRCTGYVKLVDAILDAATLLPEGNIPEPDDNPRLGSGWPKYGGYERAVG TRPFVNDIDAPGMAHGAIVFSEHPRARVLAIHTDEAADMPGVVRILTADDIPGERVLGLY AKDWPVYIKVGEVTRYIGDALACVVAETEAEARAAAACVKVDYEVLTPLLDMEEAETSPI HIHENGNILLDKTIRRGEPVDEALARCAFTAEATYRTQAVEHAFIEPEATLAVPEAGGVR LYVQSQGIWHDKTDIAALLHVPTPKVTVTLADSGGAFGGKEDFTCQPHAALAAYLLGRPV KVRLSRPESIRMHPKRHGMKLHYVIGCDEKGILQAAKVRIIADTGAYASAGGPVVTRTGT HATSAYHIPSVDVHVKAVFTNNLPAGAFRGFGVNQSNFAMECLIDELCEKGGFDRWQFRY DNAVAPGRMVTTGQVLGEGTALRETLLAVRDAFRSHPRAGIACAIKNSGIGNGIPEPCDC YLEVHPAPEQPAGARLVLHHGWTEMGQGINTVARQMLCEVAELGPDVAIDITASTEYGAF AGSTTASRGTFQLGHAVVNAARQLRGDLNKYGGRLDRLVGRFYKGHYDSAGQTSGQGAPG EIRNHVSYSFATHVVFMDEEGGIEKIIAAHDSGRVVNRPLYEGQVQGGVVMGMGYALTES IPVKGGYLTSGKLNACGLLRSVDVPEIEVIAVEVPDPEGPCGAKGIGEIACIPTAPAIVN ALYLVDGVRRRELPVRKKR >gi|316922154|gb|ADCP01000129.1| GENE 7 12768 - 13964 1657 398 aa, chain - ## HITS:1 COG:ECs3743 KEGG:ns NR:ns ## COG: ECs3743 COG0078 # Protein_GI_number: 15832997 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 21 396 19 394 396 537 67.0 1e-152 MDKIRIRKLISNLAKRDYRRMYMNDFFLTWEKTDDEIAATFLVAEILRGLREANISCRMF DSGLGISLFRDNSTRTRFSFASACNLLGLEVQDLDEGKSQIAHGETVRETANMISFMADV IGIRDDMYIGKGNAYMHTVSEAVQEGYRDGVLEQRPTLVNLQCDIDHPTQCMADMLHIIN HFGGVEKLKGKKIAMSWAYSPSYGKPLSVPQGVIGLMTRFGMDVVLAHPEGYEVMGGVEE VARRNAAATGGSFARTNDMAEAFAGADIVYPKSWAPFAAMQQRTDLYGAGDMDGIKKLEK ELLAQNANHKDWECTEALMASTRNGEALYMHCLPADISGVSCAEGEVAASVFDRYRVPLY KEASYKPYIIAAMITLARKRNPADLLLDLLEKGGQRCL >gi|316922154|gb|ADCP01000129.1| GENE 8 13977 - 14939 1215 320 aa, chain - ## HITS:1 COG:yahI KEGG:ns NR:ns ## COG: yahI COG0549 # Protein_GI_number: 16128308 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 3 319 2 315 316 279 50.0 4e-75 MSRKLAVIAIGGNSLIKHRDKQSVEDQYLALCETMEHIADVIQQGWQVLITHGNGPQVGF IMLRSEIARAVAGMHIVPLVSCVADTQGAIGYQIQQSLDNVLKRRGLTDKTGRTVSLVTQ VRVDIADPGFSDPDKFVGEFYAEDQLAELHQQHPDWILKPDANRGWRRVVPSPKPCEIIE LDAIKNLLDSGFNVVAVGGGGIPVVRMPEGLFGVDAVIDKDLASSLLATQLGADMLAIST GVESVALNYGTPMQRPLHNVPVSEMERYLAEGHFPAGSMGPKIKAAVDFIRNGGSEVVIT SPEYLNAALTKGAGTHITKE >gi|316922154|gb|ADCP01000129.1| GENE 9 15075 - 16295 1634 406 aa, chain - ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 4 403 5 401 403 592 69.0 1e-169 MSTVDFSRVNELAKKYEPEMTRFLRDMIRIPSESCQEEGVIRRIKEEMEKVGFDRVEIDK MGNVLGFIGNGPRIIAFDAHIDTVGVGNRSNWTFDPYEGYEDAETIGGRGASDQEGGMAS MVYAGKIIKELGLCPKDCTIVMVGTVQEEDCDGLCWQYIINEDGLKPEFVVSTEPTDGGI YRGQRGRMEIRVDVSGVSCHGSAPERGDNAIYRMAHIMTELEQLNARLADDPFLGKGTLT VSQIFYTSPSRCAVADSCAISVDRRLTFGEDKDLAISQIENLPSVKAAGDKAKVSMYTYE VPSWTGLVYPTDCYFPTWVLPEDHVVTKSTEETYRSLFGREPRTDKWTFSTNGVSIMGRY GIPVVGFGPGKEKEAHAPDEKTWKQDLIECAALYAALPATYVANSN >gi|316922154|gb|ADCP01000129.1| GENE 10 16794 - 16916 90 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRALLLCACLLALLLAGCSESATVRAKGQWEGSVSTGGSF >gi|316922154|gb|ADCP01000129.1| GENE 11 16919 - 17860 1110 313 aa, chain - ## HITS:1 COG:all4037 KEGG:ns NR:ns ## COG: all4037 COG1045 # Protein_GI_number: 17231529 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Nostoc sp. PCC 7120 # 131 297 4 161 253 154 48.0 2e-37 MKPLAPIAPTSMPDLDLVVEQLCAPSSLETVLHHPLHDSPMPSLEELGEIMSRLKATLFP GYFGVSCVHVESMRYHLSANLDSIFRKLAEQIRRGGCFACASYATDCMSCEEVSMRKAME FMQRLPHIRRLLASDAKAAYEGDPAATSAGETIFCYPSLLTMTHHRIAHELYNLGVPVIP RIISEMAHSATGIDIHPGATIDEEFFIDHGTGVVIGETAVIGKGCRLYQGVTLGALSFPK DGDGVLVKGVPRHPILEDNVTVYAGATILGRITIGSGSIIGGNVWVTKDVPAGTKLVQKL SAQPTEESLMGKA >gi|316922154|gb|ADCP01000129.1| GENE 12 17878 - 19275 1635 465 aa, chain - ## HITS:1 COG:PA4931 KEGG:ns NR:ns ## COG: PA4931 COG0305 # Protein_GI_number: 15600124 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Pseudomonas aeruginosa # 1 455 4 458 464 424 50.0 1e-118 MASPADPNRATNDLLRRVPPHSVEAEQAVLGGIFIRNSVFHTLVDTLSSDDFYLPAHQTL YTCFLELYRKNAPIDLVSVASYLKDHGQLEAVGGAAYLSELAQTVVSGANAEYYSTIVRD KSLQRSLIGACSDIISNCFDQSRSVDALLDESEQAVFSISERTSGKVFKSSKELVNRVFE ELTKRFEQKEAVTGVTTGYNRLDQMTAGLQPSDLIIVAARPSMGKTAFALNMAMRAAVQD NVPVAIYSLEMSMDQLMMRMLCAWGKVDLGHLRRGYLDSEEWSRLYHAADVLGQAPIYID DTPALSPLELRARTRRLKAESGVGLVMVDYLQLMRGSKRTDSREQEISEISRSLKGLAKE MDIPVVALSQLNRKLEERTDKRPLLSDLRESGAIEQDADVIMFIYRDEVYHKQPDNPHKG SAEIIIGKQRNGPIGTATLAYLGSYTAFEDLEPGLSPQPSEAGGE >gi|316922154|gb|ADCP01000129.1| GENE 13 19373 - 19888 630 171 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579371|ref|YP_010179.1| 50S ribosomal protein L9 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 171 1 167 167 247 79 1e-64 MKIILRADVENLGRLGDVVTVKAGYGRNYLLPQGLAMLVTPGNLKAFELERKKLQARMDA LRAAADDIAGKLEGLVLPIVMRVGDNDKLYGSVTTAIIGDALTAQGIEVDRRRILLDHPI RTLGDHPVRVRLHADVVASLTVKVVSEDKAHLVEEEAAEAPAEAPTEEAAE >gi|316922154|gb|ADCP01000129.1| GENE 14 19900 - 20163 410 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|218886447|ref|YP_002435768.1| 30S ribosomal protein S18 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 87 1 87 87 162 90 3e-39 MAFKKKFAPRRKFCRFCADADLPLDYKRPDILRDYITERGKIIARRITGTCAKHQRLLST EIKRARQMALLFYTAGHSSDVKKKSNI >gi|316922154|gb|ADCP01000129.1| GENE 15 20165 - 20467 421 100 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46579369|ref|YP_010177.1| 30S ribosomal protein S6 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 100 1 101 101 166 85 2e-40 MRKFETLLLLSPELAADAREALLGTFTGVIERAGGRFTADHWGMRDLAYPVRKQMRGYFV RLEYVAPGDTVAELERIIRISDGAFKFLTVKLADAAEEVA >gi|316922154|gb|ADCP01000129.1| GENE 16 20662 - 20958 444 98 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDTERKYPTREVDGVDVPQMGIKELIKQLGSALETMEGNRKSYTSPEEIEAGTLMVALFG NSLYYMRLFDKSLRTFEAAQDAVLEEQLQQADTVGGIC >gi|316922154|gb|ADCP01000129.1| GENE 17 21526 - 22239 713 237 aa, chain + ## HITS:1 COG:no KEGG:DVU0372 NR:ns ## KEGG: DVU0372 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 32 235 26 229 230 176 48.0 8e-43 MKGFRLDRPADRWCLGCGLVLVLSLLLSDTVLRAYVALTDAHPFAAGFCKFALLATFGES LAQRMLVGRYLPPRFGLVPRALAWGVLGICITLAFMIFSAGVPLAMSRIGMDWAARALAG PFGTEKLLTAFAISLCLNTLFAPVLMVAHKVSDLHIARYEGTLRCLWHLPDVGGLLKSVN WDVMWRLVLFRTVLFFWIPVHTITFLLPEAFRVLFAALLGAVLGLILAWAGSRQARG >gi|316922154|gb|ADCP01000129.1| GENE 18 22794 - 23384 856 196 aa, chain - ## HITS:1 COG:RSc1165 KEGG:ns NR:ns ## COG: RSc1165 COG0652 # Protein_GI_number: 17545884 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Ralstonia solanacearum # 32 195 28 188 190 199 60.0 2e-51 MQSIRIAVLALALCFGASLLPVQSHAAAPLPKVKIETSMGDIVVELNPAKAPKTVSNFLY YVKSGFYNNTIFHRVINGFMIQGGGHTADMQEKANSRKPVVNEAGNGLKNDAYTIAMART NDPDSATAQFFINVANNDFLNHKNETPSGYGYAVFGKVIQGQDVVDRIKSVSTGSYGPYQ DVPKTPVIIKSIVPVQ >gi|316922154|gb|ADCP01000129.1| GENE 19 23487 - 25202 1943 571 aa, chain - ## HITS:1 COG:L0259_1 KEGG:ns NR:ns ## COG: L0259_1 COG0608 # Protein_GI_number: 15672614 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Lactococcus lactis # 20 567 23 554 558 317 34.0 6e-86 MKIWKFRNQPDKGPCPAGWADKLGVPQVVADLLWQRGLETSAQMDSFLSPGLRHLAPPDC WPGMQEAVDALEQGIREGRNVLIWGDYDVDGITGATLILQTLRFHDVPSTVHLPDRRKEG YGLNIPEIERLAAEEGPGILLTVDCGISDVKAVERARELGFMVVVSDHHMPPEELPPAHA ITNPRLSEDNPCPHLAGVGVAFFLMAALNGKLEAMSGKRMDMRQVLDLVALGTLADMVSL TGQNRILVKNGLLKIAEAKRPGLAELKGASGFSPVAALGAGQVVFNLAPRINAAGRLGSP TLAHDMLLTPSHDEAAKLAQTLTSLNDERRSEEDRIYKEALEQAEANPKRLGFVLYGKDW HQGVIGIVASRIVEVYYRPVLILCSDGESLKGSGRSVPEFDLHAGLTRCADMLLGFGGHR QAAGLRIAPGRLDELRERFDAVIREELGEAPLTPSLKIDAEMPFSQASDFTVLKGLELLQ PFGIGNPEPIFASLPLRVKKRKAFGHSREHISLEVTEEGSGITLQAKAWRQADQIPESIQ GQRIRLAYTPGINAYNGIASVELRVRDWEVL >gi|316922154|gb|ADCP01000129.1| GENE 20 25509 - 26207 857 232 aa, chain - ## HITS:1 COG:PM0797 KEGG:ns NR:ns ## COG: PM0797 COG0284 # Protein_GI_number: 15602662 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Pasteurella multocida # 2 225 3 225 233 168 46.0 7e-42 MAQLIAALDYTNAEDALATAESLRGCPIWMKVGLELFTHEGPAVVKKLKDMGFKVMLDLK MFDIPNTVAGGVRSACLMGVDLITLHALGGERMIHAAVDAVRQNAEEGGPKPLLFAVTVL TSMAPGELPGYGENLSGLAADLAAGGQAWGLDGVVCSGHEVEAIKSRCPGLMCLTPGIRP ASGSASDDQRRIMTPAQAVRIGSDFLVVGRPITKAPVPADAARAILDEMEKA >gi|316922154|gb|ADCP01000129.1| GENE 21 26212 - 26844 625 210 aa, chain - ## HITS:1 COG:CC1681 KEGG:ns NR:ns ## COG: CC1681 COG0194 # Protein_GI_number: 16125927 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Caulobacter vibrioides # 13 207 10 205 213 168 45.0 6e-42 MPDTTPTITARSGVLLVVCAPSGTGKTTLIQRLRDEFPNFAYSISCTTRAPRGHETDGKD YHFLSVEEFLRRREAGFFAEWANVHGNYYGTPLAPVLETLKAGQDVLFDIDVQGAAQLHL SLPRGQYVFLLPPSLSELERRLRGRGTDDEASIARRLSNAASEIRQAHWFDAWIVNDNLD KAYDELRAVYLASTLHPAHRPGLANTILEG >gi|316922154|gb|ADCP01000129.1| GENE 22 26837 - 27094 314 85 aa, chain - ## HITS:1 COG:TM1690 KEGG:ns NR:ns ## COG: TM1690 COG2052 # Protein_GI_number: 15644438 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 6 80 4 78 92 105 64.0 2e-23 MSKERIINLGFGNFVVASRVVGIINPASSPMRRLREDARTEGRLVDATQGRKTRSIIITD SNHVILSAILPDTLGQRFMQEEDNA >gi|316922154|gb|ADCP01000129.1| GENE 23 27097 - 27975 1127 292 aa, chain - ## HITS:1 COG:BH2514 KEGG:ns NR:ns ## COG: BH2514 COG1561 # Protein_GI_number: 15615077 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Bacillus halodurans # 1 292 1 294 294 139 32.0 6e-33 MLRSMTGFGRCVMEDADWTQTWEIRSVNNRHLDLKWRLPLQARGLESRLERVVRRFAARG RMEITLTLQQRGAAANLRFDAAQASAMLDQVAALADLHGDTFEPDYNALFAIPTLWERES EDGDDEMEERLEEGLIAALEDWNESRETEGAALARDMTSRIAQMEEWVSRIDERAPEIKE ERFAVLRERLSEALAAVNGELEEGRFLQEMVVLSDKLDVSEELTRLHAHLERLRDLLEIG TDAGRRLDFTLQECFREIATCGNKIQDAQTSRLVVDFKNELEKCREQVQNLE >gi|316922154|gb|ADCP01000129.1| GENE 24 28301 - 28696 293 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175150|ref|YP_001408716.1| 30S ribosomal protein S15 [Campylobacter curvus 525.92] # 6 128 3 129 130 117 45 1e-25 MAVLCVEDLDHLVLTVADIKATCRFYQQVLGMTPFTFGNGRTALSFGNRKINLHEVGKGD LPQAHNPLPGTADLCFLSRTPAVEMLDHLKANGVPVEEGPVRREGALGPITSVYFRDPDG NLIEVANYTEA >gi|316922154|gb|ADCP01000129.1| GENE 25 28715 - 29590 633 291 aa, chain + ## HITS:1 COG:FN1732 KEGG:ns NR:ns ## COG: FN1732 COG0253 # Protein_GI_number: 19705053 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Fusobacterium nucleatum # 1 274 5 263 265 92 27.0 1e-18 MRVLPFTKYSPCGNTTILVRESSLSPADRARVAAEIIAPGHLEAEQAGYVDTAAPVPRLD MMGGEFCVNATRAFAALLAEEGKLSPESGGLGGIVSVSGMPERLRVRVRRLAAHRFESSV LLDLPQAPPLENVAPGMYLVRVPGIAHLVLDAAAHPLPADKDRDTAALFARFGLLGEDAA GCIWLHREPSGLRITPFVWVRGTGTTYAETACGSGTLAASIVCQGVYGDGGELSLMQPGG EPLRVVPDGAAYPGGWAAWVGGPVRRIARGDVFVECLDGEGRTGGREGKPF >gi|316922154|gb|ADCP01000129.1| GENE 26 29990 - 30928 1005 312 aa, chain - ## HITS:1 COG:lin1770 KEGG:ns NR:ns ## COG: lin1770 COG1242 # Protein_GI_number: 16800838 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Listeria innocua # 18 308 17 305 321 214 36.0 2e-55 MYRHGVGGYPTERPVLMHTWNTFTVHLRQRFGQRVQKIPLDAGASCPNRDGTLSRAGCTF CNAIGSGSGLGLAGMNLPAQWEHWREHFRKSNRASLFLAYLQSFSNTYGPASRLAAMLEI FRTLPDLAGVSVGTRPDCLDPEKMAILGTAPWKEKWLELGVQTLNDATLRRINRGHDAAA SAKAIELAEKTNVQVCAHLMLGLPGETPDDVHATVRRLNALPVHGVKLHNVYVCRNTALE RAYRAGEYVPLTEDAYVELAVDALTELRPDIIIHRVVADPAPGELVAPDWATRKGDLVRF IQHAYHRRCGGA >gi|316922154|gb|ADCP01000129.1| GENE 27 31209 - 32252 1321 347 aa, chain + ## HITS:1 COG:ECs4123 KEGG:ns NR:ns ## COG: ECs4123 COG1077 # Protein_GI_number: 15833377 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Escherichia coli O157:H7 # 10 342 27 360 367 407 63.0 1e-113 MPKLVNFILGAFSNDLAIDLGTANTCVYVKGQGIVLREPSVVAVKKDSRGNSVVLAVGQD AKRMLGRTPGNIQAIRPMKDGVIADFEVTEAMLRHFISKVHNSRRHLVRPRIMVCVPTGI TQVEKRAVKESAQSAGAREVYLIEEPMAAAIGANLPIQEPTSNMVVDIGGGTTEVAVISL SGIVYSRSVRVGGDKMDESIMTHVKRKYNMLIGESSAEEIKIKIASAYPMDPEQQIEVKG RDLVTGIPQNIIITSEEVRKAISEQVDSIVQAVRIALEQTPPELAADIVDRGIVLTGGGA LLKGLDQLLREETSLPITVVEDPLSTVVMGTGRALDNLNELGDVCID >gi|316922154|gb|ADCP01000129.1| GENE 28 32271 - 33410 851 379 aa, chain + ## HITS:1 COG:PA4480 KEGG:ns NR:ns ## COG: PA4480 COG1792 # Protein_GI_number: 15599676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Pseudomonas aeruginosa # 82 306 27 253 330 84 28.0 3e-16 MPPQAFRSAARLPVGSADRFISAFLWIPPPIGGPFSGQSMSPKHILLVPALALLMYLGMY SWNQRTHILDNFAANTGLEASGVVLKSVRMVQDTVMGAWNRYLDLVDVREENDELKKQVE QLRHKLLMASEERAELVRLRRLLTLTPPEGWQTLGARVLAGRMGANSALASVLIGRGYMT GAIPGTPVMTPQGVAGRVLRAGPSSATVLLLVDPGSRIAVISQESRIQGVLVGGGPDKPL EMRFVSHNDVIRAGEILVTSGLDDAFPKGIPVARVLSATPSDLSPFQSIQAAPLATLSDL EELLLISRVEGSSVSGSLSGVLEAPRSPSSDSAGSPAAQRAQPQPGQQGSTIPGPPRPDA PFFLSGPTPPPGLPLRSRP >gi|316922154|gb|ADCP01000129.1| GENE 29 33407 - 33877 465 156 aa, chain + ## HITS:1 COG:no KEGG:LI0394 NR:ns ## KEGG: LI0394 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 4 152 5 153 156 134 51.0 1e-30 MTLNILWWVAFFVFGLALQQALPGTDVLVAGLFLALQERRPFQLAVVLLALILVQEGVGT LDFGASVLWYLLVITLFFIGRWMFETENWLFVLLLSGCIGLAHYGVIWLMTRLQFIPLDT TQLLDESILQALLTPFVWQCSMMARRWVVPNENNTA >gi|316922154|gb|ADCP01000129.1| GENE 30 33858 - 35690 1820 610 aa, chain + ## HITS:1 COG:PM1924 KEGG:ns NR:ns ## COG: PM1924 COG0768 # Protein_GI_number: 15603789 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Pasteurella multocida # 24 601 36 631 644 328 34.0 2e-89 MKTIQLETEGYQPPKSGLVLLQCLVGLLFFTFVIRFWYLQIHRGADFAQQAQANHLRQES VYASRGLIRDVKGTLLAENRPAFGLALIREDCKDISATLAQVSEWAGIPLEQLTNRFQQD RRKVKPFEPILLLNDMPFDQVARIEAQLMHWPGLEIVTRSKRYYPEGKEFAHILGYVAEA NEQELAADPYLALGDTVGKQGLEYVFEQRLRGHKGLYNVEVDVLGRSLGKTLVEEPQNGE NLQLCLDTKLQKQLAALLGDQTGSIVVMEPQTGRLLALVTNPSYDNNVFVGGLSQKDWVA LRDDPFHPLQNRAIQSVYPPGSVWKLMMAGLFLKEGISPSFRVVCTGAVKLGNREFRCWR KGGHGAVDMIQSLLHSCDVYYYVLGEKLGIDRIESFAKACGFGAPTGIDLPHEKSGLVPS RAWKKRRSGEAWYRGETLNVSIGQGYTLVTPLQMANFVSSLLNGGKLMKPQLLVDAKPEV RGNIPFSEEARKLVLEGMRVTADVGTAKVLRRTDAVIGGKTGTAQVVKIRMVGERRQRTA EMAYLERDHAWIASYGRKDGKDVVVVVMLEHGGGGSSAAGPVARDVYNILYGTPLGPQPQ RAVTRTVRED >gi|316922154|gb|ADCP01000129.1| GENE 31 35971 - 37080 1210 369 aa, chain + ## HITS:1 COG:VC0949 KEGG:ns NR:ns ## COG: VC0949 COG0772 # Protein_GI_number: 15640965 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Vibrio cholerae # 26 367 34 369 373 237 42.0 3e-62 MAIDRRLLTYINWGLVAATLLLFWVGIGNLYSASGVRVEDGISLAPYYERQMIWGAFGLI AMVACMSFDYRHLQAMALPFFLIVLFSLCLIPLFGKVIYGARRWIDLGFFHFQPSEMAKI AVLLMGAQVLSLDGEPLSWKKLFQVSCVGGIPAAFIVCQPDLGTALTVLAILGGMILYHG LKKRVLLVCLISIPLLLPMAWFALHDYQKQRIMTFLDPSNDPRGAGYHIIQSKIAIGSGQ IWGKGFLEGTQSKLSFLPEKHTDFAIAVFGEEWGFVGCVALMALFSLFLLSIFETVRGAK DRFGSNLAAGIFIYFFWQIFINAGMVVGIMPVVGIPLPFISYGGSATVVNFSLIGLVLNI SMRRFLFKT >gi|316922154|gb|ADCP01000129.1| GENE 32 37343 - 37759 542 138 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2190 NR:ns ## KEGG: Dvul_2190 # Name: not_defined # Def: H+-transporting two-sector ATPase, B/B' subunit # Organism: D.vulgaris_DP4 # Pathway: Oxidative phosphorylation [PATH:dvl00190]; Metabolic pathways [PATH:dvl01100] # 1 138 1 138 138 85 40.0 6e-16 MVDLNITLWIQLANFLVTLVVLNYLLISPIRKIIRKRKDNVEGLIGEIEAFTAEKQQLLD EYESELRKAREAAAIYRKDGKVMGELERARIFDAASKDAQSEVRTTQAAVRADAGVTRRA LQAKMHEFTEAAMAKLLA >gi|316922154|gb|ADCP01000129.1| GENE 33 37822 - 38355 629 177 aa, chain + ## HITS:1 COG:no KEGG:LI0399 NR:ns ## KEGG: LI0399 # Name: atpF # Def: F0F1-type ATP synthase, subunit B # Organism: L.intracellularis # Pathway: Oxidative phosphorylation [PATH:lip00190]; Metabolic pathways [PATH:lip01100] # 1 177 16 192 192 147 42.0 2e-34 MAVATLILLGVVSHFSDPYNPWINLLARVGNVCVFLYILWRAAGKAIVDGLSDRRAAIVD ELDSLAVRKAAAEEQLAAMIEKIDSLDAECETILKDSRARGEAIREALVADALEEAEKIR NAAHRAADSETKKAIEELRSQMADEIVQAVEATLKEKLDMKKHVKLIDNALKKVVLN >gi|316922154|gb|ADCP01000129.1| GENE 34 38352 - 38903 459 183 aa, chain + ## HITS:1 COG:SMc02498 KEGG:ns NR:ns ## COG: SMc02498 COG0712 # Protein_GI_number: 15966790 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Sinorhizobium meliloti # 6 180 11 183 186 89 32.0 4e-18 MNGDIVAQRYAKALFALGQQEGMAKLEQYGENLSALEGVLEDSPELVRLFHIPVISVAEK QKVLSQVLDKLDVDPMIKNFCSLLAEKERLPLFEEIAEAFGKLLDEAWGVVRGKLLTAVS LSKEQQAGILARLEKQTSKQIALKFEVDPSILGGVVLQVGDNVLDASLRAQLAILRDIIK RGE >gi|316922154|gb|ADCP01000129.1| GENE 35 38908 - 40416 1842 502 aa, chain + ## HITS:1 COG:aq_679 KEGG:ns NR:ns ## COG: aq_679 COG0056 # Protein_GI_number: 15606090 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Aquifex aeolicus # 7 500 8 501 503 651 63.0 0 MHINAGEISKIIKEQIQNYEQRVDATETGTVLSVGDGIARVYGVQNAMSMELLEFPGGLL GMVLNLEEDNVGVALLGEDTGIKEGDPVKRTGRIFSVPIGDAVMGRVLNPLGMPVDGLGP VEATEFSAVEVKAPGIIARKSVHEPMVTGIKAIDAMTPIGRGQRELVIGDRQTGKTAICI DAILAQKNSDVHCFYVAIGQKRSTVALVADTLRKHGAMEYTTIISATASEPAPLQYIAAY SGCTMAEYYRNNGKHALIVYDDLSKQAVAYRQMSLLLRRPPGREAYPGDVFYLHSRLLER AAKVNDSLGAGSMTALPIIETQAGDVSAYIPTNVISITDGQVYLEPNLFNAGIRPAINVG LSVSRVGGSAQIKAMKQVAGTMRLDLAQYRELAAFAQFGSDLDKSTKQKLERGARLVELL KQPQYQPMPTQEQVVSMYAATRGFMDDVPVASVQAFERDFLDFIRGAKSDILDDIVARKV LDADLEARLKDAILEFKKGYNA >gi|316922154|gb|ADCP01000129.1| GENE 36 40430 - 41326 969 298 aa, chain + ## HITS:1 COG:BMEI0250 KEGG:ns NR:ns ## COG: BMEI0250 COG0224 # Protein_GI_number: 17986534 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Brucella melitensis # 1 297 1 292 292 208 43.0 9e-54 MPSLKDVKMKIVGVRKTKQITKAMNMVASAKLRGAQSRIERFRPYAEKFQAVLGDLAAKS DGSAHALLERRAECHVSVIILATSDRGLCGGFNAMLIAKALDAARTKTAEGKTVKFICVG KKGRDAIRKTGFEVLSSYADVMGSFDFSLASGIGKDVVSGYTHKAMDEVTLIYGKFVSVA RQVPLVQTLLPVEAAAPATADSAADGTSAQGVSEEYTYEPEVTKLLAEILPRFVNVQIYR GLLDTSASENAARMSAMDNATRNCDEMVGALTLLFNKTRQTSITNELIDIVGGAEALK >gi|316922154|gb|ADCP01000129.1| GENE 37 41344 - 42756 1801 470 aa, chain + ## HITS:1 COG:SPAC222.12c KEGG:ns NR:ns ## COG: SPAC222.12c COG0055 # Protein_GI_number: 19114063 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Schizosaccharomyces pombe # 4 466 53 518 525 654 71.0 0 MSENIGKIVQVIGAVVDVAFPDGHLPNILTALEIKNPNNSDAPDLICEVAQHLGDDVVRT IAMDATEGLVRGMDVVDTGKPIMAPVGSASLGRIMNVVGRPVDEMGPIKAEKYLPIHREA PSFVEQNTSVELLETGIKVVDLLVPFPKGGKMGLFGGAGVGKTVILMEMINNIAKQHGGI SVFAGVGERTREGNDLYHEMKDAGVLEKAALVYGQMNEPPGARARVALTALTCAEYFRDE ENQDVLLFIDNIFRFTQAGSEVSALLGRMPSAVGYQPTLGTDLGSLQERITSTNKGSITS VQAVYVPADDLTDPAPATTFSHLDGTLVLSRQIAELGIYPAVDPLDSTSRILDPNVVGEE HYGVARGVQQVLQKYKELQDIIAILGMDELSDDDKLVVARARRIQRFLSQPFHVAETFTG TPGVYVKLEATVKAFKGILNGDYDHLAEDDFYMVGDIDMALAKYQKRQEN >gi|316922154|gb|ADCP01000129.1| GENE 38 42766 - 43170 403 134 aa, chain + ## HITS:1 COG:BS_atpC KEGG:ns NR:ns ## COG: BS_atpC COG0355 # Protein_GI_number: 16080733 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Bacillus subtilis # 4 131 3 130 132 106 46.0 1e-23 MESTLQLEIVTPDKVVVSQTVDYVGVPGIEGEFGVLPHHIALLSALAIGDLYFRVDGKTE HVFVSGGFAEVSDNKLSILAESAERGEDIDTARANAARERAEARLAAKTAETDMLRAHLA LQRAVERLHIASLV Prediction of potential genes in microbial genomes Time: Fri May 13 04:14:26 2011 Seq name: gi|316922145|gb|ADCP01000130.1| Bilophila wadsworthia 3_1_6 cont1.130, whole genome shotgun sequence Length of sequence - 10656 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 405 - 1286 1348 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 1364 - 1423 1.9 - Term 1460 - 1497 7.0 2 2 Tu 1 . - CDS 1527 - 3218 500 ## PROTEIN SUPPORTED gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 - Prom 3276 - 3335 3.4 + Prom 3544 - 3603 7.3 3 3 Tu 1 . + CDS 3769 - 4263 512 ## COG1522 Transcriptional regulators + Term 4293 - 4326 4.3 4 4 Tu 1 . + CDS 4379 - 5623 1630 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 5 5 Op 1 16/0.000 + CDS 5960 - 6337 680 ## COG0509 Glycine cleavage system H protein (lipoate-binding) 6 5 Op 2 12/0.000 + CDS 6337 - 7659 1811 ## COG0403 Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain 7 5 Op 3 . + CDS 7659 - 9140 1894 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain + Term 9216 - 9249 3.6 + Prom 9240 - 9299 3.2 8 6 Tu 1 . + CDS 9328 - 10368 1137 ## COG0095 Lipoate-protein ligase A + Term 10500 - 10548 0.1 Predicted protein(s) >gi|316922145|gb|ADCP01000130.1| GENE 1 405 - 1286 1348 293 aa, chain - ## HITS:1 COG:PA1060 KEGG:ns NR:ns ## COG: PA1060 COG0697 # Protein_GI_number: 15596257 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pseudomonas aeruginosa # 1 285 1 286 301 187 41.0 1e-47 MTVSKNVLMALLILSYATWGGGMIAMKYAFESFTAMQVVFARVAFAGVIYLALYRLWSHI PYQKGDWKYLLAMVLFEPCLFFLCETFAMTYTTASQGGVIAACFPLCTAVAAWLFLGEKL TKKTIIAMLLAVAGVAGASLAAESSDQASNPILGNLLMVGAVLSATGYAVCVRFISRRYS FLSISAIQALGGTIVFLPFVFTTPMPVNVTMPAIGGLLYMGLGVGFLVYLSFNFALKHLE AGIVALFGNLIPVFTLIFAFTILGERLTLAQTLGVGLTFVGVIIAALPEKQRQ >gi|316922145|gb|ADCP01000130.1| GENE 2 1527 - 3218 500 563 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 [Haemophilus influenzae 7P49H1] # 15 562 16 565 618 197 25 4e-50 MLHNVDLILTLAGGLSAALVLGFITQKLRMSPIVGYLLAGIIVGPYSPGFVADADTASQC AEIGIILLMFGVGLHFHLKDLLAVQRIVLPGAAAQIGAATLLGMVVSHGFGWSWAAGAVF GMAISVASTVVLTRVLSDNKALHTTTGHVALGWLVVEDLFTILLLVLLPAIVGGNEQSES VWHVLAVTTLKLGALVVFALVVGQRIIPKLLAYVARTGTRDLFTLSVLVLALGVAVGAAN FFGASMALGAFLSGMVVGQSEFSSQAAAEALPMRDAFAVLFFVSVGMLFNPASLISDWPL ILATLGIVLIGKPLAAFIVVKLFGKPLRMALSVAVALAQIGEFSFILAAMGVSFGVLPKE ASSAIIVTAIVSITLNPLLYRRINGAAHWLESKGIGLPRLDENELSHPEGAQRIIVVGYG PVGKQLTGILRDNNLDVVVVEMNIDTVRQVSALGEKIIYGDATQREVLLHAGAEEAEGLI ISSSSAPAGEIVEVARAINPNLRILVHTTYLRVAENLKSEGVSAVFSGEREVAVAMTEHL MRDFGATDEQIDRERRKLRERMA >gi|316922145|gb|ADCP01000130.1| GENE 3 3769 - 4263 512 164 aa, chain + ## HITS:1 COG:BH3641 KEGG:ns NR:ns ## COG: BH3641 COG1522 # Protein_GI_number: 15616203 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 18 163 3 148 152 62 29.0 3e-10 MYPGFRKSPTHRNVMKTKLDETNLAIIKALRNGRASFRDIAADLGLSEVTVRTRAARLIE DGVLDICGQVDVEKLPGHTLALVAIKLRSPDLVGEGEVFSHLRGVVSVAVLTGRHDLILT VLLNEEFGLLDFYTNEFSKCHDRVSSIETFIVYKGFSLKVPYIL >gi|316922145|gb|ADCP01000130.1| GENE 4 4379 - 5623 1630 414 aa, chain + ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 2 258 4 246 365 161 36.0 2e-39 MLHQTALHATHEALGATLVDFGGWDMPLWYPTGAVKEHLAVIQHAGLFDIGHMAGLMVSG PDALEALQWAFTKDLSLSGPRAAYGAFLDASGGVVDDAIVYPLSEGRYFVVLNVGKGERI AESLSAWAASEGKSVTIRDLAGTYGKLDLQGPATVRVMTKLLKDPEAVFDKLPYFSFKGD IELANSDIFLADGTPIFLSRTGYTGEQGFEIFVPYDKIVDIWNRILEAGKDEGVIPCGLA ARDSVRAGAVLPLSQQDIGPWAFVNNPWPFTLPKDENGNWTKNFFGRSALEEALASGNVT YTYAYCGFDPRKVTSPDHDTHPQVLLDGEVIGDVLTCVADVSIGRVDGVITSVASPEKPE GFAPKGLVCGFVRVNRPLENGTKLTLADPRRKIQVETVTDIRPARTARKKLGSL >gi|316922145|gb|ADCP01000130.1| GENE 5 5960 - 6337 680 125 aa, chain + ## HITS:1 COG:ML2077 KEGG:ns NR:ns ## COG: ML2077 COG0509 # Protein_GI_number: 15828123 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Mycobacterium leprae # 1 122 1 130 132 113 44.0 9e-26 MSNIPAELRYSEDHEWISTSAPWKIGVSDYAQNALGDLTYVELPSVGDTFEKGQEFGTLE STKSVSPLFMPVAGTIKAVNEALVDAPEKVNQDPYGEGWLIEIESAEGVSELLDAAAYKA HLETL >gi|316922145|gb|ADCP01000130.1| GENE 6 6337 - 7659 1811 440 aa, chain + ## HITS:1 COG:TM0213 KEGG:ns NR:ns ## COG: TM0213 COG0403 # Protein_GI_number: 15642986 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain # Organism: Thermotoga maritima # 4 439 5 436 437 372 45.0 1e-102 MHRYFPHTAEDEAQMLGVVGADRVEDLFASIPADCRHEGALPLTAMTEWELTAQAEALAA SMPAAGSAWIGAGSYQHYIPAVVPALAGRSEFYTAYTPYQPEISQGTLQGIFEFQTLIAR LLGMEVANASMYDGATALAEGALMACRLTKRSAVAVSQALHPHYRTVLDTYCNANGIEVR DLTMTEEGGTSLEHLKDDAELAALIVQSPNVFGVVEHLAEQAEWIHARKGLLITGFTEAM AYGLLATPGSQGADIVCGEGQSFGLSQAFGGPYLGLMATRKAFVRNLPGRLVGQTVDNKG KRAFVLTLSTREQHIRREKAVSNICSNAGLCAMTCAMYLASMGGTGLRHMAQLNRDKAEY LKVGLAKLGFRPVYSGQTFNEFVMAAPAGFDAKWKRLLEEKKIMFGMDLSKWYPLADSYL FCVTETRSRADIDAILKEIA >gi|316922145|gb|ADCP01000130.1| GENE 7 7659 - 9140 1894 493 aa, chain + ## HITS:1 COG:lin1387 KEGG:ns NR:ns ## COG: lin1387 COG1003 # Protein_GI_number: 16800455 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Listeria innocua # 17 489 8 482 488 502 55.0 1e-142 MSNQSWPGVSSLVLNEPLLWEKGRPGRVGVSLPESDVPAAPYEAEGMVRTDLNLPDLAEL DVVRHYTRLSTWNFGVDTGMYPLGSCTMKYNPKINEKIAALPGFAGAHPLFPSEYSQGAL RVLYETEQMLCAVTGLEACSLQPSAGAHGELTGIMLIQAWHRAQGNTRTKMLIPDTAHGT NPASAALCGFSSVKVPLGPDGILTVEDVAALMDDDCAGIMITNPNTLGLFESNIKEIAEL VHARGGLVYGDGANLNAIMGYAHMGHIGIDVMHMNLHKTFSTPHGGGGPGSGAICVSKTL VPFLPGPRLVETDGKLDWVDNSVDAGGQSIGRMHPFWGQFGVVLRAWAYMKTLGPDLKRA SELAVLNANYIKFSLKDLYDLPFPQPSLHECVFTDKKQEEFGVSTMDIAKRLIDCGIHPP TIYFPLVVHGAIMIEPTESENLEDLDAFIEAMRSIAELAETDADLLHASPTRTKVGRVDE VGAARKPVLTGDM >gi|316922145|gb|ADCP01000130.1| GENE 8 9328 - 10368 1137 346 aa, chain + ## HITS:1 COG:SP1160 KEGG:ns NR:ns ## COG: SP1160 COG0095 # Protein_GI_number: 15901025 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Streptococcus pneumoniae TIGR4 # 1 338 1 323 329 251 39.0 2e-66 MYWIVNHSLDPTFNLALEEYFLGRIEPGHPGYAILWQNSPAIVVGRFQNTRQEVNADFVR ERGISVVRRMTGGGAVYHDAGTLNYTFIHHLDKEGALPAFSEAGKPIAEALQKLGLPVTF SGRNDLMLDGLKVAGVAHCRRGMRYLHHGCILVNSDLDVLSQALNVDPAKYKSKGVASVR SRVGNLAEYLAVHSPDLPPLTVQRVRDAIMEHRSGDEYRMNAADFVAITRLRDAKYSTWD WTYGASPPFTERKAQRFPWGKVEVSYDIRQGSVVECHFYGDFFMAGPGKSLTDEVSSCEI YDLEEAMRGVPYTNEALSPVLDRFPLHLLFSGCDPDELKAFLLPGR Prediction of potential genes in microbial genomes Time: Fri May 13 04:14:32 2011 Seq name: gi|316922136|gb|ADCP01000131.1| Bilophila wadsworthia 3_1_6 cont1.131, whole genome shotgun sequence Length of sequence - 9795 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 16 - 1440 2047 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) - Prom 1462 - 1521 1.7 - Term 1570 - 1607 8.7 2 2 Op 1 . - CDS 1638 - 2270 878 ## DvMF_0579 hypothetical protein 3 2 Op 2 . - CDS 2273 - 4045 2152 ## DvMF_0580 hypothetical protein 4 2 Op 3 . - CDS 4042 - 4911 1003 ## COG0061 Predicted sugar kinase + Prom 4878 - 4937 2.8 5 3 Op 1 . + CDS 5133 - 5741 735 ## COG0279 Phosphoheptose isomerase 6 3 Op 2 . + CDS 5764 - 6000 207 ## + Term 6012 - 6061 4.8 7 4 Tu 1 . + CDS 6138 - 7064 1310 ## COG0181 Porphobilinogen deaminase + Term 7198 - 7263 10.0 8 5 Tu 1 . + CDS 7280 - 9745 2796 ## COG1067 Predicted ATP-dependent protease Predicted protein(s) >gi|316922136|gb|ADCP01000131.1| GENE 1 16 - 1440 2047 474 aa, chain - ## HITS:1 COG:aq_461 KEGG:ns NR:ns ## COG: aq_461 COG0064 # Protein_GI_number: 15605949 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Aquifex aeolicus # 4 474 5 476 478 514 52.0 1e-145 MSAYETVIGLEVHVHLCTKSKLFCSCPTTFGVEPNTNVCEVCAGMPGVLPVLNQTAVEYA TRAGLALDCTINNDSVFARKNYFYPDMPYNYQISQFERPLCEFGHLDINVGGETKRIGIT RIHMETDAGKNIHEGDHSYLDLNRAAVPLIEIVSEPDMRSAEEAVAYLKALHAIVVYLGI TDGNMEEGSFRCDANVSIRPKGAAEFGTRAELKNLNSFRHVQKAIEYEVSRQQDVLEDGD KVIQETRLFDSVKGVTMSMRDKEEAHDYRYFPDPDLVPVHITDEQLESWKASLPELPAQR RRRFMDEWKLPEQDADIITSDRAQADLFEAAVALYNEPRKIANYMTGPMLREINQTGVEL KDSALKPEALAELAKIVDGGLISAKIAQDIFSELYANGVMPEAYVREKGLVQVSDTSAID AAVAEVIAENPAEAEAYKGGKTKLISFFVGQVMRKMKGKANPALVNEALGRLLK >gi|316922136|gb|ADCP01000131.1| GENE 2 1638 - 2270 878 210 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0579 NR:ns ## KEGG: DvMF_0579 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 208 1 217 222 240 62.0 3e-62 MTTPTQLFTLLDACFDAQNEATALWHSQEPEADDVAPDASASPETLLALVRAQHLRNFRL WHIEDTARRRDVTPDVIADCKYRIDRLNQERNDRIERVDACLVALLSPLLPASPAPHINT ESLGMAIDRLSILSLKIWHMDEQVRRTDVTPEHIASCEKKLAVLKEQRADLSLAVKHLVT EFVEGTKTPKLYFQFKMYNDPTLNPELYGK >gi|316922136|gb|ADCP01000131.1| GENE 3 2273 - 4045 2152 590 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0580 NR:ns ## KEGG: DvMF_0580 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 585 1 580 581 599 51.0 1e-170 MNRLPDIGQVSDLRLGDHPGFDAWFYSFCAENNIEHGINPSGVASPEQLRFMVAMDERQV YAPCSDATFRELAASFHLRSFPPRVRSQYIAAWRSIIRVVRYEKDRQKRRDMINYCRHRF RGCLALGNILPSRLVKRLVTTLISHFDAGDPWLNERLFYNETLASFLRSQTLQKALGRLP DGLSAEGIPDLRRALDLAELARLFHLAGRSHHTLTQLIHNCAAAESGKCELPDIFTGSEA FIPQVEELFPGPPRTFLYICAMEGGLALDLRIIQTLLRLGHKVILTLKEAPVYYAPTVWD VDRDPLLVDNLPESHIFKAPAASKNELLRRLRENRLLIISDGTGERLNLYRTSVTFARAW KESDAIIARGRCNRDVLLGTSHLFTRDVFCFWEDRGEVRMQLKPHAPGIRKFSEQALTAK ARTIIKSMRASKDSGKAVMFYSCIIGSIPGQTATAIKVADTFVRSLRERLDQVFIINPAE YFEPGMDGDDLMFMWEQVQRSGLINIWRFQSMEDIEASFGLMGLKVPPVWSGKDATFSTG CTKEMRIALDMQRSHPELQIVGPGPEKFFRRGDYGVGKFFDATISNANQE >gi|316922136|gb|ADCP01000131.1| GENE 4 4042 - 4911 1003 289 aa, chain - ## HITS:1 COG:RSc2650 KEGG:ns NR:ns ## COG: RSc2650 COG0061 # Protein_GI_number: 17547369 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Ralstonia solanacearum # 40 275 59 289 302 150 39.0 4e-36 MHSVTSILLVTKARHAAAEATAAAVSQWLEAHGVACSALPADCPSEKLVGRARTSDAILI LGGDGTFVGVGRKLAGLDIPLLGINFGQVGFLTELSAVGWEPALERLLAGKMITRTCLLL AWELLRGGTPIASGHAANDVVVGRGAIARVLPVHVFVDGEDMGVVRSDGVIVSTPLGSSA YALSAHGPLVHPKVQALTLTPISPFFKSFPPIVLPADSRIRLETDAAAPDAFLTVDGQEG IPLCGGDVIRVQSLDAGLRVLSCSSGTYFQRLRERGFIQTDGTSAPEHV >gi|316922136|gb|ADCP01000131.1| GENE 5 5133 - 5741 735 202 aa, chain + ## HITS:1 COG:Cj1424c KEGG:ns NR:ns ## COG: Cj1424c COG0279 # Protein_GI_number: 15792742 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Campylobacter jejuni # 30 191 32 193 201 181 53.0 7e-46 MDKALQMISQHAEDGAKLRRDFFAVHGGKVAEAARRMAVAVARGGKILLAGNGGSAADAQ HWAAEFVNRFLMDRPPLPAIALTTDSSVLTSIGNDFGYDLVFTKQVQALGRPGDVFVGIS TSGNSANVIGALQAARKQSLFTIGLTGDGGGRMAPLCDILLAVGHPSTPLVQETHAAIGH MLCALTDYYLFENVMAIKPMLD >gi|316922136|gb|ADCP01000131.1| GENE 6 5764 - 6000 207 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPMFEYRCTRCGHEFEELVFGDENPVCPACRAEATEKLISKPCRHCEGGSLGDFGAPSAP SSGGGCAGCSGGNCASCH >gi|316922136|gb|ADCP01000131.1| GENE 7 6138 - 7064 1310 308 aa, chain + ## HITS:1 COG:hemC KEGG:ns NR:ns ## COG: hemC COG0181 # Protein_GI_number: 16131657 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Escherichia coli K12 # 4 306 13 314 320 304 57.0 2e-82 MEKLVIATRGSRLALWQANHVKDSLEAVHPGLAVELNIIKTKGDIILDVPLAKVGGKGLF VKEIEEALLSGAADIAVHSMKDVPMELPEGLILGIVPEREDPTDLFLSVDYDSLENLPAG AVVGTSSLRRQAQVLAQRPDLEVVSLRGNVDTRLRKLTEGQFAAIIMATAGMKRLGLSVP KECPLAPPAFIPAVGQGALGIEFREDRKDLAEMLSFMEHRPTRICVEAERGLLAGLDGGC QVPIAGHAEMLDDGTFELEGLVGEVDGSVIIRRKVTGRADEARAVGFALARSLAEDGGAE ILARVYAR >gi|316922136|gb|ADCP01000131.1| GENE 8 7280 - 9745 2796 821 aa, chain + ## HITS:1 COG:VCA0975 KEGG:ns NR:ns ## COG: VCA0975 COG1067 # Protein_GI_number: 15601728 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent protease # Organism: Vibrio cholerae # 36 801 33 781 786 436 32.0 1e-122 MSKIKSLPGQRLCSSLPPDRVPWETSDAIPRNGHRKVAPQPRALKALELALHIHTAGYNI YLSGESNLGRTYMLREFLEPRIRKAQTPPDLLYVNNFDDPDRPMLLSVPAGQGKKLRSVL SQALADIRKELPVRLDNEKYVKKRSELQDRFQAVRSGLIKQMDKTAGKEGFNLDMDEQGS LTLYPLVEGKRLSEDEYEHLDPTLRQGLKQKGDTLLRAMSGMVRKLNSAEQSFRSDERTL EQEVIGSVLSAVLTPAVERILKNCAGPDGEPDSTLKKYFEALRQDILDHPEGFLPRETQP AQALSAALSGDAQHPETDTYLYEVNVFVDNSGNKGAPLIIEDHPTTANLLGCVERESEMG ALVTDFTLIKAGSLHRANGGFLVLHLEDILQHPGAWEGLLRALRAGSARLEEAGDVPDGA VRTKGIEPQPVPLNLKVVLIGNEELYEALLTNDDRFAKLFKIKAHMTETVERTAADIRSW LIRVAGIIDETHLLPFSRTALAGLVDFSSRVAEDQRKLSLKFPLMRDVMIEASAMAAMKQ SALVEAEHLEEALEARMYRANLVEELYMEEYDRELIKVRTSGSAVGRVNGLSVSWYGDFE FGLPHQISCTVGVGHGGIIDLEREAELGGPIHTKAMLILKSYLVDQFARNKPLVMTGSLC FEQNYGGIEGDSASGAELAALLSALSGVPLKLSLAITGAVSQSGQIMAVGGVSRKIEGFF GICSRRGLTGEQGVIVPRDNVDHLMLSPRIREAVDTGKFGIYPVEHITEALELLTGIPAG KPLKNGGFTRDSLYDLVDRRLQEFGHSAEHAYSKPLRRRKK Prediction of potential genes in microbial genomes Time: Fri May 13 04:14:51 2011 Seq name: gi|316922134|gb|ADCP01000132.1| Bilophila wadsworthia 3_1_6 cont1.132, whole genome shotgun sequence Length of sequence - 1653 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 113 - 155 7.8 1 1 Tu 1 . - CDS 260 - 1444 1190 ## COG0477 Permeases of the major facilitator superfamily - Prom 1553 - 1612 2.0 Predicted protein(s) >gi|316922134|gb|ADCP01000132.1| GENE 1 260 - 1444 1190 394 aa, chain - ## HITS:1 COG:ECs5316 KEGG:ns NR:ns ## COG: ECs5316 COG0477 # Protein_GI_number: 15834570 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 17 269 42 309 453 62 25.0 2e-09 MPPKDIPMPFRTALPWTGFVALLFLLNYMSRSTLTPLLVSIEEDLGIGHAQATSLLLMQS AGFSSALAASGFLLSRFKPRQIATIPLAIAGGILLLMPLVHTLGQARLVFIAFGLGVGFY FPAGMATLSSLVFPKDWGKAVAIHELAPNTGFILIPLLAQAGLMFTDWRGVFAIMGVLMI CTAGAFLLWGRGGNTRTDAPSFKGCGVLLKNPASWIVALLMAVSMIGEFSIYSILQIFLV SAAGFGPEEANLGLSISRLAMPVIVIAAGWAADRFNAKRTVSACFLLHAVALCLMSVDAS VSRIPALCGVFLQAASMAFVFPPLFKVFAQCFSADEQPILLSLTMPLAGLISAGGIPFFI GYCGEYYTFGLAFLTIAAMSVASAVSVAYLKNRE Prediction of potential genes in microbial genomes Time: Fri May 13 04:15:54 2011 Seq name: gi|316921996|gb|ADCP01000133.1| Bilophila wadsworthia 3_1_6 cont1.133, whole genome shotgun sequence Length of sequence - 169568 bp Number of predicted genes - 156, with homology - 113 Number of transcription units - 92, operones - 35 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 29 - 1306 1639 ## COG1541 Coenzyme F390 synthetase - Prom 1411 - 1470 3.9 + Prom 1272 - 1331 3.0 2 2 Op 1 . + CDS 1506 - 1988 460 ## COG0799 Uncharacterized homolog of plant Iojap protein 3 2 Op 2 . + CDS 2014 - 3600 1571 ## COG0696 Phosphoglyceromutase 4 2 Op 3 . + CDS 3597 - 3839 211 ## DVU1620 hypothetical protein + Term 3947 - 3983 5.4 + Prom 3952 - 4011 5.5 5 3 Tu 1 . + CDS 4202 - 5092 542 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 5111 - 5170 2.5 6 4 Op 1 . + CDS 5370 - 7136 1803 ## DVU0536 high-molecular-weight cytochrome c 7 4 Op 2 6/0.000 + CDS 7151 - 8356 1598 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 8 4 Op 3 . + CDS 8360 - 9649 1890 ## COG5557 Polysulphide reductase 9 4 Op 4 . + CDS 9667 - 9810 148 ## 10 4 Op 5 . + CDS 9829 - 10509 852 ## DVU0532 hmc operon protein 5 11 4 Op 6 . + CDS 10543 - 11934 1747 ## COG0247 Fe-S oxidoreductase + Term 11954 - 11986 -0.9 - Term 12058 - 12096 6.2 12 5 Tu 1 . - CDS 12294 - 13508 1368 ## COG1454 Alcohol dehydrogenase, class IV - Prom 13566 - 13625 4.9 13 6 Op 1 . - CDS 14066 - 14548 216 ## gi|302342746|ref|YP_003807275.1| outer membrane protein domain-containing protein 14 6 Op 2 . - CDS 14439 - 14690 85 ## - Prom 14906 - 14965 2.7 + Prom 14870 - 14929 4.0 15 7 Tu 1 . + CDS 15057 - 15242 56 ## 16 8 Tu 1 . + CDS 15713 - 16453 299 ## COG1720 Uncharacterized conserved protein + Term 16666 - 16706 4.2 - TRNA 16704 - 16779 84.9 # Thr GGT 0 0 - Term 16654 - 16692 7.6 17 9 Op 1 25/0.000 - CDS 16827 - 17651 857 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 18 9 Op 2 18/0.000 - CDS 17665 - 18126 509 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 19 9 Op 3 15/0.000 - CDS 18160 - 19194 1049 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase 20 9 Op 4 . - CDS 19195 - 19749 769 ## COG2825 Outer membrane protein - Prom 19883 - 19942 2.9 - Term 20021 - 20047 0.3 21 10 Tu 1 . - CDS 20185 - 21861 1475 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Term 21895 - 21930 3.2 22 11 Op 1 . - CDS 21976 - 24675 3695 ## COG4775 Outer membrane protein/protective antigen OMA87 23 11 Op 2 23/0.000 - CDS 24632 - 25372 228 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 24 11 Op 3 . - CDS 25369 - 26661 1408 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 25 11 Op 4 . - CDS 26667 - 28250 2221 ## COG1190 Lysyl-tRNA synthetase (class II) - Prom 28383 - 28442 2.7 + TRNA 28673 - 28748 81.9 # Asn GTT 0 0 - Term 28747 - 28784 4.0 26 12 Op 1 . - CDS 28907 - 29029 93 ## 27 12 Op 2 . - CDS 29101 - 29373 84 ## 28 12 Op 3 . - CDS 29364 - 29741 65 ## DVU2043 hypothetical protein 29 13 Tu 1 . - CDS 30177 - 30806 56 ## COG0714 MoxR-like ATPases - Prom 30999 - 31058 5.4 - Term 31006 - 31042 5.1 30 14 Tu 1 . - CDS 31110 - 31352 150 ## COG2026 Cytotoxic translational repressor of toxin-antitoxin stability system - Prom 31575 - 31634 5.4 + Prom 31227 - 31286 4.9 31 15 Tu 1 . + CDS 31444 - 31623 64 ## 32 16 Tu 1 . - CDS 31662 - 32375 429 ## LI0183 hypothetical protein 33 17 Tu 1 . + CDS 33274 - 33525 73 ## 34 18 Tu 1 . - CDS 34273 - 36180 1135 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 36222 - 36281 7.0 35 19 Tu 1 . + CDS 36085 - 36273 70 ## + Term 36465 - 36501 -0.9 + Prom 36275 - 36334 1.9 36 20 Op 1 . + CDS 36512 - 37549 470 ## COG2515 1-aminocyclopropane-1-carboxylate deaminase 37 20 Op 2 . + CDS 37605 - 39242 690 ## COG0069 Glutamate synthase domain 2 38 21 Tu 1 . + CDS 39426 - 41009 656 ## COG3653 N-acyl-D-aspartate/D-glutamate deacylase + Prom 41461 - 41520 6.5 39 22 Tu 1 . + CDS 41540 - 42517 548 ## COG0549 Carbamate kinase + Term 42722 - 42765 -0.4 + Prom 43068 - 43127 3.8 40 23 Tu 1 . + CDS 43197 - 43601 67 ## Sros_6673 hypothetical protein - Term 43879 - 43922 8.4 41 24 Op 1 . - CDS 43994 - 44257 165 ## 42 24 Op 2 . - CDS 44268 - 44627 315 ## COG0373 Glutamyl-tRNA reductase 43 24 Op 3 . - CDS 44723 - 44944 58 ## - Term 45556 - 45590 4.2 44 25 Op 1 . - CDS 45655 - 47109 968 ## ECP_4035 hypothetical protein 45 25 Op 2 . - CDS 47144 - 48727 875 ## COG0074 Succinyl-CoA synthetase, alpha subunit - Prom 48913 - 48972 3.1 + Prom 48675 - 48734 3.4 46 26 Tu 1 . + CDS 48847 - 49104 147 ## + Term 49137 - 49186 0.3 47 27 Tu 1 . - CDS 49069 - 49605 342 ## COG0590 Cytosine/adenosine deaminases - Prom 49709 - 49768 5.8 48 28 Op 1 . - CDS 49861 - 50550 459 ## Sterm_2685 AroM family protein - Prom 50577 - 50636 3.1 49 28 Op 2 . - CDS 50638 - 51909 597 ## COG0044 Dihydroorotase and related cyclic amidohydrolases - Term 52037 - 52071 -0.5 50 28 Op 3 . - CDS 52085 - 53563 492 ## DSY0401 hypothetical protein - Prom 53772 - 53831 9.6 51 29 Tu 1 . + CDS 54414 - 54731 91 ## - Term 54578 - 54618 -0.9 52 30 Tu 1 . - CDS 54670 - 55953 1319 ## Taci_0805 extracellular ligand-binding receptor - Prom 56033 - 56092 3.3 - Term 56079 - 56105 -0.6 53 31 Op 1 13/0.000 - CDS 56127 - 57542 1042 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 54 31 Op 2 . - CDS 57527 - 59989 1603 ## COG0642 Signal transduction histidine kinase - Prom 60016 - 60075 2.3 55 32 Tu 1 . + CDS 60082 - 60225 111 ## 56 33 Tu 1 . - CDS 60212 - 61570 1637 ## COG0534 Na+-driven multidrug efflux pump 57 34 Tu 1 . - CDS 61718 - 62353 614 ## COG1309 Transcriptional regulator - Prom 62408 - 62467 8.5 - Term 62568 - 62606 9.2 58 35 Tu 1 . - CDS 62691 - 63593 793 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 63856 - 63915 2.5 59 36 Tu 1 . + CDS 63977 - 64735 786 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit + Term 64746 - 64783 2.6 60 37 Tu 1 . + CDS 64888 - 66021 1363 ## COG2768 Uncharacterized Fe-S center protein + Term 66074 - 66133 1.9 61 38 Tu 1 . - CDS 66011 - 66286 81 ## 62 39 Op 1 2/0.062 + CDS 66155 - 66808 549 ## COG1691 NCAIR mutase (PurE)-related proteins 63 39 Op 2 1/0.062 + CDS 66805 - 67647 671 ## COG1641 Uncharacterized conserved protein 64 39 Op 3 . + CDS 67712 - 68551 699 ## COG1606 ATP-utilizing enzymes of the PP-loop superfamily 65 40 Op 1 1/0.062 + CDS 68877 - 69488 506 ## COG0716 Flavodoxins 66 40 Op 2 . + CDS 69511 - 70050 591 ## COG0655 Multimeric flavodoxin WrbA + Term 70139 - 70183 12.0 - Term 70232 - 70285 2.7 67 41 Op 1 . - CDS 70301 - 71941 1957 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 68 41 Op 2 . - CDS 71967 - 73460 2325 ## COG3333 Uncharacterized protein conserved in bacteria 69 41 Op 3 . - CDS 73475 - 73954 485 ## gi|302864357|gb|EFL87288.1| TRAP-T family transporter, small (4 TMs) inner membrane subunit - Term 73970 - 74007 6.1 70 41 Op 4 . - CDS 74017 - 75015 1497 ## COG3181 Uncharacterized protein conserved in bacteria - Prom 75104 - 75163 4.9 - Term 75172 - 75217 6.1 71 42 Tu 1 . - CDS 75247 - 77211 2098 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 77454 - 77513 2.2 + Prom 77142 - 77201 1.7 72 43 Tu 1 . + CDS 77342 - 78376 1159 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 78401 - 78437 3.1 - Term 78703 - 78739 5.5 73 44 Tu 1 . - CDS 78761 - 79768 1404 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component - Prom 79968 - 80027 12.0 74 45 Tu 1 . + CDS 80496 - 80915 567 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism + Term 80947 - 80984 3.0 - Term 80986 - 81025 10.0 75 46 Op 1 24/0.000 - CDS 81046 - 81930 1296 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component 76 46 Op 2 24/0.000 - CDS 81943 - 82719 944 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 77 46 Op 3 24/0.000 - CDS 82734 - 83570 1204 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component 78 46 Op 4 21/0.000 - CDS 83567 - 84436 1174 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 79 46 Op 5 . - CDS 84586 - 85698 1761 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 80 47 Tu 1 . - CDS 85929 - 86324 248 ## - Term 87035 - 87095 7.9 81 48 Tu 1 . - CDS 87115 - 89007 1453 ## COG0642 Signal transduction histidine kinase - Prom 89127 - 89186 5.9 82 49 Tu 1 . + CDS 89434 - 90402 920 ## COG0657 Esterase/lipase + Term 90453 - 90494 10.9 + Prom 90483 - 90542 4.2 83 50 Tu 1 . + CDS 90606 - 90815 168 ## + Term 90857 - 90898 5.0 - Term 90891 - 90932 10.2 84 51 Tu 1 . - CDS 91062 - 92873 2872 ## COG1966 Carbon starvation protein, predicted membrane protein - Term 93083 - 93120 -0.7 85 52 Tu 1 . - CDS 93328 - 93759 399 ## - Prom 93948 - 94007 4.9 + Prom 94537 - 94596 3.6 86 53 Tu 1 . + CDS 94701 - 97736 2780 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain + Term 97743 - 97770 -0.9 - Term 97623 - 97653 2.3 87 54 Op 1 . - CDS 97811 - 98053 388 ## 88 54 Op 2 . - CDS 98104 - 98331 56 ## 89 55 Tu 1 . + CDS 98252 - 98878 729 ## COG1280 Putative threonine efflux protein + Term 99116 - 99150 -0.2 90 56 Tu 1 . - CDS 98939 - 99892 1257 ## COG0598 Mg2+ and Co2+ transporters 91 57 Tu 1 . + CDS 100466 - 100594 114 ## + Term 100635 - 100669 2.4 + Prom 100654 - 100713 3.9 92 58 Tu 1 . + CDS 100829 - 102421 1726 ## COG2721 Altronate dehydratase 93 59 Op 1 1/0.062 + CDS 102603 - 103454 849 ## COG3618 Predicted metal-dependent hydrolase of the TIM-barrel fold 94 59 Op 2 1/0.062 + CDS 103470 - 104252 1070 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 95 59 Op 3 . + CDS 104265 - 105473 1738 ## COG0738 Fucose permease 96 59 Op 4 . + CDS 105655 - 106548 1077 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Term 106551 - 106601 14.9 97 60 Tu 1 . + CDS 106658 - 107443 860 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold + Term 107448 - 107488 7.2 - Term 107433 - 107479 10.2 98 61 Op 1 . - CDS 107498 - 109135 2084 ## COG0303 Molybdopterin biosynthesis enzyme 99 61 Op 2 . - CDS 109159 - 109836 947 ## DvMF_0370 outer membrane lipoprotein carrier protein LolA - Term 109845 - 109879 -0.2 100 62 Op 1 . - CDS 110077 - 112932 2747 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 101 62 Op 2 . - CDS 113007 - 113564 860 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) 102 63 Op 1 . + CDS 113563 - 113793 57 ## 103 63 Op 2 . + CDS 113759 - 114394 802 ## COG0218 Predicted GTPase + Term 114560 - 114609 16.1 104 64 Op 1 . - CDS 114746 - 115765 938 ## COG4974 Site-specific recombinase XerD 105 64 Op 2 . - CDS 115802 - 117091 1199 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake - Prom 117112 - 117171 2.1 + Prom 117144 - 117203 1.6 106 65 Op 1 . + CDS 117235 - 117711 437 ## COG2872 Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain 107 65 Op 2 . + CDS 117800 - 118708 1228 ## Ddes_1119 diaminopimelate dehydrogenase + Term 118736 - 118769 5.2 - TRNA 119057 - 119131 85.8 # Gly GCC 0 0 + Prom 119203 - 119262 1.9 108 66 Tu 1 . + CDS 119303 - 120466 897 ## COG2821 Membrane-bound lytic murein transglycosylase + Term 120550 - 120591 -0.9 + Prom 120500 - 120559 1.9 109 67 Op 1 . + CDS 120602 - 122050 1601 ## COG1696 Predicted membrane protein involved in D-alanine export 110 67 Op 2 . + CDS 122069 - 124057 1559 ## COG0627 Predicted esterase + Term 124176 - 124230 16.7 - Term 124168 - 124212 11.1 111 68 Tu 1 . - CDS 124231 - 124893 650 ## COG0546 Predicted phosphatases - Term 125351 - 125386 -0.5 112 69 Op 1 . - CDS 125428 - 126441 1245 ## COG4213 ABC-type xylose transport system, periplasmic component 113 69 Op 2 4/0.062 - CDS 126435 - 128612 1722 ## COG2200 FOG: EAL domain - Prom 128831 - 128890 8.3 114 70 Tu 1 . - CDS 129434 - 130171 253 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 130330 - 130389 3.4 + Prom 130289 - 130348 5.3 115 71 Tu 1 . + CDS 130393 - 133380 2821 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains + Term 133393 - 133437 1.4 116 72 Op 1 . + CDS 133599 - 134615 842 ## COG1250 3-hydroxyacyl-CoA dehydrogenase 117 72 Op 2 . + CDS 134612 - 135058 513 ## 118 72 Op 3 1/0.062 + CDS 135134 - 136624 1680 ## COG3333 Uncharacterized protein conserved in bacteria 119 72 Op 4 . + CDS 136681 - 137631 1252 ## COG3181 Uncharacterized protein conserved in bacteria 120 73 Op 1 . + CDS 137750 - 138076 289 ## COG5561 Predicted metal-binding protein + Term 138107 - 138133 0.1 121 73 Op 2 . + CDS 138186 - 139883 1529 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] + Term 139930 - 139987 18.0 122 74 Tu 1 . + CDS 140017 - 142842 2225 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs - Term 142708 - 142765 2.0 123 75 Tu 1 . - CDS 142878 - 143492 521 ## Sterm_3311 hypothetical protein 124 76 Tu 1 . - CDS 143716 - 144660 938 ## COG2958 Uncharacterized protein conserved in bacteria - Prom 144681 - 144740 3.5 - Term 144767 - 144805 9.3 125 77 Tu 1 . - CDS 144832 - 145347 174 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 - Term 145948 - 145995 11.4 126 78 Tu 1 . - CDS 146021 - 147544 1988 ## COG0069 Glutamate synthase domain 2 - Term 147560 - 147615 16.7 127 79 Op 1 . - CDS 147626 - 149206 1896 ## COG3653 N-acyl-D-aspartate/D-glutamate deacylase 128 79 Op 2 . - CDS 149222 - 149959 804 ## COG0069 Glutamate synthase domain 2 - Term 149997 - 150060 5.3 129 80 Op 1 . - CDS 150080 - 152743 1702 ## COG0642 Signal transduction histidine kinase 130 80 Op 2 . - CDS 152821 - 153171 235 ## gi|302864281|gb|EFL87212.1| probable sensor/response regulator hybrid protein 131 80 Op 3 . - CDS 153183 - 154271 282 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain - Prom 154428 - 154487 8.6 132 81 Tu 1 . + CDS 156574 - 156813 130 ## + Term 156851 - 156896 -0.9 133 82 Op 1 . - CDS 156892 - 157383 293 ## gi|302861414|gb|EFL84351.1| conserved hypothetical protein 134 82 Op 2 . - CDS 157419 - 157850 303 ## 135 83 Tu 1 . - CDS 158238 - 158555 60 ## 136 84 Op 1 . - CDS 159231 - 159452 135 ## 137 84 Op 2 . - CDS 159449 - 159754 146 ## 138 84 Op 3 . - CDS 159732 - 160013 88 ## - Prom 160123 - 160182 2.7 139 85 Tu 1 . - CDS 160219 - 160551 218 ## - Term 160587 - 160631 8.4 140 86 Op 1 . - CDS 160640 - 160816 185 ## - Term 160838 - 160871 5.1 141 86 Op 2 . - CDS 160880 - 161131 130 ## 142 86 Op 3 . - CDS 161095 - 161361 119 ## 143 86 Op 4 . - CDS 161354 - 161770 178 ## LF82_p096 hypothetical protein 144 86 Op 5 . - CDS 161784 - 162017 98 ## 145 87 Op 1 . - CDS 162542 - 163201 400 ## 146 87 Op 2 . - CDS 163194 - 164030 505 ## Mmc1_2921 hypothetical protein 147 87 Op 3 . - CDS 164044 - 164334 123 ## 148 88 Op 1 . - CDS 164631 - 164816 106 ## 149 88 Op 2 . - CDS 164786 - 164968 107 ## 150 89 Op 1 . - CDS 165422 - 165664 335 ## 151 89 Op 2 . - CDS 165742 - 165927 122 ## - Prom 166014 - 166073 3.5 - Term 166035 - 166062 0.1 152 90 Tu 1 . - CDS 166092 - 166379 87 ## - Prom 166447 - 166506 5.1 - Term 167250 - 167297 7.3 153 91 Op 1 . - CDS 167523 - 167729 202 ## 154 91 Op 2 . - CDS 167726 - 167887 65 ## 155 91 Op 3 . - CDS 167880 - 168332 620 ## COG3793 Tellurite resistance protein - TRNA 168578 - 168652 71.4 # Gln TTG 0 0 156 92 Tu 1 . + CDS 168686 - 168922 56 ## - TRNA 168831 - 168918 59.6 # Ser TGA 0 0 - TRNA 168957 - 169040 28.2 # Pseudo ??? 0 0 Predicted protein(s) >gi|316921996|gb|ADCP01000133.1| GENE 1 29 - 1306 1639 425 aa, chain - ## HITS:1 COG:MTH1855 KEGG:ns NR:ns ## COG: MTH1855 COG1541 # Protein_GI_number: 15679843 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanothermobacter thermautotrophicus # 1 422 10 430 433 511 57.0 1e-144 MPREKIREIQLSRLQDLCRRVYANVPFYRRSFDEKGIKPSDVQSLDDLKFLPFTLKQDMR NNYPYGLFAVPMEMIQRIHASSGTTGKATVVGYTKRDIETWAECAARSLAAAGATANDIV QVSYGYGLFTGGLGAHYGAERLGATVIPMSGGSTKRQVQLMRDFGATVLCCTPSYALHLH DAGLEIGINIKDLPLRLGVFGAEPWTEEMRREIEDKLGITATNIYGLSEIMGPGVSQSCA KEQHGMHIWEDHFLPEIIDPVSGEILPEGSTGELVITTLTKQGIPLVRYRTRDITSLNYT PCSCGRTHVRMNRITGRSDDMLIIRGVNVFPSQIESLLLEVQGVSPHYQLILTRENNLDY VEVQVELDGAMFSDAIKDLQQREQRIQKGIKEFLGVTTKVRLVEPHSIPRSEGKAVRVID RRVMQ >gi|316921996|gb|ADCP01000133.1| GENE 2 1506 - 1988 460 160 aa, chain + ## HITS:1 COG:PM1922 KEGG:ns NR:ns ## COG: PM1922 COG0799 # Protein_GI_number: 15603787 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Pasteurella multocida # 41 135 8 102 103 88 41.0 5e-18 MEQHNSTSTASQATPANTPAPKAQAARKFSTASASEKTAALVKWLEESKARDVVALDVAG KSPCMDVVIVVTASSLRHAKSLADGLMEQCTKNNYEFLRMEGYQTGQWILADLNDIVVHI FQQDVRELFRIESLWKESPALYGEMPAVSPRHAEPAGDQD >gi|316921996|gb|ADCP01000133.1| GENE 3 2014 - 3600 1571 528 aa, chain + ## HITS:1 COG:BS_pgm KEGG:ns NR:ns ## COG: BS_pgm COG0696 # Protein_GI_number: 16080444 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Bacillus subtilis # 1 524 1 510 511 469 47.0 1e-132 MSLTPTLLLILDGYGLAPAGQGNAASLARTPNIDRLISLPGATRIDASGRAVGLPAGYMG NSEVGHLNIGAGRIVYQDMTRIDVAMETGELASNAALLELLANVKRTGGRLHFAGLLSDG GVHSHINHLGALMDIAAAHGVDVRVHAFMDGRDTSPTSGAGFLAQLGDMMARTRAAHSGV SVEQAALVGRFYAMDRDKRWERVKVAWDMMVHGEGQRASDPVAAVEALYAAGETDEFLKP QVFGDPADVCVRNGDGIFFINFRADRGRELVSAFHFPDFDGFDRGGVPALAGLVTMTSYD SSLHVPVAFPKENLVQTLGEVVADAGAHQLRIAETEKYAHVTYFFSGGREEPFPLEDRIL VNSPKDVATYDLKPQMSVLEVTDRFLEAWAAGPEKDGVPYTLAVCNLANPDMVGHTGVIE AAVKALEYVDGCVARLVEVVLSSGGRVLMTADHGNVEVMLDETGHPQTAHTCNQVPLVVV ERGADGERAVPLRSGGKLGDIAPTLLDLWGLAKPGVMSGESLVEGVRG >gi|316921996|gb|ADCP01000133.1| GENE 4 3597 - 3839 211 80 aa, chain + ## HITS:1 COG:no KEGG:DVU1620 NR:ns ## KEGG: DVU1620 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 79 1 79 80 103 70.0 2e-21 MSLSSRPVQPVPPESMELVFFYRCPGCGRRNPLIAPTQPSMTRCESCGEPFPVMPVDERT VHYVKIMLANGRAALDPDFA >gi|316921996|gb|ADCP01000133.1| GENE 5 4202 - 5092 542 296 aa, chain + ## HITS:1 COG:Cgl0553 KEGG:ns NR:ns ## COG: Cgl0553 COG0454 # Protein_GI_number: 19551803 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Corynebacterium glutamicum # 151 290 15 152 152 125 48.0 1e-28 MTTSHYQAARENLFGHVPFLQRRMPGMLIRDLGGALLVDSGLGSDTFNKVLWECGGDGRP CRVERGAEDGLLQASKDWFAGKRPRDLLELPGMALPPATERPFTVWASAEDEGALNRQRT AFGRNGYGIGELETGMALPLNRWTPEGRPPEGLRIVQVGTSAELSAYADVLAANWNPPDE DVKRLYKAAEQVLFQAVAPMRLFVGFAGEVPVVSGELFLSEKGNTAGLHMISTREAFRRR GFGGAMTAALLRAGQAAGASLAVLQASSEGEPVYRRQGFEPCGLFVEYVLSEEILF >gi|316921996|gb|ADCP01000133.1| GENE 6 5370 - 7136 1803 588 aa, chain + ## HITS:1 COG:no KEGG:DVU0536 NR:ns ## KEGG: DVU0536 # Name: hmcA # Def: high-molecular-weight cytochrome c # Organism: D.vulgaris # Pathway: not_defined # 1 588 16 560 560 466 46.0 1e-130 MENGRKLLRGIALAAIAVIGLGGIALFASPSLAAQTRPDVIRIDAIGQLKKLEMPPAVFL HDEHTKALAATGQDCSVCHTPTANGHTVKFQRKEDGTDAKKLENIYHNGCIGCHENMASS NQKTGPLDGECRACHDTKLPFKAEQKPVKMGSKSLHYLHVSSKAIVNPANSEENCGVCHH VYDEKLNKLVWKKGQEDACAACHGEKAVGSTPSLQTAVHTKCVWCHENVAQSSRAYLTAQ VEAKKAAEPKSTKKLSAKEVQAEAAAEAASIEAAIVTGPTTCAGCHTEEAQSKFKQVNPV PRLMRGQPDATVLLPVNNAKRPVGAPEAGMKPVVFNHKAHEAAVDSCRTCHHVRIESCTV CHTVDGNKDGNFVKLADAMHAKTSDSSCVGCHQQTVMQKKECAGCHGAVPVMPADSCATC HKDVKGITSAQIADGSAFDLTKEQLADIAAKDLAAQPAPAKPFPAVEVPETVTIGALSND FEPSVFPHRKIYEALVKGAADSGLASAFHTSPTAMCAACHHNSPIADLKTPPKCASCHGI EADKMATSVNKPSLKAAYHQQCMACHDRMKIEKPAATDCAGCHTPRVK >gi|316921996|gb|ADCP01000133.1| GENE 7 7151 - 8356 1598 401 aa, chain + ## HITS:1 COG:ECs3881 KEGG:ns NR:ns ## COG: ECs3881 COG0437 # Protein_GI_number: 15833135 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 257 1 287 328 165 36.0 1e-40 MNRRSFLTLMGGLGIGSALGGAKSASAAGGTFHGYPDSKGVLHDTTLCIGCRRCEQACNK VNDLPKPEKPFTDLNVLNEKRRTSAKEWTVVNKYRLASLDKDVFRKSQCMHCEEPACASA CFVKAFTKNPDGSVTYDPTLCVGCRYCMVACPFNIPGYTYDRALNPLVQKCTLCHPRLQE GKLPGCVEACPTGALVFGKRKDLVKIAWDRITAHPERYQNHVYGEHEMGGTAWMTISGAE FKEVGLNEDLGTKAAGEYTAGALGAVPMVVGIWPVLLGGAYAITKRKEQIAKEEQNDAVN AAVARTEEEAAKKLQASLDKAAKEAAKEKDRAVADEVKKAVAEAEEALRAQLEAEKAEAV AKAAEEAAAKAAEEAAAKAAAKPSKPEKGKKGKHDTNGGEA >gi|316921996|gb|ADCP01000133.1| GENE 8 8360 - 9649 1890 429 aa, chain + ## HITS:1 COG:STM3148 KEGG:ns NR:ns ## COG: STM3148 COG5557 # Protein_GI_number: 16766448 # Func_class: C Energy production and conversion # Function: Polysulphide reductase # Organism: Salmonella typhimurium LT2 # 42 419 18 376 392 166 32.0 6e-41 MKLWKNLSNRLDAVADVMAAKPCMQGCVGKTLLSIPRTPGNLITAVILAIGLLITIGRFG FGLGAVTNLDDNNPWGIWIAFDLLCGVALAAGGYVTSSACYLFGMKRYHSAVRPAILTAF LGYAFVVVALLYDVGQPWRLPYPLLVSQGTTSLLFEVGLCVGIYLTVLFIEWSPVGLEWL LGMKDAPCWLVRLRPRMHTIRKAVLCFTIPLTILGVVLSTMHQSSLGALFLIAPSKMHPL WYSPFMPVFFFISSMVAGLSMVIFEGTLSHKALHNKMDETHLREADGVVFGFGRAASFVL IGYFIIKVLDTTMDNDWHYLASGYGAWFAVEMVGFVLLPAFLYALGVREKNITLIRVASV FGVLGIVVNRFNVCLVAFNWQLDSADRYFPSISEVFLSIFIVTLIVTAYRFVCSKMPVLY EHPDFKDAH >gi|316921996|gb|ADCP01000133.1| GENE 9 9667 - 9810 148 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEDFPTLYSFMIFTKGWTYVLMGVALVGFLGYWRFLFGRDERTGRPE >gi|316921996|gb|ADCP01000133.1| GENE 10 9829 - 10509 852 226 aa, chain + ## HITS:1 COG:no KEGG:DVU0532 NR:ns ## KEGG: DVU0532 # Name: not_defined # Def: hmc operon protein 5 # Organism: D.vulgaris # Pathway: not_defined # 1 226 1 226 226 280 69.0 2e-74 MYEFLTGPMLWAAFLIFVIGLARRVVLYIRGLDWRLERVAYGPGRKIGMKGAISSVLQWL VPFGTHSWRRQPYFTIAFFLFHIGVVIVPLFLAGHMVIMQERFGFSLPSLPMWFSDAFTV LAIVGAFMMILRRVALPEVRFLTTLGDWGILLLVLFVLVAGFLARMQAPGYESWLLWHIF AAELVLILAPFTKLSHIVLYFMSRGQLGMDYAIKRGGASRGPAFPW >gi|316921996|gb|ADCP01000133.1| GENE 11 10543 - 11934 1747 463 aa, chain + ## HITS:1 COG:AF0547 KEGG:ns NR:ns ## COG: AF0547 COG0247 # Protein_GI_number: 11498157 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Archaeoglobus fulgidus # 60 452 68 475 477 174 30.0 3e-43 MPEGKFCNRRPINTEEDLKALLGDKGGQQYYKEMEMLEVDTVALWETLQKTCKSRTRTWL EICAHCGLCANSCFLYTANNRDPKQVPSYKIQSTLGEMLRRKGQVDTKFMIDTMEVAWAK CTCCNRCGTYCPHGIDMGVMFSYLRGVLYSQGFVPWELKIGAGMHRVYGAQMDVTTEDWV DTCEWMAEENAEEWPGLEIPIDKEDCDVMYILNAREPKHYPEDVAQAAILFHVAGEKWTV PSVGWENTSLSLFAGDWEACANNVKRIYGAVDRLRPKMVVGTECGHAHRATVVEGPYWAG REDGKPPVPFLHYVEWLADALRKGKIKIDPAKRLKMPVTLQDSCNYIRNHGLAEHTRYIM SQIAEDFREMKCNREDNFCCGGGGGFNGIGKFRPQRDKGLYMKLQQIRETGAKMVISPCH NCWDAIRDMMEVYHVHDIKWSFLKPIIIDMMIVPDHLKPQDDE >gi|316921996|gb|ADCP01000133.1| GENE 12 12294 - 13508 1368 404 aa, chain - ## HITS:1 COG:TM0111 KEGG:ns NR:ns ## COG: TM0111 COG1454 # Protein_GI_number: 15642886 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Thermotoga maritima # 24 397 16 385 387 187 33.0 3e-47 MWDTRFDERDVKEIRTKTTTYLGVGAIAKIGDIAADLKARGIDSILCVTGGHSYKLTGAW DYVTAACAKHGIRISLYDKVTPNPTTDSIDEAAALGRSADAKAVLCIGGGSPIDAGKSAA IILANPDKTAEELYTFAFTPTKAVPIVVVNLTHGTGSEVNRFAVASITKHNHKPAIAYDC LYPLFSIDDPGLMIKLSPKQTRYVSIDSVNHVIEAATAVTANPLSILLAGETIRLVAEYL PKALANPEDLKARYCLSYASMIAGTAFDNGLLHFTHALEHPLSGIKPELAHGLGLSMILP AIIEECYPACSATLGTILRPLTPDLKGTPDEAHKAARGVETWLADMGVPQKLEDEGFTEN DIPRLCELTRTTPSLGLLLSVAPVPATPERIEKIYRRSLKKMDA >gi|316921996|gb|ADCP01000133.1| GENE 13 14066 - 14548 216 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302342746|ref|YP_003807275.1| ## NR: gi|302342746|ref|YP_003807275.1| outer membrane protein domain-containing protein [Desulfarculus baarsii DSM 2075] # 2 157 7 169 223 73 29.0 4e-12 MSIRSLLLCCILLFPCIASASPLSIQSSVSTLGVGVDISYRQNSFIAYRLSAFTQPFDTK IKLKGIQYKTEDSSHSVGLLLNIFPFENAFRISAGAYYFKLDTDLSTSVDAINPIYEELI NQTSGKATWERLAPYIGLGWERAQQSESSLGWHASLGPYL >gi|316921996|gb|ADCP01000133.1| GENE 14 14439 - 14690 85 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDSTHYQHIPAYLHWHRKILTRLKQFIVANSFLDMPIWRERRSREVYHVHSFSSLMLYPF ISMYRFGISAFHPIFCEHTWGWS >gi|316921996|gb|ADCP01000133.1| GENE 15 15057 - 15242 56 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAFFLFSLMLKPGQTGTAVWGRGELFPLGVFYAAACRGLGKRIYALCGGRRGICTRLLSL R >gi|316921996|gb|ADCP01000133.1| GENE 16 15713 - 16453 299 246 aa, chain + ## HITS:1 COG:yaeB KEGG:ns NR:ns ## COG: yaeB COG1720 # Protein_GI_number: 16128188 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 4 221 9 233 235 189 43.0 3e-48 MNVIASIRNDFQEKFGIPRQSGVVETALSEIIFEPDYRSAEALRGLEGFSHIWLIWAFSA SAGKGWSPTVRPPRLGGNARVGVFASRSPFRPNPLGLSAVKLEDIVLQSPEGPLLRVSGA DLLHGTPILDIKPYLPYADCIPQATGGFAGTKPEPRLEVSAAREVLQGLPTGKWKTLREV LALDPRPSYQDDPERIYGFAFAGRNVRFRVNGSTLEVLSIDAAGEKFRDGEIPTGSCTTA NKSPRP >gi|316921996|gb|ADCP01000133.1| GENE 17 16827 - 17651 857 274 aa, chain - ## HITS:1 COG:lpxA KEGG:ns NR:ns ## COG: lpxA COG1043 # Protein_GI_number: 16128174 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Escherichia coli K12 # 6 258 8 260 262 247 49.0 2e-65 MASQQVHPTAIVHANAQIGKDVEIGPYAIIEEHVVIGDRCRIDAHAVIKDYTRMGVGNHI HSHALVGGEPQDLKFQGEVTWLELGDDNRIREFATLHRGTEGGGGITRIGSRNLCMAYTH IAHDCQLGNDIVMSNGATLGGHVRVDDFAIIGGLSAVHQFGHVGTHAFVGGMTGVAQDLP PWMLAAGSRALVHGPNLVGLRRAGASRETISAFKQAFRLIWRSEMPRSEALDLLANDYAN LPQVMEFVDFVRSSERGLCPAEKNVEKKLDDDAV >gi|316921996|gb|ADCP01000133.1| GENE 18 17665 - 18126 509 153 aa, chain - ## HITS:1 COG:RSc1415 KEGG:ns NR:ns ## COG: RSc1415 COG0764 # Protein_GI_number: 17546134 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Ralstonia solanacearum # 5 152 13 163 164 152 52.0 2e-37 MDQPTSFDIRKILEFLPHRYPFLLVDRVNEIVDGEYIKAHKCVSFNEPFFQGHFPGVPVM PGVLIIEALAQAGGLLVLHGFDPSETAGKLFLFSGLEKVRFRRPVFPGDRLVLECRLIRH KLKLWKMEGRAYVDGVLAAEAEITAAVTDRGDM >gi|316921996|gb|ADCP01000133.1| GENE 19 18160 - 19194 1049 344 aa, chain - ## HITS:1 COG:FN1909 KEGG:ns NR:ns ## COG: FN1909 COG1044 # Protein_GI_number: 19705214 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Fusobacterium nucleatum # 38 333 38 331 332 217 40.0 3e-56 MVARKLSEIAAQLGLILQGDDAEVIGVNTLEGAGPDEVSFLANPKYADQLAATRAGAVIV RPEHAGDVRRALISENPYQDFGRVLELFAAPQGSFSGISPLAFIHPDAELGDGVTVYPFV YIGPHATVGSGVKLFPGVYVGENVRIGKGTTVYPNAVLMAGTHVGEGCILHPGSVLGADG FGFARTPAGIQKIPQVGKVTIGNAVEVGANAAIDRAVLDATRIGDGTKIDNLAQIGHNVQ IGRNGFVVSQVGISGSTTVGDNCTFAGQVGIAGHLHIGDNVTIGPQSGVAKDIPSDVVVG GTPTVDQRTYMRTLALMPRFPELFKRLSKLEKELEELKGGKPDA >gi|316921996|gb|ADCP01000133.1| GENE 20 19195 - 19749 769 184 aa, chain - ## HITS:1 COG:PA3647 KEGG:ns NR:ns ## COG: PA3647 COG2825 # Protein_GI_number: 15598843 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 9 168 10 167 168 60 26.0 2e-09 MFRVLSLALVACLLMAGSAFADMKIGVFNSQAVAMDSDAAKAAQQKLQSQFGAEKTQLEK QAKDLQSKGEALQAQVAKMSAKEREDKQMEFLRLRRAFEEKSRNFARKVENTENQIRQSM AQQIYDAATTIAKQQKLDLILDAAAGSVMYATPTLDITKPVLAEVNRLWKAGGSKFPEPN TGKK >gi|316921996|gb|ADCP01000133.1| GENE 21 20185 - 21861 1475 558 aa, chain - ## HITS:1 COG:jhp0709 KEGG:ns NR:ns ## COG: jhp0709 COG0860 # Protein_GI_number: 15611776 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Helicobacter pylori J99 # 340 546 248 461 469 125 38.0 2e-28 MILTGSQTPHRSSPAIGRLFRPLLVAFALALFVTAFSDAHAAPSFEQQYEKAKQDMEFLK SDSKRGGWREPWEKLAQSFFDLHEKYPRWRNRPAALFRSALAMEELAKRSMLRQDAQAAV DRYGVFLKSYASHVLADDALFGIARIKAERFNDFSGAQEALNTIQNQYPRGDVAPEAKLY AQRLKAALEAAKSSTPGKKTAALLTDMKWENQKNLAVITLEFDRPIIWSIDTQSGSKKND IPNRMVVDLMGVNPASTIRPGIKVQGSSLRRMRLDLSAPDKTRLLLDFRQVKRFMVKTES SPFRIVITTSATDAAMPRGISFGQGLQSSDPLFRFATRSIMIDPGHGGKDPGTVHNGVIE REVTLDIAKRVGSILAGKGIQVHYTRTDNTWVSLSTRSYMANQARADVLLSIHVNASPNE NACGLETYFPDTASSSADAGKLAGLENANGGHKGDTGSPAASSSVRSQESRRLADTVQKC TLSVMKEKTFTVRDGGVKSGPFKVLLGANIPGILVEVGYCSNAQEAANLAIPTYRQALAE GIASGILAHLGSLDGNKR >gi|316921996|gb|ADCP01000133.1| GENE 22 21976 - 24675 3695 899 aa, chain - ## HITS:1 COG:RSc1412 KEGG:ns NR:ns ## COG: RSc1412 COG4775 # Protein_GI_number: 17546131 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Ralstonia solanacearum # 161 899 26 765 765 281 29.0 6e-75 MKRCVLALLALALCNLLFHGTAQAAPKQAASDAVRVAVLPFQINGGADLNYLNDELPKLF EQRLAQKGFKVIPAKDTLRLLKKQQISQLNLATARSIARQAGANYAVYGSFSKTGDAFSI DARMVDAAGQQGAQPYFAQKQNMIELLPAVDELVDKMGGGAVKGDAITDIQIRGVKVLDP DVVLMRLSTRKGDVVDPSTLNEEIKRIWDLGYFSDVNADIEQGGQGRVLVFTVKEKPRID DVIVNGSDEVKKEDILAAMSSKTGSVLNDRLLAEDIQKVTELYRKEGFYLAEVNYRIEQK ANSASAVLVFDVKEGKKLYIKEIKIEGLQGIDAGDLKKELALQEHGILSWVTGTGVLREE YLERDSAAITAYAMNHGYVDIQVAAPQVDYSEDGITVTFTVNEGKRYKLGEIGFKGDLID SDERLKEVIKLDDYKDSNGYFSLTVLQDDIKALTDFYGNYGYAFAEVDVDTPKNDEEGII NVFYVPHKKQKVHIRRVVTQGNTRTRDNVIFRELRLADGDQFDGSKLRRSNERLNRLRYF TQADTTIVPTDKDDEVDLRVNLKEDRTGAIMGGVGYSTFYQFGVSGSIQERNLFGRGYSL GLQGFVSGKSSYLDLSFVNPRIYDTDFGFSNNSYAIWEEWDDFKKKTIGNTIRLFHPIGE YTSVSVGYRLDRYTLFDIPDTASRAYKEYEGKNLSSVVSTSITYDSTDSRERPTSGVVGR VSMEYGGGGIGGNDNFFKPIGELQSFHTLFRNPNHIFHWRGRVGGVFENSNKPVPVFDRF FIGGIDSIRGYDSEDLAPRDPKYHDEIGGDRMGFLNFEYIWTFHPEMGIALVPFFDIGYN IDSKQTSDPFSDLKKSVGLELRWRSPMGDLRFAYGFPLDKNVKGDRPGGRFEFSMGQFF >gi|316921996|gb|ADCP01000133.1| GENE 23 24632 - 25372 228 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 1 223 1 211 312 92 32 1e-17 MMEQTAYATAPPVLYELSGVEKSCEGPAERITILKNMDLTVRAGESLAIVGASGSGKSTL LHLLGALDTPTAGKVLFEGKSLPDMTPVEKAHFRNKKLGFVFQFHHLLPEFSTEENVAMQ ALIAGMNRSKALGLARQALERVGLSDRKDHRVTTLSGGERQRAAIARAILLEPRVLLADE PTGNLDQRTGDSVAELLLELNRTLGMTLVIVTHNREMAGTLGRCLELRSGELYEEMRTRA AGAGAV >gi|316921996|gb|ADCP01000133.1| GENE 24 25369 - 26661 1408 430 aa, chain - ## HITS:1 COG:RSc1117 KEGG:ns NR:ns ## COG: RSc1117 COG4591 # Protein_GI_number: 17545836 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Ralstonia solanacearum # 3 430 5 416 416 244 35.0 2e-64 MRFELFVALRYLFARRKQAFIYIISAMSVLGVAIGVAALVVVLGVYNGFTTDIRDKILGA NAHIIVSGPLAALTAPPSDADFEKMRGGEAPELSGMAAVLRRINATPGVVGSTPFLYAEG MLSSPRGVKGVVLRGIDPKTAPDVLTMLDNLSSGSVADLNASVEGAPPGVLIGKELAQRL GLVVGSRVNLLSPSGQKTTAGFQPRIRPYKVAGIFQTGMFEYDSSLGFISLAAARDLLGL PSDQISGIEVTVKDVYKADTVAAALQKELGPLASVRTWMDMNANLFAALKLEKIGMFILL AMVVLIGSFSIVTTLVMLVMEKTHDIAILMSMGATKGMIRRIFMLQGTIIGVVGTLLGYV LGIGLALLLQRYQFIKLPPGVYTLDHLPILLNWVDIVVVGVSAMTLCFLATIYPARQASS LEPAEALRFE >gi|316921996|gb|ADCP01000133.1| GENE 25 26667 - 28250 2221 527 aa, chain - ## HITS:1 COG:ECs5111 KEGG:ns NR:ns ## COG: ECs5111 COG1190 # Protein_GI_number: 15834365 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Escherichia coli O157:H7 # 36 526 14 504 505 498 51.0 1e-141 MSSPETTHHKPVKLATKSAHASYFMPMLQSLADKDELNEVVKNCVVKSCELMDAGVPLHL NGFVKTHDAGAIRAEYEACSATELEELNESFTCAGRIISHRSFGKVAFFHLMDRSGRIQC YASKDLMGEEAYKHFKKFDIGDIVGVHGKLFRTKTDELTLACDQVTLLTKSFRSLPEKHK GLTNVEIRYRQRYVDLITNPKVRDIFRKRSKIIRTFRAFLEERGFVEVETPMLHPVAGGA AARPFITHHNALDMELYMRIAPELYLKRLLVGGFEKVFELNRNFRNEGISIRHNPEFTMC EFYWAYATFYDLMDLTEELFSTIATEVCGSPIITYQGTELDLTPGKWQRLTFHESLEKIG GHSPEFYNDFDKVLAYVRERGEKILKTDKLGKLQAKLFDLDVEPKLVAPTFIYHYPTEIS PLARRNKDNPDITDRFELFIMGREYGNAFSELNDPIDQRMRFEQQVREKAAGDDEACAMD EDYLRALEYGMPPAAGEGIGIDRLVMLLTDSPSIREVIFFPLLKQES >gi|316921996|gb|ADCP01000133.1| GENE 26 28907 - 29029 93 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEKLLYFVGGIIVGSIGTLAALGFTEEELDEMRQEEDESL >gi|316921996|gb|ADCP01000133.1| GENE 27 29101 - 29373 84 90 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFIGRRVWKCPRCGYVGEPDSTRLGERLGTAIGGIIGGFVGYCACGGSLIPTTGKYRGLV SGLMFGASTGNQIGQALDMYLIHVCKCPCC >gi|316921996|gb|ADCP01000133.1| GENE 28 29364 - 29741 65 125 aa, chain - ## HITS:1 COG:no KEGG:DVU2043 NR:ns ## KEGG: DVU2043 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 9 119 421 531 533 93 39.0 2e-18 MRVRSIRSLKSGQPVHARINLTPRGGTPLAPALWWIMKQLLFAKEQRKMLLVLTDGQPHD MNTTQKPVEIASKTGLEVYGLGMLDRSIGNFLPDTSRVICRLEELPAMLFELLHDVLTRK SRSCL >gi|316921996|gb|ADCP01000133.1| GENE 29 30177 - 30806 56 209 aa, chain - ## HITS:1 COG:YPMT1.88 KEGG:ns NR:ns ## COG: YPMT1.88 COG0714 # Protein_GI_number: 16082881 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Yersinia pestis # 6 147 170 313 411 97 40.0 2e-20 MHDLDPLYLYGPTGCGKTTLIKQLAARLNYPTFEVTGHGRLEFADLCGHISLQKGSMVYE YGPLPLAMLYGGILLINEADLLSPEVAVGLNGVLDGAPLCLPENGGEVISPHEMFRITCT ANTNGGGDDTGLYQGAVRQNLALLDRYMAYEVGYSNPSSKNGYRNNHNEVHLQQKIRMIQ RLHKASFQEEKRNYETILRICSPKRWGRF >gi|316921996|gb|ADCP01000133.1| GENE 30 31110 - 31352 150 80 aa, chain - ## HITS:1 COG:STM1550 KEGG:ns NR:ns ## COG: STM1550 COG2026 # Protein_GI_number: 16764895 # Func_class: J Translation, ribosomal structure and biogenesis; D Cell cycle control, cell division, chromosome partitioning # Function: Cytotoxic translational repressor of toxin-antitoxin stability system # Organism: Salmonella typhimurium LT2 # 1 80 1 80 94 104 70.0 4e-23 MSYKLRFHEQALKEWKKLGTTLQEQFKKKLAERLEQPRIPSAALSGMSDCYKIKLKAAGY RLVYRVEGDIMYVTVIAVGK >gi|316921996|gb|ADCP01000133.1| GENE 31 31444 - 31623 64 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPGGDKVGLRLVMIQHGNRVSLRRSKNTRKISLQFRLTVMSAWIVSILDLFLASIYTLN >gi|316921996|gb|ADCP01000133.1| GENE 32 31662 - 32375 429 237 aa, chain - ## HITS:1 COG:no KEGG:LI0183 NR:ns ## KEGG: LI0183 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 2 233 147 385 400 140 34.0 5e-32 MQIGDQTASLKTYWDLPVFNLVQNTGEVLLYRTFPEICNECLGVSLLDASACDIQGWVLC LSMCDAKAFGPFFPKDTDISRCLDMASEFCETMIALRDNLLNLNTIQTVQGLSSLCPSCL WRKDCPHFKGSSHPEWEDTLAQFMDLKTQKKSIEAEIGELESRLKVAYQLSHTVKGEWIN TGNHTFRVIPQNGRVTLDRKRLAEELNTLLGGQKAQTLMAKCEKQGEPFERLYAIRI >gi|316921996|gb|ADCP01000133.1| GENE 33 33274 - 33525 73 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPTPWGSALGRGTTARLDCHCPDSAWVNFVTKPRYPGSAGTGYWQSQMPFESNASRHESM TLFKVAYLLLVNGLLASLVAVGR >gi|316921996|gb|ADCP01000133.1| GENE 34 34273 - 36180 1135 635 aa, chain - ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 31 552 28 556 627 290 32.0 8e-78 MIQDYKLAALVHSQLVADVVHNEILRLGWDIHIEVTSYETALQDAASLLEQGYEALLCHG GFREELFARFGPCIVFIERSDIDLIKSLAEARKISTTVALTAHVNETRDIEFMEQLLDMS IIPVRYTLKDDLARKIQELFAQGVQVFVGGGGTGRIVSRLGGSVFLDLPQRANIRNALNR AIILAENIRMERAYRSNIQAIMHYSKEGMICINTEYEVVFHNDQALNILRVATPNRLVPF FRPLYLEEVLREQTPHIDKLVTINERQLLINAFPLQLSSTATGALCFIHDVRSLQRISRK ISADLQARGLVANYTCRDIVGNAPAIVRLKKNIQRYAHTDIPVFIYGETGTGKELVAHSL HAASSRSQQPFVVVNCAALPDPLLESELFGYEEGAFTGAKRGGKQGLFELANGGTLFLDE IGDISHAVQLRLLRVLDAREVMHVGGDRFIPINVRVLCASNKQLLHLVRKGTFRMDLYFR LTGVCLEVPPLRNRLEDIALFAAPLLRRYGKDKKALGPEIQKRLKAYNWPGNVRELMAVL ESYLVLLGDQATSLACFEEVFRHRDVLPLTPPPLPAATTPDAGDDHNLGRQTTDFRRRKV MEELALNLGDRRKTAAALGISYSSLRRILENMKED >gi|316921996|gb|ADCP01000133.1| GENE 35 36085 - 36273 70 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNIPSKTKNFVVYNVRYQLGMNKSCQLVILDHDFSLCSKIDRYYAFLINIIMQNGLVNQV FY >gi|316921996|gb|ADCP01000133.1| GENE 36 36512 - 37549 470 345 aa, chain + ## HITS:1 COG:CC2032 KEGG:ns NR:ns ## COG: CC2032 COG2515 # Protein_GI_number: 16126275 # Func_class: E Amino acid transport and metabolism # Function: 1-aminocyclopropane-1-carboxylate deaminase # Organism: Caulobacter vibrioides # 9 339 3 323 333 193 36.0 4e-49 MIEHIRQRIAHAPRIPLGVWPTPFMAMDHLRETIGSSCPRLWVKREDLTPLGAGGNKVRK LEFVLARAMAEGADVLLNTGEVQSNQVVQTAAAAAHLGIPCELFLGCMDPPLSEDEKDTG NILLCRILGARIHLLPPGEDRAAAMRTRAEELKKEGRHPYIIPRGSSTQEGSLGSLSCFF ELLEQAVEHDFVPDAIVVTVGSSGTTAGFLVGAQAMRRTMNRKIGIWAFDVFGSEYPVSA HDRIMSHAEESWRSLELPGNCGEDSLHLSGEFVGPGYCRPYQGMLDAVRLVAGAEGFVAD PNYTGKCLAGLLHLLRSGVFRPDQNIVYLHSGGLPALFAMHRYFN >gi|316921996|gb|ADCP01000133.1| GENE 37 37605 - 39242 690 545 aa, chain + ## HITS:1 COG:TM1058 KEGG:ns NR:ns ## COG: TM1058 COG0069 # Protein_GI_number: 15643816 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Thermotoga maritima # 21 543 2 506 508 387 39.0 1e-107 MSIPKSNEKVGAMNRGTPCESGLCNFCRTDCQGRCETWLSSLEGRRTLTPREYGNATIGS STTREVGISYDALRIRGRVFGGAGLDKEKARLGDAAYCDAVLSTRIGSHVKAPSRYPFVI GALSRNPVVDKYWDSFAIGAALCGIPLVIGENVGGGDNKTEFNPDGSIRALPDLDRRIAV YQRYYDGQGMLIVQVNVNDAYNGVSEYLARKYGDKICIEIKWGQGAKPIGGEGIIRDIEY ARFMKSRSYCLRPDPDDPDVQEAFRTGHVEYFKRYTGLTYPSLDTCEEVLSELASKVREL RDQGASSLALKTGSFGMADLALAIRAASELDFEMLTIDGSGGGTGMSPNDVLDTWGVPSL LLHAKAYDYAALRASAGKSVVDLAIGGGLARPSQVFKALALGAPYVKAVCMSRAFMIPAF LGCNIEGALHPERRAAVNGAWDKLPKSVLDIGDTPETIFAGYHALKERLGAEEIRHVPYG AIAMWTMCDRLGAGLQHLMGGARKFGVDYIDRTDIAAINRETANETGIPFITDQDDDMAR QIVAG >gi|316921996|gb|ADCP01000133.1| GENE 38 39426 - 41009 656 527 aa, chain + ## HITS:1 COG:PAB0090 KEGG:ns NR:ns ## COG: PAB0090 COG3653 # Protein_GI_number: 14520359 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-acyl-D-aspartate/D-glutamate deacylase # Organism: Pyrococcus abyssi # 3 525 5 523 526 417 43.0 1e-116 MLDIAFINARIVDGTGTPWQRGCVGVHDGRIVQVGSNGDLPEAANVIDVDDHVLCPGFID SHSHSDLSVLENPESSAKIMQGVTTENVGLDSLSVAPISDRNKEQWQISLSGLDGILNRA WDWNSFSSYLDRVDAAMPSVNISSYVGLGTVRLDVMGMENRPPTAAELRRMKESIAECMQ QGARGISAGLIYTPNKYQSTEELIALAKTAASYDGLFDVHMRNEADHMTEALDEIIHIAR ESGIRVMITHFKARGRRNWGSGRKHLDTIDRARQEGVDISIAQYPYTANSTLMHVVVPPW YHSRGMDGMLKALVEEREQVKKDMLTTDGWENFSEVMGWENIYVSSVNKPQNSWCEGLSA VQIGERLHCSPEDAILDLLIDERLAVGLLGFGMSEEDVIEGIKHPSMCVITDGLLSGKKP HPRTYAAFPRFLARYVREKHILTLEEAVRKMTFNTARRLRMKHKGLILQGMDADIVVFDE QRILDVNSFEEPRIHPQGIDLVCVNGEIVVRDGVHTGARPGRTIRDH >gi|316921996|gb|ADCP01000133.1| GENE 39 41540 - 42517 548 325 aa, chain + ## HITS:1 COG:PAB0593 KEGG:ns NR:ns ## COG: PAB0593 COG0549 # Protein_GI_number: 14521090 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Pyrococcus abyssi # 17 324 5 313 314 295 49.0 6e-80 MPTSSVSAYGKDRNKLIVVAVGGNALLTERQRGTFKEQMENIEYCSTELCKIIQDGYQLV LTHGNGPQIGLIMLRDSLSRVVRPAMPLYVYSAESQGQIGFLLMRSLCNHVALVGIQPRI TTILTQVIVRRDDPAFSRPTKPVGPFYTLEEMQDLQKISPFPYVEDSGRGYRKIVCSPEP VGIVESSFIKGLTSLGLSIIAGGGGGIPLVQNDDGTLSGCEAVIDKDLTSALLAKEIGAS TLVILTGVERVAINYNRPEQQWLDTVSLAQMKQYYQEGHFAEGSMGPKVLAAMRFVENSG SRAIITCLEKVVEALRGQTGTIIVA >gi|316921996|gb|ADCP01000133.1| GENE 40 43197 - 43601 67 134 aa, chain + ## HITS:1 COG:no KEGG:Sros_6673 NR:ns ## KEGG: Sros_6673 # Name: not_defined # Def: hypothetical protein # Organism: S.roseum # Pathway: not_defined # 22 133 171 292 296 61 43.0 1e-08 MHKAQEWIRGADFSPSLFVSGLMGLGPGLTPSGDDILGGILIALHCLGRLDQAGILWDAI GREAHRTNRISQAHLYAAAHGMGVEVLHGWIHALQGEADRQTLYSLCERLGRVGHTSGWD AALGACLASLNARP >gi|316921996|gb|ADCP01000133.1| GENE 41 43994 - 44257 165 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELAKRIIRRHGATSGHADGARERVKELATQFRCEAVPFDQMIERLANIDIVISSTESPE AIIRARDFRDVLKRRMMEQLDQLVAEV >gi|316921996|gb|ADCP01000133.1| GENE 42 44268 - 44627 315 119 aa, chain - ## HITS:1 COG:VC2180 KEGG:ns NR:ns ## COG: VC2180 COG0373 # Protein_GI_number: 15642179 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Vibrio cholerae # 2 117 45 161 419 96 48.0 1e-20 MLILSTCNRVEILAVGKGNIVGREIDAWARARGHSTSELAPYVYVHKDEVAITHLFTVAS SLDSMVLGEPQILGQLKEAYRNATQARTTRIINRLLHKSFSVAKRVSTETDIAASAVIH >gi|316921996|gb|ADCP01000133.1| GENE 43 44723 - 44944 58 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGAPHEMPVSCGTVIRWPDTGPMGEFASPPLLPELRFFLSFFTLELEKSPGAAFTLSVSF TALLALRSGKDLP >gi|316921996|gb|ADCP01000133.1| GENE 44 45655 - 47109 968 484 aa, chain - ## HITS:1 COG:no KEGG:ECP_4035 NR:ns ## KEGG: ECP_4035 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 8 482 2 467 474 530 57.0 1e-149 MTCNTQPSSIVTRPLNVINLGVDSFADPLAFAQVPVENCAWQPPAEGDSELGWKLAELLN NPEIDAANEVALSRFMEANPVLIGVGTAGQTIPGFEGRMLLHAGPPIDWERMCGTQQGGV IGAIIYEGWAETPEEARIMVEEGKVRLAPCHHYNSVGPMAGVTSPSMPVWIVENATHGNR TYCNFNEGTGKVLRFGSYNDETLNRLRFMADILAPVIASGVANLPEPLELKNIMAMALHM GDEIHNRNLAATTLFFRQLAPAAVRGMANAPTGSPAAKSENVAAALAFISATDHFFLNLT MPACKAMLDAAAGVPKSSLVTTMARNGVEFGIRISGLPDQWFTAPSPYVDGLYFPGFGPE DAGRDLGDSAITETAGVGGFAMASAPATIQFVGGTVADAIRNSRVMRRICLTTNPAFSLP ALDFAGTATGIDCRKVVDTGILPAINTGIPHKEASIGQIGAGRTHAPLACFTKAVAALCD TVKS >gi|316921996|gb|ADCP01000133.1| GENE 45 47144 - 48727 875 527 aa, chain - ## HITS:1 COG:ECs0369 KEGG:ns NR:ns ## COG: ECs0369 COG0074 # Protein_GI_number: 15829623 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 521 1 505 515 338 43.0 2e-92 MKTERRIFQNVYLDSVTLMRISANLMKLEGVAEASVLMGTPSNMDLLVQSGLLAEPPVVK SSDLVVVIKGDEGVLDNAFTEAAEALRPKESGTGEVSILPPPTSLRCAVAENADSSLALI SVPGAFAAAEALKAIGAGLNVMIFSDNVPVWQEVALKRAATARGLLVMGPDCGTSIINGA PLAFANAVRRGSIGIVGASGTGIQQVTCLIHGLGQGISQAIGTGGHDLSAEVGGLTMLAG IDALARDDSTKIITLISKPPSREVGTTILKAAADTGKPVVVNFFGADTVALIATLDASAC SRFCIATTLEEAAHRSVALAESKAPTFTSVIPGTNSPAETILRARAKALRAQASLTPQQT RLRALYTGGTFCYEAQWLLGNCLGDIYSNAPAGSSNSLENPFKSTGNTIVDLGDDVFTRG KPHPMIDPTPRNGRLIQEMADPTCGVLLLDVVLGYGSHEDPAGELAAAIAKGKAATANPP ICIASVCGTDQDPQHLDEQCAKLEAQGVVLCQSNAQAAALAACVLMA >gi|316921996|gb|ADCP01000133.1| GENE 46 48847 - 49104 147 85 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHRDIAGSVTCSLRAMLLCYKNEFHLKHACLIWGLTGIRKIVGTIPFMLFRLRHFFDTLL VHSESALYLVQYRVLIVLFFLSVFF >gi|316921996|gb|ADCP01000133.1| GENE 47 49069 - 49605 342 178 aa, chain - ## HITS:1 COG:MA3407 KEGG:ns NR:ns ## COG: MA3407 COG0590 # Protein_GI_number: 20092219 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Methanosarcina acetivorans str.C2A # 26 132 13 118 162 107 48.0 8e-24 MNADFQRRVKNTIQLDPEELAQKKKFMELALEEGRKAMNSGAGGPFGAVIVRNNEVVSCG YNMVFETLDPTAHAEIVAIRNATQKLGQLDLSDCELFTTCEPCPQCLAATYWAGISRIYY GITQEENVKMGFPGAKRMYEAFEKHGSDKQVVVFAYEDICQKLMAEWLKKDAQEKQYY >gi|316921996|gb|ADCP01000133.1| GENE 48 49861 - 50550 459 229 aa, chain - ## HITS:1 COG:no KEGG:Sterm_2685 NR:ns ## KEGG: Sterm_2685 # Name: not_defined # Def: AroM family protein # Organism: S.termitidis # Pathway: not_defined # 20 229 21 226 227 73 26.0 4e-12 MTQRMAVLAAAPLDPTGFTDMQAIWGDKVELSFSSVITEKTAEAMQPYTADKGGRVLMSG LNDNTVTTISIDKLTPPVEKLLADTRDKADVAFFACAGDFRKLPSSVLTVQPNILLRSIV QSLLQPSMRLGVITPGEPQVPHVFASWLPYLEAVGLTKEQLVVDWTPPNLEMATKCAQRL ADQNVDLVAVECLGFKEELRKPIADIVKKPVLLVRTVVASTIKEIASSF >gi|316921996|gb|ADCP01000133.1| GENE 49 50638 - 51909 597 423 aa, chain - ## HITS:1 COG:L81189 KEGG:ns NR:ns ## COG: L81189 COG0044 # Protein_GI_number: 15673051 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Lactococcus lactis # 1 417 19 440 441 280 39.0 4e-75 MNMLLKNGRVIDPSCGRDLQTDILIRDGKIAEIGTCDPAQAECILDCTGMIAAPGLIDAH VHLRDPGLSYKASIATDTYAAAKGGVTHVVAMANVIPVPHTVSAHQMIMERMRKEAKVKA TQVASLTQSIQSQNIISKELSDIDELVASGVEIFSDDGNCLMDLTVLYKILMKTKEHNML IMLHEEDEHMKWFEPTAFLHSAESSIVARDLELIRDVGGRRIHFQHISTRRSVELIRQAK ADGLSVSSEVCPHHLLLTKAARDIYGTNAKMAPCLQTQDDIEALIEGLEDGTVDIIVSDH APHTPEDKATELVKAPNGVIGVEVMFPVLLNEFVHKRKFSYSWLLEKLTIKPARLLGLSG GTLNVGEPADVCIFSDREWVIKAENFASISRNTPYDGWEVKGCVEMTLVDGKVVYSSGII ENR >gi|316921996|gb|ADCP01000133.1| GENE 50 52085 - 53563 492 492 aa, chain - ## HITS:1 COG:no KEGG:DSY0401 NR:ns ## KEGG: DSY0401 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 14 481 19 486 496 223 30.0 2e-56 MSETASLNKTPMSYWIHSAVYLFLTFGMGYIPTESISSLGMSIVGIFIGMLYGWTFIGFI WPSMMSIFALGVCGYFDTPQLAFSNAFSNQLVIFMLLILVFTAYFERSGINKKIAAWFLS RPSVQGRPWFFTFMVLLATFVVGFLVDGNAVNFLIWNLLYGIFKEVDYREGENYPAYLCA GVSFAALLSYGCKPWGNANIMAIGALQSASGGIYSLNYVKFMLLAIPLCFLFLMGYFIVM KYIFKPDVKKLMSLSNDYLQQLRNDLTLNIKEKIAAGALLVFISILLLINILPSSSSGIL GILSKGNFLSALIVILIALNFSRLEGGPILDFGGCAQKGIHWNVFWLNAAAMPVSAALSS DAAGITKWFGLMMNNYFIGVEAVVFIIAFTLLLLLATQVAFNMTLVVVAVPIVWQVCQTL GLNPLGVTVLILMAISCAIATPAASSNAAIMFSNVDWIGVPRSFKAGISAMLASMVILLG VGFPLVALIYGF >gi|316921996|gb|ADCP01000133.1| GENE 51 54414 - 54731 91 105 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWKYPVSQIDRYLSEGLAARRGFSGLVRETFSSAAGERDTGACLLEGSLGSALSRAAESV WEEHISQNSRLRYCGRLLCKNLLRLISDWAEYWSCRARGFCPLRW >gi|316921996|gb|ADCP01000133.1| GENE 52 54670 - 55953 1319 427 aa, chain - ## HITS:1 COG:no KEGG:Taci_0805 NR:ns ## KEGG: Taci_0805 # Name: not_defined # Def: extracellular ligand-binding receptor # Organism: T.acidaminovorans # Pathway: ABC transporters [PATH:tai02010] # 27 334 24 323 384 66 25.0 3e-09 MNRSQKFLGITLMASLLCGFTPMQSEAKEYKIGGMYTYDYHDGRHADWGEYELHAAKMAI EDINASGMLGGNTLNMPPELVIDYHCWPEGAAEKARSLLKQNIVALTGVDCSGPAVLIAK EAEKFKTPVVSVGANAASLSSPTEFPYYYRNVTPSTKYEGYLLEVAKHYDIKEIALFHTT DAWGSGAAAVILNEAAKLGIEVKASYGYSRNTPVEEVQKRMAAVKELGIKSIFITMPTPD TVIAFRTLTNLGMNQPGYSLFAAEMTSADERPDAINGAFGYLAPMTKLVPSKALDEFAAR FSKKIGKKVDMNSKAFFYGVLSYDHMMALGLAMKNVQDQKLEMNGDTLMASLRALSFDGL SGRQNLAPGTNDREIMAVEIMNCQGYKEDGKTVRFVPVGFVDSVTGKLTIEEDKILWPGK TSTPPNR >gi|316921996|gb|ADCP01000133.1| GENE 53 56127 - 57542 1042 471 aa, chain - ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 3 469 6 456 461 278 37.0 1e-74 MARILVIDDDETFSYVLKRACSRQGHEVMLSASACEAKEASAGMFDIVFLDISLPDGNGL SLLPLLLGSEGHPEVIVITGHDNRDWLEQALMAGAWDFIRKDDSLDTVFAALEQALLYHQ NRCGAQLSSSGQRTTPLKRNRLVGSGAAIGRCLESVIQCADSDVCVLLSGETGTGKEVFA RTIHENSGRQNGPFIVVDCAALPENLAESLLFGHVRGAFTGAITRENGLIPEADGGTLFL DEVGELPLSLQKVFLRVLQEKRVRPVGSTQETPCDFRLIAATNRNLEHMVEEESFRRDLF YRLQTVHIPLPPLRERQEDILPLAHHFCAHFSELYALPAKQLATGTATALQLYGWPGNVR ELCGAMERSVLTAGRASVIFPQHLPTGIRIHAATQRLAALSVPKEEKGSSVSESGLPSYA EFREQVWQKEEGRYLARILDQTGGSISEACEVTGLSRSRLYALLKRHGLTR >gi|316921996|gb|ADCP01000133.1| GENE 54 57527 - 59989 1603 820 aa, chain - ## HITS:1 COG:SMa0939 KEGG:ns NR:ns ## COG: SMa0939 COG0642 # Protein_GI_number: 16262960 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Sinorhizobium meliloti # 11 642 25 632 633 176 25.0 2e-43 MKPLAALAKGICLFCAIFAAQAMAFADVAERTAPLPPAEVLIINSFHSGHFWEYNIIKRI SEELENTETGFRIRLHCEYLDYERHPPGSLDNELVSLFAPKYAGVSLKAIIVTDNDALDF MLAHGNRLFPGVPVVFCGIADPPAELAERRAHYTGILENFSIHRILESIPLVHPEARHLA IICGDTTSARTALRQAAPELAAMKPGIAVRTLAALPAQAMQKALEELPRDTVLLNFGYYR TADGQSYSMKESLQRLRSWTDLPMYSPWSGQLGKGVLAGQCEFNEFHAVHAARMVLSVLG GTPPDAIPLLHEPSPYLIYDHAVLTRYGISESDLPPDSVIINRPLSFYEQHRAALLPAMT IMLVLFGIILLLMYLLRVKQRSESLLRQEKAVLAQANALERRSQLERRMEAIGRMAGGIT HDVNNILGGIAACAQLALPEIPRENPAYEDVLHILDATVRGKDLMKQIRMTDSTKTLDAR SETPLVSKLIRECAETLQPQLPPHISLNVRNACPEARIRAVSVEVHQVLLNLCLNAIQAM PDGGELTVSAGLYAVERHRSGDELAEGSYVRIDISDTGKGIPSGLMEFIFDPFFTTKQEN GGSGLGLAQVHSLVQRNRGAVRVANRPCGGALFSVFLPHDFPSPERDDAGKATGTPLSET VSPVDSIPQPVENGFPTGHPRTAPPGGTAGEILVVDDDPGILYLLEKLLNQMRLSVVTAA DAETALSLFEKGMHPALVITDQKLPGIQGIAFARLVRKRFPRMPIILCSSCSAMHDAQSC HMGADEDIQRFVKPFDPQRLKEAVDRALQPKTGRWIWRAF >gi|316921996|gb|ADCP01000133.1| GENE 55 60082 - 60225 111 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSAPSANDSKRPSASAALDLAPYTDKDQEIARDAETPVKGFYTVRGF >gi|316921996|gb|ADCP01000133.1| GENE 56 60212 - 61570 1637 452 aa, chain - ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 10 440 8 437 457 258 35.0 2e-68 MAHADPSSELATKPLGRLLLRFAIPSIFAQIVNLLYNIVDRIFVGRIPEIGPLALAGLGV AFPIILIISSFSFLAGMGGAPLASIAMGRGDNAKAERILGSACVFLLAMSVVLTVLCFWA MRPMLLMFGASEQTIGYAVDYFSLYLLGTPAVQLSLGLNYYISCQGFAKTSMMTVLIGAA LNIILDPIFIYGLDMGCKGAALATICSQAVSAVWIFAFFFGKRTILHFRRQYLRLDLKEL VPVILLGLSPFFVQVTDCIVPLVMNAGMQQYGNDYYVGTMTVMFSIEQLLRLPADGLCQG AVPIMSYNYGAGKPERVKKTFFLILAFSFGFTTIASWLILLFPKTFIMPFTDSQRLIEIT IPAIRIYFGGMMFIGVLYACQRTFMALGRTKLSLLGAMMRKLVILLPLCFILPRLGYGTD GLLYAECIADVLGSAIVFAIFIWNFKSMLKTP >gi|316921996|gb|ADCP01000133.1| GENE 57 61718 - 62353 614 211 aa, chain - ## HITS:1 COG:PA0839 KEGG:ns NR:ns ## COG: PA0839 COG1309 # Protein_GI_number: 15596036 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 20 202 4 187 196 99 29.0 4e-21 MQNHAQDAKGRGRPPKQRGDKLLTREMLVHVGTEVLTEKGYSAVGVDEILQRTGISRCSF YYYFKTKEGFGAELIDRYRLSITQKLERCLSDTSLSPLGSLRAFVGEVRDEMARTNCRSG CLAGKLAQEVNTLPENFREQLSAVFAEWQDVFARGLKRAQETGEISPELDCASTAAFFWY GWEGALQRANLELNIKPVDIFIGNFFSRLTG >gi|316921996|gb|ADCP01000133.1| GENE 58 62691 - 63593 793 300 aa, chain - ## HITS:1 COG:RSc0625 KEGG:ns NR:ns ## COG: RSc0625 COG2207 # Protein_GI_number: 17545344 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Ralstonia solanacearum # 34 296 40 302 321 232 46.0 8e-61 MMETGSGETARTASLLKEKLLQRAPEHGKYPMDIEGLVITRRHEANRIETCFSKPSVSVI IQGSKRTMLGSEEYCYGENQCLVAGVDMPSSFYVTDASPERPFLALCLDLDKYLITRLAA EVPPPCATGACSYKGLSVADVDPDILHAFLRLVELLEKPEQIPILGPLIVREIHYRLLIG PQGEFLRRLNTLGTQSNQIAQAITWLRDNYREPLQVDKLAQKVNMATSTFHRHFKEVTTL SPLQFHKRLRLYEAKRLMLTESKDASSASLAVGYESPTQFNREYKRLFGEPPHRDVMRMR >gi|316921996|gb|ADCP01000133.1| GENE 59 63977 - 64735 786 252 aa, chain + ## HITS:1 COG:MA0409 KEGG:ns NR:ns ## COG: MA0409 COG0599 # Protein_GI_number: 20089302 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Methanosarcina acetivorans str.C2A # 1 251 1 250 250 248 47.0 6e-66 MSRAALAEKKQLERFGNARSPLEATDPDFAEMRDRLIWGEVAWHGSLDAKMQELITLVVL TASQTLDGFAPHVGAALQVGATPEEIKEAMYQCAPYIGFPKTEKALRLVNEVFREKRIPL PVASQKTVTEDDRFMQGVKVQKSIFGAAIDAMHKSTPQNQRHLLRDMLSAFCFGDVYTRK GLDLRTREILTFCIISSLGGCESQVKSHVQGNVNVGNTKENLIDALTCCLPYIGFPRTLN ALGCVNAVIPEN >gi|316921996|gb|ADCP01000133.1| GENE 60 64888 - 66021 1363 377 aa, chain + ## HITS:1 COG:MA3446 KEGG:ns NR:ns ## COG: MA3446 COG2768 # Protein_GI_number: 20092258 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Methanosarcina acetivorans str.C2A # 10 377 1 360 360 394 53.0 1e-109 MSVPVYLAKLAVKNPSDHRLAKIKRLCDAVGLADKVKPLGTVGEDEVVAIKLHFGEAGND TYLHPTFVRQVVDCVKATGARPFLTDTCTLYKSTRHNAVDHLETAYRHGFTPYVVDAPVI MADGVTSKCYEEVAVNLKHFASVKIAQAFLSANAMVVLSHFKGHAMGGFGGAIKNLAMGC APQAGKIDQHGRNVVIYPKCIGCGQCVPLCPRSALSLEKAEKGRHAVIDKERCIGCYECV TACKQGAIGVDTPNEYSDFAERMAEYAYGAVKGKEGRICYLNFLLNVTPQCDCAGWSEPA IVPDLGILASEDPVALDTACFDLVKEAHSLIPLGEHGAHSCGYDKFSALHPNTRGWHQVE YAESIGMGSRDYELITV >gi|316921996|gb|ADCP01000133.1| GENE 61 66011 - 66286 81 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPRAPAKTGEQLHERLERLPFAKHRFRQPDPGRPGVIKQNAVIHILTRLLENSETPRRRG EGRGEPRPTGAGSHAPGRIRIMAARERRLTP >gi|316921996|gb|ADCP01000133.1| GENE 62 66155 - 66808 549 217 aa, chain + ## HITS:1 COG:CAC0776 KEGG:ns NR:ns ## COG: CAC0776 COG1691 # Protein_GI_number: 15894063 # Func_class: R General function prediction only # Function: NCAIR mutase (PurE)-related proteins # Organism: Clostridium acetobutylicum # 8 213 41 245 248 140 41.0 1e-33 MDHGILFDHARTARIGLPEAVFCEGKPFEALVELLSRFGRGSGHPVLFTRLAPDVFARVP AEIRAGYDYHPLSRTAFGDALPPKGKGRVAVISAGTADGFVAWEAARTLAYLGIGHKLFE DCGVAGLWRLAERLEEINTFDAVIVVAGLDAALVSVMGGLTPKPIFGVPTSVGYGAAQGG RAALASMLSSCAPGVGIMNIDNGYGAACAAARVVNGL >gi|316921996|gb|ADCP01000133.1| GENE 63 66805 - 67647 671 280 aa, chain + ## HITS:1 COG:CAC0774 KEGG:ns NR:ns ## COG: CAC0774 COG1641 # Protein_GI_number: 15894061 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 105 261 100 256 420 119 40.0 5e-27 MSEAHAHGDHAHRHEHHHEHGHTHETAHGDGRGKARGGRILTIRSHSGLSGDMFLAGLLR MTELGEAETDRLLAAVLPELAGTVRLTRRQVNRIGGWFADVSLPHQHEHRTLEDILGIIA GSGMDDEPKRLASDTFTLLAKAEAAVHGMKPEAVRFHEVGALDSILDICMVCVLFTRLSP ARFVVSPLPIADGEVVCAHGVIPVPAPAVLELLEGIPVRPFSGEGETVTPTAIALLRSLG ATFGPWPAMLVEKRALVYGSRVFANAPNGTIFACGTELEG >gi|316921996|gb|ADCP01000133.1| GENE 64 67712 - 68551 699 279 aa, chain + ## HITS:1 COG:CAC0775 KEGG:ns NR:ns ## COG: CAC0775 COG1606 # Protein_GI_number: 15894062 # Func_class: R General function prediction only # Function: ATP-utilizing enzymes of the PP-loop superfamily # Organism: Clostridium acetobutylicum # 18 208 6 207 271 92 31.0 9e-19 MDSSMTAASPDETALLPRLESVLGVLAAGNRFALAYSGGLDSRFLAHAAQRFGFEPVLLH IVGPHIPPEETDYARHWAASRELAYEELPADPLDLALVASGDRRRCYACKRNLFSLLKAR TDLPLCDGTNASDAGQYRPGIRAVEELGILSPLASAGLAKADIHRCAALTGMEDPEQKAR PCLLTRLPYGMKPERSLLAGLAAGERAVHGVFASAGLPGPDFRLRLVDAERLELHVLPEP ALAPGLCAELARRIAEAVPQLPPPRVVKVETLSGFFDRA >gi|316921996|gb|ADCP01000133.1| GENE 65 68877 - 69488 506 203 aa, chain + ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 46 201 8 177 179 162 44.0 3e-40 MSKALEIDSSRRSLLKYTVAGLLAGFGGRLLPHVSSAYAATEGGAKALVVYYSRSGNTRA VAEAIHAAVGGDIVELQPVTPYPEAYRATTDQAKQELASGYKPPLKDRIGHIEAYDVVFV GSPNWWGTVAGPVRTFLSEYDLAGKRIAPFITHEGSALGRSVADIKTFCPKAVVLDGLAV RGSRAASAQGEVAAWLRKIGMGK >gi|316921996|gb|ADCP01000133.1| GENE 66 69511 - 70050 591 179 aa, chain + ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 177 1 177 179 241 63.0 7e-64 MSKNILILSGSPRRGGNSDILCDRFMEGARESGHRAEKVFLRDKNIGYCIGCEACHQNNG VCVQKDDMAEILGKMIAADVIVMATPVYFYTMDAQMKTLIDRCVARYTEISNKAFYFIAT AADGEKRSLERTIEGFRGFLYCLEGAQEKGIVYGSGAWKKGDIKESPALNEAYEMGRHA >gi|316921996|gb|ADCP01000133.1| GENE 67 70301 - 71941 1957 546 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 7 543 18 548 553 346 36.0 7e-95 MQQATHIYRNGTILTMDSRCSIVSCLAVSGETILAVGSEAEVAPFQGPETRIVDLGGRFM MPGFYDCHSHFMRAGMYNKYYLDVNSHPIGDVRTHGDIRRKVREALSGMPAGEWLLCAGY DDTAVAEERHFTLAELDAMAPDHPLFLRHISGHLALCNSKAFEAAGITDETLNPSGGVFR HDGNGRLTGLVEEPAAMEMVLAASPQMTEEKWLGAVERATDDYVAKGVTTAHDGGVTTAM WKNYMTAHKRGMLKNRVQLLPKHGWFDFSLAPTVQCGTPLTKDGLLSMGAVKLFQDGSLQ GYTGYLSNPYHSLPDAISDGSWRGYPIYNPRELVNIVTRYHEEGWQVAIHGNGDAGIEDI LNAFEEAQKAYPRANARHIIIHCQTVREDQLDRIERLGVVPSFFTVHTYYWGDRHRDIFL GKARASRIDPLRSALKRGIPFTSHNDTSVTPMDPLLSVWSAVNRLTGSGKVLGEDQTVSV LDALKSVTIWGAYQFHEERMKGSLEPGKLADMVILGENPLEIAPERIRDIPILATLVGNR LVYGSL >gi|316921996|gb|ADCP01000133.1| GENE 68 71967 - 73460 2325 497 aa, chain - ## HITS:1 COG:BH2009 KEGG:ns NR:ns ## COG: BH2009 COG3333 # Protein_GI_number: 15614572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 467 1 466 504 358 44.0 9e-99 MFDMFMTAFSSACSLEALVANFIGVALGIVFGALPGLTAVMGVALLIPLTFGFPAVIAFS SLLGMYCGAIYAGSITAILVGTPGTAAAAATMLEGPQFTARGESLKALEMTTIASFIGGI FSCLVLATVAPQLAHFALDFSAPEYFSLGIFGLTIVATLSEGALLKGCIAALLGMLISMI GMDPLSGNLRMTFDSPDLINGVSLVPALVGLYALSQVLITVEDVFMGRKLSTAEISRKRM PLSEIWTNRAALLRGSIIGTFIGIVPATGSGTASFAAYSETKRHSKHPELFGKGSIEGIA ATESANNAVTGGALIPLLTLGVPGDVVTAIMLGALMIQGMTPGPLLFQEQGTLVYSIFIA LFVSNVFMLLLGYYAVRLFAKVVLIPGGILMPLVTTLCVVGGYALNNSNFDLAVMAGFGL LGYIMTKARFPLAPLLLAMILSGIIETNFRRALSISNQDFSVFFTRPVCASFLAISLFIL FNLLWKEWKKYRAASAA >gi|316921996|gb|ADCP01000133.1| GENE 69 73475 - 73954 485 159 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302864357|gb|EFL87288.1| ## NR: gi|302864357|gb|EFL87288.1| TRAP-T family transporter, small (4 TMs) inner membrane subunit [Desulfovibrio sp. 3_1_syn3] # 1 159 9 165 165 144 56.0 2e-33 MKNKTDFFIGLTLLAFCCVMGYEIHLIPEPAALDFFSAGSFPMGVTIALALLSLVLVARA VLGRDESAACWPERKLAVKIGLMAAWILIYVIGFIFLGEYAYDAEWPEGTGFVISTLVFL SGAQVLTGYRNPFYISLISAGISAFLYAVFAIFFKVPLP >gi|316921996|gb|ADCP01000133.1| GENE 70 74017 - 75015 1497 332 aa, chain - ## HITS:1 COG:AGpA473 KEGG:ns NR:ns ## COG: AGpA473 COG3181 # Protein_GI_number: 16119557 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 312 2 298 318 111 27.0 2e-24 MKKSIIASLLIACSLFIGGNTDARAEYPYDKPINIIVPFGPGGGVDIAARILADYFQQNY NITINVVNKPGGAQAIGINEMLRARPDGYTLAFPGFSALATTPKLTNVGYTLKNIKPVAH IASMECVLSTNKSSGIDTWEKFLQAAEKNPDGTVYGTTGSISTQRLYMTKLTDRFHGDLK IRHTAYTSGHEVSTALLGKHITAGFQVPANILPYANSGDFNVIAISRKERRADLPNTPTF RELYADKMTPADEKWIDLGAWHGLVVSSKVKDDRIAALEPLIEKAMKDPEVIEKFNKIGL DAGYLPAKEFGQLIQSSSDLADEVLAGRKSLD >gi|316921996|gb|ADCP01000133.1| GENE 71 75247 - 77211 2098 654 aa, chain - ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 54 641 48 626 627 412 38.0 1e-114 MLPQYEFQMTLIAPYKGLDARIFRQVAKDLRCRIKFMDLAFDEAIEAAKRLSPDTCDVVL SRGVTVDVVKQNSSIPVVPIDFSAWDLLQALQPYAGHVRNVAFFRYSTPLPGLSSVEKAL GMRIKEHLYGSKNEMHLRLIQLDPADVELFVARGTLVCQWATAAGFPTLEIIDGEISAKR TLLEAVNVARARRSERQRTARFGAILDAIDEGIVVYDAQGKVNLITPSAESLLNCAKKEA LGEHIRTVMPGVFSPGTLAGDKAEHGRVHDIRGTTLVINRVPILFQGQNVGTVCSISDAR RIYKAEAKLRNKLKSKGFTTRYSFGDIRTRSPHVRHLKELGVLYASTDANLLICGESGTG KELFAQSIHAASLRKDKPFVAVNCAAIPEGLLESELFGYEEGAFTGARRQGKAGMFELAH TGTLFLDEIGDLPLTLQGRLLRVLQERELVRVGGTQVIPLDVRVLCATHQDLRQLVAEER FRADLFYRLNVLSLRLPPLRERLEDIVDLAVSHLREHLDEPPAESLLVSQLERPLLRHPW PGNIRELLSLMERMAIVANHMAEVTDWEQLLLSLWEDPPLEESPITPSLPQEPEDLPPLT LRSHMAREEARFIRQAVARCGGDMGKAAHLLGISRMTLWRKLQGLAETNAEHFE >gi|316921996|gb|ADCP01000133.1| GENE 72 77342 - 78376 1159 344 aa, chain + ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 43 248 47 244 332 80 32.0 3e-15 MRFKTLFAVCALFGSLFQSSPSLSETLRTAWLGEHEAFLTWYAKEKGWDKEAGLDIIMLR FDSGKSLIENVRAYDWAIAGCGAVPAMQATMSDQVEVIAIANDESSANMILTRPDSPILA EKGFNPDFPNVFGTPESVKGKLILCPKKTSARYLLDKWLAALGLEEKDVKIEELEPTPAL GAFKNGYGDILAVWSPFTREAETLGFKVAAHSQDCGATQPVLLVADRAFTEKNPDAVRAF LKVYLRVVDEIKAQGPETLAPAYVRFAEAWMGKKFSEADAIAELREHPVFSLEEQLALFG EDGESPLKAWLAEIAAFSEKMNPDTAHRHATPEAVSDRFLKALK >gi|316921996|gb|ADCP01000133.1| GENE 73 78761 - 79768 1404 335 aa, chain - ## HITS:1 COG:SMc00271 KEGG:ns NR:ns ## COG: SMc00271 COG1638 # Protein_GI_number: 15965452 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Sinorhizobium meliloti # 9 295 7 289 325 138 29.0 2e-32 MKCPKILSLLLMILLLSTGYASAAPSQTLKIAAGDPEDSEMGVVGNAFKEYIEEKTKGDV EIQCFYSGSLGDESECLRNVQKGTLPMAMAGIANLVPFEKKLGLLTLPYLFSNIDEVVTG TNSAPAELLNSYATKSGFRILTWTYTDFRYISNAKRPITKMADMQGLKFRVPQSAVLIAA YKAFGGSPTPISWAETFTALQQGVVDGQCYGYIGFKANKFNEANQKYLTEVHYTYQLQPL VISERVFKKMTPEMQKLLIDAGKYAQEKVLKYQIEQADAAKKYLIDNGLQVSQLEDEDVW KKAAMEKVWPEMADFVGGKETINAYLKACGKPLWK >gi|316921996|gb|ADCP01000133.1| GENE 74 80496 - 80915 567 139 aa, chain + ## HITS:1 COG:MA0735 KEGG:ns NR:ns ## COG: MA0735 COG2050 # Protein_GI_number: 20089620 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Methanosarcina acetivorans str.C2A # 11 138 18 145 146 94 38.0 8e-20 MPSYEEVRYHFNHHDPFARHMGIELLEVGPEHGVAIMPLDERHRNGMGHAHGGAIFALVD MTFATVSNAAGLYCVNAQTNISYLEPGRIGPLRGEARKIRSGRNLGTYDVRITDSDGTLV AIATVTGFMTKYPIQQKDA >gi|316921996|gb|ADCP01000133.1| GENE 75 81046 - 81930 1296 294 aa, chain - ## HITS:1 COG:PA2294 KEGG:ns NR:ns ## COG: PA2294 COG1116 # Protein_GI_number: 15597490 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Pseudomonas aeruginosa # 1 260 1 260 284 211 39.0 2e-54 MSDIGMGYGEMQVTNLCKGYGIGPLHKEVIKDCSFTLEKSKLTVLIGPSGCGKSALINML AGYETPDAGSILIDGEKVSGPGPDRLVVFQETALFPWMTVFENVSYGPRVRGELSGKDLE NTTMKLLEMVGLEDFRSKYPSQLSGGMQRRAELVRAMINQPKVMMMDEPFRGLDAMTRAL MQEYYVRLFEEHRGTNLFVTSELEEGIFLADRLVVLTNQPCQVKKVMDVNLPRPRTFDMM AMPEYVDIKREVLELLHDEAMKSFATGAVNAADFLEAYSQLGQDQAQPRMEGHK >gi|316921996|gb|ADCP01000133.1| GENE 76 81943 - 82719 944 258 aa, chain - ## HITS:1 COG:PA2295 KEGG:ns NR:ns ## COG: PA2295 COG0600 # Protein_GI_number: 15597491 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Pseudomonas aeruginosa # 39 254 42 257 265 148 39.0 1e-35 MSSIKKSLCSATARRGIISIIVFTVVWEICTRLHVPIIGNVPAPSSVIGSMGKQLTDAFY WISWRDSMMRIVGGFVLAQAIGIPLGLLLGASKAAHNLIYPVFEIMRPVPPLAWVPVAVI FWPTPEMSMVFVTFLGAFFIVVINIVDGVRSIDARYLRAARSLGSSRSDMFWHILLPGSL PSIVVGMTVGMGVTWAVVVAAEMIASRTGLGYLTWSAYVAGEFPVIIIGMMSIGIAGYVC SAIIRFTGTRMTPWLRTF >gi|316921996|gb|ADCP01000133.1| GENE 77 82734 - 83570 1204 278 aa, chain - ## HITS:1 COG:MJ0412 KEGG:ns NR:ns ## COG: MJ0412 COG1116 # Protein_GI_number: 15668588 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Methanococcus jannaschii # 12 272 17 267 267 216 40.0 3e-56 MSGNETPTRGHIQINNVSKIYDPDGAKVLAVDRCTMDIAPGEFCVVVGPSGCGKTTLLNA IAGFHSISSGEILLDGEVLCSADKQAQPGSDRIVVFQNGALFPWFTVLENVTFGPIVQKA MSKEEAEDKAMEMLGQMGLMGIENNYPSEISSGMRRRVEIARAMMNNPRVLLLDEPFRAM DALTKTVVHQFLLQVYDRSKTTVFFITHDLEEAIFLGSKVYIMTTRPATIKKILDVDIPR PRDFRVLSSPTYTRLKEECISAVHEEALKAFKAGEREM >gi|316921996|gb|ADCP01000133.1| GENE 78 83567 - 84436 1174 289 aa, chain - ## HITS:1 COG:PA2295 KEGG:ns NR:ns ## COG: PA2295 COG0600 # Protein_GI_number: 15597491 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Pseudomonas aeruginosa # 90 282 65 257 265 152 46.0 9e-37 MSQAIMQRQTTKGPSAWEVTLANTKLWLKSPTPYLNALGLVAFVLFWYLTTEYFQLPYFE KLPGPVASFQEWVSKDPIYGISIYTSDYYAHIGISVWRVFQAFMLATLLGVPTGLFMGWN KTFKDYSFPLLETLRPIPMLAWVPLAILMWPGREASIVFLTFLGSYFATVLNTLLGVESI DESYFRAARSLGARPRDVFFKVILPGAMPFIFTGLQISMGYAWFSLVAGEMLAGEYGLGY LIWNSFMLVQYPVIIIAMITLGVIGSLSSLLIRMIGNRLMQWKVREEVR >gi|316921996|gb|ADCP01000133.1| GENE 79 84586 - 85698 1761 370 aa, chain - ## HITS:1 COG:AGpT119 KEGG:ns NR:ns ## COG: AGpT119 COG0715 # Protein_GI_number: 16119873 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 21 262 24 258 320 61 25.0 3e-09 MKIKSLLGTLALSLGLLASPAAAEKFVVGFQPYDTISYQAIVNAELELWKKYTPAGTEIE FQPALQGTVVANNMLANKAQVGYMSVMPATILCSKPKQAEIKMVATLGMSKGTRCSLVVV RKDAPEFKSNEEVARWLDGKIIAAPKGSASDQYLRRFFEKYNVKPGEYLNQSIEVIATNF RIGKIDAASLWEPTLSRIATDVGEGTARIVADGSACDNPDLGILNMRADFVKNHRDVAKG YLRAELEAQRYMLDPANWENVINMVSKYATGIPKNVLWYSIYGLVPSDSSDPVREWKNFY FGDRENANIVEVAPFLFKSKIISMEKLPDGTVDDTLAREVFKEAGYAPASPDAALGVIKG AKAADCPFKN >gi|316921996|gb|ADCP01000133.1| GENE 80 85929 - 86324 248 131 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRYASERLSLFRTVFVQVAALLLFAAGGMLLAVAERELNVRPPHHAEMAPAAELLDNRLP AVLKGQPDELPAPAEKSFQKKTRGSQPPFALSGLSPMLQPERPVRLSEAVSALLPSWRGL VAFPLPPPASA >gi|316921996|gb|ADCP01000133.1| GENE 81 87115 - 89007 1453 630 aa, chain - ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 215 475 15 275 280 187 40.0 6e-47 MIALRWHVLRLPLLLSLLWCLLLFALFQWTVQREDENATELAQIQTRTLFSSIVNTRGWN ADRGGVWATEQPNCQPNPWLPEDQRTLLTADGRRLVRFNPAYMTRQIAERFQSKTMNFHI CGLSPKRPQNRPDQWETSALLDFDAGKTERFQLVPGRRGAYYRYMAPLRAEESCLQCHQR NKIGDVLGGISVSISAFPILASVEQRNSMVGLTFGLIGFIGVLGIGGATFQINRKKEQAE EANRAKSVFLSRMSHDIRTPMSAIIGLTAIAKKNAGDPERLNECIDKIALASQFQLSLVN DILDVSKIESGKMQLASEPFDLAGIIDDVTLFTSSSATAKGIVFETSVDPRIGRMYVGDP MRLKQILMNLLSNALKFVDENGKVTLRVESLRRTGRQEAVRLIVSDNGIGMSNAFQKRMF LPFEQDAAPRRKRSGSGLGLAIIGNLTALMGGSIAVDSELGKGTRFTVDIPLERTETAEK HDTAGENAEERCLFRGETLLLAEDDGLNAEVLENLLGYINLRVVHAENGKVAVDLFERSA PGEYAAILMDIQMPIMDGLDAARTIRALPREDAASIPIFALSASVFAEDVERSLCSGMNG HLGKPVDIAQICSMLQTWLYPQQTDTQEQA >gi|316921996|gb|ADCP01000133.1| GENE 82 89434 - 90402 920 322 aa, chain + ## HITS:1 COG:MA2629 KEGG:ns NR:ns ## COG: MA2629 COG0657 # Protein_GI_number: 20091452 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Methanosarcina acetivorans str.C2A # 11 321 8 318 321 320 52.0 3e-87 MPAASSRSPHLTKATKAFVDAITAENNPPLYTLPPQAARRILADAQAKPIKKLTMDMHDE DLPVGPTGKVPTRFIRPEGAGGKLPIVFYFHGGGWILGDKDTHDRLVRELAKGAKAAVVY PNYTPSPEAQYPVPLEQAYAAMLYIIEHADAYGLDPSRIAVAGDSVGGNMATVMTLLAKE RKGPEIAFQLLLYPVTDANFDTESYRTFAEGPWLTREAMKWFWDAYAPDARRRGEITASP LRATPEQLAGLPPAMIITAENDVLRDEGEVYARKLIEAGVHVACVRYNNTHHDFMMLNAL AETTPTRAAVAQAAAALKKMLH >gi|316921996|gb|ADCP01000133.1| GENE 83 90606 - 90815 168 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVEQDELLEMLPCSHCKNEKPHLVSCRPEGRTADLWRVECPCEKAPTQWSVSKTAAVRLW NRYMTNMKE >gi|316921996|gb|ADCP01000133.1| GENE 84 91062 - 92873 2872 603 aa, chain - ## HITS:1 COG:CAC1669 KEGG:ns NR:ns ## COG: CAC1669 COG1966 # Protein_GI_number: 15894946 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Clostridium acetobutylicum # 1 579 1 577 592 551 53.0 1e-156 MNALTVVFAALCIFAIGYRFYGLFIANKVLNLNDARVTPAVKYSDGHDYVDTNKYVLFGH HFAAIAAAGPLLGPVLAAQFGYMPGLLWILIGCVLAGGVHDMVVLFASVRHRGQSLAYIA SQEIDKTTGSVAAWAVLAILLLTLAGLSIAVVNAMHNSLWSTYTVAATIPIAIIMGLYMQ IWRKGDVLGASIIGVVLLALCILTGPYVAAHPETFGWLDIDKKPMSILIPVYGFAASVLP VWMLLLPRDYLSTFLKIGTIGALALGIVFVMPDFNMPAFTEFTKGGGPIVGGPVIPFIFI TIACGALSGFHATIGTGTTPKMIGKERDVLFVGYGAMLVEGFVAIMALIAACVLVPADYF AINAPADKFAALGMSVVDLPTLEKEVAESLMHRPGGSVSLAVGMAHIFGQIPWMAHLMSY WYHFAIMFEAVFILTAVDAGTRVGRFFLQEMIGKVIPKFGDKNWWPGIVVTSFLFTGAWG YLVYTGDISSIWPLFGISNQLLASVTLLIGTTMLLRMNKTKYAWITAAPGIFMTFITFWA GIWLIMYQYIPTQKYLLASLSVLVMVMMGFVIIGTLRRWSVLLKETKIVRDPYGDEVKEI VQE >gi|316921996|gb|ADCP01000133.1| GENE 85 93328 - 93759 399 143 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYASQRNSLSPTWHMLLMGLMSVILFFGGLLTSNDRPVPIVDNLPNEGLLRPTSLLPWES VELSETRLLFRHEEMQEVKSPQRLNQPTQHNLLARHHGFQPFASALPTPTFAERILPAYE KAFLPYAGWDGPVSFALPPPLLA >gi|316921996|gb|ADCP01000133.1| GENE 86 94701 - 97736 2780 1011 aa, chain + ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 592 1002 345 753 776 194 32.0 6e-49 MEQMCTAVFAFLESAKPFFQYGKPLQELLPFFVRALGGILPLDALAIVRMEGKKGVCETA CWSGTPHLPEALMLRAFNGKEFPKAFDTPFSEEALAALFPEETAGKLNWLALPIACDADA LGLLFVGRESGETEWPDRERDFLRLVCAMLGFFFAGKAHCEQQSFHIGVLNAAMDQVKVG LYVTDPHTDAILYMNKFMKDVFKLEHPEGKICWQVLQSGMNGRCPFCPVDRLMADANQNQ VFRWEERNTLTWRIYDNSDSLMRWTDGSLVHLQQSVDITDSLRLHKEANYDELTGLLNRR AGKAALADALVRLDREESSLIVGMFDLDRLKEVNDVYGHGEGDRALRTIAQEMQRSLHAP DMCFRLSGDEFVVLFHNTNRHAVDGLVAGVLERLKARREQLGLPYSLEFSFGCFKVMPGC GMTVTEVLSKADESMYEQKKRAHIRKAERRLQEKQGGGDIPPEALEYDSLRLYNALVKST DSYIFVSNMKTGIFRYSPSMVEEFGLPQSIVENAAAVWGSKVHPDDKAAFMEANQIIADG RSDFHCVEYRAKNRKGEWIWVRCRGYLERDGDGEPSLFAGFITNLGQKNKIDHVTGLFNK LKFEEDIESMLEKRPEHPLHLLVLGLDGFKHINELYGKSFGDEILRVIGQRIQGMLPLSA SVYRLDGDEFGITVSGERFEMMELYRSLSESFRSQQEYDGKKYFCTISAGSASYPEDASG YTELSEYASHSLKYAKKLGRNRIVFFSRAILEQEMRSLELVELLRESIERQFEGFELVYQ PQIAVDTRHVVGAEALARWTCTKYGPVSPGEFIPLLEQSGLIIPFGRWVFREAAAQCKAW TKSRPDFKISINLSYLQVVSNSMVPFINNTLERLDLSPANLTVEFTESCMIRENARIRAV FESIRNLGIRIAMDDFGTGYSSLGMLKNSPADVVKIDRTFVRDILTSQFDATFIHFVVAL CHDVNIKVCLEGVEREEEFDRVRSMGLDFIQGFLFGKPVPPDVFERDFLMP >gi|316921996|gb|ADCP01000133.1| GENE 87 97811 - 98053 388 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLGLMHDHPYVFTPTIAKIMLPVWVCLLVSTVRAIRRDMKKEKAGDLSKEETDERMGINA LVLFFSFAFTVGCLAALTSF >gi|316921996|gb|ADCP01000133.1| GENE 88 98104 - 98331 56 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMTSGPGVIAISSDATKNAHVVSKESIRPPQALLYRKAPGDSIRGGRGKRDSTGQEDMPE GRLFRKRLADGTNAA >gi|316921996|gb|ADCP01000133.1| GENE 89 98252 - 98878 729 208 aa, chain + ## HITS:1 COG:BH0429 KEGG:ns NR:ns ## COG: BH0429 COG1280 # Protein_GI_number: 15612992 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Bacillus halodurans # 2 206 1 205 207 125 36.0 4e-29 MLSFETTCAFFVASLLMAMTPGPDVIIVLTQSSLYGMRAGVLTTLGLMTGLLGHTLAVAL GVAVLFQTSEAAFTALKFLGAAYLLYLAWQSFRSGVFRAFLTQSLFPGYGTLYRRGFLSN ITNPKVTLFCLAVLPQFVEPERGHPTLQILSLGGLYELACLIVFTAIAALGGRMATWFNR SDRAQMLMNRIAGCIFLGLALMLAFASR >gi|316921996|gb|ADCP01000133.1| GENE 90 98939 - 99892 1257 317 aa, chain - ## HITS:1 COG:FN0332 KEGG:ns NR:ns ## COG: FN0332 COG0598 # Protein_GI_number: 19703675 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Fusobacterium nucleatum # 112 316 153 351 351 88 27.0 2e-17 MKQHAITLAHGTYTELKDTEGAKGTEDKLLWLDIQGTSEELTHFLKERQIPPFLRRRVLS SQRFPSVFSGNDYVLLSVPARKTWVQKRSVSITFILFAQEVITLCRDEGYNFDGERKQLA EGDTASIRSASDLLIALLDSLVETNICNFIEARSQTENLSLHVDNASGKISEQTILDTRR QINHLVNQFEDFFYGLADLHALTPHAMLSDAVREKLRDIRDAQNHLAKNALRLVTRLSEL LQHCQFVLQQQTDQRLRQLTVLSAIFMPLTLITGVYGMNFAYMPETGWRYGYYATLAGMV WLAIFLWITLKRKGWFR >gi|316921996|gb|ADCP01000133.1| GENE 91 100466 - 100594 114 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEKALMDQVGAVFIDYDPTGSFRIYTERPLSIFTTGGGSCSL >gi|316921996|gb|ADCP01000133.1| GENE 92 100829 - 102421 1726 530 aa, chain + ## HITS:1 COG:CC1488 KEGG:ns NR:ns ## COG: CC1488 COG2721 # Protein_GI_number: 16125735 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Caulobacter vibrioides # 2 530 10 502 502 284 35.0 3e-76 MQKILRIDPKDNLIVTLRDLAQGDIIENEGQRIQLVTDVPAKHKFTREPVPVGGIVTLYG VPVGKAVAPLQSGERITVDNVVHYAAEVDLSDAVPYMWKAPDVSRWANRTFDGVIRPDGR VGTANYWLIIPLVFCENRNALKLRDALERTLGYAGDHLADFARSLVGGTGCAPAPRPFPH IDGIRAITHNGGCGGTAQDAWTLCRMLAAYADHPNVAGLTVFSLGCEKAQIGLFQEALRE RNLGFDKPCILLRQQDWSSEVKMMEEAVRKTLAHFKNADLVERRPVPLSKLKLGVKCGGS DGFSGISANPAIGEVSDRVVTLGGGSALAEFPELCGVEANMISRCIRKEDKARFLELMRR YEAAANACGASISDNPSPGNIHDGLITDAIKSAGAAKKGGKAPISAVLDYAEPMPDAGLS LVCTPGNDVEAVTGLVAAGVNVVLFSTGLGTPTGNPIVPVIKVATNTAVAERLSDLIDFD CGPIIDGEPIAAAADRLLEKVLDTASGRYSTKADRLGQHDFLFWKREVSL >gi|316921996|gb|ADCP01000133.1| GENE 93 102603 - 103454 849 283 aa, chain + ## HITS:1 COG:AGc5105 KEGG:ns NR:ns ## COG: AGc5105 COG3618 # Protein_GI_number: 15890062 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 280 17 284 287 170 36.0 3e-42 MSTVWRVDAHQHFWRFDPAAYGWIGDDMAVLKRDFLPAELRFELDIRHIGGSVAVQARAS EAETDFLLGLASSNPWILGVVGWIDLLAGDLESRLEARASSAVLKGYRHQVQDEPSPSAF LEDGRFNRGVETLQRGGKVYEVLIHAKDLPAAIAFCGRHDLGPLVLDHLGKPDVRHESAA EWARRIAPLAAQEHVSCKLSGLITEAHWHGWDERDLLPYLDAALECFGPSRLLFGSDWPV CLLSGTYDQVCGLAGKATASLSEAERDAVWGGNACRVYGIGNL >gi|316921996|gb|ADCP01000133.1| GENE 94 103470 - 104252 1070 260 aa, chain + ## HITS:1 COG:Cj0485 KEGG:ns NR:ns ## COG: Cj0485 COG1028 # Protein_GI_number: 15791849 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Campylobacter jejuni # 1 257 1 257 262 265 50.0 6e-71 MDLRIKDKVYLVTGGGSGIGGGISVALAREGAIPVILGRSPLQKSFRAEVRALQPLMHFI QIELTDEQACADAVQEAVSMYGRIDGLINCAGGNDNVSLETGVEAFRVSLERNLVHYYTM AHYCIGELKKSKGSILNVSSKTALTGQGGTSGYTAAKGGILSLTREWAASYLKDGIRVNA VVPAEVWTPLYERWVNTFPRPAEKLETIVRNIPFGHRMTTVEELADMAVFLLSPLSSHTT GQWVVVDGGYTHLDRTLTAR >gi|316921996|gb|ADCP01000133.1| GENE 95 104265 - 105473 1738 402 aa, chain + ## HITS:1 COG:Cj0486 KEGG:ns NR:ns ## COG: Cj0486 COG0738 # Protein_GI_number: 15791850 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Campylobacter jejuni # 6 393 4 405 418 293 46.0 4e-79 MSQITSPQYRKAFILVTTLFFMWGLSYGLIDVLNKHFQVTLNVSKAQSGLIQAAYFGAYF VVAIPAGLFMEWKGYKAGILCGLSLYAIGALLFVPAAGVASFTFFLFALFVIALGLGCLE TAANSYSSALGSTETAETRLNLSQSFNGLGQFTGPLLGGLLFFGGDEAGGGGGLDAVRMT YVGIACVVILLVILFAKAELPDLRSSEEADEAAMMKHSLLAHKEFVAGVAAQFLYVAAQV GIGAFFINLYIECWPGGTAQQGAFFLSIAMLCLLIGRFVSTGIMTKIAPAKLLAAYGAVC VILTGVVFTAVEYVSVIALIAVFFFISIMFPTIFAMGTKNLGTRKKLGGSYMIMAIVGGA IMPYFMGLIADYTHTAAAFLLPLCCFVCVVLYGAAYGRLVKE >gi|316921996|gb|ADCP01000133.1| GENE 96 105655 - 106548 1077 297 aa, chain + ## HITS:1 COG:yjhH KEGG:ns NR:ns ## COG: yjhH COG0329 # Protein_GI_number: 16132119 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Escherichia coli K12 # 3 265 17 281 319 148 29.0 1e-35 MPENKKKYTGAWCPSITPFDADGKLDLAALEKHFSRLSEAGTDGVLLMGSIGEFTALSMD ERVSLIHAARTMTHLPLIANVSGTCMEDMTRLADAAYSEGYEAVMALPHYYFAQTPRQLE AYFDTLGGRFEGEWLIYNFPARTGCDVDAALIAKLAARFPRFMGVKDTVDCASHTRAIVR AAAEVREDFSVLCGFDEYFIPNLMNGGDGVLSGLNNVVPELFTQAKTAFRSGQFEDLCAA HREIGRLSSIYTIGDDFVTTIKAAVAAKFGGLEPVSRGYGGALDDTQLGAVKRLFGI >gi|316921996|gb|ADCP01000133.1| GENE 97 106658 - 107443 860 261 aa, chain + ## HITS:1 COG:APE2502 KEGG:ns NR:ns ## COG: APE2502 COG2159 # Protein_GI_number: 14602106 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Aeropyrum pernix # 88 227 134 280 327 62 33.0 6e-10 MFTDIHTHAFHPKIARHAVEHLNEAYQLDCVGEGTIADLLLREKRSGIGRCVVLCAATTP GQVVPANDYAVTLQREHPEVTAFGTLHPGYEAWESQLERLKAAGIRGLKLHPDFQHFWLD DPRLLPMFEAAQDDFIFLFHIGDNVPPEKNPSCPYKLAALLDRFPRLRCIAGHLGGYHQW EHSLKALVGRDVWLDTSSCTPFIQPDLLKAILRKHPQDRILFGSDYPLYDPGDAMMHLQE KAELGDAALDRYCSNADALFA >gi|316921996|gb|ADCP01000133.1| GENE 98 107498 - 109135 2084 545 aa, chain - ## HITS:1 COG:mlr0093_1 KEGG:ns NR:ns ## COG: mlr0093_1 COG0303 # Protein_GI_number: 13470396 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Mesorhizobium loti # 205 531 5 322 330 99 26.0 2e-20 MSIGSRTHEEFMEEARAFHGYPAPGLIIGGYMVELAKRHMPDGVLYDAVSETAHCLPDAV QLLTPCTFGNGWLRVLPFGIYAVTLYDKATGEGVRVELDNDKLEPYDAIRSWFLKERPKK EQDTERLQAQIKEAGESILSFRKVRIRQDMLGHRSFGAITRCPLCGSHYPASYGGICRSC QGQSPYEDGPGFALSQQPRMPAPVPIPVEEAVGKHALHDMTQIIPGKEKGAAFVAGQELS AGDICRLQQMGKNRVYVQENTPHPEGWVHEDDAARGFARLMPGDGVEVEAAPREGKVNFR ATRDGMLLVDTERLERFNLVPDVMCCTRHNYSVLTAGTRLAGSRAIPLFLSRPGFLKALS VLEDGPLFKVVPMRKAKIGILVTGTEVFQGLIEDRFAPIITQKAQQHHCEVVKTLFAPDD ADLIVRGVRDLLDAGADFIVTTAGMSVDPDDLTRKGLTEAGLTDTLYGVPALPGTMTLIG RIGGAQIIGVPACALFFKTTVFDIILPRMLAGVPITRLDLAKIGNGGLCMECKVCTFPKC PFGKV >gi|316921996|gb|ADCP01000133.1| GENE 99 109159 - 109836 947 225 aa, chain - ## HITS:1 COG:no KEGG:DvMF_0370 NR:ns ## KEGG: DvMF_0370 # Name: not_defined # Def: outer membrane lipoprotein carrier protein LolA # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 28 224 36 232 233 243 56.0 4e-63 MRFPILFACAAALLLISAGASHAAAPAITGEMQKKYESMQSFTAEFTQRLVHQESGSEET RQGTLAFQKPLRVRWETKAPHAELLVITDKDVWDYLPDEELAYQYSPEVVHDSRSIIQVI TGQSRLDKDFTVEAEPDDNGLAVLRLYPKDPSPQLVEALLWVDKGTKLIKKAQILDFYGN TNEVALTSLTPDASIKAGTFQFTPPQGITVENLMNQDAPERPLLQ >gi|316921996|gb|ADCP01000133.1| GENE 100 110077 - 112932 2747 951 aa, chain - ## HITS:1 COG:AGc5003 KEGG:ns NR:ns ## COG: AGc5003 COG1674 # Protein_GI_number: 15890008 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 441 944 362 888 902 458 47.0 1e-128 MQTKRRAPEEVAITDGRKIARELFGLFLIFWGLLVLLSLVSFDQNDPSINHAVSNPAIVK NYAGLFGAYLSGLLVDIFGFAALVWPLVFLAWGAGCVSTWFTMPWWRWFGFTLLAACLIS LGAAWNLGIGDVRGGGMLGMSLYTKFTTLFSPVGSSLIWLFLLFISLEMAFGIAWIALIR KGWALLREQLSGTPFSLDDLPERLSKVSMPKVALPKIGLPHKDKKAEGETAADVIALFDI HDEEERPAPKREEPLSMDPPFSLTPEETPAAPKKDAGDPLKLWEPPSEHDTFDFQPEEPR AGTGTFAPQVNIHLASPTDAAAKAVNLLAGENPLLALTIPETQPAPARPQAELQVQPVQP YAAEQQAPVQPAQPHAEEPQQTISYAPEQPRDIFTFAPQQDAAQPVAQWQQAPQPVQAVA QPATPAAPAQPSVQAAPATSQAPAQTAQPRIDAPVGTAPAAAKPILRKRSYPMPSLDLLQ QPQQSDSLPSREVLEEQSAGLMNCLAEFNIQGELVRVTPGPVITLFEIRPAPGVRVGRFT NLTDDLARSLKAEAIRIQAPVPGCDTVGVEIPNLNRSTVNFRELIQSEAFQSAPSLLTMA LGKDIEGRPAVRDLATMPHVLVAGTTGSGKSVCLNSVLVSFLYKASPDEVKLMMIDPKRV EMAMYADLPHLVHPVVTETSLAKTALEWAVAEMDGRYDCLAKFGVKNIKDYNKKLASFGD ERPQEYADLKPMPYLVIVIDELADLMLTAGKDVEGCLVRLAQLARAAGIHLIVATQRPSV DVVTGLIKANFPCRVSFQVANKYDSRTILDTAGAEQLLGKGDMLFKPTGGKLQRLHGPFV TDDEVQAVADHWRRQCAPQYEVDFTEWGTSLAENAKASSAPASGPGSSDEESLYAEAVAF VQEQGRMSISLLQRRFRIGFNKAARFVERMEEEGILPPASRANKARTVRMD >gi|316921996|gb|ADCP01000133.1| GENE 101 113007 - 113564 860 185 aa, chain - ## HITS:1 COG:all5058 KEGG:ns NR:ns ## COG: all5058 COG0231 # Protein_GI_number: 17232550 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Nostoc sp. PCC 7120 # 1 185 1 185 185 185 47.0 4e-47 MYSTTDFRKGLKIEIDGTPFEIIEFQHFKPGKGGAMVRTKLRNILNGRVLDNTFRSGEKV ERPNLESRDMQFLYHEGEQLVFMDMTTYDQMHMDAEATDGKANYLKDGQECRVLLYNEKP LDIEIPASLVLEVTETEPGAKGDTVSNVTKPATLETGVVIQVPIFVNIGDRVKVDTRTNG YLGRE >gi|316921996|gb|ADCP01000133.1| GENE 102 113563 - 113793 57 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLPPKTVLNIKTYQGSAPRPPLVPYPEQSDGCRDCFSRCGRIALPLQLDIVEHQAALVNI HRRFPYDERDSPHTDP >gi|316921996|gb|ADCP01000133.1| GENE 103 113759 - 114394 802 211 aa, chain + ## HITS:1 COG:TM1466 KEGG:ns NR:ns ## COG: TM1466 COG0218 # Protein_GI_number: 15644215 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Thermotoga maritima # 33 193 23 184 195 131 45.0 1e-30 MMKETPPTLTLEATTFNKPQLQEQLSVLEQRQEAQIALAGRSNVGKSSLVNALARRKQLA KISATPGKTRSINFYRVAPDGFSLVDLPGYGYAKCSQEERKSWAKLIEYYLTNTPTLKAL ALLLDCRIPPQALDRDLADFARGHGIPLLPVLTKADKCTQVERSKRQQEWSRLLNGIMPL PVSSKDMRGVDALWSELRRFAGTGESVSEEE >gi|316921996|gb|ADCP01000133.1| GENE 104 114746 - 115765 938 339 aa, chain - ## HITS:1 COG:SA1328 KEGG:ns NR:ns ## COG: SA1328 COG4974 # Protein_GI_number: 15927078 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Staphylococcus aureus N315 # 15 326 5 295 295 203 34.0 4e-52 MPTPAPAMPPLPQDVEQFLTWMTDQKGFSPATISAYRTDLLQFEEWLHREQHSLARPGEL EKYHFQDYSAHLFHEGQARSSIGRKLSALRSLFRYLMKMKKIDKNPAKLVRNPKRELRHP TALNVDQMFTLLDEASVESGGAAGLDGSRQAEQTVTDREHAGHTRDLALAELLYGSGLRI SEALDLDVDDVDPASGFIRVIGKGSKERIAPLSDTSVRALFLWLRVRALIAPPAEKALFV GNRGKRLNRRQAARILDEIRKSAGLPQHLSPHTLRHTFATHMLENGADMRSVQELLGHAS LSTTQRYTHITLDHLMRVYDKAHPRSSVRGKGGEEGEDV >gi|316921996|gb|ADCP01000133.1| GENE 105 115802 - 117091 1199 429 aa, chain - ## HITS:1 COG:PM1599 KEGG:ns NR:ns ## COG: PM1599 COG0758 # Protein_GI_number: 15603464 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Pasteurella multocida # 126 353 79 307 373 198 48.0 2e-50 MRKTDLFPENDSPGCELREPERPIDAPYAVDIPAVPTALGALDAASRAEFWAALALSHTD GLGARSRKRLLDAFGSAFEAVRRANDWREAGLREDPIREFRSEKWREPARKEWDATRKLT GTVLLWTDSRYPALLKELPDAPIRLYCQGDMSLLGNPCVSIVGTRTCSREGIRAAQAIAS GLAASGITVVSGLAIGIDKQAHLAALDLPGGTIAVLGAGSDVCYPKENDGLRRSIIQRGL LISEYAPGTLPDPRHFPIRNRIISGLSLGVVVVEAALRSGSLITARLALEQNRAVYAVPG GIGSKYAEGCQKLIRDGAQPIFSFSDILSDLSAQLKTLLPQPDSAPRPYSLDVPQEPFQS APAAPPPLRQPDRSTLEGRILTLLAQSPRAIDDVCQSLGCDPGEAGSTLIILEVRGLIRQ RPDLRYALT >gi|316921996|gb|ADCP01000133.1| GENE 106 117235 - 117711 437 158 aa, chain + ## HITS:1 COG:PH0574 KEGG:ns NR:ns ## COG: PH0574 COG2872 # Protein_GI_number: 14590471 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain # Organism: Pyrococcus horikoshii # 10 154 9 150 157 62 28.0 4e-10 MSKEYDPFMHTAEHVLNQTMVRMFGCGRSFSSHLNADKSKCDYHFPRPLEDAEASELERR INEVLAQHMPVRTEMLPREEAGKLVDLGRLPAHLEGEEELMVRIVTVGDYDICPCIGEHV ENTSEVGAFRLVSHDFTPPQGDEEAGRLRIRFKLKKNA >gi|316921996|gb|ADCP01000133.1| GENE 107 117800 - 118708 1228 302 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1119 NR:ns ## KEGG: Ddes_1119 # Name: not_defined # Def: diaminopimelate dehydrogenase # Organism: D.desulfuricans_ATCC27774 # Pathway: Lysine biosynthesis [PATH:dds00300] # 2 302 10 307 307 350 62.0 5e-95 MGLGNIGRYAIEALEIQPDFTCVGVIRRKESLGKDAHDLRGVPEYASLADLEAVSGKPDV VLLCAPSRKIPEVAGDLLGKGYNTVDSFDIHDRIVETIGLLEAQAVKGKAVAITAAGWDP GTDSVMRALFEAMAPVGVTFTNFGRGRSMGHSVAARAIAGVADATSITIPLGGGRHSRLV YVVLEEGATFEAVKAAIQADPYFSHDPLDVRQVTKDELPRVADASHGVLMERVGASGLTA NQHFTFDMRIDNPALTAQVLVSSARAAVRLAQSGRSGAYTLIDVPPVLLLPGERIDNIGR LV >gi|316921996|gb|ADCP01000133.1| GENE 108 119303 - 120466 897 387 aa, chain + ## HITS:1 COG:PA1222 KEGG:ns NR:ns ## COG: PA1222 COG2821 # Protein_GI_number: 15596419 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-bound lytic murein transglycosylase # Organism: Pseudomonas aeruginosa # 146 387 120 378 385 158 38.0 2e-38 MSRRFSLFGACLLCLLLSACGGKSTRPDVVQPDVPEAVRVTPLETPGVFPVSEAEAQRLS RGLAPARQGMRSWKDMRFAVEQSLAYVRAKPSSRVAVNYPALQVTYGEMEKGLERLLRLL PFLDKAPEVLASDFRWVRIGPDFGFTGYYEPEIPASHVRKGRFQYPIYKSPPDLRKKRPY HTRHAIDCKGALKGRGLELAWVEDPVDIFMLQIQGSGRLRFEDGSVRPVLYDGQNGHKYV ALGRVMVDRGLLKREEVSMFSIREWLAAHPDQVTDLLDTNPSYVFFKLGKPGTSGSKGSM GRRITPWVSVATDQSVLPNGLLTLMNVTLPDEGGEHSVPFNALTLPQDTGGAIKRNRVDL FCGNGETATHTASYLDNRGAVFLLLPR >gi|316921996|gb|ADCP01000133.1| GENE 109 120602 - 122050 1601 482 aa, chain + ## HITS:1 COG:PA3548 KEGG:ns NR:ns ## COG: PA3548 COG1696 # Protein_GI_number: 15598744 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Pseudomonas aeruginosa # 26 440 24 427 520 236 36.0 8e-62 MNFVTLEFATFFLFVLVAGWLLRERGGEYRAFLLVCNLFFYALAGMAFVPLLLAVAVLNW GAVHLMSRFSGQPRKRKGVIALDVALHVALLAFFKYYEFLILGLESLASAAGLDTGILRH ALPSMEILFPVGLSFYTFQGLSYAIDHYRCPDEPPRAFLDVLLFVSFFPTILAGPIMRAR QFLPQLGHRTWDSRDLQEGFALILSGLFKKVVIASYLSEHIVRDVFESPEFYSSWTILAA VYAYSIQIFCDFSGYSDMAIGVGRLMGFRLPENFRSPYLALNIQDFWRRWHITLSLWLRD YLYIPLGGSRRGNRYLNLIITMALGGLWHGSNIRFLVWGVMHGVGLAVVHAFHELKKRFW PDGFTPSPALHWCGVGAAWLLTFHFVSFLWVFFRAEDMERSLEILRRVFVFGQPGEGFPL LVIPAVLIGLALQFFGARLFAAFVDAQERLPWAVQAVVLALVGGFILKMGPDGVMPFIYF QF >gi|316921996|gb|ADCP01000133.1| GENE 110 122069 - 124057 1559 662 aa, chain + ## HITS:1 COG:PM1451 KEGG:ns NR:ns ## COG: PM1451 COG0627 # Protein_GI_number: 15603316 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Pasteurella multocida # 429 655 39 258 269 117 28.0 1e-25 MKNPFRSERGTLTSGRACAVYALAFVLMLFLDATRFSSFLDRAGIDFPPAAVVGQTVKDV AEATGLAAFSEAETRLVTAFNTDKTLGTVEGPSVAAAPSTAPQRPVADALADFSREAASP AQPESPVEEVPASVPPAAVPEQAVRTPGAQPETGIPPPPVRAETERKPLVLLVGDSMMME GFGPVLQRTLRKRPDLEVVREGKYSTGLSRQDYFDWPAQLEKLVGKYNPDMVVICMGAND PQDIIDENRKRHHADSESWKTIYRSRAERLLAVATAKGAKAVWVGLPVMGKEPYSTRVRR LSELQKEACETYHAAFVDTVKVLADAQGNYTTFKVDDKGRHVRLRYKDMVHVTEDGGAML SAAVEPVVEKELLLGRNKAAERPAPQALPSSASSSPLPAESPLPAVAETSAEQGGIPFTV DSMFRGGKIPCYAFLPADRKPGERFPVVYLLHGAFEDAGVWNTRAGALLSKLATRERLVF IAPSCGRTGWYADSPYLKKSRIESFFARELMPYVERAFPVLPKRGVMGMSMGGHGSFVLA LRHPGSFASVSSMSGVMDITRHPDQWKIRDVLGPMNANKALWQSYSAEELLKRSKAAGMP AMLITTGQQDAYVVPENRAFRDTLRRGGFSYQYREAPGLHDWTYWLDELPLHVAFHAGVL HR >gi|316921996|gb|ADCP01000133.1| GENE 111 124231 - 124893 650 220 aa, chain - ## HITS:1 COG:SP0104 KEGG:ns NR:ns ## COG: SP0104 COG0546 # Protein_GI_number: 15900047 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Streptococcus pneumoniae TIGR4 # 9 211 5 202 210 100 32.0 2e-21 MQLSAFPFSTILFDLDGTLVDSAPDVLEAIAAVLRDAGDPVPRMELSIIGPPLEGIFREV CPDADEAKIARHVAAFRDLYFGGTFPQSTPYPGIFALLERLRAKGCRLFVATHKPEAAAQ RMLTLKGFMPFLEGVGCTDSLPDRQLCKKDIIRLLMERHALDPASMVMVGDTALDIRGGN EQGIATIAALYGYGDKAKLLAERPRYLTDDPEWRTLRKAE >gi|316921996|gb|ADCP01000133.1| GENE 112 125428 - 126441 1245 337 aa, chain - ## HITS:1 COG:YPO4037 KEGG:ns NR:ns ## COG: YPO4037 COG4213 # Protein_GI_number: 16124157 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Yersinia pestis # 2 335 3 331 331 191 34.0 2e-48 MLKKWLSCICAAILLLGLGGCEDNSRKVKIGVSIGAGGAARWQKDMAFMQERAKALGADI ELRLNLPDSSKTQAEDCLEMLSDGIDVLIITPNNTRKVDDVLMYAKQKNTKVVSYARAVM GGNVDLFVGYDCYKIGQNMGQHLTEKVYHGDIIVLKGDVNDFNTPFLYYGAMKYIKPLIE NGKLNMVLDAYVPKWSPAEAKKLVKEAVAANGNRIDAIFASNDRLAGAAAEALEELHVTN HAVITGMDAELAAIKRILAGTQDATIYMDLRELAYAAVDEAYNLATKKKVNVNSELGNDG SNKINAFLINGKVVTRENINKVLIEPGHFTKEDIYSE >gi|316921996|gb|ADCP01000133.1| GENE 113 126435 - 128612 1722 725 aa, chain - ## HITS:1 COG:alr1230_2 KEGG:ns NR:ns ## COG: alr1230_2 COG2200 # Protein_GI_number: 17228725 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Nostoc sp. PCC 7120 # 476 717 15 256 266 170 36.0 1e-41 MVFINLALSVSVVFLMRYVQNLLNSDVKINLVEIVTQNKDAITSRLMMNVNSLDITANKL SDRLKAEESSGKLKTQDLIEQYAKENNTDDLFTANRDGMALFDGGRKLDISGRHYFRLAI DGIPNISDKLISRINGDEVFVISVPLSYNGEIVGTIQKVITFEEMYKICSLSIFSSQGYM YVINREGYVILHSAHPNCEQKSDNYFRDLYGAGNPKASEQMKRDIRNNRNGFIETTIAGR EIFSAYTPIEKIHDWYLITSVPNNAVSPNGNTVISIFYFILFVIVVIFTSSLTYFLWYKN KQRAQLEKIAFVDTVTLGDTYNKFLVDAQGILTQCPHKKFHIIKFDIDNFKYINNFYGFE FGDRILRKINESISQQLNAHELIARIYSDHFVILLENAAEGRLNALLSSIENEEITLYFS AGIYSVTDSTESINLMVDKAGTAARSIKGVLNKKFAYYTNKFEQITIHNEQLKRAVKQAL KNDEFIPYYQPKVDINSGILVGGEALVRWKTREGKFIFPNEFIPMCEQTGLIVELDMIMY EKVLKFLKSFLEQGLACVPISVNFSRLHLMDSDFLSKIVKKQCEYGVPAHLIEIELTESA IFDNIDTIYAFTRKMHANGFAIAMDDFGSGYSSLNMLKDIPIDVLKIDKGFLSEAADNNR RNIIFSSIADMARKLNIKVVVEGVEHLENVRLMKECGCSIAQGYYFAKPMDEESFRNVFK EGRIC >gi|316921996|gb|ADCP01000133.1| GENE 114 129434 - 130171 253 245 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 4 240 2 238 242 102 30 2e-20 MSEQRPVAIVTGAASNIGLACAQRFAATHTVIMADIADATQQAASLPHAVAVRVNVGEFD SCLNLVAEARKYGRIDAVVHSAGITRPACSILDMPVEEWENVIKVNLTGAFFLAKACIPP MLEAGQGAMVLFSSRAAKTGFAALGSNGAKTKAHYCASKAGVISLVKSLAMELAPYGIRV NGVAPGPVQGTMIPKESWPIIAEKVPLNRLGTPEEMAEGAWFLCSPQAAFITGHILDING GTLMD >gi|316921996|gb|ADCP01000133.1| GENE 115 130393 - 133380 2821 995 aa, chain + ## HITS:1 COG:ECs3353 KEGG:ns NR:ns ## COG: ECs3353 COG3604 # Protein_GI_number: 15832607 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli O157:H7 # 652 993 302 657 663 250 41.0 1e-65 MDFSPFSLLVVKVFSLITGGLGAEACAQILKVPLEELEPIIAELKEGGLLQSAPDRPARI MITSRGVAELQMRNAWREVCVEIADVASALDEGQNLEACTLLLGFINTLLNSSQSSGAVA CYGLVLRLLGRWSPAGAASGQKQAFMKLALAACDISMYLSKSHRQAREIAARALETAKEQ GDARMTALLHLVESSLENMAAECSAQRMHELHAQCETSLLELDKAGARGQAEYFVGMFHF WHGHFREALAAFEVALQYPQLWECRFQTEMFSLYTSSSAVYLGRFHQAVGILEAAKRAAE LGQNRFKILWWEGQLAMVLLYMRRHEEALELIDHVLSMVNPETETKIAVWGMRGLAFYHY TQGNIKASHAAMQRTMDIALRYGFRRPIYSYPWMFDVLAAYEQAGLPPVRGLDLDEEIQR AIESYNEHLRAGALRAKATRLFAESGDKEAAGSLLAESLRGFTAIGNPTEIAVTQRRIAR LHDLPARAAEKGLAPESGASPYEPESTARIGGTLLENCRIALESIRNLTDFSQYINQMTH IAGCELGAERAALLEVSSGGLLTCRAACNISAVELNTGKISSKVSQIYAVQQNKPLLLDE ERHVFLSLPLETAKARWVLWLESEYALETLRGISNADQAAFTLLFESELRKIEAGKPKPA RLAAGEQEQAQAEPYRDTPEILLHGSAIMRRLLSHGRQIGATDAPVLILGETGVGKELLA HYIHDCSGRTGPFVPVHPASIPDGLFESEFFGHEKGAFTGAIRQKIGLAEMAHQGTLFID EAGDIPAPMQIKLLRVFQDHRFMRVGGEEERHSDFRLICATNKDLWAEVKSGHFREDLYY RLSVVPLTIPPLRSRKEDIRLLVRFFLDRYSRRYHRDIQQPSERELEALLQYDWPGNIRE LKSVLERTVILHQGGRLSFDLSSHAESKADGTPRQEPCEELFKNLPTLDELQSRYIQHVL KITNGRITGNRGALRILGMKRSTLYLRLKQYNIRF >gi|316921996|gb|ADCP01000133.1| GENE 116 133599 - 134615 842 338 aa, chain + ## HITS:1 COG:mlr6793 KEGG:ns NR:ns ## COG: mlr6793 COG1250 # Protein_GI_number: 13475669 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Mesorhizobium loti # 2 307 1 306 309 162 32.0 8e-40 MLFRNVVIFGAGTMGHAIALRFARGGHAVTLTDVDRDALDRAERSIRRLAGTQSEQLGGE APESVMARLRFAEDGFGAAAGAELVVEVISERVDIKRRFYEELAGVVSPSALIASNTSVL DIFEHAPSVLHPQLIMAHYFVPAHIIPLVEIVGHPSNPGQLVPQFAACLSALGMKPVVLK RFARGFIVNRIQRAINQELFQLLDEGVADAAALDDAVSVCLGARLAVMGYLSRLDFTGLD LVLSNYRQVPMGLATDETPPPLLERMVGEGRLGFKSGKGFYGYSGVTSEEVLRHRDARLL AVFRSLETINARFPSPTLEPANGGAWQNGPSSQKDEDT >gi|316921996|gb|ADCP01000133.1| GENE 117 134612 - 135058 513 148 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKIARENLISGLVVIALGLVFLSQAFGLEQANATEGIHPMDYPKALTWLFIAVGCCIVLV PSHRRSDSDIPLFSARTAAVSGILVAYALLLDHMGFGVTSFLAACGIGYAMGWRKLPSLL LSNLCGTAVIWSLCWYVLKLPLPMGILF >gi|316921996|gb|ADCP01000133.1| GENE 118 135134 - 136624 1680 496 aa, chain + ## HITS:1 COG:BH2009 KEGG:ns NR:ns ## COG: BH2009 COG3333 # Protein_GI_number: 15614572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 20 454 22 454 504 301 38.0 1e-81 MLFIDSLLAALSPFTLAMNFVGVMLGMIVGALPGLGSVVAITICLPFTFGMTSVSAIALL LGVYCGSICGGSIAAVLINTPGTPQSAATAFDGYPMARAGKPGKAIGWALAASIFGGVFS CVILTFAAPQIAVFALQFGPLETCALILMGLTCISSVSANNQFKGLAMGVLGLLLACVGM SPFSAESRFTFDIFALNSGIDLVAVIVGVFALSEVLDRVERMRREARVENGTSCRVQLPS LGEWRGRMSGLVKSSLIGTFVGILPGTGAATAAFLSYGEARRSSPRRGNIGKGEPDGIIA AESSNNAVTGGALVPSLALGIPGDPVTAIMLATLTIHGVTPGVRLMTENPEMVYATFAVL MLSNLLMYPSCVITTRMFSFLLRIPEQLLMGFIAVLCILGSYGSRGNLFDVFVTVFMGIA AYFMRRMEFPLPPLVIGLVLGQQFEMSIGQMMLFKGDDSWLAFVSASPIAMALLGMAFLL LVVPQVQNYRALRAKR >gi|316921996|gb|ADCP01000133.1| GENE 119 136681 - 137631 1252 316 aa, chain + ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 27 301 16 295 308 139 28.0 5e-33 MIRQMMKKTLLFLCVFMLGATGALAAYPDHPITLVVPFGAGGGTDIPARLLAGMMEKRLG QPIVVQNVSGAGGTLGVAQVAAAKPDGYTLGYVPTGTMCLQPHVMKLPYGTDAFDFIGTS VSQPVVLMTAKNAPWKNINEFIEMAKKNPNKYVVGITATGNMTHVPMLQFAEHFGLKLRF IPYRSGGEIFKDIMAGRTHLHADAPASLSTFDVYGLVQFADERSENLSDIPTTKELGMDA RFSHWQGVIAPRGLPKDVLRTLETAMSEVVNSEAFQQEALRLKTKAHWMSSEAFRKLYDE ELNTYGEILKYSLPQK >gi|316921996|gb|ADCP01000133.1| GENE 120 137750 - 138076 289 108 aa, chain + ## HITS:1 COG:MA3162 KEGG:ns NR:ns ## COG: MA3162 COG5561 # Protein_GI_number: 20091980 # Func_class: S Function unknown # Function: Predicted metal-binding protein # Organism: Methanosarcina acetivorans str.C2A # 2 108 12 116 117 61 32.0 5e-10 MKVGIIRCQQTEDLCPGTGDFAAVEKKSGAFETLGDCTIVGFVSCGGCPGSKAIARAQAL MDRGAQAIALATCITKGNPSGFPCPNRDSILQALRKKCKEVSVLEYTH >gi|316921996|gb|ADCP01000133.1| GENE 121 138186 - 139883 1529 565 aa, chain + ## HITS:1 COG:SMa0958 KEGG:ns NR:ns ## COG: SMa0958 COG0028 # Protein_GI_number: 16262972 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Sinorhizobium meliloti # 10 562 9 563 570 491 49.0 1e-138 MASCTEKHNTVGHSIGRMLLRHGIHEFFGQGLPQGLSLPFEELGLRQIAYRTENGGGYMA DGYARVSKRPGVMIAQNGPAATLVVAPLAEALKSSIPLVVMLQEIPLKNEDRNAFQEFDH VRLFSACSKWTRRVHTADRVEDYVDQAFAVACSGRPGPVVLLFPNDMLGMPAEPEKRDIP LGHYPLDPVMPPLDAVRGVAERLLSSRNPVIVAGGGVHLSSACAALAGLQEHFSLPVATT VMGKGGVDERHPLSLGVLANATGPGSAGRHQRPILEEADFILLVGTRTDQNATDSWTLYP SDAVFAHIDMDSGEVGRNYEAIRLVGDARLTLEALADVMKTMDGGKRRTARAALEARIAA GRAAHDAEMASFLSNAGEPLHPAQVMAALQAVLTPESIVVADASFSSIWATNNLVSLRSG MRFITPRGMAGLGWGLPMALGAAVAEPEAPIYCITGDGGFAHVWSEMEVAARSNLKVTTI VLNNGILGYVKCSEKLRYGKNSTSVDIFPIDHVALAQACGLQGIRVEHAEELLPALEQAR LSETSTLIEVMCSGDCPPITDFVGK >gi|316921996|gb|ADCP01000133.1| GENE 122 140017 - 142842 2225 941 aa, chain + ## HITS:1 COG:mll4880 KEGG:ns NR:ns ## COG: mll4880 COG1529 # Protein_GI_number: 13474083 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Mesorhizobium loti # 158 907 1 764 774 243 28.0 1e-63 MSIQRIVLNINGVDREILCEPEKDSLADVVRRLGLTGTKVGCGTGHCGACSLLLDGEVTR SCVKKMKQVKAYAKVETIEGIGSAERLHPLQRAFMTYASVQCGFCSPGFIMSAKGLLMRN PDPTREEVREWFTRHGNVCRCTGYKPIVDAVMEAAAVMRGEKAMEDITFTPPEGGRLYGS DFPKPTALSRVLGTCDFGADISGKMPEGTLHLAVVLAKREHARIRALDTAEAQAMPGVVN VVTAKDVKGTNRLVAPQGTVHSLCDGLDRPVICDGVVRRYGDVVAVVAATGRDKARAAAE RVRVEYEPLPAVMTFMEAAAEGAFPIHENTPNIYIEQPVYKGEDTQGLLEHAAHTVEGSF GTSRQPHLTVEPDVVLAYPQDGGVVIQCKGQYLYGNIAQMAPAIGLPKEKVRIIGNPAGG SFGYSMSPGNTALAAACALALDAPVSLVLSYAEHQHTTGKRSPVYANVRLACDEAGKLTA MDFLAGIDHGAYSEMAGALTTKVCRFFGYPYAIPNIRGLVRTAFTNNNFGTAFRAFGSPQ TYTASEQIVDMLAGRIGMDPFEFRYINVAREGDTCTTSVPYREYPMRAMMDMLRPHYRKA VEKARAESTPEHRRGVGIAWGGYHVSKVPDRAEIDLELNQDGSVTHYSTWADVGQGADTG SLIHVHEALRPLRLRPEAIRLVRNDTGCCPDTGSASGSRSHHVVGMATLDAASKLLAAMR KEDGTYRTWREMRDAGVPTRYRGVHTAEWSDIDPDTGHGYGAIAQNYVLFMAEVAVEAAT GRTRVLGATIVADVGVIGSRQAVLGQAWGGFSHSIGFALSENYEDMKKHATMRGAGVPRC NDVPDTFNVLFHETYRENGPHGSTGCAEGFQSAGHVSILNAIADAVGVRVATLPVTPEKL KAAMTAKAAGVPYGQEPWDLGCSLYGRLAFLKARHAGKGKG >gi|316921996|gb|ADCP01000133.1| GENE 123 142878 - 143492 521 204 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3311 NR:ns ## KEGG: Sterm_3311 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 4 198 8 202 205 157 46.0 2e-37 MRDELMSEIIKLEWDMFSHVSNVGGPASCQMRPDTFKIMRKSQAATWSDELLASYLEDLK TATREGRNIMTEKYARMMESTFPEEYRKLAASLPPVDKETLQKIEEIVAINVGWKAELFD RYPRLSGKGRPLRTSEDSAMETSFETYLRGELKTYSARTITLLHELTLRQQQDGVNGAAL NLLNQVQQYGYATLEQAEGHRAGG >gi|316921996|gb|ADCP01000133.1| GENE 124 143716 - 144660 938 314 aa, chain - ## HITS:1 COG:RSc0886 KEGG:ns NR:ns ## COG: RSc0886 COG2958 # Protein_GI_number: 17545605 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 3 314 6 311 315 255 46.0 6e-68 MALSQSQKIIAFLEKHRGGSFTARQIAEAIVQQYASDYAKKRNNPRFADDKAFLSQIIAE IGSQKDTLKRLRPTLVWQDRPKPRRYCIPLEETQEALDTAQGETCERDEESCACEERALL EQELYPLLTSYLWSELHLYSMRIREGRSHNTRGNGGNQWLHPDIVAMEPVDKAWGKHVKA CGHASGSTCVRLWSFEVKKMLTVANIRKCFFQAVSNSSWAHEGYLVATAIADDRVEQELR MLSALHGIGVILLDPENPSESEILLPAQKKPDADWQSIDRLAKENGDFHEFIELVSVYFQ TGTLREKDWDNKTA >gi|316921996|gb|ADCP01000133.1| GENE 125 144832 - 145347 174 171 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 57 166 317 426 435 71 31 2e-11 MNGITSSGAKGINALGPRYMGKYHPLMEIGPRYLLIQGTHRELTEGNGPPFVLDVAPMLV LFAIVMTFCILGCFLPSMPLLLICVPIFVPIAKVFGWNLIWFGVIITVLDNMASITPPFG ISLFVMKEVAGVTLGAMYRSAVPFVIALFVCLLLIVLFPSLATYLPTLMNG >gi|316921996|gb|ADCP01000133.1| GENE 126 146021 - 147544 1988 507 aa, chain - ## HITS:1 COG:TM1058 KEGG:ns NR:ns ## COG: TM1058 COG0069 # Protein_GI_number: 15643816 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Thermotoga maritima # 26 500 2 488 508 333 37.0 5e-91 MSYSPTLGSVFNRTKMRDPSHVCSFSGMCAMCTADCIGSCEIGLSAVRGMETVYPTTTGE NQIASEKVYPVDYSHFNINGRSFGAMGASLDADRATIYHVGLEREIGTLNPVKLALPIIL PALIKLNWPDYFGGAAMAGVNAVIGEGAVSKDPTLAYENGKVVHAPKLREMLDAFNKYDR GYGQIILQANYDDDAQGVPEYAIRECGAKAIEFKFGQSAKGTQPANLVPTLEEALKKCEQ GFLVLPDPRDPGVQEAYGRKACPAFHVYGRLPIWSDDYLIGRIAQLRDMGMANVYFKMAG FDPEDLEHILRLASKAQVDMVTFDGAGGGSGYSPCKMMNEWCLPTVVMESILYGILDKLA AEGMELPHVAITGGFATEDQVYKALALGAPYISAVGLCRSSMAAAMSAKKIGDLIEAGKV PPELARFGTTKEELFSDLPELRGLYGSAADGFSTGAVGVYSYLNRIAYGLRHFAALNRKF DVKHIGRRDVFPLTRDAKELLDGTWLR >gi|316921996|gb|ADCP01000133.1| GENE 127 147626 - 149206 1896 526 aa, chain - ## HITS:1 COG:PAB0090 KEGG:ns NR:ns ## COG: PAB0090 COG3653 # Protein_GI_number: 14520359 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-acyl-D-aspartate/D-glutamate deacylase # Organism: Pyrococcus abyssi # 3 525 5 523 526 419 42.0 1e-117 MLDLLLTNARIVDGTGNPWRKGCVGVRDGRIVAVGSAHDMPEARTVLDIGGNILCPGFID SHSHSDLRIIDEPLALPKIMQGITTENVVLDSMSVAPISDKNKEPWSTSISGLDGVAQKP WGWNGFASYLNALDAAKPSVNLSSYVGLGTVRLDVMGMEDRPPTAEELRLMEESVARCME EGARGISAGLIYTPNKYQSTEELVALAKVAARYDGILDVHMRNEADHMAEALEEVIHIGR ETGIRILVTHFKLRGRRNWGNARRHLDTIDRARGEGIDVGIAQYPYTANSTFMHVVVPPW YHSRGTDGLLKALAEEREQVKKDMLTTEGWENFSQVMGWENIYVSSVVSEKNLWCEGKSA VALGEALGKSPEDAVLDLLIEENLAVGLLGHGMYEEDVMEGMKHPTMCLITDGLLSGGKP HPRTYAAFPRFIARYVREKRLLTLEEAVRKMTSSTALKLRMKRKGFIMPGMDADMVVFGE NRIRDVNTFENPRVYPEGIDYVLVNGEIAVDHGVHTGARPGKTIRD >gi|316921996|gb|ADCP01000133.1| GENE 128 149222 - 149959 804 245 aa, chain - ## HITS:1 COG:TM1058 KEGG:ns NR:ns ## COG: TM1058 COG0069 # Protein_GI_number: 15643816 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Thermotoga maritima # 24 243 297 506 508 173 39.0 3e-43 MTSYGCHPFSRAGLSLHLRPQRRVRVASELDFELLTIDGSGGGTGMSPNDMLDNWGVPSV LLHAKAHDYAFLRAAAGKKVVDLAVGGGLAKPSQAFKALALGAPYVKAVCMSRSFMIPAF LGCNIEGALHPERRERVHGAWSALPKTVLDIGDTPETIFAGYHALKERLGADEMAHIPYG AIAMWTMCDRLGAGLQHHMAGARKFGVEHIDRTDICAANRETAQETGIPFITEQDDELAR RIVLG >gi|316921996|gb|ADCP01000133.1| GENE 129 150080 - 152743 1702 887 aa, chain - ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 341 589 20 266 280 189 44.0 3e-47 MKSYSGIPYTTGKQAADDVKTQLVGFIKIPKWLLGLCLITVFYILFVCFSMYKIGDSTTQ LYEYPYTVSRKAQEMKTALYGIRISLPSLLATPDITAAQINDLLQKQRAMQDKSLEEITL KFRGNEKELAYLKRCMAELREARKNMVNNLLGNLDFQYINAYYAREVLPHFEKVDKLLGN LGDAAEHRGNAILAEMDKLRTFSIIATLLMGISIILLILHTRKLEWDKYKESAYREKLLN LLAANIDEIFFISREDKSFEYVSSNSERVIGVPPEKFIHDYRKLYSLLACKDSDWLRAVL DDTSSNMKERDIVLGEEGRRFNIRVYPIFLNGVLSQRIIVLFDQTKEFAYRQALSDALEN ARNANTAKSNFLSHMSHEIRTPMNAIIGMTVIALTRLDDRSRMEDCLTKIALSSRHLLGL INDVLDMSKIEGGKLTIAHEPFNFKISLQGVINLIQPQALERGLDFEVSLSGVDEEELLG DALRLNQILINILSNALKFTPVGGSIRLEVHQLHKKNNNVQFRFVIRDTGIGMSQEFIKR LYTPFEQATSSTASKFGGTGLGMAITKNLVSLLGGTIFVKSEEGRGTEFTVELPFGLSGR QLEKGKGELEPLKVLVVDDDYDTCEHACLLLDKMGLRTRWVLSGAEAVKVVQESHVSGDG YDVCFIDWKMPDMDGMETVRRIRNEVGPETLIIIISAYDWGAIEEKARAIGVNGFIAKPF FASNLYNTLTSLTRRTAPKREIEVSAPPESETETAHQHYDFTGKHILLVEDNEFNREVAQ EFLEMTGATVESAENGSEGVALFTASETGQYDIILMDVQMPVMDGYEATRAIRASVHPDA NSIPILAMTANAFNEDVAAAVAAGMNGHIAKPIDVTALYRLLASHFK >gi|316921996|gb|ADCP01000133.1| GENE 130 152821 - 153171 235 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302864281|gb|EFL87212.1| ## NR: gi|302864281|gb|EFL87212.1| probable sensor/response regulator hybrid protein [Desulfovibrio sp. 3_1_syn3] # 1 114 4 117 119 143 68.0 3e-33 MDQTRKARLEAGGIMVDEALERFMGNEAMLERYLQKFLSEKSYAMLRDSLASNDWEAAGR AAHTLKSICGTIGCEAMQELVILQERHIRAGEWKEAVGMMPEISNSYENICGVIRA >gi|316921996|gb|ADCP01000133.1| GENE 131 153183 - 154271 282 362 aa, chain - ## HITS:1 COG:slr2100 KEGG:ns NR:ns ## COG: slr2100 COG3437 # Protein_GI_number: 16330586 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Synechocystis # 9 352 12 360 368 208 34.0 2e-53 MHKIVVPNTILIVDDDEMNRDVLGNIFSASHSIEMAENGKECLNKILECGQKFCAVLLDV VMPVMGGIEVLKKLNRDGVVDHIPVFLITGETDTRIIKRAYELGVMDVISKPISSYMVQR RVNSVIELFTARKRLSSVVGQQKDQLLKQAKRILRLNMGMIESLSTAIEFRSGESGEHIR KIHDITKLFLENSPLGRDFSTEEIEHISLAAIMHDVGKISIPDAILSKPGRLTPEEFEIM KTHTTQGGQLLERIPQMRELPFFTYAYDIAKYHHERWDGRGYPEGRKGDDIPLWAQIVSI ADVYDALVSPRCYKKAFSFEVALGMIVSGECGVFNPNILACFREIEGKLRTLYGSGPERL YE >gi|316921996|gb|ADCP01000133.1| GENE 132 156574 - 156813 130 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPYPCCQTLHNLLAMVFKPFFDRFMRICNGTVVCNLNKSRTLAVSTKFFKSRLTYGVYFC GFFSRDIRFSENSLCFCEL >gi|316921996|gb|ADCP01000133.1| GENE 133 156892 - 157383 293 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302861414|gb|EFL84351.1| ## NR: gi|302861414|gb|EFL84351.1| conserved hypothetical protein [Desulfovibrio sp. 3_1_syn3] # 11 163 7 157 157 124 42.0 3e-27 MRRPINPVIPYPHEAIQHTRCVLALSMITVALSLLKPETLPQLGDLGRQVKKVDRWIERC SDDVQRRLSAGAKRDLDRRFHILAEHVDSALAETDDAKKWSLWASGVWAGLTFLEDARNT CPVYFRGLHWHNLLKTLTTLCNALEKVDPKIAEIGTRVYELAA >gi|316921996|gb|ADCP01000133.1| GENE 134 157419 - 157850 303 143 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDFKPFDRVLVRDSDDAAWVPGFYSHASGRIDVPYTSILSHSYAQCIPYEGNEHLAGTA DAPASWSESKYMPVCGEIVAVRDSGMKFWVPRIAVGIDGDGRYLCRKLPCTNESDLCSWE HARAWDDELVAVVPIPDEPREPK >gi|316921996|gb|ADCP01000133.1| GENE 135 158238 - 158555 60 105 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNTTVQKAINDGMKATGITQTELAIVLRSRARASEIINGKRDLRRTELLVLSYLLHIPLE TLMPPLSEEDKLHIDSLIAWERKLQETRYIKKRTSKRSKKACPKN >gi|316921996|gb|ADCP01000133.1| GENE 136 159231 - 159452 135 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MITTAELARIRAAAIGDMLGDPGALDEMGPAATIFRLCRELELATKRAVAMSEVAAAAWE AAREAARKDELQT >gi|316921996|gb|ADCP01000133.1| GENE 137 159449 - 159754 146 101 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTAQEWLDELERLREAATPGPWKHDKWIIKSLEARKIAQTINLDNSAYIVAACNAVPMLV EMLRIAVSLIEAQNAQRARKGERIPDIDVMRLLFTMTEPKE >gi|316921996|gb|ADCP01000133.1| GENE 138 159732 - 160013 88 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPEPQIVIVVNSLTEAHKLERELVRDRQNACGYRPRQKNECRFCKHVGRYSSYTYQTTYF CDLHNFCVAARGICNDFETNIPGERTNDSAGVA >gi|316921996|gb|ADCP01000133.1| GENE 139 160219 - 160551 218 110 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLTEEERKWLEERKDVCFRCKKRRQTCSPLLRESCQKAGYPIRDLAMVIPPDYRDAAEFE ARVSRYIAENASELDFSSNNWQFFDGFKAKSLAWYILREARIAVEEKMDQ >gi|316921996|gb|ADCP01000133.1| GENE 140 160640 - 160816 185 58 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNTVGELISELESFPPETPVVHHDDEEGNTVSKVSIGFMTDGDGDPVCVILFPGEEIE >gi|316921996|gb|ADCP01000133.1| GENE 141 160880 - 161131 130 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPDTDALPDIRLKCPDLASIIPGRRFLYRAKVGGERQTVTVTASCAPYPRDFGKGRKAMY VTVYGYEGKWTVPASKLRIAEKA >gi|316921996|gb|ADCP01000133.1| GENE 142 161095 - 161361 119 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNSSFAKFMNSKFMGMVYALIMFFFCIHLAYRSGEMSGILSEKQKKDIVFVIVDSENSTH RVIWSDGTLKEFNGVTPCPTPTPSQTSD >gi|316921996|gb|ADCP01000133.1| GENE 143 161354 - 161770 178 138 aa, chain - ## HITS:1 COG:no KEGG:LF82_p096 NR:ns ## KEGG: LF82_p096 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_LF82 # Pathway: not_defined # 25 132 11 112 114 74 35.0 1e-12 MSVDLYGDYTAEVLSLPQQWREQMWLANGETGISSKTIHAVLTGTVDNTVFNTSPSWFRY DVPHDPSDFRRCYLLLKFIPEWKQRLHEVAERFPKWKPFVEQWDELTRLYEDERDREDGK APLLYKLMKELKNKGNHE >gi|316921996|gb|ADCP01000133.1| GENE 144 161784 - 162017 98 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIVWHVTTAKKLKRYKDSGGILPPVRAWDSLPAAERFSKQTGRKVILRLKFPNTAERLPG HRGEAFVLNEKYRLTSI >gi|316921996|gb|ADCP01000133.1| GENE 145 162542 - 163201 400 219 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLDLNMESEQKNRNYGPVPAGSKVMVRISVETPKYGLQDTPWVAQAKSGLLGLWCKFTVV GGLYDGVEWYDNLWLPAGYQNIRLNEGQTTTCNRSGAQIRAIIEAHRGINPKATDDRSVR GRQLAEWTDMQDMEFPAKLGISKEPYEKDGKKYWNNYISSIITPDKEEYTQIKSGKEIIT DGPVTGEAEKGKNQTPPEANTYGQPFPAGPSEYDDMPFN >gi|316921996|gb|ADCP01000133.1| GENE 146 163194 - 164030 505 278 aa, chain - ## HITS:1 COG:no KEGG:Mmc1_2921 NR:ns ## KEGG: Mmc1_2921 # Name: not_defined # Def: hypothetical protein # Organism: Magnetococcus_MC1 # Pathway: not_defined # 39 253 15 233 252 191 44.0 3e-47 MAQTTQEQAVQAASFLGCDASELVNPFDLSAIVSSKGFQPMKIVLYGVPGIGKTTFAGTF PSPILLRTENGAAALDIPTFPNLITSLQDLDAAIAALRGIHQFKTLIIDSLDWMEPLVWQ YVCTKEGKENIEDFGYGKGYVKVDDVWRAIQAKLEKLRTLRDMNIVTIAHAVPVTIDPPD SDPYQRYSLKLHKRGAALWMEWAEMILFLNYKARVTKREGEKAKATGSGDRVIYTAERPA YQAKSRWPLEPEIFIGNDPTWAAFHEQLSTATEGAYHA >gi|316921996|gb|ADCP01000133.1| GENE 147 164044 - 164334 123 96 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYCGATLTLETATIEHIVPLAGNGLDDLRNMTLACAECNHAAGHLSARQKVELALKRRGD VMRPEADMYDAERESVDRTIPDEPEYHGPGCSGALL >gi|316921996|gb|ADCP01000133.1| GENE 148 164631 - 164816 106 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAEKDGGGGMISNVYMKHAKVSLERPKKVDRARVAAVIIFMAVMLGICILPQILYGMEAM R >gi|316921996|gb|ADCP01000133.1| GENE 149 164786 - 164968 107 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNMTMTISHADDFTRSHPVLTGKRKGTTLRERLERIRDKSRKASVREAAEQWLRKTEVAA >gi|316921996|gb|ADCP01000133.1| GENE 150 165422 - 165664 335 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTTAPYLVIGTSPFNGMDTVDECDSLEEAREFANEHGGKVIKASEYRPKGLLEDWLNNE PEDFGVRPGIDFPATLHRAG >gi|316921996|gb|ADCP01000133.1| GENE 151 165742 - 165927 122 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNAQEMERIEQHWTNGDNQGRIWSINGYGEVCAEYPGNWKYFSSVKGAERWMKKHGYELA A >gi|316921996|gb|ADCP01000133.1| GENE 152 166092 - 166379 87 95 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLTKDGMVIFDDAIRTQLAARQKELRLSDKALGQMAFPFMGNPLGKVRSILIAQGAGEDK KPQNLRMADLVNLCQALGLNLHDVIRMGLKQADGK >gi|316921996|gb|ADCP01000133.1| GENE 153 167523 - 167729 202 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKEYTQRGGIGLLGALGLMFVGLKLTGYIDWSWWWVTLPFWGGLAISVTLLVVSVLGLL FELKKGKV >gi|316921996|gb|ADCP01000133.1| GENE 154 167726 - 167887 65 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNWRVAGAIACGLCAVGIDFTNMLMSVVSDLFFVSGVFVMLWPLLKKKEDSA >gi|316921996|gb|ADCP01000133.1| GENE 155 167880 - 168332 620 150 aa, chain - ## HITS:1 COG:Z2379 KEGG:ns NR:ns ## COG: Z2379 COG3793 # Protein_GI_number: 15801792 # Func_class: P Inorganic ion transport and metabolism # Function: Tellurite resistance protein # Organism: Escherichia coli O157:H7 EDL933 # 7 145 25 163 164 119 46.0 2e-27 MGFFSKMFGKNVQKGKAELAKVENRDLMQAIVGGALLVAYADGECEDAELAKLDKTINAL PELQHFGSEISETINMFRMQFETGFRIGRQKAMKEIEDLKASPDEKLLCFNVMVTIAESD GEIEPEEVKVLKEVASMLGINLRDYGLENA >gi|316921996|gb|ADCP01000133.1| GENE 156 168686 - 168922 56 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLAMPNLANAPQIFGSLYARNFYARAPLAGHVHLCESIPRNNDSSHFNLADGVGFEPTDA FTSAVFKTAAFILSAIRP Prediction of potential genes in microbial genomes Time: Fri May 13 04:24:15 2011 Seq name: gi|316921914|gb|ADCP01000134.1| Bilophila wadsworthia 3_1_6 cont1.134, whole genome shotgun sequence Length of sequence - 78669 bp Number of predicted genes - 107, with homology - 59 Number of transcription units - 48, operones - 24 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 102 86 ## - Prom 194 - 253 5.3 + Prom 568 - 627 1.9 2 2 Tu 1 . + CDS 654 - 998 74 ## 3 3 Op 1 . + CDS 1245 - 1688 387 ## DvMF_2842 hypothetical protein 4 3 Op 2 . + CDS 1691 - 2176 -62 ## 5 4 Op 1 . + CDS 2480 - 3526 199 ## ABSDF2498 hypothetical protein 6 4 Op 2 . + CDS 3417 - 3842 255 ## + Prom 3993 - 4052 4.8 7 5 Op 1 . + CDS 4251 - 4496 190 ## COG4570 Holliday junction resolvase 8 5 Op 2 . + CDS 4499 - 4696 206 ## + Term 4805 - 4843 2.2 + Prom 4782 - 4841 2.4 9 6 Op 1 . + CDS 4881 - 5102 141 ## LM5578_1345 hypothetical protein 10 6 Op 2 . + CDS 5090 - 5524 275 ## COG0756 dUTPase 11 6 Op 3 . + CDS 5527 - 6066 209 ## 12 6 Op 4 . + CDS 6115 - 6498 267 ## RB2501_01256 hypothetical protein 13 6 Op 5 . + CDS 6498 - 6788 284 ## 14 6 Op 6 . + CDS 6790 - 6990 58 ## + Term 7072 - 7122 -0.7 15 7 Tu 1 . + CDS 7129 - 7455 175 ## 16 8 Tu 1 . + CDS 7515 - 7931 191 ## + Term 7940 - 7971 -1.0 17 9 Op 1 . - CDS 8137 - 8850 500 ## 18 9 Op 2 . - CDS 8852 - 9559 360 ## Srot_2112 hypothetical protein - Prom 9689 - 9748 4.3 + Prom 9609 - 9668 4.4 19 10 Tu 1 . + CDS 9828 - 10022 129 ## 20 11 Op 1 . + CDS 10127 - 11389 345 ## Mmc1_1690 hypothetical protein 21 11 Op 2 . + CDS 11400 - 12857 836 ## BP3383 hypothetical protein + Term 12865 - 12898 2.0 22 12 Op 1 . + CDS 12943 - 13251 89 ## 23 12 Op 2 . + CDS 13255 - 13566 75 ## + Term 13641 - 13673 -0.2 + Prom 13758 - 13817 4.3 24 13 Tu 1 . + CDS 13867 - 14985 441 ## Msip34_1630 phage head morphogenesis protein, SPP1 gp7 family + Term 15094 - 15148 6.2 25 14 Op 1 . + CDS 15487 - 16314 675 ## MCR_0943 hypothetical protein 26 14 Op 2 . + CDS 16324 - 17352 679 ## Rru_A2587 hypothetical protein 27 14 Op 3 . + CDS 17364 - 17594 180 ## 28 15 Op 1 . + CDS 17726 - 18094 285 ## gi|153806874|ref|ZP_01959542.1| hypothetical protein BACCAC_01149 29 15 Op 2 . + CDS 18113 - 18508 158 ## 30 15 Op 3 . + CDS 18515 - 18739 135 ## 31 15 Op 4 . + CDS 18742 - 19308 204 ## Msip34_1624 hypothetical protein 32 15 Op 5 . + CDS 19310 - 19756 357 ## 33 15 Op 6 . + CDS 19753 - 20148 107 ## 34 15 Op 7 . + CDS 20148 - 20660 291 ## TM1040_0810 hypothetical protein 35 15 Op 8 . + CDS 20657 - 21121 247 ## 36 15 Op 9 . + CDS 21125 - 22273 283 ## DVU1494 hypothetical protein 37 15 Op 10 . + CDS 22273 - 22638 136 ## 38 16 Tu 1 . - CDS 22837 - 23370 -280 ## + Prom 23269 - 23328 3.4 39 17 Op 1 . + CDS 23436 - 24272 682 ## 40 17 Op 2 . + CDS 24286 - 24645 306 ## 41 18 Op 1 . + CDS 24747 - 24983 115 ## 42 18 Op 2 . + CDS 24986 - 25528 260 ## 43 18 Op 3 . + CDS 25525 - 25758 132 ## 44 19 Tu 1 . + CDS 26149 - 26394 91 ## + Term 26395 - 26425 1.0 45 20 Tu 1 . + CDS 26575 - 27351 548 ## COG3645 Uncharacterized phage-encoded protein + Term 27373 - 27425 6.9 - Term 27365 - 27409 2.6 46 21 Tu 1 . - CDS 27531 - 28031 264 ## - Prom 28060 - 28119 7.0 47 22 Op 1 . - CDS 28136 - 28306 127 ## - Prom 28328 - 28387 5.0 - Term 28350 - 28383 1.1 48 22 Op 2 . - CDS 28403 - 29593 219 ## ACIAD2481 hypothetical protein - Prom 29751 - 29810 4.0 - Term 30017 - 30048 3.2 49 23 Op 1 9/0.000 - CDS 30068 - 30448 111 ## COG1598 Uncharacterized conserved protein 50 23 Op 2 . - CDS 30484 - 30687 123 ## COG1724 Predicted periplasmic or secreted lipoprotein - Prom 30890 - 30949 5.4 - Term 30933 - 30974 6.1 51 24 Tu 1 . - CDS 30977 - 31774 297 ## COG3646 Uncharacterized phage-encoded protein - Prom 31894 - 31953 7.4 - Term 31918 - 31963 14.6 52 25 Op 1 . - CDS 31985 - 32242 351 ## 53 25 Op 2 . - CDS 32245 - 33186 337 ## COG3617 Prophage antirepressor - Prom 33354 - 33413 5.0 - Term 33363 - 33400 1.0 54 26 Op 1 . - CDS 33513 - 34022 161 ## 55 26 Op 2 . - CDS 34024 - 34332 134 ## - Prom 34498 - 34557 3.1 - Term 34505 - 34556 6.1 56 27 Tu 1 . - CDS 34561 - 34869 86 ## - Prom 34946 - 35005 4.2 + Prom 34905 - 34964 3.6 57 28 Tu 1 . + CDS 34993 - 35136 98 ## + Prom 35458 - 35517 10.6 58 29 Tu 1 . + CDS 35662 - 35907 79 ## + Term 35931 - 35966 5.5 59 30 Tu 1 . - CDS 37047 - 37541 -646 ## - Prom 37568 - 37627 4.6 + Prom 37253 - 37312 1.9 60 31 Op 1 . + CDS 37558 - 38730 679 ## COG5281 Phage-related minor tail protein 61 31 Op 2 . + CDS 38748 - 39470 165 ## + Prom 39475 - 39534 2.2 62 32 Tu 1 . + CDS 39726 - 39929 174 ## 63 33 Tu 1 . + CDS 40444 - 40863 330 ## DVU2154 tail assembly protein, putative + Term 40878 - 40916 2.1 + Prom 40949 - 41008 1.8 64 34 Tu 1 . + CDS 41048 - 45079 738 ## DVU2153 tail fiber protein, putative + Prom 45345 - 45404 4.1 65 35 Op 1 . + CDS 45465 - 45977 -184 ## 66 35 Op 2 . + CDS 45989 - 46519 98 ## 67 36 Tu 1 . - CDS 46298 - 46585 175 ## 68 37 Tu 1 . + CDS 46572 - 46802 91 ## + Term 46816 - 46855 5.0 - TRNA 46923 - 46999 88.8 # Arg CCT 0 0 - Term 46874 - 46911 5.1 69 38 Op 1 . - CDS 47042 - 48370 1369 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains 70 38 Op 2 6/0.000 - CDS 48391 - 49431 1147 ## COG4792 Type III secretory pathway, component EscU 71 38 Op 3 8/0.000 - CDS 49428 - 50249 986 ## COG4791 Type III secretory pathway, component EscT 72 38 Op 4 . - CDS 50366 - 50629 401 ## COG4794 Type III secretory pathway, component EscS 73 39 Op 1 . - CDS 50779 - 52866 2756 ## COG4789 Type III secretory pathway, component EscV 74 39 Op 2 . - CDS 52863 - 53270 489 ## Dvul_3007 tetratricopeptide TPR_4 75 39 Op 3 . - CDS 53272 - 53640 442 ## DvMF_2417 hypothetical protein 76 39 Op 4 . - CDS 53683 - 54057 468 ## LI0546 hypothetical protein 77 39 Op 5 . - CDS 54057 - 55181 1280 ## DvMF_2419 type III secretion regulator YopN/LcrE/InvE/MxiC + Prom 55496 - 55555 2.9 78 40 Op 1 . + CDS 55576 - 55866 486 ## 79 40 Op 2 . + CDS 55890 - 56450 539 ## Dvul_3002 hypothetical protein 80 40 Op 3 . + CDS 56476 - 57018 600 ## Dvul_3002 hypothetical protein 81 41 Op 1 . + CDS 57181 - 57678 601 ## COG0457 FOG: TPR repeat 82 41 Op 2 . + CDS 57686 - 58696 947 ## Dvul_3000 type III secretion system target, YopB family + Term 58769 - 58799 1.0 83 42 Op 1 . + CDS 58890 - 59864 910 ## DVUA0111 type III secretion system protein IpaC family 84 42 Op 2 . + CDS 59873 - 60211 267 ## 85 42 Op 3 . + CDS 60222 - 60476 216 ## 86 42 Op 4 . + CDS 60527 - 60970 206 ## Dvul_2998 TIR chaperone family protein 87 42 Op 5 . + CDS 60886 - 62844 1719 ## COG1450 Type II secretory pathway, component PulD 88 42 Op 6 . + CDS 62889 - 64325 713 ## DvMF_2427 type III secretion apparatus protein, YscD/HrpQ family 89 42 Op 7 . + CDS 64318 - 64521 265 ## 90 42 Op 8 . + CDS 64550 - 64810 397 ## DvMF_2429 hypothetical protein + Term 64837 - 64867 -0.7 91 43 Op 1 . + CDS 64996 - 65370 428 ## Dvul_2993 hypothetical protein 92 43 Op 2 . + CDS 65397 - 65975 398 ## DvMF_2431 hypothetical protein 93 43 Op 3 . + CDS 66036 - 66812 658 ## COG4669 Type III secretory pathway, lipoprotein EscJ 94 43 Op 4 . + CDS 66891 - 67496 297 ## LI0542 hypothetical protein 95 43 Op 5 13/0.000 + CDS 67475 - 68095 665 ## COG1317 Flagellar biosynthesis/type III secretory pathway protein 96 43 Op 6 . + CDS 68339 - 69655 1385 ## COG1157 Flagellar biosynthesis/type III secretory pathway ATPase 97 43 Op 7 . + CDS 69658 - 70149 645 ## DvMF_2436 type III secretion YscO fmaily protein + Prom 70240 - 70299 1.8 98 44 Op 1 . + CDS 70388 - 71062 480 ## DVUA0120 type III secretion protein, putative 99 44 Op 2 . + CDS 71160 - 72221 972 ## DVUA0121 type III secretion system protein YopQ family 100 44 Op 3 . + CDS 72224 - 72874 653 ## COG4790 Type III secretory pathway, component EscR 101 44 Op 4 8/0.000 + CDS 72898 - 73233 251 ## COG1366 Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) 102 44 Op 5 . + CDS 73244 - 73660 538 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 103 44 Op 6 . + CDS 73693 - 74343 687 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) - Term 74446 - 74496 10.9 104 45 Tu 1 . - CDS 74593 - 75666 1000 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 105 46 Tu 1 . + CDS 75559 - 75873 112 ## + Term 75916 - 75960 2.1 - Term 75720 - 75768 9.0 106 47 Tu 1 . - CDS 75870 - 76688 931 ## Ddes_2150 split soret cytochrome c precursor - Prom 76867 - 76926 3.2 - Term 76966 - 76999 1.3 107 48 Tu 1 . - CDS 77025 - 78413 1931 ## COG0165 Argininosuccinate lyase Predicted protein(s) >gi|316921914|gb|ADCP01000134.1| GENE 1 3 - 102 86 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFKHLSSFAIKHSKRSSHTSRERERERERERER >gi|316921914|gb|ADCP01000134.1| GENE 2 654 - 998 74 114 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTSYCQLTKLYATPYDKIIYFYVVMWQNAVMKSLIINELRVFFAKHPQILPYHLAISAEV APSIVHNAIKGKREDMFSSTADRLRAAMRRLECELRPTATAETPNERKEVSCNG >gi|316921914|gb|ADCP01000134.1| GENE 3 1245 - 1688 387 147 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2842 NR:ns ## KEGG: DvMF_2842 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 9 82 5 78 98 78 50.0 9e-14 MNTQDYNTLTEVIEAMIDEGKKPIKAIAAEISKPYPTLKRELNPADDGAKLGADVLLGIM ASCGSIAPLEWLADRLGYVVKPKEWAEPDKPTWEGESVDDTICCGKMVMLMQEKAHPSIV SKAAEEWKDEIDQTNTRYRLDYNQARQ >gi|316921914|gb|ADCP01000134.1| GENE 4 1691 - 2176 -62 161 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQKCQLCGRWIRGKKADHVCPPPLDPRTCPICGQVFTPLKDSQICCSIKCGRIRQKRESY ERVRQFREKEKEDRETRRCAICEEEFIPRSHNQLVCEKQSCRAEYKKRYDAARYKGDFTP APWYGQFCMPDPYQGKRLYFDGLHSVRGSMPGRSADPVLGF >gi|316921914|gb|ADCP01000134.1| GENE 5 2480 - 3526 199 348 aa, chain + ## HITS:1 COG:no KEGG:ABSDF2498 NR:ns ## KEGG: ABSDF2498 # Name: not_defined # Def: hypothetical protein # Organism: A.baumannii_SDF # Pathway: not_defined # 1 157 1 149 309 99 39.0 2e-19 MSIDATRWAWSLQGIRPTQKLVLLSLADRAGENHVCWPSLQRLAFDTGLDVKTIRTCLID LAQAGIVSRREVRGRGYEYTLVGVEGRENQTILTDYRKTASKKPFSENATPTSFGMGIDN HQDFNDNFLDTPTSLGTPTQTGTGTDLGTPPLPVSVPHPYQFRYPTPTKTGTRIYQEPIK NRSENLEESVQTHTRTRSQKPKPQKLTFGEYANVKLTAEEHGKLIAAYGEDKTADAIAFL DIHLGARAGKDPYKSHYLALRKWVFDAVEERKAKKQAAPSGRTQAPMTARQAETAKRGEW AKQILKFDEVMKNGELATVGFGTEQGVCALSATDAGTGRVRAVGQDLE >gi|316921914|gb|ADCP01000134.1| GENE 6 3417 - 3842 255 141 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MASLQLLALELSKAFVLYRQPMPEREEFELLVKTWNEVLANVGDNEFRGAMRRVEAKSSF FPVPADIMRQVEEARKQIPTVSREALPETALTFDERCEEGADWCAKILANLRGKMDARKQ GRPDMPLNEQLANLRALGVEQ >gi|316921914|gb|ADCP01000134.1| GENE 7 4251 - 4496 190 81 aa, chain + ## HITS:1 COG:ECs1777 KEGG:ns NR:ns ## COG: ECs1777 COG4570 # Protein_GI_number: 15831031 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvase # Organism: Escherichia coli O157:H7 # 3 80 40 117 119 75 47.0 3e-14 MQIVGDEKKALKIDSRVKVNVVVCPPDRRKRDIDGYLKALLDSLTHAGVWLDDEQVDSIY ITRGEVVKGGKAVVEILPMEV >gi|316921914|gb|ADCP01000134.1| GENE 8 4499 - 4696 206 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKNEHRLLCGWKTIIAYTGVSRLLMIRYAYPVHDCDRAANHGYGVCAYTDELDAHREVI KHGKA >gi|316921914|gb|ADCP01000134.1| GENE 9 4881 - 5102 141 73 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1345 NR:ns ## KEGG: LM5578_1345 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 4 69 111 179 179 61 47.0 9e-09 MIKNTEQPSYYSRFPIQPIDFISANKIDFLVGNVIKYVCRYDAKNGVEDLKKAKVYLEKL IERTEQEAKEWTA >gi|316921914|gb|ADCP01000134.1| GENE 10 5090 - 5524 275 144 aa, chain + ## HITS:1 COG:Cgl1859 KEGG:ns NR:ns ## COG: Cgl1859 COG0756 # Protein_GI_number: 19553109 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Corynebacterium glutamicum # 2 144 3 147 149 87 40.0 1e-17 MDSVTIKFKRLHPDAVTPKQGSEWAAGFDITAISRKWLPDEACYEYGTGLAIEVPKGFAA LLFPRSSIFRVPLQLSNSVGVIDADYRGEIKAKFRRTDGGEPLYQPGDRIGQLVIIPVPS VQYIEAKELSPSKRGTNGYGSTGR >gi|316921914|gb|ADCP01000134.1| GENE 11 5527 - 6066 209 179 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKRAKLDFERRGGLTAIDDVAFARILPIIHQYMPRIHLSMSAIEIRAHLDAILYLSIFY CPWRKIKNYRSIYRFYRRLRSRGALTIIQVKIGRSTPLEIIRPRKEARTESRKRKCPPCP KCHEDAMVCYHTKKKYDDDGNIRMIVRYSKCSACEHTSVFIETKNNKWWNNPNFVPTCC >gi|316921914|gb|ADCP01000134.1| GENE 12 6115 - 6498 267 127 aa, chain + ## HITS:1 COG:no KEGG:RB2501_01256 NR:ns ## KEGG: RB2501_01256 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 5 118 1 113 117 117 48.0 2e-25 MQISLRHFSPNEFRCKDGCGGGIEHMNQDLLMMLDEVRDRAGIPLVLSSAYRCPAHNQAV GGVDDSAHTRGYAVDIKCINSHTRFLILQAALEVGFRRIELAPTWVHLDNDPNKPQDVAF YQHGGKY >gi|316921914|gb|ADCP01000134.1| GENE 13 6498 - 6788 284 96 aa, chain + ## HITS:0 COG:no KEGG:no NR:no METTVIDFILSTLAQLSAQYPDAAWIITALSVVMTVCGLCAVATVWMPVPKETTGAYAAV YRWVHAFAAHFGQNRGAVADGKSETVKAEVKAVTGK >gi|316921914|gb|ADCP01000134.1| GENE 14 6790 - 6990 58 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRAVLEFLSSLAGLLKLWLRQRYGERREADRTAVRDDAGGEWVRSMGGTDRRDKPTPLDA GGRRDG >gi|316921914|gb|ADCP01000134.1| GENE 15 7129 - 7455 175 108 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLEKIIADFSRAIDLLSGGITWIILGAVGGVAVEWETARAYHREMRVSDILASWVIGICF GAMTWVCAANSSEGIRMVYTVSSAMFGHGCGPLIKRNIKGIIENLGKK >gi|316921914|gb|ADCP01000134.1| GENE 16 7515 - 7931 191 138 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDTYALQRVILKMDLLPGERYVGMVLALHLNQKTGTIQVRQKTLIDETGYSRNTVQKALQ RLIASGVFISQQTGRAAVLALGNNTGNMDTPNTGHQLPQKLGYRRKRKGAPFDLDTSLST RIEELNKRDEKRFQREHK >gi|316921914|gb|ADCP01000134.1| GENE 17 8137 - 8850 500 237 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLNKYDKPATMADRNTRHVTMKNPLDGLKYSGNVSADALKELEIVLKAFRASDAGENKG PYGIDSRFFFTIAFESFAQRESFRKALSLTRHGEQYWTGEAFMGSLELLKEKKGTSLSEY KRENPFAKREAPVAAKAEPTNAEKYSKLRAEVKRTQEKMQGYTEDRTWIAVCFPSEKDME AARKKLALPEGKFIHVEDVCNAVEKKFGISLDVPVVPFGLRAVAKPDKALLALVEDY >gi|316921914|gb|ADCP01000134.1| GENE 18 8852 - 9559 360 235 aa, chain - ## HITS:1 COG:no KEGG:Srot_2112 NR:ns ## KEGG: Srot_2112 # Name: not_defined # Def: hypothetical protein # Organism: S.rotundus # Pathway: not_defined # 21 227 25 238 246 132 35.0 1e-29 MITTFDEMIAHVQEKTYENNVLLSFSGGKDAWGTWIAIRDHFNVTPFYYYIVPGLEIIDE YLSRCEKRVGKIRQYPHPMLYDMLTSCTAQAPQRCWTIEHLELPRFTKDDLHRCVENDCG FPEKSCYVALGLRAADSIMRGSYFKKHGPVDDKRKVFSPIWDWNKARLLEELKTDGVKLS KEYEFFGRTFDGPVLLYSWGLKKHSPRDYARLLEWFPMLEAEVWRYERNLANGGN >gi|316921914|gb|ADCP01000134.1| GENE 19 9828 - 10022 129 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIYGGDFEAIKYYLERKGKCRGYGGQDRFGKGPDDEVGMDKPESGVLVTPGLLSEDAWET AAQK >gi|316921914|gb|ADCP01000134.1| GENE 20 10127 - 11389 345 420 aa, chain + ## HITS:1 COG:no KEGG:Mmc1_1690 NR:ns ## KEGG: Mmc1_1690 # Name: not_defined # Def: hypothetical protein # Organism: Magnetococcus_MC1 # Pathway: not_defined # 11 407 28 408 445 340 45.0 6e-92 MDYIQFVGLGFGPSWKGIIFRQTYKQLEEIVSKSKRFYPAIVPGAQWRGSAMEWVFPKGE TLKLRHAKRPADMENYQGHEYPFVGFDELCNLPSQEVYEKAKGFCRSSDPNVPKMIRCTA NPLGPGHLWVKRYFIDPAPALTPIVESHGGKRVFIPATIYDNKNLVEADPDYLRRLESIA DDSLRRAWLNGDWNVVAGSFFGDVWSASRNIVKPFNIPRNWYCFRSFDWGSSHPFSVGWW AIADGTQAPDGRYYPRGAFIRFAEWYGAKRGSNGMVVPNEGLRMSSKHVAQGIRERERKM SDAMGIRINPGPADPAIYASTDGPSVADNMASEGIKWVRADNSRVTGWQQMRERIWQEGE EPMLYVFNTCTELIRTLPAAPRDEHIPDDIDTEFEDHALDECRYAVMFKKREVFFGSSMG >gi|316921914|gb|ADCP01000134.1| GENE 21 11400 - 12857 836 485 aa, chain + ## HITS:1 COG:no KEGG:BP3383 NR:ns ## KEGG: BP3383 # Name: not_defined # Def: hypothetical protein # Organism: B.pertussis # Pathway: not_defined # 40 456 37 451 472 86 24.0 2e-15 MADNTEYSKKHPDYSESYQRRGLALDLYEGGRRVEENPAYLIRHPYETQKQYDIRFQRAT YRNFAAPIVDVFASFINEGRPQRVLPAKLEDMQDNVDRLGTKANTFFADVTRLAAAGGIR FVQVDAEPQAGITQAEAEEAGRRQWPYFISLDPTDVWDWEIGADGLDWVVIHGAGMEGNA PFTKGTRYETLTVWTRTEWTRYRRETDSTAKAGTSLGWKEYDRGINPSGLVPIVPFTFED TAGSIMSGVPATDDVLSLVLRIYRRDSELDKMLFDRAVPLLNVNGLSKEDWADFVVGSSN ALMSTNPGGITAQYVESTGTSFSAQTEFLTRDENSVREIALRMIRPQSGVGESAESKQID RQQLDTQLANFARRCANAEAQCWKIAARWMRLDDSDISTPYTENYDVEAAGDAIVSALVS LNSQAIISKQTIRDTSAVKKMMPEGWKPEEEETRLQQELGSTRGASGTLRLPNILGGTGT QGMNI >gi|316921914|gb|ADCP01000134.1| GENE 22 12943 - 13251 89 102 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKELTKVEREPYRKAIEDAVMNLCVACSNALAQGLDVDINIQHLTDFDPLKGPIRGITSN RIITANHKIPIEEIVSPYSPPSVKIDIEGCNRDDIDLREDAK >gi|316921914|gb|ADCP01000134.1| GENE 23 13255 - 13566 75 103 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNDIDLDELERLANGAVPGPWQVEESGGGVFSSNEHLPPKRAYGVYGDIPNFVCDFNDGE YHEYRDIEEQKNTAAYIAAMNPETTKALIARIRELEDENKKLR >gi|316921914|gb|ADCP01000134.1| GENE 24 13867 - 14985 441 372 aa, chain + ## HITS:1 COG:no KEGG:Msip34_1630 NR:ns ## KEGG: Msip34_1630 # Name: not_defined # Def: phage head morphogenesis protein, SPP1 gp7 family # Organism: Methylovorus_SIP3-4 # Pathway: not_defined # 19 342 28 324 622 105 28.0 3e-21 MTPQDLAELYLLSRSLWWRHELGLLSDEAVSELIKVLRSARAGIVAQIEAEAQGLASISE WTRERNDQISAWIDDVLAGTSASVTSYISEASVGVALASVATYNSILSFDGKAKAVKLVE GLTREQVKQFFQDQPLGGKLLSDWVGNAFSTGARDSILDAIREGVVMGEGYRKLVKRVMT AADTGFSITQREATTLVRTYVQSANTGAQEAVYEQNEGIIKGYKRVETLDNRTCRICALA DGTVYGKNEKRPELPAHPNCRGVYVPILKTFREIGLNIDELEEVARPWTIREQGPIGTGG RKILNFGTTKENFGGWWESLSAEDKLKTSVGPVRGRLLQEGLVRWEDLSDKRTGLPYTLE QLGFDEQGNPLR >gi|316921914|gb|ADCP01000134.1| GENE 25 15487 - 16314 675 275 aa, chain + ## HITS:1 COG:no KEGG:MCR_0943 NR:ns ## KEGG: MCR_0943 # Name: not_defined # Def: hypothetical protein # Organism: M.catarrhalis # Pathway: not_defined # 1 228 1 213 244 155 41.0 2e-36 MKLKLDDNGNAVLQEGIPVWVADDGKEIPYNVPDLVGKLSAVNAESAGRRKELDDLNAKF KLFEGLDPEKAKAALETVANLEAGKLIDAGKVEEFKAQVDQGWKVKLEDKDKAHVKEIEK LTEALDSKNAAIRNLVVRGAFEASTFLREKTTLPADMAYAAFGKHFEVKEENGELKAVAS LDGQPIFSRSNPGTFASPEEAIEALIEKYRYKERILRDTMPGGSGATKPSFVSAAKNPWA KESWNVGEQMKLFSVNPARAKALMQEAGVPIPAGL >gi|316921914|gb|ADCP01000134.1| GENE 26 16324 - 17352 679 342 aa, chain + ## HITS:1 COG:no KEGG:Rru_A2587 NR:ns ## KEGG: Rru_A2587 # Name: not_defined # Def: hypothetical protein # Organism: R.rubrum # Pathway: not_defined # 10 342 5 337 337 197 38.0 6e-49 MAVQSGSQVSTRLSDVPIVPEIATAAIILRSLNSNAFVNSGVMIRDPEADAFLTNNLGGK TFAPRYLGPLADDEPNISSDDPSEKSTPKKITGGKNKAVRQSLNQSWSSMDLTNTYLGLD ITTAITNQIGDYWNTQENKRLLASLKGIIAADLATGSPVMTVDVTGKTGADALFNAEAFI DAQTTMGDMASSLTAVAVHSTVYATMKKLNLIDFIPASEGRVEIPTYQGLTVIQDDAMTY VPAVTGDSPSPAKYYTYLFGRGAVALGVGTPKTPFAIHRDEAAGNGGGEEIVHSRLEWII HPQGFSFGLEETPTLAQLETASNWTRQYERKRIALAALITQG >gi|316921914|gb|ADCP01000134.1| GENE 27 17364 - 17594 180 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKEKESPEQPVEEKIPPQQSKESGKAERMSERTPNQQNMYISELLRKKAQGQSMDALRG ENQMLKAKLKRLESGQ >gi|316921914|gb|ADCP01000134.1| GENE 28 17726 - 18094 285 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153806874|ref|ZP_01959542.1| ## NR: gi|153806874|ref|ZP_01959542.1| hypothetical protein BACCAC_01149 [Bacteroides caccae ATCC 43185] # 6 122 3 120 120 79 38.0 5e-14 MSYETVELNVTKLSENKYRDGNKVYIVANLIERAKELEPFDLPLIALNASSNVWHPVSCA YHLARKMKRVQQADLNCPIILDESGFVMDGWHRIAKALFEGRETIKAVRFKETPPCDYVE EE >gi|316921914|gb|ADCP01000134.1| GENE 29 18113 - 18508 158 131 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIYETNEKHFSVFKEECIKWIKFFGITEWEIFFVHETYEDGCRAACYTKHLHKIAHLALN KRQEVCEPITDESVKKSAFHEVCELLLSSITFIALDEEIPHHERQGLTESETHEVIRRME NSIFNHVELEG >gi|316921914|gb|ADCP01000134.1| GENE 30 18515 - 18739 135 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEDKNLAPGTVEINIGGCSDKGHPVPVKSLTVNGRSIPFLMDSDFILRIRGGGPAEIQTT LYAERFITAPWSKE >gi|316921914|gb|ADCP01000134.1| GENE 31 18742 - 19308 204 188 aa, chain + ## HITS:1 COG:no KEGG:Msip34_1624 NR:ns ## KEGG: Msip34_1624 # Name: not_defined # Def: hypothetical protein # Organism: Methylovorus_SIP3-4 # Pathway: not_defined # 1 175 1 161 167 65 33.0 1e-09 MPLIVEDGTMPEGANTYASVADADAYLLSRGVSDWAAPPSSDLEPDPQLSAKEAALIRSA DYLNGLKWKGEKIEYDWPMAWPRAGVPTGVKNIHGVMQFVACDTVPSAVKRACIELAALF IAGEDPLAPIERGGRVASETVGPISTSYFDDAASETLYPAVSGLVWAFLREIPGQSGSIS GFAETGRA >gi|316921914|gb|ADCP01000134.1| GENE 32 19310 - 19756 357 148 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGSCIYHPELSQVLSHTHIPVEGGPKYWTVYLWTNGKALQKARDCHGDKCSTAVKGLTCP ESFTLRRDGTVDASDELGEIHFAVGEWDEEIVSHECQHASQQCRRVLQINPDASVDEEER LCYLTGKLTKAVYRWLWRLDPDMWWVYR >gi|316921914|gb|ADCP01000134.1| GENE 33 19753 - 20148 107 131 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNPAALAKKADKQFRKNGQAMTFIQENESEQPNPDTGQPEVDTIPTEFYGLWDTISLQEI GTLIQVGDAVIWAGGLAIPKPDNTDWIDVDGEAWNIVDVEPTKPGSIPIVYKIYVRSAGS SKGYKRLGRKA >gi|316921914|gb|ADCP01000134.1| GENE 34 20148 - 20660 291 170 aa, chain + ## HITS:1 COG:no KEGG:TM1040_0810 NR:ns ## KEGG: TM1040_0810 # Name: not_defined # Def: hypothetical protein # Organism: Silicibacter_TM1040 # Pathway: not_defined # 49 165 20 125 133 77 42.0 2e-13 MPVEAYEYRIQEIRRKIKELDSVMTDDVNKFEKILQEQVRLTIEGEALLIVKKVISEVFV RIVLRTPVDTGRARASWQFGVGTAPSGVAPDKEYPELKDKEISETQVRAAVASALEEISV APASVWFISNNLEYIEALEAGWSKKQAPAGMVSLTLREMTRQLEQELGKA >gi|316921914|gb|ADCP01000134.1| GENE 35 20657 - 21121 247 154 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIRLDLIRKAFDNAMLAQMKELLPGTPTPFVWLGLNVSSTPDLSKLHIITGFNPGESSTQ ELGGPGSLAIRDGVYIITQSLPQGMNVDEAWRIASGLESYFWQFIRETPFYTGECCVWLD VPSTANRGTDPDQGRYLISTTIPWYTAYSGGAKE >gi|316921914|gb|ADCP01000134.1| GENE 36 21125 - 22273 283 382 aa, chain + ## HITS:1 COG:no KEGG:DVU1494 NR:ns ## KEGG: DVU1494 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 14 376 14 385 387 144 28.0 4e-33 MSQYCPTVQKSNVQRAFAMIETVPGELEKPTASGFVAPAGRGSISQAPTYTDSPELSNTL DVVSQSRDAMPPGDWSLPMVARLAANAGAPQGDAMFHAALGKLDTSVSGKRAYKLDTCRP TLSLWLQNDETVQFMSGCTVESLEWAVDRDGLTIFTFSGRGRRAGIVGVGELAEAPTSAT VKLETGQSMAFSIGGYITNRTKQDGTEYQITAIDTATDTIVLDSAPAEWVAGEEIGPWLP VADSIGKEVENNSVILLIDDVEGKMRPSTFTASLPTQFLEEIGDQYPGESADNKRSITMD MSVYFRRAEAVRFGQALEGKTLSVKLKAKNENGSIEVAMPRVRISSPTIGEDDAVLTLDS SGTALGLTGEDSFTITITKAGV >gi|316921914|gb|ADCP01000134.1| GENE 37 22273 - 22638 136 121 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSLTLLTKKFASCTFRLDLTADGSAYFVCKPIVGSKQNEIAKKVMAEYAFDAQIAAFKIL PALLEVHIVGWEGLQDVSGFPIPYSKEMLLELCEHDYEFMEMQLNRIRRIAREGRLEEEK N >gi|316921914|gb|ADCP01000134.1| GENE 38 22837 - 23370 -280 177 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDNYLPIHRIWIGVSDKNKTDCPELWSTSLPGYQAQSEPRRFDISISNAVRLQDNALDTS HQDAFRNNGENRYRKQDKSGGDAAYDNHTPNRTPKLDRDRGYPIRGPRFPYRAVIRRWYA GSRGSYQLPLIFWTSFSLLFRVFSPVLPETACPTGKYAKEVLLNQRNGHIRRGGLHI >gi|316921914|gb|ADCP01000134.1| GENE 39 23436 - 24272 682 278 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGMDGLVPTFSVNGDGNGYNGGGWGDAVGAFAGALIGSWWGNGWGGGFGNGGGNGGCCGS APSAPSVVIAGGAGGCGSTAELDALTGLQASVNSLGLQSLQGQNTTNMAMCQGFSGVVAA DNANTASLMNATTQGFAGLNTAIMTGDQGIQQSLCQGFNGLNTAVLVSSKDAALQNCQST NQLTNAIDSCCCTTQRTIAAEGSATRALVSQLDRERLLTQICDLKSENSSLKNQNFTTAA ISALGSQLRTEMQQNTLNIIGHMAVIPRPTSGTATAAA >gi|316921914|gb|ADCP01000134.1| GENE 40 24286 - 24645 306 119 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAMRYMMGSERGDGYSRSGSRSSHGGEYGGGRYEARRLPPRNRYGEFRRRRSRREYNDGM GRYEFGDDYENRHYREDDWDRYGYDERREPYFEDEERERRRERHRRMMDEYGDDGDGYD >gi|316921914|gb|ADCP01000134.1| GENE 41 24747 - 24983 115 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDNPPPTWPPYLAKGEYWGIVEMEYGEAKKAVEGFKNGSMSICDAIKELGHVASAAAQAR AHLMKLSEMQDLPEKVKV >gi|316921914|gb|ADCP01000134.1| GENE 42 24986 - 25528 260 180 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSNKLPLPARFSDDAMSYSLVYIPGLPADTDPRYSVEPNVTSMRPIGRNVQSVNITGKGK MIFTWVLMEQPAKPGTTTPARQYLFFRLSDKPIPLDLLKPGDMICYAATPKHVTTDPAQC FVDSLFGQVSADAQAERPRDTWASELFQPPQEQEETETAPQSEMRISEPNGQENHQEEGE >gi|316921914|gb|ADCP01000134.1| GENE 43 25525 - 25758 132 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEETTVDRVFRRIEQDVERILDREPSSRGVLNQVIQEGAPMFMAYKMRGRVPSVDALER VATKLLAIVVATEEARR >gi|316921914|gb|ADCP01000134.1| GENE 44 26149 - 26394 91 81 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQGRYQANGGGMTQDERKLVEVLLSIDAALPHDFPLPEDAPRPAWCHLFMLHFGTEKGEC CGICPELSEVMVQALDRAYEK >gi|316921914|gb|ADCP01000134.1| GENE 45 26575 - 27351 548 258 aa, chain + ## HITS:1 COG:SA1801_2 KEGG:ns NR:ns ## COG: SA1801_2 COG3645 # Protein_GI_number: 15927569 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Staphylococcus aureus N315 # 140 250 9 120 126 122 56.0 5e-28 MSEMQIFEKAEFGKVRVVERDGQPWFVAKDVCECLELTDVSKTISLLDDDEKGTNSIRTP GGEQQMLVVSEPGLYSLILRSRKPEAKAFKRWVTHDVIPSIRKRGLYATPQTVEAMLADP DTAIKLLTSLKEERAKSAALAAKVEQDAPKVLFADSVAASRSSILIGDLAKLLVQNGVKI GQNRLFVYLREKGFLIQSGSRKNTPTQRSMEMGLFEVKEYLVHNPDGSTRTRFTTKVTGK GQLYFVKKFLSANTPVAA >gi|316921914|gb|ADCP01000134.1| GENE 46 27531 - 28031 264 166 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLYHGSMIEHLAISNSGTGLGYNFGAMFFARTYGTAKGYGNYVYQCEIDIKDIFLNEDL PYLEDGAAGTALREVMAERGIDEKYFDLCWYAVVEEKIGYQDEDWVNLLNMDDDDASWEA QAMRIAFARKLGFNGCDMNDECGSIALLPEYIALNPASPEDDEDNE >gi|316921914|gb|ADCP01000134.1| GENE 47 28136 - 28306 127 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLEKKKTVGRPPRADKTQRVTIYLPQALAARVKLEAETRKQTLSAFFETLLEKNI >gi|316921914|gb|ADCP01000134.1| GENE 48 28403 - 29593 219 396 aa, chain - ## HITS:1 COG:no KEGG:ACIAD2481 NR:ns ## KEGG: ACIAD2481 # Name: not_defined # Def: hypothetical protein # Organism: Acinetobacter_ADP1 # Pathway: not_defined # 141 392 3 288 294 126 31.0 2e-27 MYNALENGTVEEQEEIEGELLRVRKIPFIKGGGKSIDRRIRQLLLPQKSAATGYVSVSPL TAIGLSALLFGPSGLVSKHNDSLNHPDALRIRRAHLAFGGANPHNLGYFSARWMMQYPIL LSAPDCEEGMPERRNPSKRGRYLILPNLRVQCANILTNQILVNGPPISAAWGMGHALERE MGRRIEGVCLVMHYVEPLGEREYGAFEPSQKRGAAFTFEKSRNGSDYTKGTINLSLQPGV CAHMRVSVVYELSRDLTSLPRAVEAFLATGRFAGGLITSYGKPDLHDDRDTLLECLPVGR VIVDRRDLMASGNPLENLVNAIGYRHKQEWLSATNIGYSAITDFGLRGGARDGHLHAFAE PLIGIVEYVNTLDRAQSYFWHDRWLDDSFLLEGGQD >gi|316921914|gb|ADCP01000134.1| GENE 49 30068 - 30448 111 126 aa, chain - ## HITS:1 COG:SMc04269 KEGG:ns NR:ns ## COG: SMc04269 COG1598 # Protein_GI_number: 15965793 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 3 122 4 124 153 71 39.0 3e-13 MKYIAIIHKDPESAYGVTLPDFPGCFSGADTLDEIPGNIQEAVELWAEGEDITPPVPSSF ESVAQLDSAKGGMLMLVDVNFDFLDQNIVPVNISMPVYMRNLIDREAKARGLTRSAFLVK AAQAYA >gi|316921914|gb|ADCP01000134.1| GENE 50 30484 - 30687 123 67 aa, chain - ## HITS:1 COG:CC3184 KEGG:ns NR:ns ## COG: CC3184 COG1724 # Protein_GI_number: 16127414 # Func_class: N Cell motility # Function: Predicted periplasmic or secreted lipoprotein # Organism: Caulobacter vibrioides # 8 67 1 61 62 60 54.0 8e-10 MLRIKNDMNSREVLKRLEKEGFVRVSQRGSHMKLRHEDGRVVIVPHPKKDLPPGTLHNIE EQSKIKF >gi|316921914|gb|ADCP01000134.1| GENE 51 30977 - 31774 297 265 aa, chain - ## HITS:1 COG:PM1774 KEGG:ns NR:ns ## COG: PM1774 COG3646 # Protein_GI_number: 15603639 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Pasteurella multocida # 21 116 40 129 239 80 45.0 2e-15 MVETGNGLMIQIVNGKPSVTSLQVAHTFEKEHKNVLRDIEQVLSQVSENFGKLNFELSEY LTQNNLGFEVRKPMYLLSKDGFTMLTMGYAGEKAMRFKEAYIARFNEMERALQVNAPTNL PEALRAYANMLEENELIRRQRDEAIRTKAEIGCRREATAMATASAAVRKVNTLEDKLGFG LHYKQVKGIPWLLDEFDESRAMFQQVGKKLKSLSTRMGYEVKVVPCPEYPEGVKAYHVDV VNAFLAELRQDLNMLGKYRKQRRAA >gi|316921914|gb|ADCP01000134.1| GENE 52 31985 - 32242 351 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MERTYVTTSLEVARVANRKHGQVLRDIDRLRVILPDTKAPEFEAARRTDAKGNSLRYYLL TPYALALLDVGQGKSALRWKGEKLL >gi|316921914|gb|ADCP01000134.1| GENE 53 32245 - 33186 337 313 aa, chain - ## HITS:1 COG:YPO2093 KEGG:ns NR:ns ## COG: YPO2093 COG3617 # Protein_GI_number: 16122332 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Yersinia pestis # 2 131 12 137 187 108 42.0 1e-23 MFSPVQHQSTLWVRAAELARALGYAQENSVSRIYRSNADEFTPDMTQVIEITAQSRNSSS QKIVDGRCRIFSLRGCHLLAMFARTPVAKEFRKWCLDVIEKYGEQFPIDHPVTLNDAPIS PEQRAELKLIVDSKAGMVPKAVQRRAYKEIWTRFNRHFHIAEYKQIPCSRMDEARDFLLS MQVNAGKIEALPQAALPSPCAAHPVIDEKPFLEFIEEIQAAHEEVDRILSRLHHSVFTLS WKVAEALERRADIRLYLVPDMLSENTLRGYLHQSVYHDVKKALTAPGRQLEQYSNPGYSL LGIVRQLNGNGRA >gi|316921914|gb|ADCP01000134.1| GENE 54 33513 - 34022 161 169 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTYMIITNPKNEDYILIKKYIEKNKKINSWFSMWKFVIFTSDMDIKEVQDDFTKSFDGM LFFVSEIKPKNICGSMPSDLWDFIIKTNEHNDKTEDIENKQTLENKTAYALNNKNIIAAN FKKELYNKDQHIMMLTMELQRCHKENKIYQEKLQLHKEMTEKFLKRQPK >gi|316921914|gb|ADCP01000134.1| GENE 55 34024 - 34332 134 102 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNPILWLAGIGCFFSLVGLFSGYFLFNEKIVTWFFMGCFGFCLLVSFLIAIYYAVYHPER LRSEDYQIKKEAIDMIRYELGNKEEVTEYAARIISTNPMLDK >gi|316921914|gb|ADCP01000134.1| GENE 56 34561 - 34869 86 102 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHNGQEQGINAAPTANFNGGMGVTATTGPMMTNITTSPAIYQLFFARLDNLLGKSPNWIT CKEANKLFTDGNLDPFCTVANDRTPDGKSAVQRNRELEQANQ >gi|316921914|gb|ADCP01000134.1| GENE 57 34993 - 35136 98 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRNIVVIISKPENSIRSKQYFFRCFSLVYTGSHLDLPFSFGSDFLWG >gi|316921914|gb|ADCP01000134.1| GENE 58 35662 - 35907 79 81 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLINIRYDRKISIKDAMESSEEFARITARLIAKKDGEINKYISVFPGYMTQGETGKSILI TYKGYAEYSPYSDSISWINEQ >gi|316921914|gb|ADCP01000134.1| GENE 59 37047 - 37541 -646 164 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAFLAISSSFLTRSIPKSSPEPRIFSISARTPSSARFADSILLFSESVEVVLVPVPPPLE RAVFLADSAASFLNFSTRLVYSLKSASGLETISRPFVISFQSIEKDSLIFFDQFLPISPA LLSEEEAPSKSNVITLYRASPKRFICSLPTITVSTKAIIPSTAI >gi|316921914|gb|ADCP01000134.1| GENE 60 37558 - 38730 679 390 aa, chain + ## HITS:1 COG:STM1041 KEGG:ns NR:ns ## COG: STM1041 COG5281 # Protein_GI_number: 16764401 # Func_class: S Function unknown # Function: Phage-related minor tail protein # Organism: Salmonella typhimurium LT2 # 19 324 645 940 995 103 30.0 5e-22 MSEMNALQNEYAQAATAQHYQKIADAIRDVDIQIANLNGDTAKARILEVNKQVEELTRRL TELGEAPDRVTKKAQELRAALNLQSQIKDAQAAADFYRELAQLSGNYGRSQEYVDQLLSL QADNLIRNVGITRELADEWLRLQKIQNSREAWAGAYRATQEYFSEATNLAQGFENLTTNA FSSMEDGIVRFTMTAKLSVKDMVNSITSDLVRMMARSSLTGPLAGALGSGLSNLFSDWST NIKFNNLADSIVAGATPSAHGNILSGAGISSYSNSIVSEPTLFGFDRLTPFARGGIMGEA GPEAVMPLVRTSGGNLGVRAEGTGKQIVNVTVINNAGAQVETSQRENEDGSLDVEIVIDQ MVTRSMVKRSGGASNVLRKGYGIGMIPISR >gi|316921914|gb|ADCP01000134.1| GENE 61 38748 - 39470 165 240 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSDFSYQPSSGPLSGIDFERQTTQFFQQVSAAAGVASTAASAAQTTANEALERAQASNLV DVKTTTEAGGVITVKDVAIGGDLGDLASARGIFDTLTKGSVDCNTLTDQGVYAISLVETV NGPGFSAKLIVFNGKNSKITNQMALAIGAGTSGAVRVAYRARNSEEIWSTWSEGILSGRI GDGLTVNNGIISVPEYEGATASAAGTSGLVPPAAAGQATYVLCGDGEWRDIATLVAAAQA >gi|316921914|gb|ADCP01000134.1| GENE 62 39726 - 39929 174 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTLDQWAQFRQFWKVGLNGGVIPFNYFDPDLNEFFDVRFDPSASEDFSVKERGPLHREVS MTWEVLP >gi|316921914|gb|ADCP01000134.1| GENE 63 40444 - 40863 330 139 aa, chain + ## HITS:1 COG:no KEGG:DVU2154 NR:ns ## KEGG: DVU2154 # Name: not_defined # Def: tail assembly protein, putative # Organism: D.vulgaris # Pathway: not_defined # 7 111 11 113 141 70 39.0 2e-11 MSRPDNWLDRYIGLPWKIGGRELHGGIDCWGLVRLVMRDEAGIDMPSWVDDPDAHGETCR SRSRAFDRHLDHFFRVPAGEEQLFDIVTFFIGPALWHVGVLVQLPHTMLHIEGPEGSQCE DWTSRADLKKQFGGFWRVR >gi|316921914|gb|ADCP01000134.1| GENE 64 41048 - 45079 738 1343 aa, chain + ## HITS:1 COG:no KEGG:DVU2153 NR:ns ## KEGG: DVU2153 # Name: not_defined # Def: tail fiber protein, putative # Organism: D.vulgaris # Pathway: not_defined # 79 1342 150 1342 1346 454 29.0 1e-125 MWLNGVSIPVWAHDKIRVQHGDRIELIVCPRGSGFNSFLKLVGAVVGIAIAAWAPFTGFA GAAFVNGLFAAGATLGISLLADAIAPIRPPKYDTKESLSASPTYNLNASSNAMGFAQPIR RLYGRHLVPPTRIMDDYTEISGSDEYLYFLGVIGIGAYQLEDPQIGDTDLDLYKDVFVQY CYGHKGEAIPNIFPSVVYQESLSIQLLSTDSLGNQLDSGNWWVTRSTRAGVDEIGMDIQA PQGLYELDDKANRKAISRSVQIRYRTKGGVDTPVGEWHGLTSQVTAQSVSFPSAIWPAYT ETVGDRDAYPYTVHHPAEPYSAPSLVYIDQSGVIGVANWKVKKHEIFGTVSVSRREFDNP TLSSSFVTLIARVTITGDSISSFTPEPILNGTGITITKSGPLTLSFSAGTFSTAYTFVTL TGHSAEPDKSVRKGIKWRVPYSATGYDIQIRRHEQESTSSRVVDALYWSAIRAADTTQKP YAGDVPIAMMGMKIRASNQLSGSLPLVTLIATSELADWNREAKNWNTVRPTRNPASAFLD VLRGTANLRPRKDEEIDFDSIQRWHEWCDDNGYLFDGVYDTQSKVWDALSDIAAAGRGSP CLIDGKFGVVWQHEQPGQPVGHITVRNSRDFKSSREYKELIHGVKIKFVSEELRYGQDEL TVYAPGYDEHTATLFESLELFGTTTPDLAAMRGRWYLMQALLLIEQYSVTMPLEYLRLKK GDKIRVLRPEVLYGIGSAALAGWTLDDAGNVTAIRWDAELQMTPGQRYAAIVRLGDGSEW TVTGTSDDGRTMRIDDPLPVPVPAPRKADLVMVGTVNDVGRMCLVTSIKPGSNLTATVGL CDYVEELYQKDGQIPDYTPSITLPGTGPLRKAKTPIITSIRSDESVLTYSSGRGSQPRIY LEWQFTDDSTTAQVRYRQTGTDVWSYVSALQHRATECYISGVKEGWDIVEGAKIQNGITY DVQVRAVNTIGWASDWASISGHAVVGRTTPPLPPDAVGLDGYILTILMSDKPLDVVGFEV FIAMDDTDTFSMSLKVTSPYTADGKFDLKPWAGHARRYYVRSIDEVGLTSDLVSGIIDFG DVKPDNILVEYSQKDQNWPGTLVNGYIGEKDRLYASDETFVFASTTEDFVFADTAKSLVF PTSSGAPLVYTFSLSIPRDMVGARVLIVPDIVQGTVQSIEWRKYTIPYLFSSGDAYIFPA QDMDFVFPTIVPAEWEATPTNYISPGNEVLDVRVTFTPGESLAIVDDIRIVLDVPDVVFP VNNIRVPASGIRLPIPQKTFRTILGVTFGVEAVSGLTAEKTVTLDKNGARDESGYLLEGP LVVGIDASGNYAEIQIDAVISGY >gi|316921914|gb|ADCP01000134.1| GENE 65 45465 - 45977 -184 170 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSCSIRLKLISTESITYWQELPTAVKISDSVALEDSAIAASSLAAKTAYDRGSAGIAAAE TAQASVVTLRSDVPAIAAKAAAPSQKYIDYAAPGNGVSFVAPETGYMNIRMENAPCSIYA GTIGIAISPPYDGANAEGFIPVEKGENVVCYYSGSMTRFRFVYANGNAPA >gi|316921914|gb|ADCP01000134.1| GENE 66 45989 - 46519 98 176 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHYVTTDSTGRITACSPWPFPGASLCQCEVVSYPDGKFYRSDALPQVYAIPGSAEQYVG ECPAGGVMMSGPRPEDVRDVSGVVTETWVASQQGSWEKISTASSRIREERDRRLEESTWI VERHRDQLASGEATTLTEGQYQAWLGYRQALRDIPQQLGFPWNGPDDPACPWPVEP >gi|316921914|gb|ADCP01000134.1| GENE 67 46298 - 46585 175 95 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGSGMAFLLFSKESSHAYDFCKLWFYRPRASRIIRTIPREAQLLRDVSQGLTVAEPCLIL AFRECRGLAACELIAVAFDNPCAFFKAAVTLFTYA >gi|316921914|gb|ADCP01000134.1| GENE 68 46572 - 46802 91 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPLPMPEEEENLRCPHGERVQALMLCSLDCAVSGAHPDRKAKADGRFIITPWWCLHKCRW LPEHRDEIRFVAKKPD >gi|316921914|gb|ADCP01000134.1| GENE 69 47042 - 48370 1369 442 aa, chain - ## HITS:1 COG:aq_1792 KEGG:ns NR:ns ## COG: aq_1792 COG3604 # Protein_GI_number: 15606847 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Aquifex aeolicus # 80 435 165 497 497 252 42.0 1e-66 MKQEPLSEQAFLDTLINLCDDLAWGRPASEDRLFALTKEGAGPKNYVRLAEAFGMMLVKV ESREYHRSQLIADLKTRNAELEEAHRLLAERNTHLMRTIQENYQTKKIVGQCEAMRKVIQ LALTIARRPINTLILGPTGAGKEVIAKAIHFNSPRREGPFVAVNCTAIPDSLFESEMFGI EKGIATGVNARKGLIEEASGGTLFLDELADMTLPNQAKLLRVLEEREVLRVGSSKPVSVD IKLIAATNANLEEAVRKGNFREDLYYRINVAEIRVPPLRDRGDDILLLAQLFLERHCAYM GRPRLSLSPAVCRQLLQYGWPGNVRELNNEMERAASLTIGNRVELSDLSTRIVPDEVRCH LLETEPLISRDGADTPREETPTAQNYNLQEMERKLILDVLNKTGGNKSKAAELLGITREG LRKKLLRMGINDKELPHDDGKA >gi|316921914|gb|ADCP01000134.1| GENE 70 48391 - 49431 1147 346 aa, chain - ## HITS:1 COG:YPCD1.47 KEGG:ns NR:ns ## COG: YPCD1.47 COG4792 # Protein_GI_number: 16082735 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscU # Organism: Yersinia pestis # 3 346 4 349 354 239 36.0 7e-63 MSDKTEQPTPKRLRESREKGDVCKSQDVPSALTVLALSVYIVVMGGHLLESLLAMMELPM KLLSTPYAEAAPRAAEATFHIAVSIVAPLVGLVMAVALVANLGQVGILFAFKGAMPKLEN VSPSKWFQKVFSMKNLVEFLKNIIKVTVLGVTVWVVMRDHLPTLFAIQRGTIWTMWEVLG MAVKDLLLMAAGVFCVIAAVDYLFQKWQYTKNHMMSKDEVKREYKEMEGDPQVKGKRKQL HQEMLSQNALGNVRKAKVLVTNPTHYAVALDYEKDRTPLPVILAKGEGFLAQRMIRVAQE EGIPIMRNVPLARSLFENGTENAYIPKDLIGPVAEVLRWVQSLQRQ >gi|316921914|gb|ADCP01000134.1| GENE 71 49428 - 50249 986 273 aa, chain - ## HITS:1 COG:YPCD1.46 KEGG:ns NR:ns ## COG: YPCD1.46 COG4791 # Protein_GI_number: 16082734 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscT # Organism: Yersinia pestis # 32 255 26 242 261 125 38.0 9e-29 MDYQALFNELHLFEQLLALLVGMPRLFMLVQVAPFMGGNILTGQLRTTVVFACYLILHPA IVASLPETQGLSPSTFALYGAILVKETLIGMLLGYLSGMLFWTIQCAGFFIDNQRGASMA EGADPLSGEQTSPLGSFFFQSAVYLFFSTGAFLALLGVVYASYEIWPVTQLIPLSVFKDI NLPLFFAGRVSWLLLMMLLLSGPIVVACLLTDVSLGLINRFASQLNVYVLAMPIKSGVAA FLMLFYFMMLLSNATGLFDGVKADFEQLRRLLP >gi|316921914|gb|ADCP01000134.1| GENE 72 50366 - 50629 401 87 aa, chain - ## HITS:1 COG:CPn0824 KEGG:ns NR:ns ## COG: CPn0824 COG4794 # Protein_GI_number: 15618733 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscS # Organism: Chlamydophila pneumoniae CWL029 # 2 84 10 92 95 72 50.0 2e-13 MESAAIDYAMKALYLVLMLSMPPIIVASLIGILLSLIQAITQLQEQTLTFGVKLLAVVLT LFIMGGWLSGEILRYADDIFSRFYMIR >gi|316921914|gb|ADCP01000134.1| GENE 73 50779 - 52866 2756 695 aa, chain - ## HITS:1 COG:YPCD1.34c KEGG:ns NR:ns ## COG: YPCD1.34c COG4789 # Protein_GI_number: 16082722 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscV # Organism: Yersinia pestis # 14 695 12 704 704 694 52.0 0 MSLFTTARSAVGVVTRHNDINIVFLLVTVIALMIIPLPTPVVDTLIGVNMALSFTILMMT MYVRTVLDFSVFPTLLLFTTLFRVGLNITTTRLILLQADAGEIIFTFGEFALGGNFVVGA VVFLILTIVQFLVIAKGAERVAEVGARFTLDAMPGKQMSIDADMRAGVIDMEEAQRRRER ISQESQMYGAMDGAMKFVKGDSIAGMIIAVVNIVGGTIIGITQNGMTAGDALHTYGILTI GDGLVSQIPSLLISISAGILITRTGDSDDNVGSQIGLQIFAQPKALLMAGGLVFLFALIP GFPKPQLFTLAALLGGLGYVLKRVNETPQAPDAKEELSKSLTPTARPKSRPGAARDEFAP TVPIILDISEDMGASLDYASLNVELANLRRALYFDLGVPFPGINIRPNPGLPELSYVLNV NEIPMSRGKLEKGMVLARDTSENLSMLGVEFKLGERFLPDVEPLWVPESKAASLERVGIS IMNHARILAYHLSLLLARHASSFLGMQESKYLLDKMEERAPDLVREATRLLPTQRIAEIF QRLVQEQISIRDLRSILEALIEWSPKEKDTVMLTEYVRSALKRQISYMYSKGQNMLPAIL LDPSVEETIRKAIRQTSAGAFLALDPDTTQRFLRAVTESAGKYTANTQKPVLMASMDIRR YVRRLIEGEHYGLPVVSYQEVTPEISIQPVSRIRL >gi|316921914|gb|ADCP01000134.1| GENE 74 52863 - 53270 489 135 aa, chain - ## HITS:1 COG:no KEGG:Dvul_3007 NR:ns ## KEGG: Dvul_3007 # Name: not_defined # Def: tetratricopeptide TPR_4 # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 116 1 117 119 89 49.0 4e-17 MALTQEQRNTLSILGYLYYRMGRLDNAATVFAALDKLAPEGMDAISRRAAATLAAIETDR GNAEKALQLLHRVMDGQTLSTRHAALHLLRARALWQQGRKDEARAAVNEYLYLAGNGPSA QALAEPPFNSMGKRV >gi|316921914|gb|ADCP01000134.1| GENE 75 53272 - 53640 442 122 aa, chain - ## HITS:1 COG:no KEGG:DvMF_2417 NR:ns ## KEGG: DvMF_2417 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 3 122 4 122 122 132 59.0 6e-30 MPSINLLDPSLGIQTIMESPAATRGGIPEARPLASNVLREAGLEELYSPCNAHHLVEQAL CPDAGDGDILRPEVFSGNLAGCLEALKDSDDPAVRALADGELAPLLQNKELLNAYMGLMI GG >gi|316921914|gb|ADCP01000134.1| GENE 76 53683 - 54057 468 124 aa, chain - ## HITS:1 COG:no KEGG:LI0546 NR:ns ## KEGG: LI0546 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 121 1 119 120 114 46.0 2e-24 MIEREIAEFGRRMGMPAFSLSPAGLAALDVERLGRLHLELAENAGERELLVYLARAVPEY DVKAPRRVLERCHYRHADPMPLSGGVHKGNIVLLTRFPERQATAAGIERAVQFLSGVMDT VAKA >gi|316921914|gb|ADCP01000134.1| GENE 77 54057 - 55181 1280 374 aa, chain - ## HITS:1 COG:no KEGG:DvMF_2419 NR:ns ## KEGG: DvMF_2419 # Name: not_defined # Def: type III secretion regulator YopN/LcrE/InvE/MxiC # Organism: D.vulgaris_Miyazaki_F # Pathway: Bacterial secretion system [PATH:dvm03070] # 1 370 1 374 379 397 59.0 1e-109 MSIDFSVTTMSQMAGGATASAASSLATGSLMGNAAAQVEDPMSLLADAAEELTFAADTTD EYELEDRKERERAESAYAERVKLYQDLMHEAGKSQNIDRLKDSLRAREGREKASREALYR FPDPSDAYAALSEALDAFSDDPSVDPSVIEDIRQGLAELEAEHGPQIRSGIQGALAAAGY PELDSADGLRDLYRQTVCDFTDVNAAFAHIHEKYGDVGFGKAMDFLFNALGNDLATDVPS METTHLESVHATLEQVRLLQSTHVQCERLLQRWQDVHGVQCGLAPMELLGDLVDLRKEHF LGAMQIDRIASKAKAPDIEREVLFLQELLNMARNLPVQLFDGEQGRMKVIDAVQESVDAA IRREDEYLASLGDA >gi|316921914|gb|ADCP01000134.1| GENE 78 55576 - 55866 486 96 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAFTEKDMQAQEEQLNALKDELSRLNQKFDAQLKGMGLTEDDLKQQEAMTPEVEALVAKA KEEAKRAGEARKAQASLDNRPSGKAPGAGRRGVVRL >gi|316921914|gb|ADCP01000134.1| GENE 79 55890 - 56450 539 186 aa, chain + ## HITS:1 COG:no KEGG:Dvul_3002 NR:ns ## KEGG: Dvul_3002 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 20 186 18 169 169 119 44.0 4e-26 MSQGIGGVGGSGSVSGVGSLGTCTSVNFIFAKLQMELAASAKDSALGYIKQIESAQAEQK EVADLLQQARQAQNEGENGTGVGQKSINGVTVTGKDCYPMSKELATFMEQHELTFPNTDK DYILGKDEWDVAIQSLQAYQETIGTDIQTLMVYVQDFMGQYNSYTQGANSAIQSGMQTLT SVARGQ >gi|316921914|gb|ADCP01000134.1| GENE 80 56476 - 57018 600 180 aa, chain + ## HITS:1 COG:no KEGG:Dvul_3002 NR:ns ## KEGG: Dvul_3002 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 180 1 169 169 142 50.0 7e-33 MDPVSGNSGSGSVSSVGSLGTCTSVNFIFAKLQMELAASAKDSALGYIKQIEGAQAEQKE VADMLQRCRELQNQAKDSGGCTEMPADVREFMDKNNLTYDLTTGGVSKPTKETADSLHNK DEWDVAIQSLQAYQETIGTDIQTKMVYVQDFMGQYNSYTQGANSAIQSGMQTLTAVARGQ >gi|316921914|gb|ADCP01000134.1| GENE 81 57181 - 57678 601 165 aa, chain + ## HITS:1 COG:YPCD1.30c KEGG:ns NR:ns ## COG: YPCD1.30c COG0457 # Protein_GI_number: 16082718 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Yersinia pestis # 12 135 8 130 168 83 38.0 1e-16 MTEQKRAAVEYTQEELETMVKAVLNGATIGDVCNVSQDMLESLYSLGYNLYTSGNYKDAE TVFSGLCLYDHNDPRFWMGLAGSRQANGKYQEAVDAYGLCSAMGALASPVPLLQAGMCYL KMGDREKAQGAFVVALSMGEEGNPEHDAARGKASAMLAILEQAEK >gi|316921914|gb|ADCP01000134.1| GENE 82 57686 - 58696 947 336 aa, chain + ## HITS:1 COG:no KEGG:Dvul_3000 NR:ns ## KEGG: Dvul_3000 # Name: not_defined # Def: type III secretion system target, YopB family # Organism: D.vulgaris_DP4 # Pathway: not_defined # 37 336 33 335 335 216 46.0 8e-55 MPNSIDFQVGSKFMELAGGVEVNNLQEAIKKALQEITLPDLPSSSPNASSSSSDLPRLAP PTAGGLSLETLAQMISNETRTQATKDGVASIDAKGKERAEINEKKLEEIMDRLDSMKSKG ILDVFKKIFSWVGVIVGAIASVATIVAGAATGNPLLIAGGAVMLTMSINSAVSMATDGKV SISAGIAAGLKACGVPEDIAGYVGMACELAITIVGIGLTMGGSFGSAATTSAQTLSKVAD IALKSSNIASSVVQMGSGATNIAGSVYDYKISTSYADTKELEAILERIQQAQDMELDFLK GVMERAEKMLEDVNNIIEGCTESQTAILTNVAPTMA >gi|316921914|gb|ADCP01000134.1| GENE 83 58890 - 59864 910 324 aa, chain + ## HITS:1 COG:no KEGG:DVUA0111 NR:ns ## KEGG: DVUA0111 # Name: not_defined # Def: type III secretion system protein IpaC family # Organism: D.vulgaris # Pathway: not_defined # 1 324 25 298 298 134 35.0 4e-30 MSGIQGLSSDQQQVYNQLINDAGTVNVSRATVDKELQVAMDAGLSFQEAAAQVRNDLPTL TPPKSSSGALGAWVGVASPMASINALITEMSAEQRKENREVMKAQTDSIAMSMEQQADEI RKKAVAQLVCGVVSGAINIGMGAVQFGMGMKQLNMGKQQGALAKEQGVLGKQQTDIANKQ ADLKSLRSEGAKLSGDELKQFNAEYSSRKASLKADMTDVKTQMKTLDAKSGALGAQSQSY GVMGQSMSQLGQGVGGIVGSVGDFIGAQYDAAMKEMEADQERMRASRDALKSINDSLSEL IQKSISTQDAIQQNMNQTRSRILG >gi|316921914|gb|ADCP01000134.1| GENE 84 59873 - 60211 267 112 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAINGISTQQVSGISPNAGVSVNGGASIESVSNAISAAANKGITPAGPTVTERDSRGSTY GFTPYHDHNGNKMVYLESRSASGVVSHMPLKFEQLESLCSRRGIAVPSLQYN >gi|316921914|gb|ADCP01000134.1| GENE 85 60222 - 60476 216 84 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVTNSEPQDRLLEELSACSAALSAIAEKLEAEGRSLEALTDAEQREIEALGKRVERCDRQ LDECLKDTPPCGKGPMPHVWGIRV >gi|316921914|gb|ADCP01000134.1| GENE 86 60527 - 60970 206 147 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2998 NR:ns ## KEGG: Dvul_2998 # Name: not_defined # Def: TIR chaperone family protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 126 1 127 151 102 46.0 6e-21 MSLLQESVDALTAAAGWKKASPGADGVFRFRLEGDLDMEFFSPDGRTGILRADLGPLPAA EQDADDLLRKCASRMVAASRSRRTVLSVDEGRFSLHLVVPLAESVASMPGYAKDFLNDLA WWKRQVQGGGASAFSFFSGMQAWNMGR >gi|316921914|gb|ADCP01000134.1| GENE 87 60886 - 62844 1719 652 aa, chain + ## HITS:1 COG:YPCD1.52 KEGG:ns NR:ns ## COG: YPCD1.52 COG1450 # Protein_GI_number: 16082740 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulD # Organism: Yersinia pestis # 64 575 35 509 607 252 31.0 1e-66 MVEKTGAGRWRVGVFFLLRYASLEYGALMHCRGGVSLLMVFALVFGVLLVSCPAFAVQGG FSRPYTHYADDEDLTVILTNFARSQGLGASFSPGVVGKVSGRFDAVPPETFLKGMQAAFG VTWYRLGSTLYFYSESELSRTFITPRAMTAERLYQMLRQSAVFAPQLPATLAPGGAMIVV SGPPTYLAQISAAVTAFEEAQITNFVMKVFPLKYAWAEDMSVNSMDKTVTIPGVASILRA MVTGSPNSATRVTQDTATVDKLSGTGLIAQGRMTPVAPDAPAPAAQQGGQAAPQNATQVN IMADPRVNAVLVNDAEYRMPYYAKVIADLDRPVELVEIHAAIVDIDSDFKRDLGVTYQGA NSRDKGWSTGGEFSGTGDKFSPLPTMGTPSGTGMNLSTIYTHGADFFLARIQALEAEGEA RMLGRPSVLTVDNVQATLENTSTYYIQVEGYQAVDLFKVEAGTVLRVTPHIIRNERGQTS IKLAVSVQDDQNDSKDAPVSDNSVPPIKQTKINTQAIVGAGQSLLIGGYYYEQKSTDASG IPILMHIPVLGNLFKTTSKGTKRMERLILITPKVIHLDEIPTTPPRVDEPTFHRSPTQAD YEERVPAAPQSGCSRKRPELETVTPATAIPLSPPAARQSPGVQAQARPAGAL >gi|316921914|gb|ADCP01000134.1| GENE 88 62889 - 64325 713 478 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2427 NR:ns ## KEGG: DvMF_2427 # Name: not_defined # Def: type III secretion apparatus protein, YscD/HrpQ family # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 478 17 522 522 257 40.0 7e-67 MGAELVLPAGNQLVGSDDSCDIILQDSSVASRHANIVIGFSGDGAPSVRVTPLDGEILLN GAPLPSDGADIPARTPFFLGLTCMAWSEPGEPADAWRQVASGLQERRSVQERPEPAEPQR EEPVMDLGDPMILLQEGTSVGANGVGKTLWERLRSLPPQRLALSFFLLLGLCGLTFSYAG RTPDAETRVKEIQTIIDENGFRTLVVSPQDEGVGVQGVLRDDAERADLLRLAQGVQYPVY LDVTLRGDRVDAVTSAFASRGFSLSVQEKTDNPDDGLDLAGYMKDGLVEEQAFSAVREDV PELRDQAVWSRLDRTIRHADAVEAVLLPLLKEADLAFVQVRFLPGKVELAGRFDVAQRIR LDGVLDKARAELGVPVVFDVVASFEKKRPQAEVRREAAAPQQAEAGDVPAPQADDMPEIR VTGVTLTPMRFISLATGQRVFEGGLLPGGYVLESIGVKELKLRKDGRIIVYRLRGSNE >gi|316921914|gb|ADCP01000134.1| GENE 89 64318 - 64521 265 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSDAMVSVLDALDDPRALQGIRDEVLEMDLAVKRHMDRGLTPDEWAVAQAVREATQAALH SMDNMAK >gi|316921914|gb|ADCP01000134.1| GENE 90 64550 - 64810 397 86 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2429 NR:ns ## KEGG: DvMF_2429 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: Bacterial secretion system [PATH:dvm03070] # 3 86 1 84 84 86 60.0 2e-16 MNINGTSGTPGLNVNELFNSGLDSISSKGKDLQAKMTEMLGKDEVSPEDMMALQFEVGQY NAMLESLSSVTKSMTDMMKSLAQRTG >gi|316921914|gb|ADCP01000134.1| GENE 91 64996 - 65370 428 124 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2993 NR:ns ## KEGG: Dvul_2993 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 121 1 121 121 130 62.0 2e-29 MLSDTQISRLLDVANAACHGGNVVDARAMYAGVLALRPGFASALMGKALSHIVVDDFDEA ERLLKDEVLAAMPEDEDAQALLGLCYTLARRQGEAEAVLAPLAAGEGPRAELAAGLLEKL RQEP >gi|316921914|gb|ADCP01000134.1| GENE 92 65397 - 65975 398 192 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2431 NR:ns ## KEGG: DvMF_2431 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 192 1 257 257 85 40.0 1e-15 MDISGLGASKLLEALEKLGVSKGAELRSLGPQGEPPAELVRAFEDALRAPDGQGGPSVQG VAEGSREHGPFEEAGQDIPEVARTDEAIPPNPSEDPVYGISEGGRIDGIGGTMRIESPSG PQAVEGRTDAVQTDGMQELARLLERVGGGNASATELYQLQFLVGMLRVQATGGAKLSQQV NQGFESLLKQQG >gi|316921914|gb|ADCP01000134.1| GENE 93 66036 - 66812 658 258 aa, chain + ## HITS:1 COG:mlr6339 KEGG:ns NR:ns ## COG: mlr6339 COG4669 # Protein_GI_number: 13475299 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, lipoprotein EscJ # Organism: Mesorhizobium loti # 7 250 12 251 279 141 38.0 1e-33 MRWLPFLLVLSLCAALLGCQVEVYRGLTEAQVNTMLSTLLKRGIRAEKTAAGKAGFTLSV DEDQLVQSLEILKENNLPRADYENLGKVFSGQGMISSASEEQARMAYAISQELSDTFSRI DGVLTARVHVVLGGTDQATDTRTLPSAAVFLRHTPDSPVVNLVAKIRELTSKAVPDLDYE RVSVMLVPVREQVSVPMEPTPKFLGLPLAPQNGPPYALIGAGIVFLAALAGLALLALSLR DALRKRKEAANASEDQSS >gi|316921914|gb|ADCP01000134.1| GENE 94 66891 - 67496 297 201 aa, chain + ## HITS:1 COG:no KEGG:LI0542 NR:ns ## KEGG: LI0542 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 22 201 47 234 234 145 40.0 1e-33 MLAFNGRLGSVVPDCPEVWRVPGLNTAIVAAELSRRGSPVLAWPVAPPEEGFWDFTDESR RLAFLSAEELERLGKVFGVCVHAAELARIITREPVLALREALGEPLYRYGIQRGQYQLGS VRQFFLSRDVREPLLERMQRHGRLAIAICRAPWPAALKERAAENIEDAPPSVSPAVQRAV WFGLKKLLLKEVAPQWAPCFD >gi|316921914|gb|ADCP01000134.1| GENE 95 67475 - 68095 665 206 aa, chain + ## HITS:1 COG:PA1725 KEGG:ns NR:ns ## COG: PA1725 COG1317 # Protein_GI_number: 15596922 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway protein # Organism: Pseudomonas aeruginosa # 1 205 1 204 214 118 39.0 7e-27 MGSLFRLTGDTVTPSAGTRVLKASEAAVLLEANAVLDAARERVADMERKAGEAYERRREE GYRDGVEEGRLEHAEKVMETVLSSVEYIEGIEATLVNVVAVAVRKVIGEIDENERIVRIV RNALVTVRNQQHVTIRVAPADEKAVREGLASMLASVPGGASFLDVVPDARLERGACLLES ELGVVDASLETQLKALENALRSKIAS >gi|316921914|gb|ADCP01000134.1| GENE 96 68339 - 69655 1385 438 aa, chain + ## HITS:1 COG:YPCD1.40 KEGG:ns NR:ns ## COG: YPCD1.40 COG1157 # Protein_GI_number: 16082728 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway ATPase # Organism: Yersinia pestis # 1 433 2 434 439 577 68.0 1e-164 MAFEYIGALLEEAVQSTASVEVRGRVEQVVGTIIRAVVPGVKVGELCILRNPWEDWTLRA EVVGFVKQVALLTPLGDLQGISPATEVIPTGEIHSVPVGEDLLGRVLNGLGDPIDGGPPL KPRTHYPVYADPPNPMLRKIIDRPISLGLRVLDGVLTCGEGQRMGIFAAAGGGKSTLLSS IIKGCSADVCVLALIGERGREVREFIENDLGPEGRKKAVLVVSTSDRSSMERLKAAYTAT AVAEYFRDQHKSVLLMMDSVTRFGRAQREIGLAAGEPPTRRGFPPSVFSTLPKLMERAGN SDKGSITALYTVLVEGDDMTEPIADETRSILDGHIILSRKLAAANHYPAIDVQASVSRVM NAIVTKEHKKAAQALRKVLAKFAEVELLVQIGEYKKGSDKEADDALARINAVNAFLRQGL DEKSTFEETLQALYKVVA >gi|316921914|gb|ADCP01000134.1| GENE 97 69658 - 70149 645 163 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2436 NR:ns ## KEGG: DvMF_2436 # Name: not_defined # Def: type III secretion YscO fmaily protein # Organism: D.vulgaris_Miyazaki_F # Pathway: Bacterial secretion system [PATH:dvm03070] # 4 155 5 156 167 116 53.0 3e-25 MEKYPLAPLLKVREYREDAAKNALSAAERAVVEAQEAVERCRGELERYKVWRQEEVERRY DAIMGKGLSLKELDVFKAGLGALADGELKLEEAIAQALENVKKRQEDVRKAREAARQAQH ETAKIVTHRDIWLVEAKREAERLEDLEMEEFKPLPPQGTEGEL >gi|316921914|gb|ADCP01000134.1| GENE 98 70388 - 71062 480 224 aa, chain + ## HITS:1 COG:no KEGG:DVUA0120 NR:ns ## KEGG: DVUA0120 # Name: not_defined # Def: type III secretion protein, putative # Organism: D.vulgaris # Pathway: not_defined # 1 214 7 219 230 113 44.0 5e-24 MADFMKVDMSRLGQAEGRSADSGMAAPSGADVDTFQQAMRRRPLDERQGSGGEESDGGRG ASHAPDEDTSAEAEALSSMFSRMAGPLDTLFAGRTAQTAETAAPSGDAPDLGALAEKLVE RILVSGPDGGHEIRLTLGKDVLPGTEIRLQRGSDGVLSVHLATDNAASFQTLVGAQESLK ARLESFEKDVRVEVVSERGGAEAENGDGRRQSRGEYVAPDERDA >gi|316921914|gb|ADCP01000134.1| GENE 99 71160 - 72221 972 353 aa, chain + ## HITS:1 COG:no KEGG:DVUA0121 NR:ns ## KEGG: DVUA0121 # Name: not_defined # Def: type III secretion system protein YopQ family # Organism: D.vulgaris # Pathway: Bacterial secretion system [PATH:dvu03070] # 3 348 42 422 484 168 38.0 3e-40 MEILPFRAPALSPLEAHVFNLLCTRAQPWPIDVAGTPCALEAGGLAAPVPPVCAFDVVCG ERTWRVELGSLRLLALHPAVADVPPNAALPGALQLAVLDLLAAPLAELAQTFLHVPLAVR NVRMIEAAEPAVCALPLTLHVPVGAGASAEAYPIPVRLCLPDRESALELVERLDALPRRR ALGLAGDVPVPVSLEAGRMRLSVNELSSLEPDDVLLPETYPAREGRVTLRLCATSRSLVF ACSLAEGCATILSVLNPEEGPMSDENNTAAGAAPSEGVDTGELEVTLTFELERRLMTVRD VETLAPGYTFAFGGDALAPVTLYANGKSVGKGRLVDLNGTLGVQVVSLGKVGE >gi|316921914|gb|ADCP01000134.1| GENE 100 72224 - 72874 653 216 aa, chain + ## HITS:1 COG:PA1693 KEGG:ns NR:ns ## COG: PA1693 COG4790 # Protein_GI_number: 15596890 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscR # Organism: Pseudomonas aeruginosa # 20 215 21 215 217 196 56.0 2e-50 MTGVNPLYFIVGMALLGLAPFFLMMVTSYVKIVVVTSLVRNALGVQQVPPAMVMNGLAII LSLFIMAPMMMSTMDIAGKLQISAEPSPKEVIGIVDKLSPPLRKFLSDNTEDGVLRAFMS TAKRIWPKEQHDQISKDNMLILVPAFTISELTKAFQVGFLLYLPFIAIDLIISNILLAMG MMMVSPMTISLPFKLLLFVTLDGWLKISQGLLLSYK >gi|316921914|gb|ADCP01000134.1| GENE 101 72898 - 73233 251 111 aa, chain + ## HITS:1 COG:CT424 KEGG:ns NR:ns ## COG: CT424 COG1366 # Protein_GI_number: 15605151 # Func_class: T Signal transduction mechanisms # Function: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) # Organism: Chlamydia trachomatis # 7 111 8 112 116 67 33.0 7e-12 MQVVTSEQGCVGIIALTGRIDATTASSFETSCRELLDSGAKKVVVDLGGVEYISSAGLRV ILTMVKASKAAAATLAFCSMQSMVAEVFKISGFSSMLPIYATRDEAVSALS >gi|316921914|gb|ADCP01000134.1| GENE 102 73244 - 73660 538 138 aa, chain + ## HITS:1 COG:CPn0670 KEGG:ns NR:ns ## COG: CPn0670 COG2172 # Protein_GI_number: 15618580 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Chlamydophila pneumoniae CWL029 # 7 132 11 136 144 62 34.0 2e-10 MATLSVPASLEQLANVNAFIHETIPSSYRPLIPQVELAAEELLVNVFSYAYPDVPGEAQV ACREVRLDNVPYFCFKVQDWGAPFDPFLEAPVPDVSLGVDERPVGGLGIHLIKSVVAHYT YAYYEQSNIIELYFALPE >gi|316921914|gb|ADCP01000134.1| GENE 103 73693 - 74343 687 216 aa, chain + ## HITS:1 COG:BH2852 KEGG:ns NR:ns ## COG: BH2852 COG0741 # Protein_GI_number: 15615415 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Bacillus halodurans # 94 208 110 225 232 138 60.0 1e-32 MFVSGTASAGVILHEELSTDERPAQRLRPVHPGGQSESSLPPRPSIPTENLLDQMDIYAL SPHAAALSLPRMVSPQLVLRAHAGKGTLPPPAVWRELVAKASGRYGLDPRLIAAVIKVES NFETIAESEKGAQGLMQLMPGTQQMLGVIDPFDPEANVDAGSRYLRQQIDRFGRLDLALA AYNAGPGNVLRYGGIPPFAETQAYVSKVLSLLDADN >gi|316921914|gb|ADCP01000134.1| GENE 104 74593 - 75666 1000 357 aa, chain - ## HITS:1 COG:FN0563 KEGG:ns NR:ns ## COG: FN0563 COG0482 # Protein_GI_number: 19703898 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Fusobacterium nucleatum # 8 295 6 296 333 193 41.0 3e-49 MNTSTPHAFALFSGGLDSILAARLIMEQGLVVRCLHFVTPFFGKPQLIPHWEKVYGLEVE AVDVGEAFVRLLRERPAHGFGKVMNPCVDCKILMMRKARELMKKWGASFLISGEVLGQRP MSQRRDTLNVIRRDAEVKESLLRPLSAQLLDPTAPEISGLVDRNKLLGIFGRGRKSQMAL ADQMGLKEIPTPAGGCKLAEKENSRRYWPVLTRLKEPTVEDFELSNIGRQYWFGDHWLSM GRNEADNSALERLVRPGDAILRVRDFPSPFALARQLTPWDNETLQDAASLVASYSPKGVR AAEASPDGTIAVRVQINGESTFVNVAPKRTPALPWGEPEFVDVRKAIKAENRPTFDE >gi|316921914|gb|ADCP01000134.1| GENE 105 75559 - 75873 112 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQATDNQPLFHDQAGSEDAVQPAGKQGKCMRSGGIHKAILLRRIHEHKRLLCWRTREKGM ASAFAGSHMPFFMGASSDAPNEVWENRGTHTGSLAPLRQRKSWL >gi|316921914|gb|ADCP01000134.1| GENE 106 75870 - 76688 931 272 aa, chain - ## HITS:1 COG:no KEGG:Ddes_2150 NR:ns ## KEGG: Ddes_2150 # Name: not_defined # Def: split soret cytochrome c precursor # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 15 272 22 279 279 365 66.0 1e-100 MELTRRAMLGALGGLAVGGTVASVVGTSIAADPKTKRFEQINGDFGWKPHKLDPKECAAV AYDGYWHKGLGCAYGAFYAIVGLMGEKYGAPYNQFPFAMLEVGKGGISDWGTICGALLGP ASAYALFWGRKERNAMVSELYRWYETAKLPIFNPGEAAKGVKGDLPTSVADSVLCHVSLS RWCFENKVEANSKARSERCGRLTADVTYKAVEIMNAKIDGTFKPALAAPQSVTTCGECHA KGKEADNMKSVMDCTPCHSGNEHLMNKFKDHP >gi|316921914|gb|ADCP01000134.1| GENE 107 77025 - 78413 1931 462 aa, chain - ## HITS:1 COG:BH3186 KEGG:ns NR:ns ## COG: BH3186 COG0165 # Protein_GI_number: 15615748 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Bacillus halodurans # 5 458 3 456 458 454 49.0 1e-127 MAKEKLWGGRFHEGTAASVEAYTQSISYDWALYKQDIAGSCAHARMLARQGVISAEEADQ LVAGLAAVRAEIEDGTFQWKKELEDVHMNIESRLTELVGDVGKKLHTGRSRNDQVALDFR LFVSDRLREWQRLARELAAVYMARAEEHQDTILPGCTHLQPAQPVSLAHHLLAYAWMLRR DVQRMADCEKRTRISPLGAAALAGTTYPLDPASVARELDMYGTFDNSMDAVSDRDFVIEA LFCASTAMMHLSRCCEEIILWSNPAFGFVKLPDAYATGSSIMPQKKNPDVAEIMRGKVGR VYGDLTAMLTILKGLPLTYNRDLQEDKEPFLDADHTLSTSLELMAGMMKALRFDEVRMAR ALTAGFLNATELADYLVTKGIPFREAHHITGSAVALAEGKNLSLEALPLDALQGLCDRIG PDVFDVLEYRAAVARRNAPGGTGPLSVARQLEKMKAWLHETR Prediction of potential genes in microbial genomes Time: Fri May 13 04:34:34 2011 Seq name: gi|316921909|gb|ADCP01000135.1| Bilophila wadsworthia 3_1_6 cont1.135, whole genome shotgun sequence Length of sequence - 6021 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 69 - 1277 1825 ## COG0137 Argininosuccinate synthase 2 1 Op 2 . - CDS 1344 - 1973 929 ## COG0491 Zn-dependent hydrolases, including glyoxylases - Prom 2114 - 2173 3.5 - Term 2308 - 2358 17.1 3 2 Tu 1 . - CDS 2381 - 4648 3035 ## COG0550 Topoisomerase IA 4 3 Tu 1 . - CDS 5029 - 5832 1021 ## COG0790 FOG: TPR repeat, SEL1 subfamily Predicted protein(s) >gi|316921909|gb|ADCP01000135.1| GENE 1 69 - 1277 1825 402 aa, chain - ## HITS:1 COG:mlr4366 KEGG:ns NR:ns ## COG: mlr4366 COG0137 # Protein_GI_number: 13473685 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Mesorhizobium loti # 1 394 1 398 407 488 60.0 1e-137 MAQQRNIKKVVLAYSGGLDTSVILKWLQVNYGCEVITMTADVGQEDDLNGVDEKALRTGA TKAYIEDLREEFAKDFIFPMLRSGALYEGRYLLGTSVARPLITKRLVEIARAEGADALAH GATGKGNDQVRFELSAAALAPDLRVIAPWREWDLMSRTALNFFAEQHNIPISSGAKHYSM DRNMLHCSFEGGELEDPWEEPLEASHIMAVPFEKAPDEAEYVTITFEHGDPVAVNGEALS PAQIMVKLNELGRKHGIGRVDMVENRFVGIKSRGVYETPGGTIIYVAHKDLEGITMERET MHIRDMMMPSYAGAIYNGFWYSPEREAMQAFIDKSQERVSGTVRLKLYKGNAWPVGRESK NTLYCADLATFEECATYDHKDAAGFIKLNSLRIRGYRKDLIK >gi|316921909|gb|ADCP01000135.1| GENE 2 1344 - 1973 929 209 aa, chain - ## HITS:1 COG:BS_yqgX KEGG:ns NR:ns ## COG: BS_yqgX COG0491 # Protein_GI_number: 16079535 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Bacillus subtilis # 1 207 1 206 211 129 37.0 4e-30 MPIETFPIGPLETNCHIVISGSDAVAVDPGGDLNSGLGEVLALLKEQNLTLRAILCTHFH FDHLYGVAALHEATGAPVYGPAGDASLLGTELGGSGAWGFPPVTTFKWEDLKEGPHTFGS IKCEVLTTPGHTPGSLSIYFPELNSIMTGDLLFYHSVGRTDFPGSSQEALRKSLHKKIFI LPQETAVYPGHGPNTNVGEELANNPYLWL >gi|316921909|gb|ADCP01000135.1| GENE 3 2381 - 4648 3035 755 aa, chain - ## HITS:1 COG:BH2467_1 KEGG:ns NR:ns ## COG: BH2467_1 COG0550 # Protein_GI_number: 15615030 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Bacillus halodurans # 1 548 1 550 550 540 50.0 1e-153 MGKSLIIVESPAKVKTIKKFLGPGYLVQASVGHIRDLPKSNIGVDEANDFAPEYEIIQGK EKVVQSLRDAASKVDTVFLAPDPDREGEAIAWHVAELLKDKNPNIKRIQFNEITSRAVHE ALDHPRDLNVNLFDAQQARRVLDRLVGYKISPLLWKNVKRGISAGRVQSVALRLIVEREE ERQAFVPEEYWLFKAILEGSIPPPFKAELWKVEGKKPVIPNAETADALEKAMKGAPFTVS AIEEKERSRAPMPPFITSTLQQAANQRLSYPAKRTMNIAQRLYEGLELGDRGTTALITYM RTDSVRIADEAQKAAADFIENRFGKDYLAPGGKRNFKTKSDAQDAHEAIRPVDVTLTPED VKPYLAPDQYQVYRLIWARFVASQMAAARFHDTTVTIDNGPVQWRSKGERMLFPGFLAVM PRGKDEEGVELPALTKGETLKLNSLTKEQKFTQPSPRFTEASLVRELEELGIGRPSTYAS IISTLQDRDYVTLEEKHFAPTDLGRVVCDRLREHFKTLMDVGFTAHMEELLDKVAEGQQD WVALLRNFNNDFDPTLLEAAKSMGKAKTDTVTDIPCPECGKPLAVKFGKTGQFLACTGYP ACRYTSNYTRDEQGQIHLQEKVKPEFEKVGTCPECGKDLVLKRSRTGSRFIACSGYPDCK HAEPYSTGVPCPREGCNGVLVEKSSKRGKIFYSCSNYPQCDYALWDWPIAEPCPECGSPL LVLKNTKAKGKFIACPEKTCKYTRPLEGGSEDDAE >gi|316921909|gb|ADCP01000135.1| GENE 4 5029 - 5832 1021 267 aa, chain - ## HITS:1 COG:jhp1045 KEGG:ns NR:ns ## COG: jhp1045 COG0790 # Protein_GI_number: 15612110 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Helicobacter pylori J99 # 84 259 34 215 256 93 34.0 5e-19 MINTKQVVPALVAVTLLSLAPAAEALAIEEHAQAAKPAAVAADKPAAAKPAEAKADTKTD AKADVKTQAPAATDPTKETPAPSVEQAQKAYDNGSYEEAFIIVEPLAKLEHPDAEYLLGQ MYELGRGVKKDTEQAVTLFTSAANQGYAAAQAKLGQLHMEGKKDYASAMSWFQKAADQGY ALAYSAIGDLYAQGYGVGQDKGKALDYYKKAATAGDADACLHLGQMYEQGKDVKADPAQA AIWYKKGADLGSAECAAALKRLNDQKA Prediction of potential genes in microbial genomes Time: Fri May 13 04:34:39 2011 Seq name: gi|316921903|gb|ADCP01000136.1| Bilophila wadsworthia 3_1_6 cont1.136, whole genome shotgun sequence Length of sequence - 6338 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 101 - 138 -0.3 1 1 Tu 1 . - CDS 281 - 1657 1932 ## COG2848 Uncharacterized conserved protein 2 2 Tu 1 . - CDS 1777 - 2451 686 ## Ddes_0621 amino acid-binding ACT domain protein - Prom 2496 - 2555 4.0 + Prom 2540 - 2599 2.6 3 3 Tu 1 . + CDS 2622 - 4397 2350 ## COG2759 Formyltetrahydrofolate synthetase + Term 4412 - 4455 9.1 + Prom 4421 - 4480 2.1 4 4 Op 1 . + CDS 4662 - 5207 485 ## PROTEIN SUPPORTED gi|116512426|ref|YP_809642.1| spermidine acetyltransferase 5 4 Op 2 . + CDS 5234 - 6124 897 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase + Term 6199 - 6271 15.1 Predicted protein(s) >gi|316921903|gb|ADCP01000136.1| GENE 1 281 - 1657 1932 458 aa, chain - ## HITS:1 COG:MA1691 KEGG:ns NR:ns ## COG: MA1691 COG2848 # Protein_GI_number: 20090543 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 3 458 15 464 464 513 59.0 1e-145 MLTDREVLSTLNMLRNEHLDVRTVTLGISLFDCASHDFDVFAYRVRAKIAKYAAKLVATC NEVGDKFGIPVVNKRISVSPIGVVGASFSRDQMVRACQVLDESAKDAGVDFLGGFGALVE KGITPGERNLIDALPEALATTDRICSSINVASTRSGINMDAVALLGQRILDVAEATRDRD GLGCAKLVVFANIPQDVPFMAGAYLGVGEPDVVINVGVSGPGVVKKAIDRAMESHKPGEF TLGEVAEVIKRTAYKVTRVGEIIGKEVAQRLDLPFGVADLSLAPTPAVGDSVGEIFQSVG LSSIGAPGTTAVLAMLNDAVKKGGAFASSHVGGLSGAFIPVSEDSSIEAATRSGALSLEK LEAMTSVCSVGLDMIAIPGDTPAATISGIIADEMAIGMINSKTTAVRVIPVPGKGVGEKA SFGGLLGEADIIPVPGGDASTFISLGGRIPAPIHSLKN >gi|316921903|gb|ADCP01000136.1| GENE 2 1777 - 2451 686 224 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0621 NR:ns ## KEGG: Ddes_0621 # Name: not_defined # Def: amino acid-binding ACT domain protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 41 221 3 184 187 155 46.0 7e-37 MLTAGEGWQKRLTECFRLISLLLSPYFGKAKDLPREVSVSKIVVSVIGLDCPGVVYAVSS TLSALECNIEEVSQTILKNQFAAIFVANKQESLDNGTIHTQLSKAIESRGMHLSVTIRDF EEGDTSGSAESEPFVVTVDGEDRFEIIAAFSKIFADQKVNIENLKALMPEEEKRALLVFE ISLPMEIDRNALRRTLKDKARTLGLQLSMQHRDIFEALHRVQPV >gi|316921903|gb|ADCP01000136.1| GENE 3 2622 - 4397 2350 591 aa, chain + ## HITS:1 COG:SPBC2G2.08_2 KEGG:ns NR:ns ## COG: SPBC2G2.08_2 COG2759 # Protein_GI_number: 19113229 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Schizosaccharomyces pombe # 9 590 1 629 630 461 40.0 1e-129 MVLDPKKHADWEIAQDAETRMKTVQQLGRDMGLLDKELLPYGHVMGKVDYRAVLERLGDK PNGKYVDVTAITPTPLGEGKSTTTIGLVQGLGRRNKRASAAIRQPSGGPTMGVKGSAAGG GLSQCIPLTQYSLGFTGDINAVMNAHNLAMVALTSRMQHERNYNDEKLLKLSGMPRLNID PTNVNMGWVMDFCCQSLRNIIIGMDGTNGRSDGFMMRSRFDIAVSSEVMAILAIAKDLKD LRQRMGKIIVAYDRDGKPVTTADLEVDGAMTAWMVEAINPNLIQTIEGQPVFVHAGPFAN IAIGQSSVIADRIGLKLNEYHVTESGFGADIGYEKFWNLKCHYSGLTPDAAVVVATVRAL KSHGGAPVPVPGRPLPEEYRTENVGFVEAGCCNLLHHINTVKRSGVSPVVCINAFVTDTK AEIAKVRELCEAAGARVALSTHWEHGGEGALELADAVMDACNDKTEFKPLYSWDMPFKER IELVAREVYGADGVDFSAEASRKLADIQKRDDANELGLCMVKTHLSLSDDPGVKGAPKGW KLRIRDVLTFGGAGFVVPVSGSITLMPGTGSNPSFRRVDVDVDTGRVKGIF >gi|316921903|gb|ADCP01000136.1| GENE 4 4662 - 5207 485 181 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116512426|ref|YP_809642.1| spermidine acetyltransferase [Lactococcus lactis subsp. cremoris SK11] # 4 174 7 177 212 191 51 1e-48 MVADPSIRLRPLEREDLPFIHQLDNNASVMRYWFEEPYEAFVELTDLYDKHIHDQSERRF IVEHDRAKVGLVELVEIDYVHRRAEFQIIIAPAHQGLGYAAKAVLLVMDYAFTVLNLYKL YLVVDTENKKAIHVYKKLGFEVEGELKHEFFSNGEYRNALRMCTFQTDYLMKKKEQAAQW A >gi|316921903|gb|ADCP01000136.1| GENE 5 5234 - 6124 897 296 aa, chain + ## HITS:1 COG:BMEII0510 KEGG:ns NR:ns ## COG: BMEII0510 COG0190 # Protein_GI_number: 17988855 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Brucella melitensis # 1 290 21 308 319 281 53.0 8e-76 MADIISGTEMRAAILDELRQEVEAIREKHGTVPGLVTILVGENPASISYVTLKVKTALSL GFHEIQDNQPADVSEEALLALIDKYNRDPSIHGILVQLPLPKHIDENKVIYAIDPDKDVD GFHPVNLGRMVIGGDAVRFLPCTPAGIQEMLVRSGVEMAGAEVVVVGRSNIVGKPISVML GQKGPGANATVTMVHTRTKDLAEHCRRADILIVAAGVPGLVKPDWIKPGATVIDVGVNRV GFNEKTGKPILSGDVDFAEASKVAGKITPVPGGVGPMTIAMLMKNTVRSAWYHLGR Prediction of potential genes in microbial genomes Time: Fri May 13 04:34:48 2011 Seq name: gi|316921899|gb|ADCP01000137.1| Bilophila wadsworthia 3_1_6 cont1.137, whole genome shotgun sequence Length of sequence - 3309 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 269 - 952 733 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 2 1 Op 2 . - CDS 975 - 2336 487 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 - Prom 2492 - 2551 2.0 3 2 Tu 1 . + CDS 2493 - 3263 754 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) Predicted protein(s) >gi|316921899|gb|ADCP01000137.1| GENE 1 269 - 952 733 227 aa, chain - ## HITS:1 COG:AF1550 KEGG:ns NR:ns ## COG: AF1550 COG1387 # Protein_GI_number: 11499145 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Archaeoglobus fulgidus # 1 201 1 194 213 133 36.0 3e-31 MFDFHIHVNCSGGRDGLLPSEAMRLAKCAGFRAVGLIVRADPSTLPILLPTLKTLVKTCS LYADIEAFAGVELVHVPPALLPDAVGQAREQGAALILAHGESIPRQLADVVETGTNLAAI NAGVDILAHPGLITVEDAQLAAEKGVLLELNTAPRHCLANGHIVRMAERFGCELLLNSDA SSSADFESPDVTQALRKAAALGAGLDAEGLSRLHVTERKLVQKLLRL >gi|316921899|gb|ADCP01000137.1| GENE 2 975 - 2336 487 453 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 1 404 1 394 451 192 31 4e-49 MAQGTFHIITFGCQMNVNDSFWLARSLQRRGFVESPLEKAEVVILNTCSVRDKPEQKVYA ALGRIKYETRRVPGSFAVVAGCVAQQIGAGFFERFPQVRLVVGGDGLAMAPDAIERLHGD PSLRLNLTDFSEIYPERDPALPAQGEGEIPPVAFVNIMQGCDNFCTYCIVPFTRGRQKSR DAGAILDECRALIDNGAKEITLLGQNVNSYGLDKHASGDTSFARLLRKVSELPGLARLRF VTPHPKDLSPEVIAMFGEVPNLCPRLHLPLQAGSDRVLARMNRKYDMARYMTLVEGLRAA RPDIALSTDLIVGFPGETEEQFQETLDAVRAVNFMSSFSFCYSDRPGTAASRHTDKVEPA EKLRRLERLQALQEDLSSDWLKARVGCKTDILLEGASRKQDGEDAATESWQGRDPWGDAV NVSLPAGIGKAGLIVPVTIVTAKKHSLIGELRG >gi|316921899|gb|ADCP01000137.1| GENE 3 2493 - 3263 754 256 aa, chain + ## HITS:1 COG:MA0585 KEGG:ns NR:ns ## COG: MA0585 COG0179 # Protein_GI_number: 20089474 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Methanosarcina acetivorans str.C2A # 3 245 2 243 244 222 48.0 4e-58 MRIIRVEHGGTSYYGILEDGLVFRLHQQPGHMEPIPLSEAKVLPLVTPSKVVCVGLNYKA HAEELGYPLPNMPSFFLKPPSSIIGNGEPIVLPSGMGRIEHEAELAMVVGKVCRNLTPEE AEESLFGFTCANDVTAREVQKVDPLIGHCKSYDTFCPIGPWIETDLEDINDLSIRCLVNG EVRQSANTGDMIFRPLDLLCFLSRVMTLMPGDVILTGTPPGISPIQADDVVQVSIEGIGT LSNPVEGPGPEEKILQ Prediction of potential genes in microbial genomes Time: Fri May 13 04:34:58 2011 Seq name: gi|316921887|gb|ADCP01000138.1| Bilophila wadsworthia 3_1_6 cont1.138, whole genome shotgun sequence Length of sequence - 16144 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 9, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 15/0.000 - CDS 9 - 1268 1610 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 2 1 Op 2 . - CDS 1284 - 2249 1137 ## COG0540 Aspartate carbamoyltransferase, catalytic chain - Prom 2465 - 2524 3.3 + Prom 2429 - 2488 3.3 3 2 Tu 1 . + CDS 2585 - 3730 660 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases + Term 3811 - 3850 -0.7 + Prom 3758 - 3817 3.4 4 3 Tu 1 . + CDS 3895 - 4986 869 ## COG0006 Xaa-Pro aminopeptidase + Term 5084 - 5129 -0.6 5 4 Tu 1 . + CDS 5569 - 6054 54 ## COG0438 Glycosyltransferase + Prom 6950 - 7009 5.2 6 5 Tu 1 . + CDS 7091 - 7357 109 ## - Term 7141 - 7184 -0.9 7 6 Tu 1 . - CDS 7301 - 8266 611 ## DVU1213 rhomboid family protein - Prom 8382 - 8441 2.2 - Term 8422 - 8450 -0.2 8 7 Tu 1 . - CDS 8454 - 8663 303 ## PROTEIN SUPPORTED gi|220904652|ref|YP_002479964.1| ribosomal protein L28 9 8 Tu 1 . + CDS 9039 - 12278 4394 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Prom 12710 - 12769 2.7 10 9 Op 1 . + CDS 12808 - 14145 1405 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 11 9 Op 2 . + CDS 14164 - 15789 2017 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases + Term 15977 - 16020 9.9 Predicted protein(s) >gi|316921887|gb|ADCP01000138.1| GENE 1 9 - 1268 1610 419 aa, chain - ## HITS:1 COG:aq_806 KEGG:ns NR:ns ## COG: aq_806 COG0044 # Protein_GI_number: 15606174 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Aquifex aeolicus # 1 408 1 409 422 306 41.0 4e-83 MSSLLIKNARYLGDSVDLLVVDGKVQSLSASINAPADAEIVDAAGKILFPSLIDVHTHLR EPGYEWKETVATGLSSAAHGGFGAILCMANTDPVNDDAAVTEQILESARRSWPNGPRVHP IGAATVGLKGEELTRMSELKDAGCVAISNDGRPVGNTEIFRRMLEYAADLDLIVIDHCED PYLAAKSHMNEGATSGVLGVKGQPDVAETIQASRSILLADYLGVPVHIAHVSAKRTVDII AWGKSRGVKVTAETTPHYLTLDDTAVNGYNTLAKVNPPLRTPADVAAMREAVKTGVIDML ATDHAPHAAHEKETPFDEAPNGFTGMDLALTITYGLVREGVLTEADLIRLWATEPARVFN LPVNTFEPGAPADFFLFDPELEWQVTPETLYSKSHNTPWLGKSLKGRVSAHWIGGVKIV >gi|316921887|gb|ADCP01000138.1| GENE 2 1284 - 2249 1137 321 aa, chain - ## HITS:1 COG:Cgl1574 KEGG:ns NR:ns ## COG: Cgl1574 COG0540 # Protein_GI_number: 19552824 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Corynebacterium glutamicum # 11 316 2 309 312 276 47.0 3e-74 MQQETFHWPHKDLLDVDQLTQEELFHLLDTAANLHEINSRPVKKVPILKGKSVVLFFAEP STRTKTSFDMAGKRLSADTFGLAKSGSSLQKGESMKDTALTLQAMNPDVIVIRHSSSGAA QFLAERLHCGVVNAGDGWHAHPTQALLDAFSLRQVWGHDRNAFKGKNLLILGDCAHSRVC RSNVLLLTKLGVNVSLCAPRTLLPAGVDNWPVTVHTDSKKAVRDMDAVMCLRLQLERQQA GLLPDLREYSKRFCLTTAHMELARPGAHIMHPGPILRGLDVSDALADSSQSLILDQVSAG VSVRMAVLYLLATRPDIKDNL >gi|316921887|gb|ADCP01000138.1| GENE 3 2585 - 3730 660 381 aa, chain + ## HITS:1 COG:all4956 KEGG:ns NR:ns ## COG: all4956 COG0635 # Protein_GI_number: 17232448 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Nostoc sp. PCC 7120 # 4 375 14 394 396 243 35.0 3e-64 MLLYIHVPFCRRKCRYCAFYSNPLARSGDEMAVYLSALYAELGQWGRRLGRVPVESVFFG GGTPSLLDPDQLEGVLDCVGRTFEVLPGAEISMEANPDSLHTAEKASGFLAAGVNRISLG VQSLHDGMLETLGRLHRANAAREAFRAIREAGCANLSLDLMWGLPGQTLEQWLEDARAAI ALGPEHISAYGLTLEPGTPLAESCGDAELPSEDVQCAMYLEGIRLFEEAGLHHYEVSNFA RDGFRCRHNLGYWEGRDYLGVGPAATSTIGGERWTNPEGEGWLEQVREGRRCPEWEPLDR ATRALELMMLRLRTVDGLPLDAYESLAGRSFLGDHGPFAHRLCAEGLARMENGVFRLTDE GMLVSNSILGELFEEEPKSAE >gi|316921887|gb|ADCP01000138.1| GENE 4 3895 - 4986 869 363 aa, chain + ## HITS:1 COG:BS_yqhT KEGG:ns NR:ns ## COG: BS_yqhT COG0006 # Protein_GI_number: 16079502 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Bacillus subtilis # 11 363 4 343 353 202 34.0 1e-51 MSEQMCEQRRERLRALMREQGIEALLISHAANRYYLSGFELHDVQLNESAGRLIVMADGK DWILTDSRYLDAARRLWEPERVFIYGADAPEDIAKLLKGLVPGKTIGFEARAVTLEFYEK FAETLAGSGCRLSKADGLVERLRVIKDTEEIRRMEISCRLNQQMMEWLPGVLVPGRSEAA VSWDIEMFFRHHGATELAFTSIVGHGPNAALPHYLPSKDALLEAENLVLVDVGCRLEDYC SDQTRTFWVGEKPTERFKKTLEAVQEAQHKAIRAIHPGVLACDVYKAARGHFESLGVAEA FTHGLGHGVGLETHEGPSLNGRNKTPLEPGMIVTVEPGLYFPEWGGIRWEHMVLVTEDGC RVL >gi|316921887|gb|ADCP01000138.1| GENE 5 5569 - 6054 54 161 aa, chain + ## HITS:1 COG:TM0622 KEGG:ns NR:ns ## COG: TM0622 COG0438 # Protein_GI_number: 15643387 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 4 121 217 333 388 63 33.0 2e-10 MFYALKKVREQGVDFRCIVVGTGPDEQLVKEQCSSMGLDDITTFTGHVSNVQEQLGKGDI YVLPSYCEALGIALEEAMAQGLACIARRSGGVPEIWPTDQGSLLVQAHDDGSNMAEALYS LLTLPGEALLGIKKDFYAHAVQGFHVEEQARKVEEWLFNIY >gi|316921887|gb|ADCP01000138.1| GENE 6 7091 - 7357 109 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYGIVGTTLFIWSVGLALKKTLAQKDLFFQVVIIFYSVFGLFEFSLDREDGIVLLFFPMG LVYGREIAASLQRQPPAQHDQPCGAQAE >gi|316921887|gb|ADCP01000138.1| GENE 7 7301 - 8266 611 321 aa, chain - ## HITS:1 COG:no KEGG:DVU1213 NR:ns ## KEGG: DVU1213 # Name: not_defined # Def: rhomboid family protein # Organism: D.vulgaris # Pathway: not_defined # 16 321 42 333 337 238 50.0 3e-61 MRPYSVPRFWRMIDDGEKRPTLPPPHLFDLWITILASRNIPYLLTGRGNKLRLYVPALLE RQARSELEAVLAESRKPRPVYIEPPTHNNAHWVLSVLLLLILWHGVRMHWGFLAHLPGLP DLPSEEWSRLGSLDVFRVKTFGEWYRCVTALTLHADSQHLFGNVLFGAPFLILLCRRVGL GLGLFLILLAGSFGNALNAWYRPAGHISLGFSTALFGTVGVLSGFMALQGWGSRTQSDTG KLSWRRGILLLAAGTGILAMLGTEGDKTDYAAHLFGLLSGFIVGGTAGWISRRTALPSPV INTLLGLSAAGLVVLCWRLAL >gi|316921887|gb|ADCP01000138.1| GENE 8 8454 - 8663 303 69 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|220904652|ref|YP_002479964.1| ribosomal protein L28 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 67 1 67 69 121 83 4e-27 MSKQCEFCGKKPQTGNYVSHSNIKTKRVFNPNLQNVRHQFANGEVRTVSVCTRCLRSGVV TKPVVRKAD >gi|316921887|gb|ADCP01000138.1| GENE 9 9039 - 12278 4394 1079 aa, chain + ## HITS:1 COG:alr3809 KEGG:ns NR:ns ## COG: alr3809 COG0458 # Protein_GI_number: 17231301 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Nostoc sp. PCC 7120 # 1 1075 1 1100 1104 1301 60.0 0 MPKRTDLKKIMVIGSGPIVIGQACEFDYSGTQGIKALKEEGYEVVLVNSNPATIMTDPGL ADRTYIEPIEPETVAAIVRKERPCAILPTLGGQTGLNTALALAKSGVLAECGTELIGARA EVIEKAESRELFRQAMENIGLKVPQSAIAHNMDDVRRIGSSMPFPMIIRPAFTLGGTGGG IAYNMEDLEEIAGDGLTASPVSEVMIEQSVIGWKEFEMEVMRDTADNCVIVCSIENVDAM GVHTGDSITVAPAQTLTDREYQKMRDASLAIMREIGVETGGSNVQFGVNPANGELVIIEM NPRVSRSSALASKATGFPIAKIAAKLAVGYTLDELRNDITRETSASFEPTIDYCVVKIPR FTFEKFPGAKDELTTSMKSVGETMAIGRTFKESLQKGLRSLEIGAIGLGSNFRAALPSRE ELMGKLRTPNSRRMFAIRQAMLVGFTLEELHEITLIDPWFLRQIKEIVDMEGQIRDFALA NSMTPDNPEMVAVLRKAKEFGFSDRQLAEMWKKPEGDIRALRRETGTVPTYYLVDTCAAE FEAYTPYFYSTYETGQEVVPADRKKVIILGGGPNRIGQGIEFDYCCCHASFALKDAGVQA IMVNSNPETVSTDYDTSDRLYFEPLTFEDVMHIVEKERPDGVIVQFGGQTPLNLAVPLLR AGVPILGTSPDAIDRAEDRERFQALLQKLSLLQPANGTATTLEESREIAHRIGYPIVIRP SYVLGGRAMMIVYDDEEMAEYFALHVGKQKLEHPVLIDKFLENAIEVDVDALSDGEDVYV AGVMEHIEEAGIHSGDSSCVLPPYSLPAETVAEIERQTVALAKELRVVGLMNIQFAVKDG VVFILEVNPRASRTAPFVSKATGVPLPRLATQVMLGKTLKELDPWSMRRSGYVSVKESVF PFRRFPGVDILLGPEMRSTGEVMGIAPTFEEAFLKGQWGAGQKLPEGGKVFLSVNDRDKR FVAEVARQYVDLGFELLATSGTAGLLAGQGIPSTRVLKVYEGRPNIVDLIKNGEVSLVIN TASGKRTAHDSKAIRQATLHYGVPYSTTLSGAKAIARAIGAARTTEVRVKSLQEYYKGE >gi|316921887|gb|ADCP01000138.1| GENE 10 12808 - 14145 1405 445 aa, chain + ## HITS:1 COG:CAC1392 KEGG:ns NR:ns ## COG: CAC1392 COG0034 # Protein_GI_number: 15894671 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Clostridium acetobutylicum # 1 433 36 463 475 430 47.0 1e-120 MAYFGLYALQHRGQESTGIASYDGKHIRLHAGMGLVPDVYDEATLGQELQGNRAIGHVRY STSGVSVRKNAQPLVVRVCGIEIALAHNGNLTNASELRGELESEGAIFQTSSDSEIFVHL IARSLAKKKDLEEAILSACDRVRGAYSLIILSEGRIIALRDAHGFHPLAMGKVDDAYVFA SETCAFDLLEAEYLREVLPGEMVVVDGDKVVSRPLNNAGKVPVRQCIFELVYFARPDSIV FGEDVYQCRKHMGYEMAKEAPTDAELVMPFPDSGVYAAIGFAHAAGVPYEQAYIRNHYVG RTFIQPSQTMRNFGVRVKLNPVRSMIKNRRICIVDDSIVRGTTVRTRVVKLRELGAREVH FRVSCPPLRHPCFYGINFSSRGELVAANHPLEALPGLLNLDSLHYLTIEGLLNSVSEPDK YCLACFNGEYPVACGGQCSCCGSAE >gi|316921887|gb|ADCP01000138.1| GENE 11 14164 - 15789 2017 541 aa, chain + ## HITS:1 COG:aq_356 KEGG:ns NR:ns ## COG: aq_356 COG0119 # Protein_GI_number: 15605865 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Aquifex aeolicus # 1 532 1 524 528 544 51.0 1e-154 MTQKIVLYDTTLRDGTQAEDVNLSTPDKIKIALRLDEFGIDYIEGGWPGSNPVDVAFFKE IANYHLKYARIAAFGSTHHPDNKVEEDPNLNALIASGATAGTIFGKSCERHAKEALRLDP KRNLDIIRNSVAYLKRSMPEVYFDAEHFFDGYKRNAEYSLAVLRAAHEAGALIINLCDTN GGTLPNEMQQIVSDLRVALPEVRLGIHTHNDCEMAVANTIVAVQAGAEMVQGTINGVGER CGNANLCSIIPVLQVKCGLTCLPEPADVRLRQLTKLSSYFAEVANMKPFNRQPFVGNSAF AHKGGVHVSAVNRCSSLYEHMEPELVGNGQRILITELGGRSNIVSLARRFGFHLDKDEPV VKGLFNELKKKASLGYDYASAEASVELLILRKLARRGVRDFFKLVQYHVSAVRDASHELP MEEATVMVEVEGAVEHTAATGNGPVNALDTALRKALLPFYPRLKEMKLLDFKVRVLSASD GTGGTASVVRVLIESGDADSTWVTVGVSHDIIEASWQGLVDSVVYKLYRDEEGQRKLRDL R Prediction of potential genes in microbial genomes Time: Fri May 13 04:35:41 2011 Seq name: gi|316921850|gb|ADCP01000139.1| Bilophila wadsworthia 3_1_6 cont1.139, whole genome shotgun sequence Length of sequence - 42850 bp Number of predicted genes - 38, with homology - 33 Number of transcription units - 27, operones - 9 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 17 - 604 651 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family - Prom 819 - 878 80.3 + TRNA 790 - 876 72.3 # Leu CAA 0 0 - Term 1012 - 1061 4.4 2 2 Tu 1 . - CDS 1117 - 1329 69 ## + Prom 1735 - 1794 2.2 3 3 Tu 1 . + CDS 1959 - 2453 389 ## Dvul_0265 hypothetical protein + Term 2459 - 2492 3.0 - Term 2446 - 2480 2.0 4 4 Tu 1 . - CDS 2508 - 3428 1100 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 3562 - 3621 4.1 5 5 Tu 1 . + CDS 3647 - 4198 398 ## COG0622 Predicted phosphoesterase + Term 4207 - 4254 9.3 + Prom 4273 - 4332 2.9 6 6 Op 1 . + CDS 4497 - 5861 1562 ## COG0534 Na+-driven multidrug efflux pump + Term 5882 - 5930 4.1 7 6 Op 2 . + CDS 5954 - 6754 785 ## COG2979 Uncharacterized protein conserved in bacteria + Term 6837 - 6883 15.1 - Term 6886 - 6916 -0.6 8 7 Op 1 7/0.000 - CDS 7037 - 7879 655 ## COG0348 Polyferredoxin 9 7 Op 2 . - CDS 7876 - 8247 186 ## COG1145 Ferredoxin 10 8 Tu 1 . - CDS 8363 - 8686 433 ## - Term 8778 - 8813 0.2 11 9 Op 1 . - CDS 9003 - 11267 2940 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 12 9 Op 2 . - CDS 11279 - 11731 583 ## Ddes_0615 hypothetical protein 13 9 Op 3 . - CDS 11731 - 12309 515 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit - Prom 12385 - 12444 1.5 - Term 12493 - 12545 9.3 14 10 Tu 1 . - CDS 12742 - 14103 1742 ## COG1596 Periplasmic protein involved in polysaccharide export - Term 14415 - 14453 10.6 15 11 Tu 1 . - CDS 14474 - 14845 298 ## DvMF_1691 hypothetical protein 16 12 Op 1 16/0.000 - CDS 15311 - 15928 519 ## COG0311 Predicted glutamine amidotransferase involved in pyridoxine biosynthesis 17 12 Op 2 . - CDS 15933 - 16814 1256 ## COG0214 Pyridoxine biosynthesis enzyme - Prom 16968 - 17027 4.6 18 13 Tu 1 . - CDS 17183 - 19108 2410 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Prom 19356 - 19415 5.8 19 14 Tu 1 . + CDS 19497 - 20579 1212 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 20607 - 20651 12.1 + Prom 20991 - 21050 3.7 20 15 Tu 1 . + CDS 21143 - 22576 1541 ## COG0593 ATPase involved in DNA replication initiation + Term 22677 - 22715 2.1 - Term 22886 - 22942 12.4 21 16 Op 1 . - CDS 22966 - 24078 1064 ## COG1988 Predicted membrane-bound metal-dependent hydrolases - Term 24093 - 24139 13.1 22 16 Op 2 . - CDS 24341 - 25366 780 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation - Prom 25424 - 25483 5.4 + Prom 25541 - 25600 2.2 23 17 Op 1 . + CDS 25677 - 26714 1047 ## COG1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases 24 17 Op 2 . + CDS 26731 - 27603 1036 ## COG1284 Uncharacterized conserved protein + Term 27629 - 27661 0.8 25 18 Tu 1 . - CDS 27596 - 27784 63 ## 26 19 Op 1 . - CDS 27974 - 30988 4390 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain 27 19 Op 2 . - CDS 31046 - 32023 1023 ## COG0142 Geranylgeranyl pyrophosphate synthase 28 20 Tu 1 . + CDS 32181 - 32378 206 ## + Term 32475 - 32515 7.6 - Term 32463 - 32503 6.2 29 21 Tu 1 . - CDS 32526 - 33941 1971 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases - Prom 33961 - 34020 3.9 30 22 Tu 1 . - CDS 34061 - 34543 238 ## + Prom 34738 - 34797 3.5 31 23 Tu 1 . + CDS 34874 - 36298 479 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis + Term 36342 - 36375 4.5 - Term 36558 - 36597 10.0 32 24 Op 1 9/0.000 - CDS 36619 - 36978 305 ## COG1314 Preprotein translocase subunit SecG - Term 37054 - 37082 -1.0 33 24 Op 2 . - CDS 37170 - 37931 873 ## COG0149 Triosephosphate isomerase 34 24 Op 3 . - CDS 37954 - 38454 222 ## PROTEIN SUPPORTED gi|77918898|ref|YP_356713.1| ribosomal-protein-alanine acetyltransferase + Prom 38190 - 38249 1.6 35 25 Tu 1 . + CDS 38446 - 39024 191 ## DvMF_0346 NUDIX hydrolase 36 26 Op 1 . - CDS 38975 - 39976 835 ## COG1077 Actin-like ATPase involved in cell morphogenesis 37 26 Op 2 . - CDS 39981 - 41024 890 ## LI1136 hypothetical protein - Prom 41188 - 41247 4.0 + Prom 41585 - 41644 2.4 38 27 Tu 1 . + CDS 41669 - 42088 381 ## Ddes_0625 protein of unknown function DUF1090 + Term 42207 - 42246 3.4 + 5S_RRNA 42783 - 42841 93.0 # AE015927 [R:2797299..2798807] # 5S ribosomal RNA # Clostridium tetani E88 # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. Predicted protein(s) >gi|316921850|gb|ADCP01000139.1| GENE 1 17 - 604 651 195 aa, chain - ## HITS:1 COG:MA2295 KEGG:ns NR:ns ## COG: MA2295 COG1853 # Protein_GI_number: 20091133 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Methanosarcina acetivorans str.C2A # 1 185 1 185 188 191 48.0 5e-49 MFQSLGAKTFILPTPVFLVGTYDHTGRPNIMTAAWGGICASQPPSIAVSIRKSRWTYNAI LERKSFTINIPSRKLAAQSDFAGMHSGRDTDKFTALNLTPVPAEHVDAPGIAECPVIVEL TLSHTLELGSHTQFIGEIMDVKVDSACMREDGLPDPALIDPLLFAPLVQEYWAISQFEAR AFSAGHALTHSAVLG >gi|316921850|gb|ADCP01000139.1| GENE 2 1117 - 1329 69 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTCLSRIYNLDIRILHIHNVHTKRHVTRSRQRRVDCVEHFKTLFKPEVSPKLMWKNAAK VLGLETSEGG >gi|316921850|gb|ADCP01000139.1| GENE 3 1959 - 2453 389 164 aa, chain + ## HITS:1 COG:no KEGG:Dvul_0265 NR:ns ## KEGG: Dvul_0265 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 18 159 35 185 200 98 43.0 8e-20 MNQKHCLLVMAFAGVLSLAPLPAFSGNLSNFSDALEQLADMTRAADRAHDALNDAMQGGR DYSRYDRDWRNAESRLERARIRTMAHIAGVSESRIRSLRDDGYSWERIARKYKVDPRRFG YGQSRYDHDRDKWKGVPPGLAKKGGLPPGQAKKFKEHHDKKHRD >gi|316921850|gb|ADCP01000139.1| GENE 4 2508 - 3428 1100 306 aa, chain - ## HITS:1 COG:PM1698 KEGG:ns NR:ns ## COG: PM1698 COG0697 # Protein_GI_number: 15603563 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pasteurella multocida # 10 290 8 291 301 131 33.0 2e-30 MPIFSHTVTGYLFALLATVVWSGNFVVARGLAGALSPVELSFWRWSIAFLTILPFAGRSL LRSLPLVRGTWGKVILMALLGITCFNTFIYQAGHTTDATNMSLLATASPIVMAAIAHLFL RERLSRFQFFGLCGALCGVIILVSRGRLGTLLGLRFAQGDLWMLLSVFLFAIYSLMLRCR PKAFPQKAFLALLIGIGVLGLIPPLLWQAADTGLSPLDGSILSALIYIGVGASVVSFLAW SLAIERIGMVRAGIIYNSIPLFASLEATLFLGESITLPQMIGGVLIIGGICYASFGDLYA ARRLLK >gi|316921850|gb|ADCP01000139.1| GENE 5 3647 - 4198 398 183 aa, chain + ## HITS:1 COG:CAC2749 KEGG:ns NR:ns ## COG: CAC2749 COG0622 # Protein_GI_number: 15896006 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Clostridium acetobutylicum # 1 177 1 177 180 132 38.0 5e-31 MRLLVASDLHGSPEAAAFLRRRCEELLPDMLVLLGDYLYHGPRNPLPSSYGPPSVVSVFA DFDTPIVAVRGNCDAEVDLMLLPFAVEDSAVIAADGLRIVAQHGHHLPSCPPIPGVRPGD VVLSGHTHIPRGETVDGVHFWNPGSTTLPKGGFPASYGVFEAGAFRVFGLDGRLLLEHRP SAA >gi|316921850|gb|ADCP01000139.1| GENE 6 4497 - 5861 1562 454 aa, chain + ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 10 434 7 432 448 337 43.0 3e-92 MADTSATLLLRTEPIGKLLLRYSLPAIAAMVVFSLYNIIDSIFIGHGVGPLAISGLAITF PVMNLTFALVLLVGIGGASICSIRLGQQDIDGATRVLGNVLLLGLINGVTFGLLSQLVLD PVLTAFGASEHTLPYARDFMQIILYGLPVTCTMFGLNHIMRATGYPQKAMLSAVLTVGMN IILAPIFIFWLKWGIRGAAIATVLSQCVGMVWVLSHFRNPKSTVHFRQGTFRLRWKIVSS IFSIGMSPFLLNVCACLVTVLINIGLKRYGGDMAIGAFGILNRILILFVMLVMGLTQGMQ PIVGYNYGAQQFERVKQTLKYGVITGGLITTAGFLAGQFAPEIVSRMFTDDAGLIALSVE GMRLATLVFPLVGIQIVVGNFFQSIGKAKLSIFLSLTRQLLFLAPCLLILPRFFELKGIW ISLPVSDSLSFVTSMGVLYMFLREMRRVHHSREA >gi|316921850|gb|ADCP01000139.1| GENE 7 5954 - 6754 785 266 aa, chain + ## HITS:1 COG:PA3712 KEGG:ns NR:ns ## COG: PA3712 COG2979 # Protein_GI_number: 15598907 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 74 257 58 225 231 103 37.0 4e-22 MKDLFNQISNGLGGLGGLGDKLNGALGGQQKADAPSGQGTGNALGDMVKSAGGLGGLLGS AALGGLLGTLMSGKTARKVAKGALMVGGTAAVGALAWKFYQKWSGANGAAPQGQPAEPAP SAQPAPSGWPAELQQTPPPAVSADQTALLLLEAMVFAARADGHIDDKERANIHNAVESLF PGRDMAQVLDTLLNKPLDPASLASRIANHDEARDLYRLSAMIIDVDHFMERSYLDGLAAA LKIAPEEKAALDSEVEETKRTAIEPS >gi|316921850|gb|ADCP01000139.1| GENE 8 7037 - 7879 655 280 aa, chain - ## HITS:1 COG:STM2257 KEGG:ns NR:ns ## COG: STM2257 COG0348 # Protein_GI_number: 16765585 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Salmonella typhimurium LT2 # 3 264 23 271 289 79 28.0 6e-15 MIRLLRLRRLVQVAVLLLLAALPWLNRDGFTGMKGSLFAFDFFGLPFADPLGAVQIGLSG IIPGWRLIAGALVVLAVALCLGRIFCSWFCPFGLLSELVHAIRGRRPQKPFKYGFWSRAV IALVGLALVPLLGFPLLNQLSLPGGLSLGFLTAAAPLSSKTIHTVLEHLLFFAVPFLPVI TVLAVELVTGERLWCRWVCPQSVLLALMARLPVAPRLRRTLSRCTCKGEPACRRACSLGL DPRNPRSASPLECTNCGACVLACSRVHGGGPAALALGKKD >gi|316921850|gb|ADCP01000139.1| GENE 9 7876 - 8247 186 123 aa, chain - ## HITS:1 COG:STM2258 KEGG:ns NR:ns ## COG: STM2258 COG1145 # Protein_GI_number: 16765586 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Salmonella typhimurium LT2 # 11 120 89 213 231 60 34.0 8e-10 MLETFGPERHTPEIRPSEIPCWLCMKCPPACPSGALRPVAAMKEANMGRAVIFKDRCLNW IESGTMCMTCYDRCPLRGEGMVLDMGYVPAVGESCVGCGMCEYVCPKNAVAVVPDVQKKG GKA >gi|316921850|gb|ADCP01000139.1| GENE 10 8363 - 8686 433 107 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVIAGITIQTTAEAYGTVRERLCASKDVADTQETGVPGMLAAVMETDAAYIEDALGALSG WDGVLNVGLVSISYEDELENKGYISCPEHKPRKCGPACFGETPNPLD >gi|316921850|gb|ADCP01000139.1| GENE 11 9003 - 11267 2940 754 aa, chain - ## HITS:1 COG:STM2259 KEGG:ns NR:ns ## COG: STM2259 COG0243 # Protein_GI_number: 16765587 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Salmonella typhimurium LT2 # 1 753 1 827 828 537 38.0 1e-152 MQMTRRDFLRSAAMSAALAAVAGPAGVARAAEANTPEKWVKGVCRYCGTGCGVLVGVKNG KAVAIKGDPNNHNQGFLCLKGALLIPVLNSPDRVTKPMLRRTKGGRLEPISWDEALDHMA KKFRTTIDTYGPDAVAWYGSGQCLTEESYLANKIFKAGFKTNNVDGNPRLCMASAVGGYT TTFGKDEPMGTYADIDRASAFFIIGSNTSEAHPVLFRRIARRKQNEPGVKIIVADPRRTN TSRIADLHIAFRPGTDLALMNGMAWVILHEELDNPRFYNKYAIFKTNDGKDATFDDYRAF LEDYTPDKVAKLCNIPEQQVWEAGRLFAESPATMSLWCMGINQRIRGVWANNLIHNLHLI TGQICRPGATSFSLTGQPNACGGVRDTGSLSHLLPAGRVVANKAHRNQMEAFWGIPQDSM SPNVGYHTIALFEALGKADVKAIIICETNPAHTLPNLNKVHKAMSNPDTFITVIEAFPDA VTLEYADLILPPAFWCERDGTYGCGERRYSLIEKAVEPPADCRPTVNTLIEFAKRAGVDP KLVNFRNSADVWDEWRRLSMKGPYNFGGMTRERMKRESGLIWPCPTETHPGTNLRYVRGE DPNIPADHPDKIWFYGNPSGKANIWMRPYKGAAEEPDQEYPFYLTSMRVIDHWHTATMTG KVPELLKANPAAFVEVNTEDASRLNIKNGDQVVVETRRDALTLPAHVNDTCKPGLIAVPF FDKKKLVNKLFLDATDPASREPEYKICAARIKKA >gi|316921850|gb|ADCP01000139.1| GENE 12 11279 - 11731 583 150 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0615 NR:ns ## KEGG: Ddes_0615 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 23 149 21 148 148 182 67.0 3e-45 MARLLKKPLIIALAAVILLVIGFFAYIQNAFTGTRCEAAKHLDADMIGDCYGCHLKVTPQ VAQDWYESKHGVTLVRCQVCHGQPDGKGAVPFKRVPGVEVCAACHGLAIDKMTALYGKRD DCSTCHPYHARPMHGKVYENRQATTKTVLE >gi|316921850|gb|ADCP01000139.1| GENE 13 11731 - 12309 515 192 aa, chain - ## HITS:1 COG:VC1951 KEGG:ns NR:ns ## COG: VC1951 COG3005 # Protein_GI_number: 15641953 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Vibrio cholerae # 18 190 25 194 368 78 30.0 6e-15 MFFRSMKVWIAATVLVVLVLAGTAIGVHVTSSTSFCLSCHEMRVHQQELALSPHAQDARG NAIECVQCHIPSTNIVRMLSAKTWLGTKDLWVHATTGGSVTLNRREIQPEARRFMDDANC RACHEDLYRNAKNDGAISEYGRLAHDNYLDKNGSSRSGCAGCHRNIAHLPPEDRHYDANA AFASKLTFKEVR >gi|316921850|gb|ADCP01000139.1| GENE 14 12742 - 14103 1742 453 aa, chain - ## HITS:1 COG:Cj1444c KEGG:ns NR:ns ## COG: Cj1444c COG1596 # Protein_GI_number: 15792762 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Campylobacter jejuni # 4 453 104 552 552 268 36.0 1e-71 MDDILTVDPEGMITIPDVGRVPVAGLPIDAKEPLEQAIKSKLNAAGVVDVEMYIRPMDTQ PVSIFVTGFVPRPGNYTGTPSDTILAFLDKAGGIDSKRGSYRNIRVLRQGQEVGSFDLYP FALKGAMPHIRFQDGDTVVVGEKGPSVITSGEVRNTARFEFKPGELTGAKLIELADPQPR ATHVSLSGSRGGAPYNLYLPMNDFKALRLENGDQIRFLADRPGDTIMVEAQGAIRGASRF PVRRNTRLHEVMSYIAVEPGRANLEGLYIKRKSVAVRQKKAIEDALRRLEQNAYTATSAS SDEAQIRSKEAEMLSNFISRAKEVQPEGVVVVGTKGNIADLALEDGDVIVIPEKTDVVLI SGEVMMPQAIVWNKDKDMDDYIKGAGGFSNRADESNLIVVHPSGEVVPRAKEVLPGDQVL VLPRVESKNLQAVKDISQVLYQVAVACKVILDL >gi|316921850|gb|ADCP01000139.1| GENE 15 14474 - 14845 298 123 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1691 NR:ns ## KEGG: DvMF_1691 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 4 123 13 131 133 87 41.0 1e-16 MRFLQFCGLILLGGLVSCSWQSAPTPPPVEPQPLSPVAQFMVANATGATAVIEDETFGGE VRVTLEDAFLSAAGETCKRATLLSAQHEAEIVVICHNGQEGDGSTPGEWRLMPRVWGKGI STP >gi|316921850|gb|ADCP01000139.1| GENE 16 15311 - 15928 519 205 aa, chain - ## HITS:1 COG:SP1467 KEGG:ns NR:ns ## COG: SP1467 COG0311 # Protein_GI_number: 15901317 # Func_class: H Coenzyme transport and metabolism # Function: Predicted glutamine amidotransferase involved in pyridoxine biosynthesis # Organism: Streptococcus pneumoniae TIGR4 # 8 201 2 188 193 176 47.0 3e-44 MAATSAPRIGVLAIQGAFREHVRSLCLCGAETVEIRTREDLEGLSGLVFPGGESTVMGKF LIEWGMMDRVRELIRSGMPVFGTCAGLILLCSDILDHPGQPRIGLLNASVRRNAFGRQID SFTTPLELCPFPTASGNPTDQPPLEAVFIRAPLIEKIGPGVEVLARVNGFPVAVRQGNVL ATSFHPELTDDLRLHQWVVDMAGKN >gi|316921850|gb|ADCP01000139.1| GENE 17 15933 - 16814 1256 293 aa, chain - ## HITS:1 COG:BS_yaaD KEGG:ns NR:ns ## COG: BS_yaaD COG0214 # Protein_GI_number: 16077079 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxine biosynthesis enzyme # Organism: Bacillus subtilis # 2 293 3 294 294 405 73.0 1e-113 MEKGTFQLKSGLAEMLKGGVIMDVTTPEQAKIAQEAGACAVMALERVPADIRKAGGVARM ADPTIVLRIMDAVTIPVMAKARIGHIVEARILESLGVDYIDESEVLTPADDRFHIDKRDY TVPFVCGARNLGEALRRIAEGASMIRTKGEPGTGNVVEAVRHCRMVLGEVRALCALPEEE VANFAKENGAPLELVLKIRAEGRLPVVNFAAGGIATPADAALMMQLGLDGVFVGSGIFKS TDPAKRAKAIVAAVTHYNDYKILAEVSRDLGEAMPGLEISTIAPEQRMQERGW >gi|316921850|gb|ADCP01000139.1| GENE 18 17183 - 19108 2410 641 aa, chain - ## HITS:1 COG:YPO1419 KEGG:ns NR:ns ## COG: YPO1419 COG0488 # Protein_GI_number: 16121699 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Yersinia pestis # 1 628 1 632 637 533 47.0 1e-151 MALLSLQGVTLNLGGKTLLDTADLHVEQGERVCLVGRNGVGKSSLMALLSGDLTPDGGTI VRTPGVTFGHMPQAVPDHWSGPVFGVVASGLGKEGEALAAAHLIATGREDQLSAAKLALA RDLLQSGEGWERHGDILEAINQLGLDPEADFATLSGGNKRRVALARALVASENLLLDEPT NHLDIKTIAWLEDFLVRRVKTFVFISHDRAFVRRLATRIVEVDRSQLYSYACTFDQFLER REERLDAEEKQAAAFDKKLAQEEAWIRQGIKARRTRNMGRVRALQAMRSERMARVSRQGV VSMLAQEAERSGKLVIEAKQAGFAYPDGYRVFNAFSTIIQRGDRIGLIGNNGVGKTTLLR LLLGELEPTEGSIRHGTRLEVSYFDQLRSSLDPEKSVMDNVANGNDTVTINGQQRHVASY LMDFLFESDRLRVPVHTLSGGERNRLLLAKLFTMPSNLLVLDEPTNDLDVETLELLEELL ASYSGTVLMVSHDREFLDNLVTTTLALEGDGQVREYVGGYTDWLRQRPEPVAAHTDKPKP KPLAPPAKIGPRKLTFKEQRELQMLKDELEALPGKMAALEDEQHTLEDRLNDPGFFARDP DGFNATAKRISELDDEQTAALQRWEDVEFRIAELEGKAEDK >gi|316921850|gb|ADCP01000139.1| GENE 19 19497 - 20579 1212 360 aa, chain + ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 10 338 8 303 332 92 25.0 2e-18 MKVRNPFAGLLALFFVVAWCLSGSAAFAKPLVPLRTAWLGEHETFLVWYAREKGWDKAEG LDLELLPFESGKNVIDEMQSSNWAIAGVGAMPALTASLSSRLYIVGIGNDESASNAIFTR ADSPILKMKGFNPNCPEVYGNPGSVRGKTFIVPKGTSAHYMLSRWLHVLGLNERDVNIVD MQPSEAMKAFADGQGDAIALWAPQTFEAEKLGLKTVAHSSDCNARQPILLVANMDYANKH KGDIVAFLRVYLRSVDMMKKTPAEDLADDYMRFYNAWTGKTMTREEAIRDIKDHPVYALD QQLDMFKQSYGTSELREWLHDIVSFQNETGELDRRDLARLERLHYVTDIYLKAVKAPAKK >gi|316921850|gb|ADCP01000139.1| GENE 20 21143 - 22576 1541 477 aa, chain + ## HITS:1 COG:VC0012 KEGG:ns NR:ns ## COG: VC0012 COG0593 # Protein_GI_number: 15640044 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Vibrio cholerae # 5 461 11 456 472 230 33.0 6e-60 MKTQWLKICEDLQNRLNPGTYKVWVAPLTADLEGEGKIRLSAPNGFVATWVRDRLLNDIS DAASAIFGRTMEISVVAGNPPAKPSRSVPGRPAVSVEGEPAPARRPRTAPAPRPVAAPSL LSSAAEQLSLPITMPVNQSVPHNWRYAFDSFVVGPTNDMAYAAARNMARSGAAVDTLFLS SGPGLGKTHLTQAVGQALCEASNRSNPKVEYLTAEEFSSCFVQALQSRTVDRFKGRFRDV DLLLLEDVHFLQGKEKMQDEVLSTIKSLQEKGSRVVLTSSFAPCELRNVDNSLVSRFCSG FLAGIEKPDASTRRRILQEKARQNNALLSDTVVDVLTERLTGDIRQLESCVHNLLLKAKL LGCTISVEMAQEILAQYSLDDPFVDVDSIIRKVCEGFGLSPEQLASRSRKQNLVVARNTI FYLARKHTELSLQDIGDKFSRRHSTVLKGIASVERELRRESPLGRQIAGTLALLERR >gi|316921850|gb|ADCP01000139.1| GENE 21 22966 - 24078 1064 370 aa, chain - ## HITS:1 COG:PM0419 KEGG:ns NR:ns ## COG: PM0419 COG1988 # Protein_GI_number: 15602284 # Func_class: R General function prediction only # Function: Predicted membrane-bound metal-dependent hydrolases # Organism: Pasteurella multocida # 36 205 37 196 345 61 29.0 3e-09 MEPITHALSGAVIGYALPGKRRWWLPAWAALVAASPDMDVFFTRTPLQYIEYHRGITHSF VGGLGLALLLALIPWLMNRWRVPKVPDDSPPFGWPLPLAWLTAYLLILHHIFLDCMNSYG TQVFLPFSDYRVRLNALFIVDPLLLLPLALGLIFWRKRRAVMIGLLLWTILYPLGSLATR VGLEAHLKDSHYTPDVFTQELLSEDLAPGQRGKGLWDDVRAVHLVPDAFTPFHWKLILDR GSVWDVAGYTVFTEEPQTFIAYAKPPQPLWAELAAKDRMFRAYERFALYPALEAEIPLEG PLEGYTEYVFSDLRFGSTIGWVDDIQVRRHGKPTTFRIMARVAPDGRVSAVRFITTTGAG GDSGWNPPVE >gi|316921850|gb|ADCP01000139.1| GENE 22 24341 - 25366 780 341 aa, chain - ## HITS:1 COG:NMB0352_1 KEGG:ns NR:ns ## COG: NMB0352_1 COG0794 # Protein_GI_number: 15676267 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Neisseria meningitidis MC58 # 18 218 11 205 207 201 54.0 2e-51 MTQSVRSHTGPAALIKLAQDVLNTEIAGLAAVRDRLNARTPEKAPLVLALGLLAACTGRV VVTGIGKSGLVGRKLAATFSSTGTPAFFLHPVEGAHGDLGSLRKDDVIIAISNSGETAEL NAILPALKSLGTSLIAMTGREDSTLGRLADVTLHSGVPREACPHGLAPTASTTAVLALGD ALAVCLMQLKSFTEKDFLRYHPGGSLGQRLKLNVSEVMRTEGLPQLSETSLLSEALRQLD QGKLGAVLLLDNEHRISGILTDGDVRRAVCRGTLNPEAPVSTVMTPSPRCGTQSDTVATL LELMESKAITVLPIADEERRLLGMVHMHDLLGQGSVAFSQH >gi|316921850|gb|ADCP01000139.1| GENE 23 25677 - 26714 1047 345 aa, chain + ## HITS:1 COG:mll4732 KEGG:ns NR:ns ## COG: mll4732 COG1304 # Protein_GI_number: 13473966 # Func_class: C Energy production and conversion # Function: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases # Organism: Mesorhizobium loti # 37 339 33 351 352 120 31.0 3e-27 MDIKEVRANAKERMKGFCRVCPVCDGRACAGETPGMGGIGTGSAFKNNVAALAGVRLNMR LIHEVKKPVTETEVLGFRLRLPVLAAPIGGTSFNMGGALSEAEYARAIVSGCREAGIVGC VGDGAPDELHEAGNAAITAEGGWGIPFIKPWEGEELERKMCRAAATGTRVLGMDIDAAGL IALARMGRPVSPKTCAELAAIVDRSHELGMKFVLKGIMTVEDAIAAERAGCDGIVVSNHG GRALDHTPGTIEVLPAIAAQVKGRMAVLMDGGIRDGLDVLKALAFGADAVLIGRPFCLAA VGGGSEGVKLTADHLYNQLVRSMVLTGCPSVREAGRHLVSTAPLA >gi|316921850|gb|ADCP01000139.1| GENE 24 26731 - 27603 1036 290 aa, chain + ## HITS:1 COG:BH1678 KEGG:ns NR:ns ## COG: BH1678 COG1284 # Protein_GI_number: 15614241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 11 281 1 276 290 119 28.0 6e-27 MKVSLPQRNKLAQSMLWNLFLLTVGAVFFSLGAQAVAAHHGFLTGGIFGAGLLGWYTTGL FTPAIWYLLINIPLFFVSWFQIGRQFFFYSLYGSLATTIISELITFQVPIQDHFYAAITA GVLCGTGTGLMLRTLGSGGGLDLIGVILNKRWNIGIGRFNFCFNAVMFLASMATISLDMV IVSFIQVFVASATIEYALRMGSQRKMVYVVTDRGEALCEAIIAEGYGGATVLKGKGAYSG GDREIILTVTNNIMLRRLENLVFAIDDKALFIVENTFYVSGTRYPRKSII >gi|316921850|gb|ADCP01000139.1| GENE 25 27596 - 27784 63 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MASGKWFSSGAKQGDSGKAATRVTATIICHYKRLRRMEVWFGEEEGPLLKKGIFLPKAGS PK >gi|316921850|gb|ADCP01000139.1| GENE 26 27974 - 30988 4390 1004 aa, chain - ## HITS:1 COG:AF1940 KEGG:ns NR:ns ## COG: AF1940 COG0046 # Protein_GI_number: 11499524 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Archaeoglobus fulgidus # 180 971 10 732 765 378 34.0 1e-104 MLYRIEAGLFPHLDDTIGRKTAASIREALGIPVKSVRTIKVFTLEGLDAQQVNTLMAKSV LHDPVLQAVSLSPLPLPEEGTSWIIEVGYRPGVTDNEGRTARDTAALVLGIEDRSSLAVY TSIQYRLHGPLTEEQVTRIARDLLANELIQRFEIKSRAQWDADPGFAPKAARVTGASCDE VNIVPLSSMSDAQLMEVSQKNTLALSLEEMTKIKNWFSSPEVAAQRKELGLPADPTDAEL EVLAQTWSEHCKHKIFASKISYTDKEAGTSETINSLYKTCIMNTTKQIRESLGDRDFCRS VFKDNAGVIAFTPEYDACIKVETHNSPSALDPYGGALTGIVGVNRDPMGTGLGAELICNT DVFCFASPFHDEALPPRLLHPRRVFEGVREGVEHGGNKSGIPTVNGSIVFDERYLGKPLV YCGTIGLIPAEVPASKSSTDGKPRKGYEKKAQVGDIIVMTGGRIGKDGIHGATFSSEELH EGSPATAVQIGDPITQRKMYDFIMRARDLGLYNAITDNGAGGLSSSIGEMAEDTNGCRID LARAPLKYDGLRPWEILLSEAQERMTLAVPPAALDEFLALAKRMDVEATPLGEFTDDGVF HVTYNGKLVTHLDMEFMHDGVPQYQLNAVWEAPVHPDSRIEDGGVDQAELLKAMLGRLNI CSKEYLIRQYDHEVKGGSVVKPLIGVKRDGPSDAGVIRPILESEAGLVISHGICPKYSDY DAYWMMANAMDEGIRNAVAVGADPDRMAGVDNFCWCDPVQSEKTPDGEYKLAQLVRACKA LSELCVAYGVPCISGKDSMKNDYTGGGKKISIPPTVLFSVLGIMDDVKKAVTSDFKKPGD RLYIIGNTARELAGSEIADQLGIACNTVPRVDAKSALASYRKLHQAINSRLISACHDLSD GGLAVSLAEMCIGGRLGAHLALNRVPVIGALTLTEALYSESASRLLVSVAPDKAAEFEAL FGASAPFIGEVTSDARLTVASADSALFSAPVEALAHAFKATLDW >gi|316921850|gb|ADCP01000139.1| GENE 27 31046 - 32023 1023 325 aa, chain - ## HITS:1 COG:MK0774 KEGG:ns NR:ns ## COG: MK0774 COG0142 # Protein_GI_number: 20094211 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Methanopyrus kandleri AV19 # 41 324 43 323 324 158 34.0 2e-38 MITLRTFIEQEQPRINAALAEAAAELPVSIMPIAQYAMGNGGKRLRPLLTVLVARLLGYA DNDIYPLAAAMEMFHVATLLHDDVMDNAEMRRGSTATHKVFGVTETILAGDALLAKGNQI VAGFGDPRLTAATSDAIAQTANGEILEIANQGKNAPDLLIYNQIIAGKTAWMIRTSCNIG AVRANGTPEQVAGATDYGFNLGMAFQIVDDALDFAPSSLTGKPEGGDVREGKLTPPIFFY ANSLAPAERDAFFARFAEQSFTDEDVRHVIDTVRAQGFDDKTRGIADTYLRKAQENLDAL TAGLPASPYGEILTSFIGYVRNRDA >gi|316921850|gb|ADCP01000139.1| GENE 28 32181 - 32378 206 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEFNSKLFFSALGLAFILEALPYTLFPERMRQVLQTLGEEGASGLRRMGLFSLAAGLIVL WLTLG >gi|316921850|gb|ADCP01000139.1| GENE 29 32526 - 33941 1971 471 aa, chain - ## HITS:1 COG:NMA0753 KEGG:ns NR:ns ## COG: NMA0753 COG1055 # Protein_GI_number: 15793728 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Neisseria meningitidis Z2491 # 16 471 11 469 473 468 57.0 1e-131 MKRLFAMFMALALTALVPLSAQASGGHDLNGAELSILWVVPFVCMLLSIAIGPLAVPHFW HHHFGKVALFWGLAFLVPCAMVFGLQLAVYQFVHVLFMEYVPFIILLFALFTIAGGVRLK GQLVGTPVVNAGLLALGTILASWMGTTGAAMLLIRPLLRANEHRKYRMHIVIFFIFLVAN IGGSLTPLGDPPLFLGFLKGVSFFWTTVHLFCKTVLLSVILLGIFFLLDTFLFSKEGKPK PATQSNEKLGLDGKVNLLLLIGVVISVLMSGIWKPESGFEIYGTHIELQNAARDVILLAL AGASLALTTKECRKLNNFTWEPILEVAKLFIGIFISMIPAIAILRAGTDGALAGVINMVF HEGQPVNAMFFWLTGILSSFLDNAPTYLVFFNTAGGDAVHLMTDWTETLSAISAGAVFMG AVTYIGNAPNFMVRSIAEDQGVKMPSFFGYMLWSVGILFPCFALITYLFFM >gi|316921850|gb|ADCP01000139.1| GENE 30 34061 - 34543 238 160 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGAWHLYLWSPLPGLVAIGITPVLAIFFFFRGLNLVSRTLPYWKTRRLIRKLGMRPTWWN TGAGYLLIDERQGSWIINGTAGMIVDIKRLHGHSDWQMHRLDLYTTDTPKPTASYGFGSA EEIREAAKIFQKAYAPQEKRDLPVTFADLRKKENKASEAH >gi|316921850|gb|ADCP01000139.1| GENE 31 34874 - 36298 479 474 aa, chain + ## HITS:1 COG:HI0872 KEGG:ns NR:ns ## COG: HI0872 COG2148 # Protein_GI_number: 16272813 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Haemophilus influenzae # 98 474 107 471 471 238 38.0 2e-62 MCSKRSLALLAADCLALFGTMFLIGVIRVLLGGEISLSQHVPLALFLLVVPAVNAFEGLY SGVPPALPEEMRLLGISTSIAYFSIAIFLFLGRGDLPSRTVYVWSWFLSLGTVPLVRCAL RSRFSREAWWRVPTVVFGAGELVPRVTEYLNKHTEAGLLLKAHFVSQKCTEAEKEQKNTP QDTYSGLEPLYSRSDLDTFVRLYPRTCAIVVMEKDAPLACRQELIDLASLLFSSVIIIPE DFSEGEIPFWVRPVEIGDVLCFKVRQNLLDPKRLSLKRAMDLFLSVVGGIAIFPVLVLIA LAIKLESRGPVFFRQNRIGRGGQTIHILKFRTMVRNAEEVLQTYLRENPDLREEWEADQK LRNDPRITKVGAWLRKTSLDELPQLWNVVWGEMSLVGPRPIVDDEIVKYGSAFISYTRVR PGMTGLWQVSGRNDLSYKQRVHLDRFYICNWSTWLDILILAKTFPVVLGRKGAY >gi|316921850|gb|ADCP01000139.1| GENE 32 36619 - 36978 305 119 aa, chain - ## HITS:1 COG:STM3293 KEGG:ns NR:ns ## COG: STM3293 COG1314 # Protein_GI_number: 16766589 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecG # Organism: Salmonella typhimurium LT2 # 1 112 1 109 110 58 36.0 3e-09 MQTLILSLHVIVCVTLVILVLLQAGKEGMGVIFGGGNASVFGSSGAGGLLAKMTAFMAVL FIVTSLSYNYVSSSHKAQESTILDVRLEDTPKPAAPAATVPATPAEKPAEAPAADKPAN >gi|316921850|gb|ADCP01000139.1| GENE 33 37170 - 37931 873 253 aa, chain - ## HITS:1 COG:aq_360 KEGG:ns NR:ns ## COG: aq_360 COG0149 # Protein_GI_number: 15605869 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Aquifex aeolicus # 1 248 1 244 247 216 45.0 3e-56 MRTIIAANWKMNKTRTEAKATAHELASILGGAPDDRDVVVFAPFTALEATREGLDGAPGI FVGAQNMYPAESGAFTGEISPAMLREAGCTWVLTGHSERRHVLGESPDFVGKKTAFALEQ GFSVILCIGETLAERKAGKLNSILWLQLERGLAGLGPNADYSRLAVAYEPVWAIGTGQTA SDEDIRMVHTLVRELLSSLLGSNGIGIQILYGGSVKPDNAKAILSLDNVDGLLVGGASLQ AQSFADIVRADLA >gi|316921850|gb|ADCP01000139.1| GENE 34 37954 - 38454 222 166 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|77918898|ref|YP_356713.1| ribosomal-protein-alanine acetyltransferase [Pelobacter carbinolicus DSM 2380] # 2 154 1 146 154 90 37 2e-17 MMHDSPKRAVRCVRLAIEDAPALSSLESRCFALPWTEQAFAEAFAGKTFHAFGIRETASP LSPRELAGYIAVYHTPDEVEILNIAVREDRRRHGYGRLLMATALQDAKETGILQGVLEVR ISNAPAIHLYESFGFRQAGKRRGYYQDTGEDALIYTLDMGQTRTNP >gi|316921850|gb|ADCP01000139.1| GENE 35 38446 - 39024 191 192 aa, chain + ## HITS:1 COG:no KEGG:DvMF_0346 NR:ns ## KEGG: DvMF_0346 # Name: not_defined # Def: NUDIX hydrolase # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 17 171 4 165 182 91 36.0 2e-17 MHHTIVPEPCENEEPLIEYVDERDRPLLIAPSSDAFPLRKKAVCAVLCDRRGRAFVRRRM SEEGMETWDISAETFVRSGEAREEAVERAVRETLGLSNVPSGVAVSAAFKPAGETVSLTL FIAELPGGLPPVSLPEGHFLDREELEGMTETFPDLFSPALLWAIQTGCLWKHRGPRVSTR SINPCTFRDRPV >gi|316921850|gb|ADCP01000139.1| GENE 36 38975 - 39976 835 333 aa, chain - ## HITS:1 COG:TM1544 KEGG:ns NR:ns ## COG: TM1544 COG1077 # Protein_GI_number: 15644292 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Thermotoga maritima # 1 322 1 323 336 289 48.0 5e-78 MLRKHIAVDLGTANTLVFVQSQGIVINEPSVVAVDTINERVIAVGKEAKDYIGKTPQRIS TIRPLRDGVIADFDAAQALIATFLKKVIGSWPIKPDVVICVPTNITDVERRAVAEAATKA GARNVKLIEEPVAAAVGAGLPVLDPVGSMVVDIGGGTTDIAVITLGSMACSTSLKLAGDA LTAAVQRFVREKYQIVVGENMAERIKIALGSVAPLPVPLSFEVSGKDLATASPRTIALSD TDTREAFQPITEKLVSAVMGILEVTPPELGADIMRQGMLLTGGGALLRGFADRITRATRI PVYVDSDPLTTVLRGAGITLDDPEKYRDLLIEC >gi|316921850|gb|ADCP01000139.1| GENE 37 39981 - 41024 890 347 aa, chain - ## HITS:1 COG:no KEGG:LI1136 NR:ns ## KEGG: LI1136 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 346 1 344 344 388 51.0 1e-106 MATLHDNDNTLLALVRTTFDAHSAVLFLPDQSGNYTVALSSTDNEPSVQEVTIAPGKGLV GWILRHKQPVIVNNLDMRHTFLGYYDENDEGAISAFMGCHIPEGGALCVDSVRPRAYTEE DQLLLHRFARHLARQVHSAGLACDAEDLRRYFTRLEQLSELSAQNPHWRDYLGAFLRLMA ESTEFEYVAFATAQEGGSTYTVEGENTPLLITEDGMPELPLTSGGLVSWVFRNEVPVHAE GADGSPSTPLFGKLPGVPNFQSVMCLPVHLNKVTCGVLCFAGLNPRSLSQNLRTFTKIAV TYLAQYLEMLYLRHRLKSLLPRAKVHRDGAMAYDPDTAPSAPMSEED >gi|316921850|gb|ADCP01000139.1| GENE 38 41669 - 42088 381 139 aa, chain + ## HITS:1 COG:no KEGG:Ddes_0625 NR:ns ## KEGG: Ddes_0625 # Name: not_defined # Def: protein of unknown function DUF1090 # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 33 137 28 132 136 77 48.0 1e-13 MGERLLKTAVLSATLGIALIAFSGMSLASDDFAGGCKRKEQAIETKMEYARKHGNQEQLR GLEEALSQVRRWCSDGTLRSKAELNILEKQDKVKERQAELDKAVTEGKGEKKIAKLQRKL GEAKEELAQAVAKRDALAQ Prediction of potential genes in microbial genomes Time: Fri May 13 04:37:48 2011 Seq name: gi|316921772|gb|ADCP01000140.1| Bilophila wadsworthia 3_1_6 cont1.140, whole genome shotgun sequence Length of sequence - 93892 bp Number of predicted genes - 83, with homology - 73 Number of transcription units - 39, operones - 15 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + 5S_RRNA 249 - 348 93.0 # CP001197 [R:1104568..1104682] # 5S ribosomal RNA # Desulfovibrio vulgaris str. 'Miyazaki F' # Bacteria; Proteobacteria; Deltaproteobacteria; Desulfovibrionales; Desulfovibrionaceae; Desulfovibrio. 1 1 Tu 1 . - CDS 623 - 1324 405 ## + Prom 1281 - 1340 5.9 2 2 Op 1 . + CDS 1511 - 1750 94 ## 3 2 Op 2 40/0.000 + CDS 1831 - 2868 1276 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit + Term 2928 - 2969 1.0 4 2 Op 3 . + CDS 3051 - 5450 2865 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 5 2 Op 4 . + CDS 5527 - 5877 214 ## COG0789 Predicted transcriptional regulators 6 2 Op 5 1/0.222 + CDS 5884 - 6555 707 ## COG0036 Pentose-5-phosphate-3-epimerase 7 2 Op 6 . + CDS 6573 - 8567 2483 ## COG0021 Transketolase + Term 8603 - 8638 1.9 + Prom 8614 - 8673 3.4 8 3 Op 1 33/0.000 + CDS 8820 - 10121 982 ## COG0336 tRNA-(guanine-N1)-methyltransferase + Term 10127 - 10173 -0.8 9 3 Op 2 3/0.000 + CDS 10298 - 10645 485 ## PROTEIN SUPPORTED gi|94986665|ref|YP_594598.1| 50S ribosomal protein L19 + Term 10646 - 10681 -0.5 + Prom 10688 - 10747 2.6 10 3 Op 3 1/0.222 + CDS 10779 - 11492 624 ## COG0164 Ribonuclease HII 11 3 Op 4 4/0.000 + CDS 11489 - 11890 196 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 12 3 Op 5 . + CDS 11832 - 12665 689 ## COG0313 Predicted methyltransferases 13 3 Op 6 . + CDS 12669 - 13130 465 ## COG0691 tmRNA-binding protein + Term 13158 - 13204 2.1 + TRNA 13354 - 13429 81.9 # Ala CGC 0 0 + Prom 13356 - 13415 78.1 14 4 Tu 1 . + CDS 13635 - 13877 196 ## + Term 13896 - 13955 25.7 - Term 13892 - 13935 9.3 15 5 Tu 1 . - CDS 13975 - 16449 2625 ## COG2199 FOG: GGDEF domain - Prom 16538 - 16597 3.1 - Term 16669 - 16705 9.6 16 6 Tu 1 . - CDS 16710 - 16880 184 ## - Prom 17030 - 17089 1.8 + Prom 17009 - 17068 2.7 17 7 Tu 1 . + CDS 17113 - 17562 454 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 17572 - 17609 6.1 - Term 17559 - 17597 2.5 18 8 Tu 1 . - CDS 17624 - 18373 872 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component - Prom 18475 - 18534 3.7 + Prom 18602 - 18661 2.5 19 9 Tu 1 . + CDS 18756 - 21380 2208 ## COG2199 FOG: GGDEF domain + Prom 21392 - 21451 2.6 20 10 Op 1 . + CDS 21621 - 21932 211 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 21 10 Op 2 26/0.000 + CDS 21996 - 22469 405 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 22 10 Op 3 . + CDS 22494 - 23426 1199 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs + Term 23438 - 23479 4.8 + Prom 23532 - 23591 3.8 23 11 Tu 1 . + CDS 23646 - 24317 763 ## COG2846 Regulator of cell morphogenesis and NO signaling + TRNA 24535 - 24621 59.4 # Leu CAG 0 0 24 12 Op 1 . + CDS 24886 - 26085 417 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 25 12 Op 2 . + CDS 26100 - 26288 193 ## + Term 26386 - 26418 0.7 + Prom 26333 - 26392 2.7 26 13 Tu 1 . + CDS 26419 - 27396 283 ## gi|302862827|gb|EFL85759.1| hypothetical protein HMPREF0326_01462 27 14 Tu 1 . + CDS 27520 - 27735 143 ## XCV2219 hypothetical protein 28 15 Op 1 . + CDS 28363 - 28713 219 ## cce_5305 hypothetical protein 29 15 Op 2 . + CDS 28707 - 29921 396 ## DMR_07200 hypothetical protein 30 15 Op 3 . + CDS 29934 - 30305 337 ## 31 15 Op 4 . + CDS 30308 - 31012 662 ## Bxe_A2574 hypothetical protein 32 15 Op 5 . + CDS 31009 - 31611 487 ## gi|296163053|ref|ZP_06845827.1| conserved hypothetical protein + Term 31617 - 31650 6.1 - Term 31605 - 31638 6.1 33 16 Tu 1 . - CDS 31696 - 31986 347 ## BDP_1620 hypothetical protein - Prom 32034 - 32093 2.4 + Prom 32088 - 32147 3.5 34 17 Tu 1 . + CDS 32178 - 32561 232 ## COG1733 Predicted transcriptional regulators - Term 32416 - 32454 -0.5 35 18 Tu 1 . - CDS 32564 - 34039 473 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 34252 - 34311 11.5 + Prom 34224 - 34283 6.1 36 19 Op 1 11/0.000 + CDS 34442 - 36889 2857 ## COG1882 Pyruvate-formate lyase + Term 36906 - 36939 6.1 37 19 Op 2 . + CDS 37028 - 37891 609 ## COG1180 Pyruvate-formate lyase-activating enzyme 38 19 Op 3 . + CDS 37912 - 38025 111 ## + Term 38073 - 38119 -0.9 + Prom 38109 - 38168 2.0 39 19 Op 4 . + CDS 38207 - 39460 917 ## COG0477 Permeases of the major facilitator superfamily + Term 39465 - 39512 13.2 40 20 Tu 1 . + CDS 39876 - 40328 114 ## COG0778 Nitroreductase + Term 40557 - 40588 -0.1 41 21 Tu 1 . - CDS 40834 - 40965 58 ## - Prom 41215 - 41274 6.8 + Prom 41506 - 41565 2.7 42 22 Op 1 5/0.000 + CDS 41592 - 44861 3056 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 43 22 Op 2 27/0.000 + CDS 44858 - 46276 1511 ## COG0286 Type I restriction-modification system methyltransferase subunit 44 22 Op 3 . + CDS 46276 - 47628 284 ## COG0732 Restriction endonuclease S subunits 45 22 Op 4 . + CDS 47687 - 48514 188 ## Smlt0297 hypothetical protein + Term 48695 - 48736 10.2 - Term 48683 - 48723 6.2 46 23 Tu 1 . - CDS 48742 - 48981 137 ## - Prom 49086 - 49145 2.5 + Prom 49090 - 49149 2.5 47 24 Op 1 . + CDS 49280 - 49723 470 ## COG1846 Transcriptional regulators 48 24 Op 2 4/0.000 + CDS 49775 - 50395 574 ## COG0655 Multimeric flavodoxin WrbA + Prom 50634 - 50693 2.1 49 24 Op 3 . + CDS 50735 - 51967 1202 ## COG0477 Permeases of the major facilitator superfamily + Term 52004 - 52048 -1.0 + Prom 52091 - 52150 4.9 50 25 Op 1 15/0.000 + CDS 52286 - 53104 302 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 51 25 Op 2 8/0.000 + CDS 53127 - 54092 1208 ## COG3221 ABC-type phosphate/phosphonate transport system, periplasmic component + Term 54102 - 54137 7.1 52 25 Op 3 . + CDS 54171 - 54962 1110 ## COG3639 ABC-type phosphate/phosphonate transport system, permease component 53 25 Op 4 . + CDS 54962 - 55687 580 ## Dbac_0825 protein of unknown function DUF1045 54 25 Op 5 9/0.000 + CDS 55687 - 56127 459 ## COG3624 Uncharacterized enzyme of phosphonate metabolism 55 25 Op 6 8/0.000 + CDS 56127 - 56729 499 ## COG3625 Uncharacterized enzyme of phosphonate metabolism 56 25 Op 7 8/0.000 + CDS 56720 - 57829 1198 ## COG3626 Uncharacterized enzyme of phosphonate metabolism 57 25 Op 8 7/0.000 + CDS 57830 - 58714 1091 ## COG3627 Uncharacterized enzyme of phosphonate metabolism 58 25 Op 9 7/0.000 + CDS 58711 - 59496 359 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 59 25 Op 10 6/0.000 + CDS 59500 - 60189 214 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 60 25 Op 11 7/0.000 + CDS 60216 - 61355 1325 ## COG3454 Metal-dependent hydrolase involved in phosphonate metabolism 61 25 Op 12 . + CDS 61355 - 61927 334 ## COG3709 Uncharacterized component of phosphonate metabolism + Term 61974 - 62011 5.7 - Term 61967 - 61994 0.1 62 26 Tu 1 . - CDS 62110 - 63117 945 ## COG0500 SAM-dependent methyltransferases + Prom 63473 - 63532 6.1 63 27 Op 1 6/0.000 + CDS 63751 - 66684 2517 ## COG2200 FOG: EAL domain + Term 66749 - 66793 15.5 + Prom 66695 - 66754 3.3 64 27 Op 2 . + CDS 66951 - 68405 1115 ## COG2199 FOG: GGDEF domain + Term 68436 - 68460 -1.0 65 28 Tu 1 . + CDS 68595 - 69908 896 ## Ddes_0525 4Fe-4S ferredoxin iron-sulfur binding domain protein + Term 69947 - 69986 4.0 - Term 70050 - 70089 7.3 66 29 Op 1 . - CDS 70127 - 71542 2023 ## amb1862 hypothetical protein 67 29 Op 2 . - CDS 71656 - 73059 1811 ## Ddes_2094 sodium/sulphate symporter - Prom 73109 - 73168 2.5 + Prom 73113 - 73172 2.8 68 30 Tu 1 . + CDS 73329 - 74012 663 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 74079 - 74119 13.6 - Term 74067 - 74107 13.6 69 31 Op 1 . - CDS 74125 - 76071 1810 ## LI0080 hypothetical protein - Term 76140 - 76168 -0.3 70 31 Op 2 . - CDS 76174 - 76827 700 ## COG1802 Transcriptional regulators - Prom 77011 - 77070 2.1 - Term 77034 - 77081 6.3 71 32 Op 1 3/0.000 - CDS 77150 - 78247 1364 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain 72 32 Op 2 16/0.000 - CDS 78234 - 79325 962 ## COG0784 FOG: CheY-like receiver 73 32 Op 3 . - CDS 79346 - 82552 3163 ## COG0642 Signal transduction histidine kinase - Prom 82654 - 82713 6.1 + Prom 82613 - 82672 3.9 74 33 Tu 1 . + CDS 82723 - 83595 891 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family + Term 83732 - 83784 19.1 - Term 83628 - 83666 6.2 75 34 Op 1 . - CDS 83784 - 84581 789 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 76 34 Op 2 . - CDS 84591 - 85814 1439 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 77 34 Op 3 . - CDS 85850 - 86389 747 ## COG2065 Pyrimidine operon attenuation protein/uracil phosphoribosyltransferase 78 35 Tu 1 . + CDS 86762 - 87100 370 ## DVU0359 HesB-like domain-containing protein + Term 87137 - 87183 2.2 + Prom 87196 - 87255 2.6 79 36 Tu 1 . + CDS 87330 - 87428 115 ## - Term 87673 - 87708 8.1 80 37 Tu 1 . - CDS 87889 - 88653 1148 ## COG0623 Enoyl-[acyl-carrier-protein] reductase (NADH) - Term 88719 - 88753 1.6 81 38 Op 1 . - CDS 88977 - 89873 1340 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 82 38 Op 2 . - CDS 89917 - 91218 1627 ## COG0141 Histidinol dehydrogenase - Prom 91292 - 91351 1.7 + Prom 91373 - 91432 2.0 83 39 Tu 1 . + CDS 91463 - 93829 2537 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein Predicted protein(s) >gi|316921772|gb|ADCP01000140.1| GENE 1 623 - 1324 405 233 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAQASLNAAGVMKRTPAKLPSFPAPYSLPGSFLDELDEHPVVTFIQIKKEFNKEFSTWVA GCCQYLGSFIELNSGNFYVNLTQCHLNRRNIEACAKLLYLSGVMLDDKHSMKYFVQIARE ITTSFYPLDTSAIPAEISDPDDLTKHLEREYQRQHPEHWFEDWRQRFTHEMQEFFEKKDL IEKAASSNQQKPQYQPTAPSKKAPPEKKRKEKGCLGVCLILTVLAALGILFLH >gi|316921772|gb|ADCP01000140.1| GENE 2 1511 - 1750 94 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFYAGEYDSSSISENHFRDNRCLEKVRCLGMFSHTISLKMRNEEEWGEPELSGKTTGKHT SLCAIGASELRIYKAFSVC >gi|316921772|gb|ADCP01000140.1| GENE 3 1831 - 2868 1276 345 aa, chain + ## HITS:1 COG:PA2740 KEGG:ns NR:ns ## COG: PA2740 COG0016 # Protein_GI_number: 15597936 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Pseudomonas aeruginosa # 4 344 1 337 338 370 51.0 1e-102 MDLLTELESLIPELEKGLDQASSLQEMEALRVEFLGRKGRIARIMSRLPELEPAQRPLLG QTANRVKESLTAQFEAKKTALDNAEEAAALARFDAGLPGRLPWRGSLHPVTMVMEEICTI FKGLGYDIVSGPEVETDYYCFEALNIPPEHPARDMQDTLYVSDGIVLRTHTSPLQVRTML KQKPPVAVVAPGRVYRRDSDITHTPMFHQVEGLLVDKHVTMADLRGTLTAFMREVFGSDT RIRFRPSFFPFTEPSAEADISCCMCGGKGHVNGTTCRVCKGTGWLEILGCGMVDPEVFKA VGYDPEEVTGFAFGLGVERIAMLKYGIGDLRMFFENDVRFLGQFA >gi|316921772|gb|ADCP01000140.1| GENE 4 3051 - 5450 2865 799 aa, chain + ## HITS:1 COG:FN2122_2 KEGG:ns NR:ns ## COG: FN2122_2 COG0072 # Protein_GI_number: 19705412 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Fusobacterium nucleatum # 151 794 6 645 653 336 31.0 1e-91 MLLSLKWLREFVPVEAGAQEVGDRLTMLGLELEDIIHPYDAIKDIVVGHVLTKEAHPDSD HLAVCTVDAGQEEVLNIVCGAPNVAAGQKVPVALVGTTMPGGMVIKKAKLRGVPSFGMIC SERELGLSEDHNGIMVLPEGFKVGTRLIDALDLDTEILEIGITPNRGDCLSVLGLAREAA LAFKLPLTLPKVHVEAQGADWSSEWAVNIPNPDLCPFYQLRLVEGVTIRQSPMWMRHRLH AVGVRPISNIVDVTNYILMELGQPLHAFDRDKLEGGRTEISVARDGERIVTLDGQERVLT FADLLIRDGMKPVALAGVMGGLDTEISDASRNVMVESAVFRPETIRKTARRLALASEASY RFERGVDQINSRFAMDRAVSLMAELSGGVVRTGACTQEPKLWAAPHPRFRVQRTVDLLGM DVEPAFCADTLERLGCGLDRSDGADWKVTTPSWRSDLSREVDLIEEIARVKGMDTIPETL PAVSRPLDRFGQPESRYGFLSRIKAWGRGLGLNEAENYSFVGHKDLDHLGLSKEGRIDII NPLTEEQNVLRTEIAPSLLQNVRTNIAHGNMGVRLFEVANVFEADPASQTTAKESARLGM VMYGSLYDTAWPNAEIDAGYADIRGLVEHFAAFLNLSAPVFTRDENTHPFLAPCVRVTVD GKPVGVVGQVKAQLADAYHARKPIWLAELDLETLWDLHRAARIVFKALSVFPASSRDVTV IAPLTLSVAAVEKHIRDMRIAILEDVTLIDLYEPKDTEERNLTFRLTFRKADRTLKDAEV DKEREKVAQSLIKNLGVRI >gi|316921772|gb|ADCP01000140.1| GENE 5 5527 - 5877 214 116 aa, chain + ## HITS:1 COG:RSc1584 KEGG:ns NR:ns ## COG: RSc1584 COG0789 # Protein_GI_number: 17546303 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Ralstonia solanacearum # 4 74 19 89 146 65 45.0 2e-11 MTGRTYRIGEAARLLKLEGYVLRFWETEFEQLRPLRTPKGQRLYSDADLVVLRRIRTLLH EQGMTIEGARRVLERECAESGDGKARTSAEDMLRHVESELMALQALLGRQRLKPTR >gi|316921772|gb|ADCP01000140.1| GENE 6 5884 - 6555 707 223 aa, chain + ## HITS:1 COG:sll0807 KEGG:ns NR:ns ## COG: sll0807 COG0036 # Protein_GI_number: 16330729 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Synechocystis # 1 204 5 210 230 231 54.0 7e-61 MILSPSLLSADVGNLAAELSALEAAGLRWVHWDVMDGRFVPNITFGQHVIRQLRKRSGLF FDVHLMIVEPENYLSEFQAAGADMLVVHAEATRHLQRTLAEIRRLGMKAGVALNPATPLS ALEYVLEDVDMILLMSVNPGFGGQKFLPITYEKIRRLRRMLDERRLETLIQIDGGVTPDN TADLVRAGADVLVSGSAFFSVPPYEARRLAFESKAEEARSLTL >gi|316921772|gb|ADCP01000140.1| GENE 7 6573 - 8567 2483 664 aa, chain + ## HITS:1 COG:HI1023 KEGG:ns NR:ns ## COG: HI1023 COG0021 # Protein_GI_number: 16272957 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Haemophilus influenzae # 1 663 1 665 665 862 62.0 0 MRSRKELADAIRILSMDAVDKAKSGHPGAPMGMADMAEALWRGVLRHNPADPHWANRDRF VLSNGHASMLLYSVLHLTGYDLSMDDIKNFRQLGSRTPGHPEKGVTPGVETTTGPLGQGF ATAVGMAVAEKLLAAEFNRDGFPIVDHHTYVFLGDGCLMEGLSQEACSLAGTLGLGKLIV MYDDNGISIDGDVHQWFGDDTPARFEACGWHVISGVDGHDAAMLDAALVLARKETERPTL ICCKTTIGYGSPKKGGTASSHGSPMGADENAAVRAALGWKDEPFVIPQDIKDAWDARERG AALEGEWKALFAAYAKAYPDLAAEFERRMGGTLPADWESVSSSAIARFDADKPKVATRIA NRDVLNAIAPHLPELFGGSADLTGSVGTWHEKASRIGRNDWNGNYLSYGVREFVMGTVMN GLALHGGFIPYGGTFLVFSDYARNAIRLSALMKQRLVWVLTHDSIGVGEDGPTHQPVEHV SSLRLIPDLLVWRPCDAVESAVAWKVALESAQPSCMVLTRQGLTPQTRTEEQLEAVKRGA YILKDCEGTPEVILIATGSEVQLAVSAAEALAGKGRKARVVSMPCAELFDAQPAEYKENV LPRAVRARVAVEAASVDGWWKYVGLDGAVVGMSGFGESAPGDVLFKHFGFTVDHVVDVAE GLLK >gi|316921772|gb|ADCP01000140.1| GENE 8 8820 - 10121 982 433 aa, chain + ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 237 1 237 246 225 45.0 1e-58 MRINILTLFPEWFASPLDTALLGKAREAGLVEFDILNPRDRTEDRHHIVDDRPYGGGPGM VMMLEPLVKTLRELETERGGAGRIIMLAAAGKPLTQSLARELAREETLTLVCGRYEGIDA RLMDILPVEQVSVGEAVLNGGEAAAMMLVEAATRLIPGFMGKEESGDDESFSAGLLEYPH FTRPEVFEDVPVPDVLRSGDHGRIAKWRREQSLRTTLRVRPDMLDGAPLTSDDMEYLRET VEAAGRLRLGRNLHCALVHYPVFLGDRKTGATSLTNLDVHDIARCSRTYGLGSFTVVTPL KDQQTILETLVRHWTEGPGGVSNPDRAEAFSLVRMASDVQGAIDLVEARTGQRPVLLGTS ARDNGVMTPQAVRDLLVDRPVLLLFGTGHGMAPEILDACDGILRPLRWMDAYNHLPVRGA VAITLDRVLGDCC >gi|316921772|gb|ADCP01000140.1| GENE 9 10298 - 10645 485 115 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|94986665|ref|YP_594598.1| 50S ribosomal protein L19 [Lawsonia intracellularis PHE/MN1-00] # 1 115 1 115 115 191 80 1e-47 MTIIEKIEREQMRLDIPAFKPGDAVKVHFRIVEGEKERIQVFQGNVIRMHRGTVGATVTV RKVSDGVGVERIFPLHSPFIDHIELVTEGSVRRSRLYYLRDLKGKAARIKPKKRY >gi|316921772|gb|ADCP01000140.1| GENE 10 10779 - 11492 624 237 aa, chain + ## HITS:1 COG:HI1059 KEGG:ns NR:ns ## COG: HI1059 COG0164 # Protein_GI_number: 16272990 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Haemophilus influenzae # 34 224 11 190 197 158 51.0 7e-39 MAGRKKSVSVQEELLLCPSAKPGDWFPGIPVSAVAGIDEAGRGCLAGPVVAAAVILPEGA VIPGLADSKVLSPTRRDGLAAEIGAVALAWGLGVVWPPEIDRINILQATLKAMCHAASVL KVRPKGLLIDGNQTIPEPLFRQRAACWPSSPRQKSIVDGDALVPVISAASIIAKTFRDML MDKLDHRYPGYGFAKHKGYGSKEHFAALRELGPCPMHRKTFRGVLPEQQTARQGSLL >gi|316921772|gb|ADCP01000140.1| GENE 11 11489 - 11890 196 133 aa, chain + ## HITS:1 COG:NMA0341 KEGG:ns NR:ns ## COG: NMA0341 COG0792 # Protein_GI_number: 15793351 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Neisseria meningitidis Z2491 # 1 101 1 101 115 75 35.0 3e-14 MTLRQEKGRAGEDAAADWLKKAGMRILARNWRSGSYELDLVCRDGDELVFVEVRLRGKGS LASPEASMTPAKCRSVVKAARAYLSASGEWDVPCRFDLVCVRDAGATFELDYYRHVFDIS EIMGSGDAAWQPW >gi|316921772|gb|ADCP01000140.1| GENE 12 11832 - 12665 689 277 aa, chain + ## HITS:1 COG:SP0938 KEGG:ns NR:ns ## COG: SP0938 COG0313 # Protein_GI_number: 15900818 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 7 271 15 280 289 176 37.0 3e-44 MSLTSPKLWVVATPLGNPGDLSPRAREVLEGADMVLAEDTRRAGLLCQRCGVKAKRFMSF HDHNEESKLDEVLGLLNGGRTLALISDAGLPLVADPGYRLVRACRAAGIPVSVVPGPSAP VTALAGSGIAPQPFAFLGFLPRSRSDQEKTLAPFANLALTLIFFERKDRLSETLSAAHAV LGPRELCIARELTKTHEEYLLGRLEDGVPAGVELLGEITVVVGPAEAGGVTDREEVFRLI AEERELGGSPRDVARRVQTRTAGWTVKSIYALLSARR >gi|316921772|gb|ADCP01000140.1| GENE 13 12669 - 13130 465 153 aa, chain + ## HITS:1 COG:alr5070 KEGG:ns NR:ns ## COG: alr5070 COG0691 # Protein_GI_number: 17232562 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Nostoc sp. PCC 7120 # 1 152 1 153 155 160 56.0 8e-40 MSAKKSGGPLIGDNRKARHIYEYLETIEAGISLTGPEVKSLRAGQVNFTDSYVEFRRGEA WLIGLHIAPYTNAGYVEQNADRPRKLLMHAQEIASFAAKVEQKGLTVVPSKLYLKHGKIK VELALGRGRKLHDHRVELKRRAETRDMQRELRG >gi|316921772|gb|ADCP01000140.1| GENE 14 13635 - 13877 196 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGSTGRIFTRKKNKNGTWSPWSSVITDSQIGDGITNTNGIISVPEVISSGTQLLPLLGLL AGSLMRVCLHRLWTHTGCFL >gi|316921772|gb|ADCP01000140.1| GENE 15 13975 - 16449 2625 824 aa, chain - ## HITS:1 COG:PA0575_3 KEGG:ns NR:ns ## COG: PA0575_3 COG2199 # Protein_GI_number: 15595772 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 627 795 2 173 182 106 35.0 2e-22 MSLIARMIIPISVIVTLATSFIGWKAYSDARDSVAEATRILEISTLDAVGRELTTVMNLT RQHLLTVADRIAIIELLRNPDESPEGERSKLLTRHLRRAVLDFPSIDGLAVLNSRGTVIA SVNPGENATSRKTSSYFHKALQGAQTLDGPLVLPYTQEKAFIISTPIYDRGKVSGVLVGA INMEHLTRLIVDPVTLNGMGYVFVCTPGGSVIMHRDRNKMFADNVSEQRFIKEATADKDR FLEYRNPQGQIMYSAYTALPNGWIIVAAASKEAMLAGVRNLRSDIILSCAGALLLVTLLI YLILRKVTGVLQEGVEFAERVASGDLDRSLNIRREDELGKLARALNTMVQRLKASFSMAE ERAREAEEASARATETSQELEAVINGVRGGVARLELTDNLRVLWANAGFYALAGRSHAEY EQDVHDEALLVVHPDDRAHLLGAFRSSLASNIPVHTEYRILRRDGGVVWIYLQASLVGHR DGVPVFQGVFVDVSKQKNTMRALELEQQRNRIVAELTNDIIFEYDYDTDEMTCSERYETI FHKPRVIPNYRRHQDNLLNMLLPEDAKTLQEAYRRIEQGVDSYALTLRLLQPEAGYQWYS IRFRSIRDETGRAIKLIGQLTNIHHQKLEEDRLRHEASTDLLTQIYNKITTEHLVSLELQ EGRNHALIVVDLDKFKYANDTFGHLFGDSVIKAVAATLRATFRATDIVGRTGGDEFVVLA KDISLSDGLETLKAKCRRLSVELAHVSLGQDYVISASIGISLFPRDGKTYAELFAKADAA LYQIKNTGRGHFAVYGEVPETETKALPSGGKGMPLMPRGPRNDA >gi|316921772|gb|ADCP01000140.1| GENE 16 16710 - 16880 184 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METIFAILTVIVYILLALVGLVALTICVFLWMVSTRTRQDSKIFQERFDGWRKTRG >gi|316921772|gb|ADCP01000140.1| GENE 17 17113 - 17562 454 149 aa, chain + ## HITS:1 COG:MA2866 KEGG:ns NR:ns ## COG: MA2866 COG0589 # Protein_GI_number: 20091690 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Methanosarcina acetivorans str.C2A # 4 142 7 144 152 71 30.0 4e-13 MEVKKILCAVDLEDTIAPSVEYAKMMASMTGASILVAYVIPAHTPYEDVYLSVNNQPNEV NNPNETIQKSMNALLAEQFAGMDATGVILVGRPSEELVKLAEEKGANLIVMGTHGRAGFD RLLFGSVAHEVVRAAPCPVMTVRPLMPKK >gi|316921772|gb|ADCP01000140.1| GENE 18 17624 - 18373 872 249 aa, chain - ## HITS:1 COG:slr0498 KEGG:ns NR:ns ## COG: slr0498 COG1226 # Protein_GI_number: 16331771 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Synechocystis # 100 226 100 226 234 70 32.0 3e-12 MPEHPHHIRSEAYKYRFDLLFASLLCTIVVNIFFPEDIYDGIAQTIYLPIQLIAGITLFD IRRKGYLLTLLLFLGGLLIVGRLLDSFTPLNLREYLVFAYVVFFGWVTLELFRQIYTAPL VDRESVLAALCGLLLIGYCGFFVFVAVEFHQPGSFSGLTPGGQGFRDLFYFSYVTILTIG YGDITPHTWVAKNATVLVALIAYMYSLVIVAMIVNQFAENRKLKQQEKARLSGSAQARRR EAPPSGEND >gi|316921772|gb|ADCP01000140.1| GENE 19 18756 - 21380 2208 874 aa, chain + ## HITS:1 COG:RSc0588_4 KEGG:ns NR:ns ## COG: RSc0588_4 COG2199 # Protein_GI_number: 17545307 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Ralstonia solanacearum # 701 874 2 176 182 116 41.0 3e-25 MVRAVRTFRGRNVYVLCLLGIFALSVLVALSFLEFRSENRERIQAVNNAYLEEIARQSNL RISIRMNAAIAYTAMFAELFSSVERLTGPDAAALLQNMAGHYDFHDVRVFTPDGKGRDLE GSVRVEPGEPWFQDVLRGNHDVRFIRNGHGPDNLLFYSPIIRDGEIVGAVAGVYDMSILS RIMDMSYFHERGYSNILTSSGESVTTSRHTSNIIGGSENGLEAFEANAVMLDGYTFEGHR TMLRDGTSGFIRFSVGGYGRTAYITPLGINDWRLVTIVPDAVTEDICDPINRGALSFSIK LFLAFVMSAMLISFIIFRTRREVEGVNQERDSMADISPGGIVKCTRDDFRMLYINQGCLE LAGYSREELEDRFQNVFLGVVYPEDRDRLTALMRECGSEPLQCEYRIVTRSGEVRWILHR MRLAVEQRGAACLYCSMTDITDAKEAELKLRMSNERFLIAISHTDDMLFEYVYATGRIMR IAGKGASLTEASLHEAMDEIGMDEESRKAMEGVLETLRSGEASASVVMRSGNGCSRWFRV TLTNLFDDEGRPYSALGTQEDITELKEAELRFAREERYREAMLSKTIASYTVNLTRDRLL SRYADGEELGVEEPDASAAARFRKAVEVSVHPEDQQGVLRFYATESLLQLYAAGQGETSL EYRTLRGDTFIWVSASINLLKDPASGDLLAFGYLTDITERRQREAELVYKSERDFLTKLY NRSAVEKRFSEYAASASGDGLLAFFSMDLDGFKNVNDTLGHLEGDTLLQLVARELESCFR VDDIVGRLGGDEFIALMYGVSTREIIERKAEELCARIRSIRMAHPRYSGVSVSIGVAIAP EHGDSFEELYAKADVALYHAKQGGRDRFVIYGGE >gi|316921772|gb|ADCP01000140.1| GENE 20 21621 - 21932 211 103 aa, chain + ## HITS:1 COG:MA0735 KEGG:ns NR:ns ## COG: MA0735 COG2050 # Protein_GI_number: 20089620 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Methanosarcina acetivorans str.C2A # 7 86 50 129 146 64 45.0 4e-11 MPPHGCQLNGMGAVHDGAVFILADVAFAALSNANGLYCTNAQTSISFVRSGRVGPLRGEA RVLRSGKLLSTYEVRVTDAEGGLVAVATITGYGVGARLPLGDA >gi|316921772|gb|ADCP01000140.1| GENE 21 21996 - 22469 405 157 aa, chain + ## HITS:1 COG:TM0865 KEGG:ns NR:ns ## COG: TM0865 COG1585 # Protein_GI_number: 15643628 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Thermotoga maritima # 1 134 1 128 140 61 29.0 7e-10 MSVAAMWFILGIILLAVELVSPVFVLFFFGLGAWAAGVTALFVDDLAIEVVVFGASSVVF LLSLRRLFVRSFRGKTQISSDAASVGLPNLHAGKMGTVTRPIPVNGVGEISVGGSFWRAV SPEAQPEGAQVRVLGHIPDDELTLEVVSGGENSRNPA >gi|316921772|gb|ADCP01000140.1| GENE 22 22494 - 23426 1199 310 aa, chain + ## HITS:1 COG:RSc1423 KEGG:ns NR:ns ## COG: RSc1423 COG0330 # Protein_GI_number: 17546142 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Ralstonia solanacearum # 15 309 7 305 308 288 51.0 1e-77 MLDLIGSSLTVFVFLALLVIFVLFKTALVVPNQQAVVVERLGKFHAVLFAGFHILIPFID AVAYRRSLKEDVLDVPKQTCITKDNVSVDIDGVLYLQVVNPEKSAYGISDYMFGSVQLAQ TALRSAIGKLELDRTFEERSTINQEVISALDAATAPWGIKVLRYEIRDITPPSGVMQAME KQMRAEREKRALIAQSEGEMQARINMAEGAKAAAIAESEGKLQAMKNQAEGDAVLIRAVA QATADGLATVADQMEKPGGTQAANLRVAENYLEQFGKLAKEGNTMILPTDLANISGLVSS LTSVIKQAKA >gi|316921772|gb|ADCP01000140.1| GENE 23 23646 - 24317 763 223 aa, chain + ## HITS:1 COG:CAC0071 KEGG:ns NR:ns ## COG: CAC0071 COG2846 # Protein_GI_number: 15893368 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Regulator of cell morphogenesis and NO signaling # Organism: Clostridium acetobutylicum # 1 223 1 233 239 154 38.0 9e-38 MQNKWHEHSVGNIVVNFPLSSAIFRAAGIDYCCGGGRLLGEVIKEQRLVDSNIYKALDEL AARTEKPAEPFSEMSAERLVEFILTKHHSYLWGVLPEAHELIVNVLKAHGRRHPELYDVY KLFGQLSHEMEPHLIREETELFPAIANSEAKEHCQALVHILEEEHEAAGTLLKKLRHATN DYSVPCDGCDTFRELYKKLTEIEEDTLQHYHLENNILFKKILA >gi|316921772|gb|ADCP01000140.1| GENE 24 24886 - 26085 417 399 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 49 394 60 403 406 165 30 9e-40 MKLSDAAVRNAKANGKVQKLSDGGGLYLHVTVTGSKLWRMAYRFEGKQKLLSFGAYPAVS LKDARHRRDDAKELLAKGIDPGEEKKQAREEKLAKEREERDTFEFVAREWFAKYEPTLSE KHAKKLRRYLENTIFPVIGGKPVTQLEPANFLQLVQPSERLGHHETAHKLMRLCGQVTRY ARITGRVKYDVAAGLTEALTPVQTTHFAAVTLPDDIGQLLRDIDAYVGYTSVVYCLKILP YVFTRPSELRLAHWSEFDFKNAIWIIPASRMKMRREHVAPLSKQVLALLKELHTYTGNGE LLFPSARALTTPISDAAPLAALRRMGYGKETMTLHGFRAMASTRLNELGFRADVIEAQLA HKEPDTVRLAYNRAEYMEERRQLMQKWANYLDELRSTKQ >gi|316921772|gb|ADCP01000140.1| GENE 25 26100 - 26288 193 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSDTIHSKKFPQSPEQDTFVTIGKIPEDEQITVADTGWSEETKERMRQAGYDDMDMIGAW YS >gi|316921772|gb|ADCP01000140.1| GENE 26 26419 - 27396 283 325 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302862827|gb|EFL85759.1| ## NR: gi|302862827|gb|EFL85759.1| hypothetical protein HMPREF0326_01462 [Desulfovibrio sp. 3_1_syn3] # 1 227 1 224 311 101 28.0 6e-20 MSDQQKISMGRVRQAYDLTANEVEQLIQSLWPDAVFPVALWNRTHETFWRAFADARQAVY KMESPTLPQVDLSLKRFQTYLLAHDIMDFFRRLSLNIHVNVLKPGLTSILEESYEDMESI FSKEREFLFNSLDRLNTENIFEQYSPDSALFEKPVCSIPVSSYHFIAHIFVDTIMVDRDS AIRHLRHYPLRREVKGITRALVESVLRAAPPESAVADKAAVLPTCERSPQDGKLVFYVPG AIWEGRPDTAVRNAMKEKYPLAVIAYVLLYWCGPQKSEAGHKAPQGRKTHVGRLLAEKEY KDEKSYRNLMDMLLKEADAYSILKV >gi|316921772|gb|ADCP01000140.1| GENE 27 27520 - 27735 143 71 aa, chain + ## HITS:1 COG:no KEGG:XCV2219 NR:ns ## KEGG: XCV2219 # Name: not_defined # Def: hypothetical protein # Organism: X.campestris_vesicatoria # Pathway: not_defined # 8 67 6 79 82 78 52.0 9e-14 MSRSDTKIPSCGFLRLPQVIALIPISKSAWWEGCRTGRYPKPVKLGPRTTVWRVEDIKAF IESVGKDGSHE >gi|316921772|gb|ADCP01000140.1| GENE 28 28363 - 28713 219 116 aa, chain + ## HITS:1 COG:no KEGG:cce_5305 NR:ns ## KEGG: cce_5305 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_ATCC51142 # Pathway: not_defined # 8 114 38 146 152 66 36.0 3e-10 MKRVKPVRTASVVVRVFPEERDILRFNAGIQGMSMSDYIRQTCLGIRLRRTSEEKRRLRE LARIGANINQLARWANTCKRAAEAVEVLAALANIERRIAEFASCASARKEPETEPC >gi|316921772|gb|ADCP01000140.1| GENE 29 28707 - 29921 396 404 aa, chain + ## HITS:1 COG:no KEGG:DMR_07200 NR:ns ## KEGG: DMR_07200 # Name: not_defined # Def: hypothetical protein # Organism: D.magneticus # Pathway: not_defined # 1 301 1 311 318 320 58.0 6e-86 MLMKVFPHGTGEGDKPSRYLVRPDYPGRDTAPPQVLRGNPAVTRALIDSIDRRWKFTSGV LSWHPDEKICEEQEEEVMDAFERVAFAGLEADQRNILWVRHTHAGHHELHFLIPRLELSS GKDFNACPPGWQKDFDVFRDLFNWREGWTRPDDPARARDELPKKADLFKARMARWGREIR ESDRDRAKEVIHAFLKEKVTQGLVRNREDILSALKEQGLSINREGRDYISVIAPNSGMKM RFRGGFYARDWTPKVAQEEESEEKKIGTARRMVARLQQDFERVIEKRAACNIKRYPAKWK QLPDEEALLLPRIQEDTLHGRNRTDADAEPETDGGELQRPADGLRHEDGRTGGQADADSD GTSHLEAIVQRCQRSVQQLADLVGDLEKRRIARERQNRPRRRMR >gi|316921772|gb|ADCP01000140.1| GENE 30 29934 - 30305 337 123 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGKFKRIPVKDINAMRKILDDLPDKNLGKTREEAAELLNANILKAIEKGYTVKEVAAIMA EGNVVIPAPVIRAKVLPAKAAPRKREPKPRPETAKPVAAPPVTSHAQEEIPDFYTPDKPD SEL >gi|316921772|gb|ADCP01000140.1| GENE 31 30308 - 31012 662 234 aa, chain + ## HITS:1 COG:no KEGG:Bxe_A2574 NR:ns ## KEGG: Bxe_A2574 # Name: not_defined # Def: hypothetical protein # Organism: B.xenovorans # Pathway: not_defined # 23 229 24 226 230 193 47.0 4e-48 MNIQNSVFYVGGSKGGVGKSLFSFALVDYLLNRNANVLLVDTDTDNPDVFKAHKDLALPN LLCRLNSLDDADGWADLLDTVQNYPDHAVVINAAARTKTSTASYGDIMKEALREMRRELV VFWIINRHRDSIELLHSFQEVFTDVPLHVCRNLYFGEARRFDMYNSSKAREAVEKNGMTL DFPAVGNRVADWLYSRRMSIRAAHPEMPFGTRAELQRWQGVCAKMFDSVMGEAS >gi|316921772|gb|ADCP01000140.1| GENE 32 31009 - 31611 487 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|296163053|ref|ZP_06845827.1| ## NR: gi|296163053|ref|ZP_06845827.1| conserved hypothetical protein [Burkholderia sp. Ch1-1] # 3 136 2 139 221 63 32.0 9e-09 MSMDTVIDLGKACGRELSEAEAARLREAGSRLNLREDDALWPLLAAMEYQRIYYEALPEK IAGASRAIMDGMAAAAEKETAAAQARLTDSVVKEAQNLASKIQYGKLVPMWLLALICLLA YGSLMLWAGFRIDTGRDMPLAAVFQMPSGWLICGLSLVAGVFLCALSGKTFLVGGKGWWK TAALAMGFIMTGAVILTFSL >gi|316921772|gb|ADCP01000140.1| GENE 33 31696 - 31986 347 96 aa, chain - ## HITS:1 COG:no KEGG:BDP_1620 NR:ns ## KEGG: BDP_1620 # Name: not_defined # Def: hypothetical protein # Organism: B.dentium # Pathway: not_defined # 1 96 1 96 96 142 75.0 3e-33 MEVKAYLEITMRIDNANRPAAAKVYTDYRAPFLDTIEGALTKELLIRDEDVQVLHGFDTV EHAAAYLNSELFNADVVKGLKPLWSAEPDVKIYTVA >gi|316921772|gb|ADCP01000140.1| GENE 34 32178 - 32561 232 127 aa, chain + ## HITS:1 COG:MA0333 KEGG:ns NR:ns ## COG: MA0333 COG1733 # Protein_GI_number: 20089231 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 7 106 8 106 127 98 47.0 3e-21 MYQAKLEKDIRCPLEYGLDVFGGRWKSRIICVLAHQDNLRYGKLREEMTNITDAVLAQNL KELIADGMVKRVQFNEIPPRVEYSLTEKGNSVVPILQNICRWSGAYHKEVSGLSIAQCRK CDYTAKK >gi|316921772|gb|ADCP01000140.1| GENE 35 32564 - 34039 473 491 aa, chain - ## HITS:1 COG:PA2005 KEGG:ns NR:ns ## COG: PA2005 COG3829 # Protein_GI_number: 15597201 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Pseudomonas aeruginosa # 17 483 23 469 481 258 36.0 2e-68 MTTLREEFFASVNVNIDILELLHSGIAIFNHRAKLMFANATYKKMYRLDDTCIGLDAMEF FLTAKQGILEVLKTGEANSCASVSINGLYGVTYRWPLRDREGRVAGCMTENISVSPMKDK IHEMQNIIDELENYNSFSTTLYPRQSTEIINFDSIVGESSSMRLLKEKGKRFARNNEPIL ILGENGTGKDLIAQAIHAASPRHARNFVAVNCAAIPHELMESELFGYEAGAFTGAKASGK QGQFELADGGTIFLDEIGEMPLTLQAKLLRVLENHEIKKLGAASPRYVDFRLVSATNRNL EQMVQQGRFREDLYYRLNLFDLVVPPLRERIADIPLLAYSILSGLLGPERGNSIRIAKEV LSLCSTHPWRGNVRELRNILTYALYSMKEYESELCLRHLPERFFHKVDNEFLSSTLMDEE VQGQKFVPSPAPPEKLSESRLEAERRSILEALKKCGGNKVKAAKLLGIARSCLYKKISIL DIKDYAVFNDN >gi|316921772|gb|ADCP01000140.1| GENE 36 34442 - 36889 2857 815 aa, chain + ## HITS:1 COG:SPy2049 KEGG:ns NR:ns ## COG: SPy2049 COG1882 # Protein_GI_number: 15675819 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Streptococcus pyogenes M1 GAS # 30 814 14 803 805 333 30.0 1e-90 MFTPEKTEAALAAEKAGHGAYAIDWTTADDRIRLCKGIVTDTNQQMDPERLSVLKAVYEE NPGEEPVLLRAKLLERLLLTKKLFLDENPVVGTLTSIRCGLYAYPEWSCDWIKDEMDMAK ICSLGEIVIPDETKDLMMGVYNQWKGRTVRDTADINYKKLYDVDPRPYIKSGMLYDVMNV SLGSGCANYNMILTKGARGVLEEVEQSMSKLKHQSGDYRKLAFLQGIKIVLEAFVKWAHR YADLCEETAAGESDAKKKAELLETAEICRWVPENPARNFREAIQSYWFAHLLVEIEQMGC ANSPGRFGQFMYPFYKKDIEEGRITRDEANKLTCFLFIRHTELGTYSGMAHQKALSGHTG QTLAIGGLTPDGKDASNDYEMVIMEVQARMQNIQPTLALWYTPMMEPAYLMKAVDVIRTG CGQPQFMNTAIGVARNLLHFGPDGVTIEEARNVANFGCVASGVAEAGSYIAGEDAICAAK FVELALYNGWDPTTKKQLGPQTGEANDFKDYEELYEAVRQQIKAGMMVQRRHSNLSTAAH EKIVPSIFRSALHDGCIETGMTEEAGGHKYPQGMAILSTVVDAGNALHAIKELVFDKKLF TIAKLREALEADWVGYEDMQKMCLDVPKYGNDDPESDAYVQRLFEDFKTIYTECGPNYFG KHCYLDAYSLSFHNLYGSIMNAFPNGRKKGVAFTDGSVSATPGTDTEGPTALIKSAAQAV DTTWYVSNHFNMKFLPSALADAKGARNLLDLIKTYFDWGGSHIQFNCVSHETLVDAKEHP QNHKELVVRVAGFSAYFTRLDGGVQDEIIKRTEYK >gi|316921772|gb|ADCP01000140.1| GENE 37 37028 - 37891 609 287 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 2 256 15 266 302 215 43.0 9e-56 MSVHDGPGLRTTVFFKGCPLHCLWCSNPESQAMAPQLMFFQNLCSGCGHCVQVCTHGAAI EKGGRFLRDFSRCVGCGACAEQCPTTASSMSGREYDADAIMKVIRKDASFFSNSGGGVTF SGGECTMQGAFLMELVEACLNEGLHTCIDTCGQTDPKLFKCLLGKADLFLFDIKHMDSAQ HKRLTGMGNELIQRNLGAALETFPEKIRIRIPLMPGLNDSEENIAAVASRLKPYGVRHVD VLPCHFFGSSKYNALGLPQPDVREYVPDALHGVLERFAAHGLETEIV >gi|316921772|gb|ADCP01000140.1| GENE 38 37912 - 38025 111 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLTVKADKAVLEKLAALMRNEEPDSCVRLKEYVLGGG >gi|316921772|gb|ADCP01000140.1| GENE 39 38207 - 39460 917 417 aa, chain + ## HITS:1 COG:AF0907 KEGG:ns NR:ns ## COG: AF0907 COG0477 # Protein_GI_number: 11498512 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Archaeoglobus fulgidus # 9 415 18 418 419 223 35.0 4e-58 MDINAAYRRKRIILFLAISVGYLVVFFQRVAPAIVGPVMADELGLAATDLGIMASMYFWA QAAGCIPAGLFSDTFGPRKIIAVGLICASAGTAVFVMGNSVPTLALGRFIIGLSVSVVFV GAMKIFSDWFYPNELATCSGVLLAVGNLGALLSTRPLMWLIEQAGWRDAFWFVTAYTLFS GILAWMILRNTPKECGFPEINSSSQETVSMRSALKVVFQTHRFYLVTIASTLYYGTLMNV GGLWSGPYLQDVYGLSKDVASSIVMFFTLGMIMGCPVSGWLSDKVIRSRKKVLLLGIILH TLVYIPLAFFTASIGTPTLLYLLFTLFGFTGGFFVVCFACVKETTEPRYAATAVGGLNMG IAVGAAFFQYVCGLIIDSHGKVGSVYGSDAYASAFQLCMVMMIVGAIFMFFFKEKTN >gi|316921772|gb|ADCP01000140.1| GENE 40 39876 - 40328 114 150 aa, chain + ## HITS:1 COG:MA2742_2 KEGG:ns NR:ns ## COG: MA2742_2 COG0778 # Protein_GI_number: 20091565 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 1 146 25 181 184 80 31.0 1e-15 MAAYAPTAHNARQVSYIVINGRDKVEKLLHATVRLMEEHHFYPNHTKNVRNGHDTLFRGA PCLILIHAPERILSETDCATAACYLELAFPSFHLGSCWAGMLIEACAYGLPAEIQLPKGH KLYAALMVGKPAVAYKRIPFRTPPEIIRAS >gi|316921772|gb|ADCP01000140.1| GENE 41 40834 - 40965 58 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPNPGSLKIILSTLPIYEILFISYEPNILFYIKKNKTSTSGRK >gi|316921772|gb|ADCP01000140.1| GENE 42 41592 - 44861 3056 1089 aa, chain + ## HITS:1 COG:hsdR KEGG:ns NR:ns ## COG: hsdR COG4096 # Protein_GI_number: 16132171 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Escherichia coli K12 # 3 1084 23 1186 1188 638 36.0 0 MVSNFAFLHEKFPELEELGMLAERYLRSDPQSCLMKAGLLCEGMIRVMFALDDITLPEHC DAVERIDILQRREALPADVAAAFHLMRRVRNKAAHEGLRLPEAKILYFLQITHSLCEWFF QTYGDESYRQLDFVAPQDAEASGPSPSESKEEAGHENALTEQAVSRAAQASAVPEDIRRK RAVVAASKRHVSEAETRYLIDEQLRKVGWEADSETLRHSFGTRPQVGRNLAIAEWPTDSS VGNRGFADYALFAGTRLVGIIEAKASHKDIPSVLDHQCKDYARHIRKEHLAHTIGEWRGF RVPFLFAANGRPYIRQYEEKSGIWFQDVRNASTAPQALHGWMSPTGLEELLERDIAQGNQ RLEVMSADFLTDRTGLNLREYQLRAIEAAERAVKSGAQTVLLAMATGTGKTRTVLGMMYR FLKSGRFRRILFLVDRNALGTQALDVFKDVKLEELHPLNDIYAINGLTDKAIDKETRVHV ATVQGMVKRILYSEGSESMPSVTDYDLIIVDEAHRGYILDREMTDDEQLYRDQRDYQSKY REVIDYFQAVKIALTATPALQTTQIFGSPVFSYSYREAVIDGFLVDHDAPHQLSTTLSEQ GIHYKPGDTVAVYDPLTGEVTNSELLEDELDFGVDDFNKTVINENFNREILREIAKDIDP EDTSSGKTLIYAVNDHHADMIVRILKEIYAEELVDTDAVMKITGSIGNRKNIDAAIRRFK NESLPSIVVTVDLLTTGIDVPEITRLVFLRRVKSRILFEQMLGRATRLCPDIGKNHFEVY DAVGLYTALEPLTTMKPVVSNPAATFSQLLDGLETLADPPHILNQINQIIAKLQRRKRSM TDEVRQHFADMAGDAPDAFIERVQDLRPREAKALLLSKRDLFELIQREDAAGRRPVVLSD TPDALTSHTRGYGKENLRPGDYLESFARFVREQRNAIAALNTICTRPADLSRASLKSLRL ALGREGFTVRQLNSALSRMSNKDIAADIISLIRRYGINAELLGHEERIRRAVSRLKQAHS FSRVEEKWIDRMEKYLLNESVLSVHTFDESVIFRDKGGFAALNKLFRNQLENIVTELNTY LYDDGGLAA >gi|316921772|gb|ADCP01000140.1| GENE 43 44858 - 46276 1511 472 aa, chain + ## HITS:1 COG:hsdM KEGG:ns NR:ns ## COG: hsdM COG0286 # Protein_GI_number: 16132170 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Escherichia coli K12 # 1 463 1 507 529 448 48.0 1e-125 MTTQEIVAKLWNLCNVLRDDGITYHQYVTELTYILFLKMAKETGTEAAIPETCRWDALAA KSGIELKHVYKQVLAELSENGTGRVREIYQGAVSNIDEPKNLEKIISSINALDWYSAQEE GLGNLYEGLLEKNANEKKSGAGQYFTPRVLIDVMVRLMKPQVGELCNDPACGTFGFMIAA DRYLKDQTDDYFDLDEDQAAFQKQRAFTGCELVHDTHRLALMNAMLHGIEGEILLADTLS TAGKAMKGYDLVLTNPPFGTKKGGERATRDDFAFATSNKQLNFLQHIYRSLKRGGRAAVV LPDNVLFADGDGGRIRADLMDKCTLHTVLRLPTGIFYAQGVKTNVLFFTRGQSDRGNTKE VWFYDLRTNMPSFGKTTPLKKEHFADFEAAFEAEDRRAVRDERWSVFSREEIKTKGDSLD LGLIRDESLLDYDDLPDPIESGEACIAQLEEAMDLLKSVVRELQGLTGREAN >gi|316921772|gb|ADCP01000140.1| GENE 44 46276 - 47628 284 450 aa, chain + ## HITS:1 COG:STM4524 KEGG:ns NR:ns ## COG: STM4524 COG0732 # Protein_GI_number: 16767768 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Salmonella typhimurium LT2 # 33 447 6 430 469 184 29.0 4e-46 MARPRKNAAKAQGTLLTPDEVVEIPVEEQPYPLPEGWKWVRLGKLYQINPRIIADDNTMS SFVPMEKIEPGMKGTFTFEILPWGKAKKGHTQFADGDVAFAKISPCFENGKSMLVRGLKN GIGAGTTELIILRQPSVLQKYTFYIICSSDFIQKGTHTYSGTVGQQRISMDFVRNYPVPL PPVDVQQRIVDCIESLFAKLDEAREKAEAVFDGFESRKAAILHKAFTGELTEKWRKEKNI SLESWDSCRLISVLKEKPRNGYSPKPVECKTNVKSMTLSATTSGFFRPEFFKYIDEEIPE NSHLWLSQGDILIQRANSLEKVGTSAIYTGGDHEFIYPDLIMKLQVRAPHSYKYIAYILS TQPTLSYFRSKATGTAGNMPKINQQIVSNTPIVVPSCEEQNEIVRILDGLLAKDQQARDA AESVLERIDLMKKSILAKAFRGELTGQVHK >gi|316921772|gb|ADCP01000140.1| GENE 45 47687 - 48514 188 275 aa, chain + ## HITS:1 COG:no KEGG:Smlt0297 NR:ns ## KEGG: Smlt0297 # Name: not_defined # Def: hypothetical protein # Organism: S.maltophilia # Pathway: not_defined # 129 273 108 253 256 132 42.0 2e-29 MSKVVITSINPGDSDYAFEEFVVLENNSSEKIYLDNWKIVWQEWPSQRKLHEYIFSNWKS KKRSFDQQEKIFIISGIGYNRFYHIGQLKHCQKAHWQIFTDTLKHICSVPFIKITLYDDK NVEIDHLFSRQFKDSAPLQPVIVIGHGGNSAWRELQEHLRDQQGFEVEAFETAPRASQTI PDVIVDLGSQANMAILVMTGEDELADGSRHPRLNVVQELGKFQERFGNNKTIILSEKGVT LPSNNSGIIYISFETGNIRACFGDIIAVIRREFKL >gi|316921772|gb|ADCP01000140.1| GENE 46 48742 - 48981 137 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSLRHIKSIERGLSDPTLSTLEKIASALSLRVKELFALEEIRRSPKGIGKDLLAQLENCK ESELLRFYACNKILLSKYK >gi|316921772|gb|ADCP01000140.1| GENE 47 49280 - 49723 470 147 aa, chain + ## HITS:1 COG:AGl3233 KEGG:ns NR:ns ## COG: AGl3233 COG1846 # Protein_GI_number: 15891736 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 141 6 133 133 89 39.0 2e-18 MESLTKEYASPVCVCMQLRKAGRVVTQSYDRCLRPVGIRGTQFALLKRIACMRRPFITDI GKVLCMDQTTATRNIGLLERAGFVAVGVHPDDARKKVVELTPEGQAKLEEAFPLWEEAQR EIREYLGDDRLGNLCSLLQVLSDFSSE >gi|316921772|gb|ADCP01000140.1| GENE 48 49775 - 50395 574 206 aa, chain + ## HITS:1 COG:CAC3341 KEGG:ns NR:ns ## COG: CAC3341 COG0655 # Protein_GI_number: 15896584 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 206 1 208 208 278 66.0 6e-75 MNVLLINGSPHKKGCTYTALSEVAGQLEKNGIGSNIVHIGSKPIRGCIACGACRETGLCV FDDDPVNECSGLLKAADGLIVGSPVYYAAPNGALCAFLDRLFYMKSGAYAYKPAAAVVSC RRGGASAAFDRLNKYFTISCMPIVSSQYWNSVHGNTPEEVRQDKEGLQIMRTLGDNMAWL LKCIEAAKGTVPYPVREPWTPTNFIR >gi|316921772|gb|ADCP01000140.1| GENE 49 50735 - 51967 1202 410 aa, chain + ## HITS:1 COG:BH0472 KEGG:ns NR:ns ## COG: BH0472 COG0477 # Protein_GI_number: 15613035 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus halodurans # 2 410 4 412 417 265 44.0 1e-70 MAEKERASRQVIVISLITAACLIGDSMLYIVLPICFAQAGLSSLWEVGIILSVNRLVRLP LNPMVGWLYRHISDRTGIFIATVLATITTFSYAFADGFAVWLLLRCLWGIAWTLLRLGGF YCILNVSSDDNRGYFMGLYNGLYRIGSLAGMLLGGIFADWLGFSVTCMLFGACTAVTIVL GFICVPRGNSGVPAGTQAEERSLTWLWKDGTVLWVMATGLVVALIYQGLYASTLSELIRI HFGSSVTLLGGVAVGAATLGGVLQAIRWGWEPWLAPGIGVLSDRRFGRRSMMVVSLAFGV VVFGLVALRLPLPLWLAVLIGIQLTGTSLTTIADSVASDTASVRGGRVFMMWYSFAVDLG AALGPILAYLLNDMWGMDTAYAGTAVLLLILAVKWYWSAPLLKPFVRRSA >gi|316921772|gb|ADCP01000140.1| GENE 50 52286 - 53104 302 272 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 236 1 226 245 120 30 2e-26 MIHIDTLSKTFSGGKGLDRVSLHIKPGEMVALIGSSGSGKSTLMRHIAGLMPGDADGGRI EVAQHLIQDRGVISRDIRTMRAEIGMVFQQFNLVDRMSVLRNVMLGALSRTALWRSLLGF FHEEDARLAYKALARVGIMEKAHQRTSNLSGGQQQRAAIARALVQRARVLLADEPIASLD PESSRNVMETLRDLNQKDGLTVLVTLHQVDYALSFCPRTIALKHGRVVYDGPSAELTPAF LKRLYGTECDSLFARQESDIRRNPPLTQVQVA >gi|316921772|gb|ADCP01000140.1| GENE 51 53127 - 54092 1208 321 aa, chain + ## HITS:1 COG:PA3383 KEGG:ns NR:ns ## COG: PA3383 COG3221 # Protein_GI_number: 15598579 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, periplasmic component # Organism: Pseudomonas aeruginosa # 1 319 1 318 334 367 58.0 1e-101 MFSRFLIAVAAAAVVLGGSVCAKAAEVLNFGIISTESSQNLKQLWDPFLADMEKATGFKV KAFFASDYAGIIEGMRFKKVDLCWIGNKGAIQMVDRSDGSVFAQTTAADGTDGYYSLLVV NKESPLKSVDDVLKNAAGLRFSNGDPNSTSGFLVPGYYVFAKNNVDPKKIFKNVVAANHE SNALAVANRQVDVATCNNEAIGRLEIAHPEKAAQIREIWRSPLIPSDPLVWRNSLPESTK KKIADFIFGYGVKGDVAHAKAILEKLQWGPFKPSTNAQLVPLRQLELFREKSKLQANAEM PAAEKAERIKKIDEALDKLSL >gi|316921772|gb|ADCP01000140.1| GENE 52 54171 - 54962 1110 263 aa, chain + ## HITS:1 COG:ECs5086 KEGG:ns NR:ns ## COG: ECs5086 COG3639 # Protein_GI_number: 15834340 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, permease component # Organism: Escherichia coli O157:H7 # 7 263 23 276 276 332 69.0 4e-91 MSSEFALSPPRKRPMELLRLVGWAVALAFLVWSWRGAEMRPMTLITESGNILTLIKDFFP PDFTDWRLYATEMVVTLQVAIWGTLLAVVCAVPLGILSSDNIVPWWVYHPVRRLMDAARA INEMVFAMLFVVAVGLGPFAGVLALWVHTTGVLAKLFSEAVEAIEPSPVEGVRSTGASFL EEVVYGVIPQVFPLWISYSLYRFESNVRSATVVGMVGAGGIGMVLWELIRSFAFEQTCAV MALIVLVVVVFDMLSQQLRKRVL >gi|316921772|gb|ADCP01000140.1| GENE 53 54962 - 55687 580 241 aa, chain + ## HITS:1 COG:no KEGG:Dbac_0825 NR:ns ## KEGG: Dbac_0825 # Name: not_defined # Def: protein of unknown function DUF1045 # Organism: D.baculatum # Pathway: not_defined # 3 231 4 230 234 147 39.0 4e-34 MPARYAVYYAPSAGDVLHQAMTPLLGRDALGGLNVPQATPPGVDPVFWKAVTRVPAHYGL HATLKAPFELRHSGMDSQLLRSTGEVASRFLPFAIPSLSLAYLGKEEKGFYALVPSTKCS LLSFLERACVMDLDAFRAPLKTEDVARRGHLSLEERSNLYMWGYHRVLDSFQFHITLTDG IADAGLRALVGAGLRKALEGVLDAPLRIDALTLFKQENRNRPFSAVARLPFANVQTAQET A >gi|316921772|gb|ADCP01000140.1| GENE 54 55687 - 56127 459 146 aa, chain + ## HITS:1 COG:ECs5084 KEGG:ns NR:ns ## COG: ECs5084 COG3624 # Protein_GI_number: 15834338 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli O157:H7 # 6 146 8 149 150 110 45.0 1e-24 MSLTVRQERMRLLALASVERLEAGLSAITPPDYEVVRGPETGLVMIRGRVGNTGDAFNMG EALATRCVVSLKDGCLGYAWILGEDARRAELAALYDALWQRDGYAEPLDRELLPALEAER ARRVREDAAAVEPTRVDFFTLVRGED >gi|316921772|gb|ADCP01000140.1| GENE 55 56127 - 56729 499 200 aa, chain + ## HITS:1 COG:mlr3343 KEGG:ns NR:ns ## COG: mlr3343 COG3625 # Protein_GI_number: 13472900 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Mesorhizobium loti # 3 192 11 201 203 137 41.0 1e-32 MPAFSQPIFDMQRVFRVVLDAMSRPARLCRLPVFPQSVPDGLSPVLADLALTLCDGESPL WLSPGLRKDETTTWLRFHCGCPIVEEADKAAFALVADASELPPLSAFAQGVPAYPDRSAT VLLAGLAFDAGTRFRGSGPGICGESTFTCTLPEDFAERWRENASGFPLGVDMLLCGADGV AGLPRTTRLERLPQESASCM >gi|316921772|gb|ADCP01000140.1| GENE 56 56720 - 57829 1198 369 aa, chain + ## HITS:1 COG:AGc299 KEGG:ns NR:ns ## COG: AGc299 COG3626 # Protein_GI_number: 15887533 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 361 1 351 369 389 58.0 1e-108 MYVAVKGGEKAISAAHELLRREGRGDPSVPDLGIDQLIGQLRLAVDRVMAEGSLYDRDLA ALAIKQAQGDLLEAVFLVRAYRTTLPRFGSSLAVDTSAMRVLRRISATWKDLPGGQILGA TFDYTHRLLDRTLFRAEAGAAGSSPLPVEPDTDFPLGGPLPRALDALVAEGLLERPEPSE GDACRDITRHPLELPADAPADRGVRLQALARADEGFLLGMAYSTQRGFGAVHPFTGELRR GWVEVSFFSEELGFDVVIGEIELTECETVSRYTGTARPPCFTRGYGLVFGNNERKALSMS ITDRALRAKELDEPVVGPAQNPEFVLLHGDNVDASGFVQHIKLPHYVDFQADLSLIRGLR AKRQNDKDI >gi|316921772|gb|ADCP01000140.1| GENE 57 57830 - 58714 1091 294 aa, chain + ## HITS:1 COG:phnJ KEGG:ns NR:ns ## COG: phnJ COG3627 # Protein_GI_number: 16131924 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli K12 # 10 288 2 280 281 444 74.0 1e-124 MPQTDVCADAPLEGYAFAYLDETTKRMIRRGLLKAVAIPGYQVPFASREMPMPCGWGTGG IQVTASILGRDDVLKVIDQGSDDTTNAVSIRDFFRKTTGVATTTHTDEATIIQTRHRIPE QPLTESQIMVYQVPLPEPLRWLEPSEQETRTLHALEEYGIMNVKLYEDIVRHGTIATGFD YPVRVAGRYIMSPSPIPRFDNPKMHQCPALQLFGAGREKRLYAIPPYTDVESLDFEDYPF TVQSWDEACAICGSTDTYLDEVILDDQGNRMFVCSDTDCCARRAAERDGQGGAQ >gi|316921772|gb|ADCP01000140.1| GENE 58 58711 - 59496 359 261 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 17 260 30 267 329 142 34 5e-33 MKDWNAAPILSARNLTRRYGEHIGCRDVSFDLWPGEVLGIVGESGSGKSTLLNMLSGRME PTSGTIAYRDAAGRERNIHALPEGEKRRLLRSELGFVHQRPSDGLRMQVSAGANIGERLM AEGARHYGTLRGRALQWLERVEIDAGRIDDKPALYSGGMQQRLQIARNLVTSPRLVFMDE PTGGLDVSVQARLLDLLRHLVLDMDLAVILVTHDLAVARLLSHRLMVMYRGEVVETGLTD QVLDDPHHAYTQLLVSSILQG >gi|316921772|gb|ADCP01000140.1| GENE 59 59500 - 60189 214 229 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 17 224 7 210 245 87 31 3e-16 MNTPIIEVRNLSKTFVLHQQGGVAIEALSGISFSVDAGECVALHGPSGAGKSTLLRALYG NYLPTGGSIRVRCGGEDVDITSADPRTVIRLRRGCISYVSQFLRVIPRVSTLDLVAEPLL EAGEGQAAARSRAEGLLARLNLPERLWHIAPATFSGGEQQRVNIARGFIRPSPILLLDEP TASLDGANRQVVIDLIREARGNGAAIVGIFHDDVAREAVADRTIPVRRP >gi|316921772|gb|ADCP01000140.1| GENE 60 60216 - 61355 1325 379 aa, chain + ## HITS:1 COG:SMb21171 KEGG:ns NR:ns ## COG: SMb21171 COG3454 # Protein_GI_number: 16264585 # Func_class: P Inorganic ion transport and metabolism # Function: Metal-dependent hydrolase involved in phosphonate metabolism # Organism: Sinorhizobium meliloti # 3 378 4 378 379 345 49.0 9e-95 MNETVFTNARLITPDAVVEGTLAVRDGRIVSLDNGASSLAERIDLEGDYLMPGLVELHTD NLEKHFIPRPKVLWPQGVPAFFAHDAQIVASGITTVFDALSVGEYHDKGRIAMLGKAVGA LNSCRSTGKLRSEHLLHLRCEVADPRMQDLFFPLSDTPALSLVSLMDHTPGQRQWRDTSS YRTYYSNIASWTDEEFEQVVERLQTARDGCAEDNAASVMAFCRERGLPMASHDDTLVEHV EEALGNGIAISEFPTTMEAARHASENGMLVLMGAPNVVRGGSHSGNISALEIAQAGYLGS LSSDYVPVSLLEAAFALSEGGYMSLPAAVGLVTANPAEAVGLADRGSLEVGRRADLVRVS MVEGQPVVRNVWREGAEVF >gi|316921772|gb|ADCP01000140.1| GENE 61 61355 - 61927 334 190 aa, chain + ## HITS:1 COG:PA3373 KEGG:ns NR:ns ## COG: PA3373 COG3709 # Protein_GI_number: 15598569 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized component of phosphonate metabolism # Organism: Pseudomonas aeruginosa # 4 170 3 170 185 127 48.0 1e-29 MRQGQLVYVMGASGSGKDSLLQALRPRLRGVPVAFARRYITRPLSAKGERHIAVSPERFA AMAASGEFMMDWRSHGLRYGIGKGVETTLAHGGVVLMNGSREYLPEALRRFPSLVPVLVE VDASVLRARLLARGREPGGDIEERIARASMPAEYPDGVLRIDNSGELAESTAAFYRMVAR LIEEKTDFRQ >gi|316921772|gb|ADCP01000140.1| GENE 62 62110 - 63117 945 335 aa, chain - ## HITS:1 COG:MT0593 KEGG:ns NR:ns ## COG: MT0593 COG0500 # Protein_GI_number: 15839965 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Mycobacterium tuberculosis CDC1551 # 1 318 1 329 340 101 25.0 2e-21 MKDWTPSALLHASGMYWESCAIHAAVHLDVFTPLSGGPMTVTELADRTGCVPQSLDRLLT ALCAMGLLIRQGETCELPPFSRDHLCRGNPGYLGHIVEHHRHLMPGWSKLDESVRENRPT RHRSSVDTEDTGERESFLMGMYNIANAQAERSVPHIDLSGRSHLLDLGGATGAYAIHFCR RNPGLSATVFDLPTTRPFAERVIADSGMADRISFAGGDFTTEDLPGGFDVAWLSQILHGA GFDESANIVRKAAQSLVPGGLLLIQEFILDDNRSGPQHPALFDLNMLVGTPDGQSYTRQE LAGFMRNAGLEDIRRLPLDLPQGCGIMAGTVPVGM >gi|316921772|gb|ADCP01000140.1| GENE 63 63751 - 66684 2517 977 aa, chain + ## HITS:1 COG:PA4367_3 KEGG:ns NR:ns ## COG: PA4367_3 COG2200 # Protein_GI_number: 15599563 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Pseudomonas aeruginosa # 706 970 2 266 267 191 35.0 5e-48 MRRHGYGISVFGAIVLCCLFFCFPAQSDDNAVSIRVGYYEDGDYMSRSRSAEYIGFNIEF LQKLAKVGGMRYEMLDEGSWENAWHQLLEGHIDMIPAVFRTEEREKEVLFSNLPMGTLYV TLNVRVDDKRYGYEDFSSFQGMRVGIIRGSSDGERFQRYCEKFGVKPIIVPYAETQALLS ALEDGTLDGVAITHLGRSSTFRSVAQFSPEPVYIAVAKNRPDLLARLDAAMNAIVLRDPN YMMRLYDKYFAVSMAQQPVFTKEEEAFLKQADVVVAAYDPSWAPLEYTDTETGMFSGVTS DLLSTFSRYTGLRFRFEPIDQAEALEKVKKGLIDMVCSVTGDYLWDERNKMYTTRYYLRA PTMLVRTKQPHAIERIALQGGYWFSEHVAADHPGKEIIFYKNVRECFDALLKGDADAVYA NVYVTNYLLTERRYESLAATSMSKYTSEICFGISKCADPRLLSILDKCIQYTAPERMDEL VLKNTTKPRQITLLDFVAQHLIEVGCGMLAVFGVILLLLVYNLAIKTRSNHRIQALLYRD GLTNLDNMNKFYVECGKLLRKAPSGSYALLYGDINQFKTINDNLGFAVGDQILRAFGAIL QRNVKDGECCARASADHFILLVRYADWESLLVRIEEIGMELDEWRRMQDMAYRIVTVFGV YLVNETEGFDLHRMLDFANYARRNAKQTMKNIVLYDEKMRQEALLQRELEGRLEIALSQG EFDAYFQPKVDMRDGALIGSEALVRWNYPTRGLLMPGSFIPFFERNGSVVEVDLYIYELV CRTVREWLERGMTVNPVSCNFSSLHFDSPEFPELVAAIADRHNVPHALLELEITESAIMR NPQIVCAQVLRLKERGFMIAIDDFGSGYSSLGQLQQLMADVLKLDRSFVRRGMFGTRERI VIGNVIHMAGELGMQVICEGVENAKQAEMLIELGCYYAQGFYYAKPMPRDVFEGLLEKGK VLQDEKENKKGAWQRKP >gi|316921772|gb|ADCP01000140.1| GENE 64 66951 - 68405 1115 484 aa, chain + ## HITS:1 COG:STM2123_3 KEGG:ns NR:ns ## COG: STM2123_3 COG2199 # Protein_GI_number: 16765453 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Salmonella typhimurium LT2 # 285 463 4 175 194 110 39.0 7e-24 MSIRWVTSLLGLLLTVTIMVTAGLSWNVYSSFTKMVEADVQYLNLMQVSRELLQSSYDLT RKAREYVNTGDEASERAYYDILAQRSGRVPRKSSLAPGETISLPSLLERYGASRGDMEFL ATALRRSDDLSITEIEAMQASKGEALCPDGLHSGSDTIDVEYARRLLFNAAYQKRAGEII KLVESGITSILDRKSHQIEAYKRSVVYHMKELGASIAVVFLIALMWIWYGMKKVSQPLRE TTLFAHRVAEGEAESSIAVSGHDEIAELRAMLNVMLSSLQKNALMLRELSYIDPLTALWN RRRFREVLSDELRRVQEEKGKGFGLAFIDVDCFKGINDTFGHAAGDMVLREFAKLARGML PDSATFARLGGDEFVLLLPDADEGALFRQLEEIRVACTAKPLVHADREVRFSISAGGYRF TSSGAIDTPEENEKTISDLFRYADSALYMSKQAGRDRVTLWSPDCSFPCAESRWKEQSVS DDAC >gi|316921772|gb|ADCP01000140.1| GENE 65 68595 - 69908 896 437 aa, chain + ## HITS:1 COG:no KEGG:Ddes_0525 NR:ns ## KEGG: Ddes_0525 # Name: not_defined # Def: 4Fe-4S ferredoxin iron-sulfur binding domain protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 12 412 11 411 438 310 53.0 1e-82 MSIMRSILCLTPSSLAFVLGGAHFWRAGQYPFSVGCMLFALLVWRREAWIRQVTLVLLPL LAARWVWAAAQFVQIRMFMEQPWMRLACILLGVSLFTATAALLLEGKTGDAWYARGRKRA LMRTAAFFGTIAILTPLLFVAPQVFLTERFLPGWAPLHILLAGFWASWVASRLGDRDAAP RTRLFIWRLFSVAFFVQLALGLAGYGLFLMTGNLHLPVPGMILAAPLYRGGGLFMPILFG VSVLLAGAAWCSHLCYFGVWDTVAASGRKAVPPPRWMSRLRPVFFGLMLAVPAVLRLSGA PTGVAVALGLALGLLLLPVAVLLSRRYGSACYCLAVCPLGLVANWLGKIAPWRIRRTDAC MHCLACIRVCRYGALTPERLKEGRPGPGCTLCRDCLSVCRHGGLAVTLYGKTCGAAESSF VVLLSIMHTVFLAVARV >gi|316921772|gb|ADCP01000140.1| GENE 66 70127 - 71542 2023 471 aa, chain - ## HITS:1 COG:no KEGG:amb1862 NR:ns ## KEGG: amb1862 # Name: not_defined # Def: hypothetical protein # Organism: M.magneticum # Pathway: not_defined # 31 434 17 367 374 216 37.0 2e-54 MFRLTSLATALLSLFMLSAPASAYEAITGPLGLLAYDKAKAYDGYTLFTPHTKVSSWIVD KIQDPGSRKTYLIDMEGNIVHTWKHDHPAFYAELLPNGNLLRAEKIAGSPVNFGGWYGLL REYAWDGKVVWEYKVSNPRQIAHHGFDRLPNGNTAILIWENKTYDEALAKGRDPKDKALS RNGMPAPGQGPDGQPLQGIWPDAILEVTPKGEIAWEWHVWDNIGTGPDQIDINWHLPLSM GYFARADWTHWNSVRYNAKTDQYAVNSRDFGEIYIIDRKTKKIVWRYGNPATHGMGKAPS GYHDDGDQVLFGSHDVEWLPNGNISIFNNGTHRPSANRSSVMEINPATNQVVWQYETKDH NSFYSDFQSAAQKLPNGNWFITSTNNGHLFEVTPDKQVVWEYNNPISTDDQAYCVKRDDN PETQVHRAYRYGKDFPAFAGKDLKPQGKLAAGCPDWHLLLEFRQGPNFSSK >gi|316921772|gb|ADCP01000140.1| GENE 67 71656 - 73059 1811 467 aa, chain - ## HITS:1 COG:no KEGG:Ddes_2094 NR:ns ## KEGG: Ddes_2094 # Name: not_defined # Def: sodium/sulphate symporter # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 11 466 11 465 466 520 60.0 1e-146 MTLPMTGKNCSLALKWLISFAIPLIVYWAMPIDGQTVTHHMALFLSITTWAICIWAMDAM NEIAVGLILPVLYIVLCGVKQQVVYSPWLSEVPLIVIGGFALGKILQETGLGKRIGLTCV RCMGGSFTGTLIGITLAVSIVAPLVPAITGKAAIFCAIGISLCDALDFKAKGREATAVIL TCCLAVASTKLCYLTGGADLVLGMGLLQKASGISSTWMEYAMHNFLPGMLYTFMSVGLVM LILPSKVDRQALGATLQTKYQELGPVTSEQKRAAILMLVTLILLATDKLHGVAAGLVLIG VTFAAFLPGVRLLDGPKFSKVNFAPLFFIMGCMTIGSAGGFLKVTTWIANNTLPYLHGMT TTTAGLASYVLGALANFLLTPLAATTTMTSPVTELGIQMNMDPRILFYSFQYGLDNYLFP YEYAVILYFFSSGYMLFKDMLKVLAARMVLTGIFLFFLAIPYWKMVL >gi|316921772|gb|ADCP01000140.1| GENE 68 73329 - 74012 663 227 aa, chain + ## HITS:1 COG:Cgl0291 KEGG:ns NR:ns ## COG: Cgl0291 COG0664 # Protein_GI_number: 19551541 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Corynebacterium glutamicum # 11 196 14 198 227 70 32.0 3e-12 MEMNDAKLIEFTHIPPEAGVWRQVLDLGTRLRFRKGEEIMHGGEKGEYLYYVHQGEARLM RTTLDGREKILMSMRSGTVMGETPFFDEVPARSSIVAGTDCVVYAFSRDCVLHEILPNYP ALTLALLHSLASKVRVLCNQSVSLSLEELPPRICRFLHLRMRDDGGAPNPRVSPGLNQQE LANLLGVHRVTLNKALRELEREAVLGPYSRDEVYIVDMERFHELSLK >gi|316921772|gb|ADCP01000140.1| GENE 69 74125 - 76071 1810 648 aa, chain - ## HITS:1 COG:no KEGG:LI0080 NR:ns ## KEGG: LI0080 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 98 606 20 502 509 229 29.0 4e-58 MSTDTPVQKDEPNASAEQQSAPVQPVSEEPVAPVTETAATDTDAQPSPAPETETVTPPPA EPAPAKPSPVQEEKAAPAPKAAKVSRSPAPEAQPQTSSFASRCFNGLASVGPLVLILFWL IQSMPTFMGRELHALHNLGVGLFGVAASGLPSPELYPVYHWFLSALSLIPGIDSLSLAAY IPGMQPTAGLEQISGYPVQLLPIASALSALFLILLTWALARATGNDRRTAFASGLVLLTG FACMGLPRISGSDMLFTAILTLAGICLYRGWIKAFAPLWLFAGFALVALSALAGGLLGLV LPLLVSLVFLLWRGTFRRAGARDGALAFGLLLVLLLAWGTFIAFGDGGRELLKTLLENEY LAPVLEAWKLQGQDSWIIVALLAVLWLPWTLLLLFLPWGRIGTFFKGIVVNRKQRPGQGW LWCSAIVTLAVLALLGANMPVLLMPLLPPLAVLTAQGVLNLSARGSRGFFLLLAILLIAL GLLFAAANIYPLFLGELPAPLSALQPTPLALVAALVQTCGLVLLGIILWKALNRSFAGGA LLVLTFLVLVYTAPMTYYAAAGAPVAVATPVEPAPAQEDPATEKVPSETPVQPSGPDAPA TDEQAAPAPSTPVEAPAATEPQATPLPAPEEPAPAAETAPAAQPAPAN >gi|316921772|gb|ADCP01000140.1| GENE 70 76174 - 76827 700 217 aa, chain - ## HITS:1 COG:DR0815 KEGG:ns NR:ns ## COG: DR0815 COG1802 # Protein_GI_number: 15805841 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Deinococcus radiodurans # 13 209 43 235 267 88 32.0 1e-17 MHTFKRQTYSGQVVEFIKRCILDGELAPGEQVKEVILSERLGISRAPIREALQILTQDGL ITSEPQKGKYIRKMTRQEIENEYEIGGILEGAGVAAALPLMNHVDLARLSGMIEHMARLA PRVGGLYEFTEVDDAFHSALLAHCANRRLVEMARSACTNISKFLFYNHWNVLFTPQEFVD RHRDIFEAVQTRNPATVEESLREHYRESGRRMAQFGS >gi|316921772|gb|ADCP01000140.1| GENE 71 77150 - 78247 1364 365 aa, chain - ## HITS:1 COG:slr2100 KEGG:ns NR:ns ## COG: slr2100 COG3437 # Protein_GI_number: 16330586 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Synechocystis # 6 355 10 366 368 328 49.0 8e-90 MGYPKKPVVLVVDDTIANIRILDDLLRGEYTVRVATNGATALRLALSEPRPDIVLLDIMM PDMDGYEVCRRLKSDPLTSNIAVVFITAMGNEEHEAKGLDLGAVDFIAKPFQPRLVRARV SNHVALKKYNDELNSLVKERTKELYLSRSVTVECLASVVETRDNETGAHVRRTQHGVEIL AKRLRLMRPDIWDDDDDTIELFRVCAPLHDVGKVGIPDSILRKPGKLTEAEFNEMKKHTT LGYQTLSWAEKRMGGHNDFLRLGAVIAYTHHERWDGKGYPRGLKGEEIPRVGRLMALADV YDALISKRVYKPAFDHETARDIIVEGRGTQFDPLVVEAFLFEEDAFRSLIVTYPDDENPL LVSND >gi|316921772|gb|ADCP01000140.1| GENE 72 78234 - 79325 962 363 aa, chain - ## HITS:1 COG:alr2279_4 KEGG:ns NR:ns ## COG: alr2279_4 COG0784 # Protein_GI_number: 17229771 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Nostoc sp. PCC 7120 # 2 143 151 304 332 87 35.0 3e-17 MHQPEHTPRTLFVDDESINQQIGAEILLALGAEADTADNGLQALRLIRTRHYAIVFTDVE MPVMDGATLTHILRSDSRYGELPIIACTAISSPEEEEALLQKGFTAVLRKPFDVQAMQTA LNRHATLSGEAPAPSPEPRPRVEAGHAIREELEHLPGFSVRQGLDRVMGNLPLYVSLLFD FRRELAEADAEIRQARLRDDMPSIRASVHKLKGISGNVGATDLYTQLVELEQCLQGGSAE NVDNLLDTLYGYVTTTESVLAAAGQAIAAYPEAPLSIPAAGETLDVRKLQPALCECLLLI RQYNTAAREAFGSVRRFLGGQCGAETNEIQHAIDTFDFEKAESGLLHLAQLLDLELREPN GLS >gi|316921772|gb|ADCP01000140.1| GENE 73 79346 - 82552 3163 1068 aa, chain - ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 508 770 4 266 280 290 56.0 1e-77 MLGSFLFVLTFSFAAFSETELILSPDEKLWLAQHPELRLGVIASRLPFEALSSERKQEGI ASDYANWLEKTLGIRFVAHPVPDTGLHLPFGEGKLDVHLSAVDTPESRKTMLFTRPYLEL PLVLFMMNDAPFVNGLQDMKGKRVAAVGERLLDILKDTSPEIRAVEARTTREALEMLERR QVDAVFEILDAGTHVIRLNNIPQVIVAAVTPYRLPIRIGVRKDWPQLAAILDKALGGIPP TASQDFHNRWFNVQRSAEIAWSLFWEMTATVILIALGIVLTALFMLRKVRKEVLVRRNTE AMLNSLLENLPAAVCLFDSSCRCRKLNAMCASIFGFTEQTALDGVPGTDFPVASGVLSEK VLRDVLESGEPRMFPHSRTDADGKTAYYQTTLVPLRKEDEKTDAVLCLSADMTTQKLLEK KLSEQLAFAQKLLKTSPIGVLIAVEGFIRYSNPRARELMDLREEMDVARVYPQLREAAAM LPTLEDTLTDIALKSRDAQGQPHHLRVTYTRTEYEGAPGIICWLVDDTQNKELEKQLVLA KEEAETASRAKSDFLANMSHEIRTPMNGIIGMTHLTLQTELTSRQRDYLTQISTSANTLL RIVNDILDFSKIEAGKLEMEHADFQLEQVLEEVASIADLSAAEKGLELLSRISPDVPPVL EGDALRLSQILLNLVGNAIKFTQSGHVLVSVTQERHEGSKTRLRFSVSDTGIGMTSEQIG NLFESFTQADNSTTRRYGGTGLGLAICKRLVNLMGGEIHVRSTPGHGSDFIFTAEFGTVA VPERRLLLPSHPLHGERILIVDDNELSRNILSGMLREFQLDPVSAASVGEAVRLCRDAAR EGRPYRVALVDWLMPGMNGKDTAQAICAALPKRPVILFMATIHDRPDVLAHIREGKDIPM RLITKPVTPSSLISALLEAEQPQDRPDRAEPSAAPFPELVGARVLLAEDNAINRQVANEI LQSSGVSVEPAGDGLEAVEKLWEGGYDAVLMDIQMPGMDGLKATRILREYARFDNLPIIA MTAHAMNGDREKSLSVGMQDYIAKPIEPSVLLSTLARWIHRDGRSPRR >gi|316921772|gb|ADCP01000140.1| GENE 74 82723 - 83595 891 290 aa, chain + ## HITS:1 COG:sll0228 KEGG:ns NR:ns ## COG: sll0228 COG0010 # Protein_GI_number: 16329302 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Synechocystis # 8 290 13 303 306 192 39.0 7e-49 MPHETTHRFLASELPVAEPESCLFHVLPVPLEATVSYAGGTARGPEAILEASDQLELWDG ESVPAEAGIYTWPAVDISGSPETVLGRVEAAVGGILDCNGLPVLLGGEHTVTYGALAALK KRFGTFGIIQFDAHADLRDSYEGSPWSHACVMRRAVKDLGLPLAQFGVRAMCLDEVEARK EHGVTFHDAASIARSGLPESLLPPDFPRQVYLTFDVDGLDPSIMPATGTPVPGGLGWYQS LDIAERTLSGRDVLGFDVVELAPVPGMHASDFAAARLAYALMGIVARGRR >gi|316921772|gb|ADCP01000140.1| GENE 75 83784 - 84581 789 265 aa, chain - ## HITS:1 COG:PA0005 KEGG:ns NR:ns ## COG: PA0005 COG0204 # Protein_GI_number: 15595203 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Pseudomonas aeruginosa # 72 241 60 219 257 89 35.0 5e-18 MTTSSHLFEANWYLTSPRTPSFLSRVSPNLGFYAPMIWIVFKAGCNAMRNQYDGEAWAQS SEGIAHALESVGCRIEAAGMEHFRSLEGPCVFIGNHMSTLETFVLPSMIQPWKDVTFVVK ESLLKYPFFGPVLGSREPIVVGRSNPREDLVAVLEGGEARLKQGRSVIVFPQSTRSSVFD PAHFNTIGVKLAKRAGVPVIPVALKTDAWGNGNLLKDFGPVDVKKTIHFEFGAPMRIEGT GKAEHQQITEFIIDRLGVWGREIGE >gi|316921772|gb|ADCP01000140.1| GENE 76 84591 - 85814 1439 407 aa, chain - ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 6 403 16 436 443 308 39.0 1e-83 MSLSQPLPERLRPSELALFVGQSHLAERLTTLLEGPRLPSLLLFGPPGCGKSTLALLLAR ARGGNVLRLSAPEAGLQQLRRQLTGVDILVLDELHRFSKAQQDFFLPLLESGDLTMIATT TENPSFSVTRQLLSRLHVLRLRQLGRPELLELGKRGAAALSVTLNDEILDFLSTAAHGDA RTMLNLVEYASSLPEEKRNLPHVKATLPEIMARHDKDGDSHYEYASALIKSIRGSDPDAA LYYLACLLEGGEDPRFICRRLILSASEDVGLADPQALPLAVACQQAVEFVGMPEGYIPLA ETVVYLALARKSNSTYAAYANAAREVKVNGPQSVPLHLRNASTQLQREWGYGKDYKYPHN FPQAWVEQDYLPPSVRNRVFYQPKEYGEEPRLASWSKRLKRTKPENR >gi|316921772|gb|ADCP01000140.1| GENE 77 85850 - 86389 747 179 aa, chain - ## HITS:1 COG:DR1110 KEGG:ns NR:ns ## COG: DR1110 COG2065 # Protein_GI_number: 15806130 # Func_class: F Nucleotide transport and metabolism # Function: Pyrimidine operon attenuation protein/uracil phosphoribosyltransferase # Organism: Deinococcus radiodurans # 6 174 8 176 183 152 49.0 3e-37 MKTTVLLHADEIARVLDRLACQIMERHGDCEQTVLLGIQRRGVDLAVRLGKVLEDKLGRK LPFGTLDINLYRDDWTTMHARPTIGESNITTPLDNKNVILVDDVLFTGRTIRAALEAILD YGRPKTVELLVLVDRGHRELPIHGDYVGRKINTARSEKVNVLMEERDGRDEVVLESADA >gi|316921772|gb|ADCP01000140.1| GENE 78 86762 - 87100 370 112 aa, chain + ## HITS:1 COG:no KEGG:DVU0359 NR:ns ## KEGG: DVU0359 # Name: not_defined # Def: HesB-like domain-containing protein # Organism: D.vulgaris # Pathway: not_defined # 1 91 1 91 108 119 64.0 3e-26 MISLTEEARKELEAYFEGKEKTPIRVYLAPGGCSGPRLALALDEPGEDDETSEDNGLTLC INKELAEKVGAVTVGMTHMGFVVESEIPLPSGGNGGGCSSCCGGCGSSGGCH >gi|316921772|gb|ADCP01000140.1| GENE 79 87330 - 87428 115 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFTISETASHELEAYFADKEKTPIRVYLAAGG >gi|316921772|gb|ADCP01000140.1| GENE 80 87889 - 88653 1148 254 aa, chain - ## HITS:1 COG:AGc1374 KEGG:ns NR:ns ## COG: AGc1374 COG0623 # Protein_GI_number: 15888100 # Func_class: I Lipid transport and metabolism # Function: Enoyl-[acyl-carrier-protein] reductase (NADH) # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 254 7 259 272 280 53.0 2e-75 MLLKDKKCLIFGVANNRSIAYGIASCFKEHGARLAFSYAGEAIHKRVEPISEELGGEFIF PCDVTSDEDITAAAALVKEKWGGVDVLVHSVAFANREDLTKRFIETSRDGFATALNVSAY SLTALCHAFEPMLNPGASVLTMTYYGAQKIITNYNVMGVAKAALEASVRYLSVDLGAKDV RINAISAGPIKTLASSGISDFKQIFNHIEEHAPLRRNVTTEDVGKAAVFLASDLGSAVTG EIMFVDCGYNNLGI >gi|316921772|gb|ADCP01000140.1| GENE 81 88977 - 89873 1340 298 aa, chain - ## HITS:1 COG:NMB0757 KEGG:ns NR:ns ## COG: NMB0757 COG0152 # Protein_GI_number: 15676655 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Neisseria meningitidis MC58 # 18 292 12 285 287 304 54.0 1e-82 MKVTTKTDIKEFPLLNRGKVRDIYDIDDSTLLIVTTDRMSAFDVIMNEPVPYKGVILNQI TLFWMKKFAHLVKNHLIESDVDKFPAALAPYREELEGRSVLVKKAKPLPVECIVRGYITG SGWKDYKATGSVCGYKLPEGLLESARLEQPLFTPSTKAELGEHDENISTARAAEILGKDI ADQIENIALALFKEGRAWAESRGIIIADTKFEFGMVDGELILIDEVLTPDSSRFWPAEGY KAGQSQPSFDKQYLRDWLSGQPWDKTPPPPALPKEVVEATQNKYLEAYRILTGSELAL >gi|316921772|gb|ADCP01000140.1| GENE 82 89917 - 91218 1627 433 aa, chain - ## HITS:1 COG:BS_hisD KEGG:ns NR:ns ## COG: BS_hisD COG0141 # Protein_GI_number: 16080544 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Bacillus subtilis # 16 433 10 423 427 402 51.0 1e-112 MTCRLLSLSSEAMWPELLRLLENRDNPDQTVETDVRAMLDAVRKGGDSAVLEYVRRFDCP DMRPPLRVPAEDLEQAARDIPAESLDCIKQAADNIRAFHEAQKDRSWFLTRDDGTILGQK IEAVDRAGLYVPGGRGGNTPLISSLLMTAIPAQAAGVREIAVVTPPRADGTLNPHILAAA HLLGVHEIYRMGGAWAIAALAYGTETVAPVDVIAGPGNIWVTTAKRLVQGRVGIDMIAGP SEVLVLADDSANPEWLAADMLSQAEHDALASAICITDNEALAHAVIAELERQSATLPRAD IVRKSLADWSAVVLVPDMDAAIALANRVAPEHLEVLMREPWSVLPRIRHAGAVFLGPYAP EPLGDYFAGPNHVLPTLGTARFSSALSVQTFCKRTSIIAASPAFAAANADAVARLARMEQ LEAHARSMECRKQ >gi|316921772|gb|ADCP01000140.1| GENE 83 91463 - 93829 2537 788 aa, chain + ## HITS:1 COG:BMEI1055 KEGG:ns NR:ns ## COG: BMEI1055 COG5009 # Protein_GI_number: 17987338 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Brucella melitensis # 5 769 8 787 819 517 38.0 1e-146 MKRVFFLFGTIIGVLVALGLAAVVGVYFWAVRDLPGFTRIADYRPALTTTVLARDGSVIG SFYRENRFLISLDQMPKMLPKAFLAAEDAEFYEHEGVNPVAIFRAFLINLQSGTKRQGGS TITQQVIKRLLLTPERSYERKIKEAILAYRLEHYLSKDEILTIYLNQTFLGANAYGVEAA ARTYFGKNAKDLTLAECALIAGLPKAPSQYNPYRDPSAAIGRQHYVLRRLRELNWITEAE YDAALKQPLVFKQMADGMGRQGAWYLEEVRRQLIDLFSEENAKRLGIELPVYGEDAVYQM GFTVQTAMSPPAQMAADEALRSGLEDFTRRQGWQGPVEKIPVADLKERIEKSHFSPDMLA NGERVKGIVTRITDKGADIRLGKYSGFIDNAQLSWAKKTLRSKQKFLESGDVVWVAAVSD PKNPYNPDEAELNKPIRLSLQCVPELQGALISIEPETGDVVAMVGGYSFQESQFNRATQA FRQPGSSFKPIVYSAALDNGFTAASVILDAPVVQFLESGDVWRPSNYEKNFKGPLLLRTA LALSRNLCTIRVVQQMGVQKVIERAKDLELEPHFPEVLSVSLGAVAVTPLNMTQAYTAFA NGGMVSKARFILSVKNFWNETIYESQPDLRDAISPQNAYVMSYLLKEVVNAGTATKAKVL GRPVAGKTGTSNDWKDAWFIGFTPHLVTGMYVGYDQPRTMGRSGTGGSMALPIFVEYAKS AFQAYPPDDFEVPDGISFANVDQTSGHLVGSGGLRLPFYTGTEPGSSTSEEAISGVNTRG EDLLKQMF Prediction of potential genes in microbial genomes Time: Fri May 13 04:41:15 2011 Seq name: gi|316921753|gb|ADCP01000141.1| Bilophila wadsworthia 3_1_6 cont1.141, whole genome shotgun sequence Length of sequence - 20966 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 8, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 73 - 132 2.4 1 1 Tu 1 . + CDS 202 - 639 470 ## COG3238 Uncharacterized protein conserved in bacteria + Term 711 - 749 4.2 - Term 831 - 867 5.4 2 2 Op 1 31/0.000 - CDS 891 - 1919 1691 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 3 2 Op 2 . - CDS 1955 - 3274 1691 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 - Prom 3346 - 3405 3.3 4 3 Tu 1 1/0.000 - CDS 3532 - 4734 1841 ## COG0426 Uncharacterized flavoproteins - Term 4769 - 4809 11.1 5 4 Op 1 3/0.000 - CDS 4839 - 4997 235 ## COG1592 Rubrerythrin 6 4 Op 2 . - CDS 5118 - 5501 587 ## COG2033 Desulfoferrodoxin - Prom 5657 - 5716 4.4 - Term 5691 - 5735 3.5 7 5 Op 1 . - CDS 5774 - 6706 889 ## COG0470 ATPase involved in DNA replication 8 5 Op 2 . - CDS 6734 - 6913 75 ## - Prom 7080 - 7139 4.2 9 6 Tu 1 . - CDS 7195 - 10260 3508 ## LI0012 hypothetical protein - Prom 10294 - 10353 4.0 - Term 10417 - 10460 13.0 10 7 Op 1 37/0.000 - CDS 10487 - 11257 369 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc - Prom 11381 - 11440 8.0 11 7 Op 2 23/0.000 - CDS 11444 - 12616 1415 ## COG0133 Tryptophan synthase beta chain 12 7 Op 3 9/0.000 - CDS 12619 - 13251 525 ## COG0135 Phosphoribosylanthranilate isomerase - Prom 13366 - 13425 2.1 - Term 13272 - 13316 -0.4 13 7 Op 4 21/0.000 - CDS 13467 - 14252 769 ## COG0134 Indole-3-glycerol phosphate synthase 14 7 Op 5 10/0.000 - CDS 14242 - 15834 2118 ## COG0547 Anthranilate phosphoribosyltransferase 15 7 Op 6 . - CDS 15815 - 17257 1753 ## COG0147 Anthranilate/para-aminobenzoate synthases component I - Prom 17484 - 17543 3.8 + Prom 17722 - 17781 3.0 16 8 Op 1 8/0.000 + CDS 17819 - 19180 1607 ## COG4303 Ethanolamine ammonia-lyase, large subunit 17 8 Op 2 6/0.000 + CDS 19190 - 20164 685 ## COG4302 Ethanolamine ammonia-lyase, small subunit + Term 20190 - 20229 -0.4 + Prom 20195 - 20254 1.5 18 8 Op 3 . + CDS 20286 - 20945 495 ## COG4816 Ethanolamine utilization protein Predicted protein(s) >gi|316921753|gb|ADCP01000141.1| GENE 1 202 - 639 470 145 aa, chain + ## HITS:1 COG:PM1890 KEGG:ns NR:ns ## COG: PM1890 COG3238 # Protein_GI_number: 15603755 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 4 143 3 146 156 68 36.0 3e-12 MEKIFYLGMMIAAGLLFGLQSPINAQLSRSVGVLEGSFLSFLGGTLVLGLAMLFFGKGHV GGVFGVPAWQLVGGLLGATVVFNTILCVPHIGVLPTLMAMILGNLIMGCIIDHFGWFGIP VTLFTWRRFLGVLLVLLGLLIAFKR >gi|316921753|gb|ADCP01000141.1| GENE 2 891 - 1919 1691 342 aa, chain - ## HITS:1 COG:BMEII0759 KEGG:ns NR:ns ## COG: BMEII0759 COG1294 # Protein_GI_number: 17989104 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Brucella melitensis # 20 342 1 355 355 231 40.0 2e-60 MLETTWFVLWGLLWAVYFVLDGFDFGMGTLLPFLAKNDEEKRIIYNAAGPYWDGNEVWLI TAGGVTFAAFPKAYAVMFSALYAPLLILLFALIFRAVSFEFRNKKDCPQWRKFWDACQFL GNFLPALLLGVAFANLFMGIPIDGNGVYHGNIIKLLNPYGLAGGIFFVLIFCMHGALWLA LKSEGDIHERAVGAAQVVWPLVLAVALLFLALTAFYTNLYANYAEHPTLLIFPILAVASL LTVRVMIWADNLLGAWFFSALFIITVTFFGVLGMYPGIIISSIDPAYSVTVFNGASSQLT LKIMLGVTLCCIPVVIAYQAWVYYTFAHKVTPESLQKDDHAY >gi|316921753|gb|ADCP01000141.1| GENE 3 1955 - 3274 1691 439 aa, chain - ## HITS:1 COG:MA1006 KEGG:ns NR:ns ## COG: MA1006 COG1271 # Protein_GI_number: 20089882 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Methanosarcina acetivorans str.C2A # 1 425 2 429 438 377 45.0 1e-104 MDVVMLSRLQFAVTVFFHFIFVPLTLGLVVLLAIMETLYVRTGNEVYKRMVKFWGKLFLI NFVLGVVTGITLEFQFGTNWSRYSEYVGDIFGSLLAIEATAAFFLESTFLAVWAFGWERV SKKVHLTAIWLVAIASNLSALWIILANGWMQNPVGYVIKNGRAELSSFFDVVTNPFAWSQ YFHTIFAAWMLGGFFVLGVSSWHLLRKNELDLFKRSVRIAAPFTLILVLLLGLQGHHHAQ IVAEMQPAKLAAMESHWETGTHVPMYLLTWPDQANRTNSIQALPIPSLLSLLAFNSPSAE VKGLDAFPADDIPPVLPTFLSFRFMVACAGLFVLLAIAAWWWRKDLENHPLLMKALIYVI PLPYLGIMAGWAVAEIGRQPWIVYGLMRTSDAVSPVPTSSVGLSLAAFIVIYTFLGILDI YLLRKYAKKGPEAAPEAQA >gi|316921753|gb|ADCP01000141.1| GENE 4 3532 - 4734 1841 400 aa, chain - ## HITS:1 COG:MA3743 KEGG:ns NR:ns ## COG: MA3743 COG0426 # Protein_GI_number: 20092541 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Methanosarcina acetivorans str.C2A # 4 399 13 405 405 331 40.0 2e-90 MKPVELAKDIFWIGAVDYNKRNFHGYSRSPRGTTYNSYLVRDEKNVVFDTVDNDYAGTMF CRLAKLIPLDKVDYIVVNHTEKDHAGALCQLVERCKPEKIITSTIGKRFMEAQFDTTGWP IEVVKTGDVINIGKRNIHFVETRMLHWPDSMVSYIPEDKLLISNDAFGQNIASSERFSDQ YDRCVLEKAIKEYYYNIVLPFSPQVLKTLELVAALKLDIEMIAPDHGLIWRGKDDCKYIL DTYRALAEQKPKQRAVIVYDSMWGSTGIMASAIASGLEDEGVPVRIIDIQKNHHSDVMTE LADCGAVIVGSATHNNNVLPGIADVLTYMKGLRPLNRVGAAFGSYGWSGESPKIIQEWLA SMNMDMPADPVKCLFVPKHEGLSQCVALGKTVAEALKAKC >gi|316921753|gb|ADCP01000141.1| GENE 5 4839 - 4997 235 52 aa, chain - ## HITS:1 COG:alr1174 KEGG:ns NR:ns ## COG: alr1174 COG1592 # Protein_GI_number: 17228669 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Nostoc sp. PCC 7120 # 2 49 181 228 237 79 64.0 1e-15 MQKYVCNVCGYEYDPAAGDPDSGIAPGTAFEDLPEDWVCPVCGASKSDFSAA >gi|316921753|gb|ADCP01000141.1| GENE 6 5118 - 5501 587 127 aa, chain - ## HITS:1 COG:AF0833 KEGG:ns NR:ns ## COG: AF0833 COG2033 # Protein_GI_number: 11498439 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Archaeoglobus fulgidus # 1 127 1 124 125 139 56.0 1e-33 MANQFEIYKCELCGNVIEVLFAGSGGPLSCCGQPMKLMADKAIDGAKEKHVPVIEANGAG CKVKVGSAPHVMEETHWIEWIEIATADGKRYKKFLKPGDAPEADFCIPVDKVVSAREYCN LHGLWKA >gi|316921753|gb|ADCP01000141.1| GENE 7 5774 - 6706 889 310 aa, chain - ## HITS:1 COG:L0280 KEGG:ns NR:ns ## COG: L0280 COG0470 # Protein_GI_number: 15672381 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Lactococcus lactis # 61 183 36 160 286 80 32.0 3e-15 MAVKTTRTATEEQLPTREEAEAAFALALDARQERVRAALNKLASDPPQVLILEGGSVAER FSVALWHAARLNCPEGRPPCLHCPACLQIGANLFHDLYVLDGREGSIKIETVRELRTILG EAPRGDGKRVVILAEAQSLGVEAANALLKSLEEPRPGVCFLLLAPQRERLLPTLVSRGWV VTLAWPEAGTPSTPELFQWEEALAEFMASGQGWLDKTSGKGAVDAALARRIVLSVQKAQA ALHAGRDGGPLGRRLAILPEAGHLHVNDLLAQCQESLDYMVSPPLVLNWLATRLHIVYRH ARLRGRKPTA >gi|316921753|gb|ADCP01000141.1| GENE 8 6734 - 6913 75 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRIAGRGAAPVLSPLLKSKKKFCNSYVKRDTRLSSASEPAVWCGEGVAGGPFPVGIRFP >gi|316921753|gb|ADCP01000141.1| GENE 9 7195 - 10260 3508 1021 aa, chain - ## HITS:1 COG:no KEGG:LI0012 NR:ns ## KEGG: LI0012 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 993 1 994 1011 1206 59.0 0 MLGTLFFDKQDRELLRMINETIDHGPSQDLEHKVFDANLHPHGILELTTTHEYRMAHAVI NLLGNLEAGGAPDRLMALRILRDEVLHSARTPFRYNTGRVLIQIMKEIIRSRQDELTQLK LVHDFRKVTSGNPRLVRQFLNTYHLLEMPEEWNQLTVDHHVHDANTKGRKNATHLIMDAW IKGIRYITVVYYNYVEPAAARELLQAAEIMGVDVRIGLEFRTPFRDRFVCFVWAPRGFSD PEAFLSFLAERPMVALMNEGRKASLWMQRHVMDTLQLWNAKHAPALAEELEIPVPLLEPE AFLAYVGTGQTSFLHLAEYAHKTLLKHLVQRVKALQEEALTATSERQSQIAQLIRRMDML TTEVIMETWLKPERNPELPSPDVPDNAPDTPELLRMPPHVLLDWLSCLRSGYRITLQLAE LTAEDVLELLWDCQGMITHLELFNLKEWQEGHLRHLTAINDLQIAINKGSVLHLKQILRG MIRTLEASSSEKDKERCAKFRIILRNIPSLQAPYKVAPLRIRIGTDSTSHSGVRHGMGLA MPETLPHGARRQIARGKQFRPIYLPVSLSLEFRETYAEQERYSSFMRWLGPRLRRIWGFG HFGLRCTKEWRAISGITQIGEEGNVITMGGIGGELNNGLRAERTEKTRPRWLGFSRLSTP ISNTLKVLAGFIPALVTFLYTQEWWVLAWFGAPLWFLITGLRNIPQAILGGGGMWSRSLL RWNDYVSWTRVCDSLLYTGLSVPLLEYFIRVLLLEDGLGLTVKDHQFLVFSIIAGANSIY ISLHNIYRGFPKEAIIGNLFRSLLAIPVSVFYNDVLALFLPFFTTADPLLILEPGAAIIS KTASDTVAAIIEGLADWRNNRRLRYWDYETKLQRLFDCYAKLELAFPDQDILSLLSRPEE FTRLTSTEARSLQVESIINALDLMYFWLYQPCAQQTLTSILRTMTREERVIVARSQGILS RVREVSQLFVDGLLGRNFARALSFYLDSYENYILTLNKRCAGFSNGHRPLLRRRRHISDC L >gi|316921753|gb|ADCP01000141.1| GENE 10 10487 - 11257 369 256 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 244 1 243 263 146 34 1e-34 MTFLEQKIRDAHTAGRTALIPFVTAGFPTLDTFWGIIEELDANGADVIEIGVPFSDPVAD GPVVEDASRRALEQGVSLNWIMDGLKQRAGNFKAGIVLMGYLNPFLQYGLERFAADAAQA GVSGCIVPDLPLDESAPVRTTLKAHGIDLIALVGQNTSEERMREYAAVSEGYVYVVSVLG TTGARTGLPPQVEETLRRARKAFNLPLALGFGLQHPDQLAAIPADIRPDAAVFGSALLKH LDKGNTAKAFLEVWTR >gi|316921753|gb|ADCP01000141.1| GENE 11 11444 - 12616 1415 390 aa, chain - ## HITS:1 COG:aq_706 KEGG:ns NR:ns ## COG: aq_706 COG0133 # Protein_GI_number: 15606106 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Aquifex aeolicus # 2 389 7 395 397 465 59.0 1e-131 MKKGYFGEFGGCFVPELLMPPLQELEEAMRDIMPSPTFKEELDDLLRNFAGRETPLTYCP TLSKRLGFSLWLKREDLLHTGAHKVNNTLGQALLAKHMGKTALVAETGAGQHGVATAAAA ARLGLECIVFMGAEDVERQSSNVRRMKLLGAEVRPVQSGTRTLKDAINEALRYWIAAQET THYCFGTAAGPHPFPTLVRNLQSVIGRETRAQMLEKTGRLPDVVVAAVGGGSNAIGMFHP FVDDASVRIIGVEAAGTGEPGCYNSAPINLGTPGVLHGQMTMLLQTKDGQIEPSHSVSAG LDYPGVGPEHAYLSAIGRVHYGLANDAEALDAFQTLCRTEGILPALESSHALAWVLRHHG ELPAGGHVVVNLSGRGDKDMDIVEQHLHIS >gi|316921753|gb|ADCP01000141.1| GENE 12 12619 - 13251 525 210 aa, chain - ## HITS:1 COG:all5288 KEGG:ns NR:ns ## COG: all5288 COG0135 # Protein_GI_number: 17232780 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Nostoc sp. PCC 7120 # 6 197 3 195 217 103 34.0 3e-22 MNDLIIKICGMREQADLDCAAALGVDLCGFIFAPQSPRAVTPEQAAALDSGDMLRVGVFV TDDMRFIEETARAARLDRIQLHGDQPMTCADRLSRTLGAERLIRVLWPERYPDLAALEET MNRHAASTGMFLLDAGMSGGGSGKRIGSERLRGLNAPRPWLLAGGLTPENVKETVAACVP DGVDFNSGLESEPGRKDPSRMRAALTALRG >gi|316921753|gb|ADCP01000141.1| GENE 13 13467 - 14252 769 261 aa, chain - ## HITS:1 COG:aq_1787 KEGG:ns NR:ns ## COG: aq_1787 COG0134 # Protein_GI_number: 15606843 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Aquifex aeolicus # 2 259 3 255 257 131 38.0 1e-30 MLLEKFKTAKQPEIQALRQLEYINMLPLPLRIERPSFLNALCAPNGNRPHVIAEFKRASP SRGDILMDFPPEEAAKQYAEAGASCLSVLTEETYFKGTLGYLKRMSAPGLPLLRKDFIFD RLQVAATAATPASALLLIVRLTPDAALLRSLREQAEAHGIDAVVEIFDEADLALARESGA RIIQVNARDLDTLQVDRSACLRLAERFHQAGDNEAWIAASGISASDHLRQAAGAGYQAVL VGTSLMDGGAPGAALKGLLER >gi|316921753|gb|ADCP01000141.1| GENE 14 14242 - 15834 2118 530 aa, chain - ## HITS:1 COG:HI1389 KEGG:ns NR:ns ## COG: HI1389 COG0547 # Protein_GI_number: 16273299 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Haemophilus influenzae # 202 501 4 302 333 210 38.0 5e-54 MFLLIDNYDSFTYNLVQAFYALGQDPVVVKNDDPRLLELAQDPSLNMVCLSPGPGHPKDA GYCMEVLRILDPKVPVLGVCLGHQALGLAAGAEVVVGPCIMHGKASEIVHDGSGLFSGVP NPMRVGRYHSLVVRSDVDEAHAKFTVTAHGPEGEIMALRYKDRPWVGVQFHPESILTPDG LRLLGNFPKAILPAGNDANAINVILDTLASGQDLTADMASAGFSALMDGTMTPSQAGSFL MGLRMKGESPLEMAHATRAALARAVRVDGIEGSYIEVVGTGGDGRSSFNCSTGTALTLAG MGYQVVKHGNRAVSSKCGSADALEGLGVNLDVNPAEVGKILHDQNFVFLFAQRFHPCFRN IAPIRNELGIRTLFNLLGPLINPSRPTHILLGVARPNLVKLMAETLAQSSVHRAAVVCGA GGYDEITPIGPNEMIILDDGKLEPLTLDPADYGIPTCTPEDLAVSSRDEAVAVLRELLAG RGPEAMRNMLALNVGMSIYLLEGKLPLATCIAKAREALASGVGGRVLHAA >gi|316921753|gb|ADCP01000141.1| GENE 15 15815 - 17257 1753 480 aa, chain - ## HITS:1 COG:aq_582 KEGG:ns NR:ns ## COG: aq_582 COG0147 # Protein_GI_number: 15606032 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Aquifex aeolicus # 21 467 26 485 494 296 39.0 7e-80 MTASKTPAAPDVTLRQSARWLPADMDTPISLFLGTCHEHHGILLESAEVDGRWGRYSILA CDFLLQATCRDGLLALDTNDDRLRPFAALNGMPFTEGLRTLMRGLHLEKPEGITLPPITR ALYGYFGYGMASFFNKKLENCLPQKDADASLVLPGTVLVFDHLYNRLCQISLGEHRDVNG ARTRECGAALPFIGEDAVTATPGREGYMDAVNKIKQMLRQGEAIQVVPSCRFSTPFEGDA FTLYRRMRRCNASPYMFFMDLPGITLFGSSPEVMVRCDEGKLQLSPIAGTRKRGADDLED ARLAAELQEDPKERAEHVMLVDLGRNDLGRIARPGTVNVERLMEVERFSHVMHLTSRITA RLQPDLDALDVLAATFPAGTVSGAPKVRAMEIIAETEQLPRGPYAGCIGWLGLDDDNVSL DTGITIRSMWVKDGKLHWQAGGGIVFDSDPVAEWNEVYNKSAIMRTVLNTTGEDHVPAHR >gi|316921753|gb|ADCP01000141.1| GENE 16 17819 - 19180 1607 453 aa, chain + ## HITS:1 COG:eutB KEGG:ns NR:ns ## COG: eutB COG4303 # Protein_GI_number: 16130366 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, large subunit # Organism: Escherichia coli K12 # 1 453 15 467 467 662 68.0 0 MKLKTTLFGETYRFDSVRDVLNKAGELRSGDVLAGVAASFMQERVAAKQVLSELTLGDLR EHPVVPYEDDAVTRIIQDAVHTPVYESIRNWSVGEFREFLLDGRTSSAAIERVRKGLTSE MAAAVSKIMSNADLIFAAKKMPVVVRSNNTVGLPGHFSSRLQPNDTRDEIPSIVAQVYEG LSYGAGDAVIGINPVTDTVENTKAMLNALWEIIERHQIPTQNCVLAHVTTQMEAIRQGAN AGMIFQSIAGSEKGLREFGVTTGLLDEAYDLARHYCQATGPNVMYFETGQGSALSADAHY GCDQVTMEARCYGLARRYQPFMVNTVVGFIGPEYLYNHQQIIRAGLEDHFMGKLHGLPMG CDCCYTNHADTDQNSNENLMILLATAGVNFIISLPMGDDIMLNYQTNSFHDIATVRQLLN LRPAPEFEQWLERHGIMENGCLTSRAGDASIFF >gi|316921753|gb|ADCP01000141.1| GENE 17 19190 - 20164 685 324 aa, chain + ## HITS:1 COG:STM2457 KEGG:ns NR:ns ## COG: STM2457 COG4302 # Protein_GI_number: 16765777 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, small subunit # Organism: Salmonella typhimurium LT2 # 1 316 1 298 298 292 50.0 6e-79 MQEQEIHDLVASVLDALKTQQAFPAPSVGQGGRLSAMQSEQRPLTGSGAEPELLAATVDA DTGASLEDLGGSAFRKPCGVKEPHNPDVLQEFMRSTGARIGGGACGTRPSTTAYLRFLAD HARSKGTVFREVPEEWLRRRGMLAVQTLVEDKDTYLTRPDLGRVLSEASLQTVRERYKPV PQVLIVLSDGLSTDAVLANADEIVPPLTNGLRQAGFTVGDPLFLRYGRVKAEDRLGEAIG CDVVLMLVGERPGLGQSESMSCYGVYRPTAATLESDRSVISNIHREGTPPVEAAAVIVDL VRQMMQHKASGIALNKALNPGFSG >gi|316921753|gb|ADCP01000141.1| GENE 18 20286 - 20945 495 219 aa, chain + ## HITS:1 COG:STM2456 KEGG:ns NR:ns ## COG: STM2456 COG4816 # Protein_GI_number: 16765776 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Salmonella typhimurium LT2 # 1 218 1 218 219 196 50.0 4e-50 MKVLQPVPVRVLASRIIPSVDPGLAAAFKLEPRHRALALFTSDSDDVSYIAIDEATKKAA VDVVYASSFYGGAAHASGPYSGEFIGILAAETPGDVEEGLRIAVDHAQNRVSFLTADPEG KLIFLSHLVSSCGTYVAEQAGIPVGTALAWLIAPPLEATFGLDAALKAADVRLVKHFPPP SETNFSGGWLAGDQAACQAACQAFTEAVLGVAAGPLERI Prediction of potential genes in microbial genomes Time: Fri May 13 04:42:13 2011 Seq name: gi|316921676|gb|ADCP01000142.1| Bilophila wadsworthia 3_1_6 cont1.142, whole genome shotgun sequence Length of sequence - 94360 bp Number of predicted genes - 79, with homology - 67 Number of transcription units - 42, operones - 18 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 115 - 174 5.0 1 1 Op 1 . + CDS 372 - 1103 716 ## COG4812 Ethanolamine utilization cobalamin adenosyltransferase 2 1 Op 2 . + CDS 1107 - 2522 1283 ## COG4819 Ethanolamine utilization protein, possible chaperonin protecting lyase from inhibition 3 1 Op 3 . + CDS 2541 - 2888 254 ## + Term 3014 - 3054 -0.9 - Term 2799 - 2843 -0.8 4 2 Tu 1 . - CDS 3010 - 3438 236 ## PROTEIN SUPPORTED gi|223038821|ref|ZP_03609113.1| 30S ribosomal protein S8 - Term 3541 - 3598 19.1 5 3 Op 1 . - CDS 3629 - 3874 361 ## 6 3 Op 2 3/0.000 - CDS 3849 - 5285 360 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 7 3 Op 3 13/0.000 - CDS 5285 - 7225 373 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 8 3 Op 4 . - CDS 7222 - 8430 1603 ## COG0845 Membrane-fusion protein 9 4 Op 1 . - CDS 8531 - 8923 428 ## 10 4 Op 2 . - CDS 8923 - 9276 353 ## 11 4 Op 3 . - CDS 9273 - 9863 685 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 9935 - 9994 2.2 - Term 10326 - 10365 8.2 12 5 Op 1 . - CDS 10489 - 11982 1961 ## COG0531 Amino acid transporters 13 5 Op 2 . - CDS 12103 - 12504 295 ## - Prom 12586 - 12645 3.1 - Term 12536 - 12577 1.0 14 6 Tu 1 . - CDS 12660 - 14000 1064 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 14132 - 14191 4.0 + Prom 14095 - 14154 2.6 15 7 Tu 1 . + CDS 14245 - 15471 1085 ## COG2821 Membrane-bound lytic murein transglycosylase - Term 15580 - 15630 18.0 16 8 Op 1 . - CDS 15796 - 16287 704 ## DvMF_2147 zinc resistance-associated protein 17 8 Op 2 . - CDS 16484 - 16960 673 ## DvMF_2147 zinc resistance-associated protein - Prom 17061 - 17120 5.3 18 9 Op 1 13/0.000 + CDS 17353 - 19086 1560 ## COG0642 Signal transduction histidine kinase + Prom 19154 - 19213 2.4 19 9 Op 2 . + CDS 19282 - 20649 1308 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 20 10 Tu 1 . - CDS 20650 - 21249 705 ## Ddes_2135 hypothetical protein - Prom 21292 - 21351 4.4 - Term 21320 - 21384 28.1 21 11 Tu 1 . - CDS 21396 - 23816 2784 ## COG1042 Acyl-CoA synthetase (NDP forming) - Prom 23940 - 23999 2.2 - TRNA 24189 - 24264 80.5 # Lys TTT 0 0 - Term 24134 - 24185 8.4 22 12 Op 1 18/0.000 - CDS 24431 - 25171 196 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 23 12 Op 2 19/0.000 - CDS 25158 - 25955 248 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 24 12 Op 3 24/0.000 - CDS 25952 - 26995 1778 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 25 12 Op 4 20/0.000 - CDS 26992 - 27885 1268 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components - Term 27966 - 28011 11.6 26 12 Op 5 2/0.000 - CDS 28027 - 29166 1813 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component - Prom 29335 - 29394 5.8 - Term 29781 - 29815 6.0 27 13 Op 1 1/0.167 - CDS 29842 - 30711 1069 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 28 13 Op 2 . - CDS 30780 - 31451 763 ## COG3382 Uncharacterized conserved protein 29 13 Op 3 . - CDS 31524 - 32294 642 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 32363 - 32422 2.0 - Term 32365 - 32393 -0.2 30 14 Tu 1 . - CDS 32468 - 33103 613 ## gi|302862367|gb|EFL85300.1| conserved hypothetical protein + Prom 33222 - 33281 2.0 31 15 Tu 1 . + CDS 33484 - 33684 92 ## + Term 33738 - 33775 -0.2 32 16 Tu 1 . + CDS 33952 - 36654 3124 ## COG0474 Cation transport ATPase - Term 36650 - 36696 6.1 33 17 Op 1 . - CDS 36826 - 38436 1940 ## COG0248 Exopolyphosphatase 34 17 Op 2 11/0.000 - CDS 38415 - 39314 909 ## COG0248 Exopolyphosphatase 35 17 Op 3 . - CDS 39341 - 41488 2450 ## COG0855 Polyphosphate kinase - Prom 41544 - 41603 5.5 36 18 Tu 1 . + CDS 41821 - 43275 1404 ## COG2199 FOG: GGDEF domain + Prom 43594 - 43653 2.3 37 19 Op 1 4/0.000 + CDS 43673 - 45706 1591 ## COG3408 Glycogen debranching enzyme 38 19 Op 2 2/0.000 + CDS 45706 - 47046 1326 ## COG0438 Glycosyltransferase 39 19 Op 3 . + CDS 47048 - 48277 1401 ## COG1449 Alpha-amylase/alpha-mannosidase 40 20 Tu 1 . + CDS 48340 - 52590 3985 ## COG0058 Glucan phosphorylase + Term 52774 - 52813 -0.9 41 21 Tu 1 . + CDS 52892 - 53155 72 ## + Term 53213 - 53258 3.1 + Prom 53352 - 53411 6.4 42 22 Op 1 . + CDS 53471 - 54700 979 ## Dalk_3013 hypothetical protein 43 22 Op 2 . + CDS 54697 - 55116 207 ## Dole_1327 hypothetical protein 44 22 Op 3 6/0.000 + CDS 55141 - 55587 512 ## COG0716 Flavodoxins + Prom 55591 - 55650 5.4 45 22 Op 4 . + CDS 55741 - 56202 377 ## COG0735 Fe2+/Zn2+ uptake regulation proteins + Prom 56260 - 56319 2.4 46 23 Op 1 . + CDS 56341 - 57735 1522 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 47 23 Op 2 . + CDS 57750 - 59879 2214 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Prom 59882 - 59941 2.9 48 24 Op 1 . + CDS 60017 - 60568 610 ## Hhal_0381 hypothetical protein 49 24 Op 2 34/0.000 + CDS 60546 - 61328 778 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 50 24 Op 3 . + CDS 61328 - 62731 221 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 51 25 Op 1 . + CDS 62877 - 63410 499 ## COG3721 Putative heme iron utilization protein 52 25 Op 2 . + CDS 63407 - 63919 738 ## COG0716 Flavodoxins 53 25 Op 3 30/0.000 + CDS 64003 - 64641 766 ## COG0811 Biopolymer transport proteins 54 25 Op 4 . + CDS 64634 - 65020 339 ## COG0848 Biopolymer transport protein 55 25 Op 5 . + CDS 65061 - 65831 616 ## + Term 65846 - 65894 -0.9 56 26 Tu 1 . + CDS 66149 - 66781 407 ## COG0500 SAM-dependent methyltransferases + Term 66826 - 66858 1.3 57 27 Op 1 . - CDS 66991 - 67548 762 ## COG1321 Mn-dependent transcriptional regulator 58 27 Op 2 . - CDS 67573 - 68946 1552 ## COG0534 Na+-driven multidrug efflux pump - Prom 69134 - 69193 2.2 + Prom 69124 - 69183 3.0 59 28 Op 1 22/0.000 + CDS 69343 - 69576 374 ## COG1918 Fe2+ transport system protein A 60 28 Op 2 . + CDS 69573 - 71819 2669 ## COG0370 Fe2+ transport system protein B 61 29 Tu 1 . - CDS 71795 - 71887 58 ## 62 30 Tu 1 . + CDS 71874 - 72122 332 ## Dvul_0061 hypothetical protein + Term 72258 - 72291 -0.3 63 31 Op 1 . + CDS 72304 - 74406 2074 ## COG2217 Cation transport ATPase 64 31 Op 2 . + CDS 74403 - 74846 429 ## DvMF_1416 hypothetical protein + Term 74847 - 74873 -0.6 65 31 Op 3 . + CDS 75006 - 76403 1910 ## LI0461 hypothetical protein + Term 76435 - 76496 16.1 66 32 Tu 1 . - CDS 76410 - 76568 100 ## - Prom 76804 - 76863 4.1 + Prom 76832 - 76891 1.7 67 33 Tu 1 . + CDS 76949 - 77305 113 ## gi|302863703|gb|EFL86634.1| toxin-antitoxin system, antitoxin component, Xre family + Term 77361 - 77406 3.3 + Prom 77394 - 77453 5.7 68 34 Tu 1 . + CDS 77488 - 77766 202 ## gi|302863703|gb|EFL86634.1| toxin-antitoxin system, antitoxin component, Xre family + Term 77771 - 77821 -0.4 - Term 77837 - 77879 11.6 69 35 Tu 1 . - CDS 77919 - 80624 3439 ## COG3264 Small-conductance mechanosensitive channel - Prom 80847 - 80906 4.4 + Prom 80852 - 80911 5.8 70 36 Op 1 . + CDS 80976 - 81230 88 ## 71 36 Op 2 6/0.000 + CDS 81142 - 82248 1296 ## COG0067 Glutamate synthase domain 1 72 36 Op 3 21/0.000 + CDS 82236 - 83867 1723 ## COG0069 Glutamate synthase domain 2 73 36 Op 4 . + CDS 83879 - 86182 1830 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 86385 - 86409 -1.0 - Term 86189 - 86241 15.4 74 37 Tu 1 . - CDS 86424 - 86795 192 ## - Prom 86818 - 86877 6.8 + Prom 87081 - 87140 1.6 75 38 Tu 1 . + CDS 87160 - 87357 174 ## gi|302863398|gb|EFL86329.1| conserved domain protein + Term 87420 - 87459 3.6 - Term 87402 - 87453 11.2 76 39 Tu 1 . - CDS 87538 - 89160 1865 ## COG1283 Na+/phosphate symporter + TRNA 89419 - 89508 65.9 # Ser GGA 0 0 + Prom 89433 - 89492 80.4 77 40 Tu 1 . + CDS 89603 - 91312 476 ## COG0582 Integrase + Term 91468 - 91510 1.2 + Prom 92850 - 92909 9.3 78 41 Tu 1 . + CDS 93105 - 93593 -8 ## Dvul_2616 AAA ATPase + Term 93650 - 93686 0.3 79 42 Tu 1 . - CDS 94007 - 94330 111 ## bglu_1g32480 integrase, catalytic region Predicted protein(s) >gi|316921676|gb|ADCP01000142.1| GENE 1 372 - 1103 716 243 aa, chain + ## HITS:1 COG:eutT KEGG:ns NR:ns ## COG: eutT COG4812 # Protein_GI_number: 16130384 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization cobalamin adenosyltransferase # Organism: Escherichia coli K12 # 6 227 16 247 267 128 38.0 8e-30 MARCGLGQGVELHLSPEERLTPSAQELVNNRKIRILIVSGDGRVARDDGKAVHPLATEER PFAKVEKTEAQTHLNGTEMVTKEHPRIWFRGKVDSAIAAAILVQTQLQPVLAPELQGLLA DLRSWLGWAMRCEVLDEAVPVLSMGPFSREAVRRLACHPEELGLEPLCPDAAEGSIFTLL NWLRAIIRETEVAAAFAFRKQDGTLERPDILELLDGLSNAVYAFMLLVKAGDRGLGAIQA ARG >gi|316921676|gb|ADCP01000142.1| GENE 2 1107 - 2522 1283 471 aa, chain + ## HITS:1 COG:STM2459 KEGG:ns NR:ns ## COG: STM2459 COG4819 # Protein_GI_number: 16765779 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin protecting lyase from inhibition # Organism: Salmonella typhimurium LT2 # 1 469 1 465 467 432 52.0 1e-121 MDSKQILSVGIDIGTSTSQLVFSRLELRNRAAACQIPRFEITDRRIVWQSPVAFTPLTDV DTLDEAALDALIRGWYAEAGVRPEDVETGAVIVTGESLKTVNARRTVMRLADSLGDFVVA SAGPHLESVIAGRGAGAAALSEARRGTVLNIDIGGGTSNYAVFRAGRVIDTACLNMGGRL AETDGHGWITRVRKPLLPVLEELYGSRAPETLGPDDIPAIADRMAGLIWQVLADERSPLA DRLLQTPPLKMWRYDAVTLSGGVGACCAEPEADPFRFRDLGPSLARAIVRHPGFAALPLL PPSQTVRATVIGAGSWMLSLSGATVWADDALLPMRNLPVVFPWLDWHAGLTPAAVETAIG DAMRRMDLGDADRFVVGLPAGMPVAYATVCLLVEALAGFWSRRPEGQPVLVALAEDMGKV LGMELRPRLSGRPLVVLDELKLADGDYLDMGKPLYQGGVVPVTIKSLAFSG >gi|316921676|gb|ADCP01000142.1| GENE 3 2541 - 2888 254 115 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNDNGFLYNLERAQAGPVDEEALARKMAARTARVARAHESVFTEANGWSGQFPGTPGYYY FADSDEGQYVIVEAIGEDGGLEIYYSGSEIAFPPEGMDGKWKPVPVPCECLRERG >gi|316921676|gb|ADCP01000142.1| GENE 4 3010 - 3438 236 142 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223038821|ref|ZP_03609113.1| 30S ribosomal protein S8 [Campylobacter rectus RM3267] # 5 122 1 118 118 95 39 9e-19 MGAFMPTRDELFAYVKATFGTEPDYPWPQEPEYAVLRHADNGKWYGIVMSVPLSKLGLPG KGNVEIINVKSAQVLDMLLYGDPGILPAYHMNKKHWITILLESGFAEKELRDLLMESYRL TQRRKKRVPSEQLSSRSEPLEK >gi|316921676|gb|ADCP01000142.1| GENE 5 3629 - 3874 361 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFQVKEINKAPFEDAERVFETSHPELRYAVASQRAVGFYESLKDYNENASLWVAWFEGEK PADLAAFIQQEIEKAEASVEE >gi|316921676|gb|ADCP01000142.1| GENE 6 3849 - 5285 360 478 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 5 449 4 457 460 143 25 4e-33 MNAIRLFAALLVLGLGGCANYPMGSLKTDLLPAEDVAASYSLNTAWWKGYHDADLDRIVA LALERNVDLALSAIKVNRALYQARLLSADLVPSFSANASAAASHNIDNGNAARSYQTEFG VSYEVDLWQKLRNAANAQEWEYKATMEDREAARLALVNSVADTYFELKYLDESIKVMDAS VKRYQELLRLIEAKYEFGKVASVEPLQAQQSLLSARNSLLDLQDRQNVARQTLRDLLNIR PGENPAIGNADLMSYPTVGVDLDVPVAALSARPDIRASEARLQSAFKTLESDRASWYPTI SIGSTLGTSSSTSSKVFDLPLLAGTVRVSFPFLQWNTIRWNIKISEADFESAKLDFTQAV TSALNEVDTAYFSYANAQRSLENTLSKHQKDVRIGEYYQTRYDLGAAELKDYLDALNTAD NSMLSALEAKYRVIRYENQIYKAMGGRYERILPPSGGKATQGHTSTNGETHVPSQRNQ >gi|316921676|gb|ADCP01000142.1| GENE 7 5285 - 7225 373 646 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 254 646 7 413 413 148 28 1e-34 MKIIEAEGINKFFGTGQNRVHVLKDMSFAIEEGDFVAIIGQSGSGKSTLMNILGCLDTQS SGVCRIGGVDIGTLDSDELAELRGKRIGFIFQRYNLLSTLSATENVALPAVYSGVDKKAR LDRAKSLLADLGLPDKLENKPNELSGGQQQRVSIARALMNGGSIILADEPTGALDSKSGE NVMDILTELNRRGHTIIVVTHDHHIAGFASRVIEIKDGSIISDTRSREFVPAPERELPRS PRASFLFRKDQFVEAFKMSVQAILAHKMRSVLTMLGIIIGIASVVSVVALGRGSQEKIIA NISSMGTNTINVYPGQGFGDRKSDKVKTLTVADSDVLGKQSYIDSATPNTTAAGTLVYRN ITVNAQLSGVGEQYFDVKGLKPAEGRFFNRDDVRANHSYVVIDDNTRKKLFPNGGDPVGQ VLLFNRQPLEIIAVAEKQDSVFGPTDTLNLWAPYTTVMNKITGQRHISSVIVKVKDNVMP LMAEKNLTALLTARHGKTDFFTVNTDSIKKTIESTTGTMALLISGIALISLVVGGIGVMN IMLVSVTERTREIGLRMAIGAKQGNIMEQFLIEAVLICVIGGVLGIVVSYLIGVVFDLLV SNFAMSYSAGSMLLALVCSSAIGIVFGFMPARNASRLNPIDALSRE >gi|316921676|gb|ADCP01000142.1| GENE 8 7222 - 8430 1603 402 aa, chain - ## HITS:1 COG:NMA0728 KEGG:ns NR:ns ## COG: NMA0728 COG0845 # Protein_GI_number: 15793704 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Neisseria meningitidis Z2491 # 32 400 32 389 392 318 50.0 8e-87 MKKKLFLIAALAVAAAVAAWYVLRMDQGDGMAYLTEPVTIGNITRTVNATGEVGAVQLVS VGAQVGGQIKKLHVVLGQDVKKGDLIAEIDSVPQLNQLETDKARLQSYESQLAAKKVALK IAKTKYDREMQLKKRDAASKESLEDAENAYALAKAEVTELESQIRQARIAVNTDEVNLGY TRITAPLDGTIVSVPVDEGQTVNANQTTPTIVQIADLDKMEIKIEISEGDITAVKPGMPI TYTILSNPETEYKATLTSIDPGLTTLTDGSYKTTSSTGSSASSSSSSSSNNAVYYYGKAL IDNKEGPLRIGMTTQNVITVADVKDALLVPVIAVQSRNGKKFVRVLGPGDKPERREVTTG IADGIHIQILSGLKEGDAVITAQVTQQERDAQVQQRHRGPRL >gi|316921676|gb|ADCP01000142.1| GENE 9 8531 - 8923 428 130 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRTRFKSLLALGLTFAAGCFVGIVIGSSNPRWFKSFRQPPGPDRAVEHLTEVLELTPEQQ EQVREVFIRLHPEFLKESERAKAFRRAFIIKHFNEMAPLLDERQKAVAQEFLEKQLSRKM PPPPKEGKAN >gi|316921676|gb|ADCP01000142.1| GENE 10 8923 - 9276 353 117 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNNLLQSLRGVLSALIGFQSPEDGLLEKARRRWNEGRTNRPSLLDEAWEERVMFAVRLSP MDDPDVPDWLAQRFRPLALANAVIILCALMLFWHAGDPGTALAEYAQSALAGYEVFF >gi|316921676|gb|ADCP01000142.1| GENE 11 9273 - 9863 685 196 aa, chain - ## HITS:1 COG:STM2640 KEGG:ns NR:ns ## COG: STM2640 COG1595 # Protein_GI_number: 16765960 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Salmonella typhimurium LT2 # 10 195 11 191 191 75 32.0 4e-14 MIPEAGEETLQAVLNGDAEAFRSILRSYGPRVERLVSAHVLPHEVPEVAQETFVRAYRSL RSYQGQGSFERWLATIALRCCYERLRERYRPEAPFSSLARGEEDEPEGLTDQASFRLYEE ENERRDLREELESALQSLEPRDRMLLTLTFLEGYSIAETAAMMDMTPINVRVRVHRSKAK LREALDPPLESLGRQS >gi|316921676|gb|ADCP01000142.1| GENE 12 10489 - 11982 1961 497 aa, chain - ## HITS:1 COG:SA0541 KEGG:ns NR:ns ## COG: SA0541 COG0531 # Protein_GI_number: 15926262 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Staphylococcus aureus N315 # 8 476 7 474 494 338 40.0 1e-92 MDHSTESQTDRKELKKTMSPAEVWALAVGAIIGWGCFVLPGTRFLPDAGPLGTCLAFLIG GGLLCTVALCYSILIKVYPVAGGSFTYAYVGFGTRAGFICGWALVLGYLCVIAANGTALA LLSRFVLPGVFDVGYLYTVAGWDVYAGELAMMSAFFIFFGYMNFRGMDMASSLQLILAFA LAGGVLVLILGSVATETSHVDNLFPLFAEGRSPVACVLSILAITPWLFVGFDTIPQTAEE FDFPPEKSRRLMINSIVCGALLYALVLIAVAIIIPYTDLLGQNHSWTTGAVANMAFGRFG GVILAIPVMAGIFTGMNGFFMATTRLLFSMGRGKFLHPWFVKVHPKHGTPTNAVLFTLGL TLIAPWFGRSALNWIVDMSAMGTALAYLFTCMTAYKYVANFPDIPEARWGKPVAIIGGLT SISCFAMLALPGSPAAIGIESWFILLVWVALGAAFYFNRASELNAIPHEQMQYLLLGTKD RPLLFEAAQSTSAGMNK >gi|316921676|gb|ADCP01000142.1| GENE 13 12103 - 12504 295 133 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTLLSRYWIRHFTKSLWWALLPLISVCLLYPNAIAQRHPVQPCLMHCQDGASLIPAQPDQ PSAYLETKSAHPLDRTSGKDGPLLSLNASSYGFSLALLQCAAFIPLPDLLLGLSEIRGML PLALAPPSVSISL >gi|316921676|gb|ADCP01000142.1| GENE 14 12660 - 14000 1064 446 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 3 442 2 439 456 414 49 1e-115 METLIQALKAINDVIWGPGMLILLIGTGLYLTVGLRFYTFRNIFSAFSMLWKGRTSSAGD KGELTPFNALMTALAGTIGTGSIVGVATAILSGGPGALFWMWCTAMVGMATKFSEILLAV RYREVTPAGNFVGGAMYFIKNGLGEKWKWLGGLFAIFAAITCFGTGNAVQTNAISGVLWG TFGVPTWITALILFLLIASVLLGGLRRIGSVAGKIVPFMAILYTLASLLIIILNIKEVPA MFLHVIEEAFTPTAAQGGFAGATAMMAVRFGMARGIFANEAGLGSGPIAHATATSTPLRQ ATIGMLDVFITTMIVCSMTGFAILVTGEWTAVGAQGASLTARAFETALPGVGSAIVAISL VMFAFTTILGWCVYGERSAIFLFGDRVLKPFRILYTIVVPIGALAQLDLVWILADTFNAL MAFPNLIGVLLLSPVVFKTVQESTRP >gi|316921676|gb|ADCP01000142.1| GENE 15 14245 - 15471 1085 408 aa, chain + ## HITS:1 COG:PM0928 KEGG:ns NR:ns ## COG: PM0928 COG2821 # Protein_GI_number: 15602793 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-bound lytic murein transglycosylase # Organism: Pasteurella multocida # 1 404 17 373 374 130 28.0 5e-30 MLKTFVLCLAVLLAACGAKPSPEPFPAPASTASSEIRRLPHPAVEALDTPSPLFRQAPAE QARAQIARMDKASMGLRSWMELAPALERSLAYARSWNPDERAAEHSGIRITWGEVVASLE KLKAILPRLDAQPELLEEGFQWLSVAPEVKFTGYYSPVMQASRTRKPGYEYPIYRLPEEL APDLAWCLPTHSCPEDAFLQVIKPEDPYYSRADIDLDGALKSRNLEMAWLEHPVETYDLM LEGSGILAFDDGTQQAALFAGLNGHSGQSMAGYLIRSGELPRNKASMKGIRQWWDNHPQK RRAFLNASSGYVFFRFGAEHPKGTAGCELTPWVSMAVDPRVLPLGGIVSYALPGQRQGIR RGLGFAHDTGGAIRLRRIDIYTGEGEDAHRQAMTIYNQGQVWLLLAKQ >gi|316921676|gb|ADCP01000142.1| GENE 16 15796 - 16287 704 163 aa, chain - ## HITS:1 COG:no KEGG:DvMF_2147 NR:ns ## KEGG: DvMF_2147 # Name: not_defined # Def: zinc resistance-associated protein # Organism: D.vulgaris_Miyazaki_F # Pathway: Two-component system [PATH:dvm02020] # 1 124 1 132 179 90 41.0 2e-17 MKTRALIAAFALSAIVLGSGVSAFAYNGYHNYGNHAGMYAVSPEKQGAYDAMMKDYYSRV SPIQDKLNAKGIELDALSRNPNAKPETISKLANEVAELRSQLRAERMTLGDKMEKELGIQ PCRGMMGNGMMGNGMGYHNGGRHGGGHGGHGGHGGGYGSNCNY >gi|316921676|gb|ADCP01000142.1| GENE 17 16484 - 16960 673 158 aa, chain - ## HITS:1 COG:no KEGG:DvMF_2147 NR:ns ## KEGG: DvMF_2147 # Name: not_defined # Def: zinc resistance-associated protein # Organism: D.vulgaris_Miyazaki_F # Pathway: Two-component system [PATH:dvm02020] # 1 124 1 132 179 91 42.0 7e-18 MKTRALIAAFALSAIVLGSGVSAFAYNGYHNYGNHAGMYAVSPENQGAYDAMMKDYYSRV SPIQDKLNAKGIELDALSRNPNAKPETISKLANEVAELRSQLRAERMTLGDKMEKELGVQ PGRGMMGNGMGYHNGGRHGGGHGGHGGHGGGYGSNCNY >gi|316921676|gb|ADCP01000142.1| GENE 18 17353 - 19086 1560 577 aa, chain + ## HITS:1 COG:hydH KEGG:ns NR:ns ## COG: hydH COG0642 # Protein_GI_number: 16131833 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 315 574 177 455 465 219 44.0 1e-56 MPRWLTLFPWFVMVCALLFSAGIVVFGLHNAKRESDFTSQLTLEKGAALISALEGALRTG MGYHWSDEVLSDLLNKVGEQPDIVSLVITNRNGTVLMAADTKLIGTSFLSPEVLAELDPT RQVKWNATELPDGMPVFQVYKQFSVPQGERRRHGGGHRMMRRGLDGDSCGMLEASIAEGT SLVIFVEYDLTPLEEAQAADEKHMAVMFGILVFVGLVGCITLFLIQSYRRSRKLVQETTA FSSEILRTLPVGIISTDMDDRITSINPAAQDITGLTRTAATGRGLHDLLPGVWAVLEGRT AQDAAREQEAWCTVGERRVPLAISASHIVTEEGEAIGTALIMRDLGEIRRLQSELRRRDR LVALGNMAAGIAHEVRNPLSAIKGLARFFMEASPEGSDESRMADIMTREVLRLDKVVGDL LDFARPDVLNLTDVGLNELVERARDMVRSDMDARKIRFEAELPEPPLSVRLDRDRMTQVL LNLFLNAVQAMPDGGRLMVRARMEGTELALDVADTGCGIAPERLADIFSPYFTTKASGTG LGLSIVHKIVEAHDGTIEVASTPGEGTVFKLRFPFVA >gi|316921676|gb|ADCP01000142.1| GENE 19 19282 - 20649 1308 455 aa, chain + ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 4 449 3 441 441 529 62.0 1e-150 MAEKGSHTVLVVDDDPGHRGMLEALLGRWGLAVTSAGDGKEAVARVCERPFDVVLMDIRM PDMDGITALKEIRKYNPAVPVVLMTAYSAVESAVEAMKSGAVDYLIKPLDFDQLKSTLFK TLEHSEAVRSGERFPVPVEGIVGNSPAVRRMLEMIHTVAPSEATVLINGKSGTGKELVAR AIHNQSQRKDGPWVAVNCAALTESLLESELFGHEKGAFTGADKRRDGRFLQADGGTLFLD EIGEISLLMQVKLLRAIQQREIQRVGGDATLKVDVRIVAATNRNLLDEVAAGRFREDLYY RLNVVSIQVPSLQERSEDIPLLAEHFLKVFGERNRKDVKGFTPKAMDMLIKYPWPGNVRE LENAVERAVVLLFGSYVSERELPLAVTQAYEQEDDAEPSALSVPLEGATLEDVEREAILR MLDSVGDNKSEAAKRLGISRKTLHTKLKRYGQEQD >gi|316921676|gb|ADCP01000142.1| GENE 20 20650 - 21249 705 199 aa, chain - ## HITS:1 COG:no KEGG:Ddes_2135 NR:ns ## KEGG: Ddes_2135 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 196 1 188 191 159 47.0 6e-38 MSVTSALGNQAKLLWQFLGFFQKPYLRFLHAIIAVLVILQILSSFAMHMLSSGQLNPALS SWLASWYHILSGLLVVILSLLLAADSLTKRGFHYFFPYLWGDTEQLKKDIQASLRFKMVP PRPKGLATAVQGLGLGALLLVVLSGLIWFILWRNGSSFAGLALETHKNVTLLIELYLIGH GCMALLHFFVWQRNKARQE >gi|316921676|gb|ADCP01000142.1| GENE 21 21396 - 23816 2784 806 aa, chain - ## HITS:1 COG:PAB2112 KEGG:ns NR:ns ## COG: PAB2112 COG1042 # Protein_GI_number: 14520570 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Pyrococcus abyssi # 301 785 8 458 472 172 28.0 2e-42 MNLTVEYQPITELFRNAHTEGRHFLYEFEVYNLLSLSGSETPPKCSFIPRNAKPMEEEIM SLPGEKAVLKIISPTIVHKTEVGGVRIVPKTPDKVRSAVRRMLSEVPERYAEWIERCPAS APESYKGLAGTALQNAIAADLKGVLQVQFMPPDSASFGNELIVGLRRTREFGMVISAGLG GTDTELYAERFRKGQAIVAALTELTDGDAFFELFRKTVSYRKLAGLTRGQRRIVTDDQLI ECFESFIRMGNYFSPCNPDAPFIIEELEINPFTFTDYLMVPLDGMCRFSTPGERPAPRPI QKIANLLHPKSIGIIGVSSKRRNFGRIILENIIDSGFDRDKLVIIRDGENDASGVRCVPN LRALPDPLDLFIVAIGAEQVPPLVDEIIENNAAHSVMLIPGGLGETEESREMTERMIARI TEAHKNLAAGGDGGPAFLGANCMGVISRPGKFDTWFIPAAKMPDYKQYPRRRTAIVSQSG AFLLNRFSQTPEMSPSYLISMGNQTDLTLGDMMRHFMDSQEVDVIAVYAEGFKDLDGIQF AEAVREAIRRDKQVVFYKAGRTPEGKTATSGHTASLAGDYMVCETCIRQAGAIMARNFTE FQDLILLAETFTNATIRGKRLGAVSGAGFEAVGMADSLQSDEYSMSLGTYSETTRLAMKR LIASKGLEKLVPIKNPLDINPGADDEVHAYMAELLLNDENIDAVVIGLDPLSPVTHSLAE TDIDAFRMDAPDGILTLLSDVRSRTSKPMVAVMDGGEKFEPLRKALRERGIPVFPVCDRA VAALSLYLESRLAAEGLKHASYCLLR >gi|316921676|gb|ADCP01000142.1| GENE 22 24431 - 25171 196 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 1 236 6 230 309 80 26 4e-14 ATMLEIKDLHVHYGGIHAVQGVSMRIPQGRIVTLIGANGAGKSSVIRSISGLVKDTEGEV LFTPAARDGSGPRTEQLLGKSPEDIIRRGISVSPEGRRILPHLTVEENLMLGAYIRNDKE GIAQDIDWVYNLFPRLRERSWQKGGTLSGGEQQMLAVGRALMSRPDLVMLDEPSLGLAPL LVREIFDIILKINAEGKTVLLVEQNALAALSIAHYAYVLEVGRVVAEAPGQELLKDPKVK EAYLGG >gi|316921676|gb|ADCP01000142.1| GENE 23 25158 - 25955 248 265 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 19 251 31 254 329 100 26 4e-20 MTSFAQNGLLLDVQHVTMRFGGITAVNDLSMRIPTGSITGLIGPNGAGKTTAFNVISGFY NPQEGDILFKGHSVKGFPPHRICRAGMARTFQNIRLFGSETALENVMVGCQVRRKSQWWM PVFSLPSARREEEEIREKAKGLIHRLALDSYMDEKASSLPYGAQRRLEIARALATSPDFL LLDEPAAGMNPQESTELMRFIRHIRDEFDLTILLIEHDMKVVMGVCQYIWVLEYGQLIAE GDPDAIRNNPRVIEAYLGEDVSNHA >gi|316921676|gb|ADCP01000142.1| GENE 24 25952 - 26995 1778 347 aa, chain - ## HITS:1 COG:TM1137 KEGG:ns NR:ns ## COG: TM1137 COG4177 # Protein_GI_number: 15643894 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Thermotoga maritima # 3 347 9 355 359 295 47.0 8e-80 MKTSMRNLLLNLAAIGLLALFLVWAETNLDGYKVQILNLIAVNAILALSLNLIYGFTGMF SLGHAGFMAIGAYVSALCVLPAAQKEMMWILEDIIWPFSVIHTPFWFSVVAGGFVAAIFG LFIAIPVLRLGGDYLGIATLGFAEIIRVLIVNLTPITNGSLGIKGIPFHASLLVNYGWLL VTLYCMVKLLGSNFGNIFKAIRDDEIAAKVMGIDTFRTKVLSFCLGAFFAGVGGALLGNL LTTIDPKMFTFLLTFNVLMIVVAGGLGSLTGSILGSAVITVLLEWLRAVEDPITLGGFAI PGIPGMRMVVFSLVLLCIILYRREGIMGMREITWDAIFDFFSRRKRA >gi|316921676|gb|ADCP01000142.1| GENE 25 26992 - 27885 1268 297 aa, chain - ## HITS:1 COG:Cj1017c KEGG:ns NR:ns ## COG: Cj1017c COG0559 # Protein_GI_number: 15792344 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Campylobacter jejuni # 1 295 1 295 298 266 50.0 3e-71 MSLDMFFQHFFNALTLGSLYGLIAIGYTMVYGILRLINFAHGDILMLGAYFVFYGTTTFF LPWGVAVVVAILAASAVGILVDRIAYRPLRDAPRISALISAIAVSFFIESLSVVVFSGLP RPVYQPEWLMTPINFGQLKLLPITLVVPVVSFVLVLGLLWIIHRTKPGLAMRAISKDIET TRLLGVRVDRIIALAFGLGSALAAASGIMWALRYPQINPYMGIFPGFKAFIAAVLGGIGS IQGAMIGGLLLGFIEIMTVAFMPSLSGYRDAFAFVVLVLVLLVRPTGLFGQRSEEKI >gi|316921676|gb|ADCP01000142.1| GENE 26 28027 - 29166 1813 379 aa, chain - ## HITS:1 COG:TM1135 KEGG:ns NR:ns ## COG: TM1135 COG0683 # Protein_GI_number: 15643892 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Thermotoga maritima # 29 379 21 367 370 268 43.0 2e-71 MKLGTLLSAFACMTLSMGFATPAKAEDPIKIGVYMPLTGQNAFAGQLELEGIQLAHKITP TVLGRPVELVIVDNKSDKVEAANAVKRLAERDNVTAIIGTYASSLSLAGGEVAERAKVPV LATSSTNPLVTQGKKYYFRACFIDPYQGAAAAAYAYKTLGYKKAAVLTDVVSDYAVGLSN FFKKSYKKLGGELVADMKYSSGDQDFTAQLTELISKKPDIVFMPAYFAEGAIIMKQAREL GATFVLMGGDAMDNPDTVKIAGKAAEGFMQTAFPYDMTMKDMNDQAKAFTEAWKKNFPNK DPNVNATLGYTCYDMIIDALKRAGKADREAVTKALAETKGLATPVGIMTLNANHDAEIPV GILKYENGKRIYLGEIVPE >gi|316921676|gb|ADCP01000142.1| GENE 27 29842 - 30711 1069 289 aa, chain - ## HITS:1 COG:PH0070 KEGG:ns NR:ns ## COG: PH0070 COG0697 # Protein_GI_number: 14590024 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pyrococcus horikoshii # 8 283 1 271 275 103 29.0 5e-22 MAKDPKQMTGYLYVLAAAVLWSLIGPFSKICMEEGLAPLEVAFWRALLGGICFFAQTGIC GGARIPVKHALFFCLFGVLSISVFFSSLQISIQLSGAAMAMVLLYTAPAWVAVFSRILFH ESFSSRKGIALGLALFGTALVCFSGGSLNAEPSVLGIVCGLISGLCYASHFPFYVWWQPR YSTATLYAYMLLGGALALFPFVDFAPTRSWTAWANLLALGVVTNYGAYLAYGRSLQSINQ VKAAIIGNIEPVLATFWVWLFWNENFSAYGWAGSALVISAVFLLTADRS >gi|316921676|gb|ADCP01000142.1| GENE 28 30780 - 31451 763 223 aa, chain - ## HITS:1 COG:BH1019 KEGG:ns NR:ns ## COG: BH1019 COG3382 # Protein_GI_number: 15613582 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 55 189 54 188 224 112 40.0 7e-25 MSALNVTIDPSIKGAWPEARLGCFIYTADVRAREDSLWTCLENEVAPYLKTMLETTPLAQ IPNLAESRKAYKAFGKDPGRFRVSSESLYRRVRQGKALYQINSVVDANNLVSLETGFSLG SYDTARIGADIVFRLGKAGEVYPGIGKDDIALENMPLLADGEGAFGSPTSDSTRAMITQE TRSCLTVVFSFSARSKLEEALALTVKRFGQYANPSHAESFIIE >gi|316921676|gb|ADCP01000142.1| GENE 29 31524 - 32294 642 256 aa, chain - ## HITS:1 COG:AGc2651 KEGG:ns NR:ns ## COG: AGc2651 COG2207 # Protein_GI_number: 15888763 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 237 15 247 252 150 39.0 3e-36 MSIRIDFLSYEGERNNERHPHAQFVLPIAGELEISIGGAEGRLTPSCAAFVAPGVPHSQL ATVSNAFLIVNCDLAECGVPAAEQLAEQIFLPVSEATRHLIGFAELSRDRFSQPGTTQCW MPLLLNSFLERPVCRPSRLAVLERLIDAHPGAPWTVGEMARRIGISPSRLHALSRSEWGL SPQAWLAERRIGHVRKWLAESDLPIADLALRAGYADQSALTRAMQRLTGLTPAAYRRQTR LEQKNRSPEQDSRTRP >gi|316921676|gb|ADCP01000142.1| GENE 30 32468 - 33103 613 211 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|302862367|gb|EFL85300.1| ## NR: gi|302862367|gb|EFL85300.1| conserved hypothetical protein [Desulfovibrio sp. 3_1_syn3] # 3 207 2 203 212 208 50.0 2e-52 MTTATDTFRQHLQETQAIHFLPYVGDGYADARPRILMIGEANHGTAPDGADRTYTDGVVR RALQDAAEGNGNHWVRYIRNISAMLTGNAYGGSNAVWDTVAYGVFFQHMETETHRNRSRA TPEEIALGRGAFFAMLDVLKPDFVIVWGLTLFKQHWLSPESGITMLAPGLCAYVFTDRPD VRIWHCHHPSRDFSHVSEHQKWEAVRKLHIR >gi|316921676|gb|ADCP01000142.1| GENE 31 33484 - 33684 92 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPFFSPYSYAGGVLRALVRFFREFQLPASLLAPFGCLRMAGAFSLRPSFSRRTLFGLRLS PHERGR >gi|316921676|gb|ADCP01000142.1| GENE 32 33952 - 36654 3124 900 aa, chain + ## HITS:1 COG:L85514 KEGG:ns NR:ns ## COG: L85514 COG0474 # Protein_GI_number: 15673239 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Lactococcus lactis # 27 900 24 910 910 1042 60.0 0 MIKKQLETEKDLNVTQEVRGLAARPASARLLEAARQTPQEVFAAYETTPQGQPRPDIMRA LFGRNELARKKADSILKRLFKAFINPFTVVLIVLAVISFITDYVIVEPEDRDLTAVLIVG IMVFISGTLRFVQEVRSGNAAERLQAMVKTTIAVLRDGESGERPISELVVGDVIRLAAGD MIPADVRIVETKDLFVSQSSLTGESEPMEKWTAAQPQTGGNPLECNNLAFMGSTVVSGSA LALVIAVGKDTLFGALARRVAETRVRTNFEKGVNAVSWVLIRFMVGMVPVVLFLNGFTKG DWVQAALFALSVAVGLTPEMLPMIVSANLAKGAVAMSRKKVIVKHLNAIQNLGAMNILCT DKTGTLTQDRIVLEYPLDVHGNVDERVLRHAFLNSYHQTGLRNLMDEAIVEHAYETNMLP LWQDYRKVDEIPFDFTRRRMSVVVADKAGKTQIITKGAVEEMLSICSYAEYKGNVEPLTS ALSEEILATVRRYNEAGLRVIAVAHKTNPMVAGAFSVADESDMVLIGYLAFLDPPKDSAA AAVAALKEYGVAVKVLTGDNDAVTRSVCGQVGLRSHSLLLGSDVEAMDDAALRAAAERTD IFAKLTPQQKARIVTCLRENGHTVGFMGDGINDAAAMKASDVGISVDSAVDIARESADII LLEKDLMVLEQGAIEGRRIYANIIKYIKMTASSNFGNMFSVLAASAFLPFLPLAPLQILV LNLIYDISCTAMPWDNVDADFLKQPKTWDASSISRFMIWFGPASSVFDITTFVLLYTYIC PLVFGGAYETLDAGMQVAFVGLFQAGWFVESLWTQTLVLHMLRTPKVPFLRSRASWQVTG LTSLGILAGTCIPFTTVGGALDMMPLPGAFFPWLFATLAAYMLLTTTLKGIFIKKYGELL >gi|316921676|gb|ADCP01000142.1| GENE 33 36826 - 38436 1940 536 aa, chain - ## HITS:1 COG:MA0083 KEGG:ns NR:ns ## COG: MA0083 COG0248 # Protein_GI_number: 20088982 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Methanosarcina acetivorans str.C2A # 12 536 10 530 543 394 42.0 1e-109 MRMSGELERTEKAIGIIDLGSNSVRLLLVRVHADGSTTILNQVKHMVRLGEGGFTHNRLQ EETMARTIAVLRGLAEMCAVYETTEIIAIATAAVRDAVNGPEFIRRVQEKTGIDFTVVSG REEARLIYLGVSSGLEHSKSLRLFIDIGGGSTELIVGDSQSYVNLDSLKLGCVRLTNLFF EDNKGPVSAYTYAALQRYVRNEALHSFQRIAGFEIPEAVASSGTAQNLAEIAAALEQQDA ARTGKPSTASKHVLSYDGLRRAVKELCSRTLEERKSVPGINPNRADVIIAGAAILQTIME EQGFESVTISNRNLQNGILVDYLMRTHPQKAGSRLSAREESVLQLARFCRFEEKHSRHIA KLALELFDSARAIGLHDAPPVSRELLYYAALLHDIGIFISFANHNAHSHYLIRNTELLGF TEREIELIAALAFFHRKRPSKKIPLFLQLDEGIREEVRLLSLFLTLAERMDKSHRQIIRS ARFGRTPDGELELCLRASEVCPIEIEEIGRSHKLIKKTFHTHFTLCPCLENGAPIV >gi|316921676|gb|ADCP01000142.1| GENE 34 38415 - 39314 909 299 aa, chain - ## HITS:1 COG:MA2351 KEGG:ns NR:ns ## COG: MA2351 COG0248 # Protein_GI_number: 20091186 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Methanosarcina acetivorans str.C2A # 158 293 380 516 534 72 27.0 7e-13 MTIHTEDQLQEPQAEPKATPAKRPARTAKNVPAGESQPKPRPARKKPQRAQPEPVQESAP EPAAVAELVTAAPEAYDPAPALSAPCSEPCPETCSAPAAETTELAEETLTTAVMPESPVM ADSLTHGQRVAAIAATLFQDLAELHGLDDVWGHRLHLAAQLHDIGFAEGRKGHHKISMRL IEEDLSLNIHEDDRPWVALLARYHRKAWPSRRHARFDALKKSDRKELRKAASLLRIADAL DYTHTGVVGNLAVAVKKRKVIIAVQCSGDCSAEMERVIKKGDLFMHVFGRELECVCQGN >gi|316921676|gb|ADCP01000142.1| GENE 35 39341 - 41488 2450 715 aa, chain - ## HITS:1 COG:alr3593 KEGG:ns NR:ns ## COG: alr3593 COG0855 # Protein_GI_number: 17231085 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Nostoc sp. PCC 7120 # 25 708 16 711 736 619 46.0 1e-177 MVSKDSEGYPEQLCATEEISELACPNLYLNREINWLDFDAKVLDEATDAGLPLLEQLKFL SIFYNNLDEFFMVRVANIYRQYRSGAVSSSPDRMTPAKQLAEIRRKVLILVSRAQEHWRK RLAPQLHDKGVRLMRYADLSEKQRKFLDGYFRNEIYPILTPQAIDPGHPFPTISNTSLNF IIQLRSRDGVTRFARLKCPNNISRFVFIPRNKEAKTYASLGFNANVRDSDIILLEDLIAE YLGALFPGNTVVNAGLFRITRNTDVEIEEDEADDLLEAVKDLVEQRRFGDVVRLEIAHGT AKELSAFLTERLGMQPFQIYRVKGPLAFSELMALYGVDRPGLKESPFYGRTPSVFQEGDI YAHIQSRDVFLFHPYDSFTPVLDFIRRSSEDPGVIAIKQTLYRVGNASPIVKALIEARRS GKQVTAVVELKARFDEERNITWAEEMEKAGVNVVYGLVGMKIHAKLCLVVRREPEGVSRY VHIGTGNYNPSTAKMYTDMGLITANPNICADVTDLFNVMTGYAHREQYRELLVSPLSMRR SLLDMIQRETALHQQNGNGEIIFKCNQLVDKHVIRALYRASMEGVRVRLQVRGICCLRPG VKGISDNIEVTSIVGRFLEHTRVYWFNANGEGRLYIGSADIMPRNLDRRIEVLVPILDPG LRACLRNDILETHLRDNQQTWRLQEDGTYCRIKPGKGEEPVNAQEIMMRRSRECV >gi|316921676|gb|ADCP01000142.1| GENE 36 41821 - 43275 1404 484 aa, chain + ## HITS:1 COG:PA0575_3 KEGG:ns NR:ns ## COG: PA0575_3 COG2199 # Protein_GI_number: 15595772 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 301 466 1 169 182 129 41.0 1e-29 MQRLIRHGERRKFDTGVLREGGNCRSVLYALGVLLVCLLGSYLFLFRLEHVRVLQEREKA RYVVDSYATKIQYVVANAFSSAYLVEALLRQGEGRISEFYSYAEELVKMHPSTININLAP RGVVTKVYPFERNKKAIGHDLFANPERSREALLARDSGKATIAGPFDLIQGGYAMAVRLP VFMGEPEGDITAQEKFWGLICVTFAFPDVLDPVKLEYLDKQNYVYQLSRIHPDTGDRIVL LRSAETLIEPVERKVHLPNADWILSVTPKGGWNWSIRYASFGIATFISILFSCVVGLAVD LAHQKKRLELMSEHDPLTGLPNRRVLFRELEKAMEAKRPFALCYMDLDDFKQVNDTYGHD CGDQLLNGFAERLQSALSTAGTLIRLGGDEFIAILYGVSERCIAGERFRRMLEAIGSEPY HLEGIDLNPAVSMGLALYPSEADSLEALMRLADADMYEHKKRRRCRNGTETSPSADGASC GAGV >gi|316921676|gb|ADCP01000142.1| GENE 37 43673 - 45706 1591 677 aa, chain + ## HITS:1 COG:MA0905 KEGG:ns NR:ns ## COG: MA0905 COG3408 # Protein_GI_number: 20089784 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen debranching enzyme # Organism: Methanosarcina acetivorans str.C2A # 15 652 37 669 680 407 37.0 1e-113 MKFRFDKPTCQNIRKSLRKEWIETNGLGDYASSSLICCNTRKYHGLLVADLESPAGRHVL LSSLEESLLVGGRELCFSCRKHPGVFYPRGNEHLEEMDAGEWPVFRYRFGEVTLTRELML IPGHRLLLVRYEVRGTAPDTPPVVLRVKPLLAFRNAHDLTRANPALRPETLSAPSGICLR PYEALPPLFMQVEGGFIFNAAPDWCFNVEYLVERERGFAYQEDLFQPGQFEIRLVPGKPV FLTASTEALQTEHGHNVIGEIWRREVERRMGLAASSESLEGHLAREGERFLVRRPDGGLA VTAGYHWFGSWGRDTLIALPGLTFYAGRPEEGTDVLLGLGSAVRDGLIPNVFSADGNHAY NSADASLWYVWAVQQMLKALPDRKGLIRERCWPVIKNIIEHYGGGKVPFVAPDAEGFLSV GNPGTQLTWMDAVANGRPVTPRHGQPVEISALWYNALAFADSLAKAFGEPEWRRTKQLDA MRSVFFKRYWVTDERGDYLADVWRDGGVETCVRPNQLFALSLPYPVLDEERYASVLSRVR RCLLTPYGLRTLAPSAPGYRPLYEGGPAERDEAYHQGTVWPWLLGAYGEAVLRAAWDVPG SVRELLRTIRPLFAQHLGDAGIGSVSEIFDGDPPHLPNGCIAQAWSVAECLRLLHLLKEA APGVYAEWEANARKGGN >gi|316921676|gb|ADCP01000142.1| GENE 38 45706 - 47046 1326 446 aa, chain + ## HITS:1 COG:TVN0430 KEGG:ns NR:ns ## COG: TVN0430 COG0438 # Protein_GI_number: 13541261 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermoplasma volcanium # 164 441 95 369 369 190 36.0 5e-48 MRVLMFGWEFPPYKSGGLGTACYGMGRALAKKGTEILFVLPQTPLDAPVRVGERLSLCSA SGTRVERTARNLRINEEVRELWREKLHIEHVDSLLLPYDTPESYDERRRELERLQSVKRN ADVFSSTQETILRLHGGYGPDLMSEVYRYACAAVAIARKARFDVIHVHDWMTYPAGILVR KLTGKPLVAHIHALEHDRSGDNMNRDVAGIERAGLEAADRVVAVSHYTKRLVMRQYGIPG DKIEVVHNAVSRSEADRVYAVPERCAHEKRVLFMGRVTFQKGPDYFVEAAKLVLDRIPDV RFVMAGSGDMLPNMVRRVAQLRIGSHFHFTGFLKDEEVDHMYAVSDLYVMPSVSEPFGIA PLEAMAYDVPVLISRQSGVAEVVRNAIKVDFWDIREMANKICAILSYPFLAAEEVRNCRE ELKAIRWENVADRLNAIYAQLVSGGK >gi|316921676|gb|ADCP01000142.1| GENE 39 47048 - 48277 1401 409 aa, chain + ## HITS:1 COG:MA4052 KEGG:ns NR:ns ## COG: MA4052 COG1449 # Protein_GI_number: 20092845 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-amylase/alpha-mannosidase # Organism: Methanosarcina acetivorans str.C2A # 1 388 1 387 396 407 52.0 1e-113 MSALCFYFQVHQPYRLRHYTFFDIGVDSFYEDEDANCGIMLKVARKCYLPMNALLLKLIR KYDGRFKVSFSISGTALDQFEAYAPEVIQSFRELVATGCVELLSETYNHSLAFLYSPEEF REQVALHDERIEALFGVTPRVFRNTELIYNNDLARAVEAMGYKAVLAEGADHVLGWRSPN FVYRPAGCDRLKLLLKNYRLSDDIAFRFSNHQWPEFPLTADKFSEWAHAANASGDLINLF MDYETFGEHQWESTGIFAFMEALPEVMLRTPGFAFITPSEAAARFEPVASLDVPHFMSWA DAERDLTAWLGNDMQNDAIESVYRLEKAVKATGDPGVLRTWRRLQTSDHFYYMSTKWFSD GDVHSYFNPYGTPYDAYINYMNVLADFRLTLDAASPLDPGTKSAVLESA >gi|316921676|gb|ADCP01000142.1| GENE 40 48340 - 52590 3985 1416 aa, chain + ## HITS:1 COG:all1272 KEGG:ns NR:ns ## COG: all1272 COG0058 # Protein_GI_number: 17228767 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Nostoc sp. PCC 7120 # 568 1416 4 853 854 650 41.0 0 MQYDMSRTTLFEVSWEVCNKVGGIYAVVSSKALEAAALFGDNYYLLGPDLGNNAEFEETD EACWQELRAITARRNLACRFGRWDIPGRPKVILVGYRDRFNQGQLLFSLWNRYGIDSISG GWDYVEPVMFSTACGEVIEAACQTLAVPASGPVLAHFHEWMCGGGLLYLKERAPKLGTVF TTHATMLGRSMAGSGFDIYKQMNQINPKMEAGAYNITAKCSMETASAREADCFTTVSRIT ADEATVFLGRSPDVVTPNGLDMRVIPDYSAERDVPVGARAKLLGAAGRLLRRELAPDTRI FIISGRYEYHNKGVDVFLDALAGVNEALRQSQTNVLALCAVMGGHSGVNPDAVGGDPSKI SDQGPYWISSHHVYNQPQDPILNACKRLGLDNRPENHVQVIFDPALLDGNDGFLNMPYEE VLAACDLGVFPSWYEPWGYTPQESAAHAVPTVTTDLSGFGLWVRDTQGQEQGVTILHRQQ TSYEGTVAALRAVLLDYAALPSAQLDERRTAVRALSGACSWDRFFPHYIQAYTQALDKAV ERGALRDAPSSASLTRVLEATMSTTPTLHAFTAVTALPEPIGRLRELAHNLWWSWHPECH QLVSALNPAEWERSGHNPVAVIEKATKARLLIVAHDQSYLRLYKSTMEAFDAYMGVSAKD FGALSPERPAAYFSTEYGLSECLPIYSGGLGVLSGDHLKSTSDLNIPLVGVGLLYRSGYF RQQIDRDGRQIAQYPENDFATLPLELVKDEGGAPLEVLLQLPGRRLHAQIWMVRVGRVKL YLLDTDTPSNTADDRKITARLYEADRDCRLRQEILLGMGGVQLMRALGIRPSVYHMNEGH SAFLILERIRLLMQERGLSFAEAGELVRGSSLFTTHTPVDAGNERFGLERMEPYFLPYAQ SIGLPWPEFVRMGRFEGSERNVFEMTVLALNYSFRANGVSALHGYVSRHMWQEGWKGVPK AEIPIRYVTNGVHVPSYTGAPMRALLDNVLGPGWQEQSPDSSIWSKIDEIPDEGLWAVRQ LQKKQLLDYLRASLPEFFKKFAIPYEKQKEMAACLTPSSLVIGFARRFAPYKRATLLFAD LDRLARIVGNAERPVIFVFSGKAHPADTQGIDMLQEVIRHMLDPRFFGKIFFIEDYNLAV SRLMVQGCDVWLNTPRRPYEACGTSGQKVPVNAGVNLSISDGWWCEGYNGENGWTIGPAV TREYLCGEQSDYDDAGFLYALLEEKIVPLYFERNFEGMPHDWLLTVRQSMQSLIARFSSN RMVREYLNDYYIPAAQRYAELRDKHNALTKRLARWKQDVNARFSSLRIEQIRIEGLKGEE LMGTEPMQVQIGVLPGGMKPEELLVQLVAGPGDGNGFTDTPDVVDMARTEESTADRLVYA CAYSPSRSGPHVYGVRVLPVTPGLASPLETRLVLWG >gi|316921676|gb|ADCP01000142.1| GENE 41 52892 - 53155 72 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRDAFIARRDYILNDVFGERLPFLDDNPPILENLSSLEKRTSPHGTFNWSICKWWRVAAS SGISSVTARAAIFSRPSLGTLPRPGPR >gi|316921676|gb|ADCP01000142.1| GENE 42 53471 - 54700 979 409 aa, chain + ## HITS:1 COG:no KEGG:Dalk_3013 NR:ns ## KEGG: Dalk_3013 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 18 380 22 384 407 239 38.0 1e-61 MSWLSQTADAEPALDVASPVSRAVHGIWEADELFRCPLVGMGITLAEQKHLLKKMSLPAV ELTPFEMHELFVNAAATENPLSRKMSQLLRRKYEREAAPLRLLSEEDFLAQWKEAFSAGQ YVGFLWAAATRRLSDEARRTVYGAVHMAMHGIAEEQTRIRRFIVELERKNEQQANRIRLL KEREAASRGGMEQAEEGRRALAQENAALRDERNRLKAELDAMYEASTALWEIRGPELEAE NENLREALASLTQRFEGQRELVNSLLKQGRARFEQGNLPGMAADSCGTCCGRDCDESCPS FDLCSKRVLIVGGVERMESLYREFIEGGGGVLDYHNGSLQGGTRQLERSLRRADIILCPV NCNSHGACIKVKNLAKKHNKTFYMLPNGSLSTISRLLGSESARQEGERA >gi|316921676|gb|ADCP01000142.1| GENE 43 54697 - 55116 207 139 aa, chain + ## HITS:1 COG:no KEGG:Dole_1327 NR:ns ## KEGG: Dole_1327 # Name: not_defined # Def: hypothetical protein # Organism: D.oleovorans # Pathway: not_defined # 1 94 1 94 99 67 39.0 1e-10 MSTDFQMNYHHNKDNLHVKMQGIFDGNSAHELLNLFLREYRCGGRVFVDTAGVSEVLPFG SKVLQARLCQTPVPASQLFFKGENGFNMAPSGSRVLIVPPRKKKSGCCGKCAHCTCHGEH GHGHGHGGCACRGEHNHDH >gi|316921676|gb|ADCP01000142.1| GENE 44 55141 - 55587 512 148 aa, chain + ## HITS:1 COG:MA2699 KEGG:ns NR:ns ## COG: MA2699 COG0716 # Protein_GI_number: 20091523 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 1 148 1 145 147 108 41.0 4e-24 MSNVLIVYGSTTGNTADIAEYLGGRLRAAGHVADVRDAADVSPDGLCEGRDVALFGCSAW GTDEVELQTDFEPLFEAFDRIGVKGVKTACFASGDSSFEHFCGAVDVIEARLNELGGIQI LEGLKLDGNLSSNGSDVEDWANRLLAAL >gi|316921676|gb|ADCP01000142.1| GENE 45 55741 - 56202 377 153 aa, chain + ## HITS:1 COG:Cj0400 KEGG:ns NR:ns ## COG: Cj0400 COG0735 # Protein_GI_number: 15791767 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Campylobacter jejuni # 14 145 15 148 157 111 40.0 5e-25 MSGGQVHVDPETVFTRFLRGKGCHNTQQRRAIVSVFFQEEGHLTIDDLYGRVQRQDPTVG QSTVYRTMKLLCEAGLAKEVSFGGGVVRYEQASDWHHDHLICECCGKSIEVVDPKIEALQ ERLVRQYGFTPTQHRLYLYGVCPECLAGQSPSK >gi|316921676|gb|ADCP01000142.1| GENE 46 56341 - 57735 1522 464 aa, chain + ## HITS:1 COG:ECs4383 KEGG:ns NR:ns ## COG: ECs4383 COG0635 # Protein_GI_number: 15833637 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Escherichia coli O157:H7 # 27 454 9 428 445 340 41.0 3e-93 MTEIRLEKSLPSEQERVGQPSVPAEGLERYFAREGVDPLPYAFEGKIAVHPGLGGSGVEK QDFEATLEPLLAEPRTGKTVAYIHVPFCETHCLYCGFYNKAYREDESRIYADALIQELRL WRGRPGQDAGPVHAVYMGGGTPTALQADDLRRILKEVRAVLPLANDCEITVEGRLSNFGP DKMEACFEGGANRFSLGVQSFDTEIRQAMGRVSDRDTLIRQLQLLQSYDQAAVIVDLIYG FPMQTMERWLADIATAQSLKLDGADCYQLNVYRNTPLAKAIESGRLPAGADVPMQSAMFA AGVKAMQKGFYRRLSISHWARTSRERNLYNLYVKGRAHCLAFGPGAGGNLDGFFYLNQSD YRAWQEEVRAGRKPIAMLFRPFPQGKLFKAIAEGMEQGWLDMPGLEAAYGIPLGEVWKPV LEQWERAGLVERDGAFIVLTLAGQFWQVNLSQLLLDYLKRALEA >gi|316921676|gb|ADCP01000142.1| GENE 47 57750 - 59879 2214 709 aa, chain + ## HITS:1 COG:ECs2792 KEGG:ns NR:ns ## COG: ECs2792 COG1629 # Protein_GI_number: 15832046 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Escherichia coli O157:H7 # 6 703 15 709 715 367 34.0 1e-101 MLARALPLTALGLAAWCPAYAVEEPVKADDVYVTATRVEKELQDVPMSVSVMTGEDIKRS PARTIGELLQDVPGVEIRNSGGQGFKRISIRGENPNRVLILIDGQKLVENKSMDGTPLLI DPSNVERVEVIKGPASVLYGSEAIGGVVNIITKKGGDKPIQGEASVAYNGASNGFAESLS AFGGMNGFKYRVSGSYSDQGNLRTPDGEAPNTSFRQKEGSAFLSYDFSDKFTVGGGLDSF KGAIHSGSMEPGYENFAVDVPKWQRDKVYAFAEAKNVTPWLPRVRFDAFWQKNEKEMTND VNTDPAITKMPLVVTNNADNRNKQLGSSLQMDWAIGDNHYLITGYDISYDTLKADTRASA STSVERILATGILASAPPFAQQIAAASMAKSIPYTSTAYHEGDMLTNALYAQMESTLPAD FTLSYGVRYTWVQSEMKHAEGSKTNSKGTVPYDVGTESSSNNSRPVFNVGLMWSGIPDLT LRATFAQGFRVPSLQEKYVMSAMGGGTILPNPGLKPETSNSYEIGARYVHDGLSVDVAAF YSDADDYIYNPTIDADTDTSRYINVSSAKTHGVELAASYDFECGLTPYASATWMRRKFDY GTLTTWKTGVPEWSGRAGVRFKHALSETVDFNADVYGRFSSNSVEKTESETTHYNNWQTA NVAFGFEFGDEKQYTVAAEVLNLFDKRYRQDDSILEPGLHANIKVGMRF >gi|316921676|gb|ADCP01000142.1| GENE 48 60017 - 60568 610 183 aa, chain + ## HITS:1 COG:no KEGG:Hhal_0381 NR:ns ## KEGG: Hhal_0381 # Name: not_defined # Def: hypothetical protein # Organism: H.halophila # Pathway: not_defined # 8 183 24 199 199 115 43.0 7e-25 MPSQVFNRHYWSARELVVLGVFSAAAKLSTLLVALVGGGMNPVSLLAKNLIFTTLLVVML YKVRKPGTLALFVLVNMLISMLLLGGSVTLLPAMLLAAGCAEAAMACSGGMRKPWAPVLG VAVYDLTSKVFSLCVSWLFMRESPALLYVIVPIVVIGYVGSLCGLFSGVRAVRELRHAGI VRN >gi|316921676|gb|ADCP01000142.1| GENE 49 60546 - 61328 778 260 aa, chain + ## HITS:1 COG:BH0166 KEGG:ns NR:ns ## COG: BH0166 COG0619 # Protein_GI_number: 15612729 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Bacillus halodurans # 116 237 133 254 265 63 30.0 3e-10 MQALSVTEGAVWGGGLHAADTRAKMFVSLLASVATVALSGVEPQMVLFGMSLVYALGMRQ FRILLIGYAVLVGMSLLAMGCAFGMSLLIPSMPAFSAVSLVVPFLRMATMLNVILPLAFS CRVQSLLTALKSLRLPFCLYLPAAVMIRFIPTFLHDAKQVSETLRIRGWRMTPWNAFRHP VLLIRLVFTPLLFRSLKTSEELGIAAELKGLGYGEGMRPYRKLVWKASDTWLLVAACLVA AAAVLCQHAVGWAPLEGGMR >gi|316921676|gb|ADCP01000142.1| GENE 50 61328 - 62731 221 467 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 273 452 137 326 398 89 35 5e-17 MIRFERVTYTYPFQTCPAVSDISFSVRPGELVLCTGASGCGKSTLMRLANGLCPHYFQGT LEGRVLIGGEPTDARPINTIAREVGTLFQDPEQQFFALNVEDELAFAHEWQGVSPEAIAA KIDEAARAFRLGPILSSSIHELSEGQKQKVGLASILSQGPRALILDEPSANLDPEATEDL ACKLAELKARGMAILVVDHRLYWLEGIADRVLVMREGRIAERGGFDMLYDDALRGRCGLR AARVDDPRRTLPDCAEELLEPGEGGLAVSGLRFAYSGKPALFDGVDVSIPAGVTALIGDN GTGKTTLARLLTGLNAAQEGRFAIAGQPVPASKLLSRVGIVLQNADHQLHMKTVRQELEV SLSLAGGGADDVPGLLSLFALEGLAERHPQSLSGGEKQRLVIACAFAKRPDVLILDEPTS GLDGRNMRRIADALDLLAGRGACVLVITHDLELMGLSCARALRLPLS >gi|316921676|gb|ADCP01000142.1| GENE 51 62877 - 63410 499 177 aa, chain + ## HITS:1 COG:YPO0285 KEGG:ns NR:ns ## COG: YPO0285 COG3721 # Protein_GI_number: 16120624 # Func_class: P Inorganic ion transport and metabolism # Function: Putative heme iron utilization protein # Organism: Yersinia pestis # 16 177 18 178 181 123 39.0 1e-28 MTASCCQTTETLGESIAEALREKKPVMLASLAERFGVSELEVARALPDEARAFAGKDAFD TVWQALASWENATFIMAHLGSVIEIKGKIPEGRHGHGYFNLSGGSGLGGHLKIDDLGHIC FLSLPFMGLESHSVQFFNAAGTVLFSVYVGRENRQLIPAARESFFALREAVGTKDPS >gi|316921676|gb|ADCP01000142.1| GENE 52 63407 - 63919 738 170 aa, chain + ## HITS:1 COG:FN0772 KEGG:ns NR:ns ## COG: FN0772 COG0716 # Protein_GI_number: 19704107 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 162 1 163 169 91 34.0 8e-19 MKALVVYSSRTGNTRKIAEAIAAVLPGCELYPVESAPAPEGYDLVAVGYWVDKGMPDAQA KAYLETVRDAKVALFGTLGAWPDSDHARDCIAQGEALVNAPERRNKVLGTYLCQGKVDPK IVAMMQKMASDVHPMTPERKARLEEAAKHPDEADCLRAQEAFKGFAEQVA >gi|316921676|gb|ADCP01000142.1| GENE 53 64003 - 64641 766 212 aa, chain + ## HITS:1 COG:MTH1022 KEGG:ns NR:ns ## COG: MTH1022 COG0811 # Protein_GI_number: 15679040 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Methanothermobacter thermautotrophicus # 66 173 81 186 279 75 37.0 1e-13 MNLASIYASIGPSGVALVFVGCAGLYLALRNLLYLHFVWRDFKRDFLDIEHQEGRCMRDV CDKSDNPLIAIIRDVVKTHGGHSEDIRAEVAYLFHRNFEQVTKGLCWLRLISVVSPLLGL LGTILGMVTVFRTIAENSAPNAAQLAAGIWEALITTIMGLCVAIPMLMFYYYLMLKFKGF HIEAVEHSYRALELCGGAKREARPETRRMCHA >gi|316921676|gb|ADCP01000142.1| GENE 54 64634 - 65020 339 128 aa, chain + ## HITS:1 COG:PA2982 KEGG:ns NR:ns ## COG: PA2982 COG0848 # Protein_GI_number: 15598178 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Pseudomonas aeruginosa # 6 83 13 91 146 58 40.0 2e-09 MHDFDEDASIDLTPVIDVIFMLLIFFIMTTTFSKPVLDIVLPASETAEESSRKNAELVIS VKADGTIHYQDRQLTKEALAAVLETRPEALLNLYVDEKAPFEAFVGVVDIARLKRGGHFV ISTQPSER >gi|316921676|gb|ADCP01000142.1| GENE 55 65061 - 65831 616 256 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGGMTGGLVLLALAVCLMTPSVKILTPPKAPSSIMLMRVIRPEPTPEAKPVQPDAMRKL LSPESEVTIPVPPPVVERPKPPQEAPKTVAPEPVKPKPVKKAPPVKKTRPVVRQPEKPVP SSAPVGQQTEIPPIAGGTGVSSVAAAAPSAPVGGTQERPDDKSKALGAILDALNRHKRYP KQARRIGAEGTVQLLVTISADGKVSACSLGKGSGFGVLDTATERLGEKLVGLDIPSVRGG KGFQVLVPVRYSLKDA >gi|316921676|gb|ADCP01000142.1| GENE 56 66149 - 66781 407 210 aa, chain + ## HITS:1 COG:CAC0567 KEGG:ns NR:ns ## COG: CAC0567 COG0500 # Protein_GI_number: 15893857 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 13 204 12 209 209 130 33.0 2e-30 MRHVLRTFFQNARKPGGLSGRIMLWAMNWGHASLAKWGRSHVSPEPDARVLDIGCGGGAN LAQFLKLCPQGSVCGIDFSAESVATSLRKNAGAVAAGRCEVRQGDVSRLPYADASFDLVT AFETVYFWPDVSAAFAEVFRVLGPSGVFLIVNEESGDSRWCSIVDGMRTYTADALAGWLC AVGFVDVRSDVRGEGGALCLVARKGPDSVA >gi|316921676|gb|ADCP01000142.1| GENE 57 66991 - 67548 762 185 aa, chain - ## HITS:1 COG:MA4338 KEGG:ns NR:ns ## COG: MA4338 COG1321 # Protein_GI_number: 20093126 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 15 127 9 122 224 77 37.0 2e-14 MPEERNDARQLSPVLENYLEIIFHQEIREGAARASSIAEAAGVSRSTVTSTLKALKSMGY VDYEPYSLVHLTEAGMHIGRDIAHRHIVIQEFFQHILQLEPNVSNDVACELEHVIPPDVI RRLGQFVLYLRSREDDWKNWQEDYKEIRLEHIQSNTDRDLVRNAPPMLAMRTQMMGQDEL RKKYR >gi|316921676|gb|ADCP01000142.1| GENE 58 67573 - 68946 1552 457 aa, chain - ## HITS:1 COG:PAB1439 KEGG:ns NR:ns ## COG: PAB1439 COG0534 # Protein_GI_number: 14521613 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Pyrococcus abyssi # 12 414 17 413 468 104 23.0 3e-22 MLINAAPNSALPTARELVSLTVPQLGMMLCHLAVSMTDLWVAGRLDASVQASLGIVSQVF TLLMLITSLAGSGCLTTISQALGAGLEQRARRYAALIAGLAGLTGTVIAATSLLCLPMTL RLMHVSPEMEPVVRTFIIAYSCQLPFYYTLIMLNSVFRAYKLVRLPLIAIAVVAVGNLIG SAGFGLGLCELPDYGYQGIAWSTFGSALLGLACNLYAIVRHGILTRASLAPWRWARRAMP YLFRVGGPSALGALAGHMGNLVILSLLTGLPGDMVPVLAGMTLGSRVESFLTFPTAALSM TVTILSGHLIGANRPEALFRFGQRLALAAALALGFGAGLLFVFRVPVAEMLSTQPEVVRQ AAQFLAFACAGIPLKTYSMLVNGAFAGAGATRVSCKVNCVTMWALQLPLAWTLGNAFGAT GIYAAMLCANAGSAFWYARLYAGKKWLEYGMRKRQNA >gi|316921676|gb|ADCP01000142.1| GENE 59 69343 - 69576 374 77 aa, chain + ## HITS:1 COG:L192240 KEGG:ns NR:ns ## COG: L192240 COG1918 # Protein_GI_number: 15672170 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Lactococcus lactis # 4 74 78 148 152 59 45.0 2e-09 MRRIGLRQMQAGQKGSIISVDADGELGRRIRDMGLIPGTDVEIVGRAPLRDPVALRLFGV TLALRNREADYITVEVA >gi|316921676|gb|ADCP01000142.1| GENE 60 69573 - 71819 2669 748 aa, chain + ## HITS:1 COG:MA3477 KEGG:ns NR:ns ## COG: MA3477 COG0370 # Protein_GI_number: 20092288 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanosarcina acetivorans str.C2A # 25 744 10 664 670 468 38.0 1e-131 MSQAHADCPHCRQDGSEAKHIALKEVRIALAGNPNCGKTTAFNEYTGTHQHVGNYPGVTV DKKEGYVSIDGVEVLMVDLPGTYSLTAYSQEEIVTRAVLAPEKAEEKPMAVVDLVDTSAL ERNLFLTVQIMEMGLPVVLACNMMDEARKAGIQVDTKELGKQLGIPVVATVARSGEGLKD ALREAVAFGRRGSAPLNISYGPDLDPVLTKMERLLAEKGLLTDRYPTRWIALKVLENDSE ILRQVERMPDVAAQLRADRETVADHLRDTLQTSPEAIIADYRYGFIRGVLRSGAVRRESA KDRLELSDKLDKVLTNTLFGPLIMLGVLYAMFQLTFVIGAYPQGWVEDGFGWLGETVSAL MPDGFLKSLIVSGVIDGLGGVMSFVPLIIIMFTLVAILEDFGYMARMAYMLDRVFRAFGL HGASVMPFIIAGGIAGGCAIPGVMATRTLRSPKEKLATLLTLPYMACGAKLPVFLLLVGV FFPENPARTMFLLTLAGWAAALLMARLMRSTIIRGESTPFVMELPPYRVPTLRSVVTHCW ERTWMYLKKAGTVLVAVSILIWASMTFPTLDPEQAAPLETQIAALQEKVDALPEGDEART PLEEELAAVQATLAEESLQHSWAGRLGMAIEPLTKPVGFDWRTDVALIGGIAAKEAIVST LGTAYSLGEQDPEEATSLAERLAQDPNWNKATALSLMLFVLLYSPCFVALVVIRQEAGSW GWVAFSMIFNTVFAYGVAAAAYRICMAL >gi|316921676|gb|ADCP01000142.1| GENE 61 71795 - 71887 58 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFSFMSFSFDVEFHYQLKVEKEAYKAMQIL >gi|316921676|gb|ADCP01000142.1| GENE 62 71874 - 72122 332 82 aa, chain + ## HITS:1 COG:no KEGG:Dvul_0061 NR:ns ## KEGG: Dvul_0061 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 80 1 80 91 92 62.0 5e-18 MNENMKYGLFFLGGLALGAIGAVAVSKGKLDLKPMAADLLSRGMDVKDAVLAKVETVKEN MDDMVAEARHAAEQRREAKEEA >gi|316921676|gb|ADCP01000142.1| GENE 63 72304 - 74406 2074 700 aa, chain + ## HITS:1 COG:SP2101 KEGG:ns NR:ns ## COG: SP2101 COG2217 # Protein_GI_number: 15901916 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 97 683 97 682 687 371 34.0 1e-102 MLFFIVHDVPGRVRLRARYGFSLRKAQVLADRLDAVGGIEGVRVNPRTGSVLLLYANDEA KRAAFLLLAVAASEQPSGDVAVCGENLPEPAGRPSWGPFLRYIFVRPFMPVLVRVATAVA ASVPFILKGVAALCRGRLSVDVLDAAAIGISLLRRDFKTARLLTLLLGFGEALEAWTRKK SLDTLAQSLALDVDTVWVRREGREMHVAISELSEDDLVIVRTGSAIPVDGVVAEGEASVN QASMTGEPLGVLRSPGSSVYAGTVVEEGEIAIRPTGVGDGTRLRQIVRFIEDSEALKAGV QGKAERLADAVVPFSFLLAGLVWLVTRNPARAASVLLVDYSCALKLSTPLAVLAAMKEGV GRGVLVKGGRFLESLAAADAVVFDKTGTLTESRPRVAEVIPGEGYERDDILRTAACLEEH FPHPVARAVVRKAEQEGLHHQEEHTEVEYVVAHGIASRLHGKRVLFGSRHYIHHDEGVPV DAMREDIERLAREGRSILYLAVDGKLAGLIAIEDPLRPEAAPVIRKLLGRGIRVVMLTGD DERTAAAVAERLGISEYRSQVLPTDKAEVIRSLQAEGHTVAMLGDGINDSPALSAADVGV TLSDGADLAREVADVVLTECRLDGLVTAVDLSKAAMRRIRTDFGLIMTLNTVFLASGLAG FLQPGPSALLHNLTTVGVSLNAMRPMLKAGAELQLEEVSA >gi|316921676|gb|ADCP01000142.1| GENE 64 74403 - 74846 429 147 aa, chain + ## HITS:1 COG:no KEGG:DvMF_1416 NR:ns ## KEGG: DvMF_1416 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 8 147 2 143 145 122 52.0 5e-27 MSEIESCITSFVDGRVRLRHPSLKRAEDAEQVRGFLASLPGILRVTVNSRTGSLLLEYDP DQISREDLLALAGQWADFASAQDEAAAPRKRRFDRARAIRFTNRGMLATLGASLAFGLAG RERGHVVAGGLFLLFNLAHLYTYRKAL >gi|316921676|gb|ADCP01000142.1| GENE 65 75006 - 76403 1910 465 aa, chain + ## HITS:1 COG:no KEGG:LI0461 NR:ns ## KEGG: LI0461 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 465 1 498 498 389 45.0 1e-106 MKKVAVLLLAAGLVFGAAHKGAQAADIKVSGEWDFNTEWNNIGFAKEKADDLFHARQRLR TQVDIIASESLKGTVFFEVGDTNWGNSSEGGALGTDGKVVEVRYSYVDWVVPQTDLRVRM GLQPFSLPNFVAGDPIMGSDDSDGAGITLSYQFNDMAGMSLFWMRAENDNTTIDRGVGNA MDFVGLSVPLTGEGWQLNPWGMYGNLGKNSLNEVGENGEFLIAGLLPYGQDGVSVVAKDS NNPAWWLGIGGELTTFDPLRLAFDFAYGKADWGKAANGQDLTRQGWLLSGIAEYKLDYVT PGLIAWYGSGDNSDTMDGSERLPTLSPGWGATTLGWDGAYGISDGAVLSNTPTGTWGVVA RLADISFFENLTHTLAVGFYTGTNNTRMVTSGPVGGVESGNVYLTTKDHAWEVNFDSQYK IYENLTLAAEMGFVRLDLDKDVWGSKLSGVDKTAYKVGLNLNYAF >gi|316921676|gb|ADCP01000142.1| GENE 66 76410 - 76568 100 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEELMVEIFRRASKTIKKTNCISSFIGKQKKAGTHFLAMSPGLASFLWNNFH >gi|316921676|gb|ADCP01000142.1| GENE 67 76949 - 77305 113 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302863703|gb|EFL86634.1| ## NR: gi|302863703|gb|EFL86634.1| toxin-antitoxin system, antitoxin component, Xre family [Desulfovibrio sp. 3_1_syn3] # 1 79 1 79 103 102 69.0 8e-21 MEENPLCPFGDVLVKAKKDRDITQYRLAKLTKRSARYISMLEHNAREQQLSTGLLLARAL RMDAGELVRDVDALMPEGWRSRMTKTLSRRGPGTARRYAELQRRNEAIGNAGHSLFAV >gi|316921676|gb|ADCP01000142.1| GENE 68 77488 - 77766 202 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302863703|gb|EFL86634.1| ## NR: gi|302863703|gb|EFL86634.1| toxin-antitoxin system, antitoxin component, Xre family [Desulfovibrio sp. 3_1_syn3] # 1 82 1 85 103 62 44.0 9e-09 MRKRPPSKFGVALVQEYRKRGLTQYSLAQKLGRSTRYLNNLEHDRNEPRFTTILLLADAI GMDPGELVNAAAALSWAALTEEEGALGEMEEG >gi|316921676|gb|ADCP01000142.1| GENE 69 77919 - 80624 3439 901 aa, chain - ## HITS:1 COG:AGc3014 KEGG:ns NR:ns ## COG: AGc3014 COG3264 # Protein_GI_number: 15888945 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 588 889 513 818 881 191 37.0 7e-48 MSLQRFFTRSTFGRFALFMMAGLLCASLSARPLLAAPDKDAAASDQKTSAKEEKAKTDKT ADASSEKAADRDALTDKEKAKLEAEEKAKQEAEEKARLEAEAKAKAEADAKAKQEAEERA RQDAEAATAKLQAEAEAAQQKSESTASDDAVWEEAFKQNQDELNAMEIEIDGLAGGLKAI INPLRKSLPGMEEAAQRLFSLAGTHKLDPIMLESIDQRGVLQASGLRQQMQPVLTAQNVV KDKIAALEQMKRSLPPLPEKGSKEKLNPDQQQTWKNISRIENKLIKMDQRLTLELSSATD LLGKIESMHTGVTGYLPDLWQTYYFSVPSRFFDPVAWEGITERWEKTLQNMNLRMSIEFP RNPSAWKGVGLRFLNVLLLGCVVIFFVNRRMVRAEKKGTMPTAARSGILRSLLWQFTGLA LVGASFGSQGELFRGLLVLGSILVIGGEMTLAWGMRCFILDKKMGLTPLWPMYLASSLGI LISYPNLPGGILSLSWIAVGLVSLYILKELHARELPGIENNLLHVHRLTVWLSMAVAAIG WPRMAILVLILVDCLAINTQFIIGLLQILNRSSDVQDSEQHAQRSVLAGLVAAFIAPIVL LLILCGMVLWVIAMPGGALLLWHYLGTGVQVGSASFNMVHLILIVSVFYITRAAITASRS FLNRVASQSYKIDKTLIPPMQTGITYGLWTLFALFTLKALGFGLENLAVIAGGLSVGIGF GMQTIVNNFLSGLILIFSRTLHEGDVIDVGNLQGVVRKISVRATTVETFDNAIIFVPNSE FVSNRLINWTRNGRIVRREVAIGVAYGTDPKLVERLLVEVAQSVPQIVRRPQPFVLFMDF ADSTLNFVLRYWSDIGTATDAASAIRHRIVEVFAEHQVEIAFPQLDVHVIPQTPSLTRAE G >gi|316921676|gb|ADCP01000142.1| GENE 70 80976 - 81230 88 84 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYKSACRRAENGPPKFFPLLRVDADDRRTHGKRGMARRHEVPPRVAGNPYEDGGSYVQNR SDQKQGAGRTGESVVFDASPTGRA >gi|316921676|gb|ADCP01000142.1| GENE 71 81142 - 82248 1296 368 aa, chain + ## HITS:1 COG:AF0952 KEGG:ns NR:ns ## COG: AF0952 COG0067 # Protein_GI_number: 11498557 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 1 # Organism: Archaeoglobus fulgidus # 119 342 125 348 378 79 27.0 2e-14 MCRIGAIKSKVPVEPAKALYLMLPQQEGHDNSGYAMVMQDLYGVFEGHKDKPLLSLACTP RGVRMVDAYMAEQGFEQVLEWLPDVDRRPGLDIKAMPCYVFRNYEYPEAYRGRPQAEREE LLLDTRLALRRMLEKDGQGFVYSFWPDVLTLKEIGDPRDIAAYFRLWDEEGPLRAKSIVT QCRQNTNYDIVRYAAHPFFLQGYTLCANGENTFYTKNTEYQKSLHRGYVGFESDSQNFLY TLHYVLRELKWPLKYFKHVITPLPFAEAERRGDRKVLSLIRESLAHLEINGPNTIIAVLP DGSMMTCCDSKKLRPVVVGGDGTTMAISSEVCGLNAVLPDRDASRDIYPNEREVVLINND LAVQRWNQ >gi|316921676|gb|ADCP01000142.1| GENE 72 82236 - 83867 1723 543 aa, chain + ## HITS:1 COG:MA4218_2 KEGG:ns NR:ns ## COG: MA4218_2 COG0069 # Protein_GI_number: 20093008 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Methanosarcina acetivorans str.C2A # 138 524 31 391 415 168 34.0 2e-41 MESVTTQDIGVNDLRWIIDYRPERCTMCGSCVASCTLNALEVAVMRQDLTVSRASQPEPV REHLARPVIRQKASLTGSCVGCGMCEKVCPNQAIRAVRNPDSRYALLARTNGPTKRGGRT NLNAQRTLDHIVVGRISQMTDPALDSERHTFDILAPFGRVLLPGELPLRAEGGELRLEGK TPPLHWIYPVIFSDMSIGALSTRAWEAIALATAYLNEQCGLPVRMSSGEGGMPVKLLESE RLKYMILQIASGHFGWNRIIKAMPRMKADPAGILIKIGQGAKPGDGGLLPAAKVAPHIQA IRGVPRTTLHSPPNHQGLYSIEESVQKMHLSLNAAFGFRVPVAIKCAASATSVSVYNNLL RDPYKICGGFFIDGIQGGTGAANEVSLDHTGHPVVSKLRDCYLAAVRQGLQGQIPLWAGG GIGMTGTAAADAFKMICLGANGVFLGKILIQLLGCVGNEQGRCNACSTGRCPTGICTQDP RLVHRLDVDRGAQNIVDYMLALDGELRKLMAPIGNSSLPMGRSDALVTTDRAVADKLGIQ YVC >gi|316921676|gb|ADCP01000142.1| GENE 73 83879 - 86182 1830 767 aa, chain + ## HITS:1 COG:PAB1738 KEGG:ns NR:ns ## COG: PAB1738 COG0493 # Protein_GI_number: 14521152 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Pyrococcus abyssi # 305 662 50 470 474 143 31.0 1e-33 MLTIKTLQGTHRMSTQDLLLAIEEAVGNGETSFEIEASGQHDIGGPLWNREGKALRFHVT NPGQRVGSMCLDNTEILVDGPAPADVGWLNAGGRIVVRGDAGDTAGHCAAAGVIHIGGRA GARSGSLMKHDPLYAPPELWVLKNVGSFSFEFMGGGKAVVCGYDCEGLPSVLGERPCVGM VGGIVYVRGAFSEDVADDLAVSGLESDDIAYLDAGLETFLSAVGRPELSAVLSDWSEWRK IYPLTLGEYSSRPDPMPMKAFRAKEWIQGGIFSDVCRDDFVVNATVARGLYRQRVPSWDN AACAAPCEFRCPASIPTQLRYNLLRAGKVEEAYKLVLDYTPFPGSVCGGVCPNPCMEGCT RGGIDEAVQIGALGRCSIDVSLPRPTGPTGKKVAVIGGGVAGLSAAWRLARKGHEVTVYE ADDRMGGKLEQVIPRARLPHEILEKELKRIEDMGVRFVIGSRVDADGFQRLRRESDAVIV ATGGHIPRVFPWPGHERIVAGIDFLKAINKGENPRVGRNVIVIGCGNAGMDAAAGAYAMG AESVTCIDVQKPAAFAHEIAHIEALGGKLLWPVMTKEITDGGLITADGTLISGDMVIITI GESPDLGYLPEGVRKFRDWVIPGADMSLLENVFAAGDVIKPGLLADAIGTGIKAAEAVDA SLRGVAYAPVEKKPVPSGQLSTAYFTRCPHGELPAANRDFDRCVSCGTCRDCRMCLESCP EGAISRETLAGGAYRYVSDPDRCIGCGICSGVCPCGIWTMHPNVPLE >gi|316921676|gb|ADCP01000142.1| GENE 74 86424 - 86795 192 123 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTLLLFRVNAKGYVIDAENGMLEFPGGGIKVQNWTSHFNPLYLFQEFMRHKIALAEIREI HPPLNMRQTVYGNGRSRASVPSRIELNGDFGAVSLSFRSKEKRDELYLALVRIHHIGGLI PTG >gi|316921676|gb|ADCP01000142.1| GENE 75 87160 - 87357 174 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302863398|gb|EFL86329.1| ## NR: gi|302863398|gb|EFL86329.1| conserved domain protein [Desulfovibrio sp. 3_1_syn3] # 1 57 24 80 89 65 54.0 1e-09 MTKTALADFSDLQDCYVRGIIKGKRNPTITAIYSICEALGLSPLEFFKRVTDELNELNKP RKNLQ >gi|316921676|gb|ADCP01000142.1| GENE 76 87538 - 89160 1865 540 aa, chain - ## HITS:1 COG:BH1407 KEGG:ns NR:ns ## COG: BH1407 COG1283 # Protein_GI_number: 15613970 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Bacillus halodurans # 6 529 9 539 543 327 38.0 3e-89 MSVQSIIQLLGGVGLFLFAIKLISEALQLIAGDRLRQLIGTLTKTPIMGVLVGACVTVLI QSSSATTVMTVSFVDAGLMTLTQAIGVIMGANIGTTITGQILAIKVQDYAYLFIIIGVLL SFFGRSKVQKYAGNGLLGFGLLFVGMQTMESSMSFLRNEKELFLMFSHNPLMGVLAGTLL TLLVQSSAATVGLTIALGVQGLLPLHAAIPIILGDNIGTTITAVLASIGTDRTAKQACAA HVLFNVIGVCIFLTILPLYQELIAMTATGIAHQIANAHTLFNVFNTIIFLPFVKPFAALI RRLLPDKAHKVVEGAQYLDPKLIEATPGIAVEAVKNECAYMGFLVIHLLDSVQEVFFNDK KDLIPKIEETEAKIDQLHKAVKAYAEDIMQAGISDDAARVLTLYVASSGNIERIGDYGKK LLEYYAYRQNRPKDFSQQAMTELSGMYAEAHHAIVTALDGFINDDPDKAREVAPIAGKLR GLEVELRNRHIQRLDKQECDSETGLVYVDILGIIEHIGYHSNNVAKATISSCTAEHKAKA >gi|316921676|gb|ADCP01000142.1| GENE 77 89603 - 91312 476 569 aa, chain + ## HITS:1 COG:SMc02297 KEGG:ns NR:ns ## COG: SMc02297 COG0582 # Protein_GI_number: 15964353 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Sinorhizobium meliloti # 186 558 146 494 513 69 25.0 1e-11 MPSVGVSSHSRYRSYLTTHRGILYFRILVPNHLQSLIGKREIRRSLNGLDSRTARTKALR LSLAAQHFFALADDLAHERIHSALEKGIQSFGITKENIKSVGTFLFDSTLKKDLSPASLT LLLPSFLREQEGIASSTKIFSEKIEAPEANGAGMEDFCFLKEKTPSVESTEPVRRRIKLE EKGRLPNLREAFDAYVKAKTLTWSAASAKDIPPQVRQFVEIVRELEHGRDIRVDELSREH IRSYFDTLKHLPCRLCGQRQFTGKGWLQLADMGRSGQIERLLSVKTMEVRQTNVRSFVNW CELEYRGAVQAKYVNSGFPKVLSDKDIRRKGVKREAFTQDELKALFGDMGKYVQATEGVP SRFWAPLIALYSGMRLEEICQLHLSDIVKVDGVLCFSINEESGGSGYVKHVKSSAGIRKV PVHLHLWDELGLEKFVASRWAKTSKEKYASILLFPDLQERVNAVNHATVKLGSALTHWFT RYRRSVGVGGQHGEPSTKAFHSFRHTVIEYLHKEARVDLSMLQAVVGHEMVDMGVTENYA GDWAVKTLLTDIIQKLNWLSFFREIALRG >gi|316921676|gb|ADCP01000142.1| GENE 78 93105 - 93593 -8 162 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2616 NR:ns ## KEGG: Dvul_2616 # Name: not_defined # Def: AAA ATPase # Organism: D.vulgaris_DP4 # Pathway: not_defined # 2 160 516 672 674 85 36.0 5e-16 MQNIFQLEKGQDLPELRSIAAVQTALEEAGPTLKMTLTKCKVFPGLEKKFGVYHLPEGGR WELVAGESFSETMIIDDSSSAAEMTAETLEFNAAEGNSLELTPDEERVHNSLKKLGKWVK RSELEKHCGFKADKLRLLLKSLIEKKFVTKTGQGKSTHYQAS >gi|316921676|gb|ADCP01000142.1| GENE 79 94007 - 94330 111 107 aa, chain - ## HITS:1 COG:no KEGG:bglu_1g32480 NR:ns ## KEGG: bglu_1g32480 # Name: not_defined # Def: integrase, catalytic region # Organism: B.glumae # Pathway: not_defined # 2 107 241 346 348 186 79.0 2e-46 MAKVEAHDYELYLGVNGIEHTKTKARHPQTNGICERFHKTILNEFYQVAFRRKLYQSLEE LQADLDTWIDSYNTQRTHQGKMCCGRTPMQTLLDGKSLWAEKVGQLN Prediction of potential genes in microbial genomes Time: Fri May 13 04:45:46 2011 Seq name: gi|316921643|gb|ADCP01000143.1| Bilophila wadsworthia 3_1_6 cont1.143, whole genome shotgun sequence Length of sequence - 36622 bp Number of predicted genes - 44, with homology - 30 Number of transcription units - 29, operones - 8 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 352 169 ## Nmul_A1248 integrase catalytic subunit - Prom 395 - 454 2.4 2 2 Tu 1 . - CDS 491 - 625 59 ## - Prom 802 - 861 3.8 3 3 Tu 1 . - CDS 951 - 1409 69 ## - Prom 1495 - 1554 1.9 4 4 Tu 1 . - CDS 1779 - 2408 109 ## Dvul_2576 hypothetical protein - Prom 2447 - 2506 5.7 - Term 2766 - 2807 2.0 5 5 Tu 1 . - CDS 2912 - 3412 88 ## Dde_2218 Lon-A peptidase (EC:3.4.21.53) + Prom 3203 - 3262 1.6 6 6 Tu 1 . + CDS 3308 - 3544 162 ## 7 7 Tu 1 . - CDS 3559 - 4212 413 ## HM1_0635 hypothetical protein - Prom 4331 - 4390 2.8 - Term 4342 - 4383 5.2 8 8 Tu 1 . - CDS 4392 - 4850 -180 ## Daro_2247 helix-hairpin-helix DNA-binding motif-containing protein - Prom 4917 - 4976 3.2 + Prom 4758 - 4817 5.5 9 9 Tu 1 . + CDS 4907 - 5242 96 ## gi|212703136|ref|ZP_03311264.1| hypothetical protein DESPIG_01175 + Term 5248 - 5297 1.8 10 10 Tu 1 . - CDS 5420 - 5716 70 ## - Prom 5926 - 5985 10.2 + Prom 5573 - 5632 4.4 11 11 Op 1 . + CDS 5681 - 5941 78 ## 12 11 Op 2 . + CDS 5992 - 6222 95 ## 13 11 Op 3 . + CDS 6295 - 6429 106 ## + Term 6542 - 6601 22.5 + Prom 7211 - 7270 6.0 14 12 Tu 1 . + CDS 7296 - 11222 2155 ## COG1002 Type II restriction enzyme, methylase subunits + Term 11431 - 11459 -0.1 15 13 Tu 1 . + CDS 11573 - 12070 169 ## gi|302343546|ref|YP_003808075.1| protein of unknown function DUF362 + Term 12120 - 12163 1.3 + Prom 12084 - 12143 9.4 16 14 Op 1 . + CDS 12171 - 12866 216 ## DVU2032 ERF family protein + Term 12873 - 12907 4.2 17 14 Op 2 . + CDS 12930 - 14099 362 ## LI0183 hypothetical protein 18 14 Op 3 . + CDS 14168 - 14377 176 ## gi|302864079|gb|EFL87010.1| hypothetical protein HMPREF0326_00784 + Term 14527 - 14560 2.5 + Prom 14756 - 14815 3.9 19 15 Op 1 . + CDS 14935 - 15924 708 ## COG0714 MoxR-like ATPases 20 15 Op 2 . + CDS 15978 - 16247 111 ## 21 15 Op 3 . + CDS 16258 - 17232 988 ## LI0186 hypothetical protein 22 15 Op 4 . + CDS 17237 - 17443 107 ## DVU2043 hypothetical protein + Prom 18016 - 18075 2.6 23 16 Tu 1 . + CDS 18251 - 19999 384 ## COG3344 Retron-type reverse transcriptase 24 17 Tu 1 . + CDS 20480 - 21517 416 ## Dde_2920 von Willebrand factor, type A + Term 21591 - 21629 0.3 25 18 Tu 1 . + CDS 21657 - 21725 60 ## + Term 21741 - 21771 1.9 + Prom 21807 - 21866 1.9 26 19 Op 1 . + CDS 21893 - 22138 185 ## gi|301796091|emb|CBL44295.1| hypothetical protein 27 19 Op 2 . + CDS 22135 - 22512 196 ## COG3654 Prophage maintenance system killer protein 28 19 Op 3 . + CDS 22517 - 23449 377 ## SAR11_1169 hypothetical protein + Term 23509 - 23548 0.2 + Prom 23646 - 23705 3.4 29 20 Tu 1 . + CDS 23805 - 24122 180 ## COG3177 Uncharacterized conserved protein + Term 24138 - 24191 17.0 + Prom 24153 - 24212 8.4 30 21 Tu 1 . + CDS 24359 - 25177 727 ## COG3645 Uncharacterized phage-encoded protein + Term 25282 - 25332 14.0 - Term 25205 - 25243 7.5 31 22 Tu 1 . - CDS 25322 - 25405 63 ## - Prom 25580 - 25639 2.3 32 23 Tu 1 . + CDS 25609 - 26304 659 ## COG3617 Prophage antirepressor + Term 26310 - 26348 7.3 - Term 26298 - 26336 7.3 33 24 Op 1 . - CDS 26342 - 26575 91 ## 34 24 Op 2 . - CDS 26572 - 27351 392 ## ECO111_p2-013 putative head processing protein - Prom 27469 - 27528 5.2 + Prom 27247 - 27306 1.7 35 25 Op 1 . + CDS 27502 - 27702 259 ## 36 25 Op 2 . + CDS 27798 - 28052 82 ## Emin_0938 hypothetical protein + Term 28094 - 28119 -0.8 - Term 28240 - 28272 6.3 37 26 Tu 1 . - CDS 28376 - 29686 989 ## COG1757 Na+/H+ antiporter - Term 29760 - 29806 14.5 38 27 Op 1 . - CDS 29812 - 30567 741 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 39 27 Op 2 . - CDS 30587 - 31759 977 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 31779 - 31838 8.1 40 28 Op 1 1/0.000 - CDS 31863 - 32474 683 ## COG1802 Transcriptional regulators 41 28 Op 2 1/0.000 - CDS 32542 - 33489 1104 ## COG3181 Uncharacterized protein conserved in bacteria 42 28 Op 3 . - CDS 33526 - 35055 1318 ## COG3333 Uncharacterized protein conserved in bacteria 43 28 Op 4 . - CDS 35096 - 35551 551 ## - Prom 35659 - 35718 5.5 44 29 Tu 1 . + CDS 36344 - 36620 279 ## Predicted protein(s) >gi|316921643|gb|ADCP01000143.1| GENE 1 1 - 352 169 117 aa, chain - ## HITS:1 COG:no KEGG:Nmul_A1248 NR:ns ## KEGG: Nmul_A1248 # Name: not_defined # Def: integrase catalytic subunit # Organism: N.multiformis # Pathway: not_defined # 1 116 1 116 347 187 72.0 1e-46 MESFNQNVIKHKTGLLNLAAELGNISKACKMMGFSRDTFYRYQAARDAGGVEALFEVSRR KPNLKNRVEEAIEVAVTAFAVDFPAYGQTRASNELRKQGIFVSPSGVRSIWMRHDLA >gi|316921643|gb|ADCP01000143.1| GENE 2 491 - 625 59 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLKEGLDPFRSIFDLPIPDSSAFLIQTEKPNWTVRTDRLFVAQ >gi|316921643|gb|ADCP01000143.1| GENE 3 951 - 1409 69 152 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLKISDAINVQVLQLPNDKNGRTAIVSANIIMTDGRVISDVCSESPRSPFENEDTLELAK NKIINSIKQKAEPYIKNHQQPSRNAIQDDKSRFKGGGDKPASRSQLSLIRNKAQEQGKNP EELAASRFGKRLQDLKGWEADSLIKELLTKTR >gi|316921643|gb|ADCP01000143.1| GENE 4 1779 - 2408 109 209 aa, chain - ## HITS:1 COG:no KEGG:Dvul_2576 NR:ns ## KEGG: Dvul_2576 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 29 191 24 181 198 86 34.0 5e-16 MMTKVTFEDVFDGLQILPPRDDTEGKFIDILEKQKKLMEFALANHSKVMQIRFDLHYPSD GFILPSPEHISDFSYNLSRRLKRMVIATHRVDPLYLWVREIHDSSFPHYHFLLLINANAI KNKFTIFNIANETWKNTLKINEDGLVHYCLNKRRDPEYDNGIVIDKNKEDFSIKRDVAFR SGSYLSKTFSKDGLDKHAWRYGYSRLPRA >gi|316921643|gb|ADCP01000143.1| GENE 5 2912 - 3412 88 166 aa, chain - ## HITS:1 COG:no KEGG:Dde_2218 NR:ns ## KEGG: Dde_2218 # Name: not_defined # Def: Lon-A peptidase (EC:3.4.21.53) # Organism: D.desulfuricans # Pathway: not_defined # 1 163 60 214 819 85 36.0 9e-16 MVTQRNPVVLDIRSRSELFEVGTVAKILEVVEGPQPDTLRVLFEGLYRARFIPYGGCDLK HVSRKVTSIADVYPFEERSHPVSEQRISEFLSALNAYIVKSEKPAPKVIERIINREITFS QASPGIMADTVMQYIRVDYRKKQELLELADAVERMDAVYELLQDNC >gi|316921643|gb|ADCP01000143.1| GENE 6 3308 - 3544 162 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRPLYHFQYLGDGTNFKKLTSRTDIQYNGITLRHHKNISVMIADRSSYCLDGKFTANEKW YDPFGKQNNFPQRHNGDF >gi|316921643|gb|ADCP01000143.1| GENE 7 3559 - 4212 413 217 aa, chain - ## HITS:1 COG:no KEGG:HM1_0635 NR:ns ## KEGG: HM1_0635 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 4 189 11 198 204 202 55.0 8e-51 MATRLPAIKVTRKSGKECFQGMEGERTLKDFWAWAFSDLVSNTERGKLAEYIVATAMGCD EGTSPTWGSFDLLSPEGIKIEVKASAYIQSWEQKGFSQIGFSIAESLYWDGVDYAKEKKR QADVYVFCVLKHKEQDTINPLDLEQWDFYPISTAILNKAVKGQKTICLARVAELCGSGPF TYQMLKNAVIEVWQSGVAGDTIKQKSLQPNSGDGIEK >gi|316921643|gb|ADCP01000143.1| GENE 8 4392 - 4850 -180 152 aa, chain - ## HITS:1 COG:no KEGG:Daro_2247 NR:ns ## KEGG: Daro_2247 # Name: not_defined # Def: helix-hairpin-helix DNA-binding motif-containing protein # Organism: D.aromatica # Pathway: not_defined # 37 113 3 81 119 62 40.0 5e-09 MYSRIMIKMSTYRHFKLFNRTKQRPYRFLLKKGVENVPLEKTQKLAAIVGANITARRKLK GWTQAEFAEKMGMGPDSLSRIERGTVAPRFPRLEEMARLLECSPADLFRSPDEILQEISG KTQIKVQAIPLAPKEEVIRLAEKIILLMQLER >gi|316921643|gb|ADCP01000143.1| GENE 9 4907 - 5242 96 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212703136|ref|ZP_03311264.1| ## NR: gi|212703136|ref|ZP_03311264.1| hypothetical protein DESPIG_01175 [Desulfovibrio piger ATCC 29098] # 10 107 1 102 106 81 52.0 2e-14 MSESNERNPMQILYVILCLLFAIFCALIKYKGYSVGLFARAGIFILCFIGSYIGALIGDF LRRLALPDSFFTSGGMIDILKTKLFWAIGPQVVGIFIGGIVGTSIILSFCK >gi|316921643|gb|ADCP01000143.1| GENE 10 5420 - 5716 70 98 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSIATITLTCAHIVFYYISFYICFIFIIFNNFSFCIHFRTSTFISKINPELLYSVGNFNI ISFFTKLLFYYVWIGKHDNLFIRIKPERCIAKIFTLCS >gi|316921643|gb|ADCP01000143.1| GENE 11 5681 - 5941 78 86 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCTGEGDGCYAQCQETTNMNFITLKSDGTGTMDGYRANDFPFTWESIKGKFIIFHYDGHD EEYIIDTTKEHGTLLIGKTECLKKIR >gi|316921643|gb|ADCP01000143.1| GENE 12 5992 - 6222 95 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNFISGWFGIGIILLICYTPVYIFFFNSAEDAVKEMTLADIIYEDIRLTTDSNNMIKEK KKLLMNLQKNMRNVQE >gi|316921643|gb|ADCP01000143.1| GENE 13 6295 - 6429 106 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLNTTQNMSLNLSTKTIQLNYLLLLKSHLFLISKQQKLMHYYH >gi|316921643|gb|ADCP01000143.1| GENE 14 7296 - 11222 2155 1308 aa, chain + ## HITS:1 COG:Cj0690c KEGG:ns NR:ns ## COG: Cj0690c COG1002 # Protein_GI_number: 15792039 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Campylobacter jejuni # 1 1303 1 1247 1250 867 41.0 0 MSSNKPFASYVVNNLYDNNFFTEDLPGLLADAGSAEESDKWFRVVKRIFADNPPLQLNES QLESGIIKPVLGVLGWFVLPQDTKVIQGKNIRPDWTLFASKKDKDKYLSIPPHQRRDVVE GIITFAEIKSADKELDTRKASRKDNPYLQLVEYLLLTRIPFGFLTNGVEWWLVDNEKISA EKRFLRVDLAQIIADDNADAFHCFYYLFHRTTFVPSKPEEKSVFITISQTDAERRNASEE DLRKVIYGADGTESLFEVTGRALFAAAGKKADPAVLRQVYENSLYFVFRLLFIAYFEDRH WGLLKKHSHYPDLSLRKLDEDLQLAAPDSFTGWNHLQTLFRTLNVGNLNLSIPLLNGGLF DDARAALLAKPKVMDNATLRKVLNALFVFGTTEIPLRRDFKALSVTHLGTIYESLLEFEF RITPEDLMYVVYKETKGKDKGKLSEGFFDVYDTGVLEKNKDVLILNKRPFPKNTLYLVGS QNSRKASASFYTPASLSLPLVRRAIDHQIACLPKDKSVLDLRILDNACGSGHLLIESLNY LTSRALDRMEDDALLATTLSDETQRISEAMDSLGLQEDMKPDEFMILKRILLKKVLYGVD IQVFAIELAHLSLWIETFVFGTPLSLIEHHVKVGNALIGTEIKTFQDAIGKEGRQLSMLN LTVKDHFEKLYTVYKKLNAIQDTTADDIAASKKLFKQEIGPALKEMNLLLDLCNYRDMLM AEGREAEAGKVRVWEQAPALLKGQLPELQATIDTYRAKYGFFNWEIEFPEAFADANGKRG FHIIVGNPPWDKTKFEDPMFFSQYRSNYRNLPNSKKKELQDDLLSKPDIRQRYESQRAHT LAVNEYYKLFYPLNKGAGDGNLFRFFVERNLRLLTPRGTLNYVLPTALLTEDGSATLRKA IFEDYSIVAFDGFENNKGIFPDVHRSYKFGLLQIERVRNPEQKARVRFMLTDPAMLESEK GIFEYGLDDIRATSPEFMAYMEVKNGRADLELLTRIYKKFPQLNSDWLDFRNELHSTADK SIFLEIKNDESLPLYTGSMFWQYNTCFEKPMYWLNEEQFDSHLKDKEISRIIDSCCFYLP HQEGKTKEKTVLSALGLHKRKELAQFIVPERCYFRIGYRKVARDTDERTMICSVLPKNVG AQDSIYLTIPKKYIFDMESKSIYVLETPIDRIFFAQALLNSLSFDWVLRFSIAINVNKTY LMRQPMPQPTDEELAENPVYREMILNSLKLSLHYNPEGFFDLKMLYGLKDTDIPTTSKQV DMLKIRNDVLVAGIYGVTKPEMEHMLKGFNVLARKKPEYVKALLDAME >gi|316921643|gb|ADCP01000143.1| GENE 15 11573 - 12070 169 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302343546|ref|YP_003808075.1| ## NR: gi|302343546|ref|YP_003808075.1| protein of unknown function DUF362 [Desulfarculus baarsii DSM 2075] # 1 155 78 232 309 62 29.0 8e-09 MRYLPPIGCGRNTDPQIVQELILQSLNAGAEVVRVLDTPWATPERCAKASGLRRVCDAFD ANMLVWPTRTSEYVFCPLPGTSFSGLHIVKAALEADAIIFAPHALPDVDCSESQFMLHML MGLVKERYSLSIAQQHEDVTRHILSLLSTKLLVIDKQMETVCPSN >gi|316921643|gb|ADCP01000143.1| GENE 16 12171 - 12866 216 231 aa, chain + ## HITS:1 COG:no KEGG:DVU2032 NR:ns ## KEGG: DVU2032 # Name: not_defined # Def: ERF family protein # Organism: D.vulgaris # Pathway: not_defined # 3 230 5 236 237 244 55.0 2e-63 MQEYNSAEISEIAKALINVQRQLQPATKDANNPFTKSKYATLNSVMDSCRDALLSNGIWL CQYPVPAEPGYLGLVTKLTHAESGQWQSSLAVVPLPKADPQGVGISMTYMRRYALSAMLG IVTEEDTDGELPQNRSKTVIPQKTPIKGSQIKKDVLQRKTQADSDSSALNRTSEGLLKLP DLEGITYQVVPAQNGQQCIVARGDTVPQKDSLMAAGFKWNPQRKIWWKYAA >gi|316921643|gb|ADCP01000143.1| GENE 17 12930 - 14099 362 389 aa, chain + ## HITS:1 COG:no KEGG:LI0183 NR:ns ## KEGG: LI0183 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 10 385 7 385 400 285 39.0 1e-75 MDGANKNVGAEGLLKILRCGLERSAEQRTAIQLGDRSQYVGMSDIAKMLDCPRAALAGKL YIPEYRSTDEALKRKITFHRGHWFERGVHQALIGYGLSPLSQLEIEIRYGDVPIKAHLDF ILVTLQPQPTVRILEVKSTARLPATLSESYAMQIGGQTALLKTYWNHPVFSIIQDTGEVL YHRTLPEICKELLDVSLPDDASACDIQGWVLCLSMCDAKAFGPFLPEDMDVAQCLDMASE FWETMNDLKENRLNLNAIRTAQGLAPLCPSCFWKEDCPHFKGSSHPEWEDTLVQFIKLKT QKKSIEEEIGELESRLKVAYQLSHTVRGEWINTGNHTFRVIPQNGRVTLDRKRLHEELET LLGEQDAQMFMAKCEKQGEPFERLYAVEV >gi|316921643|gb|ADCP01000143.1| GENE 18 14168 - 14377 176 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302864079|gb|EFL87010.1| ## NR: gi|302864079|gb|EFL87010.1| hypothetical protein HMPREF0326_00784 [Desulfovibrio sp. 3_1_syn3] # 1 49 30 78 89 66 61.0 6e-10 MPLSLEEMTAVAIRNQHLILAEELKKGVPLNYRDELGRNILEYPDGHIEITQDVIPPYIQ NQMKPCNHD >gi|316921643|gb|ADCP01000143.1| GENE 19 14935 - 15924 708 329 aa, chain + ## HITS:1 COG:YPMT1.88 KEGG:ns NR:ns ## COG: YPMT1.88 COG0714 # Protein_GI_number: 16082881 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Yersinia pestis # 68 252 170 355 411 115 38.0 1e-25 MESQNIMELEALTPTEMDAGLVFSGVPTGNTIWGFQNPCLYTPAIDPHYSFHESARDAAV WFMNKLDPLYLYGPTGSGKTSLIKQLAARLNYPIFEVTGHGRLEFADLCGHISLHEGSMR YEYGPLSLAMRYGGLFLINEVDLLSPEVAVGLNGVLEGAPLCLPENGGEVVHPHEMFRIA CTGNTNGGGDDTGLYQGTGRMNLAWLDRFMLCEVNYPDAVVEKALLVKLHPALPEHIVVK MVDFANEIRKQFIGASDSCTDTIEVTLSTRTLLRWADLTLRFQPLAHQGIQPLSYALDRA LGFRASRPTRAMLHELLQRMFPMDCHLGE >gi|316921643|gb|ADCP01000143.1| GENE 20 15978 - 16247 111 89 aa, chain + ## HITS:0 COG:no KEGG:no NR:no METDSGYLDTLPQLPQVILKNEASSRFWRGEAFSHALELLRGSSGRSGRSLTIPADDCVD GNPVQELLERSLQKLAEGYHIELLTPLTN >gi|316921643|gb|ADCP01000143.1| GENE 21 16258 - 17232 988 324 aa, chain + ## HITS:1 COG:no KEGG:LI0186 NR:ns ## KEGG: LI0186 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 1 324 5 331 331 363 56.0 3e-99 MDTPILSDIKVLDNLLALNLNVNLWSARKKMVLEDFGGAELPPEDLASLGSKRIADPNSL KVFVTLKARAFNYLDRHGIRFMSGWAIPEDKAGDIIRELIGIRDEFLKEKNAFLADYDQS IENWINKHTKWASIIRESTVGSEYVRSRMGFSWQLYRVAPLMEHAVPEAVAESGLNEEVE NLAKTLFGEIARSADETWNRVYAGKTEVTHKALSPLRTLQQKLSGLTFINPHVSPVVDII QMAFSRIPKKGNITGADLVMLQGLVCLLRDPNALIAHAERVIEGYGPASVLDAVKAPVMQ ATPKPELPEVLPVSHANLPNVGLW >gi|316921643|gb|ADCP01000143.1| GENE 22 17237 - 17443 107 68 aa, chain + ## HITS:1 COG:no KEGG:DVU2043 NR:ns ## KEGG: DVU2043 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 65 1 64 533 79 53.0 3e-14 MLKPKDIMDCLPLLASVLGNQYGVTVEIGGSEAYTDGKTIHLPALPLDSEPELITMIKGY CDHGATCC >gi|316921643|gb|ADCP01000143.1| GENE 23 18251 - 19999 384 582 aa, chain + ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 1 302 55 322 470 164 36.0 4e-40 MELIRKEENILLAYRNIKKNKGSHTKGVDGKTITYLSGLHPEELVTMVRNKLDNYFPQRV RRVEIPKLDGRTRPLGIPTIKDRLVQQCILQVLEPVCEAKFFKHSYGFRPLRSAKHAIAR AYFLAQVAHLHYVVDIDIKGFFDNIDHGKLLKQLWTLGIEDKSLLRVISKMLKAEIDGIG VPAKGTPQGGILSPLLANVVLNELDWWIASQWERHPTRDTYQPGKKDGSMSSMYYALKKT RLKEVYIVRYADDFKIFCRHHNHAKRIFAAVQKWLLERLSLEISPEKSKITNLRKGYSEF LGIKMKVRPKGKMIGKDSKKDRFVATSHLADKARSKILHEVRRHTKLMQKPVANEGHVFV NDYNAYVMGIHGYYCCATLCSKDFSEIAFRSRSALKNRLTPRRRKPKEPLPAYIEKRYGK SRRIRFVYDKPLLPISYVRHVKCFQFKEQSIFVEEDRQFVHAKQKAVSAETLRHLLTHPV QGRSVEYNDNRISLFVGQYGRCFITGELLAAWQVHCHHKKPKSLGGGDEYKNLVIVEKDI HRLIHATSEATIAACLQLLNLKSSQLMKVNRLREQAHLAPIL >gi|316921643|gb|ADCP01000143.1| GENE 24 20480 - 21517 416 345 aa, chain + ## HITS:1 COG:no KEGG:Dde_2920 NR:ns ## KEGG: Dde_2920 # Name: not_defined # Def: von Willebrand factor, type A # Organism: D.desulfuricans # Pathway: not_defined # 1 341 182 543 547 208 38.0 2e-52 MRDSCKDTGDCIRYALELEQAILAWSLEQKSITEPEKEDSGAAQDESVASSNASLDSLEK AVGALSSDELPRGFGESLAERIELDSPKDRQKQLRVAVTGQKRLAELPPGQLSRIERQTA GLGFRLQGLIQAQKWLPAMPGVRGRFSSSLCHKLAVGNPRVFVKNGFRESPGTAVHILLD SSGSMTDSGLMLANAVCYAVGKALQGIPGVNLGITAFPGANRAKRGATVAPVLRHGEKLT ARFPEIAYGMTPMAESLWWVMQQMCLLRENRKIILIITDGEPDSIPAAQEAFKQAQKKGF ECYGLGIMCSSIATLLPHTSRVIETLPQLAPALFKLLEGALRRNA >gi|316921643|gb|ADCP01000143.1| GENE 25 21657 - 21725 60 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSWSAWIAVAIAVLEVIREESD >gi|316921643|gb|ADCP01000143.1| GENE 26 21893 - 22138 185 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301796091|emb|CBL44295.1| ## NR: gi|301796091|emb|CBL44295.1| hypothetical protein [gamma proteobacterium HdN1] # 1 81 1 81 81 64 44.0 2e-09 MTQSTAIKGQKVKFATQLSSDVLAELRQMAHEEGRQIQSILDEALREYIENKRNAKPRTH VMTALFSSMTEHDALYEALSK >gi|316921643|gb|ADCP01000143.1| GENE 27 22135 - 22512 196 125 aa, chain + ## HITS:1 COG:alr9029 KEGG:ns NR:ns ## COG: alr9029 COG3654 # Protein_GI_number: 17227494 # Func_class: R General function prediction only # Function: Prophage maintenance system killer protein # Organism: Nostoc sp. PCC 7120 # 5 89 4 93 128 78 42.0 3e-15 MSRDYLTTADVLGLHAILLERYGGASGIRDMGAVESSVYRPQCGYYADIVEEACALMESL LINQPFVDGNKRTAFAAFDVFLRINGKHLKMDSVRLYTLLMHWIGLPPSMRLQNMIHDIR PCVSE >gi|316921643|gb|ADCP01000143.1| GENE 28 22517 - 23449 377 310 aa, chain + ## HITS:1 COG:no KEGG:SAR11_1169 NR:ns ## KEGG: SAR11_1169 # Name: not_defined # Def: hypothetical protein # Organism: P.ubique # Pathway: not_defined # 180 308 172 298 299 147 54.0 4e-34 MTHTKYYRLMLGKGSRHAEQCLAERFIGADYGIERDLSRELPDSWRQFNAMFIPVYQEAH PEKTKVAAGLACGMLWTVCKGMSDGDIVLCPDGTGCYRVGEISGPYHYESGQVLPHRRPV RWLDIAIDRSEMSSALRNSAGSIGAVCNISDYAEEIKALLAVHQPNPIQVQDPDIENPVA FVLEKHLEDFLVANWVQTELGRRYDIFEDDGEIIGQQFATDTGNMDILAISKDRRELLVV ELKKGQAADKVVGQILRYMGYATQELAEEGQTVKGIIIAQEDDLRLRRALSVTPNVSFYR YQVSFKLIPS >gi|316921643|gb|ADCP01000143.1| GENE 29 23805 - 24122 180 105 aa, chain + ## HITS:1 COG:alr0502 KEGG:ns NR:ns ## COG: alr0502 COG3177 # Protein_GI_number: 17227998 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 1 82 20 102 151 80 49.0 8e-16 MEAFFRWYGAARGALHPVEFAARVHADFVNIHPFKDGNGRTARLIMNFELMRAGFPTVIV PVDARPDYYRNLDIAATQGDYLPFVMQIAELAQKSFAPYWALLGE >gi|316921643|gb|ADCP01000143.1| GENE 30 24359 - 25177 727 272 aa, chain + ## HITS:1 COG:SPy0946_2 KEGG:ns NR:ns ## COG: SPy0946_2 COG3645 # Protein_GI_number: 15674964 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Streptococcus pyogenes M1 GAS # 135 256 1 122 127 105 42.0 8e-23 MEMPQIFENKEFGKVRVMEYNGAPWFVASDVAKALGYERPADAVNIHCKKANKITQYCDS PDRVKTPPINLNIIPESDVYRLVMRSNLPGAERFQDWVVEEVLPAIRKTGGYGTPQTENE ILSRAITIAANRIGLLSQEVAMLQEQIALDAPKVELAEAIMETEECVSVNQFAKILKQNG LDIGANRLYRDLRKDGYLIRRKGVNWNMPKQRMMDKGYFRVVERSTDTEDDYQFVTVTTM LTGMGQYHLLAHFLNKYGVRRQLSLFEGRARA >gi|316921643|gb|ADCP01000143.1| GENE 31 25322 - 25405 63 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRFLRDVLVGFLASFLAAVAVHLLNF >gi|316921643|gb|ADCP01000143.1| GENE 32 25609 - 26304 659 231 aa, chain + ## HITS:1 COG:Z1818_1 KEGG:ns NR:ns ## COG: Z1818_1 COG3617 # Protein_GI_number: 15801289 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Escherichia coli O157:H7 EDL933 # 11 112 15 123 188 116 54.0 4e-26 MEMPQIFENKEFGKIRVVEHSGTPWFVGKDVCDCLEIGNSRDAAASLDDDEKGVALIDTP GGKQEMSIISEPGLYFLVLRSRKPEAKAFKRWIVHEVLPAIRKHGGYLTPKKLEEALLNP DVLIRLATQLKEEREARVQAEARVAILSHVRKTYTTTEIAKELGMRSAVALNRLLCERHI QFKQNGTYVLYAEYAEHGYVHIKQEILENDKIVYHRRWTQLGREWLLDMLG >gi|316921643|gb|ADCP01000143.1| GENE 33 26342 - 26575 91 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIDALASSFPCFIPEDVQHQMMEGDFSRAKMIFEQASRVDFSQYPIEEHNPKTISPGSQT QEPPLYGSGAFGFMIDI >gi|316921643|gb|ADCP01000143.1| GENE 34 26572 - 27351 392 259 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-013 NR:ns ## KEGG: ECO111_p2-013 # Name: not_defined # Def: putative head processing protein # Organism: E.coli_O111_H- # Pathway: not_defined # 8 233 13 240 339 129 33.0 8e-29 MPAQRLNCSFSLLDEGRQYTGNHRKYIIENAREICNSPATKEKIRLREALGFYGHGRRIL AGKMNIGEVEAVTLPDGGKAIVSNIPSNVTVAFDVSPEGVVSHSQEVLDTETGKIVSTLH ASRVGGFSWACPGVDGGRGKPTRLSGFSGFDYVLNPGFSSNRGYILEGAADTAGKEQMIL ECLAATVKDDRKAEESLAGWKLDTQARLAALEDAIFESTARQAELTERLRRSEETASTSK AHASPPFWKPQRRRNGLQG >gi|316921643|gb|ADCP01000143.1| GENE 35 27502 - 27702 259 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MARDKRKEGAAEPVKALSGLYAADPTASTPLLSALAEARGISLPDLVERVLAKADAFAVA SGSIIG >gi|316921643|gb|ADCP01000143.1| GENE 36 27798 - 28052 82 84 aa, chain + ## HITS:1 COG:no KEGG:Emin_0938 NR:ns ## KEGG: Emin_0938 # Name: not_defined # Def: hypothetical protein # Organism: E.minutum # Pathway: not_defined # 1 76 1 81 81 68 43.0 1e-10 MTYGKRTLIAVDQLINTLLGGWPDETLSSRCYRWARDGVRAWPCKLVDGLFFWQREHCKS SYESEREGRQSPPELRRKAPVKNL >gi|316921643|gb|ADCP01000143.1| GENE 37 28376 - 29686 989 436 aa, chain - ## HITS:1 COG:CAC0744 KEGG:ns NR:ns ## COG: CAC0744 COG1757 # Protein_GI_number: 15894031 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Clostridium acetobutylicum # 7 425 7 426 448 246 34.0 8e-65 MLLAAVFVLFFLALCGSVVYGVFILYPLLWGLLSVGFVAWSRGYAIGTILRMGYTGAKQS LIVVEVFFHIGALTAAWRAAGTIGFLVHHGLQMLSPGAFLVGCFLLPLAFSVLLGTAIGT VGVIGIALMVIAKAGGIDPALAAGAIVAGAYFGDRCSPMSSSAHLVASITRTDVYVNLKN MAKSSALPLLLALAGYAWLSLLHPLDATEEQFSARIAELFRLHPITALPALLVLAGGLLR IRIKNAMLYSTLTAIVIALSVQDMPAIDMLRALVVGYEAPADNPIGYLFTGGGWISIINS ILIVFFSSAFAGIFEGAKLLVEVERGVTFLFHTVGGYVTNLIVSTTVSAAACNQTLAILL SAQIQNRLYTGEQARTILALRLENTVILISALIPWNIALSLPLSILDAGPDSIPYMLFLW LMPLIGGLNFPKPQTA >gi|316921643|gb|ADCP01000143.1| GENE 38 29812 - 30567 741 251 aa, chain - ## HITS:1 COG:STM3249 KEGG:ns NR:ns ## COG: STM3249 COG3836 # Protein_GI_number: 16766547 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Salmonella typhimurium LT2 # 4 241 10 246 256 163 38.0 2e-40 MEMLKDRLYAQERVIGTWCNLGSPLSAELAGITGYDWVLLDQEHGPGDNITLLHQLHALG KYPTVPIVRIAWRDRILTKRALDLGAGGIMFPYIQNAEEAREAVAFMQYAPLGERGLAAG TRCAGFGAFFGEYRENTVRSLLCVAQIESREAVENAAAIAAVPGVDVLFVGPLDLSAGMD MPRQFSDPAFVETLRRVVAAASLHGKASGILVQTPEHIRLVRELGFRFIAAGGDSNAIRQ AFSANLVLARA >gi|316921643|gb|ADCP01000143.1| GENE 39 30587 - 31759 977 390 aa, chain - ## HITS:1 COG:PH1371 KEGG:ns NR:ns ## COG: PH1371 COG0436 # Protein_GI_number: 14591174 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Pyrococcus horikoshii # 2 387 1 383 389 308 41.0 2e-83 MMQEADTLQNIPCSQLWSIANQATTLGEQGYDVIRLELGRPDFDTPEHIKEAAKKALDEG KVHYVSHFGIQPLREAIADTWEHDTGQRIDPNANILITAGGTEGLYLAYAAFLNHGDEIL IGDPGWVTYFHAPLLIGVTIKRFSLIKEGRFCLDLEQIRSRITPRTRMILLNSPSNPVGG VFTRQEIEALAALAREKGLIVVADEVYHRILFGDAKHYSIAALDGMADSTITINSFSKTY SMTGWRLGYVIASSERIGAMLRVHQQLGATCCSFGQYGAVAALRGSQQCVEDMRAAYERR TDIVLQRLSSMTQLSCVPPRGGLYTYIDVSGTGLDGTTFADRFLQEEKVATMPGAAFGET GAPYIRLSIASSETNLTEAMNRIQRFLGTL >gi|316921643|gb|ADCP01000143.1| GENE 40 31863 - 32474 683 203 aa, chain - ## HITS:1 COG:BMEI0106 KEGG:ns NR:ns ## COG: BMEI0106 COG1802 # Protein_GI_number: 17986390 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Brucella melitensis # 6 156 21 171 227 75 30.0 7e-14 MLRDVVYDYLWTQIREKQFKPGEFVNLNLISQKLGVSRTPLREALIVLEAEGFVQILPKR GIYIKPLTLQDTYNLYEAIGALESSIIMSVWSKIDENFIRTLERINGEIAACTDYARHYE LNHEFHFAYINLSENTEIKNYLGRLYQRIYDFSGINYGQKFQSNNCAGHAEFIRLLREGN PVSSADYLRDVHWKFRVPEVFHA >gi|316921643|gb|ADCP01000143.1| GENE 41 32542 - 33489 1104 315 aa, chain - ## HITS:1 COG:BH2007 KEGG:ns NR:ns ## COG: BH2007 COG3181 # Protein_GI_number: 15614570 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 20 302 35 323 339 136 28.0 5e-32 MKRFICTLFGLLAGMALLLPSLSWAKYPEKPITFICPYGAGGSTDIQARVLAGSLEKKLN QPIIVKNITGAGGLPGTQAAIDARPDGYTFGYIPLGPLVMQPHLRGLPSQVDQFAYVARI INAPYVLFVAANAPWNSFEAMMKDILANPKKYRYASSGAGTQPHIAMEDLFFQYGASVQH VAFHNDADAMQSMAGGHVQISTAPMSVVRQYGVKPLLVYDLKKISEIPEVPTSAQLGKSI VYTHWHVLTAPKGTPQEYIDAMSRAIAELSSDPDYLARLDKLCMKPAYMGPKDTEKMVRE EYDHYGKIIARVIKK >gi|316921643|gb|ADCP01000143.1| GENE 42 33526 - 35055 1318 509 aa, chain - ## HITS:1 COG:BH2009 KEGG:ns NR:ns ## COG: BH2009 COG3333 # Protein_GI_number: 15614572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 457 1 454 504 277 36.0 5e-74 MVDSILAAFQMSINAETLLIVFIGAFYGTIIGALPGLGTVVALTMCLPFTLHMENVPAIA LLLAVYCGSVFGGSISAVLINTPGTPQSAATGMDGYPMAKRGQAGQALGWVTVASVTGGL ASCVVLIVATPQMAALSIRYGGPLEICALICLGLACIATLSQGNQIKGLMMGVFGLLLAT VGNEPVSGMMRFTFGNRGMETGIDMLPLVVGVFPLAEVFYRIYEERSSVKAVPIDCRTIV FPRLSEWKGRFVGLLRSSGIGILIGILPGTGPTAATFISYASAKRSSKNGANFGKGEPDG LIAAESANNAVTGGALVPTLALGIPGDATLALLLATFAIHNLTPGVRLMVDFPDVVYASF ITLVIANLMLIPSAILTVRLFGTLMRIPSPLFIGFILLLSLLGAFISRNLSFDLSVAIVM GLVGFAMRLWDFPAAAMLIGFVLGPQFEYRLGQVFLFKGELSWLEYFSQNPVGTGLLVIT AFILLSPIYSALRRKGGDAATPDAASSEE >gi|316921643|gb|ADCP01000143.1| GENE 43 35096 - 35551 551 151 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDVSRKGIVVSVGFILAAVFLLYDSYTTKEVYLNEGALYPMTYPRILLGIWIVLSGLHVF ARRVAVDIPSLLKALPTILLITAVLVGFTILLPLLGFPVASFLLLCAVFLILHYRYPIRL TLIAAGASFFLWFIFQKLIALPLPSGTLLFF >gi|316921643|gb|ADCP01000143.1| GENE 44 36344 - 36620 279 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRPALIDKLCQAGKWLFKPDLVPVDDVTLKRGSTAISVAIDALVRSGGGLAVDAKTGRL YVDFSLVPDDQMQAIVLAMVQQGGGLAVDGTG Prediction of potential genes in microbial genomes Time: Fri May 13 04:49:07 2011 Seq name: gi|316921640|gb|ADCP01000144.1| Bilophila wadsworthia 3_1_6 cont1.144, whole genome shotgun sequence Length of sequence - 2808 bp Number of predicted genes - 4, with homology - 2 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 132 - 1046 239 ## gi|212702826|ref|ZP_03310954.1| hypothetical protein DESPIG_00858 2 1 Op 2 . + CDS 1080 - 1316 170 ## 3 2 Op 1 . - CDS 1211 - 1480 139 ## 4 2 Op 2 . - CDS 1489 - 2703 605 ## gi|212702826|ref|ZP_03310954.1| hypothetical protein DESPIG_00858 Predicted protein(s) >gi|316921640|gb|ADCP01000144.1| GENE 1 132 - 1046 239 304 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212702826|ref|ZP_03310954.1| ## NR: gi|212702826|ref|ZP_03310954.1| hypothetical protein DESPIG_00858 [Desulfovibrio piger ATCC 29098] # 35 302 58 331 426 198 41.0 3e-49 MSRPALIDKLCQAGKWLFKPDLVPVDDVTLKRGSTAISVAIDALVRSGGGLAVDAKTGRL YVDFSLVPDDQMQAIVLAMVQQGGGLAVDGTGKLYVDFASMPTDKFESMLKSIRVPVWLS KNLIFYVDAGTGADTLDDGRGLSVSKPFKTIRAAVEYIANNYNLGKYIATVYIGAGIYRE DINLPKYNSTTGYIYLRGIDEDRGQVVINGCIYAGTSVGVYYFRSVTVRNRAGESSIGSK NFFAVSARPGAELQLYNLGIDLSSAAPPLGDKYGIVAMGGTITIRDLNEDDIGLNISSGP TAIA >gi|316921640|gb|ADCP01000144.1| GENE 2 1080 - 1316 170 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLADISISGAVGATLALSGLAIFDVQIMVERDAPKFVGYVTGRRYSVNENSIAKTHGRGP DFIPGDSEGLVATGGQYS >gi|316921640|gb|ADCP01000144.1| GENE 3 1211 - 1480 139 89 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDIRLYKLADGRLWSVEMYDLRIIRGDFNETGSVNVSETAIFYTIYQPHYLLLTRYEYCP PVATNPSLSPGIKSGPRPCVLAMEFSFTE >gi|316921640|gb|ADCP01000144.1| GENE 4 1489 - 2703 605 404 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212702826|ref|ZP_03310954.1| ## NR: gi|212702826|ref|ZP_03310954.1| hypothetical protein DESPIG_00858 [Desulfovibrio piger ATCC 29098] # 35 237 58 267 426 145 40.0 6e-33 MSRPALIDKLCQAGKWLFKPDLVPVDDVTLKRGSTAISVAIDALVRSGGGLAVDAKTGRL YVDFSLVPDDQMQAIVLAMVQQGGGLAVDGTGKLYVDFASMPTDKFESMLKSIRVPVWLT GNKPFYVNINTGSDTLDEKRGESAGKPFKTLQACLNYVCDNFNVSRYACTINVAPGTYFY FGIADFARTTGYIMITGESAATTIVEATNRSAGSFSAGSWRLKNIKFREVAADVPAGGAW VEGNALSVMGTAVVSLDEGFESEIIAGASVSVVSTGAFRAISLADSSIVTVYKARITAID NTLRQIVKGRMSGLTGYGSSVCNLRGVSAGAVCLEINGDFSTSITANGSYIVRNTVDLPV VTGTAQGRRYSAINGGRIVTSGGPDYFPGTTAGTVETSTFSWYK Prediction of potential genes in microbial genomes Time: Fri May 13 04:50:02 2011 Seq name: gi|316921633|gb|ADCP01000145.1| Bilophila wadsworthia 3_1_6 cont1.145, whole genome shotgun sequence Length of sequence - 3574 bp Number of predicted genes - 7, with homology - 3 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 301 300 ## - Prom 428 - 487 4.0 + Prom 413 - 472 5.0 2 2 Tu 1 . + CDS 576 - 929 359 ## 3 3 Tu 1 . - CDS 890 - 1486 226 ## - Term 2006 - 2052 1.8 4 4 Op 1 . - CDS 2123 - 2305 206 ## Ppha_2830 helix-turn-helix protein, CopG family 5 4 Op 2 . - CDS 2383 - 2640 384 ## 6 4 Op 3 . - CDS 2699 - 3166 525 ## Dde_1737 phage-encoded protein-like 7 4 Op 4 . - CDS 3185 - 3574 60 ## COG3293 Transposase and inactivated derivatives Predicted protein(s) >gi|316921633|gb|ADCP01000145.1| GENE 1 1 - 301 300 100 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRPALIDKLCQAGKWLFKPDLVPVDDVTLKRGSTAISVAIDALVRSGGGLAVDAKTGRL YVDFSLVPDDQMQAIVLAMVQQGGGLAVDGTGKLYVDFAS >gi|316921633|gb|ADCP01000145.1| GENE 2 576 - 929 359 117 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALLKGGKGGGSSTTNISSEPDKAYNARMAKVTERQQALSDQFFNWWQTVQAPVEQANAD AQLGLIPAQTDIVREGLEGFKGIQTDFYNASRPTSTETYAGRAATYANMALSKATAA >gi|316921633|gb|ADCP01000145.1| GENE 3 890 - 1486 226 198 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFSGLFEEIYSFPLSQYLAVVAYRGNEIFLPMHSDRNATYGADTSVFVMNDGAMTGLRFL RPGPGQQFLERIASCASLRRENEASIPLLPHSDSEHPTSKARPLRSLRHHALEPGHKALL HRRRIRHEFHSLPSKPVGHRPGQFFQVHLLALHPARVRGPLYGQIAAGRLRRAACRTDAH PAHVPVHAAVALDSAMFA >gi|316921633|gb|ADCP01000145.1| GENE 4 2123 - 2305 206 60 aa, chain - ## HITS:1 COG:no KEGG:Ppha_2830 NR:ns ## KEGG: Ppha_2830 # Name: not_defined # Def: helix-turn-helix protein, CopG family # Organism: P.phaeoclathratiforme # Pathway: not_defined # 6 60 10 64 79 71 58.0 8e-12 MKANDFDDGGSVMPFADMPQAERPNLKTKRVNVDFPAWMVEALDKEAAHLGVSRQALVKV >gi|316921633|gb|ADCP01000145.1| GENE 5 2383 - 2640 384 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLFDDGKKLEKAVGEEAAKTIVEVLERFDESQRSASASKGDLRETELRLLKEIQQAKAE TIKWIAGIITAQTVAIIAAIIALMK >gi|316921633|gb|ADCP01000145.1| GENE 6 2699 - 3166 525 155 aa, chain - ## HITS:1 COG:no KEGG:Dde_1737 NR:ns ## KEGG: Dde_1737 # Name: not_defined # Def: phage-encoded protein-like # Organism: D.desulfuricans # Pathway: not_defined # 1 120 67 177 179 64 39.0 1e-09 MHIIFRDGFTLLAMGYTGPEAMRFKLAYIEAFNRMEAELAKRNRPALPAAPRFDEAAMLE LAAEIREAQQHYYRTFGHLCSRLISMSIPVFTALENRVYKQAPDRPFSGVRIGAQWERYF TERMTAALHSLDDRLPDEKNPAMLLLEYARAMSAR >gi|316921633|gb|ADCP01000145.1| GENE 7 3185 - 3574 60 129 aa, chain - ## HITS:1 COG:BMEI1402 KEGG:ns NR:ns ## COG: BMEI1402 COG3293 # Protein_GI_number: 17987685 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Brucella melitensis # 1 124 2 125 146 169 62.0 1e-42 DSKLHAVCDGHGRPVLLLLTEGQVSDCKGAAILQHLLPEDGIFLADGRYDVAWLRESLKA RGMTVCIPPRKTRNAAFPFDKTLYKRRHLIENTFSKLKDWRRIATRYDRCAHTFFSAICL AVCVIFYLH Prediction of potential genes in microbial genomes Time: Fri May 13 04:50:55 2011 Seq name: gi|316921626|gb|ADCP01000146.1| Bilophila wadsworthia 3_1_6 cont1.146, whole genome shotgun sequence Length of sequence - 6808 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 465 - 1613 1716 ## Ddes_1722 metallo-beta-lactamase family protein 2 1 Op 2 . - CDS 1624 - 3027 1996 ## Ddes_1721 amino acid transporter, AAT family 3 1 Op 3 . - CDS 3078 - 3317 158 ## 4 1 Op 4 . - CDS 3323 - 3970 696 ## COG2188 Transcriptional regulators - Prom 4071 - 4130 1.6 5 2 Tu 1 . + CDS 4380 - 5561 1459 ## Ddes_1718 hypothetical protein + Term 5629 - 5694 8.2 - Term 5628 - 5666 8.1 6 3 Tu 1 . - CDS 5713 - 6579 784 ## COG0384 Predicted epimerase, PhzC/PhzF homolog Predicted protein(s) >gi|316921626|gb|ADCP01000146.1| GENE 1 465 - 1613 1716 382 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1722 NR:ns ## KEGG: Ddes_1722 # Name: not_defined # Def: metallo-beta-lactamase family protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 382 1 382 382 696 81.0 0 MDRREFMKGAALGVGAGVLGSMGLYSYSPLRRAFLPEVKRGTADIGVCKSVKVTNISETS WFDNGIFMNDVTGAGGLLVDQYTFNWAPFGNGKGVGKGTFEEGLKRIKHLLPHKIDEAWA IAKENCVHADNAGGYSCLVEIESMEGETTRYLFDTGWNYDWMDTCFKREGIDVMLADNKI DAFIQTHEHMDHYWGFPVVTKYNPNIHVYTPNTFYPAGKEYLKACGHVGKWTEVPKGLHT LQPGVALYQFDCPIIFKVFGEMSLYCNVKDVGLVSITGCCHQGIILFADTAYKELAYEKD QFYGLYGGLHISPFDDWDPKYDDLVIGLKKWNLQQVGCNHCTGLITAQKFVDAGYPVVKG TARFRSKTTNYLGNGDVIKFPA >gi|316921626|gb|ADCP01000146.1| GENE 2 1624 - 3027 1996 467 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1721 NR:ns ## KEGG: Ddes_1721 # Name: not_defined # Def: amino acid transporter, AAT family # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 5 467 4 467 467 706 82.0 0 MSGQQQIPYLEERKLIKRWSGPWPMLANMAFTLLIFAVTWWVFQDPRGIMRFYTPYVGYN YCRWWLIILIWMAYIFDFWPFRRDWVRSAHPLQKGLVLALVSVGIMIAMIHGFFEGVLGN LAFAYFNPAQLQKLGLTDFYSTEYAAQACMMFAVIASWISPAWLVALEGQPWAGLSQPVR GFSIWLGTFCLSLLIYFMTMHNHMGILYYPWQYFTAICPPYWEHFAETVSANFHVAWIMC CTVVVWFMEGIWERFPFTMIKTPWLRRLALFFGIIAISWALCMFFWYMQELVWGDAIRGH RRDAAPDWRWLHVGETAIFFLVPALFLQFYCGNWPNRFSTPINVLVRSLLVTLGGIAIYC LYYKYAHFVLGTQKGFSHPQQFPMIPMIWLINIWLINWWFMDGWPGWRREFKTAEDFAAE ERHLAANAAWNPAMLPGVIIGLVVGVIAYFAIVALLPLASATFTLVQ >gi|316921626|gb|ADCP01000146.1| GENE 3 3078 - 3317 158 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRRAFLGFRFSEEDAGTRDVPGGVPPETLLPPEFSPAMLRAEAERLGLDPDRIGAGELA SAVMRAMYARAPEGAGRTG >gi|316921626|gb|ADCP01000146.1| GENE 4 3323 - 3970 696 215 aa, chain - ## HITS:1 COG:SP1446 KEGG:ns NR:ns ## COG: SP1446 COG2188 # Protein_GI_number: 15901296 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 7 68 11 72 235 63 50.0 4e-10 MDKQECIERQLESALLEGRWGVFERLPSERGLAEAFGVNRATVRAALRALAGRGILETRR GSGTVVRALPGDARHAAGTFADYLAAFRILMPPLVAASLPAVSPSVILELERLLPSAGAS LRNGDMKTFIQAQIRFFSILIRVLGNPRLEDAASRVLPDGLALARLLQECTLPQCENVFA QLARLLSALRHADAGAAAKAVEGYATILLALQEAR >gi|316921626|gb|ADCP01000146.1| GENE 5 4380 - 5561 1459 393 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1718 NR:ns ## KEGG: Ddes_1718 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 392 1 392 394 377 51.0 1e-103 MQQFSPWIAGFPGLIWSIDIPHSRLKVFNRWSIPGLDVPRLLKDARYRKKAVRREDHALL MEFWDMLTARRPAAVTFRLTNGNGTYILQGWPGENDTLYYGFLKEAFLPAAYSADGRPGT CQMDIGGIGYPVFALDIPTRGLLIINDAARSLFAAASRDPNTLGLADIAPGGLAEKLLEA GNKALANDAWAGTLTFSNAQRGLFSAKVRLTPCGGAGEGRVVRVALLNIPENRKSAACPP SEEDAQTAPRPLREGLESLFAPYAGELDGLLFSDIQSHKGQVVVYGAGPAFRELRWGAEH AYEGTIAQDIERFGLRSLTVEDTLDSIKSIDWVLFAPHGIRSYFAKPFYTEQGLHAILIL ASIRPGSFGADAETRFASLMGPFERLIGAWRKG >gi|316921626|gb|ADCP01000146.1| GENE 6 5713 - 6579 784 288 aa, chain - ## HITS:1 COG:BS_yfhB KEGG:ns NR:ns ## COG: BS_yfhB COG0384 # Protein_GI_number: 16077914 # Func_class: R General function prediction only # Function: Predicted epimerase, PhzC/PhzF homolog # Organism: Bacillus subtilis # 10 287 10 292 293 140 32.0 4e-33 MPNCTFHIVDAFTAVPFTGNPCAVVTDADGIEPADMLRIARETNAPETAFVLASDKADVR VRYFMPRGEIPFAGHPTIATGHLLRELGVLKPGTARFEFAIGVLPVDIRPDRVIMTQPPA VPDTCADAGTTAQALGLQASDLREGLPCQLMRGGVSFLMVPVRELAALRRINMDRPALKA VLAPLGVSAAYVFAPEGVEPETDVHARLIDPDNAGEDPFTGSAAGCMASYMHAHGLCSGT RIRLEQGHILDRPGTGELELILASDSGKLDAVRLGGTAVSSASGTLRW Prediction of potential genes in microbial genomes Time: Fri May 13 04:51:31 2011 Seq name: gi|316921605|gb|ADCP01000147.1| Bilophila wadsworthia 3_1_6 cont1.147, whole genome shotgun sequence Length of sequence - 26046 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 9, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 60 - 497 401 ## COG0251 Putative translation initiation inhibitor, yjgF family 2 1 Op 2 . - CDS 499 - 1656 1531 ## COG0006 Xaa-Pro aminopeptidase - Term 1697 - 1721 -0.3 3 2 Op 1 . - CDS 1817 - 3001 1247 ## RPB_4421 peptidase M24 4 2 Op 2 34/0.000 - CDS 3077 - 4096 1333 ## COG0765 ABC-type amino acid transport system, permease component 5 2 Op 3 34/0.000 - CDS 4105 - 4848 585 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 6 2 Op 4 31/0.000 - CDS 4824 - 5810 1140 ## COG0765 ABC-type amino acid transport system, permease component 7 2 Op 5 . - CDS 5859 - 6668 1153 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 6832 - 6891 6.4 + Prom 7860 - 7919 6.0 8 3 Op 1 3/0.000 + CDS 8019 - 9566 1574 ## COG1042 Acyl-CoA synthetase (NDP forming) 9 3 Op 2 2/0.000 + CDS 9404 - 10213 870 ## COG1042 Acyl-CoA synthetase (NDP forming) 10 3 Op 3 11/0.000 + CDS 10233 - 12005 2353 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits 11 3 Op 4 . + CDS 12019 - 12606 948 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 12 4 Tu 1 . + CDS 12786 - 14366 2058 ## COG3653 N-acyl-D-aspartate/D-glutamate deacylase + Prom 14559 - 14618 4.0 13 5 Op 1 . + CDS 14669 - 15421 916 ## COG1414 Transcriptional regulator 14 5 Op 2 . + CDS 15679 - 17052 1566 ## COG0044 Dihydroorotase and related cyclic amidohydrolases + Term 17077 - 17124 16.5 - Term 17221 - 17254 -0.5 15 6 Tu 1 . - CDS 17280 - 17756 499 ## COG1246 N-acetylglutamate synthase and related acetyltransferases - Prom 17968 - 18027 2.4 + Prom 18251 - 18310 1.9 16 7 Op 1 . + CDS 18365 - 19633 1287 ## COG2206 HD-GYP domain + Term 19710 - 19742 0.4 17 7 Op 2 . + CDS 19792 - 21768 2127 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Term 21811 - 21868 9.8 18 8 Tu 1 . + CDS 22099 - 23424 1909 ## COG2610 H+/gluconate symporter and related permeases 19 9 Op 1 1/0.000 + CDS 23557 - 24546 1537 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 20 9 Op 2 6/0.000 + CDS 24667 - 25311 831 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 21 9 Op 3 . + CDS 25316 - 26045 762 ## COG3395 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|316921605|gb|ADCP01000147.1| GENE 1 60 - 497 401 145 aa, chain - ## HITS:1 COG:aq_364 KEGG:ns NR:ns ## COG: aq_364 COG0251 # Protein_GI_number: 15605872 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Aquifex aeolicus # 11 140 12 117 125 63 31.0 1e-10 MGTVTRIGRGPAGLPFSRMTLKDGRFFLSGQVAQNPETGRIDHPGCYEQSLTVLGNIRGI LAEAGFAPEQVAKMTVFLTDMGQKDAFDRAFERVFRRVPEEGDAPGSPEGSGTGEGGAGL SCAPACTCVGVAALPNPAAVVERGT >gi|316921605|gb|ADCP01000147.1| GENE 2 499 - 1656 1531 385 aa, chain - ## HITS:1 COG:PAB1637 KEGG:ns NR:ns ## COG: PAB1637 COG0006 # Protein_GI_number: 14521304 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Pyrococcus abyssi # 128 364 117 343 351 79 26.0 2e-14 MNTTVLEKIQSELERSGCDAFLLSGGDNMRYAAQVHLPFTDEHPGLAFVLFAKGRLPLVL SPSLWARSWERGIISDVRAYPGSLDATHAAAELAAFLRPLGLASIGLDMNRASVALREAL GEWALTDISAAISAIRQIKSAEEAAFLEDLAYRADHAILGAAHHSLACDARAEKGQAELI RMHALERGFEAVGYNGISQSATGAHAKDYWAESPKFGIGQGKKTARDEMNRLDLQASLNG YWAFGSRLMTMGWPLPEQTKAYEGLVAMRKAALAAVRPGATCADLYAAMLDAAERAGVTP VTGVHWGHGIGVASQEAPWIAPGDATVLKEGMVLVLNPHVQGPGGTILYSRDTVLLDFNG CKILGWYKDWREPYVASDSYYSGGG >gi|316921605|gb|ADCP01000147.1| GENE 3 1817 - 3001 1247 394 aa, chain - ## HITS:1 COG:no KEGG:RPB_4421 NR:ns ## KEGG: RPB_4421 # Name: not_defined # Def: peptidase M24 # Organism: R.palustris_HaA2 # Pathway: not_defined # 1 371 5 385 399 78 25.0 4e-13 MQASIQSRLAGAMREQGMRSLLLADPENFGYVAGPMLPYVKQTPDRAVLALFEPGTDEAA CTCWLYPDLAQPAEAQLAKEPSAVCVAVAEDAVGADCLIRRLKESRPDLDGVVGVDFSQT AASVFDALRAAFPGCSFKDAGPMLRGLRLVKLPEEIDRIEMACRQVDTGIIGAINHMEGA HSGSAYRKGFYTVPEFIERVRIHAYEGGTSMSGNMAAVAGEAIGANYMPQRGFLPSEGLC RIEYACCHGGYWGVTARMIHIGEDMDPRVRSACADNLSLKMLALDMLRPGVACSDIHGAV VKEAAKRGIPLSSASGIGHGVGRGEIEGPYLAADDPTVLCENMVIALDVSTLGPEGELLR SIDIYAVEKDGPRLLSWHRNWDMPYMITGFRSAH >gi|316921605|gb|ADCP01000147.1| GENE 4 3077 - 4096 1333 339 aa, chain - ## HITS:1 COG:mll5422 KEGG:ns NR:ns ## COG: mll5422 COG0765 # Protein_GI_number: 13474522 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Mesorhizobium loti # 120 334 7 220 224 147 40.0 2e-35 MSRDPHPAEAGGVPRPRRRARIAGPWYHSPKAALSGLAVLIAWGLLLPGATPVGGGMSTL GSLSLFGYALFSIVGGIVFLSNRNLGRVLSSCAAVGLLLLWGWLFVRYSGADWSRLAYVF FNFEILGESGMKALAGGFMTTLEVGIIATLCAFWLGVFLTLVRFFENKVMTGFVKVYVDV FRALPTIVLVSLIHYGAPFIGIYLPLMVSGILTLTLNHSAFFSEILRSGVSAVGHGQLEA ARSLGLGTFKTFRLIVFPQAFKTAMPPLTGQFVALLKETVICSVVGIPELLREALVVQSW TANPTPLIIATIGYLLLLVPLTRLSRYLEVRMSTVRQAC >gi|316921605|gb|ADCP01000147.1| GENE 5 4105 - 4848 585 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 8 246 1 242 245 229 48 1e-59 MDNTASCMIEVCDLHKHYKDVHVLRGISTTVAKGEVCAVIGPSGSGKSTFLRCLNRLEEA SSGEIFINAQEITDPHLNLNKMRENIGMVFQGFNLFPHMTAIQNIMLAPVNVRGLSHKEA EAYGMELLSRVGLADKHASYPDMLSGGQQQRVAIARALAMRPEVMLFDEPTSALDPEMIG EVLEVMIDLAKQGMTMVVVSHEMSFIREVADKVLVLADGVIIEEGSPQEVFTNPRHERCR SFLSKIL >gi|316921605|gb|ADCP01000147.1| GENE 6 4824 - 5810 1140 328 aa, chain - ## HITS:1 COG:mll5422 KEGG:ns NR:ns ## COG: mll5422 COG0765 # Protein_GI_number: 13474522 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Mesorhizobium loti # 108 319 7 218 224 149 41.0 1e-35 MQAENTHSRQKRQRLWYQRDGFLWGLVVALIAAAWGLLPPAGTSLGIGGWLGMGLYMLLA AVPMIVTTMDKRRPQLAGTAAALGFLGLLGLVFYRYSGAQWDMLGMMFFNRELIVETWPI LLGGLENTLLIFICSAVLSVTIAPIIAIIRTLNNPIFNCAIDAFVDVFRSLPILVLVIFI YYALPFLEIHSTPFAAGVCGLFLSALAFMIEYFRAGIESVSRGHVEAARSLGLGVIQTMR FVILPLAIRVVLPPLTGHLVGLLKATAFVSVVGMPELLKRAMEIVEWKANPTPLVTITIM YLILLLPLMYGSTLLERRLARWTTPHRA >gi|316921605|gb|ADCP01000147.1| GENE 7 5859 - 6668 1153 269 aa, chain - ## HITS:1 COG:mll5423 KEGG:ns NR:ns ## COG: mll5423 COG0834 # Protein_GI_number: 13474523 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Mesorhizobium loti # 34 265 30 263 265 123 32.0 4e-28 MNRLHVLVLLAAMACTGLVMGTVKAEAADSGKELRAAVNATAPPFAFMENGKLVGFEVDL HAEVAKRMGKQLVMTNIPFAGLAPGLQAKKWDAAGNTWINKERLVMMDFSDPWVDAQLAF VTRKNDAITSPEQLKGKTVGAEAGTAEQRWLDANQEKYGPYTIRTYDRSIDILMDLMNKR LDAYLREQTRALYQIKNYPMLGVAFPVGDVFVQGVAFRKGDPLRDEYNTILNSMKRDGTL AALYEKWLGTKPGPDSSTVHVYTTPWQPK >gi|316921605|gb|ADCP01000147.1| GENE 8 8019 - 9566 1574 515 aa, chain + ## HITS:1 COG:MJ0590 KEGG:ns NR:ns ## COG: MJ0590 COG1042 # Protein_GI_number: 15668770 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Methanococcus jannaschii # 6 430 6 418 704 220 32.0 6e-57 MKKSPWKLDPLFTPRSIAVIGASPNGGAGSIVLRNMQRLGYTGTVYPINPKYKEIFGYPC YASLRDLPAPADCAAVLLGSRSLLPMLEEAHAAGVKGVWGFASGFAETGGEGRAMQRRIH DFCHESGLLFCGPNCVGYANITDKTCMYSAPLPQAFRPGDIGVIAQSGAVLLALGNANRM AGFSRLISSGNEAALGLADYMDYLVDDDATHVIALFIETIRDPEAFAAACSRAAEKGKPV IALKVGRSELACRVAATHTGAIAGSDRILDAFFRRANVMRVNTLDDLLETAVLFSGLRGL SAGSPKVGMATVSGGEMGMLADVCADSGLVFPPVSEAGKERLRKVLPPYAPIANPLDAWG SGDLREAYPASLGILAAEEDVETLIVSQDIPGNMAPEQVEQFSDVARAAVAVRAASGKPV IVVSNISGGIDPAIGDILAQGNVPVVQGSGAGIPAIRRWIDWCAARAAKTASPYPRKDSP TPSILPPNWLRNSTPVRAYCPTRFPPACCAISGSP >gi|316921605|gb|ADCP01000147.1| GENE 9 9404 - 10213 870 269 aa, chain + ## HITS:1 COG:PH1788 KEGG:ns NR:ns ## COG: PH1788 COG1042 # Protein_GI_number: 14591546 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Pyrococcus horikoshii # 43 261 25 236 238 142 37.0 8e-34 MRRARGENGISLPPEGFPHSFHTPPELASELDASSGVLPYALSARLLRHFGIAIDHEELA VSLEEGMAAAGRIGYPVALKAASPDIAHKTETGLVKLNLRTGAEFAAAWRELEGILALHH ADARREGMLVQGMVRSAGGKSGDVVETIVGVNRDGGFGSAVMVGLGGVFVELLRDVSLEL APLSPEGARAMIGRLKAAKLLTGFRGRAASDVEALSDVLVRVAAMCCALGDRLVSLDLNP VMVLPEGEGVRIVDIVMQVAPARHCGETD >gi|316921605|gb|ADCP01000147.1| GENE 10 10233 - 12005 2353 590 aa, chain + ## HITS:1 COG:CAC2001 KEGG:ns NR:ns ## COG: CAC2001 COG4231 # Protein_GI_number: 15895271 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Clostridium acetobutylicum # 1 583 1 573 584 489 42.0 1e-138 MKKLLSGNEALARGALEARIEIACGYPGTPSSEILENVSKYKEIYSEWSVNEKVAMDAAA GAAYSGRRALVTTKQVGMNVMSDSLFYTAYTGAEAALVVVTADDPGLFSSQNEQDNRHYA KLGKFPMLEPCDSQECKDFMIEAVDMSERFDTPVVIRTTMRTSHSKSVVELGAPGSYGRE VGPFPRNIEKYNSMCTWARKRHYILEQRLLDIEEWSNTWPGNRIEWGDREYGFIAGGIIY EYVKQVFPEASILKLGMCYPLPKKLIREFADGVRNVIVVEELDPFIEEQVRAMGIPARGK DIFSICGELLPEDIAESCRKAGILKDTAPAFSDTSESLPSRSPLLCSGCPHRSTFYNLSQ MKVPVAGDIGCYNLGTLPPFNAQHTMGAMGASVGVLHGMGLSGLPEPAVCTIGDGTFFHA GVAPLLNMVHNKGKGTVIIMDNRTTAMTGHQDNPGIEATLSLGTVAPVDIAGLCRACGVE KVLTADAFDLAAVRKALEECTAYDGVSVLITRGDCVFVSRSPKPARVVDADKCIACGKCI QSGCPSVVLSDEKHPRTGKRKARIEPVTCVGCGICAQICPVHAISGPEQA >gi|316921605|gb|ADCP01000147.1| GENE 11 12019 - 12606 948 195 aa, chain + ## HITS:1 COG:CAC2000 KEGG:ns NR:ns ## COG: CAC2000 COG1014 # Protein_GI_number: 15895270 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Clostridium acetobutylicum # 3 190 2 187 192 130 34.0 1e-30 MKKTIRFLITGVGGQGTILASDILAEVGLRLGHDAKKSDILGLAVRGGSVVSHIIWGENV RAPMIDRGTADYYLSFEWLEGLRRLVYTNKDTVVLANDWRIDPVSVSSGQADYPDEDGIR AAIRERCAGLREIPATPLAVEMGNARVFNSVMMGGLAQLVGGDADVWRSVIADRVAPKVR DLNVRAFDAGLNHTF >gi|316921605|gb|ADCP01000147.1| GENE 12 12786 - 14366 2058 526 aa, chain + ## HITS:1 COG:PAB0090 KEGG:ns NR:ns ## COG: PAB0090 COG3653 # Protein_GI_number: 14520359 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-acyl-D-aspartate/D-glutamate deacylase # Organism: Pyrococcus abyssi # 2 525 4 523 526 428 42.0 1e-119 MFDILFRNAKVIDGTGNPWFYGDVGVEGGTVAAVLPPGSTARARRVADVDGKALCPGFVD GHSHSELPLLAGEGMDCKIMQGVTTENLGLDGMSLAPIKDGDKADWKKHLSGLAGTLDVP WTWNSFAEYLDCLQAAKPPLNVSSYVGLGTVRLSVMGMADRPAEDHEIRAMRDLVARCMD EGARGISAGLIYPPSRYQSLEETVAVAKAAAAHGGIYDVHMRNEADAIAESVEEVIAISE RSGIPAMITHFKIRGKKNWGRSAALIRRIDEARAAGIDVTMAQYPYTAGSTFLHVVIPPW YHSRGVDGLLRALREERGAIKHDLDTRMDWENFSQSVGWEKIFVSSVVSEANQPCVGKSI SQIAAERGHADPADAAFDLLVEEQLAVGMISFGLDEADITAIMRHPAVSFITDGLMSGRP HPRAYATYPRILGRYVREQGVLSLEEAVRKMTSLPARKIRLRNKGVIAEGYDADLVVFDP DTVIDKNSYDDPRVHPAGIAHVLVNGVFVVEDAALTGARPGRVIRD >gi|316921605|gb|ADCP01000147.1| GENE 13 14669 - 15421 916 250 aa, chain + ## HITS:1 COG:YPO1714 KEGG:ns NR:ns ## COG: YPO1714 COG1414 # Protein_GI_number: 16121974 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Yersinia pestis # 2 207 13 218 263 95 29.0 7e-20 MLTSSQKVFDVLEFLCANGPHKALEVSKALSLQKSSVHRFLNSLIEFGYVRKDDQTGLFS VTLKVVQLGAMVSSKIDMVETGRSYMRELVATFDQATVSFATFMDKQVLVLRREYPRNCV TRIDLSQQLPAYCTGLGKALLSTKSEEEIDEYIATVPRTAYTLHTLTDGARLKQDLLAAK EKGYAEDISEISDSLHCVAVPILTPHGGLWAISLSGHCSVIREYGVENIVKHLKKAVWDM TSGYSAAPVA >gi|316921605|gb|ADCP01000147.1| GENE 14 15679 - 17052 1566 457 aa, chain + ## HITS:1 COG:BMEI1644 KEGG:ns NR:ns ## COG: BMEI1644 COG0044 # Protein_GI_number: 17987927 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Brucella melitensis # 2 451 6 454 489 224 31.0 4e-58 MSTTLITNGIVVDAGTMRPLDILISGERVESLLPRGSEVRADRVIDASGRYVLPGIIDAH NHPVYADRIDRLSRAALSGGVTTVIPYIGAVAAWGKQGGLVQAIDDFIREGETGSCIDFG LHCTLTANVMEEADEAIPELVARGVISFKAFTSYRKRGMKLEDDQILHLMELVAREKALL AFHAENDAILEHLEAKAVAEGREHPRDYPATHPNISEAEAIFRVLSLASVAGCNIYLPHV TCKESLEAIRLFRKWGTVPTLFAETCPHYLTLDDSQLALRGNLAKMSPPLRKEADREALW EAIRNGEIQVVASDAAGHATAANEPLFAETFRAPHGAPGVDTLFCVTWNEGIAKGRVAIP DLVRLLCENPAKCFGLYPRKGTLLPGSDADVVIVDPGDDWIIPDRNEHMAVDYSLFAGMG CLGRHRTVLLRGAVAYEDGKILDDARRGVFLRGSLPA >gi|316921605|gb|ADCP01000147.1| GENE 15 17280 - 17756 499 158 aa, chain - ## HITS:1 COG:aq_1359 KEGG:ns NR:ns ## COG: aq_1359 COG1246 # Protein_GI_number: 15606556 # Func_class: E Amino acid transport and metabolism # Function: N-acetylglutamate synthase and related acetyltransferases # Organism: Aquifex aeolicus # 4 153 14 162 181 145 46.0 2e-35 MQPVLRPARMCDAKKIHGLLMECSAEGLLLPRSLSNLYGQIRDFRILETSGGEFIGCAAL AVIWENLAEVRSLAIAKPWRGNGFGNILVDECCKDARRIGIDRVFALTYQVRFFERNGFG VVSKDLLPQRVWIVCANCPRFPDCDEVAVMRHLDAGGA >gi|316921605|gb|ADCP01000147.1| GENE 16 18365 - 19633 1287 422 aa, chain + ## HITS:1 COG:VCA0681 KEGG:ns NR:ns ## COG: VCA0681 COG2206 # Protein_GI_number: 15601438 # Func_class: T Signal transduction mechanisms # Function: HD-GYP domain # Organism: Vibrio cholerae # 1 380 15 407 431 203 33.0 6e-52 MQLRIKLYDLILAFANALDQVHPALTGHHRRVGFLLDRLAERLGLPSEERERLFLAGIMH DVGVIPLKTSAEDLVFERERYLHPQAGCLFLQNCPTLAEEAERVRFHHMYWEKACDRGLA AREGSLINIADRVDVDLRAKNDFREAVEGADRRVCERRPGIYSPDHIEAMQDILHDEETL NDLAGAHIRLPAFVRRRYGDQILTPQETIRFSTLFGHIIDSCSPFTATHSTGVAHMAVAL GRLTGMGQDDLDTLFVAGMLHDIGKLGIPLALLEKPGQLTDEEFPKVKRHADLSRLWLDA VPGFERVSVWGGGHHERLDGKGYPLGLKGEEIPLPSRIMAIADVFTALTEDRPYRKGMTP PEALRIVKGMVKQGHLDGDLSRLLCDHADELDAVRASSQAEARRGFQALHDACAEPDERL RD >gi|316921605|gb|ADCP01000147.1| GENE 17 19792 - 21768 2127 658 aa, chain + ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 67 651 49 626 627 337 35.0 6e-92 MYNPRDPQTNAQERMPLPQIRIVSISPYPEMERIVQDVIAAHPQRGRIDNRIITATVDRL DLSELHDCDAIIARGYSARRLKAQGLGVPVIDIAISGYDVIRSIQECQRRFQTRRIGFVG FYNAFNGIEQFSGMFGCDIRVYIPECAAEIRHALELAKQDGCDVIVGGYSVQQDAGLLGL PAILLRTGRETYAQSLEEAIRTVTLVQQERIRTETYKIITESVKDGLLYVDASGIIRLDN PAACLMRDAPLKDVPLETAFPCMMDSFSRAMRDCAEIQGEVRKVRGVSITFDCTPVAVNG IAAGVVISFQNVDKVQQLEGHIRKKLSNKGLTAKHHLSDIVHCSRLIDETIEDACRYAAA SSNILIVGETGTGKELFAQGIHNASQRRNGPFVAINCAALPENLLESELFGYVEGAFTGT TRGGKMGLFELAHNGTLFLDEISEIPLNMQSKLLRVLQEHEVRRIGDDRVTAVDVRIISA TNVNLHKLVAAKRFRQDLLYRLDVLKITIPPLRRRGEDVIELFHFFLKKFCWKNREALPE VAPDSLHLLREYPFAGNARELRNVVERAAVLRRNRDVLTSADLNRALHPEDIEEVPPLLA ATPAPFPPEAPQASGPDNEKDRLLAALRACGGNRTRAARELGMDRTTLWRKLQKYIQK >gi|316921605|gb|ADCP01000147.1| GENE 18 22099 - 23424 1909 441 aa, chain + ## HITS:1 COG:FN0225 KEGG:ns NR:ns ## COG: FN0225 COG2610 # Protein_GI_number: 19703570 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Fusobacterium nucleatum # 6 428 5 440 452 179 31.0 9e-45 MSSAYVLVAFAVCILLMILFISKWKIHPVTSILVATIILAISIGTPWDKIEGIINKGMAD TLKSIAIVIVLGCILGKVLEETGAAVSITKATVKLFGEKRVLWAILLASAILGIPIWADT VVILLIPIVSNLALQTNKSMMSYGTALYMGALVTASLVPPTPGPVAAAALLHLPLSDAIM WGVLISIPSVFAAGLYCLTLKEPVAPQDEYIKSAEASKDKPLPSVGTSLAPIILPLVLIF LNTGINAMFPDTTVAKVFKFVGSPLAALLAGCLFSLVLTGAEWRSKKVMNDWVESALRSS AMPIMVTAMGGALAAFIKNAGVANAVAETVVQFSIPGIIIPILIAALIHVITGSNALGVM TAAALVEPMLGALGISPLAAFLACGTGALMFKHANSSGFWVTVTMSNMDIRQGIRAVGGG STVAGVTGAAITVILHFAGII >gi|316921605|gb|ADCP01000147.1| GENE 19 23557 - 24546 1537 329 aa, chain + ## HITS:1 COG:lin2813 KEGG:ns NR:ns ## COG: lin2813 COG1063 # Protein_GI_number: 16801874 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Listeria innocua # 23 315 19 330 350 125 29.0 8e-29 MDTFVTKAARLNGPRDIELIERELVCGEDDIIVKNHLMGICGSDKNFYRGYLPPKTAEFR QDPKFPFLLGHESGGTVVAVGSRVADYKVGDKVMAFGWNNNFAEYFAAKSFQLQPVPYGL DMDIASLGEPISCAMYSGLNSGVQLGDTVVVMGGGFAGQIIAQCARRKGAYRVIVVDVLE GKLALARRLGADITLNPGQDDVIEAVKDLTNGLGADVVVEAAGTAESFNTASEIIKHNGK FVFYSWVTTPVTLNISRWHDDGLEFINTCLVHHTWQQRYVWCPEALRPVAQGLVDVKPLI TDEFKLDDIKAGFDLADKDDAAIKIVFRP >gi|316921605|gb|ADCP01000147.1| GENE 20 24667 - 25311 831 214 aa, chain + ## HITS:1 COG:HI1012 KEGG:ns NR:ns ## COG: HI1012 COG0235 # Protein_GI_number: 16272947 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Haemophilus influenzae # 11 208 7 205 210 170 43.0 2e-42 MSTCPDCETGRNLLAGACRSVFQRGLTGGSAGNMSLRVEGGILATPTGIPFGELTPGNLS LLDETGALVSGPKATKEAPFHLAWYKANPDHLAVVHLHSPWAVAVSCLADLDPGSPLPPL TPYQLMKLGHLAVAPYAKPGSAELCAGVARLAPGRTALLMANHGSITGGASLTKALDAAE ELEATCRLYLTLRGLPVRCLTPEQAAELLPPKGA >gi|316921605|gb|ADCP01000147.1| GENE 21 25316 - 26045 762 243 aa, chain + ## HITS:1 COG:HI1011 KEGG:ns NR:ns ## COG: HI1011 COG3395 # Protein_GI_number: 16272946 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Haemophilus influenzae # 4 226 1 219 413 230 51.0 2e-60 MTAVLGCIADDFTGGTDLAETLSRNGMRTIQLLGVPDTALASEAAAQADALVVALKIRTV PVEEAVSQALAALEALRAAGCRQFFWKYCSTFDSTPKGNIGPVAEALLQALGGDRTVVCP SFPENKRTVYQGYLFVGDRLLHETSMRSHPLTPMTDANLIRLMDAQSSRKSGLVPLETVR EGVEAVKARMDELAAQGTPWLVCDALSDADLRAIGAACAGHALVHGRQRRGPRARRQLRH RRG Prediction of potential genes in microbial genomes Time: Fri May 13 04:51:42 2011 Seq name: gi|316921597|gb|ADCP01000148.1| Bilophila wadsworthia 3_1_6 cont1.148, whole genome shotgun sequence Length of sequence - 8119 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 416 471 ## COG3395 Uncharacterized protein conserved in bacteria + Term 452 - 479 -0.8 2 1 Op 2 . + CDS 498 - 1898 1912 ## COG4091 Predicted homoserine dehydrogenase + Term 1956 - 2008 12.5 3 2 Op 1 . + CDS 2693 - 6352 4237 ## DVUA0145 hypothetical protein 4 2 Op 2 . + CDS 6426 - 6857 561 ## Dvul_2964 TIR chaperone family protein 5 3 Tu 1 . - CDS 7095 - 7235 138 ## 6 4 Op 1 . - CDS 7578 - 7838 403 ## Ddes_2350 hypothetical protein 7 4 Op 2 . - CDS 7835 - 8119 331 ## Desal_3542 protein of unknown function DUF6 transmembrane Predicted protein(s) >gi|316921597|gb|ADCP01000148.1| GENE 1 3 - 416 471 137 aa, chain + ## HITS:1 COG:SMb20670 KEGG:ns NR:ns ## COG: SMb20670 COG3395 # Protein_GI_number: 16265125 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 6 133 301 424 425 112 52.0 2e-25 TSPRTVAFARRHCPEGPCIVYSTASPDRVAATQARFGREAAGAMIERALADIAAALVETG IRRILVAGGETSGAVVSRLGVRALRIGETIAPGVPWTETLLDPANPGAPHLALALKSGNF GGEDFFMQALGLLANAN >gi|316921597|gb|ADCP01000148.1| GENE 2 498 - 1898 1912 466 aa, chain + ## HITS:1 COG:AGl3299 KEGG:ns NR:ns ## COG: AGl3299 COG4091 # Protein_GI_number: 15891768 # Func_class: E Amino acid transport and metabolism # Function: Predicted homoserine dehydrogenase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 462 52 481 489 157 29.0 4e-38 MFYESLFNPSIHRPVKACLTGAGEFGASFLFQAQGMPLLEVPAVCTRTVQRAVDAYENAG IPASKVRVCESAEDAKKAYADGFNVVVDSFEYIADLPLDVLVESSGAPEASAAAAELAIE RGMHVVMVSKETDSVVGSILARKAAERGLVYTTGDGDQPSLLIGLISWGRVLGMNIVGAG KSSEYDFIYDPARDMVLCPGQKQEIATPGMGSLWQFGDRPAEEVVGKRRAMLTSLSHRSV PDLCEMAVVANATGMMPDRPLFHAPVARIDEMPDLFCPKADGGLFDAPGRLDIFNCFRRP DEASMAGGVFIVVECKDKKSWQVLKAKGHPCSRNDRYAALYHPAHLLGVETGTSVLAAAL MKHATGGGNLRPLCDLCGRAVRDLPKGTVLNMGGHHHTIDGVEGVLNPAAAIGPDAALPF YLLSHRTLCRDVPAGKLITLGDLDMPEESTLLRLRREQDKAFGLVK >gi|316921597|gb|ADCP01000148.1| GENE 3 2693 - 6352 4237 1219 aa, chain + ## HITS:1 COG:no KEGG:DVUA0145 NR:ns ## KEGG: DVUA0145 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 1 1218 68 1306 1308 1394 64.0 0 MSQLHGINGAQQPQATGISPSVGKLGLHSVQLGTNPPVRLDAIKGNKIPFAGFRTATKVV NAKTGARENAALALRSLASPDGKLDAKALLNAAKSMQTHLNRLGTLGEIRGTMDDAVIAA FAPEVESLSNTELLNAYQQFLSPEMSLLKRALQAEMSANPRNEDVMAAAANLFSLEALVT KEASNRIIIAQGLAQPGQIPPLSAQYGAGIEGMGAARPHEAPADMSAVSMHVLMDVAIDS SARRERVGGLVADMASRRNLGNIDARQFGDVLRSAGLTINVDLGFLFGMNGPKPLLKAGG AWEHIFHSIEAAPDEASRQAAIDVKGAGYIQKRDNVERGLFPELSEDRPAVANERPTYAA LNLLRQRTGAAPTYGTVALHLKPEVARRATYTVDDTFVALRLRYTEAGRQAVLDLLPGSP GISEAHKLDLMTEGSELRRRLDAIFDGMAAKGEFRADLFKNEFQLFGLEDDENSALAGLF IKVFKDTQSTRKAMASFDSLETLLPELGDMDAVSLARAALDRQQHGMGRVASECNYIEAQ VHGPIVFARDVAEIVINKEFGLDQLPQAQKAWFNAVVAVLGGKQPAAADMDAFSAEQRAE LAAIREQLGGAVIPVRIEEQIPELDLKNTVRSEERAFYAAHLDQARIDAKLHDVQQDDAG LQAFISQMLSIRPGGAAVSRILGTVPLVAGGDAQNVREAFAAYVEQYRHVPLRGQHTEDD VLQNAMWQAVSDVMGKGRLDSLAAIEELTADPAQRATLRDFVMGHPPMSGQAFRALASAA LQGAGVLNGLAPAEDEPLDDEAMLTRFGGAAASFRRSFDAMPEEERDAAGEGRLLQAFGG LAFSLMRDASPEVSDRVAERLNGPAMRGLSGVLLRLGDAERGFPQDAGFRDALAFNAFQS GLRAALGGRAETPATFAGELSLIPQADRDRLRAALPGLADTLDASFPARPAFPPAQAGKL AATPAQHRDFLLSMLPIYHDHERPGAFDHGAAYHGRGHICRAFIFASTMAGLMEEMGHTV DRTALLCGIAGHDAGRERNGADTPEQEAESARLALEKMHERFGADTFGDDYEREFTAAIV GHASPTLESMLLNAADSLDIGRVAEFDFKYFPFLRGGEQEGPKALVPEYQNLRQALHEEA DLLARMTDPLTQTRDLRMKLIQAGEAEDMVHVQRAASEAVAGQLALDAEEDFLAFVEGKI RAHPDMFPLLTRYYLDPLA >gi|316921597|gb|ADCP01000148.1| GENE 4 6426 - 6857 561 143 aa, chain + ## HITS:1 COG:no KEGG:Dvul_2964 NR:ns ## KEGG: Dvul_2964 # Name: not_defined # Def: TIR chaperone family protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 126 1 126 153 184 66.0 1e-45 MELASLIEDLGKACGLELTLDEGRCTLRFDGEHDVTIERDGNAVILHGIAGDGGCLSNPD NLRILLTASFLGAETDGGALSVWERTGEIVLWKRFDAFTDYPDFEAVINAFLAQLIHWKA RIDALPSVAVSASATFTPGGIMV >gi|316921597|gb|ADCP01000148.1| GENE 5 7095 - 7235 138 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKKVLIRMPLLPGLNDDPAGLEAAFAMLREEAAQGADIAGVELLP >gi|316921597|gb|ADCP01000148.1| GENE 6 7578 - 7838 403 86 aa, chain - ## HITS:1 COG:no KEGG:Ddes_2350 NR:ns ## KEGG: Ddes_2350 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 5 86 9 90 91 67 47.0 1e-10 MKHPIRLTIAALLRDGKPRTARDVFEALRPQYPGERQLSPDAVDGHLQALKAVGIVRIAE ETLEAGTLTQRYALSEYGKGRVERFL >gi|316921597|gb|ADCP01000148.1| GENE 7 7835 - 8119 331 94 aa, chain - ## HITS:1 COG:no KEGG:Desal_3542 NR:ns ## KEGG: Desal_3542 # Name: not_defined # Def: protein of unknown function DUF6 transmembrane # Organism: D.salexigens # Pathway: not_defined # 12 93 253 334 334 102 63.0 3e-21 GPRRFDARRGCFALAGVVGCISYRCWYKAMNMAGVSRAMALNITYALWGVLFGALFTDVE ITRSLVIGAAAIFAGMFLVIGNPRDIINLRNVSS Prediction of potential genes in microbial genomes Time: Fri May 13 04:52:46 2011 Seq name: gi|316921551|gb|ADCP01000149.1| Bilophila wadsworthia 3_1_6 cont1.149, whole genome shotgun sequence Length of sequence - 59626 bp Number of predicted genes - 47, with homology - 42 Number of transcription units - 28, operones - 9 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 736 1085 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 2 2 Tu 1 . - CDS 927 - 1319 532 ## COG0509 Glycine cleavage system H protein (lipoate-binding) - Prom 1342 - 1401 2.1 3 3 Op 1 . - CDS 1511 - 2935 671 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 4 3 Op 2 . - CDS 2946 - 3383 725 ## Ddes_2346 hypothetical protein + Prom 3590 - 3649 2.5 5 4 Tu 1 . + CDS 3702 - 5111 1696 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 5200 - 5240 5.0 - Term 5186 - 5228 -0.9 6 5 Tu 1 . - CDS 5360 - 5917 548 ## - Term 6519 - 6550 -0.5 7 6 Tu 1 . - CDS 6656 - 7210 862 ## Ddes_2344 hypothetical protein - Term 7396 - 7430 4.0 8 7 Tu 1 . - CDS 7465 - 9351 2838 ## Ddes_2343 type I phosphodiesterase/nucleotide pyrophosphatase - Prom 9591 - 9650 2.4 + Prom 9524 - 9583 5.4 9 8 Tu 1 . + CDS 9770 - 11329 2166 ## COG1292 Choline-glycine betaine transporter + Term 11358 - 11403 15.8 10 9 Tu 1 . + CDS 11690 - 12874 1745 ## COG0477 Permeases of the major facilitator superfamily + Term 12946 - 12988 10.4 - Term 12934 - 12976 12.0 11 10 Tu 1 . - CDS 13152 - 14768 2108 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold - Prom 14865 - 14924 2.5 12 11 Tu 1 . + CDS 15219 - 16574 1684 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases + Term 16761 - 16801 9.6 + Prom 16866 - 16925 3.4 13 12 Op 1 . + CDS 16946 - 17593 902 ## COG1802 Transcriptional regulators 14 12 Op 2 . + CDS 17590 - 18690 1526 ## COG3191 L-aminopeptidase/D-esterase + Term 18692 - 18744 6.1 + Prom 18741 - 18800 2.1 15 13 Tu 1 . + CDS 18868 - 19944 1709 ## COG4521 ABC-type taurine transport system, periplasmic component + Prom 19991 - 20050 1.9 16 14 Tu 1 . + CDS 20163 - 21074 1163 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 21562 - 21621 3.9 17 15 Op 1 19/0.000 + CDS 21656 - 22882 1739 ## COG1566 Multidrug resistance efflux pump 18 15 Op 2 11/0.000 + CDS 22879 - 24405 2094 ## COG0477 Permeases of the major facilitator superfamily 19 15 Op 3 1/0.000 + CDS 24414 - 25016 534 ## COG1309 Transcriptional regulator 20 15 Op 4 2/0.000 + CDS 26307 - 27098 793 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 21 15 Op 5 . + CDS 27107 - 28714 1606 ## COG2199 FOG: GGDEF domain + Term 28824 - 28860 -0.4 - Term 28666 - 28694 -0.6 22 16 Tu 1 . - CDS 28764 - 29093 354 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 29172 - 29231 2.5 23 17 Tu 1 . - CDS 29497 - 30468 654 ## DVUA0003 hypothetical protein 24 18 Tu 1 . - CDS 31241 - 32029 944 ## COG1192 ATPases involved in chromosome partitioning - Prom 32214 - 32273 8.7 - Term 34207 - 34249 4.2 25 19 Op 1 . - CDS 34273 - 35361 1051 ## COG0535 Predicted Fe-S oxidoreductases 26 19 Op 2 . - CDS 35371 - 36261 857 ## THEYE_A1412 hypothetical protein - Prom 36364 - 36423 3.4 - Term 36305 - 36329 -0.3 27 20 Tu 1 . - CDS 36579 - 38327 1781 ## THEYE_A1413 hypothetical protein 28 21 Op 1 2/0.000 + CDS 38853 - 39416 840 ## COG0450 Peroxiredoxin 29 21 Op 2 . + CDS 39416 - 41089 1622 ## COG0492 Thioredoxin reductase + Term 41220 - 41260 0.2 + Prom 41566 - 41625 6.4 30 22 Tu 1 . + CDS 41666 - 41920 83 ## + Term 42037 - 42073 0.5 - Term 41781 - 41822 10.2 31 23 Op 1 . - CDS 41849 - 44521 2909 ## xccb100_3061 hypothetical protein 32 23 Op 2 . - CDS 44537 - 45142 823 ## - Prom 45260 - 45319 1.5 33 24 Tu 1 . + CDS 45371 - 45574 109 ## 34 25 Op 1 8/0.000 - CDS 45738 - 46913 1488 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 35 25 Op 2 . - CDS 46903 - 47208 428 ## COG1396 Predicted transcriptional regulators 36 25 Op 3 . - CDS 47198 - 47407 144 ## 37 25 Op 4 . - CDS 47445 - 48419 1139 ## COG3713 Outer membrane protein V 38 25 Op 5 13/0.000 - CDS 48416 - 49804 1794 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 49832 - 49891 1.6 39 25 Op 6 . - CDS 49960 - 51648 1785 ## COG0642 Signal transduction histidine kinase - Prom 51706 - 51765 1.7 40 26 Tu 1 . + CDS 51960 - 52700 893 ## COG4359 Uncharacterized conserved protein, possibly involved in methylthioadenosine recycling + Term 52715 - 52755 0.9 + Prom 52726 - 52785 2.1 41 27 Op 1 . + CDS 52823 - 54208 2053 ## COG0160 4-aminobutyrate aminotransferase and related aminotransferases + Term 54210 - 54259 8.1 42 27 Op 2 . + CDS 54261 - 55223 1027 ## COG1647 Esterase/lipase 43 27 Op 3 . + CDS 55220 - 55582 458 ## Smlt1419 putative transmembrane protein 44 27 Op 4 . + CDS 55579 - 55989 617 ## CtCNB1_0145 quaternary ammonium compound-resistance protein + Term 56073 - 56120 0.9 45 28 Op 1 . + CDS 56165 - 57358 1139 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 46 28 Op 2 . + CDS 57351 - 58439 1345 ## Daci_0136 hypothetical protein 47 28 Op 3 . + CDS 58432 - 59385 868 ## Daci_0135 hypothetical protein + Term 59505 - 59541 -0.8 Predicted protein(s) >gi|316921551|gb|ADCP01000149.1| GENE 1 1 - 736 1085 245 aa, chain - ## HITS:1 COG:FN1236 KEGG:ns NR:ns ## COG: FN1236 COG0697 # Protein_GI_number: 19704571 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 11 228 3 222 322 96 29.0 5e-20 MTNLASLREQKELRYAKKGLALALMSGMIWSSDGLILGKGLAEEPFGSPALWLFAPLLAA GLHDFCAACLSLAINGAQGKGREVIRTLRSKAGRACIWGALLGAPLGMGGYLMALSMAGP AYVLPITSLYPAIAALLALVFLKEHVSLRAWGGLALCVIGAIAIGYTPPEGAGGGLFYLG IAFAFLAAFGWAAEGVCVTSGMDFIEPAVALNVYQIVSSLLYAGIIVPLALWHLSASAPG CDIPG >gi|316921551|gb|ADCP01000149.1| GENE 2 927 - 1319 532 130 aa, chain - ## HITS:1 COG:BH3484 KEGG:ns NR:ns ## COG: BH3484 COG0509 # Protein_GI_number: 15616046 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Bacillus halodurans # 8 128 3 125 128 117 51.0 6e-27 MKSNEELNLPESLRYTDEHVWVRMEGGEAVIGISDFAQDQLGEIAFVDLPAVGVRFDAGS DFGTVESLKSVNQLYMPIPGEVVAVNEELESTPTLVNVSPYGEGWMIRIRPDASAESLLD AAGYRALLGL >gi|316921551|gb|ADCP01000149.1| GENE 3 1511 - 2935 671 474 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 5 465 6 447 458 263 35 2e-69 MPIRLTVIGGGPGGYTAAFAAARAGMSVTLVESGNLGGTCLNNGCIPTKTLKASADALEL ALRLSQFGITGQGAPAVDPAAVLARKEKVCSTLRGGLEKACASLGVRLLKGRGRLVHAGL VEASTAEGPVSVVGDRVILATGSGALELPGLPVDHTHILTSDDALALDRVPASIAIVGGG VIGCELAFIYQAFGSKVTVVEGQNRLLPLPSVDADMSALLQREMKKRRIGCELGRTLKDV RVEGGMVRAMLGASPFIKEPTPAQMKETPIEAETVLVTVGRAPNTAGLGLAEAGVAVDGR GWIRADEHMRTSLPGVYAVGDALGPSRIMLAHVAAAEGLCAVRDCLGHDGRMDYSAVPSG IFTSPEIGCVGLAEAQASEAGRDVRTATFQMRELGKAQAMSELPGMFKIVSDGATGKVLG VHIAGAHATDLIAEAGLALRLGASVRDIAATVHAHPTLAEGLYEAALSLVEAGG >gi|316921551|gb|ADCP01000149.1| GENE 4 2946 - 3383 725 145 aa, chain - ## HITS:1 COG:no KEGG:Ddes_2346 NR:ns ## KEGG: Ddes_2346 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 145 1 145 145 208 72.0 6e-53 MSHELSVSIARRGAKVDLDMESAAFGVISIDGEQIPAEERPGTAKKLLASSALYCYCAAL DKALETRSAKYRRIEGKATVRTGTDDKGRGRVTGIDLDVTVFMDQEYEFIFDRVEKIMKQ GCLITGSLEAAFPVKYNLNLECDED >gi|316921551|gb|ADCP01000149.1| GENE 5 3702 - 5111 1696 469 aa, chain + ## HITS:1 COG:XF2545 KEGG:ns NR:ns ## COG: XF2545 COG2204 # Protein_GI_number: 15839134 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Xylella fastidiosa 9a5c # 169 467 168 496 497 244 42.0 3e-64 MELFASAPGLAVEESYVRLLNHLPGLAYRCRVGRTDPIVSERLEYTLEFVSKGSYDLLGI PAEDMVRNNNNTIERMMHPDDLQKTRRNMYDSIVAHQPYQVMYRVSLPSGRLKWIWDQGE GVFDADGELCYLEGLMMDISEQKFQELELKEENRQLKASVSNLYGLGNIVGKSEAMRRVY GLILKAAETDTNVIIYGETGSGKDLVAKAIHEYSARKGNYVPVNCGAIPEQLLESEFFGH TKGAFSGAHANKEGYIGAAHNGTLFLDEIGELPIHLQVKLLRAIENKMYTPVGSNTPKVS SFRLVAATNQDLSKMVLEKKMRSDFYYRVHVLSITLPPLRERHGDIPLLVEAYMERKGIS CALPLQVRLAMEHYDWPGNVRELHNFLDRYTAFGDVALESLGDAGKTELLFPALHHEGLT LETATLQLEERLIRQALERCRWRRGDAAEALGLNLRTLQRKMKRLGMGR >gi|316921551|gb|ADCP01000149.1| GENE 6 5360 - 5917 548 185 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNADPLTYAPFLLLLLLGALVEIRYWIRNRARMTPKARLKRGLFLAAVPVLCAALWLGWE RAAFWLEEHSAPPYAQIEVPVSDLRDNQDGLRTFTERLWETVRSGTHAEAQARFHDALFF LGLCELSSGSGGKGTVDEPLSTRDVKIAGVTALFRTALANGHFVLSDILDRYAEAKREHP ELLSK >gi|316921551|gb|ADCP01000149.1| GENE 7 6656 - 7210 862 184 aa, chain - ## HITS:1 COG:no KEGG:Ddes_2344 NR:ns ## KEGG: Ddes_2344 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 184 1 183 183 183 64.0 2e-45 MAEKKALLVLADALDLNGSGGALDKLKKKAAVLSHADAAGLKDLAVALGGVCAGASGIEA AFEADAALVIVEGADALAPALEAADRRTLVVVVSASGTAFYGLAVNPKAGIVGRAVNAQD IAVTIATIADLPVDEDCTGAIIYQVMKNPNLKLEEIKKLKEALVRMESVIQRDNREPWDK HDCA >gi|316921551|gb|ADCP01000149.1| GENE 8 7465 - 9351 2838 628 aa, chain - ## HITS:1 COG:no KEGG:Ddes_2343 NR:ns ## KEGG: Ddes_2343 # Name: not_defined # Def: type I phosphodiesterase/nucleotide pyrophosphatase # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 627 1 627 628 1197 88.0 0 MADKAKRAALIGYDCLIPKRLEAMLAQGGLEHFRAFMNEGSFIPEGYNLPTVTPPSWATI CTGAYPRTHGVEDYYYYHEGRSLDYKETTQAFGSDIVTAETIWDAWDKNGKKCIVVNYPM SWPSRMKNGVMIMGQGLSPAETRWPLHGNEHKEFLASESVISTEFYPMGVQGTFDDAKGW KNLPECDEPLEMVVNMAFKECVEPVEGQTWYCLAWESGDDGYDRIALCPEKDYSKAFFTI RLGEWSGPVQHDFTIKADGRTEKGVFRCKLMQLSDDAEEFKLYISGIAGRIGFIAPPEAA DQIDFSKHITANDIGLVAFLHGIIDTDTVCELVEFHSAWLWNTVESLLKANPDWDLFYMH SHPIDWFYHGWLSELDSKDPEIRARAEKMERHIYEVEDRLLGRLMDIMGDDTLMCVCSDH GATPMGPILNTAHALKEAGLCSYEPKKSENYWDIYEETEGFNYVLDVSKSLAVPQRYMFV YVNLKGKYPGGIVEPEDYEKVRGRIIDALLDYKHPETSERPVLLAVRKEDAHVFGMGGAQ AGDVVYVLKPEYMAEHGYGLPTGESGCGSLKNLLMFRGPNIRKGYRYTRPRWLADIVPTL CYMTGGPVPEDAEGAPIYQIMEDPNMVK >gi|316921551|gb|ADCP01000149.1| GENE 9 9770 - 11329 2166 519 aa, chain + ## HITS:1 COG:SA2408 KEGG:ns NR:ns ## COG: SA2408 COG1292 # Protein_GI_number: 15928201 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Staphylococcus aureus N315 # 16 507 8 507 540 320 34.0 4e-87 MNQATEKGNGIDLRPEWGIFIPSVLVIILISIPAVLYPKAAEEVVSAIYQPFAANFGTLY LWITVGLIILCVYFACSRYGDIKFGDPDEKPEFSLSSWIAMIFCSGVAGAVMFWSIIEPL WDIVQPPQYAAPMSTQAYDWALAYLLLHWGPNAWCTYFITALPIAYMFHIRRKPFLRISS AADMIIGKQKDGLLGRCVDVFFILGLLFCTAVTMCISLPTVEAALARVFGITPSFGLEIA ILFVSALLAGVTVYLGLKKGIKRLSDINVVIALAMVAYGALVGPTSSLFDIFTNAVGKML GNFWSMTFWTNPFSEGSFPRDWTIFYALFWAGYGPFMGLFIARISRGRTVREVIGWGMFG TVAGGFMIHGVFGSYTLWAQYHGIIDAVSILKTQGAPAAMMAVIDTLPFSKVVMLIYCAF STIFLATTVNSGCYVVAATATRRMPVDADPHRYHCTFWAVAQGLLALGLLAMGGLEVAKI FGNFSGALMALPSLLLTACWLKIIREDGKDLLHTYRAKD >gi|316921551|gb|ADCP01000149.1| GENE 10 11690 - 12874 1745 394 aa, chain + ## HITS:1 COG:ECs3121 KEGG:ns NR:ns ## COG: ECs3121 COG0477 # Protein_GI_number: 15832375 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 8 380 2 377 396 134 24.0 4e-31 MRRRLCPKEALFTPSFIAVCSANLLFFIASFMMVPVLPIYLLDGLHASKSVVGIILSAYM IGALVMRPLSGFLADQFPRKILFLVCGILFAAQFEGYLLFNALALVGIVRALHGMSFGAL STSAATLAVDVIPIAKLGTGIGLYGMMGSLAMALGPMIGMLVLEAGSYDAVFITAMGCAA GGILLGLLVKKQRVTASRGEKISLDRFFLKKGTYAFIGLVLMAFVYGLLVNYLSVFARER HVMANPGYFFLLMSLGLILSRLFAGGMIDKGYIGRLILGGKGTILLASLLFLFVPTETVF FGSAVAFGLGFGMMSPSYQTLFINLAEPTRRGTANATYLIAWDVGIGAAVLLGGLIAEIS SFDDAFILGLIMLFFSAVLYMKVIGPQYEKNRLR >gi|316921551|gb|ADCP01000149.1| GENE 11 13152 - 14768 2108 538 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 4 534 14 548 553 222 30.0 1e-57 MRCIEKTLYTNGRIVTMDAAGTVAEAVLVAGNVIEAVGSKDGLAALAGPGCRTVDLKGAS LYPGFIDTHSHASLYAMWKTHCRCPGIAHLEDVYPLIRKQAERVRDGDGIVVYNFDDTDI PERRGPTKRELDAVMPDRPVLVFHISGHTCYANSKALERIGVNPDAPFDDVNVFLGKDGL PNGYIAEEMAFKAMGELLPALDQERHKALVRECVAEYNAQGFTSAHDSAVGIGNLSAETV YRTYNQLYESGELNMRVYLATMEQPFRRLEPTGLLDGPGNRFVQYSAVKTLADGSIQAGT AAIPEGYFFDPSLRPGIIGTQDYWDEMVYHWHSRGRQLSIHCNGVGAIETIITAVERAQA RCPRKDARHLIIHCQMATDEHVRRMKEAGILPSFYGLHVWNWGDRHRDIFLGPDRAARLD PAGSAVREGLPFSLHADTPVLPQMTMLSIHTAVNRETKGGAVLGPDQRISTLEAVRAYTS YAALFSHSEAWRGTIEPGKVADFVIPSEDILEAPAGRLKGIAFAAAIVDNRVVHGQLP >gi|316921551|gb|ADCP01000149.1| GENE 12 15219 - 16574 1684 451 aa, chain + ## HITS:1 COG:BH0746 KEGG:ns NR:ns ## COG: BH0746 COG0402 # Protein_GI_number: 15613309 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Bacillus halodurans # 4 441 5 439 445 370 43.0 1e-102 MRTLLANATVITMNPARDVLDTDILIENGVIADMGPGLAGRPENAGAERVDLSGRIVIPG LIQSHMHVTQSLFRGLADEMELMDWLQRRIWPLEGAHTPETNAAAARLAAAEGLRSGVTA FIDMGTAHCQDAIFETMRDVGMRGLFGKCMLDLGGTDVPAALMEDTETCLRESERLMNRW HMSAGGRLRYAFAPRFVPSCTETLLTRTRDMARANGVRLHTHASENKGECAYVESLVHMR NLRYLHSIGYTGEDVILAHCIWIDDDEIRILADTGTHAVHCPSSNMKLASGIARIDEMLA AGCRVALGLDGAHNHMDALVELRQAGILQKVRTNRPSALSPLQALEMATLGGARALGQED ELGSLEPGKKADLAVINPDRLNMAPRIGRDPVAQVVYQATHENVEATMVDGVFLYRDGKY ATLDLGECLRDAESACARILRSPDVAPLFAH >gi|316921551|gb|ADCP01000149.1| GENE 13 16946 - 17593 902 215 aa, chain + ## HITS:1 COG:SMb20323 KEGG:ns NR:ns ## COG: SMb20323 COG1802 # Protein_GI_number: 16264057 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 11 200 10 200 219 73 27.0 2e-13 MITLNTNPGLLRTAVYERILSDLQQGAIAPGETISLNKLAETLNISKTPLRDALLELQAE GFVTLYPQRGVMVNALSEEEKREIYEVCGVLDGQVIRNVFPRVTPEHVARLKDLNARMYP GTQDISNDEYNRINLAFHETYLELETSSLLKKLLRINRLRLFQFSLRDWGKAFCAVNHAE HGRIIELFESGTAEELSRYVTQVHWSFNWQQGTES >gi|316921551|gb|ADCP01000149.1| GENE 14 17590 - 18690 1526 366 aa, chain + ## HITS:1 COG:PA1486 KEGG:ns NR:ns ## COG: PA1486 COG3191 # Protein_GI_number: 15596683 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: L-aminopeptidase/D-esterase # Organism: Pseudomonas aeruginosa # 15 361 16 359 366 270 49.0 3e-72 MRIRDIARIDGFPCGEYNAVTDVKGVRVGHSTVIKDGAGPVEGIARTGVTVVLPAGDIVE NAMYAGVFSLNGNGELSGCHWIEESGLLTTPIALTNTHSLGIVRDAMIDYYARLRKTDGS SYWMLPVVGETWDGWLSDINGMHVRPEHLCAALDSAASGPVAEGCVGGGTGMILHEFKGG IGTSSRVLPEELGGYTVGVLIQGNYGKREDLLINGVPVGRHIPTSRIPGKEQALQPTPKE DGSIIIVVATDAPLTPDQCKRLARRAGLGLGRTGGLAFDGSGDIFIAFSTANRLGSNAGG GPREHTVRTLSHGCMDPLFRATVEATHEAIVNALCAATDTDGILGRRMYALPLDEVRGIM RRYESL >gi|316921551|gb|ADCP01000149.1| GENE 15 18868 - 19944 1709 358 aa, chain + ## HITS:1 COG:PA3938 KEGG:ns NR:ns ## COG: PA3938 COG4521 # Protein_GI_number: 15599133 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type taurine transport system, periplasmic component # Organism: Pseudomonas aeruginosa # 9 251 19 234 337 73 24.0 7e-13 MWKRFLGIMAAALCVLNLTVGSAFSADLVKVPTAWLDEHETFLMWYAKEKGWDKEAGLDI EMKLFNSGPDILNALPAGEWRFAAVGALPAMLGNLRYGTSIIAQANNEAALCTSVVVRAD SPIAKVKGWNKDYPEVYGSPETVRGKTFLATTLTSSHYALSTWLNVLGLKDSDVVIKNMD QSQVVGAFENNIGDGIAIWAPHTFIVQEKGGVVFAGDIVHCKKSNPIVLIADTKYAEAHP DVAAKFLSVYMRAVDLMKNTPPKQFVEEYQKFYLEFVGKEYNYNQALLDLETHPLSNIDE QLAIFDDSKGPSQAQLHQSDIAAFFSGVGRITADEAAKVADGKYATDKYLKLLKEQQK >gi|316921551|gb|ADCP01000149.1| GENE 16 20163 - 21074 1163 303 aa, chain + ## HITS:1 COG:BS_ywfM KEGG:ns NR:ns ## COG: BS_ywfM COG0697 # Protein_GI_number: 16080815 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 10 295 8 293 296 163 40.0 4e-40 MSTPFARGPLLILLGALCFSTTGTVQALSPEGATPYVIGALRMLVGGLALLLWCAFRGEL PRFWRWPMRCVVPSTLALLGFQFFFFRGVLEAGVAVGTVVAIGFSPIVVALLGLIFLREK PAKAWYPATGLALIGLILLNADAVGGASFAGMAFPLLAGFSYACYFVFSKPLAQNAAPGS VMMVLCLLSGLCMLPVFWIYPAAWLATPVGALVALHLGVVTTAIAFSLTLAGLKVTPAAT ASTLSLAEPLSAACLGIFFLHEPLSLSAAVGIALIFGSVLILVACPSGQPPRESAARLRP QGR >gi|316921551|gb|ADCP01000149.1| GENE 17 21656 - 22882 1739 408 aa, chain + ## HITS:1 COG:YPO3267 KEGG:ns NR:ns ## COG: YPO3267 COG1566 # Protein_GI_number: 16123424 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Yersinia pestis # 15 397 8 389 390 365 49.0 1e-100 MTTDENLSMTPDNDKAPQRRKRPRKGPIVGFLLLLFLLAGIGAGAWWYLFLRFEESTDDA YVAGNVLRIMPQVSGKVVAVDVDDNDMVAAGQTLVRLDPVDARLAYERSLVALASAVRDN CRLQAQLRETEAAIRMRRVDLRQKADNLTRREALGKAKAIGAEELKHARDDRENAAVALA AAEEERNALAAVLLDTGLAEQPAVRQAAANVRDCWLALQRTTVLSPAAGQIAKRGVQAGE VVSPGTPLMTVIPLHRLWIDANFKEVQLHKMRIGQPAVIRVDMYGGDVTYHGRVTGFSAG TGSAFSLLPPQNATGNWIKIVQRVPVRIEIDPADLREHPLLVGLSAVVKVDTSDTSGRLL AASARTEPVPGDLVSQAPAVDFAPVEATISGIIRDNALPGVCPVQGKP >gi|316921551|gb|ADCP01000149.1| GENE 18 22879 - 24405 2094 508 aa, chain + ## HITS:1 COG:YPO3268 KEGG:ns NR:ns ## COG: YPO3268 COG0477 # Protein_GI_number: 16123425 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 10 502 5 499 511 561 57.0 1e-159 MSGRAQPLPPLQGGMLVIATIGISVVTAMTVLDSTIANVALPTIAGNLGVASSQGTWVIT FFGVANAIAIPLTGWLARRMGEVRLYFWSTFLFVVASFLCGISSSLGMLIICRILQGAAA GPLIPLSQSLLLACYPREKRGIALSLWSMTIVLAPILGPILGGYICDNYTWNWIFFVNVP VGAVCLVALRIPMAGRETATAKNPIDTVGLILLIIGVGCLQLMLDEGKDKDWFASGYIVI LGVLTVVGLVLLVAWELTEEHPVVNLSLFRHRNFTVGTICISLGFLFYFAGVVMMPMMLQ TRLGYTSMWAGLSLAPIGIFPVFLSPLIGRFAQRIDMRVIVTMSFLVFSAAFYLRTLFSP DVDFAFVLWPQVVQGIGVAMFFMPLTSIISGLDAKDIANASSLSNCTRVLAGSIGSSLAT TMWERQEALHHVRLTESINPFNPAAQHGLNQLTQMGLSPEQAKVWVANEITRQGFLLGFN ELFWLASIAFIGLAGLVWLSTPINKKIR >gi|316921551|gb|ADCP01000149.1| GENE 19 24414 - 25016 534 200 aa, chain + ## HITS:1 COG:AGl2069 KEGG:ns NR:ns ## COG: AGl2069 COG1309 # Protein_GI_number: 15891152 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 197 94 289 294 87 31.0 2e-17 MAQQTKCRKKTEARREAILDAALEEFAAKGYAGARMEDIARRAGVAKGTLYLHFGDKEGL FNGLAESAFAPMHALAKEILDDKTPTLRERLLRFCGPMLDEQGCSRTARIIRLVWAEGLH NPKLVAPFYHGLVAPILELHRENLRLAAHEHPHPALQRFPQLLVSPIMQGLLWQGMFKDK APLDVKELYIAYLDIILPEK >gi|316921551|gb|ADCP01000149.1| GENE 20 26307 - 27098 793 263 aa, chain + ## HITS:1 COG:mll5202 KEGG:ns NR:ns ## COG: mll5202 COG0834 # Protein_GI_number: 13474337 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Mesorhizobium loti # 38 263 27 258 265 127 31.0 2e-29 MRITIPSFSSLRPVLLAGFLLALLFAGFPPRSGAEEILRIGVEDDYAPFSFPAEGTQAGF DPEIAEALCKAMRRSCTVEPLAFGTLLEKMRRGELDMIVAGLAKNEQRLAYMDFTNSYYH SRSIYLGLPGSVAVSAEGLKGKRVGVQDHTQQEIFLRSHWVNVAEIIRFPTYGQLIDAFC AGQLDAILVDGLSGYEFLQSERGQPFAILDDPLPPDEDLAYAHIGVRKNNSELIEAINEA IVHIKLNGEYDRIVRKYFPFSIY >gi|316921551|gb|ADCP01000149.1| GENE 21 27107 - 28714 1606 535 aa, chain + ## HITS:1 COG:RSp0400_2 KEGG:ns NR:ns ## COG: RSp0400_2 COG2199 # Protein_GI_number: 17548621 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Ralstonia solanacearum # 369 529 13 179 206 126 44.0 1e-28 MYCKPSRNAEPKRKRQLGINLFPVLVSVPLLALLAFVSFIFTNDLQRIKNLALSTSNSHL PQILAKQQMLLNVENMRRNIALVYSASDPTISRNAGITAQALIAESVFEETSVDFSSKMD EIRPQLLRLIELRQQSYLAEDRLHHAEIFFSNVLGRLLMRAGMEYSYPLRHSARHLDKDF SRNTANKQGLQESLRPLMDMCAKAEQAAPPERTLLKDCGEFRTNWDGIVSNWTQLARGHV EAKQLWNTLDGKLRELSDAVSTHEAEWIYRSMSHINEEVSRIKELFPIMGTALLAGLLAT LLLIHLQILRPLSLLSQALRTIRSGHSISELPRVRIRELQDVLDVLPDISRHMDELNTRS GQLAQERDQYADLSFRDALTGVYNRRALEETMERIPARTPLALLMLDLDFFKNYNDRFGH QAGDAALKAVAQATQKALLRDSDRIFRYGGEEFTIILAGAGQYSAMTVAQRILEEVAALD LVHPDSRTGKLTVSIGIAHRSFGEDADICEVLTHADQALYEAKTSGRNRIVLFEG >gi|316921551|gb|ADCP01000149.1| GENE 22 28764 - 29093 354 109 aa, chain - ## HITS:1 COG:RSc2521 KEGG:ns NR:ns ## COG: RSc2521 COG0776 # Protein_GI_number: 17547240 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Ralstonia solanacearum # 5 93 2 90 94 58 35.0 2e-09 MASKSTKKDLVQHIAKALEVSPVEAERVMLVVMRYVRRQLLEDTSVILPGIGTLAVTETA ARLGRNPHTGQPVPIPPRKTIKLKVSKDLKEAMEPSETVRTGGADFDAQ >gi|316921551|gb|ADCP01000149.1| GENE 23 29497 - 30468 654 323 aa, chain - ## HITS:1 COG:no KEGG:DVUA0003 NR:ns ## KEGG: DVUA0003 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 145 304 958 1116 1131 114 39.0 6e-24 MTLRSPTSDALNGRSRRTAKVDVQAGARCKPKRVGEGQISLLRPLAPVHGANPTKLFKWL YGIGKKETIDLTLPIIEETIGIDARTCRRIIATWAETGVCEKTVHRRGMRIRLLMDEKEA LGAPRAASPSVSASPADELERALPELCPRLMEVGFTAKHLRRIKELLAVQNIDPGSLTDA LRYAEYELEHGQMVDSKGQPVSRPVDWVFRCLSKDGTYRIPTGYVSPAELRRREREEAVR REREALEAERRLLEEEAALALEKEVEAMFRQMLDEPDGPFAAQILEKVPAVLRSRGGLEN PLVQKACRMQISYMLQDAVGAGQ >gi|316921551|gb|ADCP01000149.1| GENE 24 31241 - 32029 944 262 aa, chain - ## HITS:1 COG:BB0431 KEGG:ns NR:ns ## COG: BB0431 COG1192 # Protein_GI_number: 15594776 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Borrelia burgdorferi # 3 257 2 245 250 117 29.0 2e-26 MGKIISVVNNKGGVGKTTITVNLAHALAMQGKKVLIVDVDSQCNSTSFFNLAPGCASLYE LLASVYDEEAPEIKPESCIYPTEIGCSILPNVEEMAFIEAEFYKNESYIVALRERLREYA TKEFDITLIDCPPSMGAFVYMAMIASDFIIVPIRAGSRFSLDGITKTINAINQIRRTKLN EGLVLLKFLYNMADLRRLADKHSLTILNNRYPGQVLTEYISEATMLRSSEMLSETVFQSS PRSKVAAKFRSVAREILAALGM >gi|316921551|gb|ADCP01000149.1| GENE 25 34273 - 35361 1051 362 aa, chain - ## HITS:1 COG:AF1615 KEGG:ns NR:ns ## COG: AF1615 COG0535 # Protein_GI_number: 11499207 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Archaeoglobus fulgidus # 36 358 4 319 321 115 29.0 1e-25 MAKDGGAADPLRNPRFQEPSVWQLFRESLFGKPRLLDCIQVEVTSVCPGRCVYCPHTTQA GYWRSRHMEAATFARLWPLMQESGRVHLQGWGEPFLHPRFMDFAALARKAGCRVSTTTCG LRMDETLAGQIVGSGIDIVAFSLVGTDEASNAPRAGVPFSRVREAVRTLQRVRKAKMGVH LELHFAYLMLASQMEAVEGLPDLMDELDVHAAVISTLDYIAAPGLEAEAFSPHEAGKLDA ARDILERVAEETAKRGRGLYYSLPSPRPDDGCGLFPCRENAARTLYVDAEGRMSPCIYLN LPVDAPASSPEAVNRRVFGSAVEEDPLGIWQSGPFRMFRDALQSPLPDIPCRTCPKRFET GN >gi|316921551|gb|ADCP01000149.1| GENE 26 35371 - 36261 857 296 aa, chain - ## HITS:1 COG:no KEGG:THEYE_A1412 NR:ns ## KEGG: THEYE_A1412 # Name: not_defined # Def: hypothetical protein # Organism: T.yellowstonii # Pathway: not_defined # 13 294 14 280 282 112 29.0 1e-23 MPPSRWAAFAAEAAVPEQLVPYVRAVSDLEPLECGGFLALRSGPALVLVGHDAGLKQGGA EQERSRPEAARLDAAVGEASSLRGIESLTVLAPFRPDAAPEWAVSTRPDAYWGIPLPTDA ADMAYGQKLRNQLRRARRDILIAREGWKPDHAELVEQYIRSRPFAPGTRHLFRHIGPYVE AVPDALLFAARDCEGALQGFAVGDYTALGTVFYLFAFRAPESPPGTADALLDALAAEGIR RGHTLLNLGLGINPGIVFFKRKWNATILRPHVETSWAVQRPQEAGLLGSLKKLFGM >gi|316921551|gb|ADCP01000149.1| GENE 27 36579 - 38327 1781 582 aa, chain - ## HITS:1 COG:no KEGG:THEYE_A1413 NR:ns ## KEGG: THEYE_A1413 # Name: not_defined # Def: hypothetical protein # Organism: T.yellowstonii # Pathway: not_defined # 297 561 2 271 273 164 31.0 7e-39 MPDLASVLEQARVPEHSAPFMQAMSQGTAFLEGEYLFLTADDWLMAIGYPLSGEYAHDSF ERALAAALRRVPKRLEGVDCWAIGPDLPARLAGHVTDRDTFYILPCGAPVPARLRRPVEI AAEKLTVREGRDFTPAHRRLWAEFMNRAVLPPNVRGLFARTGAVLDTPGLRLLDAVDAEG NLAASLLLDDAPRRFCSYLIGAHSRTHYTPHAADLLFAHMLRSAREADKEYVHLGLGVND GIRRFKTKWGGRPALPYVMASWRETPRETGSETISSLVRILLESPSDLSKRQIFDSLPQQ RPFAMLWELTKNGRTSWIGGTAHFFCYSFEAAFRNLFERVDTVLFEGPLDADSLDEVARI GKTPDGRHPPLIDLLEEPEIRLLERVVQGPTGKLARFFNAANPNPADVRGLLATTRHWYA FFSLWTAYLERQGWNQSVDLCAWHTALEMGKSVIGMESLEEQVASLESVPLGRVTAFFRG CRSWKGYARRNIHSYLDGDLEGMLGTSTEFPSRTEQIIDGRNQRFRERMRPFLEEGRAAV FVGSAHMLQLRDMLAEDGFTVRQAYPTWRHRLRAAIRGRNGG >gi|316921551|gb|ADCP01000149.1| GENE 28 38853 - 39416 840 187 aa, chain + ## HITS:1 COG:FN1983 KEGG:ns NR:ns ## COG: FN1983 COG0450 # Protein_GI_number: 19705279 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Fusobacterium nucleatum # 1 187 1 188 188 256 65.0 2e-68 MSLINKEVSEFSVQAFQDNAFKTVTKADILGKWSVFFFYPADFTFVCPTELEDLAEKYED FKSIGCEIYSVSCDSHFVHKAWHDASKTIQKIKYPMLADPTGALARDFDVMIESEGMAER GSFIVNPEGRIVAYEVIAGNVGRNADELFRRVQASQFVASHDDQVCPAKWRPGAETLKPS LDLVGLL >gi|316921551|gb|ADCP01000149.1| GENE 29 39416 - 41089 1622 557 aa, chain + ## HITS:1 COG:FN1984_1 KEGG:ns NR:ns ## COG: FN1984_1 COG0492 # Protein_GI_number: 19705280 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin reductase # Organism: Fusobacterium nucleatum # 9 321 4 316 332 296 48.0 9e-80 MDAGRAADLYDAVVVGGGPAGLAAALYLARARYRVLVVEKDTFGGQITITAEVVNYPGVE KTEGHSLTETMRRQALHFGAEFLLAEAQGIDVDGDFRIVRTSRGAFRCFAVLLATGAHPR KVGFEGEETFRGRGVAYCATCDGGFFTDRDVFVVGGGFAAAEEAMFLTRYARSVTMLVRR STLSCAESIAEQVLAHESVRVRFNTVLEAVEGDTALRRAVFRDTVTGRLETYAPPEGETF GVFVFAGYEPASRLAEGLAELTGQGNIVTDREQRTRTEGVYAAGDVCDKRLRQVVTAVSD GAVAATSIERYAADMQRKTGLRPQRPATAPASSGASKASAPSGNARETEGGFLTAEQREQ LAGVFARMERPLILKAEPDSRPVSEDLRRMLMELAALTDKLTVEWTPPSDGPERPCVRVL RADGTDTGIAFHGVPGGHEFNSFVVGLYNAAGPGQSIDPALAEAIAAIDRPLDLQIVVAL SCTMCPELVIAAQKIAASNPLVTAKVYDVNHFPELRERYKIMSVPCLIINKAKVAFGKKT LGQLLELLAEPEAGQTG >gi|316921551|gb|ADCP01000149.1| GENE 30 41666 - 41920 83 84 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIPCTAGPAISFKCRRKRIGLPPVSGISPVPFHAWRMRHRKKRRRIPKEMRRPEAFRGTL GSEDGAELRQKVPGEETVLDALPE >gi|316921551|gb|ADCP01000149.1| GENE 31 41849 - 44521 2909 890 aa, chain - ## HITS:1 COG:no KEGG:xccb100_3061 NR:ns ## KEGG: xccb100_3061 # Name: not_defined # Def: hypothetical protein # Organism: X.campestris_B100 # Pathway: not_defined # 538 856 89 354 370 73 24.0 5e-11 MSSTIFTATSGTSISVDPHADVEGAGQQNPAVPHDAQNGAPAPGNPPNEDLVQAEARLRQ MLDTALDRSGGLRGIGGDLYDAIKTEIPRPKVFHGGFRIHSAMLKAANACDVAAANLRRI PIADFHANPLPDAAFAALEAFVGAQNELYTQIGAFQKASGVSAALLDSLAQATQFRASEA LNFAATMQLAAMPAERRQQIAPGVADPQAPTAGKAMQDMAHFMHGSGDVAGAFRADAARL FGSIDALEGRKGQMPLAEFRDTATRLRADLEALRGRIDGAIHPNADPGGGPVLIPDRTLF APAEAMLARSAERLDALASADPRAGVKSAVRAILPPVDQSLFPSASALPGGLGTPLATGL ARYNAKVSEIMTRLDAGALSPERLAHEMEDAVRILRAPETRRACGTLLTMMAIAEHPTIG AKEIGGYYQHFCGERIPRGMADLLRRSVGQNGFVLVPSEMKRLAAQIENTPLLTRDGILV SEFQEAAGMLRSTGANPQARGEYIQAALDHHVDMNTILEASLRGIMADQLELRAGDAILS DTRKLGQGAANTVHLCTYRGRDGEDMKLVFKPEVGARRGLDHLCASGLGYRNGARVMQLN VAASRVADAIGCGGTIARSSIGSHDGQLGLFMEAAPGKTFFDIARGKPVCRMPDGKELNF PETCRVLRSNGKLDAMRANLMRELSKMEWADVLSGQVDRHGDNYLIDINPQTGAVKITGI DNDASFGTRKAGMTVVDLSRPTPRQQDFLSKLRREGYTIPPDGRIDLSKLPDRLLSETRQ QFGFNQLFRPVFIDRDTFDKLTAIREDDYRAMLAPCMDNEAVDAAVSRLKDAQRHAARLE REHRVVEDWAAPGIKETYARHKVGAKGSFGERIQNGFFTRDFLTKFSAIL >gi|316921551|gb|ADCP01000149.1| GENE 32 44537 - 45142 823 201 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDATIQGLRLPEGFSGMLEDNGLPLPPLPLEAINELEQTDEWFFATDGREPLSGLAGLYL EDLDAVFDSPEAPQGLREGPVRYLGCGLLGGGLQSWQFRYLLDAGRLRLGLSLPWGGAWN DPETERLSLEAACSFIEACQACLPQSGVLNVVLDAERCAWRLDTPEGDVQQQGEGLYALL DLMQGLDSGLVGMSNVNWMRV >gi|316921551|gb|ADCP01000149.1| GENE 33 45371 - 45574 109 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRGNIRTGGYALPGGDGLRRAIRGGAAHGLEIASIDRGGANVFPVCPGYAPPNAAGLAAS STGAPAC >gi|316921551|gb|ADCP01000149.1| GENE 34 45738 - 46913 1488 391 aa, chain - ## HITS:1 COG:CC2770 KEGG:ns NR:ns ## COG: CC2770 COG3550 # Protein_GI_number: 16127002 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Caulobacter vibrioides # 19 391 19 416 435 163 31.0 5e-40 MRLEVYLNARGERRFVGLLEERGGDILFEYDKRFLDSGIELSPFLLPLRPGVASDPKRTF DGLFGVFNDSLPDGWGLLLLDRLLRRSGLPLERITPLSRLSLVGSGGMGALEYEPVTPDP DPLPERIDLDGVAEEAWRTLNEDPLPVETLRTLMALNGSSCGARPKIMVRVSGDRRLIVP DAVAEPGCDPWLIKFPARHDDPAIGRMEYACSLAAKRAGIEMPDTALFPGRDGAAYFGVR RFDREDGGKVHVHTACGLLHASHRYASLDYENLFRLGKSLTRNPEDVEKLVRLMAFNVRL GNQDDHSKNFSFLLDAKGRWRLAPAYDLTPSRGVGGEQTCMVNGKGRDITDKDMIAAASV VDVPARTVKAILEQVGEAVAALPAILAEVSG >gi|316921551|gb|ADCP01000149.1| GENE 35 46903 - 47208 428 101 aa, chain - ## HITS:1 COG:FN1997 KEGG:ns NR:ns ## COG: FN1997 COG1396 # Protein_GI_number: 19705293 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 6 86 20 100 106 61 35.0 5e-10 MLNDYLKSPRDVRLELAERAKAERLRQNLTQEALAHAAGISLSTLRRFEKTGDIALASFV EIAFALRRMDGFDALFPEPPVHSLFEAAPRERKRARRPDAS >gi|316921551|gb|ADCP01000149.1| GENE 36 47198 - 47407 144 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLLFLTFGLFFRGLGKLVQPDALPGPRMLLFLAVSAPFAALRNPRKALTTIHKRTILSH KRSLPLYVE >gi|316921551|gb|ADCP01000149.1| GENE 37 47445 - 48419 1139 324 aa, chain - ## HITS:1 COG:yeaF KEGG:ns NR:ns ## COG: yeaF COG3713 # Protein_GI_number: 16129736 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein V # Organism: Escherichia coli K12 # 99 324 23 248 248 152 35.0 1e-36 MTPFPNGTGPANTDTGPPAPHAAAGTGGTSAIGGAHGRPCLSGAAFRRFTLPALLGVLLL AAPLHAAYGTESGTPPSGDMEWKGEASTPSRRCPTLPFDGELTLGAGGSVSSSPYKGYGP DWLPFPMITYEGERFFIRGYTAGVKIINLPYLELSAFAGYDSTSFEASESSNRRLRRLED RSPSAQAGMELRLLSPYGMFHVSGAGDVLSHSNGFNGDIGFIQSIEFGPLELLPAVGAYW SDARYNSYYYGVTRKEARKSGLGAYAPGSGFAPYVSFAIDYSLTEQWELFCQGEVTFLSG AVKDSPMVGETHTQSLTLGLTYTF >gi|316921551|gb|ADCP01000149.1| GENE 38 48416 - 49804 1794 462 aa, chain - ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 4 450 8 441 441 370 46.0 1e-102 MYTILIVDDEANLLEVLAVALENMGYGTVTAETAEEALAVLEEREVHLVLSDLRLPGLSG RELMEKVKAANPDLPVVIMTAYASLKEAVEIIKEGAFDYTVKPFELDALEATIASALRYY ALGRDNRDLREQLQTASAGQFVGQSPAYLALLRNVREVSASSANVLITGESGTGKEVVAK AIHYGGPRAKAPLVTINCAAIPEQLLESELFGHARGAFTGATSNRKGRFAQADGGTLFLD EIGDMPLALQAKILRCIQEKAIEPVGSNTTQKVDVRIIAATNQNLVEAIEKGKFRRDLYY RLNVYPIIMPSLRERKEDIPLLAEHFALRAAGEMGKVPVTFTEAALEILKNHTWPGNIRE LENCIERLTIIASGTAITPERLASCAVPGGGDASRPSAPGSAFPLDLDRQLAETERDLIL RALEETGGVQVKAAELLNISERSMWHRIKKLGINIINKKEMI >gi|316921551|gb|ADCP01000149.1| GENE 39 49960 - 51648 1785 562 aa, chain - ## HITS:1 COG:BH1920 KEGG:ns NR:ns ## COG: BH1920 COG0642 # Protein_GI_number: 15614483 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 344 560 317 536 548 139 33.0 1e-32 MKISLVAKLVLAFVLVAVIPMLVASKITTELIADVVNRNIERWLGEATTYMRHSVEETHE RLKAVSRLLDTRFDGKTTLNKREAAALSFMDIDALWLRDAEGGLLYATLPEGRIAEPLYP GALFSWVLMADGTRRVAVTSERAFTADDGKVRTLQLASWFSIDYSDSSASGPVILRIFLP EGDTFRQVYSSASDKRFHIPRSALEEIGKGASTVFIPEPDWTDDTPGAHSLIAVARGENG EVQALFVTSALLLPFRGWLASPVLFWGFFIFGTLLSAGIGYVLARRLISPLRQLNEGARD IAAGKLDCQLPVRGNDEIAELTAGFNVMARQLEIMRHESIKSARQERSRMLGEIALGFAH EIRNPLVVIKTSAEVVLASLAVKSREVRLLGFVVEEVARINNLISEFLAFAKPSPLKLEY FPLSPLIQEIAELSSAEMDKRGIRYSFSNEAEDDRVLGERSQIRQVLLNLGLNAMDAMPQ GGTLSVRLYAEGGQIRIELKDTGCGIPPNLLSTIHMPFISTKKNGLGLGLSKAYAIIEEH GGSISCSSTVGEGTVFTVCLNR >gi|316921551|gb|ADCP01000149.1| GENE 40 51960 - 52700 893 246 aa, chain + ## HITS:1 COG:BS_ykrX KEGG:ns NR:ns ## COG: BS_ykrX COG4359 # Protein_GI_number: 16078424 # Func_class: E Amino acid transport and metabolism # Function: Uncharacterized conserved protein, possibly involved in methylthioadenosine recycling # Organism: Bacillus subtilis # 26 225 8 210 235 79 26.0 6e-15 MHTDHPFTEKQDVFQLPDFAAGPYSVICDFDGTVTPFDVTDAILERFARPAWKTIEDEWV RGAISARQCMERQIPLIEAPLERLDAFLDTVPVTGGFVEFVRYGRSKGIPLGIVSDGMDY PIKRILNRHGLRHVPVVANRMVYREGAYRLEFPYGREGCASGVCKCGVAEAVSGDGKTLL IGDGLSDCCLARSASFTLARQGKALHRRCAAEGYPYFTYLDFFDILEAFGAPLPKPIAAP AAAPAA >gi|316921551|gb|ADCP01000149.1| GENE 41 52823 - 54208 2053 461 aa, chain + ## HITS:1 COG:CAC0368 KEGG:ns NR:ns ## COG: CAC0368 COG0160 # Protein_GI_number: 15893659 # Func_class: E Amino acid transport and metabolism # Function: 4-aminobutyrate aminotransferase and related aminotransferases # Organism: Clostridium acetobutylicum # 39 453 31 420 428 223 31.0 5e-58 MKTTKADEKALLEAESRYCSYGDTVHYLEPAKLFAGCNGSFLYDYEDKPYLDLQMWYSAV NFGYANPRLNDALKRQIDTLPQVASQYLHKEKIELAAMLCESIRDAWGEEGRVHFNVGGS QAIEDSLKLVRNACGGKSLMIAFEGGYHGRTLGASSITSSYRYRRRYGHFGERALFVPFP YCFRCPYGKRRDDCGMFCADQFERLFETEYYGVWDPKAGEAEYAAFYVECIQGTGGYVIP PDGYYPRLKKVLDERKIMLVDDEIQMGFYRTGKLWGLQNFGITPDVVVFGKAMTNGLNPL AGIWAKERYINPQVFPCGSTHSTFASNPLGTAVALETMRMLAEEDYETRVREMGAYFLEG LRELKSRYAIIGDVDGLGLALRAEICAADGFTPDKAMMDRIVDEGIKGDLMVDGKRYGLV LDVGGYYKNVITLAPSLNITREEVDLAMRLLDEILRRVTKN >gi|316921551|gb|ADCP01000149.1| GENE 42 54261 - 55223 1027 320 aa, chain + ## HITS:1 COG:BS_yvaK KEGG:ns NR:ns ## COG: BS_yvaK COG1647 # Protein_GI_number: 16080415 # Func_class: R General function prediction only # Function: Esterase/lipase # Organism: Bacillus subtilis # 9 257 10 235 248 101 27.0 2e-21 MQSLERLGPFFPGGRIGFLLVHGLAGTPAEMKILGKRLNRYGFTVLCPQLAGHCASEEEL ITTCWSDWSRSVEDAFDALSRHMDAVFVGGLSAGAVLSLRHAQRYPGRARGLALYSTTLR YDGWTIPKLSFLLPLILKTPYFGKRYRFEEAFPYGIMDDKLRGRILAQMQSGDAAAAGFT ATPGASLKQLWGLVAAVKRDLPSIKTPTLIVHAGNDDIASARSNALYVRDHIGGPTELLL LDRSYHMVTIDQERNVVGDATARFAWNLLSDAERERLAAVAREPVPAAPETSAQTPDAAA PSPASSVAEGMEKAAEGAGR >gi|316921551|gb|ADCP01000149.1| GENE 43 55220 - 55582 458 120 aa, chain + ## HITS:1 COG:no KEGG:Smlt1419 NR:ns ## KEGG: Smlt1419 # Name: not_defined # Def: putative transmembrane protein # Organism: S.maltophilia # Pathway: not_defined # 1 119 1 118 119 128 63.0 5e-29 MTALIVFLWIANMLVDTGGQLFFKAAASEHTAGAGGLAHWKRMASRPWLWFGICCYVLEF FVWMAFLSQVELSVGVMLGSFNIVVIMLAGRLFFKEKLTRWRVTGISLITLGVAIVGVGG >gi|316921551|gb|ADCP01000149.1| GENE 44 55579 - 55989 617 136 aa, chain + ## HITS:1 COG:no KEGG:CtCNB1_0145 NR:ns ## KEGG: CtCNB1_0145 # Name: not_defined # Def: quaternary ammonium compound-resistance protein # Organism: C.testosteroni # Pathway: not_defined # 1 127 1 127 128 157 59.0 8e-38 MKRFYIIGFLILMSFDTLSQISFKYASTQALPLELSPDWLMRLFLNPWLYGAIAGYLGAF VTWMTLLRYAPVGPSFAASHLELISVTLFSVWLFNEPLNAYKILGGVLILLGVLCLAKDE KDDEDEAEAAPVRAGE >gi|316921551|gb|ADCP01000149.1| GENE 45 56165 - 57358 1139 397 aa, chain + ## HITS:1 COG:MTH334 KEGG:ns NR:ns ## COG: MTH334 COG0399 # Protein_GI_number: 15678362 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Methanothermobacter thermautotrophicus # 39 203 50 212 363 87 30.0 4e-17 MRELPPTAGLPMHMGDCFRAPEKPFTAGLRDWLGIPEPIMACSGTAAFVVALRTLARRNP GRSRVVVPAYTCPLVPLAVRLAGLRAVACDTLPGGFGIDPGALGRCCDETALAVVPAHLG GRASDISAVSAVAHDYGAVVIEDAAQALGASFGGRSVGLSGDIGFFSLAVGKGLTTYEGG VLFSRDPDLHAELQGTASRVLRPGLLWNARRVLELLGYAVAYNPALLPLAYGRPLRRGLD RGDEIAAVGDAFTLDDIPLHRMDPLRLRVAANALERLPAFLEAGRDRAVRRAGMLERAGA EVLRDDPGGEGIWPFFMVLMPDRDRRDRALGALWRSGLGVSKLFVRALPDYPYLADAVEG GPCPAARDLAGRMLTVTNTHWLDDAAFARLAGEICHA >gi|316921551|gb|ADCP01000149.1| GENE 46 57351 - 58439 1345 362 aa, chain + ## HITS:1 COG:no KEGG:Daci_0136 NR:ns ## KEGG: Daci_0136 # Name: not_defined # Def: hypothetical protein # Organism: D.acidovorans # Pathway: not_defined # 1 346 1 350 357 347 55.0 6e-94 MPDARHANLLEPEALVELFLRHPPQGFAAASEADLPVFGTDFDLLTTLEPAILVKIRRLP LFGLWSRLLRFPARFAGTTATEYAPLPKGLEPGALLDGFRERCAAGQSLLIVKDVPEVSP LLGAGDNEAAMRLARIAPDKGFIVVEGQALAYVPIDFSSTDEYLSRLSKSRRKNLRRKLK SRERLDIEAVPLGDARFGSLDVLEELYGLYLGVYAQSEIHFDLLTRDFFAGLLQSRELGG VVFCYRHDGELVGYNICLEHGGMLIDKYIGLRYPQARELDLYFVSWFVNLEHALERGLRF YVAGWTDPEVKAGLGARFTFTRHLVWVRNPILRRILYMFRHWFESDAKAVGAKAADKDTD HA >gi|316921551|gb|ADCP01000149.1| GENE 47 58432 - 59385 868 317 aa, chain + ## HITS:1 COG:no KEGG:Daci_0135 NR:ns ## KEGG: Daci_0135 # Name: not_defined # Def: hypothetical protein # Organism: D.acidovorans # Pathway: not_defined # 13 294 10 295 303 270 51.0 5e-71 MHDPRESFRPSPPVILDFDGSVLPVAEGERRIPLGPWQEAIRFGCTRRAFSALEAHLEGV LPVDCGCAFMGSGDFHHVTLIPLRRLCRRLPPASLDVVVFDNHPDNMRYPFGIHCGSWVS HAALQPSVRRVHVIGITSGDIGLAHAWENRLTPFLRRKLTYWSVGTDAAWLRLIGRAECC RSFGSADELVRGLLPALADAGNIYLSIDKDVFAEDVVKTNWDQGVFRLSHTEAVLAACAG RVIGADVCGDVSGYEYASPFKRFLSRLDGQEPCDPQALRGWQEGQRAVNAALLESLGKVL REPEASTRVSPILRRFF Prediction of potential genes in microbial genomes Time: Fri May 13 04:55:08 2011 Seq name: gi|316921535|gb|ADCP01000150.1| Bilophila wadsworthia 3_1_6 cont1.150, whole genome shotgun sequence Length of sequence - 29214 bp Number of predicted genes - 19, with homology - 16 Number of transcription units - 10, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 98 - 157 7.0 1 1 Op 1 . + CDS 189 - 1166 520 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 2 1 Op 2 . + CDS 1175 - 1522 102 ## 3 1 Op 3 26/0.000 + CDS 1562 - 2296 562 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 4 1 Op 4 2/0.000 + CDS 2293 - 2958 368 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component 5 1 Op 5 4/0.000 + CDS 2942 - 4123 743 ## COG3524 Capsule polysaccharide export protein 6 1 Op 6 3/0.000 + CDS 4127 - 5950 848 ## COG1596 Periplasmic protein involved in polysaccharide export 7 1 Op 7 26/0.000 + CDS 5952 - 6884 250 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 7255 - 7314 2.5 8 1 Op 8 . + CDS 7474 - 10953 172 ## COG0438 Glycosyltransferase + Term 11100 - 11155 5.1 + Prom 11106 - 11165 5.4 9 2 Tu 1 . + CDS 11410 - 11694 79 ## + Prom 11715 - 11774 6.6 10 3 Tu 1 . + CDS 11799 - 11918 70 ## gi|291544023|emb|CBL17132.1| nucleotide sugar dehydrogenase 11 4 Tu 1 . + CDS 12057 - 12512 -7 ## COG1004 Predicted UDP-glucose 6-dehydrogenase + Term 12561 - 12603 -1.0 12 5 Tu 1 . + CDS 12635 - 12853 72 ## COG1004 Predicted UDP-glucose 6-dehydrogenase + Prom 12890 - 12949 2.6 13 6 Op 1 5/0.000 + CDS 13011 - 15080 820 ## COG3563 Capsule polysaccharide export protein 14 6 Op 2 . + CDS 15083 - 16282 768 ## COG3562 Capsule polysaccharide export protein 15 7 Tu 1 . + CDS 16526 - 16693 66 ## + Prom 16791 - 16850 3.3 16 8 Tu 1 . + CDS 16925 - 17686 574 ## COG2186 Transcriptional regulators + Term 17842 - 17886 -0.3 17 9 Op 1 1/0.000 + CDS 17946 - 19154 1298 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Term 19248 - 19301 -0.4 18 9 Op 2 . + CDS 19560 - 20975 1552 ## COG0471 Di- and tricarboxylate transporters + Term 21009 - 21053 15.4 19 10 Tu 1 . - CDS 21161 - 29212 11582 ## COG2931 RTX toxins and related Ca2+-binding proteins Predicted protein(s) >gi|316921535|gb|ADCP01000150.1| GENE 1 189 - 1166 520 325 aa, chain + ## HITS:1 COG:RSc0413_1 KEGG:ns NR:ns ## COG: RSc0413_1 COG0794 # Protein_GI_number: 17545132 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Ralstonia solanacearum # 16 210 19 214 216 205 55.0 9e-53 MVSFFREGGAGSFVPFAGDTLKKQAESLWHMAEDLDDSFDEVIRCLLSISGRIAVTGMGK SGHVARKVAATLASTGSPAYFIHPAEASHGDLGMVSRHDAVIAFSNSGETAELSDIILFA SRHGVPITGVTKRGDSFLARHADHVLLLPNEPEACPIGCAPTTSTTLQMALGDAIALSLL KARGFGAEDFHRFHPGGRLGRKLMTVREIMHVGEALPLASLDSPMTEILCIMTGKGFGCV GIMEKGILVGIITDGDLRRHMDGGLLGLTAERVMSRNPITVDEHCLAAKALGIMQSSKIT SLYVVRSGEPVGILNVHDCLRAGVN >gi|316921535|gb|ADCP01000150.1| GENE 2 1175 - 1522 102 115 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLRCLMVGCLTVVCGCAVQQGGRGEKSGVAAQETRYAGFASQLGEQMEGTIASVEKTPFG GPAEVSVGRPYVSGLRMLCRRATLTDPHREMTVAVCRRPSEEWTLVEPIFEPLPR >gi|316921535|gb|ADCP01000150.1| GENE 3 1562 - 2296 562 244 aa, chain + ## HITS:1 COG:SMb20833 KEGG:ns NR:ns ## COG: SMb20833 COG1682 # Protein_GI_number: 16264324 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Sinorhizobium meliloti # 2 240 16 254 259 92 30.0 5e-19 MALILREVHTLYGSSRLGYLWAVIQTMFGIGIFWGIREIAGARAPHGMSVLMFLLAGFGL WATFSETLTKCMSAVSGNKALLTFPQVTPLDLMLSRTVVVWGTQLSSGCVITGIAAFCGT SLYVSDWTGLFAAFLLTPLLGLGCGMLCASLAVFWPTLEKIVPMILRILFFASGIFFSVS MFPKNIADILLLNPVMQLIELLRQSLSRGYVAPTYDYLYIVAFCVVSLCLGGLLERYARK RAEQ >gi|316921535|gb|ADCP01000150.1| GENE 4 2293 - 2958 368 221 aa, chain + ## HITS:1 COG:PM0781 KEGG:ns NR:ns ## COG: PM0781 COG1134 # Protein_GI_number: 15602646 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Pasteurella multocida # 1 208 1 208 219 223 53.0 2e-58 MIALHNIAKSYPLPRGKKVVLDDVNITFRPGENMGILGLNGAGKSTLMRIIGGAELPDSG KVVRKSRVSWPIGFAGGFHGSLTGRENLRFTSRIYGSDIRKVTEFVEDFSELGPYMDMPV RTYSSGMKSKLAFGLSMAIGFDFYLIDEAYSVGDAAFQAKGERVFQERKASSTLIVVSHS VSTIRKNCDNAAVLFGGKLMCFANLEDALTRYQEICDAQKR >gi|316921535|gb|ADCP01000150.1| GENE 5 2942 - 4123 743 393 aa, chain + ## HITS:1 COG:Cj1445c KEGG:ns NR:ns ## COG: Cj1445c COG3524 # Protein_GI_number: 15792763 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsule polysaccharide export protein # Organism: Campylobacter jejuni # 41 390 22 370 372 160 28.0 5e-39 MLKSDNPPPLTLAVQSGRRSFSSARACWGRLPRKGKIFACVVLLPTVAVFLYYLLWASPM YVSQTRFAIRSADSSGGGLDIASALLRSSSSTGADAHIVVEYIQSLDIIHDIDKELGVDL HFSDKGHDVFSRLTNNPTQDEQLRYWKWAVIPALDQDTGIITLETKAYSPEMAQKIAAAV LARSEALVNTMNLRAREDAVSLAQSEVQRAEARVRKAQEAMRNFRDAHTLLDPRVTAAGL QGVVTELEGEAVKLRAQLAEAQSFMRSSAPATKALQTRLKAVESQLDQEKQRLAGLRSQE GNLNAVVGEYEDLTIEAEFAQKQLVTAMSALESARVHEVAKSRYVVAFQQPTLPDESLYP RPFLFTAYVFVGALLLLGLGSLITASIREHAGF >gi|316921535|gb|ADCP01000150.1| GENE 6 4127 - 5950 848 607 aa, chain + ## HITS:1 COG:Cj1444c KEGG:ns NR:ns ## COG: Cj1444c COG1596 # Protein_GI_number: 15792762 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Campylobacter jejuni # 119 607 62 552 552 324 36.0 3e-88 MNHMMHWKTYLAFATGSLLLLSLPPYAAGAQNTVQQTGGSAYGQPLGRSSNPVPASSGGI PGVPSSMPAQQAPGTYRLPSQGSVWGSRESSPVPTVLDAPPAWAQSPLRGPASATLPPFG ANLFQGNFASTYSESVNADYVILPGDRIVVRIWGAKTYDDVLMVDQQGNVFLPEVGPIRV AGLRQGQLLGAVRSSLASVFTDNVNIYVNLQSSQPVAVYVAGFVNHPGRYAGGPMDSVMS YLDRAGGITPERGSYRHIKVMRGKSLIGTVDLYDFALRGEMPSIRLKDGDVILVDERGSS VAALGLLRQQARYEFMGTATGAHLLDLATPLNSASHVSISGIRNRAPFNVYIPIAEFAQF QLADGDTVDFVADKRGRTIMAAVTGAIQGASRFPVRKDTTLKSLLQYVEIEPAIADTSAI YIRRQSVAAQQKAIIADSLRRLEQSALTSTSSSVDEANIRVREAELIQDFVRRAATLEPD GVVVVSRGGTVKDLLLEDNDVIVIPQKTDVVHVSGEVLIPKAITYEKGMSLHEYLESAGG LSDRADKDNILVAKQNGEVGKVQNMGIAPGDRILVMPRFDSKNMQLAKDFTQILYQIAVA TKVAVGL >gi|316921535|gb|ADCP01000150.1| GENE 7 5952 - 6884 250 310 aa, chain + ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 3 209 2 213 344 108 35.0 9e-24 MKIPAISVIIPLYNCAPWLRECLNSVVLQTLKDIEILIVDDKSTDNSVSIAEEYAAQDRR IRIFRSTENLGPGPSRNTAIAQACGEYIAFMDGDDFYPSLDVLEVLYHNALEHQAKVCGG SLRYVNADGSTREKQLSEYVFTEERFYTFREYQHEGGYYRFLYNNELLQINRLEFPPLRR FQDALFCVKTMLATEVFYAMPMISYAYRKEHKQVKWTEQSINDYFIGILELLNISKLYKL DVLHYCMAKNLYDHIYLFMKIRKYNILKILCNIDYSLIKYQNTVCKVKISRLKFILKLIQ YKYFLRISVA >gi|316921535|gb|ADCP01000150.1| GENE 8 7474 - 10953 172 1159 aa, chain + ## HITS:1 COG:Cj1432c_1 KEGG:ns NR:ns ## COG: Cj1432c_1 COG0438 # Protein_GI_number: 15792750 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Campylobacter jejuni # 528 867 3 343 367 239 37.0 3e-62 MQREIELEKQSSIQNSYIFALYERLIHAGKTVVFTTDMYLPKDTLKEMLESSGYHDFCDI YVSNVYQLRKGDGSLQKKLIEKYPLKKIIHIGDNKTADVDKSEKSGMAALWYPDCRLQKR EVFLNNLSGSIYRAIVNNTLNTGLWEHGLHYTHGFRVGGILTAGYCEHINEVAQQKGAEK ILFCARDCYIIQKVYNAFYRKVDNSYIEISRYAVMNLSPERYANDILDRFIFRYWDENKN AKTLEQLLHDTGYNFLVPYLEDNDLDRFIYCSSASKELFAEFFLSHIDVLKENSKVSREA ATAYFGGLIGTAKSILIVDVGWSGTCISALEYFIHDAISPDINVSGTLVCSSNTKNMCNQ ILGKYIVPYVCGPCRNNDFNNFMMPSGKKSVREIDMLHMPLEYMFTADTASLVEYFKDND ATVTFVRDVNTPKNINEIREMQNGIYAFNEKYVEYTKHLLIDVTVSPYVAFIALKESITD KNYSYSIYKNFLYDACTAPGNAQRRGVPFSSFFPDKITPAIISPGKGKRILFVSPELTYT GTPRSLLRVCRLARELGYEPIVWSAKDGPFREEFLKENIALQIVQAQDISQNQYMDIVSS CELVFCNTIVTDDYVRILSRYLPTVWFIREATNIPDFCRNKPERLALLQSYDKIFCVSEY AATALSQYTNQQIHIIHNCVEDESQWAENYRPGSGPTVKFVQFGTMEYRKGYDILIAAYK SMPIEYQRLCEMYFAGGFINSGTPYCDYLFSEMDNVPSLHYLGIIEGAENKIRTLSQMDV VVVASRDESCSLVALEGAMLSKPLILTESVGAKYIVDSENGLIVTSGDVEAMKKALMQMI DNKNALHVMGIASRKIYESRAGMSIHKAAFERLFNELSAKGKDGAKLAVISDFDETEVVI SLTSHPGRMACIHTCLESLIKQDYRNRKIILWLSLEQFVHKEKDLPAALLNLNGQNAFEI RWVADDIKPHKKYYYAYQEFRKTPIIVVDDDVVYDPSMVRALVASYKKHPHCISCHRANL MLLRADKTFRAYQTWPMGYTLLSDTPSYQLLPTGVGGVLYPPNSLPEYTFNTDVIKRYCL LCDDLWLKMMTVLNEYPTVVVKNHRPYKLIDGSQDVALWKENVKGSNNDITLHAICEYIY KSIPQGSVCMERIRKDRFC >gi|316921535|gb|ADCP01000150.1| GENE 9 11410 - 11694 79 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNYTFYGLCSFLDLFGLHVFKLRPGIDSNTLISRTFNKFLLEREVDFEQCFFTEESILNY AIETYFRKKKFTIQQINLQKLKIAGQFSFAVKNR >gi|316921535|gb|ADCP01000150.1| GENE 10 11799 - 11918 70 39 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291544023|emb|CBL17132.1| ## NR: gi|291544023|emb|CBL17132.1| nucleotide sugar dehydrogenase [Ruminococcus sp. 18P13] # 1 39 1 39 394 64 76.0 2e-09 MNIAVAGTDYVGLSLAVLLAQHNSVKAMDIVPAKVELIN >gi|316921535|gb|ADCP01000150.1| GENE 11 12057 - 12512 -7 151 aa, chain + ## HITS:1 COG:ECs2829 KEGG:ns NR:ns ## COG: ECs2829 COG1004 # Protein_GI_number: 15832083 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Escherichia coli O157:H7 # 4 150 90 235 388 170 59.0 9e-43 MERNYFDTSSIDAVLKLVEKVNPEATIVIKSTVPVGYLEAQRGAYPTLPNMLFSPKFLRG VKALYDNLYPSRIVVGVPDCCQPALRARAEQFAALLQEGALEDETPTLIVHSTEAECIKL FANTYLAMRVAYFNELDTYAELHGAGLCTDY >gi|316921535|gb|ADCP01000150.1| GENE 12 12635 - 12853 72 72 aa, chain + ## HITS:1 COG:STM2080 KEGG:ns NR:ns ## COG: STM2080 COG1004 # Protein_GI_number: 16765410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Salmonella typhimurium LT2 # 1 70 277 346 388 91 55.0 4e-19 MEANNTHINFIVSQIIARQPRIVGIYRLTMKTGSDNFRASAIQGVMGRLSNSGVKLVIYE PTLTQDIFMDWT >gi|316921535|gb|ADCP01000150.1| GENE 13 13011 - 15080 820 689 aa, chain + ## HITS:1 COG:NMB0082 KEGG:ns NR:ns ## COG: NMB0082 COG3563 # Protein_GI_number: 15676015 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsule polysaccharide export protein # Organism: Neisseria meningitidis MC58 # 21 649 31 658 704 527 45.0 1e-149 MNETNTIFEQECMFLVFSCTLSRISHLSVFLGTPVWLLRKRREGESSAVIGWGHKPTADK ARDYAAKHNLPYIALEDGFLRSLDLGCKGAQPLSLVVDHTGIYYDATEPSDLENLLNGSG GDSEGQLQSAQHAIEAIKRHDLSKYNHAPRAAERLLGDASCPRVLILDQTKGDTSVTLGR ADAESFTAMLDDAQMRFPTARLFVKTHPDVLAGKKQGYLTEAAKRRGIAIIAQDVSPLSL LAQADVVYTVTSQMGFEALLLGKEVHCFGMPFYAGWGATHDRLTCPRRAKRRTAEEIFAA AYMLYARYVNPVTARRCDIHEAIRILAAQRFQNERNKGFHACVGFSRWKRPHARAFLQST TGTIRFFSDWWKAIKWAQANGGDVVVWASKCTIGLESSCQTMGVRLIRMEDGFIRSVGLG SDFNWPYSLVLDEKGIYYDPSRPSGLEDILNTLPEHPERAELCSRASALRGFIVEKGITK YNTGVDAVTRGDFSAKGRLLLVPGQVEDDASVRLGGCGLFSNVDLLKAVRESHPDAFIIY KPHPDVESRNRKGRIPDGEALRYADRVVRNVRMDVLLGIVDEVHTLTSLTGFEALLRGIE VHAYGGPFYAGWGLTHDRIDFPRRKAGLSLEELIAGTLILYPSYYDWQTKGFCTPEDICY RLTQPKGQMRGKFFMRTFAALREKIRNVF >gi|316921535|gb|ADCP01000150.1| GENE 14 15083 - 16282 768 399 aa, chain + ## HITS:1 COG:SMc02269 KEGG:ns NR:ns ## COG: SMc02269 COG3562 # Protein_GI_number: 15964326 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsule polysaccharide export protein # Organism: Sinorhizobium meliloti # 2 398 13 406 441 250 34.0 3e-66 MRTYLFLQGPHGSFFRELGLALQRAGHRVLRVNICGGDVVDWHSREAVWYQGRTTDWSGW IGDLMRREGVTDLLAFGDWRPLHREAILVAKLRNIRVWAFEEGYLRPHYITMEEGGVNGN SMLPHSPALLRELAESHPAPPSPVAVQNPLSARVWKAIAYYAGTIFLRPFFPCFRTHRPQ SAVSEVWGWFLRVLTRRTWERRSAESVREIYRSRAPYFLFPLQLDSDSQVRRYSPFSGMK EAIAHVLTSFAHSAPKSMHLLIRNHPLDNGLINYRRYIRRFSHACGLEGRVHFVESGRAA LMMDRSEGMVVLNSTIGISGLQRGIPVYCVGTSIYAVRGLASSQEEQDLDDFWSNPQKPC PETLRHFEKVLKCRTLINGNFYTEEGVRLAIQGVMERIA >gi|316921535|gb|ADCP01000150.1| GENE 15 16526 - 16693 66 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGRTGFLRHERFPGIGMRTTRLRRDRMAERKGCPPLSQEGILCFKVLTGDYVCNS >gi|316921535|gb|ADCP01000150.1| GENE 16 16925 - 17686 574 253 aa, chain + ## HITS:1 COG:CAC2546 KEGG:ns NR:ns ## COG: CAC2546 COG2186 # Protein_GI_number: 15895808 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 25 239 9 226 231 97 30.0 2e-20 MTHSQETPDQSQKAEPLPETREPLKAYQRLSEKIQDAIEKSGLKVGDKLPPERALAETFG VSRNSVREAIRTLSEQGILKSRHGDGTYICGSDLAPLTAALLNAVDTEGKSFDHIMEFRL VIEPSIARIATARHSAEQLHQLKIIVCDQQLKLLHGENDGELDAQFHTLLAAATGNPVFV DIVSHLNELFSHNRADDVRLKRRMEEAINEHLMIIDALERKDADACSDAMARHLHAVAEK HFLAQSRERGSAC >gi|316921535|gb|ADCP01000150.1| GENE 17 17946 - 19154 1298 402 aa, chain + ## HITS:1 COG:BH0352 KEGG:ns NR:ns ## COG: BH0352 COG0624 # Protein_GI_number: 15612915 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus halodurans # 18 388 6 366 374 213 35.0 4e-55 MDLTLWKDKLDAYLDPQEGDMLRLLERIVNMDSFTTDAQDVDQLGTVLTDWLREAGFQTF MMPKTEAPADEPWQADLGHVFMAKTHGREAEPGIVFLGHMDTVFPKGTAQARPFKIEGDR ATGPGVADMKAGLVANLFAARALKRLGLIDCPMTLMFSPDEELGSPSASRTLAQVLPGAR AAINSEPGGPGGLVTVSRKGSGHMFLKVQGKASHAGRCYADGASAILEIAHKTLAIDTFL DLDRGLTVNTGLISGGVSANSVAPWAESRIHLTYRTLEDGQKVVAGIRDAVSRTVIPGTS ASISGGLRLYPFERCEAGDKLFGLVKDAGELLGMSIKGQHYDSASEAGFSSSLGVPTICN MGPEGEHIHSLNEYLVPSSIVKRCKLIALTALQASRVFQPGK >gi|316921535|gb|ADCP01000150.1| GENE 18 19560 - 20975 1552 471 aa, chain + ## HITS:1 COG:MTH788 KEGG:ns NR:ns ## COG: MTH788 COG0471 # Protein_GI_number: 15678812 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Methanothermobacter thermautotrophicus # 28 400 6 367 443 102 25.0 2e-21 MILSPNAIQRVRQAFSFLFIPLGVWIAFITPPDGLSPQTMKALGITVWAVGWWITQIVPE YVTGLLMCVFWAVTKCVSFQTAFATFSTSGWWIMVGAFALGAVAGKTGLLKRISLCVLHL FPASFAGQVWGLIASGTVISPLIPSINAKATLSTPIALSISDELDMERKKAGACGLFGAC YVGFVLAGHMFMSGSFTHYVLVGMLPPEYRGVTWLDWLLWSLPWGLVMLAGMGLYIIWAY KPSKSVRLTKDNVAEQIRNLGPLSREERITLWTLIITLLLWMTEAWHGISSGEVAVTAMC VLLMSKVMDRKDFKNGIDWPSVVYVGSILNLAAVIRALHVDRWLGVALKPYLLSVVGSPA SLIVAIILGMCVIRLLIVSMSSAAAIFVLILPPLLIPLGMNPWIVCMVAFAGGDIWYLKY MNAFYLCADLGTEGKMANHRSMIKLSAAYMVICTLGFIVSIPFWRMFGLLQ >gi|316921535|gb|ADCP01000150.1| GENE 19 21161 - 29212 11582 2683 aa, chain - ## HITS:1 COG:all7128_2 KEGG:ns NR:ns ## COG: all7128_2 COG2931 # Protein_GI_number: 17233144 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: RTX toxins and related Ca2+-binding proteins # Organism: Nostoc sp. PCC 7120 # 31 520 1391 1847 2231 90 27.0 4e-17 TLNADGSYTYTLNNALPAVQTLGVGETLTDTFTYTVSDGHGGTASNTLTLTINGTNDTPS VGPTFGFVTEDTLTVLNGALSTPTDPDTHDMPVFVAKTAETGLYGSLTLRADGTYTYTLN NSMNAVQGLGQGETLRDIFTYTVSDGHGGTASNVLTVIINGTNDAPAVAAAVAAVTEDAK LTATGKLTTPTDIDNSADGIASDTLSFIPQIARAGAYGTLTLNADGTYVYTLNSTLGSVQ ALGAGETLTDVLTYTVTDGKGGTASNTLTVTITGTNDAPVVAAATATVAEDTRTTASGTL PAPQDIDAHDTVSFIPKTGEAGRYGTLTLNADGTYTYVLNNNLAAVQGLGVGDTLSEVFL YTVKDSHGAVGNNTLTITIQGTNDAPTASPGGGSVKEDSVLTASGRLAATDPDNTLDGAG SDALAYTPKVAQAGLYGTLTLNADGTWLYALNNSLGAVQGLTTGETLTDTFTYTVTDNHG AASTGTLTITVNGANDPPLTTPATVQVAEDAATVASGTLPAPSDPDNTQDGVISDILSFV PATVQGIYGTLVMDAAGRYTYTLNNSLAVVQRLGVGETLQEVFSYAVQDQHGGISTNTLT VTVNGTNDAPTVAAAAASVTEDAQITASGSLPAPRDTDTHDTVAFIPQTATAGTYGTLTV NADGTYLYTLNNNLPAVQALQAGQTLTDTFTYTVTDNRGGTGSNTLTVTISGLQETGELS GPGNVTEDVRLATSGTLPVPADPLGILSGLLNGLTYSATTLNGLYGTFTLNANGSYTYTL NNNLGAVQGLTTGETLLDVLPYKVTNILGAALNGSYAVTIHGTNDTPVAPSVSVSVQEDT LLLSSGIIAATDPDNTRDGIVSDLLSFTPQTTAGLYGTLTLNAAGAYTYTLNNSLSVVQA LGQGESLSETFNYTVKDQHGGITVGTLTVNISGTNDLPVPTVQTASLTEDVKAVALGSLR PVPDPDVNDIVTFVPQSSTAGLYGTLTLNANGSYVYVLNNSSAAVQGLGPGESVTDTFAY AVSDGHGGTAASTLTLTINGTNDAPTVGAASASIGEEAAGPANALSGLLPTPYDADIRDA VSFVPQTNRAGTWGTLTLRADGTYTYALTDSAALHSLGAGQIAYDTFAYTVADNHGASVT NTLTITVIGVNDAPSVASATASVQEDTLLTASGILPAPTDPDAGDRPQFVMQLSTQGQYG TLFLSPAGAYTYTLNNGLPAVHALGEGDTLTDTFTYTVTDGHGGTATNTLTVTINGTNDA PTVDAADAAITEGVAQTAGTLPTPQDPDAHDVPVFVPQTGAAGLYGSLTLDASGAYVYTL NNNLAVVKQLGLGETLTDTFTYTVTDNHGGTATNTLTVTVNGINTPPLTGTVTAGSVTED TSTVLTGIMTAPPDPNPHDTVVFLPQTGTAGAYGTLTLAASGRYTYVLNNTLPAVQGLGV GETLTDTLAATVSDGKGGLTQTTLAITINGTNDLPSVAPSSGAITEDTAILAGTLSLPTD PDIHDTPAFLPQLGTAGAYGTLTLNADGTYSYALNNSLSAVQGLGVGETLTDSFAYTVSD GHGGTASNTLTITISGANDAPAALAATASVTEDTALSASGVLPRPSDPDFHDTVAFIPLA ARAGAYGTLTLAADGHYTYVLNNTLPAVQGLGVGETLTDTITYQVIDNHGATGSATLTVT INGTNDNPIVTPLTASILETAASVSGALPAPTDPDIHDAPAYVPFAARAGIYGSLTLNAD GSYTYVLNQSQPAVQGLGQGETLTDTFAYSVVDGHGGAASGTLTVTVNGINNPPTVAASA AVIAEHTPYVTGVLPSPDDPDIHDTVSFTPATLTGRYGTLSLDAHGGYVYTLNAHSPDVS GLGQGESLTDTITYGIADDKGGTGTGTLTVTINGTNDAPVLGPQAASVSVGGTVTAAGTL IAGDPDAHDTVSFTPFTAQAGLYGTLTLNADGTYAYTLNSGLSAVRQLSIGDTLQDVFLC QAQDQYGLTGSGTLTVTINGGNEAPTVAAATASVTEDLALTATGTLPAPQDINIHDAIGF VPLSAQAGTYGTLTLTAGGRYIYVLNNALPSVQALGTGETLTDTFTYQAIDNHGAIGSNT LTVTINGTNDAPTMSLAAAAFARTDLTLTGVLPPPHDPDAHDTASYQPLTAQAGLYGTLT LNADGTYTYVLNNTLDTVRELGAGESLQDVFSCTVIDSHGATGAGVFTITINGSNDAPTA ADATASVTAAVSGDQVLATGILPHPADPNIHDVLSFVPLAAEAGSYGTLTLAADGSWTYT LNNAADAVRQLGAGATLTDTITYQVADNHGATGNATLSVTIYGVNDLPAVAAATASVTED TAAAASGILPAPTDPDSGDSVSFVPIAGGAGLFGTLNLDAEGHYTYTLNNSLAVVQGLGE GQTLTDTFTYTVTDAHGGTGSNTLTVTINGVNDSPSVAAATADVTEDVQTLVTGTLPAPH DQDNYGVPNDILSFTPQIAEPGAYGSLTLNADGTYTYILNNALPAVQNLNAGDTLTDTFT YTVTDNHGGTGSNTLTVTIHGLDEPSGGGTYAAPQTASLSVTAETGEAVPPGTGATSTLP DYLQPEGDSLASVLNHYASSGSGQGHETAATDGTHPEDGAAPDASPLPSSEDGGAPTVSA PIETAVHDAQPQQYSEPYTPPMESSTETLQQNVSQELARNGGI Prediction of potential genes in microbial genomes Time: Fri May 13 04:55:45 2011 Seq name: gi|316921531|gb|ADCP01000151.1| Bilophila wadsworthia 3_1_6 cont1.151, whole genome shotgun sequence Length of sequence - 2617 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 1529 1575 ## COG0523 Putative GTPases (G3E family) 2 1 Op 2 . + CDS 1541 - 2314 647 ## COG0811 Biopolymer transport proteins 3 1 Op 3 . + CDS 2308 - 2617 364 ## DVU2807 biopolymer ExbD/TolR family transporter Predicted protein(s) >gi|316921531|gb|ADCP01000151.1| GENE 1 3 - 1529 1575 508 aa, chain + ## HITS:1 COG:all1751 KEGG:ns NR:ns ## COG: all1751 COG0523 # Protein_GI_number: 17229243 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Nostoc sp. PCC 7120 # 206 461 14 268 323 119 32.0 1e-26 GEDADGGALRGTFAAYAFPLPDDPLLEVFPLEALRLAVDPGYGDRIRALSEHAETLSPFR IGTVSLRAAVDGSGLSLTLSAPARLRMAALDGVAAFSGGQRWLVEPGGWETDVPAFRLLL RPFAALSATAESLLGDPARVSLDGEAAPIPGFDAAGNIRHIPGGTGRELRLTAAFGSAGA QGRPLARLPLRHKPLEEESGLPVLHILTGFLGSGKTTFLRQWLDFLHGRERYTGVIQNEF GEIGLDAALLRGETQVEALDEGCVCCSLADSLRPGLLRLIGDMPAEQFILETTGLANPAN VMEALSELRDIVQPGLVITVADALDLCRSEGDIAGIRRAQAARADVIILNKADTVEPAAL EALAERLRALNRQALILPARHGAVAFAELDAFYADWADRRGTPLPSHRPALPRFGETVTH ADEGFVSAALRLSAPLDEAALLALLDSAGPGLCRAKGVVDLRLDGGVTVPATVQYAAGRL GFEPAPEGEERYLVFIGTDISLPEAPSS >gi|316921531|gb|ADCP01000151.1| GENE 2 1541 - 2314 647 257 aa, chain + ## HITS:1 COG:RSc2528 KEGG:ns NR:ns ## COG: RSc2528 COG0811 # Protein_GI_number: 17547247 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Ralstonia solanacearum # 3 174 4 182 206 88 38.0 2e-17 MGLIELGGWMMWPLLAVSVAALAIVVERALVFMGCPFPDSRFPGLVLEAMRTGDVRPLAA RLEAVPSLRDFAALLGSPLPNREAALRLAGETVLERLEARLSLLSVLARLAPLMGLLGTI LGMITTFSRIAEARSGVDMSLLAGGIWQALLTTAAGLCIAIPALFFLSCFQGKVRRVADA LNKAGNAALLADGGPGITVRAGAEPLQPGDIPRGGGTAEVDSCPESPEPRGSAPEGREAL QCRDNAGEGLPDGGRSC >gi|316921531|gb|ADCP01000151.1| GENE 3 2308 - 2617 364 103 aa, chain + ## HITS:1 COG:no KEGG:DVU2807 NR:ns ## KEGG: DVU2807 # Name: not_defined # Def: biopolymer ExbD/TolR family transporter # Organism: D.vulgaris # Pathway: not_defined # 12 103 3 94 136 93 52.0 3e-18 MLNLRPKAARDDEPDLTPLIDMVFILLIFFILAASFAVRGIDLDLPPAHSGQALSGRVVE IRLLRDGSFLCEGIPVARADIRAKLQSLVRDFRARPGQLVLKA Prediction of potential genes in microbial genomes Time: Fri May 13 04:56:07 2011 Seq name: gi|316921506|gb|ADCP01000152.1| Bilophila wadsworthia 3_1_6 cont1.152, whole genome shotgun sequence Length of sequence - 29666 bp Number of predicted genes - 28, with homology - 23 Number of transcription units - 12, operones - 5 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 97 - 609 492 ## DVU2808 TonB domain-containing protein + Term 759 - 789 2.6 - Term 520 - 562 1.1 2 2 Tu 1 . - CDS 801 - 1358 501 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 3 3 Op 1 . + CDS 1397 - 1720 72 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains 4 3 Op 2 . + CDS 1735 - 2133 680 ## 5 3 Op 3 . + CDS 2266 - 3486 1692 ## HNE_2486 thiol-disulfide oxidoreductase domain-containing protein + Term 3540 - 3577 9.4 - Term 3528 - 3565 9.4 6 4 Tu 1 . - CDS 3615 - 4895 1462 ## COG3681 Uncharacterized conserved protein - Prom 5043 - 5102 2.1 7 5 Op 1 . - CDS 5522 - 5758 60 ## 8 5 Op 2 . - CDS 5810 - 7027 1384 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Prom 7277 - 7336 2.6 9 6 Tu 1 . + CDS 7464 - 7634 263 ## + Term 7816 - 7847 0.0 10 7 Tu 1 . + CDS 7906 - 8817 523 ## gi|237733076|ref|ZP_04563557.1| predicted protein + Term 8868 - 8901 6.1 - Term 8856 - 8889 6.1 11 8 Op 1 . - CDS 8936 - 9052 201 ## 12 8 Op 2 3/0.000 - CDS 9070 - 10497 2005 ## COG0591 Na+/proline symporter 13 8 Op 3 3/0.000 - CDS 10541 - 11923 1403 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 14 8 Op 4 7/0.000 - CDS 11920 - 12222 376 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 15 8 Op 5 3/0.000 - CDS 12255 - 13430 1419 ## COG0665 Glycine/D-amino acid oxidases (deaminating) 16 8 Op 6 . - CDS 13427 - 13810 632 ## COG0251 Putative translation initiation inhibitor, yjgF family 17 8 Op 7 . - CDS 13849 - 14574 781 ## HRM2_38490 hypothetical protein 18 8 Op 8 . - CDS 14667 - 15119 -166 ## 19 8 Op 9 . - CDS 15168 - 15914 842 ## COG1414 Transcriptional regulator + Prom 16493 - 16552 6.5 20 9 Op 1 . + CDS 16573 - 17799 1445 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 21 9 Op 2 . + CDS 17832 - 19199 1841 ## COG0786 Na+/glutamate symporter + Term 19222 - 19259 9.4 - Term 19210 - 19247 9.4 22 10 Tu 1 . - CDS 19291 - 19788 606 ## COG1846 Transcriptional regulators - Prom 19824 - 19883 1.6 + Prom 20193 - 20252 3.3 23 11 Tu 1 . + CDS 20280 - 21338 717 ## COG4974 Site-specific recombinase XerD + Term 21468 - 21511 7.4 24 12 Op 1 13/0.000 + CDS 21855 - 24149 2386 ## COG2274 ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain 25 12 Op 2 . + CDS 24146 - 25489 1560 ## COG0845 Membrane-fusion protein 26 12 Op 3 . + CDS 25546 - 26565 780 ## RALTA_B1440 metalloprotease, hemolysin-type calcium-binding region 27 12 Op 4 3/0.000 + CDS 26582 - 28858 1203 ## COG2931 RTX toxins and related Ca2+-binding proteins 28 12 Op 5 . + CDS 28305 - 29664 1837 ## COG2931 RTX toxins and related Ca2+-binding proteins Predicted protein(s) >gi|316921506|gb|ADCP01000152.1| GENE 1 97 - 609 492 170 aa, chain + ## HITS:1 COG:no KEGG:DVU2808 NR:ns ## KEGG: DVU2808 # Name: not_defined # Def: TonB domain-containing protein # Organism: D.vulgaris # Pathway: not_defined # 1 169 1 169 169 80 37.0 2e-14 MTEKECLAAGLAVSLFAHFALLIAPRPPEPEPSFTTVVQMDMASPAVATSREKGIGIAAA SPRDMDKADAADRKRQAFLRYLDDIDEAVHARRLDGGETGLIGVAEYMFTVRPDGTFTDP VLRASSGSPQLDASARRAVLAASGKVRRPAIIGTEPIPVILHVKYQYGLR >gi|316921506|gb|ADCP01000152.1| GENE 2 801 - 1358 501 185 aa, chain - ## HITS:1 COG:PM1535 KEGG:ns NR:ns ## COG: PM1535 COG3836 # Protein_GI_number: 15603400 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Pasteurella multocida # 15 184 83 248 266 103 33.0 2e-22 MRVRDSSRAAVLHALDLGARGIIIPDVQSVEEARRLVEYAKYYPLGARGFAFSRSAKYGF LPELKQIGDYFTATNQRTILMPQCETAGALEHIEEIAALDGIDGIFVGPYDLSVALGAPA RFATPRFAEALGRVIAACRASGKLAFIYANTMPEARAYFAQGYQGAAIGTDTAFLVKAIQ GMLQA >gi|316921506|gb|ADCP01000152.1| GENE 3 1397 - 1720 72 107 aa, chain + ## HITS:1 COG:aq_091m KEGG:ns NR:ns ## COG: aq_091m COG3829 # Protein_GI_number: 15607134 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Aquifex aeolicus # 4 81 412 489 530 66 44.0 1e-11 MELLANHFLHKFSAALGKRVQAINAEARDLLYRYPWPGNVRELENAIERAVNVCGGQELM PCDFPSAMLEHEAVSSPGRKNGWLWAQETETVCKGAPAPGRIRRMYL >gi|316921506|gb|ADCP01000152.1| GENE 4 1735 - 2133 680 132 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLTYLKHMALAAAAALCAAPAAQAADQPQEWELINPTGEIEKVAVEPAKRITALEGKTIA LRWNGKNNGDLVLDRLAELLAKKYPTAKVVKTYRDMADQNLNKISATQDESMRIVKAVAS VRPDIVIASQAD >gi|316921506|gb|ADCP01000152.1| GENE 5 2266 - 3486 1692 406 aa, chain + ## HITS:1 COG:no KEGG:HNE_2486 NR:ns ## KEGG: HNE_2486 # Name: not_defined # Def: thiol-disulfide oxidoreductase domain-containing protein # Organism: H.neptunium # Pathway: not_defined # 63 381 186 499 529 134 32.0 5e-30 MCFVTVPHPLGMISKEEVNAKVDAAFDDIVKAATAWTPSKEKQDAQVKPYPAKRFKFSGT YADVNDMFQKRKWSLSLPIVPPTVDKVAAMLKGTKRNPAEVLWVVPPRQGMLTVELVAAL GVMAGAKPEHMPLLIATVEAMADPVAAWRGPTTTTAATVPVFFISGPIIEKLKLNPGTGT AGGENPVTNALGYFVNLVGDVVGGSVPPNFDKSTHGSSADLVAMVFTENAKENPWKTTYA EEAGFKPGDSIVTHFSAYLGNANIDHDSKTGQRLLTTLSTGLLGSASGLASCLADYDATY AINNKVSFAFIVLCPEHAATIAKDFPDQRAARDFMRETAAMPYKFYSQETCVPGKDFGPY DENTFIPRFKKTESIKFFVSGGPGKQSQIWVPFPQVFKPVSKKIAE >gi|316921506|gb|ADCP01000152.1| GENE 6 3615 - 4895 1462 426 aa, chain - ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 16 417 5 411 411 200 32.0 4e-51 MDLYAFFQSEVKPTIGCTEPGAVAYAAAVAARHLPTQPEGIRLEMSVAIYKNGRSVGIPG TNGLRGLTLACVLGALGGNPDKGLMALEDIPADTVEQAQAFIDAGKMAEEVVQGTPSMWA RVTLSGGGHTVSCTVARKHDHVERLVADGAVLVDQPLEAQSGGTDWTDELTAMHFEELWN LALGIDDAIVRQMLEGARMNMAILDHAGTEQAGIGSALAKHNGHGSLSDKIKEVAGAASD LRMSGGDVAVMSSAGSGNHGIVAVIPVAMTARELGADDRKLAEALALSHLVCGYIKAYTG RLTPTCGCAVAAGAGAAAGIVRLHGGTPRQAELAAITLVGTLLGMICDGAKESCGLKVSN AASEAWSAAMLALENRGIRNTQGIISPDIHELGITLREFNEKIFSAADSVMIGLMTRQNR SAESRA >gi|316921506|gb|ADCP01000152.1| GENE 7 5522 - 5758 60 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRQSVARLMPGSLAAAFFPSVCRRFQDGSLEAEHRFAFPDDAGETAAAGCSGRASTLRA RVRANPDTRLIGSPSQPS >gi|316921506|gb|ADCP01000152.1| GENE 8 5810 - 7027 1384 405 aa, chain - ## HITS:1 COG:BH0352 KEGG:ns NR:ns ## COG: BH0352 COG0624 # Protein_GI_number: 15612915 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus halodurans # 16 390 4 366 374 225 38.0 1e-58 MELESATGKLNAYFDGRQPEIMGMIERLVNMDSFSEDGEDVNKVGETVSGWMREAGFHTE KIAKPAIDPDESWMEKLGNVFSARTHEREAGPGIVFLGHMDTVFPAGTAAARPFRVEGGR AYGPGVADMKAGVVANMFAARALKDLGLIDVPMTLMFSPDEELGAPTATRVYRERISGAR AVICAEPGFPDGGVTTERRGSGHFHMRISGISAHAGRCYEDGASAILELAHKIVALDAFV DAQAQTIVNTGLISGGNSANAVAPWADARIHITFNTVDAAERLVENVRAVAARTFVPRTT TRISGGIRLHPLEYTADVETLFGMAERACAAMGGYTIRRNRALGASEAGFTASVLGIPSI CSMGPEGAELHSPSEYLSVDTVLPRCKMIALTAIQAARAFPSTRV >gi|316921506|gb|ADCP01000152.1| GENE 9 7464 - 7634 263 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCLAEIAEASGLPKSSAYPASKAGLIMFTKCLAVEYAAEGIRANALCPGQILTYMK >gi|316921506|gb|ADCP01000152.1| GENE 10 7906 - 8817 523 303 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237733076|ref|ZP_04563557.1| ## NR: gi|237733076|ref|ZP_04563557.1| predicted protein [Mollicutes bacterium D7] # 176 294 6 123 124 117 43.0 9e-25 MKPHETAFPCETPRLSFRMAGPSCTEGMVTYWPHSAAVTEAGACGGPCQGMHGGGRKAGS QSQEKGLFRFAILSKETGNAVGTLECSSVTEGDDSAKTMRIGLAGQHDSESYLEEALRFA VLTLIPAHALRGLRVIVPHAHERVSLLKQYGFEPSEEGGPALFQRADRTYFDAGKGMALC GLACCVCSENPTCAGCRNEGCKDRSCCQPFNCCKQKKLNGCWECPAFPCDNPMFNKQRVR AFAAFVLEHGEAALIRALQKNEADGVLYHYPGRLVGDYDLPENGSAIRAMLLRGLEAAQD ARS >gi|316921506|gb|ADCP01000152.1| GENE 11 8936 - 9052 201 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWLQLTKYLFYLSFFLNIVCGVWVFFGIIQWLRDLNKQ >gi|316921506|gb|ADCP01000152.1| GENE 12 9070 - 10497 2005 475 aa, chain - ## HITS:1 COG:PA0287 KEGG:ns NR:ns ## COG: PA0287 COG0591 # Protein_GI_number: 15595484 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Pseudomonas aeruginosa # 3 425 2 416 461 140 27.0 6e-33 MHIVDYFVIAGYFALVMYLGYYSMKKVSNFDDYAVAGRSMPLPIFFAAIAATLCGGGATI GRISFMHTTGIIVFFALTGVVINQIFSGLFISGRVHNAGKNVYSLGDLYGIYYGRSGRLL SSVFGFLFCVGAFGVQILAMGAILQTATGISLIPAALIASIVTLAYTWSGGILAVTLTDA VQYVIIVIGVTLCAYLAIDHLGGFDAMMSILYSNPRFETNMKPLANWSLVQFLGLFFSFL LGEFCAPYYIQRYASTKSAKDSKNGLLIFGVHWIFFMATTAAIGLASMAIQPDVKPDLAF TNLIRDILPPGITGLVFGALLAAVMSTGAAMINTSAVIYTRDIYNKFINTAATQAQLLRQ SRLSTLVVGGISIGVAIIFQDVFGLMIYMFKLWPSAILPPLMCGLLWGKISPYAGAPAVI AGGLSFFLWSDKVLGEPFGIPANFIGIGMNCLVLFFIHQKMKGHKPEGPFLPDLN >gi|316921506|gb|ADCP01000152.1| GENE 13 10541 - 11923 1403 460 aa, chain - ## HITS:1 COG:SMa2225 KEGG:ns NR:ns ## COG: SMa2225 COG0446 # Protein_GI_number: 16263653 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Sinorhizobium meliloti # 5 458 4 457 458 247 37.0 5e-65 MSRQWDALVIGAGPAGLRAASVIAESGLEVVLVDEQAFPGGQIYRHVTAPGAEERFAEAA DYRDGTELVRRFRRSGAEYLPETTVWFLEQDRILASRGEEVLDLRARNVIIAVGAMERPV PFRGWTLPGVMNAGAGDILLKTAGLLPETPVVLAGSGPLLYLVASHLIRKGHPLAAVLDQ TPPSNMLKAAPHLPAGLLGLPLLLRGLSMLNDVRKSGTPVYRDVSGIEAQGADRLERVSF FSKGTAHTLEGATLLYHGGVIPRTHMANALDLPHHWDARQQCWTVACDGCGRTAREGVYV AGDCSRVRGAAVAGIAGELAGLAVAERQEGIGPEEFRRRSAPLLRSLAVQTASKAFLDAW FSPRKDLYAVPDDVAVCRCENVSAADIREAVRDGLTDINEVKLRTRAGMGNCQGRTCGPA LAAIAAETAGRAIPDMGRLHVRSPLRPVPLSALLRLPETK >gi|316921506|gb|ADCP01000152.1| GENE 14 11920 - 12222 376 100 aa, chain - ## HITS:1 COG:PAB0212 KEGG:ns NR:ns ## COG: PAB0212 COG0446 # Protein_GI_number: 14520528 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Pyrococcus abyssi # 14 99 22 106 481 57 33.0 6e-09 MSASQAAAEDAALVNVTFEGKPCAVPAGKTVAAALLGDAHAGYTSIHPLTGERRAPFCMM GTCFECLVEIDGQPNRQACLTIVREGMDIKRQYPDKEQER >gi|316921506|gb|ADCP01000152.1| GENE 15 12255 - 13430 1419 391 aa, chain - ## HITS:1 COG:AGpT61 KEGG:ns NR:ns ## COG: AGpT61 COG0665 # Protein_GI_number: 16119833 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 372 1 369 372 199 37.0 8e-51 MNASFDVVIVGGGVTGAAVGYGLAKRGVKTCLVDAVPTFSRASRANMGLIWCQSKALGCP QFVRWGFASSRAYGKLAAELKELSGIDTGYAPTGGIIPCLGEAELEARAQFIEKLRAETE DGTYPASVLSRKELEGMLPKVPFGPEVSGGIWSDMDGYVDPLHLMFAFRKSFVRLGGTLF AGERVSEVKPSGKGYTVVCGGRTLECGKAVLAAGLGVRKLAAQLGTDIPVFPNKSQVMLL ERIPADVLPIPLLGIARTFGGTVMIGAAHENMGMDRRLTPEVLAANAQWAVRVWPELERK RILRFWTGLRVWPKDAYPIYDRIPGHENAFVFAMHSAVSLAAILEQALPDYVMGKPLPPD GAIFGLSRFAGGGSGFQGSAASSPHAEAPNK >gi|316921506|gb|ADCP01000152.1| GENE 16 13427 - 13810 632 127 aa, chain - ## HITS:1 COG:SMb21139 KEGG:ns NR:ns ## COG: SMb21139 COG0251 # Protein_GI_number: 16264466 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Sinorhizobium meliloti # 4 124 3 125 127 125 52.0 2e-29 MGGIERFGLPKQGGVPASQAVKADGWLFVAGQVPKNAEGEFVVGGIAEQARQTLDNLMKT LRQAGYGPEDVVKVNVWLDDPRDFAIFNRIYSQYFSIEHAPARVTVQGHMMNDFKVEMDC VAWKEPR >gi|316921506|gb|ADCP01000152.1| GENE 17 13849 - 14574 781 241 aa, chain - ## HITS:1 COG:no KEGG:HRM2_38490 NR:ns ## KEGG: HRM2_38490 # Name: not_defined # Def: hypothetical protein # Organism: D.autotrophicum # Pathway: not_defined # 9 237 13 239 246 180 42.0 4e-44 MRIPSGYLYYDTPIGILCLDTLFPKPPGQLRNPLTFDFPVVCRVLRGVGAKEILSSTSAQ LETLFVDAARELERDGVKAIAGSCGFMALFQKAVASAVSVPVLMSSLVQIPLIHTLHGPG CRIGVLTAHSGSLTPEHFRQAGVPDAQIDALAIAGMENFPVFRKTILEGAAPVMDTDAVG AEIGAAAAAMTDGQPLDALLLECTDLSVFARTVQEAVSVPVYDINSLIAYAAFCVRRNQW V >gi|316921506|gb|ADCP01000152.1| GENE 18 14667 - 15119 -166 150 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLEGNPHPRRLRVPLPSGGGHYRPGPGARSPQGPQRPVSALFGSPPSGFRFVPPIPFPV PLSSRSVPLSGRFRLSGRGLLAACRHSASALLLEARRIGSFLFEHWWGIIKITIQVSGLE GGKREKCFIGNGFMILFSSFPAAARLDEGL >gi|316921506|gb|ADCP01000152.1| GENE 19 15168 - 15914 842 248 aa, chain - ## HITS:1 COG:mll9009 KEGG:ns NR:ns ## COG: mll9009 COG1414 # Protein_GI_number: 13488053 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mesorhizobium loti # 3 225 5 231 269 95 30.0 6e-20 MDSNQKTIKGAQSVYRVLNILEEVILCGDRMITPKELSASLDIPVATVHRLLTVLRQKNF VACDPATKKYHIGDSCLISPRMDVAAFIRARFLPLAERISRRFGYSTILYARSGYDAVCV ERLDGWHPIQVFLNKTGDRRPLGMGSATLSILAAMPEDEAEMILTHNEQEIRFVLKTDFS ALRGFLAEARRKGYASAQGMLLEGTIGVSHVLRMGNEAVGSIAIDAVRSEQWDKDQPKII QELQSALS >gi|316921506|gb|ADCP01000152.1| GENE 20 16573 - 17799 1445 408 aa, chain + ## HITS:1 COG:SMc02420 KEGG:ns NR:ns ## COG: SMc02420 COG0402 # Protein_GI_number: 15966349 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Sinorhizobium meliloti # 9 404 4 397 397 204 34.0 2e-52 MEVRMAAQLLLQHVKTPSGVCADILVDAGRIVRIAPDLPPSLADTRLDAEGHMAAPGFVN AHIHPAQTFVGGPWVDYDYADTVPGRAKAEAEYLAKHGRMNSLRNCYVQFREAIRRGTTH VRGHIDVGVFGLAELEDAARAREAFRDVLDIEYVAMPSNGILNMKTDPEQTIAAALRAGA SSIGGADPCDRDRDPVRSVELTLRLAAEHGAKVDLHLHEFGSMGLFSLDLLFKKIREYDL RDRVTISHAWCLANLPDTRYQPIAEAFQELGIGIITHAPGHVPFPSLKQARRWQVRYGVG TDNMRNLWGPYGMNDMRERVMLLAYLSDFRRREDLELAYDAGTYGSASVLGLPDYGLSEG CWADMVVFPVENRACALLEGPMPRFVIKRGVLVARDGELTDAVPPCPE >gi|316921506|gb|ADCP01000152.1| GENE 21 17832 - 19199 1841 455 aa, chain + ## HITS:1 COG:Cgl2722 KEGG:ns NR:ns ## COG: Cgl2722 COG0786 # Protein_GI_number: 19553972 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Corynebacterium glutamicum # 8 401 7 387 449 99 24.0 2e-20 MKPFFDMSLLIAFGFLSLLLLLGVLLRAKIKFLQNFMVPACLTGGVIGFFILNTTGFPQV QPELYPALAYHFFTISFVCIGLRGMSKVDQSKGSATKEMVRGSIWQAQMFHMGLCSQLIL ATGIVYLLNALTGQHYLESIGFLAAQGFAAGPGQAISTGLVWEKFGHSGMGQLGMTFAGV GFIMAFAIGVPLVRWGVRKGLNTYPVGSVPEEVKTGLLAPENRPCGLRQVTQNSNVDSLA LQLALVLGAWMLAYYAVEAVATRVPSSIGGTMWGLFFLINMVSGMLIRFAMRKLGIVHLI DGDSMTRLTGWMMEFLLLSTLIGIKFAVVKDYILPIIVLSVVLAVFTLFFILYFGRRVPG YSFERTVMMFGTCTGTIPTGLILLRMVDSELKTTVSVEAGMWNMAVFIFFYVNFIFHGYV VYGWGMPATLGLFALTFLANYIILRVFKLIGPKRF >gi|316921506|gb|ADCP01000152.1| GENE 22 19291 - 19788 606 165 aa, chain - ## HITS:1 COG:AGc466 KEGG:ns NR:ns ## COG: AGc466 COG1846 # Protein_GI_number: 15887622 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 161 7 168 168 94 34.0 1e-19 MPKDYTQRRNERWETVFPGIGLESILGRILRIARYHNHKSTETMKPFGIKNGEADVLSAL LHAGPPYTLNPKDITKQAYRTPGAITNVMDHLEEKGLIRRNADRENRRNVLVELTPEGLE MARESFLAQTDMEKRLLTMLSPEEKEQLRTLLKRILLDLEERGEV >gi|316921506|gb|ADCP01000152.1| GENE 23 20280 - 21338 717 352 aa, chain + ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 49 331 3 284 297 88 27.0 2e-17 MATLPEVISNGELPVLDDWEPLSRKRILRYFDRSLEYADVSAYVSAQGELLPDERIILVF LNAVYTQSLATVELYCRYVVELLNYARKPSFSVTARDVESYIRQCRMKGLKPRSVNTIIG ALKSYFKRLADTGAIALNPTAFIKKRKDGAGISLPGNLTHSLSESEMLLLFDRLAEHGAP RRDILLLKTLFMTGLRGEEAVSLCWKDVTVWQGQRYFNVLGKGSKERRIYLPDEIDEGLN EYGKLTGTSPNQPIFGNLRKPSRRIGRHALYHMVKKWLTTLMNRPDVSPHWFRHSCFTYL ASKGVRLESIQALAGHASIDTTMLYNEAAQLMVPAGTAFNKSHSYEPIKRIE >gi|316921506|gb|ADCP01000152.1| GENE 24 21855 - 24149 2386 764 aa, chain + ## HITS:1 COG:VCA1084 KEGG:ns NR:ns ## COG: VCA1084 COG2274 # Protein_GI_number: 15601834 # Func_class: V Defense mechanisms # Function: ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain # Organism: Vibrio cholerae # 40 727 5 695 704 536 43.0 1e-152 MEALPYPALATDAPQEVPVSPPQTGEMRLAPSETDRIPHLIHSLTTLSRLHGHGVSPLLL MSGAADLAASPAACLRAAQRAGLEGRIIPIDDVSFISPLALPCIPLLKHNRSCVLTKLGE HEAEIILPEDGENPRTIPLDLLQADFAGHVLFARPAPKLERRTESEHFFHEKRWFWGVLS HYLPIYKHVIVASVIINIIGVAGSLFTMNVYDRVVPNNATATLWVLTSGILLAYFCDFLL RNLRGYFVDVAGRNADVVISSKLVDKVLSMRFDKKPESTGALVNNLREFEALRDFFSSGT LLTFVDLPFLILFLALTAFIGGPLVFVPLSAVPVMLIGGLWIQWTARRASEHGFRQNMQK NALLVEMVNGLETIKGGMAENRMRRQWEQVVGASALASATSRRYTTLATTLSSSLVQLVT VGMIAWGVYLIAEGRLTMGALVGCNILVGRAMSPLMQLSSMLSRWQQSRMALKALDTLMD TPSENDGEDSPVNAATLDPLLNLDDVSFSYPGSRRTSLEDISLNIRPGERVGIIGRVGSG KSTVGRLLIGLYPPDKGAVRFGGVDIRQLNTADLRGRIGYLPQDVVLFYGSIRDNIALDD PAVQERLVSRAAWLAGVTEFVRLHPAGFGAQVGERGMNLSGGQRQSIALARALLHDPDVL ILDEPTSNMDTATENMVRERLQSALKDKTLVLITHRMSMLQLVKRVVIIEEGRIVADGPK EQVLRGLAAAPQAPRPDSRPPEKRTEPTAGSAPLSSHPLPGDAS >gi|316921506|gb|ADCP01000152.1| GENE 25 24146 - 25489 1560 447 aa, chain + ## HITS:1 COG:VCA1080 KEGG:ns NR:ns ## COG: VCA1080 COG0845 # Protein_GI_number: 15601830 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 14 446 30 481 481 259 35.0 6e-69 MNNQEKQKRLKEKEQIAFMDEVDAALHRRPRIGARGLSLGTAVAVILFLIWANHSEIDEV SRGQGQVISSQHTQVIENLEGGILQEMLVYEGQIVDKGTPLARLSNETAESLYRDALGKS RENTIAIIRLRAELEDHEPVFPADLAATSPQIIEDQLLFYATRKKQRTAELEVLHSQYVQ RAQEIEEEKNRKQQAERELGLALQQVGMTQGLVARNLYSKVDFLNQQQKVAALQGDISVL KASIPKAEAAAQEALQKYELRKAEMNSQIVEEVNKRRAELASLTESLAAGSDRVTRTELK SPVRGTIKQIMLKTVGGVAKPGEPIMEIVPLDDTLLVETRIRPGDIAFIRPNQKAMVKVT AYDFSIYGGLQGEVEQISADTIEDRKGDLFYVAKIRTTGNAIIHRGEQLPIIPGMVCTVD ILTGKKTILDFLLKPILKAKQDALRER >gi|316921506|gb|ADCP01000152.1| GENE 26 25546 - 26565 780 339 aa, chain + ## HITS:1 COG:no KEGG:RALTA_B1440 NR:ns ## KEGG: RALTA_B1440 # Name: rtxA # Def: metalloprotease, hemolysin-type calcium-binding region # Organism: C.taiwanensis # Pathway: not_defined # 199 330 294 429 1648 76 36.0 1e-12 MAEYTLRMTSQQNHPPVPHLVPGDRAVLDFPMRGVFAGRQGQDLVFSRTDGGKLVLPGVF AAHTLPGTLPAPGEFPGEAPENLMGKASSLLDTADKPGGTLHLIVHGRDMSLEEFLAALG KENMPESGMAPSARVRFHEYADAELMHGIGGLGGLELSLSRTGYETAPRYDTLPTRQDKE NDQLPARGGLPDTPVPPAHHFPTVDSFAYALAEDGVPDTVSGNALAGAHPGDGNNIFAWV TPPSAARYGSISLNPDGTFTYTIDNSLPIVQELGVGDQVVEHFIYTYTDAAGEKATGSLD ISIIGTNDVPTVAASTASVSDSGSAGVPSGVSGITARAP >gi|316921506|gb|ADCP01000152.1| GENE 27 26582 - 28858 1203 758 aa, chain + ## HITS:1 COG:all7128_2 KEGG:ns NR:ns ## COG: all7128_2 COG2931 # Protein_GI_number: 17233144 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: RTX toxins and related Ca2+-binding proteins # Organism: Nostoc sp. PCC 7120 # 48 527 1391 1848 2231 100 26.0 1e-20 MVSFLPQNGTPGAYGTFTLHTDGTWSYDLDNNLPAVRALAPGYSLTETFTYTVSDNHGGT SSNTLTVTINGTNDAPAIAAAATSVTEDTQLTTSGTLPAPSDPDRGDTIAFVPQNGTLGH YGSLTLGADGSYTYTLNNNLYAVQSLGVGETLTDTFTYTVTDSHGAIGSNTLTVTIHGTN DAPAVAAAAASVTEDTQITTSGTLPTPSDTDVHDTVSFLTQRGAPGTYGTFTLSADGSYT YALNNNLPAVQALGVGETLTDTFAYAVTDGHGGIGINTLTVTINGTNDTPTVTAAASSVT EDSQTTASGILPTPQDTDTHDSVSFLAQNGTPGTYGAFTLNADGSYTYILNNSLPAVQSL GVGETLTDTFTYTVTDNHGAIGSNTLTVTISGTNDVPAVAAAAASVTEDTQISASGTLPQ PQDPDLRDTVSFIPKTNETGTYGSLNLNADGSYTYTLDNTSPYVRELGAGETATDTFTYT VSDGHGGTASNTLTVTISGTNDAPTVAAAAASVTEDTQISASGTLPQPQDPDLHDTVAFT PKAGEAGTYGSLTLNADGSYTYTLNNTSPLVQGLGAGGNRCRHLHLHGQRRARRYGLQYA DRNHQRYQRRPDGGSRHRLRCRRHADQRVRYPAPAAGRGHPRHSGLPPPDQYRRALRFPD PQRRRDVYLHAQQRLAAGAEPRHGRIRHRRLYLHRQRRARRHGLQYPDRHHQRHRGSARA DPGHSQRQGRRPAYGQRRGSPALRRGYTGHPDLHGQGQ >gi|316921506|gb|ADCP01000152.1| GENE 28 28305 - 29664 1837 453 aa, chain + ## HITS:1 COG:all7128_2 KEGG:ns NR:ns ## COG: all7128_2 COG2931 # Protein_GI_number: 17233144 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: RTX toxins and related Ca2+-binding proteins # Organism: Nostoc sp. PCC 7120 # 1 392 1383 1760 2231 85 30.0 2e-16 MRAETVADTFTYTASDGHGGMASNTLTVTISGTNDAPTVAAATASVAEDTQISASGTLPQ PQDADIHDTVAFLPQTNTAGLYGSLTLSADGTYTYTLNNASPLVQSLGTGESVTDAFTFT VSDGHGGTASNTLTVTINGTEDLPVLTPATASVREDVQLTASGVVPPPFDADIRDTLTFT AKANEAGRFGTLTLNADGTYTYTLNNSLSAVQALGVGETLTDVFQYTVTDSQGGSSSSTL TVTINGTNDIPTVAAAIASVTEDTKITASGILPTPQDTDIHDTLAFTPKVAEAGLYGKLT LAANGSYTYTLNNSLPAVQRLGVGETATDTFTYTVSDGHGGTSTNTLTVTINGTNDAPTI GASTASVTEDTLLTATGSVPVPRDPDVHDVLSIQPMSNVAGTYGTLTLNADGTYTYTLNN SLPAVQALGAGERLLDVFTYTVRDNHGATGSNT Prediction of potential genes in microbial genomes Time: Fri May 13 04:57:26 2011 Seq name: gi|316921504|gb|ADCP01000153.1| Bilophila wadsworthia 3_1_6 cont1.153, whole genome shotgun sequence Length of sequence - 533 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 532 593 ## Slit_1599 outer membrane adhesin like proteiin Predicted protein(s) >gi|316921504|gb|ADCP01000153.1| GENE 1 1 - 532 593 177 aa, chain + ## HITS:1 COG:no KEGG:Slit_1599 NR:ns ## KEGG: Slit_1599 # Name: not_defined # Def: outer membrane adhesin like proteiin # Organism: S.lithotrophicus # Pathway: not_defined # 3 177 3120 3295 3778 108 45.0 1e-22 TGTLTVTISGTNDAPTATAAAASVTEDLALTANGVLPPPRDVDIHDTLSFLPKAAEPGLY GTLTLRADGSYTYTLNNALPAVQALGVGETLTDTFTCTVSDGHGGTGSSTLTITVNGTDD APSVAAAAASVTEDTGLTASGTLPAPTDPDAHDTPVFIPKTAEAGLYGSLTLNADGT Prediction of potential genes in microbial genomes Time: Fri May 13 04:57:35 2011 Seq name: gi|316921498|gb|ADCP01000154.1| Bilophila wadsworthia 3_1_6 cont1.154, whole genome shotgun sequence Length of sequence - 7098 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 6 - 65 2.7 1 1 Tu 1 . + CDS 193 - 678 -272 ## + Term 714 - 757 0.4 + 5S_RRNA 194 - 293 93.0 # CP001197 [R:1104568..1104682] # 5S ribosomal RNA # Desulfovibrio vulgaris str. 'Miyazaki F' # Bacteria; Proteobacteria; Deltaproteobacteria; Desulfovibrionales; Desulfovibrionaceae; Desulfovibrio. + Prom 1201 - 1260 6.2 2 2 Tu 1 . + CDS 1448 - 1924 437 ## gi|212704542|ref|ZP_03312670.1| hypothetical protein DESPIG_02601 + Term 1948 - 1985 5.1 - Term 1936 - 1973 5.1 3 3 Tu 1 . - CDS 1999 - 2883 638 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 3066 - 3125 7.7 + Prom 3076 - 3135 4.9 4 4 Tu 1 . + CDS 3205 - 4092 1160 ## COG2421 Predicted acetamidase/formamidase + Term 4096 - 4146 6.4 + Prom 4118 - 4177 2.0 5 5 Op 1 . + CDS 4380 - 5747 1867 ## COG0531 Amino acid transporters 6 5 Op 2 . + CDS 5839 - 6726 599 ## COG2421 Predicted acetamidase/formamidase Predicted protein(s) >gi|316921498|gb|ADCP01000154.1| GENE 1 193 - 678 -272 161 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVFMEERVHPIPFRTRQLSSPSSMILRTLTWESRTTPRLYFQRAVLSRTALCFARTHPHH GPPLPKPQARRKPLLLFLPLPPPFGVLPPSFFFLQLYPPGSLHNRLIRFLSTFCIHLSRG PENLWVRRRDVLIEMHGSHAVSMKKGREKGRDALGNSSPAT >gi|316921498|gb|ADCP01000154.1| GENE 2 1448 - 1924 437 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212704542|ref|ZP_03312670.1| ## NR: gi|212704542|ref|ZP_03312670.1| hypothetical protein DESPIG_02601 [Desulfovibrio piger ATCC 29098] # 2 158 10 174 176 94 34.0 2e-18 MISDAELLKLVMDRYPEQPAQFLIEQFTVMKAGLIQANNKLAGMDSEALSESLCCAEDAQ KEETVSEPSAPKKKYTRRNLVVKPEEAISEQEIACCICGKTFQNLTTKHIQSHDLTVEEY KKLCGYGPEQKLISSKLLNKLQANVLKAQQAREKKKAE >gi|316921498|gb|ADCP01000154.1| GENE 3 1999 - 2883 638 294 aa, chain - ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 1 286 3 289 306 212 43.0 6e-55 MQSKIGCYIIAVAVITVWSTTFVSTKILLRSLSPEEIMFYRHVLAYLVLLAAYPHRHPSG GFREELLFAGAGIFGSTLYFLTENYALKYSMASNVGLLLATTPMLTAIVARFLTTGEPFN RQLAAGFCIAFLGVFLVIFNGHFILKLHPIGDFLAIAAALSWAFYSTLVKKIGNRYNGIY ITRKIFFYAIVTMLPVLAFGDFRYDFSRLWQPTTYLNLIFLGVIASSLCFLIWSKIIWKI GPVAVNNFIYLMPLITMLASWLLLDEHLTPIAIAGGLVILAGVYVSTKAAHRKK >gi|316921498|gb|ADCP01000154.1| GENE 4 3205 - 4092 1160 295 aa, chain + ## HITS:1 COG:PAB0614 KEGG:ns NR:ns ## COG: PAB0614 COG2421 # Protein_GI_number: 14521124 # Func_class: C Energy production and conversion # Function: Predicted acetamidase/formamidase # Organism: Pyrococcus abyssi # 6 288 11 288 298 211 44.0 1e-54 MTELTQYVYAFDKANEPIARAKDGDEFSFRTLDCYSGQVCTEEDIVGESFNFSRTNPATG PLYVEGAMPGDVLVVDVLSVEVADKGAVTTIPNIGPLYDRCVDRTRILPVKDGKTELGGL SIPINPMIGVMGVAPAGEPIACGYAGKHGGNMDSKKLTAGSRVYLPVQVEGALLQMGDIH AVMGDCELCGTGLEIGGVITVRVSVLKGRKLDWPVIETADAWHVVSARKDYTSALIDASA QMQELVCAAYGLDPTDAYLYLSLEGDVCINQGCQPCPVEIVLRISVPKRSDKPLL >gi|316921498|gb|ADCP01000154.1| GENE 5 4380 - 5747 1867 455 aa, chain + ## HITS:1 COG:PA5510 KEGG:ns NR:ns ## COG: PA5510 COG0531 # Protein_GI_number: 15600703 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Pseudomonas aeruginosa # 7 454 2 449 449 370 44.0 1e-102 MAEEKGSVEQFGYKQELHRTLSLWDLTVYGLLFMVIIAPHSIFGYVNRDAQGMAPLVYLV GFCAMFFTALSYVQMAKRFPIAGSVYSYVQRGVNPHVGFLAGWLILLDYVFVPSLLYIFV GNWCQSLIPQIPGWVWILIFIAVNTAINIKGIETTARVDMFFFVVEIGIVLIIVIGGLKY VLGGGGAGELTMTPIYQADKVNLTFIATACTIAALNFTGFDGISTLAEEVSEPEKNVGRA ILIALVVIGTVFFLQTYIVCLIEPDYNKFNASTALFDACAVAIGDWFRIVLLVVNILAVG IANTLNAQAACSRVLFSMGRDRLLPGALAKVHPKFRTPYVAILLIGVVSLACTIIFTDEQ LTKLVNFGAISSFMMLNLAVIWFFFVKEQRRSGMDLVNYLVYPLAGTLILLYVWSGFDHL TQILGFSWLAVGIVFGYIKSKGYKEVPDAFKKSMM >gi|316921498|gb|ADCP01000154.1| GENE 6 5839 - 6726 599 295 aa, chain + ## HITS:1 COG:PAB0614 KEGG:ns NR:ns ## COG: PAB0614 COG2421 # Protein_GI_number: 14521124 # Func_class: C Energy production and conversion # Function: Predicted acetamidase/formamidase # Organism: Pyrococcus abyssi # 1 283 5 287 298 228 47.0 8e-60 MRITRERHVYSLGVSEPVATVTAPCSLTVETCDCFNGQVTEDGQPKARLNFSHVNPATGP IVVEGAEPGDVLRVHIRAIRPEKTGALMTAPGAGALPDRVKGDTRICPIADGHFTFMGVE LPLNPMIGVIGVAPAGESVPCGTPGDHGGNMDTIGIREGAILRLPVFVKGAYLGLGDLHA AMGDGEVSVTGLEVFGEVDLDLSLEKGASLPCPLLHDGDEVSFLASAETVDGAIHASTGY MHDFLLAKTSLNADEALMLMSLCGHARISQIVDPLKTARFAMPVSVLAKFGVVVE Prediction of potential genes in microbial genomes Time: Fri May 13 04:58:29 2011 Seq name: gi|316921439|gb|ADCP01000155.1| Bilophila wadsworthia 3_1_6 cont1.155, whole genome shotgun sequence Length of sequence - 58210 bp Number of predicted genes - 65, with homology - 42 Number of transcription units - 21, operones - 9 average op.length - 5.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 49 - 1029 905 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D + Term 1065 - 1103 7.1 + Prom 1185 - 1244 3.1 2 2 Tu 1 . + CDS 1323 - 2459 1220 ## COG0845 Membrane-fusion protein + Term 2605 - 2652 -0.6 + Prom 2699 - 2758 2.6 3 3 Op 1 . + CDS 2786 - 5920 3303 ## COG0841 Cation/multidrug efflux pump 4 3 Op 2 . + CDS 6008 - 6457 482 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 6483 - 6518 5.1 + Prom 6696 - 6755 3.1 5 4 Op 1 1/0.000 + CDS 6820 - 7590 1009 ## COG5266 ABC-type Co2+ transport system, periplasmic component + Term 7792 - 7857 9.5 6 4 Op 2 . + CDS 7880 - 8485 650 ## COG0310 ABC-type Co2+ transport system, permease component 7 4 Op 3 . + CDS 8490 - 9131 687 ## Dbac_3137 hypothetical protein 8 4 Op 4 34/0.000 + CDS 9131 - 9898 879 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 9 4 Op 5 . + CDS 9891 - 10610 399 ## PROTEIN SUPPORTED gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 10 4 Op 6 . + CDS 10669 - 10932 133 ## sce3952 alcohol dehydrogenase (EC:1.1.1.-) - Term 10924 - 10971 -0.4 11 5 Op 1 1/0.000 - CDS 10972 - 12015 566 ## COG0582 Integrase - Prom 12134 - 12193 1.8 - Term 12143 - 12180 0.0 12 5 Op 2 . - CDS 12208 - 13143 355 ## COG3617 Prophage antirepressor - Prom 13379 - 13438 3.2 - Term 13203 - 13243 5.5 13 6 Op 1 . - CDS 13493 - 13864 94 ## 14 6 Op 2 . - CDS 13867 - 14097 135 ## gi|298528620|ref|ZP_07016024.1| hypothetical protein Dthio_PD3630 15 6 Op 3 . - CDS 14112 - 14681 242 ## PTH_2463 hypothetical protein - Prom 14730 - 14789 2.4 - Term 14712 - 14752 6.6 16 7 Tu 1 . - CDS 14795 - 15241 224 ## - Prom 15263 - 15322 3.1 - Term 15267 - 15294 0.1 17 8 Tu 1 . - CDS 15354 - 15512 149 ## - Prom 15546 - 15605 2.6 18 9 Tu 1 . + CDS 16046 - 16249 75 ## - Term 16201 - 16257 2.6 19 10 Op 1 . - CDS 16293 - 16514 117 ## 20 10 Op 2 . - CDS 16518 - 17153 304 ## - Prom 17188 - 17247 6.4 - Term 17800 - 17840 -1.0 21 11 Tu 1 . - CDS 17937 - 19742 1432 ## AZC_3575 phage-related DNA maturase - Prom 19889 - 19948 2.3 22 12 Tu 1 . + CDS 19750 - 20295 -139 ## - Term 20291 - 20341 4.1 23 13 Tu 1 . - CDS 20368 - 20712 254 ## 24 14 Op 1 . - CDS 21150 - 21431 204 ## 25 14 Op 2 . - CDS 21432 - 21674 129 ## Emin_0938 hypothetical protein 26 14 Op 3 . - CDS 21671 - 22237 479 ## 27 14 Op 4 . - CDS 22250 - 23788 379 ## BALH_3339 triple helix repeat-containing collagen 28 14 Op 5 . - CDS 23797 - 24819 467 ## Bcer98_1704 YVTN beta-propeller repeat-containing protein 29 15 Op 1 . - CDS 24935 - 28483 2925 ## AZC_3580 hypothetical protein 30 15 Op 2 . - CDS 28480 - 30603 2125 ## 31 15 Op 3 . - CDS 30606 - 31205 812 ## Dde_2804 hypothetical protein 32 15 Op 4 . - CDS 31213 - 33597 2608 ## AZC_3583 tail tubular protein B 33 15 Op 5 . - CDS 33599 - 34234 733 ## PP_2283 tail tubular protein A - Term 34244 - 34281 8.5 34 16 Op 1 . - CDS 34294 - 35259 1494 ## Rru_A3313 minor capsid protein 10 35 16 Op 2 . - CDS 35295 - 35504 155 ## 36 16 Op 3 . - CDS 35585 - 35812 302 ## 37 16 Op 4 . - CDS 35834 - 36430 648 ## COG3772 Phage-related lysozyme (muraminidase) 38 16 Op 5 . - CDS 36360 - 37187 797 ## PP_2281 capsid assembly protein, putative 39 16 Op 6 . - CDS 37215 - 37388 262 ## 40 16 Op 7 . - CDS 37413 - 39077 2072 ## PP_2279 head-to-tail joining protein 41 16 Op 8 . - CDS 39046 - 39291 278 ## 42 16 Op 9 . - CDS 39303 - 39857 720 ## DvMF_1107 hypothetical protein 43 16 Op 10 . - CDS 39948 - 40559 577 ## COG0602 Organic radical activating enzymes 44 16 Op 11 . - CDS 40519 - 40737 217 ## 45 16 Op 12 . - CDS 40747 - 40881 105 ## 46 16 Op 13 . - CDS 40874 - 42835 2074 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 47 16 Op 14 . - CDS 42832 - 43326 729 ## Rxyl_0488 kinase-like protein 48 16 Op 15 . - CDS 43357 - 44112 989 ## COG3646 Uncharacterized phage-encoded protein 49 16 Op 16 . - CDS 44143 - 44568 463 ## Namu_3760 MazG nucleotide pyrophosphohydrolase 50 16 Op 17 . - CDS 44549 - 44794 326 ## 51 16 Op 18 . - CDS 44761 - 45576 828 ## COG0258 5'-3' exonuclease (including N-terminal domain of PolI) 52 16 Op 19 . - CDS 45557 - 46144 616 ## EUBELI_00925 hypothetical protein 53 16 Op 20 . - CDS 46122 - 46598 327 ## mma_2215 phage N-6-adenine methyltransferase 54 16 Op 21 . - CDS 46607 - 47011 545 ## - Prom 47031 - 47090 4.4 + Prom 46990 - 47049 5.9 55 17 Tu 1 . + CDS 47249 - 47329 57 ## - Term 47238 - 47276 4.0 56 18 Op 1 . - CDS 47441 - 49321 1835 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 57 18 Op 2 . - CDS 49299 - 50108 1084 ## COG3617 Prophage antirepressor 58 18 Op 3 . - CDS 50123 - 50506 474 ## 59 18 Op 4 . - CDS 50508 - 52304 1740 ## COG0305 Replicative DNA helicase 60 18 Op 5 . - CDS 52273 - 52782 339 ## PP_2268 phage endodeoxyribonuclease I 61 18 Op 6 . - CDS 52760 - 53449 1028 ## PP_2267 phage single-stranded DNA-binding protein, putative 62 18 Op 7 . - CDS 53460 - 53633 216 ## 63 19 Tu 1 . - CDS 53948 - 56596 2868 ## COG5108 Mitochondrial DNA-directed RNA polymerase - Prom 56621 - 56680 1.8 64 20 Tu 1 . + CDS 56543 - 56731 110 ## + Term 56909 - 56959 10.6 + Prom 56980 - 57039 3.4 65 21 Tu 1 . + CDS 57166 - 58002 641 ## COG1454 Alcohol dehydrogenase, class IV + Term 58064 - 58109 4.1 Predicted protein(s) >gi|316921439|gb|ADCP01000155.1| GENE 1 49 - 1029 905 326 aa, chain + ## HITS:1 COG:CAC1714 KEGG:ns NR:ns ## COG: CAC1714 COG0252 # Protein_GI_number: 15894991 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Clostridium acetobutylicum # 4 324 2 323 331 299 47.0 5e-81 MDNKKVIVLTTGGTIAMKLDPAHGIVPAVSGEDLVASVPGLREACPIEVMEFSNIPSPHM TPKRMFELGNKVEELLRHEDILGVVITHGTDTLEETAYLLDLVHNSEKPVCLTGAMRSAA EISPDGPVNLLCAVRTAASSEARGKGVLVVMNEEIHAAREVVKSHSANTETFVSPFWGPL GYVDPDRVVFRRNTLGRQSIRPPELVEDVHLIKLASGSDSLLIDFLVERNVRGLVLEAFG RGNVPPAALPGIRRAVEKGIPVVITTRTIAGRVLDVYGYEGGAKTVLEAGGILGGETSGP KARLKLMLALGVASGRDELAVFFDTP >gi|316921439|gb|ADCP01000155.1| GENE 2 1323 - 2459 1220 378 aa, chain + ## HITS:1 COG:YPO3132 KEGG:ns NR:ns ## COG: YPO3132 COG0845 # Protein_GI_number: 16123294 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 10 375 12 377 395 333 49.0 3e-91 MSYPRALTRAVLCMLLSLTLLACEEGGKGAPGSSGPREVVIIKLEPRREVYTTALAGRIA SFQVAEVRPQVGGILQQRLFTEGADVKTGQALYQIDPATYEAALDSAQAALMKAEANVTP ARLKAERFRELLAIKAVSKQEYDDAQAAFKQAEADVAVNRAAVKTARINLEYTKVRSPIS GRIGKSAFTPGALVTANQAQALTSVRQLDPVYVDITQSSQDLLRLRAQFTNGELRSAAEE APVRLKLENGAMYPHEGRLQFTDVSVDESTGMVSLRALFPNPEHILLPGMYVRAVIAEGV DENALLVPQRALRRDPKGQASVLLVDGGGKVDVRLVDVGRTVGDSWQVLSGLKPGDRVIV EGGQNVRPGMSVKIRGEG >gi|316921439|gb|ADCP01000155.1| GENE 3 2786 - 5920 3303 1044 aa, chain + ## HITS:1 COG:STM0475 KEGG:ns NR:ns ## COG: STM0475 COG0841 # Protein_GI_number: 16763855 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Salmonella typhimurium LT2 # 1 1013 1 1019 1049 1241 60.0 0 MARFFIDRPVFAWVIAILIMMAGGISIFRLPVEQYPRIAPPVVTITAKYSGASAQTLEDT VAQVIEQKLNGIDGLLYINSTSDAAGQVSIRLTFDPDTNPDVAQMQVQNKLQLATSSLPE EVTRQGITVTKVADSFLQMYAFVSSDDSMSAADLCDFVGSTILDPLSRVDGVGEVSLFGA PYAMRIWLNPSKLLSYSLTPSDVINAVKAQNKQVSLGEVGGKPIRDGQQMNVTIKAQKQL TSVPEFERILLRVNPDGSAVRLRDVARVELGQESYTSSARYNGKPAAGVGIKLASDANAL NTSNAVAAFIEDMRPYFPHGVEVVSPYDTVPFIKISIIEVVKTLLEAIVLVFAVIYLFLQ NFRATIIPSLAVPVVLLGTFGVMAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVARVM EEDGLPPREATIKTMGQITGALIGVAAVLSAVFVPMAFFGGTVGAIYRQFSLTIVSAMIL SVVVAVVLTPVLCSTFLKPGHMASQHGFFGWFNRSFDRATTAYRGAVGRIIKVGGRMMVI YLAMLVCAGWILWRMPTSFLPNEDQGILTVEIQLGPGSTETETLKIVEQVERYFLENEKE NVHGMMLTLGRASGGKGQSTARGNIRLRDWSERKDPERRAQAIIDRANQAFSSIINARVF VSAPPAIRSLGNATGFDFELQDQAGLGHEALIAARDQLMELARRSPLLRNVRTFGQDDSP QLEVDIDQEKAGAFGLPLDAINTDLSAAWGGKYVNDFVDRSRVKKVYVQADAPFRMKPED FNRWYFRNDKGEMVPFTSIGEARWTYGPMQLERYNGVSAVRIQGRAAAGMSSGTAMLEME RLMGELPEGIGYQWTGMSFQERLSGSQAPFLYALSILVVFLCLAALYESWSIPLSVILVV PLGVLGALVATSARGLSNDVYFQIGLLATIGLAAKNAILIVEFARELFQQGASLADAAME AARLRLRPILMTSLAFLIGVLPLAISTGAGSGSQNAIGTGVMGGTFAATVLGIFFVPVFF VLVFRLFNRKAREGRGTVVPKEKR >gi|316921439|gb|ADCP01000155.1| GENE 4 6008 - 6457 482 149 aa, chain + ## HITS:1 COG:MA3284 KEGG:ns NR:ns ## COG: MA3284 COG0589 # Protein_GI_number: 20092099 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Methanosarcina acetivorans str.C2A # 4 146 7 150 152 77 34.0 8e-15 MELKKILCAVDLKDSVNPAVEYAKMLSELSGASISAIYVVSSRSAYENLQVPVEDIAKGM RSIWSRARGDMDAFVEKQFPGMDVNGLIYEGRPAEKIVEIAKELGVDMIVMGTHAREGLD RLFFGSVANEVVKSAKCPVMTIRPSKKPE >gi|316921439|gb|ADCP01000155.1| GENE 5 6820 - 7590 1009 256 aa, chain + ## HITS:1 COG:BMEI0641 KEGG:ns NR:ns ## COG: BMEI0641 COG5266 # Protein_GI_number: 17986924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, periplasmic component # Organism: Brucella melitensis # 3 252 2 250 252 212 48.0 6e-55 MWKTFLMSCIMMLSVSGVALAHFGMVIPSAGTVTDKKDADVTLDLSFTHPMEMQGMPLAK PKAVQVIANGKTEDLKPSLKAKKIVDHGGWTAQYAIKRPGVYQFVMEPQPYWEPAEDCYI VHYTKAYVAAFGEEEGWDEPAGLKTEIVPLTRPFGNYAGNVFQGQVLLNGKPVPGADVEV ELYNKDKKYEAPNEYMVTQVVKADANGVFTYAVPFAGWWGFAALNTADEKLDHDGTPKNV ELGAVLWAEFVDPVKK >gi|316921439|gb|ADCP01000155.1| GENE 6 7880 - 8485 650 201 aa, chain + ## HITS:1 COG:HI1621 KEGG:ns NR:ns ## COG: HI1621 COG0310 # Protein_GI_number: 16273510 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Haemophilus influenzae # 1 196 1 199 206 125 45.0 7e-29 MHISEGVLSPAVLGAGAVLAAAGIVIGLRKLDYDRLMTVAILAAAFFVGSLIHVPIGPSS VHLILNGLLGMLLGWAAFPSIFVALMLQAILFQYGGITVLGVNTFNMAFPAVVCYYAFRP MLLKSARQRIVGAFCCGALSVAGAGLLTALSLSFSAEGFLRTAQILFLAHIPVMIVEGVI TALAVSFIAKVRPEILQFSKE >gi|316921439|gb|ADCP01000155.1| GENE 7 8490 - 9131 687 213 aa, chain + ## HITS:1 COG:no KEGG:Dbac_3137 NR:ns ## KEGG: Dbac_3137 # Name: not_defined # Def: hypothetical protein # Organism: D.baculatum # Pathway: not_defined # 21 213 18 200 202 147 50.0 4e-34 MRMFRLWCMALSLVFAGASFASAHRVNIFAFVDGDAVQVECGFNRSQKVKQGTVEVFDAT TGARLLQGTTDDNGVFRFPVTAELREAGHDLNIRIIAGEGHQNDWTVAADELASSGTPKA VAVAAAEVPATPASLVAGQAAPSAAAPVAAVSGGATPAEIERIVDAALDAKLSPIKRMLA EQTEAGPNLRDIIGGIGWIFGLIGVAAYFRRRP >gi|316921439|gb|ADCP01000155.1| GENE 8 9131 - 9898 879 255 aa, chain + ## HITS:1 COG:BMEI0637 KEGG:ns NR:ns ## COG: BMEI0637 COG0619 # Protein_GI_number: 17986920 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Brucella melitensis # 18 244 22 248 254 129 36.0 7e-30 MFDEPFAHGRSLVHAVDPRFRLVAAFVCAVCLAVVRTPEAAWFGFGIAALLLALSRPPMR PVLRRLAVVNVFIAFLWLTVPLTSGGDVIAAWGPLEVSRAGVLLTLLVTIKSNAIVMTFL ALVATMDSPTIGYALERLRFPSKLVFLFLFTYRYLHVIADEWHKLHGAARLRGFAPKTNM HTYRTFGNMLGMVFVHSFDRSVRVYEAMILRGFSGRFQSVTAFRATSRDAVFAVAAFACM VCLVAFDLYLEFPRG >gi|316921439|gb|ADCP01000155.1| GENE 9 9891 - 10610 399 239 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 [Haemophilus influenzae PittAA] # 18 210 13 205 205 158 41 8e-38 MAEPIFALEHISFTYPGGRQVFRDMDFALYPDRKIGLYGPNGSGKTTFFRLIMGLAAPQE GRILFHGRPLETEKDFRELRCKVGLVLQHAEDQLFCPTVLEDVAFGPLNLGCTQDEARDR AASTLERLGLAGFENRLTHRLSGGEKKLVSLATVLVMEPEALLLDEPTNGLDPEARQRII GVLKTLPTARIIISHDWDFLAETSSEYLTVEGQHFTDAAPSHAHAHMHVHPLGNKPHEH >gi|316921439|gb|ADCP01000155.1| GENE 10 10669 - 10932 133 87 aa, chain + ## HITS:1 COG:no KEGG:sce3952 NR:ns ## KEGG: sce3952 # Name: gbd # Def: alcohol dehydrogenase (EC:1.1.1.-) # Organism: S.cellulosum # Pathway: not_defined # 4 71 9 76 384 65 50.0 4e-10 MWNWFSPKGIVFGEGKAASTGLFARQWGRGRALVITGPFLRRSGVVEPVLQSLREEGLLG AVFSDIPSEPIYISMWNNKYYSLRKCE >gi|316921439|gb|ADCP01000155.1| GENE 11 10972 - 12015 566 347 aa, chain - ## HITS:1 COG:mlr7741 KEGG:ns NR:ns ## COG: mlr7741 COG0582 # Protein_GI_number: 13476425 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 82 345 52 311 359 149 33.0 7e-36 MKQVIDNKKRKSLPKGIRHLASGKFIADVCINGKRKTKTVDTLEEAIIERQKMLNNDVPT SFTKDPDDTSGWTLKQAYDRTKSLYWEDSAWSYKVASNFKSISTYFGETMPIGKITLDDI DDYVNTLLTKGNSNGTINRKLAILSRIFRTAVERGKLEVVPKIPLRKEAHHRVRFLTQEE ENAFVKAFIQNGYNTHAEVFLILLYTGFRLGECWRIECRDINLELGTITAWKTKNGHPRT IPIVDKIRPILERKIFEAGENGKLFPTADNRWFESAWNRIKRLLGLEKDVQLVPHALRHT CASRLAQRGVSMMVIKEWMGHSNIKTTMRYTHLSPKDLQEAAKILSL >gi|316921439|gb|ADCP01000155.1| GENE 12 12208 - 13143 355 311 aa, chain - ## HITS:1 COG:lin2418_1 KEGG:ns NR:ns ## COG: lin2418_1 COG3617 # Protein_GI_number: 16801480 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Listeria innocua # 66 169 6 106 128 92 47.0 1e-18 MGYTVDKKTGKIILVHDVTVSSKAVKQPETQQQPQRQQEQQTAATLPVIPATSMSSARAS VPGTFVFPVTRQKVRTVWHEGNVWFVAKDVAECLGFTHPQSAIIDHCNHAKVLKGGETPL LTSSPRGINIIPESDVYRLVMRSKLPAAEQFQTWVCEEVLPSIRKTGGYGRVVPASPTAT KPQTEDQLILEAMQVLLSRTETLKAELAEAKPKADYYDTLVDDRDLLTFTEAGKLFGMSA RSLAAFLRDAKGTPHHWLFKGFDGANIPYQPIIDRGLMKVKHRTSSLNGMPCTQGYFTPK GIEALRKLLKA >gi|316921439|gb|ADCP01000155.1| GENE 13 13493 - 13864 94 123 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSVQHSLFIDAYIGCLLWAGTDDNGEPLDTHYNIYNLSAESRSRIVRDCHYFMLVASTSD IDLSGLEAQAGHDFWLTRNGHGTGFWDRPEIYGEENARILSIMSHCFGNCDPIITDDEYI ELV >gi|316921439|gb|ADCP01000155.1| GENE 14 13867 - 14097 135 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298528620|ref|ZP_07016024.1| ## NR: gi|298528620|ref|ZP_07016024.1| hypothetical protein Dthio_PD3630 [Desulfonatronospira thiodismutans ASO3-1] # 8 73 7 74 77 80 57.0 4e-14 MCTFYNIDVTDMTIRQINRLFRQHDTSTLWPICGRFNATERAIRRLQRTAEYTYTDGLEY ALALDSEISRIVNGEV >gi|316921439|gb|ADCP01000155.1| GENE 15 14112 - 14681 242 189 aa, chain - ## HITS:1 COG:no KEGG:PTH_2463 NR:ns ## KEGG: PTH_2463 # Name: not_defined # Def: hypothetical protein # Organism: P.thermopropionicum # Pathway: not_defined # 76 153 149 230 237 65 43.0 1e-09 MDAFYTENIVTEKYGYNMTINLYQDIDAPNPFDEFDTLGTLETFTSATEYRERIEQLDKE RSIFVEGSTLRTEYVIYASRASIRTAYNVKRLTKKTLAKAFDCLMSERAIYENWLNGGVT GYIVTDDETGEEIDSCWGFYDDDNDEHALEEAREAAENYRRPIPAWAKNWKLLPGLTAEQ VGNPFLRVA >gi|316921439|gb|ADCP01000155.1| GENE 16 14795 - 15241 224 148 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTALLSEKELEHKADNATALFSVDPSDTFYWREILNHAKDEGFSSDVRDWQRLPGEEKED AIIGLYEKLLTSYDEGILSIDTVVVKDVLLSTGGPASGIEFRLIDCGASYEFQSARYWYQ DWFTPRQYSPIPNDIGERMFEHFGFEYK >gi|316921439|gb|ADCP01000155.1| GENE 17 15354 - 15512 149 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNLTQREAELLAYAFGVSLPKPESEVHLLTMQSGRTWWITFWDGVYKVQEEL >gi|316921439|gb|ADCP01000155.1| GENE 18 16046 - 16249 75 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPGKVLEAKTFVSVLLFVSVQFLYSWLLPAVQGSKEAKGGIVKDILLCNRVDRIKRKGIR LRACSLA >gi|316921439|gb|ADCP01000155.1| GENE 19 16293 - 16514 117 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPALPGYYTTKEAAEKLGYKTPDSLRQYCAQGKIPGAQNIGRMWFIPVEWVERKENEPID PKGNRGLARESKK >gi|316921439|gb|ADCP01000155.1| GENE 20 16518 - 17153 304 211 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNKEKIKNIGIQVTVALAVIFCLKVYHYIKNKTESKNIVAYSNENKTQVQEKHLYLADEI KVSGNTKEEQIDSFCNNFSVAFNNTPIEKRYELFINIIHISFCRIAMLNIDVDNLEPIEI PTEGMKAIYYIQEAMQKENQPFKLATIYQISYGLKDKYPSLWELKLLLSKNESLKKQFLD QQYKKAITDIELGLNLNELALIKEKIRLAQG >gi|316921439|gb|ADCP01000155.1| GENE 21 17937 - 19742 1432 601 aa, chain - ## HITS:1 COG:no KEGG:AZC_3575 NR:ns ## KEGG: AZC_3575 # Name: not_defined # Def: phage-related DNA maturase # Organism: A.caulinodans # Pathway: not_defined # 2 517 17 529 569 668 64.0 0 MPEKLTDFRVFLTLVWRHLNLPDPTPIQLDIALYLQHGPRRKIIEAFRGVGKSWITAAYV VWRLRQNPNLKFMVLSASKDRADNFTTFCLRLINEIPILQCLIPRADQRCSKLSFDVGPA RADHAPSVTSKGIFSQITGGRADEIISDDVEVPNNSFTQAMRDKLSEAVKEFDAILKPGG TITYLGTPQTEQSLYNQLPDRGYAIRIWPARYPSEDQLINYGNERLAPFILRWLEGEPTL VGRTTDPRRFSDDDLLERELSYGRSGFQLQFMLDTRLSDMEKYPLKLGDLIVMSCSATDA PEKPIWAAGTTNILNDVPCVGLNGDSRYYGPAFLHGTWLPYTGSVMAIDPAGRGKDETAV CVVKMLNGYLYVTAMRAYQEGYSEATLSSIVQLAKQQAVNHVIIEANFGDGMFTKLISPY FTKAHPCRIEEVKHSKQKEARIIDTLEPVMNQHKLVIDKNLILWDYNLSTKNLPPETALK YQLMYQMSRITRDRGSLAHDDRLDSLAMAVGYWVEQMGQDVDKRMLLRQDHLMLEEMKAW EGNAKGVNVKIGITNNPALSEMLFTFHGTVVKGNDDTGGHQHYKRLRNWTGGGRGTRKIL R >gi|316921439|gb|ADCP01000155.1| GENE 22 19750 - 20295 -139 181 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEVGVFVTKPPVVCAVEEKMPRKKKPDYWVGKTSFGAVSSFERGVAGSLFEMLRSGFHPF SGIGGNVWAFPHSEGVERTRSDGPLIASAGRHFTGNTSAALLDAKCPAGFQWAVGCSIGR TDTHSVGPERAFFMLAGFDYFEIVFLHFPIHAEDTVFPNSFEGGCHTLFLQPLAFWMVKH A >gi|316921439|gb|ADCP01000155.1| GENE 23 20368 - 20712 254 114 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDALYWLRKKQATPLRGIFLPHYRIHGIDIIFMKNSDNRASESALAELHGVVAKLLTSRL QSGDASTADINAAIKFLKDNGIDCAGSANPDVQDLVANLPTFEDVSKDEVSLLN >gi|316921439|gb|ADCP01000155.1| GENE 24 21150 - 21431 204 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MATPCAHEADISLLNTAIVEIKDTLKDLKELLLSNAVLSEQVSHFKENITSIDIRLRKLE LDVAQGKGANRWVERVVWCLTSAALGYYLKGSV >gi|316921439|gb|ADCP01000155.1| GENE 25 21432 - 21674 129 80 aa, chain - ## HITS:1 COG:no KEGG:Emin_0938 NR:ns ## KEGG: Emin_0938 # Name: not_defined # Def: hypothetical protein # Organism: E.minutum # Pathway: not_defined # 1 76 1 81 81 65 43.0 8e-10 MTYGKHILIGLDQFLNTLFMGWPDETLSSRCWRWEQAGIRAWPRKLVDTLFFWQPNHCRS AYESERKRLQCPPELRNAGG >gi|316921439|gb|ADCP01000155.1| GENE 26 21671 - 22237 479 188 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDYGQIIHRTFDDSYVITKNSMPYHVPNEGEFAEEWAEVRAYAEAHPECVVVEQPYVPPV PTLEELKAAKKARIDAETSAAILAGFDYAVDGVTYHFSYDAFDQQNFADTANVCIMKQSG AQGLPDSVMWNAYTVPGGELERLTFDASGFLALYAGGAMRHKNGTMQRGGERKAAVEAAT TAEEVEAA >gi|316921439|gb|ADCP01000155.1| GENE 27 22250 - 23788 379 512 aa, chain - ## HITS:1 COG:no KEGG:BALH_3339 NR:ns ## KEGG: BALH_3339 # Name: not_defined # Def: triple helix repeat-containing collagen # Organism: B.thuringiensis_AlHakam # Pathway: not_defined # 45 180 674 807 1269 106 46.0 3e-21 MQIGKVRPTYKGEWDAGQAYETLDWVLYRGIAYQAIKDVPINREPDAATDYWVATGMKGD KGDKGETGERGPAGVDGKDGAPGIQGPKGDKGNQGIQGPKGDTGATGPQGPQGTAPEHKW AGTKLAFQNPDGSWADPVNLIGAQGVQGPEGPIGKQGIQGPVGPQGPAGPQGVAGPKGTS LNLKGAWAADVAYVCTTVQIDVVTHNGSSYACKKSHTSTSSILPTNTTYWTLIAQRGEPR ELSDSVISSSSNIAASSKAVKTAYDRAETKLSLSGGVMSGRIGGIAGTYNADVTARCVNS ALEIRENGKVKNTQSDIAYAPAIGFHWADVVAGTLVLRSDGIFSFLKQNGSRAVVDCDVP YATAANKLRREGGVDTNWYWSGQGGQPGWLWGGNDGVNMYVYNPANFSVNYANSANYANS SNYANSAGSAPANGGTASAVSGVVVLSLNRDHACVLPSGGTWVYFYFTSYQTDRDGYSSS NHSCGTAAGGSTIASGTGYYSYSLKGIAIRIS >gi|316921439|gb|ADCP01000155.1| GENE 28 23797 - 24819 467 340 aa, chain - ## HITS:1 COG:no KEGG:Bcer98_1704 NR:ns ## KEGG: Bcer98_1704 # Name: not_defined # Def: YVTN beta-propeller repeat-containing protein # Organism: B.cereus_NVH # Pathway: not_defined # 174 293 219 347 689 138 55.0 2e-31 MSYSYVTYTGDGTTQDYIVPFPYLKISDVKVSLDEAEQNALAYSWHTSGTIRFVTSPPNG ASIRIQRITDKVTPAVDFRDGSTLTEADLDLAVTQLLYIAQEAYDALDGETAVAAKDKAE KILKEVEEVFAKTQIEINYFRKMWIDVQESPTAPGRGEYDFTSGKMTLYVPAGPVGPQGP MGQEGPQGLPGAQGEQGPRGIQGPQGIQGERGPEGQQGPMGPQGIQGPKGETGERGPQGP QGIQGATGAQGPRGETGPVGPMGPEGPRGIQGERGPQGPEGPKGATGDKGPIGDSPLPLT FGNFSVNTDGYLQFEYNGGPVDSSMFNLNPETGILEVIIA >gi|316921439|gb|ADCP01000155.1| GENE 29 24935 - 28483 2925 1182 aa, chain - ## HITS:1 COG:no KEGG:AZC_3580 NR:ns ## KEGG: AZC_3580 # Name: not_defined # Def: hypothetical protein # Organism: A.caulinodans # Pathway: not_defined # 118 1163 129 1210 1225 333 27.0 2e-89 MNNTIEVTLNGLEGGEEALSSMHGLGEAPLNATDANTMSPSAVAENAPAPSPDTSPKDDD LSFFDYVGDVIKGIANGPVNSVNETIDLAGTILNGGEEVDVAKATKGSGWLTDMSNFGET QTSAGKFAEDISTFVSGFVTGGKLLEGIKVLQGAGKGAIAARGAAKSFYSTVTSFDGHEE MLSNMIQEHPALQNVVTEALAVSKDDNEIVGRIKHGLEDLGIGMAFEGAISLYGGWRLAQ ATSKSAKEKIVAETARQLEQLRGDKEMPHDLAAGTAGKSEPPLPSAAGGKETTPPASEHT LPESQNTDALTPAKAEEHPLKPSEAIKADTIKEHILDVVTSTKSREEVVESLSKDYNIRT HLIRDENGLRILDDINEQISPATLKGQGVETFDAVLKDAERLKYYGMDRIQKVVELAASG DIPLNKAKRTLTLLKDGTEFCSRELYRIAEKMEVNPAAVSPQEMQDFVYLKENLDNLYLA ERNLTTEGGRLLSFMRNEGGIFSDEKMFKWYASPTGGTTEQIASELAKKGYTPDTIKKMA RDIRLNKDNLGAVAQAAHSVKPGSWFNVFNEFRINNMLSGPFTLAANAATNGLKTLLMPA EKYLAGTIMRDDAVQREALDTFSGLFRYWNDSFRLAKKAWKVEDNILDRMGGKMETNSAA MTYENIRNLMLKDAPKGTELSPLQENIARAMGLVGPYLRIPSRLLMSTDEFFKQLNYRSS LSASLLREGREAGIKDAGELARYVEEQLALAFKKDGSAIRGRYADDAVKDSIQYARESTW TQDLGRNTLGGGIQNLANTHPVLRIAIPFIKTPTNLFRDFVAHTPGVAQMTKTYREAIKA GGEQAALAQSKMAMGALMWTGAVMMAHSGQITGSPPKDNKLRQALEATGWQPYSIKVGDK YLSYRRLDPAGMFLGIAADLAVAGQYLNKDQYDDAVSMAVAALSNNVTSKTYMQGISELI DFINDPNEKAIQYFGRMGATLVPFASAARFARQQADDPMREMRDFMDYTMNTIPGWSSTL PARRNWVTGTTINYNLIPSNANDTVLDELNRMAEGIYGPPAKKLHGVELSTAQYSRLNEL HGTTTIGGKTLHESLGELFASQQYDIDRNTIGDPPDKERGPRAVAINRIIHAYRQKAQDE LLSEDDSLRHEVRKSDYQRLASKRGTMTENNQQELLDALLTY >gi|316921439|gb|ADCP01000155.1| GENE 30 28480 - 30603 2125 707 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTAREREAKTIKKDIGGNASLTPAINAQGLSSFSYARHGEVGYDRYAGAGLRQLAANLS SIEPSIAHAHMKLLDRRIAEDKSAASLFAVENPELTKNMEAWRQASEKDERILNMNPYVK KYIKQEILKTSALGFDAALKDAYVTSGMVNERDPEKILKWGQDFRKQYTEQAGIKGEGKD MDQLDIAEHYTAYTTTSLDNLLGKHNRDMESQNANLLEQQMFQNISDTLAGKMNPLTGGY NVHIPAERQSYVTDAAQVIMGKAEEMKKLGYSQDRVLGMLGKAVLMGNHSAAVAEGLAKS LTININGKPVSLLSQPGIAKGIEALKDKEIERAWQAESRSHTREEWARQRAIRNAMSAGT AYGSQNDDLTRETVVDKLHLCTDETYPEFVRNARAAAQGRYLKPENQIDLGRLKYGIITG TDGLAAVEEGIRTGRIPPSEASVYQNLALSQKAGENTNLSSSIQDIGKTFLSAITGASVE EAGAMYMAYSTGRKAPVGVIAEAMSQLPGITTEFESFINEQRAKKGKEDAALTQSEMLLY KQQFIAEKLPTSISTLKERYAVEKAATSENAADKKAFSNMMQDRVPTIYDEEQNKWGYTA YNPIKAKAYTSSFNALNTLFPDQIPEQDYRGMHSVQDMLAYAQSHTPGGLSWQNTFFIAV GATPDSLGVTTIQQAIDYIPKHFEQMGYKVKPRPVVLIPNDGGQQNQ >gi|316921439|gb|ADCP01000155.1| GENE 31 30606 - 31205 812 199 aa, chain - ## HITS:1 COG:no KEGG:Dde_2804 NR:ns ## KEGG: Dde_2804 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 25 188 51 214 234 100 38.0 2e-20 MGFDPMTLAVAQFVIGAASSVASGVSASQQAKAQAQYQEEQAAEYARVNELNNKAAAQEY VEQSAAERMAQMQEQDKASRDAQEVQKEALQKKGEMLASTNASGLALDFLMADYERQEAT RKDMIRENYEMSSAKSDLNVNAYKDRAQNRVNGQQNYISPGSSYSTGMNVLGTALGIGGA GATAYDRYWTAKNKLDGVK >gi|316921439|gb|ADCP01000155.1| GENE 32 31213 - 33597 2608 794 aa, chain - ## HITS:1 COG:no KEGG:AZC_3583 NR:ns ## KEGG: AZC_3583 # Name: not_defined # Def: tail tubular protein B # Organism: A.caulinodans # Pathway: not_defined # 4 794 3 784 785 560 39.0 1e-158 MGKLVSSTIPNLISGVSQQPWNVRLPTQAEEQVNCQSSVTDFLKRRPATRHLARIRDTPA ANGIASHHINRDETEQYIVTADASGINVFDLEGNAKTVSVTGTGAAYLAAATAPNRDLRF LTINDYTFVLNRRVAVKTLPDLSPKRQPEAIVFIKQASYNTTYELILNGTTHAFTTEDGI APADEPADKLSSLDICKAIADQIPKDAFSVQTSNSTIWIRRHDGGDFTVKVQDSRSNTHT SVCKGKVQRFSDLPTVAPRGFVTEIIGDASSSFDNYFCVFEPSDAGDAFGSGTWKETVKP GIPCKLDPATLPHALIRQADGTFTFGPLEWGERICGDEDSAPFPSFVGRTLNGLFFYRNR LSFLSGENVVMSEVGEFFNFFLTTVTTLVDSDVVDVAASHTKSSILHHAVTFSGGLLLFS DQSQFVLEHDTVLSNATVSIKPVTEFEASMKAAPVSSGKTVFFATDKGEWGGVREYITLP DNSDQNDASDITAHVPRYVRGNVSRLECSTNEDMLLVLSEEMRTSLWLYKYFWNGSEKIQ SAWSRWDMCGEVLSAAILNTGVYLIMQYGDGVYLEKMDITPGYKDEGETFEYCLDRKITE RDVTLGAYDAINKTTAITLPYDIPAGYTPVVVTRTGGPDAPGNLLRRVDVTGPRTITVEG PDAHGRKLFIGIPYESSYTFSTFAIREGDSKGNAVTTGRLQLRRLTLNCSNTGFLHMHVT PKFRPTSTYTFTGRELGHGTNIIGAIPLYTGTISFPILSLNTQVEVKVGSDSFLPFALVN ASWEGFYNTRNARV >gi|316921439|gb|ADCP01000155.1| GENE 33 33599 - 34234 733 211 aa, chain - ## HITS:1 COG:no KEGG:PP_2283 NR:ns ## KEGG: PP_2283 # Name: not_defined # Def: tail tubular protein A # Organism: P.putida # Pathway: not_defined # 9 187 7 187 195 127 41.0 3e-28 MSTTSPTPTTELEAVNTMLSGIGEAPVNSLSEVTADVSLARHILNEVSREIQLEGFQWNT EDDYPLTPDIHGLIKLHPSIVRVHFREPSDRELTIRGNQVYDRINHTFTFPQGTAIFCTV TLLLPFEQLPEAARRYTTLKALRIFQERVVGSQVLSQYQQADEARARVQLMGEERRQDRP NLLMGTYPPVGTWRVRDAVMRRNNTTRRLGF >gi|316921439|gb|ADCP01000155.1| GENE 34 34294 - 35259 1494 321 aa, chain - ## HITS:1 COG:no KEGG:Rru_A3313 NR:ns ## KEGG: Rru_A3313 # Name: not_defined # Def: minor capsid protein 10 # Organism: R.rubrum # Pathway: not_defined # 4 318 3 309 309 256 46.0 6e-67 MAENLTLSRPGAQNLGNDPAKMFRDVFTGEVITAFDEHNIMKDWHRMRTITHGKSASFAV MGRANARYHDPGVAILGSNKIAANERTINVDNLLIADVAIYDLEDAMNHYDVRREYSKQL GVALAKRFDETTMRVAVLAARSSGIIDDEPGGSVIKGGATLATDGEKIAEAVFACSQTFD EKDVPEQERCLILRPAQFYLLNQTTKVLNRDWLGAGSYSDGKLDKIAGIKILMSNHLPKA NITAAVDGEKNTYYGDFTNTLGLCMQSNAIATVKLKDLTVQQSGHDFNIVYQSTLMVAKY AMGHGVLNPSYAIELSTAAKA >gi|316921439|gb|ADCP01000155.1| GENE 35 35295 - 35504 155 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLKTVLLLSALTLLPLGCASSAARTNVPLPPVPTTPGAIVTPDGLVCLPPDEAGALLLWM EYAESNGSL >gi|316921439|gb|ADCP01000155.1| GENE 36 35585 - 35812 302 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDFLNQLNPEVLFPALTLCASGLANLLVLLLPLPKEGGSILYRAFHTFINWVALNVGKAK NAATSTDVAKPRRNA >gi|316921439|gb|ADCP01000155.1| GENE 37 35834 - 36430 648 198 aa, chain - ## HITS:1 COG:RSc3192 KEGG:ns NR:ns ## COG: RSc3192 COG3772 # Protein_GI_number: 17547911 # Func_class: R General function prediction only # Function: Phage-related lysozyme (muraminidase) # Organism: Ralstonia solanacearum # 66 189 31 150 153 111 46.0 9e-25 MVGTLLILALWSARWPGLESSAADAVSPQQQQQQEERAIIPSLDTLLAHPITDMIRTEWE GFSPTPYLCPAGYWTIGYGHLCDKDHSPITREQGGRYLAEDLLDALRDVERLAPNLKDEP DHRAIACASWIMNLGKGNFASSTMLKRIREGKWEAAAKEMKRWDKVTVGGKKKPFRALTR RRLTEAHLFLTGEVKTFL >gi|316921439|gb|ADCP01000155.1| GENE 38 36360 - 37187 797 275 aa, chain - ## HITS:1 COG:no KEGG:PP_2281 NR:ns ## KEGG: PP_2281 # Name: not_defined # Def: capsid assembly protein, putative # Organism: P.putida # Pathway: not_defined # 82 273 74 259 259 128 46.0 2e-28 MEDASKNLTVEVPVTETGPDAPTTTTTTATDSPRRYAGEFDTVEELEAKYQELLKTTTTA APSGEPGDGEGGDPNSNDPDGSPTDDKEKEFASASRDDAEKALSGKGLDIAEFEREFDAT GGLSEESYAKLEQAGLGKDVVDSYIAGRTALLNSFISEVKGLAGGEDGYRAVTEWADKGG LTDAEKESYNRVMNSGDKALIKLAVSGLVAKYREEEGAAPELVTGKASSARREPSDTFDS TEQVIAAMKDPRYGRDPAYTRAVERKVARSRVFGG >gi|316921439|gb|ADCP01000155.1| GENE 39 37215 - 37388 262 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANVNSAAVNNNGTKKAPEAPQLTLSHPGEATAPAVPPKPGTPIPVGHSGTLVRIDN >gi|316921439|gb|ADCP01000155.1| GENE 40 37413 - 39077 2072 554 aa, chain - ## HITS:1 COG:no KEGG:PP_2279 NR:ns ## KEGG: PP_2279 # Name: not_defined # Def: head-to-tail joining protein # Organism: P.putida # Pathway: not_defined # 15 495 8 481 524 415 46.0 1e-114 MAPADSAFPDNGSLPTKGPAETRYTELSQDRAPYLDRARRCAELTIPYLIPPDDLAQGQE LPSLYQSVGANGVTNLASKLLLTMLPPNEPCFRLRVNNLVVEREEENADKEFRTKIEKAL SRIEQAVLADIEASGDRPVVAEGNQHLIVAGNVLYHDDPKKGLRLFPLSRYVVERDPMGT PVEIVVEETVNLDTLPEDVAERIREAADTLGQPSIKGDDRKDVNIYTHLKRGPKKWSVYQ ECRGVKLPGSEGSYKLEACPWLPVRMYSIAGENYGRSFVELQLGDLGSLESLCQSLVEGS AVSAKVVGLVNPNGVTDPKALAESANGDMIEGNADDVAFLQVQKGADFQVVAAQIQRLEQ RLKTAFLMMDGVRRDAERVTAEEIRVIAQELETGLGGVYTLISQEFQLPYIASRMATMTR QKRIPELPKGTVTPSIVTGFEAIGRGNDKQKLLEFLKAGTELMGESFLGLLNPQNAVTRL ASAMGISTEGLVKDEEELAQERQAAQQQAQGQMMMEKLGPEALRQIGGMAQAGNAEALQG MQQGLEQQMQQQQP >gi|316921439|gb|ADCP01000155.1| GENE 41 39046 - 39291 278 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGGIFDKPSKPKVVEAPAPTVAATPPPPEETAEAPVINEGNKRKNQADSKRKGTSALRID LNLGGGNMGGAGGTSGLSIPR >gi|316921439|gb|ADCP01000155.1| GENE 42 39303 - 39857 720 184 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1107 NR:ns ## KEGG: DvMF_1107 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 17 166 31 191 210 95 39.0 7e-19 MHGPPISALCRAFHNVPSHPPTLTDAHLVFLWHRIREQGLDRFLFYDGGVNSLARFRAIV TNESVWAYAGFSHTTGDPLALALLDRFLGRTAYLHFTFFKGEGFARHLEIGRAFMGLIFE NGTLSCLMALTPAAFRHSWKFGLDLGFTRLGTIPGACGVLDRKTGTIRYRDGMLMKLDNP KPQP >gi|316921439|gb|ADCP01000155.1| GENE 43 39948 - 40559 577 203 aa, chain - ## HITS:1 COG:CAC0481 KEGG:ns NR:ns ## COG: CAC0481 COG0602 # Protein_GI_number: 15893772 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Clostridium acetobutylicum # 18 155 23 152 153 63 28.0 2e-10 MLTIIGSEFNPTHNAFEIYVSGCKRHCPGCHNPEAQAFGKGKSARLWMNENRYKFMTGTF SRVWILGGDLMDQAPYEAHEFIRDLRKAMKPGVELWLWTGHEFDEVPLRILGEFDVVKTG AYREDLPGHTVAYDGHDGEPRPLVLASNNQQLHRITAPCPQRDPNSLPMLPNRMKALLSD AFPDSLGSSWKDSTPLSPNVVPA >gi|316921439|gb|ADCP01000155.1| GENE 44 40519 - 40737 217 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METLMMIFFCLVCLGSLFALGFCFVTLYIIVTTLRVLREKISAALDETLPDILEKKRNAH VDHHRLGIQPDA >gi|316921439|gb|ADCP01000155.1| GENE 45 40747 - 40881 105 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHEFLFYALSSFVGFTTGFGLLYFFTEWWEKRQQKKYKTSIWKK >gi|316921439|gb|ADCP01000155.1| GENE 46 40874 - 42835 2074 653 aa, chain - ## HITS:1 COG:TM0385 KEGG:ns NR:ns ## COG: TM0385 COG1328 # Protein_GI_number: 15643151 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Thermotoga maritima # 7 650 6 643 651 350 34.0 6e-96 MIAYSQSYDPEFCELMERLFEEYPEELFRIEGIHPDQLDLHQVSRDFFKRRPIETATADH SIDGNANVSGKDVITFNYDVPKALMKMNSLYNLWNVLREKGGKKEADIIIESEIMGLIYI NDAWDVGRPYCFNYSTYDIALEGLPMGGRLEVDPPKTLFAFLRQVEQFTVYAANSTLGAT GLADLLIVASRYVDKIFENDGMDGHIRVAGHDPLGDLRTEDIWRYVHESLTSLIYTLNWE FRGNQSPFTNVSVYDRHFLEQLAPTYVIEGKAPRMGTIELVQDIFLAAYNETLSRTPITF PVVTACFSVEQGSDDTRTIKDQWFLEKIAKANLKYGFINIYCGESSTLSSCCRLRSSVSD LGYANSFGAGSTKIGSLGVVTLNLPKLAGASVDFEEFLRFIAPVVRMAATINAAKRDFIK DRIARGALPLYTLGFMDLSRQYSTCGFTGLYEALAMLGHDMRTDEGLEAAERVLAAINAE NAKMAKKYGTPHNMEQVPGESSAVKLARKDALLGYELHEGDAPALYSNQFLPLWTEGVDV LDRIRVQGRLDSLCTGGAICHLNIGSEIADAAVMKALIVHAAESGVVYFAVNYQINRCAD GHMTVGHNAAACPICKKPITDVYTRVVGFLTNTKHWNKTRREHDWPERKFAHA >gi|316921439|gb|ADCP01000155.1| GENE 47 42832 - 43326 729 164 aa, chain - ## HITS:1 COG:no KEGG:Rxyl_0488 NR:ns ## KEGG: Rxyl_0488 # Name: not_defined # Def: kinase-like protein # Organism: R.xylanophilus # Pathway: not_defined # 2 127 12 137 155 75 38.0 5e-13 MPTLYLMCGLPGSGKSTYVNRHLVPKGVQVVCPDDLRLTYGHAFYGPLEPLIHAQTAQIV RALMHRGLDIVVDECHVRAEHLRKWNGLIKAFGYDVKLIRVVASVEECKARRAAEDPDFP LEVIDRMNVTLLENWMLIKAAYRDRTFTILPTATATAADGGDEA >gi|316921439|gb|ADCP01000155.1| GENE 48 43357 - 44112 989 251 aa, chain - ## HITS:1 COG:PM1774 KEGG:ns NR:ns ## COG: PM1774 COG3646 # Protein_GI_number: 15603639 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Pasteurella multocida # 27 117 40 125 239 68 43.0 8e-12 MSQSVNINTASVSILALNGKKQPAVTSLQVAEVFGKEHYNVLADIERVMPQLSENFVKLN FQVYAYPVETGIGTRVAKAYLLSKDGFTILTMGYTGKEAMAFKEAYIARFNEMEEALRHT SVPAGLPDFTNPAVAARAWADERERADAVTLALEEAKPKAEFVDRYVEAEGLRTFTQAAK TLKIKRADLISLLLEQHLFRDRRGNIQPYAGYVKSGLYVTKETLLEPTKRTVINTYLTPK GFEAVARIVNQ >gi|316921439|gb|ADCP01000155.1| GENE 49 44143 - 44568 463 141 aa, chain - ## HITS:1 COG:no KEGG:Namu_3760 NR:ns ## KEGG: Namu_3760 # Name: not_defined # Def: MazG nucleotide pyrophosphohydrolase # Organism: N.multipartita # Pathway: not_defined # 34 133 41 133 146 82 47.0 7e-15 MQRQHDNDHLAMLAEFMTAMEQPVSQGWEDLEGSILGVNLITEEARELGDALLALSWSDE PDTREAFIKELTDLLYVCYWTAAKIGIDVNEAFRRVHASNMSKLGPDGKPVKREDGKVLK GPGYHAPDLSGVVRNVPITLI >gi|316921439|gb|ADCP01000155.1| GENE 50 44549 - 44794 326 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDAAQYLQDQIRPPYYTRFRIEPLAFITMNHLDFLQGNIIKYVCRYDGKNGAEDLMKARR YLDELITRTQTEEGRNAAATR >gi|316921439|gb|ADCP01000155.1| GENE 51 44761 - 45576 828 271 aa, chain - ## HITS:1 COG:RC1206_1 KEGG:ns NR:ns ## COG: RC1206_1 COG0258 # Protein_GI_number: 15893129 # Func_class: L Replication, recombination and repair # Function: 5'-3' exonuclease (including N-terminal domain of PolI) # Organism: Rickettsia conorii # 11 269 8 266 313 69 25.0 7e-12 MKQPAHNAPYLLIDADVLAYRAAAGAEKVICFEEDSCFPLCSLADAQAAFFGQLYAILDQ LGTSDYALCFSDDHNGGFRRRLFPGYKANRDGKPRPVALKFLREALMRDDNPDASVYIKP GLEADDCLGILATLPSFMRGRQKVIVSVDKDMKSIPGFFYDMGKPDLGIQPVSREDADLW HMTQTLIGDAADGYPGCPKIGPMTAKKLFDGIPRDYGHLWPVVVSAFEKVGFGEAEAVTQ ARLARILRAEDWDFNTKEVKLWTPPSTCKTR >gi|316921439|gb|ADCP01000155.1| GENE 52 45557 - 46144 616 195 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00925 NR:ns ## KEGG: EUBELI_00925 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 44 172 49 192 206 105 41.0 1e-21 MKTTILSEYGFHEALLGMGLSHGKTSGITSLWDIRDDASLKERALKLAGLGKGHDKFLRM IVVTLDITAPLYWWKQFDTYKVGTVAQSESTMHTLMKNPLTPEMFEGGLFPDLVRSLNVV GKQDGFETLNRCLPQSFLQRRIVQANYAVLANIIVQRTGHKLPEWKTFIESVLWGVQRPE LLRKAAGISHETASA >gi|316921439|gb|ADCP01000155.1| GENE 53 46122 - 46598 327 158 aa, chain - ## HITS:1 COG:no KEGG:mma_2215 NR:ns ## KEGG: mma_2215 # Name: not_defined # Def: phage N-6-adenine methyltransferase # Organism: M.massiliensis # Pathway: not_defined # 1 142 1 140 145 166 58.0 2e-40 MNPALFSSAKEDWETPREFFERLDGEFHFDLDVCAFPHNAKCPTYFTKEDDGLARDWGNR VCWMNPPYGKAIKAWMTKALDASRRGATVVCLVPSRTDTAWWHDTVIAGGAEVRFARGRL RFVGAEHPAPFPSAVVIFRPPPSPSQQKETNDENNDPQ >gi|316921439|gb|ADCP01000155.1| GENE 54 46607 - 47011 545 134 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTTTTYSLLPVTRIATTPRAHADVTLGFSKRGDLSLIFSPAFLAEHAALGIGSKVLIAY SAEAKKLLIAPPSAEQKEHLRTVRKRVGFPGAGCVVVSARNLPEGMPHPEKQREPVLWEP EENGAVALDLSRLA >gi|316921439|gb|ADCP01000155.1| GENE 55 47249 - 47329 57 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDIQDTLGILIAVLEIWRILDAFLNK >gi|316921439|gb|ADCP01000155.1| GENE 56 47441 - 49321 1835 626 aa, chain - ## HITS:1 COG:polA_2 KEGG:ns NR:ns ## COG: polA_2 COG0749 # Protein_GI_number: 16131704 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Escherichia coli K12 # 16 626 53 640 640 107 24.0 9e-23 MSAPRKREPYLGFGWSWIPTPETPALLFDIETDGLLDETSTIHCICAKDFLTGESFSFGP GEIEDGLGLLCQSRLLVAHNGLCFDIPAIQKLHPSLLLPRLFDTLTASRLIWTNLKDLDF TQLRKKSCRFPPKLAGSHSLDAWGQRLGVMKGDYGKTTENAWSRWSEDMQRYCEQDVEVL EALYRHILDQRYSPEALALEHEFQAVIFHQERTGVWFDERAAQSLYAELAAKRNDAVTAL QEVFPQKRIEEIFIPKANNRTRGYVKGVPFTKVRYETFNPASRQQIADRLMEKHGWKPSE FTDTGQPKVDEDVLASLPFPECKPLVDYLELLKIIGMLAEGKNGWLKLVGPDGRIHGRVI TNGAVTGRCTHNSPNLAQIPARGQYGKQCRALFAAPPPLVQVGADASGLELRMLAHYLAA YDGGAYAKVLLEGDIHTANQHAAGLETRDNAKTFIYAFLYGAGDEKLGSIVAPLASSAVQ TKRGRALKDRFFRSLPAIKRLIDDVQGVLTGPGKRPYLIGIDGRHLHIRSSHSALNTLLQ SAGAVLMKLATVIFHMEAQRRGLRLGEDYAQVLHVHDEAQFNTTPEKADALGKLFVESIE LAGRHFGMRCPTTGEYKVGANWAETH >gi|316921439|gb|ADCP01000155.1| GENE 57 49299 - 50108 1084 269 aa, chain - ## HITS:1 COG:lin2418_1 KEGG:ns NR:ns ## COG: lin2418_1 COG3617 # Protein_GI_number: 16801480 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Listeria innocua # 16 123 14 113 128 102 50.0 1e-21 MEAQIQVFRNGAFGSVRVVEHKGEPWFVASDVAKALGYANPQEATREHCKKVNKITQPSK SLTSVKRPPTFINIIPESDVYRLVMRSNLPGAVEFQDWVCEEVIPTIRKSGGYLATKPDD TPETILARAVLIAQDTLKRVEAERDEALRTKAWIGSRREATAMATAASATRRAKALEAQL GLAGDYLAVKGIKWLPEIFDLTKGGTYSQIGKYLKRLSLTENYMVKTAPDSLHESVGLYH KDIIALFRKQLAEIPSLMRKYRHVCAPQA >gi|316921439|gb|ADCP01000155.1| GENE 58 50123 - 50506 474 127 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MATVTKEQRTYAVSRIREAGMKKVGIYKERSCLLREERKLTDADKRELVYAGVVPLRPDL STYDIRNCFDFSAFENKTEYDEEKLRAFSEKTEKEIAKAIDAIMLGDAADIMKVIADFEK KMNGNNK >gi|316921439|gb|ADCP01000155.1| GENE 59 50508 - 52304 1740 598 aa, chain - ## HITS:1 COG:CAC3715 KEGG:ns NR:ns ## COG: CAC3715 COG0305 # Protein_GI_number: 15896946 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Clostridium acetobutylicum # 287 545 185 436 442 84 28.0 7e-16 MNNRKNDKSSERHPYEDSTFSHHEPCPQCQRNGGDWNGDNLARYSDGHGYCHVCGYYETA QGGSGRMDRQKDPVPFDPVPVDAFMALKARGITQETCEHFGYGIGKAGGKYCHIAPLYDH EGILVAQHLRFEGKEFRWRGSASEAVLFGQTLWRRGGRKVIVTEGEIDCLSISQLQGNKW PVVSLPNGSSSGAKYIRASLEWLESFDEVVFAFDMDEPGQKAAKECALLLSPGKAKIARL PMKDANECLVAGKGKELIDALWGAVPYRPDGIRSGAELWEDIKKPPPAGYEIPYPGLNGK LGGVRLGELVLFTAGSGIGKSTIVNEIAYHLMMAHGLTLGVMALEENPARNARRYLGIHL NKPLHLPAAHASVPEADLKAAFDAVMGNGKWYIYDHFGSSDIDTLLSKLRYLAVGLGCKA IVLDHISIVVSGLDESEGESERKIIDKLMTRLRSLIEETGILVLAVVHLKRPDKGKSYNE GRPVSLTDLRGSGSLEQVSDVVVSLERDQQGDEPDEATIRVLKNRPLGITGLAGTVRYDR ETGRLLPCDDGDGTGGATYGFQKEETPTVSSPQQSLPPWDTAEGDGNTREQQQQEKEF >gi|316921439|gb|ADCP01000155.1| GENE 60 52273 - 52782 339 169 aa, chain - ## HITS:1 COG:no KEGG:PP_2268 NR:ns ## KEGG: PP_2268 # Name: not_defined # Def: phage endodeoxyribonuclease I # Organism: P.putida # Pathway: not_defined # 24 139 4 122 141 76 38.0 3e-13 MPTFRSSSRSRAAQFARWNEVRRRTGYRSQFEADIAAALEGTGAAYESSRLPYSVTTTAT YTPDWLLPDQCILIEAKGELGKADRDKMQLIRQQYPHLDIRFVLQTPNAKLTKTVTQADW CEKNGFPWCKGPGIPDGWLKHRPGVRSRRAFAAATGTTPHEQQKKRQKQ >gi|316921439|gb|ADCP01000155.1| GENE 61 52760 - 53449 1028 229 aa, chain - ## HITS:1 COG:no KEGG:PP_2267 NR:ns ## KEGG: PP_2267 # Name: not_defined # Def: phage single-stranded DNA-binding protein, putative # Organism: P.putida # Pathway: not_defined # 14 192 2 182 226 99 35.0 1e-19 MATTPEKKKRATYTTPKGESLFAHLVNVDYGTEQYPDEKGSFNVTLALDADAAAKLDSLI AHEVDTARAEAEEKFDGLKPQTKKKFGSVNFNEVGPEEYDREGNPTGRRLFRFKTGAFYE NRQGVKVQRKVPLFDSMQQPVKLSDEPGNGSVIRVAFCCAPYFVEGQGMGGLSLYLNAVQ IIRLNTSGERSASDYGFGAEEGGFTSEGMDDDVASTNAATTPDADVPQF >gi|316921439|gb|ADCP01000155.1| GENE 62 53460 - 53633 216 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLTCAFLMAAESAGLPVPDLMGMARNCMNDGEGRRPEFRAVDDFINNEILNEQHTR >gi|316921439|gb|ADCP01000155.1| GENE 63 53948 - 56596 2868 882 aa, chain - ## HITS:1 COG:AGc2186 KEGG:ns NR:ns ## COG: AGc2186 COG5108 # Protein_GI_number: 15888521 # Func_class: K Transcription # Function: Mitochondrial DNA-directed RNA polymerase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 26 882 25 863 863 557 37.0 1e-158 MRDARFAPENAHRNHQHHSLSLSPLFARQIELEKEMADRGRDRYYTTVARNTERNAEANT SYGKMLLKRGIEPMARAISEFMESAKQGGAGRRHKAVGMLEGMDVQVVAFIALRKVLDTF SQSMPYQRVSILVGREVEMEKKLTELKEQDPDRYRMTQRYIAGSKARKYRRTVLNYAFGK STTVEYEPWAEADCLHLGQKLIELAMESCGIFHLVQVPSQKVKGKTMFQSTHYSLEPTPS CREWVERHKDHASMMYPDYLPTLIEPKPWEGAGGGGYYSDALPRLSLVKTGNLGYLEALD KRIQAGEMDTVINAVNALQNTGWSINPTVFEVASYLWDETDGGVAGLPPRDGYRLPPCPT CGADLTDTAAARVRHTCLDALPLEDFNEWKKEAFRIRELNISLFGQRIGVAKTLTMAARF KDEPRFFFPYQLDFRGRIYAVPSYLTPQGTDLAKGLLRFAEGKPLGTMQAVRWLAIHGSN CFGNDKVSLDDRHSWVLQHQQEILECAEDPFSHAWWHEADEPFCFLAFCLEWAGYVREGL DFVSHIPVAMDGTCNGLQIFSLILRDKVGGSAVNLLPAAKPQDIYQIVADKVIGKLKTDA ADPDKDAIVTTKEGKAFYNPAKSAAILLDMGINRKTTKRQVMVLPYGGTKESCREYSEEW VKDKIRNGYPQLPEDVSIRGLSYYLSTHVWDAIGETVIAARDAMKYLQDMAGVMNKLDLP IKWTTPAGFPVVQKYMDKKERRIKTKIGDSIVKLNISEESTTDLDKPKQKSAISPNYVHS LDAAALMKTVNGCLAQNIGNFAMIHDSYGTHAADSVALAATLRRTFVEMFGGEGVNPLEQ WKNEVLASVPEPMLAELPDLPILPATGDLDVGEVAHSPFFFA >gi|316921439|gb|ADCP01000155.1| GENE 64 56543 - 56731 110 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVLVVAVGVFGRKTGVSHKIILYFKVFMNFKIWAQKKAPSEWADLSLARRRDGFAWGRVL FC >gi|316921439|gb|ADCP01000155.1| GENE 65 57166 - 58002 641 278 aa, chain + ## HITS:1 COG:SPAC5H10.06c KEGG:ns NR:ns ## COG: SPAC5H10.06c COG1454 # Protein_GI_number: 19113731 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Schizosaccharomyces pombe # 1 277 150 421 422 179 39.0 7e-45 MSLLDANGGAVRDYAGVGRVPHPGVPIIAIPTTAGSGSEVSGTTIITEEAAHEKLLVISM YLVPEIAILDPLLTYGMPPALTAATGVDALTHAIESCVSRKATKLTIDFSLRAAGRLARF LPVAFASPDDAEARRECLLGSLEAGIAFTNSSVALVHGMARPLGAKFGIPHGVSNAVLLE AIMRFSLPGAPERYARLAEAMGAVPAGTPLETARAGLGIVSRLVRELHIPRLRDLLSREA LDAQLDAMVREAVASGSPANNPRQASPEEIASLYGEAF Prediction of potential genes in microbial genomes Time: Fri May 13 05:04:06 2011 Seq name: gi|316921407|gb|ADCP01000156.1| Bilophila wadsworthia 3_1_6 cont1.156, whole genome shotgun sequence Length of sequence - 34694 bp Number of predicted genes - 31, with homology - 28 Number of transcription units - 17, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 44 - 427 628 ## COG0509 Glycine cleavage system H protein (lipoate-binding) 2 1 Op 2 . - CDS 459 - 1826 517 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 1999 - 2058 1.6 3 2 Tu 1 . - CDS 2184 - 3008 1089 ## Ddes_1415 hypothetical protein - Prom 3156 - 3215 4.8 - Term 3083 - 3129 -0.3 4 3 Tu 1 . - CDS 3223 - 3930 706 ## Desal_0411 transcriptional regulator, Crp/Fnr family - Prom 3991 - 4050 5.8 + Prom 3926 - 3985 5.2 5 4 Tu 1 . + CDS 4214 - 5656 1997 ## Dred_0607 hypothetical protein + Term 5680 - 5734 7.4 6 5 Tu 1 . + CDS 5847 - 7310 1795 ## Ddes_1413 hypothetical protein + Term 7328 - 7370 4.2 - Term 7371 - 7433 1.7 7 6 Op 1 . - CDS 7468 - 7800 388 ## COG1733 Predicted transcriptional regulators 8 6 Op 2 . - CDS 7801 - 8076 95 ## + Prom 7658 - 7717 3.5 9 7 Tu 1 . + CDS 7936 - 8493 634 ## COG0655 Multimeric flavodoxin WrbA + Term 8539 - 8577 4.2 - TRNA 8652 - 8728 81.5 # Pro GGG 0 0 - Term 8590 - 8627 7.1 10 8 Tu 1 . - CDS 8865 - 9887 1268 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 9929 - 9988 4.4 + Prom 9962 - 10021 2.8 11 9 Op 1 19/0.000 + CDS 10166 - 10951 769 ## COG0413 Ketopantoate hydroxymethyltransferase 12 9 Op 2 . + CDS 10968 - 11840 947 ## COG0414 Panthothenate synthetase 13 10 Op 1 . - CDS 12293 - 14032 1821 ## COG0642 Signal transduction histidine kinase 14 10 Op 2 15/0.000 - CDS 14040 - 15167 1449 ## COG2205 Osmosensitive K+ channel histidine kinase 15 10 Op 3 18/0.000 - CDS 15210 - 15767 764 ## COG2156 K+-transporting ATPase, c chain 16 10 Op 4 20/0.000 - CDS 15874 - 17904 3015 ## COG2216 High-affinity K+ transport system, ATPase chain B 17 10 Op 5 . - CDS 18082 - 19788 2063 ## COG2060 K+-transporting ATPase, A chain 18 10 Op 6 . - CDS 19798 - 19950 138 ## - Prom 19979 - 20038 4.9 - Term 20365 - 20394 1.2 19 11 Tu 1 . - CDS 20431 - 20733 471 ## - Term 20796 - 20832 1.0 20 12 Tu 1 . - CDS 20874 - 21743 1039 ## COG1284 Uncharacterized conserved protein - Prom 21823 - 21882 2.3 + Prom 21782 - 21841 4.1 21 13 Tu 1 . + CDS 21861 - 22709 876 ## COG0225 Peptide methionine sulfoxide reductase - TRNA 22835 - 22910 78.9 # Ala GGC 0 0 - Term 23062 - 23094 -0.4 22 14 Op 1 . - CDS 23202 - 23939 815 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase 23 14 Op 2 . - CDS 23936 - 24574 800 ## LI0587 hypothetical protein 24 14 Op 3 . - CDS 24574 - 25161 630 ## COG0131 Imidazoleglycerol-phosphate dehydratase 25 14 Op 4 . - CDS 25158 - 26309 1218 ## COG0805 Sec-independent protein secretion pathway component TatC 26 14 Op 5 . - CDS 26312 - 26647 363 ## DvMF_3124 twin-arginine translocation protein, TatB subunit 27 14 Op 6 13/0.000 - CDS 26660 - 28201 2085 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Term 28250 - 28276 -0.7 28 14 Op 7 . - CDS 28314 - 29774 1946 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 29851 - 29910 3.3 + Prom 29967 - 30026 4.5 29 15 Tu 1 . + CDS 30072 - 30494 597 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins 30 16 Tu 1 . + CDS 30745 - 32238 1711 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 31 17 Tu 1 . - CDS 32123 - 34639 821 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 Predicted protein(s) >gi|316921407|gb|ADCP01000156.1| GENE 1 44 - 427 628 127 aa, chain - ## HITS:1 COG:lin2519 KEGG:ns NR:ns ## COG: lin2519 COG0509 # Protein_GI_number: 16801581 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Listeria innocua # 3 122 1 120 125 114 49.0 3e-26 MSLTYPEDRSYHSEHLWAKAEPDGTMLVGITDYAQDQLGGVIFVDLPTVGEHFKQGVSCA SIESVKITSDAIIPVSGEIVAVNEALADAPELLNDSPYDKGWLVRVKPDDADEGGRISAA EYAALVG >gi|316921407|gb|ADCP01000156.1| GENE 2 459 - 1826 517 455 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 3 448 4 450 458 203 32 1e-51 MHYDVIIIGGGPGGTTAAKELAAGGKKVAIIEDKHWGGTCLNCGCIPTKMLLGATAPLGL LKAQERLRTMKGSIDIDYKALQTRVGRFLKGSSQTLAKSLAAAGITLYEGRGVCAGKGQA IVRSESGEEMLTTDNIILSCGSSSASFPGLAPDGDAVLDSTGVLNLPEVPESLIIVGAGA IGLELGDFFGMMGSKITIVEAAPHIAPTEDADIAKEMDRVQSKAGRTCITGVMAKSLVTK DGQAELTLGDGRVLTASKALVAVGRTPNTAGLDCEKAGCTLKRRGFVDVNDHLEAAEGVY AIGDVNGLTLLAHAADHQGAYVARRILGHEKGVYVPGPVPSCIYGSTEIMRVGQTAKGLL AAGKSVSVSQVPLTLNPIAQAAGASGGFVKVVWAGDAIAGIAAIGHGVSHLVTVAQLLMV GGYTPERLHEVMIGHPTLDEIVPAAIRAPRVAVTE >gi|316921407|gb|ADCP01000156.1| GENE 3 2184 - 3008 1089 274 aa, chain - ## HITS:1 COG:no KEGG:Ddes_1415 NR:ns ## KEGG: Ddes_1415 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 274 1 274 274 329 63.0 8e-89 MSITLYTAPDCLRCKIVKGFLAEHQIAYETVDFKADAQTFNTFYRANRKAIYRNPEGVEF PLFDDGTVIKQGSGEVIAYLLSGHALEGAVTRSDLLHGWISGLYLSQCPAAEEDHFVELV QQLAKGGLTVCIQTDGRNPALLERLIGLNALAKVQLNIPGPASVYEALYGSAPTKDELAK TIELVKAFPDHEIRFLATPLSGPEGARWPNRDEAGEAARMVFEACGDHQLPYAIAAVTPE MPQGMQGLEPVADPMMLKYRSASREFLFKADIAK >gi|316921407|gb|ADCP01000156.1| GENE 4 3223 - 3930 706 235 aa, chain - ## HITS:1 COG:no KEGG:Desal_0411 NR:ns ## KEGG: Desal_0411 # Name: not_defined # Def: transcriptional regulator, Crp/Fnr family # Organism: D.salexigens # Pathway: not_defined # 9 224 8 223 235 108 30.0 1e-22 MTAEIQMPQQHIAIISHRGLHEPWREIVHMGTHLVFPRSHYNHQATTDCFYFIDKGRVRL TYTNAAGEDHVVMFFGKGCIFNETSVITGDEATMACFRVLERTEAYRFPSSLIHDAEFAR QYPHLILNMLRGTATKFSNLFSVSFLIKHGTPLARVCRFLRQLSRSHNNAKAFPLDMTQL ELASVLDIHRVSLFRCLQQLRELGILVRFSRHEIELSELDQLEKLLEEQEEAEWG >gi|316921407|gb|ADCP01000156.1| GENE 5 4214 - 5656 1997 480 aa, chain + ## HITS:1 COG:no KEGG:Dred_0607 NR:ns ## KEGG: Dred_0607 # Name: not_defined # Def: hypothetical protein # Organism: D.reducens # Pathway: not_defined # 4 480 5 480 481 713 70.0 0 MSATIFPTGVTIYNPEKCWSGYTILQAAERGALLIDMNGNEVRMWEGLQGFPNKMLPGGY VMGSLGERNPKYGLQDQTDLVQVDWDGNIVWKFDKAEYIEDPGEEPQWMARQHHDYQREG NTVGYYVPGSDPKTNSGNTLILAHKNVHVPAISDKMLLDDVIYEVDWDGDIVWEWKVSDH FEELGFDAIARNLMYRDPNYYTGFGNSHIAGDWVHTNSMSVLGPNKWYDAGDKRFHPDNI IIDCRDANIILIIEKATGNIVWKIGPYFDQTPELRKLGWIIGQHHAHMIPRGLPGEGNIL IYDNGGWAGYYAPNPGSPTGIRGALRDYSRVLEIDPVTLEIVWQMTPKEAGYLMPLDANR FYSPFISGMQRLPNGNTMITEGSHGRIFEVTPEHELVWEYVCPYWGTALPMNMVYRSYRL PYEWVPQLEKPAESPIERLDVSTFRVPGAAAPGVKSAVKIKEAKGYFGLSAMCVAADIEE >gi|316921407|gb|ADCP01000156.1| GENE 6 5847 - 7310 1795 487 aa, chain + ## HITS:1 COG:no KEGG:Ddes_1413 NR:ns ## KEGG: Ddes_1413 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 7 487 22 510 510 423 48.0 1e-117 MLCGAAATAGAVEIKAKGTWEILSEWSNVAPLNGPNDRFATLQRLRTQVEIISSNALKGV VFFEIGDTNWGKASEGGALGTDAKNIEVRYSYVDWQVPNTDLRVRMGLQPIILPGFVASG FSPVFGHDVAGIMLSQKWSDEVGTTFFWARPFNGNFGGWNGNGSQANGTPRDTPGDAMDL LSLIVPIKTDVMKVSPWGMFGFMGTKSLMGNIKGGDPATWAPRAGLYPILGSGYTFETFP FRRNTETHENAWWLGITAETNSFSPLTVAGDFAYGSVKMGEIPGYEGFADGPGTFKLDRA GWYAAVRADYALDWATPGIIAWYGSGDDGNPYNGSERLPQYNTPWAVSALGFGGGWTDIA TWKVLGHNPGGLWGVVLHLKDISFMEDLKHTLRAGYYHGTNNSAMPKAANMASYPSRIDG PFAYLTTSDDAWELNADTRYKVYENLELAVEAAYVRLNLDEGTWGKKIVNEVDKDSYRVS IGLKYSF >gi|316921407|gb|ADCP01000156.1| GENE 7 7468 - 7800 388 110 aa, chain - ## HITS:1 COG:PM1471 KEGG:ns NR:ns ## COG: PM1471 COG1733 # Protein_GI_number: 15603336 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pasteurella multocida # 7 110 13 117 119 140 63.0 6e-34 MSLECITSAKLDETGFGYTLSLISGKYKMTILYCLSEFKVVRYNELKRFIGTISHKTLSL SLKELEADGLVSRKEYPQIPPKVEYSLSKRGQSLIPILDALCEWGEQHRQ >gi|316921407|gb|ADCP01000156.1| GENE 8 7801 - 8076 95 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDVHLAHVEAHDSVSGLLRAFRKRFNQCGRVAAGTGTPVQNDDFLGHDCLLWFGGSDYNI PQGGTCLQGILKCTQKMKYAHLCEMLTQRKA >gi|316921407|gb|ADCP01000156.1| GENE 9 7936 - 8493 634 185 aa, chain + ## HITS:1 COG:CAC3334 KEGG:ns NR:ns ## COG: CAC3334 COG0655 # Protein_GI_number: 15896577 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 4 161 2 155 178 108 34.0 7e-24 MAKKIVILNGSPRPRGNTAALIEAFTKGAEQAGNGVVRFDVCKMNIHPCLGCCGGGKNPE SPCVQKDDMDAIYPHYKDADLVVLASPMYYWSLSGQLKCAFDRLFAVMELSPSYENPVKD CVLLMAAEGDTESNFEPVRHYYHALLERLGWKDCGIVYAGGNMHVGDIAGKPQLEEAEAL GASIR >gi|316921407|gb|ADCP01000156.1| GENE 10 8865 - 9887 1268 340 aa, chain - ## HITS:1 COG:VNG0065G KEGG:ns NR:ns ## COG: VNG0065G COG0451 # Protein_GI_number: 15789399 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Halobacterium sp. NRC-1 # 17 336 3 308 309 243 43.0 5e-64 MSNYTELLQRIEQEPSTWLVTGVAGFIGSNLLQTLLSHGQRVVGLDNFLTGYQHNLDMVR ELVTPEQWERFRFIEGDIRDVETCLQACQGVEHVLHEAALGSVPRSIENPIMTNGCNIDG FLNMLVAARDCGVKSFVYAASSSTYGDEPNLPKQEDRIGKPLSPYAVTKYVNELYAEVFA RSYGFKTIGLRYFNVFGQRQDPYGAYAAVIPQWFAALTKGDTVYVNGDGETSRDFCYIDN VVQANLLASYAADDARDLVYNVAFGQRTTLNELFGLIRDEVVRHKPEAAEAKPAYRDFRA GDVRHSLADITRARTLLGYEPEYDVRQGLRLAGDWYDAHL >gi|316921407|gb|ADCP01000156.1| GENE 11 10166 - 10951 769 261 aa, chain + ## HITS:1 COG:CAC2914 KEGG:ns NR:ns ## COG: CAC2914 COG0413 # Protein_GI_number: 15896167 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Clostridium acetobutylicum # 1 247 20 268 276 281 53.0 1e-75 MVTAYDTPSARIVDRAGVDMVLVGDSLGMVMLGRKDTLSVTLDEMIHHCRAVVQGIHHAL VVADMPFMTYEPGPDTALSNAARLVRESGVRAVKLEGAFLPQIRALVGAGIPVMGHIGLT PQRSAQLGGFKVQGKRAEAARQLLDEAAALEDAGCFAIVLEAIPAPLAAAITARVSVPTI GIGAGADCDGQVLVLHDMLGLYSEFTPRFVKRYAELGTLMEQAVRDYAEEVRSGAFPTPA HGFSMDEEERARLEDRLRETE >gi|316921407|gb|ADCP01000156.1| GENE 12 10968 - 11840 947 290 aa, chain + ## HITS:1 COG:TM1077 KEGG:ns NR:ns ## COG: TM1077 COG0414 # Protein_GI_number: 15643835 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Thermotoga maritima # 7 286 1 279 280 273 50.0 4e-73 MEMEKTMQIIHSPEALHSVCAEWRRQGLKTALVPTMGYYHAGHASLISYARSVADKVIVS LFVNPTQFGPNEDLEAYPRDFEKDSALVRELGGDVLYAPEPENMYAPDHATWVEVPALAN TLCGLSRPIHFRGVCTVVLKLFMLAQPSLAVFGEKDWQQLAIIKKMVSDLNVPVEVVGRP IVREADGLALSSRNVYLTPEERAQAPQIRKSLEMAAGLAAGGERDAARIIEAVRGYLAEH LPTGEEDYISIVDPAVLTKVDRIEDMALCALAIRLGKARLIDNILLKVGE >gi|316921407|gb|ADCP01000156.1| GENE 13 12293 - 14032 1821 579 aa, chain - ## HITS:1 COG:PA5484 KEGG:ns NR:ns ## COG: PA5484 COG0642 # Protein_GI_number: 15600677 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 37 577 39 595 595 148 26.0 4e-35 MIVLPTMHKRIRRSFLRLVLLFGALGVLMVTGITIAGRVPSELIRMNYDSIAYAQEMVRA MNGIRFPELYRDTDTLGWEKRFADTLEQASGNITEEAERKVITDLQASWDVYRQTPDDAN YRGLHARIDALVQVNERGMFLRLSKNARFRDIMMAVTAAAFLLGTFWAFLLADSVAERLA HPLRRVAELFRDRPQLGRKLHLPDPQTLEVRLLFDELSRLWDRLGELDELNVHKLVVEKR KLEVILESVEDGVLVLSAAGDVMLVSHRMLSLLGLAKDDVQGKPWRDLSTSAPNYMALRD GLRDDMQGKQEITLRVGGEERIYAGRRRSLLSGKGAVTGQVFLLSDVTEKRRRDGLRSEM MDWISHELKTPMQSLGLAADLMARRPGLDEEMSMLVETVGQDAARLRTVARQFMEIARMS PSALQLVPDTVDLVERVREWLIPFQLVARESGERLLLSMPETDIPVTIDTERFAWVLSNL VSNALRVGSVGSTVRIVITQEEYDAVLRVEDDGPGIPPELEARLFEPFSHGRTAGTREGL VGLGLAITRDIVEAHGGVIRYARNPGGGAIFTVLLPLAK >gi|316921407|gb|ADCP01000156.1| GENE 14 14040 - 15167 1449 375 aa, chain - ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 8 347 11 347 888 248 41.0 2e-65 MDGFAELIERKRQGSLKVYLGYAAGVGKTYAMLQEGLRLKQQHVDVVIGYLEPHDRPETM ALAEHLETVPPREWRVGDAVFHEMDVEAILARKPQVVLVDELAHTNADGSKNDKRYSDVL EVLAQGINVISTINVQHLESVAARVEEATGIAVRERIPDTVLRRADQVVNVDVTKEELRE RLRQGKIYAPQQAERALSSFFTYENLSFLRELCLREASGDQVRKIEAQELLKPALAGYAV EAVMVALSSWPTDAESLIRRGVRMANQLGSPCYVVYVRRPQESPTRIDAGIQRQLQHNLQ LATRLGAEVVQLEGVDIAETLVNFASERNVRHAIFGKSRLSPLRERLRGSFLLDFLYEAV GVDVHVANVTPKYIK >gi|316921407|gb|ADCP01000156.1| GENE 15 15210 - 15767 764 185 aa, chain - ## HITS:1 COG:AGl2092 KEGG:ns NR:ns ## COG: AGl2092 COG2156 # Protein_GI_number: 15891163 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 182 1 184 188 177 53.0 1e-44 MFKLIRQSVLSTLCLAVLLCAAYPMLVTGAADAFFPKEAAGSLILRDGAIVGSALIGQPF SSAKYFHPRPSAAGYNAGSSSGSNYGPTSKALAERVRASEQELRAERPEGVLPVDMLTAS GSGLDPHISPDAALMQVRRVAEARGLPPEAVEKLVRGHVEGPEWGVWGEPKVNVLRLNLA LDELR >gi|316921407|gb|ADCP01000156.1| GENE 16 15874 - 17904 3015 676 aa, chain - ## HITS:1 COG:AGl2090 KEGG:ns NR:ns ## COG: AGl2090 COG2216 # Protein_GI_number: 15891162 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 12 675 38 717 718 756 64.0 0 MRHHSQETAPSLYVQAAKDAFRKLDPRVLVRNPVMFVVAVGCVFTTASFFSRVVTGADAG FSGQISLWLWFTVLFANFAEALAEGRGKAQADTLRSARTDTVAHRLLSDGTVETVSASEL APGDRVVVEAGDMIPADGTVVEGVAAVDESAITGESAPVIRESGGDRSAVTGGTTVLSDR IVVNVTQEPGKSFLDRMIALVEGAERQKTPNEIALNILLAGLTLIFMLVVVTLKPLGLYH GINIDMVVLVALLVCLIPTTISGLLAAIGIAGMDRLLQRNVLALSGRAVEAAGDVDVLLL DKTGTITLGNRMASAFLPLPGVNIDEMMQAACLASSGDETPEGRSIVELGNTLGVSLNGV PQDVVSVPFTAETRMSGVDAAGEQLRKGATDSVLQWVHSLGLPERREELKALVDPVAREG GTPLVVASSTRGVLGVIQLKDVVKPGMRDRFDRIRAMGIKTVMITGDNRLTAQAIAAEAG VDDFLAEAKPEDKLQLILDLQKEGRLVAMSGDGTNDAPALAKSDVGLVMNTGTQAAREAG NMIDLDSDPTKLIEIVEIGKQLLITRGALTTFSLANDVAKYFAIIPALFMAALPELGVLN VMHLSTPQSAILSAVIFNALIIILLIPLALKGVRYRPRSTASVLRGNLLVYGLGGVIAPF IGIKLIDIVISAVGLV >gi|316921407|gb|ADCP01000156.1| GENE 17 18082 - 19788 2063 568 aa, chain - ## HITS:1 COG:mll3133 KEGG:ns NR:ns ## COG: mll3133 COG2060 # Protein_GI_number: 13472738 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Mesorhizobium loti # 1 568 1 567 567 536 52.0 1e-152 MILRDLLQLVLFIGLLAALSPLLGRYVHQVFSGRRTFLHPVLGPVERGVYSLSGIHPDEN QTWSRYAVSLIGFSLAGFLLTFGVLLFQDKLPLNPQGFPGLSWDLALNTAVSFLTNTNWQ SYGGETTMSHFSQMVALTYQNFVSAAVGIAVCAAVVRGIARNESSSIGNFWSDLVRITLY VLLPLSIIGALIMLWQGMVQNFDAGVTATTLEGGKQFIATGPVASQIAIKLLGTNGGGFF NVNAAHPFENPTALASFLQSFYILIIPSALVFFLGDAVGSRRHAWTVWAVMLALFIGGVL WIYFSELHGTHLLQQIAGGAVPNMEGKEVRFGVFGTSLFATATTDASCGAVNAMHDSLTP LGGMLVLFNMLLGEIIYGGVGSGLYGMILFILLTVFIAGLMVGRTPDYLGKRIEGREITW CMVALLACATPILCFSAVAAVSSWGTGALNNAGAHGLSEILYAFTSGAQNNGSAFAGLSA NVPVWNVLLALAMFIGRFGVMLSMLVVAGSLAARKKRPISDSSFPVEGPTFGLLLLGIIF IVGALTYLPALSLGPIVEQFQMMRGMMF >gi|316921407|gb|ADCP01000156.1| GENE 18 19798 - 19950 138 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAVEILRVGNTDVRVTFKELVMDLFDLVGLACAAGLFAYMLYALFKPERF >gi|316921407|gb|ADCP01000156.1| GENE 19 20431 - 20733 471 100 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSEEQQVRLNVHTEKMGTSYANAFQVRYSQDDVLVSFGVSLTDLSAEPGVKGVVTADMLE RIAMTPRTAKRLTLTLIQSLREYEARFGEIAVEEKAPEEE >gi|316921407|gb|ADCP01000156.1| GENE 20 20874 - 21743 1039 289 aa, chain - ## HITS:1 COG:CAC0848 KEGG:ns NR:ns ## COG: CAC0848 COG1284 # Protein_GI_number: 15894135 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 18 283 21 286 292 149 33.0 5e-36 MVSTAVRTFARGIPWNLFLLTLGCGLFALGVKAVAIPHMLISGGVFGTALLINYMTETLS PAVWNVLLNIPIFIAGWLFVGRRFVLYSLYGLAVVSVATQYLDMTINITDPILASIAAGC ICGLGLGIVLHSIGCDGGLTIISIALHKRYGISVGQFSLLYNITLFVLAFSMYNPDSMLY SLIMIFVYSKVMDYVSTIFNQRKMVLIISTQHAAIREDILTHLHRGATLLSGSGGFTGEQ RPVLLTVVQNYQIRLLEELIFRHDKQAFVIIENTLNVLGKGFSCVKEYK >gi|316921407|gb|ADCP01000156.1| GENE 21 21861 - 22709 876 282 aa, chain + ## HITS:1 COG:MA1431 KEGG:ns NR:ns ## COG: MA1431 COG0225 # Protein_GI_number: 20090291 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptide methionine sulfoxide reductase # Organism: Methanosarcina acetivorans str.C2A # 115 279 20 187 188 174 52.0 1e-43 MRPLTPDEEQVIAHGATEPPFSGRYWNHKADGTYACRRCGATLYRSGDKFDSGCGWPSFD DAVPGSVLRRPDPDGHRTEIVCAACGAHLGHVFEGERFTPKNTRHCVNSLSLEFIPEKTS ECTEEAIFAGGCFWGVEDAFQSVPGVCDAESGYTGGTVPNPTYEQVCTGGTGHAEAVRVT YDPAKVSFEELARLFFEIHDPTQINRQGPDIGTQYRSAIFYKDERQKATALSLMEKLREH GYAVATELLPASTFYPAEAYHQDFTARTGRGGCHLRVFRFGV >gi|316921407|gb|ADCP01000156.1| GENE 22 23202 - 23939 815 245 aa, chain - ## HITS:1 COG:NMB0629 KEGG:ns NR:ns ## COG: NMB0629 COG0106 # Protein_GI_number: 15676531 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Neisseria meningitidis MC58 # 1 236 1 237 245 208 47.0 9e-54 MILFPAVDIKGGQAVRLRRGRADDSTVFSDDPVAAALQWQEQGAKFLHLVDLDGAFEGVS PNTDLVRRICEALSIPVQLGGGIRDEETAHRWLDAGVARLIIGTLALEDPARFAALCHAC PGRIGVSLDAENGRLKTRGWVGDTPYTVDDVVPRLAEDGAAFLVYTDIERDGMQTGVNVP SLTHLARTSKVPVIAAGGVATLDDVKALYPLSVSANLEGAISGRAIYEGTLDLREAMSWI AAQER >gi|316921407|gb|ADCP01000156.1| GENE 23 23936 - 24574 800 212 aa, chain - ## HITS:1 COG:no KEGG:LI0587 NR:ns ## KEGG: LI0587 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 13 212 10 214 214 200 49.0 2e-50 MSRLTRRFFCLAVLAVSISLLAACQKTTPSTPDLPDLKVGVVGVEQPKGTTDLLAGFIPE DRVLASDQAVATFNEELMKLLKTTTHRSYVFIPKAGGADPRERNGALAHWAKIGKDMGVD LLIVPQILDWRERAGSSAGVTTSAAVNMDFYLIDVREPGGALVSRSHFKEKQVGLSDNLM NFDTFLKRGAKWLTAQELAMEGMQKMIKEFGL >gi|316921407|gb|ADCP01000156.1| GENE 24 24574 - 25161 630 195 aa, chain - ## HITS:1 COG:BS_hisB KEGG:ns NR:ns ## COG: BS_hisB COG0131 # Protein_GI_number: 16080543 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Bacillus subtilis # 2 194 1 194 194 158 45.0 8e-39 MIRTGHVARTTAETDIDLTLCLEGSGKTSVATGFGLLDHMLTLTAFWAGFDLSLTCKGDT HIDAHHSAEDIALCLGQAFALAVGDRKGIARVGFARVPMDEALAEVTLDISGRPWLEWRG DELLPPVLAGEERDVWREFYKSFAAAARMNVHVAFLYGRNGHHLLESAAKGLGLALAQAA SLTGNVVRSTKGSLD >gi|316921407|gb|ADCP01000156.1| GENE 25 25158 - 26309 1218 383 aa, chain - ## HITS:1 COG:NMA0803 KEGG:ns NR:ns ## COG: NMA0803 COG0805 # Protein_GI_number: 15793777 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway component TatC # Organism: Neisseria meningitidis Z2491 # 131 362 3 235 256 197 44.0 3e-50 MTEREDKTPVAPSGAAEEQEILLDSPIVESEDEPGAVDAPGVSGSKEPDEAAVESTPVVE EDAVEPVPPVEEAIAGNPEAAASADADASAGLPEIPEGASSAWPQEETLPAPAAGGEPPA PVDPPSGEEEEPEEERPMTLLEHLGELRKRLVRGFLAILIGFFACYGFAQQLFYYLSLPL LKVMPADSKFIYTGVAEGFFVDMKVAFVAGVFVACPFLFYQIWAFIAPGLYEEEKKYIIP LALSSALFFILGGVFCYFGVFPFAFEFFMSYSTDNIVAMLSIDEYLSFALKMVLAFGLIF EMPLFSFFLARMGLITAQKMREVRKYAILAIFVVAAILTPPDVFSQLMMAGPMIVLYEVS IWVAVLVGKKKAEREKAAEEGGA >gi|316921407|gb|ADCP01000156.1| GENE 26 26312 - 26647 363 111 aa, chain - ## HITS:1 COG:no KEGG:DvMF_3124 NR:ns ## KEGG: DvMF_3124 # Name: not_defined # Def: twin-arginine translocation protein, TatB subunit # Organism: D.vulgaris_Miyazaki_F # Pathway: Protein export [PATH:dvm03060]; Bacterial secretion system [PATH:dvm03070] # 1 92 1 90 96 112 72.0 3e-24 MFGIGGTELLVILVVALIVLGPKSVPQIARTLGKAMGEFRKVSTEFQRTLNTEIELEEHE KRKQEAEKELFGTEQKATAATKPAEAASPAAKPAAAATAAAPTETDASVKG >gi|316921407|gb|ADCP01000156.1| GENE 27 26660 - 28201 2085 513 aa, chain - ## HITS:1 COG:CAC2700_2 KEGG:ns NR:ns ## COG: CAC2700_2 COG0519 # Protein_GI_number: 15895957 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Clostridium acetobutylicum # 196 513 1 316 316 436 64.0 1e-122 MPSKVIIIDYGSQVTQLIARRVREAGVYSEIHPCVVTAAQVKAMQPSAIILSGGPASVGE KDAPALDKGFLDLGVPVLGICYGMQLLAFNLGGGLEQSETREYGPADLRFVRSCALWDGI DLTKLSRVWMSHGDTVKTPPAGFLVTGSTDTLEVAAMADEARKIYAVQFHPEVHHSVDGD TMLRNFLFKIAKITPDWTMSSFVERVINEVRETVGDGHVVCALSGGVDSTVVAVLLHKAI GKNLHCIFVDNGVLRSGEGDEVVTYLREHFDLNLKLVNAQDRFLDLLAGVEDPEKKRKII GHTFIDVFDEEARAIGNVEWLAQGTLYPDVIESVSHKGPSAVIKSHHNVGGLPEKMNMKL IEPLRELFKDEVRQVGAELGMPESVIWRQPFPGPGLAIRVIGEVTRERLDILRKADTIVQ QELRDAGWYRKIWQGFAVLLPLKTVGVMGDGRTYEHVIALRCVDSVDAMTADWARLPYDL LERISRRIINEVKGVNRVVYDVSSKPPSTIEWE >gi|316921407|gb|ADCP01000156.1| GENE 28 28314 - 29774 1946 486 aa, chain - ## HITS:1 COG:aq_2023_3 KEGG:ns NR:ns ## COG: aq_2023_3 COG0516 # Protein_GI_number: 15607007 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Aquifex aeolicus # 202 484 1 284 284 366 66.0 1e-101 MSKILSKGLTFDDVLLVPAYSEVTPDVVDVTAWLTPSISLNIPLVSAAMDTVTESAMAIS MARAGGIGIVHKNMSIERQKLEIEKVKKSESGMIIDPITVEPDDTVEHALDLMHAYRVSG LPVVREKKLVGILTNRDVRFVEDLAGTKVRDVMTSENLITVPTGTTLEEAKHHLHTHRIE KLLVVDEKGELAGLLTMKDIDKVQKYPNACKDSKGRLRVGAAIGVGPEGEKRAAALIAAG VDVLVLDSAHGHSKNILDAVSAIKGAFPDCQLVAGNVATYEGARAMLKAGADTVKVGIGP GSICTTRVVAGVGVPQVTAIMECSRAAREMDRCCIGDGGIKFSGDVVKALSAGANSVMVG SLLAGTEESPGETILYQGRTYKIYRGMGSIDAMKEGSSDRYFQQKSKKLVPEGIVGRVPF RGLASETIYQLVGGVRSGMGYCGAATIQELFEKSQMVEISAAGLRESHVHDVIITKEAPN YRVDNA >gi|316921407|gb|ADCP01000156.1| GENE 29 30072 - 30494 597 140 aa, chain + ## HITS:1 COG:RSc1359 KEGG:ns NR:ns ## COG: RSc1359 COG0589 # Protein_GI_number: 17546078 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Ralstonia solanacearum # 3 140 26 168 168 73 35.0 8e-14 MAYNRILLPVDGSEHSMHAVAHAVSLVREGGEIVVATVLPPIPNVIGGDARKEAEEAVKT DASLITQPVMDVIAKDNIACREKIVLHTSTAEGIIETAEDMKCDLIVMGSRGRSDIEGLL LGSVTHKVLTLATVPVLVVR >gi|316921407|gb|ADCP01000156.1| GENE 30 30745 - 32238 1711 497 aa, chain + ## HITS:1 COG:mlr2852 KEGG:ns NR:ns ## COG: mlr2852 COG1502 # Protein_GI_number: 13472525 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Mesorhizobium loti # 25 497 12 487 487 246 34.0 5e-65 MLYISLLLALLAVACLIFVPNTVILLSFVAIMHGFAVYSACHALLHKRDSRAALGWIAII LFMPPVGLPLYWLFGIARIDSQAVRLMEKAAQKVMGGLVDLHGKSLTEKPEGFIDKDEVP HAWHYLVNPGSSITGRCMLEGNDLTVLHNGEAAYPVMLKAISEAKERVFLSSFIFGYDEV GKSFAAALRAAAERGCDVRLLMDGVGSFHLWPSWRKRLGPDVKLAYFLPPRLIPPQFSIN LRTHRKVLICDSETGFTGGMNISQNHLANLKRRSRVQDVHFRCLGPIAQQLEIAFLMDWS FATGDHANFTVHPVKKQGSSLCRILMDGPGTPEDTILDLVCAQISAARHSVRIMSPYFLP PHRLHGELVSAVLRGVDVSVILPGENNHRLVDWAMRHQMPMLADRGVKIYRQPPPFAHTK LLLIDGEYTLLGSANLDPRSLNLNFELVMEVLDLRLAEQLSAFYDEVRSRSVSVPTTYPA LPYRLRNAASWIFSPYL >gi|316921407|gb|ADCP01000156.1| GENE 31 32123 - 34639 821 838 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 61 798 18 701 730 320 31 6e-87 MKKHPRQRGYFGDTVRKGASGPARPGKKNARRTDDEQTFVSVPGGLDAEEVLGLMGKIRH PVRLDDLIRFLDLSRRDKKPLENLLDALQAEGRVIRLRGGKWVEASQARIVTGVLSIQRS GAGFVTPDAVQDSPEDAPKPVDGDKRNRVQSRLQPDIFIHPGFLGDAWHGDRVEVALQPA PQHRGRKMRNPEGRIIRVLERRQKELAVHVTRNQTQRGVLCRPADPRLDFLLDVDVSALK AAPKIGELLLVTPEEKVEDGLWRGTARVSLGLEEDAMVQERLTKLNHEIPLDFPPNVLAE AAELEKRAADLGGLKPLGSVEDALPIDRVSKASTSKRQDLRGVPFVTIDGEDARDFDDAV YVRRLPGAPDGAVWELWVAIADVSHFVLPGSRLDREARDRGNSYYFPTSVEPMLPEVLSN GLCSLRPDEERLVMAARVGFDDRGIPRTSAFFPGVIRSRGRLTYEGVQEALDHGGELAGR HPYLKDAEALAHLLLERRQARGSLDFDLPEAQFIVNRSTGEVEGIKRRVRLFAHRLIEEF MLAANEAVARFLTEKGTPFPYRIHPAPDPDRLSTLFRTLASTDLAQSDLLTLKRGETPSP SRLRDILVQANGTPQEYLVGRLVLRSMMQARYSPEAGEHFGLASPCYCHFTSPIRRYADL LVHRALSFTLGATPGPILAGHKLLMAADQCNARERAATEAEREIARRLGCLLLRERTGET FTGVVSGVTEFGFFVEFDEMPVEGMVRLTTFRDDWFEYDPDRQELIGVGTGRRFRLGQAV TVRLIDVHIGRLEVNLELEGTTDTAKRSSSRRSAGGREAPGRSSGRKRNGFVPRHKRR Prediction of potential genes in microbial genomes Time: Fri May 13 05:05:10 2011 Seq name: gi|316921403|gb|ADCP01000157.1| Bilophila wadsworthia 3_1_6 cont1.157, whole genome shotgun sequence Length of sequence - 2469 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 170 - 559 435 ## COG1733 Predicted transcriptional regulators - Prom 590 - 649 3.7 + Prom 529 - 588 2.4 2 2 Tu 1 . + CDS 711 - 1367 807 ## COG0655 Multimeric flavodoxin WrbA - Term 1272 - 1304 -1.0 3 3 Tu 1 . - CDS 1364 - 2419 775 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase Predicted protein(s) >gi|316921403|gb|ADCP01000157.1| GENE 1 170 - 559 435 129 aa, chain - ## HITS:1 COG:CAC0195 KEGG:ns NR:ns ## COG: CAC0195 COG1733 # Protein_GI_number: 15893488 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 23 118 16 111 115 114 52.0 5e-26 MGGKRRTGYSTELPGANVYTVPCPIIHALNLIGGKWKLPVLWHLFDAESVRYNELKRSVV GITNIMLTQCLRELEAAGLVSRNDFGEVPPRVEYSLTGRGRELLPALRELYKWGEKQLAR TAEADPPSS >gi|316921403|gb|ADCP01000157.1| GENE 2 711 - 1367 807 218 aa, chain + ## HITS:1 COG:MA1860 KEGG:ns NR:ns ## COG: MA1860 COG0655 # Protein_GI_number: 20090710 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 216 1 214 222 229 52.0 2e-60 MTYYAINGSPRKTHNTATILNKALEGVKAADPQAQTELIHLYSHDFSGCVSCFECKRLDG KSYGKCAVRDGLTPILERLADADGIIFGSPVYFHSITGKMRMFFERLLFQYFVYDADYSA LSPKRMPTAFVYTMNVPYQVMLESHYTESFQSMESFIQRVFSKPETLYVNDTYQFSDYSK YKADRFSETDKARHREQQFPIDCQKAFELGKALVKRNG >gi|316921403|gb|ADCP01000157.1| GENE 3 1364 - 2419 775 351 aa, chain - ## HITS:1 COG:PA2981 KEGG:ns NR:ns ## COG: PA2981 COG1663 # Protein_GI_number: 15598177 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Pseudomonas aeruginosa # 13 341 20 330 332 165 37.0 1e-40 MNPLSWQKRLRPLLTPVSLLYRRILAARRAKWENEGSPAFRPVCQCFSVGNIAWGGTGKT PVVDWLLGWSEARGLRAVVLSRGYKAQPPELPLHVSPSRTPGEAGDEPLMLALEHPSAAV MVDPDRRRSGRWAEERLSPDLFVLDDGFQHVKVRRDLDIVLLRPDDLGDGWGRVIPAGAW REGPEALERADVFMIKCSPEAWESLRPACERRLAGFRRPLFSFSLRPGSLRKIGTGECRD ADAFAGKPYALATGVGDPAQVWETVARFMGREPAEYIVYPDHHAYTPEDAAKLEGLGMSV ICTSKDAVKLRELRLADAWALRVEAAFGPALWTEAAFPHWLDAWAKRKGLF Prediction of potential genes in microbial genomes Time: Fri May 13 05:05:17 2011 Seq name: gi|316921394|gb|ADCP01000158.1| Bilophila wadsworthia 3_1_6 cont1.158, whole genome shotgun sequence Length of sequence - 12605 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + 5S_RRNA 90 - 189 93.0 # CP001197 [R:1104568..1104682] # 5S ribosomal RNA # Desulfovibrio vulgaris str. 'Miyazaki F' # Bacteria; Proteobacteria; Deltaproteobacteria; Desulfovibrionales; Desulfovibrionaceae; Desulfovibrio. - Term 587 - 628 6.1 1 1 Op 1 . - CDS 651 - 1499 461 ## DVU2621 hypothetical protein 2 1 Op 2 . - CDS 1542 - 2186 745 ## COG0352 Thiamine monophosphate synthase 3 1 Op 3 . - CDS 2198 - 3934 1649 ## COG1001 Adenine deaminase - Term 4325 - 4367 -0.8 4 2 Op 1 . - CDS 4382 - 5926 654 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 5 2 Op 2 . - CDS 5937 - 6113 124 ## - Prom 6178 - 6237 2.8 + Prom 6106 - 6165 4.3 6 3 Tu 1 . + CDS 6237 - 7685 1368 ## LI0461 hypothetical protein + Term 7723 - 7762 7.5 + Prom 7706 - 7765 1.8 7 4 Tu 1 . + CDS 7787 - 9235 1671 ## LI0461 hypothetical protein + Term 9273 - 9310 6.2 + Prom 9392 - 9451 3.1 8 5 Tu 1 . + CDS 9519 - 10952 1657 ## COG1966 Carbon starvation protein, predicted membrane protein + Term 10962 - 10999 6.1 + Prom 11036 - 11095 4.1 9 6 Tu 1 . + CDS 11160 - 12566 1283 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|316921394|gb|ADCP01000158.1| GENE 1 651 - 1499 461 282 aa, chain - ## HITS:1 COG:no KEGG:DVU2621 NR:ns ## KEGG: DVU2621 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 75 260 68 212 270 108 39.0 2e-22 MLRDGGRAHSCSALQEALDLRLIRSYVFDMNGIFHTPKRFALLLLAAFALANGPVWAAPT PVADKGGDKKEEAVWGEMPAGAKTTYIGIHGGTVPVSLLTAGDGSPMIALVGLTGSDFLN FLRTNDRLQEEPLFADNTGIYAPGAHPSDISDSPVQPTAADVPTPHLTGNATQPITGNAT QPMAPSEQAVQQTADNQPTELPATDPAIPSESNATLLLAGSSSGDIPVIYATGRNFERLS LVPANWAPFGLTEKALSLEGEVKVLAPPKMMPKRHTIKKKRH >gi|316921394|gb|ADCP01000158.1| GENE 2 1542 - 2186 745 214 aa, chain - ## HITS:1 COG:BH1431 KEGG:ns NR:ns ## COG: BH1431 COG0352 # Protein_GI_number: 15613994 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Bacillus halodurans # 14 201 6 193 211 160 43.0 1e-39 MPRLLPGTSPETDLYALTDDELSLGRPAVEVAKALLDSGIKILQYREKEKKAGQMLKECL ELRRLTREAGACFIVNDHVDIAVLCEADGVHVGQEDLPVEAVRRLVGPDMIIGLSTHTPD QARAAVASGADYIGVGPIYPTQTKKDVCAPVTLDYLDWVVANITLPFVAIGGIKRHNIAD VIAHGARCCAIVSEFVSAPDIPARVAEVRSAMKK >gi|316921394|gb|ADCP01000158.1| GENE 3 2198 - 3934 1649 578 aa, chain - ## HITS:1 COG:CAC0887 KEGG:ns NR:ns ## COG: CAC0887 COG1001 # Protein_GI_number: 15894174 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Clostridium acetobutylicum # 15 573 11 570 570 516 46.0 1e-146 MYDIDDLKNLVDMAAGGRSPADLLITNAKLVDVLTGEIRETDVSVGGGKILGFSHTEAVK VVDARGAYLLPGLIDAHIHIESSMVSPARFAGLVLPHGTTSVVADPHEIANVHGLEGIRY MLENGRHLPLNIFIALPSCVPATPFEDSGAVLSAEELEELIDDPKVTGIGEMMNFPGVVS GDYDVLAKIQLGTSHGKIVDGHAPGLLGRDLDAYLVTGITNTHECATLEEMRENLRRGSY ILIREGSAAKNLRTLLPGVTPGNARRCAFCCDDRHIEDIVSDGHMDNHLRLAVGMGMDPV QAITMCTLNAAECFGLRNKGAIAPGRDADFILVDDLKAFRVRKVFTAGRLIAEDGRVLIP LDDAAAGAPSHSIHLKPLAEDALSLPVRTGKARVIGIEPASLVTKSLVREVKTDDAGCFQ SALNPGLTKIAVIERHKRTGKRGIGILEGYGLTGGAIATTISHDSHNIVVAGDNDPDMLL AVRELADMGGGIVSVHGGAIRRLPLPIAGLMTSADPQEVNAVLHDMLVAARAELGIPEDV EPFMTLSFMALPVIPELKLTARGLFNVNTFSFVGVEAD >gi|316921394|gb|ADCP01000158.1| GENE 4 4382 - 5926 654 514 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 218 511 21 319 336 256 43 6e-68 MGFFSSIKRLWSSNKTAETPETPQVPAEEALAAQPAETEDGASHEESSVSDAPQSPVEVV EKLPETAAAEQQQADLEMDQLAAGQSLEPDGQAPAPAAAAQPAPQPDAPKEPSGWRDTLL LTLREAEPRLSVWLSIALEDLGPDARTGDELWERVRFLLTALGAPENEADAFVADFAAWA ARMEYEEVADFRSELQYRLALALDLEDEEDERNRIFLKLHQGLERTREQLGKGLSTLFAS HGDLSADFWDELEELFIMADMGVEAATELCKRLKSRARAASATTPEELRPLLAAELEDIF RIPRRVVAVNPPEVVLIIGVNGVGKTTTIAKLAYRARMQGKKVLLAAADTFRAAAVEQLG IWAERTGSGFHSKGQDADPAAVAWEAMDVAIRDQYDILFIDTAGRLHTKSNLMDELHKIR EVIARKHPGAPHRSILIVDATTGQNALQQTKIFKEAAGIDELILTKLDGTAKGGVAVAVS MQYGIPITYVGLGEKMEDLRPFNGDDFAKALLEQ >gi|316921394|gb|ADCP01000158.1| GENE 5 5937 - 6113 124 58 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MILTPLKQAKKAKSLYRPSRPTWFTPSEKQGLTGRHKYANSRMVRLLNAPASHPQTSF >gi|316921394|gb|ADCP01000158.1| GENE 6 6237 - 7685 1368 482 aa, chain + ## HITS:1 COG:no KEGG:LI0461 NR:ns ## KEGG: LI0461 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 2 482 3 498 498 369 42.0 1e-100 MKRIVTLLLAAGLVLGAAAGSQAADIKAKGNWTFTWQLGDNNLFEKNGDKFTAKQRLRTQ IDVIASESLKGVLFLEMGDQNWGSSKDGASLGTDGKIVKVRYSYVDWVIPQTDAKVRMGL QNFSLPGFISNNPILGGGSADGAGITISGQFTENVGASLFWLRAENDNTDGYRGNPSSNA MDFVGLTVPMTFDGVKVTPWAMYGALGRDSFTNGDGVNKYDPVTGELISGPGSVANGLLP TGVDGAMLKGADLDRHGNAWWVGVASELTYFDPFRFALDAAYGSADMGSIGGFDVERSGW FASILGEYKMDYFTPGILFWYASGDDSNWANGSERMPVVEGSWTASSYGFDDNFGRDACD MIGLTNDGKMGVYLQAKDISFMEDLTHIFRVGFIKGTNNTEMARQGIASPTGTSGRELYL TTADKAWEVNFDTQYKIYKDLTLAVEIGYINLDLDKGVWGKGTVDSYRENNVRGAVTLQY TF >gi|316921394|gb|ADCP01000158.1| GENE 7 7787 - 9235 1671 482 aa, chain + ## HITS:1 COG:no KEGG:LI0461 NR:ns ## KEGG: LI0461 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 2 482 3 498 498 407 45.0 1e-112 MKRIVTLLLAAGLVLGAAAGSQAADIKAKGLWEFTWELGTPSFEKHTGQDTFGARQRLRT QIDVVASESLKGVMFFEIGDQNWGSSKDGASLGTDGKMVKVRYSYVDWVIPQTDAMVRMG LQNFSLPGFVSSSPILGGGSADGAGITLSGQFNENVGASLFWLRAENDNRTGGYGIDETN PYSDAMDFIGLTVPLTFDGVKVTPWAMVGMIGRDSLDTDKDAISQKGGLIPIQADISPDA LKDRHNTAWWVGVASELTYFDPFRIAVDAAYGSVDMGSYKTLDGRTIRDFDVKRAGWFAS ILGEYKMDYFTPGILFWYASGDDSNAYNGSERMPSIEGSWTASSYGFDDNYGRDSSDMIG LTNDGTMGVYLQAKDISFVEDLTHIFRVGFVKGTNNTEMAQYVGTPYAGDYRALSNNNLY LTTADKAWEVNFDTQYKIYQDLTLCVELGWINLDLDKKVWGLDSDEYRDNAFRGAVTLQY AF >gi|316921394|gb|ADCP01000158.1| GENE 8 9519 - 10952 1657 477 aa, chain + ## HITS:1 COG:VC0687 KEGG:ns NR:ns ## COG: VC0687 COG1966 # Protein_GI_number: 15640706 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Vibrio cholerae # 5 473 2 480 494 464 54.0 1e-130 MPSWMYFVLALAGLLAGYCVYGAIVEKVFGPDPNRPTPACKMPDGVDYVEMSPTKVFLIQ LLNIAGLGPVFGPILGALYGPIALLWVVIGCVFAGAVHDYFSGMLSVRYNGKSVPDIVGY NLGDLIKKLMRVFAVVLLILVGVVFVAGPAGLLSQLTGMHSMLFVAIIFAYYFLATILPV DKIIGRVYPFFAVLLVFMAVGLLGALAFKGYTFYSNIEWTMHSPSGLPAWPLVFITIACG ACSGFHATQSPLMARCINNEKHGRKIFYGAMIAEGVIGLIWVTLGMSFYSDTAALAAALG PKGNAALVVNNISVELLGVFGGALAVLGVVVLPVTSGDTAFRAARLTIADAIGYDQRPHK NRLMIAVPLFVVGVALNFIPFGMLWRYFGFANQALAAIMLWAAAAYLLHRGKFHWIASVP ACFMTAVSAVYICFEKTMGFGMSYELSNIVGIAIALICFALLLTAGRKVQPTDDLSV >gi|316921394|gb|ADCP01000158.1| GENE 9 11160 - 12566 1283 468 aa, chain + ## HITS:1 COG:YPO2392 KEGG:ns NR:ns ## COG: YPO2392 COG0534 # Protein_GI_number: 16122615 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Yersinia pestis # 6 455 2 446 457 389 48.0 1e-108 MSTKRRAYLTEMKTQLTLAVPALLAQIAMISMGFVDMVMTGRVGAVDMAAVALAGSLWVP LVLFGQGLLLAVTPCVAQLRGAGSMRSGEHMQGVGHVMRQGIWMALFISVPLILIVYLVS FHLADMGVEGALGLMTGQYLRAIIWGAPAYLLFVALRCGMEGMALMRPAMFAGFMGLLIN IPCNYILIFGKLGLPALGGPGTGVATAIVYWVMFLTMVVYVCRHTYLRELMTWRVWEIPV WATLKKLAGIGFPGALAMLFEVTLFAAVALLIAPLGAIQVAGHQVALNFSSLLFMVPLSI GTAATIRTGFALGRKSVEAVRVASRSSWLLACMAACFTALVTILWRHQIAAVYNTDPVVL GLASHLLLFAATYQITDAVQVVSVGILRGYNDTRAILCVTLISYWVFALPVGYTLGRTHL WGDPLGPQGFWTAFIVGVSIAAVLLVIRVRVLERRLLSGFDRIRVMEG Prediction of potential genes in microbial genomes Time: Fri May 13 05:05:57 2011 Seq name: gi|316921376|gb|ADCP01000159.1| Bilophila wadsworthia 3_1_6 cont1.159, whole genome shotgun sequence Length of sequence - 15840 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 11, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 250 - 1575 1136 ## COG1559 Predicted periplasmic solute-binding protein 2 1 Op 2 . - CDS 1572 - 2000 458 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) + Prom 1931 - 1990 3.7 3 2 Tu 1 . + CDS 2216 - 2797 605 ## COG0778 Nitroreductase + Term 2963 - 2996 4.4 + Prom 2799 - 2858 1.7 4 3 Tu 1 . + CDS 3019 - 3219 211 ## DvMF_0801 thiamine biosynthesis protein ThiS + Term 3252 - 3288 7.2 + Prom 3337 - 3396 2.2 5 4 Op 1 5/0.000 + CDS 3417 - 4220 936 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis 6 4 Op 2 . + CDS 4217 - 5386 1199 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes + Term 5427 - 5466 3.7 7 5 Tu 1 . + CDS 5476 - 6099 470 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 + Term 6234 - 6271 2.3 - Term 6222 - 6259 6.1 8 6 Tu 1 . - CDS 6406 - 7362 982 ## COG4395 Uncharacterized protein conserved in bacteria - Prom 7418 - 7477 1.7 - Term 7442 - 7488 5.4 9 7 Op 1 . - CDS 7509 - 8174 680 ## COG4845 Chloramphenicol O-acetyltransferase 10 7 Op 2 . - CDS 8178 - 9065 955 ## COG5266 ABC-type Co2+ transport system, periplasmic component 11 7 Op 3 . - CDS 9090 - 9935 792 ## COG1040 Predicted amidophosphoribosyltransferases 12 7 Op 4 . - CDS 9928 - 10563 455 ## COG0655 Multimeric flavodoxin WrbA - Prom 10625 - 10684 2.5 - Term 10660 - 10700 10.2 13 8 Tu 1 . - CDS 10708 - 11610 1248 ## COG0078 Ornithine carbamoyltransferase - Prom 11683 - 11742 2.2 14 9 Tu 1 . + CDS 11721 - 12506 795 ## COG2122 Uncharacterized conserved protein + Prom 12614 - 12673 2.8 15 10 Tu 1 . + CDS 12714 - 13196 512 ## COG0456 Acetyltransferases 16 11 Op 1 . + CDS 13311 - 14765 1587 ## COG0477 Permeases of the major facilitator superfamily 17 11 Op 2 . + CDS 14778 - 15794 1175 ## COG0309 Hydrogenase maturation factor Predicted protein(s) >gi|316921376|gb|ADCP01000159.1| GENE 1 250 - 1575 1136 441 aa, chain - ## HITS:1 COG:RSc1783 KEGG:ns NR:ns ## COG: RSc1783 COG1559 # Protein_GI_number: 17546502 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Ralstonia solanacearum # 106 434 66 372 377 174 37.0 3e-43 MNESPEETKNTPADPGAPRPEETGIPSGVSDTRNQQAPPDQQHSDASSPEAENWQPTDKK PGEASPLDGEEPETPLPASAPETPKPKRLWPHFLLGFMLLVLLACGGAGWLAYDFLKSPG TDPAVAPAQDVEVTVNPGTTFRTLTPELVRLGAVRNADKFILLLRWMNYRDIPHALKPGR FRINTGWTPQQVIDQLVNGSPLLDRVTIPEGLAWWEVGKRLEEAQMVRFEDFDKLVHDPA FLRHWGIPFDSAEGFLFPDTYLIMRPLELNEATAKSVVGRLIDNFWRRTAPLWPGGKRPG PSGRDEVRRLVTLASIVERETAVPSERPRVAGVYANRLRLNMLLQADPTTAYGLGEGFDG NLRRKHLDDEGNPYNTYKHPGLPPGPICSPGLACLKAAANPEQHDYIYFVARGEDGSHVF STNLAAHNKAVREYWAKRRGK >gi|316921376|gb|ADCP01000159.1| GENE 2 1572 - 2000 458 142 aa, chain - ## HITS:1 COG:sll1547 KEGG:ns NR:ns ## COG: sll1547 COG0816 # Protein_GI_number: 16332240 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Synechocystis # 4 133 8 137 152 79 36.0 3e-15 MKYLAIDYGQKRTGIAVSDTGGSMAFPRKTILMRTRAAFFEELLALIEAEATDAIVIGLP INLDGEESLTTRQVRNFSKSLARRTTLPLFWMEEALSSYEAERDLRDAGRSAAQGRAVLD QQAAVRILQSFLDQPEAKRKTI >gi|316921376|gb|ADCP01000159.1| GENE 3 2216 - 2797 605 193 aa, chain + ## HITS:1 COG:FN1880 KEGG:ns NR:ns ## COG: FN1880 COG0778 # Protein_GI_number: 19705185 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 4 193 3 192 192 97 31.0 2e-20 MDFLELVTGARTCRRFREAEGMPAGMLDWLIECARVTPCGGNAQALRFAVAESPEACAAV FPALKWAGMLTDWDGPEAGERPTGYVAILGEAGTRAKLNAIDAGIAAQTIQLAAYTRDLG CCIFLSFDPRKIREVLGIPENLEPLLVLALGFQKEVRRVETVGADGSVKYWRDAQGVHHV PKRPLEDLLIIKK >gi|316921376|gb|ADCP01000159.1| GENE 4 3019 - 3219 211 66 aa, chain + ## HITS:1 COG:no KEGG:DvMF_0801 NR:ns ## KEGG: DvMF_0801 # Name: not_defined # Def: thiamine biosynthesis protein ThiS # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 66 1 66 66 66 54.0 2e-10 MNITVNGEITVCNEGATLLDLILRLGLSPEKIVVERNRDIVPAGTFAETALQDGDTLELL QFVGGG >gi|316921376|gb|ADCP01000159.1| GENE 5 3417 - 4220 936 267 aa, chain + ## HITS:1 COG:CAC2922 KEGG:ns NR:ns ## COG: CAC2922 COG2022 # Protein_GI_number: 15896175 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Clostridium acetobutylicum # 6 258 2 254 255 311 67.0 6e-85 MKTSSDALIIGGVELHSRLFIGTGKYGSDSLIPAVAEASGAEVITVALRRVDLGNSKDNV LSHIPGTMRLLPNTSGARTADEAVRIARLSRAAGCGDWIKIEVISDTRHLLPDGYETAKA TETLAKEGFTVLPYMNPDLYVARDLVNAGAAAVMPLGAPIGTNRGLQTKEMVRILIEEIS LPIVVDAGIGKPSQACEAMEMGAAACLVNTAIATAGDPVRMGAAFGEAVRAGRNAFLAKA GPVLQGGASASSPLTGFLGGHDEGAAS >gi|316921376|gb|ADCP01000159.1| GENE 6 4217 - 5386 1199 389 aa, chain + ## HITS:1 COG:CAC2921 KEGG:ns NR:ns ## COG: CAC2921 COG1060 # Protein_GI_number: 15896174 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 42 377 36 365 368 363 50.0 1e-100 MSRLFSDVLPEWPLDEVEAGLASATETDVLAALELAENGRRLNARDFMALVSPAASPYLE TMARIARDVTVRQFGRSIQLFTPLYLANYCTNQCVYCGFNTKNHIHRSMLTMDEVEAEGK VIAATGLRNILLLTGDAPKLTGPAYIAEAARRLRPYFPSIGVEVYSMSEDDYRMLVDAGV DSFTMFQETYNEELYLKLHPAGPKRDFRFRLNAPDRAARAGMRSVNVGALLGLDQWRRDA FYTGLHADWIQATYPGVDIAVSAPRMRPHEGSFNDIHPASERDLVQYIQALRLYLPAAGI TLSTRERPFLRDRLIGLGITKISAGVSTAVGGHAHAAQDVAEEHDTAQFEIADDRDVDQM IAAIEAQGYQPVCKSWEPLDEAYACGKSA >gi|316921376|gb|ADCP01000159.1| GENE 7 5476 - 6099 470 207 aa, chain + ## HITS:1 COG:CAC2923 KEGG:ns NR:ns ## COG: CAC2923 COG0476 # Protein_GI_number: 15896176 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Clostridium acetobutylicum # 29 203 86 263 266 135 41.0 5e-32 MADGVIGWDTVLEGHLAAEQCKVLASARIGMAGAGGLGSNCTVFLARTGIRDFVIADPDV VALSNLNRQHFSPRHLGLPKVEALGAVLRELNPSVILRLEQRALDGPSACGLFAGCDIVV EAVDDPEVKKNLVEALLLAGHRVVSASGMAGWGGVPMTARKLGSRLVVVGDHVSGIGPGM PPLAPRVVMAAAMEADAVLEMLLGECR >gi|316921376|gb|ADCP01000159.1| GENE 8 6406 - 7362 982 318 aa, chain - ## HITS:1 COG:PA5528 KEGG:ns NR:ns ## COG: PA5528 COG4395 # Protein_GI_number: 15600721 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 9 314 5 283 284 80 27.0 4e-15 MPKSFYACLSAVLVLVMLAFAYGEADAARIGGGRSFGGRPSMSQPYTKPLPSNPSSSFNQ QTNRQSQQMANSAAPSRGLFGGMGGMLGGLLAGSLLGSLLFGGGFNGGGFLDIILIGGLL FLAFKFFSRRRAATQGAGQPNAGGFDSLRSTPGSDSTHQRTAAPQSGKGGFDWEALTTPS SGQQPLYQDEPKKPAGFDEEEFLRGAKAAYTRLNNAWDKRDLSDIAQFSTPAFQKEIQQQ AAEDKTPGNTEIMLVNASLLSVDTVGDEQIAQVYFNVLLREDPSQEAPIDVREIWHFVRP ASGDGTWKLDGIQQVENV >gi|316921376|gb|ADCP01000159.1| GENE 9 7509 - 8174 680 221 aa, chain - ## HITS:1 COG:CAP0060 KEGG:ns NR:ns ## COG: CAP0060 COG4845 # Protein_GI_number: 15004764 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Clostridium acetobutylicum # 3 220 2 220 220 149 34.0 3e-36 MRDASFHLLDRAAWPRSEHFDYYIRTVKCRYDLTAHLTVTALRERGKALGLRFYPLLLYI AARAINANREFRMGFNEHGAPGYWDFINPSYTIFHDDDKTFSDVWSEYDESFPVFYQNVL HDMETYKDIKGVRAKPGRPANCCPMSALPWLSFTGMAHVQATEPPFLPPIITFGKYFAQG PDTLLPIAVSVHHAAADGYHTSKLINDMQALAASSDQWMRP >gi|316921376|gb|ADCP01000159.1| GENE 10 8178 - 9065 955 295 aa, chain - ## HITS:1 COG:BMEI0641 KEGG:ns NR:ns ## COG: BMEI0641 COG5266 # Protein_GI_number: 17986924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, periplasmic component # Organism: Brucella melitensis # 13 286 12 250 252 103 28.0 6e-22 MFGRTLMTLGLAMLLAAPAHAGFGLVLPDSPIVDDPEKTSLNLVICAIDPATGAGIPIER PQVFTALRYSGDNMDRSEHLSVLDEVVAYGARAWTTNISLPHPGVYQLVMQTKAGWVPEQ DKFVQYVTKVQIPAYGSAEGWDKPANISFEISPLSRPFGLCSGMAFSGQALFDGTPIPGA LIDVARLDPAIKPNVAQPEDPNAPPQDDKAQPKAKKQDKQPTPQPISAFGATQQVKADGQ GVFTFVCPLPGWWAFSTSMAGDPLQDPEGKQKPLEIKTIFWVYMNDCPKTAPSKK >gi|316921376|gb|ADCP01000159.1| GENE 11 9090 - 9935 792 281 aa, chain - ## HITS:1 COG:mll3453 KEGG:ns NR:ns ## COG: mll3453 COG1040 # Protein_GI_number: 13472985 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Mesorhizobium loti # 55 258 20 226 240 104 37.0 2e-22 MNERSFRFLQPLRDLLHATSLDERRCILCHIPFSPLNRGELPADADNEQRMQIPLCPSCR ALLRPRKGGYCPSCGELQPFSDSEPALCETCQKTPPPWEHFRFYGAYSGALKTLLLRGKF GADPTVLHLLGRFLALSCKDLPKPDAIVPVPLHSTRLRERGFNQCQELARPLSDALGVPL VPDLLLRQHPTRHQVGLSEAERVANLKSAFLSLPEVRGKRILLVDDTYTTGTTLRRAALA LLDPHAGAAAVDVAVVARTPREKHDFQYNWSSPHESLPKNS >gi|316921376|gb|ADCP01000159.1| GENE 12 9928 - 10563 455 211 aa, chain - ## HITS:1 COG:MJ1083 KEGG:ns NR:ns ## COG: MJ1083 COG0655 # Protein_GI_number: 15669271 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanococcus jannaschii # 23 120 6 101 193 68 36.0 9e-12 MESPFPSRAELDPPEQTPHVCLLACSPHPDGSTDAAARLLEQALLSRGATVDRIRLREYA IHPCIGCGYCSAHPGHCVFDGDDVADLFRRMEHADSLILNIPVYFYGPPALLKGFIDRAQ IFWEQRREGTRTPRRPAHAVLCAARTRGDRLFEANLLILRCFLDTLGFSLHEPLLLRGVE RPTDLLHSQAIAQIERFGLEAAEQATRFYHE >gi|316921376|gb|ADCP01000159.1| GENE 13 10708 - 11610 1248 300 aa, chain - ## HITS:1 COG:BMEI1620 KEGG:ns NR:ns ## COG: BMEI1620 COG0078 # Protein_GI_number: 17987903 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Brucella melitensis # 3 300 19 318 323 291 50.0 1e-78 MPKHFMRIRDLGHSGAWEIIRRAKEMKETNYRGHCMDGKVAALIFEKASTRTRVSFDVAV RQLGGTTIFMTPAESQLGRSEPLRDTARVLSRYVDCLIVRTFGQEKLDELAEYGSLPVVN ALTDQGHPCQLMADLLTIYERTPDFSKVTVTWVGDGNNMANSWIEAAMYFPFQLNMAIAP GYEPDQQLLALALQSGAKIFLTHDPKLAVDGANYIYADVWASMGQEEEQKKREEAFKGFC VDSELMSLAAPGVKFLHCLPAHRGEEVTDEVMESEASIVWDEAENRLHAQKAILEWVFSD >gi|316921376|gb|ADCP01000159.1| GENE 14 11721 - 12506 795 261 aa, chain + ## HITS:1 COG:AF0649 KEGG:ns NR:ns ## COG: AF0649 COG2122 # Protein_GI_number: 11498257 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Archaeoglobus fulgidus # 40 212 13 175 233 114 41.0 2e-25 MGRTVKGIHLDPHRGYRERCAPRDGEHGFQLVIGETDLRVTAVSPLPEGFKDALAARVRT LRGELEAWIVLHPEFRHSLVPVPLSCSAPPPEIVRRMTEASAIAGVGPFAAVAGTIAQMA AESLVDRSPDLIIENGGDIFMYSRRDRVVGLLPDPESGVLIGLNVAASACPLALCASSAT IGHSLSFGQGELVLVRSLDGALADALATALCNRLQGPGDVKAVVGYARRFVKHGLTGIFA QCGGAIGVWGDMELTAVEQDS >gi|316921376|gb|ADCP01000159.1| GENE 15 12714 - 13196 512 160 aa, chain + ## HITS:1 COG:MA0320 KEGG:ns NR:ns ## COG: MA0320 COG0456 # Protein_GI_number: 20089218 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Methanosarcina acetivorans str.C2A # 40 158 9 127 130 101 41.0 5e-22 MHTITQLAPDDIPAVAEVIRQAFATVAEAFDLTEENCPTNGAFLRDAALAEEQERGTVLY GLSDEDGLSGCMALKRKDEATFSLEKLAVAPWSRNRGYGGALVAHAVEEVRRAGGGTICI GAMYENRKLVRWYERQGFSITGTRKFAHLPFVVCFMEKSV >gi|316921376|gb|ADCP01000159.1| GENE 16 13311 - 14765 1587 484 aa, chain + ## HITS:1 COG:MA1657 KEGG:ns NR:ns ## COG: MA1657 COG0477 # Protein_GI_number: 20090510 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Methanosarcina acetivorans str.C2A # 6 465 22 480 502 279 38.0 9e-75 MRTEHPNASCNVERPVVSLRTVIIAVGAAQFMLPFMLSGAGPLLPAIGRDLHASAMQLSL INAAYTLSLAIFHLVAGRVGDMVGRKRLFLSGLSLFVVMSALLPFVTNIWLFLCCRFVQA MGTAIMNTCALAILVSCAPPEQLGRVLGLTSIGVYSGLSLGPGLSGLLGTALGWQFLFFS VVPIGIIAWLLMYCNVRCDWKDAPDDPFDWRGSLFYTLAVSLLSIGAIWLFSGAWAGVLF ALGVVFLGLFLWEERRSHHPILDVAFLVKNKAFALSTLASFINYSSIFGVTFYFSLYLQG VHGLTLLETGLLLSAQPAVQVFISPLGGRMADRHGADIIATIGIAVCGVGLLLASSLDGA SSLWVVTLVQLVIGSGIALFASPNTSAIMTSVDEAHMGQASGLVGTARTLGTLSSMVIIS LTMNAYLGDEALGPNNIPEFLEAMHINFALFGILNLLGIFCSLGRMGGRVRRFSHRLFHP WNHH >gi|316921376|gb|ADCP01000159.1| GENE 17 14778 - 15794 1175 338 aa, chain + ## HITS:1 COG:aq_1019 KEGG:ns NR:ns ## COG: aq_1019 COG0309 # Protein_GI_number: 15606316 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Aquifex aeolicus # 3 338 2 332 332 314 48.0 2e-85 MTKTLLLDCGSGGRASQRLINELFLEQFDNPILRTLDDAAKLELSGPIAMSTDGYTVTPL FFPGGNIGTLAVHGTVNDVSMLGARPRYLTCGFILEEGLDLDLLRKVVESMAEAARSAGV LIVSGDTKVVPRGACDKLFITTAGVGEILADPSPSGSGARPGDAILVSGAVGDHGLAVLG QRDSLSFLQGVESDSAALNKMVERLMLEVGDIHVLRDPTRGGLATTLNEIAGQSGVVCSI EEASIPIHDAVRSGCDVLGLDPLYLANEGKMLCFVPEAKAQAALDVLRSEPQGHEAARIG TVLEAVAPSRPGQVVLQTPIGGRRLLSMLEGDQLPRIC Prediction of potential genes in microbial genomes Time: Fri May 13 05:06:09 2011 Seq name: gi|316921367|gb|ADCP01000160.1| Bilophila wadsworthia 3_1_6 cont1.160, whole genome shotgun sequence Length of sequence - 11003 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 7, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 229 - 288 3.0 1 1 Tu 1 . + CDS 378 - 2936 3531 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) + Term 3091 - 3129 7.4 2 2 Tu 1 . + CDS 3291 - 4802 1316 ## LI0629 hypothetical protein + Term 5003 - 5037 3.1 - Term 4794 - 4846 15.2 3 3 Tu 1 . - CDS 4875 - 7067 1789 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) + Prom 6906 - 6965 2.1 4 4 Tu 1 . + CDS 7066 - 7557 759 ## COG3193 Uncharacterized protein, possibly involved in utilization of glycolate and propanediol + Term 7700 - 7739 10.7 - Term 7686 - 7727 11.1 5 5 Op 1 . - CDS 7753 - 8319 641 ## COG1896 Predicted hydrolases of HD superfamily 6 5 Op 2 . - CDS 8345 - 8791 625 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Term 8874 - 8932 10.5 7 6 Tu 1 . - CDS 8958 - 9875 1138 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase - Prom 9936 - 9995 3.8 - Term 9947 - 9998 6.1 8 7 Tu 1 . - CDS 10007 - 10954 442 ## PROTEIN SUPPORTED gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase Predicted protein(s) >gi|316921367|gb|ADCP01000160.1| GENE 1 378 - 2936 3531 852 aa, chain + ## HITS:1 COG:secA KEGG:ns NR:ns ## COG: secA COG0653 # Protein_GI_number: 16128091 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Escherichia coli K12 # 1 848 1 898 901 850 53.0 0 MIGFLVRKIFGSKNDRFLKKLRPLVAKINALEPEMQALADEELPQRLAVYREQVQNGEKD LDAVLPEVFALVREASTRVLGMRHYDVQLLGAMALHNGKIAEMKTGEGKTLVATLAVILN SLEGKGVHVVTVNDYLAKRDAEWMGRLYNFLGLSVGVIVAGLSDEQRKEAYGADITYGTN NEFGFDYLRDNMKFYAEQLVQRGHHYAIVDEVDSILIDEARTPLIISGASDESTDLYQKV DEVVRTLEKEKHYTVDEKGKTASLTDEGVLYVEEQLGIENLYDTANITAQHHVLQSLKAH TVFRRDVDYIVKDDQVVIVDEFTGRLMAGRRFSDGLHQALEAKEHVTVAAENQTLASITF QNYFRMYDKLSGMTGTADTEAVEFAQIYGLEVSTIPPNRPMVRKDMPDLIYRTRREKMQA IIQAIKELHATGQPVLVGTISIETSELISQLLKREGVPHSVLNAKHHAQEAEIVAQAGQA GKVTIATNMAGRGTDIKLGEGVVELGGLHILGTERHESRRIDNQLRGRSGRQGDPGSSRF YLSLEDDLMRLFGSDRLSGLMQKLGMQEGEPIENNMVSRAIENAQKRVEGHHFEIRKTLL DYDNVMNQQRTVIYSLRRDLMQEPDLEPILNEYLSDLLDDMYAGLEVSKAARDIEDEKPV RARLSEVMNIDRVLPGDAPLPTREEAQELVLSIMAQLREEAGPLYADLLRYFLLEELDRG WKEHLRNMDFLRDGIGLRGYGQRDPKLEYKREGFNMFQELLVHIREGAFRALTRVRVEQR PTEVAEEVVAAPEPEPMFQHKEQPQQLSYSNEPEDLLGAPAQAKAENKPGRNDPCPCGSG KKYKKCCGANEK >gi|316921367|gb|ADCP01000160.1| GENE 2 3291 - 4802 1316 503 aa, chain + ## HITS:1 COG:no KEGG:LI0629 NR:ns ## KEGG: LI0629 # Name: not_defined # Def: hypothetical protein # Organism: L.intracellularis # Pathway: not_defined # 18 503 24 513 514 327 40.0 9e-88 MRNRLVLLAALMWCLSGVAAAAPAMDIVLYPSSAQVQVDDTLPVKNGAVAFVVPAGTDLE SLVISLDKGEVISRRAIPVLVADSETVAALRQDLAEARAHAAGLEGEAAAVAARISLWSK GALAQEISMAEMEKLDAAMPERLKALYVQAAALEPQVKEAREGVARLEQALSEYGDAATG TQVEAQVGDLSGNVRVRYSYMLSGCGWSPAYRFDAEPEKGLVRFTQQAEIRQATGQDWKG VHLTLASADPGSGLQPGSLPNWTLRKVEHMPRALAASPVMEQAAMDAAPNMMMKAAPPVD VREMATFAAWDLGTRTVPAGRPVLFELARGDWKASFVRLARPGYGGKAAWLMAEVRLPEA VDWPHGPAQYSVDGLPVGAGTFALSGDHEDMFFGRDSRVTVDMKQDLRQSGSKGFVGKRQ TRDWKWTIEVTNSHTQPVAVRVEDPEPQIGDSAIEVKVVAKPAPVVKDHVNTWNLTVPAS GKSVIDYTVEASAPEDMRLIYGR >gi|316921367|gb|ADCP01000160.1| GENE 3 4875 - 7067 1789 730 aa, chain - ## HITS:1 COG:NMA1655 KEGG:ns NR:ns ## COG: NMA1655 COG0323 # Protein_GI_number: 15794549 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Neisseria meningitidis Z2491 # 58 728 4 656 658 301 33.0 3e-81 MKTSICCGMGKAPHPTERNPAFRFALRYPLAPFSARHYIKWMNETSPHDGSPAGRRGIHL LSPELRNQIAAGEVVERPASVVKELVENSLDAGATLIEVTLENGGQSLIRVRDNGAGIPA AELELAVTRHATSKITSMDDLWHISSFGFRGEALPSIASVSSFRMESAFRASEDAEPDAA FIQMEHGVLAQQGPSGLHKGTVVEIRDLFATIPARLKFLKAPATEQKRAQELLTRLALTH TDAGFIFLAGTREVLRFPAGQSLQERLSVIWPDSVTETLLPFDRTTNDIRVHGLVSPPGQ AQPRGDRMLLYVNGRAVNDKVILRAVREAYKGRLLSKEYPQIVLFMDLPPEEVDVNVHPA KIEVRFRDERSVFGAVLRAVEEAVVRNLPTGDLDAAPSGEEARHAPAYEPKPLGFWGEAD RERIMRPRQQPLIPPDHEETPSAVPPASERPASEPDSVPEPLLYGEEGLPWDKVPDNHAA PSFHEAPQAFAAPQPPTAGAAPLPETAPHAAVQPPAANRARPSEPEQPEIEQLGEGQVRV GPYVYMGQIGGTYLLLRDIRNGRDNASLLILDQHAAHERVLVSRIEAGGFSGMSQPLVLP LEYTLHPAERERIQEFHESLSALGFELALREQGAGTVLEVRAVPPLLERAAAGEFIREVL AGRKEDLHSLWATMACKAAIKAGDALAPDEAVNLIAQWLMTENRQFCPHGRPCVLQWGTS DLDKMFKRMG >gi|316921367|gb|ADCP01000160.1| GENE 4 7066 - 7557 759 163 aa, chain + ## HITS:1 COG:PA1329 KEGG:ns NR:ns ## COG: PA1329 COG3193 # Protein_GI_number: 15596526 # Func_class: R General function prediction only # Function: Uncharacterized protein, possibly involved in utilization of glycolate and propanediol # Organism: Pseudomonas aeruginosa # 37 160 14 137 143 91 44.0 7e-19 MRRLLLSGLMVLAFSVTAFAGVDVSKLQLPGDITQAQADTVIKGALAKAKEQGVPMNIAV VDAGGNLKAFTRMDGAFLGSIDISIGKARTARLFNMPTSALGAASQPGKELYGIEVTNNG LVIFGGGELLKNKDGVIVGAVGVSGGSVAEDTNVAKAGVAAFK >gi|316921367|gb|ADCP01000160.1| GENE 5 7753 - 8319 641 188 aa, chain - ## HITS:1 COG:PA1878 KEGG:ns NR:ns ## COG: PA1878 COG1896 # Protein_GI_number: 15597075 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Pseudomonas aeruginosa # 1 184 3 186 192 141 46.0 8e-34 MDFIEQYLRFIKEAERLKAVTRTAWTHDGRRESTAEHSWRLALFAGLAAGRLPGLNREKV LMMSLIHDMGELYGGDISAALCPDPQEKTDEESRAVRKAFSLLPEREAESLLALWREYNA NATPEARLVKALDKAETIIQHNQGRNPEDFDYRFNLGYGKEYFDGSGFLRELRTLLDEET AAKLHGTD >gi|316921367|gb|ADCP01000160.1| GENE 6 8345 - 8791 625 148 aa, chain - ## HITS:1 COG:CAC3445 KEGG:ns NR:ns ## COG: CAC3445 COG0454 # Protein_GI_number: 15896686 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 5 147 4 147 147 111 42.0 4e-25 MATFQTITTDKKQYLPLLLLADPCEAMIDRYLEAGDMHVLSIGDAPVCVAVVIPYSDTAC ELKNLATDPQFQKQGYATRLMETLFKFYAARFRAMYVGTAGPGVAFYARFGFTHTHTVTG FFTDNYPEPIYEDGVLLTDMLYLKKELA >gi|316921367|gb|ADCP01000160.1| GENE 7 8958 - 9875 1138 305 aa, chain - ## HITS:1 COG:PM0148 KEGG:ns NR:ns ## COG: PM0148 COG0774 # Protein_GI_number: 15602013 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Pasteurella multocida # 1 268 2 273 305 261 50.0 1e-69 MNQTTIQRVVTCSGIGLHSGNKVYLALRPASANTGIIFDIHTPDGIRRVSPAPKAVATTV LATTLGSNGATVSTVEHLLASVLGLGIDNLIISVEGGEIPIMDGSAAAFLELFAEAGIRE LPAPRKVMRISRPLELRDGEKYIRALPYAGFRVDCTIDFPHPTIGRERFSMEVTPENFTR VANARTFGFFKDVEYLHSHGLARGGSLESVVVLDDNGVMNPEGLRYKDEFVRHKVLDFIG DMAMLGMPIQGQFEICCSGHQLNNAFLRKAESERALQLIDLSLEEAQNRKTREKRPAYEG SLALA >gi|316921367|gb|ADCP01000160.1| GENE 8 10007 - 10954 442 315 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase [Acidobacterium capsulatum ATCC 51196] # 12 312 25 290 294 174 38 2e-43 MPELSRADWVREATRTLHEAAVDSPRLSAELILQHVCGISRVELATRPETFLTSDQLSRM TGLLRRRADGEPAAYLLGQREFYGRDFRVTPATLIPRPETEHLIEAALKGCDGPASFADL GTGSGCIAVTLCAERPDWRGLMVDLSGRALAVACQNAVRHDVRQRLQPVRADFTRPLLRP ESLDLLVSNPPYVGKTEYEGLSAEVRDFEPVTALVPNFVDSDKIPSHDHHHDHGGSHSHV PSTPPDKPEGLEHLIAVAQEAFVALKPGGLLLMEHGYAQGAAIKVLLESHKWENVLILKD LSGHDRVASARKPAA Prediction of potential genes in microbial genomes Time: Fri May 13 05:06:21 2011 Seq name: gi|316921357|gb|ADCP01000161.1| Bilophila wadsworthia 3_1_6 cont1.161, whole genome shotgun sequence Length of sequence - 8599 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 5, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 191 - 757 638 ## COG1335 Amidases related to nicotinamidase - Term 1061 - 1101 12.7 2 2 Op 1 41/0.000 - CDS 1142 - 2785 1782 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 3 2 Op 2 . - CDS 2852 - 3139 478 ## COG0234 Co-chaperonin GroES (HSP10) + Prom 2744 - 2803 3.2 4 3 Tu 1 . + CDS 2894 - 3349 71 ## + Prom 3382 - 3441 9.7 5 4 Op 1 32/0.000 + CDS 3647 - 3955 399 ## PROTEIN SUPPORTED gi|46579340|ref|YP_010148.1| 50S ribosomal protein L21 6 4 Op 2 . + CDS 3981 - 4262 374 ## PROTEIN SUPPORTED gi|218886474|ref|YP_002435795.1| 50S ribosomal protein L27 + Term 4287 - 4324 6.9 7 5 Op 1 . + CDS 4485 - 4880 279 ## 8 5 Op 2 7/0.000 + CDS 4906 - 6012 769 ## PROTEIN SUPPORTED gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 + Term 6183 - 6221 7.0 + Prom 6146 - 6205 1.9 9 5 Op 3 . + CDS 6359 - 7507 1481 ## COG0263 Glutamate 5-kinase 10 5 Op 4 . + CDS 7568 - 8539 1298 ## COG0451 Nucleoside-diphosphate-sugar epimerases Predicted protein(s) >gi|316921357|gb|ADCP01000161.1| GENE 1 191 - 757 638 188 aa, chain - ## HITS:1 COG:SSO2455 KEGG:ns NR:ns ## COG: SSO2455 COG1335 # Protein_GI_number: 15899200 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Sulfolobus solfataricus # 2 160 27 177 205 108 40.0 4e-24 MNIALVIIDMQNDFVLPEAPLCVKGAQATVPTIQKLLDRARAEGWRVIHVIRQHRRDGSD VEIGRAPLFTQGAGICVPGTKGAEIVDELAPLPNETILRKLRFSAFFQTELDMLLRRLKI DTLLIAGTQYPNCVRGTATDAMSHDYNTIVVTDACSAQTDEIAATNIRDMKNMGITCVTL AELPSILA >gi|316921357|gb|ADCP01000161.1| GENE 2 1142 - 2785 1782 547 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 1 546 1 546 547 691 65 0.0 MASKEILFDAKAREKLSRGVDKLANAVKVTLGPKGRNVVIEKSFGAPVITKDGVSVAKEI ELEDKFENMGAQMVREVASKTNDIAGDGTTTATVLAQAIYREGVKLVAAGRNPMAIKRGV DKAVEAVSKELGNIAKPTRDQKEIAQVGTISANSDATIGNIIAEAMAKVGKEGVITVEEA KGLETTLDVVEGMKFDRGYLSPYFVTNAEKMVCELDNPYILCNEKKISSMKDMLPVLEQV AKVNRPLVIIAEDIEGEALATLVVNKLRGALNVVAVKAPGFGERRKAMLEDIAILTGGEA VFEERGVKLESLPLSSLGTAKRVVIDKENTTIVDGAGKSDEIKARVKQIRAQIEETTSDY DREKLQERLAKLVGGVAVIHVGAATETEMKEKKDRVEDALNATRAAVEEGIIPGGGTAFV RTIKVLDDIKPADDDELAGLNIVRRSLEEPLRLIAGNAGHEGSVVVEKVREGKDGFGFNA ATGEYEDLIKAGVIDPKKVARIALQNAASVASLLLTTECAIAEKPEPKKDMPAMPDMGGM GGMGGMY >gi|316921357|gb|ADCP01000161.1| GENE 3 2852 - 3139 478 95 aa, chain - ## HITS:1 COG:mlr2393 KEGG:ns NR:ns ## COG: mlr2393 COG0234 # Protein_GI_number: 13472183 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Mesorhizobium loti # 1 94 1 94 104 112 54.0 2e-25 MSLKPLNDRVLVKRLESEERTASGLYIPDTAKEKPSKGEVVAVGPGKHADDGKLVPMAVK VGDMVLFNKYAGTEVKIDGAEHLVMREDDILAIIA >gi|316921357|gb|ADCP01000161.1| GENE 4 2894 - 3349 71 151 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLGAVDLDFGAGILVEEDHVAHLHSHGNELAVVGVLAGAHGDHFTLGGLFLGGVGDVQAA GGTFLGFEALHKNTVIQRFQTHDDLSSKKGIFCCNQPRHHRTACGESLQALRNGGRVPDV AEEAARFKSGKTLAGGVSLASGPWCSGLSWH >gi|316921357|gb|ADCP01000161.1| GENE 5 3647 - 3955 399 102 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|46579340|ref|YP_010148.1| 50S ribosomal protein L21 [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] # 1 102 1 102 102 158 70 2e-38 MYAIIETGGKQFRVEEGTKILVDRMAADVNSEVSLDKVLMVGGAELKVGAPYVENAKVTA TVLDHVRGDKILVFKKWRRNDSRKLQGHRQDYTAILVKAIEA >gi|316921357|gb|ADCP01000161.1| GENE 6 3981 - 4262 374 93 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|218886474|ref|YP_002435795.1| 50S ribosomal protein L27 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 89 1 89 89 148 77 1e-35 MAHKKAGGSSRNGRDSNAQRRGVKRFGGQPVLAGNILVRQLGTVIHPGENVGMGSDYTLF AKIDGTVKYETYVRKRKIQKRVHILPFEAPAAE >gi|316921357|gb|ADCP01000161.1| GENE 7 4485 - 4880 279 131 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKVFLFALQLGVLAPFWRSMQALLWANSGRASEGALWNIVYCGGPLLIACALARWRILK GKETFSLIQAVCGVLLALLVSTVFVWVGASERGIPPVSAYGLAVDGVIYAAGLALAVFGC KKRLSRLGGDI >gi|316921357|gb|ADCP01000161.1| GENE 8 4906 - 6012 769 368 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 [Roseobacter sp. AzwK-3b] # 1 327 1 319 345 300 50 2e-81 MRFVDEAVISVKAGKGGNGCVSFRREKFVPRGGPDGGDGGDGGSIILRADSRLLSLYDFR IMRHYEAQNGQGGMGSQRYGRKGEDLVLNLPVGTLVFEQTPEGEHMLTDLAEAGDEYLVV RGGRGGKGNEHFKSSTMRAPRFAQPGEPGEERNLRLELKILADAGIIGLPNAGKSTFISR ISAARPKIAAYPFTTLTPNLGVVIDEYDPDQRMVVADIPGLIEGASEGQGLGHRFLKHVE RTRFLVHILSIEDVNPEENPWAGFDLVNDELNAFDEDLGLRRQLQVINKIDLRTPEEVDA LRAIAARDGREIFFMSADQGEGVEELLAAMWSLRAEMEAHAPLLHYQEMVLEDEEFPDIE VVYTRETE >gi|316921357|gb|ADCP01000161.1| GENE 9 6359 - 7507 1481 382 aa, chain + ## HITS:1 COG:DR1827 KEGG:ns NR:ns ## COG: DR1827 COG0263 # Protein_GI_number: 15806827 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Deinococcus radiodurans # 17 378 3 362 363 258 44.0 1e-68 MNWQEERDATLCEARCVVVKVGSAVLSTGSELNLPVLDNLVNQLAVLSRQGRRVVLVSSG AVAAGRTALKKLPGEPAAVGLSGKQAAAAVGQGRLMHLYDNAFAERGILTAQVLLTRDDL KNRTRFLNIRNTFSELLNWGVIPVVNENDTVSVNELKFSDNDNLASLLINPVEADLFVNL TTIGGVLDANPLETPDATIMPCIEDVRALNLDTLCGGKTSVGTGGMYSKLLSAHRVAQLG VPTAILPGCEPDVIPRLFAGETIGTWVRPEQRTVSRRKYWLAYQADPQGTLYLDTGAADA VRNHGKSLLPGGITEVHGSFEAGALVRLVYDRETVGVGLSNYNAADLLRIRGLKRHEVAA ILGDAHYPEVVHRDNLLLDAAL >gi|316921357|gb|ADCP01000161.1| GENE 10 7568 - 8539 1298 323 aa, chain + ## HITS:1 COG:FN1703 KEGG:ns NR:ns ## COG: FN1703 COG0451 # Protein_GI_number: 19705024 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Fusobacterium nucleatum # 1 323 1 332 332 307 46.0 2e-83 MYIVTGGAGFIGSAMIWRLNEAGIDDILVVDNLASSEKWRNLVNRRYRDYVHRDEFRRMV LEGRAPQKVEAIIHMGACSSTTERNADFLMDNNLHYSQMVCRYALEQGARLINASSAATY GDGSHGFSDSLLTTLSLRPLNMYGYSKQMFDLWAYRENLLKSIASVKFFNVYGPNEYHKG EMKSVACKAFTELRETGSLRLFKSDRPQYPDGGQMRDFVYVKDCVNVMFWLLEHPEANGI LNIGSGKARTWNDLANSVYSAMDKIPDINYIDMPAELKGRYQYYTQADMAWLERLGCDIQ FHSLEDGVSDYIKNYLMNPDPYL Prediction of potential genes in microbial genomes Time: Fri May 13 05:07:02 2011 Seq name: gi|316921322|gb|ADCP01000162.1| Bilophila wadsworthia 3_1_6 cont1.162, whole genome shotgun sequence Length of sequence - 41597 bp Number of predicted genes - 36, with homology - 30 Number of transcription units - 19, operones - 7 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 23 - 1291 1293 ## COG2067 Long-chain fatty acid transport protein - Prom 1394 - 1453 2.3 + Prom 1373 - 1432 6.0 2 2 Tu 1 . + CDS 1483 - 2460 843 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Term 2489 - 2522 5.4 - Term 2475 - 2512 9.4 3 3 Op 1 . - CDS 2528 - 4798 1035 ## 4 3 Op 2 . - CDS 4807 - 5286 514 ## 5 3 Op 3 . - CDS 5276 - 5635 439 ## 6 3 Op 4 . - CDS 5632 - 6105 446 ## 7 3 Op 5 . - CDS 6114 - 6662 528 ## 8 4 Tu 1 . + CDS 6749 - 7495 487 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 9 5 Tu 1 . - CDS 7485 - 8297 1145 ## COG1989 Type II secretory pathway, prepilin signal peptidase PulO and related peptidases 10 6 Tu 1 . + CDS 8523 - 9587 1286 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT 11 7 Op 1 . + CDS 9695 - 11341 1218 ## COG3267 Type II secretory pathway, component ExeA (predicted ATPase) 12 7 Op 2 . + CDS 11356 - 12762 1467 ## COG0836 Mannose-1-phosphate guanylyltransferase 13 7 Op 3 . + CDS 12765 - 14459 1912 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component + Prom 14584 - 14643 2.7 14 8 Tu 1 . + CDS 14680 - 16029 1386 ## COG1066 Predicted ATP-dependent serine protease 15 9 Op 1 19/0.000 + CDS 16191 - 16517 276 ## COG2127 Uncharacterized conserved protein 16 9 Op 2 . + CDS 16514 - 18787 1447 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 18982 - 19031 13.6 - Term 18972 - 19015 12.0 17 10 Tu 1 . - CDS 19100 - 19858 1051 ## COG0730 Predicted permeases - Prom 19979 - 20038 2.3 18 11 Tu 1 . + CDS 19964 - 20638 541 ## COG2360 Leu/Phe-tRNA-protein transferase + Term 20652 - 20686 2.1 - Term 20634 - 20678 8.1 19 12 Tu 1 . - CDS 20737 - 21735 823 ## COG0429 Predicted hydrolase of the alpha/beta-hydrolase fold + Prom 22191 - 22250 6.9 20 13 Tu 1 . + CDS 22440 - 22823 302 ## Dalk_4820 hypothetical protein + Term 22836 - 22882 11.2 - Term 22826 - 22868 9.5 21 14 Tu 1 . - CDS 22878 - 24167 1254 ## COG0123 Deacetylases, including yeast histone deacetylase and acetoin utilization protein - Term 24328 - 24363 6.5 22 15 Op 1 . - CDS 24525 - 25655 1452 ## COG0505 Carbamoylphosphate synthase small subunit 23 15 Op 2 . - CDS 25728 - 26396 721 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase - Term 26762 - 26804 3.2 24 16 Op 1 24/0.000 - CDS 26830 - 28065 1317 ## COG1459 Type II secretory pathway, component PulF 25 16 Op 2 . - CDS 28215 - 29921 2165 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 26 16 Op 3 . - CDS 29925 - 31550 2067 ## COG4796 Type II secretory pathway, component HofQ 27 16 Op 4 . - CDS 31605 - 32096 494 ## Dvul_1790 hypothetical protein 28 16 Op 5 . - CDS 32112 - 32657 653 ## DVU1275 hypothetical protein 29 16 Op 6 . - CDS 32659 - 34740 1615 ## DvMF_1024 hypothetical protein - Prom 34767 - 34826 3.8 - Term 35020 - 35059 10.0 30 17 Op 1 26/0.000 - CDS 35125 - 37362 179 ## PROTEIN SUPPORTED gi|222153157|ref|YP_002562334.1| 30S ribosomal protein S1 - Term 35020 - 35059 10.0 31 17 Op 2 26/0.000 - CDS 35125 - 37362 1615 ## PROTEIN SUPPORTED gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase - Term 37463 - 37495 7.0 32 17 Op 3 14/0.000 - CDS 37507 - 37776 397 ## PROTEIN SUPPORTED gi|220903341|ref|YP_002478653.1| ribosomal protein S15 33 17 Op 4 . - CDS 37802 - 38731 596 ## COG0130 Pseudouridine synthase + Prom 39071 - 39130 3.1 34 18 Op 1 . + CDS 39157 - 39651 409 ## COG0802 Predicted ATPase or kinase 35 18 Op 2 . + CDS 39703 - 40929 1397 ## COG0527 Aspartokinases + Term 40949 - 40992 9.0 - Term 40854 - 40904 3.2 36 19 Tu 1 . - CDS 40933 - 41460 -394 ## + 5S_RRNA 41483 - 41541 93.0 # AE015927 [R:2797299..2798807] # 5S ribosomal RNA # Clostridium tetani E88 # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. Predicted protein(s) >gi|316921322|gb|ADCP01000162.1| GENE 1 23 - 1291 1293 422 aa, chain - ## HITS:1 COG:PA1288 KEGG:ns NR:ns ## COG: PA1288 COG2067 # Protein_GI_number: 15596485 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Pseudomonas aeruginosa # 25 386 23 387 424 134 27.0 4e-31 MKRIASLLLAACLLAAPQLTGKAQAEGFALSDFGARGTALAGGMVARADDPSAVAWNPAG ITQLPGTQIMVGMTAIQPSGTVDTVDRFGIGRSTDVDKHTWANPHAYLTHQFSDNLWGGI GIFSRFGLGNSYPTNWPGRENLKYVSLKTVSLNPNLAFKINDNLSLAVGLEFMYATMLMK KDGDLGKMTGSPLLSGRYDQQMLTGSSIAPGFNLAAHYKFNDEWAAGLTYRSRVKQAVRG HVEFENKNPVPGLVNLPNSDLHGNLNLPDTISFGVMWKPMETLSFEAGAVYTVWSNYRSL NIHMDNPQYGTAYSPKNWRDTWGFNVSAEYKALDWLTLRGGYVFETSPMKDSTCDYMTPS NGRHRITAGVGFNWDQWTVDLAYGYLIIKELSYDKSTADGVLKGRSHNGSSHIAAVSVGY KF >gi|316921322|gb|ADCP01000162.1| GENE 2 1483 - 2460 843 325 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 324 5 327 329 329 50 2e-89 MSSASSPLLDFRDVSRSFTVRQGVLGRVRGEVRAVDGVSLSVRRGETLGLVGESGCGKST LARMAVRLLPPTGGEILFDGESVLPGVSGRNDTWYSRRLQMIFQDPFSSLNPRLRIGTSI AEPLMAIGAPSAERQERVGEILRRVGLQPGHAGRYPHEFSGGQRQRVAIARALVTRPELV VCDEAVSALDASVQAQVLNLLKEVQADFGLTYFFISHDLSVVGYMSDRVAVMYLGRIVEQ ADRNTLFGSAAHPYTKALLAAAPVHDPAKRHIRAFLPGELPSPFSPPAGCAFHPRCPQAM PRCREEAPELREIAPDHLARCHLYG >gi|316921322|gb|ADCP01000162.1| GENE 3 2528 - 4798 1035 756 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQTSQRGSVLLTIIGGIVILALLGMVAVSLMTGSVMTGLDAKETVQASYLAESGKEIVRV QTAQQSGGEMMSIARQLEESSRQHNGKGIEIDGKGFVKLTLYPSWFRWYEDKLHLTDSGW YGGVPNGKREMLVLNQSGAMRYAYENGFGSETPGSTAADVYLIGRVQGENGSITYDTTKR TLTVKTTADDLKWFPEYGGLIGLVRQQEKDSQLPTSTQVSKLRFTYNSKKTDGDVHTFTD FIPLSEGTISEDTFKDKDIVLGQYFRVVSEAQTANGARAALVWHTNGRSSLRFAGSGGSE EGGNATENIIGGNLSSEEDLKELFGQHLTEHNLNKGYTFTKYGPDLTALSVQGFQPSMPN HFLLKTTRKTSTDYWFAGITDKTIPIDSFQKEHGVMIQISTMIHPPGSSKKPHFFGGLLF RTHMLQGKSALTPHGVSGLGMGIVYGSIEIENSEKEGYYEVDDCTINPALLPGFEWFSKD SNDPEEPDGSFRIAKTDWEQYSRIFVQGVPSAGRKMTIKPTLILWAYDDEELNDGNVSEP PDSLRCLAAASLGPNDVYSPLSTPLDAFYFTRIVTQVREHEGDNKIRAWIASQRPPKYGV TDYTKYDPETNNFLYSYDIAHTLELGKILDDIRNGFKVSEGALWPVLDFSDGVEQNLMTD RFTLIGWDYVNPEFSRFETEKVDGRPNIKTTILTQFATGAAYNRAGIFQGTYTEDDDKNN IPSTYNLNFRNFAVGIPGTGGSGTDDVPGLTPGIVQ >gi|316921322|gb|ADCP01000162.1| GENE 4 4807 - 5286 514 159 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRSDARAPERGFTLLEIVCTLVILGVLGSLVFSGFGTAITGYTQMREVGTSDMQAELALI RLRKELAGAADVPESPFKDSPSVTFTKTDGTQTTISCEDTGNKLLIDGQILLSDVASCRF TTNGAGPSYIGIVLTQQLAERQVTWTLSVAPRNTPLAKD >gi|316921322|gb|ADCP01000162.1| GENE 5 5276 - 5635 439 119 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNARGFTLIEIIVTIILMAIMGFMAAQLIATTLRGSAESAGQIKDLTEATSTLEECIAYL NTEAMQQKAASDLIKDKGLEALLEAQDKRELWHPDGCPSCPDNLLITVKRGSVELSRAF >gi|316921322|gb|ADCP01000162.1| GENE 6 5632 - 6105 446 157 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQPPFRNGFTLIEIVSVLVVLGVLAYLGISAFRGNDEITAYAERDRLLSQLVYARAQGMA MGGGQCVTIDTNKVSFFMKGKSALPTLLKDYTPSASIQSTPPATFCFDAAGSVCGEDSLG NPNDTGILYCTASASDQTFSFGGGITLTLFAETGFVQ >gi|316921322|gb|ADCP01000162.1| GENE 7 6114 - 6662 528 182 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKRKRFSLAWLALLALAGGVELLIVDGTLSGGAPLSRMASSVRETLDQTPTPANAASQQ MYIWKDASGVTHISETPPPKPPTGKVEEYQFSPQPPAAQPEPVTPLDIPSPEHQPSAAEL EKATAEARKMLQQQVEQLQKERGQLEKQLYRARATGDGYAQIRFRTLLEQNREALEKLIP KQ >gi|316921322|gb|ADCP01000162.1| GENE 8 6749 - 7495 487 248 aa, chain + ## HITS:1 COG:BS_yjbJ KEGG:ns NR:ns ## COG: BS_yjbJ COG0741 # Protein_GI_number: 16078222 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Bacillus subtilis # 131 245 65 179 181 154 60.0 1e-37 MVAQAVPKDERPRLSWLGLPVWAGLILVCCLLCATGVRAAGPSKLKVVVRPPIKKTLPQR PHAARPDAPSAQPADPLPAESVAIYRHVGEDGTVFLTNRPDGDGRYQFFGRFSAERLMRA VGPDGVARLAERYAGQHGVAPRLVMAIIKVESGFDAKAVSSAGASGLMQLMPGTQRHLGV RDAFNPDENVEGGVRYFRSMLDRYGGNISLALAAYNAGPANVDKYGGIPPFEETRNYVRK VLALYATP >gi|316921322|gb|ADCP01000162.1| GENE 9 7485 - 8297 1145 270 aa, chain - ## HITS:1 COG:aq_1601 KEGG:ns NR:ns ## COG: aq_1601 COG1989 # Protein_GI_number: 15606720 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, prepilin signal peptidase PulO and related peptidases # Organism: Aquifex aeolicus # 18 265 9 252 254 150 38.0 3e-36 MYQTIPPELLHPEMFIPAATLLGLLLGSFYNVCIHRYVSGESILFPPSHCPHCKQRLRFW ELIPVLSYLLLRGQCARCHKPIHIRYPLVELLSGLTSGLLAWRFGPTLAFPVYLVFTGML IVASGIDLECFILPDGITLGGTVLAVPAAVFALGMDWTDALLGGLVGGGTFLAVLLVFKR LRGVDGMGFGDVKLMLMLGVLCGPLGLPLITLVAGVSALAAFLLIACLMPREAPLREMPI PFGPFLSLGAFVHILAGQEILDWWIRFITG >gi|316921322|gb|ADCP01000162.1| GENE 10 8523 - 9587 1286 354 aa, chain + ## HITS:1 COG:PA0395 KEGG:ns NR:ns ## COG: PA0395 COG2805 # Protein_GI_number: 15595592 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Pseudomonas aeruginosa # 16 348 15 342 344 356 54.0 3e-98 MAKIDNLFHLLHANHGSDLHLSAGNPPLLRIYGDLARVDAPVVTTEELLAVLREMVPMER VQAFEQTGDLDFAYDIPGLARYRANFFRQNCGISAVFREIPQRISTVEELGLPIMLRELA LLPKGLVLVTGPTGSGKSTTLAALVDYANTKRRDHIITIEDPIEFVHQSKSCLVNQREIG RDSRSFGTALRSALREDPDIILVGEMRDLETISLALEAAETGHLVFATLHTISAPKTIDR IIEVFPPEEQAQIRTSLSESIQAIIAQTLFKRADKKGRVPALEIMFGIPSVRNMIRESKT FQLPSVLQTNRNVGMQTMDDDIERLLMQRIIAPEAALVHAQDKGRLREAIGKLK >gi|316921322|gb|ADCP01000162.1| GENE 11 9695 - 11341 1218 548 aa, chain + ## HITS:1 COG:VC0403 KEGG:ns NR:ns ## COG: VC0403 COG3267 # Protein_GI_number: 15640430 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component ExeA (predicted ATPase) # Organism: Vibrio cholerae # 4 250 2 240 281 103 32.0 9e-22 MSSYLELLNLTREPFSNSPDPDAYYRTPTHEDCLNRLEIAIRLRRGLNVVLGEVGTGKST LCRCLLRSLNEQSGIDVFLLLDAGFEDADEFVRHLCELFAGQRPPEGVARRECISVIQNR VFDKALEQNRNLVLFIDEGQKLSPAALEVLRELLNFETNTEKLLQIVIFGQPELAQIIEG MPNFKDRINEYLYLKPLSMRESIRLVRHRLRLAGGQRAERLFSLGSLIALHRASRGKPRQ LMRLGHQMLLALLVGNRTTVTGRMVWAQVARNGSDVQNRWGRWLVVALALICVAGGYRWM PEAVKDDAHARLEAVVSGVLGWLPEAEKADTAGTAGKGRAAGDDAVRRAASREAVPQPLP PDEAPPAVPEKAESSTAEAAGGQGAEAEKSAVPETLGTFRLQPGEALAELVRALYGSGSA LELVLGKNPGYRNDIGAGPQDLTLPAVLYAPPPSMTKGVLLSLGAFGTLEEAYTAWRGYG KRSPSVALTPIWRPGEGLSFHILAPRGFASERAAWSWLSRFSPPAEASVRLLPPFDGACL VFHKFTVK >gi|316921322|gb|ADCP01000162.1| GENE 12 11356 - 12762 1467 468 aa, chain + ## HITS:1 COG:YPO3099_1 KEGG:ns NR:ns ## COG: YPO3099_1 COG0836 # Protein_GI_number: 16123273 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Yersinia pestis # 1 353 1 354 354 385 55.0 1e-106 MSTLRFIVLCGGSGTRLWPLSRALYPKQFMDLGGHTLFGDTVDRAAALPGSADPLVVCNE EHRFYVAAALQQKGVSGTILLEPKGRNTAPAIALAAFAALSGGGDPLLLVLPSDHVLKPQ DVFAEAVARARACAESGRIVTFGITPDAPETGFGYIRRGGALPSGGYAVARFVEKPDLAR AEAMLADGGYLWNSGMFLFRASIYLEELALHAPGIHAACKAAWEGHHADRDFIRPDADAF LSSPADSIDYAVMEKTDRAAVVPLTADWSDLGSWEAFYEAAPHDGDGNVRVGDVYAEGAE NCYLHASNRMVAALGVSDLVVVETADSVLVADRARTQDVKKIVESLKKEGRGEAESHPLV YRPWGSYETLARGERFQVKRIIVKPGGQLSLQKHHHRAEHWVVVEGTAEITVGDKVLLYH EDQSTYIPLGTVHRLKNPGMIPLVIIEIQSGTYLGEDDIVRLEDSYGR >gi|316921322|gb|ADCP01000162.1| GENE 13 12765 - 14459 1912 564 aa, chain + ## HITS:1 COG:VNG0615C KEGG:ns NR:ns ## COG: VNG0615C COG1226 # Protein_GI_number: 15789819 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Halobacterium sp. NRC-1 # 21 562 5 544 545 199 29.0 1e-50 MKLLLTLLPALLTQTTGRRNQRIVIGFLLFTTVLVALFSIVFHQIMAYEGRDYSYITGVY WTLTVMSTLGFGDITFTSDIGKLFSIIVLVSGIILIMIVMPFTFIRFVYQPWIEEYNNKR KPRSLPSDTSGHTVLVGDNDISLSVARKLRQHNYPYVILVPDGQHALELYDNRYDVVTGE FDDANTYRNLRADKAALVAALEGDLRNTNIASTVREVAPSTLLAASAENPEAMNILMLAG CDHVYSFTEMLGRSLARRVYGTRAQSNIIARFGTLCLAEAPVIHTEFMGQTLRECGFRER FGLNVAGLWEGNSYMSARPDSRIDEASILLLAGTADQLEAYDRKAERSSKANRTPVLILG GGKVGEAAADALERRGLPFCLVEKNPRLVPPDDPRYILGNAGELAVLQRAHIMETPSVIV TTHDDDLNIYLTIYCRKLRPDVQILSRSTLDRNVPSLYNAGANLVMSHASMAASTIINLL SPGRVTILTEGLNIFRVTAPPALVGLSLRDSRIREKTDCNVVALKSQGVLRVPPDPNAPM KSDTVLVLIGTADAERRFMEHFPS >gi|316921322|gb|ADCP01000162.1| GENE 14 14680 - 16029 1386 449 aa, chain + ## HITS:1 COG:aq_552 KEGG:ns NR:ns ## COG: aq_552 COG1066 # Protein_GI_number: 15606010 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Aquifex aeolicus # 4 430 3 422 444 345 44.0 1e-94 MASKIKETYVCAECGAKSPLWRGQCLVCKAWNSLELVAAPVEGARKRPIGSLGSVKAMPL AHVEDHGHEPFGTGLEALDRVLGRGLVPGSALLMGGEPGIGKSTLLLQMAGAVASSGKLV LYASGEESLPQIKDRANRLGMLHDNLLALSTSRVEDVLPLLEGDGAPDLLIVDSVQTFTS ESAEGLPGNVSQVRAVATEIVDACKRGTTTVVLVGHVTKDGALAGPRLLEHMVDTVISLE GDRRQMFRLLRVLKNRFGPNQELLVFQMVRQGLEVVEDPATYFLGARDASLSGTAVVMAV DGQRPLAVEVQALVTRSYLSIPRRTGLGFDVNRLHLILAVLEKRLRLNFGQVDIYAKVGG GMKIQEPGMDLALVAAMLSSFYDVPLPERAVLWGEVDLNGQIRPVAAHDIRLSQARRLGY KPILFPSQGEGDGIATVVELQDRLFRRKN >gi|316921322|gb|ADCP01000162.1| GENE 15 16191 - 16517 276 108 aa, chain + ## HITS:1 COG:CAC1823 KEGG:ns NR:ns ## COG: CAC1823 COG2127 # Protein_GI_number: 15895099 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 21 106 16 101 101 101 55.0 4e-22 MSQNPFHAAPGGQPDVIMDVEVAEPKMYQVLLHNDDYTTMEFVVNILMTVFHKTADQATN IMLAVHKRGKGIAGVYPHEIAETKVDKTHFLAREAGYPLRCTLEEVGA >gi|316921322|gb|ADCP01000162.1| GENE 16 16514 - 18787 1447 757 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 4 757 8 814 815 561 39 1e-159 MIGKKLQKALGLAVEELKRRRHEYLTLEHVLYGIASEPSGRKLIERCGGSAPVLRQALDH FFKTYMESLDEPTKDVYQTLAVQRVLDSALAHVKSAGRDEDIGVEVGDVLAAILEEEEYS WAAYCLKKQGITRLAVLESISHNDEEGQDESSSESGEGEGGAESAKDALARYTVDLTARA REGKLDPLVGREMELSRSIEILARRRKNNPLYVGDPGTGKTAIAEGLALRIVSGNVPPEF KDTKIYSLDLGAVLAGARYRGDFEGRIKAVVGALQKIPGAILFIDEIHTIVGAGSTSGGS MDASNLLKPILAEGKLRCIGSTTYDEFRNHFEKDRALARRFQKVDIKEPSLDECVDILKG LQPHYEKHHNVRYSSSSLRAAVELSARHVQDRLLPDKAIDVMDEAGASVRLRPGFKSGSS VSRQDVERIVARMAGIPARTVSGKERDRLKTLKDDLGSVLFGQDEAVDIVTRAILRSRAG LGRTDRPAGSFLFYGPTGVGKTELAKQLAERMGVAFLRFDMSEYMEKHAVSRLIGAPPGY VGFDQGGLLTEAVRKTPYAVVLMDEIEKAHPDIFNVLLQVMDYGTLTDNTGRKADFKNVV LIMTSNAGVRDMDASPMGFIEAPAGKVAQSAAQRGRKAVEAMFSPEFRNRLDALVPFNAL TPDLMGMIVDKCIAEMGKGLADKKVNLTLTPEARSWLAKEGYDAKLGARPLQRLLREALE DPLAGEVLFGRLVKGGTVIVDEPEDGGDKLVLRFEEK >gi|316921322|gb|ADCP01000162.1| GENE 17 19100 - 19858 1051 252 aa, chain - ## HITS:1 COG:NMB0432 KEGG:ns NR:ns ## COG: NMB0432 COG0730 # Protein_GI_number: 15676344 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Neisseria meningitidis MC58 # 2 245 4 247 262 134 38.0 2e-31 MLLTVSLLCIVTFIAGLIDSIAGGGGLLRLPALLIAGVSPQQALGTNKFTGAIGTGAAML NFARKGLILWKLAFIGIPCALLGSAAGSKCILAFDPQTAGKILIALLPVAAMITLIPRRS KSCKDSFTSKDVYLLTPLICFSVGFYDGFFGPGAGSFFIIAFNVCLRMNLIQASALTKVF NLSSNLGSLFVFLYHGDVLFLYAIPMTVADVLGNLAGSQLAIRTGPTIVRRFLLLSLIIL FVSLIWKYYFAG >gi|316921322|gb|ADCP01000162.1| GENE 18 19964 - 20638 541 224 aa, chain + ## HITS:1 COG:XF1446 KEGG:ns NR:ns ## COG: XF1446 COG2360 # Protein_GI_number: 15838047 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Leu/Phe-tRNA-protein transferase # Organism: Xylella fastidiosa 9a5c # 3 222 6 223 243 206 47.0 3e-53 MIYRLHPDYPELFPDPEGADPEGLVAVGGDLSVRRLLAAYGAGIFPWYGEGQPLLWWSPD PRCVLFPEKFRIPHTVRKEIRKCGFSVTVNQAFCDVMTGCAATPRPDQDGTWIMPEMVDA YASLHELGFAHSVEVWEHDAAGNTLVGGLYGVGLGRAFFGESMFHVRPHASKLALVSLME WLKARHCQLVDCQMATDHIMRYGAECIPRHDFLQQLRKALCRGG >gi|316921322|gb|ADCP01000162.1| GENE 19 20737 - 21735 823 332 aa, chain - ## HITS:1 COG:PA0368 KEGG:ns NR:ns ## COG: PA0368 COG0429 # Protein_GI_number: 15595565 # Func_class: R General function prediction only # Function: Predicted hydrolase of the alpha/beta-hydrolase fold # Organism: Pseudomonas aeruginosa # 8 321 6 326 332 201 39.0 1e-51 MPVVNHSSYNSPIWLRNGHLQTIWPVLFRNPPLPSLWRERLETPDGDFIDIDHIPACAGI RSGRVAILSHGLEGNSTRRYMLGMAEALNRRGWDVVARNFRGCSGEMNHTLPLYHGGETD DLHLVVQYCVSLGYGSIVLVGFSMGGNQTLKYLGERDRTIPSQVSAAVAVSVPCDMEGAA EVLSLPSRAPYMAYFLRTLRRKVEEKHSRFPDRIDIDGLDRIRTFSEFDDRYTAPLHGFD SARHYWRESGCLRFLEHIDVPFLLINASDDPFLSPDCYPNRIAEANAAMTLEMPPWGGHV GFVTPGQDLYWSEKRAADFLEYICGSVHEVIQ >gi|316921322|gb|ADCP01000162.1| GENE 20 22440 - 22823 302 127 aa, chain + ## HITS:1 COG:no KEGG:Dalk_4820 NR:ns ## KEGG: Dalk_4820 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 2 88 6 91 135 72 54.0 4e-12 MQKTKKGQLGFTLIEIISVLVILGILAAVAVPKYYDLQKEAADKAAAAVAAEAIARYNMK FSKELLGGGKTCSDALNAAKIEAFGKADDAADYGSDWKVTYVSNKVTAMNEANKGTYAID FEEPFCN >gi|316921322|gb|ADCP01000162.1| GENE 21 22878 - 24167 1254 429 aa, chain - ## HITS:1 COG:MA2888 KEGG:ns NR:ns ## COG: MA2888 COG0123 # Protein_GI_number: 20091710 # Func_class: B Chromatin structure and dynamics; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Deacetylases, including yeast histone deacetylase and acetoin utilization protein # Organism: Methanosarcina acetivorans str.C2A # 2 403 100 504 546 472 54.0 1e-133 MSKRTLGVVFFPAFDWAISPTHPEREERLLYTQDQLREEGLFDLPGITEYKPGVATYEDI ERVHFCLPDVGRVCSASHLASAGGAIRAGELVLSGERERAFALVRPPGHHAMRVTHGNRG FCNVNMEAVMIENLRRRFGPLRVAIVDTDCHHGDGTQDIYWNDPDTLFISMHQDGRTLYP GTGFLPECGGPGALGRTVNIPLPPGTGDEGYLYVTKNVVLPLLEAFKPDLVINSAGQDNH YTDPLTNMQLSAHGYAAMNALLNPHIAVLEGGYSIRGALPYVNLGICLALAGLPFEHVHE PDHDAKALKQRPQVTEYISRLCDDVLNQYHNPPSRPSEGHRDGEWWRRERDIYYDTDGLS EHQNEGIRLCPDCPGLTCIETSSDRVDKSLCLLLPRNACPHCRDLAHRLVETTGKKSGRY AHVLCIGGQ >gi|316921322|gb|ADCP01000162.1| GENE 22 24525 - 25655 1452 376 aa, chain - ## HITS:1 COG:alr1155 KEGG:ns NR:ns ## COG: alr1155 COG0505 # Protein_GI_number: 17228650 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Nostoc sp. PCC 7120 # 1 371 7 377 388 370 51.0 1e-102 MKALLALEDGFVLEGRSFTGPCEGGGEVIFNTGMTGYQEILTDPSYYGQMVCMTWPLIGN YGISAEDMESGKVHAAALIVKECCKTPSNWRSVCSLPDFLQQHGVPGIEGIDTRALTRHL RMHGAMRGMVSTSITDPAELVMQAKSLPSMEGQNLVTRVMPATPWKWTGNGGVPAELNAD GSYNWPGEGPRLVAYDFGIKWNILRLLTEQGFDVLVVPPSFTAEQVKASGAQAVFLSNGP GDPATLTDEIVTIKKLVESYPVAGICLGHQLLGHALGGTTRKLKFGHHGCNHPVKDLSTG HVEISSQNHGFCVDIDNISDVVVTHVNLNDGTLEGFAHKTKPILAVQHHPEACPGPIDSQ YFFARFRGMVRDAVGF >gi|316921322|gb|ADCP01000162.1| GENE 23 25728 - 26396 721 222 aa, chain - ## HITS:1 COG:Cj0813 KEGG:ns NR:ns ## COG: Cj0813 COG1212 # Protein_GI_number: 15792151 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Campylobacter jejuni # 7 214 29 238 239 151 38.0 1e-36 MFWHVYAHAKRASVLRNIVLATDDERIAEAALEWEIPCVMTRRDHASGTDRVFEAASKLG VEPHAVIVNIQGDEPALDPAVIEQLVRPFLDGTVQVSTLATPISPERAASPNQVKVVTAA NGDALYFSRSRIPFDREGGDGEILGHIGLYAFRMGALERFVSLPQSPLEKREKLEQLRFL ENGVPIRVVRVEGYEAHGVDTPEDLETIRELLAEHTCKSCVR >gi|316921322|gb|ADCP01000162.1| GENE 24 26830 - 28065 1317 411 aa, chain - ## HITS:1 COG:DR1863 KEGG:ns NR:ns ## COG: DR1863 COG1459 # Protein_GI_number: 15806863 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Deinococcus radiodurans # 8 403 4 400 406 265 37.0 1e-70 MADQKHTYAYRAINDEGVTLTGDIEAESAEQARQALLGRGLMPTEVTRRDGPLKSKRFQK FFQKPVTYKQIILFSKQFRTLFNAGVSITQLLGILQTQTENPTLKAATADIAQQVSTGGT LYNAFRSHPDIFSPLYCSMIRAGEFSGSMGDVLDRLTYLLEHENKIRNDVKSALRYPKIV LITLAGAFFFLLNWVVPSFAKLFVNAKIELPWPTRMALAMNTALSDYWYLLLAAAVGMFF GVRWWIRTSRGRYLWARTLLTLPLVGKVVQMSIMARFAAIFSILQRSGVSILDSLDILSE TINNAALEREFSTIKEKLRSGQGIAEPLSTARYFTPLTINMVAVGEGAGNLETMLEDLAA HYDEEVEYAVGEMTEAIGPILIVVLAAVVGFFALAIFMPMWDMTKLASGGR >gi|316921322|gb|ADCP01000162.1| GENE 25 28215 - 29921 2165 568 aa, chain - ## HITS:1 COG:DR1964 KEGG:ns NR:ns ## COG: DR1964 COG2804 # Protein_GI_number: 15806962 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Deinococcus radiodurans # 7 563 295 841 846 404 40.0 1e-112 MHTRMRLGELLLASGLLSADQLEAALQGQRSSGMKLGEYLIKQGICRESDVVDAVCRQTG IERYTPSRFPLNLSLSDRLPADVAQRTNAVPLLIRGDVLVVAMMDPLDIDALDRIEIVTD REVEPVMCLRQEFTQLYAALYGHFNTMDGVMESFTALPDQPMASDDLLIASETPKDELGQ PDEAPVVRLVNSILTQAVRESASDIHISPEKDSIQIRFRIDGKLRKTPSPPKSVGASIVS RIKILANMDISITRVPQDGRFTMMVDRREINVRVSTLPTIYGENVVMRLLDMSTNHVYTL DKLGMSDKDCKTISQTIERPYGMILSTGPTGSGKSTSLYSILQLLNKPDVNAITLEDPVE YRIDGIRQVQLNVRAGMTFASGLRSILRQDPDVIMVGEIRDSETAQIAVQAALTGHLVLS TLHTNDAPGAVSRLMEMHIEPYLVASVLLCSFAQRLVRKVCPHCAEPYDPPRALLGLFGI KDAEGATFRRGKGCYHCGNTGYLGRIGIFEVMPVTPEIQEAIVRRAHAQEISAIAQRQGV MNTLAQDAARKVRAGITTVEEALRAAMV >gi|316921322|gb|ADCP01000162.1| GENE 26 29925 - 31550 2067 541 aa, chain - ## HITS:1 COG:PA5040 KEGG:ns NR:ns ## COG: PA5040 COG4796 # Protein_GI_number: 15600233 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component HofQ # Organism: Pseudomonas aeruginosa # 79 506 284 705 714 169 30.0 2e-41 MRQRPVSILCLLALFCLLAACSPKQDKKQEFMDHWKQLSQDSQGYSPAPSDLRPEPRVIM RHTEKQEQAASRPLPTIPVTLKLHNVDVGVALRSLAAAAGVNVMLSPGVSGTVSLNVQKS PWRDVFQGLLRANGLQYRWQGNILQVLTAVEKQKEINLQTLDNQLAQQELISRQNSPLTV SVVSVRYAEAAALQQSLTKFLTAANGQGGQAAVIEVDEHSNALIIQATEQDQQRIIRLVD NLDRPRPQVHLKAYIVEATKETARELGVQWGGVWRSGSFNNGNHAWIGSGASGTQGQDPV TGGLTGTHGSGLGGAPFGLDYSGIASNDMGSLGFLFGKVGGSMLEVQLNLMEKDGNINIL SSPSITTLDNKMAYTENGEKVPYVSTSNMGDREVKFEDAVLRLEMTPNVIDGDNLKLKVL VKKDEVDTSRSVDGNPFIIKKQTETTLIMRNGETVVISGLTKEKGTDINAGVPGLKDIPG GKYVFGHESKGKTMEEVLIFITPEILPTRELPPLPSAPSMSESPSLRNGALPAPSAGAAR R >gi|316921322|gb|ADCP01000162.1| GENE 27 31605 - 32096 494 163 aa, chain - ## HITS:1 COG:no KEGG:Dvul_1790 NR:ns ## KEGG: Dvul_1790 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_DP4 # Pathway: not_defined # 1 163 1 164 164 72 33.0 7e-12 MTLREKILLAAMGAAVMGGGIQFGLSTLLPSPAPSGTDSSLQQARAVAGEVTARLQALPL SDGQRYILASSLRPAPRNPFTLPANGLPDAQQSASANASGLVYSGYVQAGKEALAIIGGL EYGVGDTLPNTGDVIRFIGSDGVRLYSPSRNAEWTLPYSGDDI >gi|316921322|gb|ADCP01000162.1| GENE 28 32112 - 32657 653 181 aa, chain - ## HITS:1 COG:no KEGG:DVU1275 NR:ns ## KEGG: DVU1275 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris # Pathway: not_defined # 10 169 14 174 186 73 33.0 3e-12 MRLPIIPAWFPLRALALQGVMALCVLLFCFAPLWYVTWKLEADIKGLETRIQVQETLQPL MEGLNARREETRRLLGATKTLPTPDTLPLIVTSLQEMVSLSGMQSSHFVPAAETVVEKNR IRLDGTLSGPPDNFRRLVLLLSEQPWLSGMEFLNVTPSGALPAYSLGIWATFTQQANTPR H >gi|316921322|gb|ADCP01000162.1| GENE 29 32659 - 34740 1615 693 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1024 NR:ns ## KEGG: DvMF_1024 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 6 693 5 654 654 289 32.0 3e-76 MSLSASQASATQKLLEHIRSTPQKDDASASFAPPLQIPPAAPLRPLFGKERSVVGIDVGK GRLCLVQRRAGSTPILEAIQSFSIPETMDWETPGLSDRIREALSAFLPHGGRDADIWVRL PDGQDELRHYRIPKVSPKDRDAIANMAAIREKPFDEAENIFDYRVDAEVLDKGVPRLPVT AMIANKGAVNTVRHTLAAAGVSPAGISSGNIYAQNLFASGWLSSPWEHFAFADIGEDSTR IEIFSGTNIALSRTIKTGLRSLVTALQESYGGRRKKTPPPPVVPPMPRHLPMEALGGELA GSHSERQQAVSSFTSPSDPFPLILQPESLPLELSLPQLSPMDMGMASAPVAPSVPAASLS SLAEKIPDDEISYEEGLHLLCENRRRTPEEEERLLLRLSQPLGRLARQLERTADHFRNAM GMPNIQGVIVFAPGGCMALALKKFETSLGLPCRPLRFDGQTTPGAATDLEKALARSADES LLQAIGLSLSAPAYTPNAIMTYKDQKSRERQMRITFLSIGLTTVVLMLLLAFCGKLYMGY LDEKAHKAELESRIASWPVLYTPDQLRNNLADTQKWQVQARTLAKRRIAAALMTELSTIT PEAIHITGMRVTFKDVNPGKQEARRGTKKPSEPEESAVAVLTGSVAGDMLQRESQLAEFL SQLEHSPMVLSLVVEKQQTDTEILSFVATLRLV >gi|316921322|gb|ADCP01000162.1| GENE 30 35125 - 37362 179 745 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|222153157|ref|YP_002562334.1| 30S ribosomal protein S1 [Streptococcus uberis 0140J] # 623 745 273 380 405 73 34 2e-12 MSFRKAPTRVSATIGGKEIILETGRVATQAHGSVTIQCGGTVVLVTVCSQPLDFDRGFFP LTVEYSERMYAAGRIPGSFFRREIGRPSERETLVSRLIDRPIRPLFPKGLPEDVQVLASV ISSDQENESDVLSITAASAAVMLSPLPFAEAVAGGRIGRINGQFILNPTVKEMAESDLNI VFAASADALVMVEGEATFVPEDVIIDALEWGRKEIQPLVEAQVKLRELAGKAKMAFTPQE DDAALLARIEELAAGAGLEAAMRVPEKMARKDARKLVKEKIVEALKADPTYGEDEKALSS VGDIIGHIEKKVVRKRILDEGTRIDGRDTKTVRPIEIQPGILPRAHGSALFTRGETQSLC VTTLGSSTDNQRMDSLTGDVTKTFMLHYNFPPFSVGEVKPVRVSRREIGHGALAEKALKP IIPAGDGFPFTVRVVAETLESNGSSSMAAVCGGCLSLMDAGVPISDPVAGVAMGLIKEGD NFIVLTDILGDEDALGDMDFKIAGTAEGVTAVQMDIKITGLTTEIMRKAMRQAHEARLHI LGEMKKAIDGPRAELSKYAPQHAEVFVNPEVIRMIIGPGGKNIKAITAATGASVDIEDSG RISIFAPTAESMEQAKELVQYYDQRPDLGKNYMGKVRKVLEIGAIVEIMPNVEALVHISQ LDTSRVAQASDVAHLGEDMLVKVIEINGDRIRASRKAVLLEEQGIEWKPEDTARPARAPR GEGDRDHRGDRGDRGERRERRPRRD >gi|316921322|gb|ADCP01000162.1| GENE 31 35125 - 37362 1615 745 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase [Brucella abortus bv. 1 str. 9-941] # 9 698 6 694 714 626 49 1e-179 MSFRKAPTRVSATIGGKEIILETGRVATQAHGSVTIQCGGTVVLVTVCSQPLDFDRGFFP LTVEYSERMYAAGRIPGSFFRREIGRPSERETLVSRLIDRPIRPLFPKGLPEDVQVLASV ISSDQENESDVLSITAASAAVMLSPLPFAEAVAGGRIGRINGQFILNPTVKEMAESDLNI VFAASADALVMVEGEATFVPEDVIIDALEWGRKEIQPLVEAQVKLRELAGKAKMAFTPQE DDAALLARIEELAAGAGLEAAMRVPEKMARKDARKLVKEKIVEALKADPTYGEDEKALSS VGDIIGHIEKKVVRKRILDEGTRIDGRDTKTVRPIEIQPGILPRAHGSALFTRGETQSLC VTTLGSSTDNQRMDSLTGDVTKTFMLHYNFPPFSVGEVKPVRVSRREIGHGALAEKALKP IIPAGDGFPFTVRVVAETLESNGSSSMAAVCGGCLSLMDAGVPISDPVAGVAMGLIKEGD NFIVLTDILGDEDALGDMDFKIAGTAEGVTAVQMDIKITGLTTEIMRKAMRQAHEARLHI LGEMKKAIDGPRAELSKYAPQHAEVFVNPEVIRMIIGPGGKNIKAITAATGASVDIEDSG RISIFAPTAESMEQAKELVQYYDQRPDLGKNYMGKVRKVLEIGAIVEIMPNVEALVHISQ LDTSRVAQASDVAHLGEDMLVKVIEINGDRIRASRKAVLLEEQGIEWKPEDTARPARAPR GEGDRDHRGDRGDRGERRERRPRRD >gi|316921322|gb|ADCP01000162.1| GENE 32 37507 - 37776 397 89 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|220903341|ref|YP_002478653.1| ribosomal protein S15 [Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774] # 1 89 1 89 89 157 85 1e-37 MVMDAAQKKTVIEAHAKHEGDTGSPEVQVALLTARIEGLTGHFKEHKKDFHSRRGLLKLV GQRRNLLNYLKSKDIQRYRALIEKLGLRK >gi|316921322|gb|ADCP01000162.1| GENE 33 37802 - 38731 596 309 aa, chain - ## HITS:1 COG:PA4742 KEGG:ns NR:ns ## COG: PA4742 COG0130 # Protein_GI_number: 15599936 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Pseudomonas aeruginosa # 6 240 13 251 304 161 41.0 2e-39 MEQQHGLLVVDKPRGLSSAQCTNRFKRLGQKKIGHAGTLDPMAQGVLLVLLGHATKISGY LMAGGVKAYQGTVRLGQTTDTWDADGQITAEAPWDHVTAEAVADVIAGWVGTSEQPVPPY SAAKHQGQPLYKLSREGKETPLKIKTIEISRAEVLRVELPYVTFRVICSSGTYIRSLAHS LGTRLGCGAVLTELIREYSHPFGLDRACPLDTLLAEPETLPERVIPVTAALPHWPTITVT GQEEADTKNGKILVWTEARRARSRAQDGTDIPCVPGANAVLLSPDGQPLALAEAVSGGPA WKVLRGLWN >gi|316921322|gb|ADCP01000162.1| GENE 34 39157 - 39651 409 164 aa, chain + ## HITS:1 COG:SMc02757_1 KEGG:ns NR:ns ## COG: SMc02757_1 COG0802 # Protein_GI_number: 15963790 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Sinorhizobium meliloti # 5 143 8 138 151 83 42.0 1e-16 MDISLPDAESTVEFGRQLGRALNEQYAEGGEQVHIILFYGDLGSGKTTFTRGFIEALPGG ENAEVSSPSFTLCNSYPTTPSVIHCDLYRSEGALPDEVDEALDTESGLVLVEWAERIAAE NLPPKRLDILFQVCKNNRLVTLSPYGKAAHCVLQKLARLRDSGE >gi|316921322|gb|ADCP01000162.1| GENE 35 39703 - 40929 1397 408 aa, chain + ## HITS:1 COG:PA0904 KEGG:ns NR:ns ## COG: PA0904 COG0527 # Protein_GI_number: 15596101 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Pseudomonas aeruginosa # 1 405 1 407 412 408 53.0 1e-113 MRILVQKFGGTSVADIDRLKMVRGKVKAALDQGYKVVVVLSAKSGKTNKLLDLSTRWAAE PDLAEVDSLLSTGEQASIALFSMLLKDSGIKARSMLGWQIPIITNDEFGRARILSIDASR IHKELEAHDVLVVAGFQGSTEDGRITTLGRGGSDTSAVALAAALDSCECEIYTDVDGVYT TDPNLCSTARKLDRVTYEEMLEMASMGAKVLQIRSVEFAKKYNVPVHVRSTFSDDPGTLV AQEDAMMEAVLVSGIAYDKDQARVTVCGVPDRPGVSAALFAPLSENGIMVDMIIQTASRE GVTDMTFTVSRKDLEKAIQLMKEIVARIGGTGVEHDPYVAKVSVIGVGMRNHTGVATKAF AALQQENINIQMISTSEIKISCLIDEKYTELAVRTLHDAFELHKPDHA >gi|316921322|gb|ADCP01000162.1| GENE 36 40933 - 41460 -394 175 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPISLWAKTQSFLFPTRSSLVKEPEAVASAAAFPSAAKRELTPPGPACQQLFFRTPKFFF GARREGFSPAQKRELIPTPSRCQPLSSAFFKKPLPCCHSDDTTTAGCFQHAAEAFAYPAP DPALCQPFFEIFSKTFSSQYPTFLFYCTMRICVSRDDTHEKSPERGLPGPSHQFA Prediction of potential genes in microbial genomes Time: Fri May 13 05:09:06 2011 Seq name: gi|316921284|gb|ADCP01000163.1| Bilophila wadsworthia 3_1_6 cont1.163, whole genome shotgun sequence Length of sequence - 39755 bp Number of predicted genes - 41, with homology - 37 Number of transcription units - 19, operones - 11 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 7 - 38 1.5 1 1 Tu 1 . - CDS 119 - 1258 1822 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component - Prom 1355 - 1414 4.4 2 2 Op 1 . - CDS 1514 - 2317 770 ## DVU3125 putative lipoprotein - Prom 2352 - 2411 1.9 3 2 Op 2 . - CDS 2413 - 3363 995 ## Sfum_1523 bile acid:sodium symporter - Term 3387 - 3420 5.2 4 2 Op 3 . - CDS 3445 - 3897 482 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Prom 3982 - 4041 6.1 + Prom 3939 - 3998 3.6 5 3 Tu 1 . + CDS 4088 - 4390 140 ## Ddes_0521 hypothetical protein + Term 4595 - 4633 10.2 + TRNA 4508 - 4584 83.2 # Met CAT 0 0 - Term 4950 - 5005 3.4 6 4 Tu 1 . - CDS 5018 - 6331 1610 ## COG0477 Permeases of the major facilitator superfamily - Prom 6417 - 6476 5.5 + Prom 6734 - 6793 5.5 7 5 Op 1 23/0.000 + CDS 6939 - 8009 1172 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 8 5 Op 2 22/0.000 + CDS 8015 - 8761 993 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 9 5 Op 3 . + CDS 8767 - 9315 624 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 10 5 Op 4 . + CDS 9319 - 9534 361 ## HRM2_10680 2-oxoisovalerate ferredoxin oxidoreductase, delta subunit (EC:1.2.7.7) + Term 9546 - 9579 3.1 11 6 Op 1 . + CDS 9630 - 11780 2503 ## COG1042 Acyl-CoA synthetase (NDP forming) 12 6 Op 2 3/0.000 + CDS 11862 - 13184 1648 ## COG0665 Glycine/D-amino acid oxidases (deaminating) + Term 13290 - 13333 0.5 13 6 Op 3 . + CDS 13403 - 13780 508 ## COG0251 Putative translation initiation inhibitor, yjgF family + Term 13786 - 13811 -0.5 14 7 Op 1 . + CDS 14022 - 15578 1291 ## COG1292 Choline-glycine betaine transporter 15 7 Op 2 . + CDS 15618 - 16439 489 ## COG1414 Transcriptional regulator 16 7 Op 3 . + CDS 16461 - 17027 186 ## COG0163 3-polyprenyl-4-hydroxybenzoate decarboxylase + Term 17204 - 17249 2.2 - Term 17188 - 17240 14.7 17 8 Op 1 . - CDS 17263 - 17592 380 ## Ddes_0118 glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.2) 18 8 Op 2 . - CDS 17608 - 17739 240 ## Amico_1396 glycine reductase 19 8 Op 3 . - CDS 17808 - 18125 506 ## COG0526 Thiol-disulfide isomerase and thioredoxins 20 8 Op 4 . - CDS 18197 - 19363 1379 ## Ddes_0116 glycine reductase (EC:1.21.4.2) 21 8 Op 5 . - CDS 19372 - 20907 1962 ## Ddes_0115 betaine reductase (EC:1.21.4.4) 22 8 Op 6 . - CDS 20933 - 21847 554 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 23 9 Tu 1 . + CDS 21907 - 22623 181 ## + Term 22754 - 22791 -0.9 24 10 Op 1 . - CDS 22324 - 23370 1452 ## Ddes_0113 selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family (EC:1.21.4.3) 25 10 Op 2 . - CDS 23409 - 24695 1881 ## Ddes_0112 sarcosine reductase (EC:1.21.4.3) 26 10 Op 3 . - CDS 24697 - 24996 156 ## 27 10 Op 4 . - CDS 25090 - 25176 76 ## - Term 25469 - 25493 -1.0 28 11 Tu 1 . - CDS 25659 - 25922 101 ## - Prom 25980 - 26039 1.8 + Prom 25879 - 25938 3.5 29 12 Op 1 . + CDS 25974 - 26705 741 ## CJE0463 hypothetical protein 30 12 Op 2 . + CDS 26720 - 28429 2068 ## COG2303 Choline dehydrogenase and related flavoproteins + Term 28476 - 28517 10.4 - Term 28469 - 28500 2.4 31 13 Tu 1 . - CDS 28509 - 28772 384 ## Dde_2645 hypothetical protein - Prom 28991 - 29050 80.3 + TRNA 28973 - 29048 81.3 # Val GAC 0 0 + Prom 28973 - 29032 78.9 32 14 Op 1 16/0.000 + CDS 29149 - 31089 2048 ## COG0441 Threonyl-tRNA synthetase 33 14 Op 2 . + CDS 31124 - 31660 431 ## PROTEIN SUPPORTED gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 + Prom 31667 - 31726 3.2 34 15 Op 1 . + CDS 31809 - 32006 278 ## PROTEIN SUPPORTED gi|218886084|ref|YP_002435405.1| 50S ribosomal protein L35 35 15 Op 2 . + CDS 32094 - 32447 511 ## PROTEIN SUPPORTED gi|94986658|ref|YP_594591.1| 50S ribosomal protein L20 + Term 32564 - 32604 8.1 + Prom 32780 - 32839 6.3 36 16 Tu 1 . + CDS 32979 - 34430 2047 ## Ddes_2093 hypothetical protein + Term 34448 - 34505 14.0 37 17 Op 1 1/0.000 + CDS 34794 - 35621 629 ## COG0778 Nitroreductase + Term 35638 - 35685 12.2 38 17 Op 2 . + CDS 35782 - 36699 176 ## PROTEIN SUPPORTED gi|90020671|ref|YP_526498.1| ribosomal protein S6 + Term 36773 - 36825 17.4 - Term 36762 - 36812 19.2 39 18 Op 1 . - CDS 36837 - 38210 439 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 40 18 Op 2 . - CDS 38254 - 38667 366 ## COG0607 Rhodanese-related sulfurtransferase - Prom 38727 - 38786 5.3 + Prom 38686 - 38745 6.4 41 19 Tu 1 . + CDS 38797 - 39396 312 ## COG1309 Transcriptional regulator + Term 39480 - 39524 11.2 Predicted protein(s) >gi|316921284|gb|ADCP01000163.1| GENE 1 119 - 1258 1822 379 aa, chain - ## HITS:1 COG:TM1135 KEGG:ns NR:ns ## COG: TM1135 COG0683 # Protein_GI_number: 15643892 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Thermotoga maritima # 29 364 21 354 370 278 46.0 2e-74 MKLGTLLSMAAMAVMGMGLALPAQAADPIKIGVYLPLTGQNAFGGQLELEGIQLAHKLKP TVLDRPVELVVVDNKSDKVEAANAVKRLTEHEKVLAIIGTYGSSLAMAGGEVAERAKVPV IGTSCTNPLVTQGKKYYFRACFIDPYQGAAAATYAYKNLGYKKAAVLTDVANDYAVGLSN FFKKTFKKIGGELVADMKYSSGDQDFTAQLTELISKKPDIVFMPAYFAEGAIIMKQAREL GATFVLMGADAMDNPDTLKIGGKAVEGFLHTTFPYDVNMENMSAEAKTFTDAWKKNFPNK DPNVNSALGYNCYNIILDALTRAGKADTEALTAALAATKNLPTALGVLSINETHDAEMPV GIIKYMDGKRKYIGDAIAE >gi|316921284|gb|ADCP01000163.1| GENE 2 1514 - 2317 770 267 aa, chain - ## HITS:1 COG:no KEGG:DVU3125 NR:ns ## KEGG: DVU3125 # Name: not_defined # Def: putative lipoprotein # Organism: D.vulgaris # Pathway: not_defined # 129 266 70 204 206 91 39.0 2e-17 MKNIPLLSRAPRFSHMLLAVLASVVFGGCSPYQKAAPQPEPTVPLTGMVTYRERMALPPD AELTVTLHRKTSDGRMLVAEERASTEGRNVPLPFSIACPAADSSSMSYELEAAISSGGTT LFATTRPLTVHPGDADIMVLTHRVMDTAPVTDGLAGTRWKLVELNGKPAEVYDNQPEPHL LFNPEGAQGQISGSDGCNSLIGSYTLQGDRIGFSQLGSTMMLCPKGDAQARALAQALAGA TSVSHSGDTLDLWSGKTRVARFKATAL >gi|316921284|gb|ADCP01000163.1| GENE 3 2413 - 3363 995 316 aa, chain - ## HITS:1 COG:no KEGG:Sfum_1523 NR:ns ## KEGG: Sfum_1523 # Name: not_defined # Def: bile acid:sodium symporter # Organism: S.fumaroxidans # Pathway: not_defined # 4 315 22 322 322 134 32.0 6e-30 MRSRDLVMIVVSFLAMLAGSFLPGLAEPLAPFPRLCLIVLLYLGFLSVGTEALFTHTRLI PGTVSGLVLIRLFALPLLSFFLFKLLMPQFALGALLVGAASIGVVAPIFSIMVNADTALV LAGNLLSSLLLPLTLPMLLYIVDSFMTLSGFGPMNLPAHLSLSGMTLSLCVTIIVPFAGA FLTRKAPRVTEYILKHQFPVSVSTIALSTLAIFSNYSGVLHQSPSLVVKALGAACLLGAV MMVGGLFLPRSMPPQRKLAFLISYGTMNNVLMLIVSLEFFSASESIMAAMYLLPLNALLV YYRALSRSWGLEQAAG >gi|316921284|gb|ADCP01000163.1| GENE 4 3445 - 3897 482 150 aa, chain - ## HITS:1 COG:Cj0400 KEGG:ns NR:ns ## COG: Cj0400 COG0735 # Protein_GI_number: 15791767 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Campylobacter jejuni # 15 150 15 152 157 119 40.0 2e-27 MFSEDRPVGNPAQIFRSFLKNKGLRNTPQRQQIMEVFFNESGHLTTEEIYDRVRKEDPSL GQATVYRTMKLLCEAGLAREVRFGDGLARYEHAADAHHDHLICENCGRNIEVVDSQIEEL QDALARKHGFKPTFHRLYLYGICPDCQRNR >gi|316921284|gb|ADCP01000163.1| GENE 5 4088 - 4390 140 100 aa, chain + ## HITS:1 COG:no KEGG:Ddes_0521 NR:ns ## KEGG: Ddes_0521 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 88 29 116 127 71 43.0 7e-12 MKKKKRRPIRLPVPEYSDKVLVSVPPQHVGMFRFLLEGYDNIASFTVLDRAEALLKVFFS PHQHTEARSALEGIAAVLPITVRPWPAAACCAEPSESGMA >gi|316921284|gb|ADCP01000163.1| GENE 6 5018 - 6331 1610 437 aa, chain - ## HITS:1 COG:YPO2459 KEGG:ns NR:ns ## COG: YPO2459 COG0477 # Protein_GI_number: 16122680 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 4 433 6 430 452 243 35.0 5e-64 MSGARQISFEDAPFSAIHKKITAGTFMGQISDGYTLGIVGISLNYAMEPLGLTSFWMGLI GAGSMFGIFFGSLLAGITADKIGRRPLYASLMLLTAIVSVLQFFLSDPLLIAIVRFVLGM LIGADYTVGIPLLSEWAPAKKRAGILGWLLVFWTIGYCISYIVGFFMDGFVAAFGNDGWR LVLCTSVVPSLIALVIRFGTPESPLWHIAKGRPQEALANIHAYLGNQYGLPDRGEEKPAS TSWFALFAPEQWRNTVVSGVFFFAQVLPFFAISIFLPLVLTKMNIQNPNASGVLYNVFTL IGVLVGLWIYGIATRRAYLLWTFYVSAAILTVMILWTSMSPLVALIMITAFALVLAASIV PEFSYPAELFPTELRGSGVGLTIAISRFGAGGGTFLLPIVSEQFGIHMALWCCVITLLFG GIICHMWAPETSGRGKK >gi|316921284|gb|ADCP01000163.1| GENE 7 6939 - 8009 1172 356 aa, chain + ## HITS:1 COG:TM1759 KEGG:ns NR:ns ## COG: TM1759 COG0674 # Protein_GI_number: 15644505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 4 339 5 340 356 306 45.0 3e-83 MTQQTMFLKNTETFAESLARCGVKFHFAYPITPATDVMKRMAVILPQYGGKMLQMESELA VSSALAGAACTGMLCATSSSGPGMTLMQEAISFMCGAELPCLMLDSMRVGPGDGDILGAQ SDYYMSTRGGGHGDYHVPVLAPSDGQEIVDLVPEGIRLAYTYRTPVLFVFDGVTAQTTET AVLPEYHDYSAEYDTSSWAYTGTKDHPKRALITGSYSHADGYVMNEHLRAKYETIRAKEQ QWKESNTEDAELIVVAFGIHGRMCKDLVANLRAQGKKVGSIRPVTLWPFPDKAFENLPST LKKILVVEMNHGQMVDDVRLAVNGRVPVHFFGKTGGDMPMYTLAEMTAEVNRLLEE >gi|316921284|gb|ADCP01000163.1| GENE 8 8015 - 8761 993 248 aa, chain + ## HITS:1 COG:MA2909_1 KEGG:ns NR:ns ## COG: MA2909_1 COG1013 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 18 248 35 265 296 212 41.0 6e-55 MQNLSAKYGKTLNLDKLTSYCPGCGHGIVTRLVAEAIETLGIRERAIAIVGIGCGGFSHH YMDVDAIEATHGRSPSFAVGYKLCRPDNIVFTYCGDGDSCAIGLGDLMHAANKGMPITSI MVNNSVFGMTGGQMSPTTLDGQVTATTLKGRDVTWQGYPLLVPEMMREMPGVKYLARESV ATPKHIMHAKKSIQKAFECQVKGLGYSFVEVIVPCPTGLKKSVQDSYKWCSEEMLGYFKP QVFKSEME >gi|316921284|gb|ADCP01000163.1| GENE 9 8767 - 9315 624 182 aa, chain + ## HITS:1 COG:PAB0345 KEGG:ns NR:ns ## COG: PAB0345 COG1014 # Protein_GI_number: 14520722 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Pyrococcus abyssi # 7 176 5 166 170 84 41.0 2e-16 MLVKCMFSGSGGQGSALMAKLVCEGAMRENLKVVMTQTYGIEQRGGDSTAFIIISDEAIG SPIVENDATIAVALSQSIYEQCLHGVAPGGMLFTNASLVENPGRAEGFTQILLPVSEIAV EVGSVRSVNMVMLGAVIAKTKLLKRETIMAVLEETMGRKKPELLEFNIKAFNAGYEAAGK GE >gi|316921284|gb|ADCP01000163.1| GENE 10 9319 - 9534 361 71 aa, chain + ## HITS:1 COG:no KEGG:HRM2_10680 NR:ns ## KEGG: HRM2_10680 # Name: not_defined # Def: 2-oxoisovalerate ferredoxin oxidoreductase, delta subunit (EC:1.2.7.7) # Organism: D.autotrophicum # Pathway: Citrate cycle (TCA cycle) [PATH:dat00020]; Metabolic pathways [PATH:dat01100] # 7 70 5 68 78 65 48.0 8e-10 MSKTKTHVIDARRCKSCGLCVDACPKGVLAIGTEINGQGYNYIERAHPEKCVLCNICGVV CPDVAVGVVEE >gi|316921284|gb|ADCP01000163.1| GENE 11 9630 - 11780 2503 716 aa, chain + ## HITS:1 COG:Ta1153 KEGG:ns NR:ns ## COG: Ta1153 COG1042 # Protein_GI_number: 16082167 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Thermoplasma acidophilum # 3 713 7 698 698 433 37.0 1e-121 MSDLIPLFRPKSVALIGASSDTKKYGYWTAKSLIDNKFQGEIHLISRSGGEILGQPTYPD ILSVPGEVDLAIIAIAPKHILPIIEQCVEKGVKTGIVVSTGFGETGPEGKEIERRMLEIA RKGNMRIMGPNCMGMYSSAVSLNASIIDLAPGPMSLVLQSGNFGIDLNFNAKKRGMGYSC WATIGNQMDVRFNDFVRYIEQDDHTKVMLLYMEGLRVENEEDGRKFLEAARRTSVSKPIA AIKIGRSAAGARAAASHTGSLAGSERIFDAALAQAGVVRVNTPNELLDAAEAFSKCKPAH GKRIAVLTDGGGHGVMATDFCEANSLEAPVLSEATQEKLRAILKPHCPIKNPVDLAGTPE GDMWVFDLCLEVLLDDPDVDGIVIVGLYGGYADLSEEFRVLEMDVAKSMAERIRKGDKPV VMHSIYQPQQPDCLKYLSEQGVPVFGAVDEAVRTMGVLVEYSERRAALLEEAGSVPPELP AGRREKAEAIFARVRSEGRVNLVETEAREVLRAYGVDLAPHYLAATADEAAEMWEKIGGK AVMKIVSPDILHKTDAGGVALNIESAGAAREAFERLVANGRNYKADANIFGVMLTPMLKG GVECIIGSSWDSTFGPTVMFGLGGIFVEILKDVSFRVAPVNMPSCRRMVREINGLAMLQG ARGSKPCDLEALAEAACLISHMVDELRDIAEVDLNPVFAWEKGLAVADARIVLQAR >gi|316921284|gb|ADCP01000163.1| GENE 12 11862 - 13184 1648 440 aa, chain + ## HITS:1 COG:PA4186 KEGG:ns NR:ns ## COG: PA4186 COG0665 # Protein_GI_number: 15599381 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Pseudomonas aeruginosa # 38 439 37 438 439 252 33.0 1e-66 MRDVKRFPTTTGLSWLEMSSFKDHLFKGHEKIGKEYDYVIVGGGYGGYGCASRLAELQPE ARIAVFEAIKIGNGDSGKNAGFIIDVPHNFGDQGNSTFEDNEMYYKLNTFIIGRMRKTIE DSGIKVDWDPCGKYLCCSETKSFKLIETESEELDQMKVHYEIYEGEELAKRIGTRYYKKA LYTPGTLLVNPADVLRGLYTTLPENVDVFEECPVMRIEEGTRARVTLLNGKEINAKTVIV TGGPFIEEFGIVKRVFCPVLSYGAFTRQLNDKEMKYFEGVKPWGCTAGHPAGTTVRFTTD NRLYVRNGFSYATHLTTSHQRIRRAVPKLRRAFENRFPEIKHVNFEFIYGGMINMTMNFR PLMTQKHPSVYASASGEGAGVAKTCLMGHYIAEWINGIDSEELRFLRRIATPSHIPPEPL TTIGATARLMWEEFNAKSEI >gi|316921284|gb|ADCP01000163.1| GENE 13 13403 - 13780 508 125 aa, chain + ## HITS:1 COG:PH0854 KEGG:ns NR:ns ## COG: PH0854 COG0251 # Protein_GI_number: 14590714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus horikoshii # 2 125 14 137 137 129 47.0 2e-30 MEFIHAPKACASVGPYSHCVKAGSTYYLCGQVPFHPVTGEVVGNGMKEQAVQTLANMKAV LEAAGLAVTDIAKTTVFVTNMDDFAAFNEVYAAFMGEHRPARVCVEVSNIAHGCLLEMDA ICFKE >gi|316921284|gb|ADCP01000163.1| GENE 14 14022 - 15578 1291 518 aa, chain + ## HITS:1 COG:lin2197 KEGG:ns NR:ns ## COG: lin2197 COG1292 # Protein_GI_number: 16801262 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Listeria innocua # 12 508 2 493 507 201 28.0 2e-51 MQDELSARTEGKRLDTIPFAIGFLSVFAFIVSAIAFPEPFIRFMDLLQDKIVKDLGWASV FFSFLVMLFTFGIVCSPIGSIRIGGRDAKPDFSFFRWFAISLCSGIGIGILFWGIGEPIY HLMQPPQNLGIAPKSHEAALFAISQSAMHWSIAQYCIYAICGVAFALMAFNEHLPLSIIS GLDLVMPRKHYSLAKNIVHAACLFSICCTVISSIGALIMMISSCVSYLTGLERSFAMNAI VALVATAFFVGSSTTGLKRGMNFLAAQNTRMFFIILFYIFLFGPTLFILNMGTEAFGYML TNFIRNSTLVSTEFMSDRWADSWIIVYMGAFFAYGPPIGLYLARLGKGRTVRQFLLMNVF APSMFVYLWINTFGSLAIYYQWKNLVDVWSFVQTQGLESTVIGILQRFPFSMALIVFFVI VTMISFVTLVDPMTSVLATISTKGISAEEEAPKFLKVLWGGNMGGVALAVITLCGISALR GMFVFGGVLMMLLTIILCWCIVKEGQNILARNKREDTP >gi|316921284|gb|ADCP01000163.1| GENE 15 15618 - 16439 489 273 aa, chain + ## HITS:1 COG:YPO1714 KEGG:ns NR:ns ## COG: YPO1714 COG1414 # Protein_GI_number: 16121974 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Yersinia pestis # 1 253 1 257 263 126 30.0 4e-29 MALADKYYTIGVLGKSFGVIELMAHQAKWELRDLAKASGLPKGTLQRILLTLCELGFVSQ DGKGGAYSLTLKFFKLGQRIASNNSLVEKARPACRQLMEKVNETVNLCVAQNFDMVVMAQ QVSWQILRLDSIIGSSFPIYPSASGKVHCAFLEESELLKFLNELREARPELTTDDINRFC TELAAIRREGVGFDCEEIFTGVRCVAAPIFDYTGNLVATIGCSVPTVRITEESSALLIRE VIQTAAAISMSIGAPRREFMPTTRTMLHCPVRG >gi|316921284|gb|ADCP01000163.1| GENE 16 16461 - 17027 186 188 aa, chain + ## HITS:1 COG:PAE1481 KEGG:ns NR:ns ## COG: PAE1481 COG0163 # Protein_GI_number: 18312661 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase # Organism: Pyrobaculum aerophilum # 1 185 3 186 189 159 47.0 3e-39 MHRLVLGISGASGMPLAQAVLENLATVSDLEVHLIISAGAERVLQAECGIPSAALARYAH AVHESSDMAAGPSSGSWHHDGMIICPCSMSSLASIASGAGSNLIHRAADVTLKERRPLVI VARETPLNLIHLKNMQAVTEAGAVVMPFTPAFYTRDTSLEAMMRHFTGRLLDQFRIEHHL CKRWRDDR >gi|316921284|gb|ADCP01000163.1| GENE 17 17263 - 17592 380 109 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0118 NR:ns ## KEGG: Ddes_0118 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.2) # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 109 50 158 158 182 87.0 5e-45 MDLENQQRVKDAAEKYGAENVVVILGSSDPEGAEIYAETVTLGDPTFAGPLAGVPLGLCV CHVLEPEIKAEADAAKWDEQIGMMEMVLNVDALADAVRKMREANSKYPL >gi|316921284|gb|ADCP01000163.1| GENE 18 17608 - 17739 240 43 aa, chain - ## HITS:1 COG:no KEGG:Amico_1396 NR:ns ## KEGG: Amico_1396 # Name: not_defined # Def: glycine reductase # Organism: A.colombiense # Pathway: not_defined # 1 43 1 43 43 65 79.0 5e-10 MGKLSGKKLLLLGERDGVPGPAMADVFANSGAEVLFSATECFV >gi|316921284|gb|ADCP01000163.1| GENE 19 17808 - 18125 506 105 aa, chain - ## HITS:1 COG:alr0052 KEGG:ns NR:ns ## COG: alr0052 COG0526 # Protein_GI_number: 17227548 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Nostoc sp. PCC 7120 # 4 89 7 92 107 77 39.0 5e-15 MIIVDKETFEAEVQQSSMPCVVDLWGPQCGPCLALMPEVEKLAEAYEGKLKFCKLNVAEN RRLVISLRVMAVPTILFYKGGECVARISGDAVSIEAIKAEADKLV >gi|316921284|gb|ADCP01000163.1| GENE 20 18197 - 19363 1379 388 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0116 NR:ns ## KEGG: Ddes_0116 # Name: not_defined # Def: glycine reductase (EC:1.21.4.2) # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 388 1 388 388 536 78.0 1e-151 MAMQECKRAILGKALEDLVARARSGKEPCRIGLMASGGEHSDAEFLVAASAAMSADPALT VVGVGPKPSGILPQGMDWIETGCEGPELASGMENALSQGRIHGAVALHYPFPLGVTTVGR VLTPGTGKPLFMASCTGMSAAHRQEAMLRNAILGVAVAKALGITCPSVGVLNLDAAPQVL RALNRMAEKGYPLNLGQSVRGDGGSLLRGNDLLCGAVDVCVADTLTGNVLMKVFSAFTSG GSYETSGWGYGPSVGEGWDKVVSIVSRASGAPVIANALAYTAAAVRGRLPAVVAEEIRLA KAAGMDDELAAFSKAAAAPQDEVQAPPAVPTDEEIHGIDVLDLELAVRCLWKENIYAEAA MGCTGPVVKLAGSSLDKARAVLTASGYL >gi|316921284|gb|ADCP01000163.1| GENE 21 19372 - 20907 1962 511 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0115 NR:ns ## KEGG: Ddes_0115 # Name: not_defined # Def: betaine reductase (EC:1.21.4.4) # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 511 1 510 511 818 80.0 0 MATAGIKAAAYCLNFAPELALRYGGTPAQERKSKPDSEFLRALPEHAQTYEEAQAYAPNK TYIGAMSIDELEKAPAPWIDNLGTPERFGTFGEIMPEDEFLGLMDICDVFDLIWLEEGFA ASVAEKLAHNPVIGEKQLARLEKGRPESDILDVVEKQHALPLYSEGRLAGCCRRAHDTDE NLEAGIMLENLSCKASGVLALLHLIKNAGLSPSDIDFVVECSEEAVGDAMQRGGGNMAKA LAEIAECGNASGFDVRGFCAGPVASVITAASMAACGARANVAVVSGGSVPKLYMNAREHV KKDVKALENCIGSFALLITPDDGQTPVIRLDSLGKHTVGAGAAPQAITSALTFEPLQKAG LKMTDVDKYAPELHNAEITLPAGAGNVPEANYKMIAALSVMKGQIERADIPKFVAERGMP GFVPTQGHIPSGVPYIGHALEALKAGTIKRAMIIGKGSLFLGRLTNLADGASFIMEGPGA GTEPAQGVSQSDVTEMLLVALSDVAANLQKG >gi|316921284|gb|ADCP01000163.1| GENE 22 20933 - 21847 554 304 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 5 302 3 303 306 218 42 6e-56 MEKRELVIIGAGPAGLSAAIYGKRAGLDTLVLEKGRPGGQVLITSKIENYPGILDGTGSG LADAFRTHAEHFKAEFRTASVQKLEVRDGEKIITLKDGSEIAAGAVIVASGASFRKQGCP GESTYTGMGVSYCAVCDAAFFEELEVAVIGGGNTAVEEACYLTGFASKVYLVHRRDQFRA DKMVVDHALSNPKIVPVMDSVLESIEGSDIVEKIVVRNVKTNEKREIPLSGVFIFIGTLP NDEYVHDLLQKDEGGWIVTDASLQTSVPGIFAAGDVRDTSLRQVVTAAGDGARAAMSAYA YLQR >gi|316921284|gb|ADCP01000163.1| GENE 23 21907 - 22623 181 238 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPKRCPGVRALSVKRQYRTQGMKANPLRAEAAGQAHTPCPSFQVPGTPPGHMGLKHRLPV HVGRQGIQGLIHELAAQLLFFFGPQFRIAQRVRDGHCGHDAVGPDAQRNGNDGTHMHDRD ISRIFDALGERCTATRACASGGGEDNPVHMRGLELRAYFRAELAGVGYGRAVADGTVEHM VELADFAFLFQIAQHVDGQDAVRVRIGIGLIVAAVGGHVFALAEVGDAVKVVCAVVLG >gi|316921284|gb|ADCP01000163.1| GENE 24 22324 - 23370 1452 348 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0113 NR:ns ## KEGG: Ddes_0113 # Name: not_defined # Def: selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family (EC:1.21.4.3) # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 348 1 348 434 600 87.0 1e-170 MAYKLVHYINQFFAGIGGEDKADVSPEVRDGIVGPGMAFKAAFNGEAEIVATFICGDNYC ANHLDDVAAHMVETVKAYGADGLIAGPAFNAGRYGTACGAVCAAVAKELQLPVVSGMYRE SPGVDLYRKEVTIVETADSARGMGKAVPAMAAVMLKLLKGEAIADPEAAGVFPKGIRKNM FYEQPGAERAVQMLIKKLNGEAFRTEYPMPVFDRVEPRPAIADISKATIAIVTSGGIVPF GNPDHIAASSAQNYGAYDLDGVTDLRKGEYMTAHGGYDQTYANADPDRVLPIDVLRDLEK EGKIGKLYHMFYSTVGNGTSVANARKFGTEIGSQLKAAHVDGVILTST >gi|316921284|gb|ADCP01000163.1| GENE 25 23409 - 24695 1881 428 aa, chain - ## HITS:1 COG:no KEGG:Ddes_0112 NR:ns ## KEGG: Ddes_0112 # Name: not_defined # Def: sarcosine reductase (EC:1.21.4.3) # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 428 1 428 428 781 87.0 0 MKLELHRISVGKLAFGAQTGVSGGVLTINKDELASLLLQDDRLSAVAVDVAHPGENVRIM PVKDAIEPRCKLEGPGEVFPGWIGDVENAGEGKTLVLGGMAVLTTGRVVAPQEGIVDMSG PGADYTPFSRTCNVVLSFDTAEDLEPHQREACYRIAGLKAAHYLASACKDAAADSVEAYD FPPLAEAMHQHPGLPKVAYMYMLQSQGLLHDTWVYGVDAKRILPTMISPTELMDGAIISG NCVSACDKNNTYVHLNNPVVKSLYEHHGKDLNFVGVIITNENVTLADKKRSSSYAVKLAR MLGVDAVVISEEGFGNPDADLIMNCRKSEQAGIKTVLITDEYAGRDGSSQSLADSCPEGD ACVTAGNANEIIVLPPMDKVIGDQAPAETIAGGFFGSVREDGSLEVELQAILGATNELGF NRIGGRTL >gi|316921284|gb|ADCP01000163.1| GENE 26 24697 - 24996 156 99 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPYRSLALTPRVRSGEHPVGIDPTSLLLIERAQARYYDVDPARMPASLSGREMQDCAFVD VELLRATIESLGGSIESISNPNGSGMAYPPRLLPDHKEM >gi|316921284|gb|ADCP01000163.1| GENE 27 25090 - 25176 76 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLSPILLTNNPRFQDLSLTGMDITFIPE >gi|316921284|gb|ADCP01000163.1| GENE 28 25659 - 25922 101 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIFRCCMNPYATGILHDNPLRIDALYGCPHVLPKEHTFNFITPRYLRIEKKNLMEPLYLR GHYLSSDEERKNIKKNGGLSLWGLRPL >gi|316921284|gb|ADCP01000163.1| GENE 29 25974 - 26705 741 243 aa, chain + ## HITS:1 COG:no KEGG:CJE0463 NR:ns ## KEGG: CJE0463 # Name: not_defined # Def: hypothetical protein # Organism: C.jejuni_RM1221 # Pathway: Pentose phosphate pathway [PATH:cjr00030]; Metabolic pathways [PATH:cjr01100] # 7 243 6 241 242 230 51.0 4e-59 MPDEKDISRRRFLQYTGLVTGAAAALGLSTVSMAAEQGGNAHSGHAAATANPLNRARMFF TNELDFSTLSEAAERIFPKDEFGPGAKELAVPYFIDNQLAGAYGYNAREYIAGPHFKGAP TQGYQTPLVRRDLFKQGILALNSAAQERFKKDFPQLSGAEQDQILIDCEAGKLPTQGFTS DYFFSLLKNAVLAGAYADPLYNGNNNMDGWRMKEYPGAQMAYTYLMTSETFEKVPPVSLS SME >gi|316921284|gb|ADCP01000163.1| GENE 30 26720 - 28429 2068 569 aa, chain + ## HITS:1 COG:Cj0415 KEGG:ns NR:ns ## COG: Cj0415 COG2303 # Protein_GI_number: 15791782 # Func_class: E Amino acid transport and metabolism # Function: Choline dehydrogenase and related flavoproteins # Organism: Campylobacter jejuni # 1 568 1 572 573 714 61.0 0 MAKELKKVNVVTVGVGFTGGIALAECAKAGLSVVGLERGDRRGVEDFQDIHDEWRYAVNY GLMQDLSKETITFRNTEEMRALPMRKLGSFLLGDGLGGAGVHWNGMNFRFSPYDFQIKTM TDERYGKNKLGKEYILQDYPLTYDEMEPYYTAFEMALGVAGEPGYFGGKRSKPYAMPPLV KTPVLSKFETAAKQTGCHPYMIPAAIASEPYTTPDGVTHNPCMYCGFCERFGCEYDAKAE PNNTFIPVAEKTGNCEIRCNANVVEILKKGNKVTGVRYIDTLTLEEFIQPADVVVLSSYV MNNAKLLMVSKIGKQYDPKTGQGTLGRGYCYQITPGATLFFEEPMNLFAGAGALGMCYDD FNADNFDHSDLKFIHGGVISLTQTGKRPIESNATPPGTRAWGSEFKKAAAYNFNRSLGIG AQGASIAYKANYLSLDKTYKDAYGLPLLRMTYNFTDQDRALFDYITKKIEEVAKAMNPKA MVSKPSPKDYNIVPYQTTHNTGGTITGKSPEDSVVNSYLQHWDAENLFVVGAGNFPHNGG CNPTGTVGALGYRCAEGILKYSKKGGSLV >gi|316921284|gb|ADCP01000163.1| GENE 31 28509 - 28772 384 87 aa, chain - ## HITS:1 COG:no KEGG:Dde_2645 NR:ns ## KEGG: Dde_2645 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 7 82 4 80 90 78 64.0 7e-14 MSAHYPTDKGVAARVEELLREQLLELGEDPASLAPHLIMQNMQCEVYPDESMVYIWKDIP ILRVAPERTDTGVMWRMFTRDEGEPLQ >gi|316921284|gb|ADCP01000163.1| GENE 32 29149 - 31089 2048 646 aa, chain + ## HITS:1 COG:CAC2362 KEGG:ns NR:ns ## COG: CAC2362 COG0441 # Protein_GI_number: 15895629 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 6 634 8 632 637 670 49.0 0 MQVSVEGKMVEVSDGASCADALKEGLSGKRFKAAIACKAGDTLLDLTAPVPTTTETLTLT PVTADTPEGLGLIRHSAAHVMAAAVKKLFPAAKVTIGPSIENGFYYDFDVETPFSPDQFE AIEAEMQRIIDAAVPFERMDISKADAVKLFNDMGEPYKVELIEGLEDGTITLYRNGDFVD LCRGPHIPNAGFVKAFKLLSVAGAYWRGDEKNRMLSRIYATAFADAKDLKEYLNRIEEAK RRDHRKLGKELDLFAFHEDVAAGMVFWLPKGMLLRTILEDFLRREHIKRGYQLVQGPQVL RRELWEQSGHYANYRENMYFTEIEGDMYGIKPMNCVSHMLIYNAHLRSYRELPQRYFELG VVHRHEKSGVLHGLLRVRQFTQDDAHIICAPEQLEDEIIGVIALVRDLMALFGFQYRVVI STKPAKAIGSDEAWELATNALIKAVERANLSYTINAGDGAFYGPKIDVKVTDAIGREWQL STIQCDFTLPERFELEYVGQDGERHRPVMVHRAILGSLERFIGVLTEHYAGAFPTWLAPV QAKILTVTDAQNAFAEEACEALRKQGIRAEVDTRNEKLGFKVREAQLAKIPYILVVGDKE VEARSVNVRLRTGENLGVKSLDEVAALVQEDCAEPFKRGGMSYRFS >gi|316921284|gb|ADCP01000163.1| GENE 33 31124 - 31660 431 178 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 [Vibrio campbellii AND4] # 4 166 2 166 166 170 52 1e-41 MRRNEQIRARELRVIGPEGEQLGILGRNEAIAMAKEHGLDLVEVAATADPPVCRVMDYGK FKYETQKKKQEAKKRQTVVQIKEIKVRPKTDDHDFETKVRHIKRFLEDGDRVKVTVFFRG REIVHKDRGLSILERVIADTKDVGKVEQEPRAEGRTLQMLLTPVAKKSGAQDAPAQED >gi|316921284|gb|ADCP01000163.1| GENE 34 31809 - 32006 278 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|218886084|ref|YP_002435405.1| 50S ribosomal protein L35 [Desulfovibrio vulgaris str. 'Miyazaki F'] # 1 65 1 65 65 111 83 6e-24 MPKMKTRRCAAKRFSTTGTGKFKRRRQNLRHILTKKDAKRKMRLGQGALVDATNVKAVRR MMPYA >gi|316921284|gb|ADCP01000163.1| GENE 35 32094 - 32447 511 117 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|94986658|ref|YP_594591.1| 50S ribosomal protein L20 [Lawsonia intracellularis PHE/MN1-00] # 1 116 1 116 117 201 83 6e-51 MRVKRGLAAHRRHKKYLDMAKGFRGGRSRLYRTAREAVERSLAYAFVGRKQRKREFRKLW ILRINAGARENGLSYSKLMFGLAQAGVALNRKVLADLAVRQKEDFAKLVELAKTKLA >gi|316921284|gb|ADCP01000163.1| GENE 36 32979 - 34430 2047 483 aa, chain + ## HITS:1 COG:no KEGG:Ddes_2093 NR:ns ## KEGG: Ddes_2093 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans_ATCC27774 # Pathway: not_defined # 1 480 1 480 483 858 87.0 0 MGHPTIYPTGVTVYDPERSWNGFTIFQAQEVGAVLMDMNGREVNVWKGVHGMPNKIFPGG YLLTSRGRRSGKYSVQDGLDVIQVDWDGNIVWKFDRNEFIDDPGIPGRWMARYHHDFQRE GSTTGYYAPGMEPKTDSGNTLVLAHRNARNPKISDKQLLDDVILEVDWDGDIVWEWNCNE HFDEMGFREGPKNTLARNPNYRPTQPEGMGDWMHINSMSVLGPNKWYDAGDERFHPDNII VDGREANIIFIISKATGKITWKLGPDYDNSPEAKAIGWIIGQHHAHMIPHTLPGGGNILV FDNGGWGGYDVPNPGAPTGVKAALRDYSRVLEIDPVAMKIVWQYTPTEAGFLAPMDSNRF YSPFISGMQRLPNGNTLITEGSDGRVFEVTPDHKIVWEFVSPYKGKFVPMNMTYRAYRVP YEWVPQVAKPVETAIEPLDVSTFRVPGAAAFGDRAKEVSVEGCVPYEGSNALCVASVEDP EDK >gi|316921284|gb|ADCP01000163.1| GENE 37 34794 - 35621 629 275 aa, chain + ## HITS:1 COG:CAC3483_2 KEGG:ns NR:ns ## COG: CAC3483_2 COG0778 # Protein_GI_number: 15896720 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 83 263 1 173 185 82 31.0 6e-16 MIDFKVDESLCVSCGACVKDCLHQALRMDTYPVMVDEGHCIRCQHCLAVCPTGAVSIMGA AASDCTPLAGNIPEPRQLDTLFKGRRSVRHYKRENVSPGLLQELLDSAAYAPTGSNAQNL LVSVVDDIAAMDALREAVYLRLDELAETGAMPDCQRRAFFLSAGKLWKAGGWDGIFRSAP HCVIVANAKNATCVEQDPLIYLSYFELMAQARGIGTLWCGLLYWCLRDVLPDFLPRLGIP DTHQLGYAMLFGYPSINYRRTVETRSALVRHIGWN >gi|316921284|gb|ADCP01000163.1| GENE 38 35782 - 36699 176 305 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020671|ref|YP_526498.1| ribosomal protein S6 [Saccharophagus degradans 2-40] # 7 283 5 276 293 72 26 4e-12 MRGEEMKTGAVSSQWYAFATVVLWSSAFVFTKVALLSFSSTALGFLRCAIASVVLYAVLR CKGVRMLSWRELPRFTLSGALGFSIYLFIFNKGSETLTAATGCILIATAPIITALMASVV FRERLTRLAWIALWLAFLGVLVLMLWNGSVSINAGVFWMLGAALAISAYNVIQRLYANGY TSLQITAYSFFTATVMLAGFLPESIRQIESAPLSHVIAVIFLGVFPSALAYLLWAKALSF AATTSDVTKFMFLTPLLSFVLGYVVISELPGVETWIGGALILSGLALFHISGQRAERARR AAMRH >gi|316921284|gb|ADCP01000163.1| GENE 39 36837 - 38210 439 457 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 4 454 3 449 458 173 26 1e-42 MQQRYNAIIIGFGKGGKTLAAYLADQGQSVAIVERSDKMYGGTCINIACIPTKTLVYEAH KSLCRGERSFEEKAADYRRAIARKNEVTGFLRGKNYAMLADRDNVDVITGTASFVSPHEV EVKTADKTLLLSGERIFINTGGETVIPPIEGVRENPRVYTSTTMLELEDLPRRLIILGGG YIALEFASFYAEFGSRVTILERGSRFLAREDADVADSVRKALENKGVTIITGASASAVRD AGDEAEVRFILDGTEQALPADAILLATGRRPLTAGLNLEAAGVKTTEQGAIAVDERLQTS VPHIWALGDVKGGPQFTYISLDDFRIVRDALYGEGKRVASDRDPAYTVFMDPPLGRVGLT EQAARDKNLDIKVAVLPAAAIPRARLMGETTGMLKAVVDAKTGTILGCALHCADAGEMIN VVETAIRAGKGYTFLRDMIYTHPSMTEALNDLLSKIK >gi|316921284|gb|ADCP01000163.1| GENE 40 38254 - 38667 366 137 aa, chain - ## HITS:1 COG:slr1184 KEGG:ns NR:ns ## COG: slr1184 COG0607 # Protein_GI_number: 16332292 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Synechocystis # 51 105 78 132 164 57 43.0 5e-09 MHSLRSKWFAVLLAYIFMFAAYGHSPAAELPQTGALAPAEALQLMDTLGDKLTVIDVRTE QEYAQGHVPGALLLPIQTLREHMGQVPADGSVLLLCRTGRRAETAYDMIREAYPQKKNLW FLKGIPVYGNDGSFTFK >gi|316921284|gb|ADCP01000163.1| GENE 41 38797 - 39396 312 199 aa, chain + ## HITS:1 COG:ycfQ KEGG:ns NR:ns ## COG: ycfQ COG1309 # Protein_GI_number: 16129074 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 2 120 40 160 236 67 31.0 2e-11 MGRPPSYDHTMLLREIIDCLWERGYAATPISVIVQETGVNAASLYARFGSKKGIMLAALD LYAKETIDALEALLAATQPGAGQVRAILEHAMGSFEDPRARGCFLVNTVTNISTDTPDFA MAVADYMGQIRGRIKDALKRAPGLRPEVTPEDAALYVQVQVWGLKVMARMRPDKAAGKVV VRQTLTALFTDEAVASLEA Prediction of potential genes in microbial genomes Time: Fri May 13 05:11:18 2011 Seq name: gi|316921252|gb|ADCP01000164.1| Bilophila wadsworthia 3_1_6 cont1.164, whole genome shotgun sequence Length of sequence - 44546 bp Number of predicted genes - 33, with homology - 28 Number of transcription units - 16, operones - 8 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 199 - 6777 3088 ## Ent638_0501 outer membrane autotransporter + Term 6793 - 6850 21.9 - Term 6849 - 6893 11.1 2 2 Op 1 . - CDS 6916 - 8064 1302 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 3 2 Op 2 8/0.000 - CDS 8093 - 10009 2815 ## COG4666 TRAP-type uncharacterized transport system, fused permease components 4 2 Op 3 . - CDS 10088 - 11053 1414 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component - Prom 11153 - 11212 5.9 - Term 11248 - 11280 1.1 5 3 Tu 1 . - CDS 11308 - 11772 -104 ## - Term 11973 - 12009 2.2 6 4 Tu 1 . - CDS 12144 - 12524 204 ## COG3653 N-acyl-D-aspartate/D-glutamate deacylase - Term 12540 - 12586 9.1 7 5 Op 1 11/0.000 - CDS 12639 - 13223 836 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 8 5 Op 2 2/0.000 - CDS 13234 - 15009 2305 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits 9 5 Op 3 . - CDS 15027 - 17180 1915 ## COG1042 Acyl-CoA synthetase (NDP forming) - Prom 17225 - 17284 6.0 + Prom 17232 - 17291 6.5 10 6 Op 1 . + CDS 17485 - 18309 988 ## COG2043 Uncharacterized protein conserved in archaea + Term 18351 - 18418 2.1 + Prom 18415 - 18474 4.7 11 6 Op 2 . + CDS 18504 - 19136 724 ## COG1802 Transcriptional regulators + Term 19197 - 19241 15.5 - Term 19353 - 19391 -0.5 12 7 Tu 1 . - CDS 19418 - 19729 307 ## Dde_1821 cytochrome c-553 - Prom 19961 - 20020 3.1 + Prom 19817 - 19876 2.4 13 8 Op 1 . + CDS 20089 - 21234 774 ## DvMF_2530 hypothetical protein 14 8 Op 2 . + CDS 21236 - 22591 1376 ## DvMF_2531 cytochrome bd-type quinol oxidase subunit 1-like protein 15 8 Op 3 . + CDS 22636 - 24432 1221 ## DvMF_2532 cytochrome c class III + Term 24534 - 24584 -0.9 - Term 24631 - 24679 13.6 16 9 Op 1 . - CDS 24740 - 25111 358 ## 17 9 Op 2 . - CDS 25108 - 25878 579 ## Elen_1342 5'-nucleotidase domain protein 18 9 Op 3 . - CDS 25875 - 26219 89 ## 19 10 Tu 1 . + CDS 26218 - 26487 97 ## - Term 26237 - 26266 -0.2 20 11 Op 1 . - CDS 26417 - 26974 438 ## COG2200 FOG: EAL domain 21 11 Op 2 . - CDS 26988 - 27122 81 ## 22 11 Op 3 . - CDS 27212 - 30295 2844 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains - Prom 30369 - 30428 4.6 + Prom 30299 - 30358 2.7 23 12 Op 1 . + CDS 30438 - 32030 1997 ## COG0786 Na+/glutamate symporter 24 12 Op 2 1/0.000 + CDS 32112 - 32900 169 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 25 12 Op 3 . + CDS 32929 - 33798 771 ## COG0726 Predicted xylanase/chitin deacetylase 26 12 Op 4 1/0.000 + CDS 33813 - 34736 928 ## COG0726 Predicted xylanase/chitin deacetylase 27 12 Op 5 2/0.000 + CDS 34793 - 35623 938 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) + Term 35709 - 35755 14.0 + Prom 35929 - 35988 2.0 28 13 Tu 1 . + CDS 36081 - 36722 579 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Term 36972 - 37009 2.2 + Prom 36783 - 36842 2.1 29 14 Tu 1 . + CDS 37053 - 38648 1410 ## COG2199 FOG: GGDEF domain + Term 38722 - 38765 1.8 + Prom 39321 - 39380 3.3 30 15 Op 1 5/0.000 + CDS 39504 - 40085 651 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 31 15 Op 2 16/0.000 + CDS 40134 - 42548 2931 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 32 15 Op 3 . + CDS 42560 - 43204 781 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 + Prom 43307 - 43366 2.3 33 16 Tu 1 . + CDS 43562 - 44389 681 ## COG3058 Uncharacterized protein involved in formate dehydrogenase formation + Term 44414 - 44442 3.0 Predicted protein(s) >gi|316921252|gb|ADCP01000164.1| GENE 1 199 - 6777 3088 2192 aa, chain + ## HITS:1 COG:no KEGG:Ent638_0501 NR:ns ## KEGG: Ent638_0501 # Name: not_defined # Def: outer membrane autotransporter # Organism: Enterobacter_638 # Pathway: not_defined # 1374 2192 597 1371 1371 380 34.0 1e-103 MTLSRGAIGNLVNRYRAVLRKCRMMNVFGSLAVAGMLVAGNAGFAGAEELSGDISPISLS GDTRNIIGVGDISLRSTEPALRYLINVSGQGQLDISMSNGSPMAVGNADGIYLKDYSEYD QYASAFHVAGSGSSGSFVGTGTFSMVGGGKLLGVCAFLSESKGTLTLSGDITGEAEAVMN GSNGYASFAAAAAGGNLVFGGDRTTLRAKASTGNNANGAFVKYGGMIDFASKSVLIESKN TDSSSVGINCADGTVKTSADTDLDIVVEGNKATTGIQLTASSSDVQLAGNLDLTATQTGQ DSFASVLGISNDSGKMVVSGPTSLRLATNAPFDAKGITASGKADMSFLGDVEIAVTGSAS GSALYTTYRYDYSTQAGICPVISLGTDGKAVALNSSGYGINNQGGSVSLTGQRINITGST GVFVEGGGNENVFADVRFDGPTTINADKAIVTSIKAGEQVGASVTFAYNPTPINVPVTKE SADSKVRGSVTGSSGTINKENAGSLAFYGDISNFSGVFNQKGGTTFLSEGAAGYFGKAQL AVTGGALVAPTLSFQKTGKLTLAGGTLETGTGQIFTSALNADGDMKDPGAVKLSDSNWKF DSGVIAFDDAKYNIVYAQTAAGLLGAGNVAADNVSGSGSAKEITFTGTLVELPPGDPDSF ETLQKAVLDTGIDSIKLGSDIVLSKRLQGTTPVARSLAIDGNGHTISGAYPGLWFKGMDS GTVSIQNIAFDGLKTSSGDRYEGPVSFGPAIFFDMGYFADNWKSTAKLIIGDGVQFRNTE SVGDGAGGAVRTAHGIVEIGNNVGFINCTGGSGGGLYSESFTTIGDNVVFEGNKGRRGGA LNVVDDYEDYLTDPDYASGARRVKYVHIGKNALFKNNSVELLGSGGNGGAIEVQSGELSI DDGATFTGNTSKSTGGAIAVCDWSPQLPAKAVLGSATFTQNSAGSYGGAIINEGDVRFNG PVSFVENTAGKIGGAVCNLNTLNMAAESTFSKNTAGVGGGFYNEGIASLGKASFIENAAA DGGGAVFNVHQLTFADGAVFSGNSATDGGAVYNDFSEDKDGNAVSAGSLAFNGGARFIGN TASGLGGAIYNTRSITLNPGAGQEIVFSGNTDSTGSNAIFMGDGSSLDITGDGKVVFNDA LSSQSATPALKKTGSGELLLNASMDGFLGTAAFEDGRTEIAQKWLIKNTVTITGGRLKMP AFSFVAQGEGNNVAGGKLILAGGILETGTGQIFINGLNAEGDNKDPGAVKLSGDNWSFDS GLIAFNDALYNLTYAQAAADSLGANNVEAGEGAGSTSAKEITFIGTGYVDPVPDPVPPVE PDPDPVPPVDPGPDPVPPVEPDPTPPAPEPKPDEDRTGKVSVDELNNNHANNIVLGNVTI TTGTSSDSGKDFVVGASAKDDKTDSISGSIGGKNIDLGSSGRNISVVGGHYLTLVGGSSD TPLVTAGGNPVNVHVGGTGEGASGVLNLGTPVMDSGGTLSGNIEIAAASTVNVRAGTHVI TGEANETTGTADVAGMNNNGGTINIAEGAHLQSTIRQADGQTNVSGKLTSASVELSGGAL NVSGAVDSSSVTASKGDIKVAGTLAADVLTTSSDVQLNIGDQGSAGRVVARQAQLQGGRV FLDPAWKGNDALADASHGVFTFVNNEIDGLLTAGQNSLLVLGDTDTGWALDAFGRSSLQW GQNGVTAALAIRKPQTLNGTLGAVMVDGTKTSAPVLIPNTATFADGSLLLVDGSGLNGAA ALTSQGGTLNVDAGAKLLIDNITQGEYAITSGFSVSNVQGWNGDNLSTPDNLIGLVLGKD ADGSMKVQATARRSSDVFRGLSLVNTMDAIWGRGLNDTESGNMGIRFLSRAVNENHLPKA DTVHTVDGAAQIAVAGGVQGMAVAAADAPVRAIQDHASLSHMTTTREGAIRKDGLNLWIN ALYGAEHARNLGAGSLDGGYNADFGGIVFGGDYAFGDFRVGMALNAGSGTARSRGDFNAT KNDFDFWGMNLYGSWSRDQFNIVGDLGYSANKNEVKQDLPTTMQLGQLRADVDTGVLTAG LRGEYRFETDWADVTPHVGVRYYNLRTDGFTSRIDDHDVFRVGRDTQEIWTFPIGVSFSR DFETSSGWKVKPRADLSVAPAAGDLKAKTKVRVPGVAASDTIKARMMDSVSFDGTLGLEV QKDNISFGLDYGIRASEHKAGHGVNVSFTYKF >gi|316921252|gb|ADCP01000164.1| GENE 2 6916 - 8064 1302 382 aa, chain - ## HITS:1 COG:PH1371 KEGG:ns NR:ns ## COG: PH1371 COG0436 # Protein_GI_number: 14591174 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Pyrococcus horikoshii # 1 381 1 386 389 307 43.0 2e-83 MQLALRTEHIPFSSLCAFFDAAEARQRAGQDIINMVMGRPDFLPAPHINEAVKKAVDDGQ VHYTSNYGLLELRQEVARKLKGENGLGYAPETEIVITAGVSEALHLSMATLLNPGDEALI PAPAFLSYGSCVHMASGVPVFVPTEQAKGFQPEPDVLERHITPKTKLILINSPQNPTGTV YPRETLQGIADLAIRHDLIVVSDEINEKIIFDGEEHISIASLPGMRERTVVLNGFSKAYA MTGSRIGYAAGPERLIAPIYCAHQYSAMCPCTYAQWGAVAALRGPQGHIADMVKELDRRR LMLLDRLAAMPGVSFVRPKGAFYIMVSIPGLGTPMQAAEFLLDAGLAIVPWDEEHLRISY ANSYENLSIAMDRMEKALRGRF >gi|316921252|gb|ADCP01000164.1| GENE 3 8093 - 10009 2815 638 aa, chain - ## HITS:1 COG:BH2945 KEGG:ns NR:ns ## COG: BH2945 COG4666 # Protein_GI_number: 15615507 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, fused permease components # Organism: Bacillus halodurans # 7 623 17 639 656 332 36.0 2e-90 MNETPLHEPLIAEEGKEKERVFTGTPERIVNICLASISALIIYWTVYVSADVMWKHGLYI MTVFCMTLVIYPFEKKPKVNHVSLVDWLLIIGSIAGSAYAIWEYLDRFMRLGMLTEMDIF FGCLMLFLGLEVGRRVIGWSLTLVSIVLILYSLYGFVIPGHFGHGGFDLEAVVTQVYAGM EGYYGLSAKMMIQYVAPFILMGAFLEKSGAGDFFIRLAFALTRNTVGGPAKAAVIGSALL GSISGSAVANVSSTGVLTIPMMKKVGYRPHVAGAIEAAASTGGQIMPPIMGAVAFLMAEF TQIPYLTIVTVATGPIILYYITLMAFIHFEAKKHNIGQMKDHHIPSARTVCSEGWHFFVA IMVIIVIMAFGYSPGMSALGGIITLIVIHSIKSRKVDFKMLYEAMVLGGRYSLGIGSLVG CIGIILSLVGLTGVGLKLSWLFTTLANGSPLVAILLVGLISMILGMGLASGPAYIVTAIA VGPALADMGFPLMTAHFIMMWFSIDSEITPPVGLASIVGAGIANADPMKTMFTAFKYAKA LYILPILFYYRPALLLQSSLFDIALTFVTVMFGLIAFAAVWENYLMRRTTLLERILLLGG ALLLFIPGHIWDFAGIGLFIIVYCLQRYYCPAGFRQTA >gi|316921252|gb|ADCP01000164.1| GENE 4 10088 - 11053 1414 321 aa, chain - ## HITS:1 COG:AF0635 KEGG:ns NR:ns ## COG: AF0635 COG2358 # Protein_GI_number: 11498243 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Archaeoglobus fulgidus # 24 320 37 330 330 164 32.0 1e-40 MLKRTLAALIVALTAFCGGTAQAEPLKLTFSTGSVGGGFFAVGSGIAGFASQKIPGISIT AISAAGVVESINRLEQGKADFAMLNTQDPPLAWEGKAPYKKQYRNMRGMGILYMQAAQPY TLKSSGIRTFKDLKGKTLSVGAPGGTMHLDFFRWVEANGLDPKKDFGKVLFLPASEAMEA VKTGQVDVAVELSSIPSPQISELSILRPVHILEFLPGARADMMQKYPQYLPTTIEKGAYN GIDQPIETVGTGAMFACRADLPDDLVYEIVKTVYSQEGVDYLGNVIAALKSMSPNLAVSY KPIPLHPGAERYFREIGLIKD >gi|316921252|gb|ADCP01000164.1| GENE 5 11308 - 11772 -104 154 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRTVLVKNGIIVDSGALRALDILIAGDRAPHGTPGVETPFGVTWDEGISRGAHRRHGFRQ DHEPESGENIRTLPAKRNTAARFRRRYGSCRSGMAPDRTRAQSPSRGGFQHLYGQTMPRR SRDRPPPGEIIRKRGEITGALPGRFLSAGPILKG >gi|316921252|gb|ADCP01000164.1| GENE 6 12144 - 12524 204 126 aa, chain - ## HITS:1 COG:PAB0090 KEGG:ns NR:ns ## COG: PAB0090 COG3653 # Protein_GI_number: 14520359 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-acyl-D-aspartate/D-glutamate deacylase # Organism: Pyrococcus abyssi # 2 98 4 98 526 62 35.0 2e-10 MFDTLFCNVRIVDGTGTPWFWGSLGVQDGFVAAVLPRTTKQLSRHTIALEGGILCPGFVD GHSHSERPLLSGEPTDCKTMQGVATEKLGLDGMSPAPMRPIPIFWGSTRANWACCRRKRP SARWLR >gi|316921252|gb|ADCP01000164.1| GENE 7 12639 - 13223 836 194 aa, chain - ## HITS:1 COG:CAC2000 KEGG:ns NR:ns ## COG: CAC2000 COG1014 # Protein_GI_number: 15895270 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Clostridium acetobutylicum # 1 190 1 188 192 130 32.0 2e-30 MQTTRFLITGVGGQGTILASDVLAEVGLRLGYDAKKSDILGLAVRGGSVVSHIIWSGKVR APMIDEGSADYYISFEWLEGLRRMAYTNPDTVILANDWRIDPVAVSSGQAEYPAVEGIRN TMRERCAALHVLPATPMAVDMGNARVFNSIIMGKLSLLVGGDGDVWRSAVADTVAPKVRD LNVRAFDAGRSFGL >gi|316921252|gb|ADCP01000164.1| GENE 8 13234 - 15009 2305 591 aa, chain - ## HITS:1 COG:CAC2001 KEGG:ns NR:ns ## COG: CAC2001 COG4231 # Protein_GI_number: 15895271 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Clostridium acetobutylicum # 1 584 1 573 584 494 43.0 1e-139 MKELLSGNEALARGALEAGIEIACGYPGTPSSEILENVAKYKEIYSEWSVNEKVAMDAAA GAAYSGRRSMVTTKQVGMNVLSDSLFYTAYTGAEAALVVITADDPGLFSSQNEQDNRHYA KLGKFPMLEPCDSQECKDFMGEAVAISERFDTPVVIRTTMRTSHSKSVVELGEPASYGKQ VGPFPRNMEKYNCMCTWARERHYVLEQRLLDLEAFSNEWPGNAIHWGDREYGFIAGGILY EYVRQVFPGASILKIGMCYPLPKKLIREFAEGVKNVIVVEELDPFMEEQIRAMGIDARGK SIFPICGELLPEDIAECCRKAGILKDAPSARANEAILDLPIRSPLLCSGCPHRSTFYHLS QMKLPVAGDIGCYNLGTLPPFNAQHTMGSMGASVGVLHGMGLSGLPEPAVCTIGDGTFFH AGVAPLLNMVHNKGKGTVIIMDNRTTAMTGHQDNPGLSKTLSNGTIEPVDIAGLCRACGV EKVVTANAFDMKEVRKGLEECTAYDGVSVLITRGDCVFVSRSPKPARVVDADKCIACGKC IQSGCPSVVLSDAVHPKTGKRKARIEPVTCVGCGVCSQICPVQAISGPENA >gi|316921252|gb|ADCP01000164.1| GENE 9 15027 - 17180 1915 717 aa, chain - ## HITS:1 COG:AF1211 KEGG:ns NR:ns ## COG: AF1211 COG1042 # Protein_GI_number: 11498810 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Archaeoglobus fulgidus # 11 697 4 674 685 336 33.0 9e-92 MKRSPWKLSPLFSPHSIAVIGASPKGGAGSIVIRNLQRLGFAGTIHPVNPKYADVLGYPC HPSLETIPGPVDCAAVLLGDKAILPILKTAHARGVKGVWAFASGFAETGEQGAAMQREIR DFCRETGLLFCGPNCVGYANITDGVGMYSAPLPRAFRKGSIGVIAQSGAVLLALGNSPRE AGFSRLISSGNEAALGLADYMDYLVDDPKTAVIALFVETIRDPEGVADACRRARGAGKPV IALKVGRSELACRVAATHTGAIAGSDRTLDAFFRRWHVIRVNTLDELLETSILFSGLRGA PTTSRRVGMSTVSGGEMGMLADICSDYGLEFPPLSEEGKAGLRSVLPPYAPLANPLDAWG SGDLKEAYPASLSILAREPAVDLLIVSQDMPSNMADEQIAQFSDVAAAAVRARADSGKPV VVVSNISGGIEPSIRKILDDGGVPALQGSTEGVGAVAAWLDWNRPLPEETEAPPPLPADL LAELDPCGGIVPYALSTRVLAHFGITALCERLTQTPEEAKTAAEAIGYPVALKGISPDIT HKTETGLVKLNIADAGALRQAWDELERSMNLHHPDARREGMLIQSMVTGDVVETIAGVNR DPAFGSAVVVGLGGIFVELLRDVSLELAPLSPARAKAMINRLQAAKLLQGFRGKQPADIA ALEETLVRLGRMAHALGKRLVSLDLNPLMVLPEGQGVRIVDIVMQARPSGDSPKHLP >gi|316921252|gb|ADCP01000164.1| GENE 10 17485 - 18309 988 274 aa, chain + ## HITS:1 COG:MTH526 KEGG:ns NR:ns ## COG: MTH526 COG2043 # Protein_GI_number: 15678554 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in archaea # Organism: Methanothermobacter thermautotrophicus # 8 234 10 227 228 85 29.0 1e-16 MPPKLPDFANISERLRKALRLEHRIVVIGLSDTPPANLPHYEGEPLKACQMLDTVRFEGK SFYTVQNDHYECKNAIRWLGFDESYEGHFSGEWATGDYPDNGRALFRAPAFSRRMYEESP KVRVGTVKCAYYMPLEKANEGPARGDEVAIFVLNPRQAMYLARGTLYSRGGICYGMTGPG TCQSVIAGPFCTRQPMYSLGCFGARQFMKITGNEMWFGVPIEQLALLADDVELLLERRPD LKAQMDEPFDQVHVVTQHELDVQKAKGKLITKNK >gi|316921252|gb|ADCP01000164.1| GENE 11 18504 - 19136 724 210 aa, chain + ## HITS:1 COG:AGl2035 KEGG:ns NR:ns ## COG: AGl2035 COG1802 # Protein_GI_number: 15891135 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 203 23 224 238 82 29.0 6e-16 MNRDEFYTAFKHDIITSKLKPGELLKEKQIMQDYGIGRLPMREIMVRLQQENLIETIPRM GTVVTRLDIKHMRDVVELRLELECVVAKLAAERITDEQLGKLRELVEKIHSCADGKSDAA AAWDAQFHQLLYEAAGNAELTRVVNDLLQTMMRLWYYLELWDKDFLSDADGAPNILKALE KRDPDLAQKAMRVHINYSVSRAVTGKFDLA >gi|316921252|gb|ADCP01000164.1| GENE 12 19418 - 19729 307 103 aa, chain - ## HITS:1 COG:no KEGG:Dde_1821 NR:ns ## KEGG: Dde_1821 # Name: not_defined # Def: cytochrome c-553 # Organism: D.desulfuricans # Pathway: not_defined # 1 99 1 99 103 82 46.0 3e-15 MKKVLIVFSAVMLLSSFGVAFADGDAANLFKTHCQGCHGADGGRVPASGIEPIKGQSAAD LLKKLEGYKDGSFGAQRKLVMENVVKQLSDEQLKSLADYASTL >gi|316921252|gb|ADCP01000164.1| GENE 13 20089 - 21234 774 381 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2530 NR:ns ## KEGG: DvMF_2530 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 21 362 10 370 381 220 41.0 6e-56 MALGMFLWAAAAYASPEAPRLLSPPMALPVPSGFLKVLLLLTFFVHLVLVNVLLGSVILS VIDRRASASDRKGGVAFMPKVLALAVNFGVAPFLFLQVLYGHFLYPSIVLMAVWWMFVAL FAMLAYYGLYVSDGAVRPARRTPILFLSALLLLMTAFLLSNASTLMLRPDFWFRWFSEPH GHLLNTSDPTLFPRYLHILLASLAVGGLTMAWRARWSKRDPEVDREEAERRFRRGLDWFF YVSLAQVPVGLLFLFTLPPDVRGLFLGGDALSTAALILAVSGLCIALILVRQGVLKLAST AALGVILVMVCVRGMVRDAMLQPYSGAASPAPTALAMPHGQTAALSLFLVATVLVVAVLV WLGRVLFHALNRSESSEIREG >gi|316921252|gb|ADCP01000164.1| GENE 14 21236 - 22591 1376 451 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2531 NR:ns ## KEGG: DvMF_2531 # Name: not_defined # Def: cytochrome bd-type quinol oxidase subunit 1-like protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 1 435 1 425 435 414 51.0 1e-114 MELPVWHFVGIGSGLIIGIVSVLHVFVAQFAVGGGIYLVWMERKAYRDGAPEILQWLERH THFFLLLTMVFGGLSGVGIWFTMSVVNPGATSMLIHNFVFFWAAEWGLFLLEVVSLLAYY YTYPWSRSGRMSPETHMRIGIVYACSGFLSLVLINGIITFMLTPGQGLATGNVWLGFFNP TYWPSVVVRLGICLILAGMFALFTAPRIASAEARHIAVRVSGLWIILPFFLLLGGSVWYF MALPPDRQEAVLSRTSDIHPFLITYGWMLPVVFLAGVVAFVRAERLRKPLSIVILCSGLL LVGSFEWVRETARRPWVAEGYMYSNGMSVTQDARAKAGGIASVSGWVRTLDAVESGSLKL EAPLADGLSKGSMLFALQCNVCHGLGGPRIDIIPRVRRLTRYGLEAQLQGQGTRLGYMPP FAGNLADRQALADYLERMGAQRQTIPSPEAQ >gi|316921252|gb|ADCP01000164.1| GENE 15 22636 - 24432 1221 598 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2532 NR:ns ## KEGG: DvMF_2532 # Name: not_defined # Def: cytochrome c class III # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 4 581 16 643 643 400 41.0 1e-109 MAAWLAALALFLIWLGWVLFSTSPIEGKAEKPSFPGVAQSDPGTVLDSLPAKGSLATPNM PFTASQEYVLTGAGLQGFYQTSAPNDVWTLSYPEPQLNAQVIKRGPPVPEIVTQGVDVTW ELGPQAGLAKDSTTRQGRMNVVEDSFFAASIPVSAVNADGVLNPYPVITLRAKDEKTGKL LAESAAVLAVSPGFGCAQCHANAGTAILEVHDRHQSTHFMEQHAKGEVIACRSCHVGLKG GKAGEGKSGPELSVSAAIHGWHATYLADRGADACKTCHVDLGRTGDDPKDAPRRLFARDF HVDRGLSCVRCHGFMEDHSLALLKAEQEAGLPQAAKLMSGITAREVPLAKIKGRLPWVQE PDCTSCHNFSEKPNLLTASAFNKWTPRSEGLSGLFSRRRDDMLMVRCIVCHGAPHAVYPA RNPLADNLDNLPPLQYQQQAATLGSYGNCALCHGQPMDFSAHHPLVQWSEREIHVPSGAR QTMPPARFSHQAHTPLINCTICHHTGYVDGKSLLCTSSGCHDGLTATLRTDKNAKPTLNP FYFYNAFHGTYPSCVACHTESLAAGKPAGPTDCKACHQAPSPLWAHEAEAGGSGAAAQ >gi|316921252|gb|ADCP01000164.1| GENE 16 24740 - 25111 358 123 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKHGESGVLQVVDMTGENLLRTLEYAIPADNDIRGWFYYFPGLKMTYAPSAMPGKRIIKI TDAEGKPLEAQRVYTAAVMNQTFPEDAIQSLTDTGISIQALLTETIQSRKTITPGKDGRF IIR >gi|316921252|gb|ADCP01000164.1| GENE 17 25108 - 25878 579 256 aa, chain - ## HITS:1 COG:no KEGG:Elen_1342 NR:ns ## KEGG: Elen_1342 # Name: not_defined # Def: 5'-nucleotidase domain protein # Organism: E.lenta # Pathway: not_defined # 1 250 248 499 618 66 27.0 8e-10 MIDTGVWKKSDLSLTYAERETMLFNRQCAMVEDSVLMAQRGGYALTGSTDQFALMPFFSP GTPCDWDRLYMVCYIGLNKHLGEPQNKKKYDPVMQLMDYISTPEGQLALAADTGAMYSSV KKVPAPDIPEIADMLPALSHGRYAIFPEPKNAQDALRESLAGMLAGALTQDDVIRMVDHQ NKNPPPPKACTVIGEATTDFTLIETGNFLTDAMRAKAGTDVALFLDNGKDGKYNGKGASA RFRPQTTNDVSGPSPI >gi|316921252|gb|ADCP01000164.1| GENE 18 25875 - 26219 89 114 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNSIARDGKLYCLPCSAQVRDIVYNTILFEEKGWKVPYDFNRFIALCRTIEARGIRSIH LSFGNSELLDTAFGATATETASALRPMPNGWPTTKGAGAASGSISAPPSIRFKP >gi|316921252|gb|ADCP01000164.1| GENE 19 26218 - 26487 97 89 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVLYLLVKDILERSSRRRDVLNQVEGGIAAPEVVHSATETGRMDGVDDLAEFAHVLEHH VFRELDFNVLRRRAAPLANFDVALQEKGI >gi|316921252|gb|ADCP01000164.1| GENE 20 26417 - 26974 438 185 aa, chain - ## HITS:1 COG:RSp1097_2 KEGG:ns NR:ns ## COG: RSp1097_2 COG2200 # Protein_GI_number: 17549318 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Ralstonia solanacearum # 50 166 1 118 272 82 36.0 4e-16 MQGAKSVWLSPIVEFGVYVINNDSLPSPHMIGRAKALKETSTELRGRLRYAVYDDAIRRQ LFREKRLKNIMGAALQNRDFQVYLQPKYRTESETIGGAEALVRWMGAEGMIYPDEFISLF ERNGFIIQLDLWVFEEVCLRIRAWLDAGLPPVKVPVNCSRVQLKSPFPGALHRNLPEVRH AAAIH >gi|316921252|gb|ADCP01000164.1| GENE 21 26988 - 27122 81 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLRCLHGGIMSDLSPHECMERLFADTFCILAVYANEEQTLKRFV >gi|316921252|gb|ADCP01000164.1| GENE 22 27212 - 30295 2844 1027 aa, chain - ## HITS:1 COG:hyfR KEGG:ns NR:ns ## COG: hyfR COG3604 # Protein_GI_number: 16130416 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli K12 # 635 1027 258 660 663 259 39.0 2e-68 MYAGTVSPSDPILAALTLSLRLSAAPLSERHLSLITGQPEENIRQILLDLEGFGFAARTE AAWTCGTVPPSLLGAMAATVRAKIPESRMPHYLGFLLPDAPLEECELVVRYLDEELNQQA PETLLCLEFVIRYLRDWGETHADAAFQKSSQYAEMVLIVQSQCLFLNSQMQMAVKLTPIA YALTQQSGNERFRTLVAIFGCYLKVFSDEPLPQDISRYVERLGELPDLGDKEMQDCLPLF RGILHYVRGEYPHVLKYYEQKLEVYGWKYRRFAMLLASCASQSAFYLRQYHLSLGINESS RRTAALAGDRMLSMFWMLHLAFAMLRVGDMDAALLNLDCLFMAFDARQYNKTAVSTVRGI ALYHYLNGRLRSAHALLTGQAARGVSPNAPHVPFEDPLNLDMLYALEQAGFPPISRYPLA LIVKKLKEGTNRQLRGAALRIEALRLRDRGGDPKREQTLLRESLDCISGTGDRREEALSA NELANVLERIGDAASAAALRDQAEACAGFRVDRSVSYQQMVLLLTRHSQGNRLPALGDGH YDCPQDSCVERCHRAFNACSNETCLTSALHRLVAIAQKELKSERGALFRRDEDDRLVCVA AVNLTDMELKSEQMRPCLEWLDGFSDRPAAPGRGEQGLCLPLDIGESGLWLLYLDSTFTD GPFAHLHQPELHTLSYLFASEVRSALRLKKVRDEESRHQKERFQSVVLQEDRNIAPLFGT GLGELLEQVRHVSVTDAPVLILGETGVGKEVMARQIHLMSRRSGPFIAVHPASTPEHLFE SEFFGHERGAFTGAIKQKLGFFEMADEGTLFIDEVGDIPPAIQTKLLRVLQEQRFIRVGG TREIYSSFRLITATNKNLWQEVQEGRFREDLLYRISVVPLMLPPLRERRQDIIPLVQNFM EHFCRRYNQLPFALNQEETAVLRSYDWPGNIRELKNVIERAVILNRLPALNVPHSPPHRK TSEETPNRRGFVVDDIPTLDELERRYLEYIFNIASGCVYGEKGMTELLNLKRSTLYTKLK KHGIMLR >gi|316921252|gb|ADCP01000164.1| GENE 23 30438 - 32030 1997 530 aa, chain + ## HITS:1 COG:Cgl2722 KEGG:ns NR:ns ## COG: Cgl2722 COG0786 # Protein_GI_number: 19553972 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Corynebacterium glutamicum # 67 485 1 395 449 78 24.0 4e-14 MLALCLLVYKCLLNEKRLSGVFLPVLRPLPPRVTGFSLVAAGRTKGLETRARLPMPGSAP QLSGCSMESSFLPYLMAVMWMSFLLLLSVWLRAKVTFLQKYLVPAGIIAGTLGFALINLG WIGYPSPEGWVPLKVGDFGMISFHLFSFGFGIIGLGCFSTHTKGRSMTLIKGALWIDLLF WLFYGLQATVGYGITELYAKITGSDLTAATGFLAGFGFAFGPGQALSVGMSWQNDYGLPD CVSMGLAYAAAGFMVANFVGVPLANWGLRKGYATYGTKELSSDFLKGLRPEDKRVPACQL TTHSGNVDTFALHFAVAATVYGLGWIVCYVLKYFVLPPNYQTASFGFIYLYALFAGMIVR LVINHTAANAFYSDDAQNRILGTTIDYMIISALMAVSAATVMKYFIPFILVIVVCTLITL LGVLYLGRRVGSFGLERLLVVFGLVTGTAASGLALLRIVDSDFKTPAAAEVGLNNVYALI PLFPFLLLSVTMPGGFGISGMLIMHVVMIAVCAAILFAGHRMKIWGPRQF >gi|316921252|gb|ADCP01000164.1| GENE 24 32112 - 32900 169 262 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 86 256 77 238 242 69 27 3e-11 MYTDLKGKTAFVMGAGKARGIGFHIAGALARQGVSVCLADRDGCVFERARELEAEGVVAR AFTADVCSAAQVEAMRDAVGRLFPSIDILIQCAGVFPDPVKLTEDSIETWLRTQDINVHG TFRLLHAFLPLMRARGGSVVAVASGAGKRPLPGYSGYSVSKAGLIMLLKSVAVEYAADGI RANSICPGPVESEMVDSRVAAESERLGVPAETLREAIRKTIPLGRMASVDDIVRTALFLA SDVSSYLTGQSLNLSGGMITEV >gi|316921252|gb|ADCP01000164.1| GENE 25 32929 - 33798 771 289 aa, chain + ## HITS:1 COG:SMb21100 KEGG:ns NR:ns ## COG: SMb21100 COG0726 # Protein_GI_number: 16264427 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Sinorhizobium meliloti # 20 280 17 287 292 92 28.0 6e-19 MQEAPLMLKNKTARRIKDIPGFVFPGGVRVIINFTVDFDAMIFRKFLREPKLWCAQGEFG GRTGLWRLLDVFGEFDVRTTLFLPGQTGLLYPEVLRRAVREGHEIANHMWDHHIPPTLEE EAVHIDRTDTLIKGLTGQYPAGTRSEHDLAALRDHAYTYVSYTPQGEFPFYVYYENIGKW MLNLPISFIHDDAMFFYFGWFGSRNEQQRIQSPEAFLQTLLEAYAAARETTGYMNIVIHP HLCGRLCRLEMLRRFFRRTREDGDVLFATSAWLADYILERFPAEGPASA >gi|316921252|gb|ADCP01000164.1| GENE 26 33813 - 34736 928 307 aa, chain + ## HITS:1 COG:PA1517 KEGG:ns NR:ns ## COG: PA1517 COG0726 # Protein_GI_number: 15596714 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Pseudomonas aeruginosa # 76 295 100 306 308 64 27.0 2e-10 MAWKKDFTLSDEMSPSGIVWPAGKAAVSVVVDYSVPAGAEGIDEAAIAYARTVWGNAVSG GWLIDYLNAYGVKATFAVPLAMARAFPESVRRAHADGHEIAAGSFAKEDVAGLSPDEERE RIERTLSGLAELTGTRPEGWFTLPRTGDDYPGGSVSDATGELLLEAGCGYFGNSMADDIP HYWITDYDRRHTLLMLPYYYAMDTQFFMFFPGVGKGSGLVQMRALWENWAAELEGVRAWG RQSTFVIQPYLMQFGAARAVLDRLMRAVTGAPDLWSATSGACAAYWKQAYPADRTLRFEK AAWLDEE >gi|316921252|gb|ADCP01000164.1| GENE 27 34793 - 35623 938 276 aa, chain + ## HITS:1 COG:BS_yjmF KEGG:ns NR:ns ## COG: BS_yjmF COG1028 # Protein_GI_number: 16078300 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Bacillus subtilis # 8 271 7 273 278 160 38.0 2e-39 MDIHALFSLKGKTAVVTGASGALGGAAVRAMAYAGANVAACYNSGRERLDTLISEIGDVG VEIKPYKVNSFSQDEIRQHAEDVMRDFSGIDVVINTAGGNVKGAYYTDEQSLFDLDAQPQ FDTVTLNLFGGCFWPCLAYGAKMLDNPGGGSIINFSSISAFTAIRGHIAYAAAKAGVSNF TQSLAAHLARDFNPKLRVNAVAPGFFPNNNPAQMLFNPDGSYRAKAQRGVDATPMHRMGH PNELIGTLVWLASDASSYVTGITVTVDGGYLLDSPA >gi|316921252|gb|ADCP01000164.1| GENE 28 36081 - 36722 579 213 aa, chain + ## HITS:1 COG:NMB0357 KEGG:ns NR:ns ## COG: NMB0357 COG0744 # Protein_GI_number: 15676272 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Neisseria meningitidis MC58 # 21 211 35 222 233 151 41.0 1e-36 MLDIGRYLVWPPVGWLETEQPRTTSFMEYRQEQWQNGPKAKDGKAMRIRQQWVPLKRIAP ALRQAVVTSEDDLFWKHDGFNFSQMYDALKRNWDKGRMAAGGSTISQQLAKNLWFTPERS ILRKIKEAIMTWRLELALDKERILELYLNVAEWGNGVYGAEAAARHYFGKSAASLSRGEA ARLAVMLPSPLRRTPSSAIVKRLSSRLLKRMPR >gi|316921252|gb|ADCP01000164.1| GENE 29 37053 - 38648 1410 531 aa, chain + ## HITS:1 COG:RSc0588_4 KEGG:ns NR:ns ## COG: RSc0588_4 COG2199 # Protein_GI_number: 17545307 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Ralstonia solanacearum # 359 531 2 180 182 120 40.0 8e-27 MYRCKLDIRIFSEDPLLLADVRNIAPLERFEHEVSGYRSFSPEAVRGSDIIVLDLPVAER PEAVRALCKPGAILVFCMEPEAFAVLRTPSLEAADDIWVKPFHRDFGAVRFKKILAGIKH RKDSRLTQTYLDTIIDSIPDLIWFKDVKGSHLKVNNGFCHAVGKKKEDVQGRGHYYIWDL KKEEYEQGEYICLESDEIVLEERRTCLFDEMVKSKQGMRQFKTYKSPLFDDDGTILGTVG IAHDVTDLANMGAELEIFLRNMPFAILISGNDGRIINVNAKFEEYFAAKEKNIVGKPYEE WKHVIQKSLCKTYGEGHFEIRLHGDGEERILEFHEEPIFDVFRNRVGQFCFCRDVTIERT FEHQIWISANTDALTGLYNRRFFYEYMNENRKENQLSLLYVDLDDFKKVNDAHGHHIGDG ALELVARLMREAFPGDFIVRLGGDEFLICLVGERSLAFLEEKANRLLQSLLEAFQVSDYL RVMSASIGIASSADPGTKLDDLVRQSDIAMYAAKQSGKSRCCVYSSGLIKK >gi|316921252|gb|ADCP01000164.1| GENE 30 39504 - 40085 651 193 aa, chain + ## HITS:1 COG:ECs4820 KEGG:ns NR:ns ## COG: ECs4820 COG0243 # Protein_GI_number: 15834074 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli O157:H7 # 1 193 3 195 1016 145 39.0 3e-35 MERRSFLKLSCGLGAGILLSSLGVSVLPIEAFAQELTKYDRIKTAKQSTSICCYCAVGCG LLCSTDTKTGKIINIEGDPEHPINEGSLCAKGAGIYQTTAANDRRLDKVLYRAPYGDKWE EKDWDFAIDRIARHLKEERDAGFVKKNAKGQTVNRVESVAHLGSSNVDSEECWSMAVFAR SYGLVYIDHQARV >gi|316921252|gb|ADCP01000164.1| GENE 31 40134 - 42548 2931 804 aa, chain + ## HITS:1 COG:STM1570 KEGG:ns NR:ns ## COG: STM1570 COG0243 # Protein_GI_number: 16764914 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Salmonella typhimurium LT2 # 1 801 213 1014 1015 716 46.0 0 MTNHWIDLQHSDCILIQGSNAAENHPISFKWVLKAKDKGAEVIHVDPRFTRTSARSSMYA PLRSGTDIAFLGGMIKYILDHDLYFKDYVLSYTNAAFLVNPKFSFNDGLFSGYDAQKHAY DKSSWSFQKDGKGLIKRDDTLKNPHCVFQLMKKHYDRYDLKKVSSITGTPEADLLAVYKA FAATGKPDKAGTIMYALGQCHHSVAVQNIRTMTIVQLLLGNIGICGGGINALRGEPNVQG STDHALLSHYLPGYLKAPKASWQTLEQYIAGTTPQTANPQSLNWMSNTGKYATSLMRAFY PEGGTPENGFGYDYLPKLDDGQDASVMSMIDAMYAGKIKGLTCVGQNPACSLPNSNKVRK ALQNLDWMVHVNIFDNETASFWKGPGLDPKKVKTECFLLPVTASVEKEGSQANSGRWMQW KYAAAEGPGDAISTGDVFWRVMSKLKELYAKDGGTFPEPILAANTDFVDGKGRYDPERVA KYINGYFLKDVTINGVEYKKGECVPGFPLLQADGSTSCGNWICSGSFTRAGENLMKRRKK VDPTGLGLYPEWSYSWPVNRRVIYNRASVDPQGKPWNPKKALVQWKDGQWIGDVPDGPAP PLAMQGGKLPFIMQPEGLGALYGPGLAEGPFPEHYEPMESPFKKNIMSAQRVNPALHSFG KDMNPIANASPDFPLVMTTYSCTEHWCTGAFTRWQPWLVEAQPQAYVEISEELAKEKGIA NGEKVRISSARGTVDAVAMVTVRLTPFKCGGKTVHQVGMTFNYGWLQPKECGDTANLLSP MVGDANSMTPEYKAFMVNIDKIKA >gi|316921252|gb|ADCP01000164.1| GENE 32 42560 - 43204 781 214 aa, chain + ## HITS:1 COG:fdnH KEGG:ns NR:ns ## COG: fdnH COG0437 # Protein_GI_number: 16129434 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli K12 # 8 208 33 237 294 147 38.0 2e-35 MSDGKTILVDVSRCTGCRGCQVACKQWNELPATDTVQTGSYQNPPDMNGDTYKIVRFREG RHENGKPYWNFFTDMCRHCVNPPCVLAADEGTMIHDEATGAVVYTEKTAENDFDVLLDAC PYRIPRKNEKTGAIVKCTMCIDRISAGGIPACVQSCPTGAMRFGERAAMLELAHKRVEEL KKEFPKAHAVDPDDVRVIYVITDDPDKYAEHVGA >gi|316921252|gb|ADCP01000164.1| GENE 33 43562 - 44389 681 275 aa, chain + ## HITS:1 COG:aq_1051 KEGG:ns NR:ns ## COG: aq_1051 COG3058 # Protein_GI_number: 15606334 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Uncharacterized protein involved in formate dehydrogenase formation # Organism: Aquifex aeolicus # 83 269 100 274 283 64 27.0 3e-10 MYDDTALRRLLIELAASHPEHAAVIRAFTPLALARGKLLDAYALSRKTKPLDGFPMFRFE ELPVCSEKSGAIASAVLEAVAEGFPGAREQVEAVRDALKPGSVRRLCKASLAHDPEPLAL FAEKNSLSPDVLDMVIAQTVKILMARTANSLPEAPFDPARKTCPYCGGKPELSVVHEKEG HRSLFCTDCGRHWRFQRTACPSCGCDKPDNLRLHFAENTPDERAVSCKNCGHYILEADIR KRDLALDGAAIVCLGMGYLDALMQEQGLLPLGESA Prediction of potential genes in microbial genomes Time: Fri May 13 05:12:59 2011 Seq name: gi|316921238|gb|ADCP01000165.1| Bilophila wadsworthia 3_1_6 cont1.165, whole genome shotgun sequence Length of sequence - 15707 bp Number of predicted genes - 20, with homology - 11 Number of transcription units - 10, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 55 - 402 176 ## COG3293 Transposase and inactivated derivatives - Prom 581 - 640 6.2 + Prom 531 - 590 2.4 2 2 Tu 1 . + CDS 613 - 735 67 ## BHWA1_01309 hypothetical protein 3 3 Op 1 . + CDS 872 - 1099 136 ## Moth_0617 hypothetical protein + Term 1147 - 1186 7.5 + Prom 1106 - 1165 2.0 4 3 Op 2 . + CDS 1191 - 1409 112 ## + Term 1426 - 1468 9.5 5 4 Tu 1 . + CDS 1479 - 1676 92 ## + Term 1724 - 1764 2.2 - Term 2512 - 2555 3.0 6 5 Tu 1 . - CDS 2589 - 3149 7 ## - Prom 3199 - 3258 2.4 7 6 Tu 1 . + CDS 3223 - 3465 168 ## + Term 3511 - 3550 10.2 - Term 3499 - 3537 6.2 8 7 Op 1 . - CDS 3551 - 4420 726 ## COG3646 Uncharacterized phage-encoded protein 9 7 Op 2 . - CDS 4440 - 4880 296 ## 10 7 Op 3 . - CDS 4895 - 5344 273 ## 11 7 Op 4 . - CDS 5345 - 6088 793 ## Bamb_6608 conjugal transfer protein TraL 12 7 Op 5 . - CDS 6100 - 6495 92 ## - Prom 6636 - 6695 7.5 13 8 Op 1 . + CDS 7030 - 7296 71 ## 14 8 Op 2 . + CDS 7300 - 9078 501 ## Reut_D6526 conjugal transfer relaxase TraI + Term 9249 - 9291 -0.9 + Prom 10179 - 10238 8.2 15 9 Op 1 1/0.000 + CDS 10380 - 11000 324 ## COG3505 Type IV secretory pathway, VirD4 components 16 9 Op 2 . + CDS 11056 - 11475 238 ## COG4959 Type IV secretory pathway, protease TraF 17 9 Op 3 . + CDS 11479 - 13314 525 ## COG4227 Antirestriction protein 18 9 Op 4 . + CDS 13337 - 13900 287 ## COG4643 Uncharacterized protein conserved in bacteria 19 9 Op 5 . + CDS 13909 - 14151 118 ## + Term 14153 - 14198 8.0 20 10 Tu 1 . - CDS 14660 - 15439 592 ## COG1192 ATPases involved in chromosome partitioning - Prom 15483 - 15542 3.1 Predicted protein(s) >gi|316921238|gb|ADCP01000165.1| GENE 1 55 - 402 176 115 aa, chain - ## HITS:1 COG:BMEI1403 KEGG:ns NR:ns ## COG: BMEI1403 COG3293 # Protein_GI_number: 17987686 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Brucella melitensis # 1 115 1 122 122 147 58.0 5e-36 MRTLFYLSESQMERLSPFFPRSHGIPRADDRHVVSGILYVIKHGLQWKDAPAEYGPYKTL YNRFVRWSRLGVFSRIFTELANQTPFDGSLMIDSTHLKAHRTAASLRKKGAPRVS >gi|316921238|gb|ADCP01000165.1| GENE 2 613 - 735 67 40 aa, chain + ## HITS:1 COG:no KEGG:BHWA1_01309 NR:ns ## KEGG: BHWA1_01309 # Name: not_defined # Def: hypothetical protein # Organism: B.hyodysenteriae # Pathway: not_defined # 1 39 1 39 84 73 82.0 3e-12 MPAIARCYGIIIKMYFLAGEHNPPHFHAIYGEYVGVIDLI >gi|316921238|gb|ADCP01000165.1| GENE 3 872 - 1099 136 75 aa, chain + ## HITS:1 COG:no KEGG:Moth_0617 NR:ns ## KEGG: Moth_0617 # Name: not_defined # Def: hypothetical protein # Organism: M.thermoacetica # Pathway: not_defined # 1 75 1 75 90 82 52.0 4e-15 MFYRIHSVNPLPNKHLLVCFWDNTVKQYNMEPLIRSIPAFQILEEPVLFNQVRVDAGGYG VTWNDNLDLSCNELW >gi|316921238|gb|ADCP01000165.1| GENE 4 1191 - 1409 112 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNGTLMLYLDQYGNHFYARTVRELREKVGSSGSRIAKMYVENGADGEPRHVGYVIAGHWL KMFAPIELPVNL >gi|316921238|gb|ADCP01000165.1| GENE 5 1479 - 1676 92 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKSLDEYKDFSEHECPFCQGIDINMKIYQYDSVEYLEILYICLHCYKKWWVQYQLCGIEA NTSTL >gi|316921238|gb|ADCP01000165.1| GENE 6 2589 - 3149 7 186 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYRDGALEELIKDLEEERASKRLRELDPELLLKPDIWIVTPEEGACEKCKSFAKKLFLKK PQRPHPNCKCKIQRILPEKTAQKALIKNTPIVTDLLRGDTSPLFLFFSATSVINVHIKNL NPVRCADVNIDSYNELGKESKSSYLLPLSSVVFTFSYFREAPVDWKLVLVTNFDDAQLLC TVKNIE >gi|316921238|gb|ADCP01000165.1| GENE 7 3223 - 3465 168 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSAYQKSVWLGDLMRYRNLEKLWVGIRADPSTVSVAVPRTTETVGIGRAKRISRDWWLFE SESGEYGLVATKDTVKWILD >gi|316921238|gb|ADCP01000165.1| GENE 8 3551 - 4420 726 289 aa, chain - ## HITS:1 COG:PM1774 KEGG:ns NR:ns ## COG: PM1774 COG3646 # Protein_GI_number: 15603639 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Pasteurella multocida # 26 116 35 125 239 75 40.0 1e-13 MEIEQATTQEVFEPTIEDMIQTENGIPTTTSLVIAQAFEKEHKDVLRAIYNMECSPEFNE RNFAPVGYKDAKGEIRPAYRLTRDGFAFLAMGFTGKKAAAWKERFLEAFNAMEAALLRQR QRQETAPKEPEQSPPPRTWEKPEFFRSARKLTKAQTESLLGVLNMECLLQNRQPEDALKE LLTFFHLSSLEDMRQSDYRHAVFSVMKRMLLISGKTSEDAQASSQYAAAINGLINFWNHS SDFTKSDIQNYVQNKCGRSLQEGISSDSDGLKVLFTLWGGISHYDLRLN >gi|316921238|gb|ADCP01000165.1| GENE 9 4440 - 4880 296 146 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEDYSKIFQSHIIRFYHEWGYSRQQFAEIIGISYGRTYSLLSEGGDIGLTLNTMQAISDS LGLPLPLLLKPLESEEWQSVLALWKWNPPYHSYIPDGYTLVRNVVLPEHKAVIVNEWGKM TAKELKKLKKQKNSLDLKEESDTPNE >gi|316921238|gb|ADCP01000165.1| GENE 10 4895 - 5344 273 149 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDSLEEYIALIARENHVILDRKDSICILHTILAKFERDLLESQRKILDDFSGKIETLCLE LDLAEKARSEKALSFARSSAEQLMKSIPTTMKATVEETSFKKLLAQTALATLKDQMSVIV HIPRKVWIGGSVLALLNLLSLFLFLINVI >gi|316921238|gb|ADCP01000165.1| GENE 11 5345 - 6088 793 247 aa, chain - ## HITS:1 COG:no KEGG:Bamb_6608 NR:ns ## KEGG: Bamb_6608 # Name: not_defined # Def: conjugal transfer protein TraL # Organism: B.cepacia # Pathway: not_defined # 1 247 1 241 241 211 42.0 2e-53 MAHVHFFLNPKGGTGKSLVSYIVAQYLISKGVKTTCIDCDPVTPTLIAYEALHAVRVNIM VENNIDKTKFDRFVELIAQGEENEHFVIDNGTSSFVPLLDYMVTNAITPLLVSMGHTVTF HVIIVGGDNLTGTVDGFKQIVTQFAPEARIIVWLNPFFGTIERGGKSFEDFDVYRENKTH VSAVLYYPDFPKDTFGKSFALLQKDRLTFAEVCDEEQAPQDYDLMTRHRIGMIRQRVSTM LDAARVL >gi|316921238|gb|ADCP01000165.1| GENE 12 6100 - 6495 92 131 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTASLSEILREETLRKAKCSDRVSRCRPLVLSLWAEINTSLQDGWSLRSVWKALKDTGKY PSSYESFRANVRALQGTGGKVESPIPSGQKAVTPEKKPETTPASQTSTALKDFSVKGVDW NPAKKDDPTLF >gi|316921238|gb|ADCP01000165.1| GENE 13 7030 - 7296 71 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSVSEYLRNVGKGTPVKSRVDVQAIKALAKVNADLGRAGGLLKMLLKNEERLIGYSGEQL RQMTNNTVGEINRLQSSLALISKRILRD >gi|316921238|gb|ADCP01000165.1| GENE 14 7300 - 9078 501 592 aa, chain + ## HITS:1 COG:no KEGG:Reut_D6526 NR:ns ## KEGG: Reut_D6526 # Name: not_defined # Def: conjugal transfer relaxase TraI # Organism: R.eutropha # Pathway: not_defined # 1 356 1 382 746 251 40.0 5e-65 MIARRLRMNTPSKSSITRLIAYLTNSQGSSNRVQDVRITGCESEDATWASMEMLAVQQQN TRAKSDKTYHCMISFQEGEHPSPEEIAEIEQRFCESLGYGSHQRVSVLHGDTEHLHLHIA INKIHPEKLTIHEPFYDHKELARVCTEVERQYGLGVDNHTFRTQARPNAAINMERAGDIE SLTGWIQRTCLADLKDARDWKTFGAVLAKHGLTLKARGNGLVFASGELHVKASSIDRSLS KTSLEKRLGRFEGISPSSSMKSQKTYERKPLSAGASRLWSEYQREEQERQNKRERLRQER MEQREEAVKDFRLRNLLIKNMTSGILNKLILHRLSRNRLKKQLQKKQWIDWLRAKAGQSH PAGRHALREKQGASQRRADTALRVLEFGRARYRFNPEERESFFVRVERQSGSRAVIWGTD LERAVAESGMAIGDRISLHKIGAEAVSVHQRQEDGQGHVEKARIAGQRNHWRIDVLVKRI SEEALAYLQSRYEGHSSYLRGIERYPLEPETITRKGTRIFAEGVKENGGRLFVSRQATDA ELRCIMDFAERHYKDVTIHAPDPLRARMKDPITQHPEQKRTQEQSQDEGYGR >gi|316921238|gb|ADCP01000165.1| GENE 15 10380 - 11000 324 206 aa, chain + ## HITS:1 COG:mlr9747 KEGG:ns NR:ns ## COG: mlr9747 COG3505 # Protein_GI_number: 13488573 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Mesorhizobium loti # 1 167 424 578 679 119 35.0 4e-27 MLLDEFPAWGRLESVEESLAYLAGFGIYFYIIAQDMQQLIKTYGENQTITANTHIQIGFA PNNPKSIKYLSEMTGVTTVVKPQVTTSGSRIGVFQKQVSTTYQEVRRQLMTEDEVMRMPK AKMDGDNMIEGGDMLIFLAGTPAIYGKQMPYFLDPILKARTGVEPPADTDRIIEPVQEQE VNLEQVPNTCEGIFDEEQPDPSFFEA >gi|316921238|gb|ADCP01000165.1| GENE 16 11056 - 11475 238 139 aa, chain + ## HITS:1 COG:XF2058 KEGG:ns NR:ns ## COG: XF2058 COG4959 # Protein_GI_number: 15838650 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, protease TraF # Organism: Xylella fastidiosa 9a5c # 2 138 33 178 178 107 36.0 5e-24 MVNTTRSFPLGVYFKTHGPVHKGDLVLFCPEENMVARYGRARGLISYGVCPGRYGYLIKR VMATEGDEVNLSDAGVSVNGICLPNSQRQKAFPAFSGSLVLHNQVLLMSEHPLSFDSRYF GFVRVEKICTPLKPFILWE >gi|316921238|gb|ADCP01000165.1| GENE 17 11479 - 13314 525 611 aa, chain + ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 11 295 223 511 522 311 55.0 2e-84 MTKADDHYVQVAETLIEKLEQGTAPWQKPWDSGGYGSLPMNPTTGNRYRGGNALQLMMRE YGDPRWMTYRQAQQAEAQVRKGERGTPIIYWKVEEERSVRGEDGKPVYDKDGNSLKETVQ LERPRSFISYVFNAEQIDGLPPLVADKERVWADVQRAENILEASGVKLVHRAQGHAFYRL GEDAIYLPEKKQFSSEEGYYSTALHELSHWTGNESRLNREFGPFGSIAYAKEELRAEIGN MMISGELGIGQSVDDHAAYVQSWIKILRDDPREVFRAAADAEKIMDYVMEFEHTQELGQE HQSGLEKNSLGMPEQGVELHADMQEHQAEATRSQAAASSNNERVYLNVPFKQKEEAKALG VQWDRQKRSWYVPAATDTAAFAKWLKKEETVQTPQKEASEQRRDPQRLYLAVPYLERKAA KTAGAKWDSIAQSWYAGPQADMEQLKRWLPENVKNEQLPAMTPREEFAEVLRSVGCIVEG EHPLMDGAAHRIRTEGDRAGASSGFYVAYMDGRPAGYVKNNRTGEEVRWKTQGAFISQQD KAAFQAECARKQQERAKELLLEHEKTAERVAGQLSWMKQDVPTPYLDRKGLAPRRGVFFG HRWKNDLHPCC >gi|316921238|gb|ADCP01000165.1| GENE 18 13337 - 13900 287 187 aa, chain + ## HITS:1 COG:XF2061_2 KEGG:ns NR:ns ## COG: XF2061_2 COG4643 # Protein_GI_number: 15838653 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 1 164 663 820 849 182 60.0 2e-46 MQYIQEDGTKRFAKNSRKEGCFHPVGGMDALRTAPAIVIAEGYATAGSISDAIGHATVAA FDSGNLMAVATALKDKYPDKAVIIAGDDDLHLLNHPKVRANPGREKAEKAAQAVGGKAVF PVFAPGEREKDMAGFTEFNDLGQKSTLGMAAVARQLKPAIEKAISEKSAELERNKQLVQS HSEGMSR >gi|316921238|gb|ADCP01000165.1| GENE 19 13909 - 14151 118 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEKTELSAREVEEAVRAAYTRPDEATTLRESVEAKCALLSVPGLEVEFDPEEAEQCGAF MENALSEEEAADASQDVISC >gi|316921238|gb|ADCP01000165.1| GENE 20 14660 - 15439 592 259 aa, chain - ## HITS:1 COG:RSc3326 KEGG:ns NR:ns ## COG: RSc3326 COG1192 # Protein_GI_number: 17548043 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Ralstonia solanacearum # 3 250 4 252 261 209 49.0 4e-54 MYICAIANQKGGVGKTTTTHNLGVQLARNAKKRVLLLDLDAQGNLSDACGLEPQTLERTV FDVLAGNVPIAGAKSTLETGLDILPANIRLAEAELAFAGRMGRENLLKKALTSVAGEYDY VLIDCPPSLGLLTVNALNAANGLLVPVQVEYYALAGLALIRQTAELVRDLNPDLAILGLV LTFFDARKTLNKDVAAALADEWGDALFSTRIRDNVSLAEAPSNGQDVFSYKRGSYGAKDY AAFAAEFLERTEGCYGTRG Prediction of potential genes in microbial genomes Time: Fri May 13 05:14:49 2011 Seq name: gi|316921237|gb|ADCP01000166.1| Bilophila wadsworthia 3_1_6 cont1.166, whole genome shotgun sequence Length of sequence - 5652 bp Number of predicted genes - 3, with homology - 0 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + SSU_RRNA 441 - 1982 99.0 # AY858543 [D:1..1542] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. + TRNA 2063 - 2139 94.3 # Ile GAT 0 0 + TRNA 2147 - 2222 90.1 # Ala TGC 0 0 - Term 2047 - 2115 25.2 1 1 Op 1 . - CDS 2212 - 2457 79 ## - Term 2495 - 2530 1.8 2 1 Op 2 . - CDS 2584 - 2838 89 ## - Prom 3016 - 3075 3.1 3 2 Tu 1 . + CDS 3530 - 3679 56 ## + 5S_RRNA 4633 - 5189 84.0 # AF142677 [R:48033..48709] # 5S ribosomal RNA # Bacillus megaterium # Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus. + 5S_RRNA 5365 - 5464 93.0 # CP001197 [R:1104568..1104682] # 5S ribosomal RNA # Desulfovibrio vulgaris str. 'Miyazaki F' # Bacteria; Proteobacteria; Deltaproteobacteria; Desulfovibrionales; Desulfovibrionaceae; Desulfovibrio. Predicted protein(s) >gi|316921237|gb|ADCP01000166.1| GENE 1 2212 - 2457 79 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLIAAYRVFHRLSAPRHSPVALDILFLKNFTIPFPFPYSIVNEQHPPVFPTGATRPCLDD CREIFASMTNHGPYVSKLLAS >gi|316921237|gb|ADCP01000166.1| GENE 2 2584 - 2838 89 84 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVPLDSHEVPRASQYSGTGCVLFGFEYGAITRCGGAFQLTSSAYSRIAYAGPTTPNGRNH PVWALPISLAATLGVSFDFSSSGY >gi|316921237|gb|ADCP01000166.1| GENE 3 3530 - 3679 56 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALRRKCNGAKHGTEALAIPFGDWVGERSLRAEGVSEGMLDCSEVIMLT Prediction of potential genes in microbial genomes Time: Fri May 13 05:15:11 2011 Seq name: gi|316921235|gb|ADCP01000167.1| Bilophila wadsworthia 3_1_6 cont1.167, whole genome shotgun sequence Length of sequence - 1790 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 128 - 187 2.5 1 1 Tu 1 . + CDS 360 - 1763 1835 ## COG2509 Uncharacterized FAD-dependent dehydrogenases Predicted protein(s) >gi|316921235|gb|ADCP01000167.1| GENE 1 360 - 1763 1835 467 aa, chain + ## HITS:1 COG:PAB1091 KEGG:ns NR:ns ## COG: PAB1091 COG2509 # Protein_GI_number: 14521862 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Pyrococcus abyssi # 1 461 1 460 484 417 46.0 1e-116 MSNTFDVIIVGGGPAGLFAAWWLADHSDLSVCIVERGNMVKKRGCPLGKAKKCMKCDPCH ILSGMGGGGLFSDGKLNFIHKLGKTDLTQFMPRSEAESLIEETEAIFDRFGMTAPVFPSD MENAKSIRKEAKKHGIDLLLIRQKHLGSDCLPNHIDGMCEALRERGVSIRTGEDVRHVIV EDSEVRGLITDKGELRCRAAILAPGRVGADWMGDMAKKYGMELQQRGIEVGVRVETHNDV MSDLTNVIYDPTFFVQTQRYDDQTRTFCTNPAGFITLENYQDFVCVNGHAYRNRKSDNTN FAFLSKVILTDPVSDSHGYGVAIGRLASLIGGGKPILQRLGDLRRGRRSTWQRIHNGYVE PTLTEVTPGDIAMALPGRIVTNLVEGLEQLNLVVPGIADDSTLLYAPEIKFFSTQVATSS DLETSVSNLFVAGDGPGVSGNIVSAAATGIIPAKAIVRKFGKGEGKE Prediction of potential genes in microbial genomes Time: Fri May 13 05:15:13 2011 Seq name: gi|316921232|gb|ADCP01000168.1| Bilophila wadsworthia 3_1_6 cont1.168, whole genome shotgun sequence Length of sequence - 1565 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 73 - 132 2.0 1 1 Tu 1 . + CDS 182 - 358 123 ## 2 2 Tu 1 . - CDS 419 - 1459 501 ## COG2801 Transposase and inactivated derivatives - Prom 1502 - 1561 2.4 Predicted protein(s) >gi|316921232|gb|ADCP01000168.1| GENE 1 182 - 358 123 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNIKKSDERRYAQDEGSVKSYTFSLSGESEPRRCTFQEMMRIIKERCPNGKMKEILLA >gi|316921232|gb|ADCP01000168.1| GENE 2 419 - 1459 501 346 aa, chain - ## HITS:1 COG:CC0624 KEGG:ns NR:ns ## COG: CC0624 COG2801 # Protein_GI_number: 16124877 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Caulobacter vibrioides # 25 318 22 284 326 70 25.0 4e-12 MESFNQNVIKHKTGLLNLAAELGNISKACKMMGFSRDTFYRYQAARDAGGVEALFEVSRR KPNLKNRVEEAIEVAVTAFAVDFPAYGQTRASNELRKQGIFVSPSGVRSIWMRHDLASMK QRLRALEKISAKQGMVLTEAQVQALERKKHDDEACGEIETHHPGYLGSQDTFYVGTIKGV GRIYQQTFVDTYSKWAAAKLYTTKTPITGADLLNDRVLPFFSSMEMGLIRMLTDRGTEYC GKVEAHDYELYLGVNGIEHTKTKARHPQTNGICERFHKTILNEFYQVAFRRKLYQSLEEL QADLDTWIDSYNTQRTHQGKMCCGRTPMQTLLDGKSLWAEKVGQLN Prediction of potential genes in microbial genomes Time: Fri May 13 05:15:19 2011 Seq name: gi|316921230|gb|ADCP01000169.1| Bilophila wadsworthia 3_1_6 cont1.169, whole genome shotgun sequence Length of sequence - 1297 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 3.7 1 1 Tu 1 . + CDS 290 - 1279 1008 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family Predicted protein(s) >gi|316921230|gb|ADCP01000169.1| GENE 1 290 - 1279 1008 329 aa, chain + ## HITS:1 COG:BS_yurF KEGG:ns NR:ns ## COG: BS_yurF COG1975 # Protein_GI_number: 16080304 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Bacillus subtilis # 11 326 12 325 330 111 27.0 3e-24 MKLENDIAQHLEQGEFLALATIIDRSGSAPRHAGAQMLVTRDLSVLGTIGGGQVESDVLA ACLPVRKGGTARLMHFDMTGFTPDADMICGGIVDILVERITPEQLPFFRQAAACRSRAAF GVWLVDITDPASPQRSFHTDAPALPAPVLTQVRSNSAACIDLDGRRVYVEPLIHQGVVVL CGGGHVSLATGRLAHEVGFEVIAVDDREEYASPTRFPFARAVHVLPEFKGLAEACGIGPE HYIVIATRGHSHDRGCLSQAMHTTAHYVGMIGSKRKRDGIYSFLRSEGFSEDDFGRVHSP IGLEIGAETPEEIAVSIVAELIAARRGLV Prediction of potential genes in microbial genomes Time: Fri May 13 05:15:21 2011 Seq name: gi|316921228|gb|ADCP01000170.1| Bilophila wadsworthia 3_1_6 cont1.170, whole genome shotgun sequence Length of sequence - 1280 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 84 - 127 7.0 1 1 Tu 1 . - CDS 293 - 1147 1248 ## COG3956 Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain Predicted protein(s) >gi|316921228|gb|ADCP01000170.1| GENE 1 293 - 1147 1248 284 aa, chain - ## HITS:1 COG:BS_yabN KEGG:ns NR:ns ## COG: BS_yabN COG3956 # Protein_GI_number: 16077126 # Func_class: R General function prediction only # Function: Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain # Organism: Bacillus subtilis # 24 276 238 487 489 163 37.0 3e-40 MSDTTTAPAAADAATQAALERLNTVIDRLIAPDGCPWDSTQTPESLTEYVIEECHELVDA IRSGKTADMVEELGDVAFLILMIGKLMARAGGPSLADALEVEAAKMVRRHPHVFSDTTYE NQSEQLKDWDRIKREEKAAADAEGPKGTYDSLPRGLPPLTKAYRIHSKADRVGFTWPEDE DVEKQVEAEWLELLDAMASDDAEAIEHEFGDHMFTLVELGRRKGIKAAPALNAATERFLT RFERMEALARERNLDFVSLSLDDKDDLWNEVKAAEKPGTEKTED Prediction of potential genes in microbial genomes Time: Fri May 13 05:15:22 2011 Seq name: gi|316921226|gb|ADCP01000171.1| Bilophila wadsworthia 3_1_6 cont1.171, whole genome shotgun sequence Length of sequence - 1275 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1273 1297 ## Predicted protein(s) >gi|316921226|gb|ADCP01000171.1| GENE 1 1 - 1273 1297 424 aa, chain - ## HITS:0 COG:no KEGG:no NR:no DDAPVLGTLDTTQTTVADGEAALTGTLSFTPGADAEGAQVTVEVEGQTFTGTKANGEWTF TGGSDGSSFQLNGTAFTYTRPSSNTTDGRNDTIILKVTVTDGDGDTAEQSVAVNTVAAPL FEGAPSGGSSVVTTDEGNIPGMGSQHETSATQPFGAATDGSFKMELHGADATVSIGGTEL KVENGKLYHNGVEVTADAAVSVPGGAHGTLTVTGMDADGTVHYTYTLTAPVDATGNASNR PGEGDAGRGEAVHADAFDVSITTTGGTATGQITVDALDDAPVLSTLDTTQTTVADGEAAL TGTLSFTPGADAEGAQVTVEVEGQTFTGTKANGEWTFAGGSDGSSFQLNGTAFTYTRPSS NTTDGRNDTITLQVTVTDGDGDIAQQSVAVNTVAAPLFEGAPSGGSERGDHGRGQHPRHG LAAR Prediction of potential genes in microbial genomes Time: Fri May 13 05:15:47 2011 Seq name: gi|316921225|gb|ADCP01000172.1| Bilophila wadsworthia 3_1_6 cont1.172, whole genome shotgun sequence Length of sequence - 1158 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri May 13 05:15:48 2011 Seq name: gi|316921223|gb|ADCP01000173.1| Bilophila wadsworthia 3_1_6 cont1.173, whole genome shotgun sequence Length of sequence - 1036 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 207 - 623 313 ## Bamb_6596 antirestriction protein Predicted protein(s) >gi|316921223|gb|ADCP01000173.1| GENE 1 207 - 623 313 138 aa, chain - ## HITS:1 COG:no KEGG:Bamb_6596 NR:ns ## KEGG: Bamb_6596 # Name: not_defined # Def: antirestriction protein # Organism: B.cepacia # Pathway: not_defined # 3 138 2 142 142 112 43.0 3e-24 MENLEMEIITATKVADEDRLNFWFRYVGMAKMLAFERHVYCWMRRLCPHYDGGYWEFYDL SNGGFYIAPADEKKMWLTWPGNYFNDEMSADAAGIVVTLYALNDFAEQISPAFGEKHRQL YDYIESHPEAQAIYAAID Prediction of potential genes in microbial genomes Time: Fri May 13 05:15:53 2011 Seq name: gi|316921221|gb|ADCP01000174.1| Bilophila wadsworthia 3_1_6 cont1.174, whole genome shotgun sequence Length of sequence - 993 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 89 - 992 464 ## ETAE_2441 type VI secretion system protein EvpM Predicted protein(s) >gi|316921221|gb|ADCP01000174.1| GENE 1 89 - 992 464 301 aa, chain + ## HITS:1 COG:no KEGG:ETAE_2441 NR:ns ## KEGG: ETAE_2441 # Name: evpM # Def: type VI secretion system protein EvpM # Organism: E.tarda # Pathway: not_defined # 1 295 1 279 462 111 29.0 4e-23 MQPQRPVYWYSGQFLEPQHFQQADAFHASERASLLQSAQPCFWGVGNLDVDEAALAEGRI LVRSGIMRFQDGTEVVIAAVPEDGNAILPSRGFREAWTDPHMPFTVFAGLPPLKPYGNVA GIPSCMRDGGRLIGCDADTLPEAPGRYLCPNSDDSIADRYALPLATESRRDMPVRTLYLY PRLFWENETTDHPDWLFLPLLRLTDEGSGPRPDPAYAPPSLYVEANPVLRGLARGLEARL AAAIRRLEPARRRGLDEPATSRQLLLVNAARTLSELRHALAWPGTAPHELFGLLRNGFAA A Prediction of potential genes in microbial genomes Time: Fri May 13 05:16:00 2011 Seq name: gi|316921220|gb|ADCP01000175.1| Bilophila wadsworthia 3_1_6 cont1.175, whole genome shotgun sequence Length of sequence - 886 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 850 224 ## gi|302862878|gb|EFL85810.1| conserved hypothetical protein Predicted protein(s) >gi|316921220|gb|ADCP01000175.1| GENE 1 2 - 850 224 282 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302862878|gb|EFL85810.1| ## NR: gi|302862878|gb|EFL85810.1| conserved hypothetical protein [Desulfovibrio sp. 3_1_syn3] # 80 195 217 332 415 104 46.0 6e-21 AVTYGTARGHGLSGKRRRLSNARTPTAHKTTHKTGGTDALTPADIGAVPTTVQVIAGTGL SGGGSLTANRTLTVKYGTGAGTACQGNDARVTASEAFRLSMIGVPRYWRSTTLPAGHVWA NGDLALFADWPELKKIYDAGGFAGMLLAYNANSATIAANLGKWRPNAANPTGLYVPNLSE QFFRAWVQGIGKAGGYNTPGVPNILGSWSGWNIMSIAQGGANGVFQAHWEPNGTVALETK TAGVWDNLIIDASNSNPVYGSSQTVMPASIDVPCIIYLGLST Prediction of potential genes in microbial genomes Time: Fri May 13 05:16:16 2011 Seq name: gi|316921219|gb|ADCP01000176.1| Bilophila wadsworthia 3_1_6 cont1.176, whole genome shotgun sequence Length of sequence - 809 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 321 - 356 1.1 1 1 Tu 1 . - CDS 403 - 750 176 ## COG3293 Transposase and inactivated derivatives Predicted protein(s) >gi|316921219|gb|ADCP01000176.1| GENE 1 403 - 750 176 115 aa, chain - ## HITS:1 COG:BMEI1403 KEGG:ns NR:ns ## COG: BMEI1403 COG3293 # Protein_GI_number: 17987686 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Brucella melitensis # 1 115 1 122 122 147 58.0 5e-36 MRTLFYLSESQMERLSPFFPRSHGIPRADDRHVVSGILYVIKHGLQWKDAPAEYGPYKTL YNRFVRWSRLGVFSRIFTELANQTPFDGSLMIDSTHLKAHRTAASLRKKGAPRVS Prediction of potential genes in microbial genomes Time: Fri May 13 05:16:17 2011 Seq name: gi|316921218|gb|ADCP01000177.1| Bilophila wadsworthia 3_1_6 cont1.177, whole genome shotgun sequence Length of sequence - 802 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 240 185 ## Nmul_A0420 transposase + Term 425 - 459 0.7 Predicted protein(s) >gi|316921218|gb|ADCP01000177.1| GENE 1 1 - 240 185 79 aa, chain + ## HITS:1 COG:no KEGG:Nmul_A0420 NR:ns ## KEGG: Nmul_A0420 # Name: not_defined # Def: transposase # Organism: N.multiformis # Pathway: not_defined # 38 72 26 60 122 62 77.0 3e-09 KGRKTGYVGDTEVSHAYPVLSFRKSVGAYKIVLYLFAPRVDDRRVVSGIIYAIKHGLQWK DASDEYGPHKMLTIALQVD Prediction of potential genes in microbial genomes Time: Fri May 13 05:16:20 2011 Seq name: gi|316921217|gb|ADCP01000178.1| Bilophila wadsworthia 3_1_6 cont1.178, whole genome shotgun sequence Length of sequence - 767 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 362 89 ## gi|302861318|gb|EFL84257.1| putative tail fiber protein + Term 501 - 549 10.7 Predicted protein(s) >gi|316921217|gb|ADCP01000178.1| GENE 1 3 - 362 89 119 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302861318|gb|EFL84257.1| ## NR: gi|302861318|gb|EFL84257.1| putative tail fiber protein [Desulfovibrio sp. 3_1_syn3] # 2 118 94 196 198 66 41.0 5e-10 VALIGWNVPNYQGVFLRGYGGQTSYHYGAVGHWSAGLGELQGDGIREIWGELSYLPRSRD GEVGQSGSLAFWNEGRNQWMNDAGKAPSGAMNFYASRSTPVVGEVRPVNRAVRYLIRAR Prediction of potential genes in microbial genomes Time: Fri May 13 05:16:31 2011 Seq name: gi|316921216|gb|ADCP01000179.1| Bilophila wadsworthia 3_1_6 cont1.179, whole genome shotgun sequence Length of sequence - 738 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 183 128 ## Predicted protein(s) >gi|316921216|gb|ADCP01000179.1| GENE 1 3 - 183 128 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPPDLSSQQGYPPDMQLFPPDLNKKVVLSEDARERERERERERERERERERERERRRRTT Prediction of potential genes in microbial genomes Time: Fri May 13 05:16:38 2011 Seq name: gi|316921215|gb|ADCP01000180.1| Bilophila wadsworthia 3_1_6 cont1.180, whole genome shotgun sequence Length of sequence - 632 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 59 - 277 149 ## 2 2 Tu 1 . + CDS 393 - 611 149 ## Predicted protein(s) >gi|316921215|gb|ADCP01000180.1| GENE 1 59 - 277 149 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGHGDNYPATVTLEGNVSLHDTGYADSIIVYMADIGTPVAHRRYYINQRQAQETRPANI AQPSIIYLGIPA >gi|316921215|gb|ADCP01000180.1| GENE 2 393 - 611 149 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGHGDNYPATVTLEGNVSLHDTGYADSIIVYMADIGTPVAHRRYYINQRQAQETRPANI AQPSIIYLGIPA Prediction of potential genes in microbial genomes Time: Fri May 13 05:16:52 2011 Seq name: gi|316921214|gb|ADCP01000181.1| Bilophila wadsworthia 3_1_6 cont1.181, whole genome shotgun sequence Length of sequence - 615 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri May 13 05:16:53 2011 Seq name: gi|316921212|gb|ADCP01000182.1| Bilophila wadsworthia 3_1_6 cont1.182, whole genome shotgun sequence Length of sequence - 563 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 561 670 ## COG1492 Cobyric acid synthase Predicted protein(s) >gi|316921212|gb|ADCP01000182.1| GENE 1 3 - 561 670 186 aa, chain - ## HITS:1 COG:CAC1374 KEGG:ns NR:ns ## COG: CAC1374 COG1492 # Protein_GI_number: 15894653 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Clostridium acetobutylicum # 4 183 226 401 491 145 40.0 5e-35 IRDLNIPEEDMAGFSWGHTDCGEKKAGSLDIAVVMLRHVSNYTDFAPLAAEPDVRLRPVR RAEEWGDPDVVMLPGSKSVVPDLDDLRRSGLADNILGHAERGKWIFGICGGLQILGRAIL DPHGIESAAPEVPGLGLMDLRSTFAADKTLVRVARAETPLGVPSGGYEIHHGLTDHGPSA LPLFLR Prediction of potential genes in microbial genomes Time: Fri May 13 05:16:53 2011 Seq name: gi|316921211|gb|ADCP01000183.1| Bilophila wadsworthia 3_1_6 cont1.183, whole genome shotgun sequence Length of sequence - 542 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 349 329 ## - Prom 476 - 535 4.0 Predicted protein(s) >gi|316921211|gb|ADCP01000183.1| GENE 1 1 - 349 329 116 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRPALIDKLCQAGKWLFKPDLVPVDDVTLKRGSTAISVAIDALVRSGGGLAVDAKTGRL YVDFSLVPDDQMQAIVLAMVQQGGGLAVDGTGKLYVDFASMPTDKFESMLKSIRVP Prediction of potential genes in microbial genomes Time: Fri May 13 05:17:03 2011 Seq name: gi|316921210|gb|ADCP01000184.1| Bilophila wadsworthia 3_1_6 cont1.184, whole genome shotgun sequence Length of sequence - 530 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 88 - 297 120 ## Predicted protein(s) >gi|316921210|gb|ADCP01000184.1| GENE 1 88 - 297 120 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSIAQGGANGVFQAHWEPNGTVALETKTAGVWDNLIIDASNSNPVYGSSQTVMPASIDVP CIIYLGLST Prediction of potential genes in microbial genomes Time: Fri May 13 05:17:10 2011 Seq name: gi|316921209|gb|ADCP01000185.1| Bilophila wadsworthia 3_1_6 cont1.185, whole genome shotgun sequence Length of sequence - 500 bp Number of predicted genes - 0