Prediction of potential genes in microbial genomes Time: Thu May 12 16:52:32 2011 Seq name: gi|319979536|gb|AEUH01000001.1| Actinomyces sp. oral taxon 178 str. F0338 contig00001, whole genome shotgun sequence Length of sequence - 20646 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 5, operones - 3 average op.length - 5.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 277 - 3609 3021 ## COG0647 Predicted sugar phosphatases of the HAD superfamily 2 1 Op 2 . + CDS 3606 - 3755 94 ## 3 1 Op 3 5/0.000 + CDS 3763 - 4596 932 ## COG1189 Predicted rRNA methylase 4 1 Op 4 17/0.000 + CDS 4589 - 5422 972 ## COG0061 Predicted sugar kinase 5 1 Op 5 4/0.000 + CDS 5419 - 7089 1911 ## COG0497 ATPase involved in DNA repair 6 1 Op 6 . + CDS 7086 - 8252 1351 ## COG4825 Uncharacterized membrane-anchored protein conserved in bacteria 7 1 Op 7 . + CDS 8249 - 9151 1221 ## HMPREF0573_11888 hypothetical protein 8 1 Op 8 . + CDS 9184 - 9873 759 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 9913 - 9951 1.0 9 2 Op 1 3/0.000 + CDS 10031 - 10726 1017 ## COG2345 Predicted transcriptional regulator 10 2 Op 2 12/0.000 + CDS 10767 - 12212 2278 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 11 2 Op 3 41/0.000 + CDS 12242 - 13405 1714 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 12 2 Op 4 4/0.000 + CDS 13423 - 14172 1292 ## COG0396 ABC-type transport system involved in Fe-S cluster assembly, ATPase component 13 2 Op 5 19/0.000 + CDS 14175 - 15452 1520 ## COG0520 Selenocysteine lyase 14 2 Op 6 4/0.000 + CDS 15484 - 15972 752 ## COG0822 NifU homolog involved in Fe-S cluster formation 15 2 Op 7 . + CDS 15981 - 16400 738 ## COG2151 Predicted metal-sulfur cluster biosynthetic enzyme 16 3 Op 1 . - CDS 16390 - 17046 353 ## Ajs_4079 hypothetical protein 17 3 Op 2 . - CDS 17049 - 18392 883 ## Arch_0273 hypothetical protein + Prom 18361 - 18420 2.8 18 4 Tu 1 . + CDS 18449 - 19462 873 ## COG4127 Uncharacterized conserved protein 19 5 Tu 1 . - CDS 19476 - 20645 1494 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II Predicted protein(s) >gi|319979536|gb|AEUH01000001.1| GENE 1 277 - 3609 3021 1110 aa, chain + ## HITS:1 COG:Cgl1376 KEGG:ns NR:ns ## COG: Cgl1376 COG0647 # Protein_GI_number: 19552626 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Corynebacterium glutamicum # 607 859 7 257 328 208 46.0 6e-53 MADETRGRDENGRGGYQRRQPGWRDRSEGGHRQRDDRGRGPGSYGGREDRYQRGEYRQNG EGYGHGDDDRGGDRYQRGRGGYRGQGGYQSGQRQDRGYRRDDRDGYRSGGDRRQRDDRFQ GGGGYQREDRFQRDGRGERRYDNADRGRGFRRDDRDGRGYDRGRDDQRGPKRFDRDGYRS GGDRRQRDDRFQRDDRFQRDGRGERRYDNADRGRGFRRDDRDGRGYDRGRDDQRGPKRFD RDDRGDHRYSGGYQNGQRQDRGYRRDDRDGYRSGGDRRQRDDRFQRDGEHQGGGRYQRDG RGERRYDNADRGRGFRRDDRDGLGYDRGRDDQRGPKRFDRDDRSGRGRLDDRSNQGYSST SEFASRDGGPAIPAGVSPEELDPQALVALETLSGPNRDIVARHLVMAGQLIDLDPQEAYK HAQAAVARAGRVDVVREAAALTAYASGRYEEALREVRAVRRMRGDDSLRAVEADSERGLG HPEKAVEIVDAASTVGMELSEQVELVLVSSGARADLGQSDVGLVIVDDALARLGDGADET LVRRLMEVKADRLRELGRDDEADETLAAMPEEIEAPDIVDVSLYQDADVDSKRSPLRGTE APLADLYDVALLDLDGTAWAGDQTIDHAADAVLASRERGMKSAFVTNNAMRTPQQVADKL NAMGFEATPDMVMTSAMDAAANMAEELEEGAKVFMIGGEGLRQALAENGFTVVASADDEP VAVVQGLDKQVDWSTLSEGAFAIQRGAAYYATNLDATLPEERGQALGNGALVRAIRHATG KRPVAAGKPEASIYQRGARRVGGERPLAVGDRLETDIMGAVNARVPAMHVLTGVHGAQDV LRAPRGQRPSFLARDMRGLLEAHPGPKHHRDGTWTCGFSQVAKATRSGALTLDDIELVDG QAVTIDSYRALAAAAWEYADEFGEPHCPRITVVDNDDPTGVVAPPEPQEDSSSETVGATA SEEAPADSQEDSGSADGASPQDADSAAVVGAEPEADYDGAVGDGAEALSGAVGAGADSAE SAPEQDPAALDRDSEEPAEEAGWPPADSDSTTGAPAGPDADADAAADPEDVAAAADSLPD PADQIPEFLPGEEELEALLAETSGMDEDGR >gi|319979536|gb|AEUH01000001.1| GENE 2 3606 - 3755 94 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTTLSNQVDAQLADFDQLGAQARVDVLTAIDARLRQGLDSPTAPPAPGR >gi|319979536|gb|AEUH01000001.1| GENE 3 3763 - 4596 932 277 aa, chain + ## HITS:1 COG:Cgl1378 KEGG:ns NR:ns ## COG: Cgl1378 COG1189 # Protein_GI_number: 19552628 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Corynebacterium glutamicum # 4 270 5 272 273 248 54.0 1e-65 MRLRRLDSELVRRGIARSRGHAQELIESGRVRLDGEVVLKPARQMNPAQAVVVAQGDDEG YVSRGAHKLAGALDALGDAAPVIAGRRCLDAGASTGGFTDVLLRRGASSVVAVDVGYGQL AWKLQTDPRVHVLDRTNVRTLDPRAVAPAPQVVVGDMSFISLTLVIPALVAAAAPDADFL LMVKPQFEIGKDRLGRGGVVRDPAHHVETVEKVARCALAEGLAIAAVQASPLPGPSGNIE YFIHMRKGHPTPIDPGTLTDHIRDAVRRGPAGGALHG >gi|319979536|gb|AEUH01000001.1| GENE 4 4589 - 5422 972 277 aa, chain + ## HITS:1 COG:Cgl1379 KEGG:ns NR:ns ## COG: Cgl1379 COG0061 # Protein_GI_number: 19552629 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Corynebacterium glutamicum # 4 274 12 306 320 196 41.0 3e-50 MADRVMLVRHVARPEAIRAAESVRTELEALGIEVVTEGAAADIDLVLAMGGDGTFLAAAS HARQRDVPLLGVNAGHMGFLTQLSKRGVGEVAARIAEGDYRVESRMTLDVRVDRPDGTAA SDWALNEAVVMHTDVAHPVHFALIVDGQEVSTYGADGMIVSTPTGSTAYSFSAGGPVVWP DTEAVIVAPLAAHGLFTRPLVLGPSSCLQIVVLHDMWTAPEMWCDGLRREEVPAGSTVTA RVGSRPVRLVRVDDTPFSARLVTKFNLPVGGWRGTDA >gi|319979536|gb|AEUH01000001.1| GENE 5 5419 - 7089 1911 556 aa, chain + ## HITS:1 COG:Cgl1380 KEGG:ns NR:ns ## COG: Cgl1380 COG0497 # Protein_GI_number: 19552630 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Corynebacterium glutamicum # 1 555 1 578 593 290 38.0 4e-78 MITSIDIRNLGVIAEAHADFGPGLTVVTGETGAGKTMVLSSLLLLLGGRADAALVRQGAA RLDVDGVFEVDEGTAERAEEAGGVVEDGELIVGRTVPARGRSRARLGGRPVPASAIAGIV GSMVTIHGQSDQIRLTSQNAQREALDRFGAAAHQELVASYREAFHAAVAAKKRLDAALAD RDGREEEIEDLAAATARIASLGLAVGEEEELARESARLTNVQDLRAHCEAAQGALRGGDS VPGAIDLARQALAELEGAGRYDESVVGMCQRLQSQILEVEALADDVSAYSRTLEPDPAAL ARVHERRAAIKDALRGRAGDIEGLLAWNEAALERLEELSSPDSDPEALERALAAAQERVL ELGGRLSEQRRRLADSLSAQVDEELRALSMPDASLSIALTPSKPTSHGLETIGFLLRPHP SAPPRPLGSGASGGELSRVMLALELILGRDGSSSTFVFDEIDAGIGGQTATEVGARLARL AENHQVIVVTHLAQVAAFASHHLVIAKEDGTTSVRAVTGADREAELTRMMGGDPHSPTAR RNAIEVLGSAVSQSQG >gi|319979536|gb|AEUH01000001.1| GENE 6 7086 - 8252 1351 388 aa, chain + ## HITS:1 COG:ML1361 KEGG:ns NR:ns ## COG: ML1361 COG4825 # Protein_GI_number: 15827708 # Func_class: S Function unknown # Function: Uncharacterized membrane-anchored protein conserved in bacteria # Organism: Mycobacterium leprae # 2 354 9 365 393 179 35.0 1e-44 MRDTSTPDDIQGRVRVDERPRALATRLEAGEIAVIDRPDLDRQSALALAARQPAAVLNAA PSATGRHKVLGAAALLDAGIPLIDDLGQDIMTLREGERIRIVGDRVLRDGSVVASGRRVA GEDAAAEDDTGLATQVEAFTASIDDYLALDGDLLIRGEGSPQLAGIVDSRPVLLVVDGPR LAEDLAVLGPWRKEAAPIVVAVDAGADAALAHRITPHVVVGDATLMGEKAIRKAKRVVVR VGSDGIAPGRERLDRMGVPYETVTMSGSAQDAACVVATHAGASAIVTAGMERGLDDFLDQ GRAAMAPAFFSRLVAGDLLIAPQAVAATHKPRPRGWALVLLAVVALVLMGAALWSTPWGN DAFHRLFGLATHLSTGPGYAQALIGGVL >gi|319979536|gb|AEUH01000001.1| GENE 7 8249 - 9151 1221 300 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_11888 NR:ns ## KEGG: HMPREF0573_11888 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 1 297 1 301 326 107 30.0 6e-22 MINFRYHVVSVIGIFVALAVGIVLGAGPLQARINSAMGPGEQSQAASEQAAELSAQASAE AAGLKELATARLGQSLAGKSVVVLTLPGARSEDVTSVRETLTGAGAQVVGAITLSDNWDS QAMSQYRTTLSATLASHLSNPAAATASADAVIGYSIAQVVSSTDSESNLLSQILTDKTTP IMTIDEDPKGAGQALVAIGPRPDAQGSKSTAAPAVERSADAWAGLGQAVGATSGVVLGDA SAKGSLVAQLRAHGVAVTTVDSVGTTLGAVDTALALASPSASARAYGVGAGAQSAVPSGS >gi|319979536|gb|AEUH01000001.1| GENE 8 9184 - 9873 759 229 aa, chain + ## HITS:1 COG:Cgl1384 KEGG:ns NR:ns ## COG: Cgl1384 COG0494 # Protein_GI_number: 19552634 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Corynebacterium glutamicum # 56 227 34 200 223 127 44.0 1e-29 MVRAARVRADRRPQSNDGVTMNESHRIADSRAPRPVESQTVLVRGLVVDFCEDQVVVEQG KDPVRRQYTRHPGAVGVVALRGPAGQEEVLLLRQYRHPVRAELWEIPAGLLDVEDEEPVV AAQRELAEEADLKADRWDALVDYFTSPGGSTEPIRVFLARDLAPTGTAFARSDEEAGIEA AWVGLDEAVAAVLDGRIHNPNSVAGLLAAHAARAGRWASLRPAASPWFR >gi|319979536|gb|AEUH01000001.1| GENE 9 10031 - 10726 1017 231 aa, chain + ## HITS:1 COG:ML0592 KEGG:ns NR:ns ## COG: ML0592 COG2345 # Protein_GI_number: 15827238 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Mycobacterium leprae # 6 216 39 247 254 149 40.0 5e-36 MADAQDANTRQQVLDLVVEKGPVTASAIARILGLTTAAVRRHITILMESGEIAEHEPGTV AKRGRGRPARHYIATERAHMHLADGYSDLAVKALRHLGQVGGEEAVDSFAAARSREIERR YAPIVRDAGKDPRVRAQALADALTQDGYAATVREIANGTFAVQLCQGHCPIQHVAGDFPQ LCDAETQAFSRLLDVHVQRLATLAGGEHVCTTHIPVGMPTIRPGARAALRK >gi|319979536|gb|AEUH01000001.1| GENE 10 10767 - 12212 2278 481 aa, chain + ## HITS:1 COG:Cgl1527 KEGG:ns NR:ns ## COG: Cgl1527 COG0719 # Protein_GI_number: 19552777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Corynebacterium glutamicum # 1 481 1 481 481 736 72.0 0 MTQAPPLGAEERRMTDDEIIGSIGKYEFGWHDSDDYSKGVPYGIDESIVRHISDVKGEPQ WMRERRLKALELFDRKPMPAWGPDLSGVDFDAFKYYVRPTDRQVNDWEDLPEEIRDTYDR LGIPEAEKARLVSGVAAQYESEVVYQQIQEDLERQGVIFTDTDTGLREHPEIFEEYFGKC VPAGDNKFSALNTAAWSGGSFVYVPKGVHVNIPLQAYFRINTQAMGQFERTLIIADEGSY VHYVEGCTAPIYDENSLHSGVIEIFVKKDARVRYTTIQNWSTNVLNLVTQRAMVDEGGTM EWVDGNMGAAITMKYPACFLMGEHARGETLSIGFAGPGQQQDTGAKMVHMAPHTSSSIVS KSVSRGGGRTSYRGLVKVNARARHSKSNVLCDALLVDKVSRTDTYPYVDVRTDDVEMGHE ATVSKVSADQLFYLMSRGLAENEAMATIVRGFVEPIAKELPMEYALELNRLIELQMEGSV G >gi|319979536|gb|AEUH01000001.1| GENE 11 12242 - 13405 1714 387 aa, chain + ## HITS:1 COG:ML0594 KEGG:ns NR:ns ## COG: ML0594 COG0719 # Protein_GI_number: 15827240 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Mycobacterium leprae # 25 381 20 382 392 343 52.0 4e-94 MNKAHSHGGGAESTSHSSRADRPTSFSLDGIPVPAGREEDWRFTPLRRIDALFDPANYDR GDAPISVDAPDGVTVETADRSDARLGRVLAPGDRTAVVAWNGFDQATVVEIPREAEPASP VRIAVTGVEGTRCQHLVVRAGAMSRATVIVSHTGSPGARVNQGIEVEAGDGADLTVVSLQ EWDDTVVHASNQRLALGRDAKLTHIVVTFGGDLVRVCSDTDFRGPGSELTMLGLYFVDAG QHLEHRVFVDHAQPNCYSRVTYKGALQGKDAHSVWIGDCLIREAADDTDTYELNRNLVLT EGARADSVPNLEIENGEIKGAGHASATGRFDDEQLFYLMSRGVPESEARRLVVRGFFAEL VNQIGVPEVVEHLMTTVEAELAKSRNN >gi|319979536|gb|AEUH01000001.1| GENE 12 13423 - 14172 1292 249 aa, chain + ## HITS:1 COG:Cgl1525 KEGG:ns NR:ns ## COG: Cgl1525 COG0396 # Protein_GI_number: 19552775 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, ATPase component # Organism: Corynebacterium glutamicum # 1 249 1 252 252 336 65.0 3e-92 MSTLKIKDLHVSVETQDGPKEILKGVNLTIESGEIHAIMGPNGSGKSTMAYALAGHPDYE ITRGEAWLDDQLITEMSVDERAKAGLFLAMQYPVEVAGVSVSNFLRTAKTAIDGRAPALR TWVKDVRGAMEGLRMDPDFAERDVNVGFSGGEKKRLEILQMELLKPSFAVLDETDSGLDV DALRIVSEGVNRVHGDTGCGVMLITHYTRILRYIKPSHVHVFVDGRVAAQGGPELADQLE ETGYDTYLK >gi|319979536|gb|AEUH01000001.1| GENE 13 14175 - 15452 1520 425 aa, chain + ## HITS:1 COG:MT1511 KEGG:ns NR:ns ## COG: MT1511 COG0520 # Protein_GI_number: 15840924 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Mycobacterium tuberculosis CDC1551 # 1 425 1 415 417 426 54.0 1e-119 MTAKDFSAAELEAIRSDFPILSRHGRGGAPIAYLDASATSQKPRQVIEAEADFYRLHNGA VHRGTHLLGDESTDAFERARSVLAGFIGGRADEVVWTKNATEAINLVALAIGHASCGRGG AAAECLRIGQGDRIVITRAEHHANIVPWQELCARTGAELAWLDLLEDGRVDLGTLDAITP NTRLVALTHVSNVTGAVSPVDAVVAAAREAGALVLLDTCQSSAHMPIDVARLGADFAVFS SHKMLGPTGAGALWGRGELLEAMPPVLTGGSMIEWVSMEGSTFMAPPERFEAGSQPVAQI AGWARALEYLDALGMDRVEAHEHALTRLMLEGISSVEGVRVLGPGADSDRIGVVAFAVDG VHPHDVGQVMDAYDVAVRVGHHCAIPLHTFFGVRSSARASVALTTTADEIERMVGALGRV RGFFG >gi|319979536|gb|AEUH01000001.1| GENE 14 15484 - 15972 752 162 aa, chain + ## HITS:1 COG:ML0597 KEGG:ns NR:ns ## COG: ML0597 COG0822 # Protein_GI_number: 15827243 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Mycobacterium leprae # 4 142 5 143 165 107 42.0 7e-24 MGSLDQLYQQVILDHARERHGAGDPHRHAATSHQVNPTCGDEVTLGVTLEDGALASLDWD GDGCSISRASLSMLTDLAVGKSVEEVGALYGAMEAMMHSRNLGVDDEVLDRLGDAAALES TSQFANRVKCALLGWYALRDAIAKSGYDISATSGTTEQGEPR >gi|319979536|gb|AEUH01000001.1| GENE 15 15981 - 16400 738 139 aa, chain + ## HITS:1 COG:MT1513 KEGG:ns NR:ns ## COG: MT1513 COG2151 # Protein_GI_number: 15840926 # Func_class: R General function prediction only # Function: Predicted metal-sulfur cluster biosynthetic enzyme # Organism: Mycobacterium tuberculosis CDC1551 # 44 139 15 115 115 117 62.0 8e-27 MAQSEPVEQRFAAPAQPAAPEGAEVRQIDGTDIVEQGTLAGESVLEALKDVIDPELGINI VDLGLVYGVFIAPDNAVRLDMTLTSAACPLTDVIERQAQMILASVTDQTQINWVWMPPWG PDRITPDGREQLRAIGFNV >gi|319979536|gb|AEUH01000001.1| GENE 16 16390 - 17046 353 218 aa, chain - ## HITS:1 COG:no KEGG:Ajs_4079 NR:ns ## KEGG: Ajs_4079 # Name: not_defined # Def: hypothetical protein # Organism: Acidovorax_JS42 # Pathway: not_defined # 1 200 1 195 200 91 32.0 2e-17 MLAPLPAEVYVEGTTDVPIIHSLLEAAQWQVDASGIQVARGVKNIRKRMSSHAQAAQYYP RILFVDGDHHCPKELRVEMEGLSQITPVPTGLIIRVVDVCVESWVLADRDGLATFCGLSP SAIPDPVALGRKGSHKEVLLNVLSRARIKDVREAMVRTTKGKLSFGPLYGRRLADFATRH WSAIRAAGHNDSLARALDRLTQLHDSLEGGCGRPPLRR >gi|319979536|gb|AEUH01000001.1| GENE 17 17049 - 18392 883 447 aa, chain - ## HITS:1 COG:no KEGG:Arch_0273 NR:ns ## KEGG: Arch_0273 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 44 433 1 365 382 133 28.0 1e-29 MNRKRMAVHAPAGERATLEGGTAPPAPVRSTAQPRTRRRKATAVKLTHVSLTNWRNFGHI EFDLDSRLFVVGPNSSGKTNLLGALRFLGDIARRGLRAANEDWGGPKHCFRTGSSEAGFS ITASSDEHTVEYALSLRTGRLPSDLHTHDGATDGVIEFPVEFPAERIAVINQERLRIDQR TIAIDSTHVTSPTSIPVLLGKDYSLLDPGSAASTGNEQPVFEQVRQCVSGVRYIHPNPKK MKERLEGRFEPDDHGTGFFQLAGRFPATALDAVVARIRPIMAATVPEVPHLSYRRLAGEE VVFYSDDPQSASSYSTHDQFSEGTLRLLGILFDLATLPRSTTLVLLEEPETFLQPSVVRS LPSFLAEVAYSKQVQMVITTHSPELLSDETIGADQVLLLRTTEKGTTGELLSESEDPRIR AAVEAQFPPSEVVELTARHEIPSNAVM >gi|319979536|gb|AEUH01000001.1| GENE 18 18449 - 19462 873 337 aa, chain + ## HITS:1 COG:STM4490 KEGG:ns NR:ns ## COG: STM4490 COG4127 # Protein_GI_number: 16767734 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 15 337 1 329 329 207 37.0 2e-53 MDARATGGHGETGWMVRAGSAGVYASKWREEGIIGIGWDFGGADISAMTRAQLSEAYAKA HPSRDAFAARHPARQAHHFAHDLTVGSTVVTYDPGSRHYYIGQVSGPCENALDDEGTTYT RRVQWSAEAPRDLLTKASRNSLGSLTTLFTISGEVMADLARASGARDPEPADGEVDDTTD EEARSASYDDGIERIKDRVLALGWEQVEQLTAGLLRAMGYFARVTAKGPDGGRDVEASPD ALGLESPHIVAEVKHRKDPIGAPAIRSFIGGLRSGDRGLYVSTGGFSKDAVREAERANYP IRLIDLDDFVRLYIEVYDRADEEARAILPLIRIWWPA >gi|319979536|gb|AEUH01000001.1| GENE 19 19476 - 20645 1494 389 aa, chain - ## HITS:1 COG:Cgl0395 KEGG:ns NR:ns ## COG: Cgl0395 COG0318 # Protein_GI_number: 19551645 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Corynebacterium glutamicum # 27 389 207 568 568 287 41.0 2e-77 GQLRGEAPAGSRSWDRAVAQAAPLPASYPRPGAEDVAVILHTSGTNGVPKSAPLTHRNIG VNVHQCVFWVWRLHEGAETFFSLLPYFHAFGLTFFLCAAVRKAATQVLLPKFDAAMALDA HRRRPVSFFVGVPPMFDRILAEARRTRTDLTSIRYSVAGAMPLSTELAQRWEEATGGMIV EGYGLSETSPVLTGAPLSDRRRHGVLGVPFPSTELRLVDLEDPSRDVEDGQPGEILVRGP QVFSGYLDAPEETAAVFTGDGWLRTGDIGVNHDGFITMADRKKELILSGGFNVYPSQVED AIRSMPGVRDVAVVGVPASGASEQVAAAIIMEDGVAPLTLDEVRTWAEKSIAHYALPRQL VFIAELPRNQIGKILRRKAAQMVKERLGR Prediction of potential genes in microbial genomes Time: Thu May 12 16:52:53 2011 Seq name: gi|319979529|gb|AEUH01000002.1| Actinomyces sp. oral taxon 178 str. F0338 contig00002, whole genome shotgun sequence Length of sequence - 6097 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 534 619 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 2 2 Op 1 . - CDS 693 - 1586 1248 ## COG2321 Predicted metalloprotease 3 2 Op 2 . - CDS 1583 - 2914 1664 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 4 2 Op 3 . - CDS 2911 - 3591 910 ## COG1738 Uncharacterized conserved protein 5 3 Tu 1 . + CDS 3696 - 5309 2375 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 6 4 Tu 1 . - CDS 5356 - 6054 821 ## Predicted protein(s) >gi|319979529|gb|AEUH01000002.1| GENE 1 3 - 534 619 177 aa, chain - ## HITS:1 COG:Cgl0395 KEGG:ns NR:ns ## COG: Cgl0395 COG0318 # Protein_GI_number: 19551645 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Corynebacterium glutamicum # 16 176 18 182 568 108 35.0 5e-24 MTSITEQGRAFYDAVPHEVPDWEGSLFSLLEDAARLYPDRAALDYFGAAITYRQVLEQVE RAASALVGAGVGRGDVVAVALPNCPQAFVVFYACMRIGAVAAQHNPLAPDPEVRGQLGRH KGRVAVVWEKCAHVYAEAGVATVFTVDISAHMPASQRLLLRLPVRRARESRGQLRGE >gi|319979529|gb|AEUH01000002.1| GENE 2 693 - 1586 1248 297 aa, chain - ## HITS:1 COG:MT2651 KEGG:ns NR:ns ## COG: MT2651 COG2321 # Protein_GI_number: 15842113 # Func_class: R General function prediction only # Function: Predicted metalloprotease # Organism: Mycobacterium tuberculosis CDC1551 # 66 295 67 290 293 223 51.0 4e-58 MSFNDNIQLDPSRVRTSAGRRAAGIGGGSVLGAIIVFAVAHFTGIDLSQFLGSSQPGPAP AGQTVDVSGCTSGADANARVECRMVATASSLDEVWRAQLASQGGGTDYQLPDFQIFTGSV STACGNATSAVGPFYCPGDSTVYLDLGFFEQMVSDYGASGAALAQEYVVAHEWGHHIQNL RGVFRDHNTQERGEQGAGVRSELQADCYAGVWMHWASQTADPNTGEPFLRVPSAEEIDGA LTTAQAIGDDRLQQRYQGSTNSESWTHGSAAQRSQWLRTGLDSGSIAACDTWSAARV >gi|319979529|gb|AEUH01000002.1| GENE 3 1583 - 2914 1664 443 aa, chain - ## HITS:1 COG:Cgl0233 KEGG:ns NR:ns ## COG: Cgl0233 COG0343 # Protein_GI_number: 19551483 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Corynebacterium glutamicum # 25 431 3 412 418 486 60.0 1e-137 MSADWLAAPPVPFSLPAAPPSGWRDRGFQVGARLASGSGLGRTGTIHTAHGIIRTPAFIP VGTKANVKALTPEMVEALGAQAVLANAYHLYLRPGSDVVDEAGGFARFMNWRGPTYTDSG GFQVLSLGAGFKKVLSQEFSGAADSDDPRLSAQAVRASNAVVDDDGVVFSSHIDGTKHRF TPEASMRIQHQLGADIMFAFDELTSLLHPRSYQVESLERTHAWARRCLVEHARLTEERRA MPYQQLWGVVQGAQWEDLRRRAARTMADMEADGQRFDGFGVGGALEKERLGTIVSWVCEE LGEDRPRHLLGISEPEDLFAGVEAGADTFDCVNPSRVARNAAVYTPTGRYNITNARFKRD FSPLAPGCGCYTCTNHTRAYVHHLFKAKEILSSTLATIHNEWFTLRLVDAMRDSIERGCF EDFRDDMLGRYRSGGGREGRSAQ >gi|319979529|gb|AEUH01000002.1| GENE 4 2911 - 3591 910 226 aa, chain - ## HITS:1 COG:Cgl0234 KEGG:ns NR:ns ## COG: Cgl0234 COG1738 # Protein_GI_number: 19551484 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 10 219 39 246 250 187 50.0 1e-47 MSTPTAPSRHPRVYDMIAVAFVAALLISNVAATKVTALDWGPVHLVFDGGAVLFPLTYIL GDVLSEVYGFGRARRVIVMGFAASIAASAVFWIVQAAPVGPGYENQAAFEAVLGFVPRVV AASVIGYLAGQLVNALVLVGIRSRWGARRLWARLIGSTLVGEAVDTALFCTIAFAGVIEG GDFVNYVVTGYVYKVAVEVVLLPLTYRVIGWVRGLEGLEGSEVFPA >gi|319979529|gb|AEUH01000002.1| GENE 5 3696 - 5309 2375 537 aa, chain + ## HITS:1 COG:ML1816 KEGG:ns NR:ns ## COG: ML1816 COG0488 # Protein_GI_number: 15827973 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Mycobacterium leprae # 1 536 1 544 545 563 57.0 1e-160 MINVQDFSLRIGARELVRDASFRVDKGMRIGLVGRNGAGKTTTMRLLAGEADRGGAAEHT GTISTTGTIGYLAQDTHVGDQTVLARDRIVSVRGIDQIIARIRRAEQEMSTTEGARQQRA MERYVRLDQEFTNQGGWAANAEAAQIAHSLGLPDRVLGQSLDTLSGGQRRRVELARVLFS GADVLLLDEPTNHLDHDSVLWLRDWIKTFPGGVMVISHDAALLGETVNQVFYLDANRARI DIYRLGWRAYLEQREEDERRRRKEREGALRKAEQLRAQGEKMRAKATKAVAAQQMLRRAD ELVARAGEAEVREKVARLRFPEPAPCGRVPLTAESLSKSYGSLEVFTGVDLAIDRASKVV VLGLNGAGKTTLLRLLAGVEEPDTGRVVAGHGLKIGYYAQEHETLDVSGTVRENMAGAAP GLDDTRVRNILGQFLFQGDDVDKPVGVLSGGEKTRLALATLVVSGANVLLLDEPTNNLDP ASREEILAALHDYEGAVVLVTHDPGAVTALDPQRVLLLPDADEDLWDESYLDLVTLT >gi|319979529|gb|AEUH01000002.1| GENE 6 5356 - 6054 821 232 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAGAACSVGIALAPAPASYAAEGAAVFEHYSFFGSGTISWTASVTDADGALTAQLCDEFG AHARAALGVVHKPSTVFTPSDGAAPSSTCVVSVSASLLAIRGAERTMSLPSAPTAPIADA IGAQVRTVEATVYEGRIVEASEGAQVDHDSERNGWETAVWTDTDEDLELTFDTSTGSTSS GASAGAPGQRDDEGSTGPGFYALVGAIVVLTIGALVVAFHAWARRTRGPRAR Prediction of potential genes in microbial genomes Time: Thu May 12 16:53:07 2011 Seq name: gi|319979523|gb|AEUH01000003.1| Actinomyces sp. oral taxon 178 str. F0338 contig00003, whole genome shotgun sequence Length of sequence - 6159 bp Number of predicted genes - 8, with homology - 2 Number of transcription units - 6, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 957 383 ## 2 2 Tu 1 . + CDS 974 - 1267 176 ## + Term 1446 - 1503 3.1 3 3 Op 1 . - CDS 1135 - 2370 1017 ## 4 3 Op 2 . - CDS 2410 - 3567 783 ## 5 3 Op 3 . - CDS 3572 - 5392 2101 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 5496 - 5555 59.8 6 4 Tu 1 . + CDS 5410 - 5508 116 ## + Term 5578 - 5613 6.1 + TRNA 5496 - 5569 78.6 # Pro GGG 0 0 - Term 5484 - 5551 30.2 7 5 Tu 1 . - CDS 5557 - 5631 96 ## + Prom 5510 - 5569 61.5 8 6 Tu 1 . + CDS 5660 - 6158 555 ## gi|154509084|ref|ZP_02044726.1| hypothetical protein ACTODO_01601 Predicted protein(s) >gi|319979523|gb|AEUH01000003.1| GENE 1 3 - 957 383 318 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGIAAVAVVLALSAVAVPGARPSCAAGGHATEYYRVTGSGPLYYSATIIADGAVLIRRVC NEFGDHIRSVFGLPASQQVVTFTRALSKGSPSMCAVDEFRVPDHAPFLSGNGPTYTITMP LHVTEPIAGATGLTMRMVRAWVGDAVITSVSGGGLVDHDPDSTGWEKAVWYQDFEDITLD YDVSTGTTSTGAVHPPSNNDPPEPEPADDPQSPQSLAGASPSAGASTPASGRTGAAKRGY NRSGLRSPSLSDQLCSDCDAKRRIASATVLIFVLVVLLRLIAKGAGWPRRPGEARKARRP ARSRAPVRPRPAEPEDPA >gi|319979523|gb|AEUH01000003.1| GENE 2 974 - 1267 176 97 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGASIGRRAFGTPRSYQLVGPGARTDHGSAKCPGNRSDGVGAQAAEGALGVLVGQAGPLG GAKRESDRGAAASDGPADPGAGSGESCGSGAAVLELS >gi|319979523|gb|AEUH01000003.1| GENE 3 1135 - 2370 1017 411 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLGKPAGIAAAVLLALSAAAVPGARSASAAEAGASEEYRVSGSGVLTYSARITASEDYL SKQVCDDFAGHLRTVFGLDKDPKASYTGATYDGGTSECAVQPFDILHPTDSVITENGSER TIAMSPSTTGPVLEAIGFTTRSTTATVSNAIITSASDGCAIDHDSSSSGQETATCAHTAD DSYVTYDVATGATSSGAVRGAPTPVPPDPEANVKPQRTGTPGAPAATTTTGPITAPVTVG TPGQRTGAPTPTGRSSDSDDGNAFVAIFFLILVVLVLFWIISMGTMRSQWKKQSRRPARS RATHTRRPVDPALPYDTSRTAASEPHDSPTFPYDTSRTAASEPHDSPTFPYDSSRTAAPE PHDSPTFPYDSSRTAAPEPHDSPDPAPGSAGPSEAAAPRSDSRFAPPSGPA >gi|319979523|gb|AEUH01000003.1| GENE 4 2410 - 3567 783 385 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGYQNAPLVMEAHVGTRRTLAIMGAATAALALGAASAPSAHAAMENRVAEQYMIHQSGRV SWTVIIADYQQALTQEQCEELGTSAQTVFSTNEKASISFTQSSGSGVSNRCTATLRAELN KTALMQTAGTTRTLTLRTAIPDSVMAVLGTTTRNVSATVFDAGIVESSDGAEIKHDSEGN GWETAEWKNVSGDLSITYDSALGNRNGRAGASPTPGSRSTPSPRSTPSAPASPSNNAAGQ NPPGNSDDNTALIVAAIVVGVIVIAAVAGAVVIVMRSRRAHTGPGGYAGVPYPDGQGPGG LPGQGHGRTPGYGQAQDYGQSSGSAVPAGQSFPFGDAAVPQSPPGASAGAQTGPSAGGSA SSALGLPPVPAPPKQRSPFAPDEDS >gi|319979523|gb|AEUH01000003.1| GENE 5 3572 - 5392 2101 606 aa, chain - ## HITS:1 COG:ZuidAm KEGG:ns NR:ns ## COG: ZuidAm COG3250 # Protein_GI_number: 15804986 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli O157:H7 EDL933 # 95 473 69 449 604 82 23.0 2e-15 MSDDATRPAGPGLVTPWGEDLGDAPLPEYPRPMLVRPQWRSLNGWWRYAVVGRRAAQGAE SPVDAWQGRIRVPFAVETRASGAARPLGPDELLCYEQDIVVPGQWRGGRVAVVFEAVDHE CRVFADGEPIGSHTGGYCPFRVELPDTGRARVRLRVEVADATDTTDQQRGKQSLEPGGIW YTATSGIWQTVWMEPLPDRAITRVLTRTRPDLATVDVRILAEGGPRPVTITISDQEGAVV RASGTTGAPIAVRIPSARPWSPGDPHLYSLVADTGADRVESYFGVRTVAVSAPEGGGAAR VLLNGAPVLVNAPLWQGYWPESGLTAPSGAAVEHDLLALKGMGFNGVRVHVKVESRRFYA LADRIGMMVVQDGVSGGRAPASIRQSGLIQATGFTWPDTGRALLRRTGRATAASRRAFLE EWSRTVRLLQAHPSVVIWVPFNEGWGQFGARRVLELTRRIDPTRPVDAASGWFDQGCGDF RSRHRYVLALVAPPANDGRAFYLSEFGGHNLPVEGHLHSSGQPYGYSFHRDRAALEAALV RLYERELIPLAARGLAACTYTQVSDVESETNGLMTYDRRVTKVDPLVMRRLNHALEEGFQ APATTW >gi|319979523|gb|AEUH01000003.1| GENE 6 5410 - 5508 116 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTWIASADWRAVGMLGYTIQGRRERRARYLAS >gi|319979523|gb|AEUH01000003.1| GENE 7 5557 - 5631 96 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPWIAPGTHKTAEHVALRFQQLAS >gi|319979523|gb|AEUH01000003.1| GENE 8 5660 - 6158 555 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509084|ref|ZP_02044726.1| ## NR: gi|154509084|ref|ZP_02044726.1| hypothetical protein ACTODO_01601 [Actinomyces odontolyticus ATCC 17982] # 18 150 2 134 141 145 63.0 9e-34 MRLSLSYGPRPRADAEGNPMSKKSALAPLVPAVLSLLLVAGGTTVFAACDPKPDGSWMQC HSCQNTVAAGAGGLALLFGASAFVKNRSLRLALQALGVVGAAVTFFIPGGICPMCMMRTM RCYTVFQPFVRIMSVLVAAGGVGALVTSMRGRGASALAPSARAGGE Prediction of potential genes in microbial genomes Time: Thu May 12 16:54:21 2011 Seq name: gi|319979519|gb|AEUH01000004.1| Actinomyces sp. oral taxon 178 str. F0338 contig00004, whole genome shotgun sequence Length of sequence - 1843 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 834 1005 ## Sterm_0502 hypothetical protein 2 1 Op 2 . + CDS 831 - 1842 1246 ## COG0577 ABC-type antimicrobial peptide transport system, permease component Predicted protein(s) >gi|319979519|gb|AEUH01000004.1| GENE 1 1 - 834 1005 277 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0502 NR:ns ## KEGG: Sterm_0502 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 76 269 4 200 201 93 29.0 8e-18 LLIAAVLCALLAALAAVSEHRRSTSYGQAVALAEAGDAARAYEILSGLGDYRDAQERARA LAERDPALPYRRAAKGDSVVFGSYEQDGDPANGPEPIHWTVVDRLEDRILVLGAECLEGR QYHHVPFEDASWQDSDLRAWMNGDFHETAFTPAERSLIVPADNANDPQSITGAGGGAPTT DRVFALSETESAIYLGDEASRDSLGVAAATDHAKGTGLPVDENGSCDWWLRSPGTYGFAA QFVDATGAPSASGANVDAVYGARPALWIRTAGAGGAQ >gi|319979519|gb|AEUH01000004.1| GENE 2 831 - 1842 1246 337 aa, chain + ## HITS:1 COG:FN1349 KEGG:ns NR:ns ## COG: FN1349 COG0577 # Protein_GI_number: 19704684 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 11 332 10 332 401 103 25.0 5e-22 MTALPLTLRGVALRSVRAHPVRTAVLLFAVAVQAACALIGLTLVEGVGDGLSLAEQRLGA DIAVYPTGCLSKVDKARLLMLGTPVDCNRKRSSLGRLAYNDDIGAVTHQLYIADTLPSGE PLWIVGFEPETDFVLSPWLEGGPGWALARGEAAVGSAVAASSEVALFQRDHPVAARLMET GSALDNAVFVTMETLGDVIRDSVAAGVGAYASVDPGADYSAALVRLGDRDRRQAVTDWIN LYVRKTTAVKSEASLAGAASGIRGQLVATAAAAVGVWLLVVAALAVVQAVLMNERRGELG VWRVVGASPARVARLMAREALLVHAAGACAGTAVWAA Prediction of potential genes in microbial genomes Time: Thu May 12 16:54:29 2011 Seq name: gi|319979510|gb|AEUH01000005.1| Actinomyces sp. oral taxon 178 str. F0338 contig00005, whole genome shotgun sequence Length of sequence - 8607 bp Number of predicted genes - 8, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 133 179 ## 2 1 Op 2 36/0.000 + CDS 145 - 1353 1541 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 3 1 Op 3 . + CDS 1355 - 2077 189 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 4 1 Op 4 . + CDS 2108 - 2635 448 ## Gobs_1817 integral membrane protein 5 1 Op 5 . + CDS 2632 - 3054 390 ## 6 2 Op 1 . - CDS 3081 - 5015 2569 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 7 2 Op 2 . - CDS 5047 - 6903 2496 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Term 6961 - 7011 10.5 8 3 Tu 1 . - CDS 7027 - 8451 2280 ## COG0174 Glutamine synthetase Predicted protein(s) >gi|319979510|gb|AEUH01000005.1| GENE 1 2 - 133 179 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no GPARALVNAAVVAAASLIAGAVGTRAALRRAQRAADGRMLLSV >gi|319979510|gb|AEUH01000005.1| GENE 2 145 - 1353 1541 402 aa, chain + ## HITS:1 COG:FN1349 KEGG:ns NR:ns ## COG: FN1349 COG0577 # Protein_GI_number: 19704684 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 7 336 10 339 401 170 29.0 6e-42 MVSLAQLPWRNLRGYPARTAALLVFSSLMAMVLFGGTMVVEGVRQGLRTVESRLGADILV TPADARSEFDAQDFLVRAEPSYFYMDEGALDRVAAVPGVEAASPQLFLASARASCCSGRY QVIAFDPGSDFTVQPWIDDTDRDVRLEPMDVVVGSNVTVYDEDDFKIFDQSLRVVAQFAP TGSALDNAVYTDFDTARVLIESSLSKGLNKYTDLDPSSVISSVLVRVRAGEDPASVAAAI EEQVPGVAAVTSTAMVGTIARTLDDASRTVIALIAIAWGVGLVMVVLVFVMMIHERRREF ATLSAVGAGKRLVSRVIAAEALGVNGIGGLVGVAVSGALLVSFEGFVRQALGSGFVVPSW TTALLLALVSLAATAAVAVVASLASVGYLRTTSASALLKEGE >gi|319979510|gb|AEUH01000005.1| GENE 3 1355 - 2077 189 240 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 24 220 295 498 563 77 30 3e-14 MDVRARSLSKDFPRGRGGRVFTAVEPTDLDIGAGELVVITGRSGSGKSTLLAMLAGILSP TAGTVEVDGTDLHSLGEEALARFRNGSIGLVPQGHAALRSLTVLENVLLPSVLYPGRGPG GRGAEELLDAVGMAQLKDARPNELSGGELRRMAIARALLMDPGVVLADEPTAGLDAASAA TALELLRGAADSGAAVLVVTHDREAEGAADRILTMDGGRLGGPEPQGAGAPRAEERQYTD >gi|319979510|gb|AEUH01000005.1| GENE 4 2108 - 2635 448 175 aa, chain + ## HITS:1 COG:no KEGG:Gobs_1817 NR:ns ## KEGG: Gobs_1817 # Name: not_defined # Def: integral membrane protein # Organism: G.obscurus # Pathway: not_defined # 13 171 14 171 189 78 45.0 1e-13 MDGQLDGLRRWLVPLGLAASFAANDGEELATMVASSRRAVGALPIGGRLRDRALRVDQRH VNAAIAMMGALCAAAVWDGIRTRGRGWLYQDFQWAFGLHGIGHIAASLATRGYTTGVATS PTVVLPQLWCAARALRRAGVPRTARPLRAAALVGGWLVLSHAVGAAVSAAGRRGA >gi|319979510|gb|AEUH01000005.1| GENE 5 2632 - 3054 390 140 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNANDRAWALTAALIGVHQGEELLLPMTEWLDRVGSSGWAGLDAHMRSSPLAGRDPWARA GAVAAQGAALCVLYLATRRSGRATRAATGALTLGWAAAFCMHIAVSARTRSFMPGTATSV VPGLPGALLVLRSIRATRSR >gi|319979510|gb|AEUH01000005.1| GENE 6 3081 - 5015 2569 644 aa, chain - ## HITS:1 COG:ML0887 KEGG:ns NR:ns ## COG: ML0887 COG1022 # Protein_GI_number: 15827409 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Mycobacterium leprae # 16 628 9 597 600 421 40.0 1e-117 MTVRRLADGSWESVARREASEDMNIPKMLHRRAREHPGQVAVERRSPVGDWRPVTIDEFL AEADTMARGLVGLGLEAGEHLAILAPTSYEWALLDIAALSCGAITVPIYETDSASQIAHI LADADVKIVITATSQQADLVESVRTDGVRMVLALDRGAERLLSQTALDVPLDRVRSRTDG VGLRDEATVIYTSGTTGMPKGVVLTHANFIETMLQAYDILPVLINDPRSRSLLFLPVAHV LARFVMYCLLSGQGVTAFSPDTRNLVDDIATFKPTMLLVVPRVLEKVYNAASAKAGGGFK GRLFSWAANQARALSRSTSYADTPLPESEVAGPLPDTTTVPDASGTPSPGPSLGLRLRGR LADALVLKKVRAVLGPNLHTIICGGAPLAVDLANFYRGLGVTLLQGYGLSETTGPITVEL PHDFPPDSVGFPWPGNRLKLAPDGELLAQGISVTKGYHNLPGATAEAFVDGWFRTGDLAS IDDRGHVRITGRKKELIVTAGGKNVSPEVLEESLSTHPLIANIVVVGEGRPYIGALIALD TEMLPDWLRRHGLPVVDAAQAGELPEVRESLERAIARANTRVSRAESIRRYRIVNAAFTV ENGYMTPSLKLKRRRVLADYAHEVDALYSSGTDASAGGDGAAKD >gi|319979510|gb|AEUH01000005.1| GENE 7 5047 - 6903 2496 618 aa, chain - ## HITS:1 COG:ML0887 KEGG:ns NR:ns ## COG: ML0887 COG1022 # Protein_GI_number: 15827409 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Mycobacterium leprae # 9 597 4 597 600 452 43.0 1e-127 MKRMRDGSYTVPPAFTVRPGDSLSTMLLARADAHPDQVVVEQRTAVGSARPITASELVRQ VDDAARGLIGMGVDPGDAVAILAPTCYEWMILDLALASVGAVSVPIYESDSAQQITHILA DAHVTLVFTATAQQAELVSLSAPEHCPVHSFDRGAMRLLAKRARPIPVEEAHRRRAQVTS SSTATIIYTSGTTGAPKGVALTHANFVGTCLSARQILGSVIDSPSTRLLLFLPVAHVLAR LVMHVVLAGQGVLGFSPSIKNLLPDIQSFKPSVLLVVPRVLEKVYNSASAKAGGGFKGRL FAWSAKQARAYGASGRRRPSPLRRARRRVADALVLKRIRAVLGPNMRYIVSGGAPLASDL AQFYTGLGLTLLQGYGLSETTGPIAVQRIGDNPVGTVGQPMPGNFIKTAKDGEVLVRGVS VMPGYYGLPDQTRAVMPDGKWFHTGDLGSIDRRGHLTITGRKKEIIVTAGGKNVSPAVLE DSLSTHPLIAHVIVVGDQRPFVGALIALDAEMLPAWLRKHGLPVCSPTEAASLPQVRESL DRAIERANRAVSRAESIREYRIIDAVFTVENGYVTPSMKLRRSKVLADYSHEVDELYGGP AAPARERGLLRLLRRKKH >gi|319979510|gb|AEUH01000005.1| GENE 8 7027 - 8451 2280 474 aa, chain - ## HITS:1 COG:ML0925 KEGG:ns NR:ns ## COG: ML0925 COG0174 # Protein_GI_number: 15827444 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Mycobacterium leprae # 4 474 5 478 478 618 62.0 1e-177 MFTSSDEVASFIDKESIELVDVRFCDVPGVQQHFTIPVGEFLGGAIDDGLMFDGSSVRGF TAIHESDMKLVPDLSSAFVDPFRARATLVVDFSIVDPFTDEPFSRDPRQVAAKAEAHLRS TGIADQCFVGAEAEFYLFDDVRYQVSPNSTFFSVDSPEAHWNTGRAEEGGNRGYKVPLKG GYFPVPPTDRYADVRDEMVRLLEGVGLTVERAHHEVGSGGQQEINYRFATLRAAADDMMK FKYVVKNAALAFGHSATFMPKPLFGDNGSGMHTHMSLWKDGEPLFYDERGYGALSDTARW FIGGLLEHAPALLAFTNPSVNSFRRLVPGFEAPINLVYSARNRSACIRIPVTGTSPKAKR VEYRVPDPSANPYLAFSACLMAGIDGIRRRSEPAAPIDKDLYELPPAEYRDIAKLPSSLE AALEALREDHDFLTEGDVFTQDLIDTWLDYKEANEVAPMRAYPHPYEYQLYYDL Prediction of potential genes in microbial genomes Time: Thu May 12 16:54:49 2011 Seq name: gi|319979494|gb|AEUH01000006.1| Actinomyces sp. oral taxon 178 str. F0338 contig00006, whole genome shotgun sequence Length of sequence - 16907 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 7, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 31 - 2067 2763 ## COG0322 Nuclease subunit of the excinuclease complex 2 1 Op 2 . + CDS 2081 - 3337 1452 ## COG0027 Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) 3 1 Op 3 12/0.000 + CDS 3339 - 4319 1351 ## COG1660 Predicted P-loop-containing kinase 4 1 Op 4 12/0.000 + CDS 4319 - 5323 1269 ## COG0391 Uncharacterized conserved protein 5 1 Op 5 . + CDS 5361 - 6344 1260 ## COG1481 Uncharacterized protein conserved in bacteria 6 2 Tu 1 . - CDS 6393 - 6593 193 ## 7 3 Op 1 26/0.000 + CDS 6556 - 7563 1699 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase 8 3 Op 2 13/0.000 + CDS 7672 - 8862 2092 ## COG0126 3-phosphoglycerate kinase 9 3 Op 3 9/0.000 + CDS 8865 - 9641 1207 ## COG0149 Triosephosphate isomerase + Term 9676 - 9726 9.3 10 3 Op 4 . + CDS 9783 - 10028 417 ## COG1314 Preprotein translocase subunit SecG + Term 10054 - 10088 7.0 11 4 Tu 1 . + CDS 10110 - 11126 1439 ## COG1434 Uncharacterized conserved protein 12 5 Op 1 4/0.000 - CDS 11169 - 11897 835 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 13 5 Op 2 5/0.000 - CDS 11899 - 12837 1143 ## COG3429 Glucose-6-P dehydrogenase subunit 14 5 Op 3 . - CDS 12834 - 14360 2023 ## COG0364 Glucose-6-phosphate 1-dehydrogenase 15 6 Tu 1 . - CDS 14465 - 16573 3003 ## COG0021 Transketolase 16 7 Tu 1 . + CDS 16673 - 16907 251 ## COG4974 Site-specific recombinase XerD Predicted protein(s) >gi|319979494|gb|AEUH01000006.1| GENE 1 31 - 2067 2763 678 aa, chain + ## HITS:1 COG:ML0562 KEGG:ns NR:ns ## COG: ML0562 COG0322 # Protein_GI_number: 15827213 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Mycobacterium leprae # 1 663 1 638 647 664 54.0 0 MADPSTYRPPAGDIPTSPGVYRFSDANGRVIYVGKAKNLRNRLANYFQDLANLHPRTQQM VTTAARVQWTVVGNEVEALTLEFTWIKEFNPRFNVMFKDDKSYPYLAVTMGEAYPRLHAV RGARKPGARYFGPFVQAWSIRETIDQLVRVFPVRTCSPGVFRRARAQGRPCLLGYIDKCS APCVGRISEEDHRAMALELCSFMQGRAGPVIADLEQQMRSASAALDFETAARLRDDVAAL RAVLERNAVVLADGTDADVFALATDELDAAVHVFHVRGGRVRGTRGWVVERSDDADEPAL IARLLEQVYSQATPDDPPAPGERAQKAEAVSVDDVAHTPTSAIPREVLVSTAPTDRATIE EWLTGLRGGRVRVRVPQRGEKAHLMGTVMENARQGLALHHSKRAGDITARAQALEELAAQ LDLPGAPLRIECYDVSHTMGTLQVASMVVFEDGAPRKDAYRSFNIRGADGGGAPDDTAAM DEVLTRRFSRLLAEESGEQGEDEEGVPLESGPVDATGRPRRFSYRPDLVVVDGGPAQAAA ARAALDGVGADVPVIGLAKRLEEVWAPGEEFPIILPRTSPALYMLQHLRDESHRFAITKH RKRRSKAQTRSALDKIPGLGPSRQTALLKHFGSVRRLRAASAEQIAQVSGIGPVLAAAIR DSLSESGTADTPGPGPAS >gi|319979494|gb|AEUH01000006.1| GENE 2 2081 - 3337 1452 418 aa, chain + ## HITS:1 COG:MJ1486 KEGG:ns NR:ns ## COG: MJ1486 COG0027 # Protein_GI_number: 15669679 # Func_class: F Nucleotide transport and metabolism # Function: Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) # Organism: Methanococcus jannaschii # 12 401 17 393 393 356 51.0 4e-98 MIAYAPSSLPARVLLLGAGELGKELTISLKRLGCLVVACDSYVGAPAMQVADEARIFDMT DPRALAEVLEAVEVDLIVPEVEAIATELLLAAEDAGRARVVPNAHAVRTTMDRQAIRALA DCLPDVRVGAHRFASSADQLRAALDELRLPVFVKPTMSSSGHGQTLVRTRADARRAWDTA ARGARAATGRVIVEERIDFDYEITLLTVRWWSHREGRVRTSFCEPIGHRQCDGDYVESWQ PADMSPAALASAQRMAAAMTGALAEAGPGPGLGLFGAEFFVRGDRVWFSELSPRPHDTGM VTMATQDLNEFDLHARAILGLPVDASLRCPGASAVIKSTAPAPAPRYVGVGDALDLGADV RIFGKPVSRAGRRVGVVTVRGATVDEARATARKAAAGVRIENYPSLQRDPRGREMASL >gi|319979494|gb|AEUH01000006.1| GENE 3 3339 - 4319 1351 326 aa, chain + ## HITS:1 COG:ML0563 KEGG:ns NR:ns ## COG: ML0563 COG1660 # Protein_GI_number: 15827214 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Mycobacterium leprae # 43 326 15 298 298 278 53.0 1e-74 MSERDRAQSPTDLGEAENPPTFPRGIDLRDETAPKREPAPTNEVLIITGYSGAGRTGAAR ALEDLDWYVVDNLPPTMLPALVGMMSNDPTAGVHRLAVGVDVRSRTFFTSLATTLEQLKA SGIAYRVIFLEATREALVKRYESNRRPHPLQGSGTLIDGIAAEERLLAPLRATADQVIDT SGMSVHDLTRHIRDYVAGEAARPLQVTVESFGFKHGLPLDADHVVDVRFLKNPYWVDELR HLTGRDQAVADYVLDQPGARDFALGYADLLAPMLDGYLVELKPFVTIAVGCTGGKHRSVA CAELIAQRLRERGHTVRARHRDIGRD >gi|319979494|gb|AEUH01000006.1| GENE 4 4319 - 5323 1269 334 aa, chain + ## HITS:1 COG:Cgl1552 KEGG:ns NR:ns ## COG: Cgl1552 COG0391 # Protein_GI_number: 19552802 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 19 321 23 337 341 238 46.0 1e-62 MAYLDAAGWVQRGDNGQSIVALGGGHGLSATLRALRHITRQLTAVVTVADDGGSSGRLRK EMPILPPGDLRMALASLCEESEWGLTWRDVMQLRLRTTGPLDGHALGNLLISGLWQLLDD PVEGLDWVGRLLGAQGRVLPMSTTPIDIEADMDDDGTRYVVSGQSKVAIAPGTVEHVRIT PAAPDVPAAVTEAISEADWVVLGPGSWYTSVIPHLLVPGVHRALATTDAHRALVLNLARQ RGETDRMSTADHVRVLRDYAPDLKLDVVIADPTACDDVDDLIRAAQDLGARVVLRQVRTG DGAPHHDPLRLAAALRDAFDGFLGEVGQPEMWLP >gi|319979494|gb|AEUH01000006.1| GENE 5 5361 - 6344 1260 327 aa, chain + ## HITS:1 COG:Cgl1551 KEGG:ns NR:ns ## COG: Cgl1551 COG1481 # Protein_GI_number: 19552801 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 1 326 1 326 327 321 60.0 2e-87 MSLTSDMKDELARTSVATQSEMAAEVCSVLRFAGGLHLVGGRILIEAELDSPVAARRLRA FLQALYNAQSSVVVVSGGSLRRGKRYVVRVVHGADDLARLTGLVDSMRRPVRGLPPFLVG AGRAEAAAVWRGAFLARGSLMEPGRSSSLEITCPGPEAALAMVGCARKLGASARSKEVRG ADRVSVRDSEAIGALISAMGAKETFGVWQERRERREARGSANRLANFDDANLRRSARAAV AAGARVERAFEILGDDIPEHLLEAGRLRLEYKQASLEELGKHTDPPLTKDAVAGRIRRLL TMADKAAHERGIPDTEAALTLEMLEED >gi|319979494|gb|AEUH01000006.1| GENE 6 6393 - 6593 193 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRPKPLMPTRVVTMFLLLVPGPGTVSVATGPAGLQARPADHCGSSTANLHPGEAGSTSHF ANRPVA >gi|319979494|gb|AEUH01000006.1| GENE 7 6556 - 7563 1699 335 aa, chain + ## HITS:1 COG:Cgl1550 KEGG:ns NR:ns ## COG: Cgl1550 COG0057 # Protein_GI_number: 19552800 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Corynebacterium glutamicum # 1 334 1 334 334 432 67.0 1e-121 MTTRVGINGFGRIGRNFFRAFLEQGADLEIVAVNDLTDNKTLAHLLKYDSILGRFGGEVS FDDDGIIVDGKHIKVLAERNPADLPWGDLGVDVVVESTGLFTDGEKAKAHIEGGAKKVVI SAPAKNVDGTFVMGVNEGDYDNATMNIVSNASCTTNCLAPLAKVLHESFGIERGIMTTIH SYTGDQRVLDAPHKDLRRARAAALNMIPTKTGAAQAVALVLPALKGKFDGLAVRVPTPTG SLTDLTFVAEKEVSVEAVKAAVKAAAEGELKGVLEYTEDPIVSTDIQGNPHTSIFDATET KVIGNLVKVLSWYDNEWGYSNALVRLTALVGSKLA >gi|319979494|gb|AEUH01000006.1| GENE 8 7672 - 8862 2092 396 aa, chain + ## HITS:1 COG:Cgl1549 KEGG:ns NR:ns ## COG: Cgl1549 COG0126 # Protein_GI_number: 19552799 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Corynebacterium glutamicum # 4 394 3 403 405 394 58.0 1e-109 MRTIDTLGDLRGKRVLVRSDFNVPLKDGAITDDGRIRAALPTLKTLADAGAKVVVLAHLG RPKGKVDPAFSLAPVAARLAELSGLKVTLASDTVGDSARQTVASLGEGEVALLENVRFDA RETSKVDAEREELAREYAKLGDAFVSDGFGVVHRKQASVYDIAKLVPSAAGLLVLKEIES LSKVTEDPERPYGVVLGGSKVSDKLGVIANLLKKADLLLIGGGMVFTFLAAKGYSVGKSL LEEDQIDTVKGYIAEAAERGVDLVLPVDVVVAPEFAADSPATVVGVDAIPADQMGLDIGP ESGKLFADKLAAAKTVAWNGPMGVFEFEAFSAGTRAVAEALSKGTMFSVIGGGDSAAAVR LLGFDESTFSHISTGGGASLELLEGKVLPGIAVLED >gi|319979494|gb|AEUH01000006.1| GENE 9 8865 - 9641 1207 258 aa, chain + ## HITS:1 COG:Cgl1548 KEGG:ns NR:ns ## COG: Cgl1548 COG0149 # Protein_GI_number: 19552798 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Corynebacterium glutamicum # 1 248 1 246 259 288 60.0 7e-78 MARTPLMAGNWKMNLDHLEANHLVQGLAMELSDRDHDYSKCEVVVIPPFTDLRTVQTVVD ADKLGVKYGAQDVSVHDNGAYTGEISTAMLNKLGCSYVVAGHSERREYHSESDELVGQKA RKAFDAGMTPILCCGEALEIRKAGTYVDFVLAQIRAALKGWDGADVAKIVIAYEPIWAIG TGETASAADAQEVCGAIRAALREDYGDSVADATRILYGGSAKPGNIKELMAQPDIDGGLV GGASLKADSFAQMATFYA >gi|319979494|gb|AEUH01000006.1| GENE 10 9783 - 10028 417 81 aa, chain + ## HITS:1 COG:Rv1440 KEGG:ns NR:ns ## COG: Rv1440 COG1314 # Protein_GI_number: 15608578 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecG # Organism: Mycobacterium tuberculosis H37Rv # 1 79 38 116 117 59 45.0 1e-09 MSILNNVLIVMIFITSALLTLTVLMHKGQGGGLSDMFGGGISSTAGSSGVAERNLNRITL GVLLVWVVCIIGYALLTKFAA >gi|319979494|gb|AEUH01000006.1| GENE 11 10110 - 11126 1439 338 aa, chain + ## HITS:1 COG:lin1003 KEGG:ns NR:ns ## COG: lin1003 COG1434 # Protein_GI_number: 16800072 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 73 305 80 315 344 131 36.0 2e-30 MDAIAAWVPTAFWAVVFLWSYHREPRQFRNAFFFLFLVITALFTVAVMSQDMWVILPIGL LVVFSPFAFIAFLFANAWVVARREGVSLATLLPLLFGVAVAGWFAALPLSVGLHAPEWVL GAAMLLTAWGAWFFMSFTALLLYSTFYRLLPRKRVYDYIIIHGAGLNGEEPTPLLRGRIE TAIRLWDRQGRRAVLVPSGGQGPDEVVSEAEAMHRFLRTNGVGEESILDEDRSTTTMENL VFAKQLIEERQRGFPYRCAMVTSDYHVFRTATYARAAGIRGDGVGAKTAFYYFPTAFIRE FIAFTRKHWLPYALIAALMLIPVIHRGLIAFGHEMGLF >gi|319979494|gb|AEUH01000006.1| GENE 12 11169 - 11897 835 242 aa, chain - ## HITS:1 COG:MT1492 KEGG:ns NR:ns ## COG: MT1492 COG0363 # Protein_GI_number: 15840904 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Mycobacterium tuberculosis CDC1551 # 64 238 60 242 247 102 40.0 7e-22 MREVAELVVAPSEAGIASVLGPRFLEDLSRLVLRAPAGRRVDVGLSGGFVTQRLLPGLVG PSAVDWSRVRVWMVDERYVAAGDPLRNDDEAWTGFLRHCPGVELVRMPSADQYGADGLEE AAAAFARTWERLMGARSFDTALIGMGPDGHICSLFPHHRALDSPGPVVRVPDSPKPPPQR ITISMAVMRSCRSLWLAAPGAAKAGAIAAALGGAPVEDYPVGAVLSPTARVYLDGPAARL VR >gi|319979494|gb|AEUH01000006.1| GENE 13 11899 - 12837 1143 312 aa, chain - ## HITS:1 COG:Rv1446c KEGG:ns NR:ns ## COG: Rv1446c COG3429 # Protein_GI_number: 15608584 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-P dehydrogenase subunit # Organism: Mycobacterium tuberculosis H37Rv # 1 295 1 290 303 181 39.0 2e-45 MIITLKNTTSAQVASRIVELREERGSAALGRVLTLLICVPDLIDVDNAIEISDAVSREHP CRVIVVVDPPATGSEALLNAQIRVGDSAGLSDIVILEPRGEAATSIDSLVMPLLQPDTPV VTYWPVDPPANPGEHPLGRIATRRITDSRATACPMAALRALSGVYTPGDTDLGWAGVTLW RALLAAIAEDFDRLPASVLVRGHATHPSSYLVAAWLHHQLGVPVSRETDPEAQTVTQVSF LFDDGTEVSLSRPASSSVARLARPGLDDAPASLPRRSVQDCLMEELRRLDPDAYYGRLLT QEVPLIPLGAEQ >gi|319979494|gb|AEUH01000006.1| GENE 14 12834 - 14360 2023 508 aa, chain - ## HITS:1 COG:MT1494 KEGG:ns NR:ns ## COG: MT1494 COG0364 # Protein_GI_number: 15840906 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate 1-dehydrogenase # Organism: Mycobacterium tuberculosis CDC1551 # 8 508 14 514 514 677 65.0 0 MERRAPTLHDPADRRLPRISPPCGLVIFGITGDLARKKLLPAVYDLANRGLLHPAFALTG FARRDWSASDFEDYVRASIEKHARTGLNERTWSQMRSGLRFVSGTFDDPAAYQKLADTVA ELDRSRGTGGNHAFYLSIPPSWFPVVAQHLAETGLNRRTDREWRRVIIEKPFGHDLASAR ELSGVISQIFDESDVFRIDHYLGKETVQNIMAMRFANTMFEPLWKANYVDSVQITMAEDI GIGTRAGYYDGIGAARDVIQNHLLQLMALTAMEEPVRFTPGEIRTEKEKVLSAVRLPEDL AASTARGQYAAGWQGGQRVRGYLEEDGIPADSTTETFAAIKLFVDTRRWAGVPFYLRAGK RLGKRVTEIAVVFKRSAHVPFPTTDLAESGQNVLVVRVQPDEGLTLKFGAKVPGAEMQVR DVTMDFAYGHAFTEDSPEAYERLILDALVGSAPLFPHQREVEWSWRILDPVLDYWAGQGQ PEQYAPGAWGPPSAHRMLADDGRMWRLP >gi|319979494|gb|AEUH01000006.1| GENE 15 14465 - 16573 3003 702 aa, chain - ## HITS:1 COG:Cgl1536 KEGG:ns NR:ns ## COG: Cgl1536 COG0021 # Protein_GI_number: 19552786 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Corynebacterium glutamicum # 13 697 21 698 700 758 60.0 0 MGPVRLEADVTLNWDHIDDRAVKTAKLLAADAVEQAGSGHPGAAISLAPAAYVLYHKQMR FDPSDPRWLGRDRFILSAGHSSLTQYVQLYLAGAGLELDDIKALRTAGSLTPAHPEFGHT KGVEITTGPLGSGLASAVGFAMNSRRVHGLLDPATPLGESVFDHDVYVIAGDGCLQEGVS AEASSLAGTQKLGNLTVLWDDNHISIEDNTSIAFSEDVLARYEAYGWHVQRVEWLSADGS YSEDVEALSAAIDAARAVTDKPSIIAVRTIIGWPTPGKQNTGGIHGAKLGAEALSGLKEA LGADPDAMFAVDDEAVAAVRARAAERAAAFRRDWDERFEAWRAANPDGAALLDRLQAGKL PEGWEAALPVFEEGKAVATRSASGQVLCAIASVLPELWGGSADLAGSNNTLMKGEPSFLP ESASSKAFSGNEFGRNLHFGVREFAMGCIMNGIAADGVNRPYGGTFFVFSDYMRGAVRLA ALMDLPVTYVWTHDSIGVGEDGPTHQPIEHLAAYRAIPNLAVVRPADAAETAAAWKAVLE QSHPAALVLSRQNLPNPRRGEGALAPADSLARGAYVLADTEGTPDVVLLASGSEVPVALE ARGLLAAEGIAARVVSVPCLDWFEAQDEEYQRSVLPPAVRARVSVEAGIALPWYRWLGDA GVPVSIEHFGASASGALLFKEYGIDADHVAAAARKSLERARA >gi|319979494|gb|AEUH01000006.1| GENE 16 16673 - 16907 251 78 aa, chain + ## HITS:1 COG:MT1740 KEGG:ns NR:ns ## COG: MT1740 COG4974 # Protein_GI_number: 15841158 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Mycobacterium tuberculosis CDC1551 # 1 78 1 78 311 58 43.0 4e-09 MDTIEDTASDWLDHLRVERGASAHTVSNYRRDIRRYARDLGARGITDIAGVRAADIEAHL ASLASGGLTGAPAAPASV Prediction of potential genes in microbial genomes Time: Thu May 12 16:54:59 2011 Seq name: gi|319979481|gb|AEUH01000007.1| Actinomyces sp. oral taxon 178 str. F0338 contig00007, whole genome shotgun sequence Length of sequence - 11716 bp Number of predicted genes - 12, with homology - 10 Number of transcription units - 6, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 2 - 751 774 ## COG4974 Site-specific recombinase XerD + Term 821 - 846 -0.5 2 1 Op 2 3/0.000 + CDS 988 - 1833 1139 ## COG1192 ATPases involved in chromosome partitioning 3 1 Op 3 21/0.000 + CDS 1817 - 2779 962 ## COG1354 Uncharacterized conserved protein 4 1 Op 4 12/0.000 + CDS 2787 - 3374 621 ## COG1386 Predicted transcriptional regulator containing the HTH domain 5 1 Op 5 . + CDS 3371 - 4120 1199 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 6 1 Op 6 . + CDS 4156 - 5295 1304 ## COG0287 Prephenate dehydrogenase 7 2 Op 1 . + CDS 5596 - 6516 794 ## SSA_0961 ketopantoate reductase PanE/ApbA 8 2 Op 2 . + CDS 6791 - 7528 1056 ## COG0283 Cytidylate kinase + Term 7590 - 7634 13.0 9 3 Tu 1 . - CDS 7610 - 7870 158 ## 10 4 Tu 1 . + CDS 7833 - 8048 328 ## 11 5 Tu 1 . + CDS 8375 - 10366 1274 ## Ndas_3629 hypothetical protein + Term 10533 - 10576 13.7 - Term 10517 - 10565 14.6 12 6 Tu 1 . - CDS 10605 - 11714 860 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 Predicted protein(s) >gi|319979481|gb|AEUH01000007.1| GENE 1 2 - 751 774 249 aa, chain + ## HITS:1 COG:Cgl1385 KEGG:ns NR:ns ## COG: Cgl1385 COG4974 # Protein_GI_number: 19552635 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Corynebacterium glutamicum # 1 231 76 304 304 223 57.0 4e-58 SVARASAAIRGLHAYALRQGRVGADAAAEVRAPKQGSHLPKALSVDQVSRLLDAAHSAPG AAGLRDAALLELLYATGARVSEAVGLAVDDIDLDDEAPVVRLFGKGRKERLVPLGSYAKD ALGAYLVRGRPELAARGRGSHAVFLNKRGAAMSRQSAWETIRRAASAAGLEAGVSPHTLR HSFATHLLEGGASVRDVQELLGHASVQTTQIYTRVTVAALREVYWTAHPRARGRAPGEGA GERRRAPRT >gi|319979481|gb|AEUH01000007.1| GENE 2 988 - 1833 1139 281 aa, chain + ## HITS:1 COG:Cgl1387 KEGG:ns NR:ns ## COG: Cgl1387 COG1192 # Protein_GI_number: 19552637 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Corynebacterium glutamicum # 13 277 23 287 290 315 61.0 6e-86 MYEPPAEQELQPDYPVPEPLSGHGPARIIAMCNQKGGVGKTTTTINLGAALAEYGRRVLV VDFDPQGAASVGLGINTLDMDQTIYTLLMDPRADAAAAICTTRTPNLDIIPANIDLSAAE VQLVNEVARESALARVLRRVEADYDVILVDCQPSLGLLAVNALTAAHGVIVPVEAEFFAL RGVALLVETIETVRDRINPRLKIDGIVATMVDLRTLHAREVLERLHEAFGDLVFTTRIGR TIKFPDASVATEPITSYAPGHPGAEAYRRLAREVVARGDTA >gi|319979481|gb|AEUH01000007.1| GENE 3 1817 - 2779 962 320 aa, chain + ## HITS:1 COG:MT1750 KEGG:ns NR:ns ## COG: MT1750 COG1354 # Protein_GI_number: 15841169 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 9 255 12 265 278 192 51.0 9e-49 MATPPEADGTAPEGDRDDRFQVRLDVFEGPFDLLLQLIARKRLDISEVALARVTDEFIAH MRAFPDLSRATEFLVVAATLLDMKAAHLLPRLDDGPDAAAEDLEARDLLFSRLLQYRAFK TAAAQVAACLERFGAYTPRAAPLEARFAALLPELVWTTTPGDLARMAADALTRTEPSVEV AHLHDPVVPVAEQARIVAERLAAQGSATFESLVADACARAVVVARFLALLELYRRGAVDF AQDQPLGELTMVWTGGGEACAGIDDTYEGGPAWGDGDGGTAAVPGGETTAGDAAMASGET APPTEAASDEEDGGRRSSGG >gi|319979481|gb|AEUH01000007.1| GENE 4 2787 - 3374 621 195 aa, chain + ## HITS:1 COG:ML1369 KEGG:ns NR:ns ## COG: ML1369 COG1386 # Protein_GI_number: 15827714 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Mycobacterium leprae # 23 195 32 205 231 141 46.0 8e-34 MTGGAAVEDERTAGGGGGLRAPIEAVMMVAAEPVPASDIADALGVDQEEADGALRALARS YREEGRGFELREAAGGWRVYSSPRFADVVGRFVVGTAQARLSQAALETLAIIAYRQPITR ARVSRVRGVGVDAVVRTLMARGLIAEVGETESGARLYGTTSEFLEKMGLGSLEDLVPLAP YLPAAEELDDLEDQL >gi|319979481|gb|AEUH01000007.1| GENE 5 3371 - 4120 1199 249 aa, chain + ## HITS:1 COG:MT1751.1 KEGG:ns NR:ns ## COG: MT1751.1 COG1187 # Protein_GI_number: 15841171 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Mycobacterium tuberculosis CDC1551 # 10 249 13 254 254 239 56.0 3e-63 MRKDLYTAGGVRLQKVLAQAGVASRRAAEQMIADGRVSVDGQVVRGQGMRVDPTAQVIHV DGERLILDETKHVVLAINKPVGVVSTMSDPEGRPTVADIVAGYPERLYHVGRLDIDTSGL LLLTNDGELANRLTHPSYEISKTYVARLHGEVRPGVKRRLMAGIELEDGPIAVDGFRVVD TYGDITTVEIVVHEGRNRLVRRMMDAVGFPVRELVRTGFGPIRLDHLQPGTSRRVKGNAL TALYGAVGL >gi|319979481|gb|AEUH01000007.1| GENE 6 4156 - 5295 1304 379 aa, chain + ## HITS:1 COG:BH1666 KEGG:ns NR:ns ## COG: BH1666 COG0287 # Protein_GI_number: 15614229 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Bacillus halodurans # 1 374 1 361 366 114 27.0 4e-25 MATRGPVLVIGSGLLGASLGLALRAGGVEVLLEDASPTSLRLAQDIGAGRPLSDWAADTG EQREAGPHLVVVATPPDVADRCVVHALRAYPRAFVTDVASVKDAVMSDALAELGRQGMAE AGARYVGSHPMAGRERSGAGAADADLFYGRPWVVVAHEGSSPEAVLAVRALASDVGGVPM EMTAPMHDRSVALVSHVPQLVSSMLAARLGDAPPEALSLAGQGLRDTARIAASDPRLWTA ILAGNAGPVAEVLRGIQRDLDDLVTHLDAAARLGPLRGGSVGAINRVMEAGNRGVARIPG KHGGAPRRYAELEVLIPDSPGEMGRLFSELGGAGVSIEDFVLEHSAGQQVGVGRIMIDPS AMERAVEVLEARAWRLIPH >gi|319979481|gb|AEUH01000007.1| GENE 7 5596 - 6516 794 306 aa, chain + ## HITS:1 COG:no KEGG:SSA_0961 NR:ns ## KEGG: SSA_0961 # Name: not_defined # Def: ketopantoate reductase PanE/ApbA # Organism: S.sanguinis # Pathway: not_defined # 1 296 11 297 312 150 31.0 5e-35 MGITHGWLLSQHHDVSWLVRADRADFYREPFALRVHDLRPGHRDTTATISLKLVTSIAPD LYDAVLVMVPGGSLADVLPVLDGVGTDVPVVLMLNHWNLAGALGGSSEPRANRLLGFPSQ VGGGRQGHRISVTVFPRGTVLEAGTASKKAALDKAEALLASGGLTIRRQRRMPDWLAVHS MQQALTAAPLIEAGSYRAFAADRAAITRMVVAFREGLDVCRARGIPTWRLWPAPLFKLPK PLVARLLQGMFQQAETEAMVVGHMGHGLDEWIEGLASIRADAQRLGVPAPEIDRQWAVIR SRRSGA >gi|319979481|gb|AEUH01000007.1| GENE 8 6791 - 7528 1056 245 aa, chain + ## HITS:1 COG:SPy0803 KEGG:ns NR:ns ## COG: SPy0803 COG0283 # Protein_GI_number: 15674845 # Func_class: F Nucleotide transport and metabolism # Function: Cytidylate kinase # Organism: Streptococcus pyogenes M1 GAS # 14 228 3 211 226 116 35.0 3e-26 MNDDARRAAIGRLGITVAIDGPAGSGKSTVSKAVASRAGIGYLDTGAMYRALTWFALESG VDFGAEGSAELVASLADRMRLRLDSDPHDPHVWVGEAEVTAAIREPRIALAIRHVSTNLR VRAWMAQEQRRRMMEARAQGSGMIAEGRDITTVVCPDADVRVLLLADEEARLRRRTLELY GDCAPEHLEAVREQVQGRDRADSAVTEFMKPAPGVRVVDSTGLDVEGVTGAVLALVDEDL RVRGA >gi|319979481|gb|AEUH01000007.1| GENE 9 7610 - 7870 158 86 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNPEASNSMDATALPSYAHNGRNYATRLRTQGRNQADVCGACCLIRGPCVRSREAVGRP QMMRPGAGRQAPCSGSLIGVGNRALG >gi|319979481|gb|AEUH01000007.1| GENE 10 7833 - 8048 328 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MASIELEASGFSKQLISINTEVLDHLRRAGARLTVRLEIQAESDQGFDDAVRRTVGENSA NLGFGGYGFEE >gi|319979481|gb|AEUH01000007.1| GENE 11 8375 - 10366 1274 663 aa, chain + ## HITS:1 COG:no KEGG:Ndas_3629 NR:ns ## KEGG: Ndas_3629 # Name: not_defined # Def: hypothetical protein # Organism: N.dassonvillei # Pathway: not_defined # 1 663 1 602 602 137 26.0 2e-30 MTIGYRSIIEIDDPRGALAVADEQFRAWLRSKKLDQRTAVARDEWDGPGIYDIGEGTTLT VINEADSEGGYEAQLLELVENSKGRGRWRTRFYAMHRAHETRYSSVLWIETRGLGDKGEE KVPNPPRLVRQLLSDVRTAHNHGVPLLVEPEPVRRPSEIERLIAHIRNPERTVSLIVAAP VGHGIDMKEKWRDALSSLTKESQGCASFFVLEDDTFRLFNERMKKGLRLPKGCLRTYCPR VDPDDPVDLRRHRILTTQRMVSEFDDRRRRFSPRLARTISTQPLINAWQAPLPLELVVAE RLLEKRRLSLTRVVPKAHDPESQPLIVTVPSPKGGTQAPQRPRARSAEAVASTERCPATL SWKDKLRAFLARFTGGRAVHSETELVTAIDEAEAAFARLDAEKRDALDEAGRRRVECETL RRTAEENLELAQGVDDELTELKGKYDDAQHDVARLEEEQRVLEKKVRRLERQVRNPGSHA SEAAAEDWIENPPDSVTGIVDRLTGDSQGYDLVRKYVELSDLDKVLEGAFRVDEVDASRF ATAFWKYILVLMDYMRACEAGDFKGGVHKYLEESRFPYHTCPQSRHKPTESDTVRTGKRM RAERTFRVPADVDPRGRIEMWAHFAPTGGGQTAPRMHYYADTRNTHKVYIGYIGPHLTNT KTN >gi|319979481|gb|AEUH01000007.1| GENE 12 10605 - 11714 860 369 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 37 350 11 324 329 335 54 7e-92 GAGAAESAGADSAERAGAYGAQGPAAPPTGPAPRGTLLDVKEASREYESAGSGFFKRDKG VVSAVDRVSITVRKGETYGLVGESGCGKSTVGRLIAGLEPPSGGAIELDGRDLAALKGRD AVRIHRDVQMMFQDSYAAMDPRMRIDQILAEPMSIQRTGDAQQIAERIMEILEQVGLTEE ILDRYPHEFSGGQLQRIGFARSLTLAPDLIVADEPVSALDVSVQAQVLNLMKDLQEELGL SYLFISHDLAVVQYMADRIGVMYLGRIVEEGPAEEVVANPRHPYTKALIDSIPVPDPAFE HADDAIKLTGEPPSAINPPEGCRFRPRCPFATDECLAQPPLSGGGHRVACHHPLAWAAAG AVAEEAPVG Prediction of potential genes in microbial genomes Time: Thu May 12 16:55:26 2011 Seq name: gi|319979465|gb|AEUH01000008.1| Actinomyces sp. oral taxon 178 str. F0338 contig00008, whole genome shotgun sequence Length of sequence - 16012 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 4, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 44/0.000 - CDS 3 - 1154 533 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 2 1 Op 2 49/0.000 - CDS 1157 - 2062 1206 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 3 1 Op 3 38/0.000 - CDS 2059 - 3018 1653 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 4 1 Op 4 . - CDS 3113 - 4774 2881 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 5009 - 5068 2.0 5 2 Op 1 . + CDS 5147 - 6295 1498 ## COG1940 Transcriptional regulator/sugar kinase 6 2 Op 2 . + CDS 6292 - 7749 1878 ## COG5476 Uncharacterized conserved protein 7 2 Op 3 1/0.333 + CDS 7746 - 8678 363 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 8 2 Op 4 . + CDS 8675 - 9373 605 ## COG3142 Uncharacterized protein involved in copper resistance 9 3 Tu 1 . - CDS 9526 - 10383 1164 ## COG2887 RecB family exonuclease 10 4 Op 1 1/0.333 + CDS 10515 - 11573 1249 ## COG2519 tRNA(1-methyladenosine) methyltransferase and related methyltransferases 11 4 Op 2 . + CDS 11566 - 13113 2270 ## COG0464 ATPases of the AAA+ class 12 4 Op 3 . + CDS 13110 - 14624 1927 ## Cfla_2003 protein of unknown function DUF245 domain protein 13 4 Op 4 . + CDS 14647 - 14823 354 ## 14 4 Op 5 . + CDS 14825 - 16010 1752 ## Jden_1206 protein of unknown function DUF245 domain protein Predicted protein(s) >gi|319979465|gb|AEUH01000008.1| GENE 1 3 - 1154 533 384 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 3 319 7 324 329 209 35 7e-54 MSTPTPLLEIKDLHTDIEIRSGVVRALSGVDLVVNAGETLGVVGESGSGKTMTALSLMGL LPQGGRVSSGSMLLEGEDLTSMPPASVRKLRGTKVGMIFQDPLTSLNPTMKIGLQVCEPL RVHEKMPKKEALARAVEILKRVGMPRPESVINSYPHQLSGGMRQRVMIAMALVCQPRILI ADEPTTALDVTTQMQILDLIDELRDEYQMGVILITHDLGVVAGHTDRVSVMYAGRIVETA PTRTLFTEPRHRYTSSLMAALPERALAERTRLFSIPGAPPSLTDLPVGCRFAARCLWATD QCRAAYPGLGGEGAHTYACFHPVLEGDESPAALQARLDAERAADEAGAGSADRAGAGGAE GTRLGSADQAGAGAGCAGRQCWQC >gi|319979465|gb|AEUH01000008.1| GENE 2 1157 - 2062 1206 301 aa, chain - ## HITS:1 COG:BS_appC KEGG:ns NR:ns ## COG: BS_appC COG1173 # Protein_GI_number: 16078205 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 31 295 33 298 303 204 41.0 2e-52 MSAVPAQAPAGAVPRQRAKRASTKLQGLQVFMENRLAVFGVVLLALLVLFSFAGPLLWRT DQIHTDLMNSVLPPSAEHPLGTDKVGYDQLGRLMEGGKTSIVVGIFAGTFATAIGTAYGA IAGFVGGWVDALMMRIVDSMMSIPVLFLFMLIATMIPPTVPVLIAIMAALSWLGTSRLIR GEALSLRTREYVMAMRGMGGSMPRAIRTHIVRNTIGTVIVNATFQVADAILYVAYLSFLG LGAPPPATDWGAMLSNGQSDVYSGNWWLIFPPGIAIILLVLSFNFIGDGLRDAFEVRLRK R >gi|319979465|gb|AEUH01000008.1| GENE 3 2059 - 3018 1653 319 aa, chain - ## HITS:1 COG:CAC0177 KEGG:ns NR:ns ## COG: CAC0177 COG0601 # Protein_GI_number: 15893470 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 1 314 1 320 322 216 37.0 6e-56 MLKYLLKRLGQAVVVVFLVTIVTFALLQSQPGGAARAALGKDATQEQLDAFDHENGYDRP IVEQYVTYIGKIAHGDFGYSYQHNQSVKTLLAVRLPRTIFLSLLSTVLALLIAVPLGVFQ AVKRNKAPDYVITIGCLLAYSTPIFFAGLLLIVLFSQVWPILPGEAPQGQDLAVMWEQWD HLVLPVCALSIGTIAAYARYVRSSMVDNLNEQYVRTARAKGLSEVRVVFVHTLRNAMFPV ITMIGLYIPAMFCGALVIETLFNFNGMGYLYWQATGRRDYPILLGVTLIVALATVVGALL ADFLYAAADPRIRLAGRSK >gi|319979465|gb|AEUH01000008.1| GENE 4 3113 - 4774 2881 553 aa, chain - ## HITS:1 COG:CAC0176 KEGG:ns NR:ns ## COG: CAC0176 COG0747 # Protein_GI_number: 15893469 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 23 521 73 538 569 93 26.0 1e-18 MASRTPNWIFPVSAKGYTQGENGNFIQAMYRPLFAYKSTSDTPYRINLPKSLGTVPEVSE DGLTYTIALRQDAKWSDGTPVTTRDIEFWWNLVTNNKEEWASYKEGFFPDGATLDVKDEH TFSITTEEKFAPDWFIDNQINKVALLPQHAWDKDSADAAVSDMDRTPEGAQKVFAFLTAE AKNLSTYATNPLWQTVNGPWKLSTFTPDQGLELVPNENYWGEDKPKIDKLIYKAFTGDDA EFNTVRSGGIDFGYIPAAQYGQKSAVEAKGYTVFLWPGNSITYLALNFAPQASGSKFINQ KYIRQAMQQLIDQKTLSDKVWNGTASPTCGPVPMSPEKVGTMEGCAYQFDPAAAQKLLED HGWKIVPDGASTCENPGTGDNQCGEGIEAGDSMNFKLNSQSGFASTHQMFEEVISQFRKL GIGIDMQELPDSVGASQVCDDETPCTWDLSFFGSQTSWGYPIYASGERLFATKAPVNLGQ YSSEKADELITASTKSSDANALAEYNDYLAEDLPVLWMPNPYYQITAVKSGLDLGDIDAT GDTWPEDWSWKQQ >gi|319979465|gb|AEUH01000008.1| GENE 5 5147 - 6295 1498 382 aa, chain + ## HITS:1 COG:VC2007 KEGG:ns NR:ns ## COG: VC2007 COG1940 # Protein_GI_number: 15642009 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Vibrio cholerae # 21 368 31 385 405 107 27.0 5e-23 MAAQSVEDAIIALIGSGQARTRADIVKRTGLAASTISAAVSRLVDSGAINETEESVSTGG RRARMLAPADAGGVGALIELGAHHALLALTDPDTGITEPTSAPINIADGPRRVLTALRDA ITRLEEAHGRTATRIAVAVPGPVDAPRSRVIRPARMPGWDGVDFAALIRQETGMAASIEN DARAGAMGELVYRRREGAGADGANPLIYVKAGSAIGGALLIDGSPVEGADGLAGDISHIP VPAAASRPCKCGNVGCLETIASADAIRADLAASGVSFDSNASLLNAARDGIPEVATAIRS AGILLGESLAHIVSFLNPRAVIIGGALSAVDAHVAGVRQALHQSCLPSIMDSLVIESSRT GRAAALWGLTTSTPTLPKENRP >gi|319979465|gb|AEUH01000008.1| GENE 6 6292 - 7749 1878 485 aa, chain + ## HITS:1 COG:AGc4702 KEGG:ns NR:ns ## COG: AGc4702 COG5476 # Protein_GI_number: 15889852 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 472 2 463 477 324 43.0 3e-88 MTRPRVAVAGIHIESSTFTPYASTADDFIVTRGADLLDRFYWIDEDWARQIEWIPVLHAR ALPGGVVQRGAYDAWKAEILQGLAAAGPLDGLFFDIHGAMSVQGMDDAEGDLITAIRGVI GPDPMVSASMDLHGNVSQDLFEGCDLLTCYRLAPHEDALESRRRAAHNLGRRLLDGAPKP VKALVHVPILLPGEKTSTRIEPAKSLYKMIDPVEAMDGIVDAAIWIGFAWADQPRCKGAV VVTGDDADLVKEQAERIGRYFWDVRERFEFVAPVATMEECLAAAEEGPKPFFISDSGDNP GAGGADDVTVALAALLAWKPVQEASLDVVHASIIDPDAAGVAWEAGVGAEVDTQVGGRID TREPGPIRVHATVEALADDPVGGRTALLRTGGLRFIVTTKRNQYTMFSQFALLGVDITTA DVVVVKIGYLEPDLYNTQKGWLMALTPGGVDQDLVRLGHHRIDRPMFPFDPDMADPGLRA QVVEP >gi|319979465|gb|AEUH01000008.1| GENE 7 7746 - 8678 363 310 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 11 296 4 313 323 144 33 4e-34 MSGGDGSPGPVFAGIDIGGTSIKWMVVDEAGDVLDEGAEPTDREAVASQVGGTGQRLARS HPGLAGFGLICPGLVDEKTGTVVYAANLELRGAWLARAVEEATGVPAALMHDGRAAGLAE GLLGAGRGASSFLMMPIGTGISVALMLGDVLWSGAAFSAGEVGHAPVFPGGEPCRCGSRG CLEVYASAKGIARRYEQATGEDIGAKAVEAGIGSDPVASEVWGTAVRALALSLTHMTLTV DVERIIIGGGLSHAGEHLLAPLRQEFASMLTFRDAPEIVRARLGGAGGRWGAAVLAARVG GSTSYERWQP >gi|319979465|gb|AEUH01000008.1| GENE 8 8675 - 9373 605 232 aa, chain + ## HITS:1 COG:PM0526 KEGG:ns NR:ns ## COG: PM0526 COG3142 # Protein_GI_number: 15602391 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Pasteurella multocida # 4 208 3 202 244 120 34.0 2e-27 MTVVEICVEDALGARRAHDGGADRIEICRDLSCGGLTPAFDEVAAALEVAPTGGVQVLVR PRPGDFVHTREEVDRIASDIVTLSALGRGAPVRLGFVVGVLTRDGQIDVDAAARLRDEAG GAPLTFHRGFDQVEDQDRGLDVLMELGYDRVLTTGGDPAVARPGALARLVARAGEDIIIL VSGGLRAHNVAGVVAASGAREVHMRAPGSDGTDEAEVRRITTALRGTGGTGR >gi|319979465|gb|AEUH01000008.1| GENE 9 9526 - 10383 1164 285 aa, chain - ## HITS:1 COG:MT2179 KEGG:ns NR:ns ## COG: MT2179 COG2887 # Protein_GI_number: 15841611 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Mycobacterium tuberculosis CDC1551 # 5 272 1 271 278 177 39.0 2e-44 MTEVIAPRPGPRSWEPALSASRAKEYERCPLQYRLHVIDGYREPATRATAMGTLIHAVLE DLYGLEAPGRTEEAAQEMVGDRFEALLRRDPTVGALFDDDGDRQRWLGEVRAVLGQYFAI EDPKWIAPWAREKNVEASTRAGVRLRGFIDRIDRAPDGRLRVVDYKTGKAPSPRFTEEAL FQMRFYALLLMRTAVLPARMQLVYLKSGRVLTLDPFPGDIERFEERVEELWGRIEADVRG DGFAPRKNPLCNWCGVRSLCPVFGGQAPPMPAEHAQWVLGTRKGA >gi|319979465|gb|AEUH01000008.1| GENE 10 10515 - 11573 1249 352 aa, chain + ## HITS:1 COG:Cgl1462 KEGG:ns NR:ns ## COG: Cgl1462 COG2519 # Protein_GI_number: 19552712 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA(1-methyladenosine) methyltransferase and related methyltransferases # Organism: Corynebacterium glutamicum # 21 291 5 278 278 257 49.0 2e-68 MDTQRNTGGARAPLGQSGRRGALAHGDRVQVRDPKGRYHQVVLVAGGRFQSNRGGFDHDD VIGRPDGQVVTTEEGRQFQILRPLRADYVMAMPRGAAVVYPKDAGVITHMGDVFPGATVV EAGAGSGALSMALLDAVGEGGRLVSVERREDFAQIAAANVDLWFGRRHPAWDLRVGDVAD VLDSLEEASVDRVVLDMLAPWENIGPLTRALVPGGVLTCYVATVTQMSRLVEDLRASGRF TDPVAWEDMRREWHLDGLAVRPAHRMVAHTGFLVVARLLAPGVLPQERAKRPAKAAEGKG GAWDQEEGWTPEGVGQRVNSDKKVRKVRRGLAAQAATWVDGGSRDGEGARDD >gi|319979465|gb|AEUH01000008.1| GENE 11 11566 - 13113 2270 515 aa, chain + ## HITS:1 COG:ML1316 KEGG:ns NR:ns ## COG: ML1316 COG0464 # Protein_GI_number: 15827684 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Mycobacterium leprae # 6 504 55 579 609 469 50.0 1e-132 MTDGADTARLQRQVVSLTEKNARLTEALTRTRSELVRIKAELADVNRPPQSLATFLRADE RERQIEVFTGGRVMRVAASPRLDVAGLSHGQRVRLDDQMVAVAASDYPRSGTVVSVLEMV GDDRVLVSSEGGAEQMLVLAGPLRHGNLRPGDSLVADTRTGTALERIVREDVEQLLAPEV PDTSYDDIGGLDAQIQQVRDAVELPFTHPELFRDYGLRPPKGILLYGPPGSGKTLIAKAV ANSLSRGRDGAQTYFLSIKGPELLNKFVGETERQIRAIFARARALAASDTPVVVFFDEME ALFRTRGTGVSSDVETMIVPQLLAEMDGVESLDNVVIIGASNRADMIDPAVLRPGRLDVR IRVDRPDRAGALDIFSKYLTPSVPIRRDEVERHGGVGAAVERMREAAVERLYVDDETTAL FVATLASGGTRRIYLSDLVSGALIAGVVERAKKHAIKDALGGGAPGLGMDHLMEGLAEEM RESLELAATASPQDWARTSGLGAEIVGVKPIGARE >gi|319979465|gb|AEUH01000008.1| GENE 12 13110 - 14624 1927 504 aa, chain + ## HITS:1 COG:no KEGG:Cfla_2003 NR:ns ## KEGG: Cfla_2003 # Name: not_defined # Def: protein of unknown function DUF245 domain protein # Organism: C.flavigena # Pathway: not_defined # 1 500 1 507 535 440 53.0 1e-122 MSGGRVIGTETEYGVYRPGDPWANPIALSAAVVDAYAQVSRERTPARGAPPVRWDYTGED PLNDLRGMRMSRAAADPSLLTDDPYHLAPSGGHERVARPTPEELALPAATTAVLANGARL YVDHAHPEYSAPEALGAVDAVLWDRAGEVVARRAMEAVEASGRGPVVLVKNNTDGKGAAY GAHENYQVPRATDFDALVRALTPFMVTRPVVCGAGRVGIGQRSEAPGFQMSQRADFVENE VGLETTFNRPIINTRDEPHADPARWRRLHVIGGDANMFDYSALLRLGTTSLVLWAIEQGT DLRWDSLVLDDPVQETWNVSHHPDLDYRISTAGGGSYTAAELQQVYLDLVLDAFDEAGAQ PSGDDRLVLEQWQSVLDRMRADLFSVAAEVEWVCKYQLLLRQRERACLQWSDPRLAAIDL QWADLRPSHGLVHRLDRAGAVKRLFAPERVEAAADEPPANTRARARGEAVANRPDLVKAS WTSLVFDPGQGDLLRYPIADARGA >gi|319979465|gb|AEUH01000008.1| GENE 13 14647 - 14823 354 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSQSQIHAGSPAPEGGEEQAGQVRARTQSVDELLDQIDSVLETNAEAFVQGFVQKGGQ >gi|319979465|gb|AEUH01000008.1| GENE 14 14825 - 16010 1752 395 aa, chain + ## HITS:1 COG:no KEGG:Jden_1206 NR:ns ## KEGG: Jden_1206 # Name: not_defined # Def: protein of unknown function DUF245 domain protein # Organism: J.denitrificans # Pathway: not_defined # 1 394 1 395 459 364 51.0 3e-99 MRRRVVGIETEHGLLAAPASGDGPCMDAEHAARQLFEPLLRRGRSSNLFLRNGGRLYLDV GSHPEYATAECDRLEDLLEQDRAGSLMLADLASQADEAMAALGEDLRVHLFRNNLDSQGN SYGCHENYMLHRRRDFRQVADALVSFFISRLVLVGNGWINLSGARPRLEFSQRANQMWDA VSSATTRSRPIINTRDEPLADSGSYRRMHVIVGDTNVAEPTTALKIGMTWMLLDAVEDGL RIEDLALADPMRAIRQINADLSGAAPIELASGARTTPVALQREIRSRVLSAIGPDSLDEA HRYVADLWGRGLDAIESGDWSGVDTELDIAIKRRLLDSYTARTGADYADPRVARLELGYH DITAQGLRDRMEAAGLMKRLTSPGGASRALTAPPA Prediction of potential genes in microbial genomes Time: Thu May 12 16:55:46 2011 Seq name: gi|319979454|gb|AEUH01000009.1| Actinomyces sp. oral taxon 178 str. F0338 contig00009, whole genome shotgun sequence Length of sequence - 7985 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 3, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 150 - 1190 1279 ## Bcav_2236 peptidylprolyl isomerase FKBP-type 2 1 Op 2 7/0.000 + CDS 1237 - 2460 1410 ## COG0082 Chorismate synthase 3 1 Op 3 . + CDS 2445 - 4043 2134 ## COG0337 3-dehydroquinate synthetase 4 1 Op 4 . + CDS 4040 - 4603 500 ## Cfla_1834 shikimate kinase (EC:2.7.1.71) 5 1 Op 5 6/0.000 + CDS 4631 - 5194 779 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) 6 1 Op 6 . + CDS 5194 - 6165 768 ## COG0781 Transcription termination factor 7 2 Op 1 . + CDS 6463 - 6912 342 ## gi|154508769|ref|ZP_02044411.1| hypothetical protein ACTODO_01278 8 2 Op 2 . + CDS 6887 - 6982 122 ## 9 3 Op 1 . + CDS 7107 - 7595 377 ## BcerKBAB4_0933 hypothetical protein 10 3 Op 2 . + CDS 7611 - 7706 141 ## + Term 7709 - 7749 3.0 Predicted protein(s) >gi|319979454|gb|AEUH01000009.1| GENE 1 150 - 1190 1279 346 aa, chain + ## HITS:1 COG:no KEGG:Bcav_2236 NR:ns ## KEGG: Bcav_2236 # Name: not_defined # Def: peptidylprolyl isomerase FKBP-type # Organism: B.cavernae # Pathway: not_defined # 80 343 29 290 308 105 32.0 2e-21 MSGHRDWTRGSTRARRGASARARRLGERSQGGSRASLYAWIALALVLVVAGAACAWWALS PRPSDSPAAQSASMPRVTDQVTVSGRVGATPTITIAGELSVTGIQATVVSQGTGRTITEG SPVLVSITAFDGHSGEMLSVSGRPQLTLGLVGSDTIPDELGRIVVGKNEGSRLVVVRKLG EQNKAANAVSDVEVDIIDVLPSIAQGTAIDASSGPLSVEMHPEGPVIRHAEAPSGTITTQ TLVKGDGVQVHAGDRVVAQFTVVGWTDGVVRASTWETGIPEVVNLKTAMKGLSETLVDQK VGSRLAITVPPDLAEGDDTLCIVIDVLGTEPPVDEPQSQPQSGPVQ >gi|319979454|gb|AEUH01000009.1| GENE 2 1237 - 2460 1410 407 aa, chain + ## HITS:1 COG:ML0516 KEGG:ns NR:ns ## COG: ML0516 COG0082 # Protein_GI_number: 15827178 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Mycobacterium leprae # 1 405 1 395 407 394 57.0 1e-109 MLEWMTAGESHGPALVATIEGVPQGIRLTTAVLRAALARRRLGHGRGARQRFEEDEATIL AGVRHGLTTGAPIAVQIANTEWPKWRVVMSADPVDPAELLKDAGTGDEREIARNRPLTRP RPGHADLPGMVSYDLDDARPVLERASARETAARVALGAIAEQLLEQVAGVRLVSHVVGVG AEAAAPGPLPVPEDAASLDASPMRTLDEEAGARFVGAVDAAKRAGDTIGGVVEVVAWGVP IGLGSHVSAHRRLDARIAGALMSIQSVKGVEIGDGFAQAALPGSAAHDEIVRDGRGRPTR ASNHAGGIEGGTSNGQPVVARAAFKPISTVPRALRSIDLATGEEAVALHQRSDACQVVPG AVIAQAEVALVLADALFEHAGGNSVDEMRRNLESYLNRVEERLCPND >gi|319979454|gb|AEUH01000009.1| GENE 3 2445 - 4043 2134 532 aa, chain + ## HITS:1 COG:ML0518 KEGG:ns NR:ns ## COG: ML0518 COG0337 # Protein_GI_number: 15827180 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Mycobacterium leprae # 173 511 6 341 361 303 54.0 8e-82 MSQRLTLPVVLVGMPGAGKSHVGRGLAGALGVPHVDTDALIEEEEGAAVSAIFAEQGEVA FREKEARAVERALGMRAVVSLGGGAVATARVRELLRGVSVVHIDVDHEELVRRTAGKGHR PLLRVDPEGTLALLRHQRARLYEEVASVRVHSDARPVQRVIDQIAAMIDTGAAPRVVEVG GAAPSRVVIGRDLASGHVAAGFDTLTQKVLLVHARPVAGRAQQLAEDLRKRGYEVETASH PDAEEGKRLEVVASLWDTAGRMRLGRKDAVVAMGGGATTDMAGFAAATWLRGVRLVNVPT TLLAMVDAAVGGKTGINTPQGKNLVGAFHSASRVVCDLAALDTLGARDLAAGMAEVVKCG FIRDTQILRLVSEAPAQELADPASPLLAELVRRSVAVKAGVVGADPHESGLREILNYGHT LAHAIERTQGYTWRHGDAVAVGCVFAAHLARARGMLDEAETAEHEELFGEAGLPTRFDGA PLDQLVAAMRSDKKVRRGVLRFVLLDGIGNPQVHVVEPGELEAAARSTGIPL >gi|319979454|gb|AEUH01000009.1| GENE 4 4040 - 4603 500 187 aa, chain + ## HITS:1 COG:no KEGG:Cfla_1834 NR:ns ## KEGG: Cfla_1834 # Name: not_defined # Def: shikimate kinase (EC:2.7.1.71) # Organism: C.flavigena # Pathway: Phenylalanine, tyrosine and tryptophan biosynthesis [PATH:cfl00400]; Metabolic pathways [PATH:cfl01100]; Biosynthesis of secondary metabolites [PATH:cfl01110] # 17 178 11 173 187 91 39.0 1e-17 MRHVGPEPGGTRAGCPRLVLIGPSGSGKSTVGALLAARLGVALHETDDEAAASLGSTMAA LVVRRDPRLAGARRDRAVAALAGPEGVTVLGPSMPADPGVAAALARAREEGATVVWLEAG ISAVSRRMGLGAPRSVGLGAPRAALRAMMEEARAHYAAVADSSVETDSLTPAQVADSVLR ACKLADG >gi|319979454|gb|AEUH01000009.1| GENE 5 4631 - 5194 779 187 aa, chain + ## HITS:1 COG:MT2609 KEGG:ns NR:ns ## COG: MT2609 COG0231 # Protein_GI_number: 15842068 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Mycobacterium tuberculosis CDC1551 # 1 185 1 185 187 259 68.0 1e-69 MATTNDLKNGLVMVIDGQLWQVVEFQHVKPGKGPAFVRTKIRNVMSGKTVDKTFNAGIKI ETATVDRRDMTYLYQDGTDYVFMDQSTYDQITVPGEVVGDARNFMVENQDVIVSQHDGTV LFVELPATVVLTITHTEPGLQGDRSSAGTKPATLETGYEIQVPLFMEEGTRVKVDTRDGS YSGRVTD >gi|319979454|gb|AEUH01000009.1| GENE 6 5194 - 6165 768 323 aa, chain + ## HITS:1 COG:ML0523 KEGG:ns NR:ns ## COG: ML0523 COG0781 # Protein_GI_number: 15827185 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Mycobacterium leprae # 10 141 8 136 190 85 40.0 1e-16 MSRQHRFTSRTKARKRAADVVFEADQRGMGRDPEALRDLLRERRVITAAQTPLPEYSIQI IAGVADSLRRIDDLIEAHARVPGLDRVAAVDLAVMRVAVWEMLANSDDVSPIVAIDEAIS IVRSISTDASPRFVNAVLDAIRKDIASSWARRGGDDGDFDEAGAGRAEGVAGDDGAGAGE AAGTAGADGSDGADRPSDGAAGRSGGGEVPAGGDGGADTAGVGGQGSADRTGGADIGTDP DWASGGPSGGGIAHVDGGVGTDGGATGLDAAEGSGPHSADPGALGADRLEELPVPTRALP EGAEPVESAELVDDELDELLEEY >gi|319979454|gb|AEUH01000009.1| GENE 7 6463 - 6912 342 149 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508769|ref|ZP_02044411.1| ## NR: gi|154508769|ref|ZP_02044411.1| hypothetical protein ACTODO_01278 [Actinomyces odontolyticus ATCC 17982] # 98 149 1 52 52 92 88.0 9e-18 MSARFPAIRGALNILASSHFVYRRSPLRWLVRVEPEVPGEDSEWRLYSASDTPEFIEGEG NFVIISFNEAGVIEPMIMPMYFQPVGTELRLVDLPQQMRKAWFDENSRDENGTLRELALD DAFWDQFNRDYDAWYAEHPLPGAPGRPQP >gi|319979454|gb|AEUH01000009.1| GENE 8 6887 - 6982 122 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLGAPSHEMTGAENLTHNAVQLTHRTKTTA >gi|319979454|gb|AEUH01000009.1| GENE 9 7107 - 7595 377 162 aa, chain + ## HITS:1 COG:no KEGG:BcerKBAB4_0933 NR:ns ## KEGG: BcerKBAB4_0933 # Name: not_defined # Def: hypothetical protein # Organism: B.weihenstephanensis # Pathway: not_defined # 27 109 13 95 110 69 36.0 4e-11 MLQCANWPPLRREMPKRAAAPPGIPFALACVASVKILRRESSVKWAMREESENVNDSGWR LYSEDDTPEFLESPSSMRIVNFNTVGDLFPIIDLLYFQPVGSEYMLVKDAKDDSLHWYDY NTLEGGKLSPLVVDDAFWGRYWEQWEAESKRVHQLFYSDERP >gi|319979454|gb|AEUH01000009.1| GENE 10 7611 - 7706 141 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAARQPGIMLAHMTEGGRRWFGSNSGSGLVF Prediction of potential genes in microbial genomes Time: Thu May 12 16:56:16 2011 Seq name: gi|319979443|gb|AEUH01000010.1| Actinomyces sp. oral taxon 178 str. F0338 contig00010, whole genome shotgun sequence Length of sequence - 7005 bp Number of predicted genes - 13, with homology - 8 Number of transcription units - 9, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 147 243 ## gi|154508770|ref|ZP_02044412.1| hypothetical protein ACTODO_01279 2 2 Op 1 . + CDS 651 - 830 192 ## 3 2 Op 2 . + CDS 827 - 1360 636 ## 4 3 Op 1 . + CDS 1484 - 1981 607 ## Sked_12140 hypothetical protein 5 3 Op 2 . + CDS 2006 - 2071 133 ## 6 4 Tu 1 . + CDS 2485 - 2601 140 ## 7 5 Tu 1 . - CDS 2629 - 2919 172 ## 8 6 Op 1 . + CDS 2786 - 3163 361 ## gi|154508768|ref|ZP_02044410.1| hypothetical protein ACTODO_01277 9 6 Op 2 . + CDS 3222 - 3437 162 ## gi|154508769|ref|ZP_02044411.1| hypothetical protein ACTODO_01278 10 7 Tu 1 . + CDS 3556 - 4056 584 ## Sked_12140 hypothetical protein 11 8 Tu 1 . - CDS 4193 - 4486 344 ## HMPREF0675_3286 hypothetical protein - Prom 4534 - 4593 3.2 + Prom 4587 - 4646 1.7 12 9 Op 1 . + CDS 4716 - 6167 2202 ## COG2271 Sugar phosphate permease 13 9 Op 2 . + CDS 6177 - 6956 1131 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family Predicted protein(s) >gi|319979443|gb|AEUH01000010.1| GENE 1 1 - 147 243 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508770|ref|ZP_02044412.1| ## NR: gi|154508770|ref|ZP_02044412.1| hypothetical protein ACTODO_01279 [Actinomyces odontolyticus ATCC 17982] # 11 48 65 102 102 63 81.0 5e-09 GWEKDEDSFTIRVHPEEVFTGEQAAPVFHDYIVDGALPDPALLRPLDI >gi|319979443|gb|AEUH01000010.1| GENE 2 651 - 830 192 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWERITSGQVTLSNVVCKTRTADLSLIERTTGQFTLEARTIEELDTELNPLPTTNGTTP >gi|319979443|gb|AEUH01000010.1| GENE 3 827 - 1360 636 177 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTTEFEAVTPERAVQIMRAWAFHPWPMSVQDGIDVYTSFGFSPHPQEPQLFTSDTSPDEA NSFFTSEGDQIDSVRTHLSNVVPKEERPAYRSQVRAAYAGFVAAFTGVLGRPKSVKDKNG VFSSQWFLGNGVGVWVGGNDGLIALSLESPEMAGIHQDDLRRGIVDYSPANDPLLEG >gi|319979443|gb|AEUH01000010.1| GENE 4 1484 - 1981 607 165 aa, chain + ## HITS:1 COG:no KEGG:Sked_12140 NR:ns ## KEGG: Sked_12140 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 43 165 31 148 150 70 36.0 2e-11 MRRFATSVLHADVGEVERFFLPQTPESTADYVCKRLRAPRYFDGAFAFAMVVWALPEGAA YPEDVPEGNPARSTYIQCAGSTRAMAVEIRLTHPDGSYQHWVVAREPVADPDRWVHIEWE KGEDSFTIHVHPEEVFTGDQAAPVFRDYIVGGILPPAELLRPLDI >gi|319979443|gb|AEUH01000010.1| GENE 5 2006 - 2071 133 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGPEFRAPAVERFMEIVRAWI >gi|319979443|gb|AEUH01000010.1| GENE 6 2485 - 2601 140 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDDVRKPLTAPQHLDRAFAYAMAVRALPSSALLRPLAI >gi|319979443|gb|AEUH01000010.1| GENE 7 2629 - 2919 172 96 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAPAWMTGTVSDTSSELVTVAGSREPMRSRMFPTINVRMPFCAITDPHQYYYYATRCRPS ALQVAVPPARVMGQTRRAHARTIPDEFPPPAEARNP >gi|319979443|gb|AEUH01000010.1| GENE 8 2786 - 3163 361 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508768|ref|ZP_02044410.1| ## NR: gi|154508768|ref|ZP_02044410.1| hypothetical protein ACTODO_01277 [Actinomyces odontolyticus ATCC 17982] # 11 125 1 115 115 203 98.0 3e-51 MIAQKGIRTLMVGNIRERIGSLLPATVTSSELVSDTVPVIHAGAMGLQIWGAFVGQSAGE AMWDEMDPQDLAVRLKAFEMNRLLAVDGGQGWLHLDFTNGWIRVVADTWEPWSIILPGIW WTGDI >gi|319979443|gb|AEUH01000010.1| GENE 9 3222 - 3437 162 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508769|ref|ZP_02044411.1| ## NR: gi|154508769|ref|ZP_02044411.1| hypothetical protein ACTODO_01278 [Actinomyces odontolyticus ATCC 17982] # 20 71 1 52 52 84 86.0 3e-15 MPCASGQWTGLRLVDLPQQMREAWFDENSRGENGTLRELALDDAFWDQFNRGYDAWYAEH PLPGAPGRHQP >gi|319979443|gb|AEUH01000010.1| GENE 10 3556 - 4056 584 166 aa, chain + ## HITS:1 COG:no KEGG:Sked_12140 NR:ns ## KEGG: Sked_12140 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 43 166 31 148 150 68 35.0 7e-11 MRRFARTDLLAEAGPVEEFSAESTPEAVADGITEVLTAPRCFDGSHMYAMVVWELPPGAE RPEDVAADDPARSTYIQCAGSTGVMTVEIRVTGQDGSYKHYVVAREPVADPESWMGIEWD TGEGGPAAVRVHPEEVFTGEQAAPVFRAYVVDGRLPAPELLRPLDI >gi|319979443|gb|AEUH01000010.1| GENE 11 4193 - 4486 344 97 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0675_3286 NR:ns ## KEGG: HMPREF0675_3286 # Name: not_defined # Def: hypothetical protein # Organism: P.acnes_SK137 # Pathway: not_defined # 4 96 5 95 95 71 43.0 1e-11 MTIEELPLTEAPSGCGCGGHEAHPPMIDAAAIPHRIRHAAVLGVAQSMRGGEAFVIRAPH LPTPLLAQIEQLPGQWAFEVLTDGPEHWDVKATRTAL >gi|319979443|gb|AEUH01000010.1| GENE 12 4716 - 6167 2202 483 aa, chain + ## HITS:1 COG:SA0214 KEGG:ns NR:ns ## COG: SA0214 COG2271 # Protein_GI_number: 15925925 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Staphylococcus aureus N315 # 8 459 4 455 459 436 48.0 1e-122 MEFLSRAFDIRPAPHAGLPLEVQRKKWVYEFLKTYAVLIIAYGGFYLLRTNFKSAQPFLV EQTGLTTSDLGYIGFGFSLTYGFGGLILGFFIDGKNTKKVVSALLIGSGVASILIGALLA SMNSPYGWMLLLWSLNGLLQAPGGPCCNSTMNRWTPRVLRGRFIGWWNASHNLGAMIAGV LALWGANTLFAGSVIGMFIVPAVVAIPIGVWGYFYGKDDPVELGWDKPETIFGEPEAKAD TVSENMDKGRILVDYVLKNPAVWFLCVANVAAYCVRIGIDNWNVLYTRQELGFSDYLAVN TTMALELGGLAGSLLWGYFSDKMGGRRALSAAIGMCLVVVPIFVYSHATAPGVVYAALFF IGFLVFGPVTLIGICVIGFAPKSATVVVNAVPRAFGYVFGDSLAKVLIGRIADPTKDGVT ILGFHLHGWGATFNVLFFSAAVGLICLVLVALFEERNLRGDRAYAASNAARTGTEADSTE SEK >gi|319979443|gb|AEUH01000010.1| GENE 13 6177 - 6956 1131 259 aa, chain + ## HITS:1 COG:lin1054 KEGG:ns NR:ns ## COG: lin1054 COG0483 # Protein_GI_number: 16800123 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Listeria innocua # 30 251 28 252 257 170 39.0 3e-42 MSEPTTTQLLQIAEATVREAMALALDPGISLDVRTKTNRNDLVTAVDRRIEEVVTARLAE ATGYPVLGEEGHTVDSFAGRVWVLDPIDGTMNYVSTHRDYAVSLALCEDGAPVVGVVADV VGSHVYTAVRGEGARCDGEALAPVLDADDYTDAIIITDIKEILALPRLARALVDSRGHRR YGSAALECVEVAASRAGAFVHMWVSPWDIAAASLICEEAGARVTRLDGTPMDVRHKGSIL VGAPRVHASLLKRLMTDAQ Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:09 2011 Seq name: gi|319979428|gb|AEUH01000011.1| Actinomyces sp. oral taxon 178 str. F0338 contig00011, whole genome shotgun sequence Length of sequence - 13758 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 2, operones - 1 average op.length - 13.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 587 526 ## COG1609 Transcriptional regulators 2 2 Op 1 8/0.000 + CDS 770 - 1330 662 ## COG2065 Pyrimidine operon attenuation protein/uracil phosphoribosyltransferase 3 2 Op 2 15/0.000 + CDS 1330 - 2322 1217 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 4 2 Op 3 7/0.000 + CDS 2315 - 3634 1979 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 5 2 Op 4 24/0.000 + CDS 3631 - 4833 1481 ## COG0505 Carbamoylphosphate synthase small subunit 6 2 Op 5 3/0.000 + CDS 4833 - 8132 4674 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 7 2 Op 6 . + CDS 8162 - 8998 960 ## COG0284 Orotidine-5'-phosphate decarboxylase 8 2 Op 7 . + CDS 9075 - 9404 459 ## Jden_1319 hypothetical protein 9 2 Op 8 25/0.000 + CDS 9432 - 9992 540 ## COG0194 Guanylate kinase 10 2 Op 9 10/0.000 + CDS 10028 - 10294 433 ## COG1758 DNA-directed RNA polymerase, subunit K/omega 11 2 Op 10 4/0.000 + CDS 10296 - 11522 1582 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 12 2 Op 11 . + CDS 11519 - 12706 1673 ## COG0192 S-adenosylmethionine synthetase 13 2 Op 12 . + CDS 12703 - 13287 829 ## COG4243 Predicted membrane protein 14 2 Op 13 . + CDS 13324 - 13756 479 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase Predicted protein(s) >gi|319979428|gb|AEUH01000011.1| GENE 1 2 - 587 526 195 aa, chain - ## HITS:1 COG:Cgl1332 KEGG:ns NR:ns ## COG: Cgl1332 COG1609 # Protein_GI_number: 19552582 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 11 192 8 196 346 79 32.0 3e-15 MEKQPFGRRPPTLRDIAVRAGASVSTVSRALRGDQRISARTRQRIVEIAQRLGYRADVAG SLLRASKPRVVGLLCDLSQELHVAYRHEVLQRAEQAGFRVVVESVEGRCPPGAALRRLRE FRIQALVVVDPRCLGDAGDPGAPVVVIGQERPFADADLVTSDNGAGMGEAVEWLAGLGHR GITYVDGPPGASARA >gi|319979428|gb|AEUH01000011.1| GENE 2 770 - 1330 662 186 aa, chain + ## HITS:1 COG:Cgl1575 KEGG:ns NR:ns ## COG: Cgl1575 COG2065 # Protein_GI_number: 19552825 # Func_class: F Nucleotide transport and metabolism # Function: Pyrimidine operon attenuation protein/uracil phosphoribosyltransferase # Organism: Corynebacterium glutamicum # 14 183 10 183 192 168 53.0 5e-42 MGSPQPDRATRGKQVLGPDDIARSLTRIAYEIIERNDGADGLVIAGIPTRGATLARRLVE RIGQVGGARTQYASIDTTMFRDDLRAQPLRPPHRTEIPESGIDGRPVVLVDDVLYTGRTV KAALDALGMTGRPSRVQLAVLVDRGHRELPIRPDYVGKNLPTSRSETVTILLDETDGADA VLLGVR >gi|319979428|gb|AEUH01000011.1| GENE 3 1330 - 2322 1217 330 aa, chain + ## HITS:1 COG:Cgl1574 KEGG:ns NR:ns ## COG: Cgl1574 COG0540 # Protein_GI_number: 19552824 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Corynebacterium glutamicum # 3 323 1 306 312 322 56.0 5e-88 MGLDHLISIRDLSRDEAVLLLDTAESMAATQSRAVKKLPTLAGRTVVNLFFEDSTRTRIS FETAAKRLSADVINFAAGGSSLSKGESLKDTALTLQAMGADAVVIRHPSSGAAHRLAHAG WMGLPVLNAGDGTHQHPTQALLDAMTLRRHYRPDGPDGGGATPAPGSGLDGAHVVIVGDV LHSRVARSGVDLLTLLGARVTLVAPPTLLPVGVEAWGCAVSYDFDAALADQPDAVMMLRV QRERMSARGGGFFPSVGAYHAEYGLTPLRFRSLRPDAVVMHPGPMNRGLEICAKAADSDQ SVVVEQVANGVCVRMAALYLLLAPEGKHRD >gi|319979428|gb|AEUH01000011.1| GENE 4 2315 - 3634 1979 439 aa, chain + ## HITS:1 COG:Cgl1573 KEGG:ns NR:ns ## COG: Cgl1573 COG0044 # Protein_GI_number: 19552823 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Corynebacterium glutamicum # 8 431 22 442 447 383 48.0 1e-106 MTSTRRDILISGADVVGSGRADIAVKDGTIVAVGPGAADALTDPRRIGADGLIALPGLVD MHTHLRQPGGEDAETVLTGTRAAAVGGYTAVHAMANTSPVADTASVVDQVLRLGEEAGWV QVRPVGAVTEGLGGARLAALSAMARSRARVRVFSDDGKCVSDPVLMRRALEYVKGFGGVI AQHSQDPALTDGSQMNESRLSGELGLAGWPAVAEEAVIARDILLAKHVGSRLHVCHLSTA GSVDLIRWGKRMGVAVTAEATPHHLLLTEDLASTYNPLYKVNPPLRSAEDVAAVREGLAD GTIDCVGTDHAPHPLEAKDCEWQAGAFGMTGLETALPILISTMVETGRMTWADLARVMSA APAAIGRVEGQGQHIRAGSRANITLVDPAERRTVDPQQQWTRSTNCPYTGMELPGRVRYT ILAGAITVDDAAPVAKEDR >gi|319979428|gb|AEUH01000011.1| GENE 5 3631 - 4833 1481 400 aa, chain + ## HITS:1 COG:ML0535 KEGG:ns NR:ns ## COG: ML0535 COG0505 # Protein_GI_number: 15827191 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Mycobacterium leprae # 6 391 4 375 375 403 57.0 1e-112 MSGDTALLVLEDGTVYRGRAWGARGRALGEAVFSTGMTGYQETLTDPSYHRQIVVMTAPH IGNTGVNDEDPESARIWVAGFVVRDAARRPSNWRSRRGLDEELSSQGVVAIADVDTRAVT RHIRERGAMRAGVFSGDALPAGADHLGPEAVAALVRIVAESPGMSGAALAGEVSTGGAYV VEPLGEFEGAEPVARVVAVDLGIKSRTPHHLAARGVQVVVVPSSYSFAQIADLRPDGVFF SNGPGDPSTAEHEISVLRSVLDARIPYFGICFGHQLFGRALGYGTYKLDYGHRGINQPVK DVETGRVQITAHNHGFAVDAPVGGPSISPYDSGRYGRVVVSHVGLNDGVVEGLAALDIPA FSVQYHPEAAAGPHDGEGLFDRFITLMAARRGAASAKEGR >gi|319979428|gb|AEUH01000011.1| GENE 6 4833 - 8132 4674 1099 aa, chain + ## HITS:1 COG:Cgl1571 KEGG:ns NR:ns ## COG: Cgl1571 COG0458 # Protein_GI_number: 19552821 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Corynebacterium glutamicum # 1 1091 1 1108 1113 1284 62.0 0 MPRRTDLRSVLVIGSGPIVIGQACEFDYSGTQACRVLKEEGLRVVLVNSNPATIMTDPGI ADATYVEPITPEVVASIIAKERPDALLPTLGGQTALNTAVALAEEGVLERYGVELIGASI DAIRAGEDREAFKAIVERSGAEVARSFIAHTMEECHAAAEQLGYPLVVRPSFTMGGLGSG FARTPEDLERIAGQGLAASITTEVLLEESILGWKEYELELMRDKADNVVVVCSIENVDPV GVHTGDSVTVAPALTLTDRELQRLRDIGIAVIREVGVDTGGCNIQFAVHPGTGRVIVIEM NPRVSRSSALASKATGFPIAKIAARLATGYTLDEIPNDITGSTPASFEPALDYVVVKVPR FTFEKFPGADPSLTTSMKAVGEAMAIGRCFTEALNKALRSIDKPGAQFHWDGPDPSPEQT RALVERAGTATEERLVQVQQAIRGGAGIDELHASTGIDPWFLDQMVLLDEVARETIGAPA LTGDVLARAKRHGFSDRQIASMRRLSEDTVREVRGAFGIRPVYKTVDTCAAEFAARTPYH YSSYDLETEVAPRSREAVIILGAGPNRIGQGIEFDYSCVHAAMALRGDYETVMVNCNPET VSTDYDVSDRLYFEPLTLEDVLEVYRAECQAGPVKGMVVQLGGQTPLSLAADLERAGVPV LGTTPHAIDLAEDRGEFGKVLEAAGCPAPAHDVAHSPAQVLPVARSIGFPVLVRPSFVLG GRGMAIINDEEALGAYLDRPDIDVLFSRGPLLIDRFLDAAIEIDVDALYDGEELFMGAIM EHIEEAGVHSGDSSCVLPPMTLSARELGRITASTEAIARGVGVRGLINIQFAMLSDTLYV IEANPRASRTVPFASKATGVQLAKAAALIQVGRTIASLRAQGLLPGRDARTIPDGGSIAV KAAVLPFKRFRTAAGEVVDTVLGPEMRSTGEVMGIDRDFPTAFAKSQLGAFTELPTRGTV FISIADTDKRAIVLPAARMAELGFTVLATSGTASVLRRNGIAAQVVRKSSEGRGAEGEPT VVDLINEGRIDLVVNTPQRSANRHDGYRIRAAAAAHDRPAVTTLQAFNAVVQAIEVRARG AYSVRSLQEWDSHRSKEAR >gi|319979428|gb|AEUH01000011.1| GENE 7 8162 - 8998 960 278 aa, chain + ## HITS:1 COG:Cgl1570 KEGG:ns NR:ns ## COG: Cgl1570 COG0284 # Protein_GI_number: 19552820 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Corynebacterium glutamicum # 1 278 1 274 278 214 46.0 2e-55 MGFGDRLYEAMGAHGPVCVGIDPHAALLARWGLDDDARGLREFSLRVVEALGGRAAALKP QSAFFERHGSQGVAVLEEVVAACRELGTLCVVDAKRGDIGSTMEGYADAYLSDASPLAGD AVTLSPYLGAGSLAPALRLAERTGRGVFVLALTSNPEGADVQHCRSREGVSVAARIVSQL ADFNASCDKVHLGPAGVVVGATVGSAVADLGIDLAALRGPVLAPGVGAQGAGPAQVRDVF RGARSAVLASSSRAILESGPDLRGLRSAYKRAVTELAG >gi|319979428|gb|AEUH01000011.1| GENE 8 9075 - 9404 459 109 aa, chain + ## HITS:1 COG:no KEGG:Jden_1319 NR:ns ## KEGG: Jden_1319 # Name: not_defined # Def: hypothetical protein # Organism: J.denitrificans # Pathway: not_defined # 1 109 1 109 109 91 55.0 9e-18 MEEVIRMALPDLTPEQRAQALEKATQARRRRAEVKNALKARSMNLSEVLELAERDDAVAK MKVVTLLESLPRVGTNTAAVLMDQYKIAASRRVRGLGPIQRKQLVERFG >gi|319979428|gb|AEUH01000011.1| GENE 9 9432 - 9992 540 186 aa, chain + ## HITS:1 COG:Rv1389 KEGG:ns NR:ns ## COG: Rv1389 COG0194 # Protein_GI_number: 15608528 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Mycobacterium tuberculosis H37Rv # 7 183 23 200 208 178 52.0 7e-45 MTEAPALTVLAGPTAVGKGTVVAALKERHPDLRVSVSATTRPPRPGEIDGAHYHFVSDEE FDSMVESGDMLEWALVHGRNKYGTPRRPVDRAIAQGHPVLLEIDLDGARQVRRTRPDARF VFLAPPSWEELERRLVSRGTEDGAEQRRRLETARVEMRASAEFDHIVVNDSVERAVSELA ALMGLE >gi|319979428|gb|AEUH01000011.1| GENE 10 10028 - 10294 433 88 aa, chain + ## HITS:1 COG:Cgl1567 KEGG:ns NR:ns ## COG: Cgl1567 COG1758 # Protein_GI_number: 19552817 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, subunit K/omega # Organism: Corynebacterium glutamicum # 8 83 18 93 95 92 61.0 2e-19 MSGIVAQPVGITSPPIDDLLEHVDSKYALVIFAARRARQINLYNLQLAQNMIQFVGPVVE TSPDEKPLAIALREIDEGMLTLEASPGV >gi|319979428|gb|AEUH01000011.1| GENE 11 10296 - 11522 1582 408 aa, chain + ## HITS:1 COG:Cgl1566 KEGG:ns NR:ns ## COG: Cgl1566 COG0452 # Protein_GI_number: 19552816 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Corynebacterium glutamicum # 4 392 13 407 420 294 50.0 2e-79 MATVVVGVTGGVAAFKAVRVVRELMRAGHDVRVAATPASLNFVGPSTWAGLTGAPAVVDV FGAGAEHVELARAADLVLVVPATADALARIRAGMADDMVALTVLASTAPVVIAPAMHSAM WTNPATASNVAELRRRGYTVIDPEEGALGSGDTGVGRLPEPAAIASRALAVLAGRGRCGG PLAGAHVLVTAGGTHEPIDPVRYLGNSSSGRQGTAIARAAAAAGARVTLVAANIDDALVR GAGAGAEVVPVSTAVQMRDAVAARLEGADALVMAAAVADFRPKRALASKIKKDPDGDGAP VIELVRNPDILAEAARAPHRPRVVVGFAAETGTDEEVAAFGRAKARRKGADLMAVNRVGQ GEGFGDVPNRIEVLDADGRSVGAASGTKDDVAEYLVGLIASRLATLSA >gi|319979428|gb|AEUH01000011.1| GENE 12 11519 - 12706 1673 395 aa, chain + ## HITS:1 COG:TM1658 KEGG:ns NR:ns ## COG: TM1658 COG0192 # Protein_GI_number: 15644406 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Thermotoga maritima # 3 394 2 393 395 486 62.0 1e-137 MSRQLFTSESVTEGHPDKVCDRISDAILDALLEKDPRARVAVETMAATGLIHIAGEVTTD AYVEIPEIVRREILQIGYDSSRVGFDGASCGISISLDGQSPDIAGGVDEALEVRGGAAGD PRDRQGAGDQGIMFGYACTDTADLMPAPIWLAHRLAERLAHVRRNGTVPGLRPDGKTQVT LAYEGDRPVGIDTVVVSAQHDEELSQADLRDAIGEHVVAPVLAASGLGLDAGSMELLVNP SGRFVLGGPAGDAGLTGRKIIVDTYGGMARHGGGAFSGKDPSKVDRSAAYAARWIAKNVV AASLARRCEIQLAYAIGRANPVGLYVDSFGTGAVPDGAIADAIRQVFDLRPLGMIEDLDL LRPIYRQTAAYGHFGRGGFPWERTDRADDLRAALA >gi|319979428|gb|AEUH01000011.1| GENE 13 12703 - 13287 829 194 aa, chain + ## HITS:1 COG:ML1666 KEGG:ns NR:ns ## COG: ML1666 COG4243 # Protein_GI_number: 15827882 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium leprae # 16 180 32 201 214 80 30.0 2e-15 MSATAWTRRTCAEMAVSGAVGLVASFVLSIEAWQLAMDSTTRFGCDVSSVLSCSTVAQTW QARILGFPNAFLGILFETAVLAVSVALFAGVRFPRWYMVCVNIMYTVALAFAYWLFLQSY FVIHVLCPWCLLITITTTLVFGGITRINIREGVLGLPQSARHFVEKGFDWSIWGLLVFII CAMVAARYGAGLIG >gi|319979428|gb|AEUH01000011.1| GENE 14 13324 - 13756 479 144 aa, chain + ## HITS:1 COG:MT1446 KEGG:ns NR:ns ## COG: MT1446 COG1198 # Protein_GI_number: 15840860 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Mycobacterium tuberculosis CDC1551 # 28 138 2 113 654 61 39.0 5e-10 MTGTQAAFDGFEEAPARTGEIAGVLVDVDLAHLDRPLDYAVPEELSGGAQVGRLVRVTLA GSRANGWIVSRRRGPIGARMAAVERVVSDLPVTTPAQLELASRIALRFLATSSQVLSVAV PARHARAERAVIADLSAPRAPAPT Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:12 2011 Seq name: gi|319979426|gb|AEUH01000012.1| Actinomyces sp. oral taxon 178 str. F0338 contig00012, whole genome shotgun sequence Length of sequence - 924 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 924 680 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase Predicted protein(s) >gi|319979426|gb|AEUH01000012.1| GENE 1 3 - 924 680 307 aa, chain + ## HITS:1 COG:Rv1402 KEGG:ns NR:ns ## COG: Rv1402 COG1198 # Protein_GI_number: 15608540 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Mycobacterium tuberculosis H37Rv # 1 307 143 455 655 187 45.0 3e-47 KAVWTALPGRHAAQIADLVRATRRGGRSALVVAPTAEEAEWLAGAIEAATGERAALMGAS VGPEERYAAHLRCLNGLDRIVVGTRSAVWAPVRDLGLVVVWGDGDDRLREQRAPRCDALD VAVQRCVVDGCALVVGSFSRSVKAHALVRSGWAVGVEAVRDAVRAATPRVRLYGSREADR TGEGRVVRFPSQALRLVRRACQGGAVLIQVASAGYVPVVSCQRCRTVARCPSCHGPLGLG AEGSMRCGWCGRAPSSWRCPHCSGTRLRALRVGADRTAEEVARALPEASVLESSAAHRVT RRLPARP Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:13 2011 Seq name: gi|319979423|gb|AEUH01000013.1| Actinomyces sp. oral taxon 178 str. F0338 contig00013, whole genome shotgun sequence Length of sequence - 1381 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 + CDS 61 - 528 530 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 2 1 Op 2 . + CDS 561 - 1380 881 ## COG0223 Methionyl-tRNA formyltransferase Predicted protein(s) >gi|319979423|gb|AEUH01000013.1| GENE 1 61 - 528 530 155 aa, chain + ## HITS:1 COG:ML0548 KEGG:ns NR:ns ## COG: ML0548 COG1198 # Protein_GI_number: 15827201 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Mycobacterium leprae # 1 126 486 621 651 59 32.0 2e-09 MAGRPELWAPEEAVRRWFNAFALVRPGGDALVVGDLPDDMGQALVRWAPSDYADRLLDER EALGFFPAMTLVALDGDRAQVAQVADQCAREAGAEVVGTVPAPRRSKDSMDVRAIVRVPR DRGLRLLEVLTGVRQARASKKSPPVMMSVNPPELF >gi|319979423|gb|AEUH01000013.1| GENE 2 561 - 1380 881 273 aa, chain + ## HITS:1 COG:Cgl1562 KEGG:ns NR:ns ## COG: Cgl1562 COG0223 # Protein_GI_number: 19552812 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Corynebacterium glutamicum # 1 270 1 278 315 177 42.0 1e-44 MRIIFAGTPEAAVPTLTALIASTHDVALVVTRPPARSGRGRAMRPSAVAQCARGAGLAVL ETASLKGEEAAGAIGAVGADLGVVVAYGGLVPPDVLAMPAHGWVNLHFSDLPRWRGAAPV QWAVLSGDPMTASCVFALEEGLDTGPVYSREPFTIGHETSGELLDRMAAAGGAQVLACVD SLAAGTARATAQSDRGATRARRLTAADGYVSFDEDAPATDRRVRAVTPNPGAWTLGPGGG RIKLGPVTAAPDRGLRAGQVEAGKRDVHVGCAQ Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:17 2011 Seq name: gi|319979413|gb|AEUH01000014.1| Actinomyces sp. oral taxon 178 str. F0338 contig00014, whole genome shotgun sequence Length of sequence - 8545 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 + CDS 72 - 1547 1619 ## COG0144 tRNA and rRNA cytosine-C5-methylases 2 1 Op 2 . + CDS 1558 - 2220 742 ## COG0036 Pentose-5-phosphate-3-epimerase 3 2 Op 1 . - CDS 2285 - 3058 634 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 4 2 Op 2 . - CDS 3137 - 4327 162 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 - Prom 4477 - 4536 3.1 5 3 Tu 1 . + CDS 4158 - 4478 108 ## 6 4 Op 1 21/0.000 + CDS 5019 - 5861 961 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 7 4 Op 2 . + CDS 5874 - 6899 1504 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 8 4 Op 3 . + CDS 6950 - 7897 1057 ## COG2378 Predicted transcriptional regulator 9 4 Op 4 . + CDS 7890 - 8544 844 ## CMM_1688 DeoR family transcriptional regulator Predicted protein(s) >gi|319979413|gb|AEUH01000014.1| GENE 1 72 - 1547 1619 491 aa, chain + ## HITS:1 COG:Cgl1561 KEGG:ns NR:ns ## COG: Cgl1561 COG0144 # Protein_GI_number: 19552811 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Corynebacterium glutamicum # 17 490 74 510 511 290 43.0 6e-78 MSQRRYADGYRKLRRADAPRRVAFEVLSEVTSSGAFANIVLPKALRRIRREEPFDDRDAA FTSELVHGTLRARGRLDWALARHVDRPLGDLDPRVLDLLRMGAHQLLDMRVPDHAAVSAT VDVAREHLTDGPVRMVNAVLRALAREEVGRIDEEIARIEDGDARLAVAHSHPEWMVRALR AALAAHGYDEAELEDALAADNEAPIVSLAARPGLIGVEELCEEAEDVLSARVATGLVSKC AVLLASGDPARLPSIRQGLAGVQDEGSQLAAQICAGAPLVGQGDSSWLDLCAGPGGKAAL LGALAAGRGAHVVANEIHPHRARLVERTTRALGTVEVVSGDGRTFGGQGTQWPLGSFDRV LVDAPCTGMGSMRRRPESRWRRAPGDVDGLVELQERLLDRAGALTRPGGVMTYVTCSPHR RETRDQVERLLGAGGWELLDAVALADSVAVEPLAVPDRAGRVDGGGAGRTLQLWGHRHGT DSMFVAALRRL >gi|319979413|gb|AEUH01000014.1| GENE 2 1558 - 2220 742 220 aa, chain + ## HITS:1 COG:MT1452 KEGG:ns NR:ns ## COG: MT1452 COG0036 # Protein_GI_number: 15840866 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Mycobacterium tuberculosis CDC1551 # 5 217 10 222 229 217 53.0 1e-56 MNATISPSILNCDIADLRGELEKISGADVAHVDVMDNHFVPNLSWGLPVAEAVVKTGILP VDAHLMIEDPDRWAPQYAEAGCQSVTFHAEAAQAPVRLARELRRIGSKVGLGLRPATDIA PFVDLLGEFDMILVMTVEPGFGGQSFLESMLPKIRRTRAAVNASGLEVSIQVDGGVSRST IELAAEAGADNFVAGSAVFRADDALEEVEVLRRLASAHCH >gi|319979413|gb|AEUH01000014.1| GENE 3 2285 - 3058 634 257 aa, chain - ## HITS:1 COG:SMb20579 KEGG:ns NR:ns ## COG: SMb20579 COG0596 # Protein_GI_number: 16265239 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Sinorhizobium meliloti # 16 248 23 258 268 62 26.0 1e-09 MATALALHEWGDPTAPPLLLVHGLTESATAWPDAVARWSPRYHVLAVDQRGHGASPRWDD QTLARAPRTLQEDLEGVLSSFPEPPVVVAHSLGALVSLRVGAARPDLVRALVLEDAARPT GDWEPDAWFVEHQERFLDAFADGGASERERMRRASSWSEAEIEGWAGCKAQVDRRYIREG TYLGEADLLSAVNRLEVPALYLAPRNGDMAPAPSEVVNPLVRLVLLDGVGHCVRRDDPEA YHSLADPFIEEASAAER >gi|319979413|gb|AEUH01000014.1| GENE 4 3137 - 4327 162 396 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 288 391 246 344 347 67 36 5e-11 MTAAVLAATAALVAAGCSGGLHQSAGAAQPAGPSQAPSSQAPGTAVPGYKPGEIPPIPLF SIPPIDVFAQNADKAVISSVSSAVASVPGVTVAPASCDAAGYKHSGNTTFSGDGSSLTQS GDLSASNNGDGSGYISEGPIWVSYDGQGGGTYRNDDDDISISVNPDGSGHYYSPSVDINR WADGSGNYSNKDTGDDIQIDPNGSSYFWNDKTGTRYFNYGNGSGYYTDDTGLYIINNGDG TATVNGDTTVPADPLPPVQPVGAFPPVAGLAPSQSCGTAITLDDSVLFDFGESTVRADAA DTLAKLASVLTDAGAPTAHVYGHTDSVGDDASNQTLSEQRAKAVVDELKKNGATTALDWK GFGETQPIAPNTNDDGSDNPAGRQANRRVEIYIPTF >gi|319979413|gb|AEUH01000014.1| GENE 5 4158 - 4478 108 106 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGISPGLYPGTAVPGAWEEGAWLGPAGCAAPADWCNPPEHPAATRAAVAARTAAVIGFF LIMCGPPELRAHRGRRGAPLIGTPPPEPRALAVMGRSRQECLSRAA >gi|319979413|gb|AEUH01000014.1| GENE 6 5019 - 5861 961 280 aa, chain + ## HITS:1 COG:PA3512 KEGG:ns NR:ns ## COG: PA3512 COG0600 # Protein_GI_number: 15598708 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Pseudomonas aeruginosa # 50 272 22 244 258 105 30.0 1e-22 MGGVRRAPGVLFAVRTRMPERDTVARGSKPRGALAVVAPLAFLAICLLGWWATTSATGIE AWRLPSPAAVAARAVQLAHNPQTWSRAAVTGGEALAGCAIGTAVALPLAYVVYRSRLVSA AVEPFLGATQALPAIAIAPIVVLWTGYGPLSVALLCALMVFFPILVSSVVGLRHIDPELL EAAALDGASGATMALTMELPLAAHSILGGLRNGFTLSVTGAVVGEMVMGGTGLGQVLVQM RTNVDTAGMFVVIALLCLMATVLYAVIYRVERSRRYDTTR >gi|319979413|gb|AEUH01000014.1| GENE 7 5874 - 6899 1504 341 aa, chain + ## HITS:1 COG:DR0268 KEGG:ns NR:ns ## COG: DR0268 COG0715 # Protein_GI_number: 15805299 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Deinococcus radiodurans # 40 290 27 269 321 111 32.0 2e-24 MTLSRILRAAAALAVTALGLSACGSQQAAPASGGNDRSTSVVVGLTYIPNVQFAPAYVAV SEGIFTDNGVDASVRHHGSDEGLFTALLAGEEDVVIASGDEAAVASTQGMDLVSIGSYYR DYPGTVIVRDDSPIQSLADLKGATIGIPGEYGSNWYATLAALQGAGLTLADVTVSSIGYT QQAALTQGDVDAVVGFSNNDLVQMRLAGLDVRSIGLPDDAPLIGASIITTRAWAASHPDQ ARGVVASLGAAMEAIHQRPDAAIEATMAQTGTDSGAEAAARAVLEATDPLWVDESGSANT AQDLERWGMMADFLRQINAVTTDIDVSSVVTNDYAAAPRSS >gi|319979413|gb|AEUH01000014.1| GENE 8 6950 - 7897 1057 315 aa, chain + ## HITS:1 COG:ML1329 KEGG:ns NR:ns ## COG: ML1329 COG2378 # Protein_GI_number: 15827690 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Mycobacterium leprae # 43 302 43 326 331 83 31.0 6e-16 MVETLSAQARLLELFFEVLGAERGRSKARIRRMDAYRGLGERAFESQFQRDKDALRQIGV DLRVGAGPDGDTYSISPGSFARVGARLNPVQAVLVGLALQAWSPGDGVGPSAGPKVAAAS DGSGAPDLLRMDLTGVRAAADFATGIRERRVVSFDYEGARGTEERSVEPWRIVVRGRALY LWGFDLDRGADRLFRISRVGSAVSFLGQAGDADEPPTDLGDPFADRRVAPVLAVLGDPGR LARHLDGPVGEEPPSSPWRLWRGREADVGRWISRVLARAEDVVVVEPVELRDAVLRRLEA AARLDGDRGAEGGDA >gi|319979413|gb|AEUH01000014.1| GENE 9 7890 - 8544 844 218 aa, chain + ## HITS:1 COG:no KEGG:CMM_1688 NR:ns ## KEGG: CMM_1688 # Name: not_defined # Def: DeoR family transcriptional regulator # Organism: C.michiganensis # Pathway: not_defined # 12 216 17 218 329 92 35.0 8e-18 MPEGTSQSVVRLLSLVAWLWDHPGVGVDEAAAHFGRSKRQMLRDARYLAEVGDSLPGASL DLDWERLEEDGALVIRSALGADAPLRLSAQEATAILIGLQALSDVVGDEYRRQIPGAAMA VRALASEDGGGEVTLRSRGDDQDQDAALRGLFDALGRAIAGRERVSFTYTNGQGGVSSRR VSPWALERSATGWILRGWCHGAGAERSFRLDGIEGLEA Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:28 2011 Seq name: gi|319979411|gb|AEUH01000015.1| Actinomyces sp. oral taxon 178 str. F0338 contig00015, whole genome shotgun sequence Length of sequence - 750 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 750 793 ## COG4581 Superfamily II RNA helicase Predicted protein(s) >gi|319979411|gb|AEUH01000015.1| GENE 1 3 - 750 793 249 aa, chain + ## HITS:1 COG:ML1333 KEGG:ns NR:ns ## COG: ML1333 COG4581 # Protein_GI_number: 15827694 # Func_class: L Replication, recombination and repair # Function: Superfamily II RNA helicase # Organism: Mycobacterium leprae # 11 249 2 252 920 239 52.0 4e-63 RYAAFKEQARSDASERARWAAALPFTPDAFQVEALDAVEAGAAAVLVAAPTGAGKTVVGD FGAHMARRGSMRAYYTTPIKALSNQKYLELSDMFGADQVGLATGDTSINPGAPIVVMTTE VLRNMIYAGAGLSDLAVVVLDEVHYLSDRMRGPVWEEVIIHLPRHVQIIALSATVSNAEE FGAWMGEVRGGCAVVVSEERPVPLYQHMVVGDEILDLYTPSGALNPRLVHLTAPRSQRGP GAAGRRGAG Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:30 2011 Seq name: gi|319979408|gb|AEUH01000016.1| Actinomyces sp. oral taxon 178 str. F0338 contig00016, whole genome shotgun sequence Length of sequence - 3457 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 1943 1909 ## COG4581 Superfamily II RNA helicase 2 1 Op 2 . + CDS 2003 - 3455 1624 ## COG0815 Apolipoprotein N-acyltransferase Predicted protein(s) >gi|319979408|gb|AEUH01000016.1| GENE 1 3 - 1943 1909 646 aa, chain + ## HITS:1 COG:MT2153 KEGG:ns NR:ns ## COG: MT2153 COG4581 # Protein_GI_number: 15841583 # Func_class: L Replication, recombination and repair # Function: Superfamily II RNA helicase # Organism: Mycobacterium tuberculosis CDC1551 # 1 639 248 902 906 392 42.0 1e-108 GGGARGGRRRESRPSTLVALDRAGLLPAITFVFSRAGCEDAVRQVVASGIALTSRATARR IGEYVDGVVRAIAPEDYQVLGVDAWREALVRGVAAHHAGMLPLMKEAVEHLFSQGLISMV YATETLALGINMPARTVVIESLQKWNGSSRAPLSAGEYTQLSGRAGRRGIDTEGHAVVSH RGGTAPEEVAALASKRTYPLVSAFRPTYNMVVNLLEHSTVEQARELLESSFAQFQADRAV VSLASRLRDARARAGALRGDLACQYGDAAEYCALRDRISRAEKEGARRRRAAAGAGARAV IDSARPGDVLAFRAGRRVRHVVVAASARAADGRTALRAVTMDAKWRTLTAHDFAGGVRAV GRMALPGAGGPRGRKGLVRLAADLVRLVRSGALAPFDGARGDGDDADELRARMRAHPVHR CPHREEHARAGAAWARADREAASLARAVESRTHSVVKQFDRVRRVLESLGFLDGGRVTAR GQQLRRVYGERDLVIAEALASGAWEGLDAPGLAAIVSACVYESRGDGASALPAGLGPAPR RAWEETSRVAARVARAEAAAGVDASPPPDPALMAACSAWAHGSTLATALADSGIAGGDFV RWTRQVIDALGQIESVDPAGQVGASARRARSLLARGVVAWSGVEER >gi|319979408|gb|AEUH01000016.1| GENE 2 2003 - 3455 1624 484 aa, chain + ## HITS:1 COG:Cgl1445 KEGG:ns NR:ns ## COG: Cgl1445 COG0815 # Protein_GI_number: 19552695 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Apolipoprotein N-acyltransferase # Organism: Corynebacterium glutamicum # 1 478 7 490 545 164 27.0 4e-40 MAVSAAAGSALYLAFPPVGWWWAAVPALALLVACVDRARAGRALLNTTAFGVAFWAPLIP WVVVSTRTPLAWAALVVSQVLFLWLWSLSVSLMRVWSWARSPVGQALGCALAWTGVEQAR SAFPWSGFPWGTVALPQVDSPLGRLAPYGGVALVSFAVMAVAVLVRRAFSLAGAGRERWW WRPAMLAGAAAVCVAPLAIDLPSAQEAGALTVGVVQGNISLPGSRAYSHEGEVTGNHASQ TRRMLESAGGGAVDLVVWGEGSADRDPAGSAVVARDLEDVSDEAGAPVLMGYSVLADDDH VWNWLALWYPGSGLDAQSRYAKQVPVPFGEFIPFRPLIASLATEAARQNHDMAAGGEPGL MTARLADGREVPIAVGICFEGAYESVIGEGVALGGQVIITPSNNYLFESSAESAQQAQLL RMRAMEYSRSAVQASTTGVSAVIRPDGSVQAATPVMSAASMVETVPLRTSLTPAARMGAL PARA Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:31 2011 Seq name: gi|319979403|gb|AEUH01000017.1| Actinomyces sp. oral taxon 178 str. F0338 contig00017, whole genome shotgun sequence Length of sequence - 3884 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 118 - 906 1078 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Term 844 - 885 8.5 2 2 Tu 1 . - CDS 918 - 1283 502 ## Sked_18450 hypothetical protein 3 3 Op 1 . + CDS 1482 - 2234 973 ## SCO2335 integral membrane protein 4 3 Op 2 . + CDS 2267 - 3772 2105 ## COG0443 Molecular chaperone + Term 3794 - 3840 17.9 Predicted protein(s) >gi|319979403|gb|AEUH01000017.1| GENE 1 118 - 906 1078 262 aa, chain + ## HITS:1 COG:Cgl1444 KEGG:ns NR:ns ## COG: Cgl1444 COG0463 # Protein_GI_number: 19552694 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Corynebacterium glutamicum # 19 259 11 248 270 212 48.0 6e-55 MSHPSHLWRAPGNDGSGLVIVIPTYNERRTLPVVVERVQAALPAAHLLVVDDSSPDGTGE WADSLAQGDGRIHCLHRPSKSGLATAYVEGMEWALQRGYGFVAQMDADGSHRPEDLPKLV ARMGGPDRPDLVIGSRWVRGGRVNGWSRKRILLSKAGNRYVRFCLGTPVRDATAGMRLHR ASFLRDSGVLTRVSTTGFGFQVEMTQAEGARGARIAEAPITFDERMSGESKLSGAIFVEE LVMVTKGGLSRLAGAARRLVGR >gi|319979403|gb|AEUH01000017.1| GENE 2 918 - 1283 502 121 aa, chain - ## HITS:1 COG:no KEGG:Sked_18450 NR:ns ## KEGG: Sked_18450 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 1 113 1 111 114 117 56.0 1e-25 MADRALRGMQIGAKSLESEDGVVFADRFIVRYLCPKGHEFEVTLSTEATAPPTWECRCGR SADLLGETEEDDDAKPAKPQRTHWDMLLERRSMAELDQVLAEQLAAYREGRLRPEGHYRK G >gi|319979403|gb|AEUH01000017.1| GENE 3 1482 - 2234 973 250 aa, chain + ## HITS:1 COG:no KEGG:SCO2335 NR:ns ## KEGG: SCO2335 # Name: SCC53.26c # Def: integral membrane protein # Organism: S.coelicolor # Pathway: not_defined # 44 219 532 709 761 118 40.0 2e-25 MTDASTRPPAPIARRWRPAVLEALVLVGCCCAYSVACFLAPVTREAAMAHAAQVASFEAR TGIDIELAANQWLVARPLIAQAASLQYAVSFFAFTAAALAIMWWRRPDRYHQARNALFVM TGGALVTYWTYPLAPPRLVEGNGIVDAVARHTSAYSSLFGTLANPYGAMPSMHTGWAVWV AAVLGAFVWKRAWQRALLALHPLVTIASIIATGNHYVVDAVAGVAYFLIACALVAIAGLL RNVRLDRPKS >gi|319979403|gb|AEUH01000017.1| GENE 4 2267 - 3772 2105 501 aa, chain + ## HITS:1 COG:Cgl2327 KEGG:ns NR:ns ## COG: Cgl2327 COG0443 # Protein_GI_number: 19553577 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Corynebacterium glutamicum # 1 487 1 473 484 422 47.0 1e-117 MKCGVDFGTTTTTVAVADRGNYPVVSFEDGADTRDYVPSVVAFDDGKLVFGFEADRAARA GAPHIRSFKRFLADPDVTASTSIRLGYRDVELMDVLTRFLAHVADCVRTGSSVSRVRKSE PLEVAVGVPAHAWSGQRFLTLEAFKNAGWDVTTMLNEPSAAGFEYTHRHAGTLNSKRTSV LVYDLGGGTFDASVVEAAGRRHEVLGSRGLNLVGGDDFDTILANLLATAAGTDSARLGEE RWAALLDDARAAKEALIPQSRQITVAIDDQQVTIPVTAFYEAATPLVASTIDALEPLLTP VDGVGQLGPDVAGLYVVGGGSQLPLISRVLRERFGRRVHRSPHTAASTAIGLAIGADPDS GFTVREQLSRGVGVFRELSSGSTVSFDTLLTPDMGRTPGQSAGLSRQYLAAHNVGFFRFV EYTSADEAGVPRGDIHPCGDVYFPFDRALQDSDADLSSVEVVRTEEGPLIQEHYWVDENG IVTVEIRDTETGYTVRKELGR Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:42 2011 Seq name: gi|319979393|gb|AEUH01000018.1| Actinomyces sp. oral taxon 178 str. F0338 contig00018, whole genome shotgun sequence Length of sequence - 9986 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 113 - 676 764 ## COG0789 Predicted transcriptional regulators 2 2 Op 1 4/0.000 - CDS 794 - 1564 962 ## COG0789 Predicted transcriptional regulators 3 2 Op 2 . - CDS 1573 - 2007 695 ## COG1716 FOG: FHA domain - Prom 2052 - 2111 1.9 4 3 Op 1 5/0.000 - CDS 2123 - 2911 870 ## COG3879 Uncharacterized protein conserved in bacteria 5 3 Op 2 5/0.000 - CDS 2908 - 3240 452 ## COG3856 Uncharacterized conserved protein (small basic protein) 6 3 Op 3 . - CDS 3237 - 4079 1015 ## COG3879 Uncharacterized protein conserved in bacteria 7 3 Op 4 . - CDS 4076 - 6646 3542 ## COG0308 Aminopeptidase N 8 4 Op 1 . + CDS 6728 - 8308 2268 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase + Term 8323 - 8359 5.1 9 4 Op 2 . + CDS 8368 - 9879 1871 ## COG1160 Predicted GTPases Predicted protein(s) >gi|319979393|gb|AEUH01000018.1| GENE 1 113 - 676 764 187 aa, chain - ## HITS:1 COG:Cgl1409 KEGG:ns NR:ns ## COG: Cgl1409 COG0789 # Protein_GI_number: 19552659 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Corynebacterium glutamicum # 6 184 1 191 191 199 57.0 2e-51 MSAEPLHADAARQGMLFGDPLEGLDTDDMGYRGPIACSAAGITYRQLDYWARTGLLQPSI RAARGSGSQRLYSFKDILVLKIVKRLLDAGVSLQQVRVAVGTLQNRGVDDFASITLMSDG ASVYECTSTDEVIDLLQGGQGVFGIAVGRVWREVEGSLAELPIERPDDGAFVDELAQRRR QRMSSAS >gi|319979393|gb|AEUH01000018.1| GENE 2 794 - 1564 962 256 aa, chain - ## HITS:1 COG:Cgl1407 KEGG:ns NR:ns ## COG: Cgl1407 COG0789 # Protein_GI_number: 19552657 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Corynebacterium glutamicum # 33 248 28 243 252 129 37.0 4e-30 MANAARAFEPVDDEAPPGRGAGPWPQGASRAPSFSIGRVVESLRAEFPAVTLSKVRFLED QGLVRPARTGAGYRKYSEADVQRIRFVLTEQRDSFTPLRVIGEKLAALDAGHDPGPSRKA QVVASEGRVVSTQGRRFIPASDLCDLTGVDVDTVTRYARLGLITPDLAGYFPSRCVQVVA LLARIESLGVDARVLRTVRTGADRNADIIDQAVSSTRARGRSADKERASARSLELGEAVA DLHRELLRVSLSQLAE >gi|319979393|gb|AEUH01000018.1| GENE 3 1573 - 2007 695 144 aa, chain - ## HITS:1 COG:MT1875 KEGG:ns NR:ns ## COG: MT1875 COG1716 # Protein_GI_number: 15841297 # Func_class: T Signal transduction mechanisms # Function: FOG: FHA domain # Organism: Mycobacterium tuberculosis CDC1551 # 38 144 47 153 162 125 61.0 3e-29 MEPQQDFDPTATSVFGITPVPDDIEEAPVVLSARDREAVNALPAGSALLIVQRGPNTGAR FLLDPNVTNAGRSPKADIFLDDVTVSRKHCQFIADNGGHIVRDSGSLNGTYVNRERVDQA RLSAGDEVQIGKYRLTYQPSPQEA >gi|319979393|gb|AEUH01000018.1| GENE 4 2123 - 2911 870 262 aa, chain - ## HITS:1 COG:MT1873 KEGG:ns NR:ns ## COG: MT1873 COG3879 # Protein_GI_number: 15841295 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium tuberculosis CDC1551 # 50 259 74 288 292 106 33.0 4e-23 MTMDEPTPAPTAEEPAGPHGPHHDHLAHRLRERAHRAFWAVPRGPHAVLLVICMVLGFAL VTQVRSQRQDPIDSLSEQDLVVLLDELTTQERGLRTRRADLQDRLSALQNAASQQEARDE AAQNARVQAEINAGTVAVHGPGVVMRISDPAGALTTAQFVMTLGELRNAGAEAIELNGNR LSTRSSFTSADGRISVDGTAIDPPYEWKVIGSSQTISTALEIQAGSASQMRAKGAVVSID PVDDVVIDSLAVPLSPQYASFG >gi|319979393|gb|AEUH01000018.1| GENE 5 2908 - 3240 452 110 aa, chain - ## HITS:1 COG:MT1872 KEGG:ns NR:ns ## COG: MT1872 COG3856 # Protein_GI_number: 15841294 # Func_class: S Function unknown # Function: Uncharacterized conserved protein (small basic protein) # Organism: Mycobacterium tuberculosis CDC1551 # 1 110 12 121 121 94 58.0 3e-20 MIAVIGLAAGIVLGLIVDPAVPTWLTPFLPVAVVAGLDALFGAARAWLEGSFHDRVFVMS FFWNVVVACLLVFLGSQLGVGSAMTTAVVVVLGIRIFSNTASIRRLIFNA >gi|319979393|gb|AEUH01000018.1| GENE 6 3237 - 4079 1015 280 aa, chain - ## HITS:1 COG:MT1871 KEGG:ns NR:ns ## COG: MT1871 COG3879 # Protein_GI_number: 15841293 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium tuberculosis CDC1551 # 132 272 155 305 309 80 35.0 4e-15 MSGGGDRGEDRAPGGGTAPRDPAASMSLLTSLLNNPLDAGYVAHGPRESGAARTVMGRAL VGAAALALGFASAITVRSLRSVHTDAVKDQLRGQVAAQQSAVADLQGQIQSLSSSVASYA GQSGAAGDAPALTLENSAQPVSGDGLVVTLADRAGRTGRGSGLVRDQDIAMVVNALWAAG AEAVSVNGQRVGPGTFIRTAGPTILVNITPVASPYSVAAIGDANAMSVALVRGATGDYLS SAQSVNGITVRTETASGLAMPALEQLPRKYARPNDQGGAQ >gi|319979393|gb|AEUH01000018.1| GENE 7 4076 - 6646 3542 856 aa, chain - ## HITS:1 COG:ML1486 KEGG:ns NR:ns ## COG: ML1486 COG0308 # Protein_GI_number: 15827782 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase N # Organism: Mycobacterium leprae # 4 762 6 772 862 454 38.0 1e-127 MQILSRNEASDRSGNIELSSVDVTVDVSGAPELAREDYPVTSRLALTTAAPRLFIDVAGR VDAVRIDGEDRAFTAEEDRVWIDGVPTGAPVEIEVDARCRYSRTGEGLHRYQDPEDGLVY LYTQFEPNDAHRAWPCIDQPDVKPRWTFHVTAPAGWVVSSNGPLDASEELPGAPDSVRHD FGTTPPLSSYITALVAGPWAVVDGGHWQGGAADGGHADLRLRLMCRKALEPFIDSDDILE VTRAGLDFYHARYGTTYPWGTYDQVFVPEYNLGAMENPGCVTFNETYLSRTAPTLSERQR RANTILHEMCHMWFGDLVTPAWWDGLWLKESFAENQGATAAAAATRYRGEWASFAVNRKA WAYEQDQMPTTHPIAADIPDVAAAKTNFDGITYAKGAAVLKQLVAWVGEDAFYAGARRYF QDHAFGATRLGDLLAALEAASGRSLEQWKGAWLETTGPSVLSASWSTGPFGEVTDFTLHQ SGGVLRPHRLVVSTWKFGGGRLEATHSFDVRIEGESAPIDPEGALAHPGGAAEVDMVVVN DQDLTYAVSRLDPHSTDVALAYAGTCPEAITRAVVWAALWNALRDGLVDPRRFVQSALVA VESEAEPAVRDRLLALAHSAIRDYLPGGARQGLRELLSAATIRYARETGDADAARAFTRA FITEFEAYGPGAFTELVRGYAAGDDIDLAWRARCALAARGLVDADGITAWRDADGSGEAQ RHAARALASLPDADSRAAAWESVFSGALSNDILSATLAGLAASSWEGDAGTGAAIDRMEE FWQSHTIGMSLRYVRGVLAVGLDIDRPGTVAQTLDALRAWLDSHEGAPAQLRRVVVEHCD SYERSRRVQEAWKEDR >gi|319979393|gb|AEUH01000018.1| GENE 8 6728 - 8308 2268 526 aa, chain + ## HITS:1 COG:slr0528 KEGG:ns NR:ns ## COG: slr0528 COG0769 # Protein_GI_number: 16332016 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Synechocystis # 41 518 31 500 505 273 38.0 6e-73 MSSVTDIRPSTPAVPLTSVLPGSAVRAAYGPDGAETAWADLGTRIAGVTVSSDDCEPQWL FVAIPGLSQHGIRFARAAKEAGAVAVLTDAAGASRAVSEGLGLPVVVVDDPRSHVAALAA AVYGHPARDLVTLAVTGTNGKTTTSYLMRAALAAVHPRAALCGTVETRVGPVSFEAERTT AESPVVQRFLALARECGEGAAVIETSAHALSLHRVDGVVFDVAAFTNLQHDHLDYYGDME GYFNAKALLFTPEHAKRAVVCVDDEWGRRLAASAPIPVTTVSALTDAPAHWRVRAAHPDP SIGRTVFTLVDPDGTGHEVRMPILGEVNVQNTAVALVSAVAAGVPLDGALRALEAAPQIP GRMEKVNPVPGNQPLVVVDYAHTPEALEWTLRSTRELTAGRLHIVFGTDGDRDASKREHL AGIAAAGADVLWVTDENPRTEDPQSIRDYLLRGIASQRPSMADVTEVVTCRRDAVRRAIL AAQPGDTVIITGKGAEWYQEVEGIKHRYNDVPVAREVLECDLRSHV >gi|319979393|gb|AEUH01000018.1| GENE 9 8368 - 9879 1871 503 aa, chain + ## HITS:1 COG:MT1753 KEGG:ns NR:ns ## COG: MT1753 COG1160 # Protein_GI_number: 15841173 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Mycobacterium tuberculosis CDC1551 # 43 501 8 458 463 502 63.0 1e-142 MSDHYPIHGTEADGAPAAPGEGGGQDDEARARAMRASLDQFELDEEDLALLDSPPSEDDD AGTRPGLPVLAVVGRPNVGKSTLVNRVLGRREAVVQDTPGVTRDRVSYPAEWAGRDFTLV DTGGWEVDVKGLDRSVAEQAETAVDLADAVVLVLDATVGVTASDERIVTMLRAKRKPVVL AANKVDSAVQEADAAYLWGLGMGEPHPVSALHGRGVGDLLDAVAAVLPDESAHAPAPPSG GPRRVALVGRPNVGKSSLLNALAGSSRVVVNELAGTTRDPVDELIELDGRPWWFVDTAGI RRKAHRTTGADYYASIRTQAAIEKAEVALVLIDGSEPLTEQDVRVVQQVIDAGRALVVVT NKWDLVDEERQRALRGELERDLVQIQWAPRINLAARTGWHTNRIVRALDTALEGWQTRIP TGHLNAFLGQLVAAHPHPLRGGKQPRILFGTQASSKPPRFVLFTTGFLDPGYRRFIERRL REAFGFAGTPIRISVRVRERRRR Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:44 2011 Seq name: gi|319979387|gb|AEUH01000019.1| Actinomyces sp. oral taxon 178 str. F0338 contig00019, whole genome shotgun sequence Length of sequence - 5069 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 24 - 1013 1347 ## PFREUD_15790 acetyltransferase 2 1 Op 2 . - CDS 1064 - 1747 815 ## COG2761 Predicted dithiol-disulfide isomerase involved in polyketide biosynthesis 3 2 Tu 1 . + CDS 1722 - 1883 65 ## - TRNA 1821 - 1910 53.3 # Leu GAG 0 0 - Term 1939 - 1979 1.1 4 3 Tu 1 . - CDS 2005 - 2508 815 ## COG1846 Transcriptional regulators 5 4 Tu 1 . + CDS 2614 - 3273 705 ## COG0546 Predicted phosphatases 6 5 Tu 1 . - CDS 3270 - 5069 2106 ## COG0587 DNA polymerase III, alpha subunit Predicted protein(s) >gi|319979387|gb|AEUH01000019.1| GENE 1 24 - 1013 1347 329 aa, chain - ## HITS:1 COG:no KEGG:PFREUD_15790 NR:ns ## KEGG: PFREUD_15790 # Name: rimL # Def: acetyltransferase # Organism: P.freudenreichii # Pathway: not_defined # 6 323 21 360 365 198 40.0 2e-49 MGEEFIDPMTAVLQDRAYSRFTTIPWPYERSMAEANVAATAGSWERGGCDWAIVGEGTGE FLGRIELRPAALIPDCLELGYFTAKDHWGQGVMTRAVALALDTAFTAMGATRVQWYGDEG NWGSWKAVWRNGLRREGVQRRAGRRLWAAGVLATDPREPDTPWDGPGAGAPAALDPGRPM GLVRQFHQTYSVPDRLAEHAAPTLDYERLGMRVALVSEEYAELIGAVYGPAARSLVEEAA RAAEGADEGVRDVVEAADALADLVYVAYGMAVESGIDLDRVLAEVQASNLSKLMPDGTVR LRGDGKVLKGPGFFPPDVRRALGLGGGAG >gi|319979387|gb|AEUH01000019.1| GENE 2 1064 - 1747 815 227 aa, chain - ## HITS:1 COG:DR0659 KEGG:ns NR:ns ## COG: DR0659 COG2761 # Protein_GI_number: 15805686 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Predicted dithiol-disulfide isomerase involved in polyketide biosynthesis # Organism: Deinococcus radiodurans # 1 220 12 220 252 97 30.0 3e-20 MRIDMWMDTACPWCYLGLRHLRAALSAFPHGDQVEVVLRAHVLEPDLEGPVDTPRAAHLA RTTGMEPEAVAEEDERLRALGRAEGVVFDFESLVVAPTSRAHRVIAAAGEADIDAGATTG PDTAHLKAAEAIMRAHFEMGLDVSDPDVLIGCAQDIGLGAAVAAGALADGRWASQVYSDH QMGVGMGVDALPTYILGSRFVVQDRLSATAMANVLTTAWAHSREDGR >gi|319979387|gb|AEUH01000019.1| GENE 3 1722 - 1883 65 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSIHMSMRIRTPSGPRPLLYGKAPEPSGPGALFSCAEGDLNPHPRKMRTSTSS >gi|319979387|gb|AEUH01000019.1| GENE 4 2005 - 2508 815 167 aa, chain - ## HITS:1 COG:Cgl2461 KEGG:ns NR:ns ## COG: Cgl2461 COG1846 # Protein_GI_number: 19553711 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 35 155 35 155 164 66 33.0 2e-11 MDDRPGLNPITDPGNNRAAAWRVFFEASGRLQGILETRLKRTFGVTMPDYNILLALWEAP KHRLRMGELADKVVYSPSRVTYLVSNLSRDGWVERVPSAIDRRGYDACLTTQGVETVLAA TELHQQTVREYLLDGMTDDDIGAIVKVFSTLDSRLKGFSARERKRRR >gi|319979387|gb|AEUH01000019.1| GENE 5 2614 - 3273 705 219 aa, chain + ## HITS:1 COG:MT2292 KEGG:ns NR:ns ## COG: MT2292 COG0546 # Protein_GI_number: 15841725 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Mycobacterium tuberculosis CDC1551 # 2 205 76 277 291 113 37.0 3e-25 MGTTFSCALFDVDGTVVDSAPAVVGVFEQTLRSFDLPVPERERLRAYVGPPLRYSFGDLG YAPPLLERLVATYRDLYSHCFLEPRPFDGIEGLFAALRSAGVALATATSKQAPMALAQFE HLGFMEYFDVVAGATPDPESTKTTVIHEALERLAGLGADVSCPVMVGDSVWDVRGAHEAG LPVIGVGWGYATEGGLDKADWRVETMEELAALCLGGGPV >gi|319979387|gb|AEUH01000019.1| GENE 6 3270 - 5069 2106 599 aa, chain - ## HITS:1 COG:Cgl0620 KEGG:ns NR:ns ## COG: Cgl0620 COG0587 # Protein_GI_number: 19551870 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Corynebacterium glutamicum # 3 599 508 1055 1055 484 46.0 1e-136 VAGLPRHMGIHPGGMVLTDQPVSSVCPISWGSVPGRSVLQWDKEDCEDAGLVKFDLLGLG MLSALRLAFDRLAGAPEGEPWGRRSAPLLGRRGLPIGLHTLPQEDPGVYDLLCAADTVGV FQVESRAQMSTLPRLKPRCFYDIVVEVALIRPGPIQGGSVTPFLRRRAGEEEVSYLHPLL EPALAKTLGVPLFQEQLMRIAADAAGFTPAQADRLRRALGAKRGVERVEGLRPALMEGMG RRGIDAATGAAIVEQLKGFADFGFPESHAFSFAHIVYASAWLKLRAPEHFYAALLACQPM GFYSPSSLVHDARRHGVRVAPPDVNHSRVGAVVEEADEGEGGVPRAPEPVPQAVPLDVDP RLAVRIGLGSIAGLGAAAERTATARGEGPYSSVGDLARRARLGEKDLERLAHAGALASLG ASRREGMWSAGALGAPRGPVGADGGWQPALPGIDPAPAPDLPAMDAVEELRADTESTGLT TGDHPFALVRAHLDPGALPVSALHGCDDGAIVSVAGLITHRQRPPTAAGTVFLTLEDESG MVNVTCAAGMWAAHRAAALRARAVEVRGRLERRDGAVGVRAHSLREVAAPVAARSRDFR Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:55 2011 Seq name: gi|319979384|gb|AEUH01000020.1| Actinomyces sp. oral taxon 178 str. F0338 contig00020, whole genome shotgun sequence Length of sequence - 1797 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 656 618 ## COG0587 DNA polymerase III, alpha subunit 2 2 Tu 1 . + CDS 729 - 1797 994 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases Predicted protein(s) >gi|319979384|gb|AEUH01000020.1| GENE 1 2 - 656 618 218 aa, chain - ## HITS:1 COG:MT3480 KEGG:ns NR:ns ## COG: MT3480 COG0587 # Protein_GI_number: 15842966 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 3 218 60 263 1098 100 38.0 2e-21 MSGTRPRYAELHAHSAFTFLDGTCEPAEMVRAAQGLGLEALAVLDVDGMHSAVRTTMAAR EAGLPIVHGSELTIDASSLRPALPGAGGPGWGLRPGAHDPGVRLPVLAASPQGYSALVSA MSDHALSRPGRRDSSHRLDDLSGRARDWLVLTGSGRGPLRRALRSHGTAGARRVLDSLVD LFGRDAVVVEVQRRAGDGPEEADALAGLARTRGLRLVA >gi|319979384|gb|AEUH01000020.1| GENE 2 729 - 1797 994 356 aa, chain + ## HITS:1 COG:PAB2382 KEGG:ns NR:ns ## COG: PAB2382 COG1063 # Protein_GI_number: 14521586 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Pyrococcus abyssi # 22 336 24 336 348 145 33.0 8e-35 MATMRAAALGSPGSLALTTRPVPEPGPGEVLLAVEATTICGTDLRIESGAKTRGVTPGVV LGHEVAGRVAALGGGLGAGAPAVGAQVGLAPEWVCGSCGPCACGRANVCENMRLFGTTVD GGLADYLLVPAPAVAQCVVAVERETDPALLALAEPLSCCLRAHARLGIGAGDTVAVLGTG PIGLIHCALAARAGARVIASGRPARLEPALRFGAAHATAATGAELVAEARRLTGGRGADA VIVAVGAPELAGVSLELAAVGARVSYFAGFSAGASATIDPNLVHYRELAVLGSANATHAD YAEAVRLLSSGALDLGGLVTHRFALGDVHAAFDAVRRRRGLKAAVLPGMGRGRRPL Prediction of potential genes in microbial genomes Time: Thu May 12 16:57:57 2011 Seq name: gi|319979380|gb|AEUH01000021.1| Actinomyces sp. oral taxon 178 str. F0338 contig00021, whole genome shotgun sequence Length of sequence - 5080 bp Number of predicted genes - 6, with homology - 2 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 40 - 525 296 ## PROTEIN SUPPORTED gi|229210357|ref|ZP_04336754.1| acetyltransferase, ribosomal protein N-acetylase 2 1 Op 2 . - CDS 575 - 3721 3672 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 3 2 Tu 1 . + CDS 4020 - 4214 333 ## 4 3 Op 1 . - CDS 4148 - 4369 203 ## 5 3 Op 2 . - CDS 4370 - 4537 135 ## 6 4 Tu 1 . - CDS 4766 - 5080 266 ## Predicted protein(s) >gi|319979380|gb|AEUH01000021.1| GENE 1 40 - 525 296 161 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229210357|ref|ZP_04336754.1| acetyltransferase, ribosomal protein N-acetylase [Leptotrichia buccalis DSM 1135] # 4 153 4 153 155 118 36 8e-27 MDDVEVLEGVDGDGAEALCGWVNARGAAFLEQWAGPALSFPLTRGRVAALEGLHSIRCGG AFVGVVQWMSTDGGEAHIGRFIIDPARTGRGLGRRALEAFLRIVFADERVSTVSLSVYED NAGARALYERMGFRVRGAREGARRALRMELARGGAEPTLPE >gi|319979380|gb|AEUH01000021.1| GENE 2 575 - 3721 3672 1048 aa, chain - ## HITS:1 COG:SP1523 KEGG:ns NR:ns ## COG: SP1523 COG0553 # Protein_GI_number: 15901369 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Streptococcus pneumoniae TIGR4 # 485 1047 485 1025 1032 337 36.0 8e-92 MPSSSFARSIASLDDDDIIACTGAAVWARGLAAYRAGKVVETKWEDGDRLVGRVRDGALT YKTRIVPQPARPGISCACPIGVDCAHGVATVIACREDLRGRALAEEAPWKRALATMIGTA ASGGEPLGLLVDAHDPSAPIWLKPLRRGARTAWTDKRAGWPDLTNTRWESVTEGINPTHL SLMREAYRRSHEGGAWRSRGEVSLESLGDGAFDWLRRLQRAGVALWAGMDPLTPLELEHT TWDVDLDASLGPDALTLRAVARDGDRVERLPRIAVEPRLLVSKGGTALARVNGVSLLDAL PEGGAVDVPVADLAEFRASWLPGLRRRAHVVSLDSTFADVEGARAAIVATVRLEGGDRVA VRWWAEYDFGAAPSRVPLAGAMASDPGLASRAERIDALGAPLSGDGLWRPVPSTARIPAW RAPAFLDAAVGALADAGVVWDIADEVRRIRVDDDGMRVTAVVDDSGTDWFGLRLEVTVAG AAIAMEDVLAALARGEDHVLVDGTWVALDGERVERLRALLAEAAVLTDSDAGSPRISAAQ AGLWGALSESADHVRASERWRRAVGLLLDGTDGPSGAGLAVSPALTARLRPYQRAGHDWL TARAGAGLGGVLADDMGLGKTVQLLSAVHSLRDADPEAPPVLVVAPTSVLGAWEEEARRF APDLDARVVAGTARRRGTSVAEECAGAHLVITSYTIARLEAGQWAAQEFSGLVVDEAQTV KNPRTAVHAALAAVRAPWCFAVTGTPVENSVADLWSVLALACPGLLPRWEVFNERIRKPV ESGRDPAALDRLRRLIAPFVLRRTKEEVAPDLPDKVETLARVELGEEHRHIYEQHLTRER ARALSLMEDFPRNRMDVLASITRLRQLALDPALVDGAYAHVGSAKTEYLVGQLVQIVPRG HQALVFSQFTSFLARIRRALERRGITVAQLDGATRGRARVIERFRSGAASVFLISLKAGG TGLTLTEADYVYVMDPWWNPAAEAQAVDRAHRIGQSKKVNVYRLVADDTIEAKVVELQDR KRRLVSKVVDGAAAGASLGAEDLRALLE >gi|319979380|gb|AEUH01000021.1| GENE 3 4020 - 4214 333 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGRSGVEHRSFCKRCQAFKGIPRPPGSCGGCSTSEVEFSKGYRQFASVITLAKCGIAVKA EALL >gi|319979380|gb|AEUH01000021.1| GENE 4 4148 - 4369 203 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVRAVARGEPFRAPRAQALTRTSAAEPERSALSSQQYRPVGEYLDLCAVGRPHSNASAFT AIPHFASVITLAN >gi|319979380|gb|AEUH01000021.1| GENE 5 4370 - 4537 135 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPTPIGSFLPQCDSPDPEFDGLVAPARWFPAPMRLTGPTTAPPLAISLSTRICAH >gi|319979380|gb|AEUH01000021.1| GENE 6 4766 - 5080 266 104 aa, chain - ## HITS:0 COG:no KEGG:no NR:no AAALYDQPRPALLLGAAPGRAPIGVTARGLLSAEPALLVLDPQGRARTVPVRPLAGPWAV AGRWWDAEGSARAYLRVALEGEGGAEDLLLVFRAGAWAVEGAYG Prediction of potential genes in microbial genomes Time: Thu May 12 16:58:17 2011 Seq name: gi|319979377|gb|AEUH01000022.1| Actinomyces sp. oral taxon 178 str. F0338 contig00022, whole genome shotgun sequence Length of sequence - 1526 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1118 1160 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 2 1 Op 2 . - CDS 1120 - 1449 283 ## gi|154508743|ref|ZP_02044385.1| hypothetical protein ACTODO_01251 Predicted protein(s) >gi|319979377|gb|AEUH01000022.1| GENE 1 2 - 1118 1160 372 aa, chain - ## HITS:1 COG:Rv3394c KEGG:ns NR:ns ## COG: Rv3394c COG0389 # Protein_GI_number: 15610530 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Mycobacterium tuberculosis H37Rv # 2 356 6 364 527 152 37.0 1e-36 MRRIALWVPDWPVNSLAAGLEAGEPGAVVASGRVEAATAAARSRGVRVGMRARDAFSLCP ELVCLPRDVEREGRAFEAVLSAFDQVAAGVECLRPGLAWANAAGPARWAGGEEGAARGLV EAVEERVGVECFAGIADGPLGAVCAARQGRIVPAGGTDRFLSGVPLGPVVTGLAPPGLRD RVEEAVALLRRLGIRVCADLLGMGATAVSSRFGAAGEFLWRAASGGELVLAGAPRPLGDI GTGVDIDEGGENVDAVLAPALRCAHDLADLLARRGLLADALLVEASAGGGGELTRRWRGV DLMSPDEIAERVRWNLLGWAEGAARPEGPIRSLRLTAVEPYPAAALTPLWGADARQRGVK RLAARLQGLAGE >gi|319979377|gb|AEUH01000022.1| GENE 2 1120 - 1449 283 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508743|ref|ZP_02044385.1| ## NR: gi|154508743|ref|ZP_02044385.1| hypothetical protein ACTODO_01251 [Actinomyces odontolyticus ATCC 17982] # 1 99 51 149 160 95 49.0 9e-19 MRAVLGSCPDGGWAAFVGVADIGWEWAHQEGLDLDRVLVVAVPGHVSAALVCSLCLDAVD VLCVGRAALSPEEQRRLAARARARGHRIITERPWPGVSRPLATGLREAV Prediction of potential genes in microbial genomes Time: Thu May 12 16:58:25 2011 Seq name: gi|319979370|gb|AEUH01000023.1| Actinomyces sp. oral taxon 178 str. F0338 contig00023, whole genome shotgun sequence Length of sequence - 3710 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 264 81 ## 2 1 Op 2 . + CDS 255 - 782 576 ## COG1881 Phospholipid-binding protein 3 1 Op 3 . + CDS 789 - 1250 609 ## COG0219 Predicted rRNA methylase (SpoU class) 4 2 Tu 1 . - CDS 1350 - 1832 598 ## gi|293192914|ref|ZP_06609758.1| hypothetical protein HMPREF0970_02113 5 3 Op 1 30/0.000 + CDS 2074 - 3447 751 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 6 3 Op 2 . + CDS 3474 - 3708 421 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes Predicted protein(s) >gi|319979370|gb|AEUH01000023.1| GENE 1 1 - 264 81 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no DGGREDGGLTSPVALSAAASALRAAASLSMADAPTSPSFRRTGVRFGHFRGGRPRLRPGA SGGARAAPVRQWRPRPIPGNPMEEPWI >gi|319979370|gb|AEUH01000023.1| GENE 2 255 - 782 576 175 aa, chain + ## HITS:1 COG:Cgl1487 KEGG:ns NR:ns ## COG: Cgl1487 COG1881 # Protein_GI_number: 19552737 # Func_class: R General function prediction only # Function: Phospholipid-binding protein # Organism: Corynebacterium glutamicum # 12 167 14 169 177 130 46.0 2e-30 MDLTTRPSARAPYEGMGVPSFSLTSDTLADGGTMPASATAQGGSRSPHLAWSGAPDGTRS FMVTCFDPDAPVPAGWWHWAVLDLPASASRLEEGAGRSDLELDGPAFHLRGDTGDASYFG AAPPPGDRPHRYVFSVHALDVDTLGLDDDATPTMAAFTALGHTLARADLTVTFQA >gi|319979370|gb|AEUH01000023.1| GENE 3 789 - 1250 609 153 aa, chain + ## HITS:1 COG:Cgl0628 KEGG:ns NR:ns ## COG: Cgl0628 COG0219 # Protein_GI_number: 19551878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Corynebacterium glutamicum # 2 152 6 156 157 174 56.0 6e-44 MLHIVFWEPEIPGNTGAAIRLSACTGSVLHLVRPLGFDMDDAKLRRAGLDYHDLAHVRVH DSLDGALAGIPGRVWALTGRAERAYTDVAYGDGDGLLLGRESVGLSEEAMSHPGVAGRVR ITMVPGVRSLNLANSAAIVLYEAWRQLGFPGGA >gi|319979370|gb|AEUH01000023.1| GENE 4 1350 - 1832 598 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192914|ref|ZP_06609758.1| ## NR: gi|293192914|ref|ZP_06609758.1| hypothetical protein HMPREF0970_02113 [Actinomyces odontolyticus F0309] # 13 156 34 176 182 200 72.0 3e-50 MPDQPDSPGSHGDQRPPSFGLGRIVIALFWLLGAWILVVAVVDLFHHNADEPWGPPILAV LAGAVYLAAATALTHNGRRMRIVGWTSIGVTIAAPLVLWVAGLGVPELNGPRSAWTGLGS DFYYAPLAVSLIGLVWMWRSNPRRIVEIAESIERPSRWQR >gi|319979370|gb|AEUH01000023.1| GENE 5 2074 - 3447 751 457 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 5 455 3 458 458 293 36 1e-79 MSEPQYDIVVLGAGSGGYATALRAAQLGMRVALIDGDKVGGTCLHRGCIPTKAYLHAAET ADAVRESARFGVRSTFGGIDMAQVAKYRDSVIGGLYKGLQGLLSSRGVEVVKGWGRLVSP DTVEVGGRALVGRNVVLATGSYSRSLPGLDIGGRVITSDQALQMDWAPQSAVVLGGGVIG LEFASVWRSFGAEVTIIEALPHLANNEDEAVSKQLERAYRRRGIKFHTNTRFAGAVQDEG GVRVVTEDGKSFDADVLLVAVGRGPVTEGLGYEQAGIRMDRGFVLTDERLRTGAGSVYAV GDIVPGPQLAHRGFLQGIFVAEEIAGLGPRMQADANIPRVTFCEPEIASVGLTEKQARQE YGDAVRTVEYNLAGNGKSTILGTSGMIKLVSVQDGPIVGFHGIGSRIGEQIGEGGLMVNW EAFPSDVASIIHAHPSQNESIGEAAMALAGKPLHVHN >gi|319979370|gb|AEUH01000023.1| GENE 6 3474 - 3708 421 78 aa, chain + ## HITS:1 COG:MT2272 KEGG:ns NR:ns ## COG: MT2272 COG0508 # Protein_GI_number: 15841706 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 1 78 1 78 553 101 70.0 4e-22 MATAVTMPALGESVTEGTVTTWLKSVGDRVEVDEPIVEVSTDKVDSEVPSPVSGVLLEIL VPEDETVEVGARIALIGD Prediction of potential genes in microbial genomes Time: Thu May 12 16:58:40 2011 Seq name: gi|319979367|gb|AEUH01000024.1| Actinomyces sp. oral taxon 178 str. F0338 contig00024, whole genome shotgun sequence Length of sequence - 1169 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 767 1138 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes + Term 1014 - 1049 0.4 2 2 Tu 1 . - CDS 813 - 1169 509 ## gi|154508737|ref|ZP_02044379.1| hypothetical protein ACTODO_01245 Predicted protein(s) >gi|319979367|gb|AEUH01000024.1| GENE 1 3 - 767 1138 254 aa, chain + ## HITS:1 COG:ML0861 KEGG:ns NR:ns ## COG: ML0861 COG0508 # Protein_GI_number: 15827386 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Mycobacterium leprae # 17 254 291 530 530 255 57.0 4e-68 SGAPAPSAAPSREPSPLRGTTEKMSRLRQTIARRMVESLQTAAQLTTVIEVDVTRVAALR ARSKDAFAAAHGTRLTFLPFFVKAATEALRYHPKINATIDGAQVTYFDHEHIGIAVDTPR GLLVPVIKDAGAKTLAGIAESINDLAARTRESKVGPDELSGSTFTITNTGSGGALFDTPV LNMPETAIMGVGTIVKRPMVVKGADGADAIAIRSMVYLSLSYDHRLVDGADASRYLMDVK RRLEEGDFEADLAL >gi|319979367|gb|AEUH01000024.1| GENE 2 813 - 1169 509 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508737|ref|ZP_02044379.1| ## NR: gi|154508737|ref|ZP_02044379.1| hypothetical protein ACTODO_01245 [Actinomyces odontolyticus ATCC 17982] # 1 118 327 444 444 66 34.0 5e-10 VRARDAAIVAGNASALDGVVGGALADADRKRIESLAEQGITVEKLETQVRVKEVVACTDS LIRARAALTQTALRACANGTCTDASAQPAQEVVLDIERASGLVVAAQGASGQSGAEGR Prediction of potential genes in microbial genomes Time: Thu May 12 16:58:53 2011 Seq name: gi|319979346|gb|AEUH01000025.1| Actinomyces sp. oral taxon 178 str. F0338 contig00025, whole genome shotgun sequence Length of sequence - 20513 bp Number of predicted genes - 21, with homology - 15 Number of transcription units - 10, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 920 837 ## COG0515 Serine/threonine protein kinase 2 2 Tu 1 . + CDS 1024 - 1716 925 ## Arch_0733 hypothetical protein 3 3 Op 1 31/0.000 + CDS 1984 - 2952 1725 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 4 3 Op 2 34/0.000 + CDS 2970 - 3908 1420 ## COG0765 ABC-type amino acid transport system, permease component 5 3 Op 3 . + CDS 3905 - 4687 616 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 6 4 Op 1 . + CDS 4815 - 5750 1142 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 7 4 Op 2 . + CDS 5821 - 6579 680 ## COG2375 Siderophore-interacting protein + Prom 6603 - 6662 2.0 8 5 Tu 1 . + CDS 6871 - 7383 772 ## Arch_0990 membrane-flanked domain protein 9 6 Tu 1 . + CDS 7786 - 7983 168 ## - Term 7723 - 7768 8.7 10 7 Op 1 . - CDS 7975 - 8217 79 ## 11 7 Op 2 . - CDS 8217 - 9338 707 ## 12 7 Op 3 . - CDS 9386 - 12253 4489 ## COG0178 Excinuclease ATPase subunit 13 8 Tu 1 . + CDS 12310 - 12591 348 ## COG2350 Uncharacterized protein conserved in bacteria + Term 12784 - 12830 1.1 14 9 Op 1 . - CDS 12673 - 13350 862 ## gi|293190640|ref|ZP_06608931.1| conserved hypothetical protein 15 9 Op 2 . - CDS 13347 - 14000 832 ## gi|293190640|ref|ZP_06608931.1| conserved hypothetical protein 16 9 Op 3 . - CDS 14024 - 15772 1478 ## SACE_1016 hypothetical protein 17 9 Op 4 . - CDS 15789 - 16106 331 ## 18 9 Op 5 . - CDS 16151 - 17284 1091 ## COG1404 Subtilisin-like serine proteases 19 9 Op 6 . - CDS 17311 - 18717 959 ## 20 9 Op 7 . - CDS 18776 - 19186 287 ## - Prom 19232 - 19291 1.9 21 10 Tu 1 . - CDS 19327 - 20466 1511 ## COG0474 Cation transport ATPase Predicted protein(s) >gi|319979346|gb|AEUH01000025.1| GENE 1 2 - 920 837 306 aa, chain - ## HITS:1 COG:SP1732_1 KEGG:ns NR:ns ## COG: SP1732_1 COG0515 # Protein_GI_number: 15901564 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Streptococcus pneumoniae TIGR4 # 31 140 53 154 343 62 34.0 8e-10 MEIGGYELGTIIHMGANGPLWLTETSKGPGVVAIRGEREARGSVDRWRAWAGIDSPHVVR LLDVVRHEDGQWAVVQEYVEGRPLDIALASHALRSPRTRRRVWEGVAAGVSALHEAGIVH GDISPANIIVDPQGRAVIIDIIDPPRPGAGTRGWSLGAPEGREGDEACADRIGELLGVDP SDADSGGTDADSPVLVDGGVVDPAQIVADLRAAALREDTLAPGRTQEGAEGGAEQGGAPG RSRRRVAAMLALAVVLFAVGGVLVWRSGAFNGAPQTRTGQSAEGAGAGAAPEEQPPSDPC AAGAVE >gi|319979346|gb|AEUH01000025.1| GENE 2 1024 - 1716 925 230 aa, chain + ## HITS:1 COG:no KEGG:Arch_0733 NR:ns ## KEGG: Arch_0733 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 3 230 9 236 236 190 44.0 4e-47 MSEKKPPAKRRWYNNMADAYRVSARSYPWIPWVLVGAAISVIGLFLIIAALLHLNWIAWL FVGLLAALTTDMAILSFLVPRALYSQIEDTPGSSKAVLQQIKRGWIIPEEPVAVTREQDL VWRIVGRPGVVLISEGPSSRVRPLLNAEAKRVTRILQNVPVHQIQVGKDEGQVALAKLQA AINKLKNALTSEEVPQVSSRLAALKANNPPIPKGIDPQRARPNRRALRGK >gi|319979346|gb|AEUH01000025.1| GENE 3 1984 - 2952 1725 322 aa, chain + ## HITS:1 COG:Cgl1300 KEGG:ns NR:ns ## COG: Cgl1300 COG0834 # Protein_GI_number: 19552550 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Corynebacterium glutamicum # 64 319 77 331 334 143 36.0 4e-34 MHTPHKPATAALLALTTASALVLAACSDPGATTQSASDPNGAAASGGAATTDVTPFDVSQ IPTVDEIAALVPDEVKTRGTLRNGASADYAPGEFRASDGQTVVGYDADLTRALAKVMGLK DGTTSHAEFPTIIPSIGTKFDVGISSFTINSEREQQTNMIAYVQVGSAYGVAQGNPKGFD PAQPCGKTIGVQTGTAQEEYINKLSQECVSAGNEAITVMPHDVQTSVATKVVGGQYDATL ADSTVIGYTAALSQGKIEQIGDVIESAPQGIAVNKQDAALTEAVQKAAQYLMDHGYLQQI LAPYGAEGAALTTAELNPQVND >gi|319979346|gb|AEUH01000025.1| GENE 4 2970 - 3908 1420 312 aa, chain + ## HITS:1 COG:Cgl1299 KEGG:ns NR:ns ## COG: Cgl1299 COG0765 # Protein_GI_number: 19552549 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Corynebacterium glutamicum # 11 297 18 299 316 238 47.0 1e-62 MPHVKDGVELLDAKPVPRPGRWVGAGVVAVIVAMAVHGLVTNPNYQWSTVWAWLFSQSIM SGVLFTIILTVLAMLIGTALAITMAIMRQSINPVLKWVATAYIWFFRGTPIYTQLIFWSL LPTLYPKLSLGIPFGPEFVTFETAAYFTPFWMAFVGLGLNEGAYLAEIMRAGLLSVDKGQ WEAATALGMPRSLIFRRIVLPQAMRVIVPPIGNETISMLKTTSLVSAIPFTLELTFVARQ KGQAMFAPVPLLIAAAIWYLLITTLLMWIQSYIEKHFAKGFDRRETTGASPAQSDSDDEP SPSQKRFLDVTP >gi|319979346|gb|AEUH01000025.1| GENE 5 3905 - 4687 616 260 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 10 259 1 242 245 241 50 2e-63 MTPDPSPQPMVSVRGVHKFFGDLHVLRGIDLDIASGEVCVILGPSGSGKSTLLRCLNQLE RISAGRVYVDGTLLGYREDASGQLHDLKDREIAAQRARIGMVFQRFNLFPHKTALENVME APVHVAKQPADQARARALELLDRVGLADRADHYPAELSGGQQQRVAIARALAMDPEIMLF DEPTSALDPELVGEVLAVMQDLAASGMTMAVVTHEIGFAREVADHVVFMDGGAIVEAGTP AQVIDNPSSDRTREFFSKVL >gi|319979346|gb|AEUH01000025.1| GENE 6 4815 - 5750 1142 311 aa, chain + ## HITS:1 COG:Cgl1480 KEGG:ns NR:ns ## COG: Cgl1480 COG0667 # Protein_GI_number: 19552730 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Corynebacterium glutamicum # 7 308 3 311 312 154 35.0 2e-37 MVTTFCGRSGLRVSSIGLGTLTWGRDTDAAEACEMLARFVDAGGNLVECSPMHGEGRATD VLAEAVAAVGRHRVVLAWRGASRPDADSWRPAGGRGDMLRSLDDALQRLGSDFVDVWLVS PEAGVPLEETLSAAEHAHRTGRARYIGMASSRTWDIAEAVTRGAMPGGAPITAVEAPLSL VDTRLAGLVAELEDRGIGFFAASPLAGGALTGKYRHSTPPDSRAASPHLRHLVEPHFTAA ARGVIEATVRAAEGLDRTAADVACAWASGYPGVSSAIIGPRTTRQLDAVLEREAPLPVPI RQVLNEVAGLR >gi|319979346|gb|AEUH01000025.1| GENE 7 5821 - 6579 680 252 aa, chain + ## HITS:1 COG:Rv1348_1 KEGG:ns NR:ns ## COG: Rv1348_1 COG2375 # Protein_GI_number: 15608488 # Func_class: P Inorganic ion transport and metabolism # Function: Siderophore-interacting protein # Organism: Mycobacterium tuberculosis H37Rv # 4 249 3 249 430 210 43.0 2e-54 MAKRGFQGAVLKVLRAREYPLAVLERSPLTPSCVRIVFSAPELLGAVPLSAASWLRLWVP DPERPDVEHQRGYTIAAHDARAGTIAVDFVQHEPSGPASTWAARAEAGDRIAAMSMGSTE VSFAGQDQPEGYLMIGDAASVPAIRSLVAEAPAHVPVRVYLEEHSDSDRDLPFASHPRLE VRWVPREGPESLARAVDPGDYRGWYAWATPEFRSLRALKLRLKEWGLAKGRLHAQAYWIE GRGLGRSRGEKR >gi|319979346|gb|AEUH01000025.1| GENE 8 6871 - 7383 772 170 aa, chain + ## HITS:1 COG:no KEGG:Arch_0990 NR:ns ## KEGG: Arch_0990 # Name: not_defined # Def: membrane-flanked domain protein # Organism: A.haemolyticum # Pathway: not_defined # 1 157 1 157 166 119 49.0 3e-26 MALSKKVLSRDEVVVRHMHTHIKVLLWRIIAWLLLIAAAAAGTVFIPADWQPWAAIALWA LVLVVSFPLLFIPWLLWYMHTYTVTTKRVITRRGVFRRTGHDLPLTRISDVQLVKDFSDR FFGCGTLALQTSSDDPLLLHDVPKVEMVQVEISNLLFEDVQGAIDADPDS >gi|319979346|gb|AEUH01000025.1| GENE 9 7786 - 7983 168 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPPSVPVPPAHALPQWVVSCPNAVRCAREGRYADEAESVLTAMPPPSQRYHTVQALSHLQ SVVSL >gi|319979346|gb|AEUH01000025.1| GENE 10 7975 - 8217 79 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGRRPLLQRSKKGGVARAASGRLVGLSQSAIAQRPRSSSSPRPYTGFCHTQTAKGHFSIG KRTRQRQRSGSFASEVEFSQ >gi|319979346|gb|AEUH01000025.1| GENE 11 8217 - 9338 707 373 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDIVGALGSLFPVLNSHPWAMAALIGVVVVAVAMWLAPRIIVATESLINIRSRLKDPDD WASGDSFLRTSDPVLWRQNLVFNRLRLDYLNSARVHSQIRRKETPISMLFLLQLIALCAA LFSVRASPTFFQPVGSEAWRQFIPPFLLNAIILVSIALYAVAEIAFLLVLYRWIKDDEKS KKTDTLSHYLRNLGFGEQLSSHIATCWYHFKVEHRDPPIALDEHRVRTRTGVLLSFRVEH REDDGVLLKDEYVGAMNDRRLVKCPTCGKRVWSYSRSWLFFKWLRWSKGSVKSSGNWTAR DVTLALGALSHLFSAQNVEKYLRRHESCVGHGGSAEAGAGGPEGAGGDDRAAGNGWFSRA LRRQRQRRQPLSR >gi|319979346|gb|AEUH01000025.1| GENE 12 9386 - 12253 4489 955 aa, chain - ## HITS:1 COG:ML1392 KEGG:ns NR:ns ## COG: ML1392 COG0178 # Protein_GI_number: 15827727 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Mycobacterium leprae # 1 946 1 954 969 1332 69.0 0 MHDQLVIQGAREHNLKDVSLTLPRDAMIVFTGLSGSGKSSLAFDTIFAEGQRRYVESLSS YARQFLGQMDKPDVDFIEGLSPAVSIDQKSTSRNPRSTVGTITEIHDYLRLLYARAGIAH CPQCGARIQAQTPQQIVDRVRTMDDGTRFQVLSPVVRGRKGEYQDLLDELKADGYTRAII DGEAVRLENAPKLEKKLKHTIEVVVDRLVVRSGIRQRLTDSVETALRLSDGLVVIDCVDL DEADPARRRRFSEKRACPNDHPLSLDEIEPRTFSFNAPYGACPDCTGLGFRNEVDPELVV PDEELSLAEGAVGVWSTNFKYHKRLLESLAAELGFDMSTPWKDLPKKAKDAVLYGKDYEV QVKFRNRWGRERSYTTGFEGAVAYIERKKEETESEWVTEKLDGYMRQVPCPSCHGARLKP EVLAVTVGGLSIAQLSDLSIADARAHLAELELEDRSASIAAPILTEIAARLDFLVDVGLT YLTLSRAAGTLSGGEAQRIRLATQIGSGLVGVLYVLDEPSIGLHQRDNRKLIATLERLRD LGNTLIVVEHDEETMEAADWIVDIGPGAGETGGWVVHSGTMDSLLVNKKSLTGQYLSGRR AIALPAARRSIDRRRLITVKGARENNLRGIDVSFPIGVFTAVTGVSGSGKSTLVNQILYR SLAARLNGARTIPGRHRTISGISGLDKVVHVDQSPIGRTPRSNPATYTGVWDQIRRLFAE TQEAKVRGYGPGRFSFNVKGGRCESCKGDGTLKIEMNFLPDVYVPCEVCHGARYNRETLE VEYKGKTVADVLSMSVAEAAEFFAPISRIARHLNTLVEVGLGYVRLGQSATTLSGGEAQR VKLASELQRRSTGRTVYVLDEPTTGLHSEDIRKLLLVLQSLADKGNTVIVIEHNLDVIKS ADWVIDMGPEGGSGGGQVVAEGTPEQVSEVPESHTGRFLVDVLRKGKVRQIAEGA >gi|319979346|gb|AEUH01000025.1| GENE 13 12310 - 12591 348 93 aa, chain + ## HITS:1 COG:Cgl1920 KEGG:ns NR:ns ## COG: Cgl1920 COG2350 # Protein_GI_number: 19553170 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 1 93 1 95 95 76 38.0 1e-14 MAYFAVEYVYDPSKTELMDQVRPRHRAYLGALRDEGKNLGSGPLSGDAPGALLILKADAE ADVLAMLDADPFHEAGVVSTRTVREWNPVIRHF >gi|319979346|gb|AEUH01000025.1| GENE 14 12673 - 13350 862 225 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190640|ref|ZP_06608931.1| ## NR: gi|293190640|ref|ZP_06608931.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 10 220 4 222 222 67 27.0 5e-10 MKRGRLRALAVAWCACVAVAVCGCGRGTDIFGRPDPLLPFDKRAPIGVYLSDAEPLIERF VDALAKQNGGALGYDGSAYINGCGETGDEHGIRVRSPGLKILMTDAMDLRALSEQILEPA GFTPPDTFAPDEERSLVWGEDRNGTLLSVYYVPGEMVRFYYHSGCVPFGDMSEDDLKAKV YGRDLTQTFPDLVLYQSFDTDGNPQGPPDQQSGTSAQSTQSGGGQ >gi|319979346|gb|AEUH01000025.1| GENE 15 13347 - 14000 832 217 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190640|ref|ZP_06608931.1| ## NR: gi|293190640|ref|ZP_06608931.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 2 212 4 222 222 68 27.0 3e-10 MAVVWCACVAVAVCGCGRGTDIFGRPNPVLPFDRRAPIGVYLSDTEPLIERFVDALAKQN GGALGYDGSAYFNGCGETGDEHGIRVASPIFKVLMTDAMDLRALSEQILEPAGFTPPDTF APDEERSLVWGEDRNGTLLSVYYVPGEMVRFYYHSGCVPFGDMSEDDLKAKVYGRDLTQT FPDLVLYQSFDADGNPQGPPDQQSGTSAQSTQSGGEQ >gi|319979346|gb|AEUH01000025.1| GENE 16 14024 - 15772 1478 582 aa, chain - ## HITS:1 COG:no KEGG:SACE_1016 NR:ns ## KEGG: SACE_1016 # Name: not_defined # Def: hypothetical protein # Organism: S.erythraea # Pathway: not_defined # 1 538 1 503 566 160 29.0 1e-37 MVAWSDLEQWRSDYLDESANAGGDGRRRTRTQVEELAGHFDRFRSEGQAADAMRGAGTAL QDDLDHLVQVASEYMLACEEAADGVRRVEAKVRSAQEISAEIGYPIKEDGSLDPFVREYE NWTITVPGTMPTTQTTREQSYASIVKELKYGELQQYIADALRIAEEVDAALEKRLSALAN GTFDGGGAREGRDSKSPGLPDDADPSWSPAEVSAWWHALSDAERQECIERDPATYGNLDG IDMASRDKANRLVLHGYTDSGGNHVPGLIEKAEAAVAAAQDKINNAGYQSPRESDLGAAL RLDLKNAQHDLEELRRLDAQLQRTGADGAPTSLLVLDPSGERLKAAVAVGDVDNAKNVAT FVPGMGTNVHDSIESYVDTAMRLQENTAAVSSSPRADTAVVAWLGYDAPQNDVSVASTEK AEAGAPRLNNFLTGIKSWRWEGGGDLHQTVVSHSYGSTTAGLAMKDIGAGVVDDFIYTGS PGAGTSTVGSLGVDPSHVWVSAVEHLDWVKGIPPDQWESFGRDPVDLEGIQHLSGDASGA SSYHANYFGWIFANHSSYFDAPAPGQHNEALDDICRVIGGAK >gi|319979346|gb|AEUH01000025.1| GENE 17 15789 - 16106 331 105 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTDLNIIAGDLEALRDGADAVAEALGSAPFGDAAGYVASGMPGSRSAQVVLQACQSIDDA WAALAQGLEDYAFNIGETISAFATTEDANSVVFQNLAAASQADAY >gi|319979346|gb|AEUH01000025.1| GENE 18 16151 - 17284 1091 377 aa, chain - ## HITS:1 COG:VC0157 KEGG:ns NR:ns ## COG: VC0157 COG1404 # Protein_GI_number: 15640187 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Vibrio cholerae # 16 289 127 398 535 79 25.0 1e-14 MRIAVCAAALVAGTGVVGAAPALAVEEITTQDYVSVLGIDKAVARGLTGTGVRIGVIDGP ADTGLPELEGANVRVRKMCDFEGSDTHKTHGSAMVSILSARGYGVARGAEVVNYSFPRRW SGKGDDDTVNDTKGCLGIGMEETLDAAVAEGMQIISISMGGGDLSESLRDALVRALAKNV IVVVSTGNEGREDPEDSLASLNGVVGVGSSDNQGLIWRYSNFGKGLTVLAPGENLKVHDF ASGEVVSVTGTSVSTPIVAGALAVAMQQWPQAAPNQILQSLVGTATNGRGGFPLLSTGGL DRTDPTGYPDANPLMDKFAGEEPSAASVADYTDGLASRDSFFKADEAYVYRGVDPEVVDK HPGQSALGTSPRYHRQD >gi|319979346|gb|AEUH01000025.1| GENE 19 17311 - 18717 959 468 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGEAGKTKGGALKQDQFFRRMQAVADLRYDDSADEEAYRSALARIDECDGAVLRYAQENG YHSTMVNDAIARWAYEFRCRLQAARARLQEGFATIGGARDTMRSLRDDFSQNVASELLTP EEESLRESSKGQMVVVPAGGHYMYTPSECYFETLEVQRRTEREDYCKGLLDQLNDQVGQH GSSVQETATKQGTAWSEHFPATPEKPSRPAPADQSAGGAVAADGHSGAAPDGPGGPAAPQ GGPAPGFDLAAEGFQRPGLPQQAPDSAVVDDLDGVDLVGRPINMTVTPNGLVGGYVPPSA VNASDPRWDPTYRMPTSVRQVSAATAGALGAGLLGPGRSLPSARSVLGGRVAGAPGRASG IGGTLPAGGKATSVGTGAPASAPGGQPMMAPIAAGVPVSGAAEERQKADGRRSDNGENPE EEDQEKEVVPLPYFDSEPKAEPVWDPAHGPGSADDGVEFSFDLEEWEL >gi|319979346|gb|AEUH01000025.1| GENE 20 18776 - 19186 287 136 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNDRGGVDPEVEGQALSRAAVGTTDLDEAGEKGVQALASEQTELKWGTETGPQTARSAFK ALLDAVDAELLECRTALEQTLEDSKRAVAGYEATDEDIAARLAAAFAGESYGAGAQVDNS TPADGYDDYSPGGDGE >gi|319979346|gb|AEUH01000025.1| GENE 21 19327 - 20466 1511 379 aa, chain - ## HITS:1 COG:Rv0908 KEGG:ns NR:ns ## COG: Rv0908 COG0474 # Protein_GI_number: 15608048 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Mycobacterium tuberculosis H37Rv # 20 356 436 780 797 251 45.0 2e-66 MSAPAALVEDRLPEARLAAAVITLEEDLRQDAAETLDYFRQQGVHVRVISGDNPTTVGAL AAQAGLTAPGGGEPRLMDARELPEDTASEEFARAIASHDVFGRVTPEQKRAMVGALQGRG HCVAMTGDGVNDALALKDADLGIAMGNGSQATKAIAQIVLVDSKFSVLPGVVAEGRRIIA NMERVSSLFLAKTTYAAILVIITALVGWRYPFLPRQFSYIDSLTIGIPAFFLALWPNRRR YVPGFLRRTLSLAMPTGAVLAAAALTAFGIVEGPPSSRESTAAILALMVGAIWLLVITSR PLTPVKCALLTAVAVATVGGVALAPVRDFFQMVWPTPWEWAVIVGVGLCAGALIEAGQRI FYRTAYRRSLEEGGATAQD Prediction of potential genes in microbial genomes Time: Thu May 12 17:00:37 2011 Seq name: gi|319979330|gb|AEUH01000026.1| Actinomyces sp. oral taxon 178 str. F0338 contig00026, whole genome shotgun sequence Length of sequence - 16387 bp Number of predicted genes - 14, with homology - 12 Number of transcription units - 9, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1217 1521 ## COG0474 Cation transport ATPase 2 1 Op 2 1/0.000 - CDS 1210 - 2208 1451 ## COG0861 Membrane protein TerC, possibly involved in tellurium resistance 3 1 Op 3 . - CDS 2205 - 3257 1482 ## COG0861 Membrane protein TerC, possibly involved in tellurium resistance 4 2 Op 1 . - CDS 3461 - 5533 3301 ## COG0556 Helicase subunit of the DNA excision repair complex 5 2 Op 2 . - CDS 5543 - 6031 558 ## COG0237 Dephospho-CoA kinase 6 3 Tu 1 . + CDS 6260 - 7672 1620 ## COG1114 Branched-chain amino acid permeases - Term 8102 - 8144 16.3 7 4 Op 1 2/0.000 - CDS 8174 - 9700 2150 ## PROTEIN SUPPORTED gi|227495256|ref|ZP_03925572.1| 30S ribosomal protein S1 8 4 Op 2 . - CDS 9783 - 12494 3491 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 9 5 Tu 1 . + CDS 12472 - 12963 610 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 10 6 Tu 1 . - CDS 13014 - 13604 886 ## COG3707 Response regulator with putative antiterminator output domain - Prom 13685 - 13744 78.9 + TRNA 13668 - 13744 58.9 # Leu CAA 0 0 + Prom 13669 - 13728 78.0 11 7 Tu 1 . + CDS 13748 - 13972 85 ## + Term 14013 - 14059 10.0 - Term 14003 - 14040 9.2 12 8 Tu 1 . - CDS 14055 - 14270 241 ## gi|154508855|ref|ZP_02044497.1| hypothetical protein ACTODO_01366 - Term 14509 - 14543 3.1 13 9 Op 1 . - CDS 14643 - 16085 1648 ## COG0469 Pyruvate kinase 14 9 Op 2 . - CDS 16176 - 16385 140 ## Predicted protein(s) >gi|319979330|gb|AEUH01000026.1| GENE 1 2 - 1217 1521 405 aa, chain - ## HITS:1 COG:MT0931 KEGG:ns NR:ns ## COG: MT0931 COG0474 # Protein_GI_number: 15840327 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Mycobacterium tuberculosis CDC1551 # 1 405 1 403 797 262 45.0 9e-70 MSNAVTTTAGLTAAQVAQRVSLGQVNKVREQTSRSLAAIVRENVFTLFNAIIVGASVIVL LFGHIQDGIFGGVMVINAVIGIVSELRAKRTLDALAIVDAPRARVVRDSTAQQIPAAEVV LDDVVSLGLGDQVPVDGEALEVNGLEVDESLLTGESRPVKKQAGDRLLAGTSVVAGAGVM RATAVGADVYAQGISAAARQFTRTVSEIQASINRVLQVVSFLLVPVVVLTFWSQNRVAGA DGGDWRNALVLSVASIIGMIPQGLVLLTSMNFALGSATLARRGVLVQELPAVEVLARVDA LCVDKTGTLTTGGIRVRSVEVLAGDRDQVLGALATASADRTNATATAIATHLEAEGAPAP LGPQAWGVPFSSARKWSAWGDGRTAWILGAPEIVIDRASEGGARA >gi|319979330|gb|AEUH01000026.1| GENE 2 1210 - 2208 1451 332 aa, chain - ## HITS:1 COG:Cgl1921 KEGG:ns NR:ns ## COG: Cgl1921 COG0861 # Protein_GI_number: 19553171 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane protein TerC, possibly involved in tellurium resistance # Organism: Corynebacterium glutamicum # 1 327 1 329 369 302 46.0 8e-82 MTVHPWAWAVLAVIALALFAFDFAGHARRPHPPSSGEALRWTLFYAGLAVLFGIGVWLAA GAAYAQEFYTGWAMEWSLSVDNLFVFILIMKSFRVPREYQQKALLFGIVIALVLRLAFIL AGAALVSRFSAVFLVFGLWLEWTAFTQIREALRGRDGDGEEYRENGFVRRVRRVLPVTDG YTGNRFLHRHGGRTSITPLFLVVMALGSADLMFAFDSIPAIFGVTSEPFLVFSCTVFALM GLRQLFFLVDSLLSRLVYLGFGLGAILAFIGVKLILEALRGGALPFVAGGAPLTALPEIS SALSLCVVVSVLAVTVVASLVASSTKEGTTRE >gi|319979330|gb|AEUH01000026.1| GENE 3 2205 - 3257 1482 350 aa, chain - ## HITS:1 COG:Cgl1921 KEGG:ns NR:ns ## COG: Cgl1921 COG0861 # Protein_GI_number: 19553171 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane protein TerC, possibly involved in tellurium resistance # Organism: Corynebacterium glutamicum # 1 321 1 322 369 335 54.0 1e-91 MEVHAIGWAALAAIIITMITIDIVGHVRTPHEPTIREAAWWSLGYIGFALAFGGLVWAVW GADFGQQYFAGYVTEKALSVDNLFVFVLIITAFRVPRKNQQEVLLAGIVIALVLRLAFIL VGKAMIESWSWVFYPFGLWLLWTAFSQIREGAEDPDAHLDEEYRPPSIVKWVSRVVPVTD GFIGARMLYRHGGRTYVTPLLLCVIAIGTADVMFAVDSIPAIYSLTVEPYLVFAANAFSL LGLRQLYFLIDGLLDRVVYLHYGLAAILGFIGFKLVNHALHTNELPFINGGEHWTAVPEP SIPFSLGFIVVVIVITVAASLAVSAGRRARARQAEAERAREGHRGGDAAL >gi|319979330|gb|AEUH01000026.1| GENE 4 3461 - 5533 3301 690 aa, chain - ## HITS:1 COG:ML1387 KEGG:ns NR:ns ## COG: ML1387 COG0556 # Protein_GI_number: 15827723 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Mycobacterium leprae # 13 689 7 695 698 922 71.0 0 MSEKPVLRQERPFEVVSEYSPSGDQPKAIKELASRIRDGEQDVVLMGATGTGKSATTAWL IEELQRPALVLEPNKTLAAQLAAEFRELLPGNAVEYFVSYYDYYQPEAYVPQTDTFIEKD SSINDEVERLRHSATNSLLTRRDTVVVSSVSCIYGLGTPEEYVARMIELERGMRIDRDAL LRRFVAMQYVRNDMAFTRGTFRVRGDTIEIIPVYEELAIRIEMFGDEIDSLAVLHPLTGD VIDEVDRTYLFPASHYVAGEERMKRAIGTIEEEMEERVAWFEGQGKLLEAQRLRMRTTFD LEMLREIGSCAGVENYSRHIDGRGPGTPPHTLLDYFPDDFLLIIDESHVTVPQIGAMFEG DMSRKRTLVDHGFRLPSAMDNRPLKWDEFTARIGQTVYLSATPGPYELDRSDGVVEQIIR PTGLVDPLVTVKPTEGQIDDLLEQVRARVEKDERVLVTTLTKKMAEELTTYLAERGVRVE YLHSDVDTLRRVELLRELRKGAFDVLVGINLLREGLDLPEVSLVSILDADKEGFLRSTRS LIQTIGRAARNVSGEVHMYADTVTDSMREALSETMRRRDLQIAYNREHGIDPKPLRKRIA DVTDMLAREQIDTESLLEGGYRREKTAAERTAAGARGAAGRAQGDLASLIEELSEQMMTA AAHLQFELAARLRDEIEDLKKELRAMKRAQ >gi|319979330|gb|AEUH01000026.1| GENE 5 5543 - 6031 558 162 aa, chain - ## HITS:1 COG:MT1667_1 KEGG:ns NR:ns ## COG: MT1667_1 COG0237 # Protein_GI_number: 15841086 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Mycobacterium tuberculosis CDC1551 # 2 137 46 182 208 109 53.0 2e-24 MLARIRERFGEDVVGADGALDRAALAARVFGDDEAVAALGAITHPAIDERAWAVLRAAPR GAVAVYDVPLLVEGGGAGLFDAVVVVDAPVEERLERLAARGVGRADALRRMASQASDEQR RAVASVWIDNSGTADELRAVAAAVLAQWLAPTGPSAGAGAMA >gi|319979330|gb|AEUH01000026.1| GENE 6 6260 - 7672 1620 470 aa, chain + ## HITS:1 COG:Cgl2257 KEGG:ns NR:ns ## COG: Cgl2257 COG1114 # Protein_GI_number: 19553507 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Corynebacterium glutamicum # 41 462 6 409 426 320 52.0 3e-87 MAPAPAHARIDADIDRSRGPRTVPTHTNTNTTAAPNRTGLLIATSLTLFSMFFGAGNLIF PPIMGANAGTSFPWAMTGFLIGAVALPVLSIITIALSGTDLRDLASRGGRVFGIVFPGLV YLSIGAFYALPRTGAVSFSTAIAPLTGWSSSLASTLFNVVFFGVALFLSWNPRSITDSLG KVLTPLLLVLLVLLVFLSLANLPASTSAPTAAYASAPLTSGLLDGYMTMDSIAALAFGIV VVTSLERTGGGIGAKMVRRTSVTALIAGALLALVYVGLGLIGHVMPGGQEYTDGASLLAD AAQMTIGWPGQVVFGLIVLTACMTTAVGLIAATSEFFHRLLPVVPYRAWMAAFTVISFVL AAAGLNSVLAVAVPIITFLYPIAITIVFTTILTRPLRLTTPGLWAFRASAWAAAIWSGAS ALAAAAPLGQLRALLSVSPWQDLQLGWIVPTALAFAVGLAADLAVARRDA >gi|319979330|gb|AEUH01000026.1| GENE 7 8174 - 9700 2150 508 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227495256|ref|ZP_03925572.1| 30S ribosomal protein S1 [Actinomyces coleocanis DSM 15436] # 30 507 1 481 481 832 87 0.0 MPGVDCTRSRPRSLELSTQLPSPLRRNYYMTTNTPQVAINDIGSTEDLLAAIDETIKYFN DGDIVEGTVVKVDHDEVLLDIGYKTEGVILSRELSIKHDVNPDDVVAVGDQVEALVLQKE DKEGRLLLSKKRAQYERAWGQIEKVKEEDGVVTGTVIEVVKGGLILDIGLRGFLPASLVE MRRVRDLAPYIGRELDAKIIELDKNRNNVVLSRRAWLEQTQSEVRTNFLHTLQKGQVRTG TVSSIVNFGAFVDLGGVDGLVHVSELSWKHIDHPSEVVEVGQEVTVEVLDVDMDRERVSL SLKATQEDPWQAFARTHAIGQVVPGKVTKLVPFGAFVRVEDGIEGLVHISELAQRHIELP DQVVKVNEEVFVKVIDIDLDRRRISLSLKQANEGVDPNSEEFDPSLYGMAAEYDEDGNYK YPEGFDPETQEWIEGFEAQREAWENQYTEAQARWEAHKAQVRRALEEDADSAPSIQEQAS YSTPVDSEGTLASDEALAALREKLTNNN >gi|319979330|gb|AEUH01000026.1| GENE 8 9783 - 12494 3491 903 aa, chain - ## HITS:1 COG:ML1381_2 KEGG:ns NR:ns ## COG: ML1381_2 COG0749 # Protein_GI_number: 15827719 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Mycobacterium leprae # 365 903 40 578 578 489 53.0 1e-138 MSDTVLIIDGHSMAYRAFYALPADKFVTTTGRCTNAVYGFVSMLTRLLEQERPTHVAVAF DLSRHSFRTDEYPEYKGTRDATPEEFKGQVELIGQVLGAMGITSLTKERYEADDILATVA RRGAEAGMRVLVVSGDRDSFQTVGERVTVLYPGTGPGDLRRMDPAAVEAKYGVPPHRYPE VAALVGEASDNLPGVPGVGPKTAAQWIARFDGLDNLLARADEVGGKRGQALRDHADEVRR NRRLNRLVDDLDLGVGIDDLRRGPTDFAEIQSLFDALEFGSLRRRVRQVAHVGRAGAEGA RGADAPAGGNGGPDVAVGACQEDTDIAQWARDNAPVAVLVEGDPRAVSGRVDRLALAGAR RALVIDPALLTPAQVDQVGGVLAGAVAVVHDWKGTAHALRAQGWTLGEECFDVMLAAYLV APDQRSYAAADLVGRVLGLDVGGGAPDDALFDLAAEGEGPGVAGARLGRLCAALHPLRAE LGEALRGSGEAGLLSDVEMPTAAVLAGMERVGIGVDSARLGALSDELGGDVERAREGALA VLGREVNLSSPKQLQEVLFEHFGLPRTRRTKTGYTTNAEALQDLYAKTAERGGDGHEFLG HLLAHRDRIKLKQMVDSLIASVADDGRIHTTFSQVAAATGRLASADPNLQNIPARSADGM RIRGGFVAGPGFEALMSADYSQIEMRIMAHLSQDEGLIGAFNSGEDLHRTMASMVFGTPV AEVTAQQRSRIKATSYGLAYGLSKYGLSKQLGIPVPEAAALRERYFERFGGVRDYLESLV DQARSRGYTETMYHRRRYLPDLGSDNRQRREMAERAALNAPIQGSAADIMKLAMIGVVGA LGRAGLRSRVLVQIHDELLVEVAPGERERVEEAVRQEMAGVASLRVPLDVSVGVGPSWQE AAH >gi|319979330|gb|AEUH01000026.1| GENE 9 12472 - 12963 610 163 aa, chain + ## HITS:1 COG:BS_yuxO KEGG:ns NR:ns ## COG: BS_yuxO COG2050 # Protein_GI_number: 16080218 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Bacillus subtilis # 39 162 6 126 126 77 40.0 1e-14 MISTVSLTHDTLDAMAPSDQTKTPSDAARHWVGASWPGTLMELTGMEVLEHTARRTVVSM PVAGNLQNGGILHGGASAALAETAGSFAACAHADDLHPGGGAYAVGTELSASHVAAGRGG RVTAVATAAHLGRTSTVHTVEVRDEDGRLISCARITNRILMRR >gi|319979330|gb|AEUH01000026.1| GENE 10 13014 - 13604 886 196 aa, chain - ## HITS:1 COG:ML1286 KEGG:ns NR:ns ## COG: ML1286 COG3707 # Protein_GI_number: 15827661 # Func_class: T Signal transduction mechanisms # Function: Response regulator with putative antiterminator output domain # Organism: Mycobacterium leprae # 8 195 13 200 205 199 60.0 2e-51 MTEEQTEPRRVLVAEDEGLIRLDIVETLTSAGFDVVGEAADGEEAVQLALDLEPDLCVMD VKMPKMDGITAAEKILQELSCAVVMLTAFSQTELVERARDAGAMAYVVKPFSPADLIPAV EIALSRHAEIESLEDQVADLTDRFETRKRVDRAKGLLMKNMGMSEPEAFRWIQKTSMDRR LSMREVADAIINQVDD >gi|319979330|gb|AEUH01000026.1| GENE 11 13748 - 13972 85 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVAVASWYRFTGRVAGIDCARHATTEESSAGSGCPLGTRRALVGPGGDGPQCPGASTRSP CADAGNGSALRRRR >gi|319979330|gb|AEUH01000026.1| GENE 12 14055 - 14270 241 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508855|ref|ZP_02044497.1| ## NR: gi|154508855|ref|ZP_02044497.1| hypothetical protein ACTODO_01366 [Actinomyces odontolyticus ATCC 17982] # 1 71 1 71 71 112 80.0 6e-24 MNTDIRTVSVHDTLFGRVANNLEVAQLSHAVAPWFADFHDSRIAKAIADLDEPELRGKAA EYLGLEVTPVA >gi|319979330|gb|AEUH01000026.1| GENE 13 14643 - 16085 1648 480 aa, chain - ## HITS:1 COG:Cgl2036 KEGG:ns NR:ns ## COG: Cgl2036 COG0469 # Protein_GI_number: 19553286 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Corynebacterium glutamicum # 2 471 5 471 477 462 55.0 1e-130 MRRAKIVCTIGPATDSPEQIQALVDAGMDVARINRSHGNAEEHEAVIARVRTASQTSGRA IAVLVDLQGPKIRLETFADGPQELAVGDTFTITTRDVPGTKELVGTTFKGLPGDCAPGDR LLIDDGNVAVRVVEVTDTDVTTRVEVPGTVSDHKGLNLPGVAVSVPALSEKDKEDLRWGI RQDADFIALSFVRSAADIEDVHAIMDEEGKRIPVIAKIEKPQAVDALEGIVDAFDGIMVA RGDLGVEMPLEAVPLVQKRAIELARIAGKPVIVATQVMDSMIKNPRPTRAEASDCANAIL DGADAVMLSGETSVGAFPIETVRTMARIIEATEEEGGERIATIPGYYASDRAAVICEAAG KIAEHLEARYLVTFTQSGRSARLMSRMRRAIPMLAFTPLESTRRQLALSWGIRAYRVPEV RHTDDMVWQVDQVAQTSRLAEIGDQLVLIAGMPPGTPGSSNMLRIHNIGDEADYSIGGTR >gi|319979330|gb|AEUH01000026.1| GENE 14 16176 - 16385 140 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no DGDGAPAAPEGGPRADSADPLEGEALTAADGAGTPEGTAAVGSGAPRADSPGSALTGGND GEDAGPDRD Prediction of potential genes in microbial genomes Time: Thu May 12 17:00:59 2011 Seq name: gi|319979315|gb|AEUH01000027.1| Actinomyces sp. oral taxon 178 str. F0338 contig00027, whole genome shotgun sequence Length of sequence - 16199 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 6, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 1 - 784 1059 ## COG0682 Prolipoprotein diacylglyceryltransferase 2 1 Op 2 37/0.000 - CDS 781 - 1596 361 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc 3 1 Op 3 13/0.000 - CDS 1593 - 2882 1533 ## COG0133 Tryptophan synthase beta chain 4 1 Op 4 2/0.000 - CDS 2887 - 3732 988 ## COG0134 Indole-3-glycerol phosphate synthase 5 1 Op 5 . - CDS 3824 - 5452 1951 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 6 1 Op 6 . - CDS 5478 - 6170 929 ## gi|154508863|ref|ZP_02044505.1| hypothetical protein ACTODO_01374 7 1 Op 7 4/0.000 - CDS 6177 - 7025 677 ## COG0169 Shikimate 5-dehydrogenase 8 1 Op 8 . - CDS 7028 - 8704 1949 ## COG1559 Predicted periplasmic solute-binding protein - Term 8895 - 8944 2.4 9 2 Tu 1 . - CDS 9186 - 11942 3271 ## COG0013 Alanyl-tRNA synthetase - Term 12064 - 12106 15.3 10 3 Op 1 . - CDS 12117 - 12740 874 ## PROTEIN SUPPORTED gi|227496183|ref|ZP_03926489.1| ribosomal protein S4 11 3 Op 2 . - CDS 12866 - 14209 1446 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Term 14211 - 14243 4.2 12 4 Op 1 . - CDS 14289 - 15749 1588 ## gi|293192676|ref|ZP_06609630.1| Rhs element Vgr protein 13 4 Op 2 . - CDS 14816 - 15271 162 ## PROTEIN SUPPORTED gi|241662874|ref|YP_002981234.1| 60S ribosomal protein L19 + Prom 15731 - 15790 1.8 14 5 Tu 1 . + CDS 15821 - 15961 138 ## 15 6 Tu 1 . - CDS 15933 - 16199 110 ## Predicted protein(s) >gi|319979315|gb|AEUH01000027.1| GENE 1 1 - 784 1059 261 aa, chain - ## HITS:1 COG:Cgl2037 KEGG:ns NR:ns ## COG: Cgl2037 COG0682 # Protein_GI_number: 19553287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Corynebacterium glutamicum # 7 261 4 268 316 229 46.0 5e-60 MTALVQAGIPSPSVGVWYVGPVPLRAYGIIIAVGMVVALAWASRRYGRRGGDPDLLFDLA LWAIPLGIVGARLYHVITSPEQYFGPGGDPWQVWQIWRGGLGIWGAVALGAVGAWIGARR AGARLGPVADSLAPALLVGQAIGRWGNWFNQELFGAPTSAPWGLRIDALHMPPGYPPGTL FHPTFLYECLWNLVGAALIVWLEKRLRFKAGQVFALYLMVYTAGRVWIEALRIDDAHRIL GVRLNVWTSLLVFAAGAVSFL >gi|319979315|gb|AEUH01000027.1| GENE 2 781 - 1596 361 271 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 15 260 10 258 263 143 36 7e-34 MTTARAPHSAVRIDQANARGDAALIAYLPVGFPTVEASVRAGKALADAGVDAIELGFPYS DPGMDGPTIQRATVAALERGTHIEDLFHAVDELTAHGVATMVMTYWNPVEWWGVERFAAD LAGVGGSGLITPDLPPEEGAQWEAAADANGLERVYLSAPSSPARRLALIAAHSRGWVYAA SSMGVTGARASVGAHVRDVVERTRAAGAERVCVGLGVSTGAQAREIGAYADGVIVGSALV RTLFSEPFERALVELGALADELVGGVKGARR >gi|319979315|gb|AEUH01000027.1| GENE 3 1593 - 2882 1533 429 aa, chain - ## HITS:1 COG:MT1647 KEGG:ns NR:ns ## COG: MT1647 COG0133 # Protein_GI_number: 15841065 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Mycobacterium tuberculosis CDC1551 # 8 403 28 417 422 469 65.0 1e-132 MSLNTAQGPYFGGFGGRFVAESLMGPLEEVEAAWRRLWPDPGFQRSLRSLFADYAGRPSL LTEAPRFASDLGGVRVLLKREDLNHTGSHKINNVLGQALLTRELGKTRVIAETGAGQHGV ATATAAALMGLECTVYMGKEDTERQALNVARMQMLGAEVVAVSAGSATLKDAINEAFRDW VTTAATTNYVFGTAAGPHPFPELVRDLQRVIGDEARAQLLAAEGRLPDVVCACVGGGSNA IGAFTAFLDDPGVALLGCEAGGDGFGTGRHAASIIADQQGVLHGARTFVLQERDGQTKPS HSISAGLDYPGVGPEHAWLAETGRASYTPVPDDEAMDAFARLSRTEGIIPAIESAHALAG ARAWARAKAEAEGPFDGAGAPIALVVLSGRGDKDMGTASRWFGYGRAKEQIIDPSAGPSG VRAVTKEAR >gi|319979315|gb|AEUH01000027.1| GENE 4 2887 - 3732 988 281 aa, chain - ## HITS:1 COG:MT1646 KEGG:ns NR:ns ## COG: MT1646 COG0134 # Protein_GI_number: 15841064 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 2 265 15 276 282 248 58.0 1e-65 MSVLEGIIAGVMEDLRVRESAVPMEEVKELALRAPDAKDAVSALRGSDGAVTIISEVKRS SPSKGELAAIPDPASLASTYERGGASVVSVLTEGRRFGGSLADLDAVRAAVDIPVLRKDF IVTPYQIHEARAHGADLVLLIVAALEKPALVSLLERVRSLGMEALVEAHSRLEVLRALEA GASVIGVNARDLTTLEVDRHTIEQVIDVIPEDVVAVAESGVSNAHDVFEYAKWGADAVLV GEALVTSGNPLESIRDMVSAGAHPALLTDRKARVRQALQEG >gi|319979315|gb|AEUH01000027.1| GENE 5 3824 - 5452 1951 542 aa, chain - ## HITS:1 COG:ML1269 KEGG:ns NR:ns ## COG: ML1269 COG0147 # Protein_GI_number: 15827651 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Mycobacterium leprae # 42 536 11 503 529 404 49.0 1e-112 MRHGGTAGGVDDTMIGVTSPSKSTSDNPPGVDLQWGQTWPDRQEFRRLAATRRVVPVVRR VLADELSAVGVYRQLAHGSHGSFILESAEHGGAWGRWSFVGASSAGAIVSSEGRARWVGA RPEGALAEGTFLEVAHSALAELHAPPIPGLPPLTGALVGSLGWGIIPEWEPTLVASSPRE SDIPDAALCLATEVAAIDHRTGSVYLMAIAWNLNGTDEGVDGAYDSALARVDAMTRQLAA PISPAVLAADPGAERPAVRQRTPRGAFEASVDAAKRAIEDGEAFQIVVSQRLDVRTGAAG VDVYRVLRTINPSPYMYYLALPDGDGGQFEVVGSSPETLVRTQGRRVWTYPIAGSRPRGA DGAQDRALAEELLQDPKELSEHVMLVDLARNDLSKVCDPATVEVSTLMEVKRFSHIQHIS STVTGVLRPDADALDCLVAAFPAGTLSGAPKPRAIRLIDELEPAARGVYGGVVGYFDLGG EADLAIAIRTAALRGGTASVQAGAGLVADSVPSLEYEESRNKAAAALEAVTTASTLRPWS SL >gi|319979315|gb|AEUH01000027.1| GENE 6 5478 - 6170 929 230 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508863|ref|ZP_02044505.1| ## NR: gi|154508863|ref|ZP_02044505.1| hypothetical protein ACTODO_01374 [Actinomyces odontolyticus ATCC 17982] # 1 230 1 230 230 257 68.0 4e-67 MIVDWLPEKAVANFSIVSGFLTWIVILFWLANSGWRIQRRYFVEATRTPLGRNGIVWSSA LMGAVFFVAGERRPGLPAILAVAMVGVLSAYVDALTHRLPNGYTAAMAVGVCAGLVGGAL VSPFWKERLTGSALGVVIWLAPVWLLNRLPGGMGAGDVKLAPVLGAMVGSVGVEAAAAGL ALAFVSAGVAALWKLVVGSAGTKSRVPMGPWMIGAALFATVAWGVIPDWL >gi|319979315|gb|AEUH01000027.1| GENE 7 6177 - 7025 677 282 aa, chain - ## HITS:1 COG:SA1424 KEGG:ns NR:ns ## COG: SA1424 COG0169 # Protein_GI_number: 15927176 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Staphylococcus aureus N315 # 5 272 4 261 268 112 32.0 7e-25 MPWAAVIGSPIAHSLSPVIHRAAWECLGLAGDWEYRSAEVDRGGLGAFIAGLDPQCRGLS VTMPCKQAVMPLMDVVDPLAAAVGAVNTVVPGAGVLTGFNTDVHGITTAIAEARAARGLG PARSACVLGARATASSALAALGALGVTRTTVVARRFSGPGSVVAAAARMGVGIDQVLIGD TTRAGAALAADIVVSTLPAGAADPLAALVRPGGHQCLLDVVYAPRDTALRRAFEAGGAVI AEGTEMLIHQGAQQVRLMTGRDPDTGVMRAALEAEVASREGR >gi|319979315|gb|AEUH01000027.1| GENE 8 7028 - 8704 1949 558 aa, chain - ## HITS:1 COG:MT2630 KEGG:ns NR:ns ## COG: MT2630 COG1559 # Protein_GI_number: 15842091 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Mycobacterium tuberculosis CDC1551 # 191 543 22 403 417 126 28.0 1e-28 MTSSTPPSYPFRSRREIHRNEPRTYSDADLHPDSDEEATARTGRATDPAAGHPGRHRPSG AGRAGAAVPGSRTKAAARAGARRGAVLGTATAAERASVPHRSAVAQESDDAPGQGDAPAS IESDFPWPSSTRSRRSTPKKRRAAASEGKAEPRREHPGPSAHTAEIPAVGRSDADGADQE EAFEEETTGMRSRRLARERQAAKKRRFWRRVRSFFVLVIVLAMVGGAGYLAVRQLRSSAN QTAQDDFPGPGTEAVSVTIEENSTGRDIGKTLVDAGVVKSVGAFIRQFEKSKAATSIRPG TYSMRLQMSAAEALAALLDETNRTDNTITVIPGTTIWQVKAKIADIMGVSEDEVQRALDD AEAIGLPAEANGKAEGWLLPGTYEVDPEDTPTTVVKRMVAGTVAELAEMGVADADRETVL IKASIVDGEGYIKRYQPMIARVIENRLADPDGETRGRLEMDSTVQYGVGKSGGVPDATAI ADDNPYNTRLHAGLPPTPIGQPSRDAISAVVNPAQGTWLYFTTVNLDTGETLFADNFAEQ MENQKKFQDYCASHEGKC >gi|319979315|gb|AEUH01000027.1| GENE 9 9186 - 11942 3271 918 aa, chain - ## HITS:1 COG:Cgl1594 KEGG:ns NR:ns ## COG: Cgl1594 COG0013 # Protein_GI_number: 19552844 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Corynebacterium glutamicum # 20 918 1 888 888 751 48.0 0 MRPRGRCGGWGGCCLRWVLMRTSEIRSRWLEFFAARDHEIRPSVPLISPEPSILFTVAGM VPFIPYILGTEPAPWPRAASVQKCVRTNDIDNVGKTTRHGTFFQMNGNFSFGDYFKEGAI DFAWHLLTDPRDQGGYGLDGDRLWMTIWEEDQVSFDYWTREIGLPAERIQRLPFAEISWS TGQPGPAGSCCEIHYDRGAQYGPGGGPVVDTGGDRFLEIWNLVFDEFVRGEGAGHDFELV GKLDRTAIDTGAGLERLAFIMQDKPNMYEIDEVFPVIAAAEAMSGKTYGLGAAGPEAGEA YADDVRLRVVADHVRSSLMLIGDGVRPGNDGRGYVLRRLIRRAVRSMRLLGVHDAAMPVL LTASKDAMRASYPELEADWGTISEVAYAEEDAFRRTLSAGTTILDTAVAQARSAGSPIPG ASAFSLHDTYGFPIDLTLEMAAEQGVKVDEDGFRALMEEQKERARADARAKKTGHTDVRV FQEIEKRLGGGSEFLGYTESGCDASVVALLVDGHDAPAAQAGADVEVVLDRTPFYAEMGG QLADHGTIRAEGGAVIEVNDVQAPIRGLFVHRGTVVEGAVAVGERAFAQIDTRRRLAIAR AHTATHMVYAGLRAVVGQDASQAGSENSPSRLRFDFRHSSALGASQVDDIEALVNEKLAE DLPVTTEVMGIEEARASGAIALFGEKYGSRVRVVTIGDGFDKELCGGTHVPSTGHLGRVT VLGEGSIGSGVRRIDALVGDGAYEYQAKEHAIVSRLASLVGGRPEDLPERVEALLARLKE SEKELEKSRIDQALSQAAGLAARAKTIGRLTGVVAAVGAVPSADALRTLALDVRDRIGGS GAAVVALAGLVGGKPSLVVATNDGARARKAKAGALVRAVGSFLGGGGGGRDDIAQGGGTK PEGVGRALEELRRQIEGL >gi|319979315|gb|AEUH01000027.1| GENE 10 12117 - 12740 874 207 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227496183|ref|ZP_03926489.1| ribosomal protein S4 [Actinomyces urogenitalis DSM 15434] # 1 207 1 207 207 341 82 2e-93 MAKNRSRKQVRESRALGLALTPKAVRYFEKRPYGPGEHGRSRRRQDSDYAVRLKEKQRLR AQYGIREAQLRRAFEEARRTAGLTGENLVELLEMRIDALVLRAGIARTIQQARQFVVHRH ILVDGKIVDRPSFRVNPGQTLQVKPKSQTTVPFEIAAQGIHRDVLPQVPDYLDVDLERLK ATLVRRPKRAEVPVTCDVQMVVEHYSR >gi|319979315|gb|AEUH01000027.1| GENE 11 12866 - 14209 1446 447 aa, chain - ## HITS:1 COG:ML0510 KEGG:ns NR:ns ## COG: ML0510 COG2256 # Protein_GI_number: 15827173 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Mycobacterium leprae # 18 447 48 471 473 415 57.0 1e-116 MDLFAAQGIDDAGVPAYSPDAPLAVRMRPARIEEVVGQDHLLGEGAPLRRLLEPASGASV AVSSVVLWGPPGTGKTTLAYLVARTTSRRFEELSAVSSGVRDIRDVVGGARRRLAAGGGE TVLFVDEVHRFSKSQQDALLPAVENRWVVLVAATTENPSFSVIPPLLSRSLLLTLRPLDG AAVAGLIERALADDRGLAGAFTITDEARDALVRLAGSDARKSLTLLEAAAGAAAGAGGAS IGLPQVEAAANRALVRWDQDQHYDVASAFIKSMRGSDVDAALHYLARMIEAGEDPRFIAR RVMIAASEEVGMAAPEVLTTCVSAAQAVAMVGMPEARIVLAQAVIAVATAPKSNASYTGI DRALADLRAGKGGPVPAHLRDAHYPGAAGLGHGEGYVYAHDAPHHVAAQQYLPDDLVGVR YYEPTRNGNEALITKRLEAIRNLLEMR >gi|319979315|gb|AEUH01000027.1| GENE 12 14289 - 15749 1588 486 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192676|ref|ZP_06609630.1| ## NR: gi|293192676|ref|ZP_06609630.1| Rhs element Vgr protein [Actinomyces odontolyticus F0309] # 59 478 38 420 430 211 40.0 7e-53 MKHARGAAVSAQRQIANQLRFLTVGLGAVLVLVVAVMLPATASDRWDNPFMPLELPSTEV RVTVIPGSDFTPTESEAKGAYVGVLLDEGAVTRYEPQTNPDTGGPLARDDFYAVGRVSPD GDFAVSVDLPNGTNNPQRASWYDGDVHKLTIYVYYAGDALQSKTAHYLRIVQAPEPGPTT EPEPSEAPTTPAPTAPQPTATPTTPAPTTSAPAPTQEPTTDPQSHATTASPRPQSPAPDP VPSSEPQPEPSTQTPDPYTPPQEPTSEKPPVPSEPSHAPVAPDPTPSTAGPAQSDGSADP DPTPSATSNTVKPDGEPVLPVASESDLTDANTGGVSASFASGKVTVNVPSDKAPAGDWVS ANILPGRQAQWLQVSDDNQASMDLPSLPNGEYKIVIATRDHTLVGWAQFKVATTSASGAG DGAAVDTSASMNAHAQMLENSLRTDAADGLNGYLLGAGACMLILGGLVVLQVMSGPTLRG MRKSGS >gi|319979315|gb|AEUH01000027.1| GENE 13 14816 - 15271 162 151 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|241662874|ref|YP_002981234.1| 60S ribosomal protein L19 [Ralstonia pickettii 12D] # 1 134 40 179 186 67 29 8e-11 CSRRPRTTCGSSRRPSPGPPRSRSPPRPRPPRRPPLRSPRPPRPLRRPPPRPRPPRRSRP RTRSPMRPQPVRVPSPRPPIRCPAPSRSRSPPRRPPIPTRPRRSRLRRSRPSPPSRRTPP SRLTRRPALRAPPSRTAARIRIPRPVRPPTR >gi|319979315|gb|AEUH01000027.1| GENE 14 15821 - 15961 138 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTRNKLHQRANIFHKVCRETLIRPRRPTAPREKVGTAPNAGGPSCG >gi|319979315|gb|AEUH01000027.1| GENE 15 15933 - 16199 110 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no ASGGGSSAAPSQGTAPATRPTTARPSASPSTEPTRVSADAAVPPNAGGGSSEESASGLNS WVMVGGAGLLLLGVTVAFTFIRRMAHPH Prediction of potential genes in microbial genomes Time: Thu May 12 17:02:09 2011 Seq name: gi|319979313|gb|AEUH01000028.1| Actinomyces sp. oral taxon 178 str. F0338 contig00028, whole genome shotgun sequence Length of sequence - 2935 bp Number of predicted genes - 3, with homology - 1 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 788 201 ## 2 2 Tu 1 . + CDS 976 - 1296 98 ## - Term 964 - 1038 6.0 3 3 Tu 1 . - CDS 1203 - 2933 2362 ## COG5263 FOG: Glucan-binding domain (YG repeat) Predicted protein(s) >gi|319979313|gb|AEUH01000028.1| GENE 1 3 - 788 201 261 aa, chain + ## HITS:0 COG:no KEGG:no NR:no AGSGVGGAPSAAPSSTARAPPLDSSDGSLEMVGAEPGPVGVCEGEDAEGDGAGAEGAGLD DTDGDGAGVEGAGLGDADGDVPEGEGDGDDGVSAGTGFSGLFAQFSGRSGVSAPTTETEK VTRAPLAPSLIVTVCLLPASQSSGTAIAAVKAPWASSSTTSHTCRSSARERCAHGGSCLS AGRNCLPSSNATETNARAEATTVQPVPDNTAFWPRTHWEGTVIREAPPDTIRHPAWPAPS TGEPHKSTLLVKLPLWASARA >gi|319979313|gb|AEUH01000028.1| GENE 2 976 - 1296 98 106 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPRTRDTPVREVVRKPGYQHAVSPAIPPPLFDKCDAFYCGALRFPDPSVTTGRPRGRSRG QGRRAAPSPVVRGSTGQWTQTPESSNVLGTPLTQWLPVAIAPEGRK >gi|319979313|gb|AEUH01000028.1| GENE 3 1203 - 2933 2362 576 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 372 560 524 710 744 181 46.0 3e-45 PKPVPDANARVSAVPGTDAAKAGSIAGVETQLLGGLYQTAYSPRHGSIYVTSSIGRPGPD FGGSAIMKLNADTLKLEAYINPAIDDTYTGRDGKTVPNMPYAAYGVDVDDAHNTVWITQT RQNTIAVYNADTLELVKQFDKGIVSHSRDVRIDAARGRAYVNAARSNTVHVFDTTTLAQL DDIVLAPTPEEFASMGLELDSEGGVLYVGSTSTNKVARVDLAASAVDFIPLPDNVQQATG VARDPGTGRLYVVDQGSGTLTVLTKDGQVLSNAPTPPPAPEGDEQPSSGALYATFDAVNR LLYVTNRNSGTITVHDADGALVQTLDSGPMPNEARADGRGNVYAVTKGGARDGSSKLDYI QKYHVVAAEPDPEGQWIQDSVGWWYRHADGTFTADGTEVIDGSTYRFDAAGYMVTGWARA DGKWFYYAPSGAQASGWAFVDRVWYYLDPATGAMATGWLQLDGSWYYLEASGAMAMGWAQ IDGAWYYLRGSGAMATGWYRINAVWYHFADGGQMTIGWINDGGTWYFLHPSGAMATGWLN LGGDWYYFLPSGAMATGSHWVNGVPRTFDDSGVWVH Prediction of potential genes in microbial genomes Time: Thu May 12 17:02:31 2011 Seq name: gi|319979309|gb|AEUH01000029.1| Actinomyces sp. oral taxon 178 str. F0338 contig00029, whole genome shotgun sequence Length of sequence - 2342 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 198 - 986 244 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 8/0.000 - CDS 983 - 2014 1196 ## COG4779 ABC-type enterobactin transport system, permease component 3 1 Op 3 . - CDS 2053 - 2340 255 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component Predicted protein(s) >gi|319979309|gb|AEUH01000029.1| GENE 1 198 - 986 244 262 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 23 228 16 223 245 98 32 4e-21 MKAIEPTLRARGIVVGYANRPPVLDHVDIDVLDGELTVIVGPNACGKSTLLRSLARVLAL RGGRVLLDGQDIRAMNQKGVARRLGLLPQSPQAPDDMLVEELVGRGRYPYQSLVRQWSEA DDDAVERAMADAGVAGLAARRLSELSGGQRQRVWIAMALAQCTRILLLDEPTTYLDLNHQ LEVLNMAGRLHREGRTVVAVLHDLHLAFRYATHLLVMKDGRLVAQGHPGRIVTAELIEDV FSVACRIIDDPETGRPIVIPLP >gi|319979309|gb|AEUH01000029.1| GENE 2 983 - 2014 1196 343 aa, chain - ## HITS:1 COG:AGpA451 KEGG:ns NR:ns ## COG: AGpA451 COG4779 # Protein_GI_number: 16119542 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type enterobactin transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 340 40 368 368 177 41.0 4e-44 MARAAGLSIRAHRRSALVTTAAVVAVLALAAHSLALGDYGLSAAESYRRLLGDAGPGDDF LGVYFVQSVRLPRTIAGIGVGAALGLSGRLFQVVSSNALGSPDIVGFTAGSASGALVAII VLGSTPVATAIGAVIGGVVSGALILALAGGTRLAGARVVLVGIGVSAALRALNSLLLVKA PLEAAQRAQMWSAGSLSGVTTVRTVALVAALAACLPALTWLSRPLALIPLGDDAATALGA RTGRTRLVAVCTGIVLVSMAIAVAGPVAFVALAAPHIAHRLTGAPGSLLAPSAAVGALLV LGSDIVAQRLIAPGELAVGVVTGCAGGVYLLVLLAHEYRRNRI >gi|319979309|gb|AEUH01000029.1| GENE 3 2053 - 2340 255 95 aa, chain - ## HITS:1 COG:all2586 KEGG:ns NR:ns ## COG: all2586 COG0609 # Protein_GI_number: 17230078 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 5 91 253 340 343 64 46.0 4e-11 TCGLVAVGALAGAATAVCGPLGFVGLAVPLIARRLVGSGQSALSALSAVLGGAWVLIADV VARVLVPEEVPVGVVLALIGAPFFVALARGRGASA Prediction of potential genes in microbial genomes Time: Thu May 12 17:02:33 2011 Seq name: gi|319979304|gb|AEUH01000030.1| Actinomyces sp. oral taxon 178 str. F0338 contig00030, whole genome shotgun sequence Length of sequence - 3945 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 33/0.000 - CDS 1 - 643 708 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 2 1 Op 2 . - CDS 776 - 1798 1351 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 3 1 Op 3 . - CDS 1782 - 1844 157 ## 4 1 Op 4 . - CDS 1901 - 3130 895 ## COG2382 Enterochelin esterase and related enzymes 5 1 Op 5 . - CDS 3127 - 3744 959 ## COG0009 Putative translation factor (SUA5) Predicted protein(s) >gi|319979304|gb|AEUH01000030.1| GENE 1 1 - 643 708 214 aa, chain - ## HITS:1 COG:Cgl0493 KEGG:ns NR:ns ## COG: Cgl0493 COG0609 # Protein_GI_number: 19551743 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Corynebacterium glutamicum # 9 214 40 246 348 113 42.0 3e-25 MLFLSMATSLAIGSRVLSPAQVWQALTGTGTPIDTEVVVRMRVPRTIAGLVVGCALGLAG AVMQALTRNPLADPGILGVNAGAALGVVSACAATGVSTQARNATAALAGALAASCAVHLL AGRGGPGSRARLALAGIALTAALSSLTQAVVLADQFAFNEFRHWVSGSLEGVRFSSLAWA GAPLAVGAVIAALLGPALGALALGDEAAAGLGVR >gi|319979304|gb|AEUH01000030.1| GENE 2 776 - 1798 1351 340 aa, chain - ## HITS:1 COG:BS_yfiY KEGG:ns NR:ns ## COG: BS_yfiY COG0614 # Protein_GI_number: 16077911 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Bacillus subtilis # 43 340 37 323 325 128 30.0 2e-29 MGIARKLGATAVALAMTLSLAACGRGPDGPASAGGAAESGGATRTVVDANGEQVTVPGRP SRVVTLSEPTTDNALALGVTPIGAVAGRGQDTVANYLADRAGSVAVLGSVGKPNVEAIGA AKPDLILVDGTSIKNDPDALASLKETAPVVYTGDAGGDWRANFQVTADALNLKDQGEAKL AEYDAHVSAVSAGLAAAGYLDQTYSVVRWQGDTAGLILKELPAGRALSDLGMKRPANQDR EGPGHSEPVSLENIDQIDADWIFFGTLGGSSVNNPSAGGSAGVEASRQALEEARQTPGFA GLGAEQAGHVVPVDGSVWTSTGGYILMDTIVSDIESTFLK >gi|319979304|gb|AEUH01000030.1| GENE 3 1782 - 1844 157 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLVTMGVSRTEWRSSGNRS >gi|319979304|gb|AEUH01000030.1| GENE 4 1901 - 3130 895 409 aa, chain - ## HITS:1 COG:ECs0624 KEGG:ns NR:ns ## COG: ECs0624 COG2382 # Protein_GI_number: 15829878 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Escherichia coli O157:H7 # 58 403 1 362 374 71 26.0 3e-12 MTDGWNRRLDPTPLEGGAPIRLDLPCAVAGADGPGLEGLARAGTPLVGEPCRVGGVEMVP ATFLWRSARPGTACVALHLSSLTDNHREDLRPALMEPVGATGWWALCCLLPSDGVLSYQV VESASPIPSGTGRTRDGWLAFHMAGLADPHCPGAIANGRAMPASLWHGPSARAHPDWARA HGPDAAGSGVEVVEAVRSRGGGCGARSVELVRGVGRGGAGQEPGVLVLFDAANWRANGVV RALSGRTGQWDLVLVDTGRSLRREAALTDPGRVEALVGAIAGVVGAERPMVVCGQSYGGL ACAHLALTRPDLVAVGVCQSGSYWVGGPQRGRGEGSLLRGLTGGSLVPAADARVVVQVGS HEAGMVGLARSFAAAAGEALVGYREYRGGHDYAWWRYGLSDALDRISAM >gi|319979304|gb|AEUH01000030.1| GENE 5 3127 - 3744 959 205 aa, chain - ## HITS:1 COG:YPO2212 KEGG:ns NR:ns ## COG: YPO2212 COG0009 # Protein_GI_number: 16122440 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Yersinia pestis # 3 204 4 205 206 204 48.0 7e-53 MSYVEMHPVNPQKRFVDKVVGLLREGGVVALPTDSGYAIACRLGNKAGMDTIRDVRRLDD KHNFSLLCSSFAQLGELVILDNHEFRTIKALTPGPYTFILRGTKEVPRMTLNKKKHTVGV RLPDHAITQAVVSALGEPLLCSTLILPGETEPLTDGREVDESIGSRVDLVVVGPVGDAEP TTVVDFTSGTAEVVRRGAGDVSLFE Prediction of potential genes in microbial genomes Time: Thu May 12 17:02:39 2011 Seq name: gi|319979300|gb|AEUH01000031.1| Actinomyces sp. oral taxon 178 str. F0338 contig00031, whole genome shotgun sequence Length of sequence - 6364 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 1875 2693 ## COG0173 Aspartyl-tRNA synthetase + Prom 1874 - 1933 1.5 2 2 Tu 1 . + CDS 2006 - 3313 1763 ## COG0477 Permeases of the major facilitator superfamily 3 3 Tu 1 . + CDS 3460 - 3912 -155 ## + Term 4073 - 4109 -0.5 4 4 Tu 1 . - CDS 3962 - 6364 2704 ## COG0513 Superfamily II DNA and RNA helicases Predicted protein(s) >gi|319979300|gb|AEUH01000031.1| GENE 1 34 - 1875 2693 613 aa, chain - ## HITS:1 COG:Cgl1597 KEGG:ns NR:ns ## COG: Cgl1597 COG0173 # Protein_GI_number: 19552847 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Corynebacterium glutamicum # 16 606 1 593 608 865 72.0 0 MPPPVSRMAPTKGLHVLRTHTIGSLGKDLIGQTVTLTGWVDRRRDHGGVAFVDLRDASGI AQVVVRDESVAHELRSEFVLKVTGEVVARPEGNENPNLPTGAIEVMGDDIEILNPSAPLP FQVSSHAEDSGNVGEEVRLKYRYLDLRREPEQRALRLRSMVSRAARDTLYAHDFVEIETP TLTRSTPEGARDFIVPARLAPGSWYALPQSPQLFKQMLMVAGMERYFQIARCYRDEDFRA DRQPEFTQLDIEMSFVDQDDVIAVAEDVLRNVWALIGYDLATPIPRMTYKDAMERYGSDK PDLRFGLELTELTDYFKDTPFRVFQAPYVGAVVMPGGGSQPRRTFDKWQEWAKARGAKGL AYVTVAEDGALGGPVAKNISEAEREGLAEATGAKPGDCIFFAAGKPTPSRELLGAARLEI GKRCGLVDPDAWAFTWVVDAPLFKPTGDAEAEGDVALGHSAWTAVHHAFTSPKPEWVDSF DEDPGNALAYAYDIVCNGNEIGGGSIRIHRRDVQNRVFAVMGIGEEEAQAQFGFLLDAFK FGAPPHGGVAFGWDRIVALLTKSESIRDVIAFPKSGGGYDPLTDAPAPITPAQRKEAGVD AEPRKRGEEAPAS >gi|319979300|gb|AEUH01000031.1| GENE 2 2006 - 3313 1763 435 aa, chain + ## HITS:1 COG:DR2098 KEGG:ns NR:ns ## COG: DR2098 COG0477 # Protein_GI_number: 15807092 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Deinococcus radiodurans # 3 377 31 403 462 101 30.0 3e-21 MVGSVLLITLVAFEELASTSIMPSVVADLGAERWFSLASGAALAAQLAATVVAGLACDWR GPRFVMAWGLALFTSGLVVSALAGTISVFVVGRLMMGLGGGLLIVPLYVLIGSVADAAHR PAFFASFSLAWVVPSLVGPVIAGYVVRAFGWHPVFGFVPLLTAVAVLPLVGILRGLTHPP SPRPPKLWAMARTGAVAGTGAALIQLAGALDPAWRSVVVAVFAAGIAMCAWSLPRLLPRG VFALRRGIGAGVMARLLAMGVQAGAGVFIPLLLQRIHGWPEHTATLWVSAGSLAWAAGAV IQSRVKSPGGRSRLPLAGTVLLAAGVASLAALLTPVVPLWVALAGWMAAGLGTGLFHSSL SVLALEITPAAKHGKVASWLQVADSAGPAIELASVSVLMGVWADSGATGALAYAPAYVVA VVLACAAVAASRRIA >gi|319979300|gb|AEUH01000031.1| GENE 3 3460 - 3912 -155 150 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRPQSPGCAHKDPDLRNTRRTQGLDSALVCDAVLPDSGPCVRHREGRAVTGHAPRPAGPE GGGRTDARYWRRRALFAPARVISPSRCQRLPRDADNPLAVSTTPSRCRQRASAHRAPESE GAPFRAGRQSGPRGCPTGRRRVPLPRWGQP >gi|319979300|gb|AEUH01000031.1| GENE 4 3962 - 6364 2704 800 aa, chain - ## HITS:1 COG:XF0252 KEGG:ns NR:ns ## COG: XF0252 COG0513 # Protein_GI_number: 15836857 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Xylella fastidiosa 9a5c # 52 619 17 572 614 440 47.0 1e-123 VDAPAAVDAGAPTGERADGARAAQDGGADEGPEEEGSAAEDGGSGEEDTVSFADLGLPDE LLQAVTDMGFVTPTPIQAEAIPPLLDLRDVVGIAQTGTGKTAAFGLPLLAIVDADEKAVQ ALVLAPTRELAMQSAAAIEDFAARTAELVVVPVYGGSPYGPQIGALKRGAQVVVGTPGRV IDLIEKGALDLSGVRMLVLDEADEMLRMGFAEDVETIASSAPDDRLTALFSATMPPAIEK VAREHLKDPVKVAVSEESSTVDTIHQTYAVVPYKHKIGALSRVLATRAQHIAAGQEDADA AIVFVRTRVDVEEVSLELSGRGFRAAGISGDVAQTERERMVERLKSGSLDVLVATDVAAR GLDVERISLVVNFDVPREPEAYVHRIGRTGRAGRQGRSLTFFTPREHTRLRRIEKLTGTP MEEVAIPSPAAVSEFRARRLLEGVGARVERGRLGMYRELLEQMAASLDVEDVAAALIAQA VGDEGPAPRAESDRRGRGRARREEGLDESGEFVGASFEAGRDKDRPLKGGRDGARRRGGG RAVAGPGTRYRVEVGRKDRVKPGSIVGAIAGEGGIDGRDIGHIDIFPTFSLVDITADLSA EQLSRISKGYVSGRQLRIRVDEGPGRRDRFERGGFERPSRGERRSDRDDRSEREDRGGRD ERFDRPSRGDRRSDRDDRLGEDRRGGFERSGRGDRFDRGDRGGRDERFDREDRRPGRGDR FDRSDRFEADGRGGRESQRGGWHRSVRTERWERDIKKRRARDDFGARESGRGFGHKDRRR DEGRGRNDDARRFGKRRYED Prediction of potential genes in microbial genomes Time: Thu May 12 17:02:57 2011 Seq name: gi|319979268|gb|AEUH01000032.1| Actinomyces sp. oral taxon 178 str. F0338 contig00032, whole genome shotgun sequence Length of sequence - 33165 bp Number of predicted genes - 32, with homology - 28 Number of transcription units - 14, operones - 8 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 184 83 ## 2 2 Tu 1 . + CDS 460 - 1464 1259 ## SSA_1656 nisin resistance protein, putative 3 3 Op 1 4/0.000 - CDS 1543 - 2886 1708 ## COG0124 Histidyl-tRNA synthetase 4 3 Op 2 . - CDS 2915 - 3631 944 ## COG0491 Zn-dependent hydrolases, including glyoxylases 5 3 Op 3 . - CDS 3635 - 4969 1912 ## HMPREF0573_11739 ATPase involved in DNA repair - Term 5083 - 5129 5.2 6 4 Op 1 9/0.000 - CDS 5138 - 7438 3190 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 7 4 Op 2 . - CDS 7511 - 8050 766 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 8 4 Op 3 31/0.000 - CDS 8047 - 9123 1608 ## COG0341 Preprotein translocase subunit SecF 9 4 Op 4 . - CDS 9125 - 11035 2842 ## COG0342 Preprotein translocase subunit SecD 10 4 Op 5 . - CDS 11091 - 11528 585 ## gi|293192654|ref|ZP_06609608.1| putative preprotein translocase, YajC subunit - Prom 11572 - 11631 1.9 - Term 11555 - 11597 -1.0 11 5 Op 1 29/0.000 - CDS 11644 - 12657 1346 ## COG2255 Holliday junction resolvasome, helicase subunit 12 5 Op 2 14/0.000 - CDS 12686 - 13285 851 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 13 5 Op 3 . - CDS 13343 - 13963 761 ## COG0817 Holliday junction resolvasome, endonuclease subunit 14 5 Op 4 16/0.000 - CDS 13963 - 14568 637 ## COG0311 Predicted glutamine amidotransferase involved in pyridoxine biosynthesis 15 5 Op 5 . - CDS 14562 - 15464 1402 ## COG0214 Pyridoxine biosynthesis enzyme 16 5 Op 6 . - CDS 15467 - 16228 1345 ## COG0217 Uncharacterized conserved protein 17 6 Tu 1 . - CDS 16381 - 17127 760 ## COG2062 Phosphohistidine phosphatase SixA + Prom 16823 - 16882 1.6 18 7 Tu 1 . + CDS 17114 - 17197 142 ## 19 8 Tu 1 . - CDS 17258 - 18580 902 ## gi|256804209|ref|ZP_05533833.1| TPR repeat-containing protein 20 9 Tu 1 . + CDS 18698 - 18853 149 ## 21 10 Op 1 . + CDS 19637 - 20227 424 ## Cfla_3211 hypothetical protein 22 10 Op 2 40/0.000 + CDS 20259 - 20975 601 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 23 10 Op 3 1/0.000 + CDS 21044 - 22081 1094 ## COG0642 Signal transduction histidine kinase 24 10 Op 4 . + CDS 22262 - 23623 1584 ## COG2132 Putative multicopper oxidases 25 11 Op 1 . + CDS 23811 - 24107 306 ## 26 11 Op 2 . + CDS 24153 - 26630 1888 ## COG2217 Cation transport ATPase 27 12 Op 1 9/0.000 - CDS 26656 - 27471 1013 ## COG1484 DNA replication protein 28 12 Op 2 . - CDS 27468 - 29135 1682 ## COG4584 Transposase and inactivated derivatives 29 13 Op 1 . - CDS 29251 - 30621 690 ## gi|256804209|ref|ZP_05533833.1| TPR repeat-containing protein 30 13 Op 2 . - CDS 30606 - 31970 937 ## RHA1_ro05343 hypothetical protein - Term 32090 - 32130 2.1 31 14 Op 1 . - CDS 32173 - 32853 779 ## COG0406 Fructose-2,6-bisphosphatase 32 14 Op 2 . - CDS 32850 - 33164 118 ## Cfla_2118 (glutamate--ammonia-ligase) adenylyltransferase (EC:2.7.7.42) Predicted protein(s) >gi|319979268|gb|AEUH01000032.1| GENE 1 1 - 184 83 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPDTPMTSDAIDGPGGAAAPFDATTAPADPRGADTVTAAPHEGADNDGAGPVRPADSAPA L >gi|319979268|gb|AEUH01000032.1| GENE 2 460 - 1464 1259 334 aa, chain + ## HITS:1 COG:no KEGG:SSA_1656 NR:ns ## KEGG: SSA_1656 # Name: not_defined # Def: nisin resistance protein, putative # Organism: S.sanguinis # Pathway: not_defined # 11 330 15 330 336 191 36.0 5e-47 MSDEKPLDEEPPKKRRRSARAIGAGVVVAVLVVTGAIGWALHAYGPHFGIWFPPPSARDY GRTALGLLDDGLYADTPQWAGARADAATRIDAATTHDEVDAVLGDAIGVAGGKHSFLADA DSDEQGADSYEAPTWSMSGCVLTVALPGYMGTPEQGSSYANSIADALSQDDLCGVVVDLR DNDGGDMGPMIAGLSPLLPDGVVATFTIGDRRTPVALSDGRVSGGGTPTSVGQRAKLAVP VAVLTSERTASSGEQALLAFRGMANVRTFGQPTAGYASVNTAIPLYTGRTMVLTVGTTTA RTGEEFGDDPIAPDEIREADEAPTAAQEWIANNQ >gi|319979268|gb|AEUH01000032.1| GENE 3 1543 - 2886 1708 447 aa, chain - ## HITS:1 COG:all5012 KEGG:ns NR:ns ## COG: all5012 COG0124 # Protein_GI_number: 17232504 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Nostoc sp. PCC 7120 # 7 434 13 454 462 274 39.0 2e-73 MARTSLSGFPEWLPDGRVVEQYVLDALRRVFELHGFCGIETRAVETLEQLEAKGETSKEI YVLDRLQALKAEQGGRRPGKERRMGLHFDLTVPFARYVVENANELDFPFKRYQIQKVWRG ERPQEGRFREFTQADIDVVGNGQLPFHFEVDLPLVMAEALNSLPIPPVRVLVNNRKVVQG ACECLGVGDVDAALRGLDKLDKIGAEGVAGELAASGIGAAQARALLEMARIRSASSSEVR SRVAELGLSGPLLEEGLGELCALLDAAGSRMPGALVADLRIARGLDYYTGSVYESEVEGH EDLGSICSGGRYDSLASDSKRSYPGVGLSIGVSRLVSRMISAPLVRATRPVPTAVVVAVA SEEERERSEAVARALRARGIPTDVAPSAAKFGKQIKFADRRAIPFVWFPGSDGADTVKDI RSGEQVEADSATWAPPAQDLAPGVVPV >gi|319979268|gb|AEUH01000032.1| GENE 4 2915 - 3631 944 238 aa, chain - ## HITS:1 COG:Rv2581c KEGG:ns NR:ns ## COG: Rv2581c COG0491 # Protein_GI_number: 15609718 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Mycobacterium tuberculosis H37Rv # 7 231 7 219 224 93 31.0 3e-19 MRLHVIPSPYYAANALVLVPAQGSAALVVDPSAGVQHLLREVLDLEGVGVGGVLATHGHP DHVWDCAAVAGWTDGPQLPVYLPGPDVYRMDDPAAHVPLPAPGFAGEWVKPERVEPVPAG AFEIVEGVRLRMLPAPGHSEGSALFLGECELDIRVDNTSFYHSDGPVPWALSGDVIFRGS VGRTDLPGGDETQMRHSLRTVSNAIDPATVLIPGHGPATTMAEEIAANEYLIRARRIG >gi|319979268|gb|AEUH01000032.1| GENE 5 3635 - 4969 1912 444 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_11739 NR:ns ## KEGG: HMPREF0573_11739 # Name: not_defined # Def: ATPase involved in DNA repair # Organism: M.curtisii # Pathway: not_defined # 36 444 97 503 504 346 53.0 1e-93 MTTTPIPSEENNPVDTAATVPVGGAPSAGTPPEVPAHPFARIDAEGTVYVKDGDSERAIG AYPEGIPAQPYALYERRYADLEASVRLFEDRLAQLAPREIDTTLASIREQIAEPNVVGDL AALRRRVDRVAEAAQARKETAREERKAAKAQALDQRTGVVERAEAIVAQDPARTHWKNSG QALRELLEEWKTLQRRGPRLDKSTEDELWKRFSSARSQFDRLRRQYFSQLDQAQSEAKRI KERLIAQAEALQASTDWGRTTAAYRDLMNQWKAAPRASRREDDALWARFRAAQQVFFDAR RAKDEATDAEYRQNLAAKEEILVDAEAILPVTDLEKAKAQLRRIQDRWEEVGRVPSSDLH RVEGRLRAVEAAVREAEEREWQRTNPETRARAAGVLGQLEGQIADLEAELARAEASGDKK RAESVRDALTTKRAWLDQISSTIA >gi|319979268|gb|AEUH01000032.1| GENE 6 5138 - 7438 3190 766 aa, chain - ## HITS:1 COG:MT2660 KEGG:ns NR:ns ## COG: MT2660 COG0317 # Protein_GI_number: 15842122 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Mycobacterium tuberculosis CDC1551 # 31 765 58 790 793 820 57.0 0 MCMSGDDNSGQGAALPAAGSRVRSGLLWFGARSRATIPQIEPLVRALRANHPKADIAVVE RAYRTAEACHSGQTRKSGEPYITHPVAVATILAEMGMTTPTLVAALLHDTVEDTDYTLDQ LRAEYGDAIADLVDGVTKLDKVHYGAAAQSETLRKMLVAMSRDIRVLLIKLGDRLHNART WRFVPAASAAKKAQETLEIYAPLAHRLGMNMVKWELEELSFKTLYPQIYEEIDHLVAERA PQREEYLRDVIDQIEEDLRRSGTKGTVSGRPKSHYSIYQKMIVRGKAFDEVFDLVAVRVL VQSIKDCYAVLGSLHGRWNPIPGRFKDYIAMPKFNLYQSLHTTVVGPGGKPVEIQIRTFD MHERAEHGVAAHWKYKQNPNAQGETDKMGADEQANWLRALVEMERETGDPEEFLDSLRYE IAGDEVYVFTPKGEVVTLPARATPVDFAYAVHTEVGHRTVGAKVNGRLVPLDTRLESGET VEVVTSKSDKAGPSRDWLAFVASPRARSKIKAWFSKERREEAIESGKEALARAMRKQNLP LQRLMSHESVLSVATSFGYPDVSGLYAALGEGHVSAQSVVTRLVDNLGGGDGTEETLSEA VTPGQPKIRQDPAREGAIVVAGMSPNEIWVKLAKCCTPVPGDEIVGFITRGQGVSIHRST CANAVRLGELQPQRFVDVSWSGEGVGAPFRVQIEVRALDRGGLLSDLTRVLSDYGVNILS ASLNTSGDQVAGGRFSFELADLGHLNAVLAALRRVDGVFEAVRLSD >gi|319979268|gb|AEUH01000032.1| GENE 7 7511 - 8050 766 179 aa, chain - ## HITS:1 COG:mll2701 KEGG:ns NR:ns ## COG: mll2701 COG0503 # Protein_GI_number: 13472413 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Mesorhizobium loti # 20 173 16 170 181 151 50.0 6e-37 MIGELGECIAELVRDNLVEVPDFPEPGILFRDITELMANGPAFKELVELIGSRYEGRIDA VAGLESRGFILGAPVATELGLGMLTIRKAGKLPGHVIGVDYSLEYGSARMEIHPESVVPG SRVLVIDDVLATGGTAGAAVELIRQCGADVAAVAVLLELTDLKGRERLRGITVETALKV >gi|319979268|gb|AEUH01000032.1| GENE 8 8047 - 9123 1608 358 aa, chain - ## HITS:1 COG:MT2663 KEGG:ns NR:ns ## COG: MT2663 COG0341 # Protein_GI_number: 15842125 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecF # Organism: Mycobacterium tuberculosis CDC1551 # 11 333 51 381 442 200 42.0 4e-51 MMSFAQWGNALYSGDKSYRVVPRRKAFVGVAAAVIVVCFALAGLLGLNRSIEFTGGSQFD VTNVADQSESTATRAVTGTGLVSSAKVTRVGSTGVRIQTDSLDTAQTATVRSALAGAYGV DSADVASSSIGPSWGSAVTTKAAQSLAIFMALVALLMTVYFRSWRMAASALLALLHDIAV TVGVFAIAQAEVSPATVIGFLTILGYSLYDTVVVFDKVRELTADVYDQKHYTFAEFVNLA VNQTMVRSINTSVVALLPVGSILLIGSVLLGTGTLTDISLALFIGMIAGTYSSIFIASPL LVTFQEMSRRTQEHNKAVAKQRASAEGPDAEPARVKVAPIKPGRHLGQAAQPRRRRQR >gi|319979268|gb|AEUH01000032.1| GENE 9 9125 - 11035 2842 636 aa, chain - ## HITS:1 COG:ML0487 KEGG:ns NR:ns ## COG: ML0487 COG0342 # Protein_GI_number: 15827164 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Mycobacterium leprae # 2 547 38 563 597 216 33.0 1e-55 MRSLVVLLVAMVAAGAALVIGHVTKGVSYVPDLALDLQGGTQLILTPKATAEGQREITEN DITQAISIIRNRIDASGVAESEITSMGNSNITVSIPGTPSQETLDLIRSSSQMNFRPVLR MGAATGVRQDASQTQANDPSGSQSGSDSQSGAGQSAQEPDQQDAQSGQQVSTAMDPATAL ATADQDKDGRLSDAPASTPPNASDLGWVTEQVMYDFLTLDCSAAADASRTEGAADKPYAA CDSNGQIKYILGPVTVPGSDLKAASAGQVRNSTGQTTGEWGVDLQFNDAGTQAFSESSTI LYGFHSQDAQGSSFYRGSPDRNHFAVVLDGTVITAPSMNAVIPNGQAQITGNFTATSAAS LANQLSFGSLPLNFEVESEQQISATLGSDHLEKGLWAAVIGFALVILYLIWQYRGLAVIS AGSLIIATVITYQVIALLSWLMGYRLSLAGVAGLIMAIGVTTDSFIVYFERVRDEVREGR PLRAAVEEGWDRAKRTIVVSDAVNLVAAVVLYLLAVGGVQGFAFTLGVTTVIDLAVIFLF THPMMELLIRTRFFGQGHKLSGLDPEHLGAKNSLVYAGRGRVVSRGSAASGGADGADESK SIAQRRREARLAARASADEAGADAVRTGAAEQEGER >gi|319979268|gb|AEUH01000032.1| GENE 10 11091 - 11528 585 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192654|ref|ZP_06609608.1| ## NR: gi|293192654|ref|ZP_06609608.1| putative preprotein translocase, YajC subunit [Actinomyces odontolyticus F0309] # 1 125 1 124 135 135 61.0 9e-31 MDQNLLLIAMLVLAVGVMWYSGRSRKQQMAKMEEEKREQLRTVTPGAWVHTRVGFWGRFV DLDGDIVVLETTDGHEMYWDRQMIGEIGGEPPLEGAGAQEGAEDDDEAPEEDEAVLGLET SADQSDGPEDAGAATGAADEADDKN >gi|319979268|gb|AEUH01000032.1| GENE 11 11644 - 12657 1346 337 aa, chain - ## HITS:1 COG:Cgl1620 KEGG:ns NR:ns ## COG: Cgl1620 COG2255 # Protein_GI_number: 19552870 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Corynebacterium glutamicum # 12 331 39 358 363 407 68.0 1e-113 MRVVAPDAGEIERAAEAALRPKRLAEFVGQRVVRGQLQLVLDAARGRGASPDHVLLAGPP GLGKTTLAMIIAAEMGTSLRLTSGPAIQHAGDLAAILSGLQEGDILFIDEIHRLARTAEE MLYLAMEDFRVDVVVGKGPGATSIPLTLPRFTAVGATTRSGLLPAPLRDRFGFTAHLEFY ETDELEQVVARSASLLGAPLGEGAAHEIASRSRGTPRIANRLLRRVVDYAQVHGDGAATL GAARAALALFEVDPLGLDRLDRAVLEAVCKRFAGGPVGLTTLSVTIGEEAETVETVAEPY LVREGFLVRTNRGRMATPRAWEHLGLAPPDAGAVLFR >gi|319979268|gb|AEUH01000032.1| GENE 12 12686 - 13285 851 199 aa, chain - ## HITS:1 COG:Cgl1621 KEGG:ns NR:ns ## COG: Cgl1621 COG0632 # Protein_GI_number: 19552871 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Corynebacterium glutamicum # 1 196 1 205 206 130 44.0 2e-30 MIALLRGVVESIGLDQVVVSAGGVGFGVRVTPAHAQSLTRGDEAVVHTAMVVREDSLTLY GFASADERDVFERLMGVTGIGPKTALAALAVLRPDDLRRAVRDQDIATLQKIPGVGRKSA QRMALEVGGKLGAPAQLPEQGGPAAAPAGEVEAEVRAALVGLGWSEAQAAKAVDALSGQG LGASDMLRSALVSLGGGRG >gi|319979268|gb|AEUH01000032.1| GENE 13 13343 - 13963 761 206 aa, chain - ## HITS:1 COG:Cgl1622 KEGG:ns NR:ns ## COG: Cgl1622 COG0817 # Protein_GI_number: 19552872 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Corynebacterium glutamicum # 1 203 1 200 213 129 41.0 5e-30 MGIDPGLTRCGLGVIDVDPSRRATLVHVGVARSDKDLATHFRLTAIADAIDAVIERHRPE VVAIERVFAQDNLQSVTTTMQVMGAAMACVGRAGLPLAVHTPSEVKSAVSGNGNAGKAQV QTMVARILGLAEAPRPADAADSLAIAICHAWRGTGLLGAPADSSIAVSLSGRLTARTEMT PAQRAWAQAQAAQRRTGAVDPARRRG >gi|319979268|gb|AEUH01000032.1| GENE 14 13963 - 14568 637 201 aa, chain - ## HITS:1 COG:ML0474 KEGG:ns NR:ns ## COG: ML0474 COG0311 # Protein_GI_number: 15827156 # Func_class: H Coenzyme transport and metabolism # Function: Predicted glutamine amidotransferase involved in pyridoxine biosynthesis # Organism: Mycobacterium leprae # 4 196 31 221 223 184 55.0 7e-47 MVTIGVLALQGDVAEHVRALEASGARARVVRREPELDGVDGLVVPGGESTTMSKLLVSFG LFDPLARRIGAGMPVYGSCAGMIALASTIVDGRDDQLCFGALDMVVRRNAFGRQIDSHEE DLRVEGIAGGPLRAVFIRAPWVESVGPGVRVIARSGNREDGPIVAVRKGAVMATSFHPEI GGDHRFHALFVDAVRGSGQCA >gi|319979268|gb|AEUH01000032.1| GENE 15 14562 - 15464 1402 300 aa, chain - ## HITS:1 COG:Rv2606c KEGG:ns NR:ns ## COG: Rv2606c COG0214 # Protein_GI_number: 15609743 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxine biosynthesis enzyme # Organism: Mycobacterium tuberculosis H37Rv # 3 300 2 299 299 459 84.0 1e-129 MSDTTAQREVGTPRVKHGMAQMLKGGVIMDVVTPDQAKIAEDAGAVAVMALERVPADIRA QGGVARMSDPDLIDGIIDAVSIPVMAKARIGHFVEAQVLQSLGVDYIDESEVLTPADYEH HIDKWAFTVPFVCGATNLGEALRRINEGAAMIRSKGEAGTGDVSNATTHMRKIRDEIRRL TSLPADELYVAAKELQAPYELVAEVAREGRLPVVLFTAGGIATPADAAMMMQLGAEGVFV GSGIFKSGDPAKRAAAIVKATTFHDDPAVIAEVSRGLGEAMVGINVDDVPVGHRLAERGW >gi|319979268|gb|AEUH01000032.1| GENE 16 15467 - 16228 1345 253 aa, chain - ## HITS:1 COG:MT2678 KEGG:ns NR:ns ## COG: MT2678 COG0217 # Protein_GI_number: 15842143 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 1 251 1 250 251 327 70.0 2e-89 MSGHSKWATTKHKKAAIDAKRGKLFARLIKNIEVAARTGGGDPAGNPTLYDAIQKAKKNS VPSDNIDRAVKRGSGAEGGGANYETIMYEGYGPSGVAFLVECLTDNRNRAASDVRLGFTR HGGSLADPGSVSYLFSRRGVVEVPPADGIDEDAIMEAVIEAGAEEVEDTGDGFVVETEPQ DVVAVRTALQEAGIDYDSAEVQFVASTEVELDLEGALKVNKLIDALEDSDDVQNIYTNMS LSDEVRAQLEEAE >gi|319979268|gb|AEUH01000032.1| GENE 17 16381 - 17127 760 248 aa, chain - ## HITS:1 COG:mlr0774 KEGG:ns NR:ns ## COG: mlr0774 COG2062 # Protein_GI_number: 13470938 # Func_class: T Signal transduction mechanisms # Function: Phosphohistidine phosphatase SixA # Organism: Mesorhizobium loti # 86 224 1 144 167 72 35.0 1e-12 MRWSKCSGQLNHVFVRPVWMAGLRQMIASVARRVHPRAEHTPCSGPFANAMTLSSWWNRS QAGTARRPRRHRPTVGGAEPLSFSSMPTLVLIRHAQAGHGFRDFDRPLTRAGRGQADRLG AALASRIGSVDIAVHSAARRTTQTFERVRARLAVGEHWADKGLYYADTEDAVALARAFDP GAASALIVGHEPTMSATGHYLAREGDRDLIGWGIPTATAIVLAFDGPWEGLGEAGCRVRD VLLNDPAL >gi|319979268|gb|AEUH01000032.1| GENE 18 17114 - 17197 142 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDHLKRYFTYRSFPYEVKPFPFEDGPG >gi|319979268|gb|AEUH01000032.1| GENE 19 17258 - 18580 902 440 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|256804209|ref|ZP_05533833.1| ## NR: gi|256804209|ref|ZP_05533833.1| TPR repeat-containing protein [Streptomyces viridochromogenes DSM 40736] # 23 283 1394 1672 1823 77 28.0 3e-12 MAATKRLADFAERNPEQQARWANSLSVRLSAVGDWEGALGAAREAVGLYRALAVKYPASH TSDLAASLHVLAASLSEAGQREESLAVFIEDFDRFPPATRAYLLLTRATWRNGGQEEDLK AAAHEANAADAPTFLGPVRRMIAHAVADSELELEGLPPWATVTINSTMRGRLNAWLDCSD RSEQADLVETTYSSPSDSERVALSAAAELYVDVPPLRELTALVDGIAEDGLEAVVTRLRC LHRSQVLAADWYNAHMNGRGAQYLHEHMMRGSDAPQRDSGSDEGSKEEPKEWEKGLYDPS VRAEVLQVLASGLPEAAADEMRRILALTELPDPGTAYAARTSDEGAEDALKEFLIARNWR AMITVLEVRPGMGDSPYGRVAAVLAAAARDDPGGRMRELMDYASEKLDAVDRRLVEALVD VALHTKDCPPAFADLRDWFK >gi|319979268|gb|AEUH01000032.1| GENE 20 18698 - 18853 149 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSATSRAQPHRYGRALKASPARRRTLVGSRGAARVSATHTFCDCCPSNNAR >gi|319979268|gb|AEUH01000032.1| GENE 21 19637 - 20227 424 196 aa, chain + ## HITS:1 COG:no KEGG:Cfla_3211 NR:ns ## KEGG: Cfla_3211 # Name: not_defined # Def: hypothetical protein # Organism: C.flavigena # Pathway: not_defined # 2 175 53 222 222 149 56.0 4e-35 MTHYMGLLAVRQPWNLLLFMALPVVLAETLAITELVMLLARRGERPAPGWVTKVGHWAGL LAGPVWVFIGLHLLITAVVPLTVNGGWRGPADVIAVLSYLAGGVPMLVISLLEAELLGRD ETGRLRLHVAMVAVFLVVAHVAMIFGMLDPAVMGYSPKAPASVQHDMGSMSGMNHDMGGM TDMDHSTMMPSSAPSN >gi|319979268|gb|AEUH01000032.1| GENE 22 20259 - 20975 601 238 aa, chain + ## HITS:1 COG:Cgl2904 KEGG:ns NR:ns ## COG: Cgl2904 COG0745 # Protein_GI_number: 19554154 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Corynebacterium glutamicum # 12 236 15 239 240 233 56.0 2e-61 MNSASVAVVPSVLVVDDEPVLSGTVKNYLERAGMSAQTCGDGLAALDLVRATAPDVVVLD LGLPGMDGVEVCRQLRTFSDCYVLMLTARADEVDKLIGLSVGADDYMTKPFSPRELVARI QVLLRRPRAGSTGATAAVVHRIGALTLDPSSRRVELDASPVELTRTEFDLLQALAEHPGW VLNRRQLTDAVWGEDWVGDDHLVDVHIAHLRKKLGDDPSQPRFVQTVRGVGYRMGKGQ >gi|319979268|gb|AEUH01000032.1| GENE 23 21044 - 22081 1094 345 aa, chain + ## HITS:1 COG:Cgl2903 KEGG:ns NR:ns ## COG: Cgl2903 COG0642 # Protein_GI_number: 19554153 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Corynebacterium glutamicum # 9 341 30 364 399 234 45.0 2e-61 MAATTVLTALLVGPAWFRMHMLEAGHAQPDVVEHAQQAFHDAGLVSLAVGLGTAMLAALG VATVLNRNLSRGVEALVDGAQHVAAGHYDQPVMMTSASPELAQVADAFNHMASQIASTEQ TRRRMLTDLGHELRTPLAATQVTLEGLQDGVVDFTPANIEILIRQNQRLAALAADISEVS RAEEGRIPLSLSHQDLDQIVTASVAAARHAYDATGVTLHAQTVPGLRVDVDAARIGQVLD NLLRNALQHTVGGNRVEVAMRHEHDRAVVRVTDNGVGIPTEALTHVFERFYRVEDTRTRD VDSGTGVGLAISRAIARAHGGDLTAHSDGPGHGATFALTLPTVTS >gi|319979268|gb|AEUH01000032.1| GENE 24 22262 - 23623 1584 453 aa, chain + ## HITS:1 COG:Cgl2906 KEGG:ns NR:ns ## COG: Cgl2906 COG2132 # Protein_GI_number: 19554156 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Putative multicopper oxidases # Organism: Corynebacterium glutamicum # 7 453 44 492 493 437 52.0 1e-122 MTIPSQTPLVAAPGTKTVNHVLTPRPVTLDLGGVTARTWAYDETLDAPVLRARAGDLLQV RVENKLPTSTSVHWHGIALRQPSDGVPGVTQQPIEAGSSFTYQFVAPDPGTYFFHPHTGV QIDRGLYRPLVIDDPAEPGRYDHEWIITLDDWADGVGTSPDDILAAFKAQNGTVSSGMNH DMDGMDHGMSPLGDAGDVTYPHYLVNGRVPAAPRTLTAKPGQKVRLRVVNASSDTIFKLA LQGHRLTVTHTDGFPVTSTQASAVYLAMGERLDATITLGDGVFVLQAAPEGKKGTPARAI VRTGSGNVPAPDTRIAELDGKALLTTQLKPADSARLPDRKPDTTLDVALNGQMKPYAWGL NGKRFGEDTPLALSRGQRVRLRMTNMTMMAHPMHIHGHTWALPGSDGLRKDTVLIRPMET VEADLQADNPGTWMLHCHNIYHAELGMMTTLRY >gi|319979268|gb|AEUH01000032.1| GENE 25 23811 - 24107 306 98 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYTSPDQAPRSFGEAPRQPLTGPMTDQDPTSTTHQHGKSHHWMHLLMCAPMLLVVAGYLW AGRASLAAGGVLAALVPAISCMLMMWLMMRMMDHGSQH >gi|319979268|gb|AEUH01000032.1| GENE 26 24153 - 26630 1888 825 aa, chain + ## HITS:1 COG:SMa1087 KEGG:ns NR:ns ## COG: SMa1087 COG2217 # Protein_GI_number: 16263042 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Sinorhizobium meliloti # 102 825 16 732 733 746 57.0 0 MNHPAATATHDHTTRYTCPMHPEVTSDQPGRCPECGMFLVDADAAPATHQHGHGRQNGHA QQHDTHDKGTWLCPMHPEVTSDHPGRCPKCGMFLQPADGGDAPAAQHHDHQHPTGEPAVV AGAVAAGDWTCPMHPEVRSDGPGDCPICGMALERVSAGLDDGPNQELVDMRRRFRVAAVL SVPLVAMVMVPMVLGRALPGSVAPWLELALSTPVVWWAGWPFFVRGAKSVVSRHLNMFTL VSLGVGAAWLYSVVAVLAPGLFPSGMRAMDGRVGTYFEAAAVIITLVLLGQVLELRARDQ TSGAIRTLLDLSPATAHRIGADGTETDVPAAELQLGDRCRVRPGEKVPADGTVVDGHAYV DESMITGEPVPVDKSPGDRVIGGTIIQGGSLVVEATGLGADSTLARIVDLVSQAQRSRAP IQGLVDKISAVFVPVVIAVALATFGLWLAIGPQPRLPFAIVAAVSVLIIACPCALGLATP MSIMVGVGRGASEGVLVKNAEALERLQKVDTLVVDKTGTLTQGRPSLVDQQGVDGHEDAR TLLLAAAVEAGSEHPLARAVVDAARQTGRTVPAASDFAAHPGGGVSATVDGQHVLVGSPA FLGSQHVDTHALDAVVEAYRRRGATAIVVAVDGRPASVLAIADPLKATTAGAIEDLRRRG MKVVMLTGDNATTARAIADELRIDQVVADVLPDQKHGHVQALQAQGHTVVMAGDGVNDAP ALAVADVGVAMGTGTDVAIESADVTLLGGDLAALVKARDLSVDTMRNIRQNLVFAFVYNV VGIPLAAGALYPAFGWLLSPVIAAAAMALSSVSVITNSLRLRRHR >gi|319979268|gb|AEUH01000032.1| GENE 27 26656 - 27471 1013 271 aa, chain - ## HITS:1 COG:AGl50 KEGG:ns NR:ns ## COG: AGl50 COG1484 # Protein_GI_number: 15890129 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 18 268 1 251 252 176 38.0 6e-44 MTTTSTSRPESLRRRQGLTEQAAQAAIDQACRRLRLPTIRAVVDDAVTAATKEQLTYQGF LAELLLAEVDDRDRRSTLRRIKSAGFPREKWLADFDFTANPNINPATINELATGDWIRRG DPLCLIGDSGTGKSHLLIALGTAAAEQGYRVRYTLATRLVNELVEAADEKQLTKTINRYG RVDLLVIDELGYLELDRRGAELLFQVLTEREEKNAIAIASNQSFSAWTDTFTDPRLCAAI VDRLTYNATIIETGTNSYRLAHTRARASVMG >gi|319979268|gb|AEUH01000032.1| GENE 28 27468 - 29135 1682 555 aa, chain - ## HITS:1 COG:AGl49 KEGG:ns NR:ns ## COG: AGl49 COG4584 # Protein_GI_number: 15890128 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 473 1 450 498 185 32.0 3e-46 MGSRVQLYADIRHDARVDGLSIRELARKHGVHRRTVRQALAAAEPPPRKKPVRTAPRLDP YKPAIDEMLTYDLTAPRKQRHTATRILARLRDEHGATDLSYSTVRDYVRVRRAQIDLEAG RRVEAMVPQDHAPGAEAEVDFGEVYVILDGVKTKCHMFVYRLSHSGKAIHRVYPTGGQEA FLEGHIEAFHALGGIPTRHIRYDNLTSAVVQVIHGGDRLRDENERWVLFRSHYGFDAFYC QPGIDGAHEKGGVEGEVGWFRRNHLTPMPEVATLDELNDKIRAWEHDDNTRRITGHANTI GQDHHAELPHLAPLPADDFDPGLILHPRVDRSALITVRMVKYSVPAHLIGQRVRVSLRAS CVVVFEGRTVLATHPRLGTRGVTRVELDHYLEVLRHKPGAFPGSTALAQARAAGAFTAAH DAFWAAARKTSGDVAGTQALIDVLLFHRSLPSDAAIAGITSALSVGAISPDVVAVEARRH ATTHTPAPASQTRGGTVVNLPPRRSTNPHQVIAHLPEDTRPLPTVTAYDELLCRRQPDPA PAPAETHHHTPTGTP >gi|319979268|gb|AEUH01000032.1| GENE 29 29251 - 30621 690 456 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|256804209|ref|ZP_05533833.1| ## NR: gi|256804209|ref|ZP_05533833.1| TPR repeat-containing protein [Streptomyces viridochromogenes DSM 40736] # 6 404 9 400 1823 168 33.0 7e-40 MRERVKWDSTHNPVGLVDMVHAGWVQRLLLAADWSGFPGPEPDLVKNGQTRVGAQAKKIF EKLVDLGIEYVHEPPDSVPGAQWIRPVTEVLGFRQATCVDLCVTFCCAALDAGIYPLIVT LTTANGKQRHSIVVVPLGRTWSTGCDAVIESGFSREPLAVDGCALAGVVAEYADDPTGTW LAIDVQQAMMPKGDWGTALSRGADYLQEWKWDVCVDVGGQRSHKADDAVPPGGNLERILA PARTPLPQDFTPLQLIKARHAVVSFEERSEYRKLRQWATTPARTSTDTANGAGADIAVAV VTGKGGSGKTRMAVELCGDLSSTGWYTGFLRTTTDVTDQELAALEDLATELMVVVDYAEE AQRGRLAEVFRALLVRRAPTRIVLTARGADAWWDEFREEVEQDGLELSNTLVVSNLGKAR QEEDQGLLNRIYIRAVRGFSARLYHSWLWQLGSAPL >gi|319979268|gb|AEUH01000032.1| GENE 30 30606 - 31970 937 454 aa, chain - ## HITS:1 COG:no KEGG:RHA1_ro05343 NR:ns ## KEGG: RHA1_ro05343 # Name: not_defined # Def: hypothetical protein # Organism: Rhodococcus_RHA1 # Pathway: not_defined # 97 434 105 445 466 89 28.0 3e-16 MIPGIGGSELADEAGRVVYGSGVRRLVGSALDPGALDIGNDLRPVGLIGPCSVVFKQLVT GYDGLIRGLGRALGLSEAQVATAGPDLASADAALVAFPYDFRRPVERIAHDLDREVRRRA QGRRVVLVAHSMGGLVAAWWWAFLSEGVEVADIITLGTPYRGAAKALNVLVNGVRVGGHE LSGLTGVLRTWDSVFDLLPHYQVVEDGGGGPYPYQLPSTVTEAVPDFSARALKAYRANRD MHRALVEKAGSGGNPFTSYYSQGHATLGRAIVDAASGQLEVAKGNPQELPPSWDGGDGTV PVFSTIPDSLEDDVNRRRRLVGKHQDLVEEKPIFDHVSEHLRDRLPPAAQGGARDEGGAY VQLDLDDVLPIGESQAIRMRVVDSRDQILEVSGVGGNVGGQRFRAERREDGWWSARLPAL EEGVHQLTVIATGVPGADRILFNTRVGAASCVSE >gi|319979268|gb|AEUH01000032.1| GENE 31 32173 - 32853 779 226 aa, chain - ## HITS:1 COG:Cgl2835 KEGG:ns NR:ns ## COG: Cgl2835 COG0406 # Protein_GI_number: 19554085 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Corynebacterium glutamicum # 2 219 4 219 223 79 31.0 5e-15 MRIVLVRHGQTAANTAGALDTVRPGLPLTAEGREQAERLAARWESEVCGPPDVIAVSGLT RTRQTAAPLAREYGLVPVVHPGIRELRSGDVEMASDVCSQITYVRTVLRWCAGDLAARMP GGESGREALARSVGAVHSIALGARAEHGPGAVVVFVVHGGLTRLLSTALADNLDETLVNT HFVGNTGTIVMEWAPDFAPATADDLVGGLHALTWDDRPVGDYEGAL >gi|319979268|gb|AEUH01000032.1| GENE 32 32850 - 33164 118 104 aa, chain - ## HITS:1 COG:no KEGG:Cfla_2118 NR:ns ## KEGG: Cfla_2118 # Name: not_defined # Def: (glutamate--ammonia-ligase) adenylyltransferase (EC:2.7.7.42) # Organism: C.flavigena # Pathway: not_defined # 1 100 913 1012 1013 84 53.0 2e-15 LRVTGTRAAVGAARGAGLLSDEDAATLLGAWELASRIRAGNALASGRMTGDKLDILGRDA RDLAPLARILGYPRGNEGALEERWRQSARRARAVMEHVFWESDS Prediction of potential genes in microbial genomes Time: Thu May 12 17:04:21 2011 Seq name: gi|319979266|gb|AEUH01000033.1| Actinomyces sp. oral taxon 178 str. F0338 contig00033, whole genome shotgun sequence Length of sequence - 2566 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 2566 2918 ## COG1391 Glutamine synthetase adenylyltransferase Predicted protein(s) >gi|319979266|gb|AEUH01000033.1| GENE 1 1 - 2566 2918 855 aa, chain - ## HITS:1 COG:Cgl2179 KEGG:ns NR:ns ## COG: Cgl2179 COG1391 # Protein_GI_number: 19553429 # Func_class: O Posttranslational modification, protein turnover, chaperones; T Signal transduction mechanisms # Function: Glutamine synthetase adenylyltransferase # Organism: Corynebacterium glutamicum # 8 854 57 945 1045 501 39.0 1e-141 ARCARRCADPDQMLLLATRLFEAAPAAVAEAADGPGERLERLCAVLGASAWLGEYLVARP GALGALWEPVGDARAEVLGAVGARPVGPVAGRLVASGGAGADDLRRAYRRVQLGIAADDL TSADAPAAVPGVGRRLAELADASLEAALALARRDADPGADVPLAVIAMGKTGAQELNYIS DVDVVYATEAADGLTEQEAASRAARLASLAAAACSGPGAEAPLWTVDANLRPEGRDGALV RTIGSYVAYWRKWAQTWEFQALLKARAAAGDADLGRRFEEAAAPFVWSAATREGFVDAAR RMRARVEESVGAARAAQEIKLGRGGLRDVEFTVQLLQLVHGRTDGALRVRGTTEAIGALA RGGYIGRADAAELCDCYAFLRAVEHRAQLPRMRRTHLVPAREAELRALGRALDPGRWPGA QGIREEIARVRSRVRALHEDVFYRPIVAATAGLSARDAVLGDEGARDRLAAIGYAAPAAA LSHIGALTKGTSRRATIQRHLLPVLISWLADGADPDMGLLNFRALSEQIGDSHWYLALLR DSGAAASRLMSMLPNSRWIAEALSTRPEAVAWLDDDAELEARPRSALVKEALALVERHPG AEEAADRVRAVRSRELTRAAAAQLVRGVDPASSAVSDATDAALLGALRIAERDEEERWGR ARAHVLLVAMGRYGGRESSFASDADVVAVHRAAPGAGEEEAAASALAVVGRVRELLGAPG PQMGVSVDLGLRPEGRNGPMSRSLGAYADYFRSWASPWERQALLRARPIGSGPLADAYLD VIDPVRYGPAPDGAALREIRLLKARMEAERLPRGADPALHVKLGPGGLADVEWTIQLLQL RHARGCAALRVTGTR Prediction of potential genes in microbial genomes Time: Thu May 12 17:04:35 2011 Seq name: gi|319979234|gb|AEUH01000034.1| Actinomyces sp. oral taxon 178 str. F0338 contig00034, whole genome shotgun sequence Length of sequence - 38514 bp Number of predicted genes - 34, with homology - 29 Number of transcription units - 16, operones - 7 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 103 - 1440 1830 ## COG0174 Glutamine synthetase 2 1 Op 2 26/0.000 - CDS 1519 - 2469 1512 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 3 1 Op 3 . - CDS 2480 - 2923 645 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 4 1 Op 4 . - CDS 2941 - 3735 195 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 5 1 Op 5 . - CDS 3745 - 4968 1706 ## COG0438 Glycosyltransferase 6 2 Op 1 . + CDS 5068 - 6375 1771 ## COG0448 ADP-glucose pyrophosphorylase 7 2 Op 2 . + CDS 6372 - 7055 572 ## COG0560 Phosphoserine phosphatase 8 3 Op 1 . - CDS 7069 - 7608 734 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 9 3 Op 2 . - CDS 7605 - 8891 1625 ## Bcav_1985 membrane protein 10 3 Op 3 . - CDS 8893 - 9516 948 ## Cfla_1788 hypothetical protein 11 3 Op 4 4/0.000 - CDS 9513 - 10685 1601 ## COG0438 Glycosyltransferase 12 3 Op 5 3/0.200 - CDS 10682 - 11608 1233 ## COG1560 Lauroyl/myristoyl acyltransferase 13 3 Op 6 3/0.200 - CDS 11605 - 12234 913 ## COG0558 Phosphatidylglycerophosphate synthase 14 3 Op 7 4/0.000 - CDS 12238 - 12840 736 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 15 3 Op 8 . - CDS 12843 - 14909 3088 ## COG0441 Threonyl-tRNA synthetase 16 3 Op 9 . - CDS 14939 - 15964 1570 ## Balac_1169 hypothetical protein 17 4 Tu 1 . - CDS 16199 - 17449 1932 ## COG3919 Predicted ATP-grasp enzyme 18 5 Tu 1 . + CDS 17608 - 17772 329 ## 19 6 Op 1 . + CDS 17904 - 18533 532 ## 20 6 Op 2 . + CDS 18559 - 20286 2131 ## COG0366 Glycosidases + Term 20312 - 20365 19.3 21 7 Tu 1 . - CDS 20379 - 22013 2194 ## gi|293192615|ref|ZP_06609569.1| putative LPXTG-motif protein cell wall anchor domain protein - Prom 22060 - 22119 1.8 22 8 Op 1 . + CDS 21871 - 22143 173 ## 23 8 Op 2 . + CDS 22152 - 22781 -189 ## - TRNA 22553 - 22627 85.0 # Val CAC 0 0 - TRNA 22629 - 22699 61.1 # Cys GCA 0 0 - TRNA 22735 - 22807 88.7 # Gly GCC 0 0 - TRNA 22870 - 22941 78.8 # Val GAC 0 0 - TRNA 22943 - 23015 88.7 # Gly GCC 0 0 24 9 Op 1 . + CDS 23182 - 25293 3178 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases + Term 25301 - 25339 -0.9 25 9 Op 2 . + CDS 25341 - 25913 665 ## Bcav_1965 hypothetical protein 26 9 Op 3 . + CDS 25915 - 27153 1367 ## COG0349 Ribonuclease D - Term 27464 - 27511 -0.2 27 10 Tu 1 . - CDS 27599 - 29497 2253 ## COG1154 Deoxyxylulose-5-phosphate synthase 28 11 Tu 1 . + CDS 29564 - 31594 2423 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 29 12 Tu 1 . + CDS 31729 - 33441 1863 ## COG0038 Chloride channel protein EriC 30 13 Tu 1 . - CDS 33428 - 34696 1370 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 31 14 Tu 1 . - CDS 34908 - 36890 2980 ## COG0531 Amino acid transporters 32 15 Tu 1 . - CDS 37000 - 37062 80 ## 33 16 Op 1 1/0.600 + CDS 37057 - 37722 897 ## COG0569 K+ transport systems, NAD-binding component 34 16 Op 2 . + CDS 37727 - 38416 835 ## COG0569 K+ transport systems, NAD-binding component Predicted protein(s) >gi|319979234|gb|AEUH01000034.1| GENE 1 103 - 1440 1830 445 aa, chain - ## HITS:1 COG:MT2280 KEGG:ns NR:ns ## COG: MT2280 COG0174 # Protein_GI_number: 15841714 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Mycobacterium tuberculosis CDC1551 # 1 444 1 446 446 521 58.0 1e-148 MDRSQEHVLQEVAEKDIRFIRLWFTDVAGVLKSVSTDPGELEDAFHEGIGFDGSAVQGLT RVCESDMLLRPDASTFQMLPSVGGDDDRVARMFCDVHTPDGRQAASDPRGVLARQVARAQ ELGFTVMVHPEIEFYLLRQPVSVERMVPVDNAGYFDHVVRGTSNSFRRRAVRILEDMGIP VEFSHHEGGPGQNEIDLRAVDPVRAADNIMTARTIIEEVALGEDLMATFMPKPFTDHPGS GMHTHLSLFEGDENAFHAPAGRYQLSETGRRFIAGLLAHAGEIAAITNQHVNSYKRLWGG GEAPSYVCWGHLNRSALVRVPLYKPNKRRAARIEYRAPDPSANPYLALACLIAAGLDGIA KRMELAPEAEDNVWDLSDRERQVMGIHALPNSLSQAVAAMRSSELVAAALGEEVFDFVIR NKLNEWREYRRQITVQELRQFLKVH >gi|319979234|gb|AEUH01000034.1| GENE 2 1519 - 2469 1512 316 aa, chain - ## HITS:1 COG:MT1533.2 KEGG:ns NR:ns ## COG: MT1533.2 COG0330 # Protein_GI_number: 15840949 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Mycobacterium tuberculosis CDC1551 # 13 311 11 309 381 310 54.0 2e-84 MTSYGLPVTAFVLALLLIFIVVALVRSVRIVPQSQAYVIERLGRFQAVFYGGFHLLVPFV DRVASRIDLREQVANFPPQSVITADQAMVSIDSVIYYQITDPRNATYEVANFIQAIEQLT ATTLRNLIGSLDLEQTQTSRDSINKQLRGVLDEATGTWGIRVTRVELKSIEPPPRVLAAM EQQITAERTKRATILSAEAEREAQIKRAEGAKQAAVLAASAQQEAQVLQARGEKDAQILR AEGARQSQILRAQGEAEAIAAVFSAINAGGATPALLSYKYLEMLPKIADGQASKVWVLPS DLTGALDAISKGFSGR >gi|319979234|gb|AEUH01000034.1| GENE 3 2480 - 2923 645 147 aa, chain - ## HITS:1 COG:Cgl1497 KEGG:ns NR:ns ## COG: Cgl1497 COG1585 # Protein_GI_number: 19552747 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Corynebacterium glutamicum # 8 140 5 138 142 60 35.0 1e-09 MDMDPIWLWVVAAVVCGIIEVLSVSFVFLMLGVGALAGAVVAGAGGGAYLQISVFAAVSV VLLLAVRPFLKGRLYQSSPDVRTNTDALIGAGGYALSTITVRDGRARLRGAEWSARTQAG TIEEGAPVVIIGIDGATAIVAPSTEPQ >gi|319979234|gb|AEUH01000034.1| GENE 4 2941 - 3735 195 264 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 223 1 214 311 79 27 2e-14 MTNVLSFDRATVVRGERRILEDVTWRTRDGEHWVVLGPNGAGKTTLVRAACARVPLTSGS IALDGAPVDRIDPAELGTRISLASSAVASRIRGAQSALDTVRSAAWGVHVAHHEHYEGQD EERARDLMAAFGVSHLADHPFGSLSEGEAQRVQLARALMSDPEVLILDEPTAGLDLGARE TLVSALDEIIAGKRSPQVVLVTHQIEEIPTGITHCAVMRGGALIAQGPISDTLTGVVLSE AFSLPLLAGMADGRWWARAASSQD >gi|319979234|gb|AEUH01000034.1| GENE 5 3745 - 4968 1706 407 aa, chain - ## HITS:1 COG:MT1250 KEGG:ns NR:ns ## COG: MT1250 COG0438 # Protein_GI_number: 15840656 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Mycobacterium tuberculosis CDC1551 # 1 406 1 385 387 366 50.0 1e-101 MRVDLLTREYPPHVYGGAGVHVLELAKVLRPLVDDLRVHAFDGPRQPGTEGADPGVTGFD NLPQLEGANPALATLGVDLQIASACTGADLVHSHTWYANFAGHVASLLDDIPHVVSAHSL EPLRPWKAEQLGGGYRLSSFVEKSAYESAAAIVAVSRGMREDILRCYPRVEPDTVHVIHN GIDLAKWHAPEGAQGEELQARVLAEHGIDPSKRTVVFVGRITRQKGLPYFLRAARELPDD VQLVLCAGAPDTKEIASEVDGLVAQLKEKRSGVVLITEMLPQPEVAAILDAADVFITPSV YEPLGIVNLEAMALGLPVVGTATGGIPDVIVDGETGYLVPIDQKTDGTGTPLDPEAFEQA MAERLIKILDDPAMARRMGQAGLERARAHFSWEAIGAKTVELYKRLV >gi|319979234|gb|AEUH01000034.1| GENE 6 5068 - 6375 1771 435 aa, chain + ## HITS:1 COG:Cgl1094 KEGG:ns NR:ns ## COG: Cgl1094 COG0448 # Protein_GI_number: 19552344 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Corynebacterium glutamicum # 22 415 4 392 409 439 58.0 1e-123 MYYHVRTPTDAKGLRINRHRGGHMAKNHVLAIVLAGGEGKRLMPLTEVRAKPAVPFGGHF RLIDFALSNIVNSGYLKIVVLTQYKSHSLDRHIAKTWYTSPLLGNFIAPVPAQQRRGPHW YLGSADAIYQSLNIVDDEQPEYIVIIGADNIYRMDFSQMVQHHIDSGLPATVAGIRQPIE LAPALGVIDGEGGRIKKFLEKPKHAEGLPDDPTKVLASMGNYVFTTKDLVAALREDAKDP DSKHDMGGNIIPWFVERGECGVYDFQDNDVPGSTDRDRDYWRDVGTLDAYYEANMDLISV HPVFNLYNRDWPTMTLINGSLPPAKFVYADEGKRVGRAIDSFVSPGVIVSGGSVERSILS PGTYVHSWAEVSDSVVMDGCRVGRHTKVVKTILDKNVVVEEGAVVGVDLAHDRERGFTVT ESGITVVPKGTVVTK >gi|319979234|gb|AEUH01000034.1| GENE 7 6372 - 7055 572 227 aa, chain + ## HITS:1 COG:L0085 KEGG:ns NR:ns ## COG: L0085 COG0560 # Protein_GI_number: 15672587 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Lactococcus lactis # 20 217 5 202 216 192 48.0 3e-49 MSPTATGPGLPQFFADGGPGLVVTDVDSTLIEQEVIEELAAEAGAREAVARITARAMNGE LDFAHSLRERVAALAGLPVSVCARVAERVTVTRGARELIGAVHRAGGRFGVVSGGFVEVV EPLARSLGIDFHAANRLEASDGVLTGRVVGRIVTAEVKTACLRRWAAQCSVPLARTVAIG DGANDVPMMREAGVGIAFCAKPAVRRLVAHQLNEPRLDALIAPLGLA >gi|319979234|gb|AEUH01000034.1| GENE 8 7069 - 7608 734 179 aa, chain - ## HITS:1 COG:Rv2609c_2 KEGG:ns NR:ns ## COG: Rv2609c_2 COG0494 # Protein_GI_number: 15609746 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Mycobacterium tuberculosis H37Rv # 21 155 1 142 172 87 38.0 1e-17 MSADGPGDWPRDSEGYPHRQAARVVLFDARGRLLLAIGHDADDPERQWWFTIGGGIEEGE DPAAGAVREVREETGIRLGVEDLVGPVLYRTAEFDFAAVTARQDEWFFVARTECAEVSRE GWTDLEKEVLDGLKWWDLDELEALDGAAEVYPRQLVGFAREWRDGWDGRLVSLTGAREP >gi|319979234|gb|AEUH01000034.1| GENE 9 7605 - 8891 1625 428 aa, chain - ## HITS:1 COG:no KEGG:Bcav_1985 NR:ns ## KEGG: Bcav_1985 # Name: not_defined # Def: membrane protein # Organism: B.cavernae # Pathway: not_defined # 42 408 101 461 470 180 35.0 8e-44 MTEAAVKWGGPGEDYRLSRINRTGSRAEIHVDAPSPDNAAPIWQTPKAVRVVPWFEIVMV AIGVAGIVYFMVYAMDADGLTTAVMTLVAVVPLLIVMSAMVFIDRWEPEPVKMKIIAFLW GGGVATVSSMIINTALMTNTALVIGDVQKAQAIAATFVAPVVEETFKGVGVLVIILARRT SINSLLDGVVYGGIVAAGFMFVEDIQYFVRYGSSGTGSLVTIFVMRGLLSPFLHSMATSL TGFAMAWGAIRAKRAWSRILVFLAGWGAAMLVHGLWNSMGSDGQLATLLKMYLFIQVPLF GTWLTALIRASNREAETIKKGLVPYVRTGWILPAEVSMVTDRGKRKAARKWVARGGAPAR KAMRKFMNELASLGLDQSLMTRVGPDPARIEEDRRLLTQAAEHRAEFLRLTAIASEQTAV AGAVGRAQ >gi|319979234|gb|AEUH01000034.1| GENE 10 8893 - 9516 948 207 aa, chain - ## HITS:1 COG:no KEGG:Cfla_1788 NR:ns ## KEGG: Cfla_1788 # Name: not_defined # Def: hypothetical protein # Organism: C.flavigena # Pathway: not_defined # 23 198 21 200 225 77 38.0 3e-13 MMHFQTHLVVIAVVVVVLTAVVAYLCVMGARRLDRLHQRILANRDALSRLLVRRASETLL LAREDALGRRAGACLTEVANASVADSADQLSCDGLDRRSGAAPAADPVDVRRCLEKASAL SRAIRDTLDDDTRRQLAAREPARARLEALDATCYRIQLARSAHNTDVTQVRSLRGTALVR LFHLAGHAPEPEPIDFDDDTRYDGRGY >gi|319979234|gb|AEUH01000034.1| GENE 11 9513 - 10685 1601 390 aa, chain - ## HITS:1 COG:Rv2610c KEGG:ns NR:ns ## COG: Rv2610c COG0438 # Protein_GI_number: 15609747 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Mycobacterium tuberculosis H37Rv # 1 369 1 372 378 297 48.0 2e-80 MRIGIVNPYSWDVPGGVGFHIRDLALKLRSRGHDVRVLSPSSAPGLPEWVTSAGSSVSVP FNGSVANISISPAAFARTRRWLADGDFDVVHVHEPVVPSVSMAATILSGAPLVATFHAAL TRSLSRSIASAPMRPYMERIAVRIAVSSEARRTLIEHHGGDAVIIPNGVDTASFRGAEPI EPWRARDGAPVVVFLGRLDEPRKGLPVFAAAIPRVLEAVPGARFLIAGRGEADAIRSGLA RFGDRVQFLGGITDPEKESLLAGASAYVAPQTGGESFGIVLVEAMAAGTAVVASGIEAFR AVLGDGRFGVLFETGSAASLADELIALLGDPQRLEGIARAGEAASLQYDWEVVADKVYEV YKLAIDTGSPAVSGSRSARNLIRGRGEEGR >gi|319979234|gb|AEUH01000034.1| GENE 12 10682 - 11608 1233 308 aa, chain - ## HITS:1 COG:Cgl1628 KEGG:ns NR:ns ## COG: Cgl1628 COG1560 # Protein_GI_number: 19552878 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Corynebacterium glutamicum # 33 298 47 306 321 126 35.0 6e-29 MRFSALSVAFALAPRLPLRWVLSAARAGARRAARRGGANVDQLRSNLRRLTGGEPDQELV VRAVESHIRNYAEQLVLGSPKGAPLLERVVFEDFDRIEEASEDGPVVLALGHSGSWDRAG AWVCAHGRTVVTVAERVEPPSLFDSFVRLREGLGMEIIGVGKGESVFDTLVERVRGRSVL VPLLADRDISGAGIEVDFAGHRALVAAGPAALAHRLSRPLYAACVSYADEDAPVASVRVE LAGPITAAPGGGANEVEALTQAWVDAFTGMLADKPEDWHMMQKVYTEDLDPERLARARAA HRSRGAGQ >gi|319979234|gb|AEUH01000034.1| GENE 13 11605 - 12234 913 209 aa, chain - ## HITS:1 COG:MT2687 KEGG:ns NR:ns ## COG: MT2687 COG0558 # Protein_GI_number: 15842152 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 13 204 15 207 217 111 38.0 1e-24 MLGNHGRSIPRAVFTPLARLLAKAGVTPNMVTAFGTAATIAVAFGVVAQGWIWQGGTALA VIMFGDSVDGTLARMTTGGTRFGAFFDSTLDRLGDGAVFASLTYYAVFHMEEGTARTWAV IAGLASIVGAAAVPYARARAESVGVTAKLGIAERTDRLLIAMGSAMIMDLGASAWVFVVG LTWVALASFITVGQRVWYTARHIDEAEGA >gi|319979234|gb|AEUH01000034.1| GENE 14 12238 - 12840 736 200 aa, chain - ## HITS:1 COG:ML0455 KEGG:ns NR:ns ## COG: ML0455 COG0537 # Protein_GI_number: 15827146 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Mycobacterium leprae # 30 199 24 193 206 176 49.0 3e-44 MSEQNPGSTGAGPLGAQAPLPTEDSRSLAGVPDAFNRFWTPYRMAYIHGEGKPADGSRDE CPFCAAPGKEDADGLVVHRGSECFVVMNLFPYNSGHLLVCPYRHVSDYTELTGRERVELG ELTATAMRVLRGVSGPHGFNLGMNQGEVAGAGIAAHLHQHIVPRWSGDANFLPIIARTKA VPELLEDARSALAAAWDQED >gi|319979234|gb|AEUH01000034.1| GENE 15 12843 - 14909 3088 688 aa, chain - ## HITS:1 COG:MT2689 KEGG:ns NR:ns ## COG: MT2689 COG0441 # Protein_GI_number: 15842154 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Mycobacterium tuberculosis CDC1551 # 47 663 45 663 692 723 60.0 0 MIPSAGTPAPSNTPTRGATVPTFDLVIDGQATTIEAGTTGTAWYAGDRSVVAIKVDGAPR DLDTELEAGTRVEAIALDSPDGLDILRHSATHVLAQAVQDVFPDVDLGIGPFITDGFYYD FGNIDAVTPELLRDLEKRMKRIVKEGQRFVRREISEDEARLELADQPYKLELVTTKGKGA EGASVEVGGGGLTMYDNVRRDGTVAWKDLCRGPHLPSTKLIGNGFALTKSSAAYWKGDQN GDQLQRIYGTAWASKDDLVAYQERLKEAERRDHRRLGAELDLFSFPEEIGPGLPVFHPKG GVLKRVMEDYVREAHIANGFQYVGTPHISKAELFHTSGHLPYYADGMFPPMDDEGQAYYL KAMNCPMHNLIFRSRGRSYRELPLRFFEFGTVYRNEKSGVLQGLTRVRSITQDDSHSYCT PEQAPGEVRHLLTFVLTLLKDFGMEDFYLEVSTRDEDGAKKDKFIGSDEQWADATRILAD IAAECAQENGLRVVDDPGGAAFYGPKISVQAKDAIRRTWQMSTIQYDFNQPHRFGLEYTA ADGARHEPVMIHSAKFGAIERFIGVLTEHYAGAFPAWLAPVQVRLVPVASAFDDYVEEVA AKLRERGIRVETDLSDDRFGKKIRNAAKEKVPFTLIAGGEDAEAGAVSFRLRDGSQHNGI PVDRAVELIARHVASRANHDAVEGIDRA >gi|319979234|gb|AEUH01000034.1| GENE 16 14939 - 15964 1570 341 aa, chain - ## HITS:1 COG:no KEGG:Balac_1169 NR:ns ## KEGG: Balac_1169 # Name: not_defined # Def: hypothetical protein # Organism: B.animalis_lactis_Bl-04 # Pathway: not_defined # 5 339 6 331 349 185 32.0 2e-45 MPETQLIPIGVEEKTAAASGFQDSALPVEQACVWEAFEASQGHSLWGRYKWCEDGKLIAF LTLYNYSLRGVRFLWAKWGPVWLREATPEREEALRADLLREIRSRDRSVAFVRLHAWYQH PDLHMPMQTISYDRTVVIDTSGKTEEAILDTMPKSGKRSIRSGLKKGKAEGITFHEDTGR ALEVIDEYYAVMEETAERDGFRPHPKQVYLDLLSSLGPEHARIFSMRDSEGAVLCWDLCL IQGIRAQAEYGASTDKARKLRQPPALDFLAAAALASEGVRGFDLMGAHSARCPELFRVGK YKIAFASHFTDVPGGWEMPVKRTTYRSLRAAMSVKRRMRRS >gi|319979234|gb|AEUH01000034.1| GENE 17 16199 - 17449 1932 416 aa, chain - ## HITS:1 COG:L93420 KEGG:ns NR:ns ## COG: L93420 COG3919 # Protein_GI_number: 15674209 # Func_class: R General function prediction only # Function: Predicted ATP-grasp enzyme # Organism: Lactococcus lactis # 3 408 1 406 408 192 28.0 8e-49 MAMSPRTDILPVIIGGDFGVYGIGRCFNEAFGCRCLCVGSLPTESITGSNFFDVRRIPAH ASDAQLMDALMGIARDHPSKRLVLMANHDIFSAFVARNHEELGRHYALPFPSLEAMAALT DKARFTRACEKAGIPTPRTVVVDFSGADDGAWAAPAIDIPFPVVAKAANGEPYDVLEFEG KRKIWFIDSPEELDGLWRTLRSAGFRDSFLVQELIPGDNTQMRSITAYVDSHGETTLIGS ARVLLEDHAPTMIGNPVAMITEEFPELWEGAVELLAGSGYRGFANFDVKIDPRDGRAVFF EVNPRIGRNNWYMAAAGANPVVPMVADLIDGQRCEQVRATREILYTMVPDSLLLRYIVDP ALKRRVKGIIRDGRRFDLLLNPAEKNLRRNLAVWLQKQNHRRKFARYYPEPTQTSY >gi|319979234|gb|AEUH01000034.1| GENE 18 17608 - 17772 329 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPNPYDEDRTSSLVRKKSELTPEQYAHQRKMLIVSGAIMVVCIAAIVGILNLIG >gi|319979234|gb|AEUH01000034.1| GENE 19 17904 - 18533 532 209 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTNPYTPNAGPHGGGYGGQPNTGPQGGYGGQMPPTGPVPPAPGAPGQPRTPAGPAPLPPP TSNTPLIVGLVVGALVVVLVIVGVLWFTRWRVPRTPAPAYTPGPVSTTRGPRPTATNRAP ASSVIETQVTEECHNAVKQTVRNPLFTRDSATYGYTDSSGDQHWTVSGRVTGTNTAGKIG TFQWTCTATYLDKTGLVDAWSSVDTTPVR >gi|319979234|gb|AEUH01000034.1| GENE 20 18559 - 20286 2131 575 aa, chain + ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 17 572 2 557 561 528 48.0 1e-149 MRFNTALIPSQRYEPPSGQWWLSGIVYQVYPRSFQDSDSDGVGDLRGVRSRLDYLQRLGV DAIWLSPVYASPQEDNGYDISDYRAIDPLFGTMEDFIALLDDVHARGMRLIMDLVVNHTS TQHEWFRESRSPSSPKRDWYYWRPARPGHCPGAPGSEPNNWESFFSGPAWCFDEESGEYY LHLFAPGQADLNWENPRVRRAVYDMMNWWLDLGVDGFRVDAIDMIAKEEGLPDGGPAHPP FGVGYERFAGRPRLHDFLQEMHREVLAGRPAVFTVGETSSASPDSALLFCDPARREFNSL IQFEHVNLGTEAGKFSPRALRDGELVGSLCRWQDGVGERGWNCLYLDSHDQPRSASRFGD PEHWRASATALATMLQLQRGTPFVYQGQELGMTNAGFTSIDQYRDVESLNYYAQALAAGA DERTVLSGLARMSRDNARTPMQWDTSPNAGFTTGTPWIGLSRSWASGEARATAEAQVDDP DSIYSYYRALAELRHRLPVVALGSFNRTATGDARVFAYERRLEGESALVVVVNLSSQTIA PRPGVVPAAAPLVLSNGDGEGPLGPWEARVHLLAD >gi|319979234|gb|AEUH01000034.1| GENE 21 20379 - 22013 2194 544 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192615|ref|ZP_06609569.1| ## NR: gi|293192615|ref|ZP_06609569.1| putative LPXTG-motif protein cell wall anchor domain protein [Actinomyces odontolyticus F0309] # 1 542 1 540 542 375 41.0 1e-102 MATWSSRRWVALAAALAVFATIIGLGQVVLGQRAAHADDTCPAPEKDITSAVSVDYDNSD LVDDRGKSVRTFVNWQTVGVKMPWSTSAPVKAGDYFTYDATVTDKANGAKPISPTAAKNF PVQSSDGTVVGCGTWGTDGKVTVVFTKSAEQAASWTGTVSSWGQMRYDRDSITTPTYLIG GKKEIKTERVGSLTPPGNATSFRKDGWMSIQYSHNEDQAITYRLIVPGGESGVRGATIVD SAVGGNWDFTCDKVQSFVPTHSYLVEGTATGAQMEKDTTAGSFAKGVGVTCEGSKVTLKL PDIPAGKFGVVMLPAHVAGASADNPLSGTFENSAVLSVPGKEDQKATRFMRYGALGDAEA HQKFSVTKKLEGTADASLEYTLNITVSNEADPTVDRTYTATLKAGETFTSADSLPLGTKV TIKEGDLPGTVTWDTARSGIFEAADGVTLSADAREATFTLAQDRVFSLTLVNATTPAPGE STTPTPQESTTTPAATATTTTPAPAAQPKPKLANTGAGTIAIGALAGLLAIAGVATLVAR RRQK >gi|319979234|gb|AEUH01000034.1| GENE 22 21871 - 22143 173 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSFSGAGQVSSAWAARCPRTTCPNPMIVANTASAAANATQRLELQVAILHTPVVLFPPYF RPSSNKHRPNFSYSVGNQGFPALAGHVGTR >gi|319979234|gb|AEUH01000034.1| GENE 23 22152 - 22781 -189 209 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRAQTPGILGNLCPDSDLSDLRDTLRPVHLLDRGAQHAVTRGAGRPPQRQRRAGPVARTR CAPRGSVDRRPTGRARRADAARTAMGAPGRRPTAQGTRPGWSAPTRRAAPGRPWAAGGND EAPTPLRADAEDLRGGRYWDRTSDLFRVKEARYPCANRPRWVRDSNPCVRLCRPLPRLSA NPPRYARERWIPVCERMTGLEPATLTLAR >gi|319979234|gb|AEUH01000034.1| GENE 24 23182 - 25293 3178 703 aa, chain + ## HITS:1 COG:Cgl2054 KEGG:ns NR:ns ## COG: Cgl2054 COG1523 # Protein_GI_number: 19553304 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Corynebacterium glutamicum # 5 698 13 712 836 916 62.0 0 MNTRPGHPYPLGATFDGTGTNFAIFSSVATSVTLCLLDDDLNETPIPMTEVDAWVWHVYV PRVGAGQRYGYRIEGPWDPDLGHRCDVSKLLLDPYAKAIDGQLKDSPSLLSYDPENPALL QPQDSARATMHSVVVNPFFDWGGDHRPGHDYSETIIYEAHVKGMTMTHPEVPAEIRGTYA GMAHPAIIAHLKKLGVTAVELMPIHQYTNDTTLQAKGLSNYWGYNTIGYFAPHNAYCSSR EPGAQVAEFKAMVKALHAADIEVILDVVYNHTAEGNHMGPTLSLRGIDNAAYYRLVDGDR RHYFDTTGTGNSLLMSSPQVLQLIMDSLRYWVSEMHVDGFRFDLASTLARQFAEVDRLSA FFDLIHQDPVVSQVKLIAEPWDVGADGYQVGGFPPLWSEWNGRYRDTVRDFWRGEFSSLP DFASRLAGSSDLYESTGRKPRASINFVIAHDGFTLADLVSYNTKHNEANLEGGADGANDN RSWNCGAEGPTDDEEILSLRRRQQRNFLTTLIFSQGVPMIAHGDELGRTQQGNNNTYCQD NELSWIDWDLDDEQQALLEFTSKLIHLRRDHPVMRRRRFLNGPAVRGGESDLGEIEWFTP SGKHMKEEEWSQPWARATMVFYNGDAIGEPDANGRRIKDDDFLLLLNAAPESIDFTLPDT KYGQLWHTAIDTGGDDDSSEFHSGDTVTVGPRTAFILRNPRGL >gi|319979234|gb|AEUH01000034.1| GENE 25 25341 - 25913 665 190 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1965 NR:ns ## KEGG: Bcav_1965 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 13 168 16 171 198 145 50.0 6e-34 MRSVSDVNEIVAPDEFVQALLSLREAPGLPRILLEEVAPPSRLAPFTAAVAMRTIDEDDV GQPLGNGRLVVLHDPDGQVGWNSTFRLVAQLRAQIDPEMGADPLLAEALWGWTQDCLDDA GTGYHDLTGTVTRELSEAFGGLVLRGSNLFVEIRASWSPNTQFIGEHMVGWAALIRRTAG VVPSLFLEGI >gi|319979234|gb|AEUH01000034.1| GENE 26 25915 - 27153 1367 412 aa, chain + ## HITS:1 COG:Cgl1855 KEGG:ns NR:ns ## COG: Cgl1855 COG0349 # Protein_GI_number: 19553105 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonuclease D # Organism: Corynebacterium glutamicum # 20 410 5 400 403 215 35.0 1e-55 MLVDNRGGLPQGDTEDPVLIAQPRGGVPAVVDTPADLERAASRLRSGSAPVALDVERALG FRYGSDPYLVQIRREDVGSFLIDSHALPDLSPLNDGVGSTWLLHDCSQDLPNLRAVGLRC PELFDTEIAARLVGMRRFGLASVAESVLGLALVKDHQAADWSVRPLPKDWLRYAALDVEL LTELHRRLSTRLHDLGRWEWATQEFAHALSVPDPRPDPERWRSVKGAGALKTPRQLAYLR ELWSAREGIARDLDLSPGRLVRNSALIRAASRPPANRRALLSIGEFRSPVARQYTDQWMR ALTRARTMDEARLPRRHRRPAPGELPDARTLKRVDEEAAARLGQIRLAVAGVASALDLDP EVVLAPRVQRYIAWAPLGRAGADGVGERMSDYGARPWQIELTAAPVRAALGL >gi|319979234|gb|AEUH01000034.1| GENE 27 27599 - 29497 2253 632 aa, chain - ## HITS:1 COG:Cgl1856 KEGG:ns NR:ns ## COG: Cgl1856 COG1154 # Protein_GI_number: 19553106 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Corynebacterium glutamicum # 9 615 2 623 636 625 53.0 1e-179 MPRTAPRPALLASINGPQDLRGLSERELIELAAQIREFLIENVSRTGGHLGPNLGVVELT IGIHRVFRSPVDTIVFDTGHQCYVHKLLTGRHDFTGLRAEGGLSGYPSRAESPHDVVESS HASSSLAWADGISWEKHRRGDRSWTVAVIGDGALTGGMAWEALNSISASPDRRVMVVVND NGRSYAPTIGGLARHLDALRTSQGYERALAWGKQQLLRGGGPGRVAYDALHGLKSGIKDA LIPQAMFEDLGLKYLGPVDGHSVSAVEEALARGASLDAPVLVHVITQKGRGYVPAEEDIA DRFHAVGPIHPETGLPVAPARFGWTSVFADEIVRIARERDDVVGVTAAMMRPVGLGPLHE AFPGRVIDVGIAEQEAAACAAGMAYQGAHPVVALYATFLNRAFDQVLMDVALHRAPVTFV LDRAGVTGADGPSHNGVWDIAMCQMVPGLRMAAPRDEDTLREELREAVGTDDAPTVVRYR KGQLPAPMPALRRSGPVDVLFDEAGASTVLIVATGATVPDALRAARALAADGVGARVVDP RWIIPVPGELAGLAQGADAVVSVEDGGLNGGFGWALRDALAPTPVVCLGVPKEFPAHGDR DDMIARFGYDAAGIERAARVALGRLAPQKAGV >gi|319979234|gb|AEUH01000034.1| GENE 28 29564 - 31594 2423 676 aa, chain + ## HITS:1 COG:slr0825 KEGG:ns NR:ns ## COG: slr0825 COG1506 # Protein_GI_number: 16331709 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Synechocystis # 17 662 11 631 637 459 39.0 1e-128 MSALIPQEATMSPTQTPYGLWKSPITGDSFTARSVTLSQVRVDGPDTYWVEGHPKENGRS TLLRHRASGETTEVLPLIDGARLPDIRTRVHEYGGKAYAVHDGVIVFSDGADGRVYRFDV NNPRSGIQPLTTLSKVRYGDFWIADVRGLVYAVAEDHRGEGEPVNSLVAIPLDGSAARND ANIIPVFSGTDFVEAPAVSPDGTKLAWITWDHPEMPWTRSRLHVASIGFEGTLGASRVLV DSPGVCVYEPRWTRDGDLIHVDDSSGWANLYRTQGFAWVEGEDPDAWTGRLRTRALHPGR RAFSHPHWSLGLHSFDNFDNDFLVCSWAEDLTWHIGTVRIDNGLAEEWATGWWPIGNVAS ADGRVVFLADSATHTPAIVEVKDGRTQVIRPSSEAQVPAELVSAAQVISWGTADGERAHG FYYPPVNPDYTAPEGELPPLIVNVHGGPTSSARPGLSIPFQYWTSRGFAVLDVNYRGSTG FGRAYRERLSGNWGVMDVQDCVDGARYLIGLGLVDPGRVAIRGASAGGFTALSALAASDV FTAGASFSGITDLRKLNEVAHKFESSYPTHLLGSSDPADPVWAERSPINHIDRITAPLLI LQGADDHVVPPSQAHAMYEALRERGNAVAMRIYEGEGHGFRSAAHIKDAWQTELAFYRTV WGIAQSSPIHVEIANL >gi|319979234|gb|AEUH01000034.1| GENE 29 31729 - 33441 1863 570 aa, chain + ## HITS:1 COG:VNG1544G_1 KEGG:ns NR:ns ## COG: VNG1544G_1 COG0038 # Protein_GI_number: 15790525 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Halobacterium sp. NRC-1 # 43 416 245 634 654 183 38.0 6e-46 MLFNALISLTGRLATGYPEYTEHIGAAHGAWGWAPWVFLLVSPAVAGLVYGPLIARFAPS AKGHGIPEVMLAVRRKGGAIPGRVAVVKLVASALTIGSGGSAGREGPIVQVGAALGSSIA SALRMPTARVVLLAACGSGAGIAATFHAPLAGAVFALEVILTQFTAEAFGFVVIAAVASS VVARAFQGDEVVVSVGQALSFTSLADIGWVALVGLVAGLAGLGFSKLLYWLEDVIDAFWA LTRLPQWSRPGVLGLVLGGGLVAFPYMFGSGYPLEERAIGGEYTVAFLLALMVGRAVFTS FTIGMGGSGGVFAPTLFIGAMAGAAFGQVVAPLASSGAGVFAVIGMGAAFAGAARAPMTA VLIIVEMTAQFSLILPMMLAVVIATGASRFLTRSTIYTEKLRRRGDVLDDPVEGTLLGTK PVVAWMAPVPELLHASAGIDEAVAALRRTRESVLPVVRGGAFVGLVTSLGLAERHQSDGR SPVALGDLELVDVSVDSLASPSRVLEALRASGLQALPVLNRDRRVIGWVSERDLVDRMYR DQRRAIEAREQSSWGSRFQERHPHRRPKRG >gi|319979234|gb|AEUH01000034.1| GENE 30 33428 - 34696 1370 422 aa, chain - ## HITS:1 COG:Rv2689c KEGG:ns NR:ns ## COG: Rv2689c COG2265 # Protein_GI_number: 15609826 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Mycobacterium tuberculosis H37Rv # 5 421 11 403 405 188 35.0 2e-47 MGEVLELRIGEPAHGGACVARDASGRVVFVRHTLPGERVRARVTSVRNTLAWADAIEILD ASADRVSPVWPQAGPSGVGGGELSHVAPPAQRRWKEHVIAGQIRRVGGEALAEAVDAIGG VRVAPAPGDGEPGDRLAHRRNRIEFVIGTDGAPGMHVYRGKRLIPLDSMPLAAPAIAGLG LFDGDSPWKRVWAPGERVRALSPTPGSAYVVCASGVYGADARRTGSAALAWPVEIGGEEH VYGVRPTGFWQTHVRGAQVLTEEVLDAARAETGGAVLELYSGAGLFTAPLARAVGEGGRL ASLEGDEGAVADAAENLAPFPWAQTFIGGIDAHGVAELAGGLGRAPDVVVADPPRAGAGR QVCEAMAALGAPRLVLVSCDPAAGARDLRALAGAGYRLESLRAWDLFPHTHHVEIVAALT RA >gi|319979234|gb|AEUH01000034.1| GENE 31 34908 - 36890 2980 660 aa, chain - ## HITS:1 COG:MT2764 KEGG:ns NR:ns ## COG: MT2764 COG0531 # Protein_GI_number: 15842228 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Mycobacterium tuberculosis CDC1551 # 4 648 8 646 657 568 52.0 1e-161 MSRAKRVLIGRPMRTDAMGHQLLPKRIALPVFASDALSSVAYAPDEILLTLALSGSLAAL QSIWVGVLVALVLAVVVLSYRQTVHAYPSGGGDYEVVTTNLGRTWGLVVASALLVDYILT VAVSISSGAAYITTAIPALDGHQVTVAVVLVVVLATLNLRGTKEAGGAFAVPTYLYMAAI GLMVAAGFAQWATGHLGQAPTAGYELVTAPGHADGIVGLGGVFLLMRAFSSGCAALTGVE AISNGVPVFRRPKSKNAATTLAMLGSIAAAMMLSILLLARATGVKIVDDPALQLAQGGAP VPEGTPVAPAISQIASAVFGPGSALFLLVTVVTGFILVLAANTAFNGFPTLASVLSRDSF LPHQMVRRGDRLSYSNGIALLSVAACVLIIGFEAQTTRLIQLYVVGVFISFTLSQLGMIR HWNAQLRKRQSGSERGKVLRSRAVNVVGFMMTGLVLAIVLVTKFAHGAWITLLMIAAVLA FQLTINHHYTTVRRQLRVDDWGAKRTLPTRVRAIVLISSLSRPAMRAVAVARASNPTSLE LVSVVEGEEQGETIRRQWHESELPVPLTLLSAPYRDIHSVILQYVRSRRQANPSEMLVVY MPQFLVSHFWENFVHNQSALRLRRSLLNVPGVVITMVPWKLGEDEAVEGRRAINDPFQRD >gi|319979234|gb|AEUH01000034.1| GENE 32 37000 - 37062 80 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHIPKPTPASRAVQARGEGT >gi|319979234|gb|AEUH01000034.1| GENE 33 37057 - 37722 897 221 aa, chain + ## HITS:1 COG:MT2765 KEGG:ns NR:ns ## COG: MT2765 COG0569 # Protein_GI_number: 15842229 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Mycobacterium tuberculosis CDC1551 # 1 203 1 204 227 177 50.0 1e-44 MHFVIMGCGRVGARLATTLDSMGHSVSVIDVDSKAFNRLPSDFSGRRVTGVGMDRAAMRQ AGIRDAYAFAAVSSGDNSNIIAARIAREVFDVEHVVARIYDPTRAYLYERLGIPTVASVQ RTAESVLRRVLPPDASLTWVHPTGSVALVNATPTPAWYGVAFPTVEELTGCRIAFTSRLG AVQPASGDLVVQEHDQLYFAISGTDTTRLRDLMSNPPRLED >gi|319979234|gb|AEUH01000034.1| GENE 34 37727 - 38416 835 229 aa, chain + ## HITS:1 COG:MT2766 KEGG:ns NR:ns ## COG: MT2766 COG0569 # Protein_GI_number: 15842230 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Mycobacterium tuberculosis CDC1551 # 13 221 13 218 220 189 49.0 5e-48 MKIVIAGAGSVGRSVALELLEHGHEITLIDHHPEKLRIASVADADWVLADACSPDALRDA GILDADVMVAATGDDKANLVISLLAKTEFAVPRVVARLNNPKNEWLFDQSWGVDVSVSTP RIMTSLVEEAVSVGIPVRLFSFNTAEVSMHAIILPQDSPVVARRVTSVLLPANTVLAALL RDGRPLTPSSDDVFEAGDELLVLIPDADSDATAELRRMVNPEPAQPPQD Prediction of potential genes in microbial genomes Time: Thu May 12 17:05:50 2011 Seq name: gi|319979232|gb|AEUH01000035.1| Actinomyces sp. oral taxon 178 str. F0338 contig00035, whole genome shotgun sequence Length of sequence - 670 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 16 - 669 707 ## Arch_0774 hypothetical protein Predicted protein(s) >gi|319979232|gb|AEUH01000035.1| GENE 1 16 - 669 707 217 aa, chain - ## HITS:1 COG:no KEGG:Arch_0774 NR:ns ## KEGG: Arch_0774 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 4 201 8 203 228 135 41.0 1e-30 PAWIQQATGDSFSWSQALGGPRGVVESVAPGLVFVVVYALTRSLVWTLAAALGVALMACA ARLLARQPLTQAASGVLGVGIGVLVASLSGRAEDYFVWGIATNAVSALALAVSLVVRRPL VGLAVGLLYGVTARWREPRMRPLARRCAALTWLWFGVFALRVAAQAPLWALGLVAPLGIV KIVLGLPLFAAAGWATWMGLRPHAASLRASGGQPDGQ Prediction of potential genes in microbial genomes Time: Thu May 12 17:05:57 2011 Seq name: gi|319979224|gb|AEUH01000036.1| Actinomyces sp. oral taxon 178 str. F0338 contig00036, whole genome shotgun sequence Length of sequence - 7163 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 75 - 434 411 ## Krad_1559 nucleic acid binding OB-fold tRNA/helicase-type 2 1 Op 2 . - CDS 454 - 1095 884 ## Xcel_1921 hypothetical protein 3 1 Op 3 . - CDS 1104 - 1406 456 ## Jden_1411 hypothetical protein 4 2 Tu 1 . + CDS 1533 - 2897 1183 ## Cfla_1698 DNA-binding protein 5 3 Op 1 . - CDS 2914 - 4056 1238 ## COG1524 Uncharacterized proteins of the AP superfamily 6 3 Op 2 . - CDS 4049 - 4768 645 ## RSal33209_2411 hypothetical protein 7 4 Tu 1 . + CDS 4656 - 7121 3284 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit Predicted protein(s) >gi|319979224|gb|AEUH01000036.1| GENE 1 75 - 434 411 119 aa, chain - ## HITS:1 COG:no KEGG:Krad_1559 NR:ns ## KEGG: Krad_1559 # Name: not_defined # Def: nucleic acid binding OB-fold tRNA/helicase-type # Organism: K.radiotolerans # Pathway: not_defined # 3 118 15 131 133 73 40.0 2e-12 MQWLRALLDTDQRQIDEADDEARSRRARGVEPIAGVADRERARLFGTVLSMTYPPASGPQ VLAARLYDGTASIELRWPGRSEIPGLHVGAHIEAEGTVGKQGGAAVIINPLYRVISMEA >gi|319979224|gb|AEUH01000036.1| GENE 2 454 - 1095 884 213 aa, chain - ## HITS:1 COG:no KEGG:Xcel_1921 NR:ns ## KEGG: Xcel_1921 # Name: not_defined # Def: hypothetical protein # Organism: X.cellulosilytica # Pathway: not_defined # 34 204 47 217 249 121 40.0 1e-26 MFERKKRHADQQEAPQAPALPTLEDDDPIWEQPGPRNAGEVDTSDGYVDMGSILFPAVPG MQLRTQLADDGSTVLQILVVLGNSGVQMSVAAAPRSGGVWDEVREEIRSGFASQGASVTD VESRYGDELLVDMPMRMPDGRSGTSRMRIIGREGDRWFARIDVLGPAAASPEAGAHIEKV IDRIVVRRDKHPRTRLELLPLHVPGGAAGVKRA >gi|319979224|gb|AEUH01000036.1| GENE 3 1104 - 1406 456 100 aa, chain - ## HITS:1 COG:no KEGG:Jden_1411 NR:ns ## KEGG: Jden_1411 # Name: not_defined # Def: hypothetical protein # Organism: J.denitrificans # Pathway: not_defined # 1 99 1 98 98 118 70.0 7e-26 MATDYDAPRKNEDDVSEDSIEELKSRRNDAGSAEIEEDETEAAENFELPGADLSKEELTV HVVPRQDDEFTCSQCFLVHHSSQLAYTDAAGQPVCTECAG >gi|319979224|gb|AEUH01000036.1| GENE 4 1533 - 2897 1183 454 aa, chain + ## HITS:1 COG:no KEGG:Cfla_1698 NR:ns ## KEGG: Cfla_1698 # Name: not_defined # Def: DNA-binding protein # Organism: C.flavigena # Pathway: not_defined # 13 252 1 247 379 142 43.0 2e-32 MRRFDYHIQEVSMIELELLGTSGDGGSLVFTDADGQRYSVLITDELRGATRRDRPRMEVA AARPSLAPREIQALLRSGATAQEIASRHGMQVATVARFEAPVQAEKEYALSRALSVRIGD EAGGPAMGDLVVDRLAARGVDPDSLRWSAQREAHEPWQIILTFVQGAAEHGAHWRLLPSG EVEAVDQEAQWLTETVAPAPAATSIFASMTRASAVDANEEEIRKREALVDQLNAARGKRQ HIEIDFDDDEVDEEAEYLAAISEEEYREPREPPAITGPISAQIYSLAQARTRSAPPAPPT PVLEGTGIPSAVPQASGEIPVIGHREARTKGSDGGEGAEPGKTTPDAGEVPSASGQGVVP SGSGSSRASVARRPPSAPRGGEPGAPSARAGAPPLPDAEAKPDEDATKKTRSVAGGRESA APESARPTATGAKRTRRGRRPMPSWDEILFGTRT >gi|319979224|gb|AEUH01000036.1| GENE 5 2914 - 4056 1238 380 aa, chain - ## HITS:1 COG:TM1581 KEGG:ns NR:ns ## COG: TM1581 COG1524 # Protein_GI_number: 15644329 # Func_class: R General function prediction only # Function: Uncharacterized proteins of the AP superfamily # Organism: Thermotoga maritima # 54 377 50 376 381 74 25.0 3e-13 MAEALPPLDGADLPAPGGPRITDIVGAGLASLDPGLPGADEAARAALGIDAGARQLLLVL ADGLGTVLLEDHFGHAPTLRAFRSSIRSLHTVVPSTTAAAITAFGTGAQPGATRMVGFCV AHGGGTMNLLAFEGGPPAEQWQSVPTHFQRLAGAGVDSAVVSPASFAGSGLTRAALRGAR HVAANALEQRCEAALRELRAGTPVVYLYWADIDHWGHSRGVGSHEWTGALEAFDAGLAGL LRRLPGGVRALLTADHGMVNIDPPSLRDLADSPALAQDVRLIAGETRAVHVHAEPGRARG VEERWRAELGEGAWILNRAQMGALIGQGPGADAVGDLLVLARGRGGVVDSRVQSAAMIAM PGVHGSLSSEEMRIPLVRLA >gi|319979224|gb|AEUH01000036.1| GENE 6 4049 - 4768 645 239 aa, chain - ## HITS:1 COG:no KEGG:RSal33209_2411 NR:ns ## KEGG: RSal33209_2411 # Name: not_defined # Def: hypothetical protein # Organism: R.salmoninarum # Pathway: not_defined # 68 233 24 198 238 105 37.0 2e-21 MTETAYSRNEPLISAETSISEMFSSASGTSTTWSFFAIGSIIAAGRPPVSTTTPRPGTMR GVDTWKRIDRTGFYPQVVKRALRRALGGEAPVASLCQVDAAFDRGSVFRHLTVATLTRDC VVQLHVDELEDGGASVATSIHRSADIAGYSTMEILDDPQRARGLSEITVAVDLRGARRIE LEPAHCDDPDCAADHGYTAQSFPDDLSIRVSAAADGEDVLAEAEEFVEALAALMGVRGG >gi|319979224|gb|AEUH01000036.1| GENE 7 4656 - 7121 3284 821 aa, chain + ## HITS:1 COG:SA0006 KEGG:ns NR:ns ## COG: SA0006 COG0188 # Protein_GI_number: 15925711 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Staphylococcus aureus N315 # 7 767 1 743 889 506 39.0 1e-142 MAKKDQVVDVPEADENISEIDVSAEMRGSFLEYAVSVIYARALPDARDGLKPVQRRILYQ MDQMGLRPDRGHVKSQRVVGEVMGKLHPHGDSAIYEALVRLAQPFNLRLPLVDGHGNFGS LDDGPAAARYTEARMAPAAVDLVTGLGEDTVDFVPNYDNQFTQPEVLPAAFPQLLVNGAS GIAVGMATNIAPHNLAETIAGAVHLLDHPDATVADLMRFVPGPDLPEGGVIVGLEGVKEA YETGRGAFKTRAKVSVERVTARKMGLIVTQLPYMVGPEKVIEKIKENVGAGRLKGISSVQ NLTDRIHGLRLVIEVKNGFNPEAVLQQLYQRTPLEDSFSINAVALVGGQPQVMGLKQILR VFVDHRLDVTTRRSRFRLGKCEERLHLVEGLLIAILDIDDVIAIIRSSDDADVARERLMT AFDLSQAQADYILELRLRRLTKFSRIELEAERDELAARIAELREVLESPELLRALVKREL AEVSERLGTPRRTVLLASSGAPVVGGAGASSLAALPSVPRSGKGLELQIPDDPCQVVLSA TGALARVEGVDPLEPGPRAAFDGWRAQLPSSVRSQVAVVTEDGVAHRIDVVDLPALPRFD TGMSLAGAVPASELLHSDSRPVGLLDPDGSAVSAMGTARGTVKRLRPDVLQRDEWEVIAL DPGDRLIGFAPCPDEADVVFISGDAQLLRTPAERIRPQGRTAGGVAGMRLSPGVEAIGFW VVDAPAESVVVTVAAAEGALPGTGQTTVKVTPFDAYPSKGRGGQGVRAQRFLRGEDRLDV AWVGAGPRASTSDGAPVDLPAPDGRRDGSGSPITLPIATIG Prediction of potential genes in microbial genomes Time: Thu May 12 17:06:26 2011 Seq name: gi|319979194|gb|AEUH01000037.1| Actinomyces sp. oral taxon 178 str. F0338 contig00037, whole genome shotgun sequence Length of sequence - 37870 bp Number of predicted genes - 32, with homology - 22 Number of transcription units - 18, operones - 10 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 83 - 1033 1277 ## COG0524 Sugar kinases, ribokinase family 2 1 Op 2 . - CDS 1026 - 2594 2366 ## COG0477 Permeases of the major facilitator superfamily 3 2 Op 1 . - CDS 2831 - 3856 1385 ## Bcav_1925 GCN5-related protein N-acetyltransferase 4 2 Op 2 . - CDS 3891 - 4910 984 ## Sked_15920 acetyltransferase (GNAT) family protein 5 3 Tu 1 . - CDS 5163 - 7301 3212 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 6 4 Tu 1 . + CDS 7527 - 7742 190 ## Sros_7085 hypothetical protein - Term 8022 - 8060 4.2 7 5 Tu 1 . - CDS 8064 - 8921 660 ## 8 6 Tu 1 . + CDS 9611 - 9685 69 ## 9 7 Op 1 . + CDS 9861 - 10391 636 ## gi|296933977|ref|ZP_06905210.1| NTP pyrophosphohydrolase including oxidative damage repair enzyme 10 7 Op 2 . + CDS 10469 - 11272 791 ## Namu_1232 hypothetical protein + Term 11276 - 11337 4.3 11 8 Op 1 . + CDS 11656 - 12699 1329 ## gi|293189518|ref|ZP_06608238.1| conserved hypothetical protein 12 8 Op 2 . + CDS 12734 - 15121 1758 ## sll7067 hypothetical protein 13 8 Op 3 . + CDS 15118 - 17538 1333 ## gi|293189516|ref|ZP_06608236.1| conserved hypothetical protein 14 8 Op 4 . + CDS 17535 - 20447 901 ## gi|293189515|ref|ZP_06608235.1| CRISPR-associated RAMP protein 15 8 Op 5 . + CDS 20444 - 21013 315 ## gi|293189514|ref|ZP_06608234.1| hypothetical protein HMPREF0970_00549 16 8 Op 6 . + CDS 21025 - 23256 1204 ## Psta_1142 protein of unknown function DUF324 - Term 22749 - 22789 1.9 17 9 Op 1 . - CDS 23008 - 23529 565 ## Cagg_3452 CRISPR-associated protein Cas2 18 9 Op 2 . - CDS 23545 - 25344 1621 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair - Prom 25379 - 25438 2.5 19 10 Op 1 . + CDS 25899 - 26120 213 ## 20 10 Op 2 . + CDS 26054 - 26269 149 ## 21 11 Op 1 . + CDS 26679 - 27566 725 ## 22 11 Op 2 . + CDS 27487 - 27879 182 ## + Term 27918 - 27963 6.5 + Prom 28062 - 28121 2.9 23 12 Op 1 . + CDS 28175 - 28741 686 ## 24 12 Op 2 . + CDS 28788 - 31262 1500 ## Psta_1142 protein of unknown function DUF324 25 13 Tu 1 . + CDS 31422 - 31598 179 ## 26 14 Op 1 . + CDS 32066 - 32761 684 ## gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 27 14 Op 2 . + CDS 32762 - 33784 715 ## gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein 28 15 Tu 1 . + CDS 33965 - 34360 213 ## 29 16 Tu 1 . + CDS 34462 - 34782 395 ## 30 17 Op 1 36/0.000 + CDS 35220 - 36320 1550 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 31 17 Op 2 . + CDS 36322 - 37005 282 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 36896 - 36938 5.0 32 18 Tu 1 . - CDS 37176 - 37760 846 ## COG1196 Chromosome segregation ATPases Predicted protein(s) >gi|319979194|gb|AEUH01000037.1| GENE 1 83 - 1033 1277 316 aa, chain - ## HITS:1 COG:mll7580 KEGG:ns NR:ns ## COG: mll7580 COG0524 # Protein_GI_number: 13476296 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Mesorhizobium loti # 4 305 2 307 307 103 29.0 5e-22 MTRILNIGEALIDEINRPNTAPVEVVGGSMLNVAAGLTRLGHESELATWFGRDERGAKVR AHAEEAGVTLVEGSDAAEFTTVAHATIDEKARATYEFEVLWDVPAIPDQDGVGHVHTGSY AATFQPGAAKVLAAVKRQAIHGTVSYDPNIRPALLGTPEETRPQVEAIVALSDVVKASDE DLEWLYPDRPVEEVIREWSQAGPALVLCTRGPWGVYIKTAAERDMLVADPLDVELVDTVG AGDSLMAGLISGLVDAGLLGSAEAKQRLREAHWADLMPAIHRGIITSGITVYHEGAYSPT KAEVAAILAVSKNLQG >gi|319979194|gb|AEUH01000037.1| GENE 2 1026 - 2594 2366 522 aa, chain - ## HITS:1 COG:STM4418 KEGG:ns NR:ns ## COG: STM4418 COG0477 # Protein_GI_number: 16767664 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 52 514 13 456 477 257 32.0 5e-68 MVSLTLCLPSQENPPRPRGASMRGIPRVPSLFDTEACMTTPSASQSKTSPLVIRSAIVAS LGGLLFGFDTAVISGAEEKLKALYALSSFGEGMIVAIATIGTILGAIVAGRLADHFGRKP VLFWIGILFGVGALATALAPTPDLVAGAGGAMTASSSMPITFFMVFRFLGGVGVGMSSVV APIYTAEIAPARVRGRLVGLVQFNIVFGILLAYASNAIIREIAHEDVAWRWMLGVMAVPA VFFLLFLMAVPETPRWLFAHRREDEARAISARLTNSAEESEEQMKEIADQLAEDRAAGHV PFFTRRYRKVILMAFCIAMFNQLSGINAILYYAPKVMKLAGGEAIFGAAFPYIASVVVGL MNLIATMAALTVIDKLGRRQLMIVGSIGYLVSLGFLAGMMFAYEAGAVTGSAAVWLVLIG LLGFIASHAFGQGSVIWVFISEIFPNRVRARGQSLGSLTHWTFAFITTYAFPVLTDKLGG GFAFGIFFLCMVGQLFWVLKVMPETKGIPLEEMEEKLGLSDD >gi|319979194|gb|AEUH01000037.1| GENE 3 2831 - 3856 1385 341 aa, chain - ## HITS:1 COG:no KEGG:Bcav_1925 NR:ns ## KEGG: Bcav_1925 # Name: not_defined # Def: GCN5-related protein N-acetyltransferase # Organism: B.cavernae # Pathway: not_defined # 5 341 14 352 352 196 37.0 1e-48 MSDTAPRAGADACADVPMPPSHHGIVWEPLAPAHHPALATLFARMEARDNPPYRTSPGEV AEMLGGATQWSGIAGFATRGLAAGRMIAYAQVTIRHPGRVECVCSGGVDPDFRRIGLGGA IVDWQEGAARAMIGAAGSGGPAQIVCHVEAGQEDLEAQLQSHGFHWSRTYYEMRADLSSL PERPSLGAYLSLEAWAPQWEEPARQAANLLNEVEWGRPPLTEEQWFQGRMSFVPEWSFLL VDRRGDRPRVGGFLIASRYEQDWAALGWREGYIDQMGVLGAYRQSRAVDALIIASMRAQA GDGMERTGTGLGSANHSGALAVYDYLGFRTVGQTRLYAREV >gi|319979194|gb|AEUH01000037.1| GENE 4 3891 - 4910 984 339 aa, chain - ## HITS:1 COG:no KEGG:Sked_15920 NR:ns ## KEGG: Sked_15920 # Name: not_defined # Def: acetyltransferase (GNAT) family protein # Organism: S.keddieii # Pathway: not_defined # 17 338 3 324 333 169 34.0 1e-40 MTGLAQRLSPPGSVPYPGNHLGLRWRPLISRDAPAVYDLVRAVESADDAIHRTGAEGIAD MVEGRGGEDWVDAIVGLDRNGSISAVGCVRVVRGVSDSAIAIVSAYVHPHWRGRGLGRAL LHWQDGRAREMLVETYGPDSAIPASIANFVDAHMTDRRRLYIAAGFYAKRTLQVMYRDLE GSESAPRVRGGYTVVPWDRALRDKARELHMEVSRDLFWSWNRARWWDDAVASMERRWSFL ALDPGGAVVGYIVTERPAQRWVSTGRPEAGVSLVGVLPQHRGRSLCSALIGHSVAAASAS GMPRIGIEVDTRNPSNAHAIYEHLGFVDDAAEVFYVIDQ >gi|319979194|gb|AEUH01000037.1| GENE 5 5163 - 7301 3212 712 aa, chain - ## HITS:1 COG:BS_gyrB KEGG:ns NR:ns ## COG: BS_gyrB COG0187 # Protein_GI_number: 16077074 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Bacillus subtilis # 13 698 2 626 638 503 41.0 1e-142 MRTLFPTMRPVSQTKASDYSARHLSVLEGLEAVRKRPGMYIGSTDHKGLMHCLWEIIDNA VDEALEGHCDTIDVRLHPDHSASVSDNGRGVPVDEVPGIGLSGVEVVYTKLHAGGKFGGG SYGAAGGLHGVGASVVNALSARMDVQVDRGSKTYEMCFRRGEPGEFDDSRQRTPNSPFTP FTDQSRLKTVGKVKRTKTGTRVRFWADFQIFPATEGFSWESLLARARQTAFLVPGLTINC VDERGEEERSESFRFDGGVVDFANWLATDGPVTDTWHIMGEGAYTETVQVLDADTGHLTA SELERVCSVDIALRWGIGYDTSVRSFVNIIATPKGGTHVAGFEQAVLRVLRAQIDKNSRK LKVGAKDGRPEKDDVLAGMTAVVTVRFPEPQFEGQTKEILGTPQIRSVVSKVVGDWLGAK LASPKRDDKQQASQLMEKVVGEMKARVSARIHKEISRKKNALETSSLPAKLADCRLADVA ETELFIVEGDSALGTAKAARNSHYQALFPIRGKILNVQKASTADMLANAECANIIQVIGA GSGRTFDLESARYGKVILMTDADVDGAHIRTLLLTLFFRYMRPMVEAGRIYAAVPPLHRI EVAGKGRRKREYVYTYSEDELHRRLAALRKSGRTWKEPIQRYKGLGEMDADQLAETTMDR AHRSLRRITLADEEALRRAEDVFELLMGSTVGPRREFIVEESAGLDRDRIDA >gi|319979194|gb|AEUH01000037.1| GENE 6 7527 - 7742 190 71 aa, chain + ## HITS:1 COG:no KEGG:Sros_7085 NR:ns ## KEGG: Sros_7085 # Name: not_defined # Def: hypothetical protein # Organism: S.roseum # Pathway: not_defined # 9 65 4 60 76 67 57.0 2e-10 MNASTLERTAVPEPTLTAADRCDACGARAWVRVTMPTGGRLFFCAHHANAHLAALVGSGA DILDERQLMTA >gi|319979194|gb|AEUH01000037.1| GENE 7 8064 - 8921 660 285 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSPKHLSDSDPDPAASARPQSANPADAQEGAQPVEGGEDLQASRRRTGAVLLLPLVIAA VLILIVYAVLHIADRPVRGSGDAHGGTGSAAQSNGPGATDGASGGGGRSSGMGGRVSPGT ALGGSDSLTDRTVEGATGSVVYAISAIDHARNRSDTHPLSDLMTPECEACSIEIGHIDEV PTGRTRTDGMQSRGVECPVVDRTSDSGGELQAVGLRCVYLSHADRWTTNDGRDTTIDATQ DVTFDLVWQRGQWRLAGMTVNHVGSIDAYYPALDELKAQAGQGQS >gi|319979194|gb|AEUH01000037.1| GENE 8 9611 - 9685 69 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCATPCFLIPSLVCDTKKNARPGR >gi|319979194|gb|AEUH01000037.1| GENE 9 9861 - 10391 636 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|296933977|ref|ZP_06905210.1| ## NR: gi|296933977|ref|ZP_06905210.1| NTP pyrophosphohydrolase including oxidative damage repair enzyme [Rothia dentocariosa ATCC 17931] # 1 162 1 159 161 97 39.0 4e-19 MTEYTIRTVHSTMGKFDAYDFDQTPDSIAEQVEQHLLNPDFLDGEGWFALSVQPAPPDAG LRPPDQYPPPTRYLLAAGRAHEMALELYLTHPDGSTGTYVVARERVRDPDERVALKWRMG PHAINLVHVHPQEVFTGEQAVPFFRDFIIEDRAPDFSLLRCIRGRRMPRPARPHCR >gi|319979194|gb|AEUH01000037.1| GENE 10 10469 - 11272 791 267 aa, chain + ## HITS:1 COG:no KEGG:Namu_1232 NR:ns ## KEGG: Namu_1232 # Name: not_defined # Def: hypothetical protein # Organism: N.multipartita # Pathway: not_defined # 29 261 8 225 241 63 26.0 8e-09 MPTTVFVRFDSPGEVPASPSRLHAALATVLDLPRSISPGRASSFPTLAHRPHHERPGAKP YCLGELTTGPGTFGMEVRFLDDRLVETFDAWLAWGGVLRVGGARESATLIAFEGQVVAET TWQELAEQSEDTTWEIRLVTPTVFSSKGAHVRGIPPVSLATSLQERWHQWDPGTAPPRID RAAMAQVLTTQDCTTEARVSLGMPRGDRRGRLASRAITAYEGALRISGVEGAPQTADFSR LMALSAYTNVGSHAGYGMGVIDAVAVG >gi|319979194|gb|AEUH01000037.1| GENE 11 11656 - 12699 1329 347 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189518|ref|ZP_06608238.1| ## NR: gi|293189518|ref|ZP_06608238.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 81 347 5 219 221 102 31.0 3e-20 MKRENCIQWVTLAITVVLTVIFGTAYMLFRNEVWVSIADDLVFPAACTILGGLTIALVGR TTASRLLNSKERTGGGTEARSDRPRNCTYTGILILTFGLLAVFILTYIFYSDQSWGGIVR SLFFPLCYTAIGTVIVAWACTSIADKFLDHYNDRVKEGLANRRLFFKGDAKMIESTRGDL KYSGLINDENYMSNTTSITESDLVILCIGQNYPGSHERGTTDTGKGGPEADDTDSGRTSE EPKGKNKRDTNNEADPRTETSKKSESVRNDSNDGTQWEDTARKKVEGVLAEVGQDSRRPA LIVYTDGWLDRETKKLIDARPFTALVNTRGRIVSDIHSLLTTLPPRS >gi|319979194|gb|AEUH01000037.1| GENE 12 12734 - 15121 1758 795 aa, chain + ## HITS:1 COG:no KEGG:sll7067 NR:ns ## KEGG: sll7067 # Name: not_defined # Def: hypothetical protein # Organism: Synechocystis # Pathway: not_defined # 5 555 3 440 558 117 25.0 2e-24 MNNVLVMLQTNSNQAYIYSSPRLREQIGASFQITLLSQWVEEEAKKLLKVPEKRSLPGSF WVSNSSGKVIVRLTGDDGDPKALGSLPGSFWVSKSSGKVIVRLTGDDGEPNALARSLIGA VTRRAFEKAPGLDVTGVFIPSEDEVGTHNGMTDDEAVTQDSLDALDREFNAYSLNRPPAA ARFPQLPFLERAKDSALPAVVPLSDQDSNAIIEYAKRLEITPPDPKTDSKTKAGIWAQTD RDQKTDEEGITLYSLPSRVKRAWARSARCSQIQQVARYWKSGETALPADPDDLERLFQDG ESDDFGSRNTKSSAPLSTIGVIHIDGNGVGTVTRNLHSAFRAVMESGGEAPERDKKQDQR TAQLRHQLESYRKPNPVTSSEAHSFQWFIMEINYRLAAVVNLAIAKSWKKVAELANDSTV PVVPILVGGDDITVYAAGDYAIPFAEAYVRHYEELTDADPYLRQLAAVVNEKGPNTGPGP LTASAGVAIVGRNFPFHIAYDLAEKLVSRGKKLGKKKGEVQCSTLDFHILRDTTVLDPDD TLTEYRDRTQRPFLIGHYSEERITGTAHAEPGAADGIAGPSANGGQWKQILRAVAAFNGQ DPDQPGVSLTDPFPRTRANRIIKLLAEEYSQPEDRRTPKDCTGYTLKKGGKTAPKSLVDE CAPSADGCADKIEPCPHPLCRALREWTNAVENTRSADALQQELRNLPKHDRLTNDSSAQS GGNAAQSIPNLQYLKWLLDLLDLAENLPAGYVQSYSALHETGLGEDAASASDEPEAPRPN DDPRSTPSPDEGASK >gi|319979194|gb|AEUH01000037.1| GENE 13 15118 - 17538 1333 806 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189516|ref|ZP_06608236.1| ## NR: gi|293189516|ref|ZP_06608236.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 9 796 1 679 681 590 49.0 1e-166 MTDEPDGHVTIVFTSDWGVSTGVGEAGRTHSTLEKQGDLPVVRGTVVAGVLREQAMIAAR ALDAAVPAPSTGRKDSDGSGGDSEGGPWSRFAKWLFGDEGNNPRHVIFSDALPVRYSQRN GKLNQEPYSLPVHDAVSLSIDPTTGAARDQMLRFTERAAPGVLAGTYRFTDEAGAEHREY AHFLLGVGGLMVRGIGSGRSGGDGECTVIAGLECATTLAESKEPTEGNSALEEVAEDQKD EPEYSSDSLRKLAEKLADELREALFAHLNGPSFAERPPSLLVDGTKTKASPLVSVAASRH SSEATAALHPASWFEATLDIVLDSPVVSYEVPFSNEVRSLDFLRGTVLIPWLHRQLRDLP TRVDGTPLEEGDEDLIRSAVVSGDLRVSDALPVYQGTPGLPVPLVFEQEKVPEDEVPADN EKDPLQPVTLFNRHIETGKQHCGEHTVPSRGAYVFLDEKQRPADTNGAATDDGAGLCGAA PASFAGRIGKPALIGRQSTAIDPHTGAAWKSRLFLVRALPAGLRLRAKAVISRRLFEALA GAPVSDADEKAPTLDLEIRGKHAYLGSRRLTGTFGQATCSLGRFTPIGVEEARNTTPCQS SEDAETTQSSSASADKTITTSLWFTSDVLARSEGLGPGGTIADLQLAFNRTGAHVTVVVP NDEEDPEDSSSDQKPSSPCHDGPKKARSKHDDRRDDDRKRNRKQVKTAIRHRRVDSWSAA DGGPRATRMAIIAGSVVQVKVPRSDVPELHVLGRIGVGELTPQGYGRFEVNHPALREPVL PLVTTKSKCFTAPEKSEEPAQEGEGR >gi|319979194|gb|AEUH01000037.1| GENE 14 17535 - 20447 901 970 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189515|ref|ZP_06608235.1| ## NR: gi|293189515|ref|ZP_06608235.1| CRISPR-associated RAMP protein [Actinomyces odontolyticus F0309] # 1 950 1 918 937 781 51.0 0 MTIYRYEMTIPVKVVSALHSGGVDKAPEREYVDEDGRTAQPNEFVRNGLGEAVLPGRSIK GAVRAAFEEHAVALGFADPTGGVDESALKSLWGGELRRNVGASKQQRGLGSDADLPLRAS ALTFHHAVVWTDGPLPHRMSTAIDRARGSAADGALFSYEYLPVGTTFEIRISAEAQDGDD AIPTTPPRDEAGAPSTDNAQSKTGGDDSVKPAAPEQVEKALRAIVTLFKDGVISLGGRTG SGWGRVELAASELNYTKDQIPQAPRKPGGSVKSVLKRLVGAGTAKVPLRTDEARKASEPL AFTLEWTSPSGVFVGIPMPKVDPKKKDEKYTTPNVPVRNWHVDDGFARTHGEDTYPDNAH ADEAALLLPGSSVRGALRSRCLRIAHDLVDAEDVMRALFNEDGKSKGVHQQIAAEPNLVR YLFGSTEHRGAVRVLDCEGRVEGTTVSNGNPEEIKALELTRNAIDRVTGSAAHSALFSEL LYPKADWSPIRIEVDQRQLDRNILLDLRGRGTITDTDDDRAREDPSVVTRSHASLLLLAL AVSDLCEGTLPLGGGTGGGLGAVQVSEVAVAIPAGRNMVAKIPFTAPEDAADINSVREAR EGFAQSFLHAVQPLFALPDAPTKDGLSAEDSVLAVLLPPLCDNKGGSANNKTVGRKRPTL VTIEWGSPTGVFVGDPRIEKDAGDAKEKIEPEGKTKKNILLPLRSKTGIQRDSEKSKDPL LLPGTSIRGALRSRCSRIARTVLAAQSKGQELASFTKDGRPIDVHEQLAADPNLVRYMFG TTEYKGAISVYDCETVPGKLGSPIEVPHIAIDRWTGGVVEGALFCELIYPNAQWQPIELE IDPERLRRNVRIDSLDTMIDDTEVRNRARASWCLLCLALAELCAGTLPLGGKTTRGLGQV EVTGITIARADETVINPTTGAPLNDAHDILKYLKGKPDGRANYEGWSDQLLGNTRSDERN SGESAEGGRR >gi|319979194|gb|AEUH01000037.1| GENE 15 20444 - 21013 315 189 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189514|ref|ZP_06608234.1| ## NR: gi|293189514|ref|ZP_06608234.1| hypothetical protein HMPREF0970_00549 [Actinomyces odontolyticus F0309] # 38 187 41 194 197 68 31.0 2e-10 MSTPQYSDSLAPFEEAFYSVEAHEETLENALDRAKHTFTNCSVAGIVWTTTGMKRLPRSQ TKFDNVAELIGLIEPGITYELRIWEAIERDKDEESDKERCDDNVAPLARELRWTNGLGAS EIVVSTGGRSHEPSLPCMVHTVRYQTHPAEGAPPGSSRSIAALEIFTGEDRNGNLVYADQ LMTGAWKQS >gi|319979194|gb|AEUH01000037.1| GENE 16 21025 - 23256 1204 743 aa, chain + ## HITS:1 COG:no KEGG:Psta_1142 NR:ns ## KEGG: Psta_1142 # Name: not_defined # Def: protein of unknown function DUF324 # Organism: P.staleyi # Pathway: not_defined # 28 621 620 1160 1272 100 24.0 2e-19 MFHASANRIPVIRKSALGLSQYQHYTYLKDETPTPLDTIDLESWSGSIDIVVRTHTPVVY GNQNAINHTIALPSNPQGEVYMAPTMIKGLISSAYEKVTSSRLRVFGDHSKPLTYRMDPA ESTHLVPVRIHNDGTTTKAIPLYGSTKTKSHITRHNDTEIPILFTATLLNMETDGIRFRT GSGINQKRLVDLTKPTDTSVPYRKVVFDARLVDNGAYAHWLITALYANRGRQLDRIELFD YTKSDLTFVPGESLDMITGYVYATTTTEDRQWGKSTFTVRSGNGIETAKKKSERVFFAKP DVNNNEGIDVSDDVINRYLLTIDSYIQEYREAKVKRVANRFIKELDLGSRIRMEGALAYA VIDETDKSNPVLDTLVPISIGRTSYKRSPYDIARANYSSPAETAVEASPADRLFGYVAIP RDREDDSDQSPCDALRGRVRFGLIDTSKVKMDTGVVPIRPLLSPRPSSARRYLTRAFGKE AGKNVNTPERPIKRSQYYSDDPQQALGACVFPTDRSALSRRNNLGFPIKALRLPEDSDNV STTIQSWIAPGSELRTTIRFEGLTRSELSILLWLLQPENLSPESDSDSVRTDDGNLAEQN IGFFQIGMGKPLGLGLVTIEIPDDGFRAVRTESPADNREADSLVHDYMSLTGCLGVSITT TSISSFPPPHKFTSSPWVRAFQRSCFGYTDNHPVRHMSLTENRHNNMTEADGDNAGYPKE GAGVEPKSLWTSDGGDPIRVVKQ >gi|319979194|gb|AEUH01000037.1| GENE 17 23008 - 23529 565 173 aa, chain - ## HITS:1 COG:no KEGG:Cagg_3452 NR:ns ## KEGG: Cagg_3452 # Name: not_defined # Def: CRISPR-associated protein Cas2 # Organism: C.aggregans # Pathway: not_defined # 1 84 1 85 93 72 41.0 6e-12 MIYIVAYDISSNRRRTRVAKALQSWGYHIQESVFQLRLDAAGLNAIRARLAALINEAEDV VHIYPLCSTCAERAEILGAAVALDDVGLCRLLLFHYSDGITTVGSPQRLGFNACALFWIA CIVAISFSHIVMPILSERHMANRVIVSITKARPLKCSHPWRRRELMWRRKTGY >gi|319979194|gb|AEUH01000037.1| GENE 18 23545 - 25344 1621 599 aa, chain - ## HITS:1 COG:alr0381 KEGG:ns NR:ns ## COG: alr0381 COG1518 # Protein_GI_number: 17227877 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Nostoc sp. PCC 7120 # 302 560 74 335 374 160 38.0 9e-39 MNTDDPNRIIALIAAAETLANAWRTMLASNTADLDPATVQTASADLDATLAHASSLIHAL AAGSGEPTPTRLERRLIACAITHTIGSRLAQETSPAAFPISGSATALSERITGMRADGRV KVFRCCLAPMTVRRIIQNASDTLPSLLAVPITTLLSAIAHTRHEPQLVDALRNAALAPID NAMTDAGHGYARSGQWILICEPDDARLDAAVALLVATLAARGLLLDGTRTQTTSFDEEFC HLGTDYTATAPAPRTTSATPEASADRIVYVGKDGTRIHVAKGRLLVDSASGVPQMSLPQR SVTRIVLTGNVGLSAGARSWALRNGVDVVCLSRRGAYQGQLVNTAATGNAARLLAQASYT TDETAKLPLARAIVNAKIRNQIHVLNRIARRDPALHLADTAARMRAWREETAHAADTDEL MGIEGAASAAYFDALAMCAPPSVDFNGRSRRPPLDVPNAALSYAYAILLGECVGALHAAG LEPSLGVLHAPTDKRPSLALDLMEEFRPLLVDQTVMALLRTRRLRSEHGAPGPQGEGVWL SADGKKTVVDAYEATAQRSVTGALPGFSGSWRRHIHHEAQLMARAIAEPYYQWTGIIWR >gi|319979194|gb|AEUH01000037.1| GENE 19 25899 - 26120 213 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVKPQSGPHFLGQGAPRTTRWIKTRRDSPSSGLVTHPVRERPAPSGALRLEDCRVVVFGD VGQSGSTQHHQVH >gi|319979194|gb|AEUH01000037.1| GENE 20 26054 - 26269 149 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRCRIRGCWAVREHPAPPGALRGGAEPHRIYVVPGQGTPSTIRCIMTAQGLEQAQPHAP SLGSTQHHQVH >gi|319979194|gb|AEUH01000037.1| GENE 21 26679 - 27566 725 295 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSQGAPSTIRCIKTRSTPRSTGTGSDVRGPCTTRCIKTSAPSSGPRPLGTRQGAPSTIRC IKTPGARPPGRTRTPVREHPAPSGALRLTDPGQRVTARRSGSTSTTRCIKTSRRIDFQYR SRTVREHPAPSGALRLDGIGHRQEHRERDQGAPSTTRCIKTLLEASRSSRYRMSQGAPST TRCIKTPSTTPTIPLNMPVREHPAPPGALRLAEGTLDDLAGRGVRGHPAPPGVLRHRPRR LVRDRLERVREHPAPPGALRPRQMVRRRRIADRQGMPSAIRCIKTSNPSSTGSRT >gi|319979194|gb|AEUH01000037.1| GENE 22 27487 - 27879 182 130 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPTVRECPAPSGVLRHRILLQLVHELSVREHPAPPGALRPDRLVGFTEGGLAVREHPAPS GALRRRGAKARQLPHPGVREHPTPSGALTTLLHRSRSRSSTQVPGGGGRPLLRRCLVRAR RQRSSGPLLP >gi|319979194|gb|AEUH01000037.1| GENE 23 28175 - 28741 686 188 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVAPSTALSLAALLAALVGPVVAYACAKKWTRRNIAELVTGDPGLVDRINRHTWALSDG AIVVVGPPDSQQAHDVHQALEDTGLFKKGAIAHIPPQDLAGAARADLIILTEDALSAQAD GPGRARLLDDVLGSKRGVHAGLIGYAPAGNFTDSEFQAIGSEPITSVTRTRGRLVNDALS MLTTLSRL >gi|319979194|gb|AEUH01000037.1| GENE 24 28788 - 31262 1500 824 aa, chain + ## HITS:1 COG:no KEGG:Psta_1142 NR:ns ## KEGG: Psta_1142 # Name: not_defined # Def: protein of unknown function DUF324 # Organism: P.staleyi # Pathway: not_defined # 51 684 622 1170 1272 95 25.0 1e-17 MTENEGQAPSTQHLGPSTQSRTWHHAYNGVPTVRASRTGVGKNPRAVFFDDAPPPPHDHI PPQSLSGWIRLELTVATPTVPGHQDPSGRIVVASLNGNNWGAQSWDDATIPATSLKGVLS SACEAVTESRLRVFREHDHVLTYRRSAREGQTLYPVFLTKEKNGTWSARVMLGKNPRPTT PAQWDTPPSCVCAAALPDSEQSAVKIMLSKKRYYLGGRRRKKGKQRGDEQAAIRTLIKLR TATPHLKRVSFTWEEENFYEGTRAIVSKIGRTLYAHSKSVIESGTDEGYVVRTTPERAAK GKRNRGPVRLISTKYNEFVFFDSEENRTLLPVSDDVVSRLVSVIHSYACNVFELKRRESR AGRNPSSGTQRSTDPSTRLVEEFVRRRLSGVRAHDTEPPFTRQDVLDYLAALASQDPGIP LFAVIVGDDSGTAEHGGKVMVLGPSQVGRATPQNGVPPHRLAEDSDVLPAHDIAEASAAD RLWGFVADEPDDDDTPSARGRITLTRAEPDTSTGKEHLLRAEAGGWVPPILAGPKPATGA PYLRDREGRHVGERLTRGRLFEPGQTLIRKVYPTHRFLIHDYRSGKEELPPPAREARSGE ATQGSEVVVGSYLLPGAKLHATLRFTALTDTELAVLLWLLTPENLVPTSEREKGGTGVFH IGLGKPLGWGAVEVRATELVVADGKRLARGYADLSGCLGLSAADSPVERGTAAPTAAITS TAESTATTPLADYRERLRGLLPGFEASAAVRSFARSAFGWRDAPDEDNVSYPVAKGSGSA GISATALFFKKREDNRVKAAFPDARGRKDCLKSQFNLPTLDDDH >gi|319979194|gb|AEUH01000037.1| GENE 25 31422 - 31598 179 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAPPRGHTAKPLVVPPTPAPATPPTGQAHLKPTSQPIKRHGIASVLRNPERSTAQPDV >gi|319979194|gb|AEUH01000037.1| GENE 26 32066 - 32761 684 231 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293190865|ref|ZP_06609027.1| ## NR: gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 [Actinomyces odontolyticus F0309] # 65 220 93 253 266 68 29.0 4e-10 MRREALARDAELRARGIDPYKGTGVDEGPRRRLSARALVALIVLVVAAVSVGAYVVFLRG EPDYGMSHGYQVQSDGSLKRPSTPVHQPDAPAELLRFTDDASEIAATHYFEVVAYAWNTG DTQYLRAFSSPDCQFCKKTAEEIERLYTEGGWGSGAAYSNMTTEMLGKSEDAATFGDQSY GVRVRYHERTPDLYHDHSFQQSRESDNNLIVIVHWDGQRWLIRDIGKEGDE >gi|319979194|gb|AEUH01000037.1| GENE 27 32762 - 33784 715 340 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293190866|ref|ZP_06609028.1| ## NR: gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 176 340 21 198 198 119 43.0 4e-25 MLKPIPLRNRARMAFALAYVFVIANSTSASADTPESNTDFSVGADENSIVISGNQFSQQT TPSTESGPSSSVVGGSGGGSGSGSAAGGGVPAVPAASGGSGASGGGSSGGGRAAGGPLEM VCTGEREGIPGSAARPGDAGAASAHCQYVAGAAPTTPETADEEPADDGSGGEGEAPPSTE AIVRTALARVAVSGAGLSWQPREKSYTNAGVPTIVYAATPSQTHTTALFGREVSITLTAS QYSYDFGDGTAPLVTSRAGEPWRRGNKEARLTHHYEQVTRGGERRTITLTTTWDATTTNP FTGQTLTLPSIITTTEQSTPFPVSHLRIDLTDTADEQDGH >gi|319979194|gb|AEUH01000037.1| GENE 28 33965 - 34360 213 131 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFPDPGGIDNVYGPAGRMHSPNPSGRGPSRRAAGAARRLRSFRPLGPSVPGPGGIDNVYG PTGRTHHPTTAGLDTKRREPGSKLLPSRREPAHRAPRHHSPSTGPSTIQRPTLILTDNHN GYTQTIIRVLR >gi|319979194|gb|AEUH01000037.1| GENE 29 34462 - 34782 395 106 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKPMSLPHYVPAVLAFACAFVLADVSLASAVTSAPTAPAALAAPSAPTGDSAQSTPPGR SPMTMVCTSEREGIPGSAARPGDPSVASAHCQYVPAPTPTGGSAGQ >gi|319979194|gb|AEUH01000037.1| GENE 30 35220 - 36320 1550 366 aa, chain + ## HITS:1 COG:lin2726 KEGG:ns NR:ns ## COG: lin2726 COG0577 # Protein_GI_number: 16801787 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Listeria innocua # 1 362 1 359 362 166 28.0 8e-41 MKIAWNEITYQPKKFVLIEILIALLMFMVVFLSGLTNGLGRSVSAQIDNFGPLNYILSKD SDKNIPFSSITPEDTEEIEKIDGIAYSGLFIQRATIAQTEGSSTLDITYFAIENNGQEIL TPKTSGTGARISDLKENEVILDSSFQDEGIQLGDEVIDKASKQKLTVIAFAQDAKYGYSE IGFVNSETYTGMRRKADPNFQWRAQTLVTPDSVTSSDLTGDLVVADREQIIDNIPGYKAQ NLTLRMITWVLLIASSAILGVFFYILTLQKLKQFGVLKAIGMPMSRITYIQLSQLTIVSL IGVAIGLGLAALVNPLLPPTVPAFITPQDNIAISASFILTSLLCGALSLLKIKKVDPIEV IGGNGE >gi|319979194|gb|AEUH01000037.1| GENE 31 36322 - 37005 282 227 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 224 1 219 245 113 33 2e-24 MAVIELENIQKSYADGNQMHHVLSHLDLSVDSKEFVAILGPSGSGKSTLLAVAGLLLSAD AGRIRIGGQDLTDLSQKQWTRKRLELLGFIFQDHQLLSYMRIGEQLELVSRLKGEQDRKK RREEVQSLLVDLGIEACYHQYPNQMSGGQKQRTAIARAFIGNPQLILADEPTASLDPDKG QEIAELICNEVKSKNKSAVMVTHDRSILSYVDTVYELKHGQLLEAEG >gi|319979194|gb|AEUH01000037.1| GENE 32 37176 - 37760 846 194 aa, chain - ## HITS:1 COG:ML1629 KEGG:ns NR:ns ## COG: ML1629 COG1196 # Protein_GI_number: 15827858 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Mycobacterium leprae # 1 192 998 1190 1203 231 63.0 9e-61 MNPLALEEHEALAARHKFLVDQVQDLKSSKADLLRIVEEVDRRVEEAFTSAFADTREQFA HTFSVLFPGGTGDMVLTDPQDMLATGIEIEARPAGKKVKRLSLLSGGERSLAAIAFLVAI FKARPSPFYVMDEVEAALDDVNLSRLLTIFKELQQASQLIVVTHQKRTMEIADALYGVTM RDGVTTVVSQRLGQ Prediction of potential genes in microbial genomes Time: Thu May 12 17:10:32 2011 Seq name: gi|319979190|gb|AEUH01000038.1| Actinomyces sp. oral taxon 178 str. F0338 contig00038, whole genome shotgun sequence Length of sequence - 2364 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1343 1499 ## COG1196 Chromosome segregation ATPases 2 1 Op 2 . - CDS 1365 - 1835 792 ## COG0105 Nucleoside diphosphate kinase 3 1 Op 3 . - CDS 1852 - 2364 474 ## Teth514_1190 phosphotransferase system, phosphocarrier protein HPr Predicted protein(s) >gi|319979190|gb|AEUH01000038.1| GENE 1 2 - 1343 1499 447 aa, chain - ## HITS:1 COG:ML1629 KEGG:ns NR:ns ## COG: ML1629 COG1196 # Protein_GI_number: 15827858 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Mycobacterium leprae # 1 442 1 442 1203 341 52.0 2e-93 MYLKSLTLRGFKSFASATTLRLEPGITCVVGPNGSGKSNVVDALAWVMGEQGARAMRGGN MADVIFAGAGSRPALGRAQADLTIDNSDGLLDIEYSEVTISRTLFRGGGSEYQINGAPAR LLDVQELLSDTGMGRQMHVIVGQGQLDAILSSTPEERRGFIEEAAGVLKHRRRKERALKK LADMDANLVRVLDLTNEIHRQLGPLARQARTARRAHVIQARVRDARARLLADDLVAARDR LDAHSASEEATARRRAQLEGELASAREGLAALEDQERQWGPRVEEAGRAWSSLTTITERL RGTHMAAGQKVALRRQPEPEPSGEDPQALDEAAARAGEEDARLLAQVEEARTALAARTRE RGQCEDADEAASRRLALVNARVADHRERAARLAGDIATAVSRHESAQGEAERAHRAADAA IARRDAAEAELAGMEDPSGSADGGDAG >gi|319979190|gb|AEUH01000038.1| GENE 2 1365 - 1835 792 156 aa, chain - ## HITS:1 COG:MT2521 KEGG:ns NR:ns ## COG: MT2521 COG0105 # Protein_GI_number: 15841969 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside diphosphate kinase # Organism: Mycobacterium tuberculosis CDC1551 # 19 153 3 134 136 151 60.0 5e-37 MTAKSVLLDAQQLLDPDLEHTLIIVKPDGFARGFTGEIIRRIELKGYTIKGLKLMVASKE LLAEHYREHKDKPFFPGLLEFMSSGPVVAIIVEGQRVVEGMRNLMGATDPTTAAAGTIRG DLGRAWNTPHMENLIHGSDSAESANREITLWFPEIY >gi|319979190|gb|AEUH01000038.1| GENE 3 1852 - 2364 474 170 aa, chain - ## HITS:1 COG:no KEGG:Teth514_1190 NR:ns ## KEGG: Teth514_1190 # Name: not_defined # Def: phosphotransferase system, phosphocarrier protein HPr # Organism: Thermoanaerobacter_X514 # Pathway: not_defined # 97 169 8 84 88 65 48.0 1e-09 ATGTAPDEAAGAGTTADDRGAGVVGRTGTAGASAASGGAMGATAADAFSGGSAGAAPASS DGAPVGLAPVSSAAGTTGTAPEEPDDGEERAEADAVVADPVGLHARPAALFVRLAGTFES RITVNGASGASVLEIMSLGVAQGDTVHLTAVGEDAHAAIVALTDMLEQRN Prediction of potential genes in microbial genomes Time: Thu May 12 17:10:39 2011 Seq name: gi|319979183|gb|AEUH01000039.1| Actinomyces sp. oral taxon 178 str. F0338 contig00039, whole genome shotgun sequence Length of sequence - 6289 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 - CDS 2 - 380 405 ## COG2376 Dihydroxyacetone kinase 2 1 Op 2 . - CDS 531 - 1523 1269 ## COG2376 Dihydroxyacetone kinase 3 2 Tu 1 . - CDS 1635 - 3392 2181 ## COG0285 Folylpolyglutamate synthase 4 3 Tu 1 . + CDS 3920 - 5344 390 ## 5 4 Tu 1 . + CDS 5677 - 6289 492 ## gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein Predicted protein(s) >gi|319979183|gb|AEUH01000039.1| GENE 1 2 - 380 405 126 aa, chain - ## HITS:1 COG:lin2844 KEGG:ns NR:ns ## COG: lin2844 COG2376 # Protein_GI_number: 16801904 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Listeria innocua # 25 126 7 108 198 89 46.0 2e-18 MARGGHRVRLAPLWEEEMSELGAAWALEWMKRTRERVGEQRERLIDLDRQIGDGDHGENL DRGFGAVVDALGAQDPQTVADVLKLVAKTLMSTVGGAAGPLYGTAFLRGAKAAGAGGLDG AGAAGV >gi|319979183|gb|AEUH01000039.1| GENE 2 531 - 1523 1269 330 aa, chain - ## HITS:1 COG:ycgT KEGG:ns NR:ns ## COG: ycgT COG2376 # Protein_GI_number: 16129163 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Escherichia coli K12 # 1 328 11 361 366 352 53.0 5e-97 MKKLVNDVHSVVRETLEGFALAHADLVDVHLDPDYVTRRTPKDEGKVGLVSGGGSGHEPL HAGFVGLGMLDAAVPGAVFTSPTPDPILEATKAADRGAGVLHIVKNYTGDVLNFETAAEM ADMEDIRVATVVVNDDVAVEDSLYTAGRRGVAGTVLVEKIAGASAERGDDLDEVAAIATR VNEQTRSMGLALGPCTVPHAGKPSFELGEDEIELGIGIHGEPGYRRGSMESADALVEELY RRVRDDLGLGAGERVITLVNGMGGTPGSELYICHRRLAQLLEADGVVIERALVGDYVTSL EMPGVSVTLTRVDDELLGLFDAPVRTPAWK >gi|319979183|gb|AEUH01000039.1| GENE 3 1635 - 3392 2181 585 aa, chain - ## HITS:1 COG:ML1471 KEGG:ns NR:ns ## COG: ML1471 COG0285 # Protein_GI_number: 15827773 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Mycobacterium leprae # 132 585 19 485 485 363 47.0 1e-100 MAGEAFFGHGSDDEARGIPADLVPYLEAADAPEEGADQAEEPNAAAADDSARDAALRALV ENSLLVGPDPTILAEITGDFDEAFADDADSADDADWSDDADSAPGVDDSARASRTASGAR SNATSPHDAALAAAADQVQLDAQVRRIYESIVERAPEHDIDPTLDRVKMVLDILGDPQNA YPSVHVTGTNGKTSTARMIDAVLTAFGMKVGRFTSPHLIDVRERISLEGHPISREGFIAA WQDVESYIRMVDERNQEQGGPRLSFFEVFTVMAFAAFADYPVDAAVVEVGMGGTWDSTNV ISAGVSVITPIGVDHARWLGSTVEQIAREKAGVIKPGQIVVVARQSEEAMAVLEERARQV DAVLRVEGRDFEVVDRQMGVGGQLVTVRTPAAVYEDVFVPLLGEHQAHNAAVALAAVEAF MGGRALDGRIVESGMMAASSPGRLQVVRRSPTIIVDAAHNPAGAGTLRDALEESFGFSRV VGVYSAMGDKDVEGVLSEVEPFMDHIVVTRMDGGRAADTGALARVAADVFGPDRVDAVEG LADAVDRAAALAEAGAGPSDRSGILVFGSVHLAGDMLALAGKLPE >gi|319979183|gb|AEUH01000039.1| GENE 4 3920 - 5344 390 474 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCALRCCVIQSLVCGSDRDRAHGTVGVPAAALRMSRTMRGSALGTTGAPGRLSEPGSTCS EWDGYAQTIVEVWRWGHVGRYHRMQVKQATAVARVDTRMWYDMGVEDTPLSDGVTRSGGR DGRERPINGVDSSESIPLEVSPSSGAHTLSDFDGNAPSDGAHTFLSEDRPGSGIAGDRQG PGPETDNAAVGDPPAEGPVDDPPDNDPVDDLPVDDPVDDLPTVDDGEGMSPVVREYTERL RREALARDAELRARGIDPYKGTGVDEEPRRRLGARALAVLAVLVVAAVSVGAYVVFLRGE PDYGMSHGYQVQSDGSLKRPSAPVHQPDAPAELLRFTDDASEIAATHYFEVVAYAWNTGD TQYLRAFSSPDCQFCQRTADDIDRLYGGGGWASGAKFTDVVPHPLGRYSDIENYGEDTYG VRVSFHQLTPDLYAHNAFQASEERDDEVTILVHWDGQRWSVRELGRDQDAEGSN >gi|319979183|gb|AEUH01000039.1| GENE 5 5677 - 6289 492 204 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293190866|ref|ZP_06609028.1| ## NR: gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 55 203 20 181 198 103 44.0 5e-21 MVCTGEREGIPGSAARPGDAGAASSHCQYVPGTAPTTPETPNEEPADEEGGGEGEAPSTE TIVRTALARVPVSGAGLSWQPRKKSYTNAGVPTIVYAASPTQTHTTALFGHEVSITLTAS QYSYDFGDSTPPLVTARAGEPWRRGNKDARLTHHYEQVTRGGDRRVITLTTKWDATTTNP FTGQTLTLPSIITTTERSSPFPVS Prediction of potential genes in microbial genomes Time: Thu May 12 17:11:23 2011 Seq name: gi|319979153|gb|AEUH01000040.1| Actinomyces sp. oral taxon 178 str. F0338 contig00040, whole genome shotgun sequence Length of sequence - 38012 bp Number of predicted genes - 29, with homology - 27 Number of transcription units - 12, operones - 7 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 393 107 ## 2 2 Tu 1 . + CDS 727 - 4374 4106 ## COG5263 FOG: Glucan-binding domain (YG repeat) 3 3 Op 1 45/0.000 - CDS 4531 - 5331 1191 ## COG0842 ABC-type multidrug transport system, permease component 4 3 Op 2 . - CDS 5361 - 6302 1251 ## COG1131 ABC-type multidrug transport system, ATPase component 5 4 Tu 1 . - CDS 6476 - 9745 4736 ## COG0060 Isoleucyl-tRNA synthetase - Prom 9945 - 10004 1.8 6 5 Op 1 . + CDS 9816 - 10748 983 ## COG0701 Predicted permeases 7 5 Op 2 . + CDS 10745 - 10999 400 ## COG0701 Predicted permeases 8 5 Op 3 . + CDS 10996 - 12006 1162 ## gi|154508973|ref|ZP_02044615.1| hypothetical protein ACTODO_01489 9 6 Op 1 . + CDS 12326 - 12472 210 ## SCO7547 sulfatase 10 6 Op 2 4/0.000 + CDS 12539 - 13192 884 ## COG3119 Arylsulfatase A and related enzymes 11 6 Op 3 . + CDS 13201 - 14481 1673 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 12 7 Op 1 . - CDS 14586 - 15347 1063 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity 13 7 Op 2 8/0.000 - CDS 15422 - 16714 1824 ## COG0247 Fe-S oxidoreductase 14 7 Op 3 9/0.000 - CDS 16744 - 17994 1672 ## COG3075 Anaerobic glycerol-3-phosphate dehydrogenase 15 7 Op 4 . - CDS 17991 - 19577 2252 ## COG0578 Glycerol-3-phosphate dehydrogenase + Prom 19723 - 19782 3.8 16 8 Op 1 . + CDS 19858 - 21915 2908 ## COG3590 Predicted metalloendopeptidase 17 8 Op 2 . + CDS 22038 - 24596 3912 ## COG0308 Aminopeptidase N + Term 24650 - 24697 12.0 - Term 24732 - 24783 16.3 18 9 Op 1 . - CDS 24801 - 27185 2866 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 27295 - 27354 2.2 - Term 27281 - 27319 1.1 19 9 Op 2 . - CDS 27428 - 29251 2125 ## gi|154508990|ref|ZP_02044632.1| hypothetical protein ACTODO_01506 - Prom 29283 - 29342 2.1 20 10 Tu 1 . + CDS 29175 - 29870 154 ## + Term 30041 - 30096 0.1 - TRNA 29528 - 29603 90.7 # Ala GGC 0 0 21 11 Tu 1 . + CDS 30226 - 31281 1186 ## COG1404 Subtilisin-like serine proteases 22 12 Op 1 6/0.000 - CDS 31343 - 32071 813 ## COG0406 Fructose-2,6-bisphosphatase 23 12 Op 2 14/0.000 - CDS 32073 - 32501 404 ## COG0799 Uncharacterized homolog of plant Iojap protein 24 12 Op 3 6/0.000 - CDS 32476 - 33165 717 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 25 12 Op 4 22/0.000 - CDS 33162 - 34439 1639 ## COG0014 Gamma-glutamyl phosphate reductase 26 12 Op 5 7/0.000 - CDS 34473 - 35573 1290 ## COG0263 Glutamate 5-kinase 27 12 Op 6 14/0.000 - CDS 35570 - 37102 1969 ## COG0536 Predicted GTPase 28 12 Op 7 32/0.000 - CDS 37178 - 37438 371 ## PROTEIN SUPPORTED gi|227493582|ref|ZP_03923898.1| ribosomal protein L27 29 12 Op 8 . - CDS 37472 - 37792 376 ## PROTEIN SUPPORTED gi|229244194|ref|ZP_04368313.1| LSU ribosomal protein L21P - Prom 37870 - 37929 1.9 Predicted protein(s) >gi|319979153|gb|AEUH01000040.1| GENE 1 1 - 393 107 130 aa, chain + ## HITS:0 COG:no KEGG:no NR:no QPGHFQRFTYIEKATFEITVRWDGNRWLISEVVPTPTTPRPDWLPPVAQTPAPTSQPPTT TPPSQPPAPTPPAPQAPDGTPSNGNAPSQSTGPDTTAPGTTGSPSTDRSADEEEPTGAPE PTNPADAYAR >gi|319979153|gb|AEUH01000040.1| GENE 2 727 - 4374 4106 1215 aa, chain + ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1074 1211 464 599 621 144 50.0 1e-33 MPTRRTNALAALLASSSLLLASAVALPAQSFPPPGGDDQGQGSPATSQAAADIALTSKAD YENGTGPGPADEAHPDASQSGASQPDAPQSDASQPDGAEGHAPEEGVRIIVQFADGASES DCDELVDRIGEAVAASVPAAAGGPAITRARDYRNVFTGVAIDAPAAALPVVQGVDGVKSA FIEREGHIEGDESEQPGGPSGNGGPAHEAGADGSGSASAAHSPSPAHSPSPAGTPTSGDA ASNGDGAPSGAPASGAAPSPAATPSQDAAAGSGNVEGGADSLAAEGIDPSNRSAHLMMRM DHVSHKGEGRVIAFLDTGLEVAHPAFSGAVDASKTALKRADVEQALPRLGEGKDGHYVND KIPFAYDYADDDADVAPSSGPGGFHGTHVAGIAAANADRIRGTAPGAQIIVAKVARSGNG SLPDSAVLAALDDMAVLRPDVVNLSIGWSAGMDNAADSLYSTVYASLQGAGVTVNAAAGN SYSAGRGNRSGKNLPYASDPDSSVMDEPATYSSAVAVASVDNAPANGAYRASDFSAWGVR PDLRLKPEIASPGGGVVSAVPGGAYDQASGTSMATPQMAGISAIVLERVSTDPLFSGMSA AERTGVAQSLIMGTAHPLVDADQGTGAFYSPRKQGAGLVDALAATTSPVYPTVDGAAEPS RPKADLGDGTAGWSFTITVHNLSDSAKSYALSSQALSEAIEGGFFTLHSTDWRGKGVSVS YSGAAVAGSGEGAALTVPASGRASVTVSVAPGAAFASYANANAPKGTFIDGFVRLAAQNG SGADLSVPYLGFYGSWGAADVFDAKASDAAVSPAHIYPSAFVDSRTGRPLGANPLAPRNT ETVPDPGRYVVSRAASSLATRRAEPRTGLLRSVHTLTSTYANEAGATVREYTNYQNYKSV RNANGTVSRAESYHLAPVFDSEDQVGAGLPDGKYTLTIAATTSGPSPTRHAISYDFALDT TAPRVTVKGVIGEGAGAKVAFDVTDASPLAAFDFHDPSNGTWYYRELVNDDGTVNPDGSH TYHFEVSASALQAAWEAQRGKGAAPSQPYVLAWDWGVNPSDKAVVRFPGTTSGAWTHDSH GWWYRLPDGSWPSSTSMVIDGETYRFDASGYMRTGWVGEAGSWYYHLPSGAMAKGWAHDS GSWYYLSPGTGAMTTGWLKQGSTWYYLTASGAMATGWLKVGGTWYYLAPSGAMATGWTNI DGTWYYFSSSGAWTG >gi|319979153|gb|AEUH01000040.1| GENE 3 4531 - 5331 1191 266 aa, chain - ## HITS:1 COG:DRA0008 KEGG:ns NR:ns ## COG: DRA0008 COG0842 # Protein_GI_number: 15807680 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Deinococcus radiodurans # 18 266 12 261 263 63 23.0 4e-10 MSSSTAFTDEAISRRAPSSGFKGFGVLTWFTFWKFITNPITISFSLILPILMYLIFGSGQ SYAQNWGVHGNAAASVLANMTTYGVVLVASSMGANAALDRTMGISRLFALTPMRSAANIL ARLIAAVGAGAIVVAIVYAVGAFTGAQMEASAWALTPVIMLPASILAAAIGLAVAFTVRS DGAYAASSAVILLSAVCAGMFMPLSMMGPVFSAIAPFTPLYGITNMVQVPLQGLDSFHWA DPVNFVVWTVLFVGIAVWGQRRDTNR >gi|319979153|gb|AEUH01000040.1| GENE 4 5361 - 6302 1251 313 aa, chain - ## HITS:1 COG:BS_yvfR KEGG:ns NR:ns ## COG: BS_yvfR COG1131 # Protein_GI_number: 16080462 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Bacillus subtilis # 20 311 5 293 301 160 34.0 4e-39 MSTEHATAVHPPRAKRRARAVSVEEVTKDYGKVRALNALSLDIPRGQIMALLGKNGAGKS TLIDIILGLQAPTSGSARVFGLAPRDAIRRSLVGVVLQTGSLPVDYTVAEALRLFGSTHD AHVDYGTILEETQLAHMSGRVIRKLSGGEQQHVRLALALLPDPHLLILDEPTAGMDATAR REFWDVMRTQAERGRTIVFATHYLTEAQDFAERTVIIKDGSVIKDAPTDELRRMNRSFHL TVDVDRGAGPRLVEQLRAAPEAGAWKISTEGGRIVVDGDDTDPAARILLSHPEAHGLEIT ASSLEDVFTSLTA >gi|319979153|gb|AEUH01000040.1| GENE 5 6476 - 9745 4736 1089 aa, chain - ## HITS:1 COG:MT1587 KEGG:ns NR:ns ## COG: MT1587 COG0060 # Protein_GI_number: 15841003 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Mycobacterium tuberculosis CDC1551 # 18 1088 13 1040 1041 1203 58.0 0 MTKHGFYPLHRDDEVDPSPSFPAMEGATLAFWAADDTFRKSIARREGGSNEFVFYDGPPF ANGLPHYGHLLTGYVKDVVGRYQTMLGHRVERRFGWDTHGLPAELEAQRQLGIDDVTEIT REGGVGIAAFNEACRSSVLRYTKEWEDYVTRQARWVDFEHDYKTLDMPYTESVIWAFKQL YDKGLAYQGHRVLPYCWNDRTPLSNHELRMDDEVYQDRTDNTVTVGLRLTEPLLADSERP ELALVWTTTPWTLPSNSAVAVGPGIDYVVVEADPGAASPVAGERVIIARDLLAAHAKQLG EAPRVLASFKGSDLVGRRYHPIYDYFDDGAHRAEGAVPGPNAWTVVAADYVTTTDGTGLV HQAPAFGEDDMWTCMEYGIGMVLPVDDGGVFTSEVPDYQGMHIFDANRHIVADLRDQSGP IARRDPRVRAVLVAEASYTHSYPHCWRCRKPLMYKAVSSWFVRVTAIRDRMVELNQQINW TPEHIKDGVFGKWLAGARDWSISRNRFWGAPIPVWVSDDPAYPRTDVYGSIAELEADFGV EVVDFHRPFIDSLTRPNPDDPTGRSTMRRIPDVFDCWFESGSMPFAQVHYPFENQEWFES HYPGDFIVEYIGQTRGWFYTLHVLATALFDRPAFRNCVSHGIVLGDDGLKMSKSLRNYPD VSQVFDRYGSDAMRWFLMSSPVVRGGNLMVKESAIRDTVRQVLLPIWNTYYFFTLYAGAA DKGAGYRARRLDLSSAPAVEGLAQMDRYLLAHTRGLAEDVRARLDAYDVAGACDAVRDHL DVLTNWYVRTQRQRFWDEDGAAFDALYTALVVLMEVAAPLLPLLAEEVWRGLTGGESVHL QDFPVIDEAVAAPALVAAMDEVRSIVSAAHALRKTHQLRVRQPLASLRVVSEGYADLEPF ADLIASEVNVKSVVFAAPEESGLGVRTELGLNPRAFDPAFRKLTSQLFKAQKSGDWEILD DGSCRFPSVLVDGSPLVLTDGMFSVSTSVDAAEGQVADVLPSGTFVVLDTELTPELEAEG YARDVVRAVQEERKNAGLHIADRIDLMLSVPAARVADVEAWRDMIAAETLALSVAVEAGD KLAVGVAKH >gi|319979153|gb|AEUH01000040.1| GENE 6 9816 - 10748 983 310 aa, chain + ## HITS:1 COG:BS_ycgR KEGG:ns NR:ns ## COG: BS_ycgR COG0701 # Protein_GI_number: 16077394 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Bacillus subtilis # 100 299 8 204 294 102 34.0 1e-21 MPRWRGLFVGAVSGPPVRFYWGTGAAPSGSLFFRMLPGDAPLSGPDHSNADSTHRFAPGA GYAGGVTIERARPAALAAYSAVAVVAGAVVLAGALGGQGVRMWAIGVGIFFQAAPFLALG VLVSEVIEEFVSPERLAHLFPRGPASIATALAAALVLPVCDCSAVPVFRSLLRKGVPTSA AVTLMLASPSINPIVVWSTWYAFGSWRIVVLRALLAVAVTVVVGVVMGRVGGPVLVEDEA ECAAGCGCHPGGGGAGAGGRLALVASHSATGFGRILPYLLVGVALSSAAQVLVSPTQSLD GRPRPWRWAR >gi|319979153|gb|AEUH01000040.1| GENE 7 10745 - 10999 400 84 aa, chain + ## HITS:1 COG:BH1518 KEGG:ns NR:ns ## COG: BH1518 COG0701 # Protein_GI_number: 15614081 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Bacillus halodurans # 1 73 258 330 337 61 43.0 5e-10 MMAAGFVFSLCSSSDAIIARSLSALAPAGALLGFMVYGPMMDVKNVLLLSSSFDKRFVAK LFVVVTVASGAAVGGAWALGLVAL >gi|319979153|gb|AEUH01000040.1| GENE 8 10996 - 12006 1162 336 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508973|ref|ZP_02044615.1| ## NR: gi|154508973|ref|ZP_02044615.1| hypothetical protein ACTODO_01489 [Actinomyces odontolyticus ATCC 17982] # 43 336 6 287 287 127 36.0 1e-27 MSGEEQEGQDGQDRAFLAIAGDFEASQRRRAARRAEEEERRPRLVAWAEAAVLVVLAASL ALAVVSGRYLKVVPPRAVWGIGFLVVCCLVWAWARARSARRREAPAREPGAWVRVIGLWA AAVCIALPINPTPSQLLGSGAPRSQTSAHSQLAPGGAPPADEAGAAAAQSAGPDGGSPTS SPTAGTGSHAPAPASSASAAGRAGTASSGPPLDVTTQTFYSTLRELSANGAAYEGREITV TGFVVAPQTAASQQGLPRLAATDSFALARMTIWCCAADAYALGFAVEAPAGTQVEPDQWV RVHGRLHSRGGKALALEADSVEAVDAPDQEFVFEER >gi|319979153|gb|AEUH01000040.1| GENE 9 12326 - 12472 210 48 aa, chain + ## HITS:1 COG:no KEGG:SCO7547 NR:ns ## KEGG: SCO7547 # Name: SC5F1.01, SC8G12.23 # Def: sulfatase # Organism: S.coelicolor # Pathway: not_defined # 1 48 247 294 533 72 64.0 4e-12 MHFFGSHLPYFLPDEWFDLIDPADVELPESFGDPLIGKPPIQLNYATY >gi|319979153|gb|AEUH01000040.1| GENE 10 12539 - 13192 884 217 aa, chain + ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 1 198 269 465 467 117 32.0 2e-26 MIDFEIGRIMDAARELGVLDDTAVFFCADHGEFTGVHRLNDKGPMMYDDIYNIPFIARIP GVSAVGRSTAFVSLIDLPATVLDVARLDTSLVEDGRSIVDLTRGGAVEGWRQDIVCEFHG HHFPLQQRMLRTRDFKLVINPESVNELYDLRRDPAEMTNVYASPVYDEIRRELATNLYRQ LRERGDHSFAKWMAAMTDFDIPLANTARFDLDGVGAE >gi|319979153|gb|AEUH01000040.1| GENE 11 13201 - 14481 1673 426 aa, chain + ## HITS:1 COG:MA2647 KEGG:ns NR:ns ## COG: MA2647 COG0641 # Protein_GI_number: 20091470 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Methanosarcina acetivorans str.C2A # 1 422 1 411 446 373 44.0 1e-103 MTSPLPFSVVVKPTGAACNLDCQYCFFLSKELLYSARSQMMGEDTLEEYVRAHLRASPDG EVAMLWQGGEPALRGLPFFRAAVDACERHRRPAQRVRHCLQTNGTLIDDEWARFLADNGF LVGVSIDGPAPLHDAYRLNRGGRGTHSMVMRGWEALERAGADVNVLCAVHAANQDHGAEV YRYFRDGLGARFLQFIPIVERVPAAALAQAEAGWRSGGAALLYRQEGDRVTSRSASPASY GAFLCEVFDQWLPRDVGGVFVQDFDSALSALFGAPTVCAHAPRCGANMAMEFNGDVYACD HWVEPDWLIGNIADSSFAQLAFSPRMREFAEKKPDLDPGCRSCPHLPLCWGGCPKDRFVP TASGARRNYLCEGYRAFYSHASPALAAMARLLRSGRPASDIMDPGTAALLGAALPAVADD SCAPAR >gi|319979153|gb|AEUH01000040.1| GENE 12 14586 - 15347 1063 253 aa, chain - ## HITS:1 COG:MT0502 KEGG:ns NR:ns ## COG: MT0502 COG4221 # Protein_GI_number: 15839874 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Mycobacterium tuberculosis CDC1551 # 5 240 9 241 251 192 55.0 5e-49 MSTPRRAVVTGASTGIGEATVRLLVDHGWGVVAVARRRERLKALADQIGCEYWAADLTDE AQVEGMASHVLEGGPVDALVNNAGGAIGVDHVAQADAGRWSAMFDRNVLTALHCTRAFLP AMRKRGGDLVFLTSTAAHDTYPGGGGYVAAKHAERVIANTLRQELVGEPVRVIEIAPGMV RTEEFSLNRLGSREAADRVYAGVAEPLVAADVAEAIVWALERPRHVNIDSMIVRPVAQAT NTVVARSADGERR >gi|319979153|gb|AEUH01000040.1| GENE 13 15422 - 16714 1824 430 aa, chain - ## HITS:1 COG:YPO3824 KEGG:ns NR:ns ## COG: YPO3824 COG0247 # Protein_GI_number: 16123959 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Yersinia pestis # 33 424 12 413 415 244 33.0 2e-64 MFPAFNGGLSSAGVEEEDVISLGDGTVGGLRGSLDACVKCTICETNCPVMRVTDLFGGPK YSGPQAERFRQKGQIVDKSIDYCSSCGTCSLVCPQGVKVTEIIHHRRTDMKRAVGIPVRD RLIGRTSLIGTMMTPVAPIANWALDVKPIRMVMEKVIGVHRSAPMPRAYGRTFEGWFKKH KPLPSSGERGQVIFFHGCAGQYFEVETSIHSVLVLEHLGYEVLVPKHGCCGLALQSNGLY SDARKYVSRLTDDLRKVNRDAPIVSASGSCAGMLRHEAHDILEMDTPELEDVGSRTWDIC EFLLDLHDRGELDTDFQRIDITIPYHAPCQLKSQGLGLPAMEVMGLIPGVKVVESQQPCC GIAGTYGMKKEKYAIAQAVGAPVFDFIKQVNAELAACDTETCRWQLRTATGANVVHPIWL IHKAYGLPNG >gi|319979153|gb|AEUH01000040.1| GENE 14 16744 - 17994 1672 416 aa, chain - ## HITS:1 COG:YPO3825 KEGG:ns NR:ns ## COG: YPO3825 COG3075 # Protein_GI_number: 16123960 # Func_class: E Amino acid transport and metabolism # Function: Anaerobic glycerol-3-phosphate dehydrogenase # Organism: Yersinia pestis # 3 412 4 417 424 120 31.0 6e-27 MRDAVVIGAGIAGLAAAIRLARSGASVTLVAKGVGGLQLGTGTIDVLGYSPERVERPLEA IGPHVASRESHPYSHFTPGFVGEAVNWLRDTAGPSLLVGDASRNVAIPTGVGALRPTNLY QPSMAAGVPAPGASYAIVGLKRFKDFYPALVAENLSRQRGPDGQRLQARAISVDFEVRAG EADSTGTNHARALDEADARQRLADAIGPLLSAGEAVGLPAVLGLADPGAAADLAERLGHG VFEIPVQPPSVPGMRLNTALSEIVKYEGRLIMGSAVTGFKRRAGRIVSVDISTAGRPTTI EAGAFVLAAGGFESGALAVDSHGQVSETVFGLPVMGAQGQLLHADFWGEDQPLFLAGIAV DDQMRPVGPDGSAVYSNLHAVGGNLAGATRWREKSGEGIALASAVRAADTIVEGLR >gi|319979153|gb|AEUH01000040.1| GENE 15 17991 - 19577 2252 528 aa, chain - ## HITS:1 COG:VCA0747 KEGG:ns NR:ns ## COG: VCA0747 COG0578 # Protein_GI_number: 15601502 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Vibrio cholerae # 4 520 7 537 547 349 38.0 6e-96 MRTIHTDVVVIGGGSTGAGVVRDVAMRGFSAVLVDRADIAQGTSGRFHGLLHSGGRYIAS DPESATECAEENAIVKRINANAVEATGGLFVTTAQDEEDYADQFLARASAARVPAREITV SEALAREPRLDPRIKRAFEVEDGTVDGWQMVWGAIRSAEAYGAKILTYHRVVSIDREGDR VSAVVVHDERGGEDLRVECGFVVNCGGPWAGQIAAMGDCHGVDVVPGAGIMIAMNHRLTG TVLNRCVWPEDGDILVPDHTVCIIGTTDIKAEDPDRLAITADEVQQMLDSGEALVPGFRR SRALHAWAGARPLIKDSRVAATDTRHMARGMSVIDHHSRDGVYGMLTIAGGKLTTYRLMA ERVVDIMCSEMGEQRPCKTAGEAVPPAEDGRLYTIGHRLDNVESRNDGPSHEQIICECEM ATRAMLERVMDTLPKSQLDDVRRQMRLGMGPCQGGFCSQRAAGIAHERGDIDAERANGLL RLFLKNRWIGLWPILYGKQVRQAALDNWIHEGTLDVEHLPVPVEEVVR >gi|319979153|gb|AEUH01000040.1| GENE 16 19858 - 21915 2908 685 aa, chain + ## HITS:1 COG:MT0208 KEGG:ns NR:ns ## COG: MT0208 COG3590 # Protein_GI_number: 15839578 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Mycobacterium tuberculosis CDC1551 # 11 685 10 662 663 578 49.0 1e-164 MDHDIRDRSGEAANADPSVRPQDDFYRHVNGGWLARHKIPADRPIDGAFHALRDQSEAHC RAIAEAAAAGALDDPDAGRIATLWNQFLDEGAIEAAGASPIAPDLEAIAQAPTRADLAAL TGRLMREGAGGLVSAYVGTNPHDSSQYMVSLVQSGLGLPDEAYYREDDYEPVRRAYVAHT ARMLELAGAPDPEGAAGRIMAFETALAAHHRDSVSDRDPRLSDNPTPWEDLVAANPGFDW RGWADGARLPTASLVVNVDQPEFLRGACALWADTDLGTLKEWAAASVIDARAPVLSSAFV EENFDFHGRTLAGTEQLRPRWKRALGLIESCLGEAMGRAWVARHFPPASKERMDDLVARL LDAYGRSIRALDWMGDATKGRALDKLAAFNPKIGYPPKWKDYSGLDIDPGAPLVDNLRAA ASHETEREWARLGTPVDRDRWYMTPQTVNAYYNPTQNEIVFPAAILQPPFFDPGADDAVN FGAIGAVIGHEIGHGFDDQGSRFDERGNLENWWTDQDRERFEERTRALIDQYNALTPADL LPGAHTPDEAGGADGAERPDGAGGTGAPHVNGALTIGENIGDLGGLTIAWKAWAAALAER GLTPATAPAVDGLSGPERFFQSWARVWRTAARPRFARQMLAVDPHSPAEFRCNQVLKNFD EFAAAYGVAEGDGMWLDPSERVRIW >gi|319979153|gb|AEUH01000040.1| GENE 17 22038 - 24596 3912 852 aa, chain + ## HITS:1 COG:MT2542 KEGG:ns NR:ns ## COG: MT2542 COG0308 # Protein_GI_number: 15841991 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase N # Organism: Mycobacterium tuberculosis CDC1551 # 5 852 5 859 861 657 45.0 0 MPGLNLTRDEALARAGVISEVIRYDIELDLTRGDTDFGSRTRVEFTAAPGGAAFADLVST NVRSIKLNDRDLDPGAHHQDSRISLDDLAEHNVLEVDADCQYMHTGEGLHRFVDPADGKA YTYSQFEVPDARRVYTTFEQPDLKSTFTLTVKAPKGWKVFSNAPTPSPEEDGDSWTYRFA TTEKMSTYITAVVAGPYEGVTDTLTSSDGRTIDLGVYCRASVLEHLDADAIIDITRKGFE FFEDAYGIAYPFTKYDQVFVPEYNAGAMENAGCVTFRDAYVFRTRPTEAQLESRANTILH ELAHMWFGDLVTMKWWNDLWLNESFAEFMSHLALAENTPYTEGWTGFMVRKDWGLKQDQL PTTHPITAQIRDLADVEVNFDGITYAKGASVLRQLVSYVGRDAFFAGLHEYLTAHSYANA TLADLLGELEKASGRDLAAWSKVWLEEAGVTLLRPSVETDEEGRITRLSIEQEAFSEGAS LRPHRLAVAGYSLDGESLQRVFHEELDVDGASTDVPSAEGVARPDFILVNDGDLAYAKIR LDEDSLAFAVANITRFTDSLTRGVVMAAAWDMTRDGQMKARDYLNLALTAVPAETNMQLL TLTLRHIDEAVRTFVAPDARAEAAETVGRRLLLLARTARSGSDAQRMLVAAAARNASNAE QFEAIKALYDGSATLEGLELDVDLQWSLLIALVRGGVAGDAEIDAREQEDDTMTGRQNAA AARAARDDAAVKEQVWEQVLGDKSIPNDTRWAMVSGFWAQARTTPSLYEPYVERYFAALA QVWEENTFHTAEDLTTLLFPSDLAGYAPGVDVVRAGHEWIDANPGAPAGAVRIIRERIDV CERQMANQVADA >gi|319979153|gb|AEUH01000040.1| GENE 18 24801 - 27185 2866 794 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 586 778 521 710 744 176 48.0 1e-43 MTANAAHAGTDEGGSQVSSAVFQWAMNDESGSAGYAPGTCNFFSAMVSGNSGGGQWPNNV QSGSDGEFLVGGPDQDVLWRPKDGNVSILKPNADGQLVAQTWANKCLTRTGARTNTRGAV SESIINIEKGSGTVDPTTNSASIKWVGSWTVVYYSGMTYWSATNPVLEVTNGQGAIKARV SGFGASMQDPDAALKPLQAREVTLATLKNVAVTKTGITVTPEYKGVEVKNPEGASPQTRT GENWGSWPADFVAFQGETGQHSYWYSTGSSGDARKPALPVTITWPQIDDAPAPQPGPTQE PTPAPTADPSPAPTAEPTPAPTADPSPAPTAEPTPTPTGKPTTGPSAKPSIEVSPYQDID PTVKTVVTVKGKDFTSGSIQNGVYVVLMDEKIWKPGEYMNAKGDEGAITSAWVTPDQLEA GKGSFTVKLTIPENTLDRTHTYYIGTMAAHGLALSDRSLDHAEKITFKAIDYDPNAVFPV DSVTAAAVADSGRTNMKVDWVYTGATPSRGWEISLACVRDCSDPLFGTKSHHQHDNSLRS DVFGDVPDGVYVASVRNAGEISGEFKASDWVSSAEVRVGAERDPNPGEGKGQWIQDSFGW WYRHSDGSYTTNGTEQIDGVTYRFNEAGYMVTGWTKVGGQWFYYAPSGAQASGWTLVGGS YYYMDPGTGAMATGWQQVDGAWYYLGASGAMATGWQRIGGAWYHLGASGAMDTGWTAIGG SWYYFAPSGQMATGWVQDSGKWYYLDPSTGAMATGWQQVGGAWYYLFASGEMATGSHWID GVLRVFDNSGAWVR >gi|319979153|gb|AEUH01000040.1| GENE 19 27428 - 29251 2125 607 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508990|ref|ZP_02044632.1| ## NR: gi|154508990|ref|ZP_02044632.1| hypothetical protein ACTODO_01506 [Actinomyces odontolyticus ATCC 17982] # 1 568 1 574 655 333 48.0 3e-89 MTFQLSRTHMSIVAAATTVAASFVVIPALAENGAPPADAVCPALLAAHGRIDSEYNPALS AYADAVAAETAARNTANGGTTATEAQLRQTLADAQAALEAAKAAPAPTAQPTATQPAAEP SRGIGWVDWSKVDHSTQAQVIEALIVQKMNAYRANAGLPPLVVSDTLTTDSRAWSKHMAV DNDFNHDPEFRYLGSQSLDGASTNYAGENIAYNSMRNFGAQLGARGNEYKTPDQAASDLF EQWKNSQGHDKNMLAENNVVGVGVHIGDLADGGLTVWGTQKFYAISDKGSIDLSRFHTTG DTASAYGFDGAAFNADNTDYVSGLAGDGVKQKQAAWTAIVGDLPAQALPVAPSSLPAKDG ASPSPAPSTTPDPAPTADGSAVAQAQAAVDAAQKALDDFLADPQSAYSQAVENLKAATAA LDGIAESIGSQTGVRPQPDGGVVVSTPNTNSSYYPVEAYRKKLAACLATTGQQQPVGPAQ SGATGAPAAPASVAQQSNAPAGPAPSEQSSAAPSPAASQQSSPAAGSDGKGGQSVSTEAG GAGPGAANGTNNGGGAPGAADKGAATGARAANLASTGSMTLVLGIGALIVLGIGVGAMLV AKRRKGK >gi|319979153|gb|AEUH01000040.1| GENE 20 29175 - 29870 154 231 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTTKEAATVVAAATIDMCVRDSWKVIGNRPNTQRTGHSNLAIRFGKLQTPLPALCWRSAP FQRSRRPPPSRWSHMAIPKRYTRSRGRSLPGTGQCWCARAERPRRTTKGSNASALEPFGG ARGIRTPDLFHAMEARYQLRHSPVFRSPPLGDSTTLADRRRGAQIKNSAARHTLFLAVGR SPVYRFLRRSGRERPLARGLASPIQRAVVWAMSTSGAINRLWLRSCRCHGG >gi|319979153|gb|AEUH01000040.1| GENE 21 30226 - 31281 1186 351 aa, chain + ## HITS:1 COG:alr0996 KEGG:ns NR:ns ## COG: alr0996 COG1404 # Protein_GI_number: 17228491 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 70 257 173 364 488 71 32.0 2e-12 MAWGVEEIGTQDYVDALGLGEGIASGSLGAGVGIGVIDGPADTSVPELAGADVTVRPMCD YTASDMSRSHGTAVVSVLAARGYGVARGARIINYAIPSDSDAKTPDCAGAGVAHAIGAAV ADGVRVVSISLSSDVWVDNAAREAVAMAVAHGVVAVVATGNEGAVDPPDSYASMNGTVGA GASDGQGRIKDYSNRGRGLTVLAPGDIRWHNLAARRIEKATGTSYATPVVAGLVAVAMGE WPGATSNQVLQALVATATAGPTGQPLVSPGGLNRTDPGQYPDENPLMDKFPGTEPSPATV ADYRDGLLATASVFDNDPSYTYRGTDPDVVRSHPDRTALGTSPRYHAPHDQ >gi|319979153|gb|AEUH01000040.1| GENE 22 31343 - 32071 813 242 aa, chain - ## HITS:1 COG:Cgl2299 KEGG:ns NR:ns ## COG: Cgl2299 COG0406 # Protein_GI_number: 19553549 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Corynebacterium glutamicum # 4 210 3 186 236 86 32.0 4e-17 MGGRRIVLVRHGQTDFNVERRFQGRADQPLNERGMAQARAAAGVLATRLSGPGEQVGVNA DSGARTGGGARLICSPLLRARMTARILADVFEAVGRPLDGPFVDERLTERDFGRMEGHTF AELVELYPAQVAQWRACGECPEAGVEPSGAVGRRMRDAVLEAEGQCRDGQTLLVVSHGSA IARGITSLLGLDPDAFDGLRGVDNCHWSELVPIAHAKRRGTVSLGWRLAAHNVGVREDVL GA >gi|319979153|gb|AEUH01000040.1| GENE 23 32073 - 32501 404 142 aa, chain - ## HITS:1 COG:MT2493 KEGG:ns NR:ns ## COG: MT2493 COG0799 # Protein_GI_number: 15841939 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Mycobacterium tuberculosis CDC1551 # 25 124 25 123 126 87 44.0 1e-17 MSLPAQTTDLAALVAAAAHERGGADPVLVDVRSRMDLADAFVVVSAPTPRQVSAVAEEVM DRVARAVRLRPAHIEGRGGGTWILIDYSDLVVHVMGQQDREYYALERLWGDCPATRLEDG AEAARREAAARSLAHAAAEGGA >gi|319979153|gb|AEUH01000040.1| GENE 24 32476 - 33165 717 229 aa, chain - ## HITS:1 COG:Cgl2301 KEGG:ns NR:ns ## COG: Cgl2301 COG1057 # Protein_GI_number: 19553551 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Corynebacterium glutamicum # 16 219 3 209 218 263 66.0 2e-70 MSGTETASDGAAAPRTPASRRRAIGIMGGTFDPIHHGHLVAASEVMDAFDLDQVVFVPAS MQPFKEGRRVTPAEHRYLMTVIATASNPRFAVSRVDIDRGGTTYTVDTLADLRAQYPDAD FAFITGADALAHIAQWKDSDALFEQAHFVGVTRPGHVLADQGLPGSAVSLVEVPAMAISS TDCRARVAQGKPVWYLVPDGVVQYISKHGLYRPGAPVEDHVHVPSRPDH >gi|319979153|gb|AEUH01000040.1| GENE 25 33162 - 34439 1639 425 aa, chain - ## HITS:1 COG:lin1227 KEGG:ns NR:ns ## COG: lin1227 COG0014 # Protein_GI_number: 16800296 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Listeria innocua # 12 424 9 414 415 382 51.0 1e-106 MDENVDIGSVTSAARSASRVLAGASPARKNAALEAIASALVERAGEILAGNASDMERGRA NGMAEGLLDRLELTSERLGAIAASVREVAALPDPVGEVVRGTTNELGLRITQVRVPMGVV GMIYEARPNVTVDAAVLALKAGSAVVLRGGSAAEASNAALVGAMRGALESVGLPADLVQT VDPWGRAGANAMMVARGGIDVLIPRGGAGLINAVVSHSTVPVIETGTGNCHVYVDAGADE EAALRIVMNAKIQRVGVCNAAETLLVHRDAAPTFLPAVLRELTGAGVLVHGDPAALECAA GAGVDPAMLREATEEDWATEYLAMELAVRVVDSIEEAIDHIRAYSSGHTEAICTPSLASA NAFRAGVDSAAIAVNASTRFTDGGQLGLGAEVGISTQKLHARGPMGLAELTTTKWIIEGD GTIRP >gi|319979153|gb|AEUH01000040.1| GENE 26 34473 - 35573 1290 366 aa, chain - ## HITS:1 COG:MT2515 KEGG:ns NR:ns ## COG: MT2515 COG0263 # Protein_GI_number: 15841961 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Mycobacterium tuberculosis CDC1551 # 3 362 8 365 376 256 49.0 4e-68 MRALSEARRVVVKIGSSSLTRPDGGLDLNRIDAVARVVARWRRGGRQAVIVSSGAVAAGL DPLGFASKPSDLASVQAAAAMGQGLLVARWTAAFHTHHVDAAQVLLTTEDVMARDHYTNA TAALGRLLSLGVVPIVNENDAVTTRELRFGDNDRVAAIIAQMVQADALVLLTDVDGLYTA PPSRPGARLIERVDSTDDLMSVLVTGAGSKVGTGGMASKVQAATMASASGIGVLLANADK VEEALSGKGTGTWFAPTHERPASRRLWIAHAAPSRGELVVDAGAARAITVGKKSLLAAGV VSVAGHFDAGDVVDIASPTGLVARGIVSYDADSLAAIAGRSAPELEGTGWEHVRPVVHRD DLAPLL >gi|319979153|gb|AEUH01000040.1| GENE 27 35570 - 37102 1969 510 aa, chain - ## HITS:1 COG:Cgl2306 KEGG:ns NR:ns ## COG: Cgl2306 COG0536 # Protein_GI_number: 19553556 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Corynebacterium glutamicum # 1 476 1 475 501 451 56.0 1e-126 MADFVDRVTLHAAGGNGGNGVASIKREKFKPLAGPDGGDGGDGGSVVLEVSDQETTLLTY HRSPHQRAQNGTQGMGDFRQGKNGADIVLPVPDGTVVKSTSGELLADLTGAGARFVVAQG GRGGLGNAALASKARKAPGFALLGEPGEERDVVLELKSVADAALVGFPSSGKSSLIAAMS SARPKIADYPFTTLVPNLGVVAAGDVRYTMADVPGLIPGASAGKGLGLDFLRHIERCAVI VHVLDTAVYEAERTPVDDLRTIEAELGAYQGDLGGLEGHVPLMERPRVIVLNKVDVPDGR DLAEIVRPELEATGWPVFEVSAVSHEGLRALSFALAGIVERERARRPALEAARPVIRPRA VGQRTAVSVRRTRKDGEEIFQVRGDKPERWVRQTDFANDEAVGYLADRLAAAGVEDELVK AGAVAGDAVVIGELDGGVVFDWEPTLTTGPELLGSRGTDSRIDQSARRTNKERRQAYHQA MDAKAAAREELWTERAAGHWTDAADGGGDR >gi|319979153|gb|AEUH01000040.1| GENE 28 37178 - 37438 371 86 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227493582|ref|ZP_03923898.1| ribosomal protein L27 [Mobiluncus curtisii ATCC 43063] # 1 86 1 86 86 147 83 9e-35 MAHKKGLGSSRNGRDSNAQRLGVKRFGGQRVNAGEILVRQRGTHFHPGDGVGRGKDDTLF ALRAGEVEFGRKRDRRVVSVVESASA >gi|319979153|gb|AEUH01000040.1| GENE 29 37472 - 37792 376 106 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229244194|ref|ZP_04368313.1| LSU ribosomal protein L21P [Cellulomonas flavigena DSM 20109] # 1 105 1 105 106 149 73 2e-35 MSSNVVYAIVKAGGRQEKVSVGDVVVVDKLAEGIGSSVSLKPLMLVDGDAITVDAAKLGG VTVKAEVVDSAKGPKISIIKYKNKTGYRKRQGHRQKMSVVKITEIA Prediction of potential genes in microbial genomes Time: Thu May 12 17:12:48 2011 Seq name: gi|319979151|gb|AEUH01000041.1| Actinomyces sp. oral taxon 178 str. F0338 contig00041, whole genome shotgun sequence Length of sequence - 2191 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 2004 2535 ## COG1530 Ribonucleases G and E Predicted protein(s) >gi|319979151|gb|AEUH01000041.1| GENE 1 3 - 2004 2535 667 aa, chain - ## HITS:1 COG:ML1468 KEGG:ns NR:ns ## COG: ML1468 COG1530 # Protein_GI_number: 15827770 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Mycobacterium leprae # 151 647 246 770 924 546 61.0 1e-155 MTPELDSPAPVATAASLLFQAPVFQEPAGAGGPDEAPSRRSADAGDEPKEASPRRRRSSR TAPEDGAGADAKGVPDDDGAAGAPPKKRRTRKRSSAEAPSADGEGTGPQEDGEAPAPKRR RRAKKEDSAADPQEQDAHEAEGEGGRRRRRRRPRSSDEAEAGGDLTDQVKALKGSTRLEA KKQRRREGRREGRRRHSITESEFLARRESVKRTMVIRDHEGLDQIAILEDDLLVEHYVAR RTSKSAVGNIYLGRVQNVLPSMEAAFVDIGRGRNAVLYAGEVNWDAVGLDGKPRRIESAL KSGDSVLVQVTKDPIGHKGARLTSQITLAGRYLVLIPNGSMMGISRKLPDKERARLKKIL KQVVPSGSGVIVRTAAEGASEEQIRADVARLTKQWEDVRAKQKSTRSAPALMRGEPELAV RVVRDIFNEDFTSLVIQGRGTYATVKEYVDELSPELSDRVEQWVGADDVFHAHRIDEQLA KGMDRKVWLPSGGTLIIDRTEAMTVIDVNTGKYIGAGGTLEETVTRNNLEAAEEIVRQLR LRDIGGIVIIDFVDMVLESNRDLVLRRLVECLGRDRTRHQVTEITSLGLVQMTRKRVGEG LVEAFSTQCGACEGRGFIVHSHPVEHPGASDKNGKGSKRSKKSAEGRRAGGEQAADHGRA KEALSAI Prediction of potential genes in microbial genomes Time: Thu May 12 17:12:50 2011 Seq name: gi|319979147|gb|AEUH01000042.1| Actinomyces sp. oral taxon 178 str. F0338 contig00042, whole genome shotgun sequence Length of sequence - 2905 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 31/0.000 + CDS 221 - 1771 2336 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 2 1 Op 2 . + CDS 1789 - 2905 1650 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 Predicted protein(s) >gi|319979147|gb|AEUH01000042.1| GENE 1 221 - 1771 2336 516 aa, chain + ## HITS:1 COG:MT1659 KEGG:ns NR:ns ## COG: MT1659 COG1271 # Protein_GI_number: 15841078 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Mycobacterium tuberculosis CDC1551 # 8 513 1 479 485 400 46.0 1e-111 MTLAQTALDAVTIGRWQFAITTVYHFVLVPLTIGLSLLVAIMQTIWHRTGKEHWLSATRF FGKLLLINFALGVATGIVQEFQFGMNWSEYSRFVGDIFGAPLAVEALLAFFLESTFLGLW IFGWGRLSKRAHLASIWCVALGTMFSAAWILAANAWMQHPVGARFNPATGRAELDGVSGF LELITSGVYLSEFAHVITSAWLVAGSFVAGISIWWMVRSAREGSQQAADHARGVWRPIAR FGLVAVLIGGLGTAASGHIQGQEMVEVQPMKMAAAEGICVDTEGAAFTIAQFGSCPLGDD SQPLQIIRVPGVASFMSHNSFTAASEGVADIQDRMVALLNADPDFTAKYGDASAYDFRPP QMAVFWSFRLMIGLGAASFALAAWGLWATRRGRAPGSLRLSQLALANIPMPFAAASFGWI FTEMGRQPWVVAPNLGALSSGSPLGSVALMTQMGVSPNVPAWQMLTTLILFTVLYGVLGF IWYLLMKRYALEGIRTAPARAAAASAESADDLAFGY >gi|319979147|gb|AEUH01000042.1| GENE 2 1789 - 2905 1650 372 aa, chain + ## HITS:1 COG:lin2865 KEGG:ns NR:ns ## COG: lin2865 COG1294 # Protein_GI_number: 16801925 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Listeria innocua # 1 355 1 336 337 226 39.0 4e-59 MTLSILWFILIAVLWTVYLVLEGFDFGVGMLLPLVAKDDRERTQTVRTIGPHWDGNEVWL LTAGGATFAAFPEWYATMFSGMYLALFLVLVCLIVRVVALEWRSKIAAEKWRRTWDWIHT ASAWLVPLLLGVAFANFVQGMMIEVVDQRSGMVVDPTAVASSLATATHQLTGGFFSLLTP YTVLGGAVLVSVCLAHGAQFLALKTEGEVRERANAIAAPATIGATALAVVWMMWGQFAFT TNVLAWLPLVVAALAVIVSTVLSQQTLRHERRSFAASCTAIAATVAWVFAAMAPAVQKSS INPAYSLTIDQASSTQATLTVMAVVAACLVPVVAAYVAWSYWVFRARVGPDDVSTRTGLP LGRIRLGGNFLV Prediction of potential genes in microbial genomes Time: Thu May 12 17:12:51 2011 Seq name: gi|319979145|gb|AEUH01000043.1| Actinomyces sp. oral taxon 178 str. F0338 contig00043, whole genome shotgun sequence Length of sequence - 1385 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 83 - 1385 1403 ## COG4988 ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components Predicted protein(s) >gi|319979145|gb|AEUH01000043.1| GENE 1 83 - 1385 1403 434 aa, chain + ## HITS:1 COG:MT1657 KEGG:ns NR:ns ## COG: MT1657 COG4988 # Protein_GI_number: 15841076 # Func_class: C Energy production and conversion; O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components # Organism: Mycobacterium tuberculosis CDC1551 # 34 431 16 408 527 256 43.0 7e-68 MRPLDPRLLRYARSVRPHIALAVALGTLTALLVIAQALLVSAAVSPVIAGRSGPSGAVWP LVGVAAVAAARAGVVWLREATGHRAAARAIRQLRARVLDAALELGPRWRATRSSQTTTLL TRGLDDLVPYFVKFLPQLFLVVTVTPAALATMLVLDAWSALIAVVVVPLIPVFMVLIGRF TQSASRDKLESMKRLGAQLLDLMAGLPTLRGLGRQDSPREHLRRLGASNTEATMGTLRVA FLSGAVLEFLTTLSVALVAVEVGMRLVYGGIDLFRGLAVIMLAPEVFEPLRQVGAQFHAS ANGVAAAEAAFAVIEAQPPRPRGTRAAPPAGAHPIRFEGVSVEARGVWAPASLTATIEPG AVTALVGPSGAGKSTAVACLLAELVPARGRIVVDAPGGPVDLADADPASWWDSIAWVPQS PALVPGTLLENAGG Prediction of potential genes in microbial genomes Time: Thu May 12 17:12:55 2011 Seq name: gi|319979134|gb|AEUH01000044.1| Actinomyces sp. oral taxon 178 str. F0338 contig00044, whole genome shotgun sequence Length of sequence - 11274 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 3, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1003 1311 ## COG4987 ABC-type transport system involved in cytochrome bd biosynthesis, fused ATPase and permease components 2 1 Op 2 . + CDS 1000 - 2652 2241 ## COG4585 Signal transduction histidine kinase 3 2 Tu 1 . - CDS 2765 - 6346 4904 ## COG5263 FOG: Glucan-binding domain (YG repeat) 4 3 Op 1 . - CDS 6907 - 7554 984 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 5 3 Op 2 4/0.000 - CDS 7598 - 8317 1050 ## COG0571 dsRNA-specific ribonuclease 6 3 Op 3 4/0.000 - CDS 8317 - 8877 728 ## COG1399 Predicted metal-binding, possibly nucleic acid-binding protein 7 3 Op 4 3/0.000 - CDS 8870 - 9580 1049 ## COG3599 Cell division initiation protein 8 3 Op 5 14/0.000 - CDS 9599 - 10078 261 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 9 3 Op 6 1/0.000 - CDS 10075 - 10665 193 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 10 3 Op 7 . - CDS 10731 - 11273 696 ## COG1200 RecG-like helicase Predicted protein(s) >gi|319979134|gb|AEUH01000044.1| GENE 1 2 - 1003 1311 333 aa, chain + ## HITS:1 COG:STM0956 KEGG:ns NR:ns ## COG: STM0956 COG4987 # Protein_GI_number: 16764317 # Func_class: C Energy production and conversion; O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome bd biosynthesis, fused ATPase and permease components # Organism: Salmonella typhimurium LT2 # 4 319 251 561 573 144 32.0 2e-34 AVDVLAIGAAVVGAALIGTAQTASGHLPAVMLAVLVLTPLSSFEGVAELAPAAAQLVRSA GAAQRICEVLGPRAPAPTTRVPGPGPEGPVLEARDLAVGWPGGPVVAEGISLALRPGRTL AVVGPSGIGKTTLLMTLAGMLEPKAGTVRINGADAWSSPRSDVTRNVCATAEDAHVFATT VLENLRAANPALEPEGALRLLRAVGLGDWVAALPDGVDTRIGSGATTVSGGERRRLLMAR ALASPAPLMLVDEASEHLDPATADALVGTLFDQGAGRGTLLVTHRLAGAARADSVLVLGP GPGPAARVVDNAPYSEIIARRPDLAWAAGQEEA >gi|319979134|gb|AEUH01000044.1| GENE 2 1000 - 2652 2241 550 aa, chain + ## HITS:1 COG:MT3218_2 KEGG:ns NR:ns ## COG: MT3218_2 COG4585 # Protein_GI_number: 15842705 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mycobacterium tuberculosis CDC1551 # 324 548 4 199 199 88 32.0 4e-17 MSPDRTDMRLLRAALDLTRSLDLRDALQSFVTQACMLTACPHGALAVLDTWGATTLKLEH HDHGPSPSVPEGLVRAIPPHSHLLVNSPEDFDDFPLPTDTAPFLGVSVLVDEQVYGRLYL TGKPGGFVASDAAVLTALAPAAGIAVENAHLYADARRTERWISASQSLTTTMLEGADEEE ALELIARTVRNVSRADTAIIMLPSVGDTWAAEITDGKNADNLLGLVFPPNGRAMSVLNEG TGMIVDSMARAQTMRLPQLAAFGPALYAPLRSRGVSSGVLILLRQIGAPEFDSSELSLAE SLASQATLALELASARHAEDVAALLDERDRIGRDLHDFAIQQLFATGMRLDAAKQAVASG AADPAAITNVLDQSLASIDEAVRQIRAIVHNLRERDQAVGLVERIRRESSLTRAALGFAP SLLITLDSVAINEDEDQELALIEDFDGRVDAHLGDDVVAVVREGLSNIARHAKATAAAVS VDVSGRACGGSVTVSIADDGQGIDPDRTRSSGLANMGERARRHRGSFSTGAGIGGRGTAI TWRAPLAQRQ >gi|319979134|gb|AEUH01000044.1| GENE 3 2765 - 6346 4904 1193 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 955 1190 469 704 744 196 43.0 2e-49 MKLTRTLVGASVAAALSLTGLGHAAFADPPSTPDPSAQSAATAAGTATGIPLAVPLSATQ KVADWQSLQFGLFMHWGVYSTYEGMYQGRPQRMGYPEQIKAWEKIPDSDYKAQAATMTAD KWDAANVCQTAKNTGMKYVMITTKHHDGFAMWDTKTTDYNVVKATAFGRDPMKELSEECG KIGIKLAFYYSIIDWSHQEPEPYANRNKIDDFMMTSVIKPQLTELLTDYGPIAELWFDMG DPTAEQSQQMAAWVHELQPATMVNSRVWNNAGDFEVGGDNQVGTSFKMEPWESILSIFPQ CWSYCTPSFRANRSEENLPVLINRTVDNLVNSVASNGQFDFNVGPKGDGSFDPFDQKVLD GVGQWMGRHPDAITGARPTWLPTADWGRTTTKGNDLFMFPASWEDGKTLTLTGVGSTVAS VGVDGTGAGLSYTQEGSTVKVTLSGASPDEVQPVVKVSFASAPQYAPEHTLSPAGGESAL DKTGLYLRRGPQGGAAAVDTYVSEKEGKSYEEVTLSFQGELDAETTYKVAYGDKSIEATG AQLLAGPVGEGWKLPAGAVVPVRLALARPSYYGDSMSLRGPLSVTLTASEKPLEGSAPVF TKHPQSVEARDGERVALTAVAASRPAATYQWYRRAPGASEASAVDGATGSVHSFTATMDD NGAAYYAVATNPTGSATSNEAVLTVSARPDNLALNKPATQSSTGWGGEASRAVDGNTDGV WDNGSLSHTGKEDNPWWQVDLGSAQPIGQVKVWNRSADDKCGADSCDKRLHDFWVFASKK PLEPGATPDSIASDPDARVVRVEGVGGYPSAVDFEGFEAQYIRVTQPGTNVEFALAEVEV SPVKAVAPTVDAITVASTPEGAVQSSGDGAFRTATAPKSTLVSLSAKTTGTPEPGLKWQF RSNDSEEWGDIEDETGDELTVLADEESVGQYRLCAENTAGSACSGIVQLVLAADPAPDPT PDPTPDPGPAPDPTPDPTPDPGPAPDPTPDPGPSPDPTPDPAPDPGVDWSKGHWRSDGTG WWWAMPDGSYPKDTALTIDGSVYRFGPDGYMRTGWVNEGGSWYFHAPSGAQARGWVHDRG SWYYMGDDGAMATGWVASGGAWYYLTASGAMKTGWLNDGGNWYYLTPGGAMATGWLDDGG TWYYLTASGTMVTGWVNDGGTWYYLTGSGAMATGWLQIGGRWHNFAPNGAWIG >gi|319979134|gb|AEUH01000044.1| GENE 4 6907 - 7554 984 215 aa, chain - ## HITS:1 COG:MT3219 KEGG:ns NR:ns ## COG: MT3219 COG2197 # Protein_GI_number: 15842706 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Mycobacterium tuberculosis CDC1551 # 10 204 3 203 217 151 41.0 9e-37 MSDISATRTRVMIVDDHEIVRRGIAEIVDRADGLEVVAEAGSVQDALRRAELVRPDVILV DLQLPDGTGIDIMDSLGRTHPEVLPIVLTSFDDDEALAESLDKGARAYVLKTVRGAEITD VIKAVADGRVLLDERTLTRRRTDHEDPTADLTPSERKVLDLIGDGLSNREIGEKLGVAEK TVKNHITSLLSKMGLQRRTQVAAWVAGQRAAAWRN >gi|319979134|gb|AEUH01000044.1| GENE 5 7598 - 8317 1050 239 aa, chain - ## HITS:1 COG:Cgl2022 KEGG:ns NR:ns ## COG: Cgl2022 COG0571 # Protein_GI_number: 19553272 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Corynebacterium glutamicum # 14 236 22 242 247 171 46.0 9e-43 MARAKRLPVPPRTDTSSLVEAWGAPIDGALLSLALVHRSYANEAGGIANNERLEFLGDSV LSVVIAERLFHDYPDVAESDLSRMRAATVSQAPLAAAARRIGLGDYVYLGKGESLHGGRD KDSILSDTFEALIGATYLTHGLQAARRVVLERLSFLLEDAQERGHHQDWKTLLVEYSQEH GLGEVSYEVEGEGPDHRRVFTARAFIAGIDGPVGSGRATSKKHAENAAAQNAMGALDPQ >gi|319979134|gb|AEUH01000044.1| GENE 6 8317 - 8877 728 186 aa, chain - ## HITS:1 COG:Rv2926c KEGG:ns NR:ns ## COG: Rv2926c COG1399 # Protein_GI_number: 15610063 # Func_class: R General function prediction only # Function: Predicted metal-binding, possibly nucleic acid-binding protein # Organism: Mycobacterium tuberculosis H37Rv # 4 182 27 199 207 95 35.0 4e-20 MTDSPLVLSLADLPRAAGSVRDRSVEWAAPADLGTPSMGVEEGTPIPVGVELTSIDDGVL VRLTTAVDLVGECVRCLDPVRVHHDVSSAEVYVEPGSAAAAEADGGPGADALNEIGPRDT IDIEPQLRDAIVTLVDARPLCRPDCPGLCDVCGRKWDELPPDHSHFQVDPRLAPLAALLG DEDGQR >gi|319979134|gb|AEUH01000044.1| GENE 7 8870 - 9580 1049 236 aa, chain - ## HITS:1 COG:Cgl2024 KEGG:ns NR:ns ## COG: Cgl2024 COG3599 # Protein_GI_number: 19553274 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division initiation protein # Organism: Corynebacterium glutamicum # 25 236 4 235 263 73 33.0 3e-13 MSQDHDANANEQGWDGQSVYGPSTVIAALDQIEDLVESARSIPLSASIMVNKAEILDLLD QAREALPEDLVAADAVVADADAVLVRADSAAEQAIAEANTRASSTLEAANTKADQIVSAA REEAERTTSRADAEAEATLAQARADAEAALADAQAQADRLVSTENIVRMAEDRAREIVAD ARREEANLREGADDYVAQSLGELAGLISDLQRRTDAGRRTIAERRGVDVTDVELHD >gi|319979134|gb|AEUH01000044.1| GENE 8 9599 - 10078 261 159 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 5 150 7 153 164 105 38 2e-22 MIRALFPGSFDPFTIGHLDIAERAAAQVGELVVGVGANPAKNPMFTVEQRVAMASAALAH LGNVRVAALQGATMDAARGLGAALIVKGVRGADDAAHEAVQAAFNLEAGGIDTWWIPTRP ALSHVSSSAVRELFRLGKDAHRYVPPAISRFMTDNRGGS >gi|319979134|gb|AEUH01000044.1| GENE 9 10075 - 10665 193 196 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 177 12 187 199 79 29 1e-14 MTRIVAGSAKGRVLAVPAKGTRPTSERVREALFSRLDHWGALEGAHVLDLYAGTGALALE ALSRGAASADLVEKSAAAARLAARNASACSLPARVHTADARAYLGARSGPELGGEAGLVL IDPPYSAPESEVAAVLALLGPWIAPDCVVVVERPKRAPAPALPSFLVLEDTRAWGETAAH FAAPPAPGGPDEGGSR >gi|319979134|gb|AEUH01000044.1| GENE 10 10731 - 11273 696 180 aa, chain - ## HITS:1 COG:aq_2053 KEGG:ns NR:ns ## COG: aq_2053 COG1200 # Protein_GI_number: 15607024 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Aquifex aeolicus # 3 156 612 768 792 139 48.0 2e-33 GELTGRTPGARKAAVMADFASGVTPVLVATTVVEVGVDVPEATLMVIIDAQQFGLSTLHQ LRGRVGRSSRESLCVAVHRHGPTEAGERRLRAFASTTDGFELAEADLRLRKEGDVLGAGQ SGTATHLEHLSVRRDGAIIRDARAAAERLIDQDPSLAAHPQVLAALREQAPRDLTWIHRS Prediction of potential genes in microbial genomes Time: Thu May 12 17:12:56 2011 Seq name: gi|319979132|gb|AEUH01000045.1| Actinomyces sp. oral taxon 178 str. F0338 contig00045, whole genome shotgun sequence Length of sequence - 754 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 753 888 ## COG1200 RecG-like helicase Predicted protein(s) >gi|319979132|gb|AEUH01000045.1| GENE 1 3 - 753 888 250 aa, chain - ## HITS:1 COG:Cgl1294 KEGG:ns NR:ns ## COG: Cgl1294 COG1200 # Protein_GI_number: 19552544 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Corynebacterium glutamicum # 9 249 277 517 707 227 52.0 2e-59 GVRDGLEGALPFALTGSQRAAIGAIDADLARPSPMQRLLQGDVGSGKTVVALAALARVVD NGRQGALVAPTEVLAEQHFASITALLGPLGGAVGVRLLTGSTPEAARRAVLEEMASPGPL IVVGTHALLQDSVGFADLALAVIDEQHRFGVAQRAALRAARADGRLVHELVMTATPIPRT IAMTVFGDLDETRMEGLPAGRAPVATHLVDASNGPWIERLWRRAREEVDSGGRVYVVCPR IDEDDAVADA Prediction of potential genes in microbial genomes Time: Thu May 12 17:12:57 2011 Seq name: gi|319979129|gb|AEUH01000046.1| Actinomyces sp. oral taxon 178 str. F0338 contig00046, whole genome shotgun sequence Length of sequence - 726 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 237 265 ## gi|154509013|ref|ZP_02044655.1| hypothetical protein ACTODO_01530 + Prom 334 - 393 2.0 2 2 Tu 1 . + CDS 474 - 665 288 ## PROTEIN SUPPORTED gi|227495859|ref|ZP_03926170.1| 50S ribosomal protein L28 Predicted protein(s) >gi|319979129|gb|AEUH01000046.1| GENE 1 3 - 237 265 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154509013|ref|ZP_02044655.1| ## NR: gi|154509013|ref|ZP_02044655.1| hypothetical protein ACTODO_01530 [Actinomyces odontolyticus ATCC 17982] # 1 78 1 78 537 66 58.0 5e-10 MEPQASVLAAAMASAAEEARRLAPFLNDLDGWDGSDCDTGSNAAATLAAMARALAALDPG APLSTAVEACARTAVRQG >gi|319979129|gb|AEUH01000046.1| GENE 2 474 - 665 288 63 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227495859|ref|ZP_03926170.1| 50S ribosomal protein L28 [Actinomyces urogenitalis DSM 15434] # 1 63 1 63 63 115 84 9e-27 MAAVCDVCGKGPGFGKSVSHSHVRTNRRWNPNIQRVRALVDGTPKRLNVCTKCLKSDRVV RAI Prediction of potential genes in microbial genomes Time: Thu May 12 17:13:03 2011 Seq name: gi|319979128|gb|AEUH01000047.1| Actinomyces sp. oral taxon 178 str. F0338 contig00047, whole genome shotgun sequence Length of sequence - 1404 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 9 - 1403 1579 ## gi|293192358|ref|ZP_06609469.1| hypothetical protein HMPREF0970_01814 Predicted protein(s) >gi|319979128|gb|AEUH01000047.1| GENE 1 9 - 1403 1579 464 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192358|ref|ZP_06609469.1| ## NR: gi|293192358|ref|ZP_06609469.1| hypothetical protein HMPREF0970_01814 [Actinomyces odontolyticus F0309] # 2 464 75 536 537 438 59.0 1e-121 AVRQGVGHVGVLLGGVFASWGAVLADEARPTPLVVRRMLCARPAPGCRIDASPAVEEVLA GARAELAALGDTLPDVADIISLCSSQAQIGLVEATSEQTGRIDAGAAVLALLLACLDSAT RADPAILESYAVMLADLAALGGGRAPSPAPPLAGRRFTVDIVMQGEDKDRAALASRLGAL GARLSTVGHTDLFGLGEWRFHVDTSAPLAALPRTGRTLRFQVRDARPDEMIGIDELADEG ITHRGVRLLERRPVRRVERATVIACTRAPGLVEELARAGAVVFLDPAPEDAPGLAVAAAS SSTGVALIAPCDEASAALAARAASLLPAGPAQAPSVLSAPTSDDLGVLAVARACAPLFVP QPGGAAAGPAMASILRDQARGALARATTCALPPAWEHSDLDAALAELERIGPTRVRLLLG SGDDGPYLVASLRQLLSAKDAARAVDLETWDGGQTGPTLLQGVA Prediction of potential genes in microbial genomes Time: Thu May 12 17:13:25 2011 Seq name: gi|319979122|gb|AEUH01000048.1| Actinomyces sp. oral taxon 178 str. F0338 contig00048, whole genome shotgun sequence Length of sequence - 4547 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 168 - 1046 1145 ## COG1131 ABC-type multidrug transport system, ATPase component 2 1 Op 2 . + CDS 1043 - 1876 780 ## Kfla_1934 ABC-2 type transporter 3 1 Op 3 . + CDS 1879 - 2103 243 ## COG1476 Predicted transcriptional regulators 4 2 Tu 1 . - CDS 2278 - 3021 718 ## Kfla_5466 transcriptional regulator, TetR family 5 3 Tu 1 . + CDS 3139 - 4546 1661 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|319979122|gb|AEUH01000048.1| GENE 1 168 - 1046 1145 292 aa, chain + ## HITS:1 COG:PH0913 KEGG:ns NR:ns ## COG: PH0913 COG1131 # Protein_GI_number: 14590767 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Pyrococcus horikoshii # 14 291 7 308 324 139 32.0 7e-33 MRFYSGGFAMAPVITTESLEFSYGEAPVLRGLDIRIGAGEVVCLLGPNGVGKTTLVENLL GSLAPTNGRVRVFGTDPRQAGAGFWAKVGLVQQSWTDHAKWRVKDQLEWIRSVQLTAAER VSTVAEVLDAVGLSDKADSRLSRLSGGQRRTIDFAAALLASPELLILDEPTTGLDPVSKA RLHDLILARVDEDATIVMTTHDLSEAERLASRVLIMNEGRIIADGTVTALRELLDRGSEI TWIEDGARHVHTTHSPERFVRGLDLDAITGLTITRPTLEETYLSLVNKEPRP >gi|319979122|gb|AEUH01000048.1| GENE 2 1043 - 1876 780 277 aa, chain + ## HITS:1 COG:no KEGG:Kfla_1934 NR:ns ## KEGG: Kfla_1934 # Name: not_defined # Def: ABC-2 type transporter # Organism: K.flavida # Pathway: ABC transporters [PATH:kfl02010] # 50 265 48 266 277 85 36.0 2e-15 MSIRTIQAVLRAASRQASADLRASLAGTCAALLISVSAIVLIGRSTEAAAGAHVAFGPMF LASSIGTVSCFITLQIAGEAYTDRIGGALLRVRILPHGPLAWTIGKTMSAITQTIAFQAA VLLGGAVFADLPLSAPQVLTCLPLIMLSAVATAPLGFFAGALARGVYSFMLSFLPVLALI GTSGFFMPMDRLPSWVQAVQVALPTYWSGHLTRWALVGDPSWEVGASFCPVLALAVLAAW TVLGFAGASFVVRRSFRKETIGSLARVQSAIRSQTGL >gi|319979122|gb|AEUH01000048.1| GENE 3 1879 - 2103 243 74 aa, chain + ## HITS:1 COG:DR2259 KEGG:ns NR:ns ## COG: DR2259 COG1476 # Protein_GI_number: 15807250 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Deinococcus radiodurans # 6 73 11 71 82 67 51.0 5e-12 MSDDVVHNRIAVLRADRRVSRRELAEALGVHYQTVGYLERGEYAPSLHLALRIARYFEVP VESVFSLEEFPRLR >gi|319979122|gb|AEUH01000048.1| GENE 4 2278 - 3021 718 247 aa, chain - ## HITS:1 COG:no KEGG:Kfla_5466 NR:ns ## KEGG: Kfla_5466 # Name: not_defined # Def: transcriptional regulator, TetR family # Organism: K.flavida # Pathway: not_defined # 28 236 24 231 238 128 38.0 2e-28 MARTDRGAPPKWVQWLWEVDVPGAEAGRGTGKGLDIGKIVRAGVALADDEGLEGVSVRKL AQRLGVSTMAAYRHIGSRDEVVAAMVDVAFGPPPALPGPPEGWEDGLRCWALAVHARYVA HPWLLDAPVDGMPTGPHRLRWMEAVLQVLAVAGLGLQERLNAALLVDGHVRTVAALKRSL ASAAHGVGRGPSRKWLLARLEAEGMASMARVLEAGALDDGQGYELDYGLDRIIAGIGAGS RPGASRS >gi|319979122|gb|AEUH01000048.1| GENE 5 3139 - 4546 1661 469 aa, chain + ## HITS:1 COG:MT1289 KEGG:ns NR:ns ## COG: MT1289 COG0477 # Protein_GI_number: 15840696 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Mycobacterium tuberculosis CDC1551 # 14 426 15 417 579 191 34.0 2e-48 MKNEERSKDRGATRSRTAATAVLMVATFMDLMDSTITNVALPTIGKDLGATPEQLEWTLA GYIIAFATLLITGGRLGDVFGHRRIFVIGMVGFTAASLGAALSQSGELLVAARVLQGGFA GIMMPQVLSSVQVMYAPEERAPVLGLIGALSSLGAVGGLVLGGWLVTIDLFGMGWRSVFL VNVPVGVVIVAAALALVPRSRSEHPLKPDPAGALLGGLGVFLVVFPLTDGRAAGWAWWIW AMLAAAPFAVGAFAWQQRRTLKAHGSPLLPLPLFRDRGFASGQLVQVLSSIGNGGYVLVL LYYVQSALGFSALAAGLTLLPFALGSMVATPLAILATKRMGKWAVLAGGMVQAGALTWVL WTISANGPSLTGWDLTAPLALTGAGMMTLIMPLTTIALESVPTQDAGAASGTLTTFSQVG MVLGVALAGTVFFGVLQDAAEARDAITTALWVPIAAYALAGLAASAAMP Prediction of potential genes in microbial genomes Time: Thu May 12 17:13:34 2011 Seq name: gi|319979118|gb|AEUH01000049.1| Actinomyces sp. oral taxon 178 str. F0338 contig00049, whole genome shotgun sequence Length of sequence - 2393 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 125 - 781 719 ## SCAB_22791 putative TetR family transcriptional regulator - Prom 837 - 896 2.5 - Term 1220 - 1263 13.0 2 2 Tu 1 . - CDS 1290 - 2393 1266 ## KRH_00490 hypothetical protein Predicted protein(s) >gi|319979118|gb|AEUH01000049.1| GENE 1 125 - 781 719 218 aa, chain - ## HITS:1 COG:no KEGG:SCAB_22791 NR:ns ## KEGG: SCAB_22791 # Name: not_defined # Def: putative TetR family transcriptional regulator # Organism: S.scabiei # Pathway: not_defined # 10 187 4 178 197 102 36.0 1e-20 MTGRAKSAGARRPGRPKAGSEDKRARILSEAMALFATRGYAGTSLADIAGASDISKAGLL HHFSSKDHLLAEVLAERDRHSMRHLPTQRESVADTLDTWLAILGENAEDPEGVALYTAMS GAVIDRQHRAHAWFSDHLDFAISFLQDALEAGKATGEVRSEAPSGLIARHLVALSDGMQL QWLCDRADGVEDADMTAVVSYAVGEVKRRWLVGGPGGA >gi|319979118|gb|AEUH01000049.1| GENE 2 1290 - 2393 1266 367 aa, chain - ## HITS:1 COG:no KEGG:KRH_00490 NR:ns ## KEGG: KRH_00490 # Name: not_defined # Def: hypothetical protein # Organism: K.rhizophila # Pathway: not_defined # 32 264 616 863 960 75 25.0 5e-12 VPQSGGGAQGQPSAPPRPGGMSPLVPAQADKPADPAARKKLRRVIVALCALVVVVIAGVV ALAVLNGQRSPEARTRQYLQLLADGKAEAATAMVDPGIANEERVFMTDAVMQAASARIEI TEVKESQESSKNKGDVRYVTATLSLDGQRFTHDFALTQGKKEFGVLDNWQITEAFMLKVE VKAKGIPAVSIGDATKTLAKDDNSITVYPGVYTVSAANTGEYVTAPPVTVSAADDFSSRT ITFEATYSDALKAAALDAAVAKTKSCGSVENAGNLDEDCPNSVRSKTLTVLNVKEVPTQI VGDKYRQGSFAAEYAVITKQEGSGLFKESAPRDQKYTVRLEVRVDENNTIRMGADGKPQF TFSWSEY Prediction of potential genes in microbial genomes Time: Thu May 12 17:13:47 2011 Seq name: gi|319979107|gb|AEUH01000050.1| Actinomyces sp. oral taxon 178 str. F0338 contig00050, whole genome shotgun sequence Length of sequence - 8317 bp Number of predicted genes - 11, with homology - 8 Number of transcription units - 3, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 129 - 2126 1928 ## 2 1 Op 2 . - CDS 2188 - 3201 1040 ## COG0611 Thiamine monophosphate kinase 3 1 Op 3 4/0.000 - CDS 3189 - 4052 189 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 4 1 Op 4 . - CDS 4085 - 4465 565 ## PROTEIN SUPPORTED gi|227497643|ref|ZP_03927861.1| 50S ribosomal protein L20 5 1 Op 5 . - CDS 4491 - 4685 268 ## PROTEIN SUPPORTED gi|227383679|ref|ZP_03867101.1| LSU ribosomal protein L35P 6 1 Op 6 . - CDS 4725 - 5393 826 ## COG0290 Translation initiation factor 3 (IF-3) 7 2 Tu 1 . + CDS 5362 - 5571 96 ## - Term 5594 - 5625 -1.0 8 3 Op 1 . - CDS 5643 - 6101 551 ## 9 3 Op 2 . - CDS 6107 - 6337 281 ## COG1476 Predicted transcriptional regulators 10 3 Op 3 . - CDS 6351 - 7556 1411 ## Cfla_0874 major facilitator superfamily MFS_1 11 3 Op 4 . - CDS 7602 - 8312 684 ## gi|293192344|ref|ZP_06609455.1| conserved hypothetical protein Predicted protein(s) >gi|319979107|gb|AEUH01000050.1| GENE 1 129 - 2126 1928 665 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDPQNPPQAPYPGGQGGQAGGYGPAPQQPYAQAGAAQSAQGQAYPGQGQPYPGQQQAQG PYGAAPQQPQRAQTPVGHQMQTLFSPAGLTTLCLELAVVFGVGLAASLIGYISLASIVSG GSFMLIPVGIGAFTGAGWSVSTTFLGASYGMMAIAVPWTVIVVEAVALRFVLNKRVWEDG TIRESVPTGVRSLVEGLLVALVMTLLTAFFSAGDGSGGLRATSFFTFLVVLLVVGLSSFT ARQKAAQVSILPKALAPMLVEVRAAARVFLPVFGALTLVVILINAISSQRFAMVPLLLLL LPNLVIAVIGLAFFGGFALSGSVGGFGAGGAAMAWDIGNGWGSLIILVGVLTIVAASLSV GVQRQRTSAPVWSRTWQLPVAVLVIMLVGLLLFNINATGTAGGATGMAGWSPLVIALAAG VVSVLAEVTPKQFGTSSPGLLSLCAGRQAAAAWLASPEAYVAPAANQYAQPGQQQYGQPG QYPQGGQPQQYAQPGQQQYGQPGQYPQQGQYPQGGQPQQYAQPGQYAQQDQAPQGGQQYG QPGQYPQGGQQQYPQQGQYPAGGQPAQYPQGGQPQQYAQPGQYPQGGGASRVESQLEGSS ADVAPTADIAQVGMPPESQVPPTAPTPEVPQQQYYQASQGYGADPAQAQMPPAPGSGENT DGQQR >gi|319979107|gb|AEUH01000050.1| GENE 2 2188 - 3201 1040 337 aa, chain - ## HITS:1 COG:Cgl1291 KEGG:ns NR:ns ## COG: Cgl1291 COG0611 # Protein_GI_number: 19552541 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Corynebacterium glutamicum # 31 301 39 308 329 156 40.0 5e-38 MARLRDMDESQLIRAFRALLPVGGRTLVGIGDDCAQIAAPEGSFLVTTDVIVEDHHFHSW WSTPYQIGARAAAQNLSDIAGIGGRASALVVSVSVSRDADADWLLDVVRGLGDRAREAGA GVVGGDLSGGDRLVLSVTAFGYCQGRPVLRDGARPGDTVGVAGTLGWSYAGLDLLRGGVV RPGREPEALAPFVRTYCAPRPPLEAGPAAASAGASAMMDLSDGLAMDGGRMARASGVVIE LDRRALGREAEALVGAAAACGKDPLDWVVGGGEDQGILAAFPPGTALPDGFRAVGAARAP RGGEEPHIQLDGAEVFGPWDHFAGSGADESDQIPGIG >gi|319979107|gb|AEUH01000050.1| GENE 3 3189 - 4052 189 287 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 111 262 100 246 255 77 32 3e-14 MLDNPRSDRVRRVAGLSGRSARSRASAVLVEGPQAVRELLVHRPASVRDVYISAAARLAH PGIAGAARSATRWVHTVTDEVAAALSADCQGVCAVASADAIVREWPADGGFYVIIAQGRD PGNVGTIIRTADAMGARAVLTVAGTADVTSPKVVRASAGSVFHLPVIPHPAFGAAVDALH ARGARVLGTSGGPRAVPLPAVEEEGALAGPHAWALGNEARGLDADEVAACDLLVAIPMTG LAESLNVASAAAVCLYASQRARGTGRPDGAVDGRGGARAGTVVRWHD >gi|319979107|gb|AEUH01000050.1| GENE 4 4085 - 4465 565 126 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227497643|ref|ZP_03927861.1| 50S ribosomal protein L20 [Actinomyces urogenitalis DSM 15434] # 1 126 1 126 126 222 88 8e-58 MARVKRSVNAKKKRRTVLEQASGYRGQRSRMYRKAKEQVTHSFVYAYRDRKARKGDFRRL WIQRINAASRAEGLTYNRFIQGLNLAGVEVDRRMLAELAVNEPAAFTALVKVAKEALPAD VNAPRA >gi|319979107|gb|AEUH01000050.1| GENE 5 4491 - 4685 268 64 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227383679|ref|ZP_03867101.1| LSU ribosomal protein L35P [Jonesia denitrificans DSM 20603] # 1 64 1 64 64 107 79 2e-23 MPKNKTHSGAKKRFRTTGSGKIMREQAGARHLLEHKSSRKTRRLATDQVLETADVKRVKR LLGK >gi|319979107|gb|AEUH01000050.1| GENE 6 4725 - 5393 826 222 aa, chain - ## HITS:1 COG:Rv1641 KEGG:ns NR:ns ## COG: Rv1641 COG0290 # Protein_GI_number: 15608779 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 3 (IF-3) # Organism: Mycobacterium tuberculosis H37Rv # 12 178 3 169 201 211 71.0 6e-55 MRNNPWKEPPISETRINERIRVPEVRLVGPGGEQVGVVRVEDALRLAEEAGLDLVEVAPN AKPPVAKLMDYGKYKYEAAQKARDARRNQANTQLKEIRFRLKIDDHDFAVKKGHVERFLS AGDKVKVMIMFRGREQSRPEAGVRLLRRLADEIGDLATVESMPRQDGRNMTMVLAPTKRK AEALNEQRKRREAERAERRGKRAERASKDQARAASFRAEDKD >gi|319979107|gb|AEUH01000050.1| GENE 7 5362 - 5571 96 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGSFHGLFRITKNLRYAARTEVPTTDGAPGPAVEPDPLTGPVGRVPAVPSSGAPGGDPG GIPPLAFRA >gi|319979107|gb|AEUH01000050.1| GENE 8 5643 - 6101 551 152 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSIMAVGDNGYDERQVLLRGKISNACLAGVLATVLVNGAVNDAFGPWGPALLQASVVASV WMTVFVVWAMGTGAYFGSGGPASRARFGYLLAAVGAVVAAVQGWRLWGSPEFLERAPTMP VCAFLVVVGVVTFIAERREARGQADARGAEDV >gi|319979107|gb|AEUH01000050.1| GENE 9 6107 - 6337 281 76 aa, chain - ## HITS:1 COG:SPy1934 KEGG:ns NR:ns ## COG: SPy1934 COG1476 # Protein_GI_number: 15675737 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 15 76 2 63 68 94 72.0 5e-20 MSYIIDILEGGAMAKNLRMKAARAGRGLSQASLAEQVGVTRQTISAVESGDYNPTIALCV RICRALGVTLDDLFWE >gi|319979107|gb|AEUH01000050.1| GENE 10 6351 - 7556 1411 401 aa, chain - ## HITS:1 COG:no KEGG:Cfla_0874 NR:ns ## KEGG: Cfla_0874 # Name: not_defined # Def: major facilitator superfamily MFS_1 # Organism: C.flavigena # Pathway: not_defined # 1 396 4 401 423 309 62.0 1e-82 MDWTTTRRSSHVGYVVQAIVNNLAPLLFVVFSSRYGIGLAQLGALASLNFGVQLATDMVA AAVVDRVGYRIPMVAAHVLSAAGLVALGVLPLLAAHTFAALCAAVVVYAVGGGLLEVVVS PVVEHIPDDSRPKAAGMALLHSFYCWGQLGVVLVTTGALAVVGEQWWPALPIAWACVPVA NGIVLARAPMPETVPGAHRTPLRELGRSGAFLLALALMATGGAAELTMSQWSSFFAQQGA GVAKQVGDVFGPGLFALLMGAARAWHATAGARFDLRRLLMASGTGACACYLLAALAPWAW LSLLGCAGTGLFVALMWPGAFSLTAARFPAGGAAMFALLALGGDLGAAVGPSAASLVASA SSALGAPAALLSLPGGGLRTGLLVCALVPALFAMGVARWED >gi|319979107|gb|AEUH01000050.1| GENE 11 7602 - 8312 684 236 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192344|ref|ZP_06609455.1| ## NR: gi|293192344|ref|ZP_06609455.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 231 22 254 255 206 56.0 7e-52 MPARDRSDAGQAMERTSAALALGEGADNGAARLEALAAALADERVVVPVAVEPGPGTDGH HAEGRNSALTGPVAFERAPTPAGPAIAVYSSAAALSAHRPGARPMGMDFRTVALAALVET GGRAVMDPGGASVLLPRPAVAALAQGDQWLPPWRDGALRALLLERARRACPRVADVRLSY AGEGATRVTAVIDACGGGPDLRGAVGEALAEIGRTPRLVAAADRVELVPVIAEPHQ Prediction of potential genes in microbial genomes Time: Thu May 12 17:14:55 2011 Seq name: gi|319979094|gb|AEUH01000051.1| Actinomyces sp. oral taxon 178 str. F0338 contig00051, whole genome shotgun sequence Length of sequence - 11538 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 5, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 112 - 861 870 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase 2 1 Op 2 . - CDS 840 - 1487 737 ## HMPREF0573_11850 hypothetical protein - Prom 1508 - 1567 1.6 3 2 Tu 1 . + CDS 1580 - 1813 193 ## Cfla_1607 RNA-binding S4 domain protein 4 3 Op 1 2/0.000 - CDS 1826 - 5365 5698 ## COG0587 DNA polymerase III, alpha subunit 5 3 Op 2 15/0.000 - CDS 5437 - 6351 265 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 6 3 Op 3 . - CDS 6344 - 6943 681 ## COG0597 Lipoprotein signal peptidase 7 3 Op 4 . - CDS 7008 - 7577 960 ## COG3599 Cell division initiation protein - Prom 7669 - 7728 4.2 8 4 Tu 1 . + CDS 7543 - 7920 89 ## 9 5 Op 1 . - CDS 7761 - 8075 361 ## HMPREF0573_11863 transmembrane protein 10 5 Op 2 . - CDS 8084 - 8542 656 ## COG1799 Uncharacterized protein conserved in bacteria 11 5 Op 3 5/0.000 - CDS 8627 - 9394 769 ## COG1496 Uncharacterized conserved protein 12 5 Op 4 6/0.000 - CDS 9394 - 10677 1617 ## COG0206 Cell division GTPase - Term 10772 - 10802 2.6 13 5 Op 5 . - CDS 10803 - 11537 180 ## PROTEIN SUPPORTED gi|88856514|ref|ZP_01131171.1| 50S ribosomal protein L6 Predicted protein(s) >gi|319979094|gb|AEUH01000051.1| GENE 1 112 - 861 870 249 aa, chain - ## HITS:1 COG:Cgl2043 KEGG:ns NR:ns ## COG: Cgl2043 COG0106 # Protein_GI_number: 19553293 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Corynebacterium glutamicum # 12 246 3 244 246 176 45.0 4e-44 MVQEEEVSARGFTVLPAVDVTGGAAVRLSRGVVDSGASWGPPAQVVRAFAGAGAAWVHLV DLDRAYGRGDNDSVIRETIRGAGVEVELSGGVRDAQSLRNALEMGPARVNIATQALADMD FVCGAVRDLGPRAAVCLDVEGERLRARGGGRGGGLWEAIDALNRAGARLLVVTDVHRDGM MAGSNIELLQRVSGATDAGILASGGVDSLDDLRALAAVDGVEGAIVGKALYEGAFTLEEA LEAVREARQ >gi|319979094|gb|AEUH01000051.1| GENE 2 840 - 1487 737 215 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_11850 NR:ns ## KEGG: HMPREF0573_11850 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 9 196 24 210 261 72 32.0 1e-11 MGVHDDDIDAEFASLVARMGDPAEAGGPPGPARGPDDTPVDESLHLDGGRLSVALVLAPI AHPEVLHSLLALSDVRESVVRLKPWTAVWLRVETTPTDEEELDVLLTGTRPMPDAVDRVA RAVSGLSKYGAVALMSWLVEGDGVEPGVSGRISAQRYVAGEPEETIPAGLLMGSMPQATE DLLLGRTTPADYPDSVSAEGARRGGGPFGWFKRKK >gi|319979094|gb|AEUH01000051.1| GENE 3 1580 - 1813 193 77 aa, chain + ## HITS:1 COG:no KEGG:Cfla_1607 NR:ns ## KEGG: Cfla_1607 # Name: not_defined # Def: RNA-binding S4 domain protein # Organism: C.flavigena # Pathway: not_defined # 13 73 14 74 76 67 63.0 2e-10 MTDTPEIPVRGQIRLGQFLKLAALAEDGAQARGLVQGGDVRVNGATEVRRGRRLSPGDLV EVDAPWGVCAARVGSAD >gi|319979094|gb|AEUH01000051.1| GENE 4 1826 - 5365 5698 1179 aa, chain - ## HITS:1 COG:MT1598 KEGG:ns NR:ns ## COG: MT1598 COG0587 # Protein_GI_number: 15841014 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 2 1174 6 1180 1184 1320 56.0 0 MASDPFVHLHNHSEYSMLDGAAKTQAMAQEAARLGQPAIGLTDHGYLFGAYDFYTNCVSA GVKPIIGLEAYVTPGTSRFDRHKVTWGEPWQRSDDVSAGGSYNHLTLIAYSTQGMHNLFR LGSYASTDGQFGKWPRADKELLAKFHEGLIVFTGCPSGAVQTRLRLGQWDEAVAEAAELR DIFGPENFYVEVMDHGIDIERRVHSQVLEIARLMNAPLVATNDSHYVKKEDRKIQDALLC INSGARINDPDRFKFDGDGYFIRSSQEMRELWRELPGACDATLEIAERCDVRFTTTKEGA NYMPDFPVPEGEDKTSWFVKEVERGLNDRFPGGIPDDVREQAQYEEDVIISMGFPGYFLT VADYINWAKSQGIRVGPGRGSGAGSMVAYAMKITELNPLEHGLIFERFLNPERISMPDFD VDFDERRRDEVIAYVKRKYGEDRISQVVTYGTIKTKQALKDSARILGKDFKMGEQLTKAL PPAIMGKDISVAGIFDQGDKRYGEAAEFRKFYEDNPDTHEVVQYALGLEGLTRQWGVHAC AVIMSSHTLTDIIPIMKRPQDGAIITQFDYPTCETLGLLKMDFLGLRNLTVVSDALENIA INGKEDPDLDHIRFDDAKTYELLGRGDTLGVFQLDGGGMRDLLKLMKPDNFEDISAVGAL YRPGPMGANSHTNYALRKNGRQEITPIHPELAEPLEEILGTTFGLIVYQEQVMQIAQKLA GYTLGQADILRKAMGKKKKEVLDQQFKGFRQGMVDNGYSEESIKALWDVVVPFSAYAFNK AHSAAYGLVSYWTAFLKANYPTEYMAALLTSTKDNKDRRALYLAECRHMDITVLPPDVNA SMGNFAPDGEAIRFGLSAVQNVGGPVVEAIVGAREEKGRFASFQDFLDKVPQVVCNKRTI QSLIRAGAFDSMGHTRRALLARCDEAVDAVIDVKRNEAIGQFDLFGGLDDGAGGSSLAID IPDLPEFDRKQKLAAEREMLGLYVSDHPLRGLEGALARHQDYEIAQVVGSEGAMADRVVK IAGLVSGVTTKVTKQGNAWAIATIEDLSGSVEVLFFPRAYESISTYLAQDAIVGIEGKVN VRDEGMAVYGQSMTLLDLSSESDAPLDLRLPASRCTAKLLTDLKGVLESYPGASPVRLHV REPGRETVVELDPRYRVAQTNAFFSHIKAVVGAGGVIGR >gi|319979094|gb|AEUH01000051.1| GENE 5 5437 - 6351 265 304 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 42 289 41 279 285 106 32 7e-23 MGEGRLLPVPDALVGERVDAALSRMLGMSRSKCADLAAQGAVTLNGRTAGKSDRLGAADI IGIDMPDPAVKAVPVTGMDILYDDEDIVVVDKPVGVAAHTGPGWEGPTVLGNLAAAGYRI TTHGPPERQGIVHRLDVGTSGAMMVAKSELAYSVMKRAFKERTVTKVYHAVVAGHLDPAS GTIDAPIGRHPSREWRMAVIDGGKRAITHYDTLEMMAGACLAEIHLETGRTHQIRVHMAA VGHPCVGDAFYGADPGQAERLGLVRQWLHAVELSFAHPRTGAPTTVRSPYPSDLAAALEV LRHP >gi|319979094|gb|AEUH01000051.1| GENE 6 6344 - 6943 681 199 aa, chain - ## HITS:1 COG:Cgl2088 KEGG:ns NR:ns ## COG: Cgl2088 COG0597 # Protein_GI_number: 19553338 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Corynebacterium glutamicum # 10 167 30 191 193 92 36.0 5e-19 MPSRAPNRNAVCLAVLLLAVAADQITKWWARSALADGRSIALVGRFLRLELVRNPGAAFS VGSGRTWMFTVLAVVILGAMAWLYRRSADTATRASIALLAGGAVGNLIDRLFQPPSFGQG HVIDFIGYGDWFVGNVADIWIVVAAAALVLSLAREGGGGAAAHAGKDSGGGAAEQADGST ERGGAPEAPPAATAGGADG >gi|319979094|gb|AEUH01000051.1| GENE 7 7008 - 7577 960 189 aa, chain - ## HITS:1 COG:MT2204 KEGG:ns NR:ns ## COG: MT2204 COG3599 # Protein_GI_number: 15841637 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division initiation protein # Organism: Mycobacterium tuberculosis CDC1551 # 4 187 3 238 260 67 29.0 1e-11 MALLTTEDVLNKKFQYVKFREGYDQVEVDEFLDEVVSTIYALQMENQDLKEKLEAAERRV AELSDGEFKAPAQSAAPAVPEPAAVLAAGAPDAESATSMLALAQRVHDEYVRDGEEQSAK IIADANAKRDEIIADAQKQHETILTQLDQERELLENKINGLRTFESEYRSNLRGHLESLL AEVGSDGDN >gi|319979094|gb|AEUH01000051.1| GENE 8 7543 - 7920 89 125 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLRTSSVVSRAISSPFGRMGWWHRGSYCSAPGRPNGGRPLTHHVNNIRNRHTSNTHHPRG AVFQSGRRCGRARHSATTAKAAANRWMTMITAHRTTNPMSTLIPEPNRRGGMKRRRKRSG GSVSV >gi|319979094|gb|AEUH01000051.1| GENE 9 7761 - 8075 361 104 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_11863 NR:ns ## KEGG: HMPREF0573_11863 # Name: not_defined # Def: transmembrane protein # Organism: M.curtisii # Pathway: not_defined # 12 98 9 94 99 80 51.0 2e-14 MSVAVAWAAHSVYVIANLYLLVLLVRVGLDWARFFARSWRPSGAALVLANAVYTLTDPPL RFLRRFIPPLRLGSGMSVDIGFVVLWAVIIVIQRFAAAFAVVAL >gi|319979094|gb|AEUH01000051.1| GENE 10 8084 - 8542 656 152 aa, chain - ## HITS:1 COG:ML0920 KEGG:ns NR:ns ## COG: ML0920 COG1799 # Protein_GI_number: 15827440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium leprae # 62 139 119 196 210 78 52.0 4e-15 MSESFGARARKFMGWYSPDAEDDEFDDYDDMEEVAPVADITPATRPSLAPVRRPEPIPSE ELSRILPVHATAFSDARVIGEAFREGTPVILDLTDVSYEEARRLVDFAAGLTFGLRGVLE SVTDRVFLLSPKNVEIPGEAMGARRGALYNQG >gi|319979094|gb|AEUH01000051.1| GENE 11 8627 - 9394 769 255 aa, chain - ## HITS:1 COG:Cgl2104 KEGG:ns NR:ns ## COG: Cgl2104 COG1496 # Protein_GI_number: 19553354 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 11 252 13 240 246 119 38.0 7e-27 MRSLAPLGGARVFLTEADPGTGPLVGNVSLGVGDDPEAVHARRRYLSARLGAPIVWMDQT HSSDVAVVGLGEMGPIMRAGGAYAPLDSRCEGEYGPVPADGVVIDARGWEGAPALAVMTA DCLPVVLSADGGRVLAAVHAGRRGFLSGILVAAVRAMSGLCAEAPAALIGPAICGQCYEV PGSMRDEAEAEAEGIGSRTRWGSAGLDLPGAARRQLEAMGCRVAVDPRCTLEDASLFSFR RDSACGRQALVVAAA >gi|319979094|gb|AEUH01000051.1| GENE 12 9394 - 10677 1617 427 aa, chain - ## HITS:1 COG:MT2209 KEGG:ns NR:ns ## COG: MT2209 COG0206 # Protein_GI_number: 15841642 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Mycobacterium tuberculosis CDC1551 # 1 313 23 335 401 370 74.0 1e-102 MASPQNHLAVIKVVGVGGGGVNAVNRMIEVGLKGVEFIAVNTDAQALLMSDAETKLDIGR ELTHGLGAGADPSVGRKAAEDHVEEITAALDGADMVFVTAGEGGGTGTGAAPVVAKIARQ GGALTVGVVTRPFSFEGNRRAAQAETGVETLRGEVDTLIVIPNDRLLEISELNISVLDAF KAADQVLLSGVQGITELITTPGLINVDFNDVKSVMKDAGSALMGIGAATGEDRATRAVET AISSPLLEASIDGAHGVLMFFQGGSDLGLREIYDSSQLVREAAHPEANIIFGNVIDDSLG DEIRVTVIAAGFDDVVAPPVSHTATIPAVPAIQKAPGVSSLRPAPASDGGRESGTLGDIP AVNLRRNAQHRAETPAVRSAPTTRPVPVEVPAVAEYAEEPSEPASFEVPRVFEDPAEKEL DIPDFLR >gi|319979094|gb|AEUH01000051.1| GENE 13 10803 - 11537 180 244 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|88856514|ref|ZP_01131171.1| 50S ribosomal protein L6 [marine actinobacterium PHSC20C1] # 11 242 24 262 266 73 28 5e-13 PAGSATDIDERRRERARAKRRLALTRTASVLGALAAVALVVWGVFFSPLFALSASSVRVS GVEGTSVDAARIASAVAAFEGTPITRLDTGSVREAVMADVAVKDAVVSRRWPSGLMIEIT ARRGAMYEAGGSGYALVDSEGVAFATADAPPAGLPLVSLPEGDRQQAAADVLEAWDALDG GVRQEVESLASDGAQVTIALRGSRTVKWGTRGQAVPKAQVLAALLAQRSASTYDVSVPAR PVTS Prediction of potential genes in microbial genomes Time: Thu May 12 17:15:15 2011 Seq name: gi|319979079|gb|AEUH01000052.1| Actinomyces sp. oral taxon 178 str. F0338 contig00052, whole genome shotgun sequence Length of sequence - 15263 bp Number of predicted genes - 16, with homology - 13 Number of transcription units - 4, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 26/0.000 - CDS 3 - 1371 1674 ## COG0773 UDP-N-acetylmuramate-alanine ligase 2 1 Op 2 31/0.000 - CDS 1371 - 2489 1280 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 3 1 Op 3 25/0.000 - CDS 2517 - 3800 1585 ## COG0772 Bacterial cell division membrane protein 4 1 Op 4 28/0.000 - CDS 3797 - 5251 1570 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 5 1 Op 5 28/0.000 - CDS 5248 - 6333 1751 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 6 1 Op 6 . - CDS 6330 - 7736 1889 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 7 1 Op 7 . - CDS 7765 - 9495 2322 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 8 1 Op 8 . - CDS 9510 - 9902 416 ## gi|154509048|ref|ZP_02044690.1| hypothetical protein ACTODO_01565 9 1 Op 9 29/0.000 - CDS 9899 - 10939 1264 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 10 1 Op 10 . - CDS 11050 - 11481 548 ## COG2001 Uncharacterized protein conserved in bacteria - Prom 11565 - 11624 2.7 11 2 Op 1 . + CDS 11557 - 11685 199 ## 12 2 Op 2 . + CDS 11746 - 12204 8 ## 13 2 Op 3 . + CDS 12287 - 12919 -43 ## 14 3 Op 1 . - CDS 13151 - 13546 570 ## HMPREF0573_11572 membrane protein 15 3 Op 2 . - CDS 13584 - 14834 1212 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 16 4 Tu 1 . + CDS 14873 - 15263 184 ## COG0421 Spermidine synthase Predicted protein(s) >gi|319979079|gb|AEUH01000052.1| GENE 1 3 - 1371 1674 456 aa, chain - ## HITS:1 COG:Cgl2107 KEGG:ns NR:ns ## COG: Cgl2107 COG0773 # Protein_GI_number: 19553357 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Corynebacterium glutamicum # 9 456 16 481 486 321 44.0 2e-87 MSGLDIAGRYHFIGIGGAGMSVVAELLASRGARVSGSDRADSPVLERLRGLGITVFAGHD PAHVPADAVVVVSTAIRESNPELAAARGRGQRVVHRSEALALAATGMRFVGVAGAHGKTT TSGMLAVALRSVGADPSVAVGGVVPQFGSGAHLGAGDVFVAEADESDGSFLNYSPSVEVV TNVEPDHLDRYGSREAFEAVFAEFASRLVPGGALVACAEDPGASRLASHARERGITTITY GRPGRCAQRPDAAIEDVDSTASGSTARVEWDGQAAVLRLAVAGEHNVLNATAALLAGAAL GLGLPEMAAGLASFTGTARRFEERGRVGSRRLFDDYAHHPTEVAAALRQARVVAGAGGVT VVFQPHLYSRTRIFAARFAEALAAADHVVVTDVYAAREDPEPGVDSTLITSALPGSLHVP DMHEAARTGAGLVDEGGILITMGAGSITSCGADVLE >gi|319979079|gb|AEUH01000052.1| GENE 2 1371 - 2489 1280 372 aa, chain - ## HITS:1 COG:Cgl2108 KEGG:ns NR:ns ## COG: Cgl2108 COG0707 # Protein_GI_number: 19553358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Corynebacterium glutamicum # 2 364 8 354 363 209 41.0 1e-53 MVKVVMAGGGTAGHVNPLLATAAQLRDLGCEVSVLGTAQGLEADLVPAAGFPLVEIPRVP LPRRPSLEFFSLPSRWRSARRLCEEALEGADALVGFGGYVSTPAYSAAHRTGVPVVVHEQ NARPGIANKVGARRARVVALTFDSSPLRAKRGRTVTTGLPLRAAIASLAQRRRDEEGARR SREEAASRLGIDPGAHTLLVTGGSLGALHINEQMTAAAAALPDGAQVVHLTGRGKDAPVR EAVRAAGVGERWLVIDYLSQMEDALAVADLVVCRSGAGTVAEMEALGLPCVYVPLPIGNG EQRLNAADHVAAGGAQLFADKDFTADVVRNTVFPLLGSARLDGMASASRQLGRAGAAAEL AGIVIDVARGES >gi|319979079|gb|AEUH01000052.1| GENE 3 2517 - 3800 1585 427 aa, chain - ## HITS:1 COG:BH2566 KEGG:ns NR:ns ## COG: BH2566 COG0772 # Protein_GI_number: 15615129 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 43 397 10 362 366 157 34.0 3e-38 MSARGAQRGTGAERRWRIGARELPRIRFGSGQVHEDNPAVTYYLIVLPTLVLVFLGMVMG FSAQTVTNIARGNNPYLEYVKPAIIIVVAVIVAMAASRVPRNWWYVAAIPMFAVSVVFQA LVLSPLGMAQGGNANWVRLPGGFTAQPSELLKLALIILLAQLLERHQGRLGDLRTMAKCA AVPMLIALGAVMLGRDMGTAMVVFAAAVGALFIAGLPKKWFAVLGALAGAAAVWLVMSNP TRIRRILAVLPGTAGERDLSAPEQIDHSLWALGSGGLTGLGVGASREKWNYLQAAHTDFI LAIIGEELGLGGTLLVLVCVGALVTGMMRVCRNAQDPFVAVAAGGTATWIGAQTVINVLS VTGMGPVIGVPLPLVSYGGSSFLFTALAVGVVASFARAGAGMSMVGRPDEASAGRDPRVA PKKRSAR >gi|319979079|gb|AEUH01000052.1| GENE 4 3797 - 5251 1570 484 aa, chain - ## HITS:1 COG:Rv2155c KEGG:ns NR:ns ## COG: Rv2155c COG0771 # Protein_GI_number: 15609292 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Mycobacterium tuberculosis H37Rv # 6 474 6 482 486 223 41.0 9e-58 MSALPAPASPIAVVGWGRSGQGAAGALASRGFDVRAFDAKDGGTLHFDEAAGVPLSVEAD AGQLADLVVGMRPAVVVVSPGVPPRSPVFARARAAGIEVWGEVELAWRLQEAGPRAGRPW LCVTGTNGKTTTVGMLGEILRASGADAAEVGNIGTPITRAIDSAAEVFAVELSSFQLHTA HSVSPLASICLNVDADHLDWHGSAEAYAVDKARVYQNTVRACVYPASDRRVEAMVENADV VEGARAIGLALGAPLVSQLGIVEGLLVDRAFVEDRAHHALALAHVSDLSAAYGPHPSAAA LSDALAAAALARAYGVEAEAVEEGLRSFRPAGHRRAVIGHAADLTWIDDSKATNAHAARA SLAGLPPRSAVWIVGGDAKGQDFHDLIRAVEPVLRGVVVIGRDRSALAAALDECAPGVPR VEVDGHEDWMFSVVNEAVALSVPGNTVVLAPACASWDQFHDYARRGEAFADAVRRLAAQW GGRP >gi|319979079|gb|AEUH01000052.1| GENE 5 5248 - 6333 1751 361 aa, chain - ## HITS:1 COG:Cgl2111 KEGG:ns NR:ns ## COG: Cgl2111 COG0472 # Protein_GI_number: 19553361 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Corynebacterium glutamicum # 1 358 1 361 366 304 50.0 2e-82 MIGLIAAFVIAMAISISGTPLLIRYLITHQYGQFIRQDGPTQHLTKRGTPTMGGLVIIIA AVVAWLAGSLITGVGPSWSGVLLVFLFVGLGAIGLLDDGIKIMRQRSLGLHPSGKIAGQV AVASLFAIGTLVGPNAYGQVPGTLSISVARPTALTLGFAGFALGVVLYLLWTNLIVAAWS NATNLTDGLDGLATGASIFVFGAYTFITYFQRIQDCTGGTVNVTNCYTTRDPLDLAIFCA ALIGALAGFLWWNASPAQIFMGDTGALALGGAVAGLSVLTQTQLLAVVIGGLFVLVVLSD VIQIGVFKATGKRVFRMAPLHHHFELKGWKEVTIVIRFWLIAALFAVAGAGAFYAEWVSA R >gi|319979079|gb|AEUH01000052.1| GENE 6 6330 - 7736 1889 468 aa, chain - ## HITS:1 COG:Cgl2112 KEGG:ns NR:ns ## COG: Cgl2112 COG0770 # Protein_GI_number: 19553362 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Corynebacterium glutamicum # 20 461 24 489 514 261 43.0 2e-69 MQRSATWIAEATRGVVHGPDVSANGPVVTDSREAAPGGVYVARRGENADGHDYVASAARA GAVCALVERVVDLSDCPPITQVVVGDATGALGDLARAHLEALRAGGAIDVVAVTGSVGKT TTKDLLLQVLSADAPTVAPRLSFNNEVGLPLTVLTADESTRHLVLEMGASGPGHIAYLTR IAPPDVAIELVVGHAHMGGFGSVAGVAEAKAELIEGSRQGATAVLNADDPNVAAMAPRAK GPVVRFSPSGVGADVVAEDVRVDAAGRASFTLAAPEGREPVSLSLVGAHHVANALAAAAG ARALGLGLPLIARALSGARALSPHRMDVRGLRIGAVPLTLIDDSYNANIDSMRAAVAALE AIGKGRRTIAVLSEMLELGEDSPATHRAVGAMLSGAGVDALIGLGRDAHYYSEGAASVPD RVLVEDPGAALAAVLDRLTDDCVVLVKGSYNSYSWQVADGLLEEATTR >gi|319979079|gb|AEUH01000052.1| GENE 7 7765 - 9495 2322 576 aa, chain - ## HITS:1 COG:Cgl2114 KEGG:ns NR:ns ## COG: Cgl2114 COG0768 # Protein_GI_number: 19553364 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Corynebacterium glutamicum # 24 565 34 641 651 224 30.0 3e-58 MSLTYSRSRGILYTIVVFLVACSMRLVYLQIIAGPTLAAEGQSIRTHSSEVAAKRGSITD ATGVVLADSILTYDIAVNQINIRAYVHEDDNGDEVGRGPAEAARQLAPLLGMDEAELGGR LLGDSTYVYLKRNVDAVTYRQIRALDIHGIEWEAVYQRSYPNGNVAAPLIGTVNAEGQGS SGLESRFDDLLTGRPGEEAFETAPNGAIMPGGKRTTVEPVNGGSLQTTIHADLQHQIQDM LDARVSRHQAEWGTVVIEDISTGQILVMADSDSTPPDNAKPQPVAGVQYAFEPGSVGKLA TIAAGLDFGTITPTSVFDVPYSLDYADAGGPITDYHQHGTEALTATGILAESSNTGTVLI GETLSDAQRRDMMERMGFGAGTGIELAGESPGLVGEQWQGRDHYVTMFGQAYMVTALQEV SMLAAIGNGGTRLSPHLVKSWTNADGTVEAPEADEPVQVMGASTAAQLLSMMESVVEDKN GTGAAAKVEGYRLGVKTGTADIVVNGQEGIVSTTAGIIPADAPRLAISVVLYNPKVDVIS SDSSAPLFGDVARTAVNNLGIPASSTPAELYPSKPQ >gi|319979079|gb|AEUH01000052.1| GENE 8 9510 - 9902 416 130 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154509048|ref|ZP_02044690.1| ## NR: gi|154509048|ref|ZP_02044690.1| hypothetical protein ACTODO_01565 [Actinomyces odontolyticus ATCC 17982] # 1 130 1 130 130 138 70.0 1e-31 MSAATATTRPAPRPELRRDEARELRIVEGKRPRRSVLMGTIAVVAVAIAAVVVSMILNTR MAQTAFEIREQQLQLNELEAQSWALRAQLDEAASPSALEAAARANGMVPAGPTGFITLGT GTVEGGKPAQ >gi|319979079|gb|AEUH01000052.1| GENE 9 9899 - 10939 1264 346 aa, chain - ## HITS:1 COG:ML0906 KEGG:ns NR:ns ## COG: ML0906 COG0275 # Protein_GI_number: 15827426 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Mycobacterium leprae # 31 337 42 355 372 286 53.0 4e-77 MDEAQGASHASSRHTEHSYNICNTRNAAELHTPVLLDECLDMLAPAIEGPGSVLIDATLG MGGHTEGALRRFPGLVVIGIDRDPEAVALASKRLEGFGPRFLAVRATYDSIGQVVADHAP RGEADGVLMDLGVSSLQLDDAERGFSYSQDAPLDMRMDPTRGVSAAELLATAPQEEITRI LRAYGEERFAPRIARLVVARRGAQPLTRTGQLVDIVREAIPAPARRTGGNPAKRTFQALR VAVNDELAILERALPAALASLRVGGRLVVESYQSLEDRIVKNVLRGGSTVSAPPGLPVVP DEARPRLALLTRGAGRAPASEQEKNPRSASVRLRGAELIRPWKEHQ >gi|319979079|gb|AEUH01000052.1| GENE 10 11050 - 11481 548 143 aa, chain - ## HITS:1 COG:MT2224 KEGG:ns NR:ns ## COG: MT2224 COG2001 # Protein_GI_number: 15841658 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium tuberculosis CDC1551 # 1 143 1 143 143 138 44.0 3e-33 MFLGTYEPKLDDKGRMFLPARFREDMEGGIVLTRGQEHCVYAFPAAEFENMTAELRRAPL SSKQARDWIRVMLSGAYKEVPDKQGRISVPADLRKYAGLDRELTVIGAGSRAEIWNSSAW REYLAVQEEVFSSTAEEVIPGMF >gi|319979079|gb|AEUH01000052.1| GENE 11 11557 - 11685 199 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSPYFTLETRVFPRKILGGAKWGKLSHTFPPGASRESRRPQR >gi|319979079|gb|AEUH01000052.1| GENE 12 11746 - 12204 8 152 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGREVGQAVEEGGEEGGGRPLALRALVPGLRTVGRFGVRQRAGPELVAVAEAAGGPGRGA SVDVSTCRGPRARRAESPTRRTPASCPIAHRYRSAAERCSNHPERHRAPAARTNKPPPAS PAPATGHRLERLAQHALHLHLHLHHLGTARTT >gi|319979079|gb|AEUH01000052.1| GENE 13 12287 - 12919 -43 210 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQPRTDPATRRTKCHNRGRIPQRGARNRTNTYNAPHTQPLPTTTRAKNRNQSTRPTTPK RCKAAPTTAGTVALGAPPLRATPGCAHKDQDHPTQHRASPNDWTHWREVEGAPSRRPYVG RLSRQPIPCARTAAPTPLGPGGGRLGKGWRAPPRAAPVAECGGPSTCRLASGAAFPGAAI RIGAGDPHRSRKPPIGVGELATPMTCCPPQ >gi|319979079|gb|AEUH01000052.1| GENE 14 13151 - 13546 570 131 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_11572 NR:ns ## KEGG: HMPREF0573_11572 # Name: not_defined # Def: membrane protein # Organism: M.curtisii # Pathway: not_defined # 1 131 3 141 141 66 34.0 4e-10 MALTDYEKKVLEQMEAELAAESPGLAKQMSAPAPQERGPLAPRRIAAGGTVLVIGLLVLI GAVSLGYSAASLALGVVGFALMTTGILYALSRPKGTGPARQAKTPKREEKRGWDSFIEDQ ERRWDDRRDND >gi|319979079|gb|AEUH01000052.1| GENE 15 13584 - 14834 1212 416 aa, chain - ## HITS:1 COG:BMEII0656 KEGG:ns NR:ns ## COG: BMEII0656 COG0389 # Protein_GI_number: 17989001 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Brucella melitensis # 23 391 49 417 445 216 37.0 7e-56 MSNAPRDTRGARWWGDDDSAAPILHVDMDSFFAQVELRENPSLAGAPLIVGGHGNRGVVT SATYGARARGVRAGMPIGRARALCPAANVVPGRHGLYRRYSQAVMGVLSSITPVVEQVSI DEAFLDVSGARRRLGPPVAIAQRIRAEIRGRVGLPASVGIAATKSVAKIASSNAKPDGLL LVPEAATVDFLRGLPVGALWGVGGKTGAVLQREGIDTVGQLADLPLARLARLVGTASAHH LHDLAWGIDTRRVGGGGEEKSVSTERTFDDNVHDRVGIERFIVAASHDCARRLRAADMVG WGVAIKMRDGAFHTITRSTALAAPTDVGREIAQAAGALFARLPIPSGGVRLFGVRVDRLQ ARSSGVATPLDGDDRPAKSERAMDRIREKYGDASLRPATLLESTGQTGSSPGAETH >gi|319979079|gb|AEUH01000052.1| GENE 16 14873 - 15263 184 130 aa, chain + ## HITS:1 COG:Cgl1208 KEGG:ns NR:ns ## COG: Cgl1208 COG0421 # Protein_GI_number: 19552458 # Func_class: E Amino acid transport and metabolism # Function: Spermidine synthase # Organism: Corynebacterium glutamicum # 32 130 47 148 314 74 44.0 4e-14 MARASSTSTVFFPTDLGHAEIRWDGPRATLLLDGVESSAVDASDPSYLEFEYMQHMSCAL GAALPPPGPVRALHLGGAACALACAWEAARPGSRQVCVEVDALLAERVREHFPIPRSPRV RIRVGDGRAV Prediction of potential genes in microbial genomes Time: Thu May 12 17:15:50 2011 Seq name: gi|319979069|gb|AEUH01000053.1| Actinomyces sp. oral taxon 178 str. F0338 contig00053, whole genome shotgun sequence Length of sequence - 10607 bp Number of predicted genes - 9, with homology - 7 Number of transcription units - 7, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 25 - 1116 1428 ## HMPREF0573_11570 hypothetical protein 2 2 Tu 1 . + CDS 1261 - 3597 2287 ## COG3973 Superfamily I DNA and RNA helicases 3 3 Op 1 1/0.000 + CDS 3771 - 4979 1730 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 4 3 Op 2 . + CDS 5146 - 6987 2298 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Term 7143 - 7184 3.9 5 4 Tu 1 . - CDS 7134 - 7664 637 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains 6 5 Tu 1 . - CDS 7864 - 8307 423 ## gi|293192311|ref|ZP_06609422.1| putative LysM domain protein 7 6 Tu 1 . + CDS 8500 - 9177 930 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) + Term 9282 - 9328 6.0 8 7 Op 1 . - CDS 9242 - 10438 1036 ## - Term 10482 - 10524 2.5 9 7 Op 2 . - CDS 10536 - 10607 77 ## Predicted protein(s) >gi|319979069|gb|AEUH01000053.1| GENE 1 25 - 1116 1428 363 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_11570 NR:ns ## KEGG: HMPREF0573_11570 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 1 331 1 310 349 201 37.0 5e-50 MDLTAHDLYADGPAVGAKARILLIHFDGAIDAGGAGRMAVAQMLRSLHNERVATFDPDAL IDYRSHRPVTTVENWVSTQVRVPEIVLDLVEDDMGHPILVLHGAEPDARWESFARAVGHI AERSGVEITFSLHGVPSGVPHTRPTPVHVQATDSSLLPDQTQMANYMQFPSPLSTFIQVR LAQRGIGGIALLGAVPYYMSDTGYPAAASALLRSLSKFADLSLPVGDLEQGAAQDQEAIE KLVEDNPEISHTVGALEEHYDAWSGQGGAIPLSSLGKRATATAAEDAKSAKDIGDVIEAY LAQVTRAKDVAVGEGAPHARGAAAASATEESARPDTLEDVLARVEARRRGENGGGTRAPR HRA >gi|319979069|gb|AEUH01000053.1| GENE 2 1261 - 3597 2287 778 aa, chain + ## HITS:1 COG:Cgl1339 KEGG:ns NR:ns ## COG: Cgl1339 COG3973 # Protein_GI_number: 19552589 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Corynebacterium glutamicum # 32 765 24 753 755 311 35.0 4e-84 MRDNGGPGIDWQEDALHTQTADAPRVRSPLPEQDFVDGAYRRLDALRAAYRERQARAHSA HGAGNAQAWTEREALSQHLGGMAARLEGVEERLVFGRLDMKDRTVRHVGRISMSTDDGSP LLIDWRAPAAQPFYQATPVEPGGVVRRRHITTRQRTVTALEDELLDASDPHGQDLELAGE GALMSALNAAREGRMSDIVATIQAEQDEIIRSPHRGLVVVQGGPGTGKTAVALHRVAYLL YAQRERLERSGVLLVGPSRIFLRYIEAVLPSLGETGVVSRTMGSLVPGVSATAVEDPALA RLKGLPAWAGILREAVRRLARLPERDQELRVWNRRVVLRRSDVDRAKRHAKRSGRPHNVA REGFARELMDVLAVRLAKEAGGADSEGRVGKDEKSAWLAEIRDSVDARRAINLAWMPTSA TTLLGRLYARPEVLAEANRRAGSPLRPDELRALARPRCRAWTVSDVALIDELEELLGPMP DPGAARARQDGAADVARAQAAIESQGLGGGIVTAQMLAESASAQEGWAPLAERAANDRTW AYGHVVVDEAQELTAMEWRALLRRCPSRSFTVVGDLDQGRGARRPASWEKALGPAARALE KEYVLTVSYRTPRALTELAQAVVARAGSPVMYPMRAVRDVPDCYSVERVDAAGASEGPLP PRERDPLWGVSQGAVRRAASRLDASDGPGAGRIAVIVGARRARAWGADADGDSALQDRVG VLSAGAAKGLEFDSVVLVEPCEILADGVGDLFVALTRATHDVRVVHSRPLPAGMEEWA >gi|319979069|gb|AEUH01000053.1| GENE 3 3771 - 4979 1730 402 aa, chain + ## HITS:1 COG:VC2481 KEGG:ns NR:ns ## COG: VC2481 COG0111 # Protein_GI_number: 15642477 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Vibrio cholerae # 3 395 12 404 409 373 51.0 1e-103 MTRALLLENPHPVADSVFAAHGIEVVRHAGALDEDELIGALDSIDYLGIRSKTHVGARVL AARPGLRAIGAFCIGTNQVDLEAATRLGIGVFNAPYSNTRSVVELAIGEIIDLSRRVTVQ NSRLHRGVWDKSADGAHEVRGHTLGIIGYGNIGTQLSVLAEAMGMNVIFYDSAERLALGN ATQMPTMESVLREADVVSIHVDGQEANTDLIGRREFAMMKTGALFLNLSRGFIVDVDALH EALCSGHLAGAALDVFPHEPKKNGDPFTSDLALLDNVILTPHIGGSTEEAQYDIGRFVAA KISDYEANGSTDMSVNLPNLSLGAGPASLSRIRLIHANVPGVLARVNQVLADAGANVDGQ ILSTSGATGYVLTDVSSPLDGAALARLEALDSTIRLIVTPLR >gi|319979069|gb|AEUH01000053.1| GENE 4 5146 - 6987 2298 613 aa, chain + ## HITS:1 COG:Cgl0395 KEGG:ns NR:ns ## COG: Cgl0395 COG0318 # Protein_GI_number: 19551645 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Corynebacterium glutamicum # 18 557 20 567 568 388 38.0 1e-107 MDLTTQLRSLYAPGVAATIEVPETTIPAMVEEVAGRYPQRAAIDFFSRQMTYAELVVQMR KAAGALAAAGVRPGDRVALVMPNCPQHAVAVLGAMALGAVVVEHNPLAPAAELRGEYEAH GARTTIAWTKSLEKLSFLDPGHTVFAMDLIRALPRASQFLVRLPLRAARERREQLAARVP AWARRWDQEVARAEAWQGACPAASGDIALLIHTGGTTGVPKAAALTHRNLIANVEQSIDW VPVLHEGAEVFYCVLPLFHAFGFTIGFLAGLRMGATIAMFPKFDQAMILTSQKRLPCTFF LGVPPMYQRLLATAKQMGSDLSSIHFSLSGAMPLSRELADAWEEATGGLMIEGYGMTEAS PIILGSPLASTRARGALGIAYPSTEVRIVDPEDPSRDVADGEVGELLARGPQVFPGYWNQ PEETEACFVDGWLRTGDLVRLRDGFIYMADRRKEMINSSGFNVYPSQVEEAVRTMPGVRD VAIVGIPSSSSGEDVVAAIVLEAGASATLADIREWAEKSIAHYALPRQLVVMSELPRSQL GKVMRKKVREQITGMTGGAIQAAKGAVEGARGAVEGARDAVGEAARGAARGATDALRRIT SPEDRQQRGQQRD >gi|319979069|gb|AEUH01000053.1| GENE 5 7134 - 7664 637 176 aa, chain - ## HITS:1 COG:MT2791 KEGG:ns NR:ns ## COG: MT2791 COG1327 # Protein_GI_number: 15842256 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Mycobacterium tuberculosis CDC1551 # 1 145 1 145 154 172 62.0 4e-43 MHCPFCHNADSRVVDTRIADDGASIRRRRECGACKKRFTTLETSSFQVVKRSGVVEPFSR NKVVSGVKKACQGRPVTDDQLALLAQQVEENLRQTGVSNVSTNEVGKAILPFLRDLDVVA YLRFASVYQAFDTLDDFAAAIEELRERQAGTAPKPRRPRGRKQKKEPEQGPTLLDS >gi|319979069|gb|AEUH01000053.1| GENE 6 7864 - 8307 423 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192311|ref|ZP_06609422.1| ## NR: gi|293192311|ref|ZP_06609422.1| putative LysM domain protein [Actinomyces odontolyticus F0309] # 21 147 29 154 154 97 56.0 2e-19 MSAQTMIMAAPAPGAPVGGGARVIDIRSAPSARRRLVAAGLGDPAARRAPARPRSARPAL TLRSFVVGSLATVALLLGSAGLGALMRPAAYDGPTYTHAVTAGESVWSLAQSVRTSRPLD EVVADIERLNGVDGALRVGQVVVLPAR >gi|319979069|gb|AEUH01000053.1| GENE 7 8500 - 9177 930 225 aa, chain + ## HITS:1 COG:ML1003 KEGG:ns NR:ns ## COG: ML1003 COG1974 # Protein_GI_number: 15827479 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Mycobacterium leprae # 6 225 23 235 235 205 54.0 6e-53 MPANELSHRQAEILRVINDKLTRDGFPPSVREIALRVGLASPSTIKHHLDSLEAGGYLER HAGLPRALDLTDRARAVLGVGNPPTGAGVVTVEVPVGHADSDSGAVPLVGRIAAGSPITA EQYVEDVFALPTRLTGRGDLFMLEVSGQSMVDAGILDGDYVVVRAQADAQSGDFVAAMID GEATVKEFSRSGGHVWLLPHNEDYAPIPGDGATVLGKVVTVIRSL >gi|319979069|gb|AEUH01000053.1| GENE 8 9242 - 10438 1036 398 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEYNIIAGGPVGAADVEAVFGRWGRVCALGGGAVDVVIGDCGFVIAPEASSSVNPAFVRW VAHARGFGGSARVYSTEIDESYLESEGEETEVSRQVEAGLARLAERVGGAYAPFFQGDAI SYTDPVGGEYDFAGGPSQWDRKRKPGSPALHLSWYGRPGPAAAPLEQWAAALERHLPQCC PLRRPGVEPVVGGGALCLRYGPPWGDVVLHEPGAGAVSDGAAGGGDCGRPGLYTLSCTVP VSAFERDDRPCLDELREFVALAAEDLGAEVATCELVYGYDCGSGVARPTRKASAKRVVVV DEAGRLLGLPAVTPWWAWLGPAYSGLLAEWFQAKAKNGASCLIKYPGFRGILISANGKAW RYERPNCYDWIPKEYMPRKRRFSKAWEPAPRIPHWDGE >gi|319979069|gb|AEUH01000053.1| GENE 9 10536 - 10607 77 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no VPRKRLFGRGWEPARTRPQWGRG Prediction of potential genes in microbial genomes Time: Thu May 12 17:16:33 2011 Seq name: gi|319979042|gb|AEUH01000054.1| Actinomyces sp. oral taxon 178 str. F0338 contig00054, whole genome shotgun sequence Length of sequence - 25097 bp Number of predicted genes - 32, with homology - 20 Number of transcription units - 22, operones - 7 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 498 173 ## - Term 572 - 620 2.1 2 2 Op 1 . - CDS 655 - 1410 707 ## gi|228991904|ref|ZP_04151840.1| hypothetical protein bpmyx0001_26490 3 2 Op 2 . - CDS 1445 - 1990 582 ## 4 3 Tu 1 . + CDS 1944 - 2273 300 ## - Term 2096 - 2124 2.1 5 4 Tu 1 . - CDS 2310 - 2834 583 ## Bcer98_0736 hypothetical protein - Term 2898 - 2930 1.3 6 5 Tu 1 . - CDS 3129 - 4517 954 ## Bcer98_0735 hypothetical protein 7 6 Tu 1 . - CDS 5027 - 6049 1433 ## COG0039 Malate/lactate dehydrogenases 8 7 Tu 1 . + CDS 6032 - 6103 79 ## 9 8 Op 1 . - CDS 6087 - 7937 2259 ## COG1199 Rad3-related DNA helicases 10 8 Op 2 . - CDS 8020 - 9522 627 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 11 9 Tu 1 . + CDS 9614 - 10225 809 ## COG2813 16S RNA G1207 methylase RsmC 12 10 Tu 1 . + CDS 10506 - 10679 64 ## + Term 10792 - 10838 7.1 13 11 Op 1 . - CDS 10957 - 11958 866 ## gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein 14 11 Op 2 . - CDS 11939 - 12637 512 ## gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 + Prom 12112 - 12171 2.8 15 12 Tu 1 . + CDS 12191 - 13009 276 ## - Term 13016 - 13052 1.1 16 13 Tu 1 . - CDS 13159 - 13596 445 ## 17 14 Tu 1 . - CDS 14582 - 14734 203 ## 18 15 Tu 1 . - CDS 14997 - 15074 85 ## 19 16 Tu 1 . + CDS 15065 - 15190 137 ## 20 17 Op 1 4/0.000 - CDS 15568 - 16542 1101 ## COG0253 Diaminopimelate epimerase 21 17 Op 2 . - CDS 16535 - 17524 1074 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 22 18 Op 1 . + CDS 17584 - 18027 664 ## gi|293192302|ref|ZP_06609413.1| conserved hypothetical protein 23 18 Op 2 . + CDS 18020 - 18721 737 ## gi|154509069|ref|ZP_02044711.1| hypothetical protein ACTODO_01586 24 19 Tu 1 . - CDS 18945 - 19136 202 ## 25 20 Tu 1 . - CDS 19254 - 20780 454 ## PROTEIN SUPPORTED gi|228000795|ref|ZP_04047796.1| SSU ribosomal protein S12P methylthiotransferase - Term 20872 - 20911 -0.9 26 21 Op 1 . - CDS 20918 - 21400 277 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 27 21 Op 2 14/0.000 - CDS 21397 - 22053 627 ## COG2137 Uncharacterized protein conserved in bacteria 28 21 Op 3 . - CDS 22057 - 23142 1744 ## COG0468 RecA/RadA recombinase 29 22 Op 1 . - CDS 23352 - 23588 208 ## gi|293192295|ref|ZP_06609406.1| conserved hypothetical protein 30 22 Op 2 2/0.000 - CDS 23591 - 24070 661 ## COG1396 Predicted transcriptional regulators 31 22 Op 3 . - CDS 24083 - 24841 1005 ## COG0558 Phosphatidylglycerophosphate synthase 32 22 Op 4 . - CDS 24847 - 25095 131 ## Predicted protein(s) >gi|319979042|gb|AEUH01000054.1| GENE 1 3 - 498 173 165 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRVGECVFELRGYPSIDDGAFVRWLAVRRGRAGFAALHGAEVDGAYLDSDRYGGLGERVR AGFAALAERVDGVVVVLDEGAGEYADPVSGEVYTFAQAGPPAEGGRAALALSWFGIVGTR ESDPVLSWIDVCRQRWPALSPSVYRDEVRRPLTDGLIDALSRPGN >gi|319979042|gb|AEUH01000054.1| GENE 2 655 - 1410 707 251 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|228991904|ref|ZP_04151840.1| ## NR: gi|228991904|ref|ZP_04151840.1| hypothetical protein bpmyx0001_26490 [Bacillus pseudomycoides DSM 12442] # 97 249 89 234 235 65 25.0 3e-09 MALTPSRRRRERAERAQLEDLNRLMSYGGAVGFVEGPPDPDSVCPAWEHYGLYDPKMYGW SYVREMGISTGGALWRTVGLLGRCGRSMRVAKWGDRYVVVYQLVNSAGETVYYFGGDPMR PSWPMSGEEGYGRSVRAVWSRAVEVWPLYSHFHDGFLSCAQPSTGVYQVGDINEFSTGEY ADLGLDDEGLRDRARGCYPFYRSPGGDIVTLDITGQCPGRADLWHPRADPECGIDFWDTI DTLLTTAIDPE >gi|319979042|gb|AEUH01000054.1| GENE 3 1445 - 1990 582 181 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWGERYIVVYTLVNSTGTTLDYYGGNPFQPTWPGEGEPRRIGPSHQWASYDFNVHAVWDR VPAKLRSFYEDVHDGLSWCNWGPDDIFELRAVNEFSSGVFPDFDMLDNDALCERAKGCYV FFEHTSGSLLTLDVVGESPGRADYWSSGGAFEFGIDFWKALDEFYTGPMDPEYYAALEDA R >gi|319979042|gb|AEUH01000054.1| GENE 4 1944 - 2273 300 109 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEFTSVYTTMYRSPHNANSASLPHRLKKAPVSGSTSPTKRSIRSAITPTRSAGDKAGNAR HSSGKGASTGGDSSTKRTSPPPDIARFNRSTLVNDRLLKRENKAIRCLP >gi|319979042|gb|AEUH01000054.1| GENE 5 2310 - 2834 583 174 aa, chain - ## HITS:1 COG:no KEGG:Bcer98_0736 NR:ns ## KEGG: Bcer98_0736 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_NVH # Pathway: not_defined # 1 167 74 231 232 71 27.0 1e-11 MDAELAVHGGEYFVVYTFFNADGKEVSYVGGNPLRPSWPAPGEDGYADNVRRVWESVPES VRAFYERTHDGFGMFPDTDRLYRLEHVRPVYGVLDVGYEEPEVAERARHCYMFYEDASGT PFTLDILADRVERAGDLWWIDSKREYHVDFWGVLDQLLEAVIVPLEDRGPRRDR >gi|319979042|gb|AEUH01000054.1| GENE 6 3129 - 4517 954 462 aa, chain - ## HITS:1 COG:no KEGG:Bcer98_0735 NR:ns ## KEGG: Bcer98_0735 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_NVH # Pathway: not_defined # 281 441 69 211 219 81 32.0 6e-14 MEARDDILSKVEVAEGRYRSAGQALVEYAGALERAQTDSLNALVAAKSAQQDVDEAVARA GRMRESAGEYPEQGDGADDRARYERAAAAADGDAEVARGRVRAQRQVILNAMGERDAAAV KAMNAIDGAGDDGLGDSWWDDWGSKVASWVAAVCDMISAITGVLGLLVCWIPVIGQALAG VLFAISAIAGVVAAIGHALRWWYGEESLVAALVSVVFAVLGGLGLAGMRGTMAGVRSSFA NLCALGEKGHEGLVGGIKALGGLRGMAAGYGHNLWMSIKNGWKYLKNFFSKRTPVRFRDG PDIPRPKPTPNDDIKDAFVRGPDGERLAPTDWTLPGNTDNLALHPDHIVPYKRIQRMEGF DRLTEAQRDQVVNWSENFHAISARANQSRQNKSFVQWEGFLGEKGKRGPSGFVYPVEPLV RTEMSRIERLMEERIQYMIDQMLPPKPPAGGGWPYVPFFPGR >gi|319979042|gb|AEUH01000054.1| GENE 7 5027 - 6049 1433 340 aa, chain - ## HITS:1 COG:CAC0267 KEGG:ns NR:ns ## COG: CAC0267 COG0039 # Protein_GI_number: 15893559 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Clostridium acetobutylicum # 27 333 3 307 313 263 43.0 3e-70 MRGKPHLEGAIVESKAQTLYPSKSSGRPSKIAIIGAGAVGTAVAYACAMRGDARSIVLQD INKPKVEAEALDIAHGIQFTPCGSVEGSDDVEIVRGADLIIVTAGAKQQPGQSRLELAGS TVNLMKKIVPNLVGVAPDARFMFITNPVDVVTYVALKLTGLPRNQVFGSGTVLDTSRLRY LVSRETGVATQNIHAYVAGEHGDSEVALWSSAEIGNVPLSQWGPTLSGRVFDAELRKSIA TDVVQSAYKIIEGKGATNYAIGLSASNIAGAVLRDEQRVLTVSTLLEDWEGISDVCMAAP TLVGRDGAGRVLNPPLTLNERDGLTASAERLRKVARDLGF >gi|319979042|gb|AEUH01000054.1| GENE 8 6032 - 6103 79 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLPSHFCRLSRSPGEMFTTFAH >gi|319979042|gb|AEUH01000054.1| GENE 9 6087 - 7937 2259 616 aa, chain - ## HITS:1 COG:Cgl2468 KEGG:ns NR:ns ## COG: Cgl2468 COG1199 # Protein_GI_number: 19553718 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Corynebacterium glutamicum # 1 612 41 662 665 441 43.0 1e-123 MAAEAAMAIEERRHLMVQAGTGTGKSIGYLTPLLTHCALNGVRGLVSTATLALQRQILVK DAPAVVDAVAARTGVRLDVRVLKGWSNYVCLHRLQGGHPAEGTLFEDRDPGAGTEPTGEL GRQIVRLREWARASDTGDRDDLEPGVPDRVWHYASVAKPECLGRHCPLVDDCFAQRARDA AAEADVVITNHSLFGINATEGSSLFGEIDAVVVDEAHELADRVRSQAARDITPARVARAA RSLRSALSLDVSDLEEAGSGLGAALAPLPDGLLEKRPAALVDAMAVLDDAVRRAQQEVGE AEADQAAKLLARGAVEELAGALDEWGRDPDHSIAYVSRTEPGAERLTVGPLDVAGPLGAV GFGERPAILTSATLALGGSFDFMAAECGLALSPAPWHGIDVGSPFDPAAQAIRYVARRLP VPGRDGPPPAVLDELVELAQASGGGVLALFASRRGAQAGAEALRGRTDFEVLVQGEGTLS ALIDRFRDERDSCLVGTLSLWQGVDVVGPSCRLVVIDKIPFPRPDDPVTKARCMDVERRG GSGFRAVSLTHAALLMAQGAGRLLRSDTDRGVVAVLDPRLGTKPYGRFILASMPPMWPTE DPQVVRGALRRLNARK >gi|319979042|gb|AEUH01000054.1| GENE 10 8020 - 9522 627 500 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 90 481 23 413 425 246 39 1e-64 MTPTQSGDDEARDARIDEVVARILSRQAQALSSTAGQDRSGDAGAMEREARAATRRVDTS ASDRADISEVEYRQVRLERVVLVGLRTTQSEEEAENSLRELAALAETAGSRVMDGVIQRR SLPDPATYLGSGKAKELADIVRAHEADTVIVDEELAPSQRRGLEDVVDAKVVDRTALILD IFAQHAKSREGKAQVELAQLEYLLPRLRGWGESMSRQAGGRVAGGAGIGSRGPGETKIEL DRRRIRARMAKLKAEIARMEPARRTQRGARRRGGVPSVAIAGYTNAGKSTLLNRLTDAGV LVEDALFATLDPTVRRARTADGREYTLTDTVGFVRNLPTQLVEAFRSTLEEVGAADVLLH VVDAAHPDPVSQVEAVRAVLSGIEGADRVPELIALNKADLATPEQLAVLRTAFPGSVALS AKTGYGVGTLRAALEDLLPRPSVLIDAVIPYSAGSLVHRVHEEGEVEREEYAAEGTRLVA RVDEALAAAVRAATAPRADE >gi|319979042|gb|AEUH01000054.1| GENE 11 9614 - 10225 809 203 aa, chain + ## HITS:1 COG:BH0124 KEGG:ns NR:ns ## COG: BH0124 COG2813 # Protein_GI_number: 15612687 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S RNA G1207 methylase RsmC # Organism: Bacillus halodurans # 1 199 1 195 201 106 32.0 3e-23 MNEQYFTDQAPASDAEARQLRVSARGHELTMRVSPKVFSSSRLDLGTRQLLAEAPELPGR GTLLDLGCGWGPLAVVMGLESPGATVWAVDVNTRALDLTERNAEANGAANVSAMNAGEAL ERARSEGVRFDAIWSNPPVRVGKEAMRRMLSDWLSLLAPGGAAYLVVQRNLGADSLVAWL VSQGMAARRYASKKGYRIIEVTA >gi|319979042|gb|AEUH01000054.1| GENE 12 10506 - 10679 64 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFVRWGHRHCLYPCGPGTLGPGRAGGFVGLGKSSESGSLLFASSPAVGGLCVRPVGP >gi|319979042|gb|AEUH01000054.1| GENE 13 10957 - 11958 866 333 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190866|ref|ZP_06609028.1| ## NR: gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 168 333 20 198 198 123 44.0 1e-26 MLKMRCKRRTRALGCVSTSVLTIVMMMPAAYADSDPEYSVTGYSDTQNNPGIEFRSTQRS TVDYSSRPPSSVGGGSGSGSAGGGVPAVPAVSGGSGVSGSGSSSGRAAGGPLEMVCTGER EGIPGSAARPGDAGAVSSHCQYVAGTAPTTPETADEEPADDGSGGEGEPPSTETIVRTAL ARVPVSGAGLSWQPRKKSYTNAGVPTIVYAATPSQTHTTALFGHEVSITLAASQYSYDFG DGTPPLVTSRAGEPWRRGNKEARLTHHYEQVTRGGERRTITLTTTWDATTTNPFTGQTLT LPSIITTTEQSTPFPVSHLRIDLTDTADEQDGH >gi|319979042|gb|AEUH01000054.1| GENE 14 11939 - 12637 512 232 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190865|ref|ZP_06609027.1| ## NR: gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 [Actinomyces odontolyticus F0309] # 65 230 93 265 266 80 31.0 6e-14 MRREALARDAELRARGIDPYKGTPVDEEPRRRLGTRALVVLAVLVVAVVSVGAYVVFLRG EPDYGMSHGYQVQSDGSLKRPPVTDKAPEQPKEMNKGDEAGAAAAARYYLNASSYAWNTG DTNPLKSISDEECAFCRDQRSKIEEFYAHGYWAAGAHSSITNTQAIEREDSNYEGDVYLV QFRIDERLAEGYTSKGFQESQSRDTVIIFRVGWDGSNWRVLEGRAANAEDAM >gi|319979042|gb|AEUH01000054.1| GENE 15 12191 - 13009 276 272 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSVKLLNLRSLIPTERTFLVAYRFERICVTSVPRVRTGVQVITGCGRRSCLVPFVHLLGL LRRLVGDGRALETPVGLDLVAVAHPVIRLPPQEHHIRPHRHHCDHQNGQHHQGTRAQPPT RLLVHGSPLVRVDPPRPQLRIPRQRLTTQPLHILAHHRRHPTTINGGPVVGRIVGGAVVG QVIDRIISGRIVDGAVVGQVIDRIISGRIVDGAVVGQAIGRTISGQLADRIVGQPATDRG AVSARAVVRPGGTGSGAAAKPTVGRITNRSAV >gi|319979042|gb|AEUH01000054.1| GENE 16 13159 - 13596 445 145 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPDPLVGEYSDAGAEAMASYYFETVAYAWNVGDPRAFTDMVAAGCQACANMANDITSVYA NGGWAAKARASDFQAAAVGRVTDEQAHGDETYAVDVSFNERTPDVYTNGSLVPGAERVRA ARVLVAWDGYFWRVVEVEDQPQEES >gi|319979042|gb|AEUH01000054.1| GENE 17 14582 - 14734 203 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKVVFFSRFSVREHPAPSGALRPARTRGMRTGTLSVREHPAPSGALRLR >gi|319979042|gb|AEUH01000054.1| GENE 18 14997 - 15074 85 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYATTLWVLASAFRWEVVGKRQTGS >gi|319979042|gb|AEUH01000054.1| GENE 19 15065 - 15190 137 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRTPPYYLPYSPISALRMYLHLLTAEPAWLPLNLPETHTRT >gi|319979042|gb|AEUH01000054.1| GENE 20 15568 - 16542 1101 324 aa, chain - ## HITS:1 COG:Cgl1898 KEGG:ns NR:ns ## COG: Cgl1898 COG0253 # Protein_GI_number: 19553148 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Corynebacterium glutamicum # 12 306 7 265 277 131 38.0 2e-30 MRDFPLPRRARLAKAHGCGNSFVVATDADDSHDPSAEEVRALCSQAFGIGADGFIRCVRS GGAWFMDYRNADGSKAEMCGNGVRVFVDHLRREGLVRLAPGESLEVLTRGGTRTVELVAE AGGGAAEGCGEGGGAAGCGAAEDGAQYRVDMGPASSPARETVKVSVPGIPGVLSGIWVDM PNPHTVVAVDSLAALEGAQLPAVDAARVAPQMRPAYEPEPREGTNLELVVDLTREGDEVG HLRMRVLERGVGETMACGTGCCAAAVATALRRGPGAPASWIVDVPGGRVRVDIDGVIDWD GPGLTGAPVFLTGPATRVAEIVIP >gi|319979042|gb|AEUH01000054.1| GENE 21 16535 - 17524 1074 329 aa, chain - ## HITS:1 COG:Cgl1899 KEGG:ns NR:ns ## COG: Cgl1899 COG0324 # Protein_GI_number: 19553149 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Corynebacterium glutamicum # 13 315 5 300 301 219 45.0 6e-57 MDQPDPRPSDLVIAVVGPTASGKSDVALSVAQRAPARLGARGTGELVSADALQLYRGMDI GTAKTPVEERRGIAHHQIDVLEVRDEASVAAYQRHARADVEGIHARGGVAVVAGGSGLYQ RALLDVIDFPGTDPRVRARLEAEAEGPLGARGLHERLARLDPDSARRIDPRNARRIIRAL EVIEVTGRPYSAHMPRHEFHRPAVMVALRRDWEELDRRIAARTRAMFDAGLVEEVRALEG AGLRGARTASRATGYAQALAVIDGLMGVDEAIDSVALATRQLARRQVKWLRPDPRVVWLD AAGDAEDVASHVLDIAEARAAGGVALHHA >gi|319979042|gb|AEUH01000054.1| GENE 22 17584 - 18027 664 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293192302|ref|ZP_06609413.1| ## NR: gi|293192302|ref|ZP_06609413.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 147 1 148 148 186 69.0 6e-46 MSTQDTPPVTLPRVCDAVAATGLIPFVAEQQVAVILPSRTVRIAVPQDNPGQGVADYPRS FDAAHADQVAEAVRNLNASTYLPKVVSTPTEQGTITVHMQHTFNWVAGASDAQLNAEVSQ FLMATIAMMNQLDIAFPDQWAKEADHA >gi|319979042|gb|AEUH01000054.1| GENE 23 18020 - 18721 737 233 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509069|ref|ZP_02044711.1| ## NR: gi|154509069|ref|ZP_02044711.1| hypothetical protein ACTODO_01586 [Actinomyces odontolyticus ATCC 17982] # 1 169 1 167 176 157 55.0 5e-37 MPEWKSVALDGDEPADSPGAGGQDGPDEGGPSPDEFALKWGTEVQPVTIERIAAVLEGDG LPVAISEFAAATQVEEGNFQIHREPADCPWAQVELRLLVQGAELADLDSIANDWNAAHLQ PTVFPVPEAGQPLLVAASRFFVGEGMSDRQIHAMIRRGVVVGLSLARELAQAGGDGTGAP SDEGSESVPASDQEGAGSEADPAAFGDGAEPAPVPTADEEGAASTAPAPEDGA >gi|319979042|gb|AEUH01000054.1| GENE 24 18945 - 19136 202 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRPSPQGYTHKDPDLRNTRRTQGRNQALVCGAVFPNSVPCVRHLEGAKGAGPRGTAAPRP RPL >gi|319979042|gb|AEUH01000054.1| GENE 25 19254 - 20780 454 508 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228000795|ref|ZP_04047796.1| SSU ribosomal protein S12P methylthiotransferase [Brachyspira murdochii DSM 12563] # 15 465 6 437 440 179 27 2e-44 MNAHNPPAPPRTYAVRTLGCQMNEHDSERMAGLLEQAGLLPVDQVPEAAARATDAGDMGA DVVVINTCSVRENAATRLFGNLGQLAAVKRGRPGMQIAVGGCLAQQMREGIVERAPWVDA VFGTHNIDVLPALLRRAEHNRAAAVEIEESLKVFPSTLPTRRESVYAAWVSISVGCNNTC TFCIVPRLRGKERDRRPGEILAEVEAVASQGAIEVTLLGQNVNSYGVGFGERRAFADLLR AVGGVEGIERVRFTSPHPAAFTDDVIDAMATTPTVMPSLHMPLQSGSDRVLRQMRRSYRR ERFMGILERVRAAVPAAAITTDIIVGFPGETEEDFAQTLQVVEEARFASAFTFLYSPRPG TPAADREDQVPADVALERYKRLVALQERICAEDNAALVGNGVEVLVSEGDGRKDGATRRI TGRARDNRLVHVGLPASLSGADRPRPGDMVRASVTHGAPHHLIADSGLEGGLFEVRRTRA GDAWQAGRDAAGPGTGVALGMPTVRAAG >gi|319979042|gb|AEUH01000054.1| GENE 26 20918 - 21400 277 160 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 6 155 752 902 904 111 47 5e-24 MRTEGVAALVGELAARGLSIATAESLTGGALVARLVDIPGASRVVRGGVCTYATDTKASV LSVSRERLELTGPVDAEVAKQMASGARALFGADIALSTTGVAGPGSADGHEAGTVHVACA APGGALHRLVHIPGDRGAVRAGAVDAALALLREALDGGLL >gi|319979042|gb|AEUH01000054.1| GENE 27 21397 - 22053 627 218 aa, chain - ## HITS:1 COG:MT2805 KEGG:ns NR:ns ## COG: MT2805 COG2137 # Protein_GI_number: 15842275 # Func_class: R General function prediction only # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium tuberculosis CDC1551 # 50 198 19 171 178 73 39.0 3e-13 MAPHQDQDGAPGAPERTGQGEARERRPSPAERARLMRERNAALEGPRAVEAAREVALRQL DTRARSRRELLDAIASRGFDDGVGQEVVSRLEAVGLVDDRAFARALVRERFAARGRTGPA LVAELRRKGLDGDSIDEAMSSISGDDEYDRARRLVEGRAPSVRGVPRRAAYRRLAGMLAR KGYGPDVSDRAVREALDALGSADGEEEHWQDGEAGWGA >gi|319979042|gb|AEUH01000054.1| GENE 28 22057 - 23142 1744 361 aa, chain - ## HITS:1 COG:Cgl1910 KEGG:ns NR:ns ## COG: Cgl1910 COG0468 # Protein_GI_number: 19553160 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Corynebacterium glutamicum # 14 359 16 368 376 481 73.0 1e-135 MAARKSTASRTVENDRNKALQVALSQIDKQFGKGSVMRLGDDSRPPVQVIPTGSLALDVA LGIGGLPRGRVIEIYGPESSGKTTVALHAVANAQKAGGNAAFIDAEHALDPVYARALGVD TDSLLVSQPDTGEQALEIADMLIRSGGIDIIVIDSVAALVPKAEIEGEMGDSHVGLQARL MSQALRKITGALSATGTTAIFINQLREKIGVFFGSPETTTGGKALKFYASVRIDVRRIET LKEAGAPVGNRTRAKVVKNKMAPPFKQAEFDIVYGKGISREGSIIDMGVDTGIVRKSGSW FTYGDDQLGQGKENVRQFLADNPALAGEIEQKILVALGIAEAPPEEASQAPSVDPDEDAG F >gi|319979042|gb|AEUH01000054.1| GENE 29 23352 - 23588 208 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192295|ref|ZP_06609406.1| ## NR: gi|293192295|ref|ZP_06609406.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 72 1 72 74 102 79.0 6e-21 MKHSEFYEAVEATFGSALGRSYVTDLHLAELGATAAEALGAGADPDSVWEALTRETGRED ARWIHRTDPVRRRGDAGR >gi|319979042|gb|AEUH01000054.1| GENE 30 23591 - 24070 661 159 aa, chain - ## HITS:1 COG:MT2816 KEGG:ns NR:ns ## COG: MT2816 COG1396 # Protein_GI_number: 15842285 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mycobacterium tuberculosis CDC1551 # 64 137 3 76 112 77 64.0 7e-15 MAQLTDEGRRGAECAHDRGRCTNMSISRSHPDRRRSGPMRGTDMRSVPMSRYSKGMEKKP RETALLRTELGDVLRDIRQRQGRTLREVSSEAQVSLGYLSEVERGQKEASSELLDAISHA LGVPLWFVLHEVSDRMAVVDGAVVPDAVPDDIMPVALIS >gi|319979042|gb|AEUH01000054.1| GENE 31 24083 - 24841 1005 252 aa, chain - ## HITS:1 COG:Cgl1919 KEGG:ns NR:ns ## COG: Cgl1919 COG0558 # Protein_GI_number: 19553169 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Corynebacterium glutamicum # 14 204 24 204 208 133 42.0 3e-31 MERGDVNDNVSRVQTRRVSNLNVANALTVTRIVLVPVFIYLALRPSYLERLLAFVVFAIA AITDKLDGHLARSRGLITDFGRIVDPIADKALTLSAFALLSWQAVLPWWVTIIIAVRELG ITWLRAAFLRRGVVVAAAYAGKVKTLLQILALGTLLIPWDYLPALDPAYTWLAEALVSLG LVLTGAALAVTVYSGIMYIVDGMRISKELDTQADGADQAPGEEEGASDEDGGPSEQDEDG HSGQGAAHGVAS >gi|319979042|gb|AEUH01000054.1| GENE 32 24847 - 25095 131 82 aa, chain - ## HITS:0 COG:no KEGG:no NR:no APAAAGPAPGAVAESGAQGGRCADETPGGAGNGGSGGPAGGAADSPKRGTDPYAGTSFGG SAPDWVDEEPAGDEDAWQLTGR Prediction of potential genes in microbial genomes Time: Thu May 12 17:19:07 2011 Seq name: gi|319979030|gb|AEUH01000055.1| Actinomyces sp. oral taxon 178 str. F0338 contig00055, whole genome shotgun sequence Length of sequence - 12867 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 3, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 727 885 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 2 1 Op 2 . - CDS 809 - 1765 971 ## COG0524 Sugar kinases, ribokinase family 3 1 Op 3 9/0.000 - CDS 1829 - 3574 2469 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 4 1 Op 4 . - CDS 3601 - 4509 1170 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 5 2 Op 1 . - CDS 5524 - 6147 312 ## gi|293192224|ref|ZP_06609393.1| acetyltransferase, GNAT family 6 2 Op 2 . - CDS 6219 - 6959 914 ## COG0289 Dihydrodipicolinate reductase 7 2 Op 3 4/0.000 - CDS 6972 - 8312 1583 ## COG0612 Predicted Zn-dependent peptidases 8 2 Op 4 26/0.000 - CDS 8419 - 10776 1339 ## PROTEIN SUPPORTED gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase - Term 10940 - 10979 11.2 9 3 Op 1 . - CDS 10987 - 11280 376 ## PROTEIN SUPPORTED gi|227429968|ref|ZP_03913007.1| SSU ribosomal protein S15P 10 3 Op 2 . - CDS 11332 - 11406 110 ## 11 3 Op 3 12/0.000 - CDS 11417 - 12406 397 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 12 3 Op 4 . - CDS 12403 - 12867 415 ## COG0130 Pseudouridine synthase Predicted protein(s) >gi|319979030|gb|AEUH01000055.1| GENE 1 1 - 727 885 242 aa, chain - ## HITS:1 COG:ML0977 KEGG:ns NR:ns ## COG: ML0977 COG1674 # Protein_GI_number: 15827463 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Mycobacterium leprae # 65 217 134 284 886 69 35.0 4e-12 MANQSNRSASPKKKPARPRSPKGAQPAPASSASTAPGLLARVFSAIGRGAVRCAAAWRAT DAQLKRDALAFVLVAIAIVVALREWFQISGEAGDFIHHSAAGLVGIFSVVLPLVLVAFSV ELVRARSGKSAVPHHVAGGVGVAFSLTGLVHVSRGNPSIDPFAAVEQAGGITGWFIARPL SLLLSVWGAAAILVLLLLYSVLLATRTRVAEVPARLREARAALARKHGADPAPGAPGDGS QG >gi|319979030|gb|AEUH01000055.1| GENE 2 809 - 1765 971 318 aa, chain - ## HITS:1 COG:ECs2903 KEGG:ns NR:ns ## COG: ECs2903 COG0524 # Protein_GI_number: 15832157 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 22 301 36 314 321 95 28.0 2e-19 MAGRFISTHSVALVLPMHIAYMPGRGGSVHADAASSRPGGGFTTLSAAAAMGVPAAMASP LGTGPNSFTVRQQLVEAGVSVLTPELVGDIGVVIQLIVEDGSMTSVVTAGVESEPSRAVL DRIILQPGDTIHVAASDLTNPHSAKVLSEWGSELPESVTLVVSISPAVEQVPVEAWRNLL GRADIVTMNVREATTLTSILAGYRGGTTIRDLMRPGAATVRRLGSMGCELQEARDAPFVH IPAFQTTTADTAGVGDTHIAEMCAGLLMGYELADACLMANAGAAITLSHPSALPVPTLQQ VEAVLESGVVSTVVPLRP >gi|319979030|gb|AEUH01000055.1| GENE 3 1829 - 3574 2469 581 aa, chain - ## HITS:1 COG:MT2822 KEGG:ns NR:ns ## COG: MT2822 COG0595 # Protein_GI_number: 15842290 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Mycobacterium tuberculosis CDC1551 # 25 580 4 557 558 530 50.0 1e-150 MNESPRDKGAPRHGRLERMNPLYENLSEPAPLEKGALRVIPLGGLGEVGRNMNVLEFEGK LLVVDCGVLFPEETQPGVDLILPDFSWIEERMDDVVGLVLTHGHEDHIGAVPYLLKLRGD IPIYGSDLTLAFVAPKLREHRLSDPGLNVVAEGDRLAVGPFDLEFVSVTHSIPDALAVFV RTSAAKILITGDFKMDQLPLDRRLTDLRAFARMGEEGVDLFMVDSTNALVPGFITPEREI GPVLDQVFGEATGQIVVASFASHVHRVQQVINAAHHHGRRVALVGRSMERNMRIAEEKGH LAIPEGVVVDLKEIDQLPPSQRVYMATGSQGEPMAALSRMSVGSHKTVTIGAGDMVVLAS SLIPGNENSVYRVINDLTRLGARVVSKENAKVHVSGHASAGELIYCYNIIQPKNVMPIHG EVRHLVGNGQLAVKTGISPESVVLAEDGVTVDVKDGTARVSGVVPCEYVYVDGRSIGEIS EDELETRRTLGAEGFISIFAVVEHDSGTVLAGPTIRAIGMAEDDSVFDEILPDVASALKD AAAPGGQDPYVLQQAMRRVIGRWVARRLRRRPMIVPVVTEQ >gi|319979030|gb|AEUH01000055.1| GENE 4 3601 - 4509 1170 302 aa, chain - ## HITS:1 COG:MT2823 KEGG:ns NR:ns ## COG: MT2823 COG0329 # Protein_GI_number: 15842291 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Mycobacterium tuberculosis CDC1551 # 13 302 12 299 300 258 50.0 1e-68 MVKGMTMIPPRLFGSLATAMVTPMKADGSLDCESAARLARALVDDGCDTILLSGTTGESP TTHQPEKNDLTDAVREAVGDRAFILCGACSNDTAHAVRIAEGAQEHGADGLLVVSPYYNR PSQEGLRAHVLEVANATDLPVMLYDIPGRTGIAFSDGTLDALASHPRIKGVKDATGDVET GVDRMHRTGLEYYSGDDALNFAWMTGGASGFVSVVSHVAAAQYRRMIELLDSDQVWEARE LSYRLRPLVRAIMGSGQGAVMAKYALWLQGVIDAPAVRLPLVGVSDSEVEALRAALVAQG AL >gi|319979030|gb|AEUH01000055.1| GENE 5 5524 - 6147 312 207 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192224|ref|ZP_06609393.1| ## NR: gi|293192224|ref|ZP_06609393.1| acetyltransferase, GNAT family [Actinomyces odontolyticus F0309] # 3 204 17 177 180 143 48.0 5e-33 MPGDERAIAALQWGAWRALLTGEELAAQGLSEERLRAGWEAALSSPRPASAALVVALHGN SVVGFALAGPDDEAGADRGQSGAAGGPPGAVRGQSGAAGGPPGAVRGQAGAAGGPPGAAP VQTGGVPAAVPTQIYELVVEPRFCRSGHGSRMLAAVADLVGGAMRVWIDARDEARQRFFS SAGFAPAGAGRTIGDGHTQHLWWARAE >gi|319979030|gb|AEUH01000055.1| GENE 6 6219 - 6959 914 246 aa, chain - ## HITS:1 COG:MT2843 KEGG:ns NR:ns ## COG: MT2843 COG0289 # Protein_GI_number: 15842311 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Mycobacterium tuberculosis CDC1551 # 3 246 2 244 245 227 55.0 2e-59 MKRVAVIGASGRMGTAVVEAVGACADMEVVARLDAGDAISRESLGGAEVAVEFTSPASSE ANVHALLDAGVDAVVGTTGWDDDAYARVRAHAEDAGRSVLIAPNFAIGAVLAMRFAQIAA PFFESAEVVEMHHPDKVDAPSGTAIATARGIARARGAAGLGDVPDATESDPHGARGARID GVPVHAVRLRGLTASEEVLLGNPGEQLVIRTDSFDRVSFMPGVLLAVRRVSSRPGLTVGI ESVMGL >gi|319979030|gb|AEUH01000055.1| GENE 7 6972 - 8312 1583 446 aa, chain - ## HITS:1 COG:ML0855 KEGG:ns NR:ns ## COG: ML0855 COG0612 # Protein_GI_number: 15827381 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Mycobacterium leprae # 26 435 3 408 424 291 41.0 2e-78 MPPIALPLTPDAASLSFDDGDTRIHRSIGSTGVRVLTQRVPAAQSVSASLWVPVGSRDEE PARAGSTHFLEHLLFKGTARRSALDIAVAFDSVGGESNAETGREHTAYWARVRDADLGTA IDVLVDMVTGSVLDPADFDTERGVILDELAMGDDNPVEVVHDAFQLAVHGDTPIGRPVGG TADAIRAVGRDDVWEHYQSNYGCPSLIVVASGNVDHDELVERVDGALAASQWSTAPRPPR PRRPTAAPEGAGSAGEGVVERRRDVGQAHVVLGCEGLRATDEETPVMHVLLSVLGGSMSS RLFQEIREKRGLAYTTYAFASPYSDTGSFGMYAGTSPGSVPEVEAIMRAQLEDLATQGPS DEEMARVRGQVRGGLVLGLEDNWSRMMRLGRSEIMGRYRVVEETLRDIESVDAQQVRSLA SRLAARLRARAVVLPRQWEPATTHTH >gi|319979030|gb|AEUH01000055.1| GENE 8 8419 - 10776 1339 785 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase [Brucella abortus bv. 1 str. 9-941] # 19 747 11 710 714 520 44 1e-147 MMEGSDIVAAEAVIDNGRFGKRTIRFETGRLARQAAGSALAYLDDATTVLSATTVGKNPK DQFDFFPLTVDVEERAYAAGRIPGSFFRREGRPGTEAILAARLIDRPLRPGFVKGLRNEV QVVETVLTIDPDDAYDVLAINAASMSTQIAGLPFTGPIGGTRLALVDGQWVAFPRWSELE RSVFNIVVAGRIVTKDDGSEDVAIMMVEAGGGKNEWELIQAGATAPTEEVVADGLEAAKP FIKALCRAQLEVAAKASKETAEFPLFLDYTDEEYSAVEECVGGKLAEALLTEGKLARDGA VDAVKEEMLAALADRFPEGEKNLKAAFRSLEKATIRARTLRDEIRMDGRTPRQIRSLSAE VEVLPRVHGSALFQRGETQILGVTTLAMLRMEQQIDNLSPVNSKRYMHQYNFAPFSTGEV GRVGSPKRREIGHGDLAERALVPVLPTREEFPYAIRQVSETMGSNGSSSMGSVCASTLSL LQAGVPLRAPVAGIAMGLMTGEVDGQPKAVTLTDILGAEDGFGDMDFKVAGTRDFITALQ LDTKLDGIDSQLLRAALGQARDARLAILDLINQAIDGPDEMSPNAPRIITVKVPVDKIGE VIGPKGKMINQIQDDTGADITIEDDGTVYIASSDGASAEAARTTINQIANPQMPEVGERF VGTVVKTTSFGAFVSLTPGKDGLLHISQVRRLVGGKRIESVDDVLQVGQQVEVEINEIGD RGKLSLHAVIDGEAGDLEAPAESERRERAERRDRSERRERRPRTRTRRRREDESEATREE GASDE >gi|319979030|gb|AEUH01000055.1| GENE 9 10987 - 11280 376 97 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227429968|ref|ZP_03913007.1| SSU ribosomal protein S15P [Xylanimonas cellulosilytica DSM 15894] # 9 97 1 89 89 149 82 1e-35 MQRIEEKPVPLSKDVKDQIVAEYATHEGDTGSPEVQIALLTQRIKDLTEHFKTHKHDHHS RRGLLLLVGRRRRLLGYLADIDIERYRSLIERLGLRR >gi|319979030|gb|AEUH01000055.1| GENE 10 11332 - 11406 110 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSLCARWCPSSQPRGRNSHTCYSY >gi|319979030|gb|AEUH01000055.1| GENE 11 11417 - 12406 397 329 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 18 308 20 307 317 157 32 4e-38 MRIWHSLAEIPGAQRSVVTIGNFDGMHNGHKKVVSTCVERARRRGVDAVAITFDPHPLQV HKPEAGVQLISPLRDRLDAMAAAGLDAVLVAHYDVNLYSMEPDEFIQEYVVDRMGAVEVV VGEDFRFGRANAGTIDTLRDLGRGLGFDVVMVTDIEAPEGRRWSSSWVRSLLAEGDVAGA ARVLGHLHRIRGTVEHGFKRGRALGFPTANLASGIEGVVPGDGVYAGWLVRNVPGTLSAE FLPAAISVGTNPQFGATERTVEAHVLGRSDLNLYGERIAVTFVSRIRPMLAFSSLEELLA QMDDDLRQTASVLGIGPAGRVDPDSVTAL >gi|319979030|gb|AEUH01000055.1| GENE 12 12403 - 12867 415 154 aa, chain - ## HITS:1 COG:ML1546 KEGG:ns NR:ns ## COG: ML1546 COG0130 # Protein_GI_number: 15827813 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Mycobacterium leprae # 2 139 175 301 320 83 43.0 2e-16 AVVDVDVVVECSSGTYIRALARDAGTALGCGAHLVRLRRTRVGRFGIGDALTLEQFDAGA CGDPERPTVPLIPLGEAARAMFPVIGLDAREAAAFAHGQAPARAALPDGEGPLAAAAPDG RVVGLVGRSGDRLRALAVFAGPGAEGTAEEGAAQ Prediction of potential genes in microbial genomes Time: Thu May 12 17:19:27 2011 Seq name: gi|319979026|gb|AEUH01000056.1| Actinomyces sp. oral taxon 178 str. F0338 contig00056, whole genome shotgun sequence Length of sequence - 2911 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 26/0.000 - CDS 2 - 398 442 ## COG0130 Pseudouridine synthase 2 1 Op 2 32/0.000 - CDS 407 - 856 620 ## COG0858 Ribosome-binding factor A 3 1 Op 3 . - CDS 980 - 2911 2838 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) Predicted protein(s) >gi|319979026|gb|AEUH01000056.1| GENE 1 2 - 398 442 132 aa, chain - ## HITS:1 COG:Cgl1933 KEGG:ns NR:ns ## COG: Cgl1933 COG0130 # Protein_GI_number: 19553183 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Corynebacterium glutamicum # 8 132 6 130 297 140 60.0 6e-34 MARRPDNPRPGLVVIDKPQGITSHDAVSRVRRTARTRKAGHAGTLDPMATGVLVVGIGKA TKLLTWVSGDTKSYEATIRFGASTRTDDAEGETTSAPGCSHLDATALGAAFERLRGDIMQ VPSAVSAIKVGG >gi|319979026|gb|AEUH01000056.1| GENE 2 407 - 856 620 149 aa, chain - ## HITS:1 COG:Cgl1938 KEGG:ns NR:ns ## COG: Cgl1938 COG0858 # Protein_GI_number: 19553188 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Corynebacterium glutamicum # 1 136 1 140 149 101 47.0 6e-22 MADEARVRKVQERVQQTVASMIGRRVKDPRLEFVTITDARVTGDLQHASVFYTVYGDDAA RAGAARAFESARGLFRSQVGKVLGLRLTPTLEFIPDALPESAAALEDALAAAKAKDATIA ERAQGASYAGEADPYRHDDEDGEDPRGED >gi|319979026|gb|AEUH01000056.1| GENE 3 980 - 2911 2838 643 aa, chain - ## HITS:1 COG:ML1556 KEGG:ns NR:ns ## COG: ML1556 COG0532 # Protein_GI_number: 15827818 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Mycobacterium leprae # 5 642 289 923 924 775 67.0 0 GATPGAFGRGGGRNQRGRKSKRAKRQEYEQQNAPVIGGVSIPRGGGAQIRIRQGASLADL AEKINVNPAALVTVLFALGEMATATQSLDQDTFEALGAELGYDVKVVSPEDEDRELLESF DIDLEAEELEDLENMAPRPPVVTVMGHVDHGKTKLLDAIRHTDVVDTEFGGITQHIGAYQ VKVDHEGRKRPITFIDTPGHEAFTAMRARGAQVTDIAILVVAADDGVMPQTVEAINHAQA ADVPIVVAVNKIDKDGANPDKIRSQLTEYGLVAEEYGGDVMFVDISAKQRKGIHELLEAV LLTADAALTLEANPTTDARGVAIEANLDKGRGAVATMLVQRGTLRVGDSLVVGSSHGRVR AMFDDTGKDVSEAGPSTPVAVLGLTSVPRAGDSFLVAPDDRTARQIADKREAAERQALLA KRRKRVSLEDFDKVLAEGEVDTLNLVIKGDVSGAVEALEDSLLRIDVGDEVALRIIHRGV GAITQNDVNLATVDNAVIIGFNVRPAERVAELADAEGVEIKYYNVIYSAIDDIEAALKGM LKPVYEEVALGSAEIRQVFRSGKFGNIAGSIVRSGTIRRGAKARLVRDGVVVADNLEIAS LRREKDDVTEVREGYECGITLGYKDIAEGDVIETWEMREKARD Prediction of potential genes in microbial genomes Time: Thu May 12 17:19:29 2011 Seq name: gi|319979020|gb|AEUH01000057.1| Actinomyces sp. oral taxon 178 str. F0338 contig00057, whole genome shotgun sequence Length of sequence - 5010 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 5, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 24 - 572 253 ## 2 2 Op 1 . - CDS 477 - 827 192 ## RER_26310 hypothetical protein 3 2 Op 2 32/0.000 - CDS 873 - 1901 577 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 4 2 Op 3 . - CDS 1909 - 2394 371 ## COG0779 Uncharacterized protein conserved in bacteria 5 3 Tu 1 . + CDS 2466 - 3524 1119 ## gi|154509104|ref|ZP_02044746.1| hypothetical protein ACTODO_01621 6 4 Tu 1 . - CDS 3726 - 4778 1608 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 7 5 Tu 1 . + CDS 4891 - 5008 63 ## Predicted protein(s) >gi|319979020|gb|AEUH01000057.1| GENE 1 24 - 572 253 182 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGAAPGLGAFAADGFGATGALGAAAGLGAAGFAPPLGAAGFFGAGFAFAVSLPFAPPLGA AGFFGAGFASAFGFAAANFSRMDRTTGGSRDDEEDLTNSPASFRRPINSLLVIPSSLASS WTRNFATFLLSGSASGQEARYSSCWQLIAGYSSGAHQLPTRFRFSIVGVRRLTRDRLQLR PH >gi|319979020|gb|AEUH01000057.1| GENE 2 477 - 827 192 116 aa, chain - ## HITS:1 COG:no KEGG:RER_26310 NR:ns ## KEGG: RER_26310 # Name: not_defined # Def: hypothetical protein # Organism: R.erythropolis # Pathway: not_defined # 13 91 4 86 105 72 48.0 5e-12 MSQPSAPGGAASGGPVRTCIGCGERGARADLVRVVVPPGGPATVDLRRDQPGRGAWIHPD PACVQAARGRRALERVMRARTGTGSSVWAQLEAIASQAANADDRETEAGWKLMGTR >gi|319979020|gb|AEUH01000057.1| GENE 3 873 - 1901 577 342 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 11 325 17 350 537 226 39 2e-59 MEIDMTALRMVETEKGVSLDTLVDAIEEALLKAYHNIPGAISEARIEIDKKTGRVTVMAV DEDEDGNPIGEFDDTPKNFGRIAQATARSVIMARLRDADDQRVFGDFAGREGQIITGTVQ QSRDGHLTRVQISDDFEAVLPDSEKVPGEAYRHGDRIRAFVVGVERTERGPRVTLSRTHP GLVECLFEREVPEIQQGLVRIRAIAREAGHRTKIAVAATREGINAKGACIGPVGARVRAV MTELGGEKIDIVDYSDDPARFVANALSPARVTRVVVHSVDNRTATAIVPDFQLSLAIGKE GQNARLAARLTNFHIDIHSDTETGDEATASRVSTPDSVTSRS >gi|319979020|gb|AEUH01000057.1| GENE 4 1909 - 2394 371 161 aa, chain - ## HITS:1 COG:Cj0138 KEGG:ns NR:ns ## COG: Cj0138 COG0779 # Protein_GI_number: 15791526 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Campylobacter jejuni # 53 156 34 139 140 69 34.0 3e-12 MAHPARDARLEELLGPVVAREDLELDTVVRTRSGGTPLVRVTVDTPLVEGAGEQARVDSD TLAAVSRAVSAALDRADPIDGEYLLEVSTPGAERELTEPRHWKRQVGLPVRVKLRDGGHV SGRVRAADDVSATLETDGGTTTIEYANMKKARARVELGSKD >gi|319979020|gb|AEUH01000057.1| GENE 5 2466 - 3524 1119 352 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509104|ref|ZP_02044746.1| ## NR: gi|154509104|ref|ZP_02044746.1| hypothetical protein ACTODO_01621 [Actinomyces odontolyticus ATCC 17982] # 51 352 41 340 340 192 51.0 2e-47 MWDDSTVPVLISETTRPAPAGAPRRGALARAGAACAVVAAVVSGCSSSSLPQLSGFAQVR DLSARTEAAAELRARALSTAAVACASCQQALGSLADESSTRLTALGGLWDPWGGRPPEGA EEPAPVSDAPVDVEGFVAWLAATAERDLAAAADPTTTESGDDARTLAAIALGRYRSAQSL ASVYGISVTTGAARAESANQRLNGLSGAGVQTWGVQDFRTSPLPALPFGRADQGLASSPA LSAAVASWDCVAQMLPRNRVSGEPLADADERAEELMVRADQALSAGVADARTVRCSLPSS SPADLAVDLVAADVSLLASDSAPVRAVGVNTALTDITRWGSVAAFGAVIGVQ >gi|319979020|gb|AEUH01000057.1| GENE 6 3726 - 4778 1608 350 aa, chain - ## HITS:1 COG:Cgl1256 KEGG:ns NR:ns ## COG: Cgl1256 COG0473 # Protein_GI_number: 19552506 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Corynebacterium glutamicum # 4 346 1 334 340 415 66.0 1e-116 MTTLNIAVIPGDGIGQEVVPEALKALARALEGTGTDIATTHFDLGARRWHRTGETLTEAD LARIKAHDAILLGAVGDPSVPSGVLERGLLLRLRFALDHYVNLRPSKYYDGVPTPLADPG DIDFVVVREGTEGLYAGNGGALRVGTPHEIATEVSVNTAYGVERVVRYAFALARKRRKRL TLVHKHNVLVNAGHLWRRIVESVGREYPDVETGYCHIDAATIHMVTDPGRFDVIVTDNLF GDILTDEAGAVTGGIGLSASGNLNPSGEFPSMFEPVHGSAPDIAGQGKADPTAAIASVAL MLDFLGHGGPAGRVEAAIARDMAARAEANRAGEPLARTTEQTGDAITALI >gi|319979020|gb|AEUH01000057.1| GENE 7 4891 - 5008 63 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRPSPGYAHKDPYLRNTRRTQGRDQALVCGPVFPNSLPC Prediction of potential genes in microbial genomes Time: Thu May 12 17:19:59 2011 Seq name: gi|319979018|gb|AEUH01000058.1| Actinomyces sp. oral taxon 178 str. F0338 contig00058, whole genome shotgun sequence Length of sequence - 847 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 316 121 ## 2 2 Tu 1 . - CDS 193 - 846 488 ## COG0438 Glycosyltransferase Predicted protein(s) >gi|319979018|gb|AEUH01000058.1| GENE 1 2 - 316 121 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no KDPYLRNTRRTQGRDQALVCGPVFPNSLPCVRHRGRRGPGPRGEERLGQGIRVCPAPALF WDAPYAMAAYSSMSAPTQRRMSYWAPTPLYREALARRRASSSSQ >gi|319979018|gb|AEUH01000058.1| GENE 2 193 - 846 488 217 aa, chain - ## HITS:1 COG:HP0421 KEGG:ns NR:ns ## COG: HP0421 COG0438 # Protein_GI_number: 15645049 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Helicobacter pylori 26695 # 37 206 201 364 389 113 35.0 3e-25 GTGTAEPAEGAPPAGAGAPPSEDSPPDGTARPGAGGTRYRVLVVGRFSREKDQGTVLRAM RHSANAHRIDLVFAGRGPTGARLRRDAARLVRSGVLTRPPRFAFLDADGLAAQARAADLY VHSATIEVEGLSCLEILRHGVVPVIARSPHSATAQFARDPRCLYRAGDARGLARAIDYWL EDDARRRASASRYRGVGAQYDIRRCVGALIDEYAAIA Prediction of potential genes in microbial genomes Time: Thu May 12 17:20:08 2011 Seq name: gi|319979010|gb|AEUH01000059.1| Actinomyces sp. oral taxon 178 str. F0338 contig00059, whole genome shotgun sequence Length of sequence - 8198 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 700 600 ## COG0438 Glycosyltransferase 2 2 Op 1 15/0.000 - CDS 1015 - 2040 1784 ## COG0059 Ketol-acid reductoisomerase 3 2 Op 2 32/0.000 - CDS 2043 - 2555 828 ## COG0440 Acetolactate synthase, small (regulatory) subunit 4 2 Op 3 . - CDS 2555 - 4405 2509 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 5 3 Tu 1 . + CDS 4673 - 6265 2266 ## COG2985 Predicted permease + Term 6351 - 6398 16.1 - Term 6341 - 6382 13.2 6 4 Op 1 . - CDS 6460 - 6696 211 ## 7 4 Op 2 . - CDS 6693 - 8063 1907 ## COG2966 Uncharacterized conserved protein Predicted protein(s) >gi|319979010|gb|AEUH01000059.1| GENE 1 1 - 700 600 233 aa, chain - ## HITS:1 COG:jhp0963 KEGG:ns NR:ns ## COG: jhp0963 COG0438 # Protein_GI_number: 15612028 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Helicobacter pylori J99 # 14 197 3 183 389 91 31.0 1e-18 MGEGPPAPARRLRILFVINNLYTTGNGLAASARRTIALLRDAGHDVRVLSSGSADECRRA GLPAPDYPLPAARVPLVHRIIRSQGYAFAHAERKAVRAGVQWADVVHLEEPFGLQARAAA AARAAGTPCLGTYHIHPENITATIGLADLRPLNEAVLASWVRRVYRHCAVVQCPSENVRE RLAPLGIGARLVTISNGVPAPAPAVAAARTAADADSGAPPPPTGTGTAEPAEG >gi|319979010|gb|AEUH01000059.1| GENE 2 1015 - 2040 1784 341 aa, chain - ## HITS:1 COG:SA1861 KEGG:ns NR:ns ## COG: SA1861 COG0059 # Protein_GI_number: 15927631 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Staphylococcus aureus N315 # 1 322 1 321 334 389 63.0 1e-108 MAEVIYDDGADLGLIASKKVAVIGYGSQGHAHALNLKDSGVDVIVGLRPGSSSWEKAEAA GLEVADIPEATAAADVVMILTPDQVQRFVYAEQIAPNLKEGAALFFSHGFNIRYGYIDPG EGHDVCMVAPKGPGHTVRRQYLDGRGVPAIVAVEQDASGEAWALALSYAKGIGATRAGVI KTTFTEETETDLFGEQAVLCGGVSHLIQYGFETLVEAGYQPEIAYFEVCHEMKLIVDLIN EGGITKQRWSCSDTAEYGDYVSGPRVIGPDVKEAMKGVLADIQDGTFAKRFIADQDNGGL EFKALRAEEEAHPIEKTGRELRAKFGWSQVDADYTEGSAAR >gi|319979010|gb|AEUH01000059.1| GENE 3 2043 - 2555 828 170 aa, chain - ## HITS:1 COG:MT3082 KEGG:ns NR:ns ## COG: MT3082 COG0440 # Protein_GI_number: 15842560 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Mycobacterium tuberculosis CDC1551 # 5 163 6 164 168 169 59.0 3e-42 MANRHTLAVLVENKPGVLTRVAALFARRAFNIKSLAVGETEHPEISRMTIIVDADEAPLE QVVKQLNKLVNVLKVVEMEPEESVERRLLLMKVAAGDATRTGVLQIVELFRAHVVDVQHD SVVIESVGSLSKLEALLAALEPFGVSELVQSGAVAIGRGSRSITDQLKEK >gi|319979010|gb|AEUH01000059.1| GENE 4 2555 - 4405 2509 616 aa, chain - ## HITS:1 COG:ML1696 KEGG:ns NR:ns ## COG: ML1696 COG0028 # Protein_GI_number: 15827903 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Mycobacterium leprae # 29 615 32 620 625 684 59.0 0 MDGSHASSPSRKRRGFFHRNDQISDMRGPAVTTTAQPIPMTGAEAIVASLEALGVTDVFG MPGGAILPTYDPLMSSKLNHVLVRHEQGAGHAAEGYALATGRVGCALVTSGPGATNTLTA LADAHMDSVPIVVISGQVGASLIGTDAFQEADVVGASMPVTKHSFLVTDASQIPARIAEA FHLAATGRPGPVLVDIAKSAQVAQMEFSWPPALDLPGYRVAGKPNMKQVRAAARAICDAQ APVLYVGGGVVRSGSSGPLARLVEASDAPVVTTLTARGAFPDSDRHNLGMPGMHGTVAAV GALQRADLIVALGTRFDDRVTGKLDAFAPRARVVHIDIDAAEISKNRHADIPIVADLAKA LPVLADEVARASSEGKSDITGWWRYIDRLRKRYPIGWSEPEDGRIAPQRVLARLSALAGP DAVYVTGVGQHQMWAAQFIAYERPRSFLSSSGAGTMGYCVPAAMGAQVGVPDRVVWAIDG DGSFQMTNQELATCSINNIPIKVALINNGVLGMVRQWQSLFFKKRYSNTTLNPADGDQIP NFEMLATAYGMAARTVRSADEVDEAIEWAMGINDRPVLIDFRVSPDAMVWPMVAAGVSND EIRYAKGMAPNWEVED >gi|319979010|gb|AEUH01000059.1| GENE 5 4673 - 6265 2266 530 aa, chain + ## HITS:1 COG:Cgl2162 KEGG:ns NR:ns ## COG: Cgl2162 COG2985 # Protein_GI_number: 19553412 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Corynebacterium glutamicum # 8 527 4 534 539 223 32.0 6e-58 METVFEILAKQPVLFLFLLVGIGMAFGHIKIRGIGLGAAAVLFFAIVIAAWAQSYGVELR VTPQLGTLGLTLFTFAIGINSGASFFHNLKTAIGPILTMVALYAVAAIGGLYVGKALGME VPLIAGTFAGALTNTPALAAAGDASGNAPMATVGYSIAYLFGVIGMLAASMAALSYGKND KDAPSPLSNRTIRVEREDHPFVGDIYERLGEKITFSRLRRGETGPITRPSMSDTLDPGDL VTVVGPRELVARAAVELGHASSHSLMQDRSYLDFRRMTISNPRVAGRTVASLGLTKEFSA TISRVRRGDVDMVAEPGLVLQEGDRVRVVAPTSRMNDITKFFGDSSRGLTDLNPIAFGLG MALGIAIGELPILTPSGDYFSIGSAAGTLIVGLIFGRIGRIGPIATALPFTTCQVLSELG LLIFLAQAGATAGGQILEAFTSGAWIRIFLLGAMMTSIMAIGLYVTMRWVFRMGGTRLAG LLGGAQTQPAVLAFANGRTNADPRVALGYALVYPVAMVGKIVVAQILGGA >gi|319979010|gb|AEUH01000059.1| GENE 6 6460 - 6696 211 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRFTLEMSAKPLGLSCSYDMTAQEISAVGRNIAITVAAITLSQIAVALILRRVRARGSRC RCEPPEERSVNGGEGRGS >gi|319979010|gb|AEUH01000059.1| GENE 7 6693 - 8063 1907 456 aa, chain - ## HITS:1 COG:YPO0484 KEGG:ns NR:ns ## COG: YPO0484 COG2966 # Protein_GI_number: 16120814 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Yersinia pestis # 28 282 13 261 266 77 23.0 5e-14 MAEAEPTKPHHPSPEGAGDILNQRIAAHDRLASQSQVVLRLGQMLLSFGASAYRVKKSMA DLARAVGISDHRAQVTYTEIIATAYANGTFRTELAEQRLMGVNADKIDRLNNYVVSLRGR SVRVEDVSDELDAVARIPALYDWFSNALASGLACAAFAFLNGGGWVECSAVAVASFFGQA LRRQMLHRHMNHFGVWMACGALASLIYILLVAPAQHFLGVEATHQAGFISALLFLVPGFP LVTGLIDLVRQDFQGGIGRLVYVFMLVASAGVAVWAVSAVFGWSVAPQYTIELVPWLHFL LRFLTSFVAAYGFAILFNSPHRVCFAAALIGAVINTGRIALVLLAGMPVPAAVGLAALAA GLLAVAVAKRTRFSRVTLSVPAVVIMIPGVPLYRALTNLNNQQIDDALSALFTVLFTIVA IGMGLALSRMLTDKNWLMESQERVPNLWDYEKEEDA Prediction of potential genes in microbial genomes Time: Thu May 12 17:20:18 2011 Seq name: gi|319979009|gb|AEUH01000060.1| Actinomyces sp. oral taxon 178 str. F0338 contig00060, whole genome shotgun sequence Length of sequence - 5751 bp Number of predicted genes - 4, with homology - 0 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + 5S_RRNA 86 - 180 85.0 # Z50065 [R:1..121] # 5S ribosomal RNA # Saccharomonospora cyanea # Bacteria; Actinobacteria; Actinobacteridae; Actinomycetales; Pseudonocardineae; Pseudonocardiaceae; Saccharomonospora. 1 1 Tu 1 . - CDS 143 - 484 182 ## - Prom 517 - 576 80.4 - LSU_RRNA 504 - 1536 90.0 # AM420293 [D:1406414..1407578] # 23S ribosomal RNA # Saccharopolyspora erythraea NRRL 2338 # Bacteria; Actinobacteria; Actinobacteridae; Actinomycetales; Pseudonocardineae; Pseudonocardiaceae; Saccharopolyspora. 2 2 Tu 1 . - CDS 1778 - 1855 80 ## 3 3 Tu 1 . + CDS 3363 - 3845 131 ## - SSU_RRNA 4056 - 5553 99.0 # AF287750 [D:1..1496] # 16S ribosomal RNA # Actinomyces sp. oral strain B27SC # Bacteria; Actinobacteria; Actinobacteridae; Actinomycetales; Actinomycineae; Actinomycetaceae; Actinomyces. - Term 5592 - 5648 11.1 4 4 Tu 1 . - CDS 5664 - 5735 59 ## Predicted protein(s) >gi|319979009|gb|AEUH01000060.1| GENE 1 143 - 484 182 113 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRGPRQNSGSIGQRWKRREALASRLVLMGRHHTNPPTTKPGGQRHHDNPRPQAHNGPTGP TRGRRRNHYTNHDQPTPHPTHHTAGQQEGGTDDTILWWSQHRGNAQQHTEPGS >gi|319979009|gb|AEUH01000060.1| GENE 2 1778 - 1855 80 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRWLCVFKAVARPPGKSGGHVGEG >gi|319979009|gb|AEUH01000060.1| GENE 3 3363 - 3845 131 160 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVAQHNPGRVPPFGHPRIKARSPTPRGISQAATSFISSWCQGIHRMPHKTTHNKKHAHTP ATGRARGPHAPPTRPTANAGQNKKGGRAQTTKMLATTIQITNNPPQPPQPPTKEGSRRKG RTQHTPHQNRRRQHPPTGGRDPPPREEAAAPQRGGHGCWT >gi|319979009|gb|AEUH01000060.1| GENE 4 5664 - 5735 59 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFVNSIVCLFVYSLFCFCACFCF Prediction of potential genes in microbial genomes Time: Thu May 12 17:20:43 2011 Seq name: gi|319978999|gb|AEUH01000061.1| Actinomyces sp. oral taxon 178 str. F0338 contig00061, whole genome shotgun sequence Length of sequence - 13464 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 260 - 306 4.2 1 1 Tu 1 . - CDS 389 - 1816 2027 ## COG0281 Malic enzyme - Term 2173 - 2227 8.2 2 2 Op 1 21/0.000 - CDS 2255 - 3751 2011 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) 3 2 Op 2 31/0.000 - CDS 3751 - 5256 1945 ## COG0154 Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit and related amidases 4 2 Op 3 . - CDS 5256 - 5552 496 ## COG0721 Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit 5 3 Tu 1 . + CDS 5551 - 6837 1770 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 6 4 Op 1 . - CDS 6901 - 10653 537 ## BRADO6395 hypothetical protein 7 4 Op 2 . - CDS 10746 - 13076 3101 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) Predicted protein(s) >gi|319978999|gb|AEUH01000061.1| GENE 1 389 - 1816 2027 475 aa, chain - ## HITS:1 COG:alr4596 KEGG:ns NR:ns ## COG: alr4596 COG0281 # Protein_GI_number: 17232088 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Nostoc sp. PCC 7120 # 6 444 9 441 463 408 52.0 1e-113 MRTSPSYTASYRIEVDEATTTAVQIVDTVAATGAEVKGLDVADSDRGRIVIDLTCDMRDS AHRAQVRDALAALPGVKTESVADQTFMSHIGGKIEIHSKVPLRNRDDLSRAYTPGVARVC TAIHDMPQKAHLLTMKANTVAVVSDGTAVLGLGDIGPEAALPVMEGKAVLFKEFGGVDAW PVVLDTKDTEEIIAIVKAIAPGYGGINLEDISAPRCFEIEERLRSELDIPVFHDDQHGTA IVVLAALLNALKIVGKRIEDVRIVVSGVGAAGNAIIRLLLAQGAKDIVGYGRTGALAADQ TEGMHPTRKALAEATNPRLVHGSLKDGLAGADVFIGVSSGNILTPDDVATMADDAIVFAL ANPTPEVDPIGAAAHAAVVATGRSDYPNQINNVLAFPGLFRGLLDAKVKEITTEVLRVAA IAIASVISEDELSPSYIIPGAFDKRVAPAVSRAVRRAVRDLPTVAIPVQPDLLLG >gi|319978999|gb|AEUH01000061.1| GENE 2 2255 - 3751 2011 498 aa, chain - ## HITS:1 COG:MT3089 KEGG:ns NR:ns ## COG: MT3089 COG0064 # Protein_GI_number: 15842567 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Mycobacterium tuberculosis CDC1551 # 3 497 14 507 509 594 64.0 1e-169 MSELMDYDEAASRFDPVLGIEVHVELGTATKMFDAAPNAFGGEPNTFVTPVSLGLPGSLP VVNRRAVEYAIKIGLALNCEIAQYCRFARKNYFYPDLTKAFQTSQSDEPIAHDGHVDVEL EDGTMFRVDIERAHMEEDAGKNTHIGGADGRIQGADYSLVDYNRAGVPLVEIVTRPIEGA GERAPEVAAAYVQALRDIFRALGVSEARMERGNVRADINVSLRPTPDSPLGTRTETKNVN SFRGIAAAVRYEMQRQGAILAGGGSVLQETRHFHEEDGSTSSGREKSDSEDYRYFPEPDL VPIRPDRAWVEELRASLPELPIAKRRRLRAEWGYADKEMRDVINAGALELIEATVGAGCD PASARKWWMGEISRRAKEREVSLDEAGVSPEQVAELQGLVDSGRINDKLARQALEGVLAG EGSPAQVVEARGLEVVSDDGALTAAVQEALDANPDVVEKIKGGKVKAAGALVGAVMKATR GQADAARVRELIMEMVGA >gi|319978999|gb|AEUH01000061.1| GENE 3 3751 - 5256 1945 501 aa, chain - ## HITS:1 COG:Rv3011c KEGG:ns NR:ns ## COG: Rv3011c COG0154 # Protein_GI_number: 15610148 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit and related amidases # Organism: Mycobacterium tuberculosis H37Rv # 1 484 1 485 494 500 60.0 1e-141 MNDLLRKSALELADMLAAGRITSVELTTACLDRIEAVDPRVRAFLHVDREGALATAADVD ARRAAGEDLHRLAGVPIALKDNLVTRGVPTTCASRILGDWRPPYDATVTRKIKEARLPIV GKTNMDEFAMGSSTEHSAFGATRNPWDLERIPGGSGGGSAAAVAAYMVPLALGSDTGGSI RQPGFVTGTVGAKPTYGAVSRYGLIAMASSLDQIGPVTRTVADAAALTELVGGFDPADST SLDEPVPALTGAVRSVAGQGSVEGLKIGVVKELSGDGYEKDVINAFNATVEQLRGTGAEV VEVSCPNLEYALAAYYLIMPAEVSSNLARFDGMRYGIRVEPTEGPVTAERVMAATRGAGF GDEVKRRIILGTHVLSAGYYDAYYGSAQKVRTLVQRDFDAVFERVDVLVSPTAPTTAFKF GEKTDDPMAMYLNDVATIPANLAGIPAMNVPNGVSGEGLPIGIQVLAPARGDLKMYEAAS LIEALSEDVAGRCPAADWEEK >gi|319978999|gb|AEUH01000061.1| GENE 4 5256 - 5552 496 98 aa, chain - ## HITS:1 COG:MT3092 KEGG:ns NR:ns ## COG: MT3092 COG0721 # Protein_GI_number: 15842570 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit # Organism: Mycobacterium tuberculosis CDC1551 # 1 98 1 98 99 79 50.0 2e-15 MSRISTDEVARVAGLAHIALTEEEIARFAGELDVIADAVSKVSEVATPDVPATSHPIPLT NVWREDEAGPTVDRDEVLAQAPASEEGMFLVPQILGEE >gi|319978999|gb|AEUH01000061.1| GENE 5 5551 - 6837 1770 428 aa, chain + ## HITS:1 COG:L187016 KEGG:ns NR:ns ## COG: L187016 COG1502 # Protein_GI_number: 15672942 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Lactococcus lactis # 45 420 91 473 481 197 30.0 5e-50 MCGILTLMGFSQRRQLTKQTWRILRRVGVGIALAQAAAVVTVHAIDRMRVQRIPGGVHGF PALEPSDTRIGGTSARTYTEGTSLYNDMLEAIDGAQHHIFFETFIWRSDEWGARFKEALL AAARRGVDVFCVWDGFGVLNQDPRFYRFPDIPHLYVRRFPALRSGFFTLNIRKTGQDHRK ILVVDGQVGFVGGYNIGDPFAYEWRDTHVRVTGDAVWELENGFVDFWNHFRPRTAPTLPD RGARQWSADVTAAFNLPDRLLYPIRGMYIDALERATDRALITTAYFIPDKEILASLIAAA RRGVRVRVLIPEYSNHIMADWVARPYYGELLREGVEIWLYQHAMVHSKTMTVDGHWSTIG TANIDHLSMRGNYEVCMQFHSRELAARMEQIFDNDLTTARRLTIEEWEHRSWLVRVGEYL LHPFEFLV >gi|319978999|gb|AEUH01000061.1| GENE 6 6901 - 10653 537 1250 aa, chain - ## HITS:1 COG:no KEGG:BRADO6395 NR:ns ## KEGG: BRADO6395 # Name: not_defined # Def: hypothetical protein # Organism: Bradyrhizobium_ORS278 # Pathway: not_defined # 14 1244 99 1347 1352 372 27.0 1e-101 MYGASMESNSLLASASDLDGWADTDDAKGAFPELMRRLLAQTPGVSNINIRAHEGTAAPG WDGTATSERSAYLPAGELRFEFGTNQDSKRKANEDYEKRAKEVAGTTDEIFVFATPRNWA GAAAWAKERQEEGVFASVEAFDAHRLEGWLQSKPAVHYWISERLKKPVSGAQTLMAWWER LRRNCTIEVPPGFHAAGRHKEADKLLDLLATNDNVSSIQSAWRDDAIAFCYSVLLDKNTD ALGRTMVVSDAQTWHQLATQISTLILIPIFDNPDIGLALDNGHGVIHPVIEMGSSRDDRD VVRLPKVDRWGGATLLTQTGSDFLTADKLTALARRSMCGFYRRISRDKSRQYPEWAKGRD AVRLLAPLVLVGAWEVGHPGDLDLLSWFVDLDGNEINARLVELVDRYPLDPPFARSGGQW RLVDAMDAAALLLSKLSSSHIERWEKLVNDVLLATDPSEGKSATEVLNALIDGLRPPFSS TLTYHVARGIALAAATSDRKAGFLANVGGCVDRIVKGLVGKRFEDNGGSGFGSLSQSLPF LAEASPRRFLEAIEHWLEQQDAAAQSGLTQADSSPECLLSAAPFLMRALQCLTWSPDYFG RAVRALVGLGDLTVVEADRRDYVEAVTVAVAGWSPFGAGGWQEKTEIVTWMLEKYSSFGW PLVEGLVTQTIGFVHPPKPLYRDWGAEKRSFVAIAHADYRHVVLGAAVRLAGFDAERWLH LLRALKRLPVQDWTASIVAFRKVADQGQWNGDQSLEACSLLRSMVNRYQSSTDALRVWRD EQLRPLVELFAQLEPADDPRRFAWLFDYERYIMIDELTSEDEGFSEVLQKERDSALASVI QGGDAQIRALVEASKNVNYIGEFLARANPELDPCILSWLDGESQTLHKAASAYVWSSAKQ RGMPWVRSILEKGFLRAEGRRRLVASLPCDKAFWRGVEDLDAALSRTYWENVSVFQIKEE DRDEVFDILLEYGCAAQAVALLSRMIDVEQKPKTSQAVRALSLWCESMGRGEKTVSSYEA EELLTWLESEARDHPDLPTLEFRLLACGHDLTPSDALYRILGREPGCFAALVKGLYGSHE GDEGQETRAYRFGCWQVLNHWRCLPELSDDGTIDGERLTRWVEESRTLLDVDGLGDIADT QIGEVLASSPDGNDGMWPAEEVRDLLEDAKSRDLEQGIAVGRYNRRGGTSRGILDGGAQE WCLALRYREHSRKMSARWPRAAAVLNSLAEEYELEAEREDEKAEERADQD >gi|319978999|gb|AEUH01000061.1| GENE 7 10746 - 13076 3101 776 aa, chain - ## HITS:1 COG:MT3094 KEGG:ns NR:ns ## COG: MT3094 COG0272 # Protein_GI_number: 15842572 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Mycobacterium tuberculosis CDC1551 # 13 773 4 677 679 573 45.0 1e-163 MSEHESFDAIRSRYDQLVAAIEEARRAYYDRDAPTLADAEYDRLYRELEDIERAHPQLRG ADSPTLSVGGSADSAFAPVRHLQQMTSLDDVFSLEELAGWEQRMHEETGEAELPMTAEVK VDGLAVNLLYVDGALRRAATRGDGHVGEDITANVRTIASIPARLDTAAPPRRVEIRGEVY FPVAEFASFNEARVAAGERAFVNPRNAAAGSLRQKDSSETAKRPLAMVAHGVGFVEAGGA FTPPTTQHGWYQLLREWGMPVSPYTRLLTGRGAIEEFIAETGEARHSLAHEIDGVVVKID DLALQRSLGSTSRAPRWAAAYKFPPEEVHTRLLDIRVQVGRTGRVTPYGVMESVLVAGSN VARATLHNAQEVKRKGVLIGDLVVLRKAGDVIPEIVAPVEEARNGSEREFSMPAACPSCG TALVEEKEGDVDLRCPNKAACPAQITERIAHIGARSALDIEGLGDEAALALTQPENNRDQ VAAALVAGHAVVLEDGTRLELADAEDLPHAEQLATADAMLPEPQAPVLTAESALFDLRAE DVRDVMVWRPTTVKGEPTGDWAQVRYFWTKAHRAKKVKGETVYERVGTTASKMLTQMLDE MERAKSQPLARVLVALSIRHVGPTAARALAARFGSMDALRAADVEELAAVDGVGGVIAQS LADWFEVDWHREILRRWSEAGVRMEDEAPEEVPAVLEGLTIVVSGAMPGYDREGAKEAIT ARGGKASGSVSKKTSLVVAGPGAGSKAAKAESLGVPVISEAQFADLLEGGLAAVGL Prediction of potential genes in microbial genomes Time: Thu May 12 17:20:57 2011 Seq name: gi|319978997|gb|AEUH01000062.1| Actinomyces sp. oral taxon 178 str. F0338 contig00062, whole genome shotgun sequence Length of sequence - 1992 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1992 1752 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins Predicted protein(s) >gi|319978997|gb|AEUH01000062.1| GENE 1 3 - 1992 1752 663 aa, chain + ## HITS:1 COG:BH0975_2 KEGG:ns NR:ns ## COG: BH0975_2 COG1674 # Protein_GI_number: 15613538 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus halodurans # 135 401 340 601 1217 144 35.0 6e-34 TRLGARIAVAGADAPAGKGPGARSLGFVGPGAHECALWCAGQIAAQTGGARVCTDQGTWT VGAPHAVIHITRSEACPHCAAPGAPGAQDAPVVHCGVGRAFEDLPPWCEQVFTARSAPVS PRWWWQVAASDDGAALPDVVEWEALGEAASGADGTDGDSRGDSDGLTDDGLAAPVGVGTD GEVVLDLVADGPHALVAGCTGSGKSEALLTWLLSMCSHHPPERVRLVLIDYKGGATFAPL SGLAHTECVLTDIDPGATARALRGIGALLADRERQLARIGVPDLRQWAQRHRADPARVAP PPARIVIAIDEFRVLVDSHPETMGVIMRLAAQGRSLGLHLVAATQRPAGAVSASMRANID IRVALRCLTAADSMDVLGDDTAARIPRTPGRAVVTGRGPLQFARTGDARALVERINGRFR SCGRAPLWAPPIPASLTWEDVDARCPGRRALGLFDSVDSSGPAPVVWDGGPIEVQAPRSE ARLAAAWVRSLAARAASLAGLPVHHCSQEPVAWAVSRVAPEDAAACARLLEGVCAHGPCV LAIDDLPAVLTSVSSAMGQLRCDEAWKGVLRGALAAGVVLVVASPGRFTLPSASMGVFST RVLRAQSVDAALHAGIDSRSVVRLGAAEALVEAGGQEAALASVPLDAREPPGGGVRGGGW RVR Prediction of potential genes in microbial genomes Time: Thu May 12 17:20:57 2011 Seq name: gi|319978994|gb|AEUH01000063.1| Actinomyces sp. oral taxon 178 str. F0338 contig00063, whole genome shotgun sequence Length of sequence - 787 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 350 - 577 268 ## + Prom 406 - 465 1.8 2 2 Tu 1 . + CDS 519 - 767 449 ## HMPREF0573_10011 WhiB family regulatory protein Predicted protein(s) >gi|319978994|gb|AEUH01000063.1| GENE 1 350 - 577 268 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKNNSGSTVRHAALLRQSISQQLLYQVSSRRASSPVTSLVSGTRRNGARVLGHTFRQST MGVLTQALVGPLERR >gi|319978994|gb|AEUH01000063.1| GENE 2 519 - 767 449 82 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_10011 NR:ns ## KEGG: HMPREF0573_10011 # Name: not_defined # Def: WhiB family regulatory protein # Organism: M.curtisii # Pathway: not_defined # 1 72 1 72 87 117 77.0 9e-26 MDWRSKAACLTVDPELFFPIGNTGPAIAQAAEAKAVCRTCEVQAVCLQWALDNNQDSGVW GGMSEEERRSLRRRAARARRAS Prediction of potential genes in microbial genomes Time: Thu May 12 17:21:12 2011 Seq name: gi|319978964|gb|AEUH01000064.1| Actinomyces sp. oral taxon 178 str. F0338 contig00064, whole genome shotgun sequence Length of sequence - 27613 bp Number of predicted genes - 31, with homology - 22 Number of transcription units - 23, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 83 - 1543 1852 ## COG3920 Signal transduction histidine kinase 2 2 Tu 1 . + CDS 1645 - 2124 749 ## Kfla_1665 hypothetical protein 3 3 Op 1 . + CDS 2240 - 2674 659 ## COG1765 Predicted redox protein, regulator of disulfide bond formation 4 3 Op 2 16/0.000 + CDS 2676 - 3473 1013 ## COG0207 Thymidylate synthase 5 3 Op 3 . + CDS 3533 - 4003 431 ## COG0262 Dihydrofolate reductase 6 4 Tu 1 . - CDS 4098 - 4814 771 ## COG2755 Lysophospholipase L1 and related esterases 7 5 Op 1 . + CDS 4717 - 5028 66 ## 8 5 Op 2 . + CDS 5037 - 6455 1289 ## COG0477 Permeases of the major facilitator superfamily 9 6 Tu 1 . - CDS 6604 - 7197 469 ## COG1309 Transcriptional regulator + Prom 7193 - 7252 6.2 10 7 Op 1 . + CDS 7282 - 7662 405 ## Lxx16480 hypothetical protein 11 7 Op 2 . + CDS 7918 - 9327 2156 ## COG0114 Fumarase + Term 9345 - 9390 19.2 12 8 Tu 1 . - CDS 9403 - 10104 692 ## - Prom 10155 - 10214 2.1 13 9 Op 1 . + CDS 10025 - 10321 116 ## 14 9 Op 2 . + CDS 10314 - 11207 910 ## COG0583 Transcriptional regulator 15 10 Tu 1 . + CDS 11987 - 12829 914 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 16 11 Tu 1 . + CDS 13038 - 13460 548 ## gi|293189221|ref|ZP_06607944.1| putative protein CrcB-like protein 17 12 Tu 1 . + CDS 13646 - 14239 465 ## gi|293189221|ref|ZP_06607944.1| putative protein CrcB-like protein + Term 14420 - 14468 4.4 18 13 Tu 1 . - CDS 14258 - 15229 880 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 19 14 Tu 1 . + CDS 15541 - 15780 241 ## cauri_0024 hypothetical protein - Term 15810 - 15838 1.3 20 15 Tu 1 . - CDS 15956 - 16279 88 ## 21 16 Op 1 . + CDS 16239 - 16427 105 ## 22 16 Op 2 . + CDS 16468 - 17007 443 ## gi|293189221|ref|ZP_06607944.1| putative protein CrcB-like protein 23 17 Tu 1 . + CDS 17178 - 18782 2120 ## COG0025 NhaP-type Na+/H+ and K+/H+ antiporters - Term 18720 - 18747 -0.8 24 18 Tu 1 . - CDS 18810 - 19427 758 ## 25 19 Tu 1 . - CDS 19522 - 20082 391 ## 26 20 Tu 1 . - CDS 20301 - 21836 2265 ## COG0833 Amino acid transporters 27 21 Op 1 . - CDS 22026 - 22805 300 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 28 21 Op 2 . - CDS 22815 - 25124 2247 ## Cphy_1444 hypothetical protein 29 21 Op 3 . - CDS 25106 - 25786 819 ## COG1309 Transcriptional regulator - Prom 25852 - 25911 1.8 30 22 Tu 1 . + CDS 26079 - 26168 86 ## - Term 26027 - 26077 14.3 31 23 Tu 1 . - CDS 26159 - 27613 995 ## Predicted protein(s) >gi|319978964|gb|AEUH01000064.1| GENE 1 83 - 1543 1852 486 aa, chain - ## HITS:1 COG:ML0803 KEGG:ns NR:ns ## COG: ML0803 COG3920 # Protein_GI_number: 15827350 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mycobacterium leprae # 11 485 9 494 500 263 37.0 8e-70 MRSLMQLAEEASSTPLPDKDIDWLHLLLADWQVLADLAAADLVLWLPSDDGRFIAAAHCR PATSTTVHVDDIIGLHLPAAREIELRRAMQTGQVIRSQAARWAGTYSMMETCVPVCHDGT IIAVVTREANLSSPRLSLGFEGWTVAAADTLCQMMARGEYPYDSTPQVSAHGVPRVLDGA LLLDAEGRVQQATPNAVSCLRRLGIRSHVTGKVLAQEVTNVIGDDSLIEESMAVVVMGRA SWRVEITARASTITMRALPLVNGRKRLGAVVLTRDVSEMHRHRQELMTKDATIREIHHRV KNNLQTVSALLRLQSRRSREDGVKAALAEAERRVQAIATVHAALSHNVNESVDFDEVART VLRMSGAVASTDHSVKVATSGRFGVIQADQAQALATVLNELVANSVEHGLADCDGRIEVR AERRGDTMTVTVADDGAGFEPGTPMTGLGTQIVKQMVSGELKGNIEWALREGGGTVVTIV MHLDRP >gi|319978964|gb|AEUH01000064.1| GENE 2 1645 - 2124 749 159 aa, chain + ## HITS:1 COG:no KEGG:Kfla_1665 NR:ns ## KEGG: Kfla_1665 # Name: not_defined # Def: hypothetical protein # Organism: K.flavida # Pathway: not_defined # 1 154 5 164 169 63 29.0 3e-09 MEFTQSATYPLDVDSVIRMSTDPAFLDMRFGRFMAANPEVTVEGDTITYVGTVNPELIPA IARPFTKNGITLTFTEAWTRTEEGATAHTEVAVGGAPVSAAVDSTLTPDGDGCARALNGT ITVNVPLVGGRVEKEAVARVGHAFDREHQLAAEWAAKQG >gi|319978964|gb|AEUH01000064.1| GENE 3 2240 - 2674 659 144 aa, chain + ## HITS:1 COG:MT2992 KEGG:ns NR:ns ## COG: MT2992 COG1765 # Protein_GI_number: 15842468 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Mycobacterium tuberculosis CDC1551 # 11 123 4 118 137 90 42.0 1e-18 MSDSTKPQPSLYLERTGVREYVARNQDGAEVRVGHGPGCFSPGDLLKLAIAGCNAMSSDT RLVSRLGDDFAQFVGVSADYDEDADRFTHVEVELVQDLSALDEQEQADLIRRADAAIKRN CTIEHSVVDRALPSHHAFTSERID >gi|319978964|gb|AEUH01000064.1| GENE 4 2676 - 3473 1013 265 aa, chain + ## HITS:1 COG:Cgl0821 KEGG:ns NR:ns ## COG: Cgl0821 COG0207 # Protein_GI_number: 19552071 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Corynebacterium glutamicum # 6 265 7 266 266 405 72.0 1e-113 MVDRQYEDLLARVMAEGVAKGDRTGTGTRSVFGAQLRYDLSRGFPLITTKRVHLKSVVGE LLWFLSGSSNVGWLQEHGIRIWNEWADEGGELGPVYGVQWRSWNAGGGTRIDQISNVLET LRTDPDSRRMVVSAWNVGDLPQMALEPCHAFFQLYAAHGRLSLQLYQRSADLFLGVPFNI ASYSLLTHMFAQQAGLQVGDFIWTGGDCHIYSNHTEQVAEQLSREPYPFPRLELAKAPSM FDYSFDDVSVEGYQHHPTIKAPVAV >gi|319978964|gb|AEUH01000064.1| GENE 5 3533 - 4003 431 156 aa, chain + ## HITS:1 COG:BMEI0609 KEGG:ns NR:ns ## COG: BMEI0609 COG0262 # Protein_GI_number: 17986892 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Brucella melitensis # 1 103 24 126 172 108 51.0 4e-24 MLWHVPADFAHFKAATMGCPIIMGRHSWEALGGPLPGRANIVITGSREYRADGADLVHSL DEALERARSHAEASGAATIWIIGGASVYEQAMPLVDELVVTDIDCDATAGHDGPVVYAPG IDASAWDRDEERSDAQWREPSGGSRWRVTTWVRARR >gi|319978964|gb|AEUH01000064.1| GENE 6 4098 - 4814 771 238 aa, chain - ## HITS:1 COG:PA3750 KEGG:ns NR:ns ## COG: PA3750 COG2755 # Protein_GI_number: 15598945 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Pseudomonas aeruginosa # 39 224 47 232 249 67 33.0 2e-11 MNPVKRTILLAQAAWAARTVRLAPEPEGERSGIARERLGATLSEYLRLVAVGDSLVAGSG ASSQATALTPRIADRVAAATGAPVIWETHARLGSTMRRVRHRFLDEVEGRPDILFVCAGS NDIMARRTRAEWADDLEAVLGRSLEMAAHVVLCSAGQPHNSPRLPAMLRRELAKRIDAQT ADSKSICARLGVDYADVAHADLVEGFWASDGFHPSEAGYEQAAGMVVGAMGFLPRLRR >gi|319978964|gb|AEUH01000064.1| GENE 7 4717 - 5028 66 103 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPERSPSGSGASRTVLAAQAAWASRIVLFTGFTVADPNTRRAPGVRGPLPRRTVEGGAPR PGRSERRHLLPFGARVATGHPGTTERMGTPAPPTSTDSSGGGA >gi|319978964|gb|AEUH01000064.1| GENE 8 5037 - 6455 1289 472 aa, chain + ## HITS:1 COG:CC0751 KEGG:ns NR:ns ## COG: CC0751 COG0477 # Protein_GI_number: 16125004 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 23 442 29 407 450 70 26.0 1e-11 MPHPDHPSSRSRSPVLADPTLRVLVAVALFTYTAQNMLNVSIAPLSRALDLPEWIVGAAV SLAALAVALLSQFWGRRSIAWGRRRVLLCALFLALVAGALFSTAVWLRAGGGIGAAWAAG AIMAARGPFFGASVAAIPPTGQALVAEVTHDEASRVRGASAFSGAINMSVMIGSVVSSLL GAWWIFAPVHATPWFVVVALVIAWAGVPHDGAGVDQGTPSPGAPSESGATGCGSPSAGAP SESGATRCGSPSAGARAVPLPSRAPRPARPLPPRVSWTDPRIVPWVLGVIGLYFAAGVTQ ILMGFLIQDRVGAAPDRAVSLTGLMLLLSAAGAMTAQLLVVPRLQWPPRRLLRAGVSVGV AAVAVLTAVSALVALAVAMLFMGVATGLTSTGFTAGASLAVRDDEQGGVAGLVSATGAIT WIFAPVSATALYGWQPLAPFALALALVGASCAAAWVHPGLRKPSAPLLDGGS >gi|319978964|gb|AEUH01000064.1| GENE 9 6604 - 7197 469 197 aa, chain - ## HITS:1 COG:CC1190 KEGG:ns NR:ns ## COG: CC1190 COG1309 # Protein_GI_number: 16125441 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Caulobacter vibrioides # 5 168 18 184 211 67 28.0 1e-11 MNPRDSVMTRAAIVDAASKLLEDSGPEAVTLRAVGEAAGVSRSAPYRHYADKAALMRALA ERALRQIARRIRHGAEHHQGEWQRLRAGCQAYIDYAVERPHHYQLVFGDTPIAEPDPGLE EAADNAMAAVGELVAQAQDAGLLRPGPTREIATIVWVLLHGLAALQITGHLHEPRTIDGD EHLADLLNLALEQLRPS >gi|319978964|gb|AEUH01000064.1| GENE 10 7282 - 7662 405 126 aa, chain + ## HITS:1 COG:no KEGG:Lxx16480 NR:ns ## KEGG: Lxx16480 # Name: not_defined # Def: hypothetical protein # Organism: L.xyli # Pathway: not_defined # 43 126 2 85 85 119 70.0 3e-26 MTDSIQANADFDAAYLRTWTEPDAGKRLALIEQVWAPGGSLHISSPGLSLTGTTDIAAHI ERVHNDLIANKGLTFSYDQRAESGDALLLRWSMTAPNGDVVGRGVDTVFRNTDGKVVNAY MFMGVN >gi|319978964|gb|AEUH01000064.1| GENE 11 7918 - 9327 2156 469 aa, chain + ## HITS:1 COG:DR2627 KEGG:ns NR:ns ## COG: DR2627 COG0114 # Protein_GI_number: 15807607 # Func_class: C Energy production and conversion # Function: Fumarase # Organism: Deinococcus radiodurans # 5 463 4 463 464 674 74.0 0 MADNTRTESDSMGTVEVANDRYWGAQTERSLHNFDIGRNTFVWGRPMVRALGILKKSAAL ANAELGELPRDIADLIARAGDEVISGKLDDHFPLVVFQTGSGTQSNMNANEVISNRAIEL AGGALGSKKPVHPNDHVNRGQSSNDTFPTAMHIAVVSELHDMYPRVRQLRDTLDGKAKEF EDVIMVGRTHLQDATPIRLGQVISGWVAQIDFALDGIEYADSRARELAIGGTAVGTGLNA HPKFGGLTAEKISEETGIAFKQADNLFAALGAHDALVLVSGALRVLADALMKIANDVRWY ASGPRNGIGELLIPENEPGSSIMPGKVNPTQCEAMTMVATQVFGNDATVGFAGSQGNFQL NVFKPVMAWNVLESIRLLGDACVSFDTHCAYGIEPNRERIQHNLDINLMQVTALNRHIGY DKASKIAKNAHHKGLSLRESALELGFLTEEQFDAWVVPADMTHPSAADE >gi|319978964|gb|AEUH01000064.1| GENE 12 9403 - 10104 692 233 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDDTEAKSPVLPGQRSMWWPKWDSYIAWAPLEMGYRALWGCAQSLEPQERTWVRRQPTI FQRGALVERILLHALFDEARADGRITSKRGIAEVGDGAPMASAMFTVPGWHAAASYCQAL VCVAISDKRVGVDCDVLAAAERIALLPSIFTVAERMHLNEHPGDVVNLWTAKESLLKLKR ARIDIDPADPAYDVINPRGARIISWNPTPRTVASLSVEDGVDEAPIPLVLPGA >gi|319978964|gb|AEUH01000064.1| GENE 13 10025 - 10321 116 98 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYESHFGHHMERWPGSTGLLASVSSDMAMTLLPGRVQEPGPVPFHTFNNFVAFLSPYLES MRRINTFDESMTHLPPPRPHALGVLGAPTAPSYTDWYE >gi|319978964|gb|AEUH01000064.1| GENE 14 10314 - 11207 910 297 aa, chain + ## HITS:1 COG:Cgl0222 KEGG:ns NR:ns ## COG: Cgl0222 COG0583 # Protein_GI_number: 19551472 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Corynebacterium glutamicum # 1 252 1 248 254 141 37.0 2e-33 MNDRLSLEGLRCAAAVHRCGSFSGAARECGLTQPSVSAAVRRIEGFMGHPLFERSAQGVA LTPFGEAVIGLIEDAAAAADRVAAASRKGKAAPVHLAIGASTLVPRSLIEGLRRAARAHP RLSADSLALSEYDLLDLQLQLETGGIDAALVPAVMPATRYRKREIGSEAMRLVDSPGQPH SGAGPVTAAEASARTFVLMAPMCGLSLFTEAFFRSAGLPFKEHPDRAASYRELEDRVALG LGSAILPESKVTRQDLVGHDLVGPDGNPVVMRYRAYWRDDCPMAQLIEEVTRSLGAE >gi|319978964|gb|AEUH01000064.1| GENE 15 11987 - 12829 914 280 aa, chain + ## HITS:1 COG:SA0314 KEGG:ns NR:ns ## COG: SA0314 COG2110 # Protein_GI_number: 15926027 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Staphylococcus aureus N315 # 49 278 40 261 266 180 45.0 2e-45 MTTPATGTGTGKADSAARSRLVRRAVEILCSEQGTEPPADLDPGAQRALLRALMNTREPS PLDPEYLEVEGAILDAEARARGTASWEDAEASPIHPRLALWRGDITRLAVDAIVNAANSA LLGCRVPGHACIDNAIHSAAGLQLREACARIMALRRAAGLGPEPTGGAEITPGFHLPARH VLHTVGPIVSGRLTDAHRAALASSYRSCLGLAASHGLRTVALCCVSTGVFGFPQDEAARI AVSTTAAFLDSAAPGASRMRVVFDVFGARDEELYRRELGL >gi|319978964|gb|AEUH01000064.1| GENE 16 13038 - 13460 548 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189221|ref|ZP_06607944.1| ## NR: gi|293189221|ref|ZP_06607944.1| putative protein CrcB-like protein [Actinomyces odontolyticus F0309] # 1 139 1 138 175 128 51.0 1e-28 MALLKDPRGILTFTFRYLALHVAFLIGLPISREMDVLLFYFFGFMGFFTFIQVLVSISQK FLVHGGVLKFVAVLCAIATDVPILWYFGWVLKGLFPGGNDWSFVLNWMPFHFGAAFAAWG LRELIHMLSSQRTNHPGLPR >gi|319978964|gb|AEUH01000064.1| GENE 17 13646 - 14239 465 197 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189221|ref|ZP_06607944.1| ## NR: gi|293189221|ref|ZP_06607944.1| putative protein CrcB-like protein [Actinomyces odontolyticus F0309] # 1 195 1 171 175 149 47.0 1e-34 MARVKDPRGILTFIFRYLVLHVVFLLGLPYSLGVGLALFYFFGFMGFFTFIQVLASISQK FLDHAGAWKAVVIMGAMASDVAVLWWLGRMLGGLFSGSDWEFVVSWLPLHFLAAFLAWGL REIIHAVKTGGTAEPRNPSGPIIPSRPIDPRASVDPGGPADPGGPADPGGSTSPSSSEDA DAPTGGPASIDANTDDR >gi|319978964|gb|AEUH01000064.1| GENE 18 14258 - 15229 880 323 aa, chain - ## HITS:1 COG:XF2725 KEGG:ns NR:ns ## COG: XF2725 COG0610 # Protein_GI_number: 15839314 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Xylella fastidiosa 9a5c # 133 320 818 1006 1007 176 47.0 6e-44 MCAKRAPLKDNAQIPQERTLSEAQQSDAEAFFTQMRSVLPVQGVQVFREAHPLPREEEST PAEDSPVFRCVLSKRGIDTRARVVGGEFTVLQGSRVTPEVTTPRRYQRVHVLHSQLVADG TIRVEGGHGEPARDIEDYRSAYLDLYQEMRQHDAADKEPINGDLVFEIELVKQVEVNVDY VLMLVEQHRASRGDGNDVEIPVEITRALKSSPSLRDKRDLIEDFYRRVSLSGDVSTEFAR YVAERREAELEGIIERENLRENASEFAHRALAEGFVSEEGTGLSSILPPMSRFARGGGRA EKEARVLDALRAWVSRFMNLGGW >gi|319978964|gb|AEUH01000064.1| GENE 19 15541 - 15780 241 79 aa, chain + ## HITS:1 COG:no KEGG:cauri_0024 NR:ns ## KEGG: cauri_0024 # Name: not_defined # Def: hypothetical protein # Organism: C.aurimucosum # Pathway: not_defined # 1 75 4 78 81 80 54.0 1e-14 MNGHPVSEAQIDDWVAEAEHGYDVEALRRRGRPRRGDQPARTISVRLTDDEIDGIDRYAD KNGGTRSTVIRDALRAFIS >gi|319978964|gb|AEUH01000064.1| GENE 20 15956 - 16279 88 107 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAATQTMMSSSRLMGVADLHGNGLRLSTVCRLPSARLAEGRSDAKRTRGGGCAGWMRGPV PRIRGRGWGRVPWDRAEAARAAGSGRAYSSRRRYVLRAFIAWMISSV >gi|319978964|gb|AEUH01000064.1| GENE 21 16239 - 16427 105 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSLELLIIVCVAAIVATAAAGRRYRNSNTGSRGPAVLAAKDGRRAAALGQTISRTPAYRD GP >gi|319978964|gb|AEUH01000064.1| GENE 22 16468 - 17007 443 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189221|ref|ZP_06607944.1| ## NR: gi|293189221|ref|ZP_06607944.1| putative protein CrcB-like protein [Actinomyces odontolyticus F0309] # 1 179 1 175 175 154 51.0 2e-36 MARAKDPRGILTFIIRYLLLHVAFLIGLLLSVSGDLVLFYFFGFMGFFTFIQVLASISQK FLDHAGAWKAVVIMGAMASDVAALWWLGRILYGLLIGSDWELVARWLPFHFIAAIFAWGL REIIHAVNSGGTTKPRAPGSPVDPSGPADPGGSTSPSGSEDANTPTDGPTDTDPNTDSQ >gi|319978964|gb|AEUH01000064.1| GENE 23 17178 - 18782 2120 534 aa, chain + ## HITS:1 COG:MT2345 KEGG:ns NR:ns ## COG: MT2345 COG0025 # Protein_GI_number: 15841778 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters # Organism: Mycobacterium tuberculosis CDC1551 # 21 534 34 542 542 173 31.0 7e-43 MSLELLIIVCVAAIVATALTNRFRLIGPVMLIVAGLVASFAPGAEGAGLPSEVVLTVFLP VLLYWEALSVSLNGMRRALRGIILSATVLVVITSAIIMAAGLAMGLSAGAALLVGACLGP TDATAVAALGKGINRFGRTVLQAESLLNDGTALAVFAIALRVASGGTGVSAASVASEFAI SIGAGLGVGVVCGTAAVLLWAKWNASRTDAMLSNLIALTVPFTTYFLAEELQGSGVLAVV ACGIVYSRYNGNANDSAIRLVGIPFWSVLTYLMNTVLFIMVGLSLPDIVRRLPHQELVRG LVLIPIIYSAMVAARFVGHHAIIFSIRALDRRPQQRERRTNIRGRLVSTVAGFRGAISLA MALSIPGSFGGTAYEERDLVVLITAGVTLLSLVVQGIALPHVVRWANARPTAVQRREAEE LDEETRVLAGIMRDIVEELPSMAQRIGVEDGELVNAIRESYGRREERMLQAASEEEFSFF DEDETALRLECIDFARARVLEARNAGRLDGATASLAIKRLDTEGALLAGPIEME >gi|319978964|gb|AEUH01000064.1| GENE 24 18810 - 19427 758 205 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKKLRMMTAALACVALGACAPGGTAQSGTGSQSATGGASGTAQPPATGAPQSGAATASP SLPPPDVEGLPCTAEELQASLGEKTTNGGTTTVTISFWNGSSQECTISGFPQVSFMSSQT NETVGLEAIRDGEDPGNIFKVLPQESAHAILTLDDPAAAGCSTASADLVTVVLPGDTAQV TLSHAGLSVCSDAASAVVGPLEPGA >gi|319978964|gb|AEUH01000064.1| GENE 25 19522 - 20082 391 186 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDCSRTVRFLTIGVAALIGLAACSPLTPKPASGGAAPAATPAANGRTAEAGGPCASGSIS VNAQEDASSGAATRKVDLVVTNTGSAPCSLTGYPGVVFAASGVQIGRPVVESSRVAEATT VSLGPGASAVATLLIIDPSALTSCKEEDFTEIWVSLAAEDDTIGVDFSGKTCSNTTNANI DPYQAR >gi|319978964|gb|AEUH01000064.1| GENE 26 20301 - 21836 2265 511 aa, chain - ## HITS:1 COG:PA4628 KEGG:ns NR:ns ## COG: PA4628 COG0833 # Protein_GI_number: 15599824 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Pseudomonas aeruginosa # 29 497 11 482 487 531 57.0 1e-150 MSENTPSAAGDTGPSIDEVIASKEGRSTQLKRKLSGRHMRMIAIGGAIGTGLFVASGATV SKAGPGGALVSYAAVGIMVLLLMQSLGEMSAHQPIAGSFQTYATKFISPSFGFAMGWNFW FNWAITVAADLVASGLVMSYWFPTVPSWVWAGSFLAFLVMLNALSARVYGEAEFWFATIK VVTVVVFLIAGVAMIFGVMGSAYPGLTNWTLEDGPFHNGFLGVVAVFMIAGFSFQGTELV GIAAGESENPHKDVPKSIRAVFWRIMIFYIGAIAVIGTLLPFTDPNLLAASETNVAQSPF TLVFARLGVGGAAAIMNAVILTSILSAGNSGLYAASRMLHSMALVGQAPKVFSYVSPHGV PVNAMVVTALMGGCAFLTALVGEGVAYTWLVNIVGLSGIVMWLGIAAAHWRFRRAYVRQG YSLDDLPYRSPFFPAGPILAFSMCIIIMIGQVYEAFVQGTYMEVLPSYIGLPLFLIVWLG FHLVKRDRVVALDDMDVKGVRVDDMSRRRRQ >gi|319978964|gb|AEUH01000064.1| GENE 27 22026 - 22805 300 259 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 17 231 2 219 245 120 33 1e-26 MTTATSRNATGGATPALSARRLNKSYGTPVLAGLDLDIADGEFVAIMGPSGSGKSTLMHC LSGMDRPDSGTVTARGREITGLGEKELASLRLTEFGFVFQQAHLMATLNLLDNIVLPGFL AELRPREEVVGRGRALMEEMGIGECASSGVTEVSGGQLQRAGICRALINDPAVLFADEPT GALNYSAATSILDLFGRVHASGATMVVVTHDARVAARADRVLVLVDGGIVEDLRLGPYEE ETAQARLETVAAALHRHSV >gi|319978964|gb|AEUH01000064.1| GENE 28 22815 - 25124 2247 769 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1444 NR:ns ## KEGG: Cphy_1444 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 765 1 775 778 442 34.0 1e-122 MLLRLLKADLARGRAVAAVLVGLVTLAVALASASASLIIDTVSAADRLSERARVPDLVQM HTGEADSRAIAQWADSRADVEDHMVMRTLPVPRAQLSIGGAVQASSYLEPAFVTAPEHID LLLNENGDRVLPGPGQVYLPVHYKAAGLAQAGDMVTVDTGEWRIDLRVAGFLRDAQMNAA MVPSKRLVVNPEDFSSLDEHIGEVEYLIEFDLADGASAAAVSEQYKAAGLPSTGIAVTAA QIQLMNALSTMLIAAVALVVAGVLAAVAVLALRYTVLAALEADLPQVAVLKAIGAPQAGI RRLYSLKYAVIASTGGCAGYALSHPLASSLLAPTTVYLGTPPTTAGGAVVPVLVAAGLVA VVVGSAWLVLRRMGRISAVEALRSGTSGGAGRPRLRWRLSRARLLGPHAWLGVREALRPA NALLMGVLALCAFTAVLPMNVSTTLGDPRSATYLGVGQADLRVDVRSGAADIEEIADDFA ADPRIAKQVVIMRKDYTMRNKDGAWAPVLIDIGDHSAFPVNYMSGRAPLNDQEIALSHSQ AAEAGAAVGATVTVKGSQGERDLAVVGVYQDITNNGLTAKAVFDDATPALWQLMYADVAD GEDAGEAAEDLAREHPGAQIVRMGEYASQLFGATQSQVDVLAAMALAVSLGLAFLITALY SVLVLARERGRIAVLRALGSTVRSIAGQYFMRFGLVAVIGVALGVVMAATAGERAVGLVL ANRGAPDLHLLADPLLMGVVVPAAVLCATAVAVGLALRPLPSMTLTDTD >gi|319978964|gb|AEUH01000064.1| GENE 29 25106 - 25786 819 226 aa, chain - ## HITS:1 COG:CAC3606 KEGG:ns NR:ns ## COG: CAC3606 COG1309 # Protein_GI_number: 15896840 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 13 201 10 198 202 99 29.0 5e-21 MAPKPRSTKAAPSRRAEILDTAQRLFIAKGFQNTSVEDIIAEIGIAKGTLYYHFSSKDEI LRAIIGRTTQRAASAARAVAEGPGGAIGKFAAVVAASRVDQPERELAEELHASGNAQFHI LTIVEMVRALAPVLTDVVEQGIVEGVFSTPHPRETVEILLTSAGMLLDEGIFTGDHDELA RRTRGLVHAAETLLGCEPGTFASMASGGPQPIPHTDAEAPQCCSDS >gi|319978964|gb|AEUH01000064.1| GENE 30 26079 - 26168 86 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGRSAAPGRDGAPGTPIRMALARISAPYW >gi|319978964|gb|AEUH01000064.1| GENE 31 26159 - 27613 995 484 aa, chain - ## HITS:0 COG:no KEGG:no NR:no AVQGQWGQPGGQVQPVQGQWGQPGGQVQPVQGQWGQPGARIQPVAEPQPAWGAQPGGQAG ADVGAHSDAQAQTPGEPWAAQEPQAEAPAAEPPAKPEPAWGTQSDAQVQAGAGAQSEDQA QTGAGAQSEDQVQTGAGAQSDTQAQAPGEPWAAQEPQAEAPAAEPPAKPEPAWGTQSDAQ AQADAGAHSGAQPQAPSGPGAAQDPWAGASGALPHSEPQPVWGAQPDGHAQASQDSPRPQ EPQASFAAPQGQREPWQPQSAEARAPQFGEGQWNPQGANPQAPSYSGQYSSDWSAGPQGQ GTRGTTADKGGSGRRLIAFLGLLVASAVAIGSVFLPYLAHEDVKLPLFGVLQGTVLMIGT VPVVVVALASWFLKKRWALIASVVLGGVAAVDIAVVAIWVVTARNGLAVFTLSYGWYIGL FSMLLMVVATVLQALGLKKPQGGAAGSQPGAGGPAQSAQTGGESRLYQPGGTNPYQAGTT NPYQ Prediction of potential genes in microbial genomes Time: Thu May 12 17:23:23 2011 Seq name: gi|319978960|gb|AEUH01000065.1| Actinomyces sp. oral taxon 178 str. F0338 contig00065, whole genome shotgun sequence Length of sequence - 1436 bp Number of predicted genes - 3, with homology - 1 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 333 247 ## 2 2 Op 1 . + CDS 444 - 695 366 ## 3 2 Op 2 . + CDS 692 - 1384 303 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 Predicted protein(s) >gi|319978960|gb|AEUH01000065.1| GENE 1 3 - 333 247 110 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MASVASAQAYSAAPPGRERSYTIMIDPSRLQASDMANPGLTANDLAIIAQARPDLHPFVV RHPSVYPALVAWIQAQHPGGQVQPVQGQWGQPGGQVQPVQGQWGQPGGQV >gi|319978960|gb|AEUH01000065.1| GENE 2 444 - 695 366 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNAPDPALRTPAMTRILHNPNARFALSLIPATIVLALVVEAVQPGAPMWLVGGVAGAMGY LLRRLTQPRAPKHDRMAEEGTAS >gi|319978960|gb|AEUH01000065.1| GENE 3 692 - 1384 303 230 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 11 202 18 212 236 121 37 4e-28 MNPYLTEIAWMVLALVLSTLIGLERQIRGKSAGIRTQAIVGLTAALMVLVSKYGFQDVLA DGITRFDPSRVASQIVSGIGFLGAGIILTRHGAIRGLTTAATIWETAAVGMACGAGLWWL AVAGTALHFLIVLVMSPVIQRITKRARGSKTVVAVRYWPGHGMVAELLQQVGALGWQVHR LSSHADGKDSPTVARLVVSTAQPLATSALVASLVDLEGVLGVEVVEDDEE Prediction of potential genes in microbial genomes Time: Thu May 12 17:23:55 2011 Seq name: gi|319978909|gb|AEUH01000066.1| Actinomyces sp. oral taxon 178 str. F0338 contig00066, whole genome shotgun sequence Length of sequence - 60883 bp Number of predicted genes - 57, with homology - 42 Number of transcription units - 35, operones - 11 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 551 436 ## COG2039 Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) - TRNA 826 - 902 89.8 # Met CAT 0 0 2 2 Tu 1 . - CDS 1052 - 1894 602 ## gi|154507574|ref|ZP_02043216.1| hypothetical protein ACTODO_00054 3 3 Tu 1 . - CDS 2047 - 5262 4285 ## COG1615 Uncharacterized conserved protein 4 4 Tu 1 . + CDS 5213 - 5770 643 ## Jden_0724 hypothetical protein + Term 5867 - 5901 -0.9 + Prom 5824 - 5883 2.2 5 5 Tu 1 . + CDS 5917 - 6723 866 ## COG0345 Pyrroline-5-carboxylate reductase 6 6 Tu 1 . + CDS 6859 - 8217 1934 ## COG2056 Predicted permease 7 7 Op 1 . - CDS 8314 - 9399 874 ## COG0456 Acetyltransferases 8 7 Op 2 13/0.000 - CDS 9396 - 11009 687 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 9 7 Op 3 49/0.000 - CDS 11006 - 11872 1140 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 10 7 Op 4 38/0.000 - CDS 11869 - 12891 1111 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 11 7 Op 5 1/0.000 - CDS 12902 - 14479 2183 ## COG0747 ABC-type dipeptide transport system, periplasmic component 12 7 Op 6 . - CDS 14634 - 15992 1726 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 13 8 Tu 1 . + CDS 16183 - 18912 3754 ## COG0525 Valyl-tRNA synthetase 14 9 Tu 1 . - CDS 18863 - 19135 241 ## + Prom 18968 - 19027 1.9 15 10 Op 1 . + CDS 19059 - 20429 983 ## SGR_2674 hypothetical protein 16 10 Op 2 . + CDS 20422 - 21048 475 ## xccb100_4194 hypothetical protein 17 11 Tu 1 . - CDS 20806 - 21282 65 ## 18 12 Tu 1 . + CDS 21368 - 21553 147 ## + Term 21704 - 21741 -0.4 19 13 Tu 1 . - CDS 21533 - 22060 723 ## + Prom 22029 - 22088 2.2 20 14 Tu 1 . + CDS 22263 - 22778 -200 ## 21 15 Tu 1 . - CDS 23327 - 24790 1986 ## COG2211 Na+/melibiose symporter and related transporters - Prom 24826 - 24885 4.5 22 16 Tu 1 . + CDS 24789 - 28052 3007 ## COG3250 Beta-galactosidase/beta-glucuronidase 23 17 Tu 1 . - CDS 28111 - 29193 1182 ## COG1609 Transcriptional regulators 24 18 Tu 1 . + CDS 29296 - 31236 2563 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 25 19 Tu 1 . - CDS 31337 - 31651 159 ## 26 20 Tu 1 . + CDS 31539 - 31922 206 ## 27 21 Op 1 . - CDS 32148 - 32690 461 ## 28 21 Op 2 . - CDS 32785 - 33819 1243 ## 29 22 Op 1 36/0.000 - CDS 34161 - 35363 1653 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 30 22 Op 2 . - CDS 35356 - 36051 324 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 31 22 Op 3 . - CDS 36048 - 37166 1475 ## Ndas_4786 peptidoglycan-binding domain 1 protein 32 22 Op 4 . - CDS 37208 - 37687 663 ## gi|154509326|ref|ZP_02044968.1| hypothetical protein ACTODO_01851 - Term 37843 - 37885 -0.3 33 23 Op 1 . - CDS 37901 - 38239 493 ## COG1605 Chorismate mutase 34 23 Op 2 . - CDS 38266 - 39459 262 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 35 24 Tu 1 . + CDS 39460 - 39612 57 ## + Term 39643 - 39669 -1.0 36 25 Op 1 . - CDS 39669 - 39812 233 ## 37 25 Op 2 . - CDS 39766 - 39867 151 ## 38 25 Op 3 5/0.000 - CDS 39882 - 40526 971 ## COG0740 Protease subunit of ATP-dependent Clp proteases 39 25 Op 4 . - CDS 40526 - 41137 889 ## COG0740 Protease subunit of ATP-dependent Clp proteases 40 26 Op 1 . - CDS 41403 - 41999 734 ## gi|269219679|ref|ZP_06163533.1| conserved hypothetical protein 41 26 Op 2 . - CDS 42030 - 42491 522 ## Ksed_23090 transcriptional regulator 42 27 Tu 1 . - CDS 42660 - 44012 1914 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) 43 28 Tu 1 . + CDS 44176 - 45588 1794 ## COG1236 Predicted exonuclease of the beta-lactamase fold involved in RNA processing 44 29 Op 1 . - CDS 45632 - 46303 890 ## COG3548 Predicted integral membrane protein 45 29 Op 2 . - CDS 46454 - 46759 256 ## gi|154509336|ref|ZP_02044978.1| hypothetical protein ACTODO_01861 - Prom 46978 - 47037 75.6 - TRNA 46786 - 46859 84.9 # Pro TGG 0 0 + TRNA 46961 - 47034 80.4 # Gly TCC 0 0 46 30 Op 1 . - CDS 47132 - 47566 409 ## COG2020 Putative protein-S-isoprenylcysteine methyltransferase 47 30 Op 2 . - CDS 47631 - 48677 1549 ## COG1064 Zn-dependent alcohol dehydrogenases - Term 48733 - 48759 -1.0 48 31 Tu 1 . - CDS 48909 - 49520 884 ## COG1309 Transcriptional regulator + Prom 49499 - 49558 2.2 49 32 Tu 1 . + CDS 49590 - 50837 1594 ## COG0366 Glycosidases 50 33 Op 1 . - CDS 51054 - 51476 371 ## 51 33 Op 2 . - CDS 51535 - 54492 3086 ## Kfla_6006 hypothetical protein 52 33 Op 3 . - CDS 54489 - 56294 1917 ## COG0326 Molecular chaperone, HSP90 family 53 33 Op 4 . - CDS 56297 - 56770 545 ## 54 33 Op 5 . - CDS 56770 - 57216 438 ## - Prom 57240 - 57299 1.6 - TRNA 57382 - 57457 82.1 # His GTG 0 0 - Term 57508 - 57561 2.3 55 34 Tu 1 . - CDS 57581 - 58498 1254 ## Arch_0447 hypothetical protein 56 35 Op 1 4/0.000 - CDS 58623 - 59759 1294 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 57 35 Op 2 . - CDS 59790 - 60824 1404 ## COG0240 Glycerol-3-phosphate dehydrogenase Predicted protein(s) >gi|319978909|gb|AEUH01000066.1| GENE 1 2 - 551 436 183 aa, chain - ## HITS:1 COG:BH3631 KEGG:ns NR:ns ## COG: BH3631 COG2039 # Protein_GI_number: 15616193 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) # Organism: Bacillus halodurans # 5 168 5 174 201 108 37.0 5e-24 MDADVLLTGFGPFGPHAVNPSQLLIERLSARRPPRIATALLPVGFEAGPRLVRGLVERIR PRAVLCVGLAASRRPLCVERVAVNERTGTDNDGLSCDHEPIDPSGPDGLFASIDWRALLT GLRSAGFEAEESWSAGTFVCNSVFYAALAATAAHGGTAGFLHIPAETDTVRLAEALEAWA AGA >gi|319978909|gb|AEUH01000066.1| GENE 2 1052 - 1894 602 280 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154507574|ref|ZP_02043216.1| ## NR: gi|154507574|ref|ZP_02043216.1| hypothetical protein ACTODO_00054 [Actinomyces odontolyticus ATCC 17982] # 67 204 20 158 168 73 32.0 1e-11 MTAPQGPYSADLNSGGAQQSSVPQGAPYQFAPPQGAPRQFDPPQGAPYQYGLAAPGAPVQ QGARPGAFASPRARACAGTILVVEACLLLNQFVPWNFEDLSLFGATLLHWAAWLWVALRA VLIVLAAVFLARRGSAVRYALVGAYLVVALLLAPLNQVYFGGDWYQVFLPPPAELDAYFF IRAMVLWVLNMVGIACAICLAVPDGRPSGAVPAHGAMAPYGSTGGQFPPTGQGAYPVPAQ GFGQHPGGGQGFQSPSDGQYPGGAQGFQSPGGRQYPPRTP >gi|319978909|gb|AEUH01000066.1| GENE 3 2047 - 5262 4285 1071 aa, chain - ## HITS:1 COG:MT0070 KEGG:ns NR:ns ## COG: MT0070 COG1615 # Protein_GI_number: 15839443 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 62 982 2 908 979 502 34.0 1e-141 MSTTTLASATSGAVGVMDPFMHPSSHGQRDSRRRPDCPAPLDWAELHVDREDNAVSSFPF AQGGRPQRRPSSRGSAPRVGPLGITIAALAAIAGIVYAASIVWTEVLWFNQMRATRVLLT QWGAHIGLFLVGFGAMTAVVYLSMAHAYKHREASTRGEAAASLRAYQKALAPVRKLAFWG VALFFGFTAGARMASNWQTLLQFVNGSSFDRVDPEFGLDISFFVFTLPALKVLVSFLMSA AVICLVASLVVSYLYGTVRLAPRPHASKPARLQTGMLAAAVSLLIGANYWLGRYELLTST SSSMNIDGAMYSDINATLPAHSILAVISVLVAVLFVVAAYRGTWRLPGSGIAVTVVAALV LGMAYPALIQQFRVRPNQRTLEAPYIQRNIDATLEAYGLDDVDYQTNYDAATTASAGQFD NDALTGQVRLLDPKVITKTVQQLQQSRLYYGFNSGMKVDRYNVGGERRDTVVSVRELNLA GLSAEQQTWVNQHTVYTHGYGAIAAYGNQLTSDGLPSYWEQSVPSVGEMGDYEERVYFSP NAPDYSIVGAPEGSEPQELDYPDDTAVGGQVSTTFTGNGGPSVGNVWNKLLYAIKFGSTD LFFSSQTNEASQILYDRDPLQRVAKVAPYLTLEQSAYPAVVDMDDDALTPKRLVWVVDAY TTTNQYPYSQHESLSDATVDSQATQVGQYVQANKINYMRNSVKAVVDAYDGSVRLFRWDD KDPILKTWEKIYPGSVEPIASVSGDLMSHLRYPEDMFKVQRNLLATYHVTDAAGFYAGGD RWRLAEETSTIGASQAQAAASQAPASTSLQPPYYLTMQMPGQESAEFSLTSVFVPGGGGE GRREAMAGFLAVDSETGSQPGVVREGYGKLRLLALPSSTTVPGPGQVQNKYDSDEKIATQ LNLLNQADSTVIRGNLLTLPVGGGLLYVQPVYVQGTGSASYPVLRSVLTAFGDKVGFAST LEESLKQLISDDGSSTRPSPGGGGGGDGSQSGAQSGSTAQPRTPTVAQALADASAAMRDA ESAMSRGDWSAYGEAQNRLKDALNRAEAAQAAAGAPAPSPSAAPGGGQSGS >gi|319978909|gb|AEUH01000066.1| GENE 4 5213 - 5770 643 185 aa, chain + ## HITS:1 COG:no KEGG:Jden_0724 NR:ns ## KEGG: Jden_0724 # Name: not_defined # Def: hypothetical protein # Organism: J.denitrificans # Pathway: not_defined # 9 182 15 190 193 129 46.0 6e-29 MTPTAPEVALANVVVDIERGAARAGWDQAPTLYALVPTARLLGDPLLPDEVAAHLRSGWD GSDDHLSAVVQEDMADDDLEEVLGHLAWPPEVAGAALTVERVVVPPEVEAQAPEDPEAAI EFVSNHPSRTEVRLAVGVLRSGESWCALRMRPFDSDDKVGQGSVLVPALVDALRVSLEPD EAPAS >gi|319978909|gb|AEUH01000066.1| GENE 5 5917 - 6723 866 268 aa, chain + ## HITS:1 COG:SPy0112 KEGG:ns NR:ns ## COG: SPy0112 COG0345 # Protein_GI_number: 15674334 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Streptococcus pyogenes M1 GAS # 1 261 1 250 256 156 37.0 4e-38 MKFGFVGTGAMVSAIVRGAAAAGFNASDFVLSNRTPQDAKALADEVGATTARSNGSLAHQ VDVLVLGVKPAVQPGVIEQIAHIVRERPEMCVVSLAAGRTLDAILEDFGACVPLVRVMPN VAAQIGQSMSALCSVGAADDQVAAVRRLMDAVGRTVVIDEKLFPVFQSLASCSPAWLFQV IDSLARAGVKHGLPKDIAVRVVAQAMAGSASLVLAASEEGRVPAQLIDQVCSPGGTTIAG LLAAEEGGLSASLVGAVDASIHADSHLG >gi|319978909|gb|AEUH01000066.1| GENE 6 6859 - 8217 1934 452 aa, chain + ## HITS:1 COG:NMB2064 KEGG:ns NR:ns ## COG: NMB2064 COG2056 # Protein_GI_number: 15677886 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Neisseria meningitidis MC58 # 17 452 15 462 462 413 55.0 1e-115 MIFNAVVISVLVMLVLSIARVHVVISLFIASLVGGLVSGMGLSATMVAFQEGLAGGAKIA LSYALLGAFAMAVAHSGLPRLLANWIIVRMKNADESNQARIARNAKWGILCGVIAMGIMS QNLIPIHIAFIPLLIPPLIGVMNRLGLDRRAVACAITFGIVTTYMFLPVGFGRIFLHDIL YKNIIEAGLDVSGVNPTAAMAIPAASMLVGCLIALLISYRGRREYEDRSDSFGRPADEPI DMKRVWAALAALVLCFAIQTWLTLIESEADPLLMGAIVGLCVMLSMRVVPWQQANDIFST GMKMMAMIGLIMISAQGYASVLKASGQIEPLVTGTAAMFSGSKALAAFVMLLVGLIITMG IGSSFSTLPIISAIYVPLCVTLGFSPIATVAIIGTAGALGDAGSPASDSTLGPTTGLNVD GQHDHMRDTVIPTFIHYNIPLLAGGWIAAMVL >gi|319978909|gb|AEUH01000066.1| GENE 7 8314 - 9399 874 361 aa, chain - ## HITS:1 COG:mlr7776 KEGG:ns NR:ns ## COG: mlr7776 COG0456 # Protein_GI_number: 13476450 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Mesorhizobium loti # 153 325 181 359 396 87 32.0 3e-17 MRSAPVPVGEGAGAALAGLGVPGPSARRWSRVGGIARATVWISGGAAVHEVHRPLSAHLR IAGWAPLVGSGTADAAGALAGIIRAIVDEAGRRGLPMVKAQTQGEEDPLAGALVAEGFTR MPGGGDPLSGAPPAEFAHERTVGWIRWLAPGPPVAPAPAYERQKTEFTCGPACALMALGR GGTAPPRGLDAEMEIWREATYTVGVGHFGLAGAIARRGARVHVITSSPGPVVGVSRAHMA TGHVREAIHRKHVDQARALGVTWEYREPAPQDLARALAAGRRVVVLVDLASLNGETMPHW ILAWGAVGDHVLVHDPWTDEQFGESWVETDTLALRGQDLWDAGVWTEEEGNRAVLVVGHS A >gi|319978909|gb|AEUH01000066.1| GENE 8 9396 - 11009 687 537 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 20 531 9 524 563 269 35 3e-71 MTARAERIGTAGSTGTGTGLRARAIRVANPSDPDRPIVADVSVDAAPGRTLVIVGESGAG KSMLARAICGLAPAQFTTSGTVELAGRTIDLGVAADSLRRLRGGGIVWLPQDPFTSLSPL HRCGVQVIAHRQGPRSERAERALARLAEVGLPARAARAYPHQLSGGMRQRVAIAAALDAA PGVLIADEPTTALDVTTQKEILTLLASLRDARGMALVLVTHDLALARDYGDDIVVMRDGA IVEAGQARRVLARPATDYARQLLDAQLRLDGPSPRPAGAVPATTVADERPVVALRDVVKE FRSGGRHAEGHLALAGVSLDILPGQSVGIVGESGSGKTTLARIMVGLEAPDSGTVDVVRA PGAGRGAPPPVGIVFQNPYSALNPARTVGQTLAEALAVGGQGGAQVPDLLGAVGLSAAHA RRLPAALSGGQRQRVAIARALAARPELLICDEAVSALDVSVQATIIDLLRRLQAEQGFAL VFITHDLAVARQMSDRIIVMKDGAIVEEGATEALLAAPSHPYTASLIDAVPGRGAPA >gi|319978909|gb|AEUH01000066.1| GENE 9 11006 - 11872 1140 288 aa, chain - ## HITS:1 COG:SMa0469 KEGG:ns NR:ns ## COG: SMa0469 COG1173 # Protein_GI_number: 16262698 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Sinorhizobium meliloti # 54 278 19 242 249 155 47.0 7e-38 MTGADTAPARPAPRGPGHRLRGLPGSVAVALVVLGVVAVWAALPGFLAPGALDQDLGAAA LPAGSPGHLLGTDKLGRDILALTIAGARSTVVGPVVVAAGSMLVGMAAGISAAWFGSWWD ALVSRGVEVLLSLPVTLLAIVVAGVVGGSYWVSVAVLTVLFAPSDVRMIRAAALAQMGAP YIESARVLRLPTPRILALHIAPNVAPIAWANLFVNIAFALVSLSGLSYLGFGVSAQAADW GRQLADARTFLSLNPAAALAPGTAIIVVAAAFNIAGDWWTSSSAKGGQ >gi|319978909|gb|AEUH01000066.1| GENE 10 11869 - 12891 1111 340 aa, chain - ## HITS:1 COG:SMa0467 KEGG:ns NR:ns ## COG: SMa0467 COG0601 # Protein_GI_number: 16262697 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Sinorhizobium meliloti # 5 316 26 338 338 218 45.0 1e-56 MTMRYLLGRIAGFAAVLAVLSVIVFSLTHMVPGDPVKMLVGTRAVTPGVRAAITAKYHLD EPLWSQYLTWLGRALHGDFGDSVRNSAPVTDILAGRVALTGQLTALAFAIAVAVGVPLGV VAARRAGTGTDRLIVGASVVGVSAPGFALGMVLLYVFALMGGLFPIYGEGSGFADRLWHL ALPAVALAVGTCAMIVRVTRAAVITELASDYVLFARSRGLSERRTTLMYLKGAALPIVTS AGLILGSLFGSTVLVEEAFSLPGIGQLLADSITYKDIPVVQAIVLIVAALIMTTTLVVDI VARILDPRPARGRAQSRRALSAQGPRAIRGSRPGRRGRAR >gi|319978909|gb|AEUH01000066.1| GENE 11 12902 - 14479 2183 525 aa, chain - ## HITS:1 COG:SMa0466 KEGG:ns NR:ns ## COG: SMa0466 COG0747 # Protein_GI_number: 16262696 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Sinorhizobium meliloti # 2 517 6 526 534 228 30.0 2e-59 MTRKPAAAAAIAGLALMCSLAACAPRSAAPSQAAAEPGTLTVGLPGSLSTLDVAHETGII NYYVAQVSSEGLLGVDAQGELVPAVASSHTTADGRTWVFQIRDDARFQDGNPVTIDDVLF SIDVARDPDRSPSSSVYWPAGTDVSQTGDSEITITLPAPAQNFGWTVTANGGLWITEKSY YEAAGAYGSAKDLIMGTGPYRAVEFQPDSKAVFVKSGTWWGGDTSAQKIEFDFFSDENSR LLAQRAGQIDVSVQIPADQVGQFEAIDGTTVHTESDRSYVGLTFDEAVEPFNDIHVRRAV AHAVDRASIVASVLSGRGEAATGIEPPDQLGSQIGEDAARALLATLPNTEFDLDKAREEL AASHVPSGFEAELTYPASIPEMGNAALAIADNLKKIGVTLTVTSKPLEEWISQVGTGEYG LYYMSYTPTTGDAGEIPGWLLGEANPARYSNPEVASLIAQSNAQADPAARAQDLVSATAI AQENVIYSPIWWGKSTSAFSSRVAPKDFGPYFFMTPWTASLDLVQ >gi|319978909|gb|AEUH01000066.1| GENE 12 14634 - 15992 1726 452 aa, chain - ## HITS:1 COG:Cgl2768 KEGG:ns NR:ns ## COG: Cgl2768 COG0624 # Protein_GI_number: 19554018 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Corynebacterium glutamicum # 11 450 7 441 441 453 54.0 1e-127 MTDLSPLGARTVELLGELVRNACVNDLTPASGHEHRSADTLEALFEGLPVRMRRIEPAPG RTTLVVALDGSDPAAEPLTLIGHTDVVPADDSQWAHEPFAARIDSDGVMWGRGTVDMLHL TAAMAVVTQDLARRVAEGGARPAGTLTFVAAADEEARGGLGVPWIGEAEPQAIPWRNCIS EMGGGHIRGARGSDSIAVVVGEKGAAQRRLHVDGDAGHGSVPLGRHSAVEVLARVVQRLA CARWPAGGGGEWEGFVSAFEFDPATRAALLAADYDGDYHEFGDLAAYAHAISRLTVAQTA VRAGGPINVLPSSAHIDLDIRTLPGQDDDFVDAALRRALGEVGEHVRIERLLVEGATASP TGTRLYRAIERALRAQHPGSRVVPVLFPGGSDLRVARRLGGVGYGFGSFGRDVALGDLYS RLHAHDEHIRLSDVDLTVRALAAVAADFLGAR >gi|319978909|gb|AEUH01000066.1| GENE 13 16183 - 18912 3754 909 aa, chain + ## HITS:1 COG:RC1053 KEGG:ns NR:ns ## COG: RC1053 COG0525 # Protein_GI_number: 15892976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Rickettsia conorii # 27 846 14 769 812 600 39.0 1e-171 MADRPGQLPATARAPRVPDRAAPEGLEHKWGSAWEAEGTYAFDRTAAREQVYSIDTPPPT VSGSLHIGHVFSYTHTDVVARFQRMRGRAVFYPMGWDDNGLPTERRVQNYFGVRVDATLP YDPDFEPPHVGGEGKSIKARDQKPISRRNFVELCERLTVEDEAQFEALWRYLGLSVDWKQ NYQTIGRRARKVAQAAFLRNLARGEAYQAEAPGLWDVTFQTAVAQAELESREYPGFYHRI AFHLTDASAAAEAARAGTPVEDGADVCVETTRPELLAACVALIAHPDDERYKPLFGKTVA SPVYGVEVPVLAHPAAEKDKGAGIAMCCTFGDTTDIDWWRDLGLPLRAILRKDGRIVADT PEWITTEAGRTAFAEISGKTTFSAREAVVARLRGSGELKGEPTKTVRQTNFFEKGDKPLE IVTSRQWYIRNGGRPWTRPASGADLNDELVERGRQLEFHPDFMRVRYENWVRGLKGDWLV SRQRFFGVPFPLWYRVGADGQIDYDAVITPSEDSLPVDPSTDAPEGFAEEQRGEPGGFVG ELDIMDTWATSSLSPQLACGWLHDDDLFARTYPMDLRPQGQDIIRTWLFSTVVRADLEFG ALPWRHAGLSGWILDADHKKMSKSKGNVVTPMGLLEKYGSDAVRYWAASARLGLDAAFDE QQMKIGRRLAIKVLNASKFALTMGGEGAQLDLDPALVTVDLDRSVLAELAGVVAEATDAL AAYDHARALEVTESFFWTFCDDYLELVKERAYNREGAWDGASAASARAALALVIDTVVRL LAPYLPFVTEEVWSWYREGSVHTAPWPTGAPLASAAGDPSVLAAASGALIALRRVKSEAK VSPRTPFLSVVLEAPAASLGALESVKGDLEAASKAVGALTVAASGDSEAADAVVKSFELG EAPAKRKKG >gi|319978909|gb|AEUH01000066.1| GENE 14 18863 - 19135 241 90 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQVNGPKSTETFSRFLNPETSKRVSMSLPSTDDARIHLSYITLRPECDGYGSAARPCPGP QGPGHTPASQRGYRLTPSCAWPGPRPVRKT >gi|319978909|gb|AEUH01000066.1| GENE 15 19059 - 20429 983 456 aa, chain + ## HITS:1 COG:no KEGG:SGR_2674 NR:ns ## KEGG: SGR_2674 # Name: not_defined # Def: hypothetical protein # Organism: S.griseus # Pathway: not_defined # 1 447 1 460 472 298 40.0 4e-79 MLTRFEVSGFKNLENVSVDFGPFTCIAGPNSVGKSNLFDAIEFLSLLSDHNFLEACTLIR PTPQQRSDISSLFSVHVTQGSENLRLAAEMILPSPVSDEFNQQVESERTFVRYEVRLAVR QGGPLSTGLQLKLVNEQLSPVQYPKDHLRFPESTAYVKRFVHGAHRFSPLSTTEENGNTI IQIQNGNKGRPRKIAADNAQRTVLSATASAENPIILAAQTEMRSWRFVALEPSAMRAPDD MVEARPISATGAHIPSSLFRQELQEGNGSNVMNRVRDAIGALVDIRRLRVVQDRARDTLE LRAKIGATPELPARSLSDGTLRFLALSAMKVSRDYFGMLCMEEPENGIHPSKIADMHRLL RGISDPTEHSARQVIVNTHSPYFVQATKDDDILCAVGKSIPDGQGGILRSVGFHPLPGTW RARTKPEDSTGARPVSRGEIAAFLENPEKWNGDEDE >gi|319978909|gb|AEUH01000066.1| GENE 16 20422 - 21048 475 208 aa, chain + ## HITS:1 COG:no KEGG:xccb100_4194 NR:ns ## KEGG: xccb100_4194 # Name: not_defined # Def: hypothetical protein # Organism: X.campestris_B100 # Pathway: not_defined # 8 177 7 185 214 67 35.0 2e-10 MNEWYCYLLCEGSSDAALTNVLELLLTRLTLEPASVSARLDLKGSVQDKLVALQKYDAGL YDLVFVHRDADNAGLQARQEEIRHAVDSVQDAAVTPVVPVVPVTMTETWALASLFGDPDC RAWLRRKASVSEKTLETYRDPKDLLRRLLSREDPDGRLLPNGVFSVQRARVLAELDAADG SALSRLTAWTALNKALADAVARARPWLS >gi|319978909|gb|AEUH01000066.1| GENE 17 20806 - 21282 65 158 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTRRCGRGATSAEDGFEFVKCVPGLGVGPVRPDRGGGIVAQPLFDSFDGQAFTVLPEGFE GVSVELGENQDSSQWRPPLGEPWAGAGDGVRECLVQGGPRCEAGERGTVRCVKFGKDSGS LNREHSIGQKSAVGILATQKTTEKILGVPVGFKGLLGY >gi|319978909|gb|AEUH01000066.1| GENE 18 21368 - 21553 147 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHALHAYLRQLPDFPMRGRARPDIHGAESDTIQTRALGVCQYFCVSGFEVRGMPPLGFSY S >gi|319978909|gb|AEUH01000066.1| GENE 19 21533 - 22060 723 175 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVNITLGIPGIFTLAGDPANVFRQLSGRSHPVAIVKRDAQGLSFDTGLTDYSGAPLTVWS LHCENHQLWYLDRRGRGTVSLRSVATGLVVSGRSALTEGTGLTMDEWASKPWQTWRIRPT STKRGFAIVSADDSFRSEPLAVDVTDNAGLGSFVQIQRWRDVPSQHFLLLRVREP >gi|319978909|gb|AEUH01000066.1| GENE 20 22263 - 22778 -200 171 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRSRAAACTPRQISGRLKARGGRDGAGGPRGLIVCLEAASAELRAQTVKLARHRGGHRSR RAPPPRSTVLVALPLPRVGTRRRRRLAPLVVQMISQLSGQPAFQGHSSSKAGSRPSVPVI TGPTGIDPLEQAVQRPTGTQPLRHTHVRQHEPSHHQSSRIDPFKSRGYTDN >gi|319978909|gb|AEUH01000066.1| GENE 21 23327 - 24790 1986 487 aa, chain - ## HITS:1 COG:melB KEGG:ns NR:ns ## COG: melB COG2211 # Protein_GI_number: 16131946 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 14 413 1 386 469 113 25.0 1e-24 MSTAPGGSRARPLLVQRICYAFGNLGQAAFYNALSTYFVTYYAAQTLFAQYERTEAEALI GVITVLVFVIRIAEIFLDPLLGNLVDNTTTRFGRFHPWQVIGGVVSSVLLFAIFTGLFGL VNTNKMVFIVVFVIVFIVLDIFYSLRDISYWGMIPALSSSSHERGVFTALGTFTGSIGYN GVTAIVVPVVAFFSSVFAGVTGESQAGWTGFGALIAVLGIVTCLAVAFGTTEQDSPLRAH AAESSNPVQAFVAIAKNDQLLWVSLSYVLYSVANVATTGFMIYLFKFVLEMPNSYSVVGL IAFGIGLATTPLYPFINKRVPRRILYLAGMAFMAAAYVLFIFFSNNVLAVFAALVLFYIP STFIQMTAILTLTDSVEYGQLKTGRRNEAVTLSVRPMLDKIAGALSNSIVGFVAIAAAMT NDADPASLTAGNITTFKTAAFYVPLAIIVASAAVFAAKVTLSEKKHAEIVKQLETTLGGE DADAAGK >gi|319978909|gb|AEUH01000066.1| GENE 22 24789 - 28052 3007 1087 aa, chain + ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 75 1076 6 1003 1014 747 41.0 0 MFFSFEVGETHLPTEVYPFTSGCKLSQPHSRPTAPQTLPPPPYTCRIKGNPIYLISSAAI IQEVDPMPPADPTPSAQWLTDPTVFAVNRRPQRSQHRTDAPRQSLNGPWEVLDTDARAID VADPESPASALRSASDRIATIPVPSTLETSGGWRPAYVNIQMPWDGHADPEAPSVPDTNR VALYRRLFTLDEDLRGAALEGGTVLLSFGGFATALYVWCDGAFVGYCEDGYTASEFDITR LLGGAPADTIHELVVACYEYSSASWLEGQDSWRFHGLFRGVCLTALPPTHVDSVRIDADF DHHSEVGRLRVKADVVDPLGGASIRAELFDHSGAPVWSTETPRSAATDLVSDAIPRVAPW SAESPTLYRLRLEVIDRDGRTCETVALNVGFRRFGIDNGIMTLNGRRIVFKGVNRHEFHP RKGRALDEADMLADIRFCKLHNINAIRTSHYPNDPRFYDLCDEYGIYVIDETNLETHGTW NVPGDVATPDTAIPGSRTQWQDACVDRLERMIVRDYSHPSVLIWSLGNESYGGEVFRTMS QRARELDPARPVHYEGITWDREYDDVSDIESRMYAHPDEIEQYLASSPAKPYISCEYMHA MGNSVGGLGLYTDLERYPHYQGGFIWDLIDQALWHPATPASAGEFLAYGGDFGDRPCDWE FSADGLLFADRTPSPAVREVKQLYSDFSLVPSTTGVDVTNRSLFTSSSAYDLVCSLLVDG EAVWQITVPSDVRPGDRATVPCPWPIDEYSAPGREIVLEASLVHLHRTAWCPAGHEVCFG QSVLPLAAPPSPPDAADRNRATYTVGRWNIGMRTGGEEALLSRTAGGMVLWAKDGDPLVL RPPRLTTFRPLTDNDRGASHGFKRAVWTVAGRYARCSDVSVREEDHSVEVAYCYELAAVG VPVSVVYRLNDDGTIRFTASYPGAPDLPTLPCFGVEWALPPSLSRLRFYGTGPDETYIDR VSGGRLGIWSRSLSEDLTPYVVPQECGNHMGAQWAEASDEQGNGIRVVSAVGNPFEVSLL PYSSAQIEEAAHWWELPEPGRRPATHLRLLAAQMGVGGDDSWGSEVHPEYLVPSEQPHVL DVVLSIL >gi|319978909|gb|AEUH01000066.1| GENE 23 28111 - 29193 1182 360 aa, chain - ## HITS:1 COG:BH2018 KEGG:ns NR:ns ## COG: BH2018 COG1609 # Protein_GI_number: 15614581 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 23 353 1 328 330 186 35.0 6e-47 MREEALSGANTAEELFSKEGSEMVSLSDIAAATGYSKATVSRVLNGDPTFSVREDTRRRI IEVGNEMGYALKGRRVGIPQNVAILDNVDPERGLHDAYFFDVREALKVKALEHMMSLASF PDCDSLIGAADRFSGFISVGPSPLKGADLEKLHAALPHGVFIDINPAPTLFHSVRPDLQQ TVLDAIDALRDQGRQRISFVGGDGHIMGMHFHEEDIRTFAFVNWTERLGLSADARVFAEG SFTVENGRRQMERMLADSTGPAPDAVVVAADPLAVGVLQCLVAAGLRVPDDVAVVSINNL EVCEYTAPPLSSYAINREELAESAILLLSDSIIGRYEQRHHVLISTELVVRSSFVPATAV >gi|319978909|gb|AEUH01000066.1| GENE 24 29296 - 31236 2563 646 aa, chain + ## HITS:1 COG:MT2242 KEGG:ns NR:ns ## COG: MT2242 COG1022 # Protein_GI_number: 15841677 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Mycobacterium tuberculosis CDC1551 # 86 640 46 600 603 458 46.0 1e-128 MRATDEGADMSDSPAPASSDDGAQELEAVPVPAPTSGGATQALEWHAPTLVRIDETCTIP RLLQDRVGRSPRRPLILRKLGMGDAWRPVSARDFHEEVQQVAAGLIARGLEPGQRVAIMS RTRYEWTLLDFACWSAGLVPVPIYETSSIEQVAHVLSDADVDLVVTETVSLAEIVRTAAE RTGRHHVVVLSLDSEAIETLVADGAGVPRATVAARTGALTKDDISTIVYTSGTTGTPKGT VLTHENFTNLCLNSHAWMPEIAAGEDSRLLLFLPLAHVFARFLQVFQISGNGVMAHVSNI KNLLPDLASFRPSYLLVVPRVLEKIYNSADAKASKGAKRRIFRWAAHVAIEYSKALDTPE GPSRSLRAQRALADKLVFQTILKLVGGNARYIVSGGAPLATWLGHFYRGLGIPVLEGYGL TETVGPVSVNTPRLSKIGTVGPALPPMSFRVSDEGEILLKGPSVFQGYNNDPEATAACFT EDGWFKTGDLGSLDRDGYVRITGRAKEIIVTAGGKNVAPAALENPMRSHPLISQVLVVGD QRPFVAALITLDAEMLPVWLESHGLPVMSVAEAATNVEVLASLQKAVDGANEHVSRAESI RKFKVLTSDFTEANGLLTPSLKVKRAVAVRRLAPVIDELYGGPVEE >gi|319978909|gb|AEUH01000066.1| GENE 25 31337 - 31651 159 104 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLSLGEITGIGPKPSNSDAKWNTSPPRRRNTAKPRRILDHYGPRTATHHITATAKDTTD PQQDARCSTLVASGRLRAGRVAIPVSPAALAAGARRGSPFGYSV >gi|319978909|gb|AEUH01000066.1| GENE 26 31539 - 31922 206 127 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLGFAVFRRRGGEVFHFASEFDGFGPIPVISPRLRRILEQPPGDPPEKPETQTHFGPPG GTTAPFAPPGGSAGTGRGGGRRRRTDTTLAPLLSPPSRARTQEPRIRQHPHRRVAHHRPC THTRTPN >gi|319978909|gb|AEUH01000066.1| GENE 27 32148 - 32690 461 180 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEWISVLLSTAAAAVSLAVVAATSGLSASSVIIYCVCAAVCLAEIGIMAVLCVTIDSTGF HVRSVFGRRTVPWPASRTGLFPVLVTVRRRLPRATVCVVAPDGRRVTMRSLAWSRMSKEE AVSLAVLHCWRIWNWGAAKGYTRETGAYTPLRDPRMQQERSQIEWAFGILPTTGAPHSTT >gi|319978909|gb|AEUH01000066.1| GENE 28 32785 - 33819 1243 344 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTSPYTQPDPNADSRGQGSGTPGYGGSAQPYAAPGGSAQQYGAPGGSAQPYAAPGGSAQQ YGVPGGSAQPYAAPGGSAQQYGVPGGSAQPYAASGGSAQQYGAPGGTGAGWESSGGQSAQ GAQPYAQGAPAGGAPGQGGGAYRYQRDDTTPIPLTADSIVMPSAKGAAWVAFILSVPFIA GSIWNVIYLGGEGTVLGIVVLAFFGAFLLFGGFLFRSSSMTIDATGFHAKNMVRSADVPW PASRTGFYVRVERKRRGASALMSEVSARVITPDGRAMPLTGIAWAGIDVAEMEVKGITQC WRLWNWAVAKGFTRETGQYVELSGLGIQQAMRSKQEHLYGIVRG >gi|319978909|gb|AEUH01000066.1| GENE 29 34161 - 35363 1653 400 aa, chain - ## HITS:1 COG:Cj0607_2 KEGG:ns NR:ns ## COG: Cj0607_2 COG0577 # Protein_GI_number: 15791967 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Campylobacter jejuni # 19 400 4 394 394 132 30.0 9e-31 MSEDSKPMYSKLRWSDVARLGGTGIRSRPTRAVLSALGIAIGIAAMVGVIGVSTSSQAQL AEQLRALGTNMLTAQAGEDLSGSSTRLPDDAVGRMRLIDGVEKATSTATLSGVSVYRSRL ADKNATGGILTMAAEQSLLDVVAGSVARGTWLNEATATYPATVLGAKAAQRLGVVEPGTQ VWIGHTSFTVVGILDPVTLAPELDNSALIGSSIAEQQFSFDGKPTTVYERSTEDRVEDVR ALLGPTLSPKNPTSVKVSRPSDALAAKNAADQTFTALLVGVGSIALLVGGIGVANTMIIS VLERRREIGLRRSLGAMRGHILVQFLAEALLLAFLGGAAGCVLGIGVTFGMSAANGWPFT LPWYVVVAGLSVTVGIGAVAGLYPAVRASRTPPTAALNAQ >gi|319978909|gb|AEUH01000066.1| GENE 30 35356 - 36051 324 231 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 226 1 221 311 129 35 4e-29 MSRILSLHDVRRTYGQPPVAACAGVSLSIDRGEFVAIVGPSGSGKSTLLNLIGTLDRPTS GTVEIDGIDVSRLSDRRLSGLRAHLIGFVFQQFHLSEGMTAIDNVADGLLYLGVPRAERR QKAERALIRVGLSHRLHHKPHQMSGGERQRVAIARAVVGDPPLLLADEPTGNLDSVSGAS IVELLHELHEQGTTIVVITHDNDLAMKLPRQVAIKDGRIVHDSAAKEVAHV >gi|319978909|gb|AEUH01000066.1| GENE 31 36048 - 37166 1475 372 aa, chain - ## HITS:1 COG:no KEGG:Ndas_4786 NR:ns ## KEGG: Ndas_4786 # Name: not_defined # Def: peptidoglycan-binding domain 1 protein # Organism: N.dassonvillei # Pathway: not_defined # 22 371 45 376 376 156 33.0 1e-36 MLVVGACAALAVALAGVGTAYALGKLPGGEQNQSEGNVFTGSTGVVTRGTLEGETTAVGT LRYADQYKFRGAFEGVITKLPTPGTTLHQGDMIQQVGDEPTYFMHGATPAWRAFEPDMSN GDDVTQLETALKELGYFTGEPNTHYDWLTRAAIQKWQKDKGLTQNGILPLGRVVFAPEDL RVGTMIARLGDRATLETDLFNVSSTRQVISANLKLADQKLGVVGNSVKIRLPGGETTTGT ISTVEPPTDKGSASGDQKGSGSGSGSGSGSDSASDKERVIPITVTPDDPSATEGLQEASV SLGLVSEKRENVLSVPLGALIALTPDQFGVEVVNDDNTIRKVPVKTGLFAGDRVEVTSDE LSENQRVVVPDR >gi|319978909|gb|AEUH01000066.1| GENE 32 37208 - 37687 663 159 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154509326|ref|ZP_02044968.1| ## NR: gi|154509326|ref|ZP_02044968.1| hypothetical protein ACTODO_01851 [Actinomyces odontolyticus ATCC 17982] # 1 156 1 156 158 136 52.0 4e-31 MKKHRPARAFAAIGLAVAVGVLGACSNGNSGAAGNKPSASASSTVKSADDANLIFAQCMR GKGFDVPDTGLTPDNLSDTSEAFNNALNECQQEIGPALGEENDLTKDADAQQQLVKGAEC LRGLGYDVPDPEPGKGLNVKDVPQDALQKCFNNQGAQTK >gi|319978909|gb|AEUH01000066.1| GENE 33 37901 - 38239 493 112 aa, chain - ## HITS:1 COG:AGc4890A KEGG:ns NR:ns ## COG: AGc4890A COG1605 # Protein_GI_number: 15889952 # Func_class: E Amino acid transport and metabolism # Function: Chorismate mutase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 14 110 4 100 105 119 63.0 1e-27 MGDEAAGPADRGVTGIPEALAEARRTIDNIDAALIHILAERFRCTQRVGHIKAVHDLPPA DPEREARQVARLRALAVRSGLDPDFAEKFLSFMVREVIRHHEDIKAEYEGAG >gi|319978909|gb|AEUH01000066.1| GENE 34 38266 - 39459 262 397 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 135 377 258 448 466 105 33 6e-22 MRKLIGGSGAYICDECIELCNEIIEEELGKQPSSSAASPLPKPREINEFLNSWVIGQTRA KRALSVAVYNHYKRVRSREAGHEEDMLGTKSNILLLGPTGTGKTHLARSLARLLEVPFAI VDATALTEAGYVGEDVENILLRLIQEADGDIKKAERGIIYIDEIDKIGRKAENASITRDV SGEGVQQALLKIIEGTVASVPPSGGRKHPHQQFLEIDTSGILFIAAGAFAGIEEIVKARL GQRSTGFGSELKSASEMGDLYEAVNAEDLHKFGMIPEFIGRLPILTSTKELTKEDLVRVL TEPNNSLVRQYQHLFELDGIELDFTRGALLAIAAQANERKTGARGLSSIMERTLSDLMFE LPSRDDVARVVITREHVNGVGAAELYTDRSSERVRSA >gi|319978909|gb|AEUH01000066.1| GENE 35 39460 - 39612 57 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLALPAERAFQQFCAISESSHGTSKVAGIPPWDGAGTPAVFQSFKLLQR >gi|319978909|gb|AEUH01000066.1| GENE 36 39669 - 39812 233 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWRRSWKLAKGLRKTNLAVSRLFLALPDGAVRSQRRIGVETDHAMAG >gi|319978909|gb|AEUH01000066.1| GENE 37 39766 - 39867 151 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCSVSIGLHGPFPFDIRKFVAPVMETGEGFEKN >gi|319978909|gb|AEUH01000066.1| GENE 38 39882 - 40526 971 214 aa, chain - ## HITS:1 COG:Cgl2359 KEGG:ns NR:ns ## COG: Cgl2359 COG0740 # Protein_GI_number: 19553609 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Corynebacterium glutamicum # 13 212 6 205 208 256 70.0 3e-68 MTTRPYFEAAARQMPQARYVLPDFEERTAYGFRRQNPYTKLFEDRIVFLGVQVDDASADD VMAQLLVLESQDPDSLITMYINSPGGSFTALTAIYDTMQYIKPQIQTVCLGQAASAAAVL LAAGSPGKRLALPNARVLIHQPAMEGMQGQASDIQIVADEIDRMRAWLEDTLAEHSGTPV EQVRRDIERDKILTAPQAAEYGLIDQVLESRKAQ >gi|319978909|gb|AEUH01000066.1| GENE 39 40526 - 41137 889 203 aa, chain - ## HITS:1 COG:CAC2640 KEGG:ns NR:ns ## COG: CAC2640 COG0740 # Protein_GI_number: 15895898 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 20 190 20 190 193 228 61.0 8e-60 MSVHMAGPGAESIMGLGDSVYQRLLKERIIWLGGEVRDENANTICAQLLLLAAEDPDRDI YLYINSPGGSVTAGMAVYDTMQYIKPDVVTVGMGLAASMGQFLLTAGAPGKRYITPHTRV LLHQPLGGAGGSATEIRINADLILGMKKELAEITAQRTGKSVEQIEADADRDHWFTAREA LEYGFVDQLIEGPQEIGTRGGQF >gi|319978909|gb|AEUH01000066.1| GENE 40 41403 - 41999 734 198 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|269219679|ref|ZP_06163533.1| ## NR: gi|269219679|ref|ZP_06163533.1| conserved hypothetical protein [Actinomyces sp. oral taxon 848 str. F0332] # 1 196 1 195 201 97 41.0 4e-19 MATVLACAALALGVARTAMFVALHLVPSDYDIVHHAVSDYAVGPTRRLAATMTWTSAVFW LVLAAAVALAPANDGPGLAAWMLALAAVFVLLPLLPTDVEGSAPTLVGRLHMLAAIAWFA IAYSCMGGFIRFFAPSAPGGLMALLVAVSWVAAVGLVALVSALVVRPLRRVAFGISERVF LLAVHVFYIAVAVGLMLV >gi|319978909|gb|AEUH01000066.1| GENE 41 42030 - 42491 522 153 aa, chain - ## HITS:1 COG:no KEGG:Ksed_23090 NR:ns ## KEGG: Ksed_23090 # Name: not_defined # Def: transcriptional regulator # Organism: K.sedentarius # Pathway: not_defined # 4 131 6 133 138 64 41.0 1e-09 MDLAYELHDLVLTLDRWAERRLRSVGLNYNRYVALVVVCEHPGVSGRQLAGPLRVSEAAA SGIVRSLLEAGLVEDTAQKGDGNVRRLRATPAGVGLRARCSDLLGSALDTAARGIGVDPE SLALSIRALHDQVRSPRGGMDGAGPARSVQQIH >gi|319978909|gb|AEUH01000066.1| GENE 42 42660 - 44012 1914 450 aa, chain - ## HITS:1 COG:ML1481 KEGG:ns NR:ns ## COG: ML1481 COG0544 # Protein_GI_number: 15827779 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Mycobacterium leprae # 1 450 1 469 469 280 41.0 3e-75 MKSSVETLEPTRVRLTVEVPFEELKSEMDKAYKDIAGQVNIPGFRKGHVPPRIIDQRFGR AAVIEQVVNEVLPGHYSSAVASNDLRPMAQPEVEVTEIPNTTGPQGGQLVFTAEVAVVPP FELPAIDESIEVVVDSAEVADEDVDKELEELRGRFATLKTVDRAAAAGDFTTIDLVARIG GEEVDSVSDVSYEIGSGDMLDGQDEALSGASAGDEVAFTAVLKGGDHEGEEAEVSLTVKS VKVRELPTADDDFAQMVSEFDTIGELTEDLREQAAQSKESQQALQARDRLIDLLLERVEI QLPESAVEHELSHRVEAAGESADQDALRAEVVREMRSELLSEAVAKEFDVQVSQQELIDY AIQMSQNYGVDINQLFSSPQQMSNVVADLGRSKALIEALKRVTVKDSSGAEVDLSKFFGE DPAAAENAEIVGGADSDEAEAPADAEDAAE >gi|319978909|gb|AEUH01000066.1| GENE 43 44176 - 45588 1794 470 aa, chain + ## HITS:1 COG:RSc0201 KEGG:ns NR:ns ## COG: RSc0201 COG1236 # Protein_GI_number: 17544920 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted exonuclease of the beta-lactamase fold involved in RNA processing # Organism: Ralstonia solanacearum # 5 444 3 428 452 354 45.0 2e-97 MATTLTFLGGARTVTGSKYLLSIDPPGGAPDRPGAGPRRILVDCGLFQGMKKLRRANWAP FPVDPASIDDVLLTHAHMDHTGYLPRLVKHGFRGSIWGTDATQALTEIVLRDSAYLQERD ADFANEHGFSRHEKALPLYRIGDVAKTLPLMRSVEFHRPLDLGDGIEATWHRAGHILGSA SIRVATPDGSVLFSGDLGRTTHPVLRSREIPPGADVALIESTYGDREHIEPASPHEAFAA AIRRTVSLGGSVLVPAFAVDRTEVVLAALDGMAAEGRIPPVPVYVDSPMALRSLEIYRAP SQACELRDGARLRLPHLRLTQLRAVEESKRLNTLGAPAIIISASGMATGGRVLHHLARML PDPRNCVVLTGYQAVGTRGRSLADGAQSLKLLGMEVGVRASVVRDEEFSVHADASELVDW ARGLSPAPGRIFCVHGEPDSASALAGALRSRLGLDAAVASPGQTITVDRS >gi|319978909|gb|AEUH01000066.1| GENE 44 45632 - 46303 890 223 aa, chain - ## HITS:1 COG:mlr1523 KEGG:ns NR:ns ## COG: mlr1523 COG3548 # Protein_GI_number: 13471525 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Mesorhizobium loti # 25 213 56 235 246 88 30.0 9e-18 MKFLKRSSRSAAAKGSGGAKGFAQIGSGRLEAFSDGVMAIAITLLSLDIKLPEYSGDLLD DLAALWPTYVGFVLSFMLIGQVWLNHHAVFHILRAVDQRVLVFNLLLLLDVVFLPFATSV LTNSLERGSQARVGAVFYGGVMVVGGLLFNALWHSAIRNPSNLRHEIPRSVVRRMSLRFA LGPVLYAFACVLGAINAWLSIGTYVCLIVFYMMDSMAAPRNGA >gi|319978909|gb|AEUH01000066.1| GENE 45 46454 - 46759 256 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154509336|ref|ZP_02044978.1| ## NR: gi|154509336|ref|ZP_02044978.1| hypothetical protein ACTODO_01861 [Actinomyces odontolyticus ATCC 17982] # 1 92 13 104 105 82 63.0 9e-15 MRSYRSIMAVGAVRAGHDPREVEAAARSAVRLESWDIAVVSGQPRATARFAAADDDEARA SHAAILTGVRRVAEVPGAVLAAVVHGRSRPIASAPADAGRN >gi|319978909|gb|AEUH01000066.1| GENE 46 47132 - 47566 409 144 aa, chain - ## HITS:1 COG:PM2013 KEGG:ns NR:ns ## COG: PM2013 COG2020 # Protein_GI_number: 15603878 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Putative protein-S-isoprenylcysteine methyltransferase # Organism: Pasteurella multocida # 19 139 28 145 145 63 36.0 1e-10 MFLTVGALAVQALVPFPDAGHPWWRWAAAGAFAFASGVLLVGSSMRFYRAGVSADPVGVD HPRALVSTGVYRVTRNPMYVGMACALAAHASIREAWAWLPAACFVVLVDRAQIPVEERAL RNAFPDEYAAYTRRVPRWLGVPRA >gi|319978909|gb|AEUH01000066.1| GENE 47 47631 - 48677 1549 348 aa, chain - ## HITS:1 COG:MT3130 KEGG:ns NR:ns ## COG: MT3130 COG1064 # Protein_GI_number: 15842609 # Func_class: R General function prediction only # Function: Zn-dependent alcohol dehydrogenases # Organism: Mycobacterium tuberculosis CDC1551 # 1 348 1 346 346 363 53.0 1e-100 MPQVLAYACDDATTRFHKTTIDLREPGPGEIYFDVKYAGICHSDIHTARGEWGPVAYPLV PGHEFVGVVARVGEGATRFAVGDRVGVGCMVGSCGQCEMCGAGYEQWCTGTPGTLWTYRA DADGNPTTGGYSQGFTVSEDFACRIPDQIPFDAAAPLLCAGITMYSPLRRYGAGPGTRVA IVGMGGLGHVGVKLAAAMGAEVSVISHGRSKEADARRFGACAFHATSEDGTVEALRGSFD LIICTVSANNLDYEGLMGTLRPYGVFVDVGLPEEPVRLPLRAFVNGAKSFAGSQIGGIRQ TQEMLDFCAAHGVAPQVEIIGGEDITGAYDRVVDSKVRYRYVIDTATF >gi|319978909|gb|AEUH01000066.1| GENE 48 48909 - 49520 884 203 aa, chain - ## HITS:1 COG:Cgl0937 KEGG:ns NR:ns ## COG: Cgl0937 COG1309 # Protein_GI_number: 19552187 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Corynebacterium glutamicum # 9 185 3 178 217 62 30.0 4e-10 MVRQDRTPRRRLDPDSRREAILAAAAAAFADRPFADVTIASIAADAQASSSLVYRYFAGK EELYAQVVGLAIEELLDKQAAALDALDEGVPVRDRIGAATLVYLDHIASHPDAWAAPLRG SRGEPQAAAELRVRVRADYVERLAGLLAPSEQARHEYALWGYYGFVDAACLRWVDKGCPP DERWALVDAALGCLGGALGDWAA >gi|319978909|gb|AEUH01000066.1| GENE 49 49590 - 50837 1594 415 aa, chain + ## HITS:1 COG:Cgl0864 KEGG:ns NR:ns ## COG: Cgl0864 COG0366 # Protein_GI_number: 19552114 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Corynebacterium glutamicum # 1 377 1 352 386 402 52.0 1e-112 MDWKHHAIWWHVYPLGFCGAPIHEADPHPGPRLRALMGWLDYAVGLGCSGLLLGPVFASA THGYDTLDHYRIDPRLGSDADFDDLVGACRERGLRILLDGVFSHVGSGHPLLRRALAEGP GSGAAAMFDIDWGAPGGPAPRVFEGHASLARLDHASSAAAEYTADVMRHWLDRGADGWRL DAAYSISTDFWARVLPGVRAAHPDAFFLAEVIHGDYAGFTAAAGVDTVTQYELWKAVWSS IKDRNLFELDWALQRHNAFLSSFTPNTFIGNHDVTRIASAVGPDGAVTALAVLMTVGGIP SIYSGDEQGFTGVKEERAGGDDAVRPAFPASPDGLAPWGWGVYRAHQDLIGLRRRNPWLT TATTSAVALENQRYTYRVRAADSSACLDVSIDLTSTPRATIRDGGGTVLWEQAGH >gi|319978909|gb|AEUH01000066.1| GENE 50 51054 - 51476 371 140 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRFFQQLSVMPRRQGRLVLVVFLFDFMVFGHGLLLMLNSPERGDLVLLGSVVVAAAVLVG IAAVVRLVRWVGRGCPQVPPTVELLGLAKGLGVGAVVMLVVSLAQLLTGNPVAVQWYFLL IGCFLGAFWAFFDITAENGD >gi|319978909|gb|AEUH01000066.1| GENE 51 51535 - 54492 3086 985 aa, chain - ## HITS:1 COG:no KEGG:Kfla_6006 NR:ns ## KEGG: Kfla_6006 # Name: not_defined # Def: hypothetical protein # Organism: K.flavida # Pathway: not_defined # 5 352 3 332 883 137 30.0 3e-30 MTTADDIRHRLAQTGAMGWGPARSAQVAEAAAWSLDVDDLGLRADAHLALADAYAYGNEG WKAMAPFAWCLAAFEEHPELLTGRRLFLFEWRYKWIVECAADNPRVPVEQVRRLMDGLEG FYRAHGASMHTVHGAAVYVDLSLGDMGAVRRDIAAWRAMGRDRLSDCVACDPERQVIVAS AEEDWERAVASAVPVLSGELGCSHQPANIQSSALLALLASGRPKAAWDAHVRSYRFQRLN PNFTNGLARHMQYLALSGHVGRAVRVLRESLPFASRADSAFSLLLLLRGAAVVLREAVAA GRGDEELGVEVAGDSPWCPAEALDAAATLSQAQARVRQWALDVAGLFDERNGNGFITSWT ARELDRPAFSAAADAFADAGIGLLEAGEQLPEGERLAEDGTVVPAEPPAPANERLAGDDD APPYPAVDTTRPAPLAGAEDALERYAAAFLDPGDGAEGSFVVDQACARGLGPQLEAMDGR DIERFMLLASMRRNVGDHDGASQWLREARGAVAGALDRASGDPGAGADGYCACVLRALGV LIDASLLREARPSGGVVLRDERVEEVRRLADTALACAAEDCGPAQSGAHAAAVSSALSVI AGLASSVGDHDCAGSALEGMEALVGRAEHRRDAEDLVRLNRARVLLERGRTYQACAAAYE VVRGHDPCPPVTAMRARKVIADASDAMAEFREAIAQLREIVNVCLAVGMPVRAASALRYL GEVLLRDERRLECAEVVESALGLVEGCGRTPLEYALRETLCRALMGLGEHQGVFDNSREL AQRDIEGGDRGAAASHLRRAAEAAEALEDRDGAVRAYTEAACLHDGDGLADRVLRGRCLR DAARTAIRGLAQEEARLRLDEARALMEQSRAAIAPLGDDGDYSSAYEMGAWHDQYAALLA ASGEFGEAVEHCEAACAGYMVKNDLEAMARPLHCMLWCLERTGDTDGCRAVIARIRQVYA APRWKDQRPLRYAAVVEQRLEGGEL >gi|319978909|gb|AEUH01000066.1| GENE 52 54489 - 56294 1917 601 aa, chain - ## HITS:1 COG:Cgl0094 KEGG:ns NR:ns ## COG: Cgl0094 COG0326 # Protein_GI_number: 19551344 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Corynebacterium glutamicum # 4 593 9 604 608 266 33.0 7e-71 MAQFHVDLRGMVDLLSRNLYSGPRVYMRELLQNGVDAIAARRQLEPDCPARIVLTVGDGG LACSDTGIGLTEEEAASLLSTIGASSKRDELGLARADFLGQFGIGLLSCFMVSPDITVVS RSARAGADGAIEWTGSADGTYCVRPAPAHEARGLPPGPGTRVVLRPFPGEAWMEPATVRA LVDEFASLLDADIEVRGGGAVTRHGRAVPVWRMGGRDRAQWCQENLGFAPIDVIDFDIAA AGLRGAIAISPQEGRVGDARHTVYAKGMLVSRSAYGLAPPWAYFAWVVADSAHLRLTASR EQLVDDEALEDARAAVGERVRAWLDDLARHAPTRFAAFVSAHAMGLRSVATTDGFVLGLV VEHVPFETTQGPRTLAQLAGLGVPVRYTRTVDQYRALADVAASQGVVLVNAGYAYEEEVL SAFLASPALGGSGADIALVDPARLLDSFSPCSPTEEAEAAGLMLAARRAIDPRDCEVVMR RFEPATMPALYLPDPDLAGRQVARASKEAASGVWAQLLGIDDPFARTRPPRMVLNRANPL IARLAALDEGGAAAAQVLRGLYVQCLLQGRQALGPKERAWAGQVLGALIDAALDHPRKDG P >gi|319978909|gb|AEUH01000066.1| GENE 53 56297 - 56770 545 157 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAWWTDPDNAATPPVTMDRVAAWLERIGWEYERRRGGDYDEIHTGAGRTQLLIEIPNPRL LSVRGRETYGTGHGPGAQERVAAAATQVNESCWVPSLFALPSQGDATTVATRALVNIEAG ANDEQLHDYLGLMTTTTVERFKALRELLGIDDEEEAG >gi|319978909|gb|AEUH01000066.1| GENE 54 56770 - 57216 438 148 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSAQPQVTTARIADWLWSQDLGYTIDDDRNIVLRFEAFTIYIFVGAEFSDLLPVRGYWRP ELADTDLERVDAHIHQYTLGGVVVKLGISRRAAPPPRHLFAQTTAFIGKGMSNEQLADYL NMAVSVIGGALQEAAEALPDLAQRQGAR >gi|319978909|gb|AEUH01000066.1| GENE 55 57581 - 58498 1254 305 aa, chain - ## HITS:1 COG:no KEGG:Arch_0447 NR:ns ## KEGG: Arch_0447 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 3 302 22 326 329 319 53.0 1e-85 MTSKPKVTLVTCASMPNLFHGEEGLPDELAQRGCDPRIAVWNDPDVDWDDAGMVVVRSVS DYAADRSAFLEWARGLRRVQNSADVLDWNTDKHYLRDLAARGLPTIPTSWLEPEHGLSKH QIHTRFPAFGEFVIKPAVSSGVRDIGRYTTLSISQRQAAMRQVQSLLREGRSVMVQRYME EIEVHGEISLVFFNGLVSHAVEKRAALHPASVTDPTVHEAVVTAEPADSVAWKWGEEIRR VLHDYVRERNGHDDLLLFNRVDLVPDGQGSFMVMEVSLVDADLYLGTTPRALGNFADAIS ARAHW >gi|319978909|gb|AEUH01000066.1| GENE 56 58623 - 59759 1294 378 aa, chain - ## HITS:1 COG:MT3059 KEGG:ns NR:ns ## COG: MT3059 COG1181 # Protein_GI_number: 15842537 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 13 377 17 372 373 303 49.0 5e-82 MTDFERPRVLIVFGGRSSEHEVSCATAAGILRAIDRDKWDVVPLGITKDGQWVPASDDPA LLEFKDGKGQSVEAGATRVALTPGGGSLVELSYDGDPADPGARVVGARDLGRVDIVLPLL HGPYGEDGTIQGLLEMAEVRYVGCGVASSAVSMDKHLTKVVLAGAGIDVGRWELITPLQW EADQGACLERAAALGFPVFVKPCRAGSSVGISKVESREGLAVAVKEAQNHDPRVIVEAGV VGREVECGVLGGRSDWRSATAPLGEIEVPDGEFYDYEHKYVDDVVGLVCPAEVAPAYEER VRETALRAFDALECEGLARVDFFLDEDSGTVLVNEVNTMPGFTPISMFPQMWAAAGVDYP SLIDALLEEAAARTVGLR >gi|319978909|gb|AEUH01000066.1| GENE 57 59790 - 60824 1404 344 aa, chain - ## HITS:1 COG:Cgl1288 KEGG:ns NR:ns ## COG: Cgl1288 COG0240 # Protein_GI_number: 19552538 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Corynebacterium glutamicum # 10 337 4 332 332 271 51.0 1e-72 MSEENGYERVAVLGTGAWGTTFARVVAHAGAAGVTVWGRNAAVVEFINDGENPAYLPGIE LPGSVRATTDLRGAVSGADLVVVAVPVKAVRPTMEAAACAIDPGAGVLSLAKGIELSSFK CVDEIIAHSAGVDEDRIAVLSGPNLSREIAEENPTATVIASENLELAKRVASTCHTPYFR PYVSTDVIGTEIAGATKNVIAVAIGAAQGMGLGINTRSTLQTRGLAEMTRLGTALGADPA TFAGLAGIGDLIATCSSRLSRNFSLGHRMGEGMGLEEALERSPGVVEGVASARPILQLAR GAGVDMPITEAVVRVVEGDATIEDMGHMLLSRPRKMDGWKITLV Prediction of potential genes in microbial genomes Time: Thu May 12 17:27:03 2011 Seq name: gi|319978902|gb|AEUH01000067.1| Actinomyces sp. oral taxon 178 str. F0338 contig00067, whole genome shotgun sequence Length of sequence - 12946 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 8, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 46 - 843 1114 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 2 2 Op 1 . - CDS 1066 - 1956 863 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 3 2 Op 2 . - CDS 1987 - 3309 1995 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 4 3 Op 1 30/0.000 - CDS 3548 - 4186 1007 ## COG0066 3-isopropylmalate dehydratase small subunit 5 3 Op 2 . - CDS 4206 - 5606 1752 ## COG0065 3-isopropylmalate dehydratase large subunit 6 4 Tu 1 . + CDS 5625 - 6524 866 ## COG1414 Transcriptional regulator + Term 6623 - 6676 12.2 - Term 6608 - 6663 7.2 7 5 Tu 1 . - CDS 6676 - 7194 -116 ## - Term 7229 - 7271 4.0 8 6 Tu 1 . - CDS 7513 - 7599 62 ## - Term 8147 - 8202 16.1 9 7 Tu 1 . - CDS 8219 - 12316 5682 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 10 8 Tu 1 . + CDS 12547 - 12946 436 ## Bcav_2778 transcriptional regulator, TetR family Predicted protein(s) >gi|319978902|gb|AEUH01000067.1| GENE 1 46 - 843 1114 265 aa, chain - ## HITS:1 COG:ML0892 KEGG:ns NR:ns ## COG: ML0892 COG0204 # Protein_GI_number: 15827414 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Mycobacterium leprae # 7 195 3 189 244 91 35.0 2e-18 MSRVTGFYRFAKCVLTPIMTPWVRFDAEGTENLPAEGAFLLVSNHLSNVDPLCLCWYFMK RDTAVRFLAKRSMFKVPFFGWIFKGMGLIPVDRDQNPAAVLAPTREALAAGEVVGIYPEG TLTRDPGEWPMAFKSGAARLALDTGVPVIPVSQWGAQRIMAPYNSKRINLKPRRPLTYRF GAPVDLSDLMSGAGSADHAAVDEATRRIALAVRQGVAGIRGAEAPEEIWDPKTEAGPWWE HEQRKRAKKERKRAKKERKGAREGE >gi|319978902|gb|AEUH01000067.1| GENE 2 1066 - 1956 863 296 aa, chain - ## HITS:1 COG:Rv2252 KEGG:ns NR:ns ## COG: Rv2252 COG1597 # Protein_GI_number: 15609389 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Mycobacterium tuberculosis H37Rv # 4 295 17 304 309 88 28.0 1e-17 MRLLVSSMSAGGRALRVGPSVVAALRGGGWDVEVVVTTAGDDPASMVAPGADAVGALGGD GFLAAVAQGCHDHGAVLVPMPGGRGNDLCRALGVGADPLQRASGLAPLGSDRGAIGGSVR AIDGMWVRGDGAGRRLALGIVSLGLDARANLLANRSVLSNGPLAYAYGAFAALATHRPAP IRARVDGEERDLSGWIASVSNSGRFGGGIALVPGADMCDGVLEVCHVGPIPTRSALAALA RVVAGRGAHDPRIRVSSARLVEFLEPQGMTAMADGDVVATVPFTVEAAPGAVRVLV >gi|319978902|gb|AEUH01000067.1| GENE 3 1987 - 3309 1995 440 aa, chain - ## HITS:1 COG:CPn0571 KEGG:ns NR:ns ## COG: CPn0571 COG0766 # Protein_GI_number: 15618482 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Chlamydophila pneumoniae CWL029 # 4 440 3 438 458 337 42.0 2e-92 MSSVLRVEGGHPLRGEITVRGAKNFVPKAMVASLLADTPSELYNVPLIRDVDVVSDLLGL HGVTADFDQDRGTMMLDPTNVKIASRSDIDTLAGASRIPILFCGPLLHQLGEAFIPELGG CAIGGRPIDFHLDTLRKFGAVVDKRPDGVRIRRPASGLHGAIIELPFPSVGATEQTLLTA VRNEGITTLRGAAIEPEIMDLINVLQKMGAIISVDTDRTIRIEGVGKLRGYRHTALPDRI EAASWAAAALATHGDIYVRGALQPDMTTFLNTFRKVGGAFDIDADGIRFYHPGERLHSIA LETAVHPGFMTDWQQPLVVALTQADGLSIVHETVYENRFGFASALRQMGATIQVYRECLG GTPCRFGQRNFFHSAVISGPTPLHAADIEVPDLRGGFSHLIAALAARGTSTVSGIDVISR GYEHFTRKLHQLDARVSYVD >gi|319978902|gb|AEUH01000067.1| GENE 4 3548 - 4186 1007 212 aa, chain - ## HITS:1 COG:Cgl1285 KEGG:ns NR:ns ## COG: Cgl1285 COG0066 # Protein_GI_number: 19552535 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Corynebacterium glutamicum # 1 194 1 195 197 290 72.0 2e-78 MEKFTTHTGVGVPLRRSNVDTDQIIPAVYLKRVTRTGFEDALFAAWRSDPGFVLNQAPYR NGSVLVAGPDFGTGSSREHAVWALKDYGFRVVLAPKFADIFRGNSGKQGLVTGIISQEDC ELLWKLLETEPGTEITVSLEDKTFRAGAVSGTFQIDDYVRWTLMEGLDDIALTLRHEDEI AAYEAGRPGFKPATLPARTLPKEEVASARGSG >gi|319978902|gb|AEUH01000067.1| GENE 5 4206 - 5606 1752 466 aa, chain - ## HITS:1 COG:MT3066 KEGG:ns NR:ns ## COG: MT3066 COG0065 # Protein_GI_number: 15842544 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Mycobacterium tuberculosis CDC1551 # 4 466 10 472 473 704 75.0 0 MSGTMAEKVWRDHIVSKGADGAPDLLYIDLHLVHEVTSPQAFEGLRLAGRPVRRPDLTIA TEDHNTPTVDIDRPIADITSRTQIDTLRENAREFGIRLHSLGDADQGIVHVVGPQLGLTQ PGMTVVCGDSHTSTHGAFGALAFGIGTSQVEHVLATQTLPMAPFKTMAITVNGSLPPGST AKDIILAIIAKIGTGGGQGYVLEYRGKAIRELSMEGRMTICNMSIEAGARAGMIAPDDTT FAYIKGRPHAPEGEDWDRAVEYWRSLASDDDAVFDAEVVLEAADIEPFVTWGTNPGQGVP LSAAVPSPEDFADENKRAAAKRALEYMDLQAGTPMRDIAVNTVFIGSCTNGRIEDLRAAA GVLKGRRKADGVRVMVVPGSARVRLQAEREGLDKVFTDFGAEWRNAGCSMCLGMNPDTMA PGERSASTSNRNFEGRQGKGSRTHLVSPVVAAATAVRGTLSSPADL >gi|319978902|gb|AEUH01000067.1| GENE 6 5625 - 6524 866 299 aa, chain + ## HITS:1 COG:Rv2989 KEGG:ns NR:ns ## COG: Rv2989 COG1414 # Protein_GI_number: 15610126 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mycobacterium tuberculosis H37Rv # 62 273 5 215 233 195 57.0 1e-49 MLDRGGPFSPPIPHPSAYLISRTDSLTERDSSGQNPYCFTFRRVAISLIETENRCMEEQT SSGVGVLDKAALVLSALEPGPATLAQLVSTTGLARPTAHRLAVALEFHRMVARDMQGRFI LGPRLQELSTAAGEDRLLAASMPVLQALRDHTKESSQLFRRQGDYRVCVAASEREMGLRD SIPVGATLSMSAGSAAQVLLAWEEPDRLHRGLYGASFNATMLSEVRRRGWAQSVGEREPG VASVSAPVRGPSGRVLAALSISGPIERMGRQPGRQHGPSVMAAANRLSEFLHSVDEADF >gi|319978902|gb|AEUH01000067.1| GENE 7 6676 - 7194 -116 172 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSARYIPLAVSTTPSQCRQPAPVSATPSQCRQRASVSATPSQCRQRAPTPVPGECPSPAS DCSSEGLCPPRAWGIPSEGGEHHGMPRMPPCPGDALFFVAVGGTKRKMPPRVWGLPVPAP RQGNARTRQGWPVLAGDEDAQAWERPHARQGWSPVRLPPGCPPPLGVRAQEP >gi|319978902|gb|AEUH01000067.1| GENE 8 7513 - 7599 62 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLSPCLSGTVRQIAYPDATESVPFWDC >gi|319978902|gb|AEUH01000067.1| GENE 9 8219 - 12316 5682 1365 aa, chain - ## HITS:1 COG:DR0405 KEGG:ns NR:ns ## COG: DR0405 COG1523 # Protein_GI_number: 15805432 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Deinococcus radiodurans # 308 1171 78 910 910 661 47.0 0 MDEGVFVRTLIRWKHRRKHRASISAALAAAASLALGGLSSLGAPALAADTPPTGAQSGAP TETAPDLVTIPGSHNTAMGCASQWQPDCSSAALARDPESGLYTASFDLPKGDWEYKVAIG GTWDVSYGANGTPGGANIAYSTTEAATVTFYYDPATHRVWNTADGPVITLPGSLQSELGC TRGDNGGDWAPDCLAPLMVPRGDGTWTYSTDRLPAGSYELKVAHGLSWNENYGVDGAPGG PNYSFNTKAGNLVTFTYTLATHKLDIDVADAPLKGVGEQRAYWVGEKTLAWPVSLLPDGV SREQVVSGEAGLAYTLVTAPEGGAGVDEGAVSGGDSHPLTVTGDLPSQVTDAHPNLRGYL ALGIGDSLTSEQVREALRGQVAVAQGRSGGALSAFTGVQLAPVLDEVYAATARDGALGVT WADGAPSFALWAPTARKVSLLTWNTSDPTGSAAEAADEAVRTEAQAGADGTWRVANADGA ITKGAQFQWEVEVYVPSTGKVETNTVTDPYSVALTTDSTRSVAIDLSDGALAPAQWASTP APKVRNDASRAIYELHVRDFSAKDQSVPEEMRGTYKAFTVSGSAGMTHLGELAGAGMNTV HLLPTFDIATIPEKRAEQKTPQIPEGAGPASEEQQAAVSAVADEDAFNWGYDPLHWMAPE GSYATDANQNGGGRTVEFREMVGALHATGLQVVLDQVYNHTAASGQDAKSVLDRIVPGYY QRLNASGGVENSTCCSNVATENAMSERLMIDSLVMWARHYHVDGFRFDLMGHHSRQTMER AKAALSQLTLEADGVDGSSLYLYGEGWNFGEVANNALFTQATQGQLDGTGIGAFNDRLRD AVHGGGPFDEDHRVYQGFGSGAFSDPNGLDTRSEAERQADYQHRVDLVKIGLTGNLKGYA MTTYDGRAVTGEQLDYNGQGAGYASQPAESVNYVDAHDNETLFDLLTYKLPTSVPMGDRV RMNTLSLATVALGQSPSFWSSGTEILRSKSLDRDSFNSGDHFNSIDWTGQDNGFGAGLPV ASKNGDKWPIMRPLLEDASLKPAPADIASAKGQALDLLRLRASTPLFALGSAELIGQKVT FPDSGAAPGSLTMLVDDTVGADADPALDGVLTVFNASDKPLTQTVGALAGRPFQLSDVQA GGSDEVVKGATFDAATGTLTVPARTVAVFTQATGGVGPQPLTGQWMKDGSRWWYRYSDGT YPSSTRLTIDGADYAFDAQGWMVTGWDSTDGQWRYYGSSGAAVSGWLRDGGSWYYLDPAT KAMATGWLELGGTWYHLAPSGAMSTGWARVDGQWYHFGPSGAMDRGWLYVRGSWFYLTDS GAMAIGWVPVGGSWYYMHASGVMATGTQVIDGTTYRFSSSGRWIG >gi|319978902|gb|AEUH01000067.1| GENE 10 12547 - 12946 436 133 aa, chain + ## HITS:1 COG:no KEGG:Bcav_2778 NR:ns ## KEGG: Bcav_2778 # Name: not_defined # Def: transcriptional regulator, TetR family # Organism: B.cavernae # Pathway: not_defined # 11 122 12 124 207 62 39.0 4e-09 MARPRKPIDIDRRGALMREARSSFARLGYAGTSLSSVLRASGFPRSSFYYFFDGKEALLD EAFADGLSRLAECAPPPDPGALTAETFWPRLFGFVDALAEAGTDPDLAATARLFHMPDAP ASARRASFERGAR Prediction of potential genes in microbial genomes Time: Thu May 12 17:27:20 2011 Seq name: gi|319978896|gb|AEUH01000068.1| Actinomyces sp. oral taxon 178 str. F0338 contig00068, whole genome shotgun sequence Length of sequence - 5504 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 7, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 168 178 ## gi|269219897|ref|ZP_06163751.1| transcriptional regulator, TetR family 2 2 Tu 1 . + CDS 294 - 1163 1046 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases - TRNA 1278 - 1350 54.8 # Glu CTC 0 0 - TRNA 1383 - 1455 54.8 # Glu CTC 0 0 - TRNA 1479 - 1553 64.3 # Gln CTG 0 0 - Term 1619 - 1655 -0.8 3 3 Tu 1 . - CDS 1657 - 3168 2278 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 4 4 Tu 1 . + CDS 3167 - 4471 1639 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities - Term 4376 - 4415 1.7 5 5 Tu 1 . - CDS 4468 - 4632 204 ## 6 6 Tu 1 . - CDS 4746 - 4808 75 ## 7 7 Tu 1 . - CDS 4925 - 5503 788 ## COG0005 Purine nucleoside phosphorylase Predicted protein(s) >gi|319978896|gb|AEUH01000068.1| GENE 1 1 - 168 178 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|269219897|ref|ZP_06163751.1| ## NR: gi|269219897|ref|ZP_06163751.1| transcriptional regulator, TetR family [Actinomyces sp. oral taxon 848 str. F0332] # 1 52 147 198 199 62 61.0 7e-09 GTLDEELPEELHAELVWAVAAALDRWMASRADGHADSRALVRAVMARLLGAPGAP >gi|319978896|gb|AEUH01000068.1| GENE 2 294 - 1163 1046 289 aa, chain + ## HITS:1 COG:all5305 KEGG:ns NR:ns ## COG: all5305 COG0702 # Protein_GI_number: 17232797 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Nostoc sp. PCC 7120 # 6 234 4 227 291 76 27.0 6e-14 MSTITIAGATGYLGRHLVAEFHRRGHTTTAIVRDAERARSAGPWGAPSLDGLVDHWIVGD VTDPRTTAGAAAGSDHVVSALGVTRQNADPWTIDNLANKAVLASALRHGANSFTYVNALG AERCPTRLTRAKTAFARALAGSGITAQIINPPAYFSDMMALLSMARHGLVAVMRREARIN PVHGADLARYIADRVESGDAGQWDVGGPDTLTWEQWARTAFHVLGRRARVVTAPRWLVGP AVRATAMASPRKGDTARFAVWNMLHDCVAPATGTHRLADFYAEYARGPR >gi|319978896|gb|AEUH01000068.1| GENE 3 1657 - 3168 2278 503 aa, chain - ## HITS:1 COG:Cgl1263 KEGG:ns NR:ns ## COG: Cgl1263 COG0008 # Protein_GI_number: 19552513 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Corynebacterium glutamicum # 10 500 3 485 493 605 61.0 1e-173 MSETTATGADVRVRFCPSPTGTPHVGMVRTCLFNWAYARHTGGTFVFRIEDTDAARDSQE SFDQIIESLQWLGLDWDEGVGKGGPYGPYRQSERMGLYRDVAARLLEAGYAYESYSTPAE VEERHRAKGEDPKLGYDGFDRDLSAEQVAAYRAQGRQPVLRLRMPDEDITFTDLVRGEIT FKAGSVPDYVIVRANGHPLYTLVNPVDDALMRITHVLRGEDLLSSTPRQIALYRALEAIG VAGAMPRFGHLPYVMGEGNKKLSKRDPESNLLLHKEAGMVPEGLNNYLALLGWSIAPDRD VFSMREMVGAFDISDVNPNPARFDHKKAVAINAEHIRLLDGADFRDRLVPFLHRDGLVSA DSFGALAPREQEVLSEAAPLVQTRMQVLGEASGMLGFLFVSDSGLEVDAKAVAKLKDNAG EVLDEAIAFVEGLGTEEFATDRLEAGLRQRIVDGMGIKPRLAFGPLRVAVTGRQVSPPLF ESMEILGRESSLARLRALREALA >gi|319978896|gb|AEUH01000068.1| GENE 4 3167 - 4471 1639 434 aa, chain + ## HITS:1 COG:Cgl2256 KEGG:ns NR:ns ## COG: Cgl2256 COG1168 # Protein_GI_number: 19553506 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Corynebacterium glutamicum # 57 416 14 353 368 199 33.0 1e-50 MIAQSYSSAAVAKAHQSVRRCAGEGAAIGLSRPTPPIRGAALTVRTFTDAQLYRTGSLKW TGVTRSDGSPTIGAWVAEMDFGTAPEVAAAMKGAIDDGLLGYQPPWLEGAVAEATAAFQR RRFGWEVEPGQVRLAASVLPALEATIRHLARPGSPVVVPTPAYMPFLTIPPRLGHPVIEV PSLRGTRAGGHGWALDLEGIRAGLEAGAGLVVLCNPWNPTGRVLRKEELRALADVVGQYD ALVFSDEIHSPLVLDEDVPFVPYASLNGSCAAHTVTATAASKGWNIAGLPSAQVILPDEA LRARWDAFRDEVAHDATVIGTVGAIAAYEKGGAWQREVKDLIRANVDLVESRLSGSPIDF VRPEATYLTWWGFEGVDLGRRSPSAVLRDEAGVAANEGLTLGADYYKWARINLACAPDVA RRIVDGVLALDALR >gi|319978896|gb|AEUH01000068.1| GENE 5 4468 - 4632 204 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVQNASESARNRRNRPDSVELRRILEQRALVPLERGEAQTHFGPPGDRCGKGPR >gi|319978896|gb|AEUH01000068.1| GENE 6 4746 - 4808 75 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MASPARLCDPFHPQREEDAP >gi|319978896|gb|AEUH01000068.1| GENE 7 4925 - 5503 788 192 aa, chain - ## HITS:1 COG:MT3406 KEGG:ns NR:ns ## COG: MT3406 COG0005 # Protein_GI_number: 15842898 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Mycobacterium tuberculosis CDC1551 # 12 185 82 258 268 105 43.0 6e-23 EVRVYDHASGPVLVATGRTHLYEGHGPRPVTALARAAAGAGVERAVLTNANGCLRDWQLG DVMAVTDHMNLSGASPFDGPLFLDVSAVWDPELTAALRGPCQREGTYAILRGPEYQTPAE TRALAGMGVDCVGMSTVMEALALHALGVRVAGMSVVSDLSFAAAPTDPGLVVEAAARAGE TVRVGIEAALAA Prediction of potential genes in microbial genomes Time: Thu May 12 17:27:34 2011 Seq name: gi|319978891|gb|AEUH01000069.1| Actinomyces sp. oral taxon 178 str. F0338 contig00069, whole genome shotgun sequence Length of sequence - 4184 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 157 105 ## 2 1 Op 2 . - CDS 154 - 1656 2309 ## COG1457 Purine-cytosine permease and related proteins 3 2 Op 1 . - CDS 1796 - 2563 1092 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) 4 2 Op 2 . - CDS 2573 - 4078 682 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 Predicted protein(s) >gi|319978891|gb|AEUH01000069.1| GENE 1 1 - 157 105 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTAPGGADPRFVSALLDEEATAAASALLSLVGGEPRALVVLGSGLAPALDAA >gi|319978891|gb|AEUH01000069.1| GENE 2 154 - 1656 2309 500 aa, chain - ## HITS:1 COG:PA2073 KEGG:ns NR:ns ## COG: PA2073 COG1457 # Protein_GI_number: 15597269 # Func_class: F Nucleotide transport and metabolism # Function: Purine-cytosine permease and related proteins # Organism: Pseudomonas aeruginosa # 18 413 12 399 476 89 25.0 1e-17 MSDSSAPTPGNAQGGLAVEMAGLDVIPESDRKGRPSDLFMPWFAANISVLGISWGSWVLG FGLSFPQAVVVSVIGVSLSFLVCGLVALAGKRGSAPTLVLSRAAFGYNGNRVSALISWIL TVGWETFLCIMAVQATATVMGALGFENHLVAQILALVLVVVLAAGSGVLGFEAIMRVQTW ITWATAVLTLVYLVLAAPTIDMGALMARPAGPVAAVLGAGCMLLVGFGFGWINAAADYSR YLPRTASTKGVVGWTTFGAALPTVVLVFFGVLLVGSNFDALSEAIGDDPIGALTTILPTW FLIPFALVAILGLIGGIVMDIYSSGLSLLATGVPVSRPVATAIDGTIMTVGTIVVVFFAD SFLTPFTAFLTTLGVVIAAWGGVMLADIALRRKDYDEPSLFVSDGRYGSVNWGAVATLAV ASFVGWGLVVNTNASWLSWQGYLLGPLGGREGDWAGANIGILLALVVGFAGYWATSAGRV RAQEAQPSRTYAQGAPEGER >gi|319978891|gb|AEUH01000069.1| GENE 3 1796 - 2563 1092 255 aa, chain - ## HITS:1 COG:ML1689 KEGG:ns NR:ns ## COG: ML1689 COG0179 # Protein_GI_number: 15827898 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Mycobacterium leprae # 33 254 11 238 242 187 47.0 1e-47 MRIVRFSDGSSPCYGALDDGSTRIVGLKGDPLFAAVEPSGRVFELDEVRLLSPVIPRSKV VGVAGNYSDEPVPDAGRTRPPLFLKANTSVIGPDDPIAIPVWSDDVVFEGELAVVIKSLA KDVPVEEAHQVVLGYTVANDATARDAVDGGPWARGKSFDTACPLGPWITVDPGLDPTDLA IRSTVNGAPAQDGSSADMIWNVFELVSYASMAFTLLPGDVVITGAPPGCGPVKAGDRVEI TIEGIGTLSNPVVRA >gi|319978891|gb|AEUH01000069.1| GENE 4 2573 - 4078 682 501 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 21 456 20 440 456 267 36 1e-71 MEEFAAVEKQIADWITTHFTIVILLGVGLLLTIVSRGVQVRMLPEMVRTVFGSRQGAKGG ISSFQAFAISLAARVGVGNVFGVAAALLLGGPGAIFWMWVVALVGMATAFFEATLAQIFK VRHPDGAFRGGPAYYMKRGMRNRAMANVFAVITVLTCGVVITSVQSNAIAGTLTEAFGEG ARQPLEGAGGFSAAQLTVAGLVFVFSAMVIFGGIRTVARVTEWMAPIMALVYVVMVAVIC VLNIGQFGTVLGQIFTSAFSPEPLVGGLGGGILAAMVNGTQRGLFSNEAGQGTAPNAAAT ATVSHPVRQGLIQSLGVFIDTIVVCTATAFVILIAGAEVWGGRGVNPSNLTTLAVSRELG AWTIVPMAVLVFVLAYSSIIAAYVYSDTNMSFVFGGRAWATWVVRVVCVASATAGALLSL DVVWNAVDIAMAVMTITNLIALVILMRWGLGALRDYQDQRRAGIADPVFVGEGNPLLPRD VPGRVWKRPKKSRARKPSSAG Prediction of potential genes in microbial genomes Time: Thu May 12 17:27:44 2011 Seq name: gi|319978873|gb|AEUH01000070.1| Actinomyces sp. oral taxon 178 str. F0338 contig00070, whole genome shotgun sequence Length of sequence - 16687 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 10, operones - 6 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 37 - 2790 3612 ## COG4581 Superfamily II RNA helicase 2 1 Op 2 . + CDS 2813 - 3202 401 ## Jden_1632 protein of unknown function DUF1696 + Term 3434 - 3473 2.2 - Term 3990 - 4033 3.5 3 2 Op 1 . - CDS 4123 - 4830 885 ## Ksed_00260 hypothetical protein 4 2 Op 2 . - CDS 4874 - 5308 451 ## COG0346 Lactoylglutathione lyase and related lyases 5 3 Op 1 . - CDS 5519 - 6028 686 ## COG2606 Uncharacterized conserved protein 6 3 Op 2 . - CDS 6047 - 6502 613 ## COG3304 Predicted membrane protein 7 4 Op 1 . + CDS 6647 - 7384 791 ## COG0325 Predicted enzyme with a TIM-barrel fold 8 4 Op 2 . + CDS 7417 - 8037 990 ## COG2095 Multiple antibiotic transporter - Term 8132 - 8161 -0.9 9 5 Tu 1 . - CDS 8253 - 9302 1413 ## COG0095 Lipoate-protein ligase A 10 6 Op 1 . - CDS 9446 - 10060 560 ## Sked_12140 hypothetical protein 11 6 Op 2 . - CDS 10078 - 10632 468 ## Cfla_0433 aminoglycoside-2''-adenylyltransferase 12 6 Op 3 . - CDS 10690 - 12444 2514 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 13 7 Tu 1 . + CDS 12640 - 14301 2461 ## COG1620 L-lactate permease 14 8 Tu 1 . + CDS 14409 - 14939 600 ## gi|293189534|ref|ZP_06608254.1| hypothetical protein HMPREF0970_00569 15 9 Op 1 . + CDS 15071 - 15400 380 ## 16 9 Op 2 . + CDS 15427 - 15717 271 ## + Prom 15722 - 15781 4.0 17 10 Tu 1 . + CDS 15823 - 16668 1252 ## SGR_894 hypothetical protein Predicted protein(s) >gi|319978873|gb|AEUH01000070.1| GENE 1 37 - 2790 3612 917 aa, chain + ## HITS:1 COG:Cgl1878 KEGG:ns NR:ns ## COG: Cgl1878 COG4581 # Protein_GI_number: 19553128 # Func_class: L Replication, recombination and repair # Function: Superfamily II RNA helicase # Organism: Corynebacterium glutamicum # 2 907 7 840 850 756 46.0 0 MLNALLDDLEDSGSLTDDDALYAAFAGWAEGTGRPLYPHQDEALIEILAGNHVIAATPTG SGKSMIALAAHFVSMAHGGRSYYTAPLKALVSEKFFDLVALFGADNVGMVTGDVSLNATA PIICCTAEILANQSLREGPGLDCDMVVMDEFHFYGDRQRGWAWQVPLLELTGAQFVAMSA TLGDTSRFEEAWRERTGRPVALVDDAERPVPLEFEYVVDRLPDTVERLLGEGRWPVYIVH FSQRDAVATAKSFDRNSLVPKDRKAQIGAALAGVHFGRGFGQTLRSLLVQGVGVHHAGML PRYRRLVERLTQRGLLPIVCGTDTLGVGINVPIRTVLMTSLIKFDGARMRHLSAREFHQI AGRAGRAGFDTVGFVRVLAPEHEVDAARERARLSAAQEAARDAREAKRAAKKAAKKRKGP ADGSVSWTRSTFDRLVGAAPEQLRSRFEMTHAMVLNVLAGAQGAGRDPGEHLVWLARSND DPPTASNPHLRRLGDIYSSMRRAGVVSHESSARAAAQGRPRLRAATDLPDDFALNQPLSP FALAALELLDPSSPEFALDVVSVVEAVLEDPRPLLFAQEKAARGEAVAAMKAQGMEYEER MEALEEVTWPRPLAELLEAAFHTYVAANPWVGALEISPKSVVREMVENAMTFTELVSRYD VGRSEGVVLRYLTDAYRALRQIVPESMQTDEVRSIVEWLAALIRAVDSSLLDEWEALSQG RSWDQAGDADASAGAELAFGADEDGTVAFSANRHAFRTAVRAAMFARVELMSRDDVDGLA RLDGAGAPGALAGAGPWSADDWDRALERYWAEHDWIGIDAAARSASMCALNEAPAREDVL AVLPPSASSDADRRLAGRAEALADLVDEALPGSYWLAEQTLADPDGDHDWRIAALVDVAA SDRAGAVDLVIVTVAAR >gi|319978873|gb|AEUH01000070.1| GENE 2 2813 - 3202 401 129 aa, chain + ## HITS:1 COG:no KEGG:Jden_1632 NR:ns ## KEGG: Jden_1632 # Name: not_defined # Def: protein of unknown function DUF1696 # Organism: J.denitrificans # Pathway: not_defined # 8 128 4 124 124 179 68.0 4e-44 MQEQTGADAAAIARWAFFNEIPIPEDVGGILVPGETPYAAFATTRDSAVFTSHRLFVRDA QGITGKKVEIYSLPYRDIIMWSSENAGTIDMTAEVKLWTRAGLVRVNLGRNIDVRRIDYL IANCVLNSH >gi|319978873|gb|AEUH01000070.1| GENE 3 4123 - 4830 885 235 aa, chain - ## HITS:1 COG:no KEGG:Ksed_00260 NR:ns ## KEGG: Ksed_00260 # Name: not_defined # Def: hypothetical protein # Organism: K.sedentarius # Pathway: not_defined # 1 224 1 219 242 143 43.0 4e-33 MTSTSTPSLSRRGRAAYAVVAVLAWAGVAATVSITTAGGYPPPAFYEPGLFAGAPEGWAG APQRLVECLSYFTELSNIVVAVIATRLARGSGPVGRWTRAIHLCATMMITVTAIVYAVLI APSETLRGIAVITNPLQHVVVPAAMVAAAFVFGPRGGITWGTVVRALAIPVAWVAYTMAR GALVHQYPYAFLNVTRIGYGQALAVIGAILAGAMAFLGLFAAVDWGLRRAARARA >gi|319978873|gb|AEUH01000070.1| GENE 4 4874 - 5308 451 144 aa, chain - ## HITS:1 COG:DR1341 KEGG:ns NR:ns ## COG: DR1341 COG0346 # Protein_GI_number: 15806358 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Deinococcus radiodurans # 3 138 2 137 144 169 62.0 1e-42 MTSGPALVPELAVTDYPASKRFWCGLVGFSVEYERPEEGFGYLVLGGAHVMLDQIDAGRT WATADLDPPLGRGINLEVQVADLDAARRRIRDAQWPVFVETEEKWYRAGGIEIGVRQFLV QDPDGYLLRLQQEIGERPAPAARA >gi|319978873|gb|AEUH01000070.1| GENE 5 5519 - 6028 686 169 aa, chain - ## HITS:1 COG:HI1434 KEGG:ns NR:ns ## COG: HI1434 COG2606 # Protein_GI_number: 16273339 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Haemophilus influenzae # 14 162 2 151 158 137 50.0 1e-32 MARKHGKGQGGGPTRAIEELTAAGAPFTVHEYEHDPAARAFGEETVEKLGIDPAQAFKTL MVRLEPTGEFAVGCVPALARLSMKLIARAAGAKHAEMADPAVAQRRTGYVVGGISPLGQT TRHRTFIDESCLDHEAVVVSGGRRGLSVELSPLDLAELAGAEVVPLCAQ >gi|319978873|gb|AEUH01000070.1| GENE 6 6047 - 6502 613 151 aa, chain - ## HITS:1 COG:Cgl0825 KEGG:ns NR:ns ## COG: Cgl0825 COG3304 # Protein_GI_number: 19552075 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Corynebacterium glutamicum # 23 144 1 122 136 139 64.0 2e-33 MAPGAGAWHTDLFGPRIERLTAMRIILNVIWLVFGGIELFVLYVLAGLVSMLFIITIPAS VACFRIAAYVLWPFGRRVVPLPGAGAGSALMNLVWFVVAGLWLSIGHVFTALLQAITIIG LPLAIANLKMIPVTCFPFGTAIVDDRFASVL >gi|319978873|gb|AEUH01000070.1| GENE 7 6647 - 7384 791 245 aa, chain + ## HITS:1 COG:BH2550 KEGG:ns NR:ns ## COG: BH2550 COG0325 # Protein_GI_number: 15615113 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Bacillus halodurans # 1 228 1 224 228 113 34.0 4e-25 MTIYEAISAARRAIDEAARAAGRTDRVALELAAKTRTPGECYEAAACLARLGGPVLVGHN RVQEARATAEAVRRVDGARIHLIGPLQSNKINQALACVDAVETVSSAELARRIDARATRP LPVFVQVNVSGEATKSGCAPDAVAPVVDAVSECANLRLAGFMTVGLNSTDEAPVRRAYAR LRSIRDAAAARTGIGAASLELSMGMSRDMAWAIAEGATIVRLGTAVFGARLVRLPAPAGA GWNAH >gi|319978873|gb|AEUH01000070.1| GENE 8 7417 - 8037 990 206 aa, chain + ## HITS:1 COG:aq_540 KEGG:ns NR:ns ## COG: aq_540 COG2095 # Protein_GI_number: 15606002 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Aquifex aeolicus # 8 203 13 211 214 93 30.0 3e-19 MTTVLAGFTAALLALAPITNPIGALAAFAGLTAGGDPGSVRSQAWKTGGYVFGILTAFAV SGSVIMRFFGFDLPALQIAGGIVVAHSGFSMLENKRRTTTDEEQHAAAKEDVSFSPMALP LIAGPGSIGVVIALAARDPSPAFHVGIVLGVLVASAVIAALLRYATPWIDKLGATGVGAI VRIMGFLILVIGVELIIHGIRSLQLF >gi|319978873|gb|AEUH01000070.1| GENE 9 8253 - 9302 1413 349 aa, chain - ## HITS:1 COG:Cgl1050 KEGG:ns NR:ns ## COG: Cgl1050 COG0095 # Protein_GI_number: 19552300 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Corynebacterium glutamicum # 2 348 4 351 352 342 51.0 8e-94 MHGEYKVPGGKLVVVDLEVDQDRLIDVNVSGDFFLDPDDALTRITGALEGAPASSSAKEL SALVAAALHGGDTLMGVNPEAVGIAVRRALGAALSWDDIDFEVIHGPVVDPMVNVALDET LVEDVAAGRRRPFMRLWEWNGPQVVIGSFQSYQNEVQPDGVERYGITVSRRVTGGGAMFM EPGNCITYSLVIPTALVDGMSFEQAYPYLDQWVMEVLEKLGIKAKYVPLNDIASEAGKIG GAAQKRWANGYMVHHVTMAYDIDAVKMNEVLRIGMEKIRDKGTRSAVKRVDPMRSQTGLP REEIMRVFFDHFKDKYHATEGAITEEDLEVARARCETKFARPEWVHRIP >gi|319978873|gb|AEUH01000070.1| GENE 10 9446 - 10060 560 204 aa, chain - ## HITS:1 COG:no KEGG:Sked_12140 NR:ns ## KEGG: Sked_12140 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 64 204 10 148 150 68 32.0 1e-10 MSAPARPALQARAAAAPWRSRRRPIRWDDWTGHCAGGEMRRFVCSELIAGIGEVERFDAF QEPGAFADWVRATLESPRRFDGSRRYALILWALPPGAGDPEDVPLDHYSRRNYIQCGGSA EAMTVEIRVTGADGAYEHYRVAREPVRDPGAWVEVEWEFGGGEPFTAELHPEEVFTGCQA SPVFHDYIVNGGLPPAELLRRLDI >gi|319978873|gb|AEUH01000070.1| GENE 11 10078 - 10632 468 184 aa, chain - ## HITS:1 COG:no KEGG:Cfla_0433 NR:ns ## KEGG: Cfla_0433 # Name: not_defined # Def: aminoglycoside-2''-adenylyltransferase # Organism: C.flavigena # Pathway: not_defined # 13 156 1 143 157 87 38.0 3e-16 MRANGNDGADGGMHADEVIAVVDWLEQHGAVHVITGGWAVDALVGRATRRHRDLDVIVEA GACDQLSQWLRGRGYQVVVDWLPIRVELRKGDLGVDLHPMDVDERGDGLQAGFGSQVFAH RAADRTLGSISGRSVVVATASRLMELHEGYDPRPEDLHDMALLRRLLEEGRAGSSAGPVG HPPV >gi|319978873|gb|AEUH01000070.1| GENE 12 10690 - 12444 2514 584 aa, chain - ## HITS:1 COG:ML2324 KEGG:ns NR:ns ## COG: ML2324 COG0119 # Protein_GI_number: 15828249 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Mycobacterium leprae # 12 583 41 605 607 622 56.0 1e-178 MKTTASVPTPTGSAMPAAKYRAFLDVNPVALPDRTWPDKRITRAPRWMSTDLRDGNQALI EPMDPARKRRMFDLLVGLGYKEIEIGFPAASQTDFDFVRSLVDDDAVPGDVTVSVLTQSR SELIERTVEAIVGFPRATVHLYNATAPVFREVVFRADRAATVDLAVTGTREVIAQAEKRL GDDTVFGFEYSPEIFVDTERDFALEICEAVMDVWQPGADREIILNLPSTVERATPNVFAD QIEYMSRNLSRREHIALSAHPHNDRGTGIASAELAVMAGADRIEGCLLGQGERTGNVDLV TLGLNLFSQGIDPGVDLSDVDAIRRTVEYCTQMDTPPRTPYVGDLVYTSFSGSHQDAIKK GFAARAVKVAAAGGDENAVPWELPYLPIDPHDVGRSYEAVVRVNSQSGKGGVAYLMSTAH ALELPRRLQVEFSRIVQRHTDTYGGEVDADTLWAVFADEYLPTSAAPGLQPWGRFELRSS QVGTIDDEHVELTTVLIDSGAPVTVNAKGTGPIDAFVSALHEFDLDVRILDYSEHAMSQG RDAKAAAYVEAAIDGQIVWGVGIDSSISRASYKAVISAVNRALR >gi|319978873|gb|AEUH01000070.1| GENE 13 12640 - 14301 2461 553 aa, chain + ## HITS:1 COG:BS_yvfH KEGG:ns NR:ns ## COG: BS_yvfH COG1620 # Protein_GI_number: 16080472 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Bacillus subtilis # 8 541 5 553 563 256 33.0 6e-68 MSAPALFQAFTPSTTAVAGNVFATALVGVAPLILFFVLMGVFKVATHWCALISLATSVAI AVLAFRMPVGMALLAGTQGAAMGFMPIIYIIIAAVWLYNLTEKSGRSADLKAVFNTIGRG DQRAQALIVAFCFCGLLEGLAGFGAPVAITCAMLVTLGVPKIKAAVVTVVGNAINVGFGA MAIPTTTAGRLGGGDPTVVATDMGHLTWVFCLFIPFLLLFILDGGRGVRQLWPLGLVAGA AAGAGHFLTPSVSYELTAVVASLLGFAACYVFLLLWGPTTPEECATSADESDKPTGSRIG LALMPYVLVVVIIAITKLAKPVAEFFSSTDVKIPWPGVYGHLLAADGSPSTAAVYSLSIL SNPGTWIFVTGIIVAIVYGKNSSGGRFATSVPEMFAVLPRTIYSLRMAILTIASVMALAF VMNFSGQTTSIGAALATTGAAFAFLSPILGWIGTAVAGSATSAGALFANLQSTAAAGAGL DPHILLAANTIGGGIGKIVSPQNLAIAATSVNSEGSDAEILKKAAPYSIGLLLVLCTLVF IASQGWLGAYMPS >gi|319978873|gb|AEUH01000070.1| GENE 14 14409 - 14939 600 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189534|ref|ZP_06608254.1| ## NR: gi|293189534|ref|ZP_06608254.1| hypothetical protein HMPREF0970_00569 [Actinomyces odontolyticus F0309] # 28 174 40 187 189 131 47.0 2e-29 MRLPIRFLTALASSASVAALASGCAQGDPRFAPFTSKTDAEPTIAARTDSFNERMPNLGA TEGHYAAATTSSGRGLPSPDQQYWFTGVATVPAETIAVLQPGATGNADKLPGIHPELYQY VPKNCQFATIAADHANSVLDTAKTDFNSDFGSFSVTGLAASADCNLVVVTGEGVFG >gi|319978873|gb|AEUH01000070.1| GENE 15 15071 - 15400 380 109 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPDSDASAPIPAPRPGASLMGPVSLVAGLVPVCFYLLGPLLGRWALFDDAPLAFTVVLAA LPLAFVLGVAALSSDVRRGRRWAWAGAGGLVLSAALPLIVSAIRFLGAH >gi|319978873|gb|AEUH01000070.1| GENE 16 15427 - 15717 271 96 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPSPDRAPAQGAASASRPAVAALALGAVAFALSVLAVLWVGPVVLPWVALPVSWAAAVVG VAVVISMARGRRRGIWMPALGAAMGLAFPVLVSIAP >gi|319978873|gb|AEUH01000070.1| GENE 17 15823 - 16668 1252 281 aa, chain + ## HITS:1 COG:no KEGG:SGR_894 NR:ns ## KEGG: SGR_894 # Name: not_defined # Def: hypothetical protein # Organism: S.griseus # Pathway: not_defined # 1 281 1 285 286 241 49.0 2e-62 MEWIPCGSGHELSIQDGAVVARNSKGRILKSVPAAAKKTPEWEDLDAVLVFLHQHDAEAG DAVQRWFLRSLPVPRALLAAVWPDEAWRSWLTDLVVSTPDGCLAGFLRAADADGLGVVDL DGESVRIDAERVLIPHPAIIPDLDDLREFAAELSLSQRFDQPFREVHRPDPANAERTQLT DWSGGAFDQLRFATGRAASSGFKVSGGYAVCPVYEDGALITARYWIGADAPDYETETGDL HWIRSGDPVPIGQVGPVAYSEGVRMATRIYAGRKPDKTEGA Prediction of potential genes in microbial genomes Time: Thu May 12 17:28:24 2011 Seq name: gi|319978871|gb|AEUH01000071.1| Actinomyces sp. oral taxon 178 str. F0338 contig00071, whole genome shotgun sequence Length of sequence - 3985 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 3985 3682 ## Amir_2509 hypothetical protein Predicted protein(s) >gi|319978871|gb|AEUH01000071.1| GENE 1 1 - 3985 3682 1328 aa, chain + ## HITS:1 COG:no KEGG:Amir_2509 NR:ns ## KEGG: Amir_2509 # Name: not_defined # Def: hypothetical protein # Organism: A.mirum # Pathway: not_defined # 5 1177 62 1289 1632 207 27.0 2e-51 ALASLGFARAGDPAPIGHTRTRTTGFPEWPILTDPANARIAVSVVPGLRKAARTAAASPD RSVHWDLRGLASTLASSAPHFLPAFYEEAARIYVGLSDRVQALAYWKSAREAEEAHALPI DEDHRHQVFLEFSLAGIVSVAELSAECSRLLASRGAERALEDFFELTVERVRGGLPPYAD LGGDLRQLSEAAGADPIDTELRLLRQILSSPIMSKAPLAFWRAFSGALKALVEQDPGARR TLARTFPKSIQPDRWVAVLDSLGVTDYLLGGDADCRAWVEDYAEVIEEQRDNHFFSGRMV FSPALCRLVRRMPRLEGATVALRNDSQIEPEFIDALADAGARVFFRPRYTQFADEPDPDL DLGPWAKNPKRIDLTALARSEYVRLVEKSLLREGAARLDLFLSHAGTRAMLSAVAEPHLG PHPSPARIEAEARRLKPLFSPFSAPSFRPQFEALMGHLDEAALLADHLRDGLVTEYTWPA LEEAIAELGPDVRFHESFPAVGVSSNGRVVWVDGGRRVAEAHPVPPQGTRVFHEDYVLVG SQTMCATSDIADNVTIAWVDGPSAASSTKHYRIHSRQMETTTLPVPGGRLVGDRLIRPYD TTNPFPKRGPVLRDGDRYWCVRGGRLRGIDPGSGALADAPLPDALMALIAPYLAEGWELR PDGVMWCPTTDTTRDSHCSTSAGRHVGVLLERHGAFLHVGASGEAVETRKPQGVWARMER PGGGHWLLTGTDCTRLLREADRMPLMDTADEVGGTHPFHSVPRAAWHHFRVRDEEASRRL RGATTERVRALVEAVPLVEPSEPSWARTRAGALRDAAKVLTKRARRAASELLGSQDPVLV DSVVWAAEAVKRIVAEARAALEEGRRALSDPRTQRTFPHWEASCDALAWMPGLATDLAHS SITTRRSVIAGLAKCLAGEQDRAGSGQDSLRVLLADPTAFLFCAARPGASESSVGASAAA FGDLVDCGLYSPTGCLFTAAQRPRDESDRVIGPHGKPAVRLDTMWADSGRTHVDVFYSPS GTAPKRLNGWPTRVLSRAPGVEGGRIVEAFKALLSDGAPPWDPEAASRLAEGTGWERPAA AVFLAGLPLPNPRKARRLDPDLRELLGVTVKEAEAAMAFLASLGEDLLLRLAGAAAHDPV RFVREGADVDAVVEAWRREDDGSYRLPGGVLAAADRTGVPYGERALIALGSEPTSSSLDI WLWAAALAHRDDPLAGWLAEAYEAMKCACATTPVVWKVSDFRCEFDNRWSQFCATLTAAE SKAAATRHSSAREPWSIEPAEDGQLVTWDPRHVHDWRTERERALDLPWGVRHDLGWRLDI LSGVYGAV Prediction of potential genes in microbial genomes Time: Thu May 12 17:28:39 2011 Seq name: gi|319978869|gb|AEUH01000072.1| Actinomyces sp. oral taxon 178 str. F0338 contig00072, whole genome shotgun sequence Length of sequence - 3892 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 3892 4213 ## Amir_2509 hypothetical protein Predicted protein(s) >gi|319978869|gb|AEUH01000072.1| GENE 1 1 - 3892 4213 1297 aa, chain + ## HITS:1 COG:no KEGG:Amir_2509 NR:ns ## KEGG: Amir_2509 # Name: not_defined # Def: hypothetical protein # Organism: A.mirum # Pathway: not_defined # 5 1235 62 1371 1632 301 29.0 8e-80 TLASLGFARAGDPVPIGHTRTRATGFPEWPILTDPANARHALNLVGDLRRIEKTARAKPG NAKKSIDELARMLDESAPHFLPTFLEEAGRIFLRNDNRAYATQYFGKAREAERAHNLDID EERHRQVFLEFALAGAVSAKELSAEGRALLERYGAGAALDSFLALNVERVKGGMPPYGGL AADVRRLAQEAGADLLDTQVRLLRQILPSPSIERASAAFWKQFRRALVELARSDEEAKGV VARLAPPAVAPDDWVGLLEKLGITGDLVDGRLDCREWVGRYTRMLQGTWRAPYPRALCLL IRRMKGLAGKTVELSNALYCLEPEILDAIVEAGARPAFDTGTRFYSSLRLDRWAESPDRP DLDFLADSDMAPLVVRGIESTMRSHLDTLIAHSGTRRMLGVWARPRLGGEPTTAQVLAEI KRLSRLLTPSGARAFPDQLAALAAHLDGPRLLARTLRDGLVTELAWPELEAAAAESGDEA PTLHESFPAVGVASKGRIVWVDGDQRVAEAAFSPKKGKRAQWWDYLLVKGETAYLRRHNN GTSDLTWSNNPGAPIDLGYAYWERPSWSSATLPVPGGRLVGERVVRVGDSEHPFTAPCPL LSEDGHYWRVDGGRVNEVDPRTGAVGRESLPGPLADLIAPYLRDGWELRPHAVSWCPATP TTERSLMSTAQGRHAWVALVRDGEFLHLDADGHACRAGSWSVMGRIRRPGGGHWLVSNGV LSREDGSPLLRTAGANGLAHLLHRVPRIGWHQFQVRDAAASGRLRNATEEQAQALLAAVP AASSADPAEPANPPQDPAEVLTDAARAAARAFLGSGEDALVDAVVWAAAGVKRILQLGAV PDAAALGAPRSFPGAEASCAAVRWATGGWGRATDRDAIGNVVRRLAGENVAFTYVEDSLP VVLAEPLALLFSAARPLAPRESVEATAAAFGDVADQGLYSPSCCVFDVPGLRQPLQGGAP APLVDTPDGPAVMLGRSAFNVPSDTSKRFYSPSGGVPPSVHGTTPQVAMRGCGIDADTII GAFRTLLSDGAPPWDPEAARRLAEGTGWSPAAAALFMAGLPGVQPVSAGPLAKDLRALLG LKAKEAREGQEFLMGLGSAALRRLVAAAAHDPVRFVREGADVDAVVEAWRREDDGSYRLP GDLLDKAGRAFRYGGRAALLGLGGSPDERNLDAWLWAAALAHRDDPLAGWLADAHEAMKR ACAATPLTWSVYDRGAALRPLLGLPPAAKDTPRGRVDSAGAWTLAADGYSTSVTWDPAAC ADLQAEARLVRALPYDDGHYERIRARLDILSGVYGAV Prediction of potential genes in microbial genomes Time: Thu May 12 17:28:57 2011 Seq name: gi|319978866|gb|AEUH01000073.1| Actinomyces sp. oral taxon 178 str. F0338 contig00073, whole genome shotgun sequence Length of sequence - 1488 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 4 - 1119 1378 ## COG0714 MoxR-like ATPases 2 1 Op 2 . + CDS 1134 - 1488 324 ## SCO6688 hypothetical protein Predicted protein(s) >gi|319978866|gb|AEUH01000073.1| GENE 1 4 - 1119 1378 371 aa, chain + ## HITS:1 COG:ECs2927 KEGG:ns NR:ns ## COG: ECs2927 COG0714 # Protein_GI_number: 15832181 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli O157:H7 # 10 369 25 383 384 282 46.0 8e-76 MMGGPMDTPPEPGAQQVPPPESRYADELARLAEVGGPRPPGWLLTPSAVVSFVCGNARLG VSRKFVGDPALVERCVITLAGERALLLVGEPGTAKSMLSELLAAAICGTSGLTVQGSAGT TEDQLRYGWNYAQLLARGPSEDALVPSPVLAAMRSGRVARVEEITRCLPEVQDALVSVLS ERRLAVPELGEYAVHAAPGFCVIATANLRDRGVTEMSAALKRRFNFEAVAPIASAEEEVA LVMEKTERALAAVGAPASADRVVVEALVTAFRDLRGGVSQEGWSVERPSTVLSTAEAVAV SGSIALAGAFFPHAKDGVRLIPGQLLGAVRKDDSDDGARLLAYWDSVVKRRAAQGMRTWE QLWELRGALEQ >gi|319978866|gb|AEUH01000073.1| GENE 2 1134 - 1488 324 118 aa, chain + ## HITS:1 COG:no KEGG:SCO6688 NR:ns ## KEGG: SCO6688 # Name: SC6G3.04 # Def: hypothetical protein # Organism: S.coelicolor # Pathway: not_defined # 6 117 10 127 1171 110 56.0 2e-23 MSAGGEVAALAASRAPFLIGVRHHSPALAVAVPRLLDGFGPDVLAVELPAQARAWVDWLA HPETVAPVALAFSRGEGMSFYPFADFSPELAALRWARGAGAEVACVDLPVGAGPGDGG Prediction of potential genes in microbial genomes Time: Thu May 12 17:29:01 2011 Seq name: gi|319978863|gb|AEUH01000074.1| Actinomyces sp. oral taxon 178 str. F0338 contig00074, whole genome shotgun sequence Length of sequence - 2571 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 2296 2448 ## SCAB_83221 hypothetical protein 2 1 Op 2 . + CDS 2293 - 2569 252 ## Predicted protein(s) >gi|319978863|gb|AEUH01000074.1| GENE 1 2 - 2296 2448 764 aa, chain + ## HITS:1 COG:no KEGG:SCAB_83221 NR:ns ## KEGG: SCAB_83221 # Name: not_defined # Def: hypothetical protein # Organism: S.scabiei # Pathway: not_defined # 2 761 413 1235 1239 452 45.0 1e-125 VLGRGRAVADALEQVLVGTRTGRPAPGVPASPLREAVASELASAGLPTAGGKRLSLTPGR GGADLARSVLLRRLCAGGIAYGRPLAAAATRGAEALAEPWDVSWSAATDASIELASSRGL TPEQVARTALLTREAQDPANVIALLREAAACACPSAVERALAAAQAMASTVGFADAVALG TALADVSRCHIPASSLLPARCRERSEELGALLASAALREIRGIEGSDDVADASALAAFAS VGAPQGLSLDHALRRLAAAGSPLMQGAAAGLLIDQEGAASRIASWLDAPTARARSALRRR IAGLLAAAGPRFDTAPAALALVDRVGSMPDDTFTALLPALRGGFDVLDEEARERLLVDLA PSLGQGRALVLSPDDTVAVARHDAGARARLEALGLADLVFTPPQRWRLVLGLRRGDEGPR ELRLASALDELYGRPRADALDEQSRSAGIGPSQLGVRQWEQEIEALFGRGEVEEIFGEAA QRGRGDVLERLDPDSVRPSVELLTTALSLVGALPEARLAKLRPLVARLTRELAEELAVRM RPALSALASAAPTRRPNGRLDLPRTLRANLRHVAAVDGRPQVVPVHPVFHATRARQVDWR LVLLVDVSGSMSQSVVYSALTAAVLAQSGCLSVDFLAFSSEVIDFSGRVDDPLSLLLEVE IGGGTDIAGAMRVARSRLRVPSRTLVVVVSDFEEFGSVDALVGEVEAMRASGAVLLGCAA LNDSGEGVYNAGVAARVAAAGMRVAALSPLDLARWVGAVVREAP >gi|319978863|gb|AEUH01000074.1| GENE 2 2293 - 2569 252 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTRISPALLARIVNALAPRVRKRADAYLADGVEVGPTTQVAGASVRLAGAGVLTEDDQIV CDCLLAPRCAHRAVVALSLEVAGDGDGPTPGE Prediction of potential genes in microbial genomes Time: Thu May 12 17:29:29 2011 Seq name: gi|319978829|gb|AEUH01000075.1| Actinomyces sp. oral taxon 178 str. F0338 contig00075, whole genome shotgun sequence Length of sequence - 34810 bp Number of predicted genes - 36, with homology - 18 Number of transcription units - 10, operones - 5 average op.length - 6.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 10 - 540 444 ## 2 1 Op 2 . + CDS 563 - 1309 621 ## COG3863 Uncharacterized distant relative of cell wall-associated hydrolases 3 1 Op 3 . + CDS 1319 - 2623 1335 ## gi|154508772|ref|ZP_02044414.1| hypothetical protein ACTODO_01281 4 2 Tu 1 . + CDS 2738 - 2827 116 ## + Term 2859 - 2898 6.1 5 3 Tu 1 . + CDS 2982 - 4364 1509 ## 6 4 Op 1 17/0.000 + CDS 4633 - 5430 972 ## COG0247 Fe-S oxidoreductase 7 4 Op 2 13/0.000 + CDS 5433 - 7061 2264 ## COG1139 Uncharacterized conserved protein containing a ferredoxin-like domain 8 4 Op 3 . + CDS 7063 - 7725 784 ## COG1556 Uncharacterized conserved protein 9 5 Tu 1 . + CDS 7993 - 9444 1744 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 10 6 Tu 1 . - CDS 9628 - 12204 3039 ## COG0178 Excinuclease ATPase subunit - Term 13261 - 13321 -0.9 11 7 Op 1 9/0.000 - CDS 13385 - 15496 3087 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 12 7 Op 2 . - CDS 15531 - 17204 2598 ## COG0366 Glycosidases 13 7 Op 3 . - CDS 17171 - 17248 74 ## 14 7 Op 4 1/0.000 - CDS 17303 - 18493 1794 ## COG1159 GTPase 15 7 Op 5 13/0.000 - CDS 18493 - 19854 1691 ## COG1253 Hemolysins and related proteins containing CBS domains 16 7 Op 6 17/0.000 - CDS 19851 - 20300 702 ## COG0319 Predicted metal-dependent hydrolase 17 7 Op 7 . - CDS 20297 - 21328 1523 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 18 8 Op 1 3/0.000 - CDS 21437 - 22237 818 ## COG1309 Transcriptional regulator 19 8 Op 2 45/0.000 - CDS 22234 - 22989 1007 ## COG0842 ABC-type multidrug transport system, permease component 20 8 Op 3 . - CDS 22986 - 23792 234 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 21 9 Op 1 . + CDS 23890 - 24246 353 ## 22 9 Op 2 . + CDS 24239 - 24556 404 ## 23 9 Op 3 . + CDS 24553 - 26343 1641 ## Bcav_1045 hypothetical protein 24 9 Op 4 . + CDS 26390 - 26914 810 ## 25 9 Op 5 . + CDS 26931 - 27461 344 ## 26 9 Op 6 . + CDS 27478 - 28011 490 ## 27 9 Op 7 . + CDS 28028 - 28558 530 ## 28 9 Op 8 . + CDS 28575 - 29102 593 ## 29 9 Op 9 . + CDS 29168 - 29782 738 ## 30 9 Op 10 . + CDS 29789 - 30319 585 ## 31 9 Op 11 . + CDS 30336 - 30863 380 ## 32 9 Op 12 . + CDS 30879 - 31409 169 ## 33 9 Op 13 . + CDS 31426 - 31956 195 ## 34 9 Op 14 . + CDS 31973 - 32506 311 ## 35 9 Op 15 . + CDS 32523 - 33053 332 ## 36 10 Tu 1 . + CDS 33253 - 34810 1733 ## COG1643 HrpA-like helicases Predicted protein(s) >gi|319978829|gb|AEUH01000075.1| GENE 1 10 - 540 444 176 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVAGRSLALRGAARALGAGLATELFGLAVGARVRCLVLGGELLGMTAREGAIHVPDDLGG VWWPGLDRVTRSWAGTLPEGRGAPRPGDEPGASGPSQVRGVVGRWCQRVLDAGPSALASP ALERDRAWAVAAGAPFAARLLGGMEAATHQGSRRFDGTWEADAPALLAAWLAASQY >gi|319978829|gb|AEUH01000075.1| GENE 2 563 - 1309 621 248 aa, chain + ## HITS:1 COG:BS_yycO KEGG:ns NR:ns ## COG: BS_yycO COG3863 # Protein_GI_number: 16081080 # Func_class: S Function unknown # Function: Uncharacterized distant relative of cell wall-associated hydrolases # Organism: Bacillus subtilis # 19 244 21 240 245 77 31.0 3e-14 MTRQFRTLSISALSTALAFALSAPAVATPEDAVTQILQITPGVSRDELVHAASDWASENG TTPEEALRTTLDQLLEEQNSNASTHTSSSSAELETSSPNAATGSTSTLPESWYKGDVYYT PTGAPWAHVGIYGGTWWVVEASGPGYKSDWYRHTSITIRSGAKQVYYHVSQATQDAAADY AYNNLRGYSYNLNFAFNKSKNPWMLNCSQLVWYAYKYGAGVDIDSNGGPGVYPHDIVEGS LAYTYANL >gi|319978829|gb|AEUH01000075.1| GENE 3 1319 - 2623 1335 434 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508772|ref|ZP_02044414.1| ## NR: gi|154508772|ref|ZP_02044414.1| hypothetical protein ACTODO_01281 [Actinomyces odontolyticus ATCC 17982] # 45 431 43 439 442 118 24.0 8e-25 MLSQSPRALSAPSAGIAALALVCACALPGCSSSTAFPEVGQDRATTANSVLALRIGNSPG VGADAYSTGVGRLLLVDGAGTGNSVDVGAMIDAGLVWNASGLHYGTSGAEVAVSDSGTTS VPRGKDERHELSRYSIDDGAASLAFYIDGDQQDAVAVTADGGSAFAPIPGMHPFNSVCDG HLYSMTSTRFPGNGSLPTQLDASGLQADTRPDGQGPADMLIRVDPHDDPQATIVGIAPMD EGLDTPQNEAPCYDGKIYVPTFHKTYPGANPDNGQDPTAGEPVLQVWDTTNGTRDLVPMA GDDGSPIAFSPGHLDWLVGHLSGSVYTFVTSYGEVYSTDLGTGTTRPLFTVPLTDPDSQA SRFTVDAQNVFTLDVPADSEAPLTLRRDPLDGGGGEVLLTVDASPVRKGTFLTGGPLMVQ AFALRPAYMEDFAR >gi|319978829|gb|AEUH01000075.1| GENE 4 2738 - 2827 116 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSASQLPKRTGSAQHEEGIGDECFTGWAA >gi|319978829|gb|AEUH01000075.1| GENE 5 2982 - 4364 1509 460 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQPRAAWSTSRKLFVLVMFLIMLAFNLYLLHSINSSSSSDWRQSDPWWQKSDIVETDSV PWEVAGAVVRPDTAVVPSGQWLNGLQQAWGLDFPAADGGAEQNVRVSSNGSVLVTLERAN LWQGTVKGWDVGGGEPRQLWSHTVQDLDSPGIGDERKSTWVGSTLFVGHYAVNADTGQIV SLTWMRQEAEAHSDKDLRVTADLMLVACDSSTGGCSGRRADGTVVWEASTGLAGTDMWGS AASTGHDWIWLGRTSTSNVFLNAVTGQVKTSSFEPAQGCWRAPAADGWLVACEGDSQIRA LAADGTPVEAFDAAHWPVSTRSKHECSDASVPVWGQEPTLAQATAYYRDSDASATRGVLT PADCGHIEYQAPDGATTALDLTGDDEQRAFTLGDSGGELQDQLSVSDDGRVLAIGNSVLI DLTTGQYMDFADSAPDRPALIAPALILTKTNTNLTATAPR >gi|319978829|gb|AEUH01000075.1| GENE 6 4633 - 5430 972 265 aa, chain + ## HITS:1 COG:DR1907 KEGG:ns NR:ns ## COG: DR1907 COG0247 # Protein_GI_number: 15806907 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Deinococcus radiodurans # 1 251 1 246 247 260 51.0 2e-69 MRIALFATCLADVMFPQAAQATVTLLERLGHEVVFPEGQVCCGQMHANTGYFEDAAKITR NHVKAFEPVLDGQWDAIVVPSGSCTGAARHEQRVVAEHVGDHELAKRSAQIAAHTYDLTE LITDVLGMEDVGAYFPHTVTYHPTCHSLRIARVGDRPYRLLRAVEALTLVDLPDAEVCCG FGGTFSIKNAQTSTSMLADKVANVMSTRAEVLVMGDYSCLMHVGGGLSRLQSGIRPMHLA EILASTKDQPFLGNISFAPESAQVI >gi|319978829|gb|AEUH01000075.1| GENE 7 5433 - 7061 2264 542 aa, chain + ## HITS:1 COG:DR1908 KEGG:ns NR:ns ## COG: DR1908 COG1139 # Protein_GI_number: 15807655 # Func_class: C Energy production and conversion # Function: Uncharacterized conserved protein containing a ferredoxin-like domain # Organism: Deinococcus radiodurans # 41 372 13 343 347 386 55.0 1e-107 MTATFIGMPAVADSASSDVHTGGWRTTVDIPEDTLRWGTTFPEGAKKTLANTQMRRNLGH ATRTIRTKRGQRVEEMPDWEDLRNAAEAVKFEVESRLPELLEEFERNVTARGGIVHWARD KHEANRIIAGIIKSKGVDEVVKVKSMATQETNLNEYLKDQGISARETDLAEMIVQLADDM PSHIVVPAIHRNRSEVRGIFLDRMEDAPEDLSDDPTELTGVARAHLRKKFLHAKVAVSGT NMGIAETGTVSIFESEGNGRMCLTLPDTLITLMGIEKLVPRFQDVEIFSQLLPRSATGER MNPYTSMWTGVTPGDGPQEFHLVLMDNGRTKTLADPIGRQALACIRCGSCMNICPVYQHT SGHAYGSVYPGPIGAILTPQLTQGMAHDDPVHTLPFASSLCGACGDVCPVKIEIPTILIH LRARTVDEKHAVLPDVWDLAMGSATPVMSNAKLWTAAGMAVKATRLVGGKKGRIGALPFP ASLWTGARDLPTAPKETFRQWWKRTHPDSDASSTRWDAPARGIPLASEHGMTTRTNDDTT ED >gi|319978829|gb|AEUH01000075.1| GENE 8 7063 - 7725 784 220 aa, chain + ## HITS:1 COG:DR1909 KEGG:ns NR:ns ## COG: DR1909 COG1556 # Protein_GI_number: 15806908 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 5 218 9 210 212 108 39.0 1e-23 MAPMDAKTAILARARDAIGRSQQGRPVRPIPRDYIRSTADAPGSDAVVGEMIEKLEDYSA KVVLAPTDAKVADAIDSFLEGMGSVVVPTGLDAPFKKAAARSGRTVREDSREKAIPTLEL DAIDAVVTRSRVGISISGTIVLDGEPDQGRRAISLVPDTHVVVLERSAIVPTVPQAVDVL GENPLRPMTWIAGGSATSDIELVRVNGVHGPRNLRVVIAH >gi|319978829|gb|AEUH01000075.1| GENE 9 7993 - 9444 1744 483 aa, chain + ## HITS:1 COG:Cgl0110 KEGG:ns NR:ns ## COG: Cgl0110 COG0246 # Protein_GI_number: 19551360 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Corynebacterium glutamicum # 11 482 24 498 503 468 50.0 1e-132 MTTLSQSDLAPDYDRSRITTGIVHMGVGGFHRAHQAAYLDDLMRRGEALDWGICGVGLMP QDTRMRDALRGQDYLYSLTLKHPGGRAESRVIGSIHDYLFAPDDPGAVLGVMTAPTTRIV SLTVTEGGYNVDDATGLFRTESEGAARDAADPHHPTTSFGYIVEALRRRRDAGVPPFTVM SCDNLPGNGKVARTAVVSQARMSDPVFAQWIDDNVAFPNCMVDRITPQTTPADIESVRER LGVEDAWPVVCEPFTQWVLEDRFTDGRPPFEDVGVQMVSDVVPYELMKLRLLNASHQGLA HWGRLLGIEYAHDAAADPDIAAWVRAYLEREARPTLRPVPGIDLDGYVDTLFERFTNEAI ADTLFRLAQDASNRMPKFVLGTVRDNLASGGPIRLGAAMVAAWALGDEGVDEAGAPIAVD DPAGARLLELAAAQKAGDDRAFVSDTAVFGDLAGQERFVEEFASQLRALRAEGARARMRA LVG >gi|319978829|gb|AEUH01000075.1| GENE 10 9628 - 12204 3039 858 aa, chain - ## HITS:1 COG:DRA0188 KEGG:ns NR:ns ## COG: DRA0188 COG0178 # Protein_GI_number: 15807854 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Deinococcus radiodurans # 2 852 83 920 922 495 38.0 1e-139 MPHRTTPSSIQVRGARVHNLKDLDVDVPLNSFVAIAGVSGSGKSSLALGTLYAEGSRRYL ESLATYTRRRISQAARAPVDRVAHVPAALALRQRPGVPDVRSTFGTATELLNHLRLLFSR AGSYLCPSGHRVPPSRNVALEVPLVCPQCGESFHGLGAEEMAFNSTGACPVCEGTGVQRV VDRAALVPDESLTIDQGAVSPWGSLMWKLMKDVAREMGVRTNVPFRDLSEEERAIVYEGP AEKRHIVVSTKTGGAELDFTYFNAVATVENALAKAKDEKALARVGKYLITRTCPACDGTR LGERVRSTLIGGIDLGQASHLTLSGLREWAARVPGLVPDEVAPMARVIVDEMAEPMERLA DLGLSYLSLDRASSTLSNGERQRVQLARAVRNRTTGVLYVLDEPTIGLHPSNVDGVLDIV RELIEDGNSVVVVDHDTRVLAAADHLIEIGPGAGADGGRVVYQGPVPGASAAPASRIGPY LAGRRRVERRAGWDRERAGTDGHGEAGADECGGPGSDGHGEAGADGRGAIHLETSRIHTV HPLGVDIPVGALTAVTGVSGSGKTTMVLESLVPALRAALAGEALPAHVRDVDAAGVTRVN LIDATPIGVNVRSTVATYSGVLDELRRAFARTEKARAAGLKAGAFSYNTGSLRCPTCDGT GQITLDVQFLPDVDIECTDCRGSRYSAQAHGIRRDGLSLPDIMAMSVDDARGAVQGLKKA GALLGTLSGLGLGYLALGEATPALSGGEAQRLKLASEMGRRQEGALFVFDEPTIGLHPDD VQVLLDVLQRLIDRGATVVVIEHDLDVIANSDWVIDMGPGGGADGGRIVAQGAPRDVAAR PDSVTGRHLRAARPGILG >gi|319978829|gb|AEUH01000075.1| GENE 11 13385 - 15496 3087 703 aa, chain - ## HITS:1 COG:L37906_2 KEGG:ns NR:ns ## COG: L37906_2 COG1263 # Protein_GI_number: 15672409 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Lactococcus lactis # 314 686 3 394 406 271 41.0 4e-72 MSESPQQRASLLAPMPGDVIAQADIPDPAFAGGAMGVGFGVVPEENTVVAPVSGRITMVA DTGHAVGFETADGLQVLLHLGVDTVELKGAPFRLVVAKGDTVQAGGPVGTMDLAAVEAAG KSTVAITVITNSKKRVESLDVRTGRAQAGDPVADALLKAAAPAAPAQAEAAPASSGTSAA SSGTAPEAAGAAVEADDGYEGLTGFALQAVRIIDGIGGADNVASVIHCITRVRFYLKDDA KADDAAVADIDGVIDVAKAGGQYQVVIGATVGDVYQEIVKRLPQGAAGDDEAAADPVEKP TTPLGWVKYGFSSLIGVITGSMIPVVGVLAGSGILKGLLGLLVQVGWLVKDSDTNIILNA MADSMFFFLPIIIGFTAAKRLGADPIIVAIIGGVLAYPSIVQLAKHDAPGYHVLHTLLGV NFNAELFGIPISLPLDNAYAYSIFPIIVGAWLASKIEPLLKKWIPAVVHSIFAPLIEIFV ISTLLFTVFGPIIMFVSGGVAAGLNWVLAINYAFAGLVIGAFYQCLVIFGLHWAVIPVIA QQLAVHGQSSTINAIVSATMIAQGGGALAFWIKARSAKIRSLAGPATISAFCGVTEPAMY GLNLKYGRAFLTASVGGAAGGLVTGLLNINMYGFAGAFTGFASFVDPTGKDTSSAVNFWI ASFVALAVAFVCTYFFGFKEDDFGQGRTVEKVRLGNREAAKKE >gi|319978829|gb|AEUH01000075.1| GENE 12 15531 - 17204 2598 557 aa, chain - ## HITS:1 COG:SPy2096 KEGG:ns NR:ns ## COG: SPy2096 COG0366 # Protein_GI_number: 15675852 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Streptococcus pyogenes M1 GAS # 11 554 1 539 542 668 58.0 0 MANARALEGTMTGFRSSVVYQIYPKSFYDSNGDGVGDLRGVIEKIPYIASLGVDHVWFNP FFPSPGRDNGYDISDYCAIDPAMGTMEDFEELVEALGAHGIGVMLDMVLNHTSTEHEWFR KALAGDERYQRYYYLRDPKPDGSLPTNWVSKFGGPAWEPFGDTGKYYLHLFDRTQADLDW HNPDVREEAAKVVNFWRSKGVRGFRFDVINLIGKPEALLDAPEGTDDRAMYTDGPDVHTY LQWLSEQSFGQDADSITVGEMSSTSIQACIGYSNPANHELSMVFNFHHLKVDYEDGQKWT LKDFDLAALKRLLGEWALGMQEGGGWNALFWNNHDQPRALDRFGNVDEYRYESATMLASA IHLLRGTPFVYMGEEIGMSDPAYTSIDDYVDVEARNAYRALVEGGTDPARAFAVVHSKAR DNARTPMQWDASDNAGFTRGTPWLRPVNHAEVNVESAERGRILPYYRRLIEARHTHPVIA EGSYIPYNAPDEVFAYVREHDGQRLLVLNNFRAHDVGVDVLGGFAQGRVLIGNYDDAVVR SRITLRPYESLSIITGA >gi|319978829|gb|AEUH01000075.1| GENE 13 17171 - 17248 74 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTCLPKCKLSHTKQRWRMREHWRAR >gi|319978829|gb|AEUH01000075.1| GENE 14 17303 - 18493 1794 396 aa, chain - ## HITS:1 COG:MT2433 KEGG:ns NR:ns ## COG: MT2433 COG1159 # Protein_GI_number: 15841876 # Func_class: R General function prediction only # Function: GTPase # Organism: Mycobacterium tuberculosis CDC1551 # 98 396 3 300 300 330 59.0 3e-90 MRFPSDDELMAGPNALGDARIEVPPGAADPEAEARAQIGGDSGLGDAEDPDAGNADDGLS DLEDEADADLGAFEALASVASLRQDAAAAIGVPDYPEDFRAGFVSIVGRPNVGKSTLTNA LVGQKVAITSGRPETTRHNIRGIVHGDGYQLVLVDTPGYHRPRTLLGKRLNDMVREALAE VDAVLFCLPADQRIGPGDQFIARELRGVKRPVIAVATKCDAVGRERVMKHLLAIERLGEW SAIVPVSSVEGKGIDHLRDVLAQTVPASPPLYPEGDVTDESRDTLIAEFIREAALEGVRD ELPHSLAVQVEEIIERERREGDTRPPLVDVHVNVYVERDSQKAIIIGRRGARLKEIGTEA RRHIEELLGKRVYLDLHVRTAKDWQSDPKMLGRLGF >gi|319978829|gb|AEUH01000075.1| GENE 15 18493 - 19854 1691 453 aa, chain - ## HITS:1 COG:Rv2366c KEGG:ns NR:ns ## COG: Rv2366c COG1253 # Protein_GI_number: 15609503 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Mycobacterium tuberculosis H37Rv # 36 440 19 426 435 223 38.0 8e-58 MTPPLPVQAAGASSIGAVPDLPLAVLAAFALVVACLAQSVEQGVSRLSAAAVEDLAEQGR ARAGVLAGLVAHRRRTLLVLRSARTLWQVVFAVSVTLILVDKGIAWWAVALIAVASVSAL QFLFVSLVPSQWAARRPEGIALAGAPTAARLVRLSRLVDPLLGRVRASRPAPEPTEAQAR AELVSDLREIVDEVGEPESIEEEDREMVRSVLDLGQTLVREVMVPRTDMVTADADLGARK ALRLFVRSGFSRLPVVGDDVDDVRGVLYFKDVVARLEAHGPDLDLTAEQMMRPAEFTVEM KPADDLLRQMQAEHSHMAIVIDEYGGVAGLVTLEDIIEEVVGEVTDEHDRHVIEPEALGG GQWRVPARYPISELGDLLGVEVEDEDVDSVGGLLAKAIGRVPLPGARGELAGVRMRAEEA RGRRRQVGTIVCSTVGGSGAPGAPPQVGTRKED >gi|319978829|gb|AEUH01000075.1| GENE 16 19851 - 20300 702 149 aa, chain - ## HITS:1 COG:MT2436 KEGG:ns NR:ns ## COG: MT2436 COG0319 # Protein_GI_number: 15841879 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Mycobacterium tuberculosis CDC1551 # 3 149 4 152 177 142 53.0 3e-34 MTEVLNETDYDIDCAEFAALADYVLTQMHVASDAELNILFIGAGPMEELHVKWLDLPGPT DVMSFPMDELKPGTADNPTPPGMLGDICICPEVATRQAADSGHTAAEEMLLLATHGILHL LGYDHAEDAERAEMFALQRKLLLTFLASR >gi|319978829|gb|AEUH01000075.1| GENE 17 20297 - 21328 1523 343 aa, chain - ## HITS:1 COG:Cgl2236 KEGG:ns NR:ns ## COG: Cgl2236 COG1702 # Protein_GI_number: 19553486 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Corynebacterium glutamicum # 27 338 26 339 339 342 57.0 6e-94 MSDPTVESTLVQTRRAMIPARLEPIMVLGVADQVLRALERGFPHVRFLVTGREISMAGVQ ADIDLAANLVEELIGLAGRGQALDADAVEQAISLLGRVTRGEAPSEEILSARGRSIRPKT PGQKAYTDAIDRSTVVFGIGPAGTGKTYLAMAQAVRRLLSGEARRIVLTRPAVEAGENLG FLPGSLTDKIDPYLRPLYDALGDMLDPEALPKLMAAGTIEVAPLAYMRGRTLNDSFIILD EAQNTTLGQMKMFLTRLGFGSKVVVTGDASQVDLPGGQPSGLAVIQTILEGVDGIEFCHL TSADVVRHELVGHIIDAYQRWEDGAARRRSARSRARHPREDRS >gi|319978829|gb|AEUH01000075.1| GENE 18 21437 - 22237 818 266 aa, chain - ## HITS:1 COG:MT1725 KEGG:ns NR:ns ## COG: MT1725 COG1309 # Protein_GI_number: 15841142 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mycobacterium tuberculosis CDC1551 # 75 262 20 202 207 141 43.0 1e-33 MTRRTPAPRGDDGERKGKPRQRARAGAPAGARSTAGNAGAAADSKGKAARAGAAAEGKAW ASSAGGAKGGKGRATRERILAAARQAFASGGYAAVSVRSIASQAGVDQSLVHHYFGTKKD LFLSAMDVPFDPMEHMGGALEKLSDGTLEDLGEAVVRGILAVFASPHGRDYVAVMRARMA PGGPVESVAGFLTEEVYSLAAWLLDDPPGTGALRTDLAASQVIGLIVARYVLRLQPLADL DDEQVVAHYAPVLQRYMTGPLPPGAG >gi|319978829|gb|AEUH01000075.1| GENE 19 22234 - 22989 1007 251 aa, chain - ## HITS:1 COG:Cgl1421 KEGG:ns NR:ns ## COG: Cgl1421 COG0842 # Protein_GI_number: 19552671 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Corynebacterium glutamicum # 1 251 1 247 247 174 42.0 1e-43 MNPATWWATTTRVLRQLVADRRTIGIVLVVPTALLVLLHFVFLDVPVAPGAEPAFPVLAV NMYGIFPMVAMFLVTSITMQRERVTGALERLWTTRLHRADLLFGYAAAFTVIGVLQSLVM WSVGAFFLGVSTEAGAPWVVDVTGAGAFVGIALGLVASAFARSEFQAVQFMPVFIIPQLF LCGFFVPTAYLPSPLERIAAVLPMTHTVAALEEVRLQSAPGQGYWKALAVMLAWGIGALV VASLTMPRRTR >gi|319978829|gb|AEUH01000075.1| GENE 20 22986 - 23792 234 268 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 28 252 2 239 245 94 28 7e-19 MINSSSDEARDTAPTVPERPGAEPPPAISVERLRVSRGRAPVLHDVSFSIRCGEIVGLLG PSGCGKTTLMRALVGVQRIDAGTAAVLGCPAGSKRLRARIGYSTQQASAYPDLTVRRNAV YFAALAGAGRGAADAAIRSVGLADRAGQVVSTLSGGQASRASIACALVGDPEVMVLDEPT VGLDPLTRESLWDVFRELAGQGRTLLVSSHVMDEAMRCDRLILMRRGRVLGVMTPAELLE ATGQDTPDRAFLALVREDADDGAEGSVR >gi|319978829|gb|AEUH01000075.1| GENE 21 23890 - 24246 353 118 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGAFEDFDQMIADAQRRVSENARAGARFRRASEGMVGRGRALGGEVAVEVDASGALVRC AFGRGAQRVSLSRLGAGVVEAYTAARADALAGVRGAAVEAFGADNPILAPLMGGGDGE >gi|319978829|gb|AEUH01000075.1| GENE 22 24239 - 24556 404 105 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEELRVNTDMLRSYAARVDATADQVSQAQWAVGSLNTIGALGVLCAPLLGPAVGFVEAG ARSCIDATGQGLGRMYELLGRTAQVYDAVEAQIAEDLRAIMERLP >gi|319978829|gb|AEUH01000075.1| GENE 23 24553 - 26343 1641 596 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1045 NR:ns ## KEGG: Bcav_1045 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 36 304 14 273 412 135 36.0 7e-30 MSGGEALTTADVPGGAVGGSGNPGSTLVAERQDARSGFEGLGIIEDGHALVEAIQGGSWM GTALGVAFTGLDVASAVSDPIGTLISMGLGWALEWVWPFNELFNALAGDAAQVEANAQTW ENVAMALGAAGAQLEEDTATCLSDMAGDGVEAIRTSAGGVAEHVQAASQWAQAMADGLRM ASGIVQVVHDVVRQAISDLIGTICSVAIEEACTLGLATALVVEQITTRVAALSTHVFEAI GHLKTVFSSFHGLLSELGRLWNTLTGAISSMFHRGAAALTPHGPVAGATAAIGIGGAPHG VHGPHASAQPTPPPHAKDPVKDEPLVFPERKPWKDTKREKPDESHNGAKWGEGRDQAQLN TLRDARVETKQHIKDLEAELRAHGVDPKDATINTIEATIARLKKEHRPVFERETIIRLAD DLSHARGRYSALGEEVGIVGAESKLESQGMELLPNTGGAGAGNGSGDIYATALDKEGNHQ AFHVVEAKGYSSKLGTRLVDGTPFKQGSPTYVRDIMLNDTQLHDALARNAALREAILKRE IPVIADVYRTRHPYMSCVTLQQTKAVPLDDDFIKKLEKILKGHPAYAPFPPSKPTP >gi|319978829|gb|AEUH01000075.1| GENE 24 26390 - 26914 810 174 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTTLNAYPTDRFIEIVKAWVNHPWPITSDEGQAIYESLGLRGYAPRPNFFWSDFSTDDE PDSYYVVTQDQISTIDLEVAKLTLPENEVGNDSTLCDTYTQYCAAIDSAFDGGATKHVIH DKAIEWTFYNDVRIRIENIDFALSFIIRSPRMTQLRRKEEEMGLTSYDEILEDD >gi|319978829|gb|AEUH01000075.1| GENE 25 26931 - 27461 344 176 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQRLAGFEAYPTDRFIEIVEAWVNHPWPIASDEGQAIYKSLGLRGYAPRPNFFWSDFST DKEPDSYYVVTRDQVSVIDMEMARLLPDGEASDDSPLYEIYTRYCNAIDTVFNGNATKHP VHDKATEWTFHNDVRIRIANIDIAISLVIHSPRMTQLRREEEEMGLTSYDDILEDD >gi|319978829|gb|AEUH01000075.1| GENE 26 27478 - 28011 490 177 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQNPARFEAYPTDRFVEIVETWVNHPWPITTEEAKEIYESIGYRPYIPDPSLFASDLSK DGEPDSYYTQSDGCINTVNITVTRQGEKGTERHNADKIEKIYKQYCNAIDTLVSKRLILR HQQDSATEWFIDNNVNIRVGNPGHRVTFSVSSPAMTQLRREEQEMGLTSYDEILEDD >gi|319978829|gb|AEUH01000075.1| GENE 27 28028 - 28558 530 176 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQRPTRFEAYPTDRFIEIVKTWIDRPWPITSKDGRIIFEDLGYSVDSEDPGMFASPFAE QEPDSYFIVRNGLIDDIDVAVSTICTRESMKENAAAMERTYDTYCKAVSGAYPEALSRRD RTHNAVEWFFKNDAQIRLSNIGITISFYIHSPRMTQLRREEQEMGLTSYGEILEDD >gi|319978829|gb|AEUH01000075.1| GENE 28 28575 - 29102 593 175 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQRPTRFEAYPTDRFIEIIKAWVNHFWPMTSAEAQELYESFGYRAYAPKPELFISDFSQ IEPDSYVTTHNDRVGDARISMSHPCDTVTPQNTDTISRTFSDYCTAIEDKLKDMISHRKS ERNAVRWTLRNGVDIRLAKLSRNISVFIGSPRMTQLRREEGEMGLTRYDEILEDD >gi|319978829|gb|AEUH01000075.1| GENE 29 29168 - 29782 738 204 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDTTNVTRTRIRRTLRQMANNRSPITEQRRIEHIADALAHEQGEYTRLGEEVGIVGAESS LEREGMVILPGIDGPNEGNHSGDIYAVAYDEDSRPRSLHVVAAKGYSHRLRTRPVDGASA TQGSPEYACHLMLTDRCLHAALAKDPVLRRGILDGSVEVITDVYRTPYPYMSSVIHLSAI PVPLDRAYASTLQGLVRQHPDYTE >gi|319978829|gb|AEUH01000075.1| GENE 30 29789 - 30319 585 176 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNSTTFQSYPVDHFISIVQAWVHQAWPITTAHGRRVYESLGMRPHAADPGLFMSHFAVGG EPDAYFTEHEGLINTVRMQIGHPCPPGTEGQNGPVIAQYFTAYYGSIEQAFGPSTLMRRR KDDGAIDWILDNNVGISLGPLSHSIILFIDSPTMTQLLREEEENGLTNYNDILEDD >gi|319978829|gb|AEUH01000075.1| GENE 31 30336 - 30863 380 175 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQHPTAFEAYPVDRFSEIIEAWITHPWPMTPIQAKELYESLGYRVHTPKPRLFFSEFAQ EEPDSFFTAQHNRVGNVRISVSRPCAARTQQDENHISHVFTEYCVIIDSSLRDSISQRKT EESSIRWTLDKDIEIRLAKLPRNISLRIASPRMTQLLREEREMGLTCYDEVLEDD >gi|319978829|gb|AEUH01000075.1| GENE 32 30879 - 31409 169 176 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQHPVSFEAYPVDRFIDIVKAWIEHPWPITAAEARQVYESLGYLVDPEDPEMFASPFAD GEADSYYVTVKGLVSTVDIAVARPCNVGEEKESSAAIASAYDAFRQAINSHYKNMISATR SEDGSTEWTFTSGVELDVSVWNRSVSARIHSPYMTQLRREEKEMGLTSYDDILEDD >gi|319978829|gb|AEUH01000075.1| GENE 33 31426 - 31956 195 176 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSQHPVSFEAYPVDRFIDIVKAWIEHSRPITAAEARNIYESLGYSVDPGDPEMFASPFAG GEADSYYVTVKGLVSTVDIAVARPCNVGEEKESSAAIADAYATFCQAINSHYKNVISATR SEDGSTEWTFTSDVELEVSVWNRSVSACIHSPYMTQLRREEQEMGLASYDDILEDD >gi|319978829|gb|AEUH01000075.1| GENE 34 31973 - 32506 311 177 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQRPARFEAYPAERFIEIVKAWVDHHWPISAEEGRRLFEDLGYAVDSEEPEMFLSEFAS NGEADSYFTARNGTVGSVDISVTKYYNADDATVDAKAVAERYAQYCEAVDRAFRGTIVRR REKPRATEWIMRHGVLLWIGNPGSIITLYIDSPRMTRLLLEEEAMGLTDYDDILEDD >gi|319978829|gb|AEUH01000075.1| GENE 35 32523 - 33053 332 176 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQQPVTFKAYPVERFIQIVKAWATHRWPMTPDEGRQLYESLGYKTDDTDRDMFSSPFAK GEPDSFFTHIGNAIADVVIAVGERYTVEEEDSCIEAVTRAYEEYCGAIDASFIGAVDSRQ KASDAMEWFLNNEVQIRIDNVGIMVGLTLHSPHMTRLRREEEAMGLTDYDDFFGDD >gi|319978829|gb|AEUH01000075.1| GENE 36 33253 - 34810 1733 519 aa, chain + ## HITS:1 COG:AGc86 KEGG:ns NR:ns ## COG: AGc86 COG1643 # Protein_GI_number: 15887410 # Func_class: L Replication, recombination and repair # Function: HrpA-like helicases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 475 41 496 866 254 41.0 3e-67 MRTLEDLVSDPPDLPVAASLPAITRAARPGACAVITAPPGTGKTTLVPAALAAALTQPYR ERPGRILVTQPRRVAVRAAARRIARLLGERVGGQVGYTVRGDSVASRSTRVEMVTPGVLL RRLQRDPELPVVAGVLLDEFHERDLDTDLALAFLLDARAALREDLFIALTSATLEASRTA GLLEEATGTAPSLIDIPGAIHPLDVRWAPPPRGAEPLGAVGGRVGVRREFLAHVARTVES AYAGSDGDLLVFLPGVAEIERVRSLISAPGADVLPLHGRLSAAQQDAALSPGSRRRVVLS TSIAESSLTVPGVRVVVDASLAREPRLDAARGISQLATVPASRARLEQRAGRAARLGPGT AVRVMDPVDFSRRPAQSAPGIATDDLTGARLQAAVWGAPDATGLALLDAPPAGAWNAAGA RLRALGAIDEAGAATPFGRALASLPLDPPLGRALLAASPVVGAARAARFTALLSEDVRAD GADLARLERSLGSPPAHLRATASRVAEQEARLVALASAV Prediction of potential genes in microbial genomes Time: Thu May 12 17:32:15 2011 Seq name: gi|319978825|gb|AEUH01000076.1| Actinomyces sp. oral taxon 178 str. F0338 contig00076, whole genome shotgun sequence Length of sequence - 2942 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1004 1164 ## COG1643 HrpA-like helicases + Term 1057 - 1083 -1.0 2 2 Tu 1 . - CDS 1026 - 1283 335 ## gi|154509442|ref|ZP_02045084.1| hypothetical protein ACTODO_01973 3 3 Op 1 . + CDS 1295 - 1411 120 ## 4 3 Op 2 . + CDS 1408 - 2808 480 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 Predicted protein(s) >gi|319978825|gb|AEUH01000076.1| GENE 1 3 - 1004 1164 333 aa, chain + ## HITS:1 COG:Cgl0142 KEGG:ns NR:ns ## COG: Cgl0142 COG1643 # Protein_GI_number: 19551392 # Func_class: L Replication, recombination and repair # Function: HrpA-like helicases # Organism: Corynebacterium glutamicum # 19 331 417 716 717 218 43.0 1e-56 PAAATGAPRTREDALALTVALARPEWIARKRPGSSAYVLAGGVGALLPAGSPLEGQEWLA VAGIDRASGDKQARILAAVPIGPDDALAAGAPLLATSTEAALESGRIRAHRVRRLGAIEL STQPAGALPADEALAMVRAFAQKNGLGAFGWGQRATALRARLAALRAAIGEPWPDVSDDG LLAASDAWLSPHAQRLASGAPLSGVDMAEALRALLPWPEASRLDELAPERITAPGGAQRA IDWSTGSPVLTLRVQQAFGWTDTPVLADGRLPLVLHLTSPAGRPVAVTSDLASFWAGPYA QVRAQMRGRYPKHPWPEDPLSARPTDRAKPRRR >gi|319978825|gb|AEUH01000076.1| GENE 2 1026 - 1283 335 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154509442|ref|ZP_02045084.1| ## NR: gi|154509442|ref|ZP_02045084.1| hypothetical protein ACTODO_01973 [Actinomyces odontolyticus ATCC 17982] # 1 81 1 81 84 106 70.0 4e-22 MRALDIGDFRENMGDVVARVSSGEVVTLMEGGVPVAELGPVRMSTLSQLRDLGLERPSIA DVGGYRMPSKNLPDGRSVVEKSLGD >gi|319978825|gb|AEUH01000076.1| GENE 3 1295 - 1411 120 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHIPQPRGQWCSNRWNTRATSGVETRETRKQTTAMEAA >gi|319978825|gb|AEUH01000076.1| GENE 4 1408 - 2808 480 466 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 10 458 4 444 458 189 29 2e-48 MTTAPATENFDLLVVGGGKAGKSLAMDRAKRGWKVAMVERRFVGGTCINVACIPTKTLVG SARRLAEARTDADFGVVGTEGARIDLASLRAHKEGVVGAMVAAHEKMFAAPGIDFIRGSA RFVGERTVNITTDDGTIRTISAPRVLINLGTRPARPAIPGLWESGAWTSEDILRLESLPQ SLAIIGAGYIGVEFASMMAAFGVAVTLVSSSPRILPREDADAAAELEAALEAQGVTIVRG ARAESASRDGSTTTLALSDGTTVVAEAVLAAVGRTPNTDGIDLELAGVDVDERGFIAVDS HLRTTAEGTWAAGDAAGTPMFTHASWSDFRIIRNQLDGVGLDDPTTSTAGRNIPYAVFST PELARIGLNEAEAAAQGLDVRVAKIPTAAVPRAKTLRSPAGFLKAVVDARTHRILGATLI GEHASETITAVQVAMAGGLTYEQLRFLPIAHPTMGEGLQILFDSLD Prediction of potential genes in microbial genomes Time: Thu May 12 17:32:26 2011 Seq name: gi|319978820|gb|AEUH01000077.1| Actinomyces sp. oral taxon 178 str. F0338 contig00077, whole genome shotgun sequence Length of sequence - 6386 bp Number of predicted genes - 6, with homology - 3 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 63 - 2519 3601 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 2592 - 2651 3.5 2 2 Tu 1 . + CDS 2374 - 2646 218 ## 3 3 Tu 1 . - CDS 2682 - 2795 66 ## 4 4 Tu 1 . + CDS 2748 - 2852 164 ## - Term 2819 - 2859 2.0 5 5 Op 1 4/0.000 - CDS 3047 - 4462 1785 ## COG0019 Diaminopimelate decarboxylase 6 5 Op 2 . - CDS 4465 - 6144 2393 ## COG0018 Arginyl-tRNA synthetase Predicted protein(s) >gi|319978820|gb|AEUH01000077.1| GENE 1 63 - 2519 3601 818 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 656 814 524 683 744 151 44.0 4e-36 MSRTPVRVMLSAAVTLATALSVVAVSSPTAHAEGTPGITSAARERDEVMNYAVNLPVGAS ESDFATAVSKAGDVGGVVLAQYPVFNSFFVQSVKAAFAPDLGAALVAAGISYDSIGPTRY ATVSGAEVKVDQNQGTPGENQAVAEAVANHDDSLSPNSQLNDFAPEGDADFTPDEGDANA WGLYAVGSVQAQQVDVPRAKVTVGVLDTGIDADHPDLKDQVDRTRSVGCASNGIADQDYS AWKDDHYHGTHVAGTIAAAHNGIGVDGVAPEATLVAIKTSNANGSFYPEYVTCAFTWAAD HGVDVTNSSYYMDPWAFWLPNDPTQAAGLEAASRAVHYAHQKGVVNVAAEGNSNDDHDNP TVDTSSPNDVEGAAFSRDVTGGVDAPAMLNDDVISVSALVLPDGQDASTGVLDRAYFSNY GVKSVDVAAPGVRVWSTVPTRIRESGYAYLSGTSMASPHAAGVVALLKEIHPGYTADQLV ALLKQQAGYSFDRLTVPADGKEYRGAGLVNALAAVTRDQEKPVVSAVEYSTDNETWQPLE GATLSGTVYVRATVTGSVTSASLDVAGLATVSKTVDEGADSVVVESGLIDLANLHLSGSD ATPVNATVTAKGVNNDAAADDDVTQTVGFFATAVKTGRWINSGKGWSYQYSDGTHPSSEV VEIDGVAYRFDADGYLAYGWDKVDGNWYYYGANGRASGWTRVNSLWYYLDPSTGQMQIGW TKVGDSLYYLEATGVMTTGWKSIDGNWYLFDASGAMTTGWQASNGSWYYLNEDGTMATGW KGVDGYWYYLKPSGQMVTGTQWVDGRWQNFDSSGRWIG >gi|319978820|gb|AEUH01000077.1| GENE 2 2374 - 2646 218 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTSSRSRAAEVIPGVPSACAVGLLTATTDNAVASVTAADSMTLTGVRDIVSSIHSTDGTV PRAAEKPLHSGSRPFIMSGITGKSPAISTF >gi|319978820|gb|AEUH01000077.1| GENE 3 2682 - 2795 66 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDPDGNTGVICLPLHCHRFPDSASNAVVAREWAGCG >gi|319978820|gb|AEUH01000077.1| GENE 4 2748 - 2852 164 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MERETYHSRIPVGIRQSTDHLYHIRHKKQRATAF >gi|319978820|gb|AEUH01000077.1| GENE 5 3047 - 4462 1785 471 aa, chain - ## HITS:1 COG:MT1332 KEGG:ns NR:ns ## COG: MT1332 COG0019 # Protein_GI_number: 15840743 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Mycobacterium tuberculosis CDC1551 # 24 466 9 445 447 352 45.0 7e-97 MPGSNGADALPVGSLVCPGPAERPDLWPWSAERRPDGALEVGGVEMADLAGRLGTPLFVV DLADLAARARVWVSAMAEEFWDGYGMAGGDAFYASKAFLSAEVARLVAAEGMGIDTASLG ELTLALRAGVDPGRIGLHGNNKGDAEIELALQAGIHRIFIDSAHEVGAVERIAARLGVEA PVMVRLKSGVHAGGNEYIATAHEDQKFGVSVADGQALDVVRRIREAPHLRFLGLHSHIGS QIFGTAAFEEAARVVVDFAARIRDELGVETGAIDLGGGVGIAYTGQDPVPDSPAAVARAL AGVVRERCESLRLPVPHVSTEPGRSVVGPAALTLYTVGVVKDVSLPGGGVRRYVAVDGGM SDNIRPALYGASYTAALAGREGAEGLARCRVVGKHCESGDIVVRDVDLPADVGAGDVIAV PATGAYGYSMASNYNMLTKPGVVGVGAGEPHWLVRPQGLDHLLALDAGLEG >gi|319978820|gb|AEUH01000077.1| GENE 6 4465 - 6144 2393 559 aa, chain - ## HITS:1 COG:MT1331 KEGG:ns NR:ns ## COG: MT1331 COG0018 # Protein_GI_number: 15840742 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Mycobacterium tuberculosis CDC1551 # 1 559 1 550 550 582 55.0 1e-166 MTPEELASLIRAVLLDAAADGRVALGPSDIPDPVRVERPRNRDNGDWSTNAAMQLAKRAG TAPRDLAALIADALADAEGVESVEVAGPGFINIRLAASSAGQLAATIVEAGDSYGTNASL AGQDINLEYVSANPTGPVHLGGARWAAVGDSLARLMTASGARVTREYYFNDHGSQIDRFA RSLYARALGREAPEDGYGGQYIADIADRVRADARAAGLPDPVGLPEEEAVEAFRARGVEL MFAEIKARLSAFRSEFDVFFHEDSLHDSGAVSAAIETLRGRGEVFDRDGAVWLRTTSYGD DKDRVLIKSDGQAAYFAADVAYYLDKRRRGADAAVYLLGADHHGYIGRMMAMCAAFGDEP GVNMQILIGQMVNLVKDGAPVRMSKRAGTIVTLDDLVDAVGVDAARYALVRVSMDSNLDI DLDLLTQHTNDNPVYYVQYAHARTRNVARNAADHGVGRDAGFDPAALDTAADAELLGALA QFPAQIAQAAQLREPHRVARYLEALAGTYHAWYGQCRVTPRGDDPVEAGHVARLWLNDAV SQVLRTGLGLLGVAAPERM Prediction of potential genes in microbial genomes Time: Thu May 12 17:32:40 2011 Seq name: gi|319978813|gb|AEUH01000078.1| Actinomyces sp. oral taxon 178 str. F0338 contig00078, whole genome shotgun sequence Length of sequence - 4098 bp Number of predicted genes - 6, with homology - 3 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 103 - 1359 1808 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family - Term 1744 - 1784 2.3 2 2 Tu 1 . - CDS 2029 - 2430 583 ## COG0698 Ribose 5-phosphate isomerase RpiB - Prom 2598 - 2657 75.3 3 3 Tu 1 . + CDS 2389 - 2508 60 ## + TRNA 2583 - 2655 79.5 # Arg CCG 0 0 4 4 Op 1 . - CDS 2659 - 2874 423 ## COG1476 Predicted transcriptional regulators 5 4 Op 2 . - CDS 2874 - 3251 481 ## 6 5 Tu 1 . - CDS 3355 - 4098 894 ## Predicted protein(s) >gi|319978813|gb|AEUH01000078.1| GENE 1 103 - 1359 1808 418 aa, chain + ## HITS:1 COG:Rv0924c KEGG:ns NR:ns ## COG: Rv0924c COG1914 # Protein_GI_number: 15608064 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Mycobacterium tuberculosis H37Rv # 16 416 35 428 428 362 56.0 1e-100 MAAPQDPRHKHAIIPLLGPAFVAAVAYVDPGNVAANITAGARYGYLLVWVLVASNLAAMV VQYLSAKLGLVTGESLPSLLGGRLPTAGRIAFWAQAEIVAAATDLAEVIGGALALHLLFG VPLVWGGVIIGAVSMALLLIQGGGRQRSFENAIIALLVVITLGFLAGLVASPPDWGSVAW GMVPRFQGADSVLIAASMLGATIMPHAVYLHSSLVNDHEGGTIPTADEHHSEGHLKRLLR ATRIDIYWALGVAGVVNTGLLLLAASALQGQQGTDTIEGAHAAIASALGPTVGTVFAVGL LASGLASTSVGAYAGSEIMNGLLHIRVPLWARRLVSLVPALAILAAGAEPTWALVVSQVA LSFGIPFAIIPLVSLTRDRRVMGEHANSPVLAAAGALIAAAIVVLNLLLIYLTATGQG >gi|319978813|gb|AEUH01000078.1| GENE 2 2029 - 2430 583 133 aa, chain - ## HITS:1 COG:ML1484 KEGG:ns NR:ns ## COG: ML1484 COG0698 # Protein_GI_number: 15827780 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Mycobacterium leprae # 1 133 22 154 162 158 58.0 2e-39 MAHLRGAGHDVVDHGALSYEPLDDYPPFCIEAAEAVVAEPGSLGVVIGGSGNGEQMAANR VAGVRAALVWNDATARLAREHNDANVIAVGARQHTAEEALALVDAFLETPFSGDARHQRR IDLMSEYEASHRA >gi|319978813|gb|AEUH01000078.1| GENE 3 2389 - 2508 60 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVDDVVARPAQVGHDLFLQFVSGVVGADVDSHMTLPRVM >gi|319978813|gb|AEUH01000078.1| GENE 4 2659 - 2874 423 71 aa, chain - ## HITS:1 COG:PA4077 KEGG:ns NR:ns ## COG: PA4077 COG1476 # Protein_GI_number: 15599272 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pseudomonas aeruginosa # 1 67 1 66 68 80 64.0 7e-16 MRNNVKELRSEQRWTQAEFGASLGVSRQTVIAIENGKYDPSLGLAFKIASVFGKCIEDIF YPDPDGQGGGR >gi|319978813|gb|AEUH01000078.1| GENE 5 2874 - 3251 481 125 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEPAQRRYLIRFGVGMAVYVALLAAALALAGVLPEPAKPWSILLVVPAIGIIVWAVCAYW AEADEFPRKLLIESAGLAFTVSAPLLATAGILDLVGAVRVPFIAAFVVLMSAWGLCMAVL QRRYR >gi|319978813|gb|AEUH01000078.1| GENE 6 3355 - 4098 894 247 aa, chain - ## HITS:0 COG:no KEGG:no NR:no VAGATAGAQWASPAPAVPTTAPAPTVPLGRLALAALARGGAILLAGAALAASGSLIWISP ATVLIDVLTLAMLALLMRREGRSLADLYRPFLLRDIGWGLLAFIGTVVVWFVASFIGNLI AYHGAPPMGAMSARLPMWAGLLCFAVMPLTIALAEEGLYRGYLQPRIGVRFGAGMALVVP ALAFSLQHLGFALSSPEAAVAKLVTTLIAGLFFGALMMAWKRTMPTVIAHWMLDLLFLGL PILMMAL Prediction of potential genes in microbial genomes Time: Thu May 12 17:33:07 2011 Seq name: gi|319978805|gb|AEUH01000079.1| Actinomyces sp. oral taxon 178 str. F0338 contig00079, whole genome shotgun sequence Length of sequence - 7062 bp Number of predicted genes - 9, with homology - 6 Number of transcription units - 7, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 66 66 ## + Term 163 - 213 -0.7 + Prom 202 - 261 2.1 2 2 Tu 1 . + CDS 378 - 1796 2007 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases 3 3 Tu 1 . + CDS 2035 - 2160 65 ## - Term 2229 - 2283 7.2 4 4 Op 1 . - CDS 2288 - 3079 945 ## COG0584 Glycerophosphoryl diester phosphodiesterase 5 4 Op 2 . - CDS 3076 - 3828 683 ## COG0739 Membrane proteins related to metalloendopeptidases 6 5 Tu 1 . + CDS 3710 - 4015 113 ## 7 6 Op 1 7/0.000 - CDS 4163 - 5194 1433 ## COG1253 Hemolysins and related proteins containing CBS domains 8 6 Op 2 . - CDS 5191 - 6495 1783 ## COG1253 Hemolysins and related proteins containing CBS domains 9 7 Tu 1 . + CDS 6720 - 7062 97 ## gi|293189468|ref|ZP_06608188.1| oxoglutarate dehydrogenase (succinyl-transferring), E1 component Predicted protein(s) >gi|319978805|gb|AEUH01000079.1| GENE 1 1 - 66 66 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no CAAAVASGAALSAVGAAARAP >gi|319978805|gb|AEUH01000079.1| GENE 2 378 - 1796 2007 472 aa, chain + ## HITS:1 COG:NMA0753 KEGG:ns NR:ns ## COG: NMA0753 COG1055 # Protein_GI_number: 15793728 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Neisseria meningitidis Z2491 # 6 430 30 467 473 252 41.0 1e-66 MTYQWWSLLPFVAMLGCIALLPIIPATAHWWERASSQLIIALALGAPTTVWMFLGAGSSP LIATGVEYAQFIVLLFALFTVSGGIHLDGDIKATPRNNTAFLAVGGVLASFVGTTGAAML LIRPLLATNKERVHRVHTVLFTIFVVANCGGLLTPLGDPPLFLGYLHGVPFTWTLGLWKE WFFVLALLLLTYYSIDRIRYASEDTSAIESDDRDIKPLRLRGGLNLLLFAVIIAAVAFIP SWDGEALAEGHVGAWTELVPWREIVMIGAAAASYFLSDREVRFTSNAFTWGPISEVAILF IGIFATMAPALRFLEQIAPSLPLTRMTFFLFTGGLSSVLDNAPTYMTFFEMAKALGGDGT LIAGVPEYYLVPISLGSVFCGAITYIGNGPNFMVKSVAEADGVAMPSFGGYITHAFTLLV PVLAAMALVFVGTSWVERGVGIALAAGLAFNSLWRIRRAGKVEGAADFSGER >gi|319978805|gb|AEUH01000079.1| GENE 3 2035 - 2160 65 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTRYLPQWPPAPHHGPTNAYPNDPLPTPMTARTRPTGAAPR >gi|319978805|gb|AEUH01000079.1| GENE 4 2288 - 3079 945 263 aa, chain - ## HITS:1 COG:Rv0317c KEGG:ns NR:ns ## COG: Rv0317c COG0584 # Protein_GI_number: 15607458 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Mycobacterium tuberculosis H37Rv # 9 257 8 254 256 135 34.0 6e-32 MIPVVSRRGPIVLAHRGGGAEAPENTMAAFEHARSLGVRHIETDAHLTADGRVVLSHDDA VDRCFDGTGLIAQLDWREISRLRHREAPGEKMPLLAEVLEAFPDMYFNIDAKVPGVVEPL IEVIEEHRAASRTLVASFKESRLRAVRTRGSIPTSLGTEAVVRLVGAAKTATDPSRWCVP GPSQGVVAAQVPAGLGPLRVVDRRFVAAAHTLGLAVHVWTIDTAEEVLELLDCGIDGIVT ERPTMVRDLLDSRGLWEEPQAPA >gi|319978805|gb|AEUH01000079.1| GENE 5 3076 - 3828 683 250 aa, chain - ## HITS:1 COG:TP0706 KEGG:ns NR:ns ## COG: TP0706 COG0739 # Protein_GI_number: 15639693 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Treponema pallidum # 82 240 159 308 308 92 40.0 1e-18 MGFFGHIARATLIVGLAGATIAVPMTGRIGSATALTMPAMAMGAPPDRQSWAAGQVLPRA SMLEGSLTAASRARVRAPVSMVGCASSAGADGTRSVNITTDTVYWPLAEGSFTITSPFMM RVSPVSGQLLQHEGIDMAAPLDTPVVAVYQGTVTEVAENSRSGAYVQIRHQRKDGTVFYS AYLHQYMNRITVKVGDQVTAGQTIGAVGNNGWSTGPHLHFEIHNEKDEPVDPEAWMAGVR AIYPGQEACQ >gi|319978805|gb|AEUH01000079.1| GENE 6 3710 - 4015 113 101 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGIVSAVADPMRPVIGTAMVAPARPTIRVARAMCPKNPTVLRGLPGPGWDSATCPPLGL PPLRGVEASRCRWARLGSGHCNSVPLTRVPRSLSVGQSAYS >gi|319978805|gb|AEUH01000079.1| GENE 7 4163 - 5194 1433 343 aa, chain - ## HITS:1 COG:Cgl1413 KEGG:ns NR:ns ## COG: Cgl1413 COG1253 # Protein_GI_number: 19552663 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Corynebacterium glutamicum # 1 339 1 346 354 221 38.0 1e-57 MSAGLGIAVTVLLLGVNAFFVAGEFAVTSTRRSQIEPLAEQGRPGSRHALHALRHVSLML AICQLGVTVASTTLGVVAEPAIAHLVEAPLARAGLPGASAHVVGFAAALAVVLFAHVVFG EMVPKNISLASSVRMLLVLAPVLVGIGRVIRPVITAMDAFANWFVRLAGYEPRAEIASTF TVEEVATIVETSQAEGVLDDELGLLSGALEFSEETAGSTMVPLDGLVVLPASATPADVEE QVARTGYSRFPILDGGGAISGYVHLKDVLYAEGAQRDEPITPWRIRRLETVGASDEVEDA LRRMQRTGVHCALVESEGEAAGVLFLEDILEQLVGEVRDSLQR >gi|319978805|gb|AEUH01000079.1| GENE 8 5191 - 6495 1783 434 aa, chain - ## HITS:1 COG:Rv1842c KEGG:ns NR:ns ## COG: Rv1842c COG1253 # Protein_GI_number: 15608979 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Mycobacterium tuberculosis H37Rv # 6 424 14 440 455 285 42.0 1e-76 MLVLGLVLTAGTFVFVSAEFSLVAIDQAVVEKRAEEGQRGAARVLRATKTLSTQLSGAQV GITLTTILLGYTTQSTIASLLESALGSAGVAWGLATGIAAFAAAAFINVFSMLFGELVPK NLALAHPMDTARAVVPFQMAFTTVFAPVIWVLGGTANWVLRRMGIEPREEISSARSAGEL AALVEHSAEEGTFDTSTASLFTNSIRMSRLCAADVMTDRGRVRTLPEGASAADVIALAAS TGHSRFPVIGEDSDDVVGLVSLRRAVAVPHERRAEVPVVSSSLLAPAPSVPETAPIGPLM VQLRDEGLQMAVVVDEYGGVSGIVTLEDVIEEIVGEVSDEHDQRRLGIRPRPDGTLLVPG TLRPDELKARTGIVLPDDGPYDTLGGLIMNELGDIPAVGQRLQVDGVGLEVAQMQGRRVT QIALTPPPEPEEAG >gi|319978805|gb|AEUH01000079.1| GENE 9 6720 - 7062 97 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189468|ref|ZP_06608188.1| ## NR: gi|293189468|ref|ZP_06608188.1| oxoglutarate dehydrogenase (succinyl-transferring), E1 component [Actinomyces odontolyticus F0309] # 21 114 11 103 1304 83 53.0 4e-15 MKTLTGLIQLTPPPPVPGPEGQNQPEETVPSSTQHPDPQWYIAKMRDLYARDPGQLDESW RAYFSTESAPPQLRSPLPAMPTAPGAGAAPVLPEAGAGIGAARTTPAPGDGTTP Prediction of potential genes in microbial genomes Time: Thu May 12 17:33:30 2011 Seq name: gi|319978801|gb|AEUH01000080.1| Actinomyces sp. oral taxon 178 str. F0338 contig00080, whole genome shotgun sequence Length of sequence - 5410 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 3686 4953 ## COG0567 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes 2 1 Op 2 . + CDS 3742 - 4344 690 ## COG2755 Lysophospholipase L1 and related esterases 3 2 Tu 1 . - CDS 4469 - 5410 1064 ## COG1162 Predicted GTPases Predicted protein(s) >gi|319978801|gb|AEUH01000080.1| GENE 1 3 - 3686 4953 1227 aa, chain + ## HITS:1 COG:Rv1248c_3 KEGG:ns NR:ns ## COG: Rv1248c_3 COG0567 # Protein_GI_number: 15608388 # Func_class: C Energy production and conversion # Function: 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes # Organism: Mycobacterium tuberculosis H37Rv # 332 1226 1 894 894 856 49.0 0 GTTPPAPASSAPPAPTGQLPAPGPEPEGGPLSVTGPATGLEEPAPQEGIANEESVTRADL PPAPPAAVAEATSPYTRQQHGRAAFTRGAGAATSDQTSVLKSAARATAKHMEASLSVPTA TSQRQIPAKLLIENRALINSHLARSVGGKVSFTHLIGYALVEALCEMPGLNVRYALEGGK PAVEHMAHIGFGLAIDVADAEGNHSLKVPVVRDADTLTFAEFVAAYQDLVSRARRGALAP ADFAGATVTLTNPGTLGTTTSVPRLMAGQGIIIGVGATDYPAEFRGVSPKRLAAMGIGKT MFFSSTYDHRVIQGADSGRLLALIDQKLSGRDGFYERVFTSMHVPTRPYKWEADYEYDPA REKGKAARIAELIHAYRSRGHLAADTDPLSYRVRHHPDLELSSYGLSVWDLDRPFPGGGP GGGQTLPLREILAQLRDTYTRTVGIEYMHIQDPVQRQWVQSRIEKPYSAPDDAEQARILD TLIRAEAFEEFLQTKFTGQKRFSLEGGESLIPLLDELLSSAALAGIHEVAIGMAHRGRLN VLANIAGKSYSQIFGEFEGNYVANPAQGSGDVKYHLGTWGVYSVDDGMATKVYMAANPSH LEAADGVLEGIVRAKQDALGDPDLPIIPVLIHGDAAFVGQGVVQETLNMSQVEGYKTGGT IHVIVDNQIGFTTGPASGRSTRYPTDLAKGLQIPILHVNADDPEAVVRCARLAFQYRCRF HKDVIIDMVCYRRRGHNEGDDPSMTQPVMYSLINRIPSTRSVYVRNLVGRGRLTEEQARA SIAKYEAELSRILDETRSGGHEGRPAPDPSLTAGVGGAGEAEDSGWTMPESQLPGQGMMI GWTSATSAPRIRRIGKAHTRFPQGFTPHPKIRRLCERRREMAMGERPIDWGFAEILAMGT LLMEGTNVRLSGEDVARATFVQRHAVLHDHTDGREFTPLRFLRPDQADFGVWNSPLSEYG VLAFDYGYSLESPETLTIWEAQFGDFANGAQTVIDEFICSAEQKWGQRSALVMLLPHGYE GQGPDHSSARIERYLQLAAQDNMWIVQPSTPANHFHMLRTQAYKRPRKPLIAFTPKQLLR LAAAASSVEEFTSGSFRPVIGEARAGIDPASVERVLVCTGRVYYDLVRERAARRDPATAV VRLEQLYPLPAQALAEELAKYPGAEVVWVQDEPRNQGAWPHLALNLFAQWPRPVRLVSRP ESATTAAGRPSLHKEQASLLLRAAFGD >gi|319978801|gb|AEUH01000080.1| GENE 2 3742 - 4344 690 200 aa, chain + ## HITS:1 COG:alr1529 KEGG:ns NR:ns ## COG: alr1529 COG2755 # Protein_GI_number: 17229021 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Nostoc sp. PCC 7120 # 1 192 9 204 206 75 30.0 6e-14 MRLCVIGDELVAGTGDSRSLGWIGRVCARSRFKVPATVMALAMPGETTKEMGMRWEAEVS PRLADDQPRGLVIGVGPADVPAGVSTARSRLNLANITDRAAVLGIPCFVVGPPPLAGVDQ QALRALSRSCAEVCARRQIPFVDTFTPLVSHDQWFEDMAGSMARNEAGMTLPGQTGYALI AWIVQHQGWYEWTGASLPED >gi|319978801|gb|AEUH01000080.1| GENE 3 4469 - 5410 1064 313 aa, chain - ## HITS:1 COG:ML0791 KEGG:ns NR:ns ## COG: ML0791 COG1162 # Protein_GI_number: 15827343 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Mycobacterium leprae # 10 260 57 304 327 219 50.0 7e-57 GGPRAGGAGRVVGVRAKEIRRGGVIMGDRVRLAGDLSGAPGALARIVRVEERSTVLRRSL EDAPDARGEKAIVANASLMCVVIALADPPPRTGMIDRCLVAAYEAGLRPVLVLTKSDLAD PGPLVGAYADFDLDVVVTGGGGASGADRLAEALSGEFSVLVGHSGVGKSTLINALSPDAD RAVGRVNGVTGRGRHTSTSSEAFELPGGGWIVDTPGVRSFGLGHVGADDVLAVFPDVRDA ASWCLPLCTHAQSEPSCALDAFARGEAPFGAGSGGGADPGSGVLARRASRVASARRLLDA VATAEAASRAARG Prediction of potential genes in microbial genomes Time: Thu May 12 17:33:31 2011 Seq name: gi|319978798|gb|AEUH01000081.1| Actinomyces sp. oral taxon 178 str. F0338 contig00081, whole genome shotgun sequence Length of sequence - 1482 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 152 60 ## gi|293189454|ref|ZP_06608174.1| putative ribosome small subunit-dependent GTPase RsgA 2 1 Op 2 . - CDS 152 - 1453 1503 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase Predicted protein(s) >gi|319978798|gb|AEUH01000081.1| GENE 1 2 - 152 60 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293189454|ref|ZP_06608174.1| ## NR: gi|293189454|ref|ZP_06608174.1| putative ribosome small subunit-dependent GTPase RsgA [Actinomyces odontolyticus F0309] # 1 49 1 49 361 79 85.0 5e-14 MGRRDTGTDDPRVRVRAPKRSRPRTKDRPDWSSAPVGSVVGIDRGRYQVV >gi|319978798|gb|AEUH01000081.1| GENE 2 152 - 1453 1503 433 aa, chain - ## HITS:1 COG:Cgl0740 KEGG:ns NR:ns ## COG: Cgl0740 COG0128 # Protein_GI_number: 19551990 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Corynebacterium glutamicum # 1 423 11 425 430 333 49.0 3e-91 MTVWEAPTASGPIDAVVALPGSKSQTARALYLAAVSDAPTTIRGALDARDTRLFVGALEQ MGASFAPQGGALRVTPMGERPRPAAIDCGLAGTVMRFLPPLAALSRGTTRFDGDAAARAR PLAPLLGALEGMGARVHHEGAPGRLPFTIRGPLRTPLGAQCAVDASASSQFLSALLLVAP LIGDPLFVSAPGRVVSMPHVEMTVRALARAGIDIEEVDEAGPVRTWHVFPGRPSPGDTDI EPDLSNAGPFLAAAMVTGGRVRVPGWPRATSQPGDAWRAILGHMGARVELGDDGLTVEGP GAGNYPGIDADMSAVGELTPALAAICASASSPSRLSGIAHLRGHETDRVAALATELNRLG GDAQDGGDCLAITPAPLRPGTVETYEDHRMAAFGAVLGLTTAGVGVRDIECTSKTLPGFA RMWAGALTGGREP Prediction of potential genes in microbial genomes Time: Thu May 12 17:33:36 2011 Seq name: gi|319978793|gb|AEUH01000082.1| Actinomyces sp. oral taxon 178 str. F0338 contig00082, whole genome shotgun sequence Length of sequence - 1882 bp Number of predicted genes - 5, with homology - 3 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 286 192 ## HMPREF0573_10389 hypothetical protein 2 2 Op 1 . + CDS 494 - 1105 769 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 3 2 Op 2 . + CDS 1122 - 1427 400 ## gi|154509480|ref|ZP_02045122.1| hypothetical protein ACTODO_02012 4 3 Tu 1 . + CDS 1639 - 1713 95 ## + Term 1728 - 1767 5.8 5 4 Tu 1 . - CDS 1801 - 1875 73 ## Predicted protein(s) >gi|319978793|gb|AEUH01000082.1| GENE 1 1 - 286 192 95 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10389 NR:ns ## KEGG: HMPREF0573_10389 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 2 92 20 107 165 62 40.0 4e-09 MVARPLLAAPFIADGIDAVRDPRAHVARIAPARPLIDRAASVAGVEADDGRLVAATRVLG AVTAVAGVGFALGRAPRVCAAVLAVVGAPVALAPR >gi|319978793|gb|AEUH01000082.1| GENE 2 494 - 1105 769 203 aa, chain + ## HITS:1 COG:Cgl0743 KEGG:ns NR:ns ## COG: Cgl0743 COG1595 # Protein_GI_number: 19551993 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Corynebacterium glutamicum # 20 200 14 194 206 211 60.0 5e-55 MDTTTPTPAGAAPPAEQAELRERFMDEAMPLMDQLFGAALGMTRNRADAEDLVQETYLKA YSKFHQYKPGTNIKAWLYRILTNTYITQYRKAQRSPKRASTDTVEDWQLAEAASHDARGL VSAEVEALEALPSEQLRDALESLSEEHRMVIVMADVESMTYKEIAQALGIPIGTVMSRLN RARRNLRAKLGEVAAEYGIGGAR >gi|319978793|gb|AEUH01000082.1| GENE 3 1122 - 1427 400 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509480|ref|ZP_02045122.1| ## NR: gi|154509480|ref|ZP_02045122.1| hypothetical protein ACTODO_02012 [Actinomyces odontolyticus ATCC 17982] # 7 100 6 99 99 117 67.0 2e-25 MNGPAPTCREVLEQIYALIDCEECDRRGALIDGGDIDGPDARLRALMLAHAASCAQCSDA LEAERHVRALLRRCYGTAQAPAALRARVTASITRISVAYRG >gi|319978793|gb|AEUH01000082.1| GENE 4 1639 - 1713 95 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSKRGRKRRDRRKNGANHGKRPNA >gi|319978793|gb|AEUH01000082.1| GENE 5 1801 - 1875 73 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRARGRHTWEARARQAAQALTGA Prediction of potential genes in microbial genomes Time: Thu May 12 17:33:51 2011 Seq name: gi|319978791|gb|AEUH01000083.1| Actinomyces sp. oral taxon 178 str. F0338 contig00083, whole genome shotgun sequence Length of sequence - 612 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 598 607 ## Bfae_20990 glycosyltransferase Predicted protein(s) >gi|319978791|gb|AEUH01000083.1| GENE 1 1 - 598 607 199 aa, chain - ## HITS:1 COG:no KEGG:Bfae_20990 NR:ns ## KEGG: Bfae_20990 # Name: not_defined # Def: glycosyltransferase # Organism: B.faecium # Pathway: not_defined # 1 199 108 295 755 79 35.0 9e-14 MFPGATVARGLARLWGRPYAITEHRPSTLDSPAFGARCRPIAAAVAGAGALTTVSDGFAR RLGEHYGTGEWEAVPLPVPALFFDEPLGGADSAGSADGTGSAGGAGGAGGATAAGVVRFV HVSHLDGNKRPGPMCDAFLEAFPHGGARLDVVGGSPAQVGVLRSAVGDRPGAASIRFHGR MDRRGTAREMARADVFVLA Prediction of potential genes in microbial genomes Time: Thu May 12 17:33:58 2011 Seq name: gi|319978788|gb|AEUH01000084.1| Actinomyces sp. oral taxon 178 str. F0338 contig00084, whole genome shotgun sequence Length of sequence - 4157 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1288 1676 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 2 2 Op 1 1/0.000 + CDS 1422 - 3041 1785 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) 3 2 Op 2 . + CDS 3038 - 4147 1035 ## COG0438 Glycosyltransferase Predicted protein(s) >gi|319978788|gb|AEUH01000084.1| GENE 1 1 - 1288 1676 429 aa, chain - ## HITS:1 COG:MA4450 KEGG:ns NR:ns ## COG: MA4450 COG2244 # Protein_GI_number: 20093236 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Methanosarcina acetivorans str.C2A # 5 362 16 378 494 132 26.0 1e-30 MLAAVMTLLSGTLAAQVIGFVLQIGIARTYSATDKGLFGIYGSVASLVVTVAAARFDLSV VLPRDDDDARVLVRLAGRCVVVSSLLTSLACVAAASWVSSRYGSAELAAWLCGSGVTVFA LAQAANLQYWLTRKGRFGDIARSSVVRSLAVAGLQLVCGWAAGGGLNALIGATVVGQLIA LAYVWGRGRDARAPGGPGAPTMGEMARRYRRMALLGGPNVLVDAVRNTGINLLIGSAAVA SLGQFQLAWAVLQVPVALIVGSIGQVFLRTLSRTEPGRMGSLVAATMRRAVLGAAVPFGA LYALAPWLFPVVFGAQWDQAGDFARSLVPWLAMMVVSSPVSNLFVVTDNQHRMLAFAVVY CAVPLAWLWLSPLDLAATVRVLGAGMAALLVVMVAMAWSCAREFDRRGGPAQGADEPDRG GEEPDHGGQ >gi|319978788|gb|AEUH01000084.1| GENE 2 1422 - 3041 1785 539 aa, chain + ## HITS:1 COG:MA0051 KEGG:ns NR:ns ## COG: MA0051 COG0367 # Protein_GI_number: 20088950 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Methanosarcina acetivorans str.C2A # 49 413 65 436 590 131 27.0 3e-30 MLPTISLDASWRRSSAPRAGGGAGSRTWAWSPVGDLPPSLFSFPSDPDLARLPGQWALVS RAGGQVRLAVDPMRSRVLLFAFHNGRWVVADSPDALRALVPWRLDRAGAAQLRHLGFSLG ARTLVEGVRSVQAAHTVVLRDDGSWDQRPYMRYRHGPDLVEDPDEFAGIFGGALRRCVSR LVAAAGRSQLVVPLSGGLDSRLLATALVEAGAPRLAAFSYGVPGCGEEAVARGVADSLGI PFTAVHLDPDRMAERWRRADGFRRATWGATSLPHVQDWYALGELRCRRLIDDGAVIVPGH TIVGNAHDCAALARRPSPAQAARLITLYHGSLQGAPGALRRDPDMALALRETARDAGVGA GASEAALQSWVEWFNLRERQAKYINMSMRTYEYFGYGWAAPMLDADMWRAWLRGGPGLTR DRGWYAGFVDALYSSVSASPPRGYYSAPGNDHGIPPGMKRAALRLMGATGADRALARARS VRAQLRHPMGFEAFSRQVPQPLLALRYMTGATALGTWARLFTDNRWGGPGRIVPEGEGA >gi|319978788|gb|AEUH01000084.1| GENE 3 3038 - 4147 1035 369 aa, chain + ## HITS:1 COG:alr5202 KEGG:ns NR:ns ## COG: alr5202 COG0438 # Protein_GI_number: 17232694 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 84 368 124 409 429 91 25.0 3e-18 MRTAITKGTLRIPPTYFALAHADAMPDIEWRAFTLVADIADPAVAVPVEQATPLAGRLGA RARERLKWGRLGTMARAVEAWGPDVVHQQQATWSLPAVRASRRTGAPMVTTLHGGDAYRA GAARGAAGGAWNERNRRAAFAQSEALLAVSRFLAGVAVASGAPADKVEVHYQGVDTAFWT PGDPSRRADAEPTVLFVGALTALKGVMDLVDVSAALVERFPHRLVVVGEGPLEERVRAAA GPHVRMTGALPREWVRDLVRSAAVLVCPTRSSEGRQEAAGLVLLEAQACAVPVIAGRVGG TPEMLADSATGFLTADGDRDDLAAALGRVLAMPEEERAAMGAAAREWVVGNRSLRGATAR LREVYASLG Prediction of potential genes in microbial genomes Time: Thu May 12 17:33:59 2011 Seq name: gi|319978785|gb|AEUH01000085.1| Actinomyces sp. oral taxon 178 str. F0338 contig00085, whole genome shotgun sequence Length of sequence - 2222 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 8 - 1036 1263 ## COG0673 Predicted dehydrogenases and related proteins 2 1 Op 2 . - CDS 1045 - 2127 1483 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 3 1 Op 3 . - CDS 2139 - 2222 117 ## Predicted protein(s) >gi|319978785|gb|AEUH01000085.1| GENE 1 8 - 1036 1263 342 aa, chain - ## HITS:1 COG:MK0248 KEGG:ns NR:ns ## COG: MK0248 COG0673 # Protein_GI_number: 20093688 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Methanopyrus kandleri AV19 # 6 332 3 315 317 138 32.0 2e-32 MGRAPVRVGILGLGSMGRHHVRNARATPGFEVVALADPAGDPHGVAGPLEVLPDVHALIR AGIDAAIVAAPTVHHEEAALALARAGVHALVEKPLAASAAAGARIRDAFCAAGLVGAVGY VERCNPALIEMRRRIADGQLGQIYQIATRRQSPFPARICDVGVVKDLATHDVDLAAWVAG SPYATISAMTTARSGREHEDMMVASGTLANGILVNHIVNWLTPFKDRTTIVTGERGALVA DTAMGDLTFHENGDQPLQWDQIAAFRGVSEGTVIRYALTKREPLAVEHAHFRDAVLGKGC QHVSMDEGLEALRVTEAILDSARTGRAVRLAQPPGPARPSGA >gi|319978785|gb|AEUH01000085.1| GENE 2 1045 - 2127 1483 360 aa, chain - ## HITS:1 COG:MTH1188 KEGG:ns NR:ns ## COG: MTH1188 COG0399 # Protein_GI_number: 15679199 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Methanothermobacter thermautotrophicus # 1 352 2 355 360 323 52.0 3e-88 MPAARPLIGEEEIAAVAAVMRSGMIAQGPEVAAFEEEFARAMAPGARAVAVNSGTSALHL GLMAAGIGPGDEVVVPSFTFAATANAVAITGATPVFADIDLRTYTLAPASVQEAVTPRTR AIMPVHLYGHPAPMDQILAIAEANGLMVFEDAAQAHGASLHGRRVGCFGEFGAFSFYPTK NMTCGEGGIVTTSDPRIERQVRLLRNQGMERRYANEVVGLNNRMTDISAAIGRVQLRKAA DWTRRRQANAAFLDAELRGVVTPHVAPGAVHVYHQYTIRVEDRDRFAAALAQEHGVGSGV YYPVPCHRLESLSRFAPARALGATDEAARSVLSLPVHPSLLTRDLERVVTGVNALARAGS >gi|319978785|gb|AEUH01000085.1| GENE 3 2139 - 2222 117 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no GRFTCPATGRGYRLDGAGALAPEGEGE Prediction of potential genes in microbial genomes Time: Thu May 12 17:34:04 2011 Seq name: gi|319978783|gb|AEUH01000086.1| Actinomyces sp. oral taxon 178 str. F0338 contig00086, whole genome shotgun sequence Length of sequence - 538 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 508 640 ## COG0110 Acetyltransferase (isoleucine patch superfamily) Predicted protein(s) >gi|319978783|gb|AEUH01000086.1| GENE 1 1 - 508 640 169 aa, chain - ## HITS:1 COG:PAB0773 KEGG:ns NR:ns ## COG: PAB0773 COG0110 # Protein_GI_number: 14521366 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Pyrococcus abyssi # 14 169 11 158 205 143 48.0 2e-34 MRHTEREVFVASAPTADIDPSATIGEGTRVWHLAQIREGAAIGRDCVIGRGAYIGAGVRV GDGCKIQNHALVYEPAGLGSGVFVGPAAVLTNDRHPRAVNPDGSPKGAGDWTRVGVDVGR GASIGARAVCVAPVSIGPWAMVAAGAVVTRDVPAYALVAGVPARRIGWV Prediction of potential genes in microbial genomes Time: Thu May 12 17:34:06 2011 Seq name: gi|319978776|gb|AEUH01000087.1| Actinomyces sp. oral taxon 178 str. F0338 contig00087, whole genome shotgun sequence Length of sequence - 6094 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 681 931 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 2 1 Op 2 . + CDS 681 - 1160 373 ## Xcel_2593 hypothetical protein 3 1 Op 3 1/0.000 + CDS 1153 - 2514 1858 ## COG1316 Transcriptional regulator 4 1 Op 4 . + CDS 2511 - 3872 1344 ## COG1316 Transcriptional regulator - Term 4182 - 4243 6.1 5 2 Tu 1 . - CDS 4313 - 5308 1503 ## COG1087 UDP-glucose 4-epimerase 6 3 Tu 1 . - CDS 5504 - 6058 733 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase Predicted protein(s) >gi|319978776|gb|AEUH01000087.1| GENE 1 1 - 681 931 226 aa, chain + ## HITS:1 COG:Cgl0325 KEGG:ns NR:ns ## COG: Cgl0325 COG0463 # Protein_GI_number: 19551575 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Corynebacterium glutamicum # 6 220 11 228 236 179 46.0 3e-45 MTVNDDTWVVIPLFNEAPVIAGVVEGLRERFPHVLCVDDASVDGGGETARRAGAWVAAHP VNLGQGAALQTGFDWALQRGARFVVTFDADGQHQARDAEAMVATARERDLAFVLGSRFLG AAPLGMGAARRAVVRAAALATRWRSGLRVTDAHNGLRVIRADALRRVRLTHNRMAHASQI VSQLGATGLPWAEAPVTIRYSEYSRSKGQPLLNSVNILVDLALEDA >gi|319978776|gb|AEUH01000087.1| GENE 2 681 - 1160 373 159 aa, chain + ## HITS:1 COG:no KEGG:Xcel_2593 NR:ns ## KEGG: Xcel_2593 # Name: not_defined # Def: hypothetical protein # Organism: X.cellulosilytica # Pathway: not_defined # 5 110 1 106 127 68 41.0 6e-11 MRYFLIKALLLGALCLMTWALVRPVRSQSSLAVRRLATALLMVCAAAAVLFPQVANTVAR AIGVERGVNLLTYGLIVAFFAQSVTAYRRDTAVQRQLTALARQVALASARPPYAGAGAPP PGAGTPCGCEGGAGGVQRAATEADIPRNGPCNDMVATDE >gi|319978776|gb|AEUH01000087.1| GENE 3 1153 - 2514 1858 453 aa, chain + ## HITS:1 COG:MT0844 KEGG:ns NR:ns ## COG: MT0844 COG1316 # Protein_GI_number: 15840235 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mycobacterium tuberculosis CDC1551 # 58 433 205 582 684 113 28.0 7e-25 MSSLPMRTIQHSATNFGRFGLIRSALAVALSGVLFVTSAGAFLYRALAGQVQDRVIDINS FVTNSSDTQTPDSFEGRAVNVLVLGIDSREGKNSDLGAGDSEDVGGLRNDSTMVVHISAD RTRLQVLSIPRDTLVDIPSCRHADGSSSDPQTEVMFNTSMFTGADGGSEASDVAPGVACV KATVEQMSGMAVDAFMVVDFAGFISMVDALGGVWFNIPEDIADDDAGLYIDQGCWKLGGT HALAYMRARYSLEDGTDISRIGRQQQLISAIMRELQSKNYVTDLPSLVSFLQAAIATVNI SSNLADPMADATLLVNLMSKLDRANMQFVTLPVVAPPWDENRRITAEPLASKVWQAIEDD QALPVGTEYTDGNGAHLTVPDPNAGSSSAPGAGDSSQGQGASPSAPGQGAAPSNPGQGSE DSTSSQTDVAQANNSAEAEAKNQVDTCPPADKR >gi|319978776|gb|AEUH01000087.1| GENE 4 2511 - 3872 1344 453 aa, chain + ## HITS:1 COG:Cgl2840 KEGG:ns NR:ns ## COG: Cgl2840 COG1316 # Protein_GI_number: 19554090 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Corynebacterium glutamicum # 147 424 1 284 288 186 40.0 1e-46 MSTHPPSFTPDPKRHRPGADDARVVGAPQDPHSAPAPPPPGVAPQMWRISEDQQGPERPL LPRTLDQPKRTGGEGSRIPQARAMPPSYQPSSASNPPSYAPASGPAPSYAPGQDAPSGPR GPRDGGPSGPAAPAKPRRRRRPFRTAMRAVGIVLVVLLAWGAFLLWDANSNLGRVDALSG KADTSGTTYLLAGSDSRADGAVQDGFNESERADSIMLVNVAQNGQTVALSIPRDTYAEIP GYGWDKINASYSYGGAALLVETVENLTGLAVDHFVQIGMGAVPDMVDAVGGVELCYDNTV SDPYSGLNWEAGCHTVDGTTALAFSRMRYADPEGDIGRTKRQRQVISKVVSTAMSPSTLI NPMRTLSVERAGSKSFTVDDSSSNVFTVTSLVMALRSATSSEMMGVPPIESLDFTTDAGA SAVLLRDTTAPDFFAKLRTGSLTTSDFNQVDGF >gi|319978776|gb|AEUH01000087.1| GENE 5 4313 - 5308 1503 331 aa, chain - ## HITS:1 COG:L0024 KEGG:ns NR:ns ## COG: L0024 COG1087 # Protein_GI_number: 15673961 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Lactococcus lactis # 1 321 1 321 326 251 42.0 2e-66 MSILVCGGAGYIGAHVVRLLSQRGDKVVVVDDLSTGSADRIGDATLVTLDVASDQAQSVL SNVMVDEDVTAVIHFSARKQVGESVKRPVWYYQQNVGGLANVLAAMNDAGIDQMIFSSSA AVYGMPPVELVPETVDCRPINPYGETKLIGEWMMADCERAWDLKWIGLRYFNVAGAGWPD LADPAIMNLIPMVLDRLERGESAKIFGTDYDTPDGTCVRDYIHVLDLAEAHIAALDVLAE GRQPEHHTYNVGTGLGTSVREIIDGLRRVIGWDFPVEEQGRRAGDPPKLIGDPLSIGVDL GWKANNGVEEIIESAWEGWQAGKRPITVPGR >gi|319978776|gb|AEUH01000087.1| GENE 6 5504 - 6058 733 184 aa, chain - ## HITS:1 COG:Cgl0694 KEGG:ns NR:ns ## COG: Cgl0694 COG0041 # Protein_GI_number: 19551944 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Corynebacterium glutamicum # 16 180 3 163 165 169 63.0 3e-42 MSTSPDVRLTATGEDPLVGVVMGSDSDWPTMEGAVAALAEFSIACEVGVVSAHRMPEDMV AYGRSASERGLRVIIAGAGGAAHLPGMLAALTELPVIGVPVALKHLDGVDSLHSIVQMPA GVPVATVSIGGARNAGLLAARILGAGEGERAAALRARMRGFQGELRAMATAKGAALAERV SSQR Prediction of potential genes in microbial genomes Time: Thu May 12 17:34:16 2011 Seq name: gi|319978757|gb|AEUH01000088.1| Actinomyces sp. oral taxon 178 str. F0338 contig00088, whole genome shotgun sequence Length of sequence - 18780 bp Number of predicted genes - 21, with homology - 18 Number of transcription units - 13, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1117 1313 ## COG0026 Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) 2 1 Op 2 40/0.000 - CDS 1141 - 2463 1725 ## COG0642 Signal transduction histidine kinase 3 1 Op 3 . - CDS 2479 - 3159 959 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 4 1 Op 4 . - CDS 3254 - 4285 1502 ## COG2114 Adenylate cyclase, family 3 (some proteins contain HAMP domain) - Prom 4385 - 4444 4.4 + Prom 4346 - 4405 2.5 5 2 Op 1 . + CDS 4514 - 5206 807 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation 6 2 Op 2 . + CDS 5246 - 6196 1290 ## COG2171 Tetrahydrodipicolinate N-succinyltransferase 7 3 Op 1 4/0.000 - CDS 6441 - 7574 1297 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 8 3 Op 2 . - CDS 7576 - 7920 481 ## COG1146 Ferredoxin + Prom 7916 - 7975 2.2 9 4 Tu 1 . + CDS 8125 - 8397 204 ## gi|227494786|ref|ZP_03925102.1| conserved hypothetical protein + Term 8586 - 8622 1.3 10 5 Tu 1 . + CDS 8706 - 8936 205 ## + Term 9104 - 9146 6.3 11 6 Tu 1 . - CDS 8896 - 8991 292 ## - Term 9086 - 9132 8.0 12 7 Op 1 . - CDS 9270 - 9512 336 ## CLH_2446 glyoxalase family protein - Term 9537 - 9572 -0.5 13 7 Op 2 . - CDS 9785 - 11704 2879 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 11898 - 11957 3.7 14 8 Tu 1 . + CDS 11930 - 12769 1061 ## COG0676 Uncharacterized enzymes related to aldose 1-epimerase 15 9 Tu 1 . + CDS 12909 - 13481 621 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 16 10 Tu 1 . + CDS 13682 - 14437 760 ## gi|293189351|ref|ZP_06608074.1| hypothetical protein HMPREF0970_00387 + Term 14543 - 14592 2.6 17 11 Tu 1 . - CDS 14434 - 14697 216 ## 18 12 Op 1 . + CDS 14636 - 15472 949 ## COG0789 Predicted transcriptional regulators 19 12 Op 2 . + CDS 15435 - 16370 746 ## COG1131 ABC-type multidrug transport system, ATPase component 20 12 Op 3 . + CDS 16367 - 17131 602 ## Clos_2750 hypothetical protein 21 13 Tu 1 . - CDS 17163 - 18656 2194 ## COG0174 Glutamine synthetase Predicted protein(s) >gi|319978757|gb|AEUH01000088.1| GENE 1 1 - 1117 1313 372 aa, chain - ## HITS:1 COG:Cgl0691 KEGG:ns NR:ns ## COG: Cgl0691 COG0026 # Protein_GI_number: 19551941 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) # Organism: Corynebacterium glutamicum # 7 372 2 376 387 296 49.0 3e-80 MDNEHRPTVAVIGGGQLARMMQESAIALGIALRALVEAPGGSGGQAIVDAPVGDPRDLAA VLALIEGADVLTFEHEHIPAAVLEACAPLVAIEPPAAALLHAQDKLAMRERLTAMGVPCP RWARVETRADLEAFGAAVGWPLVVKTPRGGYDGHGVKAVHQARDADPWLGEGPLLAEEIV PFTREVAALLARRPSGEIRSWPVASTVQEDGVCSAVTAPAVGIRPATADLARRIGEDIAE RLGVTGVLAVEMFVVGEGDQERVLVNELAMRPHNTGHWTIDGAVTSQFEQHLRAVLDLPL GATDPVRPGWSAVMVNLLGSRLDEPARALGAAMEAGGPRARVHLYGKEVRPGRKLGHVTV VGPSAHEAEALA >gi|319978757|gb|AEUH01000088.1| GENE 2 1141 - 2463 1725 440 aa, chain - ## HITS:1 COG:RSp1043 KEGG:ns NR:ns ## COG: RSp1043 COG0642 # Protein_GI_number: 17549264 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 128 416 166 463 466 95 28.0 2e-19 MRERAVRIIVSVVAVVSLIMGVPGAFFASVLVWNNDQRALDTQGQAVLHAIERRISSDEP VTPQLLDSLVADPGSGSPVAYRVKVPGHALMVNQVRPSQPAMTTTVSSPAGISVQLTASG SRAVKRIAVTCAVFAAGMAVSMLTGWAMGRRLSRRLSAPLIYLAAQAEQIGSGQVRARVE SSGIEEIDLVSEELARTGERMAGRLAAERQFAADASHQLRTPLTALSMRIEEIELVSSQE EVRAEARSCLEQVERMTRVVTDLLDANARSNGQTEAIHILEVFNTAREEWEDRFESEGRP LVFLDEAEMPVLAEAGKLGQVLATLIENSLRYGAGTTTVRARKGTSKRGIVIEVSDEGEG VPDEIAPNVFEKGVSGHGSTGIGLALARDLAQAMGGRLELAQNQPPVFTVSLAAIPKSLD PDRVMPEGALMSMGRRSRRF >gi|319978757|gb|AEUH01000088.1| GENE 3 2479 - 3159 959 226 aa, chain - ## HITS:1 COG:all4503 KEGG:ns NR:ns ## COG: all4503 COG0745 # Protein_GI_number: 17231995 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 2 221 22 252 253 182 44.0 4e-46 MTRVLLVEDDPAISEPLARALGREGYEVLASSTGREALTNVEGADLVVLDLGLPDMDGLD VAREIRADGGRVPILILTARTDEVDMVVGLDAGADDYVTKPFRLAELLARVRALLRRHTA EPAESELRAQDIRVDVAAHRAFSGSTELQLTAKEFDLLRVLLREAGSVVERDELMREVWG SDPTGSTKTLDMHVSWLRRKLGDEASDPHYITTVRGMGFRFETTRR >gi|319978757|gb|AEUH01000088.1| GENE 4 3254 - 4285 1502 343 aa, chain - ## HITS:1 COG:MT2268 KEGG:ns NR:ns ## COG: MT2268 COG2114 # Protein_GI_number: 15841703 # Func_class: T Signal transduction mechanisms # Function: Adenylate cyclase, family 3 (some proteins contain HAMP domain) # Organism: Mycobacterium tuberculosis CDC1551 # 24 300 66 336 388 61 28.0 2e-09 MGARDVPSSAFPDELTAPSYVPRLLGGPLLYSAHDIAERTGTPIVRVNEFWRALGFPTPD PGAVVFSERDLDVFSKWHDLVESGAIDTATSRSLLRANSHLADRLALWQFEALVEDAMRR HGLDDTTARMYVLDHMRSRVEVFESMFIHSWQRQLEALLARLDKEVAQRGHEDRRSRFPL NRSLGFVDMVSYTSSSTILGDALVGLIERFEEESRNAVVEAGGRVVKMIGDAVLYIADDL PTGLRVATSLIERLNADDEILPVRASFVRGDVFSRSGDVFGPTVNLASRLVDIAPVGKIL TDPTTAAAIAAGKGGHGYELEEFPTADLRGFGPVSPYVLSAVD >gi|319978757|gb|AEUH01000088.1| GENE 5 4514 - 5206 807 230 aa, chain + ## HITS:1 COG:MT3381 KEGG:ns NR:ns ## COG: MT3381 COG0424 # Protein_GI_number: 15842873 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Mycobacterium tuberculosis CDC1551 # 2 226 3 210 222 132 41.0 6e-31 MRFILASRSPARRATLESAGVRPTVIVSEVDEDAVLDALPGGRAFSGGRTTPADEVMALA RAKQEAVTRILCEEGSSFQEAPSEALVVVGCDSMLELDGRMLGKPHTPDVARARIREMRG RTAALWTGHSMALLSPARPGGARDVAGADTASASTLVHFGDITDREIDAYVATGEPLGVA GSFTVDGLGGPFITGVTGDYHSVVGISLPLVRAMAAGLGVFWPDLWEPAG >gi|319978757|gb|AEUH01000088.1| GENE 6 5246 - 6196 1290 316 aa, chain + ## HITS:1 COG:ML1058 KEGG:ns NR:ns ## COG: ML1058 COG2171 # Protein_GI_number: 15827515 # Func_class: E Amino acid transport and metabolism # Function: Tetrahydrodipicolinate N-succinyltransferase # Organism: Mycobacterium leprae # 8 315 9 317 317 318 59.0 9e-87 MDTNTVWGVGLATVTTSGQTLDTWYPAPRLGEGHDDELATRLALLEKVDDARGVRTRVVT CAQRLDEAPDSTEGAYLRLHALSHRLVRPNTVNLDGIFSLLPTVVWTDAGPCAVEGFEQT RLRLRARLGRPVVVVGIDKFPPMTNYVVPSGVRIADAVRVRLGAHLSEGTTVMHAGFVNF NAGTLGASMVEGRVSQGVVIGDGSDVGGGASTMGTLSGGGRQRVRLGRRCLLGANSGLGI PLGDDCVVEAGLYVTAGAKVALIDPASPGQERIVAARELAGESNILFRRNSVSGRVEAIV RDGHGIELNSALHASN >gi|319978757|gb|AEUH01000088.1| GENE 7 6441 - 7574 1297 377 aa, chain - ## HITS:1 COG:Rv1178 KEGG:ns NR:ns ## COG: Rv1178 COG0436 # Protein_GI_number: 15608318 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Mycobacterium tuberculosis H37Rv # 10 372 5 360 362 314 55.0 2e-85 MAELPRPLPLPAFPWNSLAPYKKQAAAHPDGICDLSVGTPVDPTPALIREALAGACDAPG YPAVVGTAEVREAILAWGARRNMVDVGEAGVIPTIGSKEAVAWIPALLGAGPGDTVLVPE VAYPTYDVGARLAGAVPVPVDPLRPGSWPDAAMVFLNSPGNPDGHVMDADQLRAAIAWAR SRGAVIVSDECYAALPWAEPHVSGGVPSLLDADVCGGDPSGLVVLYSLSKQSNLAGYRAA LVYGDPRLVAALVEARKHSGMMVPAPVQHAMAVALADEAHVRRQRDVYGARRARLLDAVG SAGLVNDPRSVAGLYLWLGCPGVDDDWELVRRLADLGILVAPGAFYGDAARGRVRMALTA TDERIGAAASRIASHWG >gi|319978757|gb|AEUH01000088.1| GENE 8 7576 - 7920 481 114 aa, chain - ## HITS:1 COG:Cgl1078 KEGG:ns NR:ns ## COG: Cgl1078 COG1146 # Protein_GI_number: 19552328 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Corynebacterium glutamicum # 1 104 1 104 105 157 78.0 5e-39 MTYVIAQPCVDVKDRACVDECPVDCIYEGERSLYIHPEECVDCGACEPVCPTEAIFYEDD LPDEWSDYLRANADFFSELGSPGGAQRTGVQEYDDPMIAALPPQNEEWKAENGY >gi|319978757|gb|AEUH01000088.1| GENE 9 8125 - 8397 204 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|227494786|ref|ZP_03925102.1| ## NR: gi|227494786|ref|ZP_03925102.1| conserved hypothetical protein [Actinomyces coleocanis DSM 15436] # 1 62 26 87 96 67 58.0 3e-10 MRWRPGERVVLRYRLDDGLHDALGTVVEAAIDHVSIETRRGLVRVEATTMVTGKSVPSPR APGPRAGARERGPGTGAQQRGPDTAPGTPP >gi|319978757|gb|AEUH01000088.1| GENE 10 8706 - 8936 205 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPDGPRARPAGSGRADGASARRLGLFCPWSSTRPGRSGVQRAGHSGCLASEVGFSKGYRH FASVISLAKCGSAVKA >gi|319978757|gb|AEUH01000088.1| GENE 11 8896 - 8991 292 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGCFHSDTDTTGHTFAAIPTLSQRYHTLQAI >gi|319978757|gb|AEUH01000088.1| GENE 12 9270 - 9512 336 80 aa, chain - ## HITS:1 COG:no KEGG:CLH_2446 NR:ns ## KEGG: CLH_2446 # Name: not_defined # Def: glyoxalase family protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 73 57 129 129 92 56.0 3e-18 MTGRRYDYGKGLNGHCGIALYVDTFEEVDASFRRVVDNGAVPVMEPATEPWGQRTCCIAD PEGNLIEIGSRDKPYEQKDA >gi|319978757|gb|AEUH01000088.1| GENE 13 9785 - 11704 2879 639 aa, chain - ## HITS:1 COG:ML1498 KEGG:ns NR:ns ## COG: ML1498 COG1217 # Protein_GI_number: 15827789 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Mycobacterium leprae # 7 624 3 614 628 725 64.0 0 MSTRQDLRNVAIVAHVDHGKTTLVDAMLWQSGAFSERDTVEKTGERVMDSGDLEREKGIT ILAKNTAVHYNGPAARALGVDGGVTINVIDTPGHADFGGEVERGLSMVDGVVLMVDASEG PLPQTRFVLRKALAAQLPVIIVVNKVDRPDSRIDEVVAETTDLLLNLGEDLMNDGADIDV DSLMDVPVVYACAKAGKSSLDNPGDGQLPAEDLEPLFETILSRIPGPAYDESAPLQAHVT NLDASPFLGRLALLRIHNGTLRKGADVGLARHDGTISRVHISELLATEGLERVSTASAGP GDIVAVAGIEDIMIGESLVDLDDPRPLPLITVDDPAISMTIGINTSPMAGRVKGAKVTAR QVKDRLDRELVGNVSIKVLPTDRPDAWEVQGRGELALAILVEQMRREGFELTVGKPQVVT KEIDGVLSEPMERMTIDIPEEHLGAVTQLMAARKGRMETMANHGSGWIRLEFVVPARGLI GFRTRFLTETRGTGIASSIADGYAPWQGPIEQRLTGSLVADRAGQATPYAMTNLQERGTF IVEPSSEVYEGQVVGENPRGEDMDVNICREKKQTNTRSATADVYESLTPSRKLTLEESLE FAASDECVEVTPEAVRVRKVVLDAQERFKIAARERRANR >gi|319978757|gb|AEUH01000088.1| GENE 14 11930 - 12769 1061 279 aa, chain + ## HITS:1 COG:VC2001 KEGG:ns NR:ns ## COG: VC2001 COG0676 # Protein_GI_number: 15642003 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized enzymes related to aldose 1-epimerase # Organism: Vibrio cholerae # 22 269 42 287 296 167 37.0 2e-41 MTEIDIHTFGRHANAGHASPFGAHVLDWQPNSDHPLLWLSPRAVTDGTAPIRGGIPVCAP WFAAPDQSPAMPPPGAPAHGLARTRRWELEERAPQSLTYSFTHRRADAGGVFPHDFTARV VIQGSEELTVALTLVNDDDHPFAVEAALHAYLAVGDVKDITIDGLDGAPYYDAAKGRNAV QKGALSLVGPTDRIYVSTSPIHVMDPRWGRRITVDKSGSGTTVVWNPWREGAARIADMGD DDWQRFVCVETANARQRAITLWPGHPRRLTATYGVDQMD >gi|319978757|gb|AEUH01000088.1| GENE 15 12909 - 13481 621 190 aa, chain + ## HITS:1 COG:CC1900 KEGG:ns NR:ns ## COG: CC1900 COG0204 # Protein_GI_number: 16126143 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Caulobacter vibrioides # 3 181 8 185 196 110 38.0 1e-24 MGLKSAVATAYLAVSRWKLVCEPLPPKVVIIGAPHTSNWDGVFMAVSLWKAGRPFQFLVK DSLAKAPVLGAFIRAIGGVSVQRGHANGLVGQVAERFRASESFTLCMTPKGTRSPREHWK SGFYRIALEAGVPIQLGFVDASTRTFGWGPTYELTGDMRADMDAIRAFYADKTGVHPERA SVPRLRGEDG >gi|319978757|gb|AEUH01000088.1| GENE 16 13682 - 14437 760 251 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189351|ref|ZP_06608074.1| ## NR: gi|293189351|ref|ZP_06608074.1| hypothetical protein HMPREF0970_00387 [Actinomyces odontolyticus F0309] # 1 142 39 179 296 92 42.0 2e-17 MTPAVSAQHRVFLTAEGVALSQKQCQEMGNGTNADVSRTARHDGQEGCEFHWSLADAGAE VFTIDDTGKFAFRSRTDELLAGFDTPDARATVASVVLTVHGAHVHETSGQPQIDSHDEST KNDATTVTWSQPDGAVEARGTIDPGASPTPPPPPATGTAPPSASAAQSGLARPQAAGHAD APDSSSGIYPVLAGFLTPLAVVAVFLIVGRWRANRAGAASEGTAAAEALPPPADPTESAE SARFPYGKPLG >gi|319978757|gb|AEUH01000088.1| GENE 17 14434 - 14697 216 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRTVTPAAVAISPLVYRFPLIAPSFRWAHDNSLRNVMNKRTNRSWNMTWLMWGGASGARP RSRRGGSRRAPAGLRPQGADPRPGAAL >gi|319978757|gb|AEUH01000088.1| GENE 18 14636 - 15472 949 278 aa, chain + ## HITS:1 COG:BS_mta KEGG:ns NR:ns ## COG: BS_mta COG0789 # Protein_GI_number: 16080713 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 5 106 2 101 257 73 37.0 3e-13 MSGKRYTSGEIATAAGVTVRTIQHYDNIGLLPSTGRTDGGRRYYTQDDLVRLEQIVFYKS LDFPLDQIKERLLLEPGKTELLTMLEEQRLLLLQRMEHLHTSFATIGIMSEMIESDRQPP LALLLRFLSALPGDDVFSRAPQLLTEEQREALSPRFQDVEPVQAFYHRWKEALIEAAVLV HENAPPDSAAAQDLARRWWNAILSLTGGDMDLIEQLSQLGLEDQMGTGDDELMNSASRYV EAAFGLFSASGGPCPEPDGSSAGGDDDDRAERPDETVR >gi|319978757|gb|AEUH01000088.1| GENE 19 15435 - 16370 746 311 aa, chain + ## HITS:1 COG:Cgl2690 KEGG:ns NR:ns ## COG: Cgl2690 COG1131 # Protein_GI_number: 19553940 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Corynebacterium glutamicum # 1 297 1 301 310 252 47.0 7e-67 MIELSGLTKRYADKVAVDNISFRVRPGAVTGFLGPNGAGKTTTMRMILGLAAPTSGRVEI DGRPYAALDAPLKKVGAMVDASAIDPRLTPAQYLDILTTASGTGGARIRGVLAAVGLAEA ADKRISEFSYGMRQRAGIAAALIGDPETVLMDEPFNGLDVDGIHWLRALLKDLAGQGKAV LVSSHLLSEVEEIADRIVVLARGALVADMPMAELRGRSAGSYVRVQSDDASALRRALVTQ GARVEALGHGALRVRGSSARRVGDTAFEEGLRVHELVAHQPSLEQLFAELVEGKTEYGGA APGRESGAARR >gi|319978757|gb|AEUH01000088.1| GENE 20 16367 - 17131 602 254 aa, chain + ## HITS:1 COG:no KEGG:Clos_2750 NR:ns ## KEGG: Clos_2750 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 5 254 14 265 265 144 31.0 2e-33 MTRSINAEWKKIWFPAHSRLYLLLALVAAVAMGLVFASTTQVTRGEALSQLDPMSIISAN ILGVDVVAVLLLLFAAVQVGREFRERTAQSYLAVSPRRSTYFIAKSLAFGLVSLAVGAAV ALAALLDGLLLVSAVGKQAPALAEACRLAAGSALMPLFYVLLAVSATFCTRSTAAGAAIP FTVLFLPAIAELLPGALRGALVPLLPASAIHTLSGQARIGSAEHTGALVALAVLAAWVLL PAVFAAWRFQRTDV >gi|319978757|gb|AEUH01000088.1| GENE 21 17163 - 18656 2194 497 aa, chain - ## HITS:1 COG:MA3382 KEGG:ns NR:ns ## COG: MA3382 COG0174 # Protein_GI_number: 20092196 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Methanosarcina acetivorans str.C2A # 2 497 1 506 506 459 44.0 1e-129 MLSTDTVSLSPNPLVQALGKPAEEFTKADVMGYAEDHRIPMLNLRYVGGDGRLKTLNFAI QSKAHLDRVLTLGERVDGSSLFSFVEATSSDLYVVPRLSSAFLNPFSAIPTLELMCNFYD VDGAPLASAPQEVLRKAQRMLEAETGCRLEALGELEYYLFSEQERLFPVEPQRGYHEAAP FSKWEEVRTESLLHLASMGCSVKYAHAEVGNFVSGGLQMIQQEIEFLPVDVCDAADQMVL AKWVVRQVAHSHGIEVSFSPKIAVGQAGSGMHFHTRLVEDGVNRFSQGHGLTDDALKVIG GYLSHAASLTAFGNTAPTSFLRLVPHQEAPTAVCWGDRNRSVLVRVPLGWQNISDSMFRD ANPAEAPIGAVPNDAQTVELRSPDGSAAIHLLLAGITVAARLGLTEPGMLEYAKERYVSG DASRIEGLDQLPASCFETAGRLLAQRADYEAGGVFPSGLIDAWAHQLVALGDEHLREDLA ESRVSVEDLVEQYFHIG Prediction of potential genes in microbial genomes Time: Thu May 12 17:34:57 2011 Seq name: gi|319978753|gb|AEUH01000089.1| Actinomyces sp. oral taxon 178 str. F0338 contig00089, whole genome shotgun sequence Length of sequence - 3017 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 + CDS 409 - 1236 1165 ## COG0457 FOG: TPR repeat 2 1 Op 2 13/0.000 + CDS 1245 - 2084 1056 ## COG0457 FOG: TPR repeat 3 1 Op 3 . + CDS 2097 - 2945 1014 ## COG0457 FOG: TPR repeat Predicted protein(s) >gi|319978753|gb|AEUH01000089.1| GENE 1 409 - 1236 1165 275 aa, chain + ## HITS:1 COG:FN0819 KEGG:ns NR:ns ## COG: FN0819 COG0457 # Protein_GI_number: 19704154 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 5 115 2 103 665 69 37.0 5e-12 MSDARAQLLDRIEQLHLDDEHQQIIALIEAQNDFTSDYDLASLLARAYNNYAQPHMDTYH DLLRRAVDLLRGVETEGLSDPKWHYRIGYALYFLDREDEALIYLRQAQALDPTDTAVTDL IDSCHRSLTARTELIPITTQSIADYFDDRGWNYNLDDNTLLTGFTEGVYRLRKETDTDDL SLWGALRTDAPMDLRPRLVETCNDWNNSTRWPKTHVVTLDDGTVRICAEQYLTTHFGMTR AQLSMAVARFIDTSEQFFSHIVERFPSLARPPRED >gi|319978753|gb|AEUH01000089.1| GENE 2 1245 - 2084 1056 279 aa, chain + ## HITS:1 COG:FN0819 KEGG:ns NR:ns ## COG: FN0819 COG0457 # Protein_GI_number: 19704154 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 12 135 5 119 665 67 33.0 3e-11 MAAIDEEERLRLLAEVAQLHDAGQHQQIIARIERVNGHREDYVMAGLLARAYVNYADSSK SSFNQLHQTAVDLLAPFEQEGADDPMWNFRMGAALYWLDRLDEAMPRLRRACELDPADKA FSQLAEKCAREITDKTTVIPITTKTITEHLDSREATYAFHEDGAVCMMFSVGHYWFSAAP EDNNVSLLARIETPASLDLRSALVEVCNDWNSRTRWPKTSVYVNDDGKVFVYAEMHLTLN GGMTRATLSTNIDRFVSTSQQFFTHVGERFPSLAYSPEE >gi|319978753|gb|AEUH01000089.1| GENE 3 2097 - 2945 1014 282 aa, chain + ## HITS:1 COG:FN0819 KEGG:ns NR:ns ## COG: FN0819 COG0457 # Protein_GI_number: 19704154 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 8 134 2 119 665 66 31.0 6e-11 MFGVFQPRQRLFAKISRLNADGEYRRVIALIESHRGYENDYELVGLLAQAYIDYAHPSMD AFNDLLQSAVNLLASTQAQGAGDPLWNQRMGVALYWLDRNEEAVPYLRHSLELNPSDSTT STFLQRCEGEITERTIVAPPTVQQIARYFDREEWKYALKEDRGVLITDFGRGHYWLSCDS DGDDVQLRGALLVVPDEDLRGPLMDACNEWNSLMRWPKAYVSDLDGQLRIYAEMYVTCRH GLTFTNLCLNVKRFITTAEDFFEDITTKFPALLQDPPSDGGS Prediction of potential genes in microbial genomes Time: Thu May 12 17:35:01 2011 Seq name: gi|319978743|gb|AEUH01000090.1| Actinomyces sp. oral taxon 178 str. F0338 contig00090, whole genome shotgun sequence Length of sequence - 10503 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 3, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1516 1807 ## COG1070 Sugar (pentulose and hexulose) kinases 2 1 Op 2 . - CDS 1522 - 2703 1818 ## COG4952 Predicted sugar isomerase 3 1 Op 3 . - CDS 2754 - 3119 474 ## COG3254 Uncharacterized conserved protein 4 2 Op 1 16/0.000 - CDS 3307 - 4374 1703 ## COG1879 ABC-type sugar transport system, periplasmic component 5 2 Op 2 11/0.000 - CDS 4439 - 5491 1417 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 6 2 Op 3 21/0.000 - CDS 5488 - 6534 1259 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 7 2 Op 4 . - CDS 6531 - 8045 188 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 8 2 Op 5 . - CDS 8114 - 8338 158 ## 9 3 Op 1 . + CDS 8235 - 9497 1491 ## COG1609 Transcriptional regulators 10 3 Op 2 . + CDS 9568 - 10467 920 ## Caci_2113 PfkB domain protein Predicted protein(s) >gi|319978743|gb|AEUH01000090.1| GENE 1 1 - 1516 1807 505 aa, chain - ## HITS:1 COG:BS_yulC KEGG:ns NR:ns ## COG: BS_yulC COG1070 # Protein_GI_number: 16080172 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus subtilis # 7 490 5 464 485 302 38.0 9e-82 MSTTVAAVDLGASSGRVLRGVLDGGRLSVRECARFANGPVRLPLRSAPPAGGGTGEEYAW DVLALWSGILDGLRVAAGLGPVDAVGIDSWAVDYALLDADSRLLGNPASYRSPRCAPAAA AFLAELDAQWLYERNGLQFQPFNTVFQLVADTREARASLARRALLIPDLLAYWLTGEAVT EVTNASTTGLLDPGARAWDPGIAQALLEGFGVEARFPRLVEAGAVVGPITVPGLGLRSPS GEATPLVAVGTHDTASAVVGVPADPDGGPFAFVSSGTWSLVGVELGAPVRTPEARLANFT NELGVDSTVRFLKNIMGMWVQQECLRQWRERGEGGLDWPSLDAQTEAAQPLRSLFDINAP EFLAPGGMIERISARLEAAGEPVPTTRGELLRAVTESLVVAYRRALREAAELSGVAPATI HIVGGGSKNALLCQSTADATGIPVVAGPVEATALGNMLVQLRAVGAVSGGLGDMRRLVAA SAPLARYAPRPGAAPQWERAEERVA >gi|319978743|gb|AEUH01000090.1| GENE 2 1522 - 2703 1818 393 aa, chain - ## HITS:1 COG:TM1071 KEGG:ns NR:ns ## COG: TM1071 COG4952 # Protein_GI_number: 15643829 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar isomerase # Organism: Thermotoga maritima # 16 383 11 382 383 384 49.0 1e-106 MPPSLADLTPTQARLLREQTIELPSWAFGNSGTRFKVFTTPGAPRDPFEKIDDAAQVHKY TGITPRVSLHIPWDSVDDFARLREHAAERGIELGTINSNVFQDDDYKFGSLTNSDEAVRR KAVDAHLRCIDVMDRTGSTTLKVWLGDGTNYPGQDSIAARQDRLADSLGRIYGALGEDQR LLLEYKFFEPAFYHTDVPDWGTALAHVSALGERAVVCLDTGHHAPGTNIEFIVVQLLRAG RLGAFDFNSRFYADDDLVVGAADPFQLFRIMHEIVSSGALEATSGVNFMLDQCHNLEQKI PGEILSALNVQEATAKALLVDRAALDAAQRSHDVIGANQVLMDAYSTDVRPLLRELRESQ GLDPDPLAAFARSGYLARAEAERVGGAQASWGA >gi|319978743|gb|AEUH01000090.1| GENE 3 2754 - 3119 474 121 aa, chain - ## HITS:1 COG:Ta0744 KEGG:ns NR:ns ## COG: Ta0744 COG3254 # Protein_GI_number: 16081816 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermoplasma acidophilum # 21 121 2 101 101 67 37.0 6e-12 MTTVPALDETLATTSQSSPSRACFLLRVRPEKLAEYADVHQRVWEEMRSALSRAGWRNYS LFLDPSSGLVVGYYEADDARAASEAIAATGVNARWQREMAQYFEPGGGGEAQTLHQYFYL P >gi|319978743|gb|AEUH01000090.1| GENE 4 3307 - 4374 1703 355 aa, chain - ## HITS:1 COG:AGl2685 KEGG:ns NR:ns ## COG: AGl2685 COG1879 # Protein_GI_number: 15891450 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 353 18 349 353 196 37.0 8e-50 MSLTKRGATALAALVLAASTAVAACGGGSGSQASGGSGATPAGGAYDVSSKTITFIPKQL NNPFSDVMLGGGKDAAAELGFAEAHVVGPLEASSSSQVSFINSEVQAGTSVIVLAANDPD AVCPALKDARGAGSKVITFDSDASADCRDLFINQVESKQVALTMLEMVSEQTGGSGKVAI LSATANATNQNTWIQYMEDEIASNAKYKDISIVAKVYGDDDDTKSFQEAQGLLQAHPDLD AIVSPTTVGIAATARYLSTSDYKGKVALTGLGLPNEMRSFVKDGTVKEFALWDPAQLGYV AAYAGAALESGAVKGEVGETFTAGTLGERTIEEGKTVVVGDPVRFNADNIDNYDF >gi|319978743|gb|AEUH01000090.1| GENE 5 4439 - 5491 1417 350 aa, chain - ## HITS:1 COG:mll5703 KEGG:ns NR:ns ## COG: mll5703 COG1172 # Protein_GI_number: 13474746 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Mesorhizobium loti # 27 337 19 331 331 168 35.0 2e-41 MSRTPNTPAEGSGRRAPSALRRIAASRDAKMTLVLALVVVAALVLVPRFGQPRTLTFLTL DVAATLLMALPMTLVMINADIDLSVASTAGLVSAVMGVLVQSGTGLWAVVALCLLIGVAC GLFNAVMTAYVGLPALAVTIGTLALYRGLALVVIGDQSISAFPEWATEAVTGSFGKTGIP YMIVPLALAVALFAVLLHMTPYGRGLFALGHSKKAAEFVGIDTRRARLIALVLSGVMAAL AGVYWTLRYSAAKADNVEGLELVVIAAVVFGGVSVFGGKGSVWGSVCGVFTVGVLNYALR LNRIPDVVLVMVTGLLLIASVVAPSIASAWSQWRSHRTPITPNTTGAQGA >gi|319978743|gb|AEUH01000090.1| GENE 6 5488 - 6534 1259 348 aa, chain - ## HITS:1 COG:SMc03814 KEGG:ns NR:ns ## COG: SMc03814 COG1172 # Protein_GI_number: 15966950 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Sinorhizobium meliloti # 20 336 20 335 335 148 34.0 1e-35 MTTTAPSNDAPAPPRAPAARSAWRSLASLREAPVLVALALLVLATWALNPRFLTPQGIKD LFLNATIAMLMAAGQSLIIQSEGVDLSVGSILGGSAFLTGVLFVSAPGVPIIAVFLAGTA LGAALGAVNGLLVTRARVPAMVITLGTLYVFRGALNWWAGSTQYFAGDRPQAFGDLGVAT VMGFPLLTLLAVAVVAAISLFQRYARSGRDLYAIGSDRAAAAVYGIPVGSRVLMAFIVNG AMVGLGGVLYASRFNSVGATTGSGMELDIVAACVVGGVAMTGGVGTAYGAAIGALLLNTM TSALTAIGVDKFWQRAVVGALILVAIVIDRVSTVRRRESLRKEGGASA >gi|319978743|gb|AEUH01000090.1| GENE 7 6531 - 8045 188 504 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 270 478 10 217 245 77 28 5e-14 MTRDTVEPIVSLQHAVKRFGSFTAMSDGCIDLFRGQIHALAGENGAGKSTLVKVLAGIHR PNGGRLLLDGEPVAFKSTAESKAAGISVIYQEPTLFPDLSVAENIYVGRQPRGRFGMIDH AAMRADAQRLFDRLEVAIDPAVPAEGLSIADQQIIEIAKAISLDAQVLIMDEPTAALSGH EVERLFAIARSLRDRGAALMFISHRMDEIYDLCDRITIMRDGRYVSTRVLAETDRSQLVK DMVGREVDQLFPKLEAQIGEPVLRVDGLTRYGAFEDVSFELRSGEILALAGLVGAGRSEV ARAVMGIDRADAGSATAFGEPLKLGDTAAAIRAGLAFVPEDRRKQGLVMDLSVARNTTLT LRDKLAKWGIIDARAEARAAQQWTRTLEVKAASQHSPASSLSGGNQQKVVLAKWLATDPR ILIVDEPTRGIDIGTKAEVHRLLSTLARRGVGILMISSELPEVMGMADRVLVMCEGRVTA ELERAEATAEAIMTAATDHTKAAS >gi|319978743|gb|AEUH01000090.1| GENE 8 8114 - 8338 158 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTPGAFPPGRAGPGRRGAFGGSWSERWVAWVRGVIGSASVARGVRIEDESFILVHSRPLV PRYSAGRGGVLWPS >gi|319978743|gb|AEUH01000090.1| GENE 9 8235 - 9497 1491 420 aa, chain + ## HITS:1 COG:PA1949 KEGG:ns NR:ns ## COG: PA1949 COG1609 # Protein_GI_number: 15597145 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pseudomonas aeruginosa # 34 418 4 333 337 165 31.0 1e-40 MTPRTQATQRSDQDPPKAPRRPGPARPGGKAPGVKDVATLAGVSVGSVSNVINGRGSVSP EVRKRVEDAIARLGYVPNPTAQALRRGTSPLVAVAVFDLNNPFFMEAASGMERCLREAGF VMTLSSTHASVAEEAQLLRTMARQAVRGVLLTPADSELEAAHELVAQGIPVVLFDTPDTS SDMSSIAVNDRAGASLAIEHLLALGHRRIIFVNGPAHVRQARQRLLGVRDAIAHWEKLTG TGSVALDVVSVADFTARAGREFAGEWLAARGLGAGAGDAQGGDGAPGGADGDGRGPGRPG SNGAQDGRAGRTGPDGDDGHPTAVFCANDLIAFGVMSSLRDAGIRIPADVSLVGFDDIAL ASQTSVPLTTIRQPMEELGMAAVELLLADPRANEEDSPAIEHRSFDPQLVIRESTSAPRR >gi|319978743|gb|AEUH01000090.1| GENE 10 9568 - 10467 920 299 aa, chain + ## HITS:1 COG:no KEGG:Caci_2113 NR:ns ## KEGG: Caci_2113 # Name: not_defined # Def: PfkB domain protein # Organism: C.acidiphila # Pathway: not_defined # 4 293 14 303 304 232 46.0 1e-59 MESLDILVIGGVGVDTVVRVPALPLPQADSLMVPPVRTLLGHTGNGVVRGAHALGMRAAV VDVIGDDPEGALIRSSFEHDGIPASFATNTAGTRRSVNLVAPDGRRLSLYDARADGFEPD PALWRDWLGRTRHVHVSIMDWARRALPDALGSGATTSTDLHDWDGRSDYHRDFAYSADTV FVSATALADEAGVVADVFSCGRAGAVVVMDGERGARIWSRAAPDSPIRVAAAVLPDRPAV DSNGAGDSFVAAFLHARLMGRPLAEAGRAGAVGGAWACGSAGTHTSFVNRQELEAALRL Prediction of potential genes in microbial genomes Time: Thu May 12 17:35:24 2011 Seq name: gi|319978712|gb|AEUH01000091.1| Actinomyces sp. oral taxon 178 str. F0338 contig00091, whole genome shotgun sequence Length of sequence - 40115 bp Number of predicted genes - 37, with homology - 25 Number of transcription units - 23, operones - 9 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1636 1569 ## COG0366 Glycosidases 2 2 Tu 1 . + CDS 1749 - 3059 1673 ## RoseRS_1490 nitrilase/cyanide hydratase and apolipoprotein N-acyltransferase + Term 3125 - 3158 2.6 3 3 Op 1 . - CDS 3281 - 5647 1945 ## AAur_0434 putative integral membrane protein 4 3 Op 2 . - CDS 5647 - 6285 731 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 5 4 Tu 1 . + CDS 6348 - 7316 1134 ## Xcel_1632 transcriptional regulator, TetR family 6 5 Op 1 . - CDS 7487 - 9859 2252 ## COG1511 Predicted membrane protein 7 5 Op 2 . - CDS 9868 - 12570 3275 ## COG2409 Predicted drug exporters of the RND superfamily + Prom 12546 - 12605 4.5 8 6 Tu 1 . + CDS 12755 - 13651 859 ## Jden_0483 transcriptional regulator, TetR family 9 7 Tu 1 . - CDS 13573 - 14454 977 ## COG0657 Esterase/lipase 10 8 Tu 1 . - CDS 14592 - 15896 1225 ## gi|225022927|ref|ZP_03712119.1| hypothetical protein CORMATOL_02973 11 9 Op 1 . + CDS 15822 - 16166 131 ## 12 9 Op 2 . + CDS 16163 - 16594 687 ## gi|269217589|ref|ZP_06161443.1| hypothetical protein HMPREF0972_00213 13 10 Op 1 . + CDS 16840 - 17547 558 ## gi|229491754|ref|ZP_04385575.1| hypothetical protein RHOER0001_1829 14 10 Op 2 . + CDS 17561 - 18778 1024 ## CMM_0308 hypothetical protein 15 11 Tu 1 . - CDS 18751 - 20256 1389 ## gi|252126807|ref|ZP_04836849.1| hypothetical protein CORMA0001_1225 16 12 Tu 1 . + CDS 20272 - 20574 145 ## + Term 20688 - 20719 -0.1 17 13 Op 1 . + CDS 20760 - 20861 170 ## 18 13 Op 2 . + CDS 20852 - 20977 120 ## 19 14 Op 1 19/0.000 - CDS 21079 - 21843 883 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 20 14 Op 2 . - CDS 21840 - 23054 1034 ## COG4585 Signal transduction histidine kinase 21 15 Op 1 . + CDS 23236 - 24423 1449 ## COG1131 ABC-type multidrug transport system, ATPase component 22 15 Op 2 . + CDS 24425 - 25567 1506 ## Ccur_10720 hypothetical protein 23 15 Op 3 . + CDS 25554 - 26738 1429 ## Ccur_10730 ABC-2 type transporter 24 16 Tu 1 . - CDS 26769 - 29990 2484 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits - Term 30056 - 30080 -1.0 25 17 Tu 1 . - CDS 30156 - 30527 178 ## COG1487 Predicted nucleic acid-binding protein, contains PIN domain - Prom 30728 - 30787 1.8 - Term 30730 - 30777 1.7 26 18 Tu 1 . - CDS 30853 - 32298 653 ## DIP0075 hypothetical protein 27 19 Op 1 . + CDS 32425 - 33801 799 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 28 19 Op 2 . + CDS 33803 - 34324 278 ## 29 19 Op 3 . + CDS 34361 - 34480 56 ## 30 19 Op 4 . + CDS 34489 - 34707 205 ## 31 19 Op 5 . + CDS 34704 - 37394 2370 ## COG3378 Predicted ATPase 32 19 Op 6 . + CDS 37411 - 37746 450 ## 33 20 Tu 1 . - CDS 37835 - 37960 58 ## - Prom 38013 - 38072 3.7 + Prom 37787 - 37846 2.9 34 21 Op 1 . + CDS 37909 - 38040 254 ## 35 21 Op 2 . + CDS 38049 - 38213 243 ## 36 22 Tu 1 . - CDS 38231 - 38776 425 ## 37 23 Tu 1 . - CDS 38890 - 39990 1857 ## COG0012 Predicted GTPase, probable translation factor Predicted protein(s) >gi|319978712|gb|AEUH01000091.1| GENE 1 1 - 1636 1569 545 aa, chain - ## HITS:1 COG:L77437 KEGG:ns NR:ns ## COG: L77437 COG0366 # Protein_GI_number: 15673233 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Lactococcus lactis # 52 516 4 470 491 432 42.0 1e-121 MPPARPDTPGSARLSPRPPTWHHRRVSENDSESAPAAPASSAGRPQEHNPLLLQAFAWDL PADSSHWRLLEANAPLAAEYGVTSVWLPPAYKGKYGAEDVGYGVYDLYDLGEFDQKGSVA TKYGTKDEYLRAVDALHRAGIGVIADIVLNHRMGGDATEDVRATPMDPADRCRALGEAEW ITAWTRYTFPGRGSAYSDFVWDWTRFHGCDWDERRRQRGVWLFEGKHWNDNVNAELGNYD YLMGCDVHVTDPEVSAELDRWGRWYVRTTGVDGFRLDAVKHVGSDFFSRWLAELRRSTGR ELPAVGEYWSGDVAELERYLEQVPHMSLFDVPLHFRLHDASVSDGNMDLSRIFERTLVGS DPARAVTFVENHDTQPGQSLASTVAPWFKPSAYALILLHEAGLPCVFWGDVFGTPESGGL PAVSELPALVLARRFLAYGPQHNAFDDPDVVGFAREGDSAHPSSGCAVVFSDRLAGVKRL VVGERHAGSTWECVIGGHPPVVVGQDGAVEIPVNDGGSACTPPPAQCRCCGPEGGSSGPG AGRTE >gi|319978712|gb|AEUH01000091.1| GENE 2 1749 - 3059 1673 436 aa, chain + ## HITS:1 COG:no KEGG:RoseRS_1490 NR:ns ## KEGG: RoseRS_1490 # Name: not_defined # Def: nitrilase/cyanide hydratase and apolipoprotein N-acyltransferase # Organism: Roseiflexus_RS-1 # Pathway: not_defined # 42 424 45 456 509 77 26.0 1e-12 MYSCDMEYETTAPPLRRGALIGAASGLVLAAATFCPHWSGQLLAPLALAPVFWLLARVRA RHAYAVGACFALAWLVPTTYWYYSFMSVGVAVGASIGWALLQANLFHLAALRDRIGVPGV WLAFSAAWVVLTWARMRLPVTEDWWIPHLGYGVWRNPGLVWLGGFGGEAALEGAVLLGGA VVAWALLRSSARAAALAFGGVALAVVCLDAVAWNLPARPIPPALALQQMTRGGVDAPATE EDVDSLMAATLAAAGAAPAQGTTVVWPENSVPASAHARIAAFASDNGVNIVYHTAEADGG AVYKKTVVIDASTGREALSNYKKHLAPDEEVGEPRDTANTADLSGHRATSYICYDLHYPD VVERLRGAQAAYVPLNDATYGYLQKQFHAADIALHAAQAGTTVVVAATDGPTMVVNSNGV VVEELVGTAPGHVAVP >gi|319978712|gb|AEUH01000091.1| GENE 3 3281 - 5647 1945 788 aa, chain - ## HITS:1 COG:no KEGG:AAur_0434 NR:ns ## KEGG: AAur_0434 # Name: not_defined # Def: putative integral membrane protein # Organism: A.aurescens # Pathway: not_defined # 497 774 40 311 330 87 32.0 2e-15 MSPRHLGPYALQLRRVLVLTAAFVAVFITIISGPQLIPPCDSPPEMAQWAVYSPNLGLPL AAGLFALVAAATAVATRARARTWADTGWFALVRLYLVCYLLLVVACAALPSASLSSPIPS EYLRALVSATNPVLALYPLAALCCAAVLEERHTVAYLCAIAPLQILALMEADPGNGLPQN AIGPLYFLLFSMLNAGTLHWAMRIAHDLDRTRLNTREAQRALANERARGRARARSNGLIH DYVLSALVLAFNRSVEEGEVRAAADSALSALTPGPSSSGTVEAEQLLGALAAQVRPREPD WTITCRWPQEAPALPAEAGAALIDAGHEALNNVRLHAGSDASDPGRPAQCRVALVFEGGA VRVTITDTGRGFDPDRPTGLRHGIRDSIVARMEAVGGRASLASCPGGGTRVVLEWSDPRD KQAKRGPVLAALARRAGRGRNRKEPSRTAWDSQVRAAAESRGARMLAVASVAVHAYITAV EIRRGAYYHPAPVVLSLALLAAAGAALVAQWPDRRPPEWVGWGSTAVVGLSNLLALTQIR VGPHWPGWSAWSAGASMFLVLLLLLRQRTAEAVAGCAALIAATAIWVAVQGREPALVFTF SLGHVVAFVFWLLLTSQSGAATTAIEEALEAEGEARLAREEQLTLNAVMAAGLADVSERA RGTLEAIRDREMTPGLAMEARLLEAELRDEIRAPFFTGTAVVEAARRSRARGVEVLLLDD RGGEGPLGTGEGADDRLRALIVERAACALDEARAGRVVVRVGPTGRRWAATILSDERLCV VKADGSVE >gi|319978712|gb|AEUH01000091.1| GENE 4 5647 - 6285 731 212 aa, chain - ## HITS:1 COG:XF0972 KEGG:ns NR:ns ## COG: XF0972 COG2197 # Protein_GI_number: 15837574 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Xylella fastidiosa 9a5c # 8 184 6 190 210 64 28.0 2e-10 MKDTRAQIVVIDDHRVVGLGVKAAFEAQCADASISWYPTVPDAPWEHGQVAVLDLRLADG SAPDENIADINGRGVPVVVYTSGDDPYLVRRAIAGGALSIVRKSAPPEDLVEAVLAASDG ATCPTPDWAAALDADEDFVASRLSDLEARILAHYASGERSGSVARALNVSKNTVNTYIAR IREKYRESGRPAESRVDLFRRAAEDGLLSYFD >gi|319978712|gb|AEUH01000091.1| GENE 5 6348 - 7316 1134 322 aa, chain + ## HITS:1 COG:no KEGG:Xcel_1632 NR:ns ## KEGG: Xcel_1632 # Name: not_defined # Def: transcriptional regulator, TetR family # Organism: X.cellulosilytica # Pathway: not_defined # 63 292 2 213 239 102 37.0 2e-20 MFASLIAAASPRRRSAVPRGRAEGPRTWFTGTSFTMDGMTNRRTAPTLEDHGGAGSAPEA APTANDDIRDAVRALNEAVTRFSRAVGSASGDAFGTSKSAVAASLDQAARDLASASRAVS GVSGKQDGRRRRSEETRARLLEAGRSIFAAQGYDGASVSDIAQKAGYTKGAFYSSFPSKE ALFLEIVDRLDTGMRSVPGAGSLFALAGTDPHEWVTGIQALPADDVLLHLESWLYAMRHE DTREKLSESWHRWVAGIALETARFYGRAEPTQEDFDTAFGVIAVGIFGRVSASASNEAGP MVERVSQRLISTPGEQRDSADD >gi|319978712|gb|AEUH01000091.1| GENE 6 7487 - 9859 2252 790 aa, chain - ## HITS:1 COG:lin2460 KEGG:ns NR:ns ## COG: lin2460 COG1511 # Protein_GI_number: 16801522 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 8 761 21 888 927 145 23.0 4e-34 MRKHRILYAFLLLLPAVLSLAALAGADNADGGASRVRAAIVNDDQIATTADGTQIPGGRQ IIAELTKDPAPAGASLTWEVVSAQSAQSGLDDGAYQAVVTIPADFSQGLASVIEGESTAA PVVTVRTGNGADARAGAINEALVRAAGTQFGGTLTIAYISRAMTGLSQVSDSLSAAADGA RKVADGQGEAMSGAAQLASAAGQLDSGLGQLASGAGQLDSGAHQLSANTAAASAGAGQLA SGTAQLEGGLGALDSGANDLSGGAARLDAGITAYVQGAGALDANGQTLASGAQSLAGSID SYIAGVAGVHAGIGQIRGQLGQFDTSGLAGMQTQLQGAAAGAGTLAEGASQIEGLVGACQ AGSAEACAGITAAAQQLNQGAAGLRSQLESGASQVGGAAAQLGGGLAQLDGGLEQLSSGM DAGLVGSSADQLSAGAHALAEGVGQMASGLSALNSASPALSDASARVAQGSTALSGAAAS ASSASTSLSAGAAELASGTGQLAAGAEDLSGATGVLAAGSSELAQGGGRMSGAAAQLGQG LGQLQSGADSLADQLGAAAQQVPSYTQDQADESARAIGAPVGVDASGQAADAQALWAPTV VALVAWVGALVGVVGAGALSSRSIDSPSSPAQLAWASLSKPLVVSVLAGGAVAGVLALAG VVPTHPLGAVVVATGVFAAFTVVNQALIGVFGRNGGLIASGAGALAQITTLSAVAPPDSL GGAFGALRPIMPLSAADTVMRWAVLGVGSPTRASWALALWTGASLLVVVAAIAKRRRTSL AELRVEAAVA >gi|319978712|gb|AEUH01000091.1| GENE 7 9868 - 12570 3275 900 aa, chain - ## HITS:1 COG:BS_ydfJ KEGG:ns NR:ns ## COG: BS_ydfJ COG2409 # Protein_GI_number: 16077610 # Func_class: R General function prediction only # Function: Predicted drug exporters of the RND superfamily # Organism: Bacillus subtilis # 15 736 1 711 724 431 37.0 1e-120 MNQVQECIECEGVLMSSLLYRLGRWCATRAGRTLLAWTAILVVLGACVGALGVNLTASFA INDVESMRGLAVLAERVPQAAGVSERVLFTSADAPIEQHRVAIDAFVQGVRQIDGVALVG DPFAEGATAVSPDSRHALVEIQTDTSVGTVVSGPTGRAEEVAGQIASLRERAAAQDPGLI SMTSDSLELETGIALSATELFGVLIAAIVLLGTFKSLTTAGAPIVSALIGVGTGMLAILV AAALVDVNSVTPVLAVMIGLAVGIDYALFIIARSREYLARGIAPAEAAARANATAGSAVV FAGATVVVALCGLAVARIPFLAVMGFSSAGVVVIAVLVAITATPALLGLLGERARPRRRR AARDDHPVASRWIAATTRVPALTVLAVIAVLGAAAVPVTGLRLGLPDNGYHAVGSEARTT YDAIADAYGEGYNAPIIVMGDISQSRDPVGAVDSLAERLRAMRGVADIPLATPNQDGTLA LVRIRPEKGQTDASTTALVSAIRSDAASIEDGLGVRALMVTGSTAVAIDIAAQLGDAMLP FGTVVVGLSMLLLMIVFRSIAVPVTATLGYLLSLGAGLGAVGLVFGWGWMAGPLGVSRVG IVISFLPVIAMGVLFGLAMDYQVFLVSRMREEWIRTGDARESVRKGFIGSSQVITAAAAI MIGVFAAFIGSESIQIKPIAVALTVGVLADALLVRMTLIPALMALLGRAAWWLPRWLDAL LPIVDVEGEGLDRSLEHAAWVEAHGAAALRLEGVTVSEGRDTAFRDLSVVVRPGALAVVR SDDAVARRAFAALVGARLRPTRGRVVVVGHVLPDGTSSVQRQTAALKAHDEALPDHARII VVDDPGTRRWGRVARLLDEGRTVVVTGPTGLVVPVPLRAATDTPLTTSAPAVGAGPAPNA >gi|319978712|gb|AEUH01000091.1| GENE 8 12755 - 13651 859 298 aa, chain + ## HITS:1 COG:no KEGG:Jden_0483 NR:ns ## KEGG: Jden_0483 # Name: not_defined # Def: transcriptional regulator, TetR family # Organism: J.denitrificans # Pathway: not_defined # 9 205 31 226 226 86 29.0 1e-15 MASDPTTLRKRENTRARLLDAAEEIMITKGIGAARIDDVVRAAGFSRGAFYSNYSSMDAL MIDLINSRSERITARATEAFDAIDGQPNVDAVMRAIETFRPEAQRMNIMAMEFDLYRMRH PEFAKELGTKSLGEHSTLDSFIQGLAARLLERMGRRPTIPIPTISRLLGVFYMDSLIDRS GAGDDATPFMRTVIEAVLVAFSQPVSPGDACTAGVRNGDSSCPADDLASLLTSFGPVATP PDPRSASGARPGTTPLPRGPGAEGETGPAGTGPQSSDETKAATESSAARASGDETSAT >gi|319978712|gb|AEUH01000091.1| GENE 9 13573 - 14454 977 293 aa, chain - ## HITS:1 COG:mlr1247 KEGG:ns NR:ns ## COG: mlr1247 COG0657 # Protein_GI_number: 13471310 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Mesorhizobium loti # 17 289 19 293 317 139 35.0 6e-33 MSLEMALTRGVFRLVPRMLDSEELTRRSMALPRRGARVPCWMWRKHRISDTIVAGMRVTR VEPRATGSGTVVVYLHGGAYVSPISWAHWLLIDALVERTGAAAVVPAYPRAPEHTAAEAF APLGVLMDQVLGAPGRVVLAGDSAGGGLALSLAIQRRDAGVRAPDGLVLFSPWVDVSMSN PRIAPMARLDPMLGIDGLVWCGKEWAGGLGTADPRVSPLHDHLRGLPPTRVYQGDRDILL PDVELFASKARARGADIRLRVEPGGIHDYVALVSSPEARAALDSVAAFVSSLD >gi|319978712|gb|AEUH01000091.1| GENE 10 14592 - 15896 1225 434 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225022927|ref|ZP_03712119.1| ## NR: gi|225022927|ref|ZP_03712119.1| hypothetical protein CORMATOL_02973 [Corynebacterium matruchotii ATCC 33806] # 1 433 1 385 386 127 26.0 1e-27 MRPEMLVCDQISRMVGAPLDEEGIHRFLRRLAQVLEAEPISMHGPGLRFRWVVDDRTLEV EAQRARGDWPFELSVRGMDTEYAIDIEEYRTFKWADPADYPFYWSVDICQIPGMWVFYPG AYPVGTWDSFSDLIAPTLDELPADIAMTPPEWRRPFRWRMTAPQLGDVFFTALPEGVEVM VESTGEALLVPRSILERWSGSHPVGMGMAIAGLAHGAPFMSVGFAFCERDEAHHFYAEAP IGPEWEHEGISADDFDEGEKTWEPLSVGELRRLIARPPVEQEEEPIEIRRAPFRAGLGAP EVLMIVGDIRRGRKAARVFKKHGARRARGGDGPVFEADGWSANPKRDDGWRVSLVEPPAA RVRFDDREVVEYARGIGEALAQRYGPPFGCEASTAGTLMQLFAVDGFGVRLYAGYSRVEV EIGQFKPMAEYEYG >gi|319978712|gb|AEUH01000091.1| GENE 11 15822 - 16166 131 114 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDPLLVQRRADHPRDLIAYQHFRAHTGVLLRVGPTQSYAARAPRHGGRLVLLGGPGRARA PAAAHPNARHTEREGKEGFTRAPYTVNLRGKDQTPAGEATEPPPSANDSRRGIL >gi|319978712|gb|AEUH01000091.1| GENE 12 16163 - 16594 687 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|269217589|ref|ZP_06161443.1| ## NR: gi|269217589|ref|ZP_06161443.1| hypothetical protein HMPREF0972_00213 [Actinomyces sp. oral taxon 848 str. F0332] # 1 139 1 143 150 95 37.0 1e-18 MTRPNTADSRDTNVKYIAAVAKILHANAEATAEGWTEVGLLVNKAPDSQKVSTYAALERG SEHFDDWLTPSAWDELGKVLAAWQFDLTQQGHPKWTAMFLGGVRDGGLAWLPDYSPTFGK WRIHRKHSNLPAIQEGLEKAARL >gi|319978712|gb|AEUH01000091.1| GENE 13 16840 - 17547 558 235 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|229491754|ref|ZP_04385575.1| ## NR: gi|229491754|ref|ZP_04385575.1| hypothetical protein RHOER0001_1829 [Rhodococcus erythropolis SK121] # 1 174 1 157 161 72 32.0 3e-11 MGFSYSLDRFEGGRVVSGDRSGLDLFLRRRRLRLGFFFKGDDDCLLTDEAGELLIYEDGV PRDLVLHGVDESSGRTFGGYVGHSHLTPDECDLVFDLCASTGLLVLNHQGNPTFIIPAGN HALAQLPTDVLAEAESNGGREIAFVDNGAQLLAAMADGMERFIAYRDQVLSGRGPSFGPA STATAAPGAPGTDAGADGPRPERKPHCPPESARRDSQPPASHGKSADKLIKRPKL >gi|319978712|gb|AEUH01000091.1| GENE 14 17561 - 18778 1024 405 aa, chain + ## HITS:1 COG:no KEGG:CMM_0308 NR:ns ## KEGG: CMM_0308 # Name: not_defined # Def: hypothetical protein # Organism: C.michiganensis # Pathway: not_defined # 1 404 1 410 413 456 63.0 1e-127 MRTPLDSPFSPGSDTVPHVWAGRVEQLSDWRDVVRPRRRAGIHERGRTILGEAGSGKSAL VRKIAHSAAAEGDWTTPQLRIPSGTDPVKRVASALLRLASDAGLPSARERRIADLLGRVE SIAASGFGLTVRAHAGPEPYTALTELLVEIGRAAIHADAVVVIHIDEVQNIIDEAARSQL LIALGDALAHEEEIFLPGGPRASRALPIAVYLTGLPEFADMAGARTGATFARRFRTTILE PISDNDLLVALHPFVTEGWQVPGGERVYMEASAQRAIVELSRGEPFLFQLAGERSWYAGT DDLITAEQVASGWRGAAHEAETHVQRILERLPARERAFLEAMAQLPPPERTLTNIARALD YRRASDAGPTAQRLDLIRRVIRRGRPHYTFRNRAVGAYLESDWPE >gi|319978712|gb|AEUH01000091.1| GENE 15 18751 - 20256 1389 501 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|252126807|ref|ZP_04836849.1| ## NR: gi|252126807|ref|ZP_04836849.1| hypothetical protein CORMA0001_1225 [Corynebacterium matruchotii ATCC 14266] # 2 469 69 518 548 116 27.0 4e-24 MLGTAGVLAQTASAVAAVGNLQGLVRLTPDTVAAMNGGLKPLVSGGDNLGVLTDAGGRFA RQVRWKPVGPQAAGGTVLASLGPAFMMVALQLQLGRLEKTAKETRGIARKILKGMNEDRQ DELAGIARVVDQAFEEAKRIRRVTPGVWEEVQGQQAKINDYREKYRRFVEGHVNEARALR GSSADEVRAHVMENGPQILIDLECYMTVERINCVYRALRAARLRDEGRDDPKEAELSEFV AADAREDYENAIGMMTGLVDALNRGFHLAAAIEKGLLDKTVVRGINDVVKGLFAGERKKT REPGDGAQNAIQKLAEGVGRFAAVIGLASPRDPEPHARVMGKRTDEEAKELLEVLRWTLA RDEELVGFAHATDARQQTVIAITDQRVVEAKLADLRSRGEFWREIPNDEIRYVRRRGGKE GRAELDVITKEENLTWSFGPADESDQAGLDAITGMLAERMRLPEEEMRELTGSRSDGAPR IGAPAPPPLPEGGHSGQSDSK >gi|319978712|gb|AEUH01000091.1| GENE 16 20272 - 20574 145 100 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVNAERQFDQIACDFVGQRAERLADYERDLRENLNESGARRLRDQFALPLCRRSVRLAIR FSIRSARNPFTRFPRRFPTRPACNPFTRFPRRFPTRFARR >gi|319978712|gb|AEUH01000091.1| GENE 17 20760 - 20861 170 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVVWCSTGRLVFQNASESRVRTEARQARVPKCV >gi|319978712|gb|AEUH01000091.1| GENE 18 20852 - 20977 120 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLSLPESAQIGPKPLNSDAFWTNPRLSRPNAAKPRRILDQ >gi|319978712|gb|AEUH01000091.1| GENE 19 21079 - 21843 883 254 aa, chain - ## HITS:1 COG:CAC1455 KEGG:ns NR:ns ## COG: CAC1455 COG2197 # Protein_GI_number: 15894734 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Clostridium acetobutylicum # 1 253 2 216 225 147 34.0 2e-35 MRVLIVDDDAIVVQSLATILSAEDGIDVVGTCLSGAEAVQEFRRLRPDVLLMDIRMPGAD GLSAAEEILEGDPQARIVFLTTFSDDEYIVRALRMGARGYLIKQDVAQIAPALRSVMVGV CVLEGDVLERGASMGMRALPAPAPGSGAGTSGGGSGTSGSGPGKNGGRPGAHGDPGEPAG AVGVPDPRSTVFASLTDREYEVVEAVAAGLDNAETAERLFMSEGTVRNHISSILAKTGLR NRTQVAVRYYRAGR >gi|319978712|gb|AEUH01000091.1| GENE 20 21840 - 23054 1034 404 aa, chain - ## HITS:1 COG:BS_yfiJ KEGG:ns NR:ns ## COG: BS_yfiJ COG4585 # Protein_GI_number: 16077896 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 142 313 162 339 400 70 33.0 7e-12 MRVLADKGALLCGCLLMWRADTATVVWLLLAVTISGLAVVAEHWPWFAAVPAAYLLAEAL AAAPAVGAPLAVYDLARASAGRPPWQRSAAAAACAPIVVAAVGRGEPTAVYTAALGAVAA LLALRTRQGEATRQNLNAARDDLRERVLALQDANARMLQMQDDELRAAALSERTRIAREI HDGVGHLLTRLLLRVKALQITHREHPGVVTDLAALDDGLDEALDSMRRSVHALSDQGEDL AVSLNLLGSRSAIATVDVDCSLDAEPPRTVSRCITAVVREALTNAARHGGAASARVTVID YPAFWQVTVDNDGAVPAGDAEPDGGSPGSGASPGSAPLDRGSGLGTPSVRRSGLGAPGDG AIRPGLGLRAMTDRVEALGGTVRITPRPRFTVFATIPKDKDGAV >gi|319978712|gb|AEUH01000091.1| GENE 21 23236 - 24423 1449 395 aa, chain + ## HITS:1 COG:CAC0241 KEGG:ns NR:ns ## COG: CAC0241 COG1131 # Protein_GI_number: 15893533 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Clostridium acetobutylicum # 82 392 3 310 310 269 44.0 7e-72 MDSENGAEGAQRNEAGGDAEPARGTRTSTDRQRRAGTAGTAHGNDNPAGTARGSDNPAGT AHGSDGRASAPSGSDGRADAAVEVESLVKRYGELIAVDGLSLRIGRGEVVGLLGPNGSGK TTTINCLLQLLSYDKGAIRVFGQPMSPTAYGLKRRIGVVPQEVAVFDELTVAENVDAFCA LYVRDRGLRRRMVSEAIAFVGLEKFAAFKPKKLSGGLLRRLNIACGIAHRPDLIILDEPT VAVDPQSRSAILDGIKRLNQEGATVIYTSHYMEEVEQLCDRITIMDRGRQVASGTSDELK AMIGTGERIRVEVVDPLPLGEEALEALRRLERVRSVAYEAPTALIECAAGPHNLSDVMGA LAGVGATCGRVVSEPPTLNEVFLEITGRALRDEAA >gi|319978712|gb|AEUH01000091.1| GENE 22 24425 - 25567 1506 380 aa, chain + ## HITS:1 COG:no KEGG:Ccur_10720 NR:ns ## KEGG: Ccur_10720 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: ABC transporters [PATH:ccu02010] # 1 380 1 384 384 343 48.0 7e-93 MVTVFRYQVLRLLRDRILLVWTLGFPVVLSLIFMAQFSNLEKAYEATPMSFGVVQDDAYW AAPGLDAVVERVSSEDADPRLIMKVSHSTASEAEAAAKRGDTNGYLAVEDGGPVLHVSQK GNEAETTRVLRVVMDSYTQAAAEHRALAEAGAPPEQVAGAGADRSFTRPVSVTQNPVKPA TRYYFALLAFASGMGTTVTMVAVQGIMASSPLGARRILAGLPRWQVLTATLTASWACVLV CLLIAFAFIAAVVGVDFGPHVGLCLVAIGAASLMSSAAGAALGSAGGAGIGVVSAITCLL SLFTGLYGPGAQSVANAVELNAPLLAQANPLWQASHCFYALLYYDSLEPFARSCAVLAGM TGLFLAIALIRARRMTHEHL >gi|319978712|gb|AEUH01000091.1| GENE 23 25554 - 26738 1429 394 aa, chain + ## HITS:1 COG:no KEGG:Ccur_10730 NR:ns ## KEGG: Ccur_10730 # Name: not_defined # Def: ABC-2 type transporter # Organism: C.curtum # Pathway: ABC transporters [PATH:ccu02010] # 1 394 1 393 393 350 50.0 6e-95 MSTFRTSLRIVAAHRAYVLVYLVLLSMLGLLTGLARSEGSSHQVKQATASVAVIDRDGST ISRGIKDYIESVGEAKPLEDSTRAIQDATAQNRISYILIIPAGYGEALQRAARGGGGPPQ MDTVIGYESASGALMDVRADSYVGKVADYLSALTDDPARAVALAEETMRHCAPTERIAQD ATPLPHSLLVYARFSLYPLLAFAVVAIATLMASLGRRAVRSRLTAAPVGSGAHSLGVLGA CLAVGAIGWLWVFGLGVAVFGRASLAASAPLLGVVGAALGAYTLVAVSLGFLIGQVGLGE NAAHAIANIGGMALSFLGGTWVPIEWLPDAVARAARLTPGYWVDQALSGAYAATSTSAEA LLPLLADCGICALFALAILSVAMGVRRTRARASL >gi|319978712|gb|AEUH01000091.1| GENE 24 26769 - 29990 2484 1073 aa, chain - ## HITS:1 COG:FN1445 KEGG:ns NR:ns ## COG: FN1445 COG1112 # Protein_GI_number: 19704777 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Fusobacterium nucleatum # 256 1063 74 847 849 358 31.0 2e-98 MCETDGFKDAPEAVADSVAEPGGGARRVAQASVDSFVSPLRGAREVDEACHGELWPSGSA PAGGFPATYRVGGPSTSVGDGTVTVKGGEMGAIDARAMHVLVHNQGESDFKDVTGRVSGL EERGDRVIVTYSSGRIYRYQAAKVRIYDQVVRRDLARGDEVYVRGSLWGTVETVYSLGLR EDPSRVRYTLVYRRRNGTVGLSRRGEGEVRMRLLTPERRNALEIMDYVRDEVRGQARRVG AWQEGTGRSTKWRAEKNSARAAATLANVWERLGRVPPGSALEAYLAGTSSADTGAEQDTI MPFASNLEQRRALKAALSHQLSVIDGPPGTGKTQTILNLVATLVAQGKTVGIVSGVNSAV DNVVEKLTEEGYGFLVANVGKREKVEAFLREQGRLGELWEEWEATAGLGADGTPVAVDPD DLHEREARLVQLWQDGRDIPRIRAELAATRKERQLLEVKIRESRRPLPDVSGLPVVRRGP EALADLLALARSAPKPHRGVLGLVERIRRYFAYGSLKGVRLDDADTQAALQLQWYRAYEE ETASRIERAEERSRNAGAQEEAREYRSQSRAVLDRAIRRRFTAVPRTLLTSTKYLSAQTR SLLGTYPVVASTCHSIRGNLDESVLLDWVIIDESSQVVIPEGIAALSKARNAVIVGDEQQ LGPVLTGWEESRREPPDPRYDVRCLSLLRSVKALDEVAGIPRTLLKEHYRCRPAIIEFCN RMYYDGQLIAMTDPADSQEPPMRVVRAAPGNHARRLRGGRYSQREIDIIAQLEDLRLLDR GVTTEDKDESGDYVLGIVTPYREQANRVGARVRQESEGGGHARWLAETAHKFQGRGAGTI VFSTVLNSDDTDGGDPRFYDDSRITNVAVSRAKDRVVVVTAHGGVRHSRNVTALLDYITM LDPEQVVESDIVSVFDVLYAHYSDAVRQYARSKWHGPSRFPSENIIDTLLGNVLAETHYA RLGYQFQVHLRDVLPHTRRLNEEQRQFVWHGSALDFGVFSTVTHKLVLGIEVDGFRYHEG DPAQRRRDALKDQVLRAYGVPVLRLPTNGSDEEARIRAALDAVEQDPFPPRFG >gi|319978712|gb|AEUH01000091.1| GENE 25 30156 - 30527 178 123 aa, chain - ## HITS:1 COG:MT1144 KEGG:ns NR:ns ## COG: MT1144 COG1487 # Protein_GI_number: 15840551 # Func_class: R General function prediction only # Function: Predicted nucleic acid-binding protein, contains PIN domain # Organism: Mycobacterium tuberculosis CDC1551 # 3 118 2 117 124 101 53.0 3e-22 MRILVDTNVWIDHLRAAEPVLVDLLERDQVCVHQSVITELALGNLKDRSILLKALERLMM VRNVDDQGVRHLVEERRLWGRGLSAVDAALLASAIVTPGVSLWTRDKRLRHAASDMGVLA DFD >gi|319978712|gb|AEUH01000091.1| GENE 26 30853 - 32298 653 481 aa, chain - ## HITS:1 COG:no KEGG:DIP0075 NR:ns ## KEGG: DIP0075 # Name: not_defined # Def: hypothetical protein # Organism: C.diphtheriae # Pathway: not_defined # 2 480 15 493 493 648 69.0 0 MAGLWWLPDEPGKQIPGILRYDGEGRSSLSLIGAFEDRIFSTLAPGVMDELEGTRTWDVI HGVAEQREVTLLGCVPTRSERTMLARVESPDLQTVSATIAVIGAHLGGEDDAIFAAAEVS VEDLTLWAASSVFSGSFGSRDGKLDGTGTVSVKPVNAETVTVDGTEYSLVHTHTLPFFDR RKGRTVGRMRDAASIRIRQAEPFTLRAGLDAASLIQDLIALATHRAAGVIWLQLEVAGTE TVLPNGQPLPRRRANVLYSPVAPGKHDEKGVDPDRVFFTCRELPFEEVMQRWCEAHGRLQ AATNMILGLRYAPARFVENNLLMAVGAAEALHRGLDIDEKPIPEAEFKAIRNAMLDQVPE EHRGRFKAAIRNSPTLRDRLFALAARPDQDAIMQLVPDVDHWAKRTARARNDLAHEGRTP NHSIDELDAIVGATTAVVILNILHELGLSAERQRQIVREHPQLRTVCQKASEWLVAPESN S >gi|319978712|gb|AEUH01000091.1| GENE 27 32425 - 33801 799 458 aa, chain + ## HITS:1 COG:Rv1586c KEGG:ns NR:ns ## COG: Rv1586c COG1961 # Protein_GI_number: 15608724 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Mycobacterium tuberculosis H37Rv # 1 456 1 469 469 213 33.0 5e-55 MRRDFPRRAAVYLRVSLDATGEHLAVDRQRADCLRIIRERGWTLTQEYVDNSLSASKRTV RRPAYDRMVQDYDAGLFDALVCWDLDRLTRQPRQLEDWIERAEERGLVITTANGEADLST DGGRMYARIKASVARSEVERKSARQKAANAQRARLGRPPLGTRLMGYTAKGVLIPEEAKV VRGIFDSFLAGESLKGIARGLQAAGVPTRHGGRWHSSSIRTILLNERYAGIMEYMGEVLP NVPVTWEPIVSEDAYFLAQSRLSDPLRKTAKEGTHRKHLGSSLFRCAECGHGMHAFSNVR YRCPRCEFSRSRTQVDAYVLAVVRARLAQPDVADLLAVDHEPEVKELTAQIQRLRDRLER INQDYDEGIIDGLRYRTAAERVRAELTATEGKRAALSGGLSGTSSVLGAPDPVEAFDAAS LMIQRRIIDALLDVQLKPGARGSRTFNPDSVILTWKEN >gi|319978712|gb|AEUH01000091.1| GENE 28 33803 - 34324 278 173 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTEAYSVKLVHRHAGRRNDTVLDTFVPDEAGGWTPVPRLHSDPWPHGSRPHSRHEIALEG GSVATRRVLVGETVEQGTEVRAVDSRTARNERDAARRWAQEDPEYRALLAEDPREARKVA RRVAWQINRRCPGCGAVLSYSFREDTLYRAFDGLRAEGRRKVDVEALARALDR >gi|319978712|gb|AEUH01000091.1| GENE 29 34361 - 34480 56 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVLRRPGYRRAVTVTPDAPSSFRAILMAFLNGSAVCCA >gi|319978712|gb|AEUH01000091.1| GENE 30 34489 - 34707 205 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGTARQRAAARYASLTRSRSEDDPTLLAARQDLHAAELEDAINRALASAPPLGAEQRARL AAMLSAGKAVAA >gi|319978712|gb|AEUH01000091.1| GENE 31 34704 - 37394 2370 896 aa, chain + ## HITS:1 COG:XF2505 KEGG:ns NR:ns ## COG: XF2505 COG3378 # Protein_GI_number: 15839095 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Xylella fastidiosa 9a5c # 477 822 430 782 819 159 32.0 2e-38 MTPNAKNGGYSGPMPEEHRNSPLPDNHDVDGRHTPNFSPDGPLPALLVAEADRIDRKRWT PREYTRDAFRERFQPGNVPVRPDKNGPAYVGGSFLDANAPRGKRNMRERWVITLDADEVT SEALPRLVEAVRELGYESIIHATHSSTPTAPRCRVVFPLAAPLPAATYPEVANALMAALE VPGMEWDASCNQAERAMYWPSTPDRATYWAEFTEGRAMDAAEFMAERTPPAPSEDAGARA GTPGSKRDPGSLPGIIGAFNRVYDIPAVIDAYSLPYTPAGPGRWTYNSAESTGGLSLVEG QKDLAISRHANTDPACIPDSRGSFHAVNAFDLAALHLFGNLDSEADRAKGPCEREASQAA MEQRAAQDPTVLREYAGTDFKEPDTDSVRTVIAKTAEDTDSALADWCEPRLAGRLEYVSG AGFRLYSPSRGTWETDNSKDNARTIKLVEDMLRERHMGIVKSGDSKRTQRDRKNLNKGKI EAVTRLLQGRLLHGPDEYDANPDILNVGNGVVDLRTGTLLPHAPSYRCTQYTPVPYLPEA SHEDWKAALAALPAESRPWFRRWVGQAATGYTPREDNTMLGIAAGIGANGKTTLLTAVRL ALGTYAGTAAMDLLLSSPRNTNNTKMALYGKRFVLIEELPEGRLDGVQVKAVTGTELIRG NLKFKDEFEWRATHSLIVTTNNLPTVSEFSEGLWRRPFVLEFPYRFTSNPQAEADRPGDA GLTQRLLAPESPALPAVLAWVVRGAVEFYEHGQQVGALPAPISSATASWQEQSDVLLAFA SECLVKDAGAVVPGSHLYAAFEAWLSEQGAASWSQRTFRGRIASHGYFRDVKEGTHRWKK LTLSTWGAEAHPTADKVRGYRGIRFRTSEDERRAREAELAGRGLAVVAGTADMDLI >gi|319978712|gb|AEUH01000091.1| GENE 32 37411 - 37746 450 111 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTRYATVQEQDQACAAILVRNLYGYVKCEGRRWYLWDDDNGGWKRTTVGYALCNRIVREV ERLIVQAVMEDRYEDARDWCRYLDPTDIGTRLTPHMARIYRENQALPRGQG >gi|319978712|gb|AEUH01000091.1| GENE 33 37835 - 37960 58 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MESRGKDRVRFCSNGYAKRRPDLMPSVSQNLGRKVNQIVKN >gi|319978712|gb|AEUH01000091.1| GENE 34 37909 - 38040 254 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRTRSSKSVPGPFLAIPTKSRFLSQNGYAFYNLSRGKILKIEI >gi|319978712|gb|AEUH01000091.1| GENE 35 38049 - 38213 243 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRRTRSTSKTHFSLELHRKNRVRFYFCHVSARQWAQLYAFSPATSSTRNETAPT >gi|319978712|gb|AEUH01000091.1| GENE 36 38231 - 38776 425 181 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSVATFRMQREKLPVTVRFGFTARGYVDRSGVRWVQAWMQNTGVRAYAHEVYRHDWQPFD LFAERKPDAARLGVPDSLPEHLRMLSRCRLRREDGVFKPQYLPPGQYVGFWVSCPLGIDR VKLATTVSVRWWSQVRFISSEWLEVPSAEELIASYRETRAYLEQQARGPEGPTAPAEGTE D >gi|319978712|gb|AEUH01000091.1| GENE 37 38890 - 39990 1857 366 aa, chain - ## HITS:1 COG:Cgl1003 KEGG:ns NR:ns ## COG: Cgl1003 COG0012 # Protein_GI_number: 19552253 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Corynebacterium glutamicum # 1 357 1 358 361 434 67.0 1e-121 MSLTIGIAGLPNVGKSTLFNALTRATVLAANYPFATIEPNVGVVPLPDPRLDKLAEMFHS QRTVHATVSFVDIAGIVRGASEGEGLGNQFLANIREADAICQVTRAFNDPDVVHVDGKVD PASDIETIQTELVLADMQTLEKQIPRLEKEVRGKKTEPVVLETARAALAVLEKGELLSGP AGADLDQEALASFQLMTTKPFIFVFNMDADGMEDEDLKDELRALVAPAEAVFLDAQFEAE LVELDEADAREMLAESGQDESGLDQLARVGFDTLGLQTFLTAGEKESRAWTIHKGDTAPK AAGVIHTDFEKGFIKAEIVSYEDLVALGSIAEARAHGRVRMEGKDYVMQDGDVVEFRSGL TSGGKK Prediction of potential genes in microbial genomes Time: Thu May 12 17:38:13 2011 Seq name: gi|319978688|gb|AEUH01000092.1| Actinomyces sp. oral taxon 178 str. F0338 contig00092, whole genome shotgun sequence Length of sequence - 17885 bp Number of predicted genes - 23, with homology - 19 Number of transcription units - 16, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 112 - 1425 1059 ## gi|252126994|ref|ZP_04836937.1| hypothetical protein CORMA0001_0072 2 2 Tu 1 . - CDS 1649 - 1999 167 ## gi|154507871|ref|ZP_02043513.1| hypothetical protein ACTODO_00353 3 3 Tu 1 . + CDS 2578 - 2946 351 ## COG2826 Transposase and inactivated derivatives, IS30 family + Term 3167 - 3211 13.9 4 4 Tu 1 . - CDS 2960 - 3475 332 ## 5 5 Tu 1 . + CDS 3351 - 3677 265 ## 6 6 Op 1 . - CDS 3549 - 3815 98 ## 7 6 Op 2 . - CDS 3826 - 4896 1331 ## COG0761 Penicillin tolerance protein 8 7 Op 1 14/0.000 + CDS 5085 - 6326 1062 ## COG1570 Exonuclease VII, large subunit 9 7 Op 2 . + CDS 6398 - 6619 256 ## COG1722 Exonuclease VII small subunit 10 8 Tu 1 . + CDS 6889 - 7821 1323 ## COG0524 Sugar kinases, ribokinase family + Term 7895 - 7936 7.7 11 9 Tu 1 . - CDS 7908 - 8669 1020 ## COG0020 Undecaprenyl pyrophosphate synthase 12 10 Op 1 . + CDS 8653 - 8820 134 ## 13 10 Op 2 . + CDS 8820 - 9527 955 ## COG1272 Predicted membrane protein, hemolysin III homolog 14 10 Op 3 . + CDS 9537 - 10076 895 ## COG0782 Transcription elongation factor 15 11 Tu 1 . - CDS 10208 - 11656 1851 ## COG0628 Predicted permease 16 12 Tu 1 . + CDS 11798 - 12415 600 ## COG0225 Peptide methionine sulfoxide reductase + Term 12539 - 12577 -0.3 + Prom 12565 - 12624 3.8 17 13 Tu 1 . + CDS 12659 - 13225 800 ## SSA_1670 TetR/AcrR family transcriptional regulator 18 14 Tu 1 . + CDS 13334 - 13663 496 ## gi|269219791|ref|ZP_06163645.1| conserved hypothetical protein 19 15 Op 1 . - CDS 13730 - 14128 565 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 20 15 Op 2 . - CDS 14145 - 15125 1562 ## COG4760 Predicted membrane protein - TRNA 15425 - 15498 66.1 # Leu TAA 0 0 21 16 Op 1 4/0.000 - CDS 15546 - 16529 1167 ## COG0248 Exopolyphosphatase 22 16 Op 2 . - CDS 16557 - 17072 458 ## COG1507 Uncharacterized conserved protein 23 16 Op 3 . - CDS 17069 - 17851 704 ## Bcav_1019 septum formation initiator Predicted protein(s) >gi|319978688|gb|AEUH01000092.1| GENE 1 112 - 1425 1059 437 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|252126994|ref|ZP_04836937.1| ## NR: gi|252126994|ref|ZP_04836937.1| hypothetical protein CORMA0001_0072 [Corynebacterium matruchotii ATCC 14266] # 1 240 1 249 386 98 29.0 7e-19 MDPIEELSDRVAALAGRDLALSEVHQFILDAAELLAPAVPVVTGNGVWVRWGLGERTVVV APHRFRSMLTLAVHFFNSEYTETHDYHAFKWGMADDMPFRWSMVLGEHTTSVFDWWRQCG LVGYNWDYFDRQFDSVLDSLPEDLELMPPQWRREVVYRWDMSVSGLGAVTLRATHEGIEI SSAATGESVMFPPGRTQGMGAVLAGLAGGAPLKKVPMLESSGFDAGPITLDGSEPEDVLR EIEMIEENNNEGIRPDTDDNRRPALTFADLRARLGEPEEETVSRAYARAEHAVLPMRWGL SLGQLHAIVRQWSAGAQMDRVLMELGAVPGTYLNDEALVGKDWVAVTGRVSSEWEIVVSP AEEHAMTDNRQLAAAAWQLSQEFQDAYGSPFAGWTSSSFGFSRFFRIGDRGLAINTFLGL RVVFGSFEKLAFRSLYG >gi|319978688|gb|AEUH01000092.1| GENE 2 1649 - 1999 167 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154507871|ref|ZP_02043513.1| ## NR: gi|154507871|ref|ZP_02043513.1| hypothetical protein ACTODO_00353 [Actinomyces odontolyticus ATCC 17982] # 2 95 173 264 398 63 40.0 6e-09 MRGVGNVRGLPVEDLWTTSVLIAAMRPPLEAFVAVSMALRRLSRFDRRDLAGSRAREEEA RRILLELVAQGEDAGAYGMGRARRVVLAADAGCESGTAVDAPAGLGQRGAAHPRPR >gi|319978688|gb|AEUH01000092.1| GENE 3 2578 - 2946 351 122 aa, chain + ## HITS:1 COG:tra8_g1 KEGG:ns NR:ns ## COG: tra8_g1 COG2826 # Protein_GI_number: 16128241 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS30 family # Organism: Escherichia coli K12 # 26 112 287 373 383 86 51.0 1e-17 MGGKAPSKRAVDVSLLNRAGPGPPHRRPLTVDRGHEFALGPALQEPLGAPIYFCQARCPW QRGTNENTNGLLREYFPKRTPLDGISPTRTQQVHNQLNQRPRKRLGLKTPEEAHTHRTLH FL >gi|319978688|gb|AEUH01000092.1| GENE 4 2960 - 3475 332 171 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWSKMRLSLTESARIRPKPSNPDAFRTTPARSPSKTVRLRRILDHPRPDSQWSPQGPGRR KRSGRAGARIGVGYGALGQATWRAFPHWGRQEGIGVGKLPTPMGGFLPPMRLVGKRRRRG ARSGGYTAWPPGAWAREPPAVGCFHSDTDTTGHTFAAMPPFSQRYYTLQAL >gi|319978688|gb|AEUH01000092.1| GENE 5 3351 - 3677 265 108 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLSLTVFDGLRAGVVRNASGFDGFGRIRADSVRLRRILDHMPAKPPFPPETQTHFGPPN RWGTGPLGHRAGTGGGVEAAARGKHPHRAGTGGGVEAVDATLSGGAPA >gi|319978688|gb|AEUH01000092.1| GENE 6 3549 - 3815 98 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRPGTMGWGLHRCVRGGAMGLGTSRGARRVVVGAREAAGGAAAQARSRGSATTEGSVNGL NPAPGARPVRVLSACGGLNPAPGARPVA >gi|319978688|gb|AEUH01000092.1| GENE 7 3826 - 4896 1331 356 aa, chain - ## HITS:1 COG:ML1938 KEGG:ns NR:ns ## COG: ML1938 COG0761 # Protein_GI_number: 15828043 # Func_class: I Lipid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Penicillin tolerance protein # Organism: Mycobacterium leprae # 26 332 27 333 335 380 64.0 1e-105 MHGTPDLRATALTSLPETTGAGGGGRILLAAPRGYCAGVDRAVDVVEKALELYGAPVYVR KEIVHNKFVVESLSKRGAVFVQETDQVPEGARVVFSAHGVSPEVHEAARVRRLATIDATC PLVTKVHREAVRFAKQDYDIILVGHQGHEEVEGTFGEAPEHIQVVGTPEEVDRVVVRDPD RVVWISQTTLSVDETRLTVDRLRQRFPNLIDPPSDDICYATQNRQGAVKAIAPKVDVMVV VGSANSSNSVRLVDVALESGVPSAHRIDKAEELRGEWFAGATTVGLTSGASVPEILVRDV IEWLQAHGFPTVEVVRTETESTTFALPRDLRADLKAAGMAPARPHAPREVLPDHTR >gi|319978688|gb|AEUH01000092.1| GENE 8 5085 - 6326 1062 413 aa, chain + ## HITS:1 COG:Cgl0996 KEGG:ns NR:ns ## COG: Cgl0996 COG1570 # Protein_GI_number: 19552246 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Corynebacterium glutamicum # 1 356 18 373 417 284 46.0 2e-76 MRRLTENIKKYVDRMSPLWVEGQVVEFKRRPGAKMHFLTLRDLQTDTSMTVTAWAGVMDQ AGDGLQEGARVVTRVKPVFWERTGRLNLQAAEIHLQGVGSLLAQIEALRQRLAAEGLFNE SRKKPLPFLPRVIGLVCGRAAKAEDDVVVNASDRWPGARFEIREVAVQGDRCVAEVGAAL AELDAAPGVDVIVIARGGGAVEDLLPFSDEALVRAVAAARTPVVSAIGHEGDCPLLDLVA DYRASTPTDAARRIVPDRAHEREGVAQAVARMRGAVANRIAAERSSLALVASRPVLTGPG AIVDGYRAALERLTSSMNQAVERRVNTEEQQLTRMRATLRALSPQATLDRGYALLKTPSG ALITSSSDVKKGDLIEGILASGRMVAQVVGATPPAPRPGSAPPGGSDPAGAPW >gi|319978688|gb|AEUH01000092.1| GENE 9 6398 - 6619 256 73 aa, chain + ## HITS:1 COG:ML1941 KEGG:ns NR:ns ## COG: ML1941 COG1722 # Protein_GI_number: 15828046 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII small subunit # Organism: Mycobacterium leprae # 12 64 24 76 87 58 60.0 2e-09 MTGEKLPDPASLGYEAARDELVQIVRALEGGQAPLEDTLALWERGEALAARCSAILDGAQ ARLAKAGESGEPQ >gi|319978688|gb|AEUH01000092.1| GENE 10 6889 - 7821 1323 310 aa, chain + ## HITS:1 COG:mll7580 KEGG:ns NR:ns ## COG: mll7580 COG0524 # Protein_GI_number: 13476296 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Mesorhizobium loti # 9 310 2 306 307 166 36.0 5e-41 MDLSTTPHVLAIGEALIDVVITPDEPDFPQEIPGGSPANVALTLGRLGRPVALSTWIGAD ERGRLIEFHLRDSDVEITPESRGASRTSTALARLDAQGQASYTFDLEWAPASPLPVSDTA VVVQAGSISSVIEPGAAVVLEALERSRAHALVVFDPNARPSIMGEPARARATIERFVAVS DIVKVSDEDVEWLTEGAPLEDTIARWLGMGPSVVVVTRGKHGSLAVTASGVRLSKTPNDV PVVDTVGAGDSFTGGIIDALWELGLTGAGARDALRGIDEASVAAVLDRASAVSDVTVSRA GANPPWKHEL >gi|319978688|gb|AEUH01000092.1| GENE 11 7908 - 8669 1020 253 aa, chain - ## HITS:1 COG:ML2467 KEGG:ns NR:ns ## COG: ML2467 COG0020 # Protein_GI_number: 15828335 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Mycobacterium leprae # 7 253 12 262 262 243 52.0 3e-64 MAQWNMLYDIYERRLRAELATAPIPAHIGVILDGNRRWAKSLGAPASQGHRAGAGKISEF LGWANDAGVSIVTLWLLSTDNLSRDADELGELLRIICEAVEALAKQGAWRLAIVGDLSLL PDGVAADLSRSIEATAGREGMKVNIAVGYGGRHEITEAVRAMMREAAAAGRSLAEVAESF TDQDIASHLYTKDQPDPDLVIRTSGEQRLSGFMLWQSVHSEYYFCEVYWPDFRRVDFLRA LRSYAQRERRLGR >gi|319978688|gb|AEUH01000092.1| GENE 12 8653 - 8820 134 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFHWAMASSFLRWRTGGLVEYAQRQVYPLTALDASGRGGAGMTPNLRRSNLRYRK >gi|319978688|gb|AEUH01000092.1| GENE 13 8820 - 9527 955 235 aa, chain + ## HITS:1 COG:MT1117 KEGG:ns NR:ns ## COG: MT1117 COG1272 # Protein_GI_number: 15840522 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Mycobacterium tuberculosis CDC1551 # 25 233 30 242 242 156 42.0 4e-38 MVPARGGSMDIASLKPTQWVQGQLVVIKPKLRGWLHLGATPLSLAASTVLVCLAPTPATR WGSAVYLVASLFLFGVSAAYHRFYWAPKWETMWSRLDHSNIFLLIAGTYTPLTIALLPRH DATVLLGVVWGGAVVGILVNLFWPSAPRWLKTSIYVALGWVAVWYLPQLWRSGGPAVVWL VFVGGVLYTIGAVVYGTKKPDPSPTWFGFHEIFHAFTVAAWACHCVAVYLAVLNS >gi|319978688|gb|AEUH01000092.1| GENE 14 9537 - 10076 895 179 aa, chain + ## HITS:1 COG:MT1111 KEGG:ns NR:ns ## COG: MT1111 COG0782 # Protein_GI_number: 15840516 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Mycobacterium tuberculosis CDC1551 # 21 177 1 162 164 124 45.0 9e-29 MVRNRVGALGAGPLLSLEVPMADTQWLTQAQYDLLANELKERIEVKRPEIARLIDAARQE GDLRENGGYHAARNEQSMNETRIQALQEMLEHAEVGETPADDGIVEPGMVVTALVAGREQ TFLLGSRDAGGDLGIQVFSTQAPLGAAVIGHKAGDTIEFEAPTGRMIQVEILSAKPFKG >gi|319978688|gb|AEUH01000092.1| GENE 15 10208 - 11656 1851 482 aa, chain - ## HITS:1 COG:MT0215 KEGG:ns NR:ns ## COG: MT0215 COG0628 # Protein_GI_number: 15839585 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Mycobacterium tuberculosis CDC1551 # 48 402 8 357 367 200 36.0 5e-51 MSENHGSARWSIGDFVARITALRPSRQEARPEPPRVEREDEVEKDVVASVPRTLRVGAAV SWRVLLVVAVVAGVVWLASRLQEVLLPVAIALAIAVFLQPLVEWLRRRLHFPPALAATVG LLLFFVVFIAALSQATNQIVEQVPLLVNQATSGVKQLVDWLEHGPIKVDTTAINNFVNQM RTELIEWVNSNKQSLATGALSITSSLLSMVTSGLTMLFCLFFFLKDGRSIWLWVVRLLPA PARVPLHESAIRGWVTLGSYVRTQIQVAAIDAVGISLGAFFLGMPMVVPIAVITFFAAFV PIIGALASGAIAVLVALVYKGATSAIIMLVIILVVQQVESNLLQPFMMSSAVSLHPVAVM LVITAAGSVGGIAGAVFGVPIAAFINATVLYLHGYDPMPQLATQADRPGGPPGMLDQMIA DTHVGKPDTRALARQQVAEAAAEAAEAAAEAEPVVAQAPEAVVEEYPNPAEVEALGGAEE AD >gi|319978688|gb|AEUH01000092.1| GENE 16 11798 - 12415 600 205 aa, chain + ## HITS:1 COG:BMEII0230 KEGG:ns NR:ns ## COG: BMEII0230 COG0225 # Protein_GI_number: 17988574 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptide methionine sulfoxide reductase # Organism: Brucella melitensis # 17 203 33 213 218 180 49.0 2e-45 MELFNALPSSPVLSPRHAVLGTPIDAAPVDGQQVVHFAAGCFWGVEKAMWSVPGVIATAT GYMGGRCPNPSYEQVCAHGTGHAEAVRVVFDAVAAPFGQLVALFFEIHDPTQADGQGNDL GDQYRSAIWTTTADQFEIASRTRDAFQRELAAHGFGAITTSIAPAPASDSPGRFWFAEER HQQYLAKNPGGYECHARTGLSCPAL >gi|319978688|gb|AEUH01000092.1| GENE 17 12659 - 13225 800 188 aa, chain + ## HITS:1 COG:no KEGG:SSA_1670 NR:ns ## KEGG: SSA_1670 # Name: not_defined # Def: TetR/AcrR family transcriptional regulator # Organism: S.sanguinis # Pathway: not_defined # 3 188 17 202 203 180 46.0 3e-44 MADKREALKRAAHQVFSQKGYKAASIAQIAARAHVAVGSFYNYFPSKEAIFLEVYIEENN RARQRMIEAIDWDADPLAITRALFDQVRTGVRDNRILAEWGNPAVSGTLHAHFQSEARDE DYPFHQFLITTFATRLADAGFDEAEAARLLNVYRLVSFMDTHITENDFPGFDEALETLIT HFVKGVFA >gi|319978688|gb|AEUH01000092.1| GENE 18 13334 - 13663 496 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|269219791|ref|ZP_06163645.1| ## NR: gi|269219791|ref|ZP_06163645.1| conserved hypothetical protein [Actinomyces sp. oral taxon 848 str. F0332] # 3 107 7 113 115 75 42.0 1e-12 MSLALSGRSILGIAIAVIGVLTMAWGATQAAHDFEGFKTLELGGVIVLATGLCLVSEVPT ALQIGGIWVAAAASAAYIFTLPNWEFPLRMMSAVPVVALAVWLTTLLAD >gi|319978688|gb|AEUH01000092.1| GENE 19 13730 - 14128 565 132 aa, chain - ## HITS:1 COG:DR1839 KEGG:ns NR:ns ## COG: DR1839 COG0545 # Protein_GI_number: 15806839 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Deinococcus radiodurans # 23 131 45 151 152 92 46.0 1e-19 MENITVTGEFGRKPSLAFEGAPSPDLLVEVLHQGDGQVVEAGDTIHCHYYGIVFAGDSDF DNSFDRGGALSFQIGVGMVIPGWDQGLVGKRVGDRVLLSIPAELGYGARGVPQAGIPGGA TLVFVTDILGVA >gi|319978688|gb|AEUH01000092.1| GENE 20 14145 - 15125 1562 326 aa, chain - ## HITS:1 COG:MT1102 KEGG:ns NR:ns ## COG: MT1102 COG4760 # Protein_GI_number: 15840507 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 73 326 20 278 278 126 36.0 5e-29 MANPVLDNLSKKWAPGQTPAGYPTMPGYQVSGQSAQNPYGGQTNPYGQPQGAYGQQTNPY GQPQGAYGVPQGAPYGIDASQMGAYEAAMNAPSADAVDRGRMTYDDVIVKTGITFGVIVV GAVLGWMTFTISMAAAVVLSGVATIVAFGLAMANTFMRITNPALVLAYAAFEGLALGAIS AAFETRYPGIVIQAVVATLVVFGVTLALFASGRIRNSPKLARFTLIALVSFVVFRLLTVL LEFSGAVDTTAVGRMTVMGIPLGVVVGVVAVLIGTFCLIQDFDRVKVGVEYGAPARAAWT CAFGLAVTLVWMYLEILRLLSYLRND >gi|319978688|gb|AEUH01000092.1| GENE 21 15546 - 16529 1167 327 aa, chain - ## HITS:1 COG:MT1054 KEGG:ns NR:ns ## COG: MT1054 COG0248 # Protein_GI_number: 15840454 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Mycobacterium tuberculosis CDC1551 # 7 327 3 318 319 189 43.0 4e-48 MTELSDVTRVAAIDCGTNSIRLLVADVVRDQGGARLTDLTRQMRIVRLGRGVDRTGVLEG AAIERAVEATVEYQRMIEALGASRVRFVATSATRDAANSDEFTREIRRVIGVDPEVVPGT EEASLSFSGAVCGLGGSVDAPMLVVDIGGGSTELVLGDARVEQAVSVNMGAVRVTEKFLS GVDPAGGVPPQDEARVVAWVDEQLDVAEKSVDLGRVRTLVGVAGTVTTLTAQALGLTSYQ PERIHGAALSEEEIGAAVRFMVDQPVALKAALGFMPEGREDVIAGGALIWSRIVRRVLAR AEQAGRPIARVRTSEHDILDGIALAMA >gi|319978688|gb|AEUH01000092.1| GENE 22 16557 - 17072 458 171 aa, chain - ## HITS:1 COG:MT1053 KEGG:ns NR:ns ## COG: MT1053 COG1507 # Protein_GI_number: 15840453 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 14 159 5 148 163 143 58.0 1e-34 MTRSIPGTSPATEADLEALRDQLGRLPRGVVGIAARCACGRPTVVATAPRLEDSSPFPTT FYLTHPRAVKACSTLEAERLMDEYNRALREDQALAQAYARAHEDYLARRGLLGEVPEIAG VSAGGMPKRVKCLHALVGHALAAGGGVNPIGDMALAEMARRGLWSPARCSC >gi|319978688|gb|AEUH01000092.1| GENE 23 17069 - 17851 704 260 aa, chain - ## HITS:1 COG:no KEGG:Bcav_1019 NR:ns ## KEGG: Bcav_1019 # Name: not_defined # Def: septum formation initiator # Organism: B.cavernae # Pathway: not_defined # 117 224 1 118 136 73 36.0 9e-12 MPRPSRPRSAPRPSAPREGEGRTPREIYAERSRKRREAAAAASASSTGSKERGRSKRRGP ASPRSAGRRPGAAVERPTSQPAAAGGKTKPRGRASGGKRTAPEGGAPGRVSFGGLDVSVR LLALSMVAVFILMMLVPSVYAWWQQERELADIRAQVAAAEQRNADMRKQLDLWSDPNYIS TQARERLGFVRPGETQYTVVDPGPEYQDSAQVNAAPAQGPARPWVQQVAILLGRADQPPQ TAAAQPQAPAQDGDDAESGR Prediction of potential genes in microbial genomes Time: Thu May 12 17:39:16 2011 Seq name: gi|319978680|gb|AEUH01000093.1| Actinomyces sp. oral taxon 178 str. F0338 contig00093, whole genome shotgun sequence Length of sequence - 7532 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 119 - 1399 1973 ## COG0148 Enolase 2 2 Op 1 . - CDS 1606 - 2229 646 ## gi|154507722|ref|ZP_02043364.1| hypothetical protein ACTODO_00204 3 2 Op 2 7/0.000 - CDS 2256 - 5840 4772 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) 4 2 Op 3 . - CDS 5866 - 6468 728 ## COG0193 Peptidyl-tRNA hydrolase 5 2 Op 4 . - CDS 6455 - 7309 1210 ## HMPREF0573_11764 hypothetical protein 6 3 Tu 1 . - CDS 7439 - 7531 118 ## Predicted protein(s) >gi|319978680|gb|AEUH01000093.1| GENE 1 119 - 1399 1973 426 aa, chain - ## HITS:1 COG:Cgl0949 KEGG:ns NR:ns ## COG: Cgl0949 COG0148 # Protein_GI_number: 19552199 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Corynebacterium glutamicum # 1 424 1 423 425 558 72.0 1e-159 MALIEGIGAREILDSRGNPTVEVEVLLEDGVLARASVPSGASTGAFEAVERRDGDKGRYL GKGVQDAVDAVEDEIAPELIGLEATDQREIDSTMIDLDGTDNKGKLGANAILGVSLAVAK ASAKSADLPLFQYLGGPNAHVLPVPMMNILNGGSHADSNVDIQEFMISPLGAPTFREALR WGAEVYHTLKSVVKERGLSTGLGDEGGFAPNLDSNAEALDLIVEAIEKAGFKPGQDVALA LDVASSEFFKDGLYTFEGEGRSTDYMVEYYEKLISTYPLVSIEDPLSEDEWDAWKALTAE IGGRVQLVGDDLFVTNPARLAKGIELGAANALLVKVNQIGSLTETLDAVEEAHRNGYRTM TSHRSGETEDTTIADLAVATNSGQIKTGAPARSERVAKYNQLLRIEELLGDSAVYAGRGA FPRFAH >gi|319978680|gb|AEUH01000093.1| GENE 2 1606 - 2229 646 207 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154507722|ref|ZP_02043364.1| ## NR: gi|154507722|ref|ZP_02043364.1| hypothetical protein ACTODO_00204 [Actinomyces odontolyticus ATCC 17982] # 8 190 1 184 198 90 35.0 4e-17 MNTPSRRILVALRTKIASVAAVVAAAVGMGACSAHPGQAIVTTQGSYSISDIETATAQIG EIIGQKAELNRIRAVFYTAPQLDALAARLGVGVTDAQIDQRVAAFSAAAGQDKYDKPLAE STKTVVRATIISEMLNQAVSQDPSLREQALSFLQQQNAQLDPQVNPRFPALDSQMNAFDY PRLGDVVSGSGSASDARQRTGQSAPTG >gi|319978680|gb|AEUH01000093.1| GENE 3 2256 - 5840 4772 1194 aa, chain - ## HITS:1 COG:MT1048 KEGG:ns NR:ns ## COG: MT1048 COG1197 # Protein_GI_number: 15840448 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Mycobacterium tuberculosis CDC1551 # 3 1188 13 1202 1234 1036 52.0 0 MLLTGLLPLLASDPAVAAAADSIAAGRSGALVAPSGLRAPIAAHIASRATTPVVVLTATG REAEVMTAALASWTAGAAAFPAWETLPHERLSPQVDTMASRIAVLRRLAHPVQGDAFAGP LSVLVVPVRSFLQPIIRGLADLEPVRIAVGDVVDLPEATARLSDLGYQRVDMVEARGQMS VRGGILDVFPPQEPHPLRVELWGDEVEEIRAFSIQDQRTLGAAPHGLWAVPCRELLLSAP VRARAREAADRLPGAAEMLSLAAEGIPAPGIESLAPVLASGMDRLVDLLPPGTPVLASDP ERIRARAADLVATTDEFLRAAWSAAAGGAQVPLEASDASFLDLDDLWGAGGPWWELTSLP PAELAEAAADGEDAGAGSDGGGSALVASPVLMRAGAREVRPYRGDFAAAAADLKSLAAQD WRVVVTTEGPGPGRRIRTILAEAGCPVALSDSVDSDPGPGLVTITTAQGTAGFVAPDLRV AVLTEGDMTGRAGATTRDMRRMPSRRRKGVDPLTLHPGDYIVHEQHGIGRFVELVSRTVG RADAAATRDYLVIEYAPSKRGQPGDRLFVPTDALDQISKYTGSDEPSLTKMGGADWAKTK ARAKKAVNEVAKELIRLYAVRQQTKGHAFGPDTPWQRELEDAFPYVETPDQLVTIDEVKA DMEKPVPMDRLLTGDVGYGKTEIAVRAAFKAIQDGYQAAVLVPTTLLVTQHLETFSERYA GFPVEIGTLSRFSTPKQTEEVKAGLASGRIDLVIGTHALLTGAVAFKKLGLVVIDEEQRF GVEHKETLKALRTDVDVLSMSATPIPRTLEMAVSGIREMSILQTPPEERQPVLTFVGAHT DAQVSAAIRRELLRDGQVFYVHNRVDSINSVAARVQSLVPEARVRVAHGKLGEHQLEAVI QDFWNHEFDVLVCTTIVETGLDISNANTLIVDRADVFGLSQLHQLRGRVGRGRERAYAYF FYPGDKALTQTAHERLKTIAANTDLGAGLAVAQRDLEIRGAGNLLGGAQSGHVEGVGFDL YVRMVSDAVAAYRGEAPAQKADVRLDLAVDAHIPEDYVRGERLRLEVYAKIAAVSSPEQE ADVREELADRYGPLPAQVDLLFAVARLREVLRRAGIGEAVTQGKYLRVSPVQLRDSQAMR LKRLHPGAVIKAAVRQVLVPFPLTQRIGGAPLTDGPLLEWVETLVTRILTPFRE >gi|319978680|gb|AEUH01000093.1| GENE 4 5866 - 6468 728 200 aa, chain - ## HITS:1 COG:Cgl0914 KEGG:ns NR:ns ## COG: Cgl0914 COG0193 # Protein_GI_number: 19552164 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Corynebacterium glutamicum # 2 189 3 180 180 155 43.0 6e-38 MNTTDSPWLVVGLGNPGAQYASTRHNVGHLTIDVLAARAGAVLKTHRSRTRVADVRLGVG PGGVPGPRAVLARSETYMNTTGGPIRRLADFLGIGAQRILVVHDDLDLPPHELRLKAGGG EGGHNGLKSLTQALGTRDYHRLRIGIGRPPGRMDPADYVLAPIPAKERPDWDVTLEEAAD AVEGVVAEGFAPAQQALHTR >gi|319978680|gb|AEUH01000093.1| GENE 5 6455 - 7309 1210 284 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_11764 NR:ns ## KEGG: HMPREF0573_11764 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 63 248 128 307 363 125 41.0 2e-27 MRISKRAWQVAGAAAGGASLFGGGIALGRLLRLDSQRGDYRKAWEDHTLATMDRLRANEG EKPYVIVALGDSSVQGLGASRVTESYPALLASAIQQTLGREVALLNLSLSGATVESVELT QIPQMRGMGLIDGDIEPDLVTLSIGGNDVMAEDMAPGQFAERLGRVIAALPPDSLVSSIP SFGIMPQEKRATEMSEYLARTVEESTARMVDVRALTREYSLPTYTFAYHAADFFHPNSAA YATWAQLFADEWAASRRQAAPVVADAPQWGMLSARVAQSEYEHD >gi|319978680|gb|AEUH01000093.1| GENE 6 7439 - 7531 118 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no EAIAAAYGARGWEPPRFLGALPSGPAGRTV Prediction of potential genes in microbial genomes Time: Thu May 12 17:39:36 2011 Seq name: gi|319978673|gb|AEUH01000094.1| Actinomyces sp. oral taxon 178 str. F0338 contig00094, whole genome shotgun sequence Length of sequence - 6377 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1057 1087 ## COG0153 Galactokinase 2 2 Tu 1 . + CDS 1188 - 1730 538 ## Noca_3556 2',5' RNA ligase 3 3 Op 1 . - CDS 1779 - 2606 1041 ## COG0708 Exonuclease III 4 3 Op 2 2/0.000 - CDS 2623 - 3516 1074 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 5 3 Op 3 . - CDS 3513 - 4802 1868 ## COG0112 Glycine/serine hydroxymethyltransferase 6 4 Tu 1 . - CDS 5097 - 5189 87 ## - Prom 5212 - 5271 3.0 7 5 Tu 1 . + CDS 5511 - 6158 772 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis Predicted protein(s) >gi|319978673|gb|AEUH01000094.1| GENE 1 1 - 1057 1087 352 aa, chain - ## HITS:1 COG:STM0774 KEGG:ns NR:ns ## COG: STM0774 COG0153 # Protein_GI_number: 16764138 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Salmonella typhimurium LT2 # 20 351 8 321 382 171 34.0 2e-42 MARIIRAEAVSPEEGAFNARELFIQAFGYEPEGVWSAPGRVNVIGEHVDYNGGPCLPIAL PHRAYLALAPRADRAVRLISPQTADAIDDLDLDSIGPWGGPRQVPSHWTAYIAGVAWALE RAGHGPLPGFDAALWSCVPVGGGLSSSAALECATAVAVDEVCGLGLAGSDEGRRLLVDAA RAAENEIAGANTGGLDQTASMRCRAGHALALDCRDMSAEPIPFDLASSGLELLVIDTRAK HSLADGQYGQRRADCEEAARLLGVGQLVEVSDLDAAMSALEGHERLARRTRHVVSEIART RAFIELMGEGPLEGERLDVAALLLDDSHDSLRDDYEVSCAELDVAVAAARAA >gi|319978673|gb|AEUH01000094.1| GENE 2 1188 - 1730 538 180 aa, chain + ## HITS:1 COG:no KEGG:Noca_3556 NR:ns ## KEGG: Noca_3556 # Name: not_defined # Def: 2',5' RNA ligase # Organism: Nocardioides_JS614 # Pathway: not_defined # 13 160 4 151 179 137 47.0 2e-31 MFLPQRLPGQDWLGVVIAIPEPWVSELTDLRLRLGDLQGSRVPAHITLMPPVPVATESRG EVIEHLSAIARRSRPFRVALRGADCFRPLSPVAFLNLAEGGRSCSDLAESIRSGPLDYEP RFPYHPHVTLAQGVGDPVLDLAIDIGSSFEASWVVPGFRLDRLDPSGAYSSMAIFDFEGI >gi|319978673|gb|AEUH01000094.1| GENE 3 1779 - 2606 1041 275 aa, chain - ## HITS:1 COG:Cgl0651 KEGG:ns NR:ns ## COG: Cgl0651 COG0708 # Protein_GI_number: 19551901 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Corynebacterium glutamicum # 1 273 1 303 304 159 36.0 4e-39 MTLTVTTVNLNGIRAAHKRGFLDWLEEAAPTALLMQEVRAPEEISRGILPGQWDSVWVPC RIKGRAGVGIAVHRDRGALVGPPRTALDGAESDADSGRWLEVLVEADGAPSPVRLVSAYF HSGEKDTPKQEAKMAHLPRIGARMAELLASAASGGEQAVVCGDFNVVRSRADIKNWTSNH NKRAGVLDEEIAFLNRWVEEGWHDVVRDLAGQVQGPYSWWTWRGQAFDNDAGWRLDYQFA TPGLARAARSFTIGRAPSYDQRFSDHAPVGVVYGL >gi|319978673|gb|AEUH01000094.1| GENE 4 2623 - 3516 1074 297 aa, chain - ## HITS:1 COG:MT3464 KEGG:ns NR:ns ## COG: MT3464 COG0190 # Protein_GI_number: 15842952 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Mycobacterium tuberculosis CDC1551 # 11 294 1 276 277 295 61.0 5e-80 MRAPWEGPARVLDGKATAAAIKQELRARVAALRQRGCAVGLGTVLVGEDPGSLAYVAGKH RDCAEVGIDSIRIDLPADAGQGRIEEAVAQLNADPACTGYIVQLPLPAGIDTNAVLEMID PDKDADGLHPTNLGRLVLRGSGPIDSPLPCTPRAVIELVERHGIGLAGADVCVVGRGVTV GRTIGLLLGRKDVNATVDVCHTGTKDLADHVRRADVVVAAAGAAGIITAQMVRPGAVVLD VGVSRTVVDGKPRLAGDVADGVDRVASWLSPNPGGVGPMTRALLVTNIVEAAERAAR >gi|319978673|gb|AEUH01000094.1| GENE 5 3513 - 4802 1868 429 aa, chain - ## HITS:1 COG:Cgl0969 KEGG:ns NR:ns ## COG: Cgl0969 COG0112 # Protein_GI_number: 19552219 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Corynebacterium glutamicum # 2 428 8 429 434 601 72.0 1e-172 MDSLNDMALAQLDPEIQAVLDNELQRQRGTLEMIASENFVPRAVLQAQGSVLTNKYAEGY PGRRYYGGCEFVDVAESLAIERAKQVFGCDYANVQPHAGAQANAAALMAMADVGDPVLGL SLAHGGHLTHGMRLNFSGKHYRAAAYEVSRETMRIEPDMVREAALRERPAVIIAGWSAYP RHLDFQAFREIADEVGAALWVDMAHFAGLVAAGLHPSPVPHADVVTTTVHKTLGGPRSGM ILSSRGDKWGKKLNSAVFPGQQGGPLMHVIAAKAIAMKVAQTDEFKDRQRRTLEGARILA ERLGADDAVSAGVKLVTGGTDVHLVLVDLVDSQINGQQAEDLLHEVGITVNRNAVPFDPR PPAVTSGLRIGTPALASRGFDAQDFEEVADIIGTALAQGASGSSVDLEPLRARVKRLTDK HPLYAGLEQ >gi|319978673|gb|AEUH01000094.1| GENE 6 5097 - 5189 87 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVSSEGVILHIVRQVVRGCAQKVIADRLTR >gi|319978673|gb|AEUH01000094.1| GENE 7 5511 - 6158 772 215 aa, chain + ## HITS:1 COG:all4420 KEGG:ns NR:ns ## COG: all4420 COG2148 # Protein_GI_number: 17231912 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 13 215 246 445 445 196 47.0 3e-50 MVVPSAARRSALDLASKRVIDIVLATVGVIILTPLWLVVALLIRISDRGPAFFKQTRVGK NGQTFTMYKFRTMRVDAEQVKASLEAANRADVGAGNSVMFKMRDDPRVTRVGRVLRKTSI DELPQLFNVIKGDMSLVGPRPPLPSEVATYEPHVMGKFSVRPGITGLWQISGRSNLSWEE TVQLDLDYAATRTVGLDMWILLQTVPALLRQEGAY Prediction of potential genes in microbial genomes Time: Thu May 12 17:39:48 2011 Seq name: gi|319978660|gb|AEUH01000095.1| Actinomyces sp. oral taxon 178 str. F0338 contig00095, whole genome shotgun sequence Length of sequence - 20991 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 6, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 6 - 878 835 ## COG3764 Sortase (surface protein transpeptidase) 2 1 Op 2 . - CDS 931 - 2328 2067 ## Apar_1262 LPXTG-motif cell wall anchor domain protein 3 1 Op 3 . - CDS 2399 - 4759 2901 ## Ccur_00820 putative collagen-binding protein 4 2 Op 1 . + CDS 4722 - 4811 203 ## 5 2 Op 2 . + CDS 4804 - 4986 223 ## - Term 4914 - 4968 -0.0 6 3 Op 1 . - CDS 4983 - 5708 954 ## COG1922 Teichoic acid biosynthesis proteins 7 3 Op 2 . - CDS 5713 - 9162 4295 ## cauri_0414 hypothetical protein - Term 9206 - 9252 13.4 8 4 Op 1 . - CDS 9282 - 12731 4821 ## cauri_0414 hypothetical protein - Prom 12772 - 12831 1.8 9 4 Op 2 . - CDS 12863 - 14350 1935 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 10 5 Tu 1 . + CDS 14447 - 15085 911 ## cauri_0412 putative secreted protein + Term 15118 - 15149 2.2 11 6 Op 1 . - CDS 15024 - 16499 1665 ## cauri_0411 hypothetical protein 12 6 Op 2 . - CDS 16501 - 17787 1516 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 13 6 Op 3 . - CDS 17784 - 19295 1979 ## cauri_0409 hypothetical protein 14 6 Op 4 . - CDS 19292 - 20188 823 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 15 6 Op 5 . - CDS 20190 - 20939 1015 ## cauri_0407 choline/ethanolaminephosphotransferase (EC:2.7.8.2) Predicted protein(s) >gi|319978660|gb|AEUH01000095.1| GENE 1 6 - 878 835 290 aa, chain - ## HITS:1 COG:SP0466 KEGG:ns NR:ns ## COG: SP0466 COG3764 # Protein_GI_number: 15900382 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase (surface protein transpeptidase) # Organism: Streptococcus pneumoniae TIGR4 # 24 236 1 209 279 154 40.0 2e-37 MTASAPGRGRARTLAALVVVALGIALVSYPFVSDWLNRRAQNAATGAQGEAVASAAPDAL AAQREAAVSYNERLLRGSVRVTDPFDEAGLPPGNAEYDSVLDVAGDSVMAELVIPAINVD LPVSHYTSDESLSKGAGHLANTSVPIGGPSTHSVLAAHTGLPTARMFDRLDELRAGDWFV IRVLGEEHFYEVVSTEVVEPGETSSLAVQQGRDLVTLVTCTPYGVNSHRLLVHAERTSAR GQWDSGGAPPGPVIADVRPGMWGAAAAGAACAAAIVGAVAGARSLVRRRR >gi|319978660|gb|AEUH01000095.1| GENE 2 931 - 2328 2067 465 aa, chain - ## HITS:1 COG:no KEGG:Apar_1262 NR:ns ## KEGG: Apar_1262 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: A.parvulum # Pathway: not_defined # 6 464 11 473 475 437 61.0 1e-121 MTRTPKTLCALAAVTTAIVLAPAAGAAPTSPTDVPDSPLRVSGLLPGDTVSAFRIADADI DAANNLTYTMAPGLPSAYDTIEEIAAVASDGTAFSQGSAAQNAAAAIAAALTAPEASAVA PDSAADLTLGSGYYLVRVTSTSGQTRVYQNMIVDVTPSVVNGAYAARDLEPIQVKTTDVS INKGVGEAGADSTDAYSVGDSVPFTVTTAVPSYPADSPNATFTISDAPSAGLAIETATIS INGAAAATGADYTLTASENGYTITYAKDYVLAHPGAPVVITYKAKLTADAYSRTADDLTG NTATVAFNPSPYDAGTATPSDSATVKTYGFVFKKVTPAGQPLPGAAFAVTLANGQTLTST SDANGYVYFEDLAAGDYTAVETHVPSGYQKAPDVSFQLSAATAASDNPATAAVENNYLVS AADVTDPELVALPITGASGIFLIVGTGTLLLVLGSAMVLRSRKRP >gi|319978660|gb|AEUH01000095.1| GENE 3 2399 - 4759 2901 786 aa, chain - ## HITS:1 COG:no KEGG:Ccur_00820 NR:ns ## KEGG: Ccur_00820 # Name: not_defined # Def: putative collagen-binding protein # Organism: C.curtum # Pathway: not_defined # 23 776 12 752 764 724 52.0 0 MKVGSISENGVTMRSTHMPATRSRRRALARLVPALLAALLACSLAPLRADAADIRRVGDI LSAIGGTVYIHDNPDWSKGQYWADGAWHSFADLDPADKNTQDGPIAELDNGWKIYWAHSF NGNGGEYMIWRYPDNGQLGVNGSRWPGFKIRFNDIGYTAAGDPIDAVLDFEYVFAWQYDG SAAHPIREPSWLTPFEITRGYGPLTAASSQDQNAAPIGIDLEFDTALVHAGTDTPIDDSN EMDVLYWDIDQPVHQGPDGAVVNNYSSDWREGVHFVSGYKDPSTIGDTSELVVSDSNTWF RSGGNDSSPTPTDRSSVIGRTGPRFRTGWRGYACSTGIGYDTKVRVYPQWPAPVKSEPVQ IVQRGGTATFDVTEVFPYVADSNKADSLTLTDTLDPALDASRASVTVLKGPAGDVDVTDN WTITTSGQTVTATAKNTGHGYAEGRHVFRITAPVRQDADLSARTTEDVDGTRYWPLPNQA SMTVNNEAKPSNTVKALVPYEAEGEVRLQATKALTGAALQAGQFTFELRGANGDVISTAA NEADGTVAFPEIPYTAADIGKTHSYTIVERDGGAPGYVYDTHAEAVTVTVSDAGGGVLNA TASYDADGAVFSNEFRHALGVVKRSASDDALGGAVFTLYEDDGDGAHTAADPVATVYSDP QMTTEILGAEATTGADGTAVYHGLRASTTYWLEETRAPVGYNLDPRAHAITVSASGEVTT TDAAGAPAPLPLVDGTATITIVDEPLPVLPATSGPGAVLIVFAGSFLTAAGAGILLTRRS KRRASM >gi|319978660|gb|AEUH01000095.1| GENE 4 4722 - 4811 203 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVTPFSEMEPTFTYTEKVYSSKITDKRDA >gi|319978660|gb|AEUH01000095.1| GENE 5 4804 - 4986 223 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLNTHVGAADGTSAVKALIPHRSHDTPPRAPRRRRGDGYDKNGALGPTAAGAGALPRRRG >gi|319978660|gb|AEUH01000095.1| GENE 6 4983 - 5708 954 241 aa, chain - ## HITS:1 COG:BMEII1130 KEGG:ns NR:ns ## COG: BMEII1130 COG1922 # Protein_GI_number: 17989475 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Teichoic acid biosynthesis proteins # Organism: Brucella melitensis # 79 217 95 238 251 60 30.0 2e-09 MNGAQREAANPDAEHHHEIARMVLVEGRATTWLNHWSLLHADFDQLRKMDAVGVDGTLLQ MTLGRMGHPVVRSSADLVLPHVFNQMEAGARVALIGAAPGVAERAAARLSAFEVMAVDGY AGLAGLRRSHARLVDFDPRLVVVGLGAGLQELVAAKARDWLPRASVCTAGGWIDQFAASD QYFPDWVHRWRLGWAWRIAHEPRRLLGRYTVEAVDFLALAPRLVARLEDMGTFGEYGLVA R >gi|319978660|gb|AEUH01000095.1| GENE 7 5713 - 9162 4295 1149 aa, chain - ## HITS:1 COG:no KEGG:cauri_0414 NR:ns ## KEGG: cauri_0414 # Name: not_defined # Def: hypothetical protein # Organism: C.aurimucosum # Pathway: not_defined # 27 1147 54 1155 1212 755 41.0 0 MRNSHRIPLVVIAAATVVLNGLGHVPSQSAPPAPVRDGLSEAGAAASCWEIKQNDPNAPD GTYWLQTPTMNAPGRFFCDQTSDGGGWVLIGRGTDGWETWSQGKGAASALAKRERRPADG TVQLSHEAVNGLLGSTPVSELPDGMRVVRAYNGRGTSWQTIKMRLPKMADFVWPFKSAHP VVYSIDGGEWTTGEPIWSKFGDDLAWGMVDMTATATSKYRVGFGYGPGALLNADTSDTSF FRKAGFTVEPYAEVYLRPQLRSDDAYARIADSGTPAQAVAPTVSEYAAPMKWGVTGNLNG SYAEGNAQVQAFEQVGSTMYVGGNFTGVARAGSTKAQTSGLAGFDASSGEWNGQSFAFNN QVKDLVELPGARLLAVGDFTRVNGEEHVGTVVIDTATGAIDSSWTLKLRNALRSGTVSVK AARVTGQHIYLGGLFTHASSNGGRSAYARGAVRLSLSGEPDRSWNPEFNGSVVAFDVDEA GGYFYAGGHFTRAQRGAAPYAARVSTAVGAALDSSWTFEPSYFDGMYQQGVALTGGRVYF GGSQHSLFGYDTATMTRTSGSIAMQNGGDMQAVTSAGNGVVYASCHCSDFVFQDAYRYYE LGTSWTRADEIRWAGAWDGATGRQLGWTPYRLASARSTGAWALEADDSGSLWVGGDFTRS FTSKSANQWAGGFARFAPRQAPPAAPANLASSGAGDGKAALTWDAVAGAASYEVLRDDRT VATVRGTSASVPMGGENRFFVRAVGASGSRGASTPVLRVAADGAQAPAEPGSSLVGDAAQ WRYLWKDSAPDQAWASASYDDSAWTTGTAPIGYGGKGLNTVLKPGAPASRPITLYGRTAF TIKDLSKTGGVEVSFVADDGAVAYINGVEIGRQRLDDGDVSYGTRASSALSTDAARADRQ TVQVPASALVEGVNVLAVEEHVNYRNAPSLTLSATVTAIPAGAYVPPVPSPPRRGAPPAQ SGGEPQDPPADEPLKPLDASHVNFGKALFSGEQWSYWTDAKAPDAAWASTADLSKWQHGA SPLGWGDAGAGTDFGAAKRPTTAYFARDVKFGPMPENGTLTLHVRADDGAVIYVNGTEVA RVRMPSGTVTPRTKASNGVTVSEAADAMVEVVVPGKLLTSEITRIAVEEHSAGASDPSLT FDMYVTLER >gi|319978660|gb|AEUH01000095.1| GENE 8 9282 - 12731 4821 1149 aa, chain - ## HITS:1 COG:no KEGG:cauri_0414 NR:ns ## KEGG: cauri_0414 # Name: not_defined # Def: hypothetical protein # Organism: C.aurimucosum # Pathway: not_defined # 21 1146 39 1155 1212 785 42.0 0 MSRRLLGALLAPAALLTSFALAPMAVADPGGGSSTGGAAAVHDGLSAQTAAASCWDIKTQ NPAAADGAYWLQTPTMDTPGQYYCDQTTDGGGWVLIGRGREGWEPWTQGKGDRSSLTARG RTAADMAVAQLSTKEVNQLIGNGSVSGLEDGVRVLRSLDPQGSSFQALDLSFPQMTDFVW FFKMGHPVNATFNHGPVVSSRAYSRFGYDDAWNANNTYPSWANGFTIGFGFGPGAQNYPG DTASAGSFFRKIRQTVFPYSEVYLRPRIASDSSDFTRIPDEGAAASTVTRAVSEFAASTQ WGVTGNLNGSYAEGNIQVQAFEQVGTTMFVGGNFTGVRSGADGAITESRGLAAFDVNTGA FTGMAFNFNNQVKDLLTLPSGKLLAVGDFTYVNGEEHVGTVVLDPATGQIDPDWNLTIMN AIGSHRVSVRTARFYNGQVYLGGTFTHLRLGDMAKVYARNAGRVGLDGRPDRSWNPEFNA SVLASDINEQNGAYYAGGHFTRAHGDRAWYAARLSTEAGAAVDPSFDFKPSTVTAGKYQQ TITSVGDRVYIGGAEHSLFSYDAATNQRLSASIMMSNGGDLQASAASPNGVIYASCHCSD AAYQDADMWTIESNWTRADDVKWVGAWDTATGKNMHWTPFEISSQRKTGAWALTVDSNGN LWAGGDFTRTHTSVSRTQWNGGFARFDARDTTPPEAPAQVRAAASTEATVTLSWTPVADA DAYEILRDDRTVAQVKEPTVELARAGENRFFVRAVDAEGNRSATSPVYVPPAFGEHDPSD PVLVDDGATWSYRYEESAPDEAWASPGYDDSAWPRGAAPLGRGDDRIATDIQTAPHKTWP ITSYFRTHFTVADPSAVSGVNVDFVADDGAIAYVNGAEVGRQRLGTGGVSYETRADAAPT YDAAQADRTTVFVPAERLVAGDNVLAVETHVNYRKAKSVSMQARVERVERKPGGSDSSPA PSPSPGATSSPVVDATGVTDGEIISSGTQWSYWASQSEPTPYWASPSGDISDWSSGASPI GWGDPDAGTPLSIEKGDRAITNYFAADVDLGPITGDAKITLKVRADDGAVVYVNGFQLTT VRMSEGAITHSTFANQAVSAKSAASNMVTVEVPAWRLVNGINRIGVEEHVNYRGTPSMTF DLTASLKRW >gi|319978660|gb|AEUH01000095.1| GENE 9 12863 - 14350 1935 495 aa, chain - ## HITS:1 COG:PAB0783 KEGG:ns NR:ns ## COG: PAB0783 COG2244 # Protein_GI_number: 14521379 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Pyrococcus abyssi # 2 436 3 442 511 88 21.0 3e-17 MSDAHQSRRNLARGGIVGFVGAAASAVLGFVFTIVLSRALGTTGAGVVTQATGVFAIVMA LAKVGLDSTAIYLMPRLSVDSPQEIRASLNYMASVAVGVSLSAVLVLEAVAPMIWSAEVA GAVRAAAWFVPIGALSLVASAALRALGNMREYVLVQNLLLPGLRPLLVAVAAAVGGSYAV VSAAWALPFVLVLVCAWALLARHMPEAGGSRWPTRQRRRRIVSFALPRTLTAGLEQALQW LDVLLVGVLAGNAASGIYGGAIRFIQAGLVVDTALRVVVSPQFSKLLHQRKMEELASLYS TASVWLVLFATPVYVLMAVFSPALMRILGEGFEAGAGVLVALCAGSVVTFMAGNIHSLLI MSGRSGWAAVNKVVVLGLNVVGNVVFIPRFGMLAAAVVWAVCMVVDAAMASVQVARFIGV RPDLGEVALPVVGVLVCVGAPAGAVALVAGRDSLVGLAAGCAAGALGFLALCWAGRSPLR LVGLGAFARARRSGE >gi|319978660|gb|AEUH01000095.1| GENE 10 14447 - 15085 911 212 aa, chain + ## HITS:1 COG:no KEGG:cauri_0412 NR:ns ## KEGG: cauri_0412 # Name: not_defined # Def: putative secreted protein # Organism: C.aurimucosum # Pathway: not_defined # 86 212 124 258 258 67 37.0 3e-10 MSVKKLTGKWLVAAWVGLAVILVAAVCVVFALVGRSVTTSQSAPGASGAESDPAAVPQSV PKTIQPQSAGPANTAGAGMAGDDEASQQVDFVLASMWQAYSDPSTAIATDLSSILTESAL EEFDAQAQEWNTDGTRVSGTPRIEDAHVTASDGTTATVRACVDSTGVTVTNDAGSPLTDD TSLMRALTDFSFVNDGGSWKLSGVSFPDDPTC >gi|319978660|gb|AEUH01000095.1| GENE 11 15024 - 16499 1665 491 aa, chain - ## HITS:1 COG:no KEGG:cauri_0411 NR:ns ## KEGG: cauri_0411 # Name: not_defined # Def: hypothetical protein # Organism: C.aurimucosum # Pathway: not_defined # 24 433 5 412 436 307 46.0 8e-82 MAEIAASRRGIWARPGPAFIGALSEDALQLPAWPFIVAYSGYFLWWAIGAGDLMWPVFAI IMIVFMLGRRGLRFPPGTVIWLFFLVWVLASMSMLDSVGRLIGAFYRFALQMSPLVFGIY AFNARARLTPKALMGTLWGFLASAVVGGYLAMAFPKLRFNTLMYYVVPKALHSNDFVSEF VRRGTTQWNPGSWVLSDPRPSAPFIYANTWGNVFSLIFPLAVIFAVISWREYARHRYLHA LVCAASVVPAAATLNRGMYIGLVVIAAWVGFQRVRDGRWRTVLVALFLGGVGAIAWLFTP ASQSLVERLQSSSTTVDRFTNYVETVASLKESPLLGFGAPRPASVPWLPSLGTQGQFWTV LFSYGIVGALLFLAFFARILPSVWKAKDIYGSVLGGIVIATLVEQFYYGMNTGLMVSVLA VALLARHLEEGTEPLTDYDVAGPAAKTPRGMWKNPVRRAEGEVFSERMLDRVSTSDRRGR RPRIVSRNRRR >gi|319978660|gb|AEUH01000095.1| GENE 12 16501 - 17787 1516 428 aa, chain - ## HITS:1 COG:alr3176 KEGG:ns NR:ns ## COG: alr3176 COG0463 # Protein_GI_number: 17230668 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 83 311 7 220 313 87 29.0 6e-17 MSAADPHGSSIWHVDRTARVRELVSGDPGMERALVLVRENGRNCGAVAIEGTAAPESDPA VAAVPSAPRRGWREGAGATVPATVVICTTGRSDLLAASVKAVLAQDHRDYRVVVVDNAPT TGLARTALKGIDDERLTMVNASRPGLSRARNRGILAATGEVVAFTDDDAIVDPHWLTALV DPFTTSALVGATTGIVLPLELAHAPQRWFESRGGFPKDFTPRVWAKGRIPDGVRGLGEAG EGGPLYPVATARVGAGVCMALRRDVLMEVGPFDPSLGAGTPTRGGEDLDMFARVLAHGDV IVHTPDALVHHRHRVDHEGLDAQVRGNGSGMSALLTKAVLARPSTALTLASRAPAVAARL RPGSARVAGTDDDVPSGLTRSEVKGFLEGPVRYLRSRHANRLRPRKRPSSGRADGAEGAP EGPGDGRG >gi|319978660|gb|AEUH01000095.1| GENE 13 17784 - 19295 1979 503 aa, chain - ## HITS:1 COG:no KEGG:cauri_0409 NR:ns ## KEGG: cauri_0409 # Name: not_defined # Def: hypothetical protein # Organism: C.aurimucosum # Pathway: not_defined # 10 294 13 260 381 80 29.0 1e-13 MILTKPPTGGQEILRHAFARRWKSIVAATLGGLAVGAGAVFALPVDHTATVTMTITPPAA TPTPVAKTSINSTDMVTELGIAKSANVLDAAATAIGGTDAKTLLAGMEVSGDTNGTIVRI EYSARSRDEAVAAADAIAAAYLKERTALAEARADEMLAGITARVDELNTELSQIQAQQAA QEPDDEDTGLGNNARGAQRNQPVQQTSSADTARIAQLRAEIDKLMDQAVQLAPYHVTAGR VLTAAGQSEDATSPSVSRTLLVTTAVGLFIGLVVVFVKETRARSLISPSQLADLTALPVW SREKGAENEWESPTRMLAMTIDREHWVDLIIDPTDPQARTLHSTLTRSLAETRVPSPRLI DMTQPLAGLLDEVRPSKHVLIAVRGGDALGPIHALLDELALINREVNAMLYLGEAEEAAS APRPEPAVHSAYGGPAEVVPPGRHAGMTSSDDPEETLIQAPIGDVPRHSDKAVINAELAE IQAERGTYEIPTQTVTRRVEGKR >gi|319978660|gb|AEUH01000095.1| GENE 14 19292 - 20188 823 298 aa, chain - ## HITS:1 COG:CAC2174 KEGG:ns NR:ns ## COG: CAC2174 COG0463 # Protein_GI_number: 15895443 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 2 224 5 235 336 78 24.0 1e-14 MPRLCVLLPARNAAGTIGRAVASTLRAMPSDSELVVGDDSSTDATAERAQEAAAGDPRLR VLPIAPGEGGVARVLGQLMAATDSSLVGRMDADDVSLRGRFRRCGAAIGRGDDMVFTQIV ELRGRRPVPRAPYAIGPDEMPWHLLLTNPVCHPTMVATRDCLDRVGGYRNVPAEDYDLWL RVAADGGAIRRLAAWGLVYRIHPTQVTASQRWRAESWNNELQARAFADLAQRLTGRPLPR LVTIPQMPASGASAALDDFEEAFRAGLGALPTRAAHRLERRLNGRIRWVRARMQGDDQ >gi|319978660|gb|AEUH01000095.1| GENE 15 20190 - 20939 1015 249 aa, chain - ## HITS:1 COG:no KEGG:cauri_0407 NR:ns ## KEGG: cauri_0407 # Name: not_defined # Def: choline/ethanolaminephosphotransferase (EC:2.7.8.2) # Organism: C.aurimucosum # Pathway: not_defined # 9 246 2 239 253 236 60.0 6e-61 MSGWQPPSGTWAQRFGQYRRELGAAQKPGDGVPAYMRWVNRSGARAVAAAAAAWGWTPNF VSLVSVCLSAAGMALLVALPPAWWTGIPVGVLLAAGFLFDSADGQVSRVTHASSKTGEWI DHVADAFRSPAIHYCAAVALMWHQPEQWWLALVALVYGWVTSGQFMSQILAEQFVRAAGR KQTRGGTLRSFILLPTDPGTLCWSFILWGLGTPFAVLYTALAAIACAHSAISLRRRYRDL RALDAAERR Prediction of potential genes in microbial genomes Time: Thu May 12 17:41:06 2011 Seq name: gi|319978653|gb|AEUH01000096.1| Actinomyces sp. oral taxon 178 str. F0338 contig00096, whole genome shotgun sequence Length of sequence - 7932 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 595 482 ## COG0615 Cytidylyltransferase 2 2 Tu 1 . - CDS 735 - 1589 1257 ## COG0788 Formyltetrahydrofolate hydrolase - Term 1668 - 1722 14.8 3 3 Op 1 1/0.000 - CDS 1917 - 2888 1014 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain 4 3 Op 2 . - CDS 3154 - 4815 2332 ## COG1069 Ribulose kinase 5 3 Op 3 . - CDS 4817 - 6187 1966 ## COG0371 Glycerol dehydrogenase and related enzymes 6 3 Op 4 . - CDS 6221 - 7783 2283 ## COG1069 Ribulose kinase Predicted protein(s) >gi|319978653|gb|AEUH01000096.1| GENE 1 1 - 595 482 198 aa, chain - ## HITS:1 COG:aq_1368 KEGG:ns NR:ns ## COG: aq_1368 COG0615 # Protein_GI_number: 15606564 # Func_class: M Cell wall/membrane/envelope biogenesis; I Lipid transport and metabolism # Function: Cytidylyltransferase # Organism: Aquifex aeolicus # 17 139 12 127 168 83 37.0 3e-16 MDQTRTTTKPVTGYVPGGFDMFHQGHLNILRAARERCDRLVVGVTSDEALIRMKGRAPVI PLKERCDLVSSLRFVDAVVVDLDQDKRLAWRLQPFDVLFKGDDWKGTPKGAKLEAEMAEV GARVVYLPYTSSTSSTKLRRFIAPEDFSDDEAAAAADEASTASAQVPVAPGGAGAASAQV PVAPGGAARQAPRPPVRH >gi|319978653|gb|AEUH01000096.1| GENE 2 735 - 1589 1257 284 aa, chain - ## HITS:1 COG:RSc1873 KEGG:ns NR:ns ## COG: RSc1873 COG0788 # Protein_GI_number: 17546592 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate hydrolase # Organism: Ralstonia solanacearum # 7 282 6 287 288 310 54.0 2e-84 MTENAQLVLTLSCPDRPGIVHAVTGVIGAAGGNVIQSQQFGDPGTGTFFMRVEVDSPAGR APVEEGLADAARQFSADYRVDELGRRLRTIIMVSREGHCLTDLLYRQRTQGLPIEVVAVV GNHPDLAPVAQFYGVPFLNIPITKDTKARAEEQLLDLVASEKVELVVLARYMQILSDGVC RAMEGRVINIHHSFLPSFKGARPYAQAHERGVKLIGATAHYVTADLDEGPIIEQDVTRVS HADSTADMVALGQDVERRVLAQAVRFHAEHRVLMNGTRTVVFAK >gi|319978653|gb|AEUH01000096.1| GENE 3 1917 - 2888 1014 323 aa, chain - ## HITS:1 COG:SMb20851 KEGG:ns NR:ns ## COG: SMb20851 COG2390 # Protein_GI_number: 16264892 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Sinorhizobium meliloti # 5 309 14 319 326 171 36.0 1e-42 MGGTMDHDREAQMLRAAQLYYYENLTQGAIADRMNCTRWTVGRLLDEARACGIVAISINH PRSRVPTLEKRLVEAFGLGEAIVVRQQSTPAGTLELVAAAAADYITALRPQPESMGIAWG RTLTAVARAMPEDWAHGVDVYQTYGGLTRSNDDVVADSIGLMARRARGVGHMLPAPAIVS DVDLGRRLRNEPSVARTLAVAPRSDVLVFSPGVLEEESVLVRSGFLTERGMERLRAMGAV TDIFSRFLDMSGEPVSQELEERTIAIPLDAVRQAKRSVAVASSLVKAVPMLVAMRSRLAT AAVVDEELAKEILLLGRLGAEGH >gi|319978653|gb|AEUH01000096.1| GENE 4 3154 - 4815 2332 553 aa, chain - ## HITS:1 COG:BS_araB KEGG:ns NR:ns ## COG: BS_araB COG1069 # Protein_GI_number: 16079931 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Bacillus subtilis # 4 545 2 549 560 494 46.0 1e-139 MTAAHTIGLDFGTLSVRAAVVRVDDGEVVADAVREYATPVMERALASTGQALPPDYALQV PGDYLVAMEEAVRGAMADSGVDPADVIGVGLDCTSASVVVTDADGRPMCEREEFAGEPQA YLKLWKHHGATDQARRIVSLARERGEAWLPRYGGTLSPEMLLPKALELFERAPQLYAQTA EILDIVDWLTWRLTGTLAYAAGDSGYKRMYQDGAYPSSEFLEALAPGFGGVFSEKMSHPI VPLGSRVGPLSAEWARAFGLPEGIAVAAGNIDAHVQCVSVGAIAPGCLTGILGTSSCWIL PSAELREVPGVFGVVDGGVSEGAWAYEAGQSAVGDIFAWFVDNHVPQSYFDEAAGAGESI HELLSRKAAALRSGESGLVALDWWNGNRSTLVDADLSGLIIGQRLTTRVEETYRALLEST AFGARMIIENFEAHGVGVEEIRIAGGLLRNPFLMQMYADVTKRPLRVARTLQAGGHGSAI FAAVAAGAYPSVAEAAEAMAGLADTVYLPDPAESEVYDRLFAVYSHLYDYFGRETEIMHD LQELRAERGWGQG >gi|319978653|gb|AEUH01000096.1| GENE 5 4817 - 6187 1966 456 aa, chain - ## HITS:1 COG:BH1862 KEGG:ns NR:ns ## COG: BH1862 COG0371 # Protein_GI_number: 15614425 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Bacillus halodurans # 41 346 51 360 399 150 33.0 7e-36 MTSLIDRALATATDTKEIAFGTGVLDQTGPMFARLFPGAKVLVVADGNTFAAAGGPVVDS LRAAGVEFAEEPYVFPGTPTLYAGYDNVEVLREHIRPLEDAVVCSIASGTLNDLAKLASG ELGRPYMNVCTAPSVDGYAAFGASIAKDGFKITRNCPAPRGLVADMRVIAAAPARLLYTG YGDLIEKVPAGADWIVADELGVEPIDDYVWSLVQGPLRDALKDPERIGAGDEAAVEGLAE GNIMSGLAMQAAQSSRPASGAGHQFSHTWEMEGHGLDWEPPLSHGMKVGVGTVASLAIWE EALRIDMDALDVEAVVAAAPTDEEVAARVRELLVPKIADEAVGHAVGKNLQGEELRERLR LIQKVWPRVRERVADQLMTPGEAAQRLDAVGGPSHPEDIGIDMARFRATHYKAQMIRSRY TILDVLTDTGMLDAVVERLFSPEGYWGRRPHTEREA >gi|319978653|gb|AEUH01000096.1| GENE 6 6221 - 7783 2283 520 aa, chain - ## HITS:1 COG:SMb20852 KEGG:ns NR:ns ## COG: SMb20852 COG1069 # Protein_GI_number: 16264893 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Sinorhizobium meliloti # 11 515 4 497 509 249 32.0 8e-66 MVAPGYEGPFLLGIDYGTESCRAAIFDLRGNPISFAATPYKTHHPKPGWAEQSPEDWWNA LVASVKRVMDETGIPARHIAGISYDATTMTVVAIDKNGHELGNAIMWMDVRATEQSARAE TIEHWARYYNGGGTMPATAEWYPFKAAWLRANEPDRYKSAHRLVDAPDWLTHKLTGEWTV NINSAALRMYYNRDEGGWPVEFYEHIGCGDVFDKIPDRVVDLGAPIGGLLPSVAQELGLL PGTPVAQGCPDAWAGQIGLGVVQPGKTAIITGSSHVITGQSATPLHGKGFFGGYTDGVMP GQYTCEGGLVSSGSVLKWFKDNFCRDLTSAAERLGMNPYKILDERVSDMPPGSGGLIINE YFQGNRTPYTDSKARGIMWGLNLSTTPEQVYHAIEEAVCYGTAHVLKAFKDAGFASTELV ACGGATKSRDWMQMHADVTGVPITLTEVGDAVVLGSCILAAVGSGQYQSIPDAARNMVHE TERIEPRADVHEEYQFYFNKYMETYPRMQGLSHELVDHLA Prediction of potential genes in microbial genomes Time: Thu May 12 17:41:08 2011 Seq name: gi|319978644|gb|AEUH01000097.1| Actinomyces sp. oral taxon 178 str. F0338 contig00097, whole genome shotgun sequence Length of sequence - 4649 bp Number of predicted genes - 8, with homology - 4 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 841 1082 ## COG0647 Predicted sugar phosphatases of the HAD superfamily 2 1 Op 2 . - CDS 891 - 1853 1366 ## PPA2323 hypothetical protein 3 2 Tu 1 . - CDS 1992 - 2570 719 ## COG0698 Ribose 5-phosphate isomerase RpiB 4 3 Tu 1 . + CDS 2795 - 3277 390 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 5 4 Op 1 . - CDS 3457 - 3717 400 ## 6 4 Op 2 . - CDS 3724 - 4017 250 ## 7 4 Op 3 . - CDS 4017 - 4280 340 ## 8 4 Op 4 . - CDS 4316 - 4648 342 ## Predicted protein(s) >gi|319978644|gb|AEUH01000097.1| GENE 1 1 - 841 1082 280 aa, chain - ## HITS:1 COG:VNG0719G KEGG:ns NR:ns ## COG: VNG0719G COG0647 # Protein_GI_number: 15789896 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Halobacterium sp. NRC-1 # 7 270 22 280 288 166 37.0 5e-41 MSATTMAAWAERLYDAYIFDMDGTIYLGDHLLPGARRLVEELRARSIPVRYLTNNPTKDP SQYAEKLSRLGLPTPVEDVIGTVATTTEWILENKPGQVVYPIAEPPLIDAFARAGIPMSE DPARIDLVVASYDRTFDYAKLQIAFDAIWFHRRASLIGTNPDRFCPFPGGMGQPDCAAVV AAIEACTGTRAETVLGKPNPQMARTALRGLDIDLERAVMVGDRLMTDIRLATTAGMASAM PLTGESTREEAAALPPQERPTYVLERVDHLLPRGIWDQRG >gi|319978644|gb|AEUH01000097.1| GENE 2 891 - 1853 1366 320 aa, chain - ## HITS:1 COG:no KEGG:PPA2323 NR:ns ## KEGG: PPA2323 # Name: not_defined # Def: hypothetical protein # Organism: P.acnes # Pathway: not_defined # 5 320 22 381 381 206 42.0 1e-51 MTPDSVFTFASSMDLFWLVLALAGGAFGAMIGANFAFAFTGVSILVGFAVAATTGSTIYL DYVAFGPVFGPHIAFAGGVGAASYAARKGLLPGGARDINSPLAGLGRPDVLVVGALYGVG GYVLHKLIVLIPWFGGHTDSVALTVVTSGIVARIMFGKTPVFHAPTPPEGTTRWLDWQEK PLQLLTVSGFASLLASGTAAMLVGHIAPASANPQVIINNAQVVPFAFSSLCIFFVAMGMR WPVTHHMTITAGLAAVVFFQITGSGLTAVAIGTVFGIIAGFGGELLARLFYAHGDTHIDP PAGIIWIMNTAVVSTAALFS >gi|319978644|gb|AEUH01000097.1| GENE 3 1992 - 2570 719 192 aa, chain - ## HITS:1 COG:PM1645 KEGG:ns NR:ns ## COG: PM1645 COG0698 # Protein_GI_number: 15603510 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Pasteurella multocida # 31 181 2 149 151 114 43.0 9e-26 MDPSLPAAVGAGKTSPRPARTMDVETNVGFRIVVAADSAGVDYKEILKKDLEADPRVDEV IDAGLAPGEDVDYPHVAVNAARMIADGKADRGLFVCGTGMGVAMAANKVPGVRASVAHDS FSVERLVLSNNAQVLAFGQRVIGIELARRLAKEWLGYVFDESSHSAPKVAALQDYDADPS RPVPGALTAHRA >gi|319978644|gb|AEUH01000097.1| GENE 4 2795 - 3277 390 160 aa, chain + ## HITS:1 COG:SP1464 KEGG:ns NR:ns ## COG: SP1464 COG0454 # Protein_GI_number: 15901314 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 1 155 1 159 164 70 34.0 1e-12 MRIRPAGLDDMDFIESAYDHARAFMRRNGNADQWPAHYPSRIDAQDDIARGRCFLVEDDE GPLAVFAFGPGPESDYASIDGAWRAETPYHVIHRLASVRGTGVARAAFAFCAERSPHLRC DTHADNVPMRRALESFGFQPCGTVMVCGRFPREAYDWVAA >gi|319978644|gb|AEUH01000097.1| GENE 5 3457 - 3717 400 86 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSAVTEIIARLLVISEEAEHILAALDGADADLNSAGHYAEQAGRVSPQPCGEAAAALAAE AKRDIASARGALSDLRAAAAGYAGTL >gi|319978644|gb|AEUH01000097.1| GENE 6 3724 - 4017 250 97 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDDRSQEEERRRRLAEARGCLSDSAHYSARAVSRLDLLASQLRDLMADIDATVAGSLTGA DARLYGAANEVHRLVRRACEAASTASRSAERDASELG >gi|319978644|gb|AEUH01000097.1| GENE 7 4017 - 4280 340 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MASMLGQLRGELADFQSDATRTGRELEIYLRRFTVQQGRINALIGGSTRRVDAELINTLE QAHRQLTHAIMALDVVAKSTGEYADSL >gi|319978644|gb|AEUH01000097.1| GENE 8 4316 - 4648 342 110 aa, chain - ## HITS:0 COG:no KEGG:no NR:no AARDAAARQEFDQLRQAAEQAERARALMEVQPQTVPEDQLIPVGELPARAAGVFSAAHSL VEREAMLRQIEEADARRRASKAPANQGRTIMLVCGALGLCYILLGAIGAV Prediction of potential genes in microbial genomes Time: Thu May 12 17:41:35 2011 Seq name: gi|319978640|gb|AEUH01000098.1| Actinomyces sp. oral taxon 178 str. F0338 contig00098, whole genome shotgun sequence Length of sequence - 4908 bp Number of predicted genes - 4, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 120 122 ## 2 1 Op 2 . - CDS 117 - 2777 3191 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 3 1 Op 3 . - CDS 2774 - 3397 685 ## 4 2 Tu 1 . - CDS 3658 - 4647 1629 ## COG0039 Malate/lactate dehydrogenases Predicted protein(s) >gi|319978640|gb|AEUH01000098.1| GENE 1 3 - 120 122 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSIVTDYNAQLALFAGSLEQAQDLGARAEAEAAAERERA >gi|319978640|gb|AEUH01000098.1| GENE 2 117 - 2777 3191 886 aa, chain - ## HITS:1 COG:HP0066 KEGG:ns NR:ns ## COG: HP0066 COG1674 # Protein_GI_number: 15644696 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Helicobacter pylori 26695 # 377 589 370 569 831 147 40.0 7e-35 MTLAEERRTIAQEAALVSGVARAHEARAHAELTRRLADSRAALDSQLASIDAAGARSAAR LQEEGAAARASSSQQLRALYDSARNAKRGCEPAPLTRIGTLEGGAPLAVGLTEGPGVLVS SAHGAAPAHDLALALAVSLVEDIPLQHLRIHVYDPRNSLTMAPLGRLREARAESFPAPLI TERDLENALDDLLRHASATAELLAGEGERTLSGLWSRLAVPQGEFRVLVLLDYPSSLSDR AREQLRALASSPGRGVCVIAAGQGPGAAPDGDGALAGALTDVRVDRDRVAIRAGGRWAVG APLAADPAHVGSVVSAAVASSNEVEGPTYPLSEFIEGGEPMWSRSSADGLSAVIGRTRGD ALTIRLSSQNPAMPNMLVGGAVGQGKSNLLLDIVYALAYHYGPDELRMLLLDFKEGVEFR RFAANDEGRDWLPHARLVALESNAVFGASVLSYLTDEIRARANTFKEAGVGSYDAYRAQG GSMPRLLVVADEFQMLFEGNDDVARDAVRALEQIARQGRSAGIHLVLASQTLSGIRALAN KEQAIFGQFASRLSLKNKAQESETILSRGNRAAADLTYRGEVVLNENFGEDPSRNIRGTC AWAQGDYVLDLQRRMFAAAPHGPAPTVFNPYAPVAWERHDSVPFARGRSAATDGIDLGRR IGVDENPVWHPVMGPRPIGLAIVGEPSLEVDGLIAAAASSLARVRPGVPLTFVVGSYGDA PVPIRLIASHLEGRGVPVRVVTGEGASVLLREGLPADQAVVLVDADQLIDFTDPIEGGDV PRFGEPPSIRSRLADVLHDTSAAGHPAVVTAWSSYAALEGVVGRDLRGVSGVALVGQDRR TLHDVSGDFSLEPPEESPRFVYVRPGAANQSFIGVPFSLPTPKEER >gi|319978640|gb|AEUH01000098.1| GENE 3 2774 - 3397 685 207 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSPTPPSDFEAISALAAYTTDLTGTRKRPWLAPGAPGQGLVRPTQLVASEGRRDLADTAA RLEAGAPLGAHKADGSTSPVAAAVAAALAFFASLYIPKYCLDLNHAMTRVGEGPSVFALV LFVLAMALPVTIFSRQISSRPSHPLWLACLIGAVAVHVIVAALAALGASNDALKGVGEPI ARTGVLLPVILVLVLAFVTAALNGMEE >gi|319978640|gb|AEUH01000098.1| GENE 4 3658 - 4647 1629 329 aa, chain - ## HITS:1 COG:DR0325 KEGG:ns NR:ns ## COG: DR0325 COG0039 # Protein_GI_number: 15805354 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Deinococcus radiodurans # 1 329 3 330 330 356 60.0 4e-98 MATPRIVTVTGAAGNIGYALLFRIASGQLFGPDVPVKLNLLEIPQAVKAAEGTAMELDDC AFPTLAGVEIFDDASKAFQGTNVAYLVGAMPRRAGMERADLLEANAGIFGPQGKAINDGA AQDVRVLVVGNPANTNATIAQNAAPDVPASRFTAMMRLDHNRAVAQLAHKTGAANADIKD VVVWGNHSADQYPDVSFAKVAGKPATELVDEEWLSSYYRPTVAKRGAAIIEARGASSAAS AANAAIDHMHSWIHGTPAGEWVTAGVMSDGTHYGVPAGLNFGFPVTSDGGEWQVVDGLEI SEATRAGIDHNIKALQEEYDAVKALGFIK Prediction of potential genes in microbial genomes Time: Thu May 12 17:41:50 2011 Seq name: gi|319978636|gb|AEUH01000099.1| Actinomyces sp. oral taxon 178 str. F0338 contig00099, whole genome shotgun sequence Length of sequence - 4289 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 904 1131 ## COG1230 Co/Zn/Cd efflux system component 2 2 Tu 1 . + CDS 1111 - 3180 3295 ## SACE_3182 hypothetical protein + Prom 3225 - 3284 1.8 3 3 Tu 1 . + CDS 3340 - 4288 948 ## Caci_4620 hypothetical protein Predicted protein(s) >gi|319978636|gb|AEUH01000099.1| GENE 1 2 - 904 1131 300 aa, chain + ## HITS:1 COG:PA0397 KEGG:ns NR:ns ## COG: PA0397 COG1230 # Protein_GI_number: 15595594 # Func_class: P Inorganic ion transport and metabolism # Function: Co/Zn/Cd efflux system component # Organism: Pseudomonas aeruginosa # 1 300 3 299 299 182 41.0 5e-46 GGHSHDHARASRTRLLMALGVTAGVMAAELVAAHVSGSLSLAADAGHMAVDSSGLVVALV AAHLMTRPRDDRHTWGWARSEVLAAALQAGMLAIICAVVAWEAVWRLVSPGAVEPVPMLV VGAVGLIANGASLLILMGGRGSSLNMRAAFLEVANDALGSVAVIAAAACALATGWSGADA VASLLIAALMAPRALGLLRKSVAILMERTPHRIELAQVREHMLAVPGVTRVHDLHVTTVA TGLVAVTAHVEVSPAVDAHGRDLIVHSLGECAAHHFPVEIAHSTFQLECAEHAAHEHLAH >gi|319978636|gb|AEUH01000099.1| GENE 2 1111 - 3180 3295 689 aa, chain + ## HITS:1 COG:no KEGG:SACE_3182 NR:ns ## KEGG: SACE_3182 # Name: not_defined # Def: hypothetical protein # Organism: S.erythraea # Pathway: not_defined # 39 514 7 444 573 127 27.0 1e-27 MVGQHLPVGPCWFMTTGYPESELMRYLNKTIGAAGALALAAAGLLSGPALADSPTTIHVS QATGADTNDGSAERPFATLGAALKAAPSGATVEVASGTYREGEITSFKSLTITAGKGQQV SLNGADVVADWNDNGDGTYSSARSDFVRFSHVGTVNANPAVEGMAAYPEQVFVDGKELTQ VAERSQVGPGTFWVNDPDPVTLVNPKNNRQGYNVKPHTGVGYVLGDNPSGHTVEVVQHHR ALTLGGEGTVFNGFTVEKYSPLQQWDYKDPEIGTLTGGAMFFVGGKNVSITNNTFQYSAM GTALGLANADGSTVSGNTIAHNGGVGFGINRSTNVSVEHNTWSMNNQAGFIVDNCGAYCT IGDTKITHSDGIRYAFNTHDYSQAGYNHNDPNVDSPRRVIGVWFDEGVMNSQIVGNHFIN VGKSAIFDEVSSHNIIASNIVESSFQGIVLSGTDSDKIFNNTIVDTLSPMVIREDTRYNG CNARNEAGECTAPESWSIEKGLTWNATDNQIYNNVLTKSANSKEGDPWRYSMLLRFAGGV NGDGTAVYGPQEAQGLDYNTYYRTSKDTEWFTIHWQWAKDGGGAGAFNAPTLAEFTSHPQ AGTDSAPGRDAHGREAHGQDIIAPRAQQNLFVRLPAQENQFGAADLHPAPGGPLEKSGTA LDPAVAAAMRLNPDNPVDKGALVNAAWGD >gi|319978636|gb|AEUH01000099.1| GENE 3 3340 - 4288 948 316 aa, chain + ## HITS:1 COG:no KEGG:Caci_4620 NR:ns ## KEGG: Caci_4620 # Name: not_defined # Def: hypothetical protein # Organism: C.acidiphila # Pathway: not_defined # 6 316 9 345 515 79 27.0 1e-13 MERPSDWSVVGYASDPVPGDPLVVRQGALDYQRIADSIASCAQALRSLDAGGSRGSQAVA ALLETRDDLLDKVGVAEGRYREAGGALEEYAGALDRAQSDSLNALAAAKSAQADAMEART RAERMRASAEEYPADGDGGDDRARYMRLAGAADADEAAAQGRVAAQKEIIDRAVSERDIA AERAIEIINGACGDGLADSWWDDWGKAITQWIAKICEIISGIAGVLALLVSWVPIVGQAL AAFLGALSAVTGVIAALANTVLAMAGEQTWLDAVISVGFAALGCVGLGGLRGVAASARAM AGARGAWAAAGGLRGV Prediction of potential genes in microbial genomes Time: Thu May 12 17:42:16 2011 Seq name: gi|319978605|gb|AEUH01000100.1| Actinomyces sp. oral taxon 178 str. F0338 contig00100, whole genome shotgun sequence Length of sequence - 34903 bp Number of predicted genes - 37, with homology - 24 Number of transcription units - 19, operones - 9 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 22 - 462 429 ## gi|115377389|ref|ZP_01464594.1| hypothetical protein STIAU_5665 2 1 Op 2 . + CDS 470 - 1180 769 ## 3 2 Tu 1 . + CDS 1293 - 1964 682 ## 4 3 Tu 1 . - CDS 1975 - 2247 142 ## 5 4 Tu 1 . + CDS 2114 - 2785 903 ## - Term 2744 - 2808 2.5 6 5 Tu 1 . - CDS 2894 - 2974 96 ## - Prom 3113 - 3172 1.7 - Term 3205 - 3246 2.4 7 6 Op 1 12/0.000 - CDS 3326 - 3622 353 ## COG2440 Ferredoxin-like protein 8 6 Op 2 9/0.000 - CDS 3651 - 4949 1758 ## COG0644 Dehydrogenases (flavoproteins) 9 6 Op 3 29/0.000 - CDS 4965 - 5837 1197 ## COG2025 Electron transfer flavoprotein, alpha subunit 10 6 Op 4 . - CDS 5855 - 6616 964 ## COG2086 Electron transfer flavoprotein, beta subunit 11 7 Tu 1 . + CDS 6739 - 7347 573 ## COG1309 Transcriptional regulator 12 8 Op 1 2/0.500 - CDS 7344 - 8006 846 ## COG0491 Zn-dependent hydrolases, including glyoxylases 13 8 Op 2 1/0.500 - CDS 8009 - 9346 2150 ## COG0477 Permeases of the major facilitator superfamily 14 9 Op 1 . - CDS 9471 - 10346 1593 ## COG2030 Acyl dehydratase 15 9 Op 2 4/0.000 - CDS 10375 - 11946 1566 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 16 9 Op 3 8/0.000 - CDS 11953 - 13173 1804 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase 17 9 Op 4 . - CDS 13202 - 14335 1934 ## COG1960 Acyl-CoA dehydrogenases 18 9 Op 5 1/0.500 - CDS 14368 - 15336 1357 ## COG0657 Esterase/lipase 19 9 Op 6 . - CDS 15345 - 16661 2129 ## COG0477 Permeases of the major facilitator superfamily 20 10 Tu 1 . + CDS 16567 - 16794 161 ## + Term 16876 - 16930 4.4 21 11 Tu 1 . - CDS 16801 - 16998 85 ## - Prom 17189 - 17248 1.8 22 12 Tu 1 . - CDS 17505 - 18377 1012 ## COG4377 Predicted membrane protein 23 13 Op 1 . + CDS 18295 - 18477 91 ## + Term 18491 - 18525 1.0 24 13 Op 2 . + CDS 18564 - 19991 1715 ## COG4222 Uncharacterized protein conserved in bacteria 25 14 Tu 1 . + CDS 20172 - 20774 786 ## - Term 21242 - 21283 15.5 26 15 Op 1 . - CDS 21289 - 22968 2710 ## COG0166 Glucose-6-phosphate isomerase - Term 22998 - 23026 2.1 27 15 Op 2 . - CDS 23059 - 25605 3163 ## COG0205 6-phosphofructokinase 28 15 Op 3 . - CDS 25639 - 25896 350 ## + Prom 25952 - 26011 1.9 29 16 Op 1 21/0.000 + CDS 26048 - 27547 2045 ## COG0280 Phosphotransacetylase 30 16 Op 2 . + CDS 27630 - 28829 1610 ## COG0282 Acetate kinase + Term 28980 - 29019 7.0 31 17 Tu 1 . - CDS 29217 - 29663 329 ## - Term 29674 - 29718 7.3 32 18 Op 1 . - CDS 29796 - 30641 962 ## AAur_3480 hypothetical protein 33 18 Op 2 . - CDS 30694 - 31704 1492 ## Kfla_0428 band 7 protein - Prom 31744 - 31803 3.1 34 19 Op 1 . + CDS 31703 - 31798 132 ## 35 19 Op 2 2/0.500 + CDS 31808 - 32563 836 ## COG1051 ADP-ribose pyrophosphatase 36 19 Op 3 . + CDS 32585 - 34669 2966 ## COG0171 NAD synthase 37 19 Op 4 . + CDS 34747 - 34903 189 ## Predicted protein(s) >gi|319978605|gb|AEUH01000100.1| GENE 1 22 - 462 429 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|115377389|ref|ZP_01464594.1| ## NR: gi|115377389|ref|ZP_01464594.1| hypothetical protein STIAU_5665 [Stigmatella aurantiaca DW4/3-1] # 33 128 363 458 478 65 38.0 1e-09 MLASAVAALKFKGSGRTDLPKGLYEGRGKNHGSAAARAYEEQITGYPVEYSIYVEGDLSE VEFDGFRDGVLLEAKGPSIANILSHEWGEHQIYEYLEDMQKQLDAMARGGLDMPIHWYFA EEEAMNIMLDTDDIPDGISLFFEPPK >gi|319978605|gb|AEUH01000100.1| GENE 2 470 - 1180 769 236 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKEPTVIMKWFAKVRLGSWASDATATAERVMPWLDSLAQVSPLLAEWKLLGKSRYECLV ARPLTLDTLRIRLWEGRSKVIFPGRTIPGYQASPAFAGDIDESASVLRISGGIYIGPDHP GINTLNIKFGALEVEELEGLADPLIDATVGAFDPVYLSICDLDVSVDREWDKFLPGWKIY LPHTAPPTRAQQAQQAATTTRPLDHGTVYTIATPTTYPTRTTDWTKDPKPDTDPDA >gi|319978605|gb|AEUH01000100.1| GENE 3 1293 - 1964 682 223 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYWCSEVCLGSWASDATVTAERVMPWLGSLAQVSPLLAEWKLLGKSRYECLVARPLTLDT LRVRLWEGRYGTRRPAYGAAPAFAGDIDEGASVLRLSGGIYIGPDHPGINTLDIKFGAPT AEDLEDFADPLIDTTVKALAPVYLSICDLDVSVDRDWDMFLPGWKIYLPHTAPPTRIQQA QQAATTTRPLDHGTVYTIATPTTYPTRTTDWTKDPKPDTDPDA >gi|319978605|gb|AEUH01000100.1| GENE 4 1975 - 2247 142 90 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPISFQSASRGLTWARESSHGITRSAVAVASDAHGPKRIFAFQSIIGLAYIGWVCCSGRA QGAAEGCAASGMVRCGGGHVPCSGVVDRVC >gi|319978605|gb|AEUH01000100.1| GENE 5 2114 - 2785 903 223 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDWNAKIRLGPWASDATATAERVMPWLDSLAQVSPLLADWKLMGNSRYKCLVARPLTLDT LRVRLWEGRYSASSPGYVAYPAFSGDIDEGPAILSINGGINIGRGYPRISILNIKFADID PDNLEDFADAFIDTTVAAFDPVYLSICDLDVSVDREWDMFLPGWKIYLPHTAPPTRIQQA QQAATTTRHLHTGTVYTIATPTTYPPHTTDWTKDPKPHTDPNT >gi|319978605|gb|AEUH01000100.1| GENE 6 2894 - 2974 96 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLRHLAFRCTTWADVGPGCPGVSGV >gi|319978605|gb|AEUH01000100.1| GENE 7 3326 - 3622 353 98 aa, chain - ## HITS:1 COG:ECs0047 KEGG:ns NR:ns ## COG: ECs0047 COG2440 # Protein_GI_number: 15829301 # Func_class: C Energy production and conversion # Function: Ferredoxin-like protein # Organism: Escherichia coli O157:H7 # 8 98 6 95 95 79 42.0 2e-15 MAGFVVESVPARLAGNTYNLDEEESHIEVDQELARRLGAGPLLERVCPAHVYSVEADGTI GVEYAACLECGTCLAMAPKGVLKWHYPRGSFGIMFREG >gi|319978605|gb|AEUH01000100.1| GENE 8 3651 - 4949 1758 432 aa, chain - ## HITS:1 COG:ydiS KEGG:ns NR:ns ## COG: ydiS COG0644 # Protein_GI_number: 16129655 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Escherichia coli K12 # 1 432 1 429 429 328 42.0 9e-90 MSEEVDFDVIVIGGGVAGAVCAYTLAGQGREVLLVERGAEPGSKNLSGGVFYCRIMEKVF PDFVDAAPVERRITRNCVSLINEGSYVNIDYWDQRLSEPANAVTVLRAKLDAWLLEQCEE AGVTVMPGVKVDSLIVEGEQIVGVRAGEDELRSHVVVAADGVNSFIAQQAGIRAKEPMKH LAVGVKSVIGLPRKVLEDRFNVRGDEGVAYAMVGDCTKGVAGGGFLYTNQESVSIGVVMR LDDLEKSGLSSSDVHDHMLGHPAIAPLLEDGTLLEYGCHLTIEDGPAMASHDLTRPGLII VGDAAGFTLNTGLTIRGMDLAAGSAIAAAETIRRAFNKEDFGQEAMDAYLRLLDSGFVGK DMATYAKAPAFLERPRMYTDYGKLAAEVFYGIFNHDLKPRRHMRKVGFDVLKASGLKLTH IAGDVLAGVRAL >gi|319978605|gb|AEUH01000100.1| GENE 9 4965 - 5837 1197 290 aa, chain - ## HITS:1 COG:STM1353 KEGG:ns NR:ns ## COG: STM1353 COG2025 # Protein_GI_number: 16764704 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Salmonella typhimurium LT2 # 1 290 4 310 311 151 35.0 1e-36 MTNTWILTTDERIANLVEVAAAVGGTTTVIAVAPSAPAVGGVDSVIHIPTAENVPAEAYA GVVAGLLADAKGDVILAANRPAERSFAGAAAAALALPVIVGATSVSAEGAEVARYGGLTN ETVAFKGAVLVLEGGAAVEADGPAAQVHEGTPMGAEVVDVQPSGSGPANLAAARRIVSCG RGFKAEEDLALAQALADALGAEIACSRPLAEGTAWMTKDRYVGVSGMRVSPDVYVALGIS GQVQHTSGMAGSKIVVAVNSDAEAPIFQISDYGIVGDIYDVVPALTAALS >gi|319978605|gb|AEUH01000100.1| GENE 10 5855 - 6616 964 253 aa, chain - ## HITS:1 COG:STM0075 KEGG:ns NR:ns ## COG: STM0075 COG2086 # Protein_GI_number: 16763465 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Salmonella typhimurium LT2 # 1 209 1 211 256 89 35.0 7e-18 MSIIVAYKYAADPQDTTVDASGAVDWSRAKAAISEYDPVAVQVGRALADSLGTDVVGVSV GGKAVASSMAKKGALSRGMDRALVVADDATADWNATTVASALAGLVGKVEGASLVLTGDS SIDESAKIVPTLIAGFLGWPCFQEVVAVEAAGDGWKLTQSTAGGTRTIRVDGPVVAAVAT DAAPVKVPGMKDILAAGKKPLEKVAVADLPTVDTALEITGRSKPVAKARRNETFTGDDAA AQLVAALRNAGAL >gi|319978605|gb|AEUH01000100.1| GENE 11 6739 - 7347 573 202 aa, chain + ## HITS:1 COG:MT3256 KEGG:ns NR:ns ## COG: MT3256 COG1309 # Protein_GI_number: 15842744 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mycobacterium tuberculosis CDC1551 # 12 180 35 203 229 59 30.0 3e-09 MTRFAVLSPAKAGRPRDPQLEERVFRAALDLYGEAGWAGFNLTKIAAEAGVGKSSLYSRW SDRDELLHRAFAALVVCPGPRGDSPREILANEADFRLREYLGPNRSAVRRLFVEAGNAEN PVIWQIYEDLFVTPLSAIHERLWEFKLSGALPRSTSVVRLLDAIEGSVLMRTFCLPDDVI GCFLEKVPDYVSDLVDDQLHHH >gi|319978605|gb|AEUH01000100.1| GENE 12 7344 - 8006 846 220 aa, chain - ## HITS:1 COG:CAC2272 KEGG:ns NR:ns ## COG: CAC2272 COG0491 # Protein_GI_number: 15895540 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Clostridium acetobutylicum # 4 197 3 193 199 112 34.0 5e-25 MFTIKQLTVGSWRAVCYALTNEEGTIVIDPGAEPDRLERWLGDAEVIGIVLTHCHSDHIG AVNELVDRYGCWVACGADDVDGVADVHRSGFDEEGVDYTVDHVDRSLAEGDTITWGGDAL RVLHTPGHTPGSICLLSQAQQVLFSGDTLFAGGIGSTAFVLGDYPDMVRSCARLAQMDPA LTVHPGHGRATELGVERPMLAHIARSALRGPGQSSNWREG >gi|319978605|gb|AEUH01000100.1| GENE 13 8009 - 9346 2150 445 aa, chain - ## HITS:1 COG:yaaU KEGG:ns NR:ns ## COG: yaaU COG0477 # Protein_GI_number: 16128039 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 9 445 6 440 443 413 49.0 1e-115 MADDNNVIDLEDLELTPLLKRVIVFSSGGPFLEGYVLSIIGVAMLKMGTELNLDAHWQGM LGVASLVGLFLGASVGGWLTDMIGRRKMFVIDLVVIAVLSLLCALVQEPVSLLALRFLVG VAVGADYPIATSMIAEFSPRKYRAGAMGVIAAAWYLGANVAALVGFALYNTPNGWRYMLA SSAIPCVLILVGRWDIPESPRWLVSKGRAEEAQAIVHKTLGANVRLPVAEEAAKTSVMKI LRGVYLRRIIFIGVIWLCQAIPMFAFYTFGQQIIGGAIGMDNERYALLVELLIGTFFMLG TFPAMYLCERIGRRPLIIACFAGMTVALAVVGFFPGAGVGLVLGCLIAYALLAGGPGNLE WLYPNELFPTEVRATAMGFAMSLSRIGTVVSLYILPAFMAAHGFAKTMLAGAAISVLGTV VSVIWAPETRGYTLAETGSVDFKGR >gi|319978605|gb|AEUH01000100.1| GENE 14 9471 - 10346 1593 291 aa, chain - ## HITS:1 COG:MT3496 KEGG:ns NR:ns ## COG: MT3496 COG2030 # Protein_GI_number: 15842982 # Func_class: I Lipid transport and metabolism # Function: Acyl dehydratase # Organism: Mycobacterium tuberculosis CDC1551 # 1 290 1 287 290 110 32.0 5e-24 MAINTDMVGQTFGPFVRDYTFRDLELFALGCGAGIDGKDGLEYLNEHDERDPKLKVLPMF GAMLIVDSEVTRTIDYGYNYAGSLHWGFDIRFHQPITKMSDHLETKVRLEGLYDRGEGRG LLAQHIGDTYDSDGNLLFTNESWDCLIYDGGWGGPKPPKDIVEMPDRPADVETTETIPEN QALIYRLSGDYHPQHIDWDYAAENGEPRPILHAISYAGVVMRHAINAFVPGEPERITRFK TRITSPVHPGSTLTTRLWKVKEGELRFALVDADVDQTGAKPHLNWGIIEYR >gi|319978605|gb|AEUH01000100.1| GENE 15 10375 - 11946 1566 523 aa, chain - ## HITS:1 COG:STM0071 KEGG:ns NR:ns ## COG: STM0071 COG0318 # Protein_GI_number: 16763461 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Salmonella typhimurium LT2 # 15 517 11 513 517 397 39.0 1e-110 MTHSEFAPESTTVPQLWERQVEDRGDHEFLVFEDAHSHEVRRFTYREFDAAVNKCANALA ARGVGAGSRVVLYLDNCVEFVECTLALAKMGAVSVPVDATSAPFELGHILGICRAGTLIT RACSADEVLPHVPGTVTDVIVVGGDGAVGGDGIGLPRLRELASREADSFASSPRVRPSDL CEILFTSGTTSEPKGVMITHANFVFSGNFVNWELDMNPEDRYMTSMVAARVNYQLSALAP VLTAGATLVMLSRYRATRFWKQARAHRATLVQGMAMIVATMLRQSVDPGERDHQVREMHY FLPLSTQDKEAFEKRFGVSILNNYGSTECLIGAITDPPHGQRRWPSIGRAGPGYEVRIAD CQGNPLPEGEVGEIMLRGVPGVSLMAGYWDDPGATAEVLDEDGWFHTHDYAYIEDGWVYF MDRRVDLIKRSGESVSSAEVECTIEDLPGVREVAVIGVPDGVRGQAVKAFIVPEPGSGLT AATVIDHCAGRLAAFKVPSSVEFVDGLPRGSYGKVRKHILANY >gi|319978605|gb|AEUH01000100.1| GENE 16 11953 - 13173 1804 406 aa, chain - ## HITS:1 COG:STM0072 KEGG:ns NR:ns ## COG: STM0072 COG1804 # Protein_GI_number: 16763462 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Salmonella typhimurium LT2 # 8 404 7 405 405 337 44.0 3e-92 MALPTQKPSFGVLDGVKVVYSAVEIAAPTAAAIMAEWGADVTWIENVWTGDSMRDTAWVK EMERRNMRSISLNPFTDDGKEALRAIVKDADIFIEAGKGPMYARKGITDEFLWEVNPKLV IVHVSGFGQWGDEEQISSAAYDLTVAAYAGIVAQNGSAEQPMNISPYLGDYINSLMIISS ALAALHRVGVTGEGESIDMAMYETLLRVGTYYMMDYMNADILYPRPGARHQNLCGIGIYE CEDGFIGLCLYGVAQNKYLLETIGLGHLWGTEEYPEGTSALWLNGPKADLIQDTLEEYLK TQSKYDVQRDFVAHRIAAQVVNEFPDILEAEHVKQRECFIDSEKADGTPVKVLNTFPKFR RNPGAFWRPMPALGEDTRDVLKRAGYSDEAIEALIESGTVKAGDED >gi|319978605|gb|AEUH01000100.1| GENE 17 13202 - 14335 1934 377 aa, chain - ## HITS:1 COG:ECs0042 KEGG:ns NR:ns ## COG: ECs0042 COG1960 # Protein_GI_number: 15829296 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Escherichia coli O157:H7 # 1 377 1 380 380 492 62.0 1e-139 MDFSLNEDQQLMVDAFTELMRSRNWDAYFHECDEKHEYPIEWTEAICELGFDRILLPEEY DGLGADWVTLTAAYEALGREGGPTYVLYQLPCWDTVLREGTEEQKEKILSFVGTGKQMLN YAMTEPSAGSSWDDMRTTYTRKDGKVYLNGHKTFITSSLYAPYLVVMARDSENMDTYTEW FVDMSLPGITKEPLGKLGLRMDSCCDVYFDNVELEEKDLFGKEGNGFRRGVADFDLERFL VALTNYGTAYCAFEDAAKYANQRIQGGQAIARHQLTQLKFAEMKVRLTNMRNMLYEIAWK HDNGLLSRGDCSMAKYYCSEASFGVVDRALQVLAGVGITGDHRIQRFYRDLRVDRISGGT DEMMILATGRSALYPYR >gi|319978605|gb|AEUH01000100.1| GENE 18 14368 - 15336 1357 322 aa, chain - ## HITS:1 COG:ECs0529 KEGG:ns NR:ns ## COG: ECs0529 COG0657 # Protein_GI_number: 15829783 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Escherichia coli O157:H7 # 35 316 36 316 319 236 42.0 3e-62 MGHKYDVPRLWTEQMGAVVAKQDELAAGAYVTGQSLEEMRKAYRTERAFWNEGGPSVASS TDRQVPTRYGQVRVRHYRPDAQGPLPLIVFVHGGGWVIGDVDTHDRITRALCRLTGAAVV SVDYTLAPDARFPQQIHECRDAVVHIREHAGEWGVDPGDVSFAGDSAGANMAAATMLMLR DEGLLPTARAMLLFYGAYGLKDSMSMRLLGGAWDGLTEADYAYYLGQYFADPADADDPYF NILGADFSAGVPPCYIAAVGLDPLRDDSRTLAEILRLCGVPHLYNEVEGVIHGFLHHSKM LDQTMEVLTEAARFHTEHPLHR >gi|319978605|gb|AEUH01000100.1| GENE 19 15345 - 16661 2129 438 aa, chain - ## HITS:1 COG:ECs3631 KEGG:ns NR:ns ## COG: ECs3631 COG0477 # Protein_GI_number: 15832885 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 16 434 5 415 425 128 25.0 3e-29 MLSLLTKERRAKDLTSFHRVFLLVLVSIGSSIVYTPAYLQYVFDKPLGKALIASGVATQD SVATTMGALLSAYSWTALVCYLPSGIIADRVRVRTLAWVGFGSTALLAYWYAFFPSYTAL IGLFIAMGITTILIWWGIRFKLVRLISEEEEYSRNIGISYGLYGLVGLLLGFLNAWIISL LAGQGDVIPMRAVLIVLGSVIMVLAVLSFFLIPKFEGEFGSGDEGFNFKQLGQVLSNPVV WLAAATLFFVYFFYTGVNQTTGYMNDAMHLNEDTVLMVATVRTYGVSLISAPVFGAVATK LGSPSKVIGTGSLVVLVGLVAFAFLPAQASFAVAAVGMAILLAFIANGVFGICSSQLTEG KVSVKVFGTATGLLSVIGFLPDTFSYIWFGSIRDAHQDNASVAYNQIFLILAGAALIAAV CAFALVVVARKRNAKKDA >gi|319978605|gb|AEUH01000100.1| GENE 20 16567 - 16794 161 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MELPMETRTSRKTRWKLVRSFARRSLVSRLSTATAFHDSRDRGIDLSQAPAAVSLISIGI PKEPSVPDIPEALLF >gi|319978605|gb|AEUH01000100.1| GENE 21 16801 - 16998 85 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDPDRPEWADWGGIGPMARGGGFWTNRVHNPRSKSWDQGQDKAGGDPVVLIVQRPLVLLM ALLRG >gi|319978605|gb|AEUH01000100.1| GENE 22 17505 - 18377 1012 290 aa, chain - ## HITS:1 COG:BS_yhfC KEGG:ns NR:ns ## COG: BS_yhfC COG4377 # Protein_GI_number: 16078082 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 19 256 1 234 258 98 33.0 1e-20 MRRDAFCPPLLEKDCEAVMVPTASLVCMGAAALAAFALPVVLIAVGRRRWRFSARSCVVG ALVFVVFALVLESAAHVAVFTALPALQRSAALYALYGALAAGVFEELGRVSGFALLRRID RGPDGVERALGAGVGHGGIEAVAVAGFGMVTSIVLSVTVVNAGAADSFLSQLPEALRGGY AQRLDALAATPAPLYLLGVGERVIAIALHIALSVLVWMAFSGRIGRWWILGAVLAHALCD VGAALYQGGAISVFAAEAWALLTTCAVVALVRRLYASTRTGGAGEAEPAP >gi|319978605|gb|AEUH01000100.1| GENE 23 18295 - 18477 91 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQTSDAVGTMTASQSFSSSGGQNASRRISRLYPLRRPEGVMSRCPSARARESCHRLPTAS >gi|319978605|gb|AEUH01000100.1| GENE 24 18564 - 19991 1715 475 aa, chain + ## HITS:1 COG:alr4238_1 KEGG:ns NR:ns ## COG: alr4238_1 COG4222 # Protein_GI_number: 17231730 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 40 407 9 389 760 367 54.0 1e-101 MKPRSVAAAACLALVLGAASAGPALAEAAQPPTTEALTVATFNASLNRNADGELLADLAT GGNEQASNVAETIQRVDPDILLINEFDYDAEGKAVDLFRTNYLEVAHNGAAPVSYPYAWS GPVNTGVPSGFDLDKDGTTTGPGDAWGFGKFPGQYGFTVYSKYPIKTDQIRTFRNFLWKD MPGALLPSNDDGTGWYSDEVLQRFPLSSKTHADLPVDVNGTTVHVLAAHPTPPSFDGPEQ RNKKRNFDEIRVWADYIANRADYLYDDNGVRGGLATDANFVILGDYNSDPLDGDSYPGAI DQLLTSPRVVDTMPTSAGGPAEAELQGGANLAHKTDPKYDTGDFTDDPRPGNLRIDYVLP NTGTQVDEAHVFWPTRDDELFRLTGVHPFPTSDHRMVWTRLRFPGSEPSAPGQPEPGDQD QAPPSGQGQAPSESDKSARLANTGASAGPLGAALLAAGTGAGLLVAQRRARRRRA >gi|319978605|gb|AEUH01000100.1| GENE 25 20172 - 20774 786 200 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPWLDSLARVSPLLADWKLLGNSYYECLVARPLTLDALRFRLWAGRYGADAPEYAASPAF SGDIDEGPAILSINTGVYVGPGHPGIHMLTTRFADIAPEVLEDFADAFIDMTVRALDPVH LSLCDLDVSADREWDMFIPGWKIYLPHTASPTRIQQAQQAATTTRHLATGTVHTIATPTT YPTRTTDWTKDPKPHTDPNA >gi|319978605|gb|AEUH01000100.1| GENE 26 21289 - 22968 2710 559 aa, chain - ## HITS:1 COG:MT0972 KEGG:ns NR:ns ## COG: MT0972 COG0166 # Protein_GI_number: 15840369 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Mycobacterium tuberculosis CDC1551 # 1 559 1 551 553 647 59.0 0 MGDIPAIDPTKAPAWDTLDQLAEEFDPDLRKFFADDPGRAERFTLEAGDLHVDLSKNLVC PTLVGHLLALAEQTGVADLRDRMFAGEHINVTEDRAVLHTALRRPASDSLVVDGQDVVAD VSAELAKIYSFANRVRSGQWVGVTGKPVRTVVNVGIGGSDLGPVMAYEALKPYVQEGLEC RFISNIDPTDAGETTKDLDPETTLVIVASKTFTTLETITNAKVVRAWLLGALRSRGIVTD AASEAAAIAKHFVAVSTALDKVAAFGIDPANAFGFWNWVGGRYSVDSAVGTSLAIAIGPE GFADFLDGFHKMDRHFVEAPPERNVPLLMGMLNIWYSNFLGADTHAVLPYSQYLHRFPAY LQQLTMESNGKSVRRDGAPVTYETGEVFWGEPGTNGQHAFYQLIHQGTRMVPADFIAFAN PTWALGDGDADMHELFLSNFFAQTKALAFGKTSEEVRAEGTPEAIVPARVFTGNRPTTSI MAPALTPSVLGQLIALYEHITFVEGAVWGIDSFDQWGVELGKVLAKQILPAIEGDAGALD AQDPSTRALIEYYRSKRTR >gi|319978605|gb|AEUH01000100.1| GENE 27 23059 - 25605 3163 848 aa, chain - ## HITS:1 COG:SPBC16H5.02 KEGG:ns NR:ns ## COG: SPBC16H5.02 COG0205 # Protein_GI_number: 19112738 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Schizosaccharomyces pombe # 102 822 196 928 942 573 44.0 1e-163 MSDAAASSTVPLCATIHLTGSASKILSFGFAIALPQELADSAARNAGEAAPSASVRADQG TLRPAQAHPGRIQWAGERPFGPATARMERDDMDAFTQQTPTRIGILTSGGDAQGMNAAVR AVVRTALARGATPYAIMEGWQGAVDGGSAIKEMRWSDVSSILAEGGTVIGTARCAAFREY AGRHTAARNLLEHGIDHLVVVGGDGSLSGTDEFRREWAQHVQELAAEGAITEETARAHPA LVVVGLVGSIDNDMVGTDMTIGADTALHRIVDAIDQLTSTAASHQRAFVIEVMGRHCGYL PLMAAVAGGADYVFTPEDPAGPGWEDELARHLHFGREAGRRESIVLVAEGAKDREGNELT TQHIADTIKERTGEDARVTILGHVQRGGTPSAYDRWQSTLLGYAAVQEVLASTGEDEPCI LGVRRGRITRIPLMKAVRDTRAVKDLIAAGDFEAAQVSRGASFRAMVGVNQILSTPPQLA AGGDGGGKRVAILHAGGLAPGMNTAARVSVRLGIAKGWTMLGVDGSWSGLADDRVRELSW GDVEGWAFKGGAELGTKRDVPPVEQYYALGRAIERNSIDALIVIGGLNAYLGVHAMTGER DRYPAFKIPMILIPASIDNNLPGCELAIGTDTAINNATWAIDRIKESAAASKRCFIAEIM GRRCGYLTLMTALATGAEYMYINEDAPSLERIAADSQRMVASFKGGRRLFLTLVNESTSE FYDREFLADVFNAEAQGLYDVRHSALGHLQQGGAPTPFDRLLATRLVNRALGHLEGQFER GDTNATYIGQIGGGIEARLVKNMFDDLDIVNRRPYDQWWRDLLPVQRIVSLANPGIEAAP ISIDDPES >gi|319978605|gb|AEUH01000100.1| GENE 28 25639 - 25896 350 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLGALTGCVIVGGIGSVVGLTVLDHPHRAVWALVATLVVVAGARLAIPGRPWFTSRYRGA DAAVLLAVAGAIAYLSSYTSTMAVH >gi|319978605|gb|AEUH01000100.1| GENE 29 26048 - 27547 2045 499 aa, chain + ## HITS:1 COG:Cgl2696 KEGG:ns NR:ns ## COG: Cgl2696 COG0280 # Protein_GI_number: 19553946 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Corynebacterium glutamicum # 80 499 14 459 461 393 53.0 1e-109 MTSRYIVVGSGSHEAAALVLDQLATAFGASSSVARVPAFAGTDADSAALGAASALPDAVG AVRERTADVVLIESSSACANRSFDAPGWDFSLAASVGAGVVLAPDTEGVGAELLAQEAAT AVSRAADHQAAVVALALPAALVGRVDSPVPVLPLPVDGGGLDALARAAAPSAVTPLAFQA DLVERARADRKRIVLPEPDDDRVLRAAAQVLAAGIADITFVGDADYVAKRAGELGLDLGA AQVVSTGDPAYLERYAEEFARLRAKKGVTLEQAREKVQDVSYFGTMMVHMGDADGMVSGA AHTTAHTIVPSFQIIKTAPGVSVVSSIFLMLMKDRVWAFGDCAVNPNPTPEQLADIAVTS ARTAAQFGVDPRVAMLSYSTGSSGSGPDVDAVVEATRLAREKAPELAIEGPIQFDAAVDS AVAEKKLPGSEVAGRATVFVFPSLEAGNIGYKAVQRSSGAVAVGPVLQGLNKPVNDLSRG ALVEDIVNTVALTAVQAQG >gi|319978605|gb|AEUH01000100.1| GENE 30 27630 - 28829 1610 399 aa, chain + ## HITS:1 COG:Cgl2695 KEGG:ns NR:ns ## COG: Cgl2695 COG0282 # Protein_GI_number: 19553945 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Corynebacterium glutamicum # 7 399 5 397 397 387 50.0 1e-107 MPSQTVLVINSGSSSIKYQLVDPETGDALAKGLVERIGDPMGVITHTHGGAVTEEEMPVP DHTVGMREVLRLFDTEGPTLAEAGIVAVGHRIVQGGRHFDGPALITDGVRDLIEELCPLA PLHNPAHLKGIDVARELMPGVPHVAVFDTAFFQQLPDRSALYALETETAEKYSVRRYGAH GTSHQFVSQEIAKMLGRDDLKQIVLHLGNGASASAVRCGKPIDTSMGLTPLEGLMMGTRT GDIDPAVVFHLERVAGMSVGEVDTLFNKKSGMKGMTGESDMRSVWAMIDNDDDPVAQKRA RTAMDVYVNRLVKYVGSYTAELGGLDVITFTAGIGENDVHVRRELAEALSPFGVKIDVEA NKARSGEPRVISAPDSTVELVVFPTNEELAIARQALAFA >gi|319978605|gb|AEUH01000100.1| GENE 31 29217 - 29663 329 148 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLVAYLDEVAEAGAFVSKAHKNYNGSPAFGSSCVFSAPVEDAYSSTGRRRRWALLRNGGG KTPTIDDSPYEWVPTAFEELMRRKPTIESKLYLSHSRGGVRDLHNSDIFARERPALAMSI SQRISPEEKARMEAAKAGMLVVARASGL >gi|319978605|gb|AEUH01000100.1| GENE 32 29796 - 30641 962 281 aa, chain - ## HITS:1 COG:no KEGG:AAur_3480 NR:ns ## KEGG: AAur_3480 # Name: not_defined # Def: hypothetical protein # Organism: A.aurescens # Pathway: not_defined # 1 276 19 299 300 278 54.0 2e-73 MDRYSTRGQVEFVLRSRGRTLEAVERAHEAHVSALARVRAGIPRGWASADVERDSLSRFL FAPEDVIVVVGPDGLVANTAKYVADQIVIGVDSAPGSNAGVLVRCTPDQGVSVCRRLDEG ERVGVDHLTMVRATVDDSRSLTALNEVFIGHPGHQSARYELALPRRAERQSSSGVVVSTG TGATGWGASLKRGRGMGDLPGPTSQSLAWFVREAWPSPCTGTECTEGILGEGEELGLSVA SESLVLFGDGMEADRLVLTWGQTVRVCRAPRALALVDPEGL >gi|319978605|gb|AEUH01000100.1| GENE 33 30694 - 31704 1492 336 aa, chain - ## HITS:1 COG:no KEGG:Kfla_0428 NR:ns ## KEGG: Kfla_0428 # Name: not_defined # Def: band 7 protein # Organism: K.flavida # Pathway: not_defined # 1 326 1 324 331 186 38.0 7e-46 MGTLHHHPLVIRYQGGPGDHVIQIRAGRTIRSGVGQSFWLRAGRSALAEVPIADRAHSFL VQTPSADQQNVNAQVSITYRIEDPEAAAKHYDFGLYPRGASADPQGLWQIDELVTRLASS SLASAIAAMPLSEAISGSLDEVGAALVLAFERDEQLRATGVGVVDARLMALRPDEGVESS LRAPLLERLQAEADRALYERRALAVARESQISENELQSKLDLARKRADLVDQEGRNSRRE AEEKAAADAIAVEAEARRIETLAKANEDDWKRVGGAKIANEAAYVRALAEAGPDVVRAFA LKEMAHNMPSIGSVTITPDMLTDVLSAFAGSAKRGE >gi|319978605|gb|AEUH01000100.1| GENE 34 31703 - 31798 132 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAFSLLGWFPLVYVIPTITSRLFNVNSTQTG >gi|319978605|gb|AEUH01000100.1| GENE 35 31808 - 32563 836 251 aa, chain + ## HITS:1 COG:DR0192 KEGG:ns NR:ns ## COG: DR0192 COG1051 # Protein_GI_number: 15805228 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Deinococcus radiodurans # 9 223 15 225 225 126 43.0 3e-29 MITPLAFPIAVDVVALTVIDRALHCLVVTRGIDPYRGRPALPGGFVRENEQTLQAAEREL EEETGITPPGHLEQLRSYGPQGRDPRGPVLSVAHLLLAPRFSPARPGGDAAGAHWRPVDA LLAPGSPLAFDHAAILADGVERARSKIEYSPLATSFCGPEFTITQLRTAYEAIWGAPLDP RNFHRKATKTASFIEATGRTAREGAGRPAALYRLTPGVDETAFVLDPPLRRPAGPAPAPP TTAPPPPIARQ >gi|319978605|gb|AEUH01000100.1| GENE 36 32585 - 34669 2966 694 aa, chain + ## HITS:1 COG:ML1463_2 KEGG:ns NR:ns ## COG: ML1463_2 COG0171 # Protein_GI_number: 15827765 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Mycobacterium leprae # 335 688 2 356 359 522 69.0 1e-148 MNFHSLYDQGFARVAAVTLPVHPARPADNAREIIDAARQLSERGVALAVFPELCVSGYAL DDLLLQDTLLDNVEKALASIVGASAGLLPLLVVGAPLRKDNALYNCAIAIHRGRVLAIIP KSHLPNYREFYEKRYFVTMPPRACERIEAPWGGIEEFSGAPVWVPFGQVLLSAADVPGLT IGIEICEDMWVPVTPATELALAGATVLANLSASPITVGRGADRELMVRSVSARCSAAYVY TAAGMGESSTDLAWDGETMVYEAGDRLAIGERFQEGAHMTIADVDLERLRTERKRQNSFT DNAQRYFAGDERMTPQEVEFTLNPPRTDLGLQRPVDRFPFVPDDPSRLEQDCYEAYNIQV AGLVQRLRAIGDPKVVIGVSGGLDSTHALVVASRAMDLLGRPRTDILCYTLPGFATSERT KKNATLLCRYLGTSFQEIDIRPAATQMLADIGHPYGEGEAAYDVTFENVQAGLRTDYLFR LANHLGGIVLGTGDLSELALGWCTYGVGDQMSHYAVNTGVPKTLMQHLIRWVVASKQFDD HVGEVLLSILNTEISPELVPAKPGEKMQSTQDKIGPYNLQDFTLYHVLRRGARPSKIAFL AEKAWSDASVGDWPAGFPEEDKAAYSLAEIVKWERLFLQRFFSQQFKRSALPNGPKVMAG GSLSPRGDWRMPADVSGADWVAELDGAVEGLFEA >gi|319978605|gb|AEUH01000100.1| GENE 37 34747 - 34903 189 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGTAAQWAFVLVGVGLAALGAFELAVGVCARAGRASGRLALPLPVEAGSARW Prediction of potential genes in microbial genomes Time: Thu May 12 17:43:58 2011 Seq name: gi|319978595|gb|AEUH01000101.1| Actinomyces sp. oral taxon 178 str. F0338 contig00101, whole genome shotgun sequence Length of sequence - 8166 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 221 183 ## gi|154507777|ref|ZP_02043419.1| hypothetical protein ACTODO_00259 2 2 Op 1 . - CDS 291 - 1055 944 ## Ksed_23700 hypothetical protein 3 2 Op 2 . - CDS 1061 - 1840 685 ## Ksed_23690 hypothetical protein 4 2 Op 3 . - CDS 1840 - 2805 1227 ## COG1131 ABC-type multidrug transport system, ATPase component 5 3 Op 1 . + CDS 2924 - 4168 806 ## Ksed_23670 signal transduction histidine kinase 6 3 Op 2 . + CDS 4165 - 4896 684 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 7 4 Op 1 . - CDS 5025 - 5396 515 ## HMPREF0675_4545 cupin domain protein 8 4 Op 2 1/0.000 - CDS 5393 - 6214 928 ## COG0500 SAM-dependent methyltransferases - Term 6545 - 6592 18.3 9 4 Op 3 . - CDS 6605 - 8164 2003 ## COG1109 Phosphomannomutase Predicted protein(s) >gi|319978595|gb|AEUH01000101.1| GENE 1 3 - 221 183 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154507777|ref|ZP_02043419.1| ## NR: gi|154507777|ref|ZP_02043419.1| hypothetical protein ACTODO_00259 [Actinomyces odontolyticus ATCC 17982] # 1 71 49 117 118 62 57.0 1e-08 GSARWERAHRAAWPVLFAGGVLGASQGTALAATALMWPQGALSVCAVLAVSGLVVEAGLW RVAGAAARSRLD >gi|319978595|gb|AEUH01000101.1| GENE 2 291 - 1055 944 254 aa, chain - ## HITS:1 COG:no KEGG:Ksed_23700 NR:ns ## KEGG: Ksed_23700 # Name: not_defined # Def: hypothetical protein # Organism: K.sedentarius # Pathway: not_defined # 3 242 1 239 248 156 45.0 8e-37 MGLVAAEFLKLRRSRVGVFAVLLPLLATVAGAVNYWVNRDALNGGWDSLASQVTLFYGLM FFNLGVALMAASVWRPEHRDASWNIVLTSGHRPWAVVLAKSVVLVAPVAVAQVVLLALTW ALGLTMGMPGWPGAAFALSCALAVVIAVPLILAQSLLSMLMKSFAAPVAVCLLLSGIGFG SIAASTGVVGALSYALPQGLASRALFLGSTAVTVTGGLSLQSVAPLLAGCAVLSVLLVCA TVLIAPRRTIEAHT >gi|319978595|gb|AEUH01000101.1| GENE 3 1061 - 1840 685 259 aa, chain - ## HITS:1 COG:no KEGG:Ksed_23690 NR:ns ## KEGG: Ksed_23690 # Name: not_defined # Def: hypothetical protein # Organism: K.sedentarius # Pathway: not_defined # 11 259 5 250 250 198 46.0 2e-49 MVAMARRLLLVRLEAAKMRRLHTIPVVVVAVVAVVALSCMNLFRPEAPVVGADPDAWPWA SLLLNAAMMNALVHPVLVAVIASRQTDIENTGAGWTLNSTTGVGPGALCRAKTALLVAVL ALAVAAETAATIGLARARGYTVPLEWGQWTQYAVLLWLVDGVFVGAHVWLAAKWDNQLIC VGVGLLGAFTAMYMFLAPSAIARMVPWGYYAVITNTRLVGRQAGGVHYGSVDLPWVVGFL LLALAALAWATHRMDRVER >gi|319978595|gb|AEUH01000101.1| GENE 4 1840 - 2805 1227 321 aa, chain - ## HITS:1 COG:BH0445 KEGG:ns NR:ns ## COG: BH0445 COG1131 # Protein_GI_number: 15613008 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Bacillus halodurans # 26 321 7 306 306 212 39.0 9e-55 MTAHPPLRMAPRTSPRPPAYDGPGLVVATSDLTKSFAGRTVVDSLDLRVPAGCVYGFLGP NGSGKSTTMKMLLGLLRPTRGRISVFGTPLTGANASAMMPRIGSMIERPPGYGHLTGAEN MAVVARMLGLSREQTERALALVRLTEHKNRLQRTYSLGMKQRLGIAMALARDPQLLVLDE PTNGLDPAGIEEIRELLIHLASEGVSVMVSSHLLDEIDKMASVLGILSAGRLVFQGTRAE LFARSLPDLFVSTPQAREALALGLPGRAVAGGIMVPGLDQGGAARMIEALVRHGITLYEV RRADQSLEDVFMALTGGAGAL >gi|319978595|gb|AEUH01000101.1| GENE 5 2924 - 4168 806 414 aa, chain + ## HITS:1 COG:no KEGG:Ksed_23670 NR:ns ## KEGG: Ksed_23670 # Name: not_defined # Def: signal transduction histidine kinase # Organism: K.sedentarius # Pathway: not_defined # 4 408 1 395 404 118 33.0 4e-25 MKQLPPPSLPPAPPRPRPVPHWTDALLGLSVAVLSSSGLVTAFPGGAAVFRVLAVLCGAA IAFRRVFPLASVLVIGAVLVVHVLLIEDLTLVALVGALVAVWTTQSRLAPPWRWALLVFF CSGAVAAVSVRAVRAYEPTDLRGFAVLGTATALVLAVAALGGARSRSIRDRRHLAAERLA MLEEQQHTLARLAVAHERNRLAGDLHDLLGHTLTAISAQAEGARCVLAADPARADEALAT IARISREGVDQVHDMVALLRGDGVPSPSAVPGAAGTTAEGDGRAATADAVGAGGGLWERV SALVVACAAPVRPDLDVSAPPAMAPEQEDAMVRNCREALTNAMRHRAPGPISVAGRSMGP QVSLTVENPIAGVPSARTGGLGVVSMASRAHGAGLRCGVGPRGGVWRVEMEAGV >gi|319978595|gb|AEUH01000101.1| GENE 6 4165 - 4896 684 243 aa, chain + ## HITS:1 COG:BS_yxjL KEGG:ns NR:ns ## COG: BS_yxjL COG2197 # Protein_GI_number: 16080942 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Bacillus subtilis # 5 230 6 217 218 142 37.0 4e-34 MSAPIRIGLADDEPLFAQGLAMILGSQQDFAVRWRAVHGQDALVRARRDPVDVLLMDVQM PVMDGIEATRRLVDEGTASRIVILTTFNTDDSVLRGIDAGASGFLLKTTPPAGLFDAVRT VHSGDSVISPGPTSTLLRALRAPGAPPLAAVPTPADAAADGRGILERAGLTGKEVEILRL IARGLTNQEICDAQWVSMATVKTHVGHLLAKTGARDRVQLVLLALRAGLMDADMVLSPRP SHG >gi|319978595|gb|AEUH01000101.1| GENE 7 5025 - 5396 515 123 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0675_4545 NR:ns ## KEGG: HMPREF0675_4545 # Name: not_defined # Def: cupin domain protein # Organism: P.acnes_SK137 # Pathway: not_defined # 19 115 17 113 115 119 62.0 3e-26 MSLSADSSTERTTPVMTDLQDVAAMVAVNEGATVSRTVMHAEGVRLVLFSFDTDEYLSEH TAAMPVLLFALEGSLEIEADGRVVVLKPGDVIHFGTRLPHAVRALEPSKLALYMLDKREK PQQ >gi|319978595|gb|AEUH01000101.1| GENE 8 5393 - 6214 928 273 aa, chain - ## HITS:1 COG:MT2697 KEGG:ns NR:ns ## COG: MT2697 COG0500 # Protein_GI_number: 15842162 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Mycobacterium tuberculosis CDC1551 # 1 272 1 272 273 263 57.0 3e-70 MSDESSEASTPLPFSQRPVDKAPGHWVLARAGKRVLRPGGAALTRSMLSRAGLAGADVVE FAPGLGVTAGWILEGGPASYTGVEQDPDAAARVGRIVRGRGRCVNADARRTGLDSGCADV VVGEAMLTMQSDRVKAEIVSEAARLLRPGGRYAIHELALAPDDIDEAVATDLRKALARAI NVNARPSTVKAWKEHLEAAGLVVQHVGTAPMALLSPGRVVADEGVGGAARIAWNLARDKD LRARVLTMRRTFKKYEKNMRGVAIVARKPKEDE >gi|319978595|gb|AEUH01000101.1| GENE 9 6605 - 8164 2003 519 aa, chain - ## HITS:1 COG:MT3407 KEGG:ns NR:ns ## COG: MT3407 COG1109 # Protein_GI_number: 15842899 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Mycobacterium tuberculosis CDC1551 # 1 509 26 526 534 358 47.0 2e-98 EVAAAMAGPLEFGTAGLRGRMGPGESRMNLAVVIRATAGLCAFLTGTIDRAPRVVIGCDA RHGSVDFALAAARVASAAGCHVLKLPPMNPTPLTAFSMRHLDADAGIMVTASHNPAMDNG YKVYLGGAVVAGAGQGVQIVPPYDAEIAAAIASAPPADQVAQDDARIEPVDPRDAYVAAA SALARGSAAEKAALRITLTAMHGVGAALTTRVLAEAGFSDVRLVAEQAEPDPDFSTVPFP NPEEPGALDLAMERASEEGSDVIIAVDPDADRCAVAVPDPGSARGWRQLTGDETGSLLGD YLAGRAPRGAVLANSIVSSRMLERIAAAHGLDYSPALTGFKWIARVPGLFFGYEEAIGFC PNPEVVRDKDGIATSVVVASLFAALKAQGRTAADELERLARAHGLHMTAPLTFRVDNLGL IAEGMERLRSAPPSTLAGSPVESVIDLSEGYQGLAGTDALLVATEAGDRVVARPSGTEPK LKCYLEVVLPVEDGAPVPWDEARARLERIKEEFGAAIGI Prediction of potential genes in microbial genomes Time: Thu May 12 17:44:21 2011 Seq name: gi|319978590|gb|AEUH01000102.1| Actinomyces sp. oral taxon 178 str. F0338 contig00102, whole genome shotgun sequence Length of sequence - 2695 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 106 64 ## 2 1 Op 2 1/0.000 - CDS 117 - 770 919 ## COG0274 Deoxyribose-phosphate aldolase 3 1 Op 3 . - CDS 819 - 1526 1143 ## COG0813 Purine-nucleoside phosphorylase - Prom 1587 - 1646 2.0 4 2 Tu 1 . - CDS 1659 - 2651 1171 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain Predicted protein(s) >gi|319978590|gb|AEUH01000102.1| GENE 1 1 - 106 64 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELLDRAQQWADHDPDGRTAAALSASVAAARGGDA >gi|319978590|gb|AEUH01000102.1| GENE 2 117 - 770 919 217 aa, chain - ## HITS:1 COG:BH1352 KEGG:ns NR:ns ## COG: BH1352 COG0274 # Protein_GI_number: 15613915 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Bacillus halodurans # 6 214 5 217 224 170 50.0 2e-42 MKRTDVARMIDHTILKPEATSADVARIVAEGAELGTYSVCVSPSMLPLSAPPGLKVACVV GFPSGAVKPEVKAFEAARAVADGADEVDMVINIALVKEGRADELEAEIRAVREAVPAPGI LKAIIESAALTDEEIVMACGAAARAGADFVKTSTGFHPAGGASAHAVALMRATVGDALGV KASGGIRDAATALAMIEAGASRLGVSATVAILAGLDD >gi|319978590|gb|AEUH01000102.1| GENE 3 819 - 1526 1143 235 aa, chain - ## HITS:1 COG:SA1940 KEGG:ns NR:ns ## COG: SA1940 COG0813 # Protein_GI_number: 15927712 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Staphylococcus aureus N315 # 2 233 3 235 236 318 64.0 4e-87 MKSTAHIDPQAPIAPTVLMPGDPLRAKFIAENYLEDAEQFNSVRNMLGFTGTYQGTPVSV MGSGMGIPSISLYAWELIHVFDCKRLIRVGTCGALQEGINLYDVVIAQAACSNSAFMDQY NLPGTYAPIGSYRLIEAVRELAKSKGVTSHVGNILSSDTFYNADPTFNDRWKKMGIMAIE METAGLYATAAEAGVEALGIFTVSDSIVTGAVTTPAERQTAFTQMMELSLSLAGI >gi|319978590|gb|AEUH01000102.1| GENE 4 1659 - 2651 1171 330 aa, chain - ## HITS:1 COG:lin2104 KEGG:ns NR:ns ## COG: lin2104 COG2390 # Protein_GI_number: 16801170 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Listeria innocua # 1 311 1 311 315 224 37.0 1e-58 MDKRDEQAVTAVKLYFERGLSQAEVATAMGLSRPTVAKLLQRGKEAGFVTIAIHDPRETS SELARRLEERFGLAEARVVHMDVPGGTDLLDELGRAGADLVVELVHDGMSVGISWGRTMS AVASHLRHTARSDVTVVQLKGGSSYSELATNDFEIMRAFCDALNATAMYLPLPAIFQDVR TLSVVKRDPHIARILQAGRRTDAVVFTVGSVGRQSLILNLGHLNDQEVEELVERSAGDAC SRFYTREGACAVPAIDKRTAGISLEDLASRPIRILVAGGQHKAVALGTALRMGLASHLVT DQRLATALLEEASEGAGRDPGAGPGADGGV Prediction of potential genes in microbial genomes Time: Thu May 12 17:44:27 2011 Seq name: gi|319978586|gb|AEUH01000103.1| Actinomyces sp. oral taxon 178 str. F0338 contig00103, whole genome shotgun sequence Length of sequence - 3131 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 12 - 1295 1612 ## COG0213 Thymidine phosphorylase 2 1 Op 2 3/0.000 - CDS 1365 - 1784 529 ## COG0295 Cytidine deaminase 3 1 Op 3 . - CDS 1808 - 3082 1838 ## COG1079 Uncharacterized ABC-type transport system, permease component Predicted protein(s) >gi|319978586|gb|AEUH01000103.1| GENE 1 12 - 1295 1612 427 aa, chain - ## HITS:1 COG:MT3415 KEGG:ns NR:ns ## COG: MT3415 COG0213 # Protein_GI_number: 15842906 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine phosphorylase # Organism: Mycobacterium tuberculosis CDC1551 # 4 426 6 426 427 432 60.0 1e-121 MEQFDAVDVIRTKRDGGALSHDQIRWVVDAYTRGVVKDEQMAALAMAVFLRGMGREEISQ WTRAMIESGERMDFSGIGRPTADKHSTGGVGDKITLPLAPLVATFGVAVPQLSGRGLGHT GGTLDKLESIPGWRASLSNDEVLHQLGEGCGAVICAAGSGLAPADKKLYALRDITSTVDC IPLIASSIMSKKIAEGTDSLVLDVKVGSGAFMKDIDDARELARTMVDLGSDAGVRTRALL TDMSTPLGLKVGNALEVAESVEVLAGGGPADVVDLTVALAREMLESAGVRDADVEAALAD GRAMDTWRAMIREQGGDPDAPLPVSKHTHTVSAEADGVLEALDALAVGVASWRLGAGRAV KDDPVQAGAGIELHAKPGQRVSKGQPLLTLHTDDEWRIPRALESLEGGIGIGDSAPEERG VVLERVE >gi|319978586|gb|AEUH01000103.1| GENE 2 1365 - 1784 529 139 aa, chain - ## HITS:1 COG:ML2174 KEGG:ns NR:ns ## COG: ML2174 COG0295 # Protein_GI_number: 15828164 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Mycobacterium leprae # 1 127 1 127 134 151 58.0 4e-37 MVDVDWQRLSALAVEAMERAYCPYSGFPVGAAGLTADGRLVSGCNVENAGYGCTLCAECS MVSELVRTGGGRLVAVACVNGNKEPVAPCGRCRQVIYEHGGDECVVLMPAGAMTMREVLP GAFGPDDLGEVAGAAAPRG >gi|319978586|gb|AEUH01000103.1| GENE 3 1808 - 3082 1838 424 aa, chain - ## HITS:1 COG:alr5368 KEGG:ns NR:ns ## COG: alr5368 COG1079 # Protein_GI_number: 17232860 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Nostoc sp. PCC 7120 # 126 422 11 304 312 205 39.0 2e-52 MSEARTMEDIQVKRSWKMPVVYALAALLLGLFALTARGDVTLRLNDKSQSLDIPDIVAAG TPILCVLFVINAVIAAWSAYSTFQRRTNTRALETTMMGAAGAATVLGFLVFAGSGSNGAV TLTSTLVSTVAISTPLIFGALSGVVSEHVGVVNIAIEGDLLVGAFAGVMAASFFRTPYAG VVAAPIAGALLGSLLALFSVKYGVDQIIVGVVLNVLALGLTTFFYGTLMKDAPQDLNTNQ FALSAVKIPLLAEIPIIGPVFFNQTILVYLMYAAVVALTVFLYRSRWGLRLRACGEHPRA ADTVGINVNRTRATNAILGSAFAGLGGAFFTLGSGLSFTDNISAGNGYIALAAMILGKWH PLGAMGAAIMFGFAQAVARLLPNIEPSIPSDLVSMIPYVVTIIAVAGFVGKSRAPAAENV PYVK Prediction of potential genes in microbial genomes Time: Thu May 12 17:44:32 2011 Seq name: gi|319978574|gb|AEUH01000104.1| Actinomyces sp. oral taxon 178 str. F0338 contig00104, whole genome shotgun sequence Length of sequence - 14320 bp Number of predicted genes - 13, with homology - 6 Number of transcription units - 9, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 24/0.000 - CDS 2 - 1181 1775 ## COG4603 ABC-type uncharacterized transport system, permease component 2 1 Op 2 15/0.000 - CDS 1178 - 2752 173 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 3 1 Op 3 . - CDS 2854 - 3945 1867 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 4 2 Tu 1 . + CDS 3842 - 4063 196 ## - Term 4120 - 4187 21.0 5 3 Op 1 1/0.000 - CDS 4193 - 6490 3092 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases 6 3 Op 2 . - CDS 6601 - 9093 2650 ## COG2374 Predicted extracellular nuclease 7 4 Tu 1 . - CDS 9931 - 10116 106 ## 8 5 Op 1 . + CDS 10605 - 10808 218 ## 9 5 Op 2 . + CDS 10747 - 11142 575 ## 10 6 Tu 1 . + CDS 11845 - 12411 736 ## 11 7 Tu 1 . + CDS 12875 - 12940 75 ## 12 8 Tu 1 . + CDS 13105 - 13962 590 ## gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 - Term 13939 - 13978 0.7 13 9 Tu 1 . - CDS 14128 - 14319 59 ## Predicted protein(s) >gi|319978574|gb|AEUH01000104.1| GENE 1 2 - 1181 1775 393 aa, chain - ## HITS:1 COG:TM0104 KEGG:ns NR:ns ## COG: TM0104 COG4603 # Protein_GI_number: 15642879 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Thermotoga maritima # 22 364 8 341 344 188 37.0 2e-47 MNKTAHQSKAVVVLRHLLGTRWFVAILAVLIAFALGAVLIVMAGASVSEAYYAMFRGAVF DPNAASFQRQIKPLTDSLFYSIPLIIGGLGLALGFRAGLFNIGGQGQVVFGALAAVWVGF SLRLPPVVHTAVALGAAMLAGALYAGIAGVLKARTGANEVIVTIMLNSIAGLALGYILSQ KAWQVAGSNQPQTPKVAETAAFTRLLPAPFKLHGGIVIALLAVCVFWWLIERSTLGFQLR AVGANPDAARTAGISVGRVTAVTMAISGAFMGLAGANEALGTIGYVSRDVAGSIGFDAIT VALLGRNKPLGTLGAGLLFGAFKAGGYTMQAKGVPIDMILILQSVIVLLIAAPALVRWLF RLPKENKTGLRAQLAALGADPAEASAGTVAVGP >gi|319978574|gb|AEUH01000104.1| GENE 2 1178 - 2752 173 524 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 252 498 127 366 398 71 30 4e-12 MKLELKGITKRFGPLVANNNINLTIEEGHIHALLGENGAGKSTLMNVLYGLHEPDEGEIL IDGESVRFNGPGDAVAAGIGMVHQHFMLIPVFTVAESIALGFEPVGRAGLIDTAKARKTV EEVSARFGFDLDPDALIEDLPVGAQQRVEIVKALSREAKVLILDEPTAVLTPQETDELMA IMRQLAESGTSIVFITHKLREVRAVADDITVIRRGAVVGTASPQSSESELANKMVGRSVM MKVEKAPASPSGGGLVFDDVSLLSPSGVPVLDHLSFGVERGEIVAVAGVQGNGQTELAEA VLGLQTPDSGTISLEGADITRTNPRASLDAGIGFIPEDRTTDGIIASFSIADNLVLDQFG EAPFSKGPMLKLGVIDDNARAKQQEYDIRLTDVSDPISSLSGGNQQKVVVAREMSKELNL LVANQPTRGVDVGSIEFIHKRIVGVRDQGTPVLLISSELDEVLSLADRVVVMYRGRIMGI VPAGTPRDVLGLMMAGVPLDEAMSASEEHSDGAGEGRTMEGGTR >gi|319978574|gb|AEUH01000104.1| GENE 3 2854 - 3945 1867 363 aa, chain - ## HITS:1 COG:TM0102 KEGG:ns NR:ns ## COG: TM0102 COG1744 # Protein_GI_number: 15642877 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Thermotoga maritima # 40 356 19 340 359 132 30.0 9e-31 MKTLTRILAITGAASIALAGCSGGGGTTDNGGGSTEGKDFKACAVSDAGGWDDKSFNESA YEGLKAAEKNLGVKINTAESSSDADFQPNVDSMISDGCNLIIGVGFKLEQALHHSAEENK DLHFALVDSGFVDDQNQSVNLENGRPLVFNTAEAAFLAGYAAAGMTKTGKVATFGGIQIP SVSVFMDGFADGVDAYNKAKGTSVQLLGWDKASQNGSFTQSFDDQALGKQQAQQFIDQGA DIIMPVAGPVGLGAAAAALADGNTRIIGVDTDWYEANPDYKSIVLTSVMKEIGAAVEQAI KDSVGGNFKSDAYVGSLENGGVSLAPFHDFDSQVPQELKDELTSLTEQIKKGEVKVESQN APK >gi|319978574|gb|AEUH01000104.1| GENE 4 3842 - 4063 196 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEPPPLSVVPPPPEHPARAIDAAPVIARILVRVFIIRPSKWWFVLERVVARATGPAPQLR GQSPVVNFCKTRN >gi|319978574|gb|AEUH01000104.1| GENE 5 4193 - 6490 3092 765 aa, chain - ## HITS:1 COG:Cgl0328 KEGG:ns NR:ns ## COG: Cgl0328 COG0737 # Protein_GI_number: 19551578 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Corynebacterium glutamicum # 75 661 34 605 694 299 36.0 1e-80 MHTTTRRAVVAGCAVFALFAAPMAALAEPAGPGAPQSGSAQSGAADAPQSEPAAAQGAQG TAESTGGAGQAEGVVTLDLYNLTDVHGHIEQVTRKGTVTEAGLPAMNCYLKQASGTNPNS SFTLLGDNIGASPYTSGALNDNPTIAALNTMNPLASTIGNHELDMGQAVFKQRVDGSNPD EYVQVKFPYLGANIEGMGTWGSENTPYLGDYKVWASPSGVKVAFIGAIAQDVPYKLSPGT TDGLTFTDPIAKIDALAKKIKDSGEAQIVIAMLDDDVKNNYPKVGANVDGLMGGDTHVPY EFDKVDSVEKLNSANPMLAGIASGSYTDNLGLIRIAYDTTSHKVVSADSRLIPAADVAKC GADPATQAVVDKAVADSKEAGQRVVAKGYTQTFARGVFTTPDGATEPGSNRGIESSLGDF VADAMRETITTPDGKPVDIGMINAGGLRADLVPNSDGTITYAQSYAVLPFSNELGYVTLK GADVKDALEQQWKTDLNSQNSRPLLKLGLSDNVQYTYDPARPYGSRITSMTVNGEPLDPE RTYTVGSVNFLLEGGDSFDALTRGGAATTNGNLDRDKFNEFLGAHSGAAPRSLKASVGIT LPAEPVADGEAVDVALRGLSFSEGPSVTSKVRVSLGGDSVEGGVDNSLVDAHASDEAAVI TTDGAGQATLSVTAVGQCEGKAAGEVVKAPVTVDTDFGRVVEAGAGLTVDVKCAGGAPGP SQSQAPGANRSGALAKTGTSAGLLTLVAVVLAGAGCAVLRARRKR >gi|319978574|gb|AEUH01000104.1| GENE 6 6601 - 9093 2650 830 aa, chain - ## HITS:1 COG:Cgl2538_1 KEGG:ns NR:ns ## COG: Cgl2538_1 COG2374 # Protein_GI_number: 19553788 # Func_class: R General function prediction only # Function: Predicted extracellular nuclease # Organism: Corynebacterium glutamicum # 105 756 187 816 816 410 42.0 1e-114 MRHKRKRTALAALAALSLIGLPTAAHAVAGAEQAGVAPIGVPAAAAPGVVPVGGPGAGLA APTAVGADSAATTPVGADNAARDEEDARAQNGQDGQDARSADGDAPGAAAEGAQSPDDSQ SPEDAQGADGAGDQAESAGAGATTPIPDIQKTGEGDDSALVGKTVTAVGVVTSAHPKGED GLASTLDGFTIQTPGTGGLQGAERQSSDGLFAYVGKADVAMPTIGQCVRVTGKVSEYPAT GPKTPAQTQSLTQLAVRSVETIDGCDPVKPVPLTAVPAPAQMEALESMLVEPQGTWTITD NYQANQYGTLSLTPGSEPLRQATDAVAPGDEARAMEADNAARTIALDDGTSTNLQRGAAT GVPYAYLANGSPARVGYHVGFTGPVVLDSRHSAFVFQPTRMVAAHPDRSPVSISGERPGV PQVEGDVRVATFNVLNYFSDLGQDEPGCQGYPDREGAFVTAKKCKVRGAWSRAAFENQQT KIVQAINAIGADVVALEEIENPVAAGAGQDRDGALKALVDALNAHAGEQVWAHVPSPAAV PADEDVIRVAFIYRRATIEPVGESKILDDPAFTGLARQPLAQEFAPVVSAPRTGKNFVVV ANHFKSKGSVPQGMEEGNKDTGDGQGNSNAIRVAQARALAAFAEQFAGTPTLLVGDFNSY SKEDPLKALTDAGWVHESADSAASYVYGGRSGSLDHVFSNAAADPLVRGVASWALNAQES VAFEYSRANSNAHLAFEADNPYRSSDHNPEIIGLDVIDEGPGQDPAPDPSAPDPSAPPQP GPGGPTPAPGAGPKHRGGLASTGAQSWVAIVAGALLAAGAALVTRARGRR >gi|319978574|gb|AEUH01000104.1| GENE 7 9931 - 10116 106 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNTGDPNRIIELIAATETLANAWQTMLASNTDGLAPAAVQTASTDLILHLGKVALRVQGC R >gi|319978574|gb|AEUH01000104.1| GENE 8 10605 - 10808 218 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQTRLSPCQGAPSTIRCIKTHNNSSSPPRIAIHVREHPAPSGALRLRIARGRRDSWGSGS TQHHQVH >gi|319978574|gb|AEUH01000104.1| GENE 9 10747 - 11142 575 131 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRGVVETHGGQGAPSTIKCIKTSGARGSSGGAVRRVREHPAPSGALRLRKATFCLRCCRL VREHPAPSSALRHVVRFVMSYFGSRVREHPAPSGALRHLGRGDPGVGDFDVREHPAPPGA LRPADDCGHRS >gi|319978574|gb|AEUH01000104.1| GENE 10 11845 - 12411 736 188 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPAALNTVLLLAALLAALVGPFAAYMCAKRWTRRSIAELVTGDPGLVDHINRHTWALSDG AIAVVGPPDSQQAHDVHQALEDTGLFKKGAIAHIPPQDLAGAARADLIILTEDALSAQTD GDGRARLLDDVLNRKRGIHAGLIGYAPTGDLTDREFQAIGSEPITSVTRTRGRLVNDAIS MLTTLSRL >gi|319978574|gb|AEUH01000104.1| GENE 11 12875 - 12940 75 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MESSIEGVPGSVSSGSWSGGA >gi|319978574|gb|AEUH01000104.1| GENE 12 13105 - 13962 590 285 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293190865|ref|ZP_06609027.1| ## NR: gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 [Actinomyces odontolyticus F0309] # 117 280 93 262 266 72 31.0 3e-11 MDNTEPSVGDQSAEGASNGGRLADGPVDDGPVDDLPADDGGGMSPVVREYVERLRREALA RDAELRARGIDPYKGTPVDEEPRRRLGARALAALAVLVAAVSVGAYVVFFRGEPDYGMSH GYQVQSDGSLKRPPVTDKAPTQPAEMSKGDEAGAAAAARYYLNLGSYAWNTGDTGPLKSI SDADCVYCRSEYTHIDEFYAHGYWAAKGYTDVIETQVIEQLPDEKYGPDAYAVQFRIDEH VAEGYTSNGFQQAEVYSTIVQLHVRWDGTAWHIMEGAAKDAKGNA >gi|319978574|gb|AEUH01000104.1| GENE 13 14128 - 14319 59 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no DPGIPSRSPVHTISIGRPAARPPGTDPPPGASDAAGTAGALSAEPPADPPPEPPPTDDGG RLE Prediction of potential genes in microbial genomes Time: Thu May 12 17:45:23 2011 Seq name: gi|319978571|gb|AEUH01000105.1| Actinomyces sp. oral taxon 178 str. F0338 contig00105, whole genome shotgun sequence Length of sequence - 2695 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 547 - 591 1.0 1 1 Tu 1 . - CDS 597 - 1721 1332 ## COG0516 IMP dehydrogenase/GMP reductase 2 2 Tu 1 . + CDS 1838 - 2554 860 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases Predicted protein(s) >gi|319978571|gb|AEUH01000105.1| GENE 1 597 - 1721 1332 374 aa, chain - ## HITS:1 COG:slr1722 KEGG:ns NR:ns ## COG: slr1722 COG0516 # Protein_GI_number: 16330504 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Synechocystis # 5 370 3 368 387 337 48.0 2e-92 MSHEVEIGRGKRGRRAYTLDDVALVPSRRTRDPEDVNVGWQIDAYHVDIPLMAAPMDSVM SPATAIRFGRLGGIGVLDLDGLWTRYEDPEPILAEIARLGADESIARLQQIYSEPIKPSL ITARLRQLREAGVVVAAKLSPQRTKKYWRYVVDAGVDLFIIRGSTVSAEHVSSHHEPLNL KRFIYELDVPVIVGGVCTDTAALHLMRTGAAGVLVGFGGGAAHSTRRSLGVHAPMATAIA DVAAARRDYMDESGGRYVHVIADGGIGRSGDLSRAIACGADAVMLGAALARAEEAPGRGW HWGSEATHPDMPRGQRVHVGTTGTLEQILYGPSTRADGSLNFVGALKRTMASTGYSEVKD LQKAQVVVSPYSAQ >gi|319978571|gb|AEUH01000105.1| GENE 2 1838 - 2554 860 238 aa, chain + ## HITS:1 COG:Cgl1214 KEGG:ns NR:ns ## COG: Cgl1214 COG0847 # Protein_GI_number: 19552464 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Corynebacterium glutamicum # 9 231 20 231 237 154 43.0 1e-37 MTWTSDPWLGFDTETTGTSPFKDRLVTAALVLRVEGRDDQVATWLADPGVEIPEQASAVH GITTEYAREHGRPVEEVLEEVAGCLTEHWFRGFPVVAFNASYDITLVDRELSRHGLATFA ERLDGEPMLVVDPLVLDRKLDRFRKGKKTLTDMAPVYGVEASPDAHTAEVDVAMTLNVLA GIARKFPELEDHDAASLTGYQARAHREWAEDFEKYLRRQGREASIERDWPLIMPRTRD Prediction of potential genes in microbial genomes Time: Thu May 12 17:45:27 2011 Seq name: gi|319978559|gb|AEUH01000106.1| Actinomyces sp. oral taxon 178 str. F0338 contig00106, whole genome shotgun sequence Length of sequence - 10005 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 9, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 22 - 711 1085 ## COG0176 Transaldolase - Term 776 - 820 1.0 2 2 Op 1 . - CDS 888 - 2051 1334 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase 3 2 Op 2 . - CDS 2044 - 2745 755 ## COG1309 Transcriptional regulator 4 3 Tu 1 . - CDS 2879 - 4402 2086 ## COG0516 IMP dehydrogenase/GMP reductase 5 4 Tu 1 . - CDS 4612 - 5880 1266 ## COG2942 N-acyl-D-glucosamine 2-epimerase 6 5 Tu 1 . - CDS 6044 - 7054 1011 ## Bcav_3056 transcriptional regulator, MerR family 7 6 Tu 1 . + CDS 6963 - 7565 453 ## Krad_0737 transcription factor WhiB + Term 7586 - 7627 11.1 8 7 Tu 1 . + CDS 7730 - 8533 994 ## COG1651 Protein-disulfide isomerase 9 8 Tu 1 . - CDS 8671 - 8967 540 ## COG0234 Co-chaperonin GroES (HSP10) - Prom 9062 - 9121 1.7 10 9 Tu 1 . + CDS 9193 - 10003 853 ## COG0500 SAM-dependent methyltransferases Predicted protein(s) >gi|319978559|gb|AEUH01000106.1| GENE 1 22 - 711 1085 229 aa, chain - ## HITS:1 COG:SPy2048 KEGG:ns NR:ns ## COG: SPy2048 COG0176 # Protein_GI_number: 15675818 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Streptococcus pyogenes M1 GAS # 1 212 1 213 222 145 36.0 7e-35 MLILIDHANVDAIAQTLTYLPIDGVTTNPTILFREGKEPLEVLHQIKEILPDGAQLHVQI VSERADDMVAEARALRGEIGGNLFIKLPVTREGFAAIPRIVREGMQVTATGIHSSMQGFM AAKAGARYVAPYVNRMDNYGIDGVRVASEIHRTLRSYGLGADVLAASFKNSEQVLELVRQ GVGAITAAPDVLAALIKNPATDAAIADQNRDFAYLIGERRTWLDLIKEK >gi|319978559|gb|AEUH01000106.1| GENE 2 888 - 2051 1334 387 aa, chain - ## HITS:1 COG:BS_yjiC KEGG:ns NR:ns ## COG: BS_yjiC COG1819 # Protein_GI_number: 16078287 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Bacillus subtilis # 9 369 6 369 392 187 32.0 3e-47 MNEHATRSVLMVNLPFSGHTNPTLGLARALTDMGHRVTYVHSPTWRAAVERTGAAFVPYD DYPSRPRPSLEQVRCWDAAYRTALRIGGDYDCLVYEMLFTSGKALADRLGIVPVRLFSTF ALNERVLEDFGRSGGWYLTSIFRLPRLRALVSKRLSRRFGWRYDDLVDEITRNGPDLNIT YTTRSFQLYAEDFPAPRYAFVGAAVGGRAPGGFDAPAGSGPLVYVSLGTRLNTSARFFRS CVEAFRGTGARVIMSIGDSVRPGRLGPLPPTIAVHRSVPQLEVLSRASLFITHGGMNSVN EALYYGVPMVVVPMGNDQPTVARRVVELGLGEALDARSATPAALRGAALRVMSDQGCRAR FDGFQKETRAAGGNAEAARLIIARLAR >gi|319978559|gb|AEUH01000106.1| GENE 3 2044 - 2745 755 233 aa, chain - ## HITS:1 COG:PA2885 KEGG:ns NR:ns ## COG: PA2885 COG1309 # Protein_GI_number: 15598081 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 21 113 6 97 198 65 38.0 6e-11 MPFPAACGTLKTDRRFVLGGAVNQAEKSERARESICAAARRLFAQKGYEATTMRDIVADS GMSKGAIYHHFRSKQEVLRSVVEDECRYLDEYFAGLAAQSQVPVRDRMTALARHLASAGP QSSLGRADWVGEVPAALLGSLRNSLTVLAPHLERMLRQGVESGEIDCPFPGEVAGVLVLL VDVWIDPLIAADSYERMRQRVDFVALFLERFGAPVLGDEAVAIMEEGVRRFYE >gi|319978559|gb|AEUH01000106.1| GENE 4 2879 - 4402 2086 507 aa, chain - ## HITS:1 COG:Cgl0587_3 KEGG:ns NR:ns ## COG: Cgl0587_3 COG0516 # Protein_GI_number: 19551837 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Corynebacterium glutamicum # 212 502 1 292 292 350 61.0 3e-96 MGELATHDPFGLTGLTYDDVLLLPELTDVVPSSVDTTSRLTKNISLRVPLLSAAMDTVTE ARMAIAMARQGGIGILHRNLSIEEQAAQVRQVKRSESGMVEDPVTVGPDATIDDLDRLCG HYRVSGLPVVSEDGALLGIITNRDLRFVPESSWSRLHVRECMTPRDRLVVGQVGISREHA KHLLAEHRVEKLPIVDEDDRLTGLITVKDFVKTEQYPNATKDSRGRLVVGAAVGYWGDTW ERATALAEAGADVLVVDTANGGARLALDMIRRIKADPAFEGIEVIGGNVATTEGAQALID AGADAVKVGVGPGSICTTRVVAGVGVPQITAIHLAARACGPAGVPLIADGGLQYSGDIGK ALVAGADTVMLGSLLAGCEESPGEVVFTNGKQFKRYRGMGSLGAMSSRGRKSYSKDRYFQ ADVSSDDKIVPEGIEGQVPYTGSLASVVYQLVGGLHQTMFYLGASTVAQIKANGRFVRIT SAGLRESHPHDVQMTTEAPNYHSSSAK >gi|319978559|gb|AEUH01000106.1| GENE 5 4612 - 5880 1266 422 aa, chain - ## HITS:1 COG:yihS KEGG:ns NR:ns ## COG: yihS COG2942 # Protein_GI_number: 16131720 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Escherichia coli K12 # 16 411 20 387 418 129 26.0 1e-29 MNAPWNDAALAGAFGQDFDALVAFASRAKAPIGYGFLRADGSVDVGRPASLWINARMTAT FALAQERGTEGAAHLVQHGLEALSGPFASPGGGWYGALVPATESGPAPLPGARTDQGSLA ALIVAACGAARAGHDRGRELLARALADQEAHWWDPGSGMPRHSWDPGYSRPGALRRLGDA LSTADAYMSAWRVCGDRAWLDRARSIMRFTGEAAARSGWVLPDLFDAHWRPLPSGAALDA DEARAGSLVASDGGAVAARTWDLVVPGDSLPGLGLRAVRLIARLARILDGLGEPPSPWMY DTAVAVYDRSQSDGWAQGGGFLLSVDGAGRPSVESRMWWVACEALRAAHGLGQWASTRGD GATAARLATDYARTVAWTDERLRAGQGRWIHELAPDGSVSTRVWEGRPDAYAAAAALLRL DV >gi|319978559|gb|AEUH01000106.1| GENE 6 6044 - 7054 1011 336 aa, chain - ## HITS:1 COG:no KEGG:Bcav_3056 NR:ns ## KEGG: Bcav_3056 # Name: not_defined # Def: transcriptional regulator, MerR family # Organism: B.cavernae # Pathway: not_defined # 43 319 24 300 344 103 34.0 9e-21 MGTRSRFEWRAPAASVRIGAGTVTDSRGGAMPRAVLSAQSVPEAALSVTEVSKRLGVSAS TLRTWERRYGVGPEERAAGAHRRYLPDDVERLRAMIGLVRNGMPTGQAAAAVHADRARPD RGAPVRPAELRVMAESGEAPALQEALESAVSEYGLVHTWSRFVAPALQQIRSGSDGEVPG GSPSHVLMCAFEHVLQAVHRSRPPRPAQAPARIVVMSDAIHELGAQVVGVALCWYGFDVR ILSSERYGGVCAAERFTAHSASHEVDLVVVMGRGGTCEEFVTTIADRSDVNILLVGADTP EILSQNVQRVRTPAACVEEAIAMLAPGAEPGAVGRL >gi|319978559|gb|AEUH01000106.1| GENE 7 6963 - 7565 453 200 aa, chain + ## HITS:1 COG:no KEGG:Krad_0737 NR:ns ## KEGG: Krad_0737 # Name: not_defined # Def: transcription factor WhiB # Organism: K.radiotolerans # Pathway: not_defined # 103 197 1 94 100 116 57.0 6e-25 MAPPRLSVTVPAPILTLAAGARHSKRERVPNAPNYSPSCSPLIRSEIAILFRLFSGRAGY AENWLIICEGSLNNYRASVYPFHVHARRDSPCAPTKTPRRHPLTDVSRLPGPMIDAWEWQ YRAACRDLDTELFFHPEGERGSTRRRRAANAKAICASCPVIEQCRAYALASQEPYGIWGG MTEEERREEIRASGRTRRAS >gi|319978559|gb|AEUH01000106.1| GENE 8 7730 - 8533 994 267 aa, chain + ## HITS:1 COG:Cgl0018 KEGG:ns NR:ns ## COG: Cgl0018 COG1651 # Protein_GI_number: 19551268 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-disulfide isomerase # Organism: Corynebacterium glutamicum # 23 262 15 249 254 93 32.0 3e-19 MSDTDKETTTTAPQGRAPAGSPLVIVLVAVIAVLAVVVGYLVWERGDSPESGASSAPSAP ASESASPEPTSAPTVTDPKILELVRSLPKRDPADAQAKGRTDAAVVMVLYSDFACPYCTR LAQQVEPQLADLVEDGTLRIEWRDLAQISETSPLAAQAGLAAAEQGRFWEFHDAVYAAAD PSDHPAYTTDSLVEFAKAAGVPDIDQFTATMNDAHTAEKVAKAADDAHGMGISSTPFMII GNAVIPGYRDAAFVRQTVIDQAAESTQ >gi|319978559|gb|AEUH01000106.1| GENE 9 8671 - 8967 540 98 aa, chain - ## HITS:1 COG:Cgl0581 KEGG:ns NR:ns ## COG: Cgl0581 COG0234 # Protein_GI_number: 19551831 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Corynebacterium glutamicum # 2 98 8 104 104 129 67.0 2e-30 MSISIKPLEDRIVIRQVEAEQTTASGLVIPDTAKEKPQEGEVIAVGPGRVDDNGNRVPVD VKVGDTVIYSRYGGTEVKYEGQEYQILSSRDVLAVVER >gi|319978559|gb|AEUH01000106.1| GENE 10 9193 - 10003 853 270 aa, chain + ## HITS:1 COG:Cgl1200 KEGG:ns NR:ns ## COG: Cgl1200 COG0500 # Protein_GI_number: 19552450 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Corynebacterium glutamicum # 55 259 52 244 387 64 33.0 2e-10 MNSALAPVLTSPGWDILRSLGERTSPGTDLGLLGASLRRAGTPADVVSAVLTQLELRSAA RAKFGPFADSMVFTRDGLEQATRLVVAAHHARRFREAGASFVADLGCGVGGDALAIAGLG LRVLAVDRDQDASAAAAINLRHFDGANVLRGDVAELDMEDLRDQGIDAVFADPARRTGAQ GGSRRIADPEQWSPPLSAVWRWRSGFERVGVKLAPGIAHSALPADCHAQWVSVDGNLVEA SVWTPALAPEGPGRSALVITGAGARVLADP Prediction of potential genes in microbial genomes Time: Thu May 12 17:45:38 2011 Seq name: gi|319978550|gb|AEUH01000107.1| Actinomyces sp. oral taxon 178 str. F0338 contig00107, whole genome shotgun sequence Length of sequence - 9384 bp Number of predicted genes - 9, with homology - 6 Number of transcription units - 7, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 387 500 ## COG0500 SAM-dependent methyltransferases 2 2 Tu 1 . + CDS 684 - 842 141 ## - Term 780 - 811 -0.1 3 3 Tu 1 . - CDS 956 - 1333 272 ## + Prom 1102 - 1161 3.7 4 4 Op 1 . + CDS 1191 - 1523 282 ## COG4859 Uncharacterized protein conserved in bacteria 5 4 Op 2 . + CDS 1553 - 2086 297 ## 6 5 Tu 1 . + CDS 2252 - 3433 1771 ## COG2170 Uncharacterized conserved protein 7 6 Op 1 . + CDS 3584 - 6241 3266 ## COG2898 Uncharacterized conserved protein 8 6 Op 2 . + CDS 6273 - 8132 1987 ## KRH_00150 hypothetical protein 9 7 Tu 1 . - CDS 8259 - 9314 644 ## PROTEIN SUPPORTED gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase Predicted protein(s) >gi|319978550|gb|AEUH01000107.1| GENE 1 1 - 387 500 128 aa, chain + ## HITS:1 COG:Cgl1200 KEGG:ns NR:ns ## COG: Cgl1200 COG0500 # Protein_GI_number: 19552450 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Corynebacterium glutamicum # 6 122 269 380 387 57 35.0 9e-09 PSGELGAVLAEPDPAVIRSGGLAHLADSIGARLIDPSIAYLTADDIAASPLLSRFEVVAR TALRAKAVSAALRPLGVGSVEVKKRGADIDPAALRKRLDLTPGGPGATVFATRVGGRHCA IIARRLAD >gi|319978550|gb|AEUH01000107.1| GENE 2 684 - 842 141 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTARIAGALKSARTLGHNLTHHTKTTKKEKTVNRRKPARQPCGRRGRSAHTP >gi|319978550|gb|AEUH01000107.1| GENE 3 956 - 1333 272 125 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPENTLHPLSIGFRGSCRTIQRTGATPFITFFDTTHPPARGMQSKVPIASPWTLRRLFRS LYHSFEFVFTVSLTVLCLVVVWIAALFCEPGRAELPVDVLKPRVVVLVTRLVARVPDAGA WFSRR >gi|319978550|gb|AEUH01000107.1| GENE 4 1191 - 1523 282 110 aa, chain + ## HITS:1 COG:FN0232 KEGG:ns NR:ns ## COG: FN0232 COG4859 # Protein_GI_number: 19703577 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 20 84 1 65 65 67 44.0 5e-12 MGTFDCIPRAGGCVVSKNVINGVAPVRWMVRHEPLNPMDNGWRVFSGIDTDEFIADPHNM AVVDYNTLCDIEPACIPVFALPAGSDIQLIIDDEGRRRWFDSNSGDELLF >gi|319978550|gb|AEUH01000107.1| GENE 5 1553 - 2086 297 177 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTEFKAVAPQRAVQIMRAWALHSWPMSVQDGIEVYTGLGFSGDSGSPEMFTSDVSPDEP DSYFASLDGPITSLEIAVTNVLPAEAEEEHPSRARGFYCSCLRAFEEQLGAPIREQESRA RRNAQWFLTNGTGVRLSGTNRMVTLFLESPEMADIHQDNLRRGITDYSPANDPLLEG >gi|319978550|gb|AEUH01000107.1| GENE 6 2252 - 3433 1771 393 aa, chain + ## HITS:1 COG:Rv0433 KEGG:ns NR:ns ## COG: Rv0433 COG2170 # Protein_GI_number: 15607574 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis H37Rv # 3 373 10 374 376 329 45.0 5e-90 MPLPFGRSARSTLGVEWELQLVDQDSHDLRQSADAVIDQATVDGELFTGIHREMLLNTIE VTSKPRATVADCLADITACVDFLSPIARRLRIDLAAAGTHPFARPGRQRVTNSRRYAQLV ERTQYWGRQMLLYGVHVHVGVEDRDKVLPIQAALSTHLGHLQAIAASSPYWAGVDTGYAS NRAMVFQQLPTAGIPRHFDTWEALESYTEDMIRTGVIKGFDEVRWDIRPSPAFGTVENRV FDAATNILEVGVFAALTHVLVEHFSRMYDAGEALPVLPDWFIAENKWRSARYGMDAELIL DPQGTTEPARDTFARLAEELAPIASDLGCAREFEGIGTILGVGAAYERQRAAFAAARPGE GLDAVVSLMRAEMEAGRPLSLAEFAALQAPDLG >gi|319978550|gb|AEUH01000107.1| GENE 7 3584 - 6241 3266 885 aa, chain + ## HITS:1 COG:Cgl0941 KEGG:ns NR:ns ## COG: Cgl0941 COG2898 # Protein_GI_number: 19552191 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 314 857 262 784 804 268 34.0 3e-71 MPINEGAQPRRANEEQGKPGTNEEQRGPTANGEQRQPREQAPSGAARPGGAARRWAATAG QWSATALRLGARWLHSVPATRAVAAALLVGVGVCLAWSGAFPLGSASADPSHWWSLVTAP FVFPRTSMETVLLIVLALIVLGGMAEHRVGSRRWLAAALVAPAAAVATTWLLAPPIQSFL PEWGASLDRGSIWGAGPMIVGLAGPVIESLGRAWRWRARFLLIGVLALSAGVIGSMDTFA RLWAGVYGLVIGRIAYRGRRQEDASTHDIVIHRQIVSLTVVCWTVAAALTIFSTTLRGPL AEMRWSVAPGWWMLGRASVGSTLLALMPVLLQLVLAYGLRRGRRFAMIGTLVLQGCLALS SAVGALVMEIAATRELRYLADDASASVITNTTQFLLVPVTLNLLLCAVVAWSRKAFTLRS RPGTVRALAARWATMMCVVAAISIGLGMLASDSYLPLVLDQGTADIEIPRASVLAILHDY LLALLPTSTVSIFEPSLEPFTLLAEVPVVWTPLVAWALSLALVARTLVTPAHTRQGSASR LVELIRAGGGGTLAWMMTWKGNSVWIREDDRAGVAYRPGGGTALTVTDPVCPSSQISATI EEFADFASRSGLTPALYSVHEPVAGAARELGWTVMQVAEESLLDLPGLAFKGKAYQDVRT AMNHAKREGVEAVWTTWEDCPQGRRDQIESISRAWSADKALPEMGFTLGGLDELRDPSTR LLLAIDSDGTVQAVTSWLPVHRQGEVVGLTLDVMRRRKDGWRPAIDFLIARAALSAQEEG LEVLSLSGAPLARSQEDDSGFGPMLDVLASILEPLYGFASLHSFKRKFKPRREALYLAVP DASSLGTVGAAIGHAYVPNMTAAQTARLAGSVAGGVAAAAIKDNQ >gi|319978550|gb|AEUH01000107.1| GENE 8 6273 - 8132 1987 619 aa, chain + ## HITS:1 COG:no KEGG:KRH_00150 NR:ns ## KEGG: KRH_00150 # Name: not_defined # Def: hypothetical protein # Organism: K.rhizophila # Pathway: not_defined # 122 393 109 373 384 100 29.0 2e-19 MGFLLSVDLTTHRAMVTALLLTIAALIASALLLPLTRRGRAGNPAGAAPQRRAHPWRTAG RWSARALAMLLPVVMILATGAMAFNRSLKIVTTPRDLFGIIAANLSGPQTGEAQSADGQS DGAQSEHERTQSGDPALLTDFQPTEESGMLKTTWTGPISGITQPVYAITPKGYRPDDGKK YGVIMTLHGYPGDPEGTMWGAQVSEALQSAIDQGLIPPTIVIGPEVNVDDSEHDCADLPG RPPVFTWVTKDVPAMIKANFPNVSTERAAWMIAGFSSGAYCAVWTAMRAPEVFGSAASLS GYNTQIEGGMKSQGQQYLADNTLSTMLANRTPDGLRIYAMAAQDDAVGGAPAAVAMANAV KAPDSVTPDTPETGGHAGPLWREHIPTMLAWWGSDNSVARAVGTPPTAEAAQSSSGFGQI VTAQTVTHRTRAFGLNGAGSLVVVGLLALLSALLCALRVPRWVAADGSGSPSPSAESAGP DGAGAGTADRATDRAPTVESEEATARTRAGTTTAGTDGQPASTGTADGTTTGRSEADTGP SQGLGGALASLRGRLAGARARASRLPRGLRRAWFFTTRLVTVSATMGVVALLIGMLGNAT GGFYTSWRTALFSLIEAFM >gi|319978550|gb|AEUH01000107.1| GENE 9 8259 - 9314 644 351 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Cryptobacterium curtum DSM 15641] # 2 326 515 841 860 252 42 6e-67 MSEPLVLGIESTCDETGVGLVRGYDLLADRTATSMDQYARYGGIIPEIASRAHLESFLPA LRSALSEAGAGLSDVDAVAVAAGPGLVGSLTVGICAAKALASSLGKPLYGVNHVIGHLAV DELAEGPLPERFIGLVVSGGHSSVLLVDDIATGVRELGGTLDDAAGEAFDKVGRLLGLPY PGGPHVDRLSREGDRRAIRFPRGLAAGKDKERHKYDFSFSGLKTAVARYVEGAQARGEQV SAPDVCAAFSEAVNDSLTAKAVRAAVDTGCSTIVVGGGFSANSRLRELLVARADEAGVAV RMPPLRFCTDNGAQIAALGSELVRAGLPASDWGFSPESGMPLTRTYAAPVG Prediction of potential genes in microbial genomes Time: Thu May 12 17:46:08 2011 Seq name: gi|319978543|gb|AEUH01000108.1| Actinomyces sp. oral taxon 178 str. F0338 contig00108, whole genome shotgun sequence Length of sequence - 5496 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 248 215 ## gi|293190205|ref|ZP_06608701.1| aromatic amino acid transport protein AroP 2 2 Tu 1 . - CDS 383 - 1018 667 ## gi|281355583|ref|ZP_06242077.1| hypothetical protein Vvad_PD3692 3 3 Op 1 36/0.000 - CDS 1286 - 2029 1012 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit 4 3 Op 2 . - CDS 2026 - 4005 3003 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 5 3 Op 3 . - CDS 4022 - 4735 1020 ## HMPREF0573_10283 succinate dehydrogenase subunit C (EC:1.3.99.1) 6 4 Tu 1 . - CDS 4937 - 5494 170 ## PROTEIN SUPPORTED gi|228002792|ref|ZP_04049785.1| (SSU ribosomal protein S18P)-alanine acetyltransferase Predicted protein(s) >gi|319978543|gb|AEUH01000108.1| GENE 1 2 - 248 215 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190205|ref|ZP_06608701.1| ## NR: gi|293190205|ref|ZP_06608701.1| aromatic amino acid transport protein AroP [Actinomyces odontolyticus F0309] # 5 76 5 76 78 75 61.0 1e-12 MIEYDNGPGTRTRRHWVSAAIGITAMGVFALLSLALLLGVLFGPGSVKSLADKAFPWSLV ALGVAGVAMLVANARSGAGGGP >gi|319978543|gb|AEUH01000108.1| GENE 2 383 - 1018 667 211 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|281355583|ref|ZP_06242077.1| ## NR: gi|281355583|ref|ZP_06242077.1| hypothetical protein Vvad_PD3692 [Victivallis vadensis ATCC BAA-548] # 52 173 4 116 151 67 37.0 5e-10 MDSQQWPGGQGPDQSGYGQGRAYGRPQEPRMGQRPASQGQSPYERDQERRYDVQTWRKAA YYGGAVVIVLGFILFFVPFVSTFSFFGSMRFGRAAPGDGFFGAFRTFPLAFVGVLLIMAG QGLRSLGRKGLAGSGVVLSPQGEARDAEPWQRSKGAQMQDALEEVPMIRDAMARGGGAEP QIRVRCTKCGYLETEDARFCSGCGAPMTPAP >gi|319978543|gb|AEUH01000108.1| GENE 3 1286 - 2029 1012 247 aa, chain - ## HITS:1 COG:Cgl0368 KEGG:ns NR:ns ## COG: Cgl0368 COG0479 # Protein_GI_number: 19551618 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Corynebacterium glutamicum # 1 246 1 246 249 248 49.0 8e-66 MNLTLRIWRQNGPEDKGAIHEYQVKGISEESSFLEMLDVLNEELFERGEEPIAFDSDCRE GICGQCGVVINGIAHGPEVTTTCQLHMRSFNDGDVITIEPWRAQGFPIIKDLVVNRSALD RIIQAGGYISVNTGGAAAADATPVPKQDAERAFEAAACIGCGACVAACPNASAMLFTSAK VTHLGLLPQGKPENYRRVVSMLNQMDEEGFGSCSNVGECAAVCPKQIPLDVIANLNRQLG KAALKGV >gi|319978543|gb|AEUH01000108.1| GENE 4 2026 - 4005 3003 659 aa, chain - ## HITS:1 COG:Cgl0367 KEGG:ns NR:ns ## COG: Cgl0367 COG1053 # Protein_GI_number: 19551617 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Corynebacterium glutamicum # 13 659 24 673 673 616 50.0 1e-176 MTDTLIHGLYREGGKIADTKAPHDVPISERWARRQFEAALVNPANRRKLRVIIVGSGLAG GAASASLGEMGYHVDVFFYQDSARRAHSIAAQGGINAAKNYRNDNDSVYRLFYDTVKGGD YRAREDNVYRLAEVSGNIIDQCVAQGVPFAREYGGQLDNRSFGGVQVSRTFYARGQTGQQ LLIGAYQAMERQVAAGTVTEYPRHEMVELIVDEGKARGIVSRDMSTGELRTHIADAVVLA TGGYGNVFFLSTNAMGCNGTAIWRAYRKGAYFGNPCFTQIHPTCIPQHGDQQSKLTLMSE SLRNDGRIWVPKRAEDCEKDPREIPEEDRDYYLERIYPSFGNLVPRDIASRQAKNMCDEG RGVGPKIDGVARGVYLDFADAIQRLGKAAVSAKYGNLFDMYERITGDNPYEVPMRIYPAV HYTMGGLWVDYDLESNLPGLYVAGEANFSDHGANRLGASALMQGLSDGYFVLPNTINDYL AHCFHLPELNEDSPSVKEALEAAQGRIDTLMSVGGTRSVDSFHKELGKIMWEYCGMERTE AGLKKAIGMIRELRADYWRDVRVPGKSIGELNQSLEKAGRVADFMELGELMCIDALHREE SCGGHFRAESQTPEGEALRHDDKFLYVAAWEFAGDDQPPVLHKEDLIYNDIELKQRSYK >gi|319978543|gb|AEUH01000108.1| GENE 5 4022 - 4735 1020 237 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10283 NR:ns ## KEGG: HMPREF0573_10283 # Name: sdhC # Def: succinate dehydrogenase subunit C (EC:1.3.99.1) # Organism: M.curtisii # Pathway: Citrate cycle (TCA cycle) [PATH:mcu00020]; Oxidative phosphorylation [PATH:mcu00190]; Butanoate metabolism [PATH:mcu00650]; Metabolic pathways [PATH:mcu01100]; Biosynthesis of secondary metabolites [PATH:mcu01110] # 1 235 1 256 260 170 41.0 4e-41 MATTTVATKKRRAWTTSVFIKQLMAVSGFVFVFFLLFHSYGNLKMFLGEQVYNDYAHWLH EEAFVPIFPHGGFLWVFRAAMVLMLIIHLYAAAYTWLRSRRARSTRYVVVKSVSDSYAAR TMRLTGVVLIFGIVFHLLHFTTASLPRQLANFGSETTAPYSRMVAAFQSPWLVLFYAVFV GVVCFHVSHGFWSMFQSLGWVRPATRKPLVVLSGLVGVIIFVMFLAPPTAIATGIIH >gi|319978543|gb|AEUH01000108.1| GENE 6 4937 - 5494 170 185 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228002792|ref|ZP_04049785.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Anaerococcus prevotii DSM 20548] # 37 171 7 141 146 70 35 4e-12 APAAGTDCAPGAPGERGGAPGSACGEVPAVDIVPARVGDLRELVRLEGLLFPEDPWTEGM LREELASASSHYLIARADGAACGYGGVRALGDQGDIMTIGVVPGARGRSVGSRLLDGLIA WARRAGADELFLDVRASNDAAIGLYLSRGFEAVGRRRRYFRNPVEDALVMRLAALAGGPP GRPSN Prediction of potential genes in microbial genomes Time: Thu May 12 17:46:28 2011 Seq name: gi|319978540|gb|AEUH01000109.1| Actinomyces sp. oral taxon 178 str. F0338 contig00109, whole genome shotgun sequence Length of sequence - 2504 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 2 - 1079 1078 ## COG0787 Alanine racemase 2 1 Op 2 . - CDS 1076 - 2473 1172 ## COG0063 Predicted sugar kinase Predicted protein(s) >gi|319978540|gb|AEUH01000109.1| GENE 1 2 - 1079 1078 359 aa, chain - ## HITS:1 COG:MT3532 KEGG:ns NR:ns ## COG: MT3532 COG0787 # Protein_GI_number: 15843019 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Mycobacterium tuberculosis CDC1551 # 14 348 36 353 408 211 42.0 2e-54 MTDTHAPRADAYPGEALIDHGAIARNLRVLRAAAPAAKQMAVVKANAYGHGLLPVALTAL ASGSDWLGVAQLPEAVELREGLDRAGVDRASAPVLAWITTPGSDFAGAVRAGIDLSVSWT WVLDAISAAARAVGRPARVHVKIDTGMSRAGSTLADLPALASALAAAQADGLVEVVGAWS HLSRGDDPSEGGLASTASHVGIFERGLRVLGEAGVRPAIRHLAATAGILWHPETHYDMVR AGIGLYGLSPDPAVADSRELGLRPAMTLRAPLTSVKVIERGAPVSYGGTWRAPTRRWAGL VPLGYGDGILRSASNAAPVVVRGPGGDVPARCAGRVCMDQFVIDLGPAGGAAGTPGSRS >gi|319978540|gb|AEUH01000109.1| GENE 2 1076 - 2473 1172 465 aa, chain - ## HITS:1 COG:ML0373_2 KEGG:ns NR:ns ## COG: ML0373_2 COG0063 # Protein_GI_number: 15827103 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Mycobacterium leprae # 90 297 6 204 289 87 39.0 7e-17 MRLAGARVWIDGIVGTGVRGPLEGALADAVRALAALKRSAGPAVVAVDVPSGLTDARGEV PGPLLPAEVTVTMGAHKEALVLPPAAGYAGRIERVDLGLGPADAERTPPVALHTAASAAA HYPAPTSEDHKYTRGVVGVVAGSDTYPGAGVLAVRGALASGAGMVRLNSTRRVEDLVLAA EPGVVTAGGRIQAGLIGPGLDEARRAEALELAEFCLEAGLPLVIDAWVLDLVPGLVDRTR PAALTITPHYGEAARLLTALGSPTTRARVGDAPLSAAQSLHELTGATVVLKGAATIVYGA ARTTPADGGAPGRARACGPEQGDGPERGHDPGGARNREGSGEPREDTHDPGGARGPEAPS SPGEPRGRDSARAIVAPLGCGWASVAGSGDVLAGLVAGTHAGARARRERDGADVHDPVAE AAAAVWVHGEAARLAARARHAPGQPIQARQIADAIPPVIGGILGA Prediction of potential genes in microbial genomes Time: Thu May 12 17:46:31 2011 Seq name: gi|319978533|gb|AEUH01000110.1| Actinomyces sp. oral taxon 178 str. F0338 contig00110, whole genome shotgun sequence Length of sequence - 9367 bp Number of predicted genes - 6, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 520 209 ## 2 2 Tu 1 . + CDS 581 - 907 175 ## 3 3 Op 1 . - CDS 1192 - 2091 1368 ## COG4975 Putative glucose uptake permease 4 3 Op 2 . - CDS 2184 - 4133 2313 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 5 3 Op 3 . - CDS 4168 - 5130 1162 ## COG1072 Panthothenate kinase 6 4 Tu 1 . - CDS 5442 - 9251 5786 ## COG5263 FOG: Glucan-binding domain (YG repeat) Predicted protein(s) >gi|319978533|gb|AEUH01000110.1| GENE 1 1 - 520 209 173 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEGMSAAEPIPAHTFATVRRMEAPLIAEARAQGRPDALMERAATGVADAVRGLLGRPRDE LWPSAADPWRLGPGPLADPQAAPRGNAPASGADPAAVLSDCQAPSAALSSVAPHGPSLAP SPHSAAHDGAPEGAAPQVVALVGSGDNGGDALYACAILQDAGTACMALLLKPG >gi|319978533|gb|AEUH01000110.1| GENE 2 581 - 907 175 108 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPSMLAYPERSGARRRSLEGTDPAALGQATGHWGRQLCIGVGNRAIETAHWGRQRVIGVG NRAIETTHWSRQRVIGVGNRASETAHWSRQQAIGAGNRPPGQAICLPQ >gi|319978533|gb|AEUH01000110.1| GENE 3 1192 - 2091 1368 299 aa, chain - ## HITS:1 COG:lin0212 KEGG:ns NR:ns ## COG: lin0212 COG4975 # Protein_GI_number: 16799289 # Func_class: G Carbohydrate transport and metabolism # Function: Putative glucose uptake permease # Organism: Listeria innocua # 6 295 3 285 285 147 36.0 2e-35 MSPTTILIGVLPSVFFGVATTLMGKTGGSDRQRVMGTVLGGLLMAAVATPFLHPQWTPFN LAVSFGTGLLLGLGVCDQLRSYTVLGMSRTMPLSTGGQLVLMSVAGIALFGEWLHGGALP YGIAAIAVLIVGIWFLSRSEAGSDAAGLDWGRGAVLLTSSTLGLVAFPLIIKYFGIAPKE FLLPQAVGYTAYAAVFFAIQGRGGVDPEASLRHRRMFPSVFNGVLWGIAILLLQLNSNNL GAGTGFTLSQLGILISTPLGILWLHETRSRKELRWTGIGVFLVIAGAVLAGIAKSLDAA >gi|319978533|gb|AEUH01000110.1| GENE 4 2184 - 4133 2313 649 aa, chain - ## HITS:1 COG:Cgl2219 KEGG:ns NR:ns ## COG: Cgl2219 COG0449 # Protein_GI_number: 19553469 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Corynebacterium glutamicum # 13 649 3 625 625 721 57.0 0 MPPAPRCGPIESMCGIVGHIASSPSQRSCRVVMDGLARLEYRGYDSAGVALASPGRLEVV REVGKLANLRAAVDRAHPVDATAGIGHTRWATHGRPTVANAHPHCSPDGRFALVHNGIIE NADALRAQLIADGAVFASETDTEVVVHMIARAYDGASPASYQGGTAGLTGADADVARRLV VAMRAVTEQLEGTFTLLVVSTESPNAIVAARRSSPLVVGLGNGENFLGSDALAFVEHTRR AVEIEQDQIVVVAARGVTVVGTDGQVVAPKEYEVDFSSDRATKNGWPTFMEKEIHEQPEA VGATLADRVDSHGHLALDEVRIPVHVLRSVDKVIMIGCGTASYAGQVARYAIEHWCRIPV EVELSSEFRYRDPVVTEKTLVVAISQSGETMDTIQAIRHAREQGAKVVAVVNTPGSTISR ESDAVVLTHAGPEIAVASTKAFTAQVAACYILGLYLAQVRGNKYADEIADYMDKLAEVPA KIERVLERGEAVRQFARTMAGATSVIYLGRHVGFPVAMEGALKLKEICYIHAEGFAAGEL KHGPIALVDEGQPVIVIVPTPRRPELHNKVLANIEEVRARGARTIVVAEEGDSSVDAFAD VVFRVPPVPTLYAPLLTVVPLQIFACELAGVKGLDVDQPRNLAKSVTVE >gi|319978533|gb|AEUH01000110.1| GENE 5 4168 - 5130 1162 320 aa, chain - ## HITS:1 COG:Cgl0968 KEGG:ns NR:ns ## COG: Cgl0968 COG1072 # Protein_GI_number: 19552218 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate kinase # Organism: Corynebacterium glutamicum # 7 319 14 322 323 333 55.0 4e-91 MDSLDAPSPIPASPYADFDRAAWARLADRTPLPLTQDDIDRIASLGDPIDMQEVDAIYRP LSALLHLYVDGRRRTAAERHAFLGEPAPVSTPFVIGIAGSVAVGKSTVARLLQLLLQRWD STPRVDLITTDGFLHPNKVLEERGIIARKGFPESYDRAALIAFLSAVKAGAPSVSAPVYS HVVYDIIPGEEAVVSRPDILIVEGLNVLQPPRALPGSVSIAVSDYFDFSIYVDADESHIE QWYVDRFLKLRATAFAREDSFFRTYAGLSDAEAESTARMVWRAINLPNLRENIRPTRERA ELVFTKGADHRVESLKIRKE >gi|319978533|gb|AEUH01000110.1| GENE 6 5442 - 9251 5786 1269 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1006 1249 467 708 744 204 43.0 1e-51 MRRRPIVGALLAPALALGTLSAAQAALAEPAQPAPFHPEVLWATSEELVGESGLNGHIEA LTDDDTGRSGAAVTFWTTKWKDGGDSFPHALAMRNPNAQDGACGIGVTARPTYDSTMGND HLPAEYRIHAFEADPGNPASDADALAAWKESVRGGQWQGGTELAHASGALAKTNDEQYIT FPRTTAPVISLTGLSALVASKNDMALSDIKLLPCDSDGNAITTYPAQPGPGPEPRPEVPN VPEPLQGGGSDSLVTDFVIDQLPYGEPGAPVEFTPGTHTYSATGYYHSATVSARIRTAPG AEATINGAKPDGDGRVKNLDLAKGINAITATVTKDGQSATYVVNITKTDTDFRGNVLVPA TAALNGGSEADNRALVDGDRNTTVTVDPLVRSEHWDGSTTGFELTLDGQRYVHRVNGFGW PSLPGNAQGWHGGNSVAIAIQEADGGEWKTIVTHASLTRDARGLWYWDFNAYHLAHRIRV WMNTGTEPQTPANIATAVRFDDVEVWGLPEGSAPKAPGADNSADARYSGFDPGEGKWGVN RAQALALQYGVMMPAWVPSEGYGRGGFDANERGLTGGAFPMFYDLPLFNTAMMESLGKGA PWALAKAPFGKNGIASAGEPHDFLSEAQKPYASTLVDIQYGDEKQWDNTEGDYYQKWFEW SKTAYPGALVHSNQYDDYTWRRPQVLDYYVRAVKPDLLSWDTYYYSTTSGPGPDRVVTSL LNNQTWQAQREYALKGLTGDGSSPILYGQYLDYNWDANVSASQKAIVPSLGLATGQKWFG LFRMEYNGYDRSSIIDHDGAPTRSFYEFTRIFRSVRYIGEYTKAMDSTFVALKPGQYDGR DEDPVLSGYRYGNFASGDEAVAANQGAGLVDMSVTNTGSVNGGKPGDVVVGYFDQLEGLE AAKSAEIFGASTTVPKGFMVVNALTGTTKFPSMLLDPRTDDGSFAETAQDITLTVKKPSA NSRLMLVDPSTATTTEVALGDGETSTVTLKAVGGGDSRLLYWVTRDDPAPGPTPTPAPTA DPTPAPTPAPTSDPTAAPTADPTAEPTGAPSQDPTDPPAPGPKAGKWVQGFFGWWYRYDD GTYPVSTTLVVDGQTYRFDARGYMVTGWHKDNGAWYYYSPSGARASGWALVSGTWYYLDP GSGAMLIGWVEVSGAWYHLGASGAMDTGWLSEGGAWYLLGADGAMRTGWAQSGGHWYYLD PGTGAMRTGWVEVSGAWYHLGASGAMDTGWLSESGAWYYLDPGSGQMATGTRTIDGVSHR FAPSGAWLG Prediction of potential genes in microbial genomes Time: Thu May 12 17:46:48 2011 Seq name: gi|319978529|gb|AEUH01000111.1| Actinomyces sp. oral taxon 178 str. F0338 contig00111, whole genome shotgun sequence Length of sequence - 3512 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1418 1895 ## COG3669 Alpha-L-fucosidase 2 2 Op 1 38/0.000 - CDS 1573 - 2499 1421 ## COG0395 ABC-type sugar transport system, permease component 3 2 Op 2 . - CDS 2496 - 3416 1594 ## COG1175 ABC-type sugar transport systems, permease components Predicted protein(s) >gi|319978529|gb|AEUH01000111.1| GENE 1 2 - 1418 1895 472 aa, chain - ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 43 372 23 362 449 114 29.0 4e-25 MTDRLTDEFAPDALEAAGALGGAEDPGYTYPDDPRTARALEHWRDLKVGVIIHWGIFSHI GQDGSWSLHRTRLGGFTDAPQGWQGTDAEYHSWYCDQARSFTGEDFDAAEWASACARAGM RYAVFTTKHHDGFAMYDTAYSNFKSTSEESGLGRDIVREVFDAFRGAGMETGVYFSKADW NHPGYWDRSRPITDRYHNYDIASHEAKWESFVTYTKNQIEELLTGYGDVNVLWLDAGWVR SPEEPIGMDAIAEAARRLQPSILVVDREVHGPNENYRTPEQEIPDSVLDHPWESCITMTR GWCSMRPDEAIKPMRRIVSNLLAIVSRGGNYLIGIGPDGSGRMSKWVRRGLDELGGFLEA NGPGIFATRPWEHGERVRGADDWSWWTTRAAPSAPGGPGTVYLFGMGPADGAGAGSLEER TFAAPGTSATWVEVPAPVAHARLLGSGEPVEVEAQGEGARLRVPATGLAYAV >gi|319978529|gb|AEUH01000111.1| GENE 2 1573 - 2499 1421 308 aa, chain - ## HITS:1 COG:all4824 KEGG:ns NR:ns ## COG: all4824 COG0395 # Protein_GI_number: 17232316 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Nostoc sp. PCC 7120 # 41 307 13 278 279 152 32.0 8e-37 MSAQAPAQKTPSRFAQWRKARRLERQSRTVASRQMRGSQLFIRYAVLVVVAILLVGPLVL PLMAAFKAPGESVFGQGATLLPQQWSLESFTRLFERTDILGSIGNSALVALLAVTSNVVL SCVGGYMLSRRGWSGRTIMYFVVLSAMIFPFESIMLSLFSMMVQANLYNTLTGVWMVAMI GPFQILMMRAAFMGIPDEIEDAALIDGAGEWRRFWEIFLPQVRGTLTVVGLTSFIAAWSD FIFPLLMLPDPKKQTLMLTLTAIQNSPQGTTYQLVLAGAVVAMTPVVIVFALSQKYFFRG IEDGGLKF >gi|319978529|gb|AEUH01000111.1| GENE 3 2496 - 3416 1594 306 aa, chain - ## HITS:1 COG:slr1202 KEGG:ns NR:ns ## COG: slr1202 COG1175 # Protein_GI_number: 16329975 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Synechocystis # 25 299 16 290 298 162 36.0 7e-40 MATAKRRAPGRSLNGAARFRPKWMPWLWITLPVLAIVSFYIYPFITTVYVSFTKTKPLGR VGRFVGFENFASVLSDGEFWESLRNSLLYAVCVVPLMVLLPLLLALLVKSHVPGIGFFRA LYYLPAISSLVVISLAWRYLLDQRGPVNNLLASWGLASEPVPFLSNQWLILFCAMLITLW QGLPYYMILYLSALANVDKSLYEAAELDGAGPVRRFFTVTVPGVRVMMFLVATLTTIGCL KIFTEVYLLGGAASPTKTLTMYIRDRIVDPTFGSLGQGDAASVCLFLLTFGFIIASQALQ RKAEDA Prediction of potential genes in microbial genomes Time: Thu May 12 17:46:55 2011 Seq name: gi|319978518|gb|AEUH01000112.1| Actinomyces sp. oral taxon 178 str. F0338 contig00112, whole genome shotgun sequence Length of sequence - 14370 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 7, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 13 - 1395 2265 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 1480 - 1539 2.8 2 2 Tu 1 . - CDS 1541 - 4444 3107 ## COG3250 Beta-galactosidase/beta-glucuronidase 3 3 Tu 1 . + CDS 4559 - 5584 1318 ## COG1609 Transcriptional regulators + Term 5716 - 5770 20.1 - Term 5696 - 5764 24.0 4 4 Op 1 . - CDS 5897 - 6799 876 ## ckrop_1874 hypothetical protein 5 4 Op 2 . - CDS 6805 - 8280 2451 ## COG0477 Permeases of the major facilitator superfamily - Prom 8364 - 8423 3.4 6 5 Tu 1 . + CDS 8210 - 8377 141 ## + Prom 8418 - 8477 2.1 7 6 Tu 1 . + CDS 8558 - 10795 2967 ## COG4409 Neuraminidase (sialidase) 8 7 Op 1 4/0.000 - CDS 11060 - 12412 1905 ## COG1109 Phosphomannomutase - Term 12438 - 12477 1.7 9 7 Op 2 59/0.000 - CDS 12529 - 13020 699 ## PROTEIN SUPPORTED gi|227494577|ref|ZP_03924893.1| ribosomal protein S9 10 7 Op 3 7/0.000 - CDS 13061 - 13504 592 ## PROTEIN SUPPORTED gi|227497214|ref|ZP_03927462.1| ribosomal protein L13 - Prom 13643 - 13702 2.4 11 7 Op 4 . - CDS 13714 - 14370 465 ## COG0101 Pseudouridylate synthase Predicted protein(s) >gi|319978518|gb|AEUH01000112.1| GENE 1 13 - 1395 2265 460 aa, chain - ## HITS:1 COG:slr1897 KEGG:ns NR:ns ## COG: slr1897 COG1653 # Protein_GI_number: 16329947 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Synechocystis # 58 455 37 431 433 140 29.0 7e-33 MAAGGLALAMFATIGLTACGGGSGSPGADSGSAIPKPDVACDIPEGNVDGKAIDTSKVEG EITFQTQGLQADFAEFFEAKIAEFEAANPGTKIKWTDQGGGAEFDQAITAQATNCQMADV VNTPSSSILALSKANLLMDYDVKAPGIGDKFVKSIWDSTAMGANGHHTALPWYFGPYITT YNKDVFKRAGLDENKPPATMEEMFDYAEQVHANSPEDFGIYGSPQWYMTAQLHGMGVKLL NNDNTAFDFADNEAAIAYVTRFAELYASGAIPKDSLTGEPDPGKAYTDGNLAFGTPNASF LKSVKSNNSAVYENTGVGPFPTNEGIKPVFEGQYIGVSVTTKNAPLAMKFAEWITNAENE YAWTHDGGAIIFPAATEALDKLVANPPEIADDPVFKAAYEQAAQSAKDAEAYTDVFYMTG GVSDILVENVNKAIRGEADPKAALKAAQDAMNAKLSALTE >gi|319978518|gb|AEUH01000112.1| GENE 2 1541 - 4444 3107 967 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 23 594 36 580 1087 339 37.0 1e-92 MDARVTPQSYSPSAGALCPPRSWLTDEPTISLNGSWDFVHHPRLPDPFDDPEKPSPCPVL GPCAERIDVPSHWVMAPHGRWGRPAYTNVDFPFPVDPPHPPADNPTGDYRRAFDVPAQWV GDGQIRLRFDGVESQARVWVNGTWIGMFTGSRLAHELDITGAVRPGANEVVVRVSQFSSG SYLEDQDQWWLPGIFRDVELRHLGAASVHDVWLSTDYRDGEGRVRVEALAPGGTGAALDG VLRLVGAEGEGAVLPVRLGPGPGGRHRSGWVDCGPVRPWSPDRPVLYRAELALDGAAPRG LRIGFTRVEVVGGQLRANGAPLVFNGVNRHEIRHDRGRVFDEAWVREDLALMKSLGVNAI RTSHYPPHPRVLDLADEIGLWVVLEGDIETHGFEGADRAWADNPSDDPRWADAFDERTRR AFERDKNHPSIALWSLGNESGTGANLAAAARWIHERAEGALVHYEGDHAMEYADVYSRMY PALEEVAAALDDSDPRAPIAVPGHPSAEVDGEARARARRAPYLMCEYLHAMGTGPGGAAL YAAQMDHPRHAGGFVWEWRDHALDQRHGGALRLAYGGDFGEPIHDGTFVCDGLVDAHSRI YAGTWAWARAMAPGAPALRALAAKRARVEERDADGARALRGLAQWCAAHGEDGAEAPEGA FHAVDVDATGAWTALGGLPVRASVSLYRAPTDNDRGRGPLDYWGLDEEGVGPMGEGLNRP GVSRADRWEEARLAFVRRSPRSLLERSDGARLVHERWGAASRQFGVDARSESRPLRIGLA GPWGPFLDGERVVIDLEPYGPLPRRLPRMGLRLELPGGPWEASWVGEASIAYADMPGDDP DGYGAGATDALWDVCVRPQEGGHRPGLKALRLGRGSDALYFVPLGALGWSVCPWDERELA EAAHWGDLAESDRTFLWLDAVQDGIGSRSCGPDSRPRFAAAPAPRRIAFVCARAHEAGAD GPRVTVG >gi|319978518|gb|AEUH01000112.1| GENE 3 4559 - 5584 1318 341 aa, chain + ## HITS:1 COG:ECs4695 KEGG:ns NR:ns ## COG: ECs4695 COG1609 # Protein_GI_number: 15833949 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 5 318 3 312 330 82 27.0 1e-15 MERVTRADVAREAGLAPSTVSLILNGRGRQVKISEATIARVERIARELNYIPNAAARSLR SGNSHLIALLMAELPDDPFVPVVHTVLTTAMIEIQRRGYLLVPLFQSSDDPSDAEVIRSV MGDTQLAGVIRETTSDERFSSTLLWNMGIPVVGMSMIEADDSADGAALVRIDESAGVAAI IRRTDLPSGEAAHAVFIAGPNTNHSRQNPVLARFGDRARMISLPDWDEATAYQAAFTLLS QDPRLELVCVADDSQAPGVIAAARDLDIPVPERLSVLGFGNQEPRSGERLGLTTVEWPLR EMTRLAVQEVIAAIESPSPGAVGRVRTLPTSPVWRTSVRAR >gi|319978518|gb|AEUH01000112.1| GENE 4 5897 - 6799 876 300 aa, chain - ## HITS:1 COG:no KEGG:ckrop_1874 NR:ns ## KEGG: ckrop_1874 # Name: not_defined # Def: hypothetical protein # Organism: C.kroppenstedtii # Pathway: not_defined # 5 296 22 314 323 310 54.0 4e-83 MAVPFIVGAYASLPAPESEADYYALLAEQPWVSGVEIPYPGQLADHGGVLASRLAPHWDF NTVTAIPGTMQNVWKDGGFGLASPDGAGRQAAIEFTRALNRDLGQLCERAGRPLVARVQL HSAPTRRAQADAFKRSLEELAAWDWCGAALVVEHCDRFIPEQTPEKGFLSLESEIDIVGG LGIGIHINWGRSAVEGRRADTAYEHVREAGGRGVLDGVVFSGAGPEETQYGYAWIDGHLP AQADEPASLMSDTEIGRCARAALGSGCHYLGAKVCVPKDATLEERLAMLARIHRASGAAR >gi|319978518|gb|AEUH01000112.1| GENE 5 6805 - 8280 2451 491 aa, chain - ## HITS:1 COG:STM3338 KEGG:ns NR:ns ## COG: STM3338 COG0477 # Protein_GI_number: 16766633 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 14 486 10 475 496 436 50.0 1e-122 MNTSPAPALPKKRWYQYLTKEDRKAFFAAWIGVLLDGYDFVLISFALPAITQAFDLSLVQ SASLISAAFISRWIGGLVLGAIGDRYGRKPAMILSIFMFAFGSIACALAPNFWVLFVLRL IIGFAMAGEYSASAAYVIESWPKHMRNKASGFLLSGYAFGVIAAAQVDKYFVTWVDSLHP GWGWRALFLTGIIPIVVAIYMRRTLPEAADWSEAKEKGHVEKNDMLAVLFGGSRKVVNYI AVVIGFIALLLIFTQTVGGTLAVSLLGAVSAVIFIYLIIQFDSKRWIIGIAIMLTIFASF MYTWPIQGLLPTYLRGVGMDQQVVANVVSFAGLGNAAGYIVAGFAGDRFGMRRWYAVSLL LSQAIVFPLFMQNGQYVALVAGLLFFQQMFGQGVSGLLPKWVSSYFPVDKRAAGLGFCYN VGALGGAVGPVLGAAIAERLSLGLALAILSFGFAAVVMVSIGANIPRLLQKLVGPEWVRP EDGNDEIIEEG >gi|319978518|gb|AEUH01000112.1| GENE 6 8210 - 8377 141 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRSSFVRYWYQRFLGSAGAGDVFIGIPSYSFIVESRGPFRARHPTEICQTSCGRC >gi|319978518|gb|AEUH01000112.1| GENE 7 8558 - 10795 2967 745 aa, chain + ## HITS:1 COG:PM1000 KEGG:ns NR:ns ## COG: PM1000 COG4409 # Protein_GI_number: 15602865 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Pasteurella multocida # 309 741 48 558 1080 200 29.0 8e-51 MQSDTTDPGISWVLQFRARIDGTLIRLHSSEAIGWELSIRAGALFLEGSTNSMAFALDME DTASLTDGTWHALAITAAGAGSRIFLDGYQCFSTTADLSPAASGENAELSASPSAGIDVR AFAAHDSVLSPTEILALSPAPTPLIEFAAARLSDYDVAELSELTSGTIFARYRVRGPGQH GTILAAAGAGREQLNLSVTEEGIEYRVLGRRGQWRVFTAHGHWDQGRWHDVVVRVGHGAV EIYVDGYLEAHLPGQVFFAGVDALDEVVIGQDASGSRLFGEVRNAALYSCVLNDAQIKKL SSVAPLDTQCLFDAGFHDSISYRIPSLITLDSGVVVAGADQRETIANDSPNSINFTIRRS FDGGRTWGDLQTVLSYPGHGATGASVIDSCVVQDRSTGRLIVLIDHFPGGIGQPNAEAGA GVDAKGRYILHDAQGAQYTWNEDGSVTDSEGGRTPYTVSERGDVTVTEGGERARGGNVFL ADGEDPDQTLLTARTCFLQMIHSDDDGATWSGPVNLNQDVKEEWMSFCGTSPGTGVQLRS GRLVVPVYYNGDHKRHFSAAVVYSDDGGQTWKRGASPNDGRVFDGRRIDSRTLDTESAAT HEATLIERADGSLLMLMRNQHPSGKVAAAVSADGGETWGEVRFVQEIDEIFCQPNAVPWP SEDCPERVVFANASQMRPYRGRGVLRLSEDGGRTWTASRTFNPAHYVYQCMAVLPDGALG LLWEREMQGLYFTRIPLEWFEAAKQ >gi|319978518|gb|AEUH01000112.1| GENE 8 11060 - 12412 1905 450 aa, chain - ## HITS:1 COG:CAC0484 KEGG:ns NR:ns ## COG: CAC0484 COG1109 # Protein_GI_number: 15893775 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 1 446 1 445 448 413 50.0 1e-115 MPRMFGTDGVRGLANVDITADLALRLGEAAARQFGCRPRPDGSKPRAIIGRDTRISGQFL DHALAAGLASAGMDVTRIGVVTTPTVAHLTATEDGVDLGVMISASHNPMPDNGIKFFAHG GHKLADAVEDQIEALVGTDWERPTGEGVGEIDADHAWARESYIRHLVGAVGTDLRGLRVV IDCANGGASEFGPAVFRELGADATVINASPDGRNINLNAGSTHPEALMAAVVGAGADFGV AYDGDADRCLAVDHTGALVDGDKIMGALAVDAHRRGALAKDTLVVTVMSNLGLLLAMREA GIATVQTGVGDRYVLEEMLASGYSLGGEQSGHIIATDHATTGDGILSSLLVSRMVKESGQ SLADLTSFVHRLPQTLINVAGVDRAGASSNGAVAEAVAAAEARLGESGRVLLRPSGTEPL VRVMVEAATQEEADAVAESLADVVKRNLAL >gi|319978518|gb|AEUH01000112.1| GENE 9 12529 - 13020 699 163 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494577|ref|ZP_03924893.1| ribosomal protein S9 [Actinomyces coleocanis DSM 15436] # 1 163 1 163 163 273 84 4e-73 MAETTASVELDEEIVTGSYTTETESAPAGAGQSITAPGAGLGRRKEAVARVRLVPGSGQW TINGRTLEDYFPNKLHQQLVKSPFVLLDIDGRFDVVARINGGGVSGQAGALRLGISRALN EIDRDANRPALKKAGFLTRDARVTERKKAGLKKARKASQFSKR >gi|319978518|gb|AEUH01000112.1| GENE 10 13061 - 13504 592 147 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227497214|ref|ZP_03927462.1| ribosomal protein L13 [Actinomyces urogenitalis DSM 15434] # 1 147 1 147 147 232 74 1e-60 MRTYSPKPGDADKKWFVIDATDVVLGRLAAHAANLLRGKHKPTFAPHMDMGDYVIIINAD KVALTGKKLEQKMAYRHSGRPGGLTATAYRDLMATAPERVIEKAVKGMLPHTSLGRAQFK KLHVYAGAEHPHAGQSPVPYELTQVAQ >gi|319978518|gb|AEUH01000112.1| GENE 11 13714 - 14370 465 218 aa, chain - ## HITS:1 COG:MT3562 KEGG:ns NR:ns ## COG: MT3562 COG0101 # Protein_GI_number: 15843050 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 36 210 119 295 297 126 49.0 3e-29 ETACAAALVRRANALLARAWAQREQGRGERPPRGVSDALVLSASPVSADFDARFSATGRS YVYRIADRPTPLRRWDCVWDDRPLDLAAVNRAAEALLGEHDFLSFCRPREGATTIRALTR LEAVRAPDGLVEFHVGADAFCHSMVRSLVGALALVGRGARPAPWPRALLDARSRASAAPI APARGLTLERVDYPGPEEWALRARLARRRRDACCGAES Prediction of potential genes in microbial genomes Time: Thu May 12 17:49:01 2011 Seq name: gi|319978464|gb|AEUH01000113.1| Actinomyces sp. oral taxon 178 str. F0338 contig00113, whole genome shotgun sequence Length of sequence - 50562 bp Number of predicted genes - 60, with homology - 50 Number of transcription units - 20, operones - 9 average op.length - 5.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 237 244 ## COG0101 Pseudouridylate synthase 2 1 Op 2 . - CDS 279 - 1028 961 ## COG1940 Transcriptional regulator/sugar kinase 3 1 Op 3 50/0.000 - CDS 1122 - 1715 471 ## PROTEIN SUPPORTED gi|229249301|ref|ZP_04373341.1| LSU ribosomal protein L17P 4 1 Op 4 32/0.000 - CDS 1754 - 2749 1558 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 5 1 Op 5 48/0.000 - CDS 2883 - 3290 587 ## PROTEIN SUPPORTED gi|229821594|ref|YP_002883120.1| ribosomal protein S11 6 1 Op 6 . - CDS 3344 - 3718 569 ## PROTEIN SUPPORTED gi|227494570|ref|ZP_03924886.1| ribosomal protein S13 7 2 Op 1 . - CDS 3867 - 3980 195 ## PROTEIN SUPPORTED gi|21223105|ref|NP_628884.1| 50S ribosomal protein L36 8 2 Op 2 9/0.000 - CDS 4020 - 4241 334 ## PROTEIN SUPPORTED gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 9 2 Op 3 12/0.000 - CDS 4432 - 5352 1174 ## COG0024 Methionine aminopeptidase 10 2 Op 4 28/0.000 - CDS 5352 - 5930 802 ## COG0563 Adenylate kinase and related kinases 11 2 Op 5 53/0.000 - CDS 5927 - 7219 908 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 12 2 Op 6 . - CDS 7407 - 7874 600 ## PROTEIN SUPPORTED gi|227497200|ref|ZP_03927448.1| ribosomal protein L15 13 2 Op 7 . - CDS 7877 - 8062 168 ## PROTEIN SUPPORTED gi|227876178|ref|ZP_03994294.1| ribosomal protein L30 14 2 Op 8 56/0.000 - CDS 8062 - 8742 892 ## PROTEIN SUPPORTED gi|227494562|ref|ZP_03924878.1| 30S ribosomal protein S5 15 2 Op 9 46/0.000 - CDS 8771 - 9142 449 ## PROTEIN SUPPORTED gi|23466146|ref|NP_696749.1| 50S ribosomal protein L18 16 2 Op 10 55/0.000 - CDS 9142 - 9684 655 ## PROTEIN SUPPORTED gi|62424403|ref|ZP_00379549.1| COG0097: Ribosomal protein L6P/L9E 17 2 Op 11 50/0.000 - CDS 9700 - 10098 553 ## PROTEIN SUPPORTED gi|227494559|ref|ZP_03924875.1| ribosomal protein S8 18 2 Op 12 50/0.000 - CDS 10161 - 10346 301 ## PROTEIN SUPPORTED gi|227494558|ref|ZP_03924874.1| ribosomal protein S14 19 2 Op 13 48/0.000 - CDS 10350 - 10895 792 ## PROTEIN SUPPORTED gi|227494557|ref|ZP_03924873.1| ribosomal protein L5 20 2 Op 14 57/0.000 - CDS 10898 - 11332 514 ## PROTEIN SUPPORTED gi|227494556|ref|ZP_03924872.1| ribosomal protein L24 21 2 Op 15 50/0.000 - CDS 11336 - 11704 568 ## PROTEIN SUPPORTED gi|227497191|ref|ZP_03927439.1| 50S ribosomal protein L14 22 2 Op 16 50/0.000 - CDS 11793 - 12059 361 ## PROTEIN SUPPORTED gi|84494777|ref|ZP_00993896.1| 30S ribosomal protein S17 23 2 Op 17 50/0.000 - CDS 12065 - 12298 289 ## PROTEIN SUPPORTED gi|227497189|ref|ZP_03927437.1| 50S ribosomal protein L29 24 2 Op 18 50/0.000 - CDS 12298 - 12717 649 ## PROTEIN SUPPORTED gi|227492438|ref|ZP_03922754.1| ribosomal protein L16 25 2 Op 19 61/0.000 - CDS 12723 - 13535 1087 ## PROTEIN SUPPORTED gi|227497187|ref|ZP_03927435.1| possible ribosomal protein S3 26 2 Op 20 59/0.000 - CDS 13537 - 13908 478 ## PROTEIN SUPPORTED gi|227494550|ref|ZP_03924866.1| ribosomal protein L22 27 2 Op 21 60/0.000 - CDS 13937 - 14218 434 ## PROTEIN SUPPORTED gi|227494549|ref|ZP_03924865.1| ribosomal protein S19 28 2 Op 22 61/0.000 - CDS 14235 - 15071 1327 ## PROTEIN SUPPORTED gi|227494548|ref|ZP_03924864.1| ribosomal protein L2 29 2 Op 23 61/0.000 - CDS 15095 - 15400 191 ## PROTEIN SUPPORTED gi|146340059|ref|YP_001205107.1| 50S ribosomal protein L23 30 2 Op 24 58/0.000 - CDS 15397 - 16041 775 ## PROTEIN SUPPORTED gi|227497182|ref|ZP_03927430.1| ribosomal protein L4 31 2 Op 25 40/0.000 - CDS 16044 - 16709 920 ## PROTEIN SUPPORTED gi|227494545|ref|ZP_03924861.1| ribosomal protein L3 32 2 Op 26 . - CDS 16721 - 17029 506 ## PROTEIN SUPPORTED gi|227428357|ref|ZP_03911414.1| SSU ribosomal protein S10P 33 3 Tu 1 . + CDS 16800 - 17264 123 ## gi|252128086|ref|ZP_04837481.1| conserved hypothetical protein - Term 17266 - 17320 11.9 34 4 Op 1 30/0.000 - CDS 17508 - 18695 1341 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 35 4 Op 2 51/0.000 - CDS 18856 - 20979 3305 ## COG0480 Translation elongation factors (GTPases) 36 4 Op 3 56/0.000 - CDS 21066 - 21536 724 ## PROTEIN SUPPORTED gi|227497165|ref|ZP_03927413.1| ribosomal protein S7 37 4 Op 4 . - CDS 21536 - 21910 579 ## PROTEIN SUPPORTED gi|227494540|ref|ZP_03924856.1| ribosomal protein S12 - Prom 22132 - 22191 1.6 38 5 Tu 1 . + CDS 22286 - 22408 65 ## + Term 22430 - 22475 17.9 39 6 Tu 1 . - CDS 22578 - 25532 4382 ## COG1472 Beta-glucosidase-related glycosidases - Term 25982 - 26044 1.6 40 7 Op 1 58/0.000 - CDS 26208 - 30110 5935 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit 41 7 Op 2 . - CDS 30126 - 33602 857 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 42 8 Tu 1 . - CDS 33783 - 34304 423 ## 43 9 Op 1 . + CDS 34423 - 34494 77 ## 44 9 Op 2 . + CDS 34509 - 34688 88 ## - TRNA 35112 - 35188 96.7 # Met CAT 0 0 45 10 Op 1 . - CDS 35284 - 36159 878 ## COG3001 Fructosamine-3-kinase 46 10 Op 2 . - CDS 36254 - 38272 2780 ## COG1479 Uncharacterized conserved protein 47 11 Tu 1 . + CDS 38626 - 39723 615 ## COG0270 Site-specific DNA methylase - Term 39606 - 39640 3.2 48 12 Op 1 . - CDS 39709 - 41556 335 ## ROP_pROB02-00110 hypothetical protein 49 12 Op 2 . - CDS 41610 - 44339 1128 ## ROP_pROB02-00120 hypothetical protein 50 13 Tu 1 . + CDS 44607 - 45062 -263 ## + Prom 45508 - 45567 2.4 51 14 Op 1 . + CDS 45643 - 45759 115 ## 52 14 Op 2 . + CDS 45784 - 46176 72 ## 53 14 Op 3 . + CDS 46142 - 46612 148 ## + Term 46776 - 46807 1.0 54 15 Tu 1 . - CDS 46860 - 47600 473 ## ROP_pROB02-00150 hypothetical protein 55 16 Tu 1 . - CDS 48005 - 48112 181 ## 56 17 Op 1 . + CDS 48142 - 48606 526 ## PFREUD_11190 hypothetical protein 57 17 Op 2 . + CDS 48599 - 48904 334 ## COG1846 Transcriptional regulators 58 18 Tu 1 . - CDS 48999 - 49595 565 ## Bsph_2619 hypothetical protein 59 19 Tu 1 . + CDS 49656 - 49841 121 ## + Term 50043 - 50093 15.7 - TRNA 49730 - 49805 91.2 # Ala CGC 0 0 60 20 Tu 1 . - CDS 50106 - 50561 520 ## gi|154507880|ref|ZP_02043522.1| hypothetical protein ACTODO_00364 Predicted protein(s) >gi|319978464|gb|AEUH01000113.1| GENE 1 3 - 237 244 78 aa, chain - ## HITS:1 COG:MT3562 KEGG:ns NR:ns ## COG: MT3562 COG0101 # Protein_GI_number: 15843050 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 1 78 22 99 297 84 58.0 7e-17 MRVRIDLAYDGTRFHGWAAQPGLRTVQGELEAVLSTLVRQPVATTVAGRTDAGVHARHQV AHVDLPGAAWRALAPRSA >gi|319978464|gb|AEUH01000113.1| GENE 2 279 - 1028 961 249 aa, chain - ## HITS:1 COG:all1371 KEGG:ns NR:ns ## COG: all1371 COG1940 # Protein_GI_number: 17228866 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Nostoc sp. PCC 7120 # 6 247 10 233 239 149 33.0 4e-36 MSHEMTLAIDCGGGGIKGSVVDEKGTMVAPARRVPTPYPLPPGLLVRTVCDLAAGLPRAT RVTMGMPGMIRHGVVVATPHYITKDGPRSRVLPELVEQWDHFDMGRAIAAALSLPTKVLN DAEVAGAGAVTGRGLEMIITLGTGLGNAVFDGGVLAPHVEVSQGFVRWGLTYDDYIGEHE RLRLGDTHWSRRVRRVVDGLRAMYMWDRLYMGGGNSKRITAANRAKMGDDIIIVPNDAGI IGGVRAWEL >gi|319978464|gb|AEUH01000113.1| GENE 3 1122 - 1715 471 197 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229249301|ref|ZP_04373341.1| LSU ribosomal protein L17P [Catenulispora acidiphila DSM 44928] # 1 177 1 176 176 186 56 3e-46 MPTPKKGARLGGSPAHQKLILSNLAKQLIEHRAVTTTEAKAKVLRPYMEKLITKAKRGDM HARRTVLKKVPSKDAVYTLFDEIVPAMDPERRGGYTRLVRVGNRKGDNAPLMRIELVMEK VEKKAVVSSAERTAAKAAEKAAAAEVAKEAGEAAGAAAERAEEAREEGDAAKAEEAERIA VEAGTAADAAEQVAEEE >gi|319978464|gb|AEUH01000113.1| GENE 4 1754 - 2749 1558 331 aa, chain - ## HITS:1 COG:MT3564 KEGG:ns NR:ns ## COG: MT3564 COG0202 # Protein_GI_number: 15843052 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Mycobacterium tuberculosis CDC1551 # 1 316 1 315 347 434 72.0 1e-121 MLIAERPILTEEKVSDLRSRFTLEPLEPGFGYTLGNSLRRTLLSSIPGAAVTSIRIDGVP HEFRTIEGVKEDVAEIILNIKQIVLSSENDEPVVMYLRKAGPGPVVAGDITPPAGVEIHN PDLHLATLNEKGKLEIELTVERGRGYVSAAQNKDPQAEISRIPIDSIYSPVVKVTYKVEA TRVEQRTDFDRLVIDVETKPAITPRDAIASAGKTLVELFGLMRELNEEAEGIEVGPSTTD ATLAADLALPIENLNLQSRSYNALRRRGILTVGELVAHSEADLLDIRNFGTKSIEEIKES LATLGMTLKDSVMPSFGEGDSPSIFSDDQSS >gi|319978464|gb|AEUH01000113.1| GENE 5 2883 - 3290 587 135 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229821594|ref|YP_002883120.1| ribosomal protein S11 [Beutenbergia cavernae DSM 12333] # 1 134 1 134 135 230 85 1e-59 MATKNARSGAGRKPRRKISKNVTHGHAYIKSTFNNTIVSLTDPTGAVVAWASSGQVGFKG SRKSTPFAAQLAAEAAARRAQEHGMKKVDVFVKGPGSGRETAIRSLQAAGLEIGSIQDVT PQAFNGTRPPKRRRG >gi|319978464|gb|AEUH01000113.1| GENE 6 3344 - 3718 569 124 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494570|ref|ZP_03924886.1| ribosomal protein S13 [Actinomyces coleocanis DSM 15436] # 1 122 1 122 122 223 91 1e-57 MARIAGVDLPREKRLEIALTYIYGVGRTRAAETLAATGVSPDTRVKDATEEELVKLRDYI EANFKVEGDLRREVQADIRRKIEIGSYQGLRHRRGLPVHGQRTKTNARTRKGPKRTVAGK KKAK >gi|319978464|gb|AEUH01000113.1| GENE 7 3867 - 3980 195 37 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|21223105|ref|NP_628884.1| 50S ribosomal protein L36 [Streptomyces coelicolor A3(2)] # 1 37 1 37 37 79 94 3e-14 MKVKPSVKKICDKCKVIRRHGNVMVICDNPRHKQRQG >gi|319978464|gb|AEUH01000113.1| GENE 8 4020 - 4241 334 73 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 [Mycobacterium tuberculosis H37Rv] # 1 73 1 73 73 133 87 2e-30 MAKKDGVIEMEGTVVEALPNAMFRVELKNGHVVLGHIAGKMRQHYIRILPEDRVVVELSP YDLSRGRIVYRYK >gi|319978464|gb|AEUH01000113.1| GENE 9 4432 - 5352 1174 306 aa, chain - ## HITS:1 COG:BS_map KEGG:ns NR:ns ## COG: BS_map COG0024 # Protein_GI_number: 16077206 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus subtilis # 7 286 5 246 248 185 37.0 8e-47 MGRIELKTAQELRFMREAGLVVAGIHEKLREAVRPGVTTAELDAVSSDAIAAAGAASNFL GYYDYPATVCVSVNDVVVHGIPGGYRLRAGDLVSFDCGAWLTRRGRQWHGDAAFSVVVSD PWIDDAAFAAGERAAAEPAGGGQDPESAALRRRRELDTVTRECLWAALAALATGRRVSAV GAAVEDVVAARARELGWEAGIIEEYTGHGIGTKMHMEPEVLNFNARGRSPRLKKGMVLAV EPMLVSGGIATRTEDDGWTVRTEDGGDAAHWEHTIAIAEGGVCVLTARDGGAAGLAPYGV APISLD >gi|319978464|gb|AEUH01000113.1| GENE 10 5352 - 5930 802 192 aa, chain - ## HITS:1 COG:MT0757 KEGG:ns NR:ns ## COG: MT0757 COG0563 # Protein_GI_number: 15840140 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Mycobacterium tuberculosis CDC1551 # 4 181 3 174 181 170 52.0 1e-42 MTVVILLGPPGAGKGTQAARIAQRLSIPAISTGDIFRANMAEGTEIGKQAKAYMDRGEFV PDSVTNTMVRARLAAPDTADGFLLDGYPRSVEQAHTLRDMLLDLGKQIDVVVEIQVDEDE VVARMLKRAQEQHRSDDTEPVMRRRLEVYRQQTEPVATYYVDQDLLEAVDGKGTIDEVTA RINTVLDSVARR >gi|319978464|gb|AEUH01000113.1| GENE 11 5927 - 7219 908 430 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 10 429 19 437 447 354 42 6e-97 MLSAFAQAFRTPDLRRKLLFTLFIMAVFRLGSFIPTPGVNSTAIQACIAQQNEASLLDLV NIFSGGALLQLSVFALGIMPYITASIIIQLLRVVIPRFDDLHKEGQSGQAKLTQYTRYLT IFLGVLQSTTTISLARSGQLFAGCNQEVIKDRSVLTFLMMIVVMMAGTGVIMWLGELVTE RGIGNGMSLLIFTSIAARLPSQMLGIGRAGKWGSVLVILVLLLVVALAVVYVEQAQRRIP VQYAKRMVGRRQYGGTTTYIPLKINMSGVIPVIFASSILALPPMVAQFGQPTDGWVQWIS GHFTQSSGFYLALYALMTLFFAFFYTAITFNPDEVADNMKRYGGFIPGYRAGRPTAEYLR YVINRITSAGAVYLVIIALLPSLLIIPLNLASGDMPFGGTTLLIIIGVGLQTVKEINTQL QQHHYEGFLL >gi|319978464|gb|AEUH01000113.1| GENE 12 7407 - 7874 600 155 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227497200|ref|ZP_03927448.1| ribosomal protein L15 [Actinomyces urogenitalis DSM 15434] # 1 155 1 157 157 235 73 3e-61 MADSTKKDAQIVKLHHLRPAPGAKTAKTRVGRGEGSKGKTAGRGTKGTKARYQVAAGFEG GQMPIHMRLPKLRGFKNPFRVEYQVVNVGKLGELFPEGGKVTVEDLIAKHAVRDSAPVKV LGGGELSVKLEVEADSWSSSAEAKITAAGGTISAR >gi|319978464|gb|AEUH01000113.1| GENE 13 7877 - 8062 168 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227876178|ref|ZP_03994294.1| ribosomal protein L30 [Mobiluncus mulieris ATCC 35243] # 1 61 1 61 61 69 57 4e-11 MAKKIKITQVKSSAHSRQNMKDTLRTLGLRKIGQSVVREESDAVLGAIRTVRHLVTAEEV D >gi|319978464|gb|AEUH01000113.1| GENE 14 8062 - 8742 892 226 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494562|ref|ZP_03924878.1| 30S ribosomal protein S5 [Actinomyces coleocanis DSM 15436] # 1 226 1 230 230 348 80 5e-95 MAAQQRDRNGQAGDGDRRERRSGRSDRDGRRNNENKNEYIERVVTINRVSKVVKGGRRFS FTALVVVGDGEGTVGVGYGKAKEVPQAISKGVEEAKKNFFRVPMIRRTITHLVEGRDTAG IVLLRPAAPGTGVIAGGPVRAVLEAAGVHDVLTKSLGSSNAINIVHATVDALKRLEQPEA VAARRGLPLERVAPASMLRARAEGEADKRAAKEAAENASAVAGEAK >gi|319978464|gb|AEUH01000113.1| GENE 15 8771 - 9142 449 123 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|23466146|ref|NP_696749.1| 50S ribosomal protein L18 [Bifidobacterium longum NCC2705] # 1 123 1 123 123 177 73 1e-43 MAYSIKGKGKAVARKRRHLRLRKKINGTPERPRMVVTRSNRHMVVQVIDDTIGHTLVSAS DIEADIAAAGGTKTERARKVGAAVAERAKAAGISAVVFDRGGNKYFGRVAAVAEAAREGG LEL >gi|319978464|gb|AEUH01000113.1| GENE 16 9142 - 9684 655 180 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|62424403|ref|ZP_00379549.1| COG0097: Ribosomal protein L6P/L9E [Brevibacterium linens BL2] # 1 180 1 178 178 256 70 1e-67 MSRIGKNPIAIPAGVDVAIDGQKVTVKGPKGTLSHVVAEPITAKVDGGQVVVSRPNDERT ARSLHGLSRTLIANMIIGVTEGYKKQLEITGTGYRVVAKGKDLEFSLGFSHTITVTPPDG IEFTIEPRSQTLFTVSGIDKQLVGETAARIRKLKKPEPYKGKGIHYVGETIRRKVGKAGK >gi|319978464|gb|AEUH01000113.1| GENE 17 9700 - 10098 553 132 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494559|ref|ZP_03924875.1| ribosomal protein S8 [Actinomyces coleocanis DSM 15436] # 1 132 1 132 132 217 81 9e-56 MTMTDPIADMLTRLRNANRAHHDSVSMPYSKLKNAIADILVEEGYIASTAVEDARVGKTL TISLKYGSHREAAIQGLQRVSKPGLRVYAKSTNLPKVRGGLGVAILSTSSGLLTDRQAAE KGVGGEVLAYVW >gi|319978464|gb|AEUH01000113.1| GENE 18 10161 - 10346 301 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494558|ref|ZP_03924874.1| ribosomal protein S14 [Actinomyces coleocanis DSM 15436] # 1 61 1 61 61 120 93 2e-26 MAKTSLKVKAARKPKFGVRAYTRCNRCGRPHSVYRKFGLCRVCLRELALRGELPGVTKSS W >gi|319978464|gb|AEUH01000113.1| GENE 19 10350 - 10895 792 181 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494557|ref|ZP_03924873.1| ribosomal protein L5 [Actinomyces coleocanis DSM 15436] # 1 181 1 181 181 309 83 2e-83 MAPRLKEKYASEIREALRAEFKHENIMQVGGLKKIVVNMGVGEAARDSKALEGAIRDLTA ITGQKPVTTRAKKSIAQFKLREGQAIGAHVTLRGDRMWEFLDRLLATALPRIRDFRGLSP KQFDGHGNYTFGLTEQSMFHEIDQDSIDRVRGMDITVVTSATTDEEGRALLRHLGFPFKE D >gi|319978464|gb|AEUH01000113.1| GENE 20 10898 - 11332 514 144 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494556|ref|ZP_03924872.1| ribosomal protein L24 [Actinomyces coleocanis DSM 15436] # 1 144 1 146 148 202 71 3e-51 MAAKIRKGDLVEVVRGRTSDEKALAKRNERRAAEGLEPLKPGDKGKQGRVIKVFPAEQKV LVEGVNLKTRHVRQGQTQGSAGGIETVEAPISLSKVALVDPDTKKRVRVGFREDTVERDG RTRTVRVRVTRGGAKRGIEQGKEI >gi|319978464|gb|AEUH01000113.1| GENE 21 11336 - 11704 568 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227497191|ref|ZP_03927439.1| 50S ribosomal protein L14 [Actinomyces urogenitalis DSM 15434] # 1 122 1 122 122 223 93 2e-57 MIQQESRLKVADNTGAKEILTIRVLGGSGRRYAGIGDTIVATVKDAIPGGNVKKGEVVKA VVVRTRKEIRRPDGSYIRFDENAAVIINNNGDPRGTRIFGPVGRELRDKKFMRIVSLAQE VI >gi|319978464|gb|AEUH01000113.1| GENE 22 11793 - 12059 361 88 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|84494777|ref|ZP_00993896.1| 30S ribosomal protein S17 [Janibacter sp. HTCC2649] # 1 88 10 97 97 143 79 2e-33 MAETTARNERKVRRGYVVSDKMDKTVVVRIEDRVKHALYGKVMRKNTKVKVHDEHNECGV GDLVLIMETRPLSATKRWRVVEILEKAK >gi|319978464|gb|AEUH01000113.1| GENE 23 12065 - 12298 289 77 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227497189|ref|ZP_03927437.1| 50S ribosomal protein L29 [Actinomyces urogenitalis DSM 15434] # 1 77 3 79 81 115 74 4e-25 MAEKGLTTADLDAMDDAQLSKELEKAKAELFNLRFAQAVGNLEDHGRMKTVRRDIARIYT IARERELGYRTAPSTEE >gi|319978464|gb|AEUH01000113.1| GENE 24 12298 - 12717 649 139 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227492438|ref|ZP_03922754.1| ribosomal protein L16 [Mobiluncus curtisii ATCC 43063] # 1 138 1 138 138 254 85 7e-67 MLIPRRTKYRKQHRPHRTGFATGGTELAFGDYGIQALEGAYITNRQIEAARIAVTRHIKR GGKVWINIFPDRPLTKKPAETRMGSGKGAPEWWVAPVKPGRILFELSGVPEELAREALSR AQHKLPIKSRFVVREGGDI >gi|319978464|gb|AEUH01000113.1| GENE 25 12723 - 13535 1087 270 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227497187|ref|ZP_03927435.1| possible ribosomal protein S3 [Actinomyces urogenitalis DSM 15434] # 1 270 1 269 269 423 80 1e-117 MGQKVNPTGFRLGITTDHRSRWFSDSTTKGQRYSDYVAEDVAIRKALTDNLERAGIAKVD IERTRDRVRVDLHTARPGIVIGRRGAEADRLRQSLEKLTGKQVQLNILEVKNPDLDAQLV AQGIAEQLAARVSFRRAMRKGIQSAQRAGAKGIRVQVSGRLGGAEMSRSEFYREGRVPLH TLRANIDYGFHEARTTFGRIGVKVWIYKGDITEREFARQQAEQGGRGGRRDGRRGGPRRG GRPNQETASQQAPREAEAPGAAATKTGSEA >gi|319978464|gb|AEUH01000113.1| GENE 26 13537 - 13908 478 123 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494550|ref|ZP_03924866.1| ribosomal protein L22 [Actinomyces coleocanis DSM 15436] # 1 121 1 121 122 188 78 5e-47 MEAKAQARYIRVTPQKSRRVVNEVRGKRALEAADILRFAPQAVATDVRKVLMSAIANSRV KAENAGETFNEADLLVLEAYVDEGPTMKRIRPRAQGRAGRILKRTSHITIVVGTKEDKKK GDR >gi|319978464|gb|AEUH01000113.1| GENE 27 13937 - 14218 434 93 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494549|ref|ZP_03924865.1| ribosomal protein S19 [Actinomyces coleocanis DSM 15436] # 1 93 1 93 93 171 88 6e-42 MPRSLKKGPFVDDHLLKKVDAANESGSKAVIKTWSRRSVITPDFLGLTIAVHDGRKHVPV FVTESMVGHKLGEFAPTRTFRSHDKDDRKGRRR >gi|319978464|gb|AEUH01000113.1| GENE 28 14235 - 15071 1327 278 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494548|ref|ZP_03924864.1| ribosomal protein L2 [Actinomyces coleocanis DSM 15436] # 1 278 1 278 278 515 88 1e-145 MGIRNYKPTTPGRRFSSVSDFVEITRDTPEKSLVRPLHKTGGRNNSGRVTSRHRGGGHKR QYRVIDFRRSDKDGVPAKVAHIEYDPNRTARIALLHYADGTKRYILAPAGLKQGDRIEAG PSADIKPGNSLQLRNIPLGTVVHAVELYPGGGAKIARSAGTSVQLVAKEGRFAQLRMPSG EIRNVEATCRATIGEVGNAEQSNINWGKAGRMRWKGKRPHVRGVVMNPVDHPHGGGEGRT SGGRHPVSPWGKPEGRTRRPKKASDRLIVRRRKTGKNR >gi|319978464|gb|AEUH01000113.1| GENE 29 15095 - 15400 191 101 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|146340059|ref|YP_001205107.1| 50S ribosomal protein L23 [Bradyrhizobium sp. ORS278] # 1 92 1 90 99 78 47 9e-14 MSTIEPGKNPRDIIFKPVTTEKSLALMDQGKYTFEVDPRANKTEIKIAIERIFGVKIGSI ATQNRKGKTYRTRGGIGKRKDVKRAIVTVREGTIDIFGDQA >gi|319978464|gb|AEUH01000113.1| GENE 30 15397 - 16041 775 214 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227497182|ref|ZP_03927430.1| ribosomal protein L4 [Actinomyces urogenitalis DSM 15434] # 1 212 1 211 214 303 72 2e-81 MSEVRSVDVLDLEGKKAATVDLPAEIFDVPTNIPLMHQVVVAQLAAARQGTHSTKTRGQV SGGGRKPFRQKGTGNARQGSIRAPQFTGGGVVHGPKPRDYEQRTPKKMKRGALRSALSDR ARADRIHVVTALFEGDKPSTKAALASLRAIVDDRQALVVLERGDQLTALSLRNVPEVHVL WVDQLNTYDVLDADDIVFTQAALESFLGTKEETK >gi|319978464|gb|AEUH01000113.1| GENE 31 16044 - 16709 920 221 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494545|ref|ZP_03924861.1| ribosomal protein L3 [Actinomyces coleocanis DSM 15436] # 1 221 1 221 221 358 78 3e-98 MTNKKQHAAAPIQALLGRKIGMTQVWDADGRLVPLTVVQVGTNVVTQVRTEEVDGYSAIQ LGFGEIDPRKVTKPLQGHFAKAGVAPRRHLAEVRTSEHDSYEVGQELDASTFEAGQSVDV SGNTKGKGFAGVMKRHGFAGVSASHGAHRNHRKPGSIGACATPGRIFKGLRMAGRMGGNR RTVQNLRIQAVDTEKGLLLISGAIPGPKNGVVLVRSAVKGA >gi|319978464|gb|AEUH01000113.1| GENE 32 16721 - 17029 506 102 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227428357|ref|ZP_03911414.1| SSU ribosomal protein S10P [Xylanimonas cellulosilytica DSM 15894] # 1 102 1 102 102 199 98 3e-50 MAGQKIRIRLKSYDHEVIDSSARKIVETVQRAGATVVGPVPLPTEKNVFVVIRSPHKYKD SREHFEMRTHKRLIDIIDPTPKAVDSLMRIDLPADVNIEIKL >gi|319978464|gb|AEUH01000113.1| GENE 33 16800 - 17264 123 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|252128086|ref|ZP_04837481.1| ## NR: gi|252128086|ref|ZP_04837481.1| conserved hypothetical protein [Corynebacterium matruchotii ATCC 14266] # 2 62 37 97 97 86 78.0 5e-16 MMSMSRLCVRISKCSRLSLYLWGDRITTKTFFSVGSGTGPTTVAPARCTVSTILRAELSM TSWSYDLSRMRIFCPAMPSYLFSYRFGAGTRTRACRTGHSAGGRGCPASCPKILGIMPGP GAMARADHPRVAGATVPFCQSSTHEDKPCWRSRM >gi|319978464|gb|AEUH01000113.1| GENE 34 17508 - 18695 1341 395 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 395 1 407 407 521 64 1e-147 MAKAKFERTKPHVNIGTIGHVDHGKTTLTAAITKVLHDKYPELNEFTPFDQVDNAPEERD RGITINVSHVEYQTEARHYAHVDAPGHADYVKNMITGAAQMDGAILVVAATDGPMAQTRE HVLLARQVGVPTILIALNKADMVDDEEMLELVEEECRDLLESQDFDRDAPIIQVSALKAL EGDPEWTKKIEELMEAVDTYIPTPERDMDKPFLMPIEDVFTITGRGTVVTGRVERGKLPI NSEVEILGIREPQKTTVTGIEMFHKSMDEAWAGENCGLLLRGTKRDEVERGQVVAVPGSI TPHTEFKGQVYILKKEEGGRHNPFFSNYRPQFYFRTTDVTGVITLPEGTDMVMPGDTTEI TVELIQPIAMEPGLGFAIREGGRTVGSGRVTEILK >gi|319978464|gb|AEUH01000113.1| GENE 35 18856 - 20979 3305 707 aa, chain - ## HITS:1 COG:Cgl0488 KEGG:ns NR:ns ## COG: Cgl0488 COG0480 # Protein_GI_number: 19551738 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Corynebacterium glutamicum # 1 706 1 706 709 995 71.0 0 MALEVLTDLKKVRNIGIMAHIDAGKTTVTERILFYTGINYKIGETHDGASTTDWMEQEKE RGITITSAAVTSFWNGNQINIIDTPGHVDFTVEVERSLRVLDGAVAVFDGKEGVEPQSET VWRQADKYEVPRICFINKMDKMGADFYFSVSTIKDRLHATPIVMQLPMGAESDFVGNIDL VTMEALTYPAKDDKGNDTLGALVVRGPIPEEYQEKAEEYREALVEAAAEGTDELTELYLE NGELTVEQIKEGIRALTVAGRAFPVYCGTALRNMGVQPVLDAVIDFLPSPLDIGEVRGFK PGSDTEEETESRKPSESEPFAALAFKIAAHPFYGKLTFVRVYSGKVDAGAQVMNSTKGKK ERIGKIFQMHSNKENPVDQAHAGHIYAVIGLKDTTTGDTLCAMDKPIVLESMTFPAPVIQ VAVEPKSKADQEKMGVAIQKLAEEDPTFTVELDPETGQTIIGGMGELHLDIIVDRMRREF KVEANVGNPMVAYRETIRKKVDKYEYTHKKQTGGSGQFARVIIALEPLPAGSEEEYEFED KVTGGRVPREYIPSVDHGIRDAMQSGVLAGYPMVDIKATLLDGAYHEVDSSEMAFKVAGS MALKEAAKKASPTLLEPIMAVEVRTPEEYMGDVIGDLNSRRGAISSMDEQHGVRVVKAQV PLSEMFGYVGDLRSKTQGRAVYSMEFDSYAEVPRAVAEEVIGKSRGN >gi|319978464|gb|AEUH01000113.1| GENE 36 21066 - 21536 724 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227497165|ref|ZP_03927413.1| ribosomal protein S7 [Actinomyces urogenitalis DSM 15434] # 1 156 1 156 156 283 91 1e-75 MPRKGPAPKRPLVDDPVYGSKVVTQLVNRVLLDGKKTVAERIVYGALESVRERTEQEPVS VLKRALDNIRPALEVRSRRVGGATYQVPVEVRPARATTLALRWLVDFSRQRREKTMTERL ANEILDASNGLGAAVKRREDMHKMAESNRAFAHYRW >gi|319978464|gb|AEUH01000113.1| GENE 37 21536 - 21910 579 124 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494540|ref|ZP_03924856.1| ribosomal protein S12 [Actinomyces coleocanis DSM 15436] # 1 124 1 124 124 227 91 9e-59 MPTIQQLVRKGRVSKRAKVKTRALKGSPQRRGVCTRVYTTTPKKPNSALRKVARVRLSSG VEVTAYIPGEGHNLQEHSIVLVRGGRVRDLPGVRYRIVRGSLDTQGVRGRQQARSRYGAK KEKK >gi|319978464|gb|AEUH01000113.1| GENE 38 22286 - 22408 65 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVRNASGFDGFGQIRAESGRLRRVLDHQTDHNAQNGTQYP >gi|319978464|gb|AEUH01000113.1| GENE 39 22578 - 25532 4382 984 aa, chain - ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 103 526 334 705 721 165 31.0 4e-40 MLSINWDDVWKMVETIKIPLIVIGVALALAVVVSIAVIRMNRPARKLTRSTVWVAAFVAV VVAVVSMMYGGFRTVLDLASGKGALTDASKAQVEGLGNDISDEGMVLLKNTDGALPLQKG SAVNVWGWGSTNPIYGGTGSGSLSADNPTTSLLDGLANAGFSTNSDLSALYTGYRAERPE VGMFKADWSLPEPTQDMYSDDLLSGVRTFSDTSVVAIARSGGEGFDLPRDVNKEARENPY FSYTDNSTEYTDFQDGQGYLELTRPERDMIALAKQNSEKTIVVVNAANAFQLGELQDDPD IDAIVWAIPGGQVGFNALGRILDGEVNPSAKTPDTFPRIIKAGPAANNFGDFQYTNMTEF AQEDPFNKGTFTQPSFVNYTDSIYVGYRWYETAAAEGVIEYASEVVYPFGHGLSYTSFTQ EMSELRESADGVLSAEVTVTNTGQVAGKDVVQVYSNPPYTDGGIEKPAANLIAYEKTKLL QPGESQTVPVTWSRDSLASYDAKDAKAYVLEAGDYRISIRSNSHEILSEQSFPVAETTVY DSAEDTHDGDLVVATNHFDDAAGDVAYLSRAGHFANLAEATAAPTGFEMSAEHKAAFLAT SNYDPDSDEAGLSVRRPTTDARNGLVLGDLHGLAYDDPKWDQLLDQLSVDDMNSLISKGG YGSPAVSSVGKLRVSDVDGPASLNNNFTGVGSIGLPSAVSVAATFNKELARSFGDAIGTM AHDMQVSGWYAPATNTHRYAYAGRNFEYFSEDPLLAGAQVAEEIQGAQAKGVYAFIKHFA LNDQETNRTHMLSTWTNEQALREVYLRPFEIGVKQGGAHAVMSAFNYIGTTYAGAHPELL NTVLREEWGFQGMVLTDYFAGYGYQNADQIVRNGGDFMLATLDIDIAKVNTQSDAGLAQM RRASHNILYTVANSWMYENGQPEVERNTWEYITWVVAGALILALLGLEVVALRRYRARAA VGPESAPPTRDTDDGDGTRGGGAE >gi|319978464|gb|AEUH01000113.1| GENE 40 26208 - 30110 5935 1300 aa, chain - ## HITS:1 COG:MT0696 KEGG:ns NR:ns ## COG: MT0696 COG0086 # Protein_GI_number: 15840071 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Mycobacterium tuberculosis CDC1551 # 1 1297 1 1313 1316 1611 62.0 0 MLDAKTFDSLKITLATGDDIATWSHGEVKKPETINYRTLKPERDGLFGEQIFGPTRDWEC ACGKYKRVRYKGIICEKCGVEVTRSKVRRERMGHIDLAAPVTHIWYFKGVPSRLGYVLNL APKDLEKVIYFAAYMITEVDEKGRHEDLAELRAELEVQKKQMENNRDAAINDFAEKLEAD IAQMEASGASAAERDRARKQGERDMAKVRRRFDADIDGLEDTWERFKNLKVGDLEGNERL YRSMVARYGTYFKGDMGASAIQKRLETFDLAAEVEALRQTIASDSGPRKARAIKRLKVIN AFVVTGNSPASMVLTKIPVIPPDLRPMVQLDGGRFATSDLNDLYRRVINRNTRLKRLLEL GAPEIIVNNEKRMLQESVDALFDNGRRGRPVAGPGNRPLKSISDMLKGKQGRFRQNLLGK RVDYSGRSVIVVGPQLQLHQCGLPKQMALELFKPFVMKRLVEKNYAQNVKAAKRKVERQR PEVWDVLDDVIREHPVLLNRAPTLHRLGIQAFEPQLIEGKAIQLHPLACGAFNADFDGDQ MAVHLPLGAEAQAEARILMLSTNNILKPSDGRPVAMPSQDMIIGLFHLTSTPDPSVPVEV DEAGNPVVPYFSSQAEAQMAFDAGNLDLNATARIRFSDGTVPPEGWEAPEGWQPGDDIIL ETSLGRAVFNEQLPTDYPFVNEVVGKKQLGDIVNTLTQRYPNVLVADCLDALKSAGFHWS TWSGITIAFSDIQASPRKREILAGYEAKAAEIVEQFETGIILEETRYEELVKLWLQCTEE VAQDMRDNFSERNTVYRMVNSGARGNWSQVQQIAGMRGLVSDPKQKLIEQPIKANYREGL TVLEYFIATHGARKGLVDTALRTAESGYLTRRLVDVSQDVIVREADCGTRAGLKIRIAQK NDAGQWEPSELIETTAYARNLARDAVNEAGEVVMPAGTDLGDDQLAELVAAGVGEITCRS VLTCESQVGTCAACYGRSLATGKQVDIGEAVGIIAAQSIGEPGTQLTMRTFHTGGAASAA DITQGLPRVQELFEARAPKVEAKMNEAAGRVHIDDEDPSVRRVVITRDDAKEDLVVEVSR RQKLLVAEGQHIDAGTPLTEGQLDPKEILRIMGRNVAQKMLVDEVQKVYRDQGVGIHAKH IEVIVRQMLRRVTILEPGDTSFMPGELVDRMAYLTQNRRVAAEGGQPASGRQMLMGITKA SLATDSWLSAASFQETTKVLTEAAMNGKSDSLVGLKENVILGKLIPAGTGLARYNDVIVE PTPEALANSNYSEVDFQDGQVSEDFLDTLGSIDFGMSFHE >gi|319978464|gb|AEUH01000113.1| GENE 41 30126 - 33602 857 1158 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 670 1117 906 1386 1392 334 43 5e-91 MAANSTAKQHTPRVSFAKLREPLQAPNLLGLQVESFDWLLGNDAWKARVEAAKENGRTDV PDVSGLEEIFHEISPIEDVAGTMSLSFRDHRLESPKYSMEEAKTKDYTYSAPMYVTAEFM NYETGEIKSQTVFMGDFPLMTPRGTFIINGTERVVVSQLVRSPGVYFEKLSDRSTDKDTF GAKVIPSRGAWLEFEIDKRDAVGVRIDRKRKQSVTHFLKAIGMTESEIRDSFADYPVLLE TLEKDTVNTQEEALLDIYRKLRPGEPANVDAAQTLLNNFYFDSRRYDLAKVGRYKINKKL AIEADPRASTLSLEDIVATVKYLLALHQGEASFAGTRGGEAVQVRVETDDIDNFANRRIR AVGELIQGQVRTGLARMERTVRERMTTQETDSITPQSLINIRPVVAAIKEFFGTSQLSQF MDQNNPLAGLTHKRRLSALGPGGLSRDRASMEVRDVHPSHYGRMCPIESPEGPNIGLIGS LATFARINPFGFIETPYRIVRDGKVTDEIRYMTADDEKAHFIAQASSPLTEDGTFAERDV LCRVAGEEPTLIARERVDYMDVSARQMVSVAAAFIPFLEHDDANRALMGANMQRQAVPLL TTEAPLVGTGMEERAAHDSGEMVLAAHSGVVSEVSADLIVVSTDAGGRDSYRLEKFERSN PGNCTNQRVIVDEGDRVEAGDVLADGPATSNGEVALGKNLLVAYMSWEGLNYEDAIILSR RVVEEDVLTSIHIEEYEVDARETKLGEEEITRDIPNVSEESLADLDERGIIRIGAEVTAG DVLVGKVTPKGETELTSEERLLRAIFGEKAREVRDTSLRVPHGEEGIVIGVREFNDDEDD LNPGVRQTVRVYVAQRRKITIGDKMAGRHGNKGVVSKILPVEDMPFLEDGTPVDIILNPL GVPGRMNVGQVLEYHLGWLAHQGWDASGAVEAGEEWTRQLPSDGIATAPGTRVATPVFDG LEADELAGLLRNVNPNRDGDVLTNPDGKARLFDGRSGEPFPYPISVGYTYMLKLHHLVDD KIHARSTGPYSMITQQPLGGKAQFGGQRFGEMEVWALEAYGAAYALQELLTIKSDDTTGR VKVYEAIVKGEDIPEPGIPESFRVLIQEMRSLCLNVEALDAAGNVIDLRDQDEDFNARED LGITLDARPNAAATIDAI >gi|319978464|gb|AEUH01000113.1| GENE 42 33783 - 34304 423 173 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEQARAGAPALVSRRPLAVARSRLYHQKMDDEAAPRPVLVFLVRALVAFVAACAVGLVLY WADADLFAVYAYGAFGFFTFVQVLVSIPAALLARPARVKALVVVSPVLVFVPIAIVVGLV AGRGGAGASPFLGGTAVLAAVVVAQLCAAVAAWLLREAFRAWRVRSALAEGEF >gi|319978464|gb|AEUH01000113.1| GENE 43 34423 - 34494 77 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYEGSPGYRHTLNRDEDGITCEK >gi|319978464|gb|AEUH01000113.1| GENE 44 34509 - 34688 88 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDVGCRTLRDFHRSGGGALLVVQPSGGLAAGRSSGSVVQNASEFRVGTGAWRACGPKCV >gi|319978464|gb|AEUH01000113.1| GENE 45 35284 - 36159 878 291 aa, chain - ## HITS:1 COG:Cgl2483 KEGG:ns NR:ns ## COG: Cgl2483 COG3001 # Protein_GI_number: 19553733 # Func_class: G Carbohydrate transport and metabolism # Function: Fructosamine-3-kinase # Organism: Corynebacterium glutamicum # 1 279 1 233 249 115 35.0 1e-25 MSAFRKTDGRPGRIAYEVAGLAWLAGAGPSGAAVVPVLAHGPTWLEEPRLRAAAPTAGAA EEFGRALARTHAAGAAHLGAAPPGHSGDGWMGQAPLSLPARSEPAPRGTPAAGKDTAAAR GSAYGTAAWSAPAESWGSFYARERIAPYLGAPAFTAAERALVERLCERLESGALDHDQPR LVAEAAASGQIGAARTHGDLWSGNVMWTPDGAVLIDPAAQGGHAEEDLAALAVFGCPHLE RITAAYHEASPLADGWRGRTGLHQMHIIMVHCFLFGRSYVPEATAIARRYA >gi|319978464|gb|AEUH01000113.1| GENE 46 36254 - 38272 2780 672 aa, chain - ## HITS:1 COG:HP0426 KEGG:ns NR:ns ## COG: HP0426 COG1479 # Protein_GI_number: 15645054 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori 26695 # 19 368 16 362 578 94 25.0 8e-19 MSIEVEQTTIGGKLGERGLLSGDRQYRIPDFQRPYVWESEQVETLMADLLEAWRREGEED YFLGSIVVVSSPGSIDEDIIDGQQRLTTICLLLAVIRYLILDETQRGEITELLKIPRRLI RGLEERPRLKMRNRDVDFFEGYIIDGDIDSLRDCDPAALATKGQRNMHANTLVILDALED VINEDNLWSFLQFLSERVSIVKVKTDTYQSAHRIFGVLNTRGVSLTAADIFKARVLAAVS PSSRDSYGDLWDRTLDEASPDSADQFFSHLLVLYTHDFASRALIDEFEAKVLSPYLREHT GKEFIDDLLVPTARAYREIASATDATMPGGVWLELLRQYPASDWKPAAMSVLRLDVPQEG KADLLKRLERMYGANYMAKVSPGKREKRLVRAMRALDDDVRPVEHADRIFSVDDDVRLRV VARLKSSLAKKADGKILLYRALMALDGRIPALPRATTAIQALPPAPIDGIDPKTPLEAWS MRLGGLALSAMKPRDVRAAGNWSAINDKLRSTRGIGKTVVASFPGADTLDDRALQERHGR LVELVADYWDIRRDSDHVDLTRMTEEELGRNLEGKSGGRTKLVRLADVVDSGIVAPGDVF VWRRPNIGDEYSVTVTDSGGLELSDGTSVASPSAAVRALTGTSAAAFDVFVRESDGKKMR ELWDTYRRRFSK >gi|319978464|gb|AEUH01000113.1| GENE 47 38626 - 39723 615 365 aa, chain + ## HITS:1 COG:RSc3438 KEGG:ns NR:ns ## COG: RSc3438 COG0270 # Protein_GI_number: 17548155 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Ralstonia solanacearum # 37 359 2 326 364 283 44.0 5e-76 MKGSLVVPAGEKTLEPPVRTATADATSAEAASALPRIVSLFSGAGGLDLGFQQAGFPLSF AVDLSPAAIETHRRNFQDTVAVEADLTELGSDGVLEHLAPILEPGESIGVIGGPPCQGFS RANTRSAANDPRNRLPLLYLQIVEALQTKYRVEFVLFENVLGIRDAKHLTTFNGILDKFN SIGLVATAEVYSALDYGVAQRRYRVIVSGFRDREVSRAFKPKKAETGNLTVRSVIEGLPD PVFFTHGLQRTAIPHHENHWTMRPMSKRFLQPEESGQTGRSFRRLKWDQPSPTVAYGHRE IHVHPSGRRRLSIYEAMLLQGFPDEFVLEGNLSAQVEQVSNAVPPPLARSLATAIKEALQ ELGER >gi|319978464|gb|AEUH01000113.1| GENE 48 39709 - 41556 335 615 aa, chain - ## HITS:1 COG:no KEGG:ROP_pROB02-00110 NR:ns ## KEGG: ROP_pROB02-00110 # Name: not_defined # Def: hypothetical protein # Organism: R.opacus # Pathway: not_defined # 1 612 20 625 627 723 59.0 0 MSVLVDGTSVITAGLDHWYEPEGRDDPAMEEYQEHDWRLEARLRVNELRLPPDYRYQGHG DDQRNVKLTVPALRFPRWHFCMYCKRLERSTLTMREAVRCENKKHGNSKPRMSQVPFVTI CAAGHLDDFPFDAWVHRTANPSCKGPLRLRSIGGDGLEGQQVVCDRCGKKRSLGRITEAH DKNKEQHTFFSDNLSSPDDPYLCSGARPWLAEDEGACGLPMRGALRGAGNIYFPKVESSI YLPWKEGSVSAEMHDLMRHPAVSATMRTLHNIFGVDLEAGILRSQLRENVPPELFGPITD EELIAGYRDLVGSGDEGSESDEESDAEFLTGNDEWRYPEFQLIRETPIDDYLTATDPGLH TDLSSHLERVRSVDVLRETRALRGFTRVYDDVLKLSAGKALLRREPLPSSQDWLPAYVVK GEGIYLELDPAQLAAWEGRPEVQARAQRITDQYGKFTARRGGQGRAPTPRFVLLHTLGHL LIDELVFTCGYSSASLRERLYVSTNAEREMAGLLIYTAAGDSEGTMGGLVRMARPDNLRA VFVSAVSGARWCSTDPVCMEAGEKGQGPDSCNLAACHGCALLPETSCEEFNRFLDRGLVI GTFKDPTLGYFSTFS >gi|319978464|gb|AEUH01000113.1| GENE 49 41610 - 44339 1128 909 aa, chain - ## HITS:1 COG:no KEGG:ROP_pROB02-00120 NR:ns ## KEGG: ROP_pROB02-00120 # Name: not_defined # Def: hypothetical protein # Organism: R.opacus # Pathway: not_defined # 2 908 255 1161 1163 1061 62.0 0 MGFTVTAVDGLTIEPYPEVARSGRDEEEQSIDLLYRNKRTYAIGHGCAAEWDEGPGEAVC QVRAVALPAYEVVSLTPNVYLTDDKGKYLLDHEGRRQAVTVSMEELADGTPQGREQVETV LRLYSEWIAARKAEISDLPERFRQAANRHMRAAETALERMETGWDLVCSDQTAERAFRWA NKAMLYQQVRSTFPLREAEPRKGGGPRPKGAHPEPRIQQGKGMWRPFQIAFILASLPELV APSHPNRSLVDLIFFPTGGGKTEAYLGACAISLLARRLRDPDDTGTDTLMRYTLRLLTAQ QFLRAASLVCVLEDIRDRHSNALGSEPFSIGIWLGGSSTPNTWKQAVKGLERLQSNVQQE NLFLLLRCPWCGAEMGPKPRGKHRGRGQDVVGYAMAHGRKRVVLRCADRECRYSRRSGLP VHVVDEDIYEVRPSIVIGTVDKFAMMAWRPEARSLFGFNDQGERVSSPPNLIIQDELHLI SGPLGSMAGLYEAVIDELCTDRRNGEAVPPKIIASTATIRRYEDQIKGLFGRTDVALFPP HGLEEGQSFFAQPVLLEDGSRGPGRRYIGVMSASFGSTQTVQTRVAAATLQAAAKVPEAN RDGYWTNLNFFNSLRELGNTVSLLQSDVPDYLTRLKRRDGVIPRWPSRPMELTSRRRSDE IPKAIEQLQTRYSPGKDHQDAIDICLASNIIEVGIDIDRLGLMTIVGQPKSTAQYIQVSG RVGRRVDISPGLVITIYGAAKPRDRSHYERFLTYHQQLYAQVEPTSVTPFATPVLRRALH AAIVAYIRQTTPEDLLPYPFPSVEYHKAFALLRERARITDPDEMPAMERMAYKRARQWDG WERTIWEANPAPWGDPKQGLMRFAGTLPDLDSKAAIWDVPTSMRNVDAECRLEISLAYAH ADAEVEEEL >gi|319978464|gb|AEUH01000113.1| GENE 50 44607 - 45062 -263 151 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDRDLLVVGTGNPDGEGTTGWSAALKGNRHRGRFEAVGVGQVEIVCVSRVMSQRPLDLDG RPIRGILMMGETRNSRKIDVLDSCPSPTGHNTAVKHTDAITGKRRSRVQDLLHCRTVVEL SAALLNCLESDRAARAVNWFPCWMIRINGAE >gi|319978464|gb|AEUH01000113.1| GENE 51 45643 - 45759 115 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFTLRSDQDWIDPRPHLVDSLNTRIAQIPFDTMVAMVS >gi|319978464|gb|AEUH01000113.1| GENE 52 45784 - 46176 72 130 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLRGRNFCFLRPQAIERRSVQRAPLSGALPEEEVGSHPDCWHFCVRDTFTAGLAWVEGA NATAVVEPHRIATTQVQLTRRHFVILRFTTTTWTLTPDGLWQIFSLASTFVRGLAFDTFD GFPTEHTRLH >gi|319978464|gb|AEUH01000113.1| GENE 53 46142 - 46612 148 156 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MASPPNTRAFTRIAPSAYAAILERGMSRFLNHSGIKKLTSSVMGAVTSDAGKKTGGPITY IHSRPGRARTASLAVTTRRPSMSSHDISPSAAILLAATIMQLPTPTSLNRVISSEPIPES VRAIPLAQSTSSSSKSWFRIAPFSLRACSEIASTSS >gi|319978464|gb|AEUH01000113.1| GENE 54 46860 - 47600 473 246 aa, chain - ## HITS:1 COG:no KEGG:ROP_pROB02-00150 NR:ns ## KEGG: ROP_pROB02-00150 # Name: not_defined # Def: hypothetical protein # Organism: R.opacus # Pathway: not_defined # 83 242 67 223 227 74 31.0 3e-12 MLASEKPELPTPLDTLRKLERGKWVRAVNKRKVLAGAFLSLDESVEPPRARFAGREWTVD KVQALSEIEDWEPNRGTCSQDRPEPGSMARMAGLDSVWDAWLARPAADLAIVGTAEWLKE DSSACLTKEDDEDLERSSIGSVLLPRTPKVATWCTRILSAMSLPDRLPLLAALKAVILDG NGAIKWLPDIEAPVVICVLDRSRADEATEEVVLQLRNTCGEPLSLSKDLGWTPPGGVEAL GFTVAL >gi|319978464|gb|AEUH01000113.1| GENE 55 48005 - 48112 181 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MESLWGRGVSMASVTSSSAQWELELGYGARGLEDG >gi|319978464|gb|AEUH01000113.1| GENE 56 48142 - 48606 526 154 aa, chain + ## HITS:1 COG:no KEGG:PFREUD_11190 NR:ns ## KEGG: PFREUD_11190 # Name: not_defined # Def: hypothetical protein # Organism: P.freudenreichii # Pathway: not_defined # 6 154 25 173 173 145 66.0 4e-34 METHFEGGRDEAAGMLTALSADRQRLAQQVTVPWALMAAFGALGAWWVGSAASAAPGAHY EPPTSGWMALLGALVVSHLVQRETGIRFRSLGARANWMVAAIVVACLALFSVSLGLVSLQ LPWAVVATSLAAFALTTCLSGLALRSAVESARRG >gi|319978464|gb|AEUH01000113.1| GENE 57 48599 - 48904 334 101 aa, chain + ## HITS:1 COG:CC3387 KEGG:ns NR:ns ## COG: CC3387 COG1846 # Protein_GI_number: 16127617 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Caulobacter vibrioides # 4 101 3 96 98 78 43.0 3e-15 MAEARFDELVHAPLRLRICGLLRHVAGLSFARLRDSLGVSDATLSKHVKALVGAGYLESR KSASPMRDDSRRVVWLSLTGEGRRAFDGHVRALEEIVGAAP >gi|319978464|gb|AEUH01000113.1| GENE 58 48999 - 49595 565 198 aa, chain - ## HITS:1 COG:no KEGG:Bsph_2619 NR:ns ## KEGG: Bsph_2619 # Name: not_defined # Def: hypothetical protein # Organism: L.sphaericus # Pathway: not_defined # 7 191 1 185 185 177 47.0 2e-43 MGMLAVLRKCDKALAVVFAALIIVFIVAAFTVPEFWDWLWQRHHNQLSWYIRPLFLVPFC WFSYKRSWAGVMATVLLLLTSMGWFAPPTEASPQVQQFLQFEQQYLQGDWTPGKVLLTLL VPLSLAALAAALWRRNVWLGVAVLALIAVAKSLWSVVFAGEAGAKVLLPAAVGLVLCGGC LWLGTRFSRSRTADKRPR >gi|319978464|gb|AEUH01000113.1| GENE 59 49656 - 49841 121 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKLPRRAGGVLSVFTSRPVLEGPGGGARGIRTPDLLIANETRYQLRHSPKDGDQHSTSA H >gi|319978464|gb|AEUH01000113.1| GENE 60 50106 - 50561 520 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154507880|ref|ZP_02043522.1| ## NR: gi|154507880|ref|ZP_02043522.1| hypothetical protein ACTODO_00364 [Actinomyces odontolyticus ATCC 17982] # 72 151 245 324 324 95 70.0 1e-18 DAAAAGAGAAVGAGSGAAVPGGSDAVVPAARAGAVAGGGAPGLVFNTDVDDAEVVPVSEE ETARAARSLAQSRTWTVVPVPAPTYVMRGRISGRMVHADTDLRGIPKVAASVPARPVAAT YEGGKRSTEEVVADQAVALNLEAVLDSKRAQ Prediction of potential genes in microbial genomes Time: Thu May 12 17:50:46 2011 Seq name: gi|319978458|gb|AEUH01000114.1| Actinomyces sp. oral taxon 178 str. F0338 contig00114, whole genome shotgun sequence Length of sequence - 4645 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 986 391 ## gi|293190342|ref|ZP_06608774.1| conserved hypothetical protein 2 1 Op 2 3/1.000 - CDS 983 - 1675 306 ## PROTEIN SUPPORTED gi|227376194|ref|ZP_03859656.1| acetyltransferase, ribosomal protein N-acetylase 3 1 Op 3 3/1.000 - CDS 1675 - 2901 1430 ## COG0303 Molybdopterin biosynthesis enzyme 4 1 Op 4 . - CDS 2906 - 3850 1484 ## COG1210 UDP-glucose pyrophosphorylase 5 2 Tu 1 . + CDS 3669 - 4481 793 ## COG0212 5-formyltetrahydrofolate cyclo-ligase Predicted protein(s) >gi|319978458|gb|AEUH01000114.1| GENE 1 2 - 986 391 328 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190342|ref|ZP_06608774.1| ## NR: gi|293190342|ref|ZP_06608774.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 51 268 3 220 285 135 51.0 4e-30 MIAHIRGRCTRCMCCECGVCQLFEKLSGVQSSTRARPPRARPADLPTFGGVEIGGLVFAA VCVVLVAAVPALVARRSAITQSRELDRFSPGLRMIRAPEDKNEPRGRATGPLLAPSKETG RGGGTMERTGAAPRSTERVSPRAVRDIARLRAKRAARLASEAAASKRRMAASAVFALATV LLGVGVWAADLAWVWTAIPGAALVASLGASRFAAVRSQREGEREVELLRELRGGAPRASS PSAEAEGASSGSAFARSLRRASSSAGAPGASGNGEKRGSSAPGAPRVGDQRASGARAESE SPRARDAAPPRTRPLPGAPPPGPPWPTR >gi|319978458|gb|AEUH01000114.1| GENE 2 983 - 1675 306 230 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227376194|ref|ZP_03859656.1| acetyltransferase, ribosomal protein N-acetylase [Kribbella flavida DSM 17836] # 45 225 17 205 205 122 36 5e-28 MPLLRKVWGGSPARLVLRVADPVGRGFIGPASPSARTGNPRELVVRPALGRDHLRIDAVR RGDREWLAPWEATMPPEADEPVPAIAEYCSRIDREQREGRTLMLVVEADGKVVGQYSLSN VQRGAMSQGMLGYWLASSWSGRGLGALVAAMVIDLAIGELGLHRVEVCVRPENERSLALC RRLGLHEEGMRRRFMHIAGKWADHVAFSIDRESLPEGGLVVAKWGRGIDE >gi|319978458|gb|AEUH01000114.1| GENE 3 1675 - 2901 1430 408 aa, chain - ## HITS:1 COG:MT0454 KEGG:ns NR:ns ## COG: MT0454 COG0303 # Protein_GI_number: 15839826 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Mycobacterium tuberculosis CDC1551 # 1 403 1 400 405 223 42.0 5e-58 MRSVADFYQDCIAAVDQQPPLDVQLADAVSCVLAEDVRAPFNLPVADLAACDGYAVRASD CSRAALDAPVTLPVTEEIRAGVVDPAALVPGTAIRIASGAPVPAGADAVVALEFTDHGVA EVGVRSAPAVGENIRRRAEDVAEGQVVLRSGARVGARQVALLAGVGRSRVLVHPRPRVVV LSIGDELVEPGRTARPGTVFDANGHALSTAITDAGAQTFRVAAVPDERAKLRETIEDQLV RADLVLTTGGISYGSGDTVREVLGSLGTVRFDNVAAWPGHILGVGTVGGEDGGTATPIFC LPGDPVSAQVCFEVFVRPALRHMQGWKSLNRPVVRAAVDRTWYSPRGRREFVRVRLAGDP RSGYQARVMGSPTALLLSALAESNALAVVPEDVTNVRAGDRLQCLVLD >gi|319978458|gb|AEUH01000114.1| GENE 4 2906 - 3850 1484 314 aa, chain - ## HITS:1 COG:BH3652 KEGG:ns NR:ns ## COG: BH3652 COG1210 # Protein_GI_number: 15616214 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Bacillus halodurans # 15 309 3 285 297 210 39.0 3e-54 MFATLAGMADKKQTVIHAVVPSAGRGTRFLPITKSVPKEMLPVVDRPSIEYIVREATDAG IEDVLFVTRAGKQSIEDYFDAEPGLEADLARAGKRAALDSVNEYKQYARVHSVRQGHPLG LGHAIAQAKQHVGDAPFAVLLPDDLMEPGSQLLRKMIQVRGALGGTVVALLRVTPEQATA YASTAVEPLPIPEGVDLAEGSLMRITAVTEKPPLDEVKSEYAVVGRYVLDPAVFTALEQI EPGAGGEYQLTDGYARMIDLPAEQGGGLYGVVIDERRFDTGDKLGYLEANVALALEDPAL GPALKEFLKTKVGD >gi|319978458|gb|AEUH01000114.1| GENE 5 3669 - 4481 793 270 aa, chain + ## HITS:1 COG:ML0181 KEGG:ns NR:ns ## COG: ML0181 COG0212 # Protein_GI_number: 15826993 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Mycobacterium leprae # 123 261 49 186 197 69 37.0 7e-12 MPASVASLTMYSMDGLSTTGSISLGTDFVIGRKRVPRPAEGTTAWITVCFLSAIPARVAN MGAKVDTITAKQELRSQIRATRLTRRRTGADIPGSLETERAGMVRSFDEAWARLGGPRLP ALFLPLPTEPDLTWVVEAAPECLLPVVMIAGVPIEDPEWGLHRAGEALAAPSQWHPSDGS HPRRLAGPEGIARADVVVIPALAVDESGTRLGQGGGWYDRTLPHRRDGVPVLACVFDDEV REAGALPREAHDVRVDGVITPSMFRLFDAV Prediction of potential genes in microbial genomes Time: Thu May 12 17:51:08 2011 Seq name: gi|319978445|gb|AEUH01000115.1| Actinomyces sp. oral taxon 178 str. F0338 contig00115, whole genome shotgun sequence Length of sequence - 15991 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 9, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 75 - 482 828 ## COG1970 Large-conductance mechanosensitive channel 2 2 Op 1 . - CDS 666 - 899 172 ## gi|154507886|ref|ZP_02043528.1| hypothetical protein ACTODO_00370 3 2 Op 2 . - CDS 892 - 5223 4968 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits - TRNA 5339 - 5411 69.2 # Arg CCT 0 0 - Term 5435 - 5476 8.2 4 3 Op 1 . - CDS 5504 - 6433 1035 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins - Term 6608 - 6653 14.6 5 3 Op 2 . - CDS 6669 - 7424 1038 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 7611 - 7670 1.7 - Term 7654 - 7703 1.6 6 4 Tu 1 . - CDS 7836 - 8642 921 ## COG1321 Mn-dependent transcriptional regulator 7 5 Tu 1 . + CDS 8682 - 9794 1306 ## COG1932 Phosphoserine aminotransferase 8 6 Tu 1 . - CDS 9935 - 11182 1390 ## COG0477 Permeases of the major facilitator superfamily 9 7 Op 1 . - CDS 11412 - 12863 2093 ## COG2252 Permeases 10 7 Op 2 . - CDS 12934 - 13701 824 ## Jden_2046 hypothetical protein 11 7 Op 3 . - CDS 13694 - 14071 462 ## COG1278 Cold shock proteins 12 8 Tu 1 . + CDS 14070 - 14171 81 ## 13 9 Tu 1 . + CDS 14278 - 15991 1916 ## Ksed_07010 protein of unknown function (DUF477) Predicted protein(s) >gi|319978445|gb|AEUH01000115.1| GENE 1 75 - 482 828 135 aa, chain + ## HITS:1 COG:Cgl0854 KEGG:ns NR:ns ## COG: Cgl0854 COG1970 # Protein_GI_number: 19552104 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Corynebacterium glutamicum # 1 128 1 129 135 94 46.0 3e-20 MIKGFKDFISRGNAVDLAVGVIIGAAFKNIVDALVDGIINPLIAAVIGKPDFSDAFILTL NGTNVKFGILITAVINFLLMAFAIYLCIVVPMNKLAALRTAKEKAEKDAAPKISDEVQLL TEIRDALKSSGSTAV >gi|319978445|gb|AEUH01000115.1| GENE 2 666 - 899 172 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154507886|ref|ZP_02043528.1| ## NR: gi|154507886|ref|ZP_02043528.1| hypothetical protein ACTODO_00370 [Actinomyces odontolyticus ATCC 17982] # 13 77 9 73 73 84 87.0 3e-15 MSERSAPGRAPGRRRVARHVSDVDRRLIEQGIPPSWEDLVRPEDTRPGSMADSPDRGDGA NDRRLLEDVPPHAQPRA >gi|319978445|gb|AEUH01000115.1| GENE 3 892 - 5223 4968 1443 aa, chain - ## HITS:1 COG:TM0618 KEGG:ns NR:ns ## COG: TM0618 COG1112 # Protein_GI_number: 15643384 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Thermotoga maritima # 944 1203 1023 1279 1289 109 27.0 5e-23 MATSFFSRRNPRGADRQAPPRETPVQAPPAASALAAAVERWRAEVVEVTDAEGRGVPRLT ITQAHPGGLARLYTEAPTRLSSLIREKASLARAMERARAMMARSLQLSTRHGVGRVHLSI GQVQWHRDGHTVRSAALLKPVRLDDVGGDVMITLEPGALMDPDLEAALREHGEHCDAEAI LDGARSSHGFSASVALSLLRERVGVLDGVEVRDELVLGIFEHPATPLLRDLEDLDRLADS PLVRALAGDEDAVGSLAVSLPPPNPRDRDPWKERGIGDLTPAQQDAVEAVATGDSLVVDA PVDSDATSVIAAILADAAASGRSVLHVGASPSHTIRARARLMELGVDEIFADFDGRSRDS YGLADKVRTAMEDTSPVMDQREVDEMRTRLRGARASLLAYTDALRRPYKNFGVSAAHALR VLTDLTEEEDAPSTRVRLDEDTLYEIALDQGERARAVLHTASERGYLRDDPDSAWKGAVV SSEAEVADVLERVGRLAESLPRLRVHMSAVAGETGARDAHTLEQWELQLAMFEGVAEAVD IFQPQIFERSAADMVIATASKQWRSDHGITMKRRERNRLVKQARDYVRPGIHVPDLHRAL IRVQERRDAWRKICGDDGWPVVPAQLSECEELTARVRADLEAIAPVFAAEQPNLVGTHVA KLTELVQEWAADKAGARELPGRVALWKEIDGLGLRALAEDFAARGVGAPMIDAELDLAWW ASLLSLMLAADPSLGGLEPSRVSALLAEGRDLDRRQVASLVPQAISSLRRLRTSALATRA AQHEELRDAVTAEDHVPSDADLICSHPLVRRLIPVVLTSPALVPECVRPGHHVDLAVIDG ADSVPLGELIPVIARARQVVVVADVAGGAEGGAIAGLARVLPSVRLEQRPDRLNDQVALL LARYGLGHSGVPVPWTASNAPVGAVWCAGTGMPALRGNAIETTTTEVETVVDLVLAHGVE SPERSLAVIALSERHAARIRSAVDRLRANEPGLAAFFDSASVQPFVVVGPGQAQGIVRDR VIISVGFAKTPHGRVIHDFGVLSTDDGARVMADVLRCVRGELTLVSSLHSADIDRSRIHR EGVHMLVDLLEIAEGHSGEGADAWPVLEGEPDRLLIDLADKLYERGYQAIANIGIPGGVR IPLAVGHPDVPDRLLVAVMTDDEAYCDEPSVRVRDRVRPWMLEDQGWRVAMALSMPLFID PDREADSIGALVDALVADSRAHDALPVLEVPRPATARAESAAEPAPDAAYGAAGPVDGPW GDAAAPEAEDAPARGASPRSVPVLDEELDEASKRRRNVDEVTTGMLLAIQKKERASEEDR GPRPAIAKGLPLAAYSDDQLDEMAAWVRSDGVERSDIETAEELRLALGITRRGFQSDAVL GNVVRRTKPVSAPAAGAAASGVPGGQDRGAADGGGLPVDGDYDDGSDDGAGTTGAADDRD ADE >gi|319978445|gb|AEUH01000115.1| GENE 4 5504 - 6433 1035 309 aa, chain - ## HITS:1 COG:Cgl2878 KEGG:ns NR:ns ## COG: Cgl2878 COG0589 # Protein_GI_number: 19554128 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Corynebacterium glutamicum # 1 291 1 293 301 109 31.0 6e-24 MGSTEVIVVGVDGSTESVAAVTWASHRSARTGARIHCVCTYALASYSAAALDGGYAVLDD EALQKGAQQVVDEAVAHATKHGAKNAGGSIQPGDPAGVLIDFSKEADLVVVGSRGSGGFA DRLLGTVSSALPAHAKCPVVVVPRHTSGKKFTPVERIVVGIDGTEAPSSALRRSVEEAQT WGARLTAISAVPIAVGPAVMAWTPTGIDHSALLAQVREGMDKAIAEAVGDRGINVARHAL DGSPASLLIEFSTAVDLVVVGTRGRGGLAGVLMGSTAQTVLGHSTCPVMVVPSTHLGKAG QIRPSWERR >gi|319978445|gb|AEUH01000115.1| GENE 5 6669 - 7424 1038 251 aa, chain - ## HITS:1 COG:CAC2663 KEGG:ns NR:ns ## COG: CAC2663 COG0791 # Protein_GI_number: 15895921 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Clostridium acetobutylicum # 144 244 142 247 255 97 46.0 3e-20 MKTAKHTARHRIERRPLAPVTNLADAARQLDPSRAVAFASVTGVALTAVAAGIANAAPVQ GQSDTQSANQNATTVAIDDVTTVEVPDIAWSADQDSATAEAPVQTPPPAAEEETRDTTTA AASRSEARDDVSESASHSRQAAVNAAGSDIASVALSLTGIPYVYGGETLAGLDCSGLVKY AYAAAGINLPHSSSAQTAGGTIVSNPQPGDIVSYPGHVAIYIGNGQMVEATVPGALSKIS EVRAGATYVRY >gi|319978445|gb|AEUH01000115.1| GENE 6 7836 - 8642 921 268 aa, chain - ## HITS:1 COG:ML1013 KEGG:ns NR:ns ## COG: ML1013 COG1321 # Protein_GI_number: 15827486 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Mycobacterium leprae # 46 231 3 197 230 195 53.0 8e-50 MANCGWRGAGGAPRVQRYGAVGPALGGGVLFCSKSPRSLKEPAVADLIDTTEMYLKTILE MEEDSVTPLRARIVDRLGHSGPTVSQTVARMERDGLLHVMGDRRLELTREGRARAVEVLR KHRLAERLLLDVIGLEWDAVHAEACRWEHVMSDRVEDRLAELLHEPVVDPYGNPVPGSGL EVAGESVELAVRNPEERAMRIVRIGEPIQADEGLLTSFAELDLTPGARVALSVHGDLVRI SALDEAGAVIGYLTIPVALSPHLFVRGD >gi|319978445|gb|AEUH01000115.1| GENE 7 8682 - 9794 1306 370 aa, chain + ## HITS:1 COG:ML2136 KEGG:ns NR:ns ## COG: ML2136 COG1932 # Protein_GI_number: 15828146 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Mycobacterium leprae # 7 368 11 375 376 327 51.0 3e-89 MQALPPIPPGLLPSDGRFGAGPSRVRTEQLDALSAPLVMGTSHRQAPVRGVVASLRAGLA DLFGPPDGYEVVLGNGGATAFWSAACASLVIDRAAFAVFGEFGRKFAAEAAGTPHLAAAI VDEAPDGRLATVAPREGVDAYAYPHNETSTGVVSPLYRPAADALTLVDATSIAGAAPVDL ADVDAYYFSLQKAFGSDGGLWVAVLSPAAVARTAVCAGMPGRWVPSILDLGAAIVNSRKD QTLNTPAVATLVMAADQVRWMNESGGLEGMSAHVRASSDLVYQWAEDSLVARPFVEDARL RSPVVATIEFDPAVDAAALAAHLRSHGVVDVNPYRGVGENQLRIATWPSIPTRDVEALLA CIDYCLERAL >gi|319978445|gb|AEUH01000115.1| GENE 8 9935 - 11182 1390 415 aa, chain - ## HITS:1 COG:DR1324 KEGG:ns NR:ns ## COG: DR1324 COG0477 # Protein_GI_number: 15806342 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Deinococcus radiodurans # 5 393 13 413 430 135 32.0 1e-31 MASPFASMRFFNYRLWFGGSLVSSTAVWLQRVAQDWYVLTVLTDHDSSQVGIVTALQFLP ILGLSGFAGALADRVKGRRILQSSLLGVCLVSLCMGVLILTGVCNVWHMYALAVASGVII AIDTPARQAFVGELVPRTYMANAVALNATAFHAARLIGPAVAGLLIEWYGVGPVSLATAV LFLIPILTMVLMRAGELVQRPFVPRAPGQVREAFAYVRGRTDIRVILLLIFVVSALGMNF QMTSALMATEVFGKSAGQFGVLSSFMAVGSILGAVAAAARAPRIRTILCGAALYGLAEIG LGLAPTYWWFAALSVPTGLFMLTTNTSANAYTQTHTDEEKRGRVMSLYSLVFLGATPVGS PLIGWVGQVLGARWSILVGGMASLGIAVVCGLWAVAHWGVRLRFEGRRPRIERSR >gi|319978445|gb|AEUH01000115.1| GENE 9 11412 - 12863 2093 483 aa, chain - ## HITS:1 COG:Cgl0801 KEGG:ns NR:ns ## COG: Cgl0801 COG2252 # Protein_GI_number: 19552051 # Func_class: R General function prediction only # Function: Permeases # Organism: Corynebacterium glutamicum # 39 481 1 453 453 331 48.0 2e-90 MSTQNRTAPSRNAVAQALDSFFQISARGSTLGREVRGGLVTFFAMAYILVVNPAILGNAV PDDGSITPQGIAAGTALVAGAVTILMGAVANYPLALAAGLGLNAVVAFTLVLGSGLSYGE AMGVIAWEGLLILLLVLTGFREAVFRAVPAALKTAITVGLGLFVSLIGLVNAGIVRTGAT PVQFGVSGSLDGWPALVFVFGLFLMIVLYVRGVKGAVLISIVSATVLAVLVQAIAHLDRI SESNPTGWGQTVPELKGSPVAVPVFDTLGKVDMLGGFGKLGAVSVVLLVFSLMLADFFDT MGTMVAVGAEGGLLDQDGNPPRTRQILVIDSLAAVAGGLGGVSSNTSYVESASGVAEGAR TGLASVVTGVLFLLSTFLAPLVELVPTEAASTALVFVGFLMMTQVTDIDWKSPEVAIPSF MTIALMPFGYSVSVGIGVGFVTHSLVQLATGRARRVHPLLWVVSALFVVYFLIGPIQRLL GVA >gi|319978445|gb|AEUH01000115.1| GENE 10 12934 - 13701 824 255 aa, chain - ## HITS:1 COG:no KEGG:Jden_2046 NR:ns ## KEGG: Jden_2046 # Name: not_defined # Def: hypothetical protein # Organism: J.denitrificans # Pathway: not_defined # 14 248 16 254 259 223 51.0 4e-57 MPRTTPRTRTPKTDAILADSVALAREAAEAVAHPRPVGDHVGFKMEADRLGTHYFASTDP GYVGWCWAVTLARVPRGRTATVCEVGMAPREGALLAPRWVPWEERLRPSDVSRDDVLAYR DEDARLEPGLEDTSDDADLPIVRDLGLGRPRVLSPQGRWDAANRWYASERGPKSGRRPKN TCSNCGFLIKMSGEMRMAFGVCANEWATDDGSVVSLDHTCGSHSETDLPKGATAWPVRPP RVNDSAIDVERAPEA >gi|319978445|gb|AEUH01000115.1| GENE 11 13694 - 14071 462 125 aa, chain - ## HITS:1 COG:ML2147 KEGG:ns NR:ns ## COG: ML2147 COG1278 # Protein_GI_number: 15828153 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Mycobacterium leprae # 1 125 1 136 136 105 44.0 2e-23 MPTGRVRFFDPERGFGFITGDDGSQVFLHSSALPAGGAHPRAGARVEYSVADGRKGPQAL SVRVVAEAPSVAHAKRRKPRAMVPVVEDLIKLLDASSASLRRGKYPDNADKIAKVLRAVA EDFDA >gi|319978445|gb|AEUH01000115.1| GENE 12 14070 - 14171 81 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MASPLLSFVGRAARGPSHHRAGGTGALRSARRL >gi|319978445|gb|AEUH01000115.1| GENE 13 14278 - 15991 1916 571 aa, chain + ## HITS:1 COG:no KEGG:Ksed_07010 NR:ns ## KEGG: Ksed_07010 # Name: not_defined # Def: protein of unknown function (DUF477) # Organism: K.sedentarius # Pathway: not_defined # 26 564 69 602 716 147 30.0 2e-33 MKHRFRAAAVGVCVLLMALSGATAFAASPVTISDTVTDPDGWLSADQSSRIASAASSARA KGLDIYFVAVPDFSGKNMGDWCKQSIKQSGLSHTSIAYVIAYEERKHASCGNVGESTISD TDLTGARAAAESELRSADPLTSEATTTAALAFYSYAESHASASAAKTSSSRTSSAASSST RGGFMWVVGLLLVLVGLIAVVVLVRKASGRGQKIVAGATARDAERARQLVDEANRQLLSA DEQVRSADDELAFAQAQFGALRTQEYGASLQAARAAVSQCFALQQQMNTAASDAHRASLA TRIMDALGTAMNDLTEKQRAFASMRDERADLPRQIREAREQLAEFEGRLADTKAELASVS ALYPGQMVASLLDNPDQARALLDSAQGALDSAEEAAASSPDAAANALDTARRAMAMASHQ MDAVFAAKTDLSAIEDTLTRAIASITSDLSDVTRLGADPAAFAPLVADARKAVDDAQRAR SGTGDPLGALEALRTAEATLDAALEPLRSAEEALERKNGAASDRIAEAEALVEQADRHVQ GRRGAVDLDTRSQLSQAHSALSKARAAAEPT Prediction of potential genes in microbial genomes Time: Thu May 12 17:51:30 2011 Seq name: gi|319978440|gb|AEUH01000116.1| Actinomyces sp. oral taxon 178 str. F0338 contig00116, whole genome shotgun sequence Length of sequence - 4436 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 51 - 623 697 ## Sked_31470 hypothetical protein 2 2 Op 1 18/0.000 - CDS 760 - 2280 2296 ## COG0554 Glycerol kinase 3 2 Op 2 . - CDS 2319 - 3104 1168 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) - Prom 3267 - 3326 2.5 4 3 Tu 1 . + CDS 3416 - 4354 1209 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain Predicted protein(s) >gi|319978440|gb|AEUH01000116.1| GENE 1 51 - 623 697 190 aa, chain - ## HITS:1 COG:no KEGG:Sked_31470 NR:ns ## KEGG: Sked_31470 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 2 190 5 192 192 225 60.0 8e-58 METSMEETTYLVVDGENIDATLGLSVLDRRPNPEERPRWDRVLEGAHNQWGQRAKGLFFL NGSSGYLPMGFVQALTAMDYAVIPLSGPEDMKVVDVGLQRTMDAIVKLERGSVILASHDA DFVPQIEALLDAGRRVGVMCFREFLSSQLHDLVDRGLEIIDLEYDVHAFQVRLPRLRIID IDDFDPLAFL >gi|319978440|gb|AEUH01000116.1| GENE 2 760 - 2280 2296 506 aa, chain - ## HITS:1 COG:MT3798 KEGG:ns NR:ns ## COG: MT3798 COG0554 # Protein_GI_number: 15843314 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Mycobacterium tuberculosis CDC1551 # 6 502 16 515 517 579 59.0 1e-165 MTEEKFVLAIDQGTTSTRAIIFNHSGEIIAVGQKEFTQIFPNPGWVEHDPLEIWDTTRAV VAEALGEAEINRHQLAAVGITNQRETTVVWDKNTGEPVYNAIVWQDVRTSEMVKELGGDE GPDRFRQICGLGLSPYFSGSKIKWILDNVEGARERAEAGDLLFGNTDSWVLWNMTGGVNG GVHCTDVTNASRTMLMDIRTLTWREDVCGIFGIPMSMLPEIKSSSEIYGYGRKNGLLVDT PISGILGDQQAATFGQACFEKGMAKNTYGTGCFMLMNTGTEPVFSDNGLLTTVAYKIGDQ QAVYALEGSIAVAGSLVQWLRDNLGMIVKSSDIGELAATVEDNGGVYFVPAFSGLFAPYW RSDARGAIVGLTRYVNKGHIARAVEESTAFQSAEVLDAMNADSGVPLKELKVDGGMTHDD LVMQFQADLCGVDVVRPKVIETTALGAAYAAGMAVGFWKGTDDVAANWQEGKRWTPAMEP AERDRTYRLWKKAVTRTFDWVDDDVK >gi|319978440|gb|AEUH01000116.1| GENE 3 2319 - 3104 1168 261 aa, chain - ## HITS:1 COG:lin1574 KEGG:ns NR:ns ## COG: lin1574 COG0580 # Protein_GI_number: 16800642 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Listeria innocua # 15 253 5 234 272 187 49.0 2e-47 MFTTLLPMAGAAAPSTGQLFMSEFMGTAVLLLLGGGVVATNLLPKSKGKGGGWLMINMGW GLAVFAGVYVAYLSGGHLNPAVTIAKAVGHMFDPAVELAPGVAVTAVNITVYIIAQFVGA FVGAVLVWLSFKRHFDEQCEPAFKLGVFSTGPEIRSYGWNCVTEAIGTAVLIIWIYVSGY TTTAIGPLGVALIIVVIGNSLGGPTGYAINPARDLGPRVAHAILPIKGKGGSDWAYSWVP VVGPIVGAIIATVIVFASGLA >gi|319978440|gb|AEUH01000116.1| GENE 4 3416 - 4354 1209 312 aa, chain + ## HITS:1 COG:DR1935 KEGG:ns NR:ns ## COG: DR1935 COG2390 # Protein_GI_number: 15806934 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Deinococcus radiodurans # 1 299 45 345 348 196 40.0 3e-50 MYYLQGHTMDTIARTLDISRSSVSRLLTYARETGLVRISLAASPSLKGTLAGQISELFDV QVSVVPTYEAASEVDRLHNVALAAADLLMGMLRPGAVLGIAWGNTTAEITRCMPSIYVPG STVVQLNGAATATESGMPYADAIIARAARAIGATMVNFPVPAFFDYVETKRQLWREQSIQ TVLKTIDRCTVALFGVGSMSSRLPSHVYAGGFLDPDEIAAVQKDGVVGDVCTVLIREDGS TDMALNERASGPRPEALRTIPRRLCVVSGASKALPLLGALRTGAATDLVLDDRAARELLE LVHTRATAPTAV Prediction of potential genes in microbial genomes Time: Thu May 12 17:51:35 2011 Seq name: gi|319978436|gb|AEUH01000117.1| Actinomyces sp. oral taxon 178 str. F0338 contig00117, whole genome shotgun sequence Length of sequence - 4021 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 81 - 1676 2330 ## COG1190 Lysyl-tRNA synthetase (class II) 2 1 Op 2 . - CDS 1739 - 3046 1342 ## BPSS1278 oxidoreductase 3 2 Tu 1 . - CDS 3276 - 3953 607 ## COG0524 Sugar kinases, ribokinase family Predicted protein(s) >gi|319978436|gb|AEUH01000117.1| GENE 1 81 - 1676 2330 531 aa, chain - ## HITS:1 COG:MT3705 KEGG:ns NR:ns ## COG: MT3705 COG1190 # Protein_GI_number: 15843211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Mycobacterium tuberculosis CDC1551 # 5 528 7 501 505 511 54.0 1e-144 MNAPADDTPEQVKVRSQKRARLMEEGTPAYPVALPITSTIAEVRQRYAHLEAGEETEDLV GIAGRVVLMRNGGKLCFATLMDGAGEKIQVMFSAASVGADSLAAYKNDVDLGDHLFVHGR VISSRRGELSVFATPAEPTEQARAAYDAAPTALPAPDASWAIASKALRPLPKTWTTQDGK EITLSEDQRIRRRELDLITRPAARDMIRVRAAVNRSIRENFHRRGYIELETPMLQVVHGG AAARPFTTHMNALDMDLYLRIATEIYLKRAVVGGVDRVFEMNRNFRNEGMDSSHSPEFTS LEAYEAYSDYNGMADLTRDLIQQAARDAFDLPEGGEVVRLADGTEYDLSGQWDRIDLYGS TSEALGEEVTVETPRETLVKHAERIGLEVDDYAVSGKIVEDIFEELVASKLWAPTFVYDF PEDTSPLTRYHRSRPGLTEKWDLYVRGFETGTAYSELADPVVQRERFEAQALAAANGDPE AMVMDEDFLVAMEQGFPPCGGMGMGIDRLLMVLTGQGIRETIPFPLVKRLG >gi|319978436|gb|AEUH01000117.1| GENE 2 1739 - 3046 1342 435 aa, chain - ## HITS:1 COG:no KEGG:BPSS1278 NR:ns ## KEGG: BPSS1278 # Name: not_defined # Def: oxidoreductase # Organism: B.pseudomallei # Pathway: Porphyrin and chlorophyll metabolism [PATH:bps00860]; Metabolic pathways [PATH:bps01100]; Biosynthesis of secondary metabolites [PATH:bps01110] # 1 428 1 427 432 114 33.0 8e-24 MRTVVIGAGIAGLVCADALARRGEDVEIYEATSRAGGRIETASVADCGVEVGANFLSSTY RVLPRLAKRLGVPLRPIRSRAGIITDSAVKAYKPSGPAMVRAGVVGWRDAARAGWALAGQ ARRLVRADPADPAQWADIDVPAAPWCAERFGRAFTDAVIGSAFRGYYFQDLATTSASAAL AMVAYGARPFTTLTADPGLEAVPRALASGLTIHYDSPVARLERGARGAALVLADGTSVEA DRVVLAVPGPRAAALLDDPSPVEAELIATPYSPGLLVVVALSRRLADHELGGAYGLLASP GRSGDLAALCASSRAGHAAPGRDAVTVMFDGEAAGECIKRGTSDEELVGRAVESLCALAP SLEAAVDQASSRVVRIPAAMPTCRVGRASLVRRYRGEQSGPVVLAGDYLAFPWSDSAALT GLWAANRVRLAAERG >gi|319978436|gb|AEUH01000117.1| GENE 3 3276 - 3953 607 225 aa, chain - ## HITS:1 COG:SMb21462 KEGG:ns NR:ns ## COG: SMb21462 COG0524 # Protein_GI_number: 16265036 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Sinorhizobium meliloti # 18 202 102 292 311 78 35.0 1e-14 MLAQEGVRLDALEAVERQSVTVSLALDGDRAMTSFGTEEFPPLRGPAPAALLTDLRGLAA NAEAVGRWRREGTWVLADCGWDPTGRWDQGDLGALALADVFTPNEAEALAYTRAGTVEEA AARLAGMCPMCVVTRGAAGALAVAPGLRAAVPAPPVAADDTTGAGDVFSAALAWALLRGR GVRGALVFAVHAAALSTTRRGGASAAPALDEVEGFLAECGAAPGE Prediction of potential genes in microbial genomes Time: Thu May 12 17:51:45 2011 Seq name: gi|319978422|gb|AEUH01000118.1| Actinomyces sp. oral taxon 178 str. F0338 contig00118, whole genome shotgun sequence Length of sequence - 10225 bp Number of predicted genes - 16, with homology - 9 Number of transcription units - 9, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 206 141 ## gi|227498097|ref|ZP_03928270.1| ribokinase family domain protein 2 2 Op 1 . + CDS 330 - 1181 1024 ## RER_46830 MerR family transcriptional regulator 3 2 Op 2 4/0.000 + CDS 1174 - 2088 1393 ## COG1131 ABC-type multidrug transport system, ATPase component 4 2 Op 3 . + CDS 2085 - 3686 1849 ## COG3559 Putative exporter of polyketide antibiotics 5 3 Tu 1 . - CDS 3659 - 3922 126 ## 6 4 Op 1 . + CDS 3956 - 4378 287 ## 7 4 Op 2 . + CDS 4394 - 4711 335 ## 8 4 Op 3 . + CDS 4798 - 6435 1968 ## Bcav_1045 hypothetical protein 9 4 Op 4 . + CDS 6484 - 7008 607 ## 10 5 Op 1 . - CDS 7112 - 7186 81 ## 11 5 Op 2 . - CDS 7208 - 7378 329 ## Ksed_05360 transposase family protein 12 6 Tu 1 . + CDS 7754 - 7930 57 ## 13 7 Op 1 . - CDS 7927 - 8364 443 ## Arcpr_1769 hypothetical protein 14 7 Op 2 . - CDS 8427 - 8804 189 ## VC0395_0570 hypothetical protein - Prom 8848 - 8907 1.9 15 8 Tu 1 . - CDS 8997 - 9572 209 ## - Prom 9777 - 9836 2.6 + Prom 9771 - 9830 4.1 16 9 Tu 1 . + CDS 9878 - 10223 334 ## Ndas_0985 transcriptional regulator, AbrB family Predicted protein(s) >gi|319978422|gb|AEUH01000118.1| GENE 1 2 - 206 141 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|227498097|ref|ZP_03928270.1| ## NR: gi|227498097|ref|ZP_03928270.1| ribokinase family domain protein [Actinomyces urogenitalis DSM 15434] # 9 68 13 72 320 75 66.0 1e-12 MLESQGNEQVLVAGPLFLDVVMGPLGHAPVPGQEQWVPGCALAPGGAANQAVALARLGAR AALASYAG >gi|319978422|gb|AEUH01000118.1| GENE 2 330 - 1181 1024 283 aa, chain + ## HITS:1 COG:no KEGG:RER_46830 NR:ns ## KEGG: RER_46830 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: R.erythropolis # Pathway: not_defined # 1 182 6 192 263 97 39.0 6e-19 MADLAGTTTRTVRYYHRLGLLPVPPVIAGRRDYGIEHLARLLRIRWLAESGIPLAKIAQM LPEEPAQGRSAIEADLLATRAQIDARIEVLHHQRARIDGLVARVRSGEAISPLPVVLERF YDHLEGLVEDPATLPIIHTDRRMVLALAISGLIPASLGPFIEGLSDEDHRAVVRMFTTFA TLDRSRYPGAYGDEERERVIEDLEEAEWAVLERNRATALALLRDLPSGGPGHLLWKRVAR LSKIGYPEPDQRRVIDDLVRRLQADPEFGPVLEERTGKEWSIV >gi|319978422|gb|AEUH01000118.1| GENE 3 1174 - 2088 1393 304 aa, chain + ## HITS:1 COG:Rv1218c KEGG:ns NR:ns ## COG: Rv1218c COG1131 # Protein_GI_number: 15608358 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Mycobacterium tuberculosis H37Rv # 5 297 10 297 311 283 54.0 4e-76 MSDLIEAEGLIKCFGTVKALDGLDLGVAKGQIHGFLGPNGAGKSTTIRVLLGMYRTDGGR ARVLGMDPAREASAINRRLAYVPGSVSLWPSLTGGQVLDTLAGLRGSRDHAREEELTERF DLDPTKKVRTYSKGNAQKVALVAALSAPVDLLVLDEPTSGLDPLMERVFTDVVREAAEGG ATVLLSSHILAEVQDLCSHVTIIKEGRTVESGDLSRLRSLAETAIRIGVGESDSRRPRLV EALAGLGAPVRPTGTRVETRVPSSLVPEVLGVVAELGADDVQVEPASLEDLFMSHYSGDE GARS >gi|319978422|gb|AEUH01000118.1| GENE 4 2085 - 3686 1849 533 aa, chain + ## HITS:1 COG:Cgl2980 KEGG:ns NR:ns ## COG: Cgl2980 COG3559 # Protein_GI_number: 19554230 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative exporter of polyketide antibiotics # Organism: Corynebacterium glutamicum # 23 531 4 517 518 153 26.0 8e-37 MSAFTGLRRCLGLATRRYRWQALAWMVPLWLFLLGHPIALQRTYPSFEERTALLSQMRDA PGVRLLFGPLPATGGVGEFASWEDGGYLLWFVAIMAIMLTTALARRDEQDGHVEVVLGAG AGRWAPFASATAWAMGAMALTGAGLAASLIGVDAAVGETPLRGALVFGGVAIAQGWAFAG VALVASQLVRDASAARGLCFTVFGAAFAIRVLADETGAAWLRWLSPLAWRDVAGPFGAER VWALAVFALVAAALVALAALLHSRRELLGAVLADRSVSTRRWRVRGPLGLTARLGVRRLV AWAFALALTAALFGAMSGGLSDLIANNPASAAYIDKMAPEMRPVVQYTTLLTVVMVALVA TAVVQRVLGLAASEERGLSEAVLACGVPRTRALLAAVADAVGAGVVLLVVSGAVLAAATA TQVGEDHAAGRALVSTLTQLPGVVAAAGIAALLVGAAPRWRSLAWAVIAWSSFARLFGGL VDMPKWAQDLSVLGHVADPWGTTHWVPLVLQLVVGLVCCGAGLAVYRRRDIPA >gi|319978422|gb|AEUH01000118.1| GENE 5 3659 - 3922 126 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MICCEDLVWFRSLLVEGGYSGRGLHCRESPRTWRRRVGRDLTRAGPQPRGAEGLGFHQPA RSRTGARSHCPRALRAGAAQAGMSLLR >gi|319978422|gb|AEUH01000118.1| GENE 6 3956 - 4378 287 140 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTHKGYRYRVDSFDEAIARVEDEVAGAQELAGRAQEFLDSLADVCGVGVDRDERVRAEVD ASGTVVAVEIEEDPELAEAVLEALTAARADAGRGLAGAAGRAYGEDSEVASRLAQEYGLD DDDDGGRGAYGRRGGPRGWR >gi|319978422|gb|AEUH01000118.1| GENE 7 4394 - 4711 335 105 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDWDVSFEPDRFRAAAEGARRVGDGVEAAISAVGQASSLSGALGMLVGPLVAPVFAVVEV AAHACLHAGAEGASDIARKLEESAQVMEACEQQIADDFKRIGGRR >gi|319978422|gb|AEUH01000118.1| GENE 8 4798 - 6435 1968 545 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1045 NR:ns ## KEGG: Bcav_1045 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 29 318 4 288 412 161 34.0 6e-38 MPRPTETDPLPSPSGGGSGSSSDFSGGGNPFVASKEDSTTPLAGSGVLDDIEGIATGLKE GSWLDVGLSTVAMGADVLGAVVDPLGTLIAWGAGWLIEHFNPMKGWLEQLAGDADQVKAN AQTWGNVAKGMDSMAEALEMDASELMADARGAAARGYSAASGDVSKSLRTAGKAAAAMSG AMGVLATVVKVVHDLVRDAIAAIIGTLASAIIEAVATVGLAIPAIIAQVQIKVGAKAAEM GAKITGLLKSAKSLFKQLTSLKGLLELLKSLLKKGKQGLKGLKDLGKKILKGKKGTDLKE LVKNRKRMWTPEKAQKRARKLIEKGKKANARKKKYIEKIKNRGLSTSDVSKANIGDTVEA LRAKGQSVLKTKMLERDARQLSKARFAEKRASEELGQLGAEYATNSFNPPRHVIPGVGGS GAGNRRFDVLSIDERLENLDFIESKGVKEGSSPHMGQATDVLPNGQKVKLKQGTPEYLNH IARQDKEFLQLMRDNPEVWRRIKKGDVQLNNVVSVTDTSGLGSTTYKSTPFTLEQRTIDH IDGQL >gi|319978422|gb|AEUH01000118.1| GENE 9 6484 - 7008 607 174 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSHEFQAVPAGRAVEIVRAWALHTWPMTVDEGVGVYTGLGFRCAGSDPEMFTSDLSPDED VSYFNSEHGSVFGVRVQLSNLVPEAGWAASLPRSREAFESYAAAFSQVFGAPYSKDSTES SFKAQWFLDNGAGVGLNGNQALITASVESPEDAEYHLSELEDEKNGIPLDFDYF >gi|319978422|gb|AEUH01000118.1| GENE 10 7112 - 7186 81 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRFDDAFWAQFDQDYAAWVDRLQK >gi|319978422|gb|AEUH01000118.1| GENE 11 7208 - 7378 329 56 aa, chain - ## HITS:1 COG:no KEGG:Ksed_05360 NR:ns ## KEGG: Ksed_05360 # Name: not_defined # Def: transposase family protein # Organism: K.sedentarius # Pathway: not_defined # 1 54 401 454 454 90 79.0 1e-17 MAYFNTGGASNGPTEAINGIIELGRRIARGYRNPTKYQLRMLLIAGGLDASTHTQL >gi|319978422|gb|AEUH01000118.1| GENE 12 7754 - 7930 57 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRISPADPRRPCIVNVFRCSSLTLGSHKHRIHSEKVLRAARHVSSFGQATTNEFCIAF >gi|319978422|gb|AEUH01000118.1| GENE 13 7927 - 8364 443 145 aa, chain - ## HITS:1 COG:no KEGG:Arcpr_1769 NR:ns ## KEGG: Arcpr_1769 # Name: not_defined # Def: hypothetical protein # Organism: A.profundus # Pathway: not_defined # 3 131 23 139 157 74 33.0 1e-12 MSVEMADRISARRAVANAFFLTVNTTLAAVIGLHKPEDGSALLPVAVCVAGIAVSVCWWY LLRNYRKLNEAKFVVINKIEADYLPLTPFLDEWAILSNEGNSKGKMARVRARLRQLGNVE RVVPVIFGLLYLMLLVGRMPLCPTT >gi|319978422|gb|AEUH01000118.1| GENE 14 8427 - 8804 189 125 aa, chain - ## HITS:1 COG:no KEGG:VC0395_0570 NR:ns ## KEGG: VC0395_0570 # Name: not_defined # Def: hypothetical protein # Organism: V.cholerae_O395 # Pathway: not_defined # 4 89 5 92 111 85 44.0 6e-16 MAQRAFISFDYDNDARLKDLLVGQAKHPDTPFEIADWSIKTASPTWKVEARRRIKGAGLM IVLCGKSTHTAVGVAEELRIAKEENIPYFLLAGHDGARKPTTATSSDKLYKWTWDNLKAL VKGSR >gi|319978422|gb|AEUH01000118.1| GENE 15 8997 - 9572 209 191 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDAPAALIESVPGQAHDVEGIHDRPRAGEFFGGRALETGESIHRDNLNTLAPDARLGGQP GLEDLFGAARDHVQEPGGTTAIADGSQVQDDGDVFVAVGSMTPHVFIHADDTHAFEPGWI VDERALAFAQDSGISGIPGHSQGLGDARHRSSPVIRQANTVWLFSTRWPDTSNPRPSRRV NALRSGRSKVV >gi|319978422|gb|AEUH01000118.1| GENE 16 9878 - 10223 334 115 aa, chain + ## HITS:1 COG:no KEGG:Ndas_0985 NR:ns ## KEGG: Ndas_0985 # Name: not_defined # Def: transcriptional regulator, AbrB family # Organism: N.dassonvillei # Pathway: not_defined # 2 93 3 95 95 71 47.0 7e-12 MAPRNPAQFGPGGRGFFGTVTVGTRGQISIPAQARKSLGLEPGDQLVVLTDPAQGLALVP LSLLLSQHAGTNPLAALVRDTMAGGEVPPSPPGPEADDGASGADEPGSAATAPAE Prediction of potential genes in microbial genomes Time: Thu May 12 17:52:58 2011 Seq name: gi|319978411|gb|AEUH01000119.1| Actinomyces sp. oral taxon 178 str. F0338 contig00119, whole genome shotgun sequence Length of sequence - 10342 bp Number of predicted genes - 12, with homology - 8 Number of transcription units - 6, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 79 58 ## 2 1 Op 2 45/0.000 + CDS 123 - 1046 1502 ## COG1131 ABC-type multidrug transport system, ATPase component 3 1 Op 3 . + CDS 1050 - 1799 1074 ## COG0842 ABC-type multidrug transport system, permease component 4 2 Op 1 . + CDS 2097 - 2528 584 ## gi|154507761|ref|ZP_02043403.1| hypothetical protein ACTODO_00243 5 2 Op 2 . + CDS 2559 - 3944 1733 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases + Term 3945 - 3980 1.3 6 3 Tu 1 . + CDS 4145 - 4693 754 ## + Term 4837 - 4891 -0.9 7 4 Tu 1 . + CDS 5167 - 6561 755 ## gi|262281720|ref|ZP_06059489.1| PAAR repeat-containing protein + Term 6612 - 6656 -0.3 8 5 Op 1 . + CDS 6702 - 7373 806 ## 9 5 Op 2 . + CDS 7423 - 8196 826 ## BAA_2267 hypothetical protein 10 6 Op 1 . + CDS 8531 - 8605 118 ## 11 6 Op 2 . + CDS 8605 - 9471 939 ## gi|228991904|ref|ZP_04151840.1| hypothetical protein bpmyx0001_26490 12 6 Op 3 . + CDS 9471 - 10250 766 ## gi|228991904|ref|ZP_04151840.1| hypothetical protein bpmyx0001_26490 + Term 10261 - 10309 0.7 Predicted protein(s) >gi|319978411|gb|AEUH01000119.1| GENE 1 2 - 79 58 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no AVAPAPGPDAAPPTGTGPADAAPAS >gi|319978411|gb|AEUH01000119.1| GENE 2 123 - 1046 1502 307 aa, chain + ## HITS:1 COG:TM0389 KEGG:ns NR:ns ## COG: TM0389 COG1131 # Protein_GI_number: 15643155 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Thermotoga maritima # 6 305 15 301 301 216 40.0 6e-56 MTSPAIRAADVTKRYGSLTAVDGVSLEVSHGEIFGIIGPNGAGKTTFMECLEGLRLPDSG DIDVLSEDPAHPTKRWRERIGVQLQTAALPPKITVDEALALFASMYSDPADPSALLHELG IASKGRSYVDKLSGGQRQRVFIALALLGKPEVVFLDELTTALDPQARLAMWDVVRSIRQG GATVVMTTHYMEEAEALCDRVAVIDHGRLIALDTVPGLIGSLGKGTKVQLRTSRTIQPSA LDGVEGISDVAVAGTAVSMLWAGSGIPQAAVSAIEATGATVSNIRTSSPGLEDVFLALTG RDMRQEA >gi|319978411|gb|AEUH01000119.1| GENE 3 1050 - 1799 1074 249 aa, chain + ## HITS:1 COG:RSc0165 KEGG:ns NR:ns ## COG: RSc0165 COG0842 # Protein_GI_number: 17544884 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Ralstonia solanacearum # 63 247 194 377 381 60 27.0 3e-09 MRTFVQLCRVTAMRALREPVTVIFAVLFAPAFIACMGLIFGNSPAPEFGGKGFLEANFTA FPGIVIAITALIIVPVDLVAQRGAGVLRRFRATPLNPALYLAADVVSRMVLGLVSFTAMY AIAALGFGVRPASAGAFVSALVATMLGLAAFLAPGYLIAGRFRNVGAAQGLGNILMYPLI FTSGAAVPLAILPPGVLSVAKYSPMTQLTYLTQGLWAGEGWGQHWVAAVVLVVFGAVCGV AAARLFRWE >gi|319978411|gb|AEUH01000119.1| GENE 4 2097 - 2528 584 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154507761|ref|ZP_02043403.1| ## NR: gi|154507761|ref|ZP_02043403.1| hypothetical protein ACTODO_00243 [Actinomyces odontolyticus ATCC 17982] # 1 140 40 179 182 233 86.0 4e-60 MLCGVDLGFSQADFFRSNASATAYVLGLESVQVLAGALCLGLIYPWGERVPHWVPCLGGR EIPRILPLVVGGIGNALLYYINATLVIRFGAVWLGLAEGQTPADGMNHWQVAVLVAAYVP MLLLWAPALTVGLVGYGRRRAPH >gi|319978411|gb|AEUH01000119.1| GENE 5 2559 - 3944 1733 461 aa, chain + ## HITS:1 COG:SMb21463 KEGG:ns NR:ns ## COG: SMb21463 COG1486 # Protein_GI_number: 16265037 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Sinorhizobium meliloti # 1 451 1 451 461 191 35.0 4e-48 MKLAIVGGGGFRVPQIIEVLARARAGDGPHAQLVVDHVALHDTAPSRLRAMKAVLTALDY PCSPAVSSHTDLDEALAGADFVFSAMRVGGTRGRVLDERCALEEGLLGQETVGVGGYAYA LRTLGPALDLARAVRRAAPGAWVINFTNPAGIITQAMRSVLGRRVVGICDTPISLVRRTC AALGLDPSAAERNGTGTDFDYVGLNHLGWLRSLRVDGVDLLPGLLADDARLDALEEARTI GADWLRAIGAVPNEYLFYYYCQREALAGVARSRTRGEFLDAQQGEFYEAVAADPARAGAL WEAAHARREATYMAEARSEDERDGRREEDIAGGGYQRVALDLMTALSTGEPARMILDVGN ADSGSAAIASLADEAVVEVPCDVDGDGIHPRPVAPLTGAPLGLVQSVKACEDLVIEAVRR RDESLAWRALALHPLVDSVRAARRVLDSYIARNPLVAAVFD >gi|319978411|gb|AEUH01000119.1| GENE 6 4145 - 4693 754 182 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALFARTTTLSANFPGTYEQVYLAAFAAVYALGYTPDMADPNMRMICYSTPTTAFTWGVN VTIGLVPQQGQVVVEITAQSLWPMPTLGQEGRNKKIINAAFEQINAALAQGAPQPGQQPQ AGVDGAQPYGGAAFQPGQQPQAGADGAQPYGGAAFQPGQQPNAGSAQPYGGAAPRPGQQP GA >gi|319978411|gb|AEUH01000119.1| GENE 7 5167 - 6561 755 464 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262281720|ref|ZP_06059489.1| ## NR: gi|262281720|ref|ZP_06059489.1| PAAR repeat-containing protein [Streptococcus sp. 2_1_36FAA] # 348 460 9 115 120 71 38.0 1e-10 MEARDDILSKVEVAEGRYRAAGNALVEYAGALERAQTDSLNALVAARSAQQDVDEAVARA GRMRESAGECPGQGDGADDRARYERAATAADGDAEVARGRVRAQRQVVLNAMGERDAAAV KAMNAIDGAGDDGLGDSWWDDWGSKVASWIATICDLVAQILGVLALLVAFIPLIGQALSA VLLTLSAIAGVVGAIAHIALAVSGDETFLAALVSVAFAALGCIGFGGMRLLTKHIISLSG LSRVRNLISATFGGKGIASTFENLREAFKDLARVRALPNNALGGVRGMARAYFRNVIASV RNLARYARVRFQYKGTPPRYTEGKPFPRRPKTPPKDLSKAWDDIEPRPRLDRVLPGNPDN LTIEADHIVPARWIEKMDGYSRLTGPQRQEVMNWMENLQAITKRANRSRQNSLYVEEWLG YNPLGDGVFEFPADRLFLTQMQRLEPLLATRIQYMIDQIVAGGL >gi|319978411|gb|AEUH01000119.1| GENE 8 6702 - 7373 806 223 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPALSPAERVGVMADRMDRFVGDVLPETVALIRRCGRDAELALWGERYIVVYTLVNSTGT TLDYYGGNPYQPTWPDEEDPRRIAPSYGIAVYDFNVRAVWDRVPPKLRSFYENVHDGLCW CNWAPDSFYELRAVNEFRSGDFSEFDMSDNDALCDRAQGCYPFFCGSSGSLLTVDLVGKC PGRVDYWSSGGAFEFDIDLWQALDEFYTGPADPEYYAALKDDQ >gi|319978411|gb|AEUH01000119.1| GENE 9 7423 - 8196 826 257 aa, chain + ## HITS:1 COG:no KEGG:BAA_2267 NR:ns ## KEGG: BAA_2267 # Name: not_defined # Def: hypothetical protein # Organism: B.anthracis_A0248 # Pathway: not_defined # 61 241 50 223 232 71 24.0 2e-11 MAILPWRRRRQLAEQAKLDELNRLMSYGGAVSFVEGPPYSVFIPSPWEGIARSTSAIGGR SVSSMMRYTVGDKFPRTVDLLERCGHDAQIVKWGDRYVVVYTLVNSAGETVHYFGGNPMR PSWPMSSEEGYGRNVRAVWSRALEVWPFYRNFQDGFVSCAQPSTGIYQVGDINEFSTGQF TTLGIDDEGLRARAKGCYPFYRSPSGDIVTLDITGQCPGRADLWHPHADPELDIDFWDTI DTLLTTTIDPHHNTQDS >gi|319978411|gb|AEUH01000119.1| GENE 10 8531 - 8605 118 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLWIIPFVEQRIQYMIDQIVAGGV >gi|319978411|gb|AEUH01000119.1| GENE 11 8605 - 9471 939 288 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|228991904|ref|ZP_04151840.1| ## NR: gi|228991904|ref|ZP_04151840.1| hypothetical protein bpmyx0001_26490 [Bacillus pseudomycoides DSM 12442] # 55 252 49 226 235 65 24.0 4e-09 MGLFSRNKGHGLTKLEELNTHTRGHGPLRFVEEAPPAGAPLPEEWRALPALSPAERVGVM ADRMDHYVGDVLPKTVDFIRRCGRDAELAMWEDGERYLVVYTFTDPEGGMSYYCAGNPYQ PTWPGEDEPKRTWPSCMSVYDFNVHAVWDRVPPKLRSFYENVHDGFLYSTGGSLQLYELM DVNEFRSGDFPDFDMSDNDALCDRAKGCYIFYHHTSGSLLTVDLVGKCPGRVDYWSTGGD FKFDIDLWQMLDAQLGDDLDPIDYPLIHHTRQPPTTSEVDTGNDEGGR >gi|319978411|gb|AEUH01000119.1| GENE 12 9471 - 10250 766 259 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|228991904|ref|ZP_04151840.1| ## NR: gi|228991904|ref|ZP_04151840.1| hypothetical protein bpmyx0001_26490 [Bacillus pseudomycoides DSM 12442] # 60 246 52 231 235 66 22.0 2e-09 MAILPWRRRRQVAEQAKLDELNRLMSYGGTVSFVERLPYPAPVHCPWSSPHLGASEVTGK VIADQLRLAVSGTMPKAVNLLERCGYDAQIVKWGERYVFIYTLINSTGQTLYYYGGDAMY PSWPRKRDAGYERNIHSVWDLVLDIWPLYSHFHDGFVSCAQPSTGIYQVGDVNEFSTGQF TSLGIDDEGLRDRARGCYPFYRSPGGDIVTLDITGQCPGRADLWHPRADPELDIDFWDTI DTLLTTTIDPEYYEAGADQ Prediction of potential genes in microbial genomes Time: Thu May 12 17:54:18 2011 Seq name: gi|319978403|gb|AEUH01000120.1| Actinomyces sp. oral taxon 178 str. F0338 contig00120, whole genome shotgun sequence Length of sequence - 6297 bp Number of predicted genes - 7, with homology - 2 Number of transcription units - 5, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 463 384 ## Bcer98_0735 hypothetical protein 2 1 Op 2 . + CDS 463 - 1332 628 ## gi|228998017|ref|ZP_04157618.1| hypothetical protein bmyco0003_25860 3 1 Op 3 . + CDS 1332 - 2105 652 ## + Term 2189 - 2251 5.7 4 2 Tu 1 . + CDS 2294 - 3493 1164 ## - Term 3483 - 3520 -0.5 5 3 Tu 1 . - CDS 3556 - 3708 95 ## 6 4 Tu 1 . + CDS 3676 - 5019 1337 ## 7 5 Tu 1 . - CDS 5120 - 6295 1245 ## Predicted protein(s) >gi|319978403|gb|AEUH01000120.1| GENE 1 2 - 463 384 153 aa, chain + ## HITS:1 COG:no KEGG:Bcer98_0735 NR:ns ## KEGG: Bcer98_0735 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_NVH # Pathway: not_defined # 14 148 89 212 219 81 39.0 9e-15 SGPDVPRETILGKTPSDPIKKSLTKGPDGKPLEPTDWTLPGNDTNLRLEADHIVPYERIQ MMEGFDKLSRAQQSQIANWPENFWAISMRANRSRGDSYFADWGGHLGRLKSERGPSGFAH PVEPLVCTEMSRIERILEQRIQYMIDKLVAGGL >gi|319978403|gb|AEUH01000120.1| GENE 2 463 - 1332 628 289 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|228998017|ref|ZP_04157618.1| ## NR: gi|228998017|ref|ZP_04157618.1| hypothetical protein bmyco0003_25860 [Bacillus mycoides Rock3-17] # 55 254 49 227 235 65 23.0 5e-09 MGLFSRSKGCGLTKLEELNTYTYKHGSLRFVEEAPPEGAPLPEEWRALPALSPAERVGVM ADRMDRFVGDVLPKTVDFIRRCGRDAELAVWEDGGRYLVIYTFTEPEGGTFYCCAGNPRQ PTWPAESVPKRSADSYGGMSVYDFNVHAVWDRVPPKLRSFYENVHDGFLHDTGGSLLLHE LVDVNEFRSGGFVDFDMLDNDALCDRAQGCYIFCDPSNSYLTVDVVGTCPGRADYWDSDG VFRFGIDFWEIVDDCLGGLLDLLDHPGGGGTRQPLTTSEVDEVDDEGGR >gi|319978403|gb|AEUH01000120.1| GENE 3 1332 - 2105 652 257 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAILPWRRRRQLAEQAKLDELNRLMSYGGAVSFVEGPPYPDQACPSWENYRLYDPTSRGG SMATEMGIATAGALWRTGDLLHRCTRSARVAKWGDRYVVVYTLVNSDGQTAYYFGGDPMR PSWPLSSEAGYGRNVRAVWSRALEVWPLYRCILDGFVSCAQPFTGVYQVGDINEFSTGEY AGLGLDDEGLRARARGCYPFYRGPGGDIVTLDITGQCPGRADLWHPRADPELDIDFWDTI DTLLTTTIDPHHNTQDN >gi|319978403|gb|AEUH01000120.1| GENE 4 2294 - 3493 1164 399 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEYNIIAGGPVGIEDVEAVFGRWGRVCPLGGGAVDVVIGDCGFVIAPEASLSVNPAFVRW VAHARGFGGSARVYSTEIDESYLESEGEETEVSRQVEAGLARLAERVGGVYAPFFQGDAI DYTDPVGGYYSFASGPIPSARKREPGSPALHLSWYGRPGPQAAPLEQWADSLERHLPQCC PLRRPAVEPVVGGGALCMRYGVPWGDVVLHEPGAGAVSDGTAGGGDCGHPGLYTLSCTVP VNAFERDDRPCLDELREFVALAAEDLGAEVATCELVYGYDCGSGVAWPTRKASAKRVVVV DGAGRLLGLPAIPAWLVWLGPAYSGLLAEWVEAKAKNGISCLIRYPGFQGVLLDLEGSAE RYDKPSHNGWWFPEEYLPRKRRFSKAWEPAPRIPHWDGE >gi|319978403|gb|AEUH01000120.1| GENE 5 3556 - 3708 95 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDGGVVVGSAHNPILTVRFGAGALRDGGGSREGHRTGGQLPPAGGLGGSI >gi|319978403|gb|AEUH01000120.1| GENE 6 3676 - 5019 1337 447 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSGTDHDSPVHQILLPDLPVDIGPFVGRIEVDTTTAGTGEAGRASAAEGRIEVDADLVFS LAASEDQAGLGSRVGDYRILPVAPLQSLGSRILEAAHEDPADALEALAEEPVALPVAFDD DGVLRLIAWRADEDAKAELNIYSSAAAVLNDFAATGPLFIFRHGASLVDYVGRRTDLISS VAVDPVPGGADRVVFPSELFALIRSDAEAATEKELGGPARTSAPEQDGGPRPAPTGFTLN LSGDWARIDLTADNAERERWIHRLVKDRTRQLSDAGALLRQEMRAWLDDASSQAKANGGL EFAFSTARIRSVTLALSVVTYWRTIMTDNPGGAFATMRAHMEESSAPDDELTVVDSGGDR VLRAIRNRFGAPELGGSDVPMLFLDYWIEVPGDPLALAHIAFSTPHVAIRETITALCDAI ALQGAWTFPGERGTGRGESVEACPDDE >gi|319978403|gb|AEUH01000120.1| GENE 7 5120 - 6295 1245 391 aa, chain - ## HITS:0 COG:no KEGG:no NR:no DGPPGADDERWVPSAAPADADTGRAAGTIHDEMTIVSSRKRKRPAAVEDEAPGVDLAEPE GDETILKPGFHHYVVGPEPTDTGTAEAKPKRWILWCAVAAVCAIAVGAGAWMWLGGRSGT ADEDPYAGLSIGDARTTGDFSATGLRVDTSATWDKDVKRTILTLTYSVAPRVTVSGEVLV VLPGVHDGKCAAPSDAKSALQPIKASTDGMSVPCGYRIVLAPVAYGQQQVVSLPVDLDLV GEDGSAPADYSQWLAAVDSATASALSTMTGTRFPLQRVTGISVKPESVSLDGTEATPVPY TVTATWNGKADGTTETDLVSSETRDGAEIQALLDLTGGQGLDGLTLTTCSSARVSGTRVL AEQPTDSCDLEAQIGGLSSPKASFQARMRNS Prediction of potential genes in microbial genomes Time: Thu May 12 17:55:42 2011 Seq name: gi|319978395|gb|AEUH01000121.1| Actinomyces sp. oral taxon 178 str. F0338 contig00121, whole genome shotgun sequence Length of sequence - 11034 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 806 661 ## COG0515 Serine/threonine protein kinase 2 1 Op 2 . - CDS 803 - 2059 1208 ## RMDY18_19220 molecular chaperone, HSP90 family 3 1 Op 3 . - CDS 2056 - 5184 3508 ## RMDY18_19210 chitinase 4 2 Op 1 . + CDS 5178 - 5270 181 ## 5 2 Op 2 . + CDS 5286 - 9662 4795 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 6 3 Tu 1 . - CDS 9722 - 10321 564 ## Bcer98_0736 hypothetical protein 7 4 Tu 1 . - CDS 10544 - 11032 461 ## Bcer98_0735 hypothetical protein Predicted protein(s) >gi|319978395|gb|AEUH01000121.1| GENE 1 2 - 806 661 268 aa, chain - ## HITS:1 COG:Cgl2127_1 KEGG:ns NR:ns ## COG: Cgl2127_1 COG0515 # Protein_GI_number: 19553377 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Corynebacterium glutamicum # 6 213 7 211 418 148 43.0 1e-35 MTRAGGAPLGSRYTLEEPISAGAMGEVWRARDNRSGGQVAAKLLKEHHEDDPEVIARFIN ERKLLLAVDHPGVVGARDMVVEGDTLAIIMDLVEGPDLAALLREHGPLAEDRAAAVGAGV LGALAAAHRAGIVHRDVKPSNVLITHPDAPTAAGVRLVDFGIAALAAQEPDSERIGTPAY MAPEVDAGGRATPAADVYSAGVVLYEMLSGEPPRAAAGADCGRAPRLPIDDALWDLVAHM LAEDPDARPSAAACLRALRRLAPPGRDA >gi|319978395|gb|AEUH01000121.1| GENE 2 803 - 2059 1208 418 aa, chain - ## HITS:1 COG:no KEGG:RMDY18_19220 NR:ns ## KEGG: RMDY18_19220 # Name: not_defined # Def: molecular chaperone, HSP90 family # Organism: R.mucilaginosa # Pathway: not_defined # 9 400 37 455 474 83 27.0 2e-14 MKRPRLPIALLVLGVAAGTAACRPGGGSSQSDPAAQPAALVDAALHHSLPADAPLSIGVL IGPTEGRGSEFRPLIEGANLAAYRFGLGGTPVDLRVALDDGTPSGATDAVRSLLDGGVGA ILVESQSDGHLDAALAEASAADCAVVMAYSQAGADGVWSLAPSATGVRAAIDQTLRDARL SKPYVVTGAQRQGPVEGIASGSAADPRAQATAVKERLDNREIDSVVVAASADEAAAFVTA LEAALDDPQTLVFLTPEAMTPAFGDAVTASGASEGRLVTVGRAPAETTALTAGEAGDAAS AFYAAMRMAASDPQCENIYKDDACATSIRSADATSHDAVVALVRAAEAAQSSKPVRVRAA LSALTVSTKQGLAVGTLDFRSYEAVPASERRVLRATASDTGLAPQSGGAVPTLHWFGQ >gi|319978395|gb|AEUH01000121.1| GENE 3 2056 - 5184 3508 1042 aa, chain - ## HITS:1 COG:no KEGG:RMDY18_19210 NR:ns ## KEGG: RMDY18_19210 # Name: not_defined # Def: chitinase # Organism: R.mucilaginosa # Pathway: not_defined # 70 609 60 586 1007 117 24.0 2e-24 MAIGQDCVCRITVADRCGEGRWNGGDMLERLGEAFKDARRRTAAALCLVMVACLAVLQPT GPAVAGAPTLQHVSVTLDGSSAVTNVGLASMTKNPSGAVTEYSDTIEPQDAVQNLPVRIR TTWTHDGQVGTDLALLAGKSGRFVIQWFVENLTAEPTEVSYESNGMRYSQTELVGVPITI AAHAEVSGGAVVTAADEQGTVSDGSVLALDSATTSVQWAALLAPPALSPTTSFTLVVDSS DFQVPQLSLSTIAGITTDPSLAGLLRGALDPGTARGASEQQVLDTVGKATTSLGEAREFV DKVHSALRSDVSTIADSTYADLRSSSNQVASHLRTTGSQLEAIGASAESAVGGATEGVRG SLRSLVDSFASLLGTNSDPELTASAVEGCSLTLPGLAEGEERTVSSTFALVNAQMEALGA VLTPSEENDSCRDAIVSGLRASIGDPAVFDGGEQARVCEEADASAASLTCALHLLDSDVD QRLQRLNTLAEAAVTGYEGLGTNQLMDVLRGRTGLASQLGRLSERVKALQDADPNAGAAR PVDELAADNEAAINAVKSATASLGSASSWLDSVDGVRGALATALNGDGADQGLIARLDAL ANSTEGSASVGSWFLSSNTPAAIDSIVGQLDAQGKRCNASWAQGLTSDSSADAIVAALGG LDQQDCPAAQLARSTASMVEGYAATVGAIQSLRASAAAARDNAQQMNGQLTALNDAVAKL RASIGSGEELANALYALYDERVSADHPDPTGVLVEVRALLADIRGGVPDQRAQIRQIATI VNGIWPDSSVMPLTNPLECPADDAGTAPGASGQAVVWLANRSYCANTDLGAALGSLKSGI ASTSDSSHQLIAQGKDRATGALDGAVNGIDALGSQLQSGIAAQREQSDGATRRMIDEAGA RSDERLAQALGDLDTSMNATLTGLRDGLGDAAGQSTTVASSLEHQFEVLLLNLGQPGTNS RLGLIGKLRGITTDIGATGDVLDHSSASVGAVAHSRAAALRRANLRAAACSLADSRLQEY TPFNGGAGPTTTIITYTIGVSK >gi|319978395|gb|AEUH01000121.1| GENE 4 5178 - 5270 181 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPNTMEAYIFGGWHSPSLKPHDGPTDPMVP >gi|319978395|gb|AEUH01000121.1| GENE 5 5286 - 9662 4795 1458 aa, chain + ## HITS:1 COG:CAC3709 KEGG:ns NR:ns ## COG: CAC3709 COG1674 # Protein_GI_number: 15896940 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Clostridium acetobutylicum # 202 1188 207 1205 1498 310 26.0 1e-83 MRFKFELSANGGTVPILVTTEATATIGDLAGAIESGRTGIRAAERPPLTVEVIDEAGARL LVLPPDSLLLDTPLRSGQRIRLVPGSAAAGPDAEGGARLLVVCGPDSGKSFPLHPGSNIL GRSTECDVVLTDREVSRRHARITIDPGARIDDLGSANGITVDGEHIDSADLAGRIPVQAG TTVFVVDAPEQSGGPDGPGTAFNRSPHVRIDYEGRELQAPEPPGPPSSGRRISPLMALAP LLMGVVSFLIFRNVMTIFFFAMSPLIMVGTALESKISGKRAHKADVEKFQTALDDLRMSV ANALDEEHDRRREEVPALERAIEAAYGRTPLLWYRHPDDDSFLRVTLGYGRAPSRNTLKM PPRRGADEETWSLLLGVKESAGHVDDVPITTSLRECDNLGVAGPREERLPVARGYVAQLT CLHSPAELVIAALASQESSAAWTYLTWLPHVDSPYSPIPGPHLAREPRTASALVAQLEEI VASRRDARANLPSIVVLVENDAPVERGRLVSIAEDGPEVGVHLVWMSDAQNQLPAACRAY LVVRGTSATAGFTTQRQTTPIASLELLPQVTATNLALELSPIEDAGAPVLDQSGIPRAVS YLALAGVGLADDPQVTVDRWREDGSLPGAGALSTRGPRTLKALVGQGPLGDFSLDLREQG PHALVGGTSGAGKSEFLQSWVLGMAAAHSPRRVTFLFVDYKGGSAFADCVNLPHCVGLVT DLSPHLVRRALTSFRAELTFREHLLNAKNAKDLLSLEATNDPECPPSLVIVVDEFAALVQ EVPEFIDGMIDIAQRGRSLGLHLILATQRPAGVIKGNLRANTALRVALRMADEIDSTDVI DSPLASEFDPRIPGRGAVRTGPGRISLFQTGYAGGRTSDAPTAARIDIETMAFGPGIPWE IPRPATTRDDEDSGPTDITRIVASITNAARACRLPAPRRPWLPELEARFDMDQVLATSGQ DDPNALPLGVIDDPAHQSQHTVHYAPDTDGNLAVYGAGGSGKSGVLRALAYAASALSDRA RTDIYCLDFSTAGLPMLSPLPNVGAVIDGSDTERVTRLLRRLVELLDDRATRYAAANADS ISEYRRSTGETDEARVLLLVDGLSAFRDAYETGTGAALKAYGNFTRLLAEGRGAGIHVIL TADRPGALPSSLAANVRTRLVLRQADENGYMALDVPKNVLVEAPPGRAIFSGGANELQVA IPSGSSSAPSQAAAFVKLAERMRDAGVAPAEGVERLPDLIARSEVPASVGGMPVLGVAEE DLAPLPFSPLGALSIGGMPGSGRTSAIESVVQAVHRWRPSLPLYFIGPRRSRVHDLDLWR ASARGIDEAQAVLAPVKEIAENAAADDDAPDVVLVVEALSELVGTPAESALLDVVKKLRR NGHLLIAEQETSGWSSGWPLIAEVRNARHGIVMQPNPMDGDVLFKVDFPRLKRSDYPLGR GVYVHSGKVRVVQLPMPE >gi|319978395|gb|AEUH01000121.1| GENE 6 9722 - 10321 564 199 aa, chain - ## HITS:1 COG:no KEGG:Bcer98_0736 NR:ns ## KEGG: Bcer98_0736 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_NVH # Pathway: not_defined # 1 192 50 231 232 79 26.0 7e-14 MVSVMGETVGEHLPRTLRGLRERCVDAELAVHGGEYFVVYTFIDGEGGEECFTGGNPLRP SWPGPGQDGYADNVRRVWESVPESVRAFYERTHDGFGLYPHTLFGLYELKRVRPVYGVLD VGYEEPEVAERARHCYMFYEDASGAPFTLDILADRVERAGDLWWIDSTREYHVDFWGVLD QLLEAVIVPLEDRGPRTGR >gi|319978395|gb|AEUH01000121.1| GENE 7 10544 - 11032 461 162 aa, chain - ## HITS:1 COG:no KEGG:Bcer98_0735 NR:ns ## KEGG: Bcer98_0735 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_NVH # Pathway: not_defined # 6 141 86 211 219 80 36.0 1e-14 GPDVPRPETPSKDIRKQFALGPDEKPLEPTDWTLPGNKRNLGLQPDHIVPYQRIQRMEGF DRLTEAQQRQVVNWSENFHAISARANLSRQEKSFVQWEGFLGEKGKRGPSGFVYPVEPLV RTEMSRIERLMEERIQYMIDQMLPPKPPAGGGWPYIPFFPGR Prediction of potential genes in microbial genomes Time: Thu May 12 17:56:15 2011 Seq name: gi|319978382|gb|AEUH01000122.1| Actinomyces sp. oral taxon 178 str. F0338 contig00122, whole genome shotgun sequence Length of sequence - 11295 bp Number of predicted genes - 13, with homology - 9 Number of transcription units - 6, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 110 - 1441 916 ## 2 2 Tu 1 . - CDS 1688 - 1996 409 ## - Term 2018 - 2050 2.1 3 3 Tu 1 . - CDS 2112 - 2411 573 ## - Term 2468 - 2530 16.7 4 4 Op 1 1/0.000 - CDS 2538 - 3251 721 ## COG2120 Uncharacterized proteins, LmbE homologs 5 4 Op 2 14/0.000 - CDS 3248 - 4573 2083 ## COG1653 ABC-type sugar transport system, periplasmic component 6 4 Op 3 38/0.000 - CDS 4601 - 5440 1258 ## COG0395 ABC-type sugar transport system, permease component 7 4 Op 4 2/0.000 - CDS 5427 - 6308 1211 ## COG1175 ABC-type sugar transport systems, permease components 8 4 Op 5 . - CDS 6567 - 7958 1555 ## COG0477 Permeases of the major facilitator superfamily 9 4 Op 6 . - CDS 7970 - 8080 75 ## 10 5 Tu 1 . - CDS 8296 - 8943 698 ## Snas_3270 transcriptional regulator, TetR family 11 6 Op 1 12/0.000 - CDS 9044 - 9466 561 ## COG0853 Aspartate 1-decarboxylase 12 6 Op 2 3/0.000 - CDS 9475 - 10401 1018 ## COG0414 Panthothenate synthetase 13 6 Op 3 . - CDS 10456 - 11295 881 ## COG5495 Uncharacterized conserved protein Predicted protein(s) >gi|319978382|gb|AEUH01000122.1| GENE 1 110 - 1441 916 443 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METRDDILSKVEVAEGRYRSAGQALVEYAGALERAQTDSLNALVAARSAQQDVDEAVARA GRMRESAGEYPERGDGADDRARYERAATAADGDAEVARGRVRAQRQVILNAMSERDTAAV KAMNVIDGASDDGLGDSWWDDWGSKIVKWVAKICDIVSAITGLLGLLVCWIPVIGQALAG VLFAISAITGVVAAIAHVALAATGEESWTEALISVAFAALGCLGLGAMKGVGRFMCKAMA SRVVQRVGRASAAYMARLAGMTKGAMGTLEGMRRLRVLRFAGRIGMRTLLRPGAWARIKA ANNANNAVGRLGEEILLINSTAPKVNFRMGRYVAGATHGREPDLFFRFPGFKNPILGDVK YWQRLGYTPQLRDFSRIAAQDGAGGVFHIFSPEYYTRLSGPLQALARGAGNGGNFVFHSL ENVLGVRAVAPVIAGAGLGHVTR >gi|319978382|gb|AEUH01000122.1| GENE 2 1688 - 1996 409 102 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGETMIIEVDTIAALGSSAGTIAAEFEGANAESDTIAAAVGHSGLSKSVHDFAHGWDDK RKKMTDALKAMSQAATAVADTWKDFDEQGADALRGEGEQSGG >gi|319978382|gb|AEUH01000122.1| GENE 3 2112 - 2411 573 99 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANAYGYSPEEMRSIGSQLVATKAEISEKIQAALAAVNGLIGSGFTTQAASGAYSEQFNQ LSQGLTQVNDGLEPLGNFLTQYADAVVEMDTQMSSSLRG >gi|319978382|gb|AEUH01000122.1| GENE 4 2538 - 3251 721 237 aa, chain - ## HITS:1 COG:PAB1341 KEGG:ns NR:ns ## COG: PAB1341 COG2120 # Protein_GI_number: 14521743 # Func_class: S Function unknown # Function: Uncharacterized proteins, LmbE homologs # Organism: Pyrococcus abyssi # 9 230 34 250 267 90 32.0 2e-18 MSTQAFASVLGVYAHPDDADVDAGATIARLARQGARVTLVVVTDGGAGGFEADGQSSMGG RRRAEQCEAARALGVSEVVFLEGYADGHVKEDPRLTRDLVRQIRLWRPQLVLSLSPEYNW DSIYANHPDHRAVGTALVDAVYPAARNPFDYRDLLGGGLEAHAVAEVWFQGGPAVNHVVP VEECDLEAKVRAVRCHDSQFPDMDGIEAHIRAAAARAGNGLSGAPLGEGFFRWVVAG >gi|319978382|gb|AEUH01000122.1| GENE 5 3248 - 4573 2083 441 aa, chain - ## HITS:1 COG:ML1770 KEGG:ns NR:ns ## COG: ML1770 COG1653 # Protein_GI_number: 15827946 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mycobacterium leprae # 8 420 17 425 446 279 40.0 7e-75 MKRRITGASAAIAAALALAACSGGQPTGQSGAGGSLTFRLWDEQAAKAYEDALPAFEAAT GIDVSVEVVPWNDYFTGVRNEIAAGAGPDVFWSNTNNYSEYAKGGKLVNMNEVFSDSERA SWLDTAVSQYTVDGQIYGAPVITDPSVALYYNKELLDRAGVGVEELDGLAWDPAASSDSL RDIAKRLTLDSSGRNAADPGFDPNTVVQYGYNAALDGQAMLFPYLGSNGASLQDADGRFT FASPEGEAAIGYLVDLVNKDHVAPSAADTNDNGDFSRDQFLQGKMALFQSGAYNLANVQE GAGFTWGLAPQPKGPKGAVTVGNSVVAVANAADASKSDAQKKLLEWLAGPEGGRALGAVG VGFPANAEAQDSWSQYWSGKGVDVTVMATKPTGTITAPFGAKLGAAMDAYNKELKEVFLG RVGVPEGVQAAQDAANKAVDE >gi|319978382|gb|AEUH01000122.1| GENE 6 4601 - 5440 1258 279 aa, chain - ## HITS:1 COG:MT2380 KEGG:ns NR:ns ## COG: MT2380 COG0395 # Protein_GI_number: 15841822 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mycobacterium tuberculosis CDC1551 # 15 279 11 274 274 238 51.0 8e-63 MSSSSTRRGPLAALGIYLGLSAAALIMAAPLLFSFMTAFKSPKDFASHSPFALPSPFSLE SFGAVLGGRIAFGGAIATTLAVVVVMVAGQLVSSVMAAYAFARIDFPGRDLVFWLFLATM MIPATVLVVPLYLMLARMGLNNTFWGIVIPFVFASPYAVFLLRQSFRQIPQELIDAATLD GAGTWRILWSLVLPMSKPILATLTLITVVSHWNSFMWPRIIAAQRPKVITVATAALQSQY NANWTYVMAATTIALVPLIALFIAFQRQIVGSIALTGLK >gi|319978382|gb|AEUH01000122.1| GENE 7 5427 - 6308 1211 293 aa, chain - ## HITS:1 COG:ML1768 KEGG:ns NR:ns ## COG: ML1768 COG1175 # Protein_GI_number: 15827944 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mycobacterium leprae # 10 292 45 327 328 275 51.0 7e-74 MAIRSRRRLRDTLTGYALLAPSLLGVVAFMAVPILVVVWLSVHKWDLIGEPSYVGAPQIA SVLSDPGFLSSLGITALLVVLVVPAQIVLGLLLANLLTKGVRGTTVFRALVVIPWIAPPL ALGVVWSWIFAPTGGLLSAIAGQRLEILVSPTWALPAAAFVVIWGNVGYTALFFIAGLLS IPKELVEAATVDGASSSQIFWRIKMPLLRPTFFFVSVTSVISVFNLFDQIYALTKGGPDG ATEVLAYKIYQEAFETGNLGRAAVMAVVMMLILMAITLAQNLYFRSRTTYEFV >gi|319978382|gb|AEUH01000122.1| GENE 8 6567 - 7958 1555 463 aa, chain - ## HITS:1 COG:DR2169 KEGG:ns NR:ns ## COG: DR2169 COG0477 # Protein_GI_number: 15807163 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Deinococcus radiodurans # 70 426 45 404 441 136 35.0 8e-32 MVRTHGAGSRHPGPVEEVTLMEAAEENGAPPDARAGETAGAGAGAAAGTAAADGARSGAD GAGRQAQVGVVLAAVFITYVGQSTLNPVIAPLSRLVGLKEWQIGLTISVAALMVVLTSQA WGKRSQSMGRKPVLVIALAVAAVSMGAFALVSHLGVTGALSGAPLFALFVIVRGVVFGAA LAAVLPTAQACIADATTTEESRVRGMAGVGAAQGLASIAGALIGGALSGLSLLVSVDAVP VFLLIGLLVVAAALRKESRTALIAKPARVSPRDPRVWPYLVAGFGMFSALGLVQVVTGFL VQDRLGLDADTTGLITGGALLAAGVGMVLAQSVIVPLSKWAPPVLLRAGALTAAIGFGLL LVDAGLAALIASVAIIGVGIGTAMPGYTAGPTLRMSRDEQGGLAGLIGATNGLTYVVAPT LGTFLYGVAPALPIVIGAAMLVGVLVFVCAHSGFRRPTGQGAG >gi|319978382|gb|AEUH01000122.1| GENE 9 7970 - 8080 75 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIRCLPHCGVACPDADLPDGAVRPGSPPIAIRTLFE >gi|319978382|gb|AEUH01000122.1| GENE 10 8296 - 8943 698 215 aa, chain - ## HITS:1 COG:no KEGG:Snas_3270 NR:ns ## KEGG: Snas_3270 # Name: not_defined # Def: transcriptional regulator, TetR family # Organism: S.nassauensis # Pathway: not_defined # 1 204 26 225 225 168 40.0 1e-40 MSHGTGIQGWSMRDLAARLGVSASVVHHHVGNREALTARVVGRVLDLVGLPTEAMEWREW FRRALLPARPVLSAYPGTARWLLMHSAALPELGPVVDSGIASVQRAGFGGNSALVFACVV NAAMMSIATADDRALHGDQVPADHALLMEGLSEAAAASPGIALLSADMIAQFAGTPRERD IAFDRYYRFVVEAMMDGLELRLLGTVHRPPNSQVL >gi|319978382|gb|AEUH01000122.1| GENE 11 9044 - 9466 561 140 aa, chain - ## HITS:1 COG:MT3706.1 KEGG:ns NR:ns ## COG: MT3706.1 COG0853 # Protein_GI_number: 15843213 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Mycobacterium tuberculosis CDC1551 # 1 128 2 129 139 147 60.0 7e-36 MREMVVGKIHRATVTGADLHYVGSITVDEDLLDAADIVPGQKVDIADIDNGARLATYTIA GPRGSGVVQLNGAAAHLVSVGDLVIIMAYAHVPESQARTMAPSVVFVDADNRVVEAGDDP GAVPEGSARARELGLRSSGA >gi|319978382|gb|AEUH01000122.1| GENE 12 9475 - 10401 1018 308 aa, chain - ## HITS:1 COG:aq_2132 KEGG:ns NR:ns ## COG: aq_2132 COG0414 # Protein_GI_number: 15607078 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Aquifex aeolicus # 22 299 22 280 282 209 42.0 6e-54 MTTRPRLVRTRTELADALGALGGARALVMTMGALHEGHLQLVREARALADHVVVSIFVNP TQFAPGEDFDAYPRTLDADMEALAGVGADLVWAPGPADVYPAPVRATIDPGPVARVLEGA SRPTHFAGVALVCTKVVNLVRPDVALYGLKDAQQLAVLRTVFEGLDVPVRLHPVPIVRDH DGVALSSRNRYLAADERARARALPEALSLAVAAARGGAGAALAVEAARERIEAEGGIAID YIAVVDDGTFDVLAGTGSAAPQVAANPGPATIVEGGLRACRVLVAARVGGTRLIDNMELP LVCEEAGA >gi|319978382|gb|AEUH01000122.1| GENE 13 10456 - 11295 881 279 aa, chain - ## HITS:1 COG:Rv3603c KEGG:ns NR:ns ## COG: Rv3603c COG5495 # Protein_GI_number: 15610739 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis H37Rv # 2 223 20 257 303 128 42.0 1e-29 VGHVGPVIASALRAAGHCIVAVSASSDASRERADAMLPGVPVEEVPVVVERSEVVVLAVP DDQIRPLVTGLARSGAWQTGQIVIHLSGAHGVSVLDGAASQGAIPLAIHPAMTFTGTSVD VRRLVGAPFAVTGAAPFLPIAQALVVEMGGEPVVVDEERRPLYHAGLTHGANSIAGIANQ TMRILGVAGVEDPARYARPLLEAALDRALTEGVAGISGPVPRADAGTVAAHVRALGASAG LRGELDTYVSMTRALVALLQDARLLDERRATELLAALLQ Prediction of potential genes in microbial genomes Time: Thu May 12 17:56:54 2011 Seq name: gi|319978370|gb|AEUH01000123.1| Actinomyces sp. oral taxon 178 str. F0338 contig00123, whole genome shotgun sequence Length of sequence - 12157 bp Number of predicted genes - 11, with homology - 9 Number of transcription units - 7, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 33 - 1697 1922 ## COG3428 Predicted membrane protein 2 1 Op 2 . - CDS 1694 - 2179 632 ## COG3402 Uncharacterized conserved protein + Prom 2180 - 2239 1.9 3 2 Op 1 40/0.000 + CDS 2281 - 2985 1097 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 4 2 Op 2 . + CDS 2982 - 4514 1780 ## COG0642 Signal transduction histidine kinase 5 2 Op 3 . + CDS 4550 - 5506 1115 ## COG4339 Uncharacterized protein conserved in bacteria 6 3 Tu 1 . - CDS 5709 - 5777 62 ## - Prom 5806 - 5865 2.4 7 4 Tu 1 . + CDS 5867 - 6157 368 ## gi|154507919|ref|ZP_02043561.1| hypothetical protein ACTODO_00404 + Term 6260 - 6306 0.2 - Term 6471 - 6522 17.1 8 5 Op 1 . - CDS 6538 - 8163 1608 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 9 5 Op 2 . - CDS 8369 - 10279 2122 ## - Prom 10340 - 10399 1.6 10 6 Tu 1 . - CDS 10422 - 11009 559 ## Bcav_0815 hypothetical protein 11 7 Tu 1 . + CDS 11147 - 11848 490 ## COG0692 Uracil DNA glycosylase + Term 11922 - 11964 -0.9 - TRNA 11918 - 11993 96.6 # Thr TGT 0 0 Predicted protein(s) >gi|319978370|gb|AEUH01000123.1| GENE 1 33 - 1697 1922 554 aa, chain - ## HITS:1 COG:Cgl0622 KEGG:ns NR:ns ## COG: Cgl0622 COG3428 # Protein_GI_number: 19551872 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Corynebacterium glutamicum # 16 531 5 464 471 157 28.0 6e-38 MSAESRSITADGVAADAWHRVHPVSPIINAWQVIAALVAIITYRGVSAYSSGAPSGWEIL NGIAEGFHLRGVLMAAFAVVIAVLVVSGLYSWLQWRATAYAVDGGAVWFRSGIVFRTNRH ARLDRIQSVDIHLPLLGRILGLGRLSIEVAGGAGSSFTIGFLRATELDELRAHILALAAG LEVGSAGEGAQPGAPARPAAESPGRVSGALEAEAARGAARRGAFTSSAPIAPENVLYEVG AGPLIGSLLLTVPMMVLLVVLVAVVAASAWAIAARGAVAAPSLFAVVPLVVASGSVLWGR FNAEFAFTAAASPDGIRIRRGLTDSRNQTIPPGRIHAIEIRQPLLWRLTGWYRVTMTQAG NSVKMGKENNGGNNELVSARVLLPVGSRAQAELAVWMVVQDLGVPDPAAFVDSVFAPTHA GADAHFTRVPHRARLVDPLVRRRRAYALTGSLFVIRDGWLTRRCALIPLARIQSTHILQG PVERRLDVATVRADLVPGVVSHTARHVDRRGAQLLWKRLEDASRVRREAEPPEKWMRRAL AARDHAAAPQGEGA >gi|319978370|gb|AEUH01000123.1| GENE 2 1694 - 2179 632 161 aa, chain - ## HITS:1 COG:Cgl6021 KEGG:ns NR:ns ## COG: Cgl6021 COG3402 # Protein_GI_number: 19551871 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 60 161 48 149 149 102 45.0 4e-22 MADLMAPDSISFTPVSSALAKVRLATCAVVLAALAAGIAVAALASDEAGLWRWEAAPVCL LVWLCWLVPRQVRAIGFAEGPDEFIIRRGVMFRSMTMVPYGRIQYVDIAQGPVARFFGIA QIKLSTASATTDATLDGVPAADAARLRDALAKRGSAELMGL >gi|319978370|gb|AEUH01000123.1| GENE 3 2281 - 2985 1097 234 aa, chain + ## HITS:1 COG:MT0782 KEGG:ns NR:ns ## COG: MT0782 COG0745 # Protein_GI_number: 15840172 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Mycobacterium tuberculosis CDC1551 # 6 231 12 237 240 262 60.0 5e-70 MTAPAPEAKLLVVDDEPNIRDLLASSLRFAGFDVSTAEDGSSAFHEAQAVRPDLIVLDVM LPDMDGFTVTRRLRDAGLTTPVLFLTARDDMRDKIQGLTVGGDDYVTKPFGLEEVVARIR AILRRTMGGDDEDGKLRVGDLVIDEDAHEVKRAGVDIDLSPTEFKLLRYLVINAGRVVSK MQILDHVWEYDWDGEVAIVESYISYLRRKLAVEGASGELIHTKRGVGYILRAED >gi|319978370|gb|AEUH01000123.1| GENE 4 2982 - 4514 1780 510 aa, chain + ## HITS:1 COG:Rv0758 KEGG:ns NR:ns ## COG: Rv0758 COG0642 # Protein_GI_number: 15607898 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mycobacterium tuberculosis H37Rv # 19 490 10 467 485 232 38.0 2e-60 MSERIGLADRIERAWTNRPLSAQLVVLITGLLAVGLTMSGTVMIGLLQRHLISQVDAQLT SDAALRLFNQTGAAASNGMASTVPTAYYLRLHHPSIDPDQVTFYPETTAASGTPILPELL GPGEARAGTGEYTQAVTVASSKEGHPWRVVATTIEVKGATPSTGVLTIALPLTDVQHTLR TTAAYFSIAAVVIVFAGGWVGHWLVRRSLSPLRSIESTAGKIAAGDLTQRVRPGPPSTEV GSLALSLNSMLTQVEQSFEARQASEKKIRRFVSDASHELRTPLAAISGYCELYSMGGVPA ERVEEVMGRISSESTRMAVLVEDLLTLARLDEGRPLEFTDIDLVKMADNAVFDLQALDST RTVGLSSLEGRRAPMSLVVSADRDRIAQVFTNLIGNIVRYTPAGSPVEIALGTAADYAVV EFRDHGPGIADRDRSRVFERFYRSDSSRNRRSGGSGLGLAIVSGILGAHHGNAALTKTKG GGLTVRIELPLPRPQNEAMVTGDADSAATV >gi|319978370|gb|AEUH01000123.1| GENE 5 4550 - 5506 1115 318 aa, chain + ## HITS:1 COG:all0539 KEGG:ns NR:ns ## COG: all0539 COG4339 # Protein_GI_number: 17228035 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 36 222 33 209 212 85 29.0 9e-17 MGAHAPQWLLGSFVEAMQQIGATAPTDELAAEARDLAERWSAEGRALHDLHRLIKVLTHI DEISLSAHDPDLLRIAAWYHGAFLNRALEVRLGGFEAQFASARCIDHAHNRLTELGVSEE VVARIDELIAFLTRHRAPRADFDAQVLVDADLAVLACSPQDYKKFREKLRLELHDLDDVQ FLKARCALVKRLLGYDRIYQSPLGNTWENAARANLEVELARLEDAKAKLCPGPDDEADAD DADDEPYGADRVTTTGTLVIKRRTLKKNPCPPPPEEATGTGILPVKAPEPAPDADEPEAT SSLESAVESLDLPATPAD >gi|319978370|gb|AEUH01000123.1| GENE 6 5709 - 5777 62 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWTGASGCACLWICGDVDRRSR >gi|319978370|gb|AEUH01000123.1| GENE 7 5867 - 6157 368 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154507919|ref|ZP_02043561.1| ## NR: gi|154507919|ref|ZP_02043561.1| hypothetical protein ACTODO_00404 [Actinomyces odontolyticus ATCC 17982] # 1 96 20 115 116 65 57.0 9e-10 MSTYMVDSAQVAASATQITATAGQIRSEVQAMMAQLVALGESWSGGAQATFQGAVAQWQG AQAQVEAALDAISSQLQTAAALYSDAEARSTALFAG >gi|319978370|gb|AEUH01000123.1| GENE 8 6538 - 8163 1608 541 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 3 540 4 543 547 624 60 1e-178 MTKIIKFDEDARRGMENGLNLLADTVKVTLGPKGRNVVLDKKWGAPTITKDGVSVAKEID LEDPFERIGAELVKEVAKKTDDVAGDGTTTATVLAQALVHEGLRNVAAGSNPIALKRGID QAVDAIVKQLHADAKPVETSEQIAATASISANDTEIGRLIAEAFEKVGSEGVVTVEETNS FDTTLETTEGMRFDKGYLSAYFVTDQERQEAVLEDAYVLLMDSKISNVKDIVPVLEKVMQ TGKPLAIIAEDIEGEALATLVVNKIRGTFKSVAVKAPGFGDRRKAMLQDMAILTGGQVIS ETVGLSLENADLELLGRARKIVVSKDETTIVEGTGDKDMLDARVRQIRQEIENTDSDYDR EKLQERLAKLAGGVAVIKSGAATEVELKERKHRIEDAVRNARAASEEGLVAGGGVALIQA AEKALASLELVGDEATGVNIVRLAVVSPLKQIAENAGLEGGVVADRVAGMPDGHGLNAAT GEYGDLMAEGISDPVKVTRSALQNAASIAGMFLTTEAVVADKPEPPAPAGGADDGGMGGM Y >gi|319978370|gb|AEUH01000123.1| GENE 9 8369 - 10279 2122 636 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDPSMYTAAMAADPTITPETMQDIAIKRPDLRGVLATNPSLYPDLRNWLAQAMAQAAGQQ GPGGQYAPATQFSSNGPAGNGTAQAGPAPQEPGAAQGWSGGQGGYAPDGSAQAGGQGGYA PGGFAQGAGQAGYPQGGFAQGAGQGGYAPGGFAQAGGQAGYPQGAGAQGAYGAGGPGMVG APYPAPKNRSKAILWIAIAAVVAVVAAGAGIFAFARMSGGPKDYDLATGWNNGGQQAWTL NLSGLFSTGSSVATAFQPVVAVKGNRMVLLGVENGTFTAKGFDVSGDQPEQKWAQPVQGG EKALSSYEDPTVRWIGKDHVLVEMGRASVPVLIAAKDGEQTPLTWYDASDENSAVVDAPQ GPGEVVSVQCSVSEGDKWKSSTETHPTCTLYSKTGEQTTAKIAEGATSKFWILSLRDKAL YPLQTDVEPSRTAGWLASVIVGDKDPLEGKGFAKDTSTFDYYDVQGQLKGTVDVGTGMPM PCGNGTRSSNNFSKLFYDALEGKLTDTVLVLEYSDSLRFTGMHIGADTTNRFGEDMQKLA NLGSRANASQTGCVASGDGKALAIVALGVEVEAGILEPVLAGDGITAIIDPAAHTLKPVS EIPGVSAPPASYALLARSDLIVTVDDSTVTAFKPAK >gi|319978370|gb|AEUH01000123.1| GENE 10 10422 - 11009 559 195 aa, chain - ## HITS:1 COG:no KEGG:Bcav_0815 NR:ns ## KEGG: Bcav_0815 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 5 195 8 184 189 94 38.0 2e-18 MSSQYPRDEFDRAGEDMPIGMHRPQPSRWKNVWPFLVILIVVPALGWAAATFLANRQNGQ PVGTVTTGSPTTPAQQSGTAAPSSAPTPSQEPTPSPSPTTTTEAPPTPDHNAAIQVLNGT GTQGLAASNTQKLNAAGYAGTSAANATGWDTQVSTVYYEDPRMEATAKDVAATLGIDNVQ RMDGIGSPDVVVVLR >gi|319978370|gb|AEUH01000123.1| GENE 11 11147 - 11848 490 233 aa, chain + ## HITS:1 COG:ML1675 KEGG:ns NR:ns ## COG: ML1675 COG0692 # Protein_GI_number: 15827886 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Mycobacterium leprae # 7 233 4 227 227 261 63.0 1e-69 MDAYNPRPLSELIDPGWARALAPVEDRVHRIGEELARAAASGTGYLPAGTDVLRAFTYPF DQVKVLIVGQDPYPTPGHAMGLSFSVAPGTPLPRSLVNIFTELCDDLGVPRPSSGDLSPW SEQGVCLLNRVLTVRPGSPGSHRGMGWEEVTACAIDALVSRHDGAGSPAPLVAILWGRDA QALEARLGRTPCVKSPHPSPLSARRGFFGSKPFSTANSLLEAQGASPVDWRLP Prediction of potential genes in microbial genomes Time: Thu May 12 17:57:46 2011 Seq name: gi|319978367|gb|AEUH01000124.1| Actinomyces sp. oral taxon 178 str. F0338 contig00124, whole genome shotgun sequence Length of sequence - 2186 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 39 - 953 1242 ## Arch_1641 DsbA oxidoreductase 2 1 Op 2 . - CDS 1007 - 2032 994 ## gi|293190401|ref|ZP_06608833.1| tyrosine protein kinase:Serine/threonine protein kinase Predicted protein(s) >gi|319978367|gb|AEUH01000124.1| GENE 1 39 - 953 1242 304 aa, chain - ## HITS:1 COG:no KEGG:Arch_1641 NR:ns ## KEGG: Arch_1641 # Name: not_defined # Def: DsbA oxidoreductase # Organism: A.haemolyticum # Pathway: not_defined # 4 229 16 233 282 71 24.0 4e-11 MGNKSAKKDAVRLKAQQMRQAQERADRRTRIIVISVVTVVVLAVVASVAYVILRQRAIIE EARNVDPASVLGDYADGRPVVVGPNGVGQADPSLPTLTEYFDYSCHACADTDAAIGAQLT QWAEQGRYNIEIQSVTTVGMEYQKAATSASLVVAQKDPDHWTAFHHALLAYFRTQFQASN GTVVQDLEASWRQVKTIASETGVPQGVVDTFPLNVSDDYLKASTAAWQGANVAGRGSSLG TPEFVKNHARMIPLTSAAELQQSIDQAFTPGADDSAPQSAAQSSGGAGPAAQSGAQQEDA AAEE >gi|319978367|gb|AEUH01000124.1| GENE 2 1007 - 2032 994 341 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190401|ref|ZP_06608833.1| ## NR: gi|293190401|ref|ZP_06608833.1| tyrosine protein kinase:Serine/threonine protein kinase [Actinomyces odontolyticus F0309] # 69 339 382 643 643 132 39.0 3e-29 MDSDGDYDDGATEVYPPLDSDGDYGDGATEVYPPWGSGAGGGPGGPGGYDGDATQVYPPG AHAGAPAFPVPPQPYAPPSGAGSGYAPMPAHVPYQTGPWAPIAGQGPYPAAPQATGWAGA VPGSAQPVPPWGAAPRDPGPPKTPWLTGAAVMAAFAVAPLAMGLGGTLLVGAVLTAVSLG GALAGWVDGRRSRHGGPRSSDGPAAVLMAPLLMVRCLAQLAAGVLLGGAVPYLLWGFVSY SIEGRALWEWPVRMATEADPVVSADPWLTDPDRAAIVWALAACALVLCWLTPFTRDLKKG FALGCDRWLRPPWARLGVVLVCAAIMIGTWTLATGGWAHTA Prediction of potential genes in microbial genomes Time: Thu May 12 17:58:20 2011 Seq name: gi|319978342|gb|AEUH01000125.1| Actinomyces sp. oral taxon 178 str. F0338 contig00125, whole genome shotgun sequence Length of sequence - 24071 bp Number of predicted genes - 21, with homology - 18 Number of transcription units - 15, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 909 860 ## COG0515 Serine/threonine protein kinase 2 2 Tu 1 . + CDS 1178 - 2308 1989 ## COG3839 ABC-type sugar transport systems, ATPase components + Term 2337 - 2383 16.1 3 3 Op 1 . + CDS 2491 - 3756 1792 ## Bcav_0782 hypothetical protein 4 3 Op 2 . + CDS 3819 - 4571 892 ## COG0584 Glycerophosphoryl diester phosphodiesterase 5 4 Op 1 . - CDS 4656 - 5855 1507 ## gi|154507928|ref|ZP_02043570.1| hypothetical protein ACTODO_00414 6 4 Op 2 . - CDS 5861 - 5929 71 ## + Prom 6171 - 6230 1.7 7 5 Tu 1 . + CDS 6283 - 7755 1925 ## COG2268 Uncharacterized protein conserved in bacteria 8 6 Tu 1 . - CDS 7905 - 8570 424 ## SYO3AOP1_1421 hypothetical protein 9 7 Tu 1 . - CDS 8727 - 9755 1820 ## COG0191 Fructose/tagatose bisphosphate aldolase 10 8 Tu 1 . - CDS 9859 - 10050 113 ## 11 9 Tu 1 . - CDS 10213 - 10854 657 ## COG0566 rRNA methylases - Prom 10879 - 10938 1.7 12 10 Op 1 . + CDS 10889 - 11713 1294 ## COG0647 Predicted sugar phosphatases of the HAD superfamily 13 10 Op 2 . + CDS 11747 - 12316 818 ## COG0461 Orotate phosphoribosyltransferase + Term 12439 - 12484 14.1 14 11 Tu 1 . + CDS 12595 - 12996 166 ## 15 12 Op 1 . - CDS 13410 - 14273 809 ## COG0300 Short-chain dehydrogenases of various substrate specificities 16 12 Op 2 . - CDS 14313 - 17138 3060 ## RMDY18_19280 hypothetical protein - Prom 17168 - 17227 2.0 17 13 Tu 1 . + CDS 17132 - 18058 1278 ## COG0708 Exonuclease III 18 14 Tu 1 . + CDS 18160 - 19590 1847 ## COG1376 Uncharacterized protein conserved in bacteria + Term 19647 - 19688 3.0 19 15 Op 1 . - CDS 19607 - 20419 859 ## COG1680 Beta-lactamase class C and other penicillin binding proteins 20 15 Op 2 . - CDS 20428 - 21243 1072 ## HMPREF0573_10538 TetR family transcriptional regulator 21 15 Op 3 . - CDS 21274 - 23886 1750 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 Predicted protein(s) >gi|319978342|gb|AEUH01000125.1| GENE 1 3 - 909 860 302 aa, chain - ## HITS:1 COG:MYPU_6850 KEGG:ns NR:ns ## COG: MYPU_6850 COG0515 # Protein_GI_number: 15829156 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Mycoplasma pulmonis # 1 225 11 237 320 122 33.0 1e-27 MIRRLGTGGAGTVWLAEDGGGTAVALKALHPALASSDAARTRLRRETRTVNAVRSPFVAH IVDAETDASQPFVVSEYVAGPTLAEVVQSGPVPMRGVAALSYHLASTIAAVHQADVIHRD IKPSNIICSQRGPVLIDFGIAMGAEDEHLTSTGLVSGTAGYTAPELLAGRPPSRHSDWWA WCATLLSCATGRPPFGSGDMKATILRVMQGEPDLAGLSPRIAAALGAGLGTEPEGRPSPS LVVADLMGAVGWAPGELDYVSVNWAELLNTGERTVPLSSDPQEIAAPPQWDEGSARPPVG AA >gi|319978342|gb|AEUH01000125.1| GENE 2 1178 - 2308 1989 376 aa, chain + ## HITS:1 COG:Cgl2410 KEGG:ns NR:ns ## COG: Cgl2410 COG3839 # Protein_GI_number: 19553660 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Corynebacterium glutamicum # 1 375 1 376 376 496 70.0 1e-140 MATVTFDKATRVYPGSDKPAVDQLDLEIKDGEFLVLVGPSGCGKSTSLRMLAGLEDVNSG RILIGDKDVTDVPPKNRDIAMVFQNYALYPHMSVRENMGFALKIAGTPKDEINKRVEEAA RILDLEPYLDRKPKALSGGQRQRVAMGRAIVRKPQVFLMDEPLSNLDAKLRVQTRTQIAS LQRELGVTTVYVTHDQTEALTMGDRIAVLAGGLLQQVGTPQEMYERPANEFVAGFIGSPA MNLGTFTVDGEWAKLGPARVPLSKATRAALTDEDGGKIKIGFRPEGLDVVPEGTEGTIPV EVDFVEELGSDAYVYGHLAGAEEDESLGSGAEGTGKQLIVRVPPRTAPDRGGVVHVRIRK GQQHNFSASTGERLPE >gi|319978342|gb|AEUH01000125.1| GENE 3 2491 - 3756 1792 421 aa, chain + ## HITS:1 COG:no KEGG:Bcav_0782 NR:ns ## KEGG: Bcav_0782 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 1 403 1 402 415 504 62.0 1e-141 MPETMEITAANVDPALLDLPWDTPLQEWPDSILAALPRGISRHIVRFVNLSGRILAVKEI GPSVAHHEYDILRDLARLNAPSVIPTAVVTGRKSPTGEELNAALVTEHLTFSLPYRALFS LRMRPDTAARLIDALALLLVRLHLLGFYWGDVSLSNTLFRRDAGAFSAYLVDAETGELHP RLTDGQREYDVDLARTNIIGELMDLQAGGYFPDDADPIEIGDRICGQYNLLWNELTGEET ISRGQRRYKVSERIRRLNDLGFDVAELRMASDASGEHLSIQPKVVDAGHHHRKLMRLTGM DVGENQARRLLADIDSWRATTGRTAMPIELAAHEWLTDEFTPVVTAVPPALAAKLEPAQI YHEFLEHRWYLAQQAGHDLPRDEVIRSYIDTVLPGKPDEAVLLDPGEADAPASAASNPDL W >gi|319978342|gb|AEUH01000125.1| GENE 4 3819 - 4571 892 250 aa, chain + ## HITS:1 COG:BS_glpQ KEGG:ns NR:ns ## COG: BS_glpQ COG0584 # Protein_GI_number: 16077282 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Bacillus subtilis # 8 242 41 286 293 135 33.0 9e-32 MTPLPRIIAHRGAKSLAPENTIAAFAKAMEVGARWFEFDVGAIGDGSLIVMHDDTLDRTT TGSGRYDGLAFSDIRKLDAGRWFSSTYRFERVPEAADAIEFGNTAQMGMHLEVKPCRRNP LLRERLVEALAVAVGAAADPAHFVVSSFDHDLLAAFHEARPDVALGWLVERGQGPSSWRG GAEALGCAAVHPPLEGLTEAEVAGMRAAGFDVNVWTVNDVECAQRLAQWGVTGVFTDVPQ DFPADALARL >gi|319978342|gb|AEUH01000125.1| GENE 5 4656 - 5855 1507 399 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154507928|ref|ZP_02043570.1| ## NR: gi|154507928|ref|ZP_02043570.1| hypothetical protein ACTODO_00414 [Actinomyces odontolyticus ATCC 17982] # 1 399 1 399 401 529 72.0 1e-148 MTQEWTRVEGMIIGDELDPNAVREELHERGVLAQLEWFASTPHMLSLTVATDGERIATVP AGSEQVEAGPALEELAEDIAGLFKAEVRIGSVTADHLPQGDSPLGRAASDDGAPSADGEG APTRIVEIGRTPASSVPLLAALEGVDLGDLELDDGHRALLAELPHGKEGWNFGELPLVTL SVSDSEFQVLLVTDDHIEHIVSHNWGMETVIVPGARAKASELPDEVIDLVGDRPDLEAIA AAVPGADPVALWATATTRGEESVWKVVKALGLPAGVAGFLLGATEASEVEGVTVHLARGI SNAIGRSVDIMLGEPESAVKPLWNSYESVAVQRPWLIHAAVGAEAIIGTGLVVTAARASS PRSGWTRFAGVVGALMLVDSIAELSLAKHVAKRYLRRQG >gi|319978342|gb|AEUH01000125.1| GENE 6 5861 - 5929 71 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGAMRHNRINRGAMRAPQQRKG >gi|319978342|gb|AEUH01000125.1| GENE 7 6283 - 7755 1925 490 aa, chain + ## HITS:1 COG:BS_yuaG KEGG:ns NR:ns ## COG: BS_yuaG COG2268 # Protein_GI_number: 16080153 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 475 1 498 509 175 31.0 2e-43 MSIPLIAALVIGGVVLVALFLWASFVSASPGEIKVISGPRGQRVLHGKTGWKVPLLERVD SMTASMISVDAQTTDFVPTNDYINVRVDAAVKVRIATDDPTLFRAATRNFLYKETREISE EVRDTLEGHLRAIIGQMRLTDIITDRAAFSERVQENAKLDLEEMGLEIVAFNIQNVMDQN GVIDNLGIDNTEQIRKTAAIAKANAQKEVAQATAVAEKEANDAQVASQLEIAQKQTDLAK RQAALKVEADTEKAKADAAYEIQSQIQRRDIERETAQADIVKQEQQAVIKEKEVVVARQA LQAEVNAKADADRYAAEKKADAALYTRQRDAEAEAFERTKKADADKQAMLAEAQGIEARG RAEAAAIGAKLTAEAEGLEKKAVAMTKMNQAAVLEMYFRALPEVARAVAEPLANVDSITM YGEGNSAHMVGDITKSITQINAGLGDSLGLNLQQLFSALVGAKLVSPTIADAVEEGVRGA AEPQDPSPSA >gi|319978342|gb|AEUH01000125.1| GENE 8 7905 - 8570 424 221 aa, chain - ## HITS:1 COG:no KEGG:SYO3AOP1_1421 NR:ns ## KEGG: SYO3AOP1_1421 # Name: not_defined # Def: hypothetical protein # Organism: Sulfurihydrogenibium_YO3AOP1 # Pathway: not_defined # 27 218 204 377 558 83 30.0 7e-15 MKRALAAIGAIAIVLVALGGCAAPRAKKPVLYLYPERVVSLGVGLSYDGVVSDSYPVAVG GVGADGRALASWSVTAGPDGVLRDGGGRAYPYLFWEGPTRADLSQGSGFVVARDDVVGFL EEKLALLGLDEREAADFITYWAPRMRVNEYTLVSFDSGAYRDAARYTFTADDGSVVEPDT FIRVFMTISAASASAVVPEQELVPAPARTGFTAVEWGGAEL >gi|319978342|gb|AEUH01000125.1| GENE 9 8727 - 9755 1820 342 aa, chain - ## HITS:1 COG:Cgl2712 KEGG:ns NR:ns ## COG: Cgl2712 COG0191 # Protein_GI_number: 19553962 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Corynebacterium glutamicum # 1 338 1 339 344 427 66.0 1e-119 MPIATPEVYAEMLNRAKEGKFAYPAINVTSSQTVTAALQGFAEAESDGIIQVSVGGAEYG SGSTIKNRVTGSLALAAYAYEVAKNYGITVALHTDHCAKQNIDSWVRPLLEIEAGQVAEG KLPTFQSHMWDGSAVPLDENLEIAKEMLALSVKANTILEIEIGVVGGEEDGVVGAINEKL YTTTEDALKTVEALGLGENGRYLTALTFGNVHGVYKPGHVKLRPEILGTIQEEVAAKVGW KNNRPFDLVMHGGSGSTADEIALAVANGVIKMNVDTDTQYAFTRPVVDHMMRHYDGVLKI DGEVGVKKQYDPRSWGKAAEAGMAARVVEACQRLGSVGTQMR >gi|319978342|gb|AEUH01000125.1| GENE 10 9859 - 10050 113 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTARVPQHLPPGASQNAPLGAAVACLLAVVVVLGGLLAWGLASGGRLAPAPATAVVETAS SSE >gi|319978342|gb|AEUH01000125.1| GENE 11 10213 - 10854 657 213 aa, chain - ## HITS:1 COG:Cgl2714 KEGG:ns NR:ns ## COG: Cgl2714 COG0566 # Protein_GI_number: 19553964 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Corynebacterium glutamicum # 17 213 2 200 206 184 52.0 2e-46 MNDRENEGQVAGPATRGVPPWPGGPGQWPEGARYDPRLLAEGDSRNVEDRYRYWTMEAIR ADVASRALPFEVAVENLGHDFNIGSIVRTANALGARRVHIVGRRRWNRRGAMVTDRYIEV GHMEGAGALAGHCRREGLALVGFDNVPGSVPLEGLALPASACLLFGEESGGLSEGALAAC AAVVAITQRGSTRSMNVGHAAAIAMWAWCSQHG >gi|319978342|gb|AEUH01000125.1| GENE 12 10889 - 11713 1294 274 aa, chain + ## HITS:1 COG:Cgl2203 KEGG:ns NR:ns ## COG: Cgl2203 COG0647 # Protein_GI_number: 19553453 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Corynebacterium glutamicum # 19 267 6 254 275 309 61.0 3e-84 MTASANHTKPLTDLPPVSAWLSDMDGVLVKEERALPGASQFLDALRSKGYPFLVLTNNSV FTNRDLSARLAHSGLDIPEDNIWTSANATAAFLQQQSPNSTAYVIGEAGLTTAIHSAGYV MTETDPEYVVLGEVRSYDFHALTRAIRLIEGGAKFIATNPDVSGPSDEGTLPACGSIAAM ITAATGKKPYFVGKPNPVMIRAGLNTIGAHSEHAAMVGDRMDTDIRAGVEAGLRTHLVLS GSTSVDEIENYPYRPFGIHEGIGELIELVGAAIG >gi|319978342|gb|AEUH01000125.1| GENE 13 11747 - 12316 818 189 aa, chain + ## HITS:1 COG:MT0395 KEGG:ns NR:ns ## COG: MT0395 COG0461 # Protein_GI_number: 15839766 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Mycobacterium tuberculosis CDC1551 # 7 188 6 178 179 176 62.0 3e-44 MTNDSHKARLAALVRGLSVVHEKVTLASGRESDFYVDMRRATLHHEAAPLIGHVMLDLLE ESGFSVGEDYGAVGGLTMGADPVAAAMLHAAASRGLDLDAFVVRKAAKDHGMRRRIEGPD VTGRRVVVLEDTSTTGGSPLEAALALREAGAEVVAVAVVVDRATGAAERIRAEGLPYFYA LGLGDIGLA >gi|319978342|gb|AEUH01000125.1| GENE 14 12595 - 12996 166 133 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFAYSRPCVRHRGGKGAKEGRGAEPRGKERLGFQGTAPAPRPHTFGYTPLGAPRWAARNA PEPRRAKVRGGAPAPASAIGTGGRRANGRRRTCQRQHSHLGGEGGGGAPRVVQNASGFRA FPGDSRVAVPKCV >gi|319978342|gb|AEUH01000125.1| GENE 15 13410 - 14273 809 287 aa, chain - ## HITS:1 COG:PA0658 KEGG:ns NR:ns ## COG: PA0658 COG0300 # Protein_GI_number: 15595855 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Pseudomonas aeruginosa # 4 279 5 250 266 124 34.0 2e-28 MGTALVTGATSGLGEEFCWQLAAGGHDLVLVARREQVLEALADKLRNVAGVGAQVIAADL STPEGVAAVSRRLDVGGQDGPCGTDGDPGGARGTEGACRGTRGSAGGTGEGPERPVDLLV NNAGFGLGRAFVDNSVANEERGLDVMVRAVMVLSHHAARSMRSRGRGAVLNIGSVASRTG GGTYSAHKAWVVAFTEGLSEELAGTGVSATVVCPGPVATSFFANAGIDLGGRLAVATPER VVADALQAVRAGRVQTTPTLTYKAAMAAMKVAPRALVLRAMRFLPHM >gi|319978342|gb|AEUH01000125.1| GENE 16 14313 - 17138 3060 941 aa, chain - ## HITS:1 COG:no KEGG:RMDY18_19280 NR:ns ## KEGG: RMDY18_19280 # Name: not_defined # Def: hypothetical protein # Organism: R.mucilaginosa # Pathway: not_defined # 25 823 69 912 1014 252 31.0 5e-65 MTRTSPAGAIALVLALAALPLAAPPASAGPDQDKDWIVSGQHVDAPVPVWHADTKSFSLN TITTPMERTALWLPKAWTGDAASGEAKSQLVIPKGRADLAFLGGEGTVLNAAPQNPGPGN TPIWAGLGADASAEWAAPDEFDGGTYTLDLVNVDGPGRMEMFIDNGDSVYRFLSSHDTAY RSVYNPRHTHLYTTFTQPGRYTANFMVTARAADGTALYTSPITPLVWQVGGEDPKEGRIK DVRSAYSAARAERSDGSDARPSLTLAPYAGREHPGDEHLTEMSFSTGDPSDTGRVWLTVN GYFLTELPVEGGAATARELLGNADASVQAVYIPDEAGAAARWISAPQAYSQRDQQAVTVT DGAAQIAPEDNPDPSPVWNPASIDVVDGAVSVRYTRAEGAKAHTVAIRARDPQLRAAYKI EFFEEKGDFTPWCAVEGTLGAGGADVKEQDLGVCKDQPMYMRVTLRPHPRSNATLTSAGV EDLTVSDEFGTEVSLKLRGAPEPAPSPSPQPTAAPDPAPTAAPSPSPSPSPSPSPQPTPA PDPAPTAPPTAPPVPAPDPAPSDLLTVPVTLSRGHLDLRVTQGADGNGMPSYAMAVKDDT RTAARASVLRSLPSVTLAVHPNAYYVRPQSLSDPGYDVLGAVGAGSYVLPQTQNSDIVWP GFSTEGVDYTGLPDGVDIGVRLLDGPAGAYAAFFQSGSLGGKPTVHFDSRDPSKSAIHTT SSTHMHGNWVFSAPGTYHIAVGASSGGRVLAAPQALTVTVEGTRRPGGATPSPAPPLPAP TVPGAPPSQDDRARPGTPPSQDDRAPRAQEPSGGQGSGAGGTAPVKSGIKPASSTTRRVS RAASAPAAARRTAATGAAGATGAQSGAQSGAQSGVDGPQSGVDAAHASSGSGSNPGGPAA AAAAVGHGGGGSWSPYWLLLLMIPAALAGGGAAHLIGRARR >gi|319978342|gb|AEUH01000125.1| GENE 17 17132 - 18058 1278 308 aa, chain + ## HITS:1 COG:MT0442 KEGG:ns NR:ns ## COG: MT0442 COG0708 # Protein_GI_number: 15839814 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Mycobacterium tuberculosis CDC1551 # 35 307 24 288 291 270 54.0 3e-72 MSWDAFSNHIVNENHYLYRGPRAQTARAQPYPGAMRIATWNVNSIRARAARATAFLERSG TDVLAMQETKCTPAQFPAEAFEEAGYHVAVHGLNQWNGVAVASRHPILRVHDHFPGQPAF GDPAVVEARALGVVIDVSGALEGAPGSPAPAELTVWSLYVPNGRELDHPHYAYKLEWLAA LRAQARGWLDADPAALVALVGDWNVAPQDQDVWDMAAFEGATHVSAPEREAFAAFAEDGW IEATRARTTNYTYWDYQRLRFPKNEGMRIDFAYASPALEARITSARIDRDERKGKGASDH VPVILDVR >gi|319978342|gb|AEUH01000125.1| GENE 18 18160 - 19590 1847 476 aa, chain + ## HITS:1 COG:sll0670 KEGG:ns NR:ns ## COG: sll0670 COG1376 # Protein_GI_number: 16331947 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 354 476 54 169 169 91 36.0 4e-18 MGRPSKKTLWIAGAATALVLVTGVGAAAYAAHYADIGLPRASVAGVSVSGLTHDQIAEAV DARASQAVLTATVSGTTTELQLADLGVTVDAQASADQALAANASIMARLTAPFHTTQTPL VYSLDEDALARTGSDLALAAGPAAVESGVSPDADQGAFVATRGHAGKGIDTAALKEAATA LAETLAPSSAEFTATDVEPTIAYADAVSAADAANRLISPEVSVTDGIDTYTAEVGDKLKW VSVTARNGALQTPTIDRAGVSAWVEATAESTNVGAKEAIDNVDPQGNVLVRSFPGSDGLE VDNADEVTSGIVSALESSSDYEGDFDYKKTTAPTKTMPAMPGYESYAYPAHEGEKWVDIN LASSTLTAYEGQTVVHGPVDINHGGVGHETVVGTFHVYQKHAAQDMGCTPEWPYCARGVP WVSYFEGSYAMHGAPWVARFGIGSDASSHGCINMPVEEAQWMYDWDELGTAVVTHY >gi|319978342|gb|AEUH01000125.1| GENE 19 19607 - 20419 859 270 aa, chain - ## HITS:1 COG:Cgl2206 KEGG:ns NR:ns ## COG: Cgl2206 COG1680 # Protein_GI_number: 19553456 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Corynebacterium glutamicum # 3 256 10 263 269 213 45.0 4e-55 MDFPFDVALVLRVGGDVVASWGDVSRARPLASVTKPIVAWSALVAVSQGALALDSPGGPE GATLRHLLAHASGLPLDSREPVAAPGKRRIYSNAGFDVIGGMVEEATGMPLRRWVARSVF GPLGMASADVPGSPAHSGVCGAADLSLFAAELTRPSLVPADLGAEAAAVQFPALAGVVPG YGRHSPCQWGLGVEIRGEKSPHWTAPTASPATFGHFGQSGSFVWVDRARGASAVFLGDRP FGQWHKDNWAALNERLLSMADGRAGGPAAR >gi|319978342|gb|AEUH01000125.1| GENE 20 20428 - 21243 1072 271 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10538 NR:ns ## KEGG: HMPREF0573_10538 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: M.curtisii # Pathway: not_defined # 18 221 2 197 215 111 35.0 3e-23 MASRPRRNPKMAGDDMTAQRKPRGAYSVGQATRESILSAAMVLIAERGYNGFSLRDLGRR VGISHPAVVYHFPSKEAILRAAIQRHEDMNALFRVSINEDVEGGFEEGGITAESFVDWAV GEMRFAMMPEADNAIALDCVLWAEASSESHPAHVHYKYRTAQMEEALTSMIIAFIEESRV DIGTKPRTLAKILIRYWYGSVVSARYADEPIDAREFVSDFLAVCVQLLHLPAHYVLRLGA SVPEEVAEVYARILRKISETNEMSEAPEATA >gi|319978342|gb|AEUH01000125.1| GENE 21 21274 - 23886 1750 870 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 858 2 804 815 678 45 0.0 MSTEFTTKAQEAIASALQSAAAAGNPQVEPTHLLAALLEQPEGIAASLLEKTGADRARIG ARTRNALVALPRTQGAAAQPRTSRALMAVVADAGDRAAKAGDQYVSTEHLLIATAASDSE AGRILTEGGVDAGDLAAALAQLRPEPLTSPDPEGSFEALAKYGRDLTEVAREGKLDPVIG RDNEIRRVVQVLSRRTKNNPVLIGEPGVGKTAVVEGLAQRIVAGDVPESLRGKRLIALDM ASMVAGAKYRGEFEERLKAVLAEIQRSDGEVITFIDELHTVVGAGGGSEGAMDAGNMLKP MLARGELRMVGATTLDEYRENIEKDPALERRFQQVFVGEPSVEDTVAILRGIAPKYEAHH KVTISDGALVAAAQLSNRYITGRQLPDKAIDLIDEAASRLRMELDSSPVEIDELRRSVDR MRMEESYLTESDPDGKDEATGERLAKLRADLADKQEELNALVSRWEAEKAGHNRVGDLRV QLDNLRTQLDLAVREGRWDEAGRLQNGEIPDVERQIQAEEERAALAEASAAAQEPMIAEK VGPAEIAEVVEAWTGIPTGRLLQTETEKLLHMEAEIGKRLIGQKDAVKAVSDAVRRSRAG ISDPNRPTGSFLFLGPTGVGKTELAKSLAEFLFDDERAMIRIDMSEYSEKHAVARLVGAP PGYVGYEQGGQLTEAVRRRPYSVVLLDEVEKAHPEIFDILLQVLDDGRLTDGQGRTVDFR NTILILTSNLGSQFLADPGMDADRSREAVMEAVRAAFRPEFLNRLDEIVTFDALSNEDIG QIVDLMVATMARRLASRRISLTVSEPAKGWITRVGYDPAYGARPLRRLIQREVGDRLATL LLAGGVVDGDTVTVDVNDDLDGLTMNVVEP Prediction of potential genes in microbial genomes Time: Thu May 12 17:59:34 2011 Seq name: gi|319978335|gb|AEUH01000126.1| Actinomyces sp. oral taxon 178 str. F0338 contig00126, whole genome shotgun sequence Length of sequence - 6702 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 624 888 ## Cfla_1554 hypothetical protein 2 1 Op 2 . - CDS 636 - 1100 535 ## Cfla_1554 hypothetical protein - Prom 1146 - 1205 1.6 - Term 1145 - 1200 -0.8 3 2 Tu 1 . - CDS 1368 - 2849 1585 ## COG2311 Predicted membrane protein 4 3 Tu 1 . - CDS 2943 - 3656 624 ## gi|154507944|ref|ZP_02043586.1| hypothetical protein ACTODO_00430 5 4 Tu 1 . + CDS 3830 - 6694 3822 ## COG0474 Cation transport ATPase Predicted protein(s) >gi|319978335|gb|AEUH01000126.1| GENE 1 3 - 624 888 207 aa, chain - ## HITS:1 COG:no KEGG:Cfla_1554 NR:ns ## KEGG: Cfla_1554 # Name: not_defined # Def: hypothetical protein # Organism: C.flavigena # Pathway: not_defined # 58 202 29 169 173 103 43.0 6e-21 MGWFWRKDADGGSDPNDSERPDGAQGPIDPWGYEPSASGGGVRTGNSEFEAHRFDMVRPI SQERVGLLFDGEGWDWEIDQDGDLRGMWEGQLFCFRFLGDSQEVLSIIAFMSETIPAKYE ADLLLFLEGWHREYLWPKAYFHREAKGGLRLVAEVNSDYEYGATDAQLVQQCMCALATTL QLFRAAGECFGPPGGGSAPPGDGGGGR >gi|319978335|gb|AEUH01000126.1| GENE 2 636 - 1100 535 154 aa, chain - ## HITS:1 COG:no KEGG:Cfla_1554 NR:ns ## KEGG: Cfla_1554 # Name: not_defined # Def: hypothetical protein # Organism: C.flavigena # Pathway: not_defined # 18 146 29 156 173 82 34.0 4e-15 MRQEATSARPAASADRVRPVTYDRLERLLDSQGLAWSVDDDGELTGTWGGSPYFLILAGR DRQVLQIVAIWKEPVPISALEAVRQSILQWHRTRPWPKCSHRIDDDGTVRVVAETVVGWQ CGATDAHLVRQIANAIALAQDFFSELARDIDIGS >gi|319978335|gb|AEUH01000126.1| GENE 3 1368 - 2849 1585 493 aa, chain - ## HITS:1 COG:Cgl1866 KEGG:ns NR:ns ## COG: Cgl1866 COG2311 # Protein_GI_number: 19553116 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Corynebacterium glutamicum # 9 422 16 424 445 88 26.0 3e-17 MSSLSFTGGRSVRYPAPDVARGFMLLLIALANVPYWLRYFPASPDGPSSADRWWILIRTA LVDRRGYPLFALLFGFGVATMVRRRIERDVEAAHQSVDPQVRASWAPHVAAQWEALVRQE ATDDAARLVRRRGWWMILFGFVHGIVFAGDIIGAYGIVAVLFAGVVARKRNVWMAVWGSV IALVSACSLTGVGFWKAGLGDLGSVVHPHASLSVYYVPNSIVQWAMAALITVLISMVVPA FMIGARLGQTDILSRPDRHRGLLWAVAGAGGLIGVVGALPYGLGASGMALPVPVWSVVLF HVSGIAGACAWLALFALFAGVGEPRGIRRVLAAVGKRSMTAYLSQTILFVVVLGGMGLAG VRAVPDAWGAVIAAFVWVAALVLCFGLETVGFPRGPFEVLLRRAVARSAGPRAGVPVPPP GAGVVEAPPGLAFPPPGARMAPPQGPQPPAAGTQGGPQSPEGAQSGPPGSSEGEGPAGPG AERGVAGGGTGGE >gi|319978335|gb|AEUH01000126.1| GENE 4 2943 - 3656 624 237 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154507944|ref|ZP_02043586.1| ## NR: gi|154507944|ref|ZP_02043586.1| hypothetical protein ACTODO_00430 [Actinomyces odontolyticus ATCC 17982] # 19 234 21 246 247 125 37.0 3e-27 MSTPTPDDGNRAPFAGQQPDQPATPWPQYGQQAAPEQSGSAYGQQAVPAAYAAPPQLPSR APGVLCIVLGLVLMIIIAPVVFFSTAVSSITDEFSNAAAGVQRLSNGGGVTLESQDGYFL QVVPPDTATACSLTDSGGQSYAMQKQDSSGSLFSAEGLRPGSYTVECAGASDTATFIGLP FSGDILQNSARSSLIWSTVVGVTGVGVLILGIVLVVRANKRRRDIQVDIMMSGVQRQ >gi|319978335|gb|AEUH01000126.1| GENE 5 3830 - 6694 3822 954 aa, chain + ## HITS:1 COG:sll0672 KEGG:ns NR:ns ## COG: sll0672 COG0474 # Protein_GI_number: 16331945 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Synechocystis # 20 950 30 939 945 604 41.0 1e-172 MPTPHAPCADAPTSPYSHPADAVAADLGTDASTGLSARSAADRLAADGPNELPSKPPVPA WRRFLGQFNDPLVYLLLAAITISTTAWILEGAEGAPVDSIVILAVVTLNAVLGFVQENKA ADAVAALSAMTEATSTVLRDGSLLVVPSSELVVGDVLVLGEGDQVGADARLLSAAALRVV ESSLTGEADAVVKTPGAVGADTDLADRTCMVHRGTSVAQGTGRAVVVATGADTQMGAIAR MLDSVEEQDTPLQKEIHAVSKMLGIVVVVIAVVVVGTLLLLAADRTPDTIVHSLLLGVSL AVAAVPEGLPAILSVVLALGVQRMATRKAVVKKLTSVETLGSASVICSDKTGTLTRSEMT VQEVVTASGTCVVTGIGYAPDGEVAPDADRDGRPDADPLEGPLRDEVTVVLSGGALASDA ELTVTEDVWSVVGDPTEGAFLVAEKKLGTHGHREGRFGRIGEVPFTSERKMMSVLLTDTA HGTVIMSKGAPDVLMEHCARVRVGDADTELTAQARARFVEHIADMSGRALRTLGVAYRVL TDEEAARARRAADSNGGADLSGLERDLVLAGVVGIIDPPRPEAAAAVAEAHRAGVRVLMI TGDHPATAGRIAADLGIVERGAPVLTGRDLEGMDDGALSGAVAATSVYARVAPEHKMRIV HALKSQGHTVSMTGDGVNDAPALRAADIGVAMGVTGTQVTKEAATMVLADDNFATIVDAV REGRRIFDNIKKFLRFLLSSNMGEVLTVFGGVVLSGAIGLSGHSESGVVLPLLATQILWI NLVTDSGPALAMGVDPSVEDVMARPPRKPTDRVVDGAMWSGVLLVGAVMAASTLATLDIF LPGGLIETPLSTDGLGTARTAAFSTLVMAQLFNTLNSRSETVSAFSHLFVNKWLWGAIGL TLVLQVAVVEAPFLQAAFSTTSLDPVHWAVVVVMASLVLWVDELRKLVARARAR Prediction of potential genes in microbial genomes Time: Thu May 12 17:59:51 2011 Seq name: gi|319978332|gb|AEUH01000127.1| Actinomyces sp. oral taxon 178 str. F0338 contig00127, whole genome shotgun sequence Length of sequence - 1191 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 - CDS 2 - 716 915 ## COG3442 Predicted glutamine amidotransferase 2 1 Op 2 . - CDS 713 - 1189 526 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase Predicted protein(s) >gi|319978332|gb|AEUH01000127.1| GENE 1 2 - 716 915 238 aa, chain - ## HITS:1 COG:ML2327 KEGG:ns NR:ns ## COG: ML2327 COG3442 # Protein_GI_number: 15828251 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferase # Organism: Mycobacterium leprae # 6 238 1 225 230 229 56.0 2e-60 MSDSALTIGVLMPEVLGTYGDTGNAMVLRERARRRGIDASIITVGLTDAIPRSLDVYTLG GGEDTAQALAAEKFRADPGLSGALEDGRQLLAVCASLQVLGMWYRDARDQRVEGMGVLDI TTDPQGHRAIGELVTVPLVDGLTEPLTGFENHGGGTVLGPEARPLGRVVAGVGNGTPLGH EAAAEAFDGVVQGSIIATYMHGPVLARNPQLADLLLSRAVGGPLEPLEVPGTAALRAE >gi|319978332|gb|AEUH01000127.1| GENE 2 713 - 1189 526 158 aa, chain - ## HITS:1 COG:Cgl0247 KEGG:ns NR:ns ## COG: Cgl0247 COG0769 # Protein_GI_number: 19551497 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Corynebacterium glutamicum # 1 158 269 423 423 153 54.0 2e-37 AVAAAVAMGVDAEAAARAVSGVDEVAGRYSTHDVDGRLARLMLAKNPAGWQEAMTMIDPR VDQVVIAVNGQVPDGQDLSWLWDVDFAALDAQGRRVVACGERGADLAVRLEYAGIHCQLA PLPMDALALCRPGKVEMLLNYTAMRDFKTVLGEKGARR Prediction of potential genes in microbial genomes Time: Thu May 12 17:59:52 2011 Seq name: gi|319978330|gb|AEUH01000128.1| Actinomyces sp. oral taxon 178 str. F0338 contig00128, whole genome shotgun sequence Length of sequence - 932 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 892 982 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase Predicted protein(s) >gi|319978330|gb|AEUH01000128.1| GENE 1 1 - 892 982 297 aa, chain - ## HITS:1 COG:MT3815 KEGG:ns NR:ns ## COG: MT3815 COG0769 # Protein_GI_number: 15843333 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Mycobacterium tuberculosis CDC1551 # 30 294 19 271 413 223 54.0 3e-58 MRYRVHMTQTPRFPLRSRVALAAGAGARTASRLLGRGSGGMIGGEVALRLSPTVLADLAA RIPSAVITGTNGKSTTTRMVRAALQTRGSVASNTNGDNMTSGVITAFMNARTASAAALEV DEMHVPSVADAVRPRVFVYLNLSRDQLDRVGEIGAVERRLREGAAAHPGAVVVANCDDPL IVSAACSNDNVVWVAAGGGWGGDSAAYPRGGRVVREEGSWHLIPSRPGEELPALASRPEP DWWLTDVDLRPEGPRATLNGPNGTAVPLALALPGRANLGNAAQAVAAAVAMGVDAEA Prediction of potential genes in microbial genomes Time: Thu May 12 17:59:53 2011 Seq name: gi|319978326|gb|AEUH01000129.1| Actinomyces sp. oral taxon 178 str. F0338 contig00129, whole genome shotgun sequence Length of sequence - 2042 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 2 - 347 361 ## COG0789 Predicted transcriptional regulators 2 1 Op 2 5/0.000 - CDS 351 - 1379 1338 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 3 1 Op 3 . - CDS 1428 - 2042 706 ## COG0576 Molecular chaperone GrpE (heat shock protein) Predicted protein(s) >gi|319978326|gb|AEUH01000129.1| GENE 1 2 - 347 361 115 aa, chain - ## HITS:1 COG:MT0368 KEGG:ns NR:ns ## COG: MT0368 COG0789 # Protein_GI_number: 15839739 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mycobacterium tuberculosis CDC1551 # 3 115 2 114 126 101 54.0 3e-22 MAGRSGQGGESVTFVISVAAELAGMHPQTLRQYDRLGLVTPARTKGRGRRYTKQDIQRLR DVQRMSQDEGINLEGIRRILELERRVEALEAERARMRTQLAEAEMRRNRVFVASP >gi|319978326|gb|AEUH01000129.1| GENE 2 351 - 1379 1338 342 aa, chain - ## HITS:1 COG:mll4755 KEGG:ns NR:ns ## COG: mll4755 COG0484 # Protein_GI_number: 13473985 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Mesorhizobium loti # 7 322 1 345 376 171 35.0 1e-42 MGEQDWLTKDFYKVLGVDKGADKKTITKAYRKLARAYHPDQNPGDAAAEAKFKEIGEAYA ILSNDEDRKRYDAIRAMGGGGARFAAGSSGGFEDLFGAFNARATGYGGGGARFQTGGADL DDLLSGLFGGGGAQGSGAQFGAGAFGAHGGQGRGPQSRPQPTKGGNRKANLAISFRQAIE GATLSIKVAGKTIKVRIPAGIKDGQKIRLAGHGKPGAHGGPAGDLEVSVAVKPHPVFSRE GDDVVVTVPVSFAEAALGAKADVPMLDGSTRSVRIPAGSDADTAIRLKGKGCALKKGPGD LLVRIRIDVPTDLTREQKRAVEGLAEALGPVSRESWLEAAKE >gi|319978326|gb|AEUH01000129.1| GENE 3 1428 - 2042 706 204 aa, chain - ## HITS:1 COG:ML2495 KEGG:ns NR:ns ## COG: ML2495 COG0576 # Protein_GI_number: 15828349 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Mycobacterium leprae # 8 194 16 182 229 82 35.0 4e-16 APASDAGRGTDGETAPEGSAPASDPEGGSTASGSGEEDAVDDMSAFEEQLGDSELANALE RVAAVEEQLTRANADLYNLNQEYAGYVRRSKEAAPAHRTAGQDEVIESLIGVLDDISAAR DHGDLDGGPFAAIATKLEDTLRNRFALERYGEAGEDFDPALHEALMATTDAGVEHPVIGK VLQPGYRRGERVIRATKVLVNNPE Prediction of potential genes in microbial genomes Time: Thu May 12 18:00:02 2011 Seq name: gi|319978305|gb|AEUH01000130.1| Actinomyces sp. oral taxon 178 str. F0338 contig00130, whole genome shotgun sequence Length of sequence - 24070 bp Number of predicted genes - 22, with homology - 17 Number of transcription units - 13, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 80 - 1918 2909 ## COG0443 Molecular chaperone 2 2 Tu 1 . - CDS 2228 - 4561 3078 ## DIP0278 putative surface-anchored membrane protein 3 3 Op 1 . + CDS 4796 - 4918 110 ## 4 3 Op 2 . + CDS 4990 - 7245 3296 ## COG3669 Alpha-L-fucosidase + Term 7260 - 7318 7.2 5 4 Tu 1 . + CDS 7495 - 8670 1371 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 6 5 Tu 1 . + CDS 9043 - 9426 481 ## COG0346 Lactoylglutathione lyase and related lyases 7 6 Tu 1 . + CDS 9545 - 10042 766 ## Arch_1317 GCN5-related N-acetyltransferase 8 7 Tu 1 . - CDS 10044 - 10889 1164 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 9 8 Tu 1 . + CDS 10990 - 12285 1678 ## COG1362 Aspartyl aminopeptidase 10 9 Op 1 . - CDS 12472 - 12903 567 ## COG1764 Predicted redox protein, regulator of disulfide bond formation 11 9 Op 2 . - CDS 12954 - 13484 580 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 12 9 Op 3 . - CDS 13493 - 15067 1952 ## COG2985 Predicted permease 13 10 Op 1 . + CDS 15178 - 15621 428 ## 14 10 Op 2 . + CDS 15618 - 16382 781 ## 15 11 Op 1 . - CDS 16626 - 19334 3586 ## CE0931 hypothetical protein 16 11 Op 2 1/0.000 - CDS 19334 - 20068 257 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 17 11 Op 3 . - CDS 20065 - 20661 666 ## COG1695 Predicted transcriptional regulators 18 11 Op 4 . - CDS 20751 - 20870 271 ## + Prom 20736 - 20795 2.2 19 12 Op 1 . + CDS 20833 - 21048 220 ## 20 12 Op 2 . + CDS 21072 - 21950 1090 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 21 12 Op 3 . + CDS 21994 - 23271 1732 ## COG1085 Galactose-1-phosphate uridylyltransferase 22 13 Tu 1 . - CDS 23314 - 24069 841 ## gi|154507977|ref|ZP_02043619.1| hypothetical protein ACTODO_00463 Predicted protein(s) >gi|319978305|gb|AEUH01000130.1| GENE 1 80 - 1918 2909 612 aa, chain - ## HITS:1 COG:ML2496 KEGG:ns NR:ns ## COG: ML2496 COG0443 # Protein_GI_number: 15828350 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Mycobacterium leprae # 1 596 1 597 620 737 68.0 0 MARAVGIDLGTTNSAIAVLEGGEPTIIPNAEGSRTTPSVVAFSKTGEVLVGEIAKRQAVT NVDRTISSVKRHMGTDWTVAIDGKEYTAQEISARILAKLKADAEAYLGEPVTEAVITVPA YFNDAQRQATKDAGQIAGLKVERIVNEPTAAALAYGLEKGKEDELILVFDLGGGTFDVSL LEIGKDDDGFSTIQVRATSGDNHLGGDDWDQRIVTWLIDQVKAKTGADLSKDPVALQRLK EAAEQAKKELSSATSTNISLQYLSMTADGPIHLDESLTRAKFEEMTSDLLDRTKRPFHDV IREAGISVGDIDHVVLVGGSTRMPAVSAVVKELAGGREPNKGVNPDEVVAVGAALQAGVI LGDRKDVLLIDVTPLSLGIETEGGFMTKLIDRNTAIPTKASEVFSTAQNGQPAVLIQVYQ GEREFARDNKLLGTFELSGIAPAPRGVPKIEVTFDIDANGIVHVSAKDRGTGKEQSVTIT GGSALPKEEIDRMVREAEEHAAEDKARREVAETRNTAEQVVYQTEKLLKDEADKISEETR AAVQADVDAVKEALKGDDVEAVKSSMAKLNETSLKVGQEIYQAQQAAGAAGGEAKSEDDV VDAEIVDEGESK >gi|319978305|gb|AEUH01000130.1| GENE 2 2228 - 4561 3078 777 aa, chain - ## HITS:1 COG:no KEGG:DIP0278 NR:ns ## KEGG: DIP0278 # Name: not_defined # Def: putative surface-anchored membrane protein # Organism: C.diphtheriae # Pathway: not_defined # 421 665 487 740 1080 77 29.0 3e-12 MTMFRTRWARAVVGASALALAATGMGALGVYSAAPAEAAANPDIEVSIKPLVLSDKDGVE IPGAGAQVEDPVRMRVEWDATNADPQPGDSFTVGLPATGAAQPDITHYYRFREVGRSDPL MVGDVQVGDCVTAADVMTCTFNDAIAAQTDVKGTMSQMLMAQKETPLTKLPFSANGVEAQ VPNPNDEPIVQRLWTEKNVGKYANSLKRDSTQINWHITGSGRRLSERSGMPAGTYANKAV FNDVLGADGQALLADDSTWYVRVDPKSAPNQYPTVARVGQPGINTAYGNYALERTISEDG KSATVTLTRTDGDFDPAINYEIVYSSKIDGPVVPEKQYTNSATLVGDTEGPITSAISYHD PITYTVEMKPGYGAFGVKKFVLGAQVGADAGQIPAGTAVTVNVAYELPTGWDPNIHPEWT APAGGANPYTMEVPVEQGVAGAAFPKGTKVTLTESLDSAQLPAALRWQGDPVFSAGSQTA VSELALSIVENTASAVSLTNTVAPAPGRISVTKQVVGGIPAGKTFSFDYVCNDPAGTSGS ILDVADGETKESGDIPAGAECLVTERDATVDGFILEPAVIDPVPIEAGQVAAVLMTNTYT PAPGNIAVTKKVVGDIPADKTFSFDYVCDDPAGTSGSILDVAAGDTKVSRDIPAGATCTV TERDASVDGFALQTSAIGPVVVEAGGVAVVEVTNTYTPAPGPAPSESPSPSPAPSESPSP SGAPSDGPGGPGAPGGPGAAEDSALARTGADGLVASLAALGLLVAGGGLIATRRARG >gi|319978305|gb|AEUH01000130.1| GENE 3 4796 - 4918 110 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLVTVGALYHGLWEWRPSLRFRYTVTASKPEHPVNSTET >gi|319978305|gb|AEUH01000130.1| GENE 4 4990 - 7245 3296 751 aa, chain + ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 28 372 20 347 449 117 28.0 9e-26 MADPTPVPAADPAYSIPAAIPASGVDKVASWQDMKFAMFIHWGVYSHYAGWYKGEKQEVG YPEQIKAWGHQQTWDRRIPLQGIPRDEYLATAQAFDAADFDATAWCQQAKDTGMRMLLIT SKHHDGFAMWDTATTDYNFTKQSPSHRDPLLELSQACKQVGIKFGLYFSNIDWEKQPENP WMNDNTLDEEGYMDYIHAQLKELLGGKYGDIAELWYDMGKPNPAQSDQLRAWAHELQPGI MINSRVGNDRADFEVGWDNELQTEQTQGPWESAVSIFPKCWGYCEWDDAAPKFKDTGYPD YSEEDWDHMVDVDNETRLRKDPDGVAKKTTETLTNLFNTAALGGQFLFNVGPKFDGSYNQ WDASVLRAIGEWNAAHPGVLTNSRPAYFPIEDWGKTMVDDSHIYMGVEKWEEGAAITLRG AGANTIESVTLDGTGQALPHEVKGNDLVVTLPARPDDHLPVIAVKTATAPVYVPVGLTAV EQGTTTIADARLEKFKAPTPKTGETQVIASLTSGDKRAEGVSLAFEATGLTDDYAAYKVT VGDQEVRGLTTADLAKGVGSFTLEAGRTYRATLSYDDPYYPMKGFGKDAKVSSVTVTAAT VKAPASIAVSPASVTAGGTVTITGKDFTPGGSVALTLHSDPVGIGTATAGSDGAFTAEAT VPAGTAGGDHQVVAVDSGTGESASAPLAVVVPVPDPSGQSSDQSGAVPSADGASPSGLAR TGSSALPVVLGAALAAGAGTLLTRRSRRARS >gi|319978305|gb|AEUH01000130.1| GENE 5 7495 - 8670 1371 391 aa, chain + ## HITS:1 COG:ybjF KEGG:ns NR:ns ## COG: ybjF COG2265 # Protein_GI_number: 16128827 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Escherichia coli K12 # 1 384 1 372 375 281 43.0 1e-75 MRCEHWESGACHSCALIEAPYPDQLADKARSVRAALASVPGAPGIRWLSPVASEPRGFRT SAKLVVGGTARRPSLGALGPDRRGVDLPGCPIQHPAVNRASVGLKRFIRALSLTPYDVPT RRGELKNILITVGADERLMVRFVLRSRDRVADIRRALPHLRQLVPSVHVVTANIHPAHEA RVEGPEEIVLTRRRTLPLDAGGLRLELGPRSFTQTNAAVAGRLYRQAAQWASLPLPGGAA PSSLWDLYCGVGGFALHASHAGVPAVTGVEVSEAAIASAISRARALGLTRDQARFVADDA TAWARAQSASGAPDVVVVNPPRRGIGLELASWLDGCTAPRVIYSSCNPTTLAKDLAAMPS LRTTHGRVFDMFPHTAHVEAAVLLERTGGSR >gi|319978305|gb|AEUH01000130.1| GENE 6 9043 - 9426 481 127 aa, chain + ## HITS:1 COG:FN1050 KEGG:ns NR:ns ## COG: FN1050 COG0346 # Protein_GI_number: 19704385 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Fusobacterium nucleatum # 1 127 1 127 127 191 64.0 3e-49 MRIEHVAMYVNDLEAARDFFVRYLGGRSNDGYHNTTTGFRSYFISFDDGPRLEVMTRPRM ADDRKDPNRTGYAHIAFSVGSRKAVDELTAALKAAGYTVVSGPRTTGDGYYESCVVAVEG NQVEITV >gi|319978305|gb|AEUH01000130.1| GENE 7 9545 - 10042 766 165 aa, chain + ## HITS:1 COG:no KEGG:Arch_1317 NR:ns ## KEGG: Arch_1317 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: A.haemolyticum # Pathway: not_defined # 15 160 1 146 148 172 63.0 4e-42 MGPPDKMDATVRRTMSVELLTTPTPELVEAMGRLLPQLSRSASPLGADDVERLLSQGGVH LFVFRPDSADASGERPVLGMLSLAVFEIPTGVRAWIEDVVVDSGARGHGAGLALVEAALE HAKGIGARTVDLTSRPSREAANRLYRRAGFVQRETNVYRVALDGQ >gi|319978305|gb|AEUH01000130.1| GENE 8 10044 - 10889 1164 281 aa, chain - ## HITS:1 COG:Cgl0512 KEGG:ns NR:ns ## COG: Cgl0512 COG0656 # Protein_GI_number: 19551762 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Corynebacterium glutamicum # 17 281 5 269 269 291 51.0 8e-79 MNELTVPSYVLNDGSDLPAVGFGTYGLRGSGGVDAMVSAVRTGYRLLDSAFNYENEGILG EAARRCGVPREELRLVSKLPGRCQHYDRAVATVEESLLRAGLDYWDMYLIHWPNPGRGLY VEAFGALVDLRGRGLIRSVGVCNFLPEHIDRLRDEVGALPSVNQIELHPYFPQADALAYH ASVGVRTQSWSPLARKLKAVEDPAITALADEAGVSPARLILRWHYQLGTIPLPKSATPSR QAANLDVFGFSLTEEQMGRISALGRPDGRQADQDPARYEEF >gi|319978305|gb|AEUH01000130.1| GENE 9 10990 - 12285 1678 431 aa, chain + ## HITS:1 COG:Cgl1463 KEGG:ns NR:ns ## COG: Cgl1463 COG1362 # Protein_GI_number: 19552713 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Corynebacterium glutamicum # 15 427 6 417 420 414 52.0 1e-115 MATLELDDVHANAVDYQHFLLDSPSPYHAADLVAQRLVDAGFTLQDEREAWDASPGGHVM VRGGAVAAWMVPPHVGDSAGFRVVGAHTDSPALSVKPSVQSTTPDGWGMVDVEIYGGMMW NSWLDRELTIAGRLITTSGRAVLARTGPIARIPQLAIHLDRSVNDSLSLDPQRHLHPVWT VDSSAGLLDHIARASGLSGADEVASFDLILTPSQGPSFFGEDGQFVAASRQDNLSSVHPA LVAMERLAGAAPEAGDVVVLACFDHEEVGSQTRTGAGGPVLETVLRRTAQALGRTPDGHE RMLASSSCVSADAAHSVHPNYADRHDPRTRPVMGRGPVLKINSKQRYATDAEGVALWARA CAAAGVASQDFVSNNAMPCGTTIGPITAARLGIPTVDVGVPLLSMHSAREMSHIGDLHAL SRALEAYWCGA >gi|319978305|gb|AEUH01000130.1| GENE 10 12472 - 12903 567 143 aa, chain - ## HITS:1 COG:mll4282 KEGG:ns NR:ns ## COG: mll4282 COG1764 # Protein_GI_number: 13473621 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Mesorhizobium loti # 9 142 4 137 137 111 45.0 5e-25 MNTPTRIAYSVTAQSTGGRAGRATTADPELALDLRAPVELGGPGGGSNPEQLFAMGYGAC FQGAMGLVAKDFGLDISESVVRTTIGIGPEGSSFALTAVIEVLVPGADTARVQALADKAH ELCPYSKATRGNVPVEVKAVSAL >gi|319978305|gb|AEUH01000130.1| GENE 11 12954 - 13484 580 176 aa, chain - ## HITS:1 COG:Cgl2168 KEGG:ns NR:ns ## COG: Cgl2168 COG0494 # Protein_GI_number: 19553418 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Corynebacterium glutamicum # 1 164 1 158 178 115 44.0 4e-26 MATPDFVLELRRHVGHAPLWMPGTTAVVLRRAEGSPAPGWSSARVDPREVEVLCVRRADN GAWTPVTGIVDPGEEPAVAAAREVLEETTVVARPTRLLSVEVVGPVTYVNGDVTTYLDVA FACEWAEGRPEPADGENTEARFVRADRLPPMNTRFTRTIARALSGEPSAVFLTEPV >gi|319978305|gb|AEUH01000130.1| GENE 12 13493 - 15067 1952 524 aa, chain - ## HITS:1 COG:Cgl2162 KEGG:ns NR:ns ## COG: Cgl2162 COG2985 # Protein_GI_number: 19553412 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Corynebacterium glutamicum # 2 515 1 529 539 222 33.0 2e-57 MIALLQTHPLLALFLVVGLGAALGAIRFGPLRFGAAGALFVGLLLSASVPETAGEDMRIV QQIGLAFFVYTVGISAGATFFQDLRKQTGLLASCAAVCLAGAVLALAGGRVLGLGEGLTT GLFTGALTAAPALDAATRVTGDPQAAVGYAFGYPIGVVVGILVVTMTVTRRWAGERDTPS LAGGNLETVTVLVADTINMRQIAAWREQRIRMSYLRRGERTRVVAPGEDLLAGDHVLMVG DAASVEEAARDVGEILGHRLEDDRSDVAFERILVSNPDVAGRPVSQLNVAKRFGAAITRV RRGDLDLLARDDLDLQLGDHLAVVVPIEELDAISDWLGDSERRVAEVDAMAFGIGMVLGM LLGVVSFPMPGGSSFQLGSAAGPLIVGMVLGALRRTGPLVWTLPAAANLTIRQIGLMLFL AALGLNAGPQMAALLQGGEGGRAALLALVIVAVCCAAQALAGRFLGLSSARAAGGVAGFL GQPAVLQAADARVVDERVEAAYATLFAFSIIVKILLVPLITTFM >gi|319978305|gb|AEUH01000130.1| GENE 13 15178 - 15621 428 147 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTPVPGWYSDPAGSGRLRWWSGTEWTEQFQDAPATAAPSAEPLATGAVPQAVDSAAAPS AQQEPFGAATAADGAGRKDPAPSKGMLIACIAAWAVAVVLCVAMTISLGAFNKAGRTVEE SQQGVTDAQQAVDDANSLDLSGNGENE >gi|319978305|gb|AEUH01000130.1| GENE 14 15618 - 16382 781 254 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTNSTGSPYTPGAQDPQASPYGTDPRPVGAQPPAAPAPKKSRGLAIATIVGAALSVVLL VVTIVLVVGARNRNDEARNLDAQASQLEHQAVEIKAQKVCDEITRDNRSTAASLFRAYPD AHQDIASAMQRLCGDKYKTAKAVAGIPTTDKTVFDSSEFECVLDAGGSTVTITATLAVTQ NDKFDGIGSVDLWVKAGFVENKLPFAPFTHSEVVATPTTVSVGSSSTVTATVSYDPSYGQ YCGVQIDSWWPTDQ >gi|319978305|gb|AEUH01000130.1| GENE 15 16626 - 19334 3586 902 aa, chain - ## HITS:1 COG:no KEGG:CE0931 NR:ns ## KEGG: CE0931 # Name: not_defined # Def: hypothetical protein # Organism: C.efficiens # Pathway: not_defined # 8 893 1 844 852 122 25.0 5e-26 MARLRQRLSRWGVALRIATRDARQNGGRTSAAIILIALPIIVAIGAVAVWDVTTSQRYVA SEWLGRDTGVQAVATHYSSGPIHQDATNSRAVAADSAEASVDASTLSSWVPDQDSLTPVD SLYQLTLTSSGASTTVATATQTPRLDVPVLAELGRSGPLDAGKIVISQDLATQLGVGVGD SVELGVTVEQSRLTKKLTGTAVVDALVPGERRAVAGAGTLGVDPYAVAGKTFTSWYVTGP VPIGWDKVEELNASGFRVTSRAVLASPPPASQLSPDIAAYAQPSRSVSPWQYLLIVAGIL LVLTEMILLVSPLFTVTQRSVMRTAAMIVANGGDNTDGRRLTIAHGLTVGLYAAMFSVLG SVGVLLGIGHWSRLGIGIIPWWTLVLSLLLPLMLSIMASVSPAHTSAHINTSAVIGGRTW EPSRLVRRRLAYPVVLLAALPILGIAAWLGQIGLLILGIALLETGLIGSIPYLFMRWRSP NRRSSMSMRLALRDAVRNGHRTFPAMASILTTVFVACALLITLNSSNEAGWNATAHVGAR GRVFVKDADMTESVLRTRQVHTTARDAIASVNPVVSEVTLTGMAWNSAPGGPTRSVGAIS PIDGQLIPQITGEGRPIHEVDPVYIVDDGTYLRESGIVSGDEMVRAVTTLTAGGVLVPEP SLIDGTGQVRLSSLDLSEIAKARAAGQTSNLPFASVIQTVAFKATPFTDFNVVVLSPQAA SQLGLAARPLGQLLTVQNPIGPFSASAFTTQIAREAPGASVTVLVPSVRSVLLPYIAAFI AVVAAGATVALVVALSASDMRPDLDTLDAIGAGPSMRRHVTTWQGLVLALNCIPAAVLSG LIVGVLAVLAVARSQVFPELSTLTPVIPWSALIGMLVGMPALCGVVAIVLTPRGQKRLRR ID >gi|319978305|gb|AEUH01000130.1| GENE 16 19334 - 20068 257 244 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 1 239 1 234 309 103 31 1e-21 MSGGPNPVRMSGVGHDFGANDVHVRALDGINLEVEPGELLAVMGPSGSGKSTLLSIAGGL ERPNRGSVFINGIDLATLDGVRRAQVRRRQIGYVFQDYNLIPSLSALENVEMPLQLDGVP NAKARALAMEALDEVGVGDLADRFPERLSGGQRQRIAIARGLVGNRSVILADEPTGALDS NTGEQIMMVLRDHIDAGAAGILVTHEARIAAWADRTVFLRDGHIVDRSRTDSIRDLLRDP TARD >gi|319978305|gb|AEUH01000130.1| GENE 17 20065 - 20661 666 198 aa, chain - ## HITS:1 COG:Cgl0834 KEGG:ns NR:ns ## COG: Cgl0834 COG1695 # Protein_GI_number: 19552084 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Corynebacterium glutamicum # 1 166 1 166 169 106 39.0 2e-23 MSVRNALLALVAQQPAGVYRLKQMFEEQTCGAWPLNIGQVYQTMQRLERDGQVVSHAETN AGRDSEVFELTELGREVLDAWWATPVPRENPERDELVMKLAVSAADPTVDVEAMIQTQRR STLSALRDVTRLKASADEGELAWKLILERHIFDLEAELNWLDHIESGAVSEAARRAAFAA AKGRARTWANVPEPATAR >gi|319978305|gb|AEUH01000130.1| GENE 18 20751 - 20870 271 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDFAARVSVGGTESAIFSDFWRKLAFNHPVGVLLSIVK >gi|319978305|gb|AEUH01000130.1| GENE 19 20833 - 21048 220 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPPTETRAAKSDISVRHFYGADHFFEHPSATTPIPTGRQTRAPARNAHHMGNNRPSAPIR TQHHPPGPLCG >gi|319978305|gb|AEUH01000130.1| GENE 20 21072 - 21950 1090 292 aa, chain + ## HITS:1 COG:CAC1958 KEGG:ns NR:ns ## COG: CAC1958 COG0656 # Protein_GI_number: 15895230 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Clostridium acetobutylicum # 21 289 10 272 274 218 40.0 2e-56 MNSPTAPTPAPPQPPTPSLTLPTGRAIPQLGFGTYKIAPEDAYDAVRAALDTGYRHIDTA QMYGNEAEVGRAIEASGIPREELFVTSKLDNGNHRPEAAAASIARSLEDLRLASLDLFLV HWPLPTLYDGDVALPWPALEDAYRAGECLAIGLSNYETAHMRAVMDRAAVKPHAIQVESH PYMPNSTILRFAQRHRIVFEAWSPLARGRAARDPLLASIGARYGAEAAQVALRWAIQRGH VVFPKSVRPDRMAANFDVFGFELAPADMDAINALNEGEKGRTGSHPATMDRM >gi|319978305|gb|AEUH01000130.1| GENE 21 21994 - 23271 1732 425 aa, chain + ## HITS:1 COG:Cgl2030 KEGG:ns NR:ns ## COG: Cgl2030 COG1085 # Protein_GI_number: 19553280 # Func_class: C Energy production and conversion # Function: Galactose-1-phosphate uridylyltransferase # Organism: Corynebacterium glutamicum # 1 418 15 431 442 421 50.0 1e-117 MADGTVKQQNLLTGTEVWTVPGRGHRPLSLPPAVRTPIDHGADGHHCAFCSGRYLDTPPE KSRLVRDADGSWRELTGLSADALSETTAEFRRIPNLFEIVSYNYWHLNHGHMPSEAARRR MAEYLASSAGYDHIMSVVRARMLASGTTPEEFDAISETKKLQAANGFFSGGHDLIVARRH FVDGATDTSQVAGSGSLTPEEHYQYTAYTARTMEDLYHLDPAVRYVAAFQNWLKPAGASF DHLHKQLVAIDDLAVQTEAELDRLRSQPDIYDIIFKVGATRKLVIAQNEHAVAFAGFGHR FPGIAVWPLHTPRNPWEVGQEQMRAISDLLHAAHAATGASVPCNEEWYHRPPAVATPMRW RILLKWRISTLAGFEGGTRIYLNTIDPWRVVEKTVPRLLELREAGAIAPMAIGEECKVSP GMLDD >gi|319978305|gb|AEUH01000130.1| GENE 22 23314 - 24069 841 251 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154507977|ref|ZP_02043619.1| ## NR: gi|154507977|ref|ZP_02043619.1| hypothetical protein ACTODO_00463 [Actinomyces odontolyticus ATCC 17982] # 8 248 142 380 430 150 53.0 8e-35 VAVPQVRTAAGTEVAGAEGAVAYGRDDVCVVADRAWMSANMLAVPSSIQELASASYAGLL AVPDPASTSVGRAFVQAASSQVGQGLGAFATSLSPRVGASTGETLAQWSAADRVTASYWS SLAGAPASTPSAGLYPLVVAPASLASAALTNTGAESYGAPVAPTCIARTLYAAGVPSGGA LSDGAASLIAWLQGGAAQRALAEAGAAAPLDSGAGEGTRASWASQGSGREADETTADPQS VAAAVSAWGAR Prediction of potential genes in microbial genomes Time: Thu May 12 18:01:12 2011 Seq name: gi|319978300|gb|AEUH01000131.1| Actinomyces sp. oral taxon 178 str. F0338 contig00131, whole genome shotgun sequence Length of sequence - 2341 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 398 407 ## gi|293190477|ref|ZP_06608864.1| hypothetical protein HMPREF0970_01195 2 2 Tu 1 . + CDS 516 - 779 343 ## gi|293190478|ref|ZP_06608865.1| conserved hypothetical protein 3 3 Tu 1 . - CDS 870 - 1259 303 ## Ksed_06030 hypothetical protein 4 4 Tu 1 . + CDS 1301 - 2339 1190 ## gi|293190480|ref|ZP_06608867.1| cytochrome C oxidase subunit IV Predicted protein(s) >gi|319978300|gb|AEUH01000131.1| GENE 1 2 - 398 407 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190477|ref|ZP_06608864.1| ## NR: gi|293190477|ref|ZP_06608864.1| hypothetical protein HMPREF0970_01195 [Actinomyces odontolyticus F0309] # 21 131 13 126 430 83 55.0 4e-15 MHDTHPRGNMTNARATRVVLLAVSALALAACTPAVSANAPADPFVDPIVPQGGSAQSADQ GTGSASAPRKGSGAVGVALVGGAQLPEQVVSAFTKDTGFTVGTAAYDAVGDVPAQGTDVV LGLDGADALAAA >gi|319978300|gb|AEUH01000131.1| GENE 2 516 - 779 343 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293190478|ref|ZP_06608865.1| ## NR: gi|293190478|ref|ZP_06608865.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 7 82 8 83 92 65 65.0 1e-09 MEGLGWKLTNAAVLAGAAFVASTAFEYGWKAATGRPVPAEDDEGTALATLVAFAAASAAV AAVAQRYAYKGARKWIAPRLKGSPVTA >gi|319978300|gb|AEUH01000131.1| GENE 3 870 - 1259 303 129 aa, chain - ## HITS:1 COG:no KEGG:Ksed_06030 NR:ns ## KEGG: Ksed_06030 # Name: not_defined # Def: hypothetical protein # Organism: K.sedentarius # Pathway: not_defined # 3 127 4 138 138 123 61.0 3e-27 MARRRIDQRDGMAALGEWAAQGEAAPRRQVATAVRFTLEEVGALHPGRAVEVRVPPAGAV QILAGTTHRRGTPPAVIETDAHTWLALATGRLRWDEAVGSGRVQASGERADLSAYLPVID LPAVRAATR >gi|319978300|gb|AEUH01000131.1| GENE 4 1301 - 2339 1190 346 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293190480|ref|ZP_06608867.1| ## NR: gi|293190480|ref|ZP_06608867.1| cytochrome C oxidase subunit IV [Actinomyces odontolyticus F0309] # 26 322 5 301 302 413 68.0 1e-114 MPRRPGLGHNGLAPSADERLHVRVNSNLITLRSILDLDIARRFEWDAALVVELSGVDRPG DLSPRVAEDPGVLYDIASQGIAPGSAIERALSHELHDAIQRRCRLWMAGIPTGHLDRLRS ELGEGIVHVAGAPDEGLTPVALAPLELLKAWSQGSDRQRAFIRKSMAGLDTLTTSSTATW AARQVRAPIIERSFFLRLCRNPKFIAYVVVLAYSLIRAVPVMYVPHFKGNWRVLWAIDII TAIPYTWGIIEMVSGQRLWHRVIGAITTAVTFFAPYVYFLMYGRHAPLSVWLVVAAIFFS AIFLEVYRTMRDSAVRRGLARRCPRPRRRAGAHQPVRRAAPQDSSA Prediction of potential genes in microbial genomes Time: Thu May 12 18:01:43 2011 Seq name: gi|319978296|gb|AEUH01000132.1| Actinomyces sp. oral taxon 178 str. F0338 contig00132, whole genome shotgun sequence Length of sequence - 4405 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 - CDS 11 - 1771 208 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 . - CDS 1774 - 3648 2172 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 3 1 Op 3 . - CDS 3645 - 4403 880 ## Arch_1575 ABC transporter related protein Predicted protein(s) >gi|319978296|gb|AEUH01000132.1| GENE 1 11 - 1771 208 586 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 344 556 16 226 245 84 31 1e-16 MWGKIIRVVNARQMRMIRNWFIASAVLQGITLSLLIPFLRAFLGAAPDLGMWTALVVGAG LCALVVDTVAMFVSFRVSVYDVCDSMIDRIASRVLALPLGWFDAKREAQVASATSREVNT ISHVVSIVLPRLTNAFVVPAVMMAATAFYDWRLAAIMAAAVPPLWWVLRLTGSATERANT TESGAATAAAGRLIEFSRLQTVLRATGATKRGWTVLDDALEAESSAVLSSLVVRSRPAQG FSLIITLTYAALVAAGMALVMGGELDPVVYLALMAVCARISGPIGEAALIATEANNAGVA FDRVLDILESSPLPEPDPSQARTPSGASVAFEHVSFSYEQGSPVLRDVSFEAEPGAVTAI VGPSGAGKSTVLRLAARFWDAGAGAVRVGGVDVRDMTSAELMAMTSMVFQDVYLFDTTIR ENLRIARPDATDGELARAARRARLDTVIEALPDGWDTRVGPGGLALSGGERQRVAIARAF IKDAPILLLDEITSALDGENESAITHVVSELSRGRTVLVVAHRVSTIMRADRVIVLSPDG AGGGARVAQTGAPDELARGSGLFREFIDASQEASRWHLRGGGQARR >gi|319978296|gb|AEUH01000132.1| GENE 2 1774 - 3648 2172 624 aa, chain - ## HITS:1 COG:SA2217 KEGG:ns NR:ns ## COG: SA2217 COG1132 # Protein_GI_number: 15928007 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Staphylococcus aureus N315 # 28 577 28 574 587 300 31.0 4e-81 MSASENKNAGLKELIRPAVPAMFGAAALTALAAVLGIVPYAAMARLAELWLGEEGATPGE LWLLAGVAVAALFAGQTCYSLGLGVTHIAEARLRHRLRERVVTTIGNMPLGEVSALSAGA IRKMVADDTAAIHTMIAHLPGDTTHSLVGAAAGFAYLLCVDWRLALALAGVWVVLIGAIA LVSMRGYGSIVSEFSEQQTALAAATVEMVEGIAEIKNFQAADATRTRFTAAREAFSALSY TWTKRSGAAVALVSAIIRPSSVFATVAPLAALFVWQGWTTPARALPFFFVALALPSGLLT LTQMTQHAYEARQGARATAQLLASPLMPEGAYEGAAPQPGAVSFDGVSFGYSQDAPVLRD VSFSVEPGTVTAIVGPSGGGKTTIARLVGRFWDADSGTVRVGGIDVREATHAWLLSQLAI VFQDVALCHDTVGANIALGRPGATREQIEAAARAACVHDRIMALPEGYDTVIGEEGGFLS GGERQRVTLARAYLMDAPILVLDEATASADPRSERDIHRALSSLSEGRSVLVIAHRLSTI RDADQILVVDRGRIVERGTHDRLIALDGAYARLWRTQNPGQGGPGGPGGPGDPGDPDASN SPDGPDGPGGPGSGGRPGTSRKEK >gi|319978296|gb|AEUH01000132.1| GENE 3 3645 - 4403 880 252 aa, chain - ## HITS:1 COG:no KEGG:Arch_1575 NR:ns ## KEGG: Arch_1575 # Name: not_defined # Def: ABC transporter related protein # Organism: A.haemolyticum # Pathway: not_defined # 16 250 541 774 775 195 48.0 1e-48 APGAAAPHQRGRRQGGAFAPLNPLTMGLAALPLMVAVALGGTTRVNLVVMAAATLGVIAS RPPARRLVATALAPWLTAALLLFTLRYFATAPYRSVELYYVASPASGATTVGAVLALVLL GGAAAPAEAQIRALTTTLRLPYRLAAAGTAAVSFLRRFSRDFALLRTARALRGVGSGWGP LAPAVRWTASVVPLMVISVSHAERVALSMDARAFGAHARRTEMVDERWRARDWAVVALAC LAAAVVLYWRYR Prediction of potential genes in microbial genomes Time: Thu May 12 18:01:50 2011 Seq name: gi|319978286|gb|AEUH01000133.1| Actinomyces sp. oral taxon 178 str. F0338 contig00133, whole genome shotgun sequence Length of sequence - 8455 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 811 316 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 2 1 Op 2 . - CDS 786 - 1421 904 ## Arch_1576 ABC-type thiamin-related transport system, permease component 1, predicted - Prom 1469 - 1528 2.7 + Prom 1487 - 1546 2.2 3 2 Tu 1 . + CDS 1639 - 2379 1235 ## COG1183 Phosphatidylserine synthase 4 3 Op 1 . + CDS 2918 - 3004 207 ## 5 3 Op 2 . + CDS 3034 - 3921 830 ## COG4823 Abortive infection bacteriophage resistance protein 6 4 Op 1 9/0.000 - CDS 4315 - 6687 3424 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain 7 4 Op 2 23/0.000 - CDS 6757 - 7464 1058 ## COG0047 Phosphoribosylformylglycinamidine (FGAM) synthase, glutamine amidotransferase domain 8 4 Op 3 15/0.000 - CDS 7461 - 7742 313 ## COG1828 Phosphoribosylformylglycinamidine (FGAM) synthase, PurS component 9 4 Op 4 . - CDS 7806 - 8453 959 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase Predicted protein(s) >gi|319978286|gb|AEUH01000133.1| GENE 1 2 - 811 316 270 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 1 255 117 366 398 126 37 6e-29 SGSTEMPADPLVACESLSVHYIGGDAWVLDDVSFEQLAGRVTAVIGPSGCGKSTLVRACC GLVPHSIPSEYSGSVALRGQEVADAPADVLAGTVAYVGQNPDAAVVTRSVRSEVEFPLQN LCLGRDEITERADAALAAVGMAGLGGKDPWILSGGQRQRLAIAVALAMRTPLLVLDEPTS TIDAEGSQRFYDLVADLARQGTAVIVIDHDLDPILPWVDQVLALDARGRGIALGTPREVF VRHRAALEAVGVWMPRALRADAVPAAEGAR >gi|319978286|gb|AEUH01000133.1| GENE 2 786 - 1421 904 211 aa, chain - ## HITS:1 COG:no KEGG:Arch_1576 NR:ns ## KEGG: Arch_1576 # Name: not_defined # Def: ABC-type thiamin-related transport system, permease component 1, predicted # Organism: A.haemolyticum # Pathway: not_defined # 18 207 5 194 197 165 50.0 8e-40 MSETPNEALPGAEPGPTSRVVRTSARKGLVDSVLGTRNLMTIAALVVCNCIIFIPINYVS VMTAGTQRGVYLGVGLIGLWAVDFLLPVTIVRRPGAAIVAGLLYGLIGMVATPVGPAAIV GCLLGGAFVELPLLLTFYRFWDWRMFMACATCFGLLNSLLYVASMGALIGSDMNALLVAI GVASGVVGGLLVLGATRLLNMAGVGIDRDAR >gi|319978286|gb|AEUH01000133.1| GENE 3 1639 - 2379 1235 246 aa, chain + ## HITS:1 COG:PA3857 KEGG:ns NR:ns ## COG: PA3857 COG1183 # Protein_GI_number: 15599052 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Pseudomonas aeruginosa # 22 241 15 233 238 159 46.0 5e-39 MSEQTQQPPSTTAPGLGRTVAAWAVHALTISGLVWAALALQALLNDDIALMWLWFVIALI VDGVDGTLARKVGVREVIPWFDGAVVDNLVDYLTWTFLPALFMSMALPFGPAPVPLLMMI LIIVSSVFCYANDGEKSNDNYFVGFPAAWNCVAIVMYVLQTPAWVNIIATIVLAALTLVP IHYTHPARVKRFRAINIVSVVLWLFACGALVVLYPVQPLWLLIVFWVGGGWFMISGFIRT ATGEDK >gi|319978286|gb|AEUH01000133.1| GENE 4 2918 - 3004 207 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTGSARKGPGPVLFISEESNGAQTVQVN >gi|319978286|gb|AEUH01000133.1| GENE 5 3034 - 3921 830 295 aa, chain + ## HITS:1 COG:PM1783 KEGG:ns NR:ns ## COG: PM1783 COG4823 # Protein_GI_number: 15603648 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Pasteurella multocida # 1 265 22 275 316 122 29.0 6e-28 MIIDDASRARNALETIGYYKLSGYSYPFRRKREGTNEIADEFTEGTTFEQVLAVYIYDET LREATAHELSRTEIAFRALIGHELGKTAPHLHLMPHELGPKAWDKDRARPTEEYRRWIEH YQTQLSTSDEDSVLHYKSKYRGVLPLWVAVQVLDWGALRTLFSFATRNQQEAISEKIRIT AAELNSWLRCLNILRNICAHHARLFNRNFRKTPKLPNTPHPLAYLKEYVNNKAHPRWGNR CFLQLTLVQYLLHETDLGDARTLPRTLAAFPPVGRLSPSSMGAPQDWHSLPFWEN >gi|319978286|gb|AEUH01000133.1| GENE 6 4315 - 6687 3424 790 aa, chain - ## HITS:1 COG:Rv0803 KEGG:ns NR:ns ## COG: Rv0803 COG0046 # Protein_GI_number: 15607943 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Mycobacterium tuberculosis H37Rv # 15 790 3 753 754 838 60.0 0 MTAQATPTNPRELHDTVEDAAATPDKEMPWRELGLKEDEYQRIRAILGRRPTNAELAMYS VMWSEHCSYKSSKKHLSEQFGARTTDAMRRRLLVGMGQNAGVVDIGGGWAVTFKVESHNH PSFVEPYQGAATGVGGIVRDIISMGARPVAVMDQLRFGAVDHPDTARVVHGVVAGVGGYG NCLGLPNIGGETEFDPSYQGNPLVNALCVGTLRHEDIHLANATGEGNLVVLFGARTGGDG IGGASILASETFEDGMPAKRPSVQVGDPFMEKVLIECCLELFGAGVVQGIQDLGAAGISC ATSELASNGDSGMHVDLENVLLRDPTLTAGEILMSESQERMMAIVAPADRERFFTIIDKW DVEASVIGRLTSDGRLTIDHHGHRIVDVDPKTVAHEGPVYDRPYARPAWQDDLRAATSEP LRRPATRAELVEDLEAVLFDANQASKAWVTDQYDRYVQGNTALAQPDDAGVIRVDEATGL GVALSTDANGWYTKLDPAAGARQALAESYRNVCVVGAEPVAITDCLNFGSPEDTDAMWQL VTAMTALADGCVELGIPVTGGNVSLYNSSGTVKGRIGSSINPTPVVGMLGIVQDVARVNP SGFTEEGLAVVLLGTTREEFDGSAWARIAHSHLGGVPPEVSLPAEVALGNVLRALSAADG PDGRPLVRAAHDLSNGGLAQALVDSALRFGVGGSFDLASAARRDGVGDLEMLLSESQARA FVAVPEAALPAVFAAAEAEGVEAVRIGTTGGDLLAVAGVDLLAAEGSGGGWVADLAGLRE RSEAVLRDRF >gi|319978286|gb|AEUH01000133.1| GENE 7 6757 - 7464 1058 235 aa, chain - ## HITS:1 COG:Cgl2535 KEGG:ns NR:ns ## COG: Cgl2535 COG0047 # Protein_GI_number: 19553785 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, glutamine amidotransferase domain # Organism: Corynebacterium glutamicum # 3 235 4 223 223 246 59.0 4e-65 MTRIGVVSFPGTLDDRDAARAVRLAGAEPVALWHKDADLRGVDAVVVPGGFSYGDYLRCG AIAATAPVMAEIVRGAGRGLPVLGICNGFQILCESHLLPGALIRNDHQKFVCRDQELRVE NPGTAWTRGFSAGDLITIPLKNGEGNFQASPDELARLEGEGRVVFRYAGPNPNGSANGIA GVANDAENVVGLMPHPEHAAEAGFGPLAHDGAAAAHHGGTDGLRIFQSVLAQLAG >gi|319978286|gb|AEUH01000133.1| GENE 8 7461 - 7742 313 93 aa, chain - ## HITS:1 COG:Cgl2536 KEGG:ns NR:ns ## COG: Cgl2536 COG1828 # Protein_GI_number: 19553786 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, PurS component # Organism: Corynebacterium glutamicum # 1 79 1 78 81 69 54.0 1e-12 MGRIIVEVMPKPEILDPQGKAVGSALPRLGITGFQAVRQGKCFHLSVEGPVTGELLAEAR KAAEEVLSNPIIEDVVRVEAIDEADARYAGGAR >gi|319978286|gb|AEUH01000133.1| GENE 9 7806 - 8453 959 215 aa, chain - ## HITS:1 COG:Cgl2543 KEGG:ns NR:ns ## COG: Cgl2543 COG0152 # Protein_GI_number: 19553793 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Corynebacterium glutamicum # 2 210 79 286 297 245 56.0 4e-65 AVPDAVDARAVVCRRLDMVPIECVVRGYLTGSGLAEYRRTGAVCGIGLPGGLTEGSRLDT PVFTPAAKAGAGEHDENITFERAAAMVGRELADRLRDASLALYERARGIAAGRGVIIADT KFEFGLDPDTGRIVLADEILTPDSSRFWPADQWRPGRPTPSYDKQYVRDWLASAGSGWDR ASGENPPALPDSVVAATAARYAGAYRRLTGADPVL Prediction of potential genes in microbial genomes Time: Thu May 12 18:02:00 2011 Seq name: gi|319978278|gb|AEUH01000134.1| Actinomyces sp. oral taxon 178 str. F0338 contig00134, whole genome shotgun sequence Length of sequence - 6602 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 2, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 248 315 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 2 1 Op 2 . - CDS 271 - 1596 1643 ## COG0151 Phosphoribosylamine-glycine ligase 3 1 Op 3 . - CDS 1681 - 2817 1372 ## gi|293190489|ref|ZP_06608876.1| putative tetratricopeptide repeat-containing domain protein 4 1 Op 4 . - CDS 2807 - 3865 1348 ## Bcav_3961 von Willebrand factor type A 5 1 Op 5 . - CDS 3867 - 4889 1526 ## Bcav_3960 von Willebrand factor type A 6 1 Op 6 . - CDS 4886 - 5443 690 ## gi|293190492|ref|ZP_06608879.1| conserved hypothetical protein 7 1 Op 7 . - CDS 5433 - 5525 124 ## 8 2 Tu 1 . - CDS 5710 - 6600 1265 ## COG3764 Sortase (surface protein transpeptidase) Predicted protein(s) >gi|319978278|gb|AEUH01000134.1| GENE 1 2 - 248 315 82 aa, chain - ## HITS:1 COG:MT0804 KEGG:ns NR:ns ## COG: MT0804 COG0152 # Protein_GI_number: 15840195 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Mycobacterium tuberculosis CDC1551 # 7 80 5 70 297 64 50.0 5e-11 MGEHVELTGWTHLASGKVRDIYAPDGEDPAGADALLMVTSDRISAYDHVLPTTIPGKGKI LNQMAIWWMGELRGIVDNHLLA >gi|319978278|gb|AEUH01000134.1| GENE 2 271 - 1596 1643 441 aa, chain - ## HITS:1 COG:ML2235 KEGG:ns NR:ns ## COG: ML2235 COG0151 # Protein_GI_number: 15828196 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Mycobacterium leprae # 1 433 1 412 422 332 50.0 1e-90 MKILLVGNGGREHAIARALARTSTTDHPVELVVQAGNPGLERLGDSRTLDPLDPDDVLRR ALGEQVDLVVVGPEAPLVAGIADAVLERGVPVFGPTRAAARLEASKSFAKQVMAAAGVAT AASRTCRTIDEAGAALDAMGSPYVVKDDALAAGKGVVVTGSRDEALAHAAACLARQGGAV VIEEYLDGPEVSLFCLSDGATALPLVPAQDFKRAGDGGAGPNTGGMGAYSPLPWLPSGAV DQIVHDIAQPVIDEMARRGTPFVGLLYCGLAMTSKGIRVVEFNVRFGDPETQVVLERLDS PLAPLLHAAATGSLAGAAPPVWSPDAAVTVVMASGGYPGPADTGHAITGIDAAERLDGVH VIHAGTSEEITDDPADVAAGCCGFEPTLALVNSGGRVLDVVARAATVERARQRAYEAVGL IRFQGEHHRTDIATWPAGLDG >gi|319978278|gb|AEUH01000134.1| GENE 3 1681 - 2817 1372 378 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190489|ref|ZP_06608876.1| ## NR: gi|293190489|ref|ZP_06608876.1| putative tetratricopeptide repeat-containing domain protein [Actinomyces odontolyticus F0309] # 14 378 5 345 345 186 41.0 3e-45 MPSDGQFPPVGPPSADPDDVLVPFGAPEADTGGSSPKAKAAHGPAAKPRRSRKALRPLLW SLPFVLIGLAASTYLIGLTLWSRASLGHWTDSDYASAQAGYEGQKTLTKSGVEPWVANYN LGTTMLREGDTDGGIGLLRTAKEQVPTATEVEPGRIETYSYECQVRINLALGIEIQGDAK AASADWSGAATAYSEAKDIISPCQSPSSSQDQSGDGEGQSDQQSGGQGGGGQSDQQSGGQ GGGGGQQDPGEQADDSTDRLGDKEQQAKDKQGEGGQDQQDQGGQGQDQQDQGGQGQDQKD QGGQGQDQKDQGDQGQDQKDKQNKGGQGKSKNEDEGYDNENSSDRQKRQDLQKKNRGNEK DRQDRSDSKRSGNPNGAW >gi|319978278|gb|AEUH01000134.1| GENE 4 2807 - 3865 1348 352 aa, chain - ## HITS:1 COG:no KEGG:Bcav_3961 NR:ns ## KEGG: Bcav_3961 # Name: not_defined # Def: von Willebrand factor type A # Organism: B.cavernae # Pathway: not_defined # 13 340 11 328 399 205 41.0 2e-51 MVFRPIAHIAVIIAILVIGTAMVFLMWKRSARPHLNDAVRRALAVVVVALICAGPSVEGE ATQVSSSVEVYMVIDRTGSMVAEDWDGKKPRIDGVRQDVATILEKMAGSRFSIISWDSGV RTELPLTTDSTAVTSYMATFVQELSESSQGSSPDRPASHLATILEKNKQKHPQNLRTLFV FTDGETSNQDHWSANSPGEESDWDQVKPYIDGGLVIGYGSKEGGPMKVRRIGGDPTTDAQ SGSGGDSGQDEYIHDRSQAGDPVAISKIDEDKLQAIASRLGVDYIHSPSTADIEGKCSSL MDGASDVAETRNLLSTYRYVVWPLALVLGALMAWEGAALAIRARSLRRSHAI >gi|319978278|gb|AEUH01000134.1| GENE 5 3867 - 4889 1526 340 aa, chain - ## HITS:1 COG:no KEGG:Bcav_3960 NR:ns ## KEGG: Bcav_3960 # Name: not_defined # Def: von Willebrand factor type A # Organism: B.cavernae # Pathway: not_defined # 23 322 24 325 343 214 39.0 3e-54 MRFTWLVFVVIGVVLAATVIGWWLARDAGKRADKKGWVANTGYLRSLPKYQALVRRTRLV FASVVVCFFLTVVAVAISAGAPVDRQVKHDKLSTRDIVLCLDASGSMIPYDGKIGEAFRQ IVQHFSGERISLQLWDARSITKFPLTDDYTLADDVLKEMSEIMENGFQGKSASGVYVSNE LLEYLEDTFDSQGGSSSLAGDGLASCVLGFDHYDQERSRMILLATDNEILGDQIYTLKDA IDFAKRQNVVVTALYPSDGSTFLSSEGRELETLVKETGGEFYDASNPASVQGVIAAIEEQ QKVDLEGSGEMLETDKPASALAWTVLGMFALFGMIALGRV >gi|319978278|gb|AEUH01000134.1| GENE 6 4886 - 5443 690 185 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190492|ref|ZP_06608879.1| ## NR: gi|293190492|ref|ZP_06608879.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 185 1 170 170 202 65.0 1e-50 MSGDATDYLRDPVIYPHWMWVAGAAILLLVVGWVAYSLWSWWHSRERTVTDLQSISDARR SRYYEFIDQIAQRNASGELDERGTHLAIAGLMRALGTERSGRDLEVATVAEIRALVPTWP QLAEVLEACEEPSFVGDEQEGADQGRGERPDNPLLAAGWDMAPPPPPRRPSVDGILSLAS QAVGA >gi|319978278|gb|AEUH01000134.1| GENE 7 5433 - 5525 124 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MALRVADAPRTAPVRTALPARAGEGGGYVR >gi|319978278|gb|AEUH01000134.1| GENE 8 5710 - 6600 1265 296 aa, chain - ## HITS:1 COG:SPy0135 KEGG:ns NR:ns ## COG: SPy0135 COG3764 # Protein_GI_number: 15674348 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase (surface protein transpeptidase) # Organism: Streptococcus pyogenes M1 GAS # 42 242 4 199 251 161 41.0 2e-39 RAPAPRGRRWRFSPLSLIPSLLGLAGLLLFLYPSISAWIVQYNQSQVIAQYEGSVGRADP SADEQLALARRYNDALSAGAVLEANANVPTGDGTSGDDSLDYDSILTADGTGLMARLKVP AADIDLPIYHGTSDDTLLKGLGHLEGTSLPVGGQGQRTVITGHRGLAEARMFTDLDKVEP GDTFTFEVFGEVLTYSVIDKKVVEPEETEALRAEPGRDLATLVTCTPLGINTHRILITGE RVYPTPQKDVDAAGAAPEIPGFPWWALGLLGGVSLIGVYVWRSGYPSRSGNRANTE Prediction of potential genes in microbial genomes Time: Thu May 12 18:02:52 2011 Seq name: gi|319978261|gb|AEUH01000135.1| Actinomyces sp. oral taxon 178 str. F0338 contig00135, whole genome shotgun sequence Length of sequence - 23065 bp Number of predicted genes - 18, with homology - 15 Number of transcription units - 12, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 12 - 2897 3954 ## DIP2227 surface-anchored fimbrial subunit 2 2 Tu 1 . - CDS 3054 - 4673 2596 ## DIP2226 surface-anchored fimbrial subunit - Prom 4728 - 4787 3.5 3 3 Tu 1 . + CDS 4612 - 4782 181 ## - Term 4877 - 4922 17.2 4 4 Op 1 23/0.000 - CDS 4939 - 5832 842 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 5 4 Op 2 . - CDS 5873 - 6862 1365 ## COG0714 MoxR-like ATPases 6 5 Tu 1 . - CDS 7021 - 10401 3617 ## DIP0278 putative surface-anchored membrane protein - Prom 10447 - 10506 3.1 - Term 10520 - 10571 19.4 7 6 Tu 1 . - CDS 10606 - 12441 2924 ## COG1274 Phosphoenolpyruvate carboxykinase (GTP) 8 7 Op 1 21/0.000 + CDS 12712 - 13344 1012 ## COG1392 Phosphate transport regulator (distant homolog of PhoU) 9 7 Op 2 . + CDS 13348 - 14409 1414 ## COG0306 Phosphate/sulphate permeases 10 7 Op 3 . + CDS 14412 - 14891 631 ## Ndas_0478 hypothetical protein - Term 14787 - 14821 4.4 11 8 Tu 1 . - CDS 15061 - 16551 759 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 12 9 Tu 1 . + CDS 16740 - 16874 150 ## + Term 16965 - 17001 4.6 13 10 Tu 1 . - CDS 17251 - 17841 621 ## HMPREF0573_10438 hypothetical protein - Prom 17924 - 17983 74.2 + TRNA 17907 - 17978 82.0 # Lys TTT 0 0 + Prom 17905 - 17964 75.0 14 11 Op 1 4/0.000 + CDS 18060 - 20225 1887 ## COG1397 ADP-ribosylglycohydrolase 15 11 Op 2 . + CDS 20233 - 21477 1587 ## COG0477 Permeases of the major facilitator superfamily 16 12 Op 1 . - CDS 21579 - 22445 975 ## COG5006 Predicted permease, DMT superfamily 17 12 Op 2 . - CDS 22446 - 22862 549 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase 18 12 Op 3 . - CDS 22917 - 23063 180 ## Predicted protein(s) >gi|319978261|gb|AEUH01000135.1| GENE 1 12 - 2897 3954 961 aa, chain - ## HITS:1 COG:no KEGG:DIP2227 NR:ns ## KEGG: DIP2227 # Name: not_defined # Def: surface-anchored fimbrial subunit # Organism: C.diphtheriae # Pathway: not_defined # 688 958 1099 1374 1375 134 36.0 2e-29 MSQHVRPQPRTRRLARFVVAFVVCAAIGATPVLANTEQPAQAAGEGVPSNSLPATFATGG SGRFKESIQWLQWADYDKDFKGKEKPNVPVLGVGEGPKDFVNHRDLGDAGSLVTTCTLSN LTHDTPAPGTPKQQTEGPLVATIPGTWAGDALDNLYNVGGPGSWSDGSEVWHPGLTYPAN YVNKNQMVIGLANGYAYNGGNAWDGKPWNQSTDHRPTGYNARVSVDYTCKAEIYGKDGSI TNVPVQGLVFADAEASSRRFGIKSWATSERQDEWVQATVKTTAGNKPPRWRVLDTMRSQK CISKNTGKQVTTDGILSNGSRTLRLMPSDDECVYQSGGSYKNPNGYGGPDAVMFMEGATS ATITMQGAGYSAVALGLVVATDYGDAPASYGVASSLFQPSWRGGEVRATTDLFKVSPQAE MYMEKGSPRLGERIDAEPMQSFSADAKGDDIVGEPNDEDGITVPADGIRTQPGATHTQAV KCEGNGKVAGWVDWNRNGVFDDAEKSDEVACSASGSATLSWTVPGDVVRSVDGEDGTGAE TFMRLRITADNNGDGQKPTGGTATGEVEDYRIAVRVPTLQLVKQVDDKYSSNEVGGLAAD QWALTGGANGFTLSGNGTTGGPTVVRTGNNDIAETTANPGGAGYESGQWSCQEAPGTLGE NYSSSVAGATVDGRAQVLVQNTDRVLCAITNTGKPGSLTWQKADSDGTTPLAGTSWRLTG PDVHGNATVEDCTAGPCPTGPYKDQDPVPGSFKVDGLKWGTYTIREASAPAGYELSNARL TFPTITPAALDAALDKAVVNDRKTGSVTWKKVDASDGASPLAGSEWTISGGALQDSVDIA DCQAASAAQCPKSPGATYYDADPAAGSFTVKGLPWSGTAYTLTEKKAPAGYRLDTTPHGF TIGKDALDHALAPISNQKQTVPGIPLTGGFGADAYLIGGIIAGLGAAGAGAAIRRRKNRQ S >gi|319978261|gb|AEUH01000135.1| GENE 2 3054 - 4673 2596 539 aa, chain - ## HITS:1 COG:no KEGG:DIP2226 NR:ns ## KEGG: DIP2226 # Name: not_defined # Def: surface-anchored fimbrial subunit # Organism: C.diphtheriae # Pathway: not_defined # 1 534 1 552 555 268 35.0 5e-70 MSRKHAPLGRRMIAAAGALSIGAAGVIATSAAALAADPAYGTIDQSAIGSIIVHKHLKND SGTMGKVDGTADSGGDGVGGVTFKAFKLSGLDLSKPTDWDKLSALAGNVPANACADPTAP TLGAGAPAVNATAAAEGTTDGAGVTTLANLPVAAYLVCETAAPADIVEKAKPFVVTVPFP NTTNDGDGKWLYDVHAYPKNQKIEIDKTIADQSVNGFGLGSVVEFPVSTTTPNLDAESQF TYFYLRDKLDSRLTEGKVTKITLGDDTLVENTDYEQTNTDNLVVVSFKRAGLAKLKANPG KKLVAVFAGKVSAIGDGTIKNKADLVQDTTYGAIPPNEPPSVPPFEPDNPPTSPEVSTDW GNVQLFKYDGDTENATRVGVRGAKFNVFNAADPYAADCSTATKTGDPIPVSINGGAPQTD FVSDVNGIVKIDGLYIGDSVGSAGAGSKDAAKRCYVVEEIEAPAGFVLPQTATTPIEVVK GAQALASYNAQIPNTKQSVPQLPLTGANGQLLMTIGGVSLALIAVGSTLVIRSRKRGEA >gi|319978261|gb|AEUH01000135.1| GENE 3 4612 - 4782 181 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLRAPAAAIIRRPRGACFLLTRLPLNTSALCVVTGLGDSVMSIEEPLGQYDIYRKM >gi|319978261|gb|AEUH01000135.1| GENE 4 4939 - 5832 842 297 aa, chain - ## HITS:1 COG:MT1527 KEGG:ns NR:ns ## COG: MT1527 COG1721 # Protein_GI_number: 15840941 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Mycobacterium tuberculosis CDC1551 # 15 240 18 242 303 87 32.0 3e-17 MPVRSREIHALSTRIQLPVMRKLLSVMDGQHSALRWGRGWEFMDLAPYQPGDDVREIDWA ATARTGTPTIKRHEATANVQVILVVDTGREMGALAPSGEPKEDVALTACEAIAWLSVARG DQIGLVAGDSRRVRFMPARTGNGHAETIMRRIAEDITLGSPRSDVPRLLRRALTTTRRRS LIVIVSDETQPLPSKEADDVIKSLSVRHQVMQISVADVDPAELDAEMTVIDVNEGVLPDF LRRDEQLAKEARNVAEQRRAAVARMLDARGVTQVRVGSSEEVPRALMRALERSSHRK >gi|319978261|gb|AEUH01000135.1| GENE 5 5873 - 6862 1365 329 aa, chain - ## HITS:1 COG:ML1810 KEGG:ns NR:ns ## COG: ML1810 COG0714 # Protein_GI_number: 15827968 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Mycobacterium leprae # 23 329 47 352 377 264 45.0 1e-70 MAAYLTDAELAQAKDLLNRIAVIYQAKVVGQEDLRRALTACLLADGHVLVESVPGLAKTT AAQTLASAVSGSFHRIQCTPDLMPNDIVGSQIWNQASGEMVTQLGPVHANLVLLDEINRS SAKTQSAMLEAMQEGQTTIGGVNHRLPRPFMVMATQNPIEEEGTYVLPEAQMDRFLLKEV ITYPTPVDELEVLERIDSGAMSAPVAASPISVEDVLVLQDLSRRVYVHNTLKAYIVDIIN TTRGAGPNPLPGWAKHVRVGASPRGGIALMQISQALALMDGRNYVVPDDIKKLRYAILRH RIIRTFDALADNVSVEGLIDAVFNAVPVP >gi|319978261|gb|AEUH01000135.1| GENE 6 7021 - 10401 3617 1126 aa, chain - ## HITS:1 COG:no KEGG:DIP0278 NR:ns ## KEGG: DIP0278 # Name: not_defined # Def: putative surface-anchored membrane protein # Organism: C.diphtheriae # Pathway: not_defined # 539 1070 452 988 1080 150 26.0 4e-34 MSTPQSGRGARRSRRGVRAALFAVLAVVLTALVPVGGAPVARAADNDSIQVVLKALRRTT PANADLPGGLRVSDTVALDFEWSLRPGAEASLKTGDSFTVSLPEPLRSRVPGHSSELRVL HGGESVAVGVCVAAPDSFTCALSDVLPSKIREGWRGVGGTASFQMSAAAATQDASLVFGV SGYPNGQALGLPGGGGIAALQRGPYSPEPLAVGMTGMSEYSKNTHYHLGFDTDRISEYYA AAGASMAFDGAAERTITLVDALGPGQRFSDPGEWVLRLKYSKGSYPQVADTVLAKGDGRA AASAEGGFAVKAVLGEAGADGQEARIELTGPWDPSTKYELEYWGSPVGSDYLEPGLTYSN RVKLVDDANGDGVEEVLVEKQVERSFVPSFSSTVKTAPAYGSFQVQKSVAGPGSALVPKD QEYTIKVDYELPLDASAYHPDEQAYPGGWAAYAPGALNPDGRSGTATMSVTQGAAAVFTG KEPSITFPKGTRVTLSEVGTDSQAPTGYRWSAPRFEVGGAATNTIEIGDGTLPVVQVTNR LEPTWGTFTVSKTLAEGSSAGAERAYTFAYSCDDPAASSGTITGVRPDGAPVASGALVRA GSSCTVREDAAEAAIDGHTLAPPAERTITISGAGQASANASFENAYAPDTGSFRVKRAIT GDRPAGAGPSTVAYSCTNGVAGELELAGDGDEARGPALPSGASCTIERAPRTVDGYAVAT SYSATRTTITKGSDELLTVTDDYRALKGAFTIAKSVDGDGAALVGDQEFSFEYTCAPTGG DPLTRTVSVKAGQSIQVEDVPAGTCTVKEAAVDVPNASSARSLTVNGSAEAVADGAATVR VADGASAVVASTSTYTLDAGSFSVTGRAAGGALESAPGAFAFDYACADASGRSVGAGTLR VGAGETRTAGPFPVGSSCGIAQEDARVGGAGLATSWTVDGAAAQGQRVVVGIGAKGSTAA VEAVNTYTRETARFAVVKNVEANGAPVPQSFTFTYVCGAGGESRVLTVPGDGTPVESEEL PVGTQCALAEQTDAAWVPGHSLSAQIADGGVVEVGPQGSVARVVATNTYTPIEAEQGDPG PDAGPSGGAGLALTGSSAPVLLGGSLALAVLGGSILVLRRRGAASD >gi|319978261|gb|AEUH01000135.1| GENE 7 10606 - 12441 2924 611 aa, chain - ## HITS:1 COG:Cgl2803 KEGG:ns NR:ns ## COG: Cgl2803 COG1274 # Protein_GI_number: 19554053 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (GTP) # Organism: Corynebacterium glutamicum # 8 611 9 610 610 721 58.0 0 MSVTEQDVRAAAPSNTPEDVLEWVAKIAELTTPDAVEFTDGSQEEWDRLTAQMVESGVFT KLNEEKRPNSFLARSLPSDVARVESRTFICCEKEEDAGPTNHWADPTEMKATLKEKFTGA MKGRTMYVIPFSMGPLGGPISQLGIEITDSPYVVCNMRIMTRMGTAAMDLIAAGKPWVPA VHSVGMPLAEGQEDTTWPCNDEKYITHFPETNEIWSFGSGYGGNALLGKKCFALRIASTM ARRDGWMAEHMLILRLTRESTGQQFHVTAAFPSACGKTNLAMLQPTIPGYKVETVGDDIA WMRPGPDGTLRAINPEAGFFGVAPGTSYKTNPMAMETGKANSIFTNVALTDDGDVWWEGI DGDVPAHLIDWHGKDWTPESGEKAAHPNSRFTVPASQCPIICPDWEAPEGVAVDAVLFGG RRATNVPLVAEQYEPAHGVFIGASVASEVTAAATDVKAGSLRHDPFAMLPFCGYNMADYW GHWIEMQKKVGDKFPKVYQVNWFRKDEDGNFMWPGYGDNARVLDWIVRRAAGEVEAIDGV TGRYPKFEDFNLDGLDIDEAVWAKLFAIDPDAWAAEMDDTEEYFSQFGDKLPAAITEQLA KFRERIAAAKA >gi|319978261|gb|AEUH01000135.1| GENE 8 12712 - 13344 1012 210 aa, chain + ## HITS:1 COG:CAC3094 KEGG:ns NR:ns ## COG: CAC3094 COG1392 # Protein_GI_number: 15896345 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate transport regulator (distant homolog of PhoU) # Organism: Clostridium acetobutylicum # 7 210 6 210 210 107 32.0 2e-23 MGIRSKPQGPDFFEYFSQQANHLVEGVELLTQIYAASPEERVALRDQLHAVEHRGDELNH EIIQRVNSSFITPFDREDLQSLASHLDDCLDFIDEAGDLLVVYGIADIPSSLDSLLDAQI DVIKNCADLTAENMPKLKSPVDMRDYWIEINRLENEGDLAYRRTLTALFDSGLDAVTIIK LKDIVQGLESCADAFEELANVIESLALKES >gi|319978261|gb|AEUH01000135.1| GENE 9 13348 - 14409 1414 353 aa, chain + ## HITS:1 COG:Cgl0455 KEGG:ns NR:ns ## COG: Cgl0455 COG0306 # Protein_GI_number: 19551705 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Corynebacterium glutamicum # 5 347 40 373 461 244 41.0 3e-64 MDLNLILVCVVIAVALFFSYTNGFHDAANAIATSVSTRAWTPRAALLMAAAMNVIGAMMG TAVAKTVGQGIISISRYAESTQAEMQHKGLVIILAALLGACVWNLITWWFGLPSSSSHAL IGGLVGAGIVSSTAVHWQGILDHVIIPMFASPIIGFVVGYFVMHCVLRLLRNAQYHRTMR RFRAAQRFSSAAMALGHGIQDAQKTMGIIIMALVAGGYGQTHQIYDPVTGDMTVPMWVKI SGALAISLGTMAGGGRIMRTIGRKIIDLDPARGFTAEAVSSSILYLCSYVVHAPISTTQV VSTAIMGVGCTRRFSAVRWGVAGNIVVAWFLTLPAAAAVSAVLYWVAAWIVGV >gi|319978261|gb|AEUH01000135.1| GENE 10 14412 - 14891 631 159 aa, chain + ## HITS:1 COG:no KEGG:Ndas_0478 NR:ns ## KEGG: Ndas_0478 # Name: not_defined # Def: hypothetical protein # Organism: N.dassonvillei # Pathway: not_defined # 17 150 5 153 163 86 34.0 4e-16 MSITTPDLQGNRRSFRAPLRALAACAAAASLSAMALAGCSTSSVMHMSIGQCIQDPESTQ VSTVETVDCSKPHDAEVFFLYEVEGTNYPGEDSLNSTAEQVCIAAFEAYVGKSFEESSLN ATWFVPTSKSWPQNDHEIVCMATDMQNGKLNQSVKNSGF >gi|319978261|gb|AEUH01000135.1| GENE 11 15061 - 16551 759 496 aa, chain - ## HITS:1 COG:Cgl1286_1 KEGG:ns NR:ns ## COG: Cgl1286_1 COG0494 # Protein_GI_number: 19552536 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Corynebacterium glutamicum # 3 153 65 193 194 84 36.0 5e-16 MRSAGALVWRFADPGRVDVVGEAVDPADIEVLMVHRPRYRDWSWPKGKAEPGEPIVVAAV REVEEETGIAVILGVPLATQRYRLGSGHTKEVHYWVGAPVPAGSVAERLRAPVVPAPRGE IDQIRWARPDAAEAMLTRRGDRRLLSELASHARAGTLLTTAVALVRPGDAAGPGAPGRGG TAQGTAALVRANTDEEALVPPGASTAASGAGAAAPTAGAAAPGAALVRPAAGGEPATSGE GMLSRLGVRRAFDLVELMSAFGVDRAYAARADHARRTLAPWAAAGGGQVLVAEGEAPRGP TTAAGPSHGAAAGAEAARTEEERHGPAAVAAASGLVPPGRRRATGPPAQLAAAADDPRSP QCAEGGDADPQGASASGSVQRDAGASTGSGWAGALVRSLLGRRRGASLVCAPGEALERGI GALRGAMAPNAQGTLPQGLDRAQVVVAHVAHHDGGHAVIAVECHALRTKRTAVPAEGPGE GRPRAHNRSKRPAGPR >gi|319978261|gb|AEUH01000135.1| GENE 12 16740 - 16874 150 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQSPGCAHKDPDLRNTRRTQGLDSALVCDAMLPNSDPCVRCRVA >gi|319978261|gb|AEUH01000135.1| GENE 13 17251 - 17841 621 196 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10438 NR:ns ## KEGG: HMPREF0573_10438 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 8 196 18 211 211 107 38.0 3e-22 MLSASRAEAIIGDEDPAESAAVAHTAAWALMGVGDDEFDDEAVARLRAHVRGHGADAIAH VWARSPAFTLPGALWRAYLLSEWYHRDPGAVARRYEAGAASPRIDGLEAPVDVRPLPFVI EEVDSLLRGDLTDDDLDFVFAQAARACRVLAAADATWIDDPDDPLAHAVTTRSRALVRTA DELDAAAREAASGALD >gi|319978261|gb|AEUH01000135.1| GENE 14 18060 - 20225 1887 721 aa, chain + ## HITS:1 COG:mll1744 KEGG:ns NR:ns ## COG: mll1744 COG1397 # Protein_GI_number: 13471694 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ADP-ribosylglycohydrolase # Organism: Mesorhizobium loti # 6 344 8 332 344 166 42.0 1e-40 MSDEADRAHGALAGLALGDALGMPTQAMTADQITRTYGWVDALVPADASQPYAPGMPAGS VTDDTEQALLVADLLVSGGGGIDPHAFSRALLDWEDSMAARGSLDLLGPSTKAALERVRA GEDPLHVGGAGTTNGSAMRVAPVGIASSTRDPRFTDTVWESCRVTHATEQGFHAAALVAA AVSLGIDGAGVGSPSDSARASLERALALVEGLGRRGAWTPQPDVCERTRYALRFARSRDR APGTADDDRAFAGALRAHVGASVEAAQSIPAAFAIAWRYAADPWRGLCVAANLGGDTDTI GAIAGAVLGAALGARCWPAQEMERVEGVSGLRLRETADGLLRLRAHGSRPPAHGEPVAAP LEGRVVLLGQVVVDLALLAPRVPAPGGDVFAEDAGMHAGGGFNVLAAARRMGAQAVSLSG VGDGGFASIITAALERIGAACEGPRIAGTDSGYCVAITDADGERTFVSTRGAEARLPRGS WSAHAARLRSGDVVHVDGYALAHPANTAALREFLLAPLPAGLRAIVDVSPVVGDVDLDDL LALRALAPLWSMNEREAGILAGRLARASAAPPHGGAPQGEAAPPAGAAPGDAARLAAALA DALGSPVVVRCGAHGAWYAPGEPADPRWAVEPPASTDDRPRRRRPPSAVRIPTPQITAVD TNGAGDAHSGVLAASLAQGADILEALRLANCAGALAATRVGPATCPSRTEIEAASRALAH H >gi|319978261|gb|AEUH01000135.1| GENE 15 20233 - 21477 1587 414 aa, chain + ## HITS:1 COG:Cgl2686 KEGG:ns NR:ns ## COG: Cgl2686 COG0477 # Protein_GI_number: 19553936 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Corynebacterium glutamicum # 17 404 7 393 398 166 35.0 1e-40 MPRNADAPRASLIGGPLTPALLLTLAALNAVPPMATDMYSAAFPQITASLSTTSTMVGLT LTAFFIGYAAGQVLGGAVSDQLGRRGPVVVGCLAALVGSLVCVFTPNAGVLIAGRVLQGF GGGVASSVGRAVLVDVAHGRMLARAMSLLQAVVGFAPMVAPVLGGFIVTRAPWRTVFWAL AAFTLLMTVLAWCFVPESLPPGERQDGGVPRFFRGLAQVVRIRPFVGFMLTNALSSFCMF GYISNATYVLQESLGLSPFQYSLVFAFNALVPTLLALVNVRLIARFEPRALIAVGLALAA LGVVTLVVSATALGLALVPTCAGFMLIMSANAFIFGNAASLALGQSRELAGTASSVLGVA QSLANAVSAPLATSGGSSATPMIVVMVVGSVGAWLAFWVVARGGRGARTDGAAQ >gi|319978261|gb|AEUH01000135.1| GENE 16 21579 - 22445 975 288 aa, chain - ## HITS:1 COG:mlr1493 KEGG:ns NR:ns ## COG: mlr1493 COG5006 # Protein_GI_number: 13471502 # Func_class: R General function prediction only # Function: Predicted permease, DMT superfamily # Organism: Mesorhizobium loti # 2 274 17 289 298 126 35.0 6e-29 MALADRLPPQLFIVGSGLIQYVGAAVAVVAFAAVEPASVAWWRAATGALVLCAWRRPWRD GLSRRDLAASTLFGVVLVAMNSSFYEAIARLPLGTAVSIEFVGPVAVAVLRGRGWAPRVA ALLALAGVACIGGLGLDLSQGRVRAGLAWILAAALAWALYIVLGQRVASTRSGVTNLALG CFFGAVLFAPVLGPGALVALTDSGLLAAVVGVGLLSTVVPYSLEALALTRLPAATFALFT ALLPASSAVVGALMLRQIPAPAELAGLVLVSVAVWIASGKHHSARRGE >gi|319978261|gb|AEUH01000135.1| GENE 17 22446 - 22862 549 138 aa, chain - ## HITS:1 COG:AGc1655 KEGG:ns NR:ns ## COG: AGc1655 COG0229 # Protein_GI_number: 15888246 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 14 136 12 134 135 155 57.0 3e-38 MSDFPSLDEASALTDAQWRSRLTPMQYHVLREGGTERPFTGELLDEHRQGAYHCAACGQR LFDADAKFESACGWPSFYEADEGATRDLVDTSHGMVRTEVRCSRCGSHLGHVFPDAPDQP TGMRYCMNSAALVFTPAQ >gi|319978261|gb|AEUH01000135.1| GENE 18 22917 - 23063 180 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no SAPAAPARPAAEPVERGERFYSGSPEADERNAAMAGRGYDDKKGKRRR Prediction of potential genes in microbial genomes Time: Thu May 12 18:03:46 2011 Seq name: gi|319978255|gb|AEUH01000136.1| Actinomyces sp. oral taxon 178 str. F0338 contig00136, whole genome shotgun sequence Length of sequence - 3633 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1780 1982 ## COG0668 Small-conductance mechanosensitive channel - Prom 2029 - 2088 79.3 + TRNA 1784 - 1856 61.4 # Glu TTC 0 0 + TRNA 1910 - 1986 88.0 # Asp GTC 0 0 + TRNA 2012 - 2087 79.1 # Phe GAA 0 0 2 2 Tu 1 . - CDS 2509 - 2664 150 ## 3 3 Tu 1 . + CDS 3089 - 3632 -257 ## APA01_41110 hypothetical protein Predicted protein(s) >gi|319978255|gb|AEUH01000136.1| GENE 1 1 - 1780 1982 593 aa, chain - ## HITS:1 COG:MA1724 KEGG:ns NR:ns ## COG: MA1724 COG0668 # Protein_GI_number: 20090576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Methanosarcina acetivorans str.C2A # 192 348 70 224 379 77 28.0 5e-14 MAAGLAPEDEDYFRQNMPSRSKRPPVTPNTPCAFQEAVRHGAYASAMRTLPPQFAHVLAA SGDSGQSDGASEVVTHALDLASLLLGSAVGAFVGLVIAVLVHAIARGALAKSAIASALLG RTRSASYGSFLTWGGWIGLQVTLSNIDLMDWSNGTTVSVMSHLLLIAALACLTWVGYSAA WLVEDAAKLRQKQDKGRSRRFETQAQVIRRLMQVVISVVGLCAVLGTFEAARQAMTTILA SAGLVSVIAGLAAQQTLGNVFAGVLLAFTDAIRVGDVVVAGPKGENGSVEEITLSYVVVR VWDERRLITPCTYFTSTPFENWTRRAAAQLGTVELKLDWSAPVNLIRRKVETLLTRTDLW DGRTWTVQVTDSDQDTVTVRVLVSAKDSGTLSDLRTFLREHLIAWIVAEHPAARPARRLE PLRTVTVTHDPTGEKAASLAEELAGIAGTGVNEAGAALVAPPASAEAPAPVSDASKTPRD AAHAARMVAARRKHKLARRRAMAERQRELAHGGPSQGEETQVMSETAVGRVLSEGRVGVR PKSTDGPATAVHGIDRTRILEAVDGGGRGRPAPAPAGAPAPRPAPAPRPAPAA >gi|319978255|gb|AEUH01000136.1| GENE 2 2509 - 2664 150 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLSLAETGAEVHLHVDSHGPGVKDPDAETDGRIWGFKAPQDTNHTVSSQFE >gi|319978255|gb|AEUH01000136.1| GENE 3 3089 - 3632 -257 181 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 1 170 22 191 276 93 40.0 4e-18 MRGKPAPASSVLSRAGLIPACAGKTRARAPPRRRRRAHPRVCGENQAAAHGRDAVGGSSP RVRGKRDMLRSLMPRFRLIPACAGKTLDGAQEVGETGAHPRVCGENVLVSGGEWVAAGSS PRVRGKHPLDAGGAASAGLIPACAGKTRRALWPWGWTTAHPRVCGENVALGHQHVLGLSR V Prediction of potential genes in microbial genomes Time: Thu May 12 18:03:55 2011 Seq name: gi|319978241|gb|AEUH01000137.1| Actinomyces sp. oral taxon 178 str. F0338 contig00137, whole genome shotgun sequence Length of sequence - 5640 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 548 -227 ## APA01_41110 hypothetical protein 2 2 Tu 1 . + CDS 858 - 1535 -438 ## APA01_41110 hypothetical protein 3 3 Tu 1 . + CDS 1873 - 2343 -281 ## GbCGDNIH1_1372 hypothetical protein 4 4 Op 1 . + CDS 2740 - 3312 -186 ## APA01_41110 hypothetical protein 5 4 Op 2 . + CDS 3336 - 3908 -423 ## APA01_41110 hypothetical protein 6 5 Tu 1 . + CDS 4142 - 4657 -351 ## APA01_41110 hypothetical protein 7 6 Tu 1 . + CDS 5265 - 5387 107 ## Predicted protein(s) >gi|319978241|gb|AEUH01000137.1| GENE 1 3 - 548 -227 181 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 30 181 17 168 276 99 46.0 6e-20 PLQSQPHHHPTDHRITNPPQNPGLSILAVGSSPRVRGKRVRGRRGRGVGGLIPACAEKTP ARASATSDSPAHPRVCGENLAAAAAGASTLGSSPRVRGKPGQWHAEGQPGRLIPACAGKT KVCVVTTPPDRAHPRVCGENSACDGQHTPASGSSPRVRGKHHQIRRAGRVGGLIPACAGK T >gi|319978241|gb|AEUH01000137.1| GENE 2 858 - 1535 -438 225 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 8 221 36 249 276 112 41.0 1e-23 MPRRCPRRGLIPACAGKTDFTGGGHWGSPAHPRVCGENTETFSVYQYAAGSSPRVRGKRR WALRGRHEVRLIPACAGKTSGTSSSPSATPAHPRACGENSVTAIWAMTAWGSSPRVRGKH VHEVGPGVGDRLIPACAGKTTRCWRTMIPAEAHPRACGENEYHHDLAVLGAGSSPRVRGK RGRACARVHVDGLIPARAGKTSAPRRRCRPRPAHPRACGENRGCT >gi|319978241|gb|AEUH01000137.1| GENE 3 1873 - 2343 -281 156 aa, chain + ## HITS:1 COG:no KEGG:GbCGDNIH1_1372 NR:ns ## KEGG: GbCGDNIH1_1372 # Name: not_defined # Def: hypothetical protein # Organism: G.bethesdensis # Pathway: not_defined # 16 147 90 221 268 62 40.0 3e-09 MRGKLDGGRPVLRGDRLIPARAGKTTVPGTVPKRKRAHPRACGENRRTMRARLASYGSSP RVRGKPAVAERPHGAPGLIPARAGKTPAFFFYEGQGGAHPRACGENDPKKTNEADFEGSS PRVRGKHAHGRIDSVPPRLIPARAGKTSSATSTSTT >gi|319978241|gb|AEUH01000137.1| GENE 4 2740 - 3312 -186 190 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 12 163 17 168 276 82 43.0 6e-15 MVSTKGTARSEGSSPRVRGKRVGGGVDGPAVRLIPARAGKTAIEAGKALATAAHPRACGE NVQEVSLVAIPAGSSPRVRGKLHPLRRRWTTQRLIPARAGKTSCAATAPAPRGAHPRACG ENAADIALAVDPRGSSPRVRGKLTEYLLRCGPLGLIPARAGKTPCPVLSPLSGGLIPARA GKTRACPWAR >gi|319978241|gb|AEUH01000137.1| GENE 5 3336 - 3908 -423 190 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 1 186 22 207 276 87 39.0 2e-16 MRGKRSSWFGGRGPPGLIPARAGKTGRQQECTRRRTAHPRACGENLSVRIPTPARSGSSP RVRGKLEGVAGEPQPGGLIPARAGKTFFTDAEDGVAGAHPRACGENTYSNIEQDWISGSS PRVRGKPGRGREVEAVAGLIPARAGKTALSPTRARRWWAHPRACGENSAPASAMLTAEGS SPRVRGKLGW >gi|319978241|gb|AEUH01000137.1| GENE 6 4142 - 4657 -351 171 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 12 166 17 171 276 71 40.0 1e-11 MTCITIVAVPTGSSPRVRGKRVPRSSLMLATGLIPARAGKTSAPRNSPSKTRAHPRACGE NTRPPASLLRSPGSSPRVRGKPRIYPGRRARPRLIPARAGKTVVGRGVSRKDGAHPRACG ENRTAGLRRQAAAGSSPRVRGKPTQTKGPGGPNGLIPARAGKTHVISPKPF >gi|319978241|gb|AEUH01000137.1| GENE 7 5265 - 5387 107 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTKAHPRACGENNVIAGRGSGATGSSPRVRGKLNNLIRRT Prediction of potential genes in microbial genomes Time: Thu May 12 18:04:22 2011 Seq name: gi|319978218|gb|AEUH01000138.1| Actinomyces sp. oral taxon 178 str. F0338 contig00138, whole genome shotgun sequence Length of sequence - 17799 bp Number of predicted genes - 18, with homology - 14 Number of transcription units - 10, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 249 - 515 128 ## 2 2 Tu 1 . + CDS 814 - 1290 -403 ## APA01_41110 hypothetical protein 3 3 Tu 1 . + CDS 1471 - 1923 -151 ## Rru_A1342 hypothetical protein 4 4 Tu 1 . + CDS 2854 - 3321 -140 ## APA01_41110 hypothetical protein 5 5 Op 1 . + CDS 3890 - 4465 -323 ## APA01_41110 hypothetical protein 6 5 Op 2 . + CDS 4539 - 5003 -407 ## APA01_41110 hypothetical protein 7 6 Tu 1 . + CDS 5369 - 5569 133 ## + Term 5774 - 5820 0.5 - Term 5659 - 5716 3.8 8 7 Op 1 . - CDS 5812 - 6225 257 ## Afer_0982 CRISPR-associated protein Cas2 9 7 Op 2 . - CDS 6222 - 7190 1056 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 10 7 Op 3 . - CDS 7187 - 7945 754 ## Namu_3050 CRISPR-associated protein, CSE3 family 11 7 Op 4 . - CDS 7946 - 8560 473 ## Sare_2576 CRISPR-associated Cas5 family protein 12 8 Op 1 . - CDS 8671 - 9825 1356 ## Namu_3052 CRISPR-associated protein, CSE4 family 13 8 Op 2 . - CDS 9876 - 10565 486 ## Caci_3908 CRISPR-associated protein, CSE2 family 14 8 Op 3 . - CDS 10562 - 12256 1191 ## Tcur_2680 CRISPR-associated protein, CSE1 family 15 8 Op 4 . - CDS 12253 - 15270 2322 ## COG1203 Predicted helicases - Prom 15302 - 15361 3.8 - Term 15305 - 15364 9.7 16 9 Tu 1 . - CDS 15369 - 15572 230 ## - Prom 15729 - 15788 2.7 + Prom 15733 - 15792 2.2 17 10 Op 1 . + CDS 15821 - 16024 180 ## 18 10 Op 2 . + CDS 16031 - 17158 1389 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component + Term 17361 - 17394 4.2 Predicted protein(s) >gi|319978218|gb|AEUH01000138.1| GENE 1 249 - 515 128 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQHPFVSLQGSSPRVRGKLRVPCLFEFLLRLIPARAGKTKQSKPHYVGAWAHPRACGENH VVVRAQARPLWLIPARAGKTGALATAHR >gi|319978218|gb|AEUH01000138.1| GENE 2 814 - 1290 -403 158 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 4 157 16 169 276 70 43.0 1e-11 MRSGGSSPRVRGKRPGPQEVDVGPRLIPARAGKTRERSAGAHSGGAHPRACGENSCGNSR ARRQMGSSPRVRGKPRPGRCRCAPSGLIPARAGKTSGTFSSPSATPAHPRACGENCWRCQ PSDLHAGSSPRVRGKRHHHNRRQGPGRLIPARAGKTRR >gi|319978218|gb|AEUH01000138.1| GENE 3 1471 - 1923 -151 150 aa, chain + ## HITS:1 COG:no KEGG:Rru_A1342 NR:ns ## KEGG: Rru_A1342 # Name: not_defined # Def: hypothetical protein # Organism: R.rubrum # Pathway: not_defined # 15 139 2 126 238 66 42.0 3e-10 MVHRFPNARAHPRACGENVRAALGDHPRNGSSPRVRGKRVGGLAQEAGDRLIPARAGKTP PPERTPGSDGAHPRACGENFPMIRVSSLASGSSPRVRGKQGHKRAKDQGIRLIPARAGKT HCAAQAPRPKTAHPRACGENAGRGRRGGPG >gi|319978218|gb|AEUH01000138.1| GENE 4 2854 - 3321 -140 155 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 17 147 99 229 276 65 41.0 4e-10 MRGKHLVRDLAGWSLGLIPARAGKTVRISYCPRPDAAHPRACGENSARQTGMSWLSGSSP RVRGKRRLLPTAARPGGLIPARAGKTSPAKTVPAGTAAHPRACGENRFDTDPLLHDDGSS PRVRGKPLAPVFDVAEARLIPARAGKTRRRCPVRA >gi|319978218|gb|AEUH01000138.1| GENE 5 3890 - 4465 -323 191 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 1 191 83 273 276 87 38.0 3e-16 MRGKRQVAEGRTPSHGLIPARAGKTRGRRSAPPGPSAHPRACGENFLATTHGSKAGGSSP RVRGKLGVLGGEAEERGLIPARAGKTSDGDCVRSPRGAHPRACGENFDGPYVAQRLAGSS PRVRGKLSEWETRPYEDRLIPARAGKTRSHRRRRAGSRAHPRACGENEGERYAAVLEYGS SPRVRGKPSIP >gi|319978218|gb|AEUH01000138.1| GENE 6 4539 - 5003 -407 154 aa, chain + ## HITS:1 COG:no KEGG:APA01_41110 NR:ns ## KEGG: APA01_41110 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 4 154 99 249 276 77 42.0 1e-13 MLGLIPARAGKTRRRGQGPGGRRAHPRACGENVRPMRVLISAIGSSPRVRGKRAERVLGP DPNRLIPARAGKTPAGLGPSAPRPAHPRACGENPGRGDHTGILVGSSPRVRGKQHFSLLQ THICRLIPARAGKTAGHRRPTSYYRAHPRACGEN >gi|319978218|gb|AEUH01000138.1| GENE 7 5369 - 5569 133 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRRIKAFFARGLIPARAGKTSPATVSSLTLSAHPRACGENSAGTRSLEAVSGSSPRVRGK RVPSVW >gi|319978218|gb|AEUH01000138.1| GENE 8 5812 - 6225 257 137 aa, chain - ## HITS:1 COG:no KEGG:Afer_0982 NR:ns ## KEGG: Afer_0982 # Name: not_defined # Def: CRISPR-associated protein Cas2 # Organism: A.ferrooxidans_DSM10331 # Pathway: not_defined # 1 90 1 91 92 107 61.0 9e-23 MIVLVLSAVPEGLRGHVTRWLLEISPGVFVGNLGSRVRERLWEIVIKTMGSGRAIMVYRA RNEQGLEFLTWGDPWRPVDFEGIRLMMRPALAQGRGQYTDPADAPDTARRLRGRPGMSNF QKRRKARGFPRSDRPER >gi|319978218|gb|AEUH01000138.1| GENE 9 6222 - 7190 1056 322 aa, chain - ## HITS:1 COG:ygbT KEGG:ns NR:ns ## COG: ygbT COG1518 # Protein_GI_number: 16130662 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Escherichia coli K12 # 12 280 5 273 305 192 38.0 1e-48 MTPIPGARPPDPKDLVRVRDRLTFLYVERCVIHRDSNAITITDSRGVAHVPAAALSVLLM GPGVKITHSAISVLSESGSTAVWVGENGVRYYAHGNPPSRSSRLLEAQAKAVSDPALRLA VARQMYLMRFQGEDVAKLSLQQLRGREGARVRRLYRHNSQRTGIPWDRREYDPDDFEGGS VVNQALSAANSALYGVVHAVIVALGCSPGLGFVHSGTYRAFVYDIADLYKADLSIPVAFD VAALGGETPIGTEARRRVRDAVRECRLLEQAVRDIKALLALSGDPIDADDDDEALIVDDL HLWDDFEGTVAAGVSYEERSVT >gi|319978218|gb|AEUH01000138.1| GENE 10 7187 - 7945 754 252 aa, chain - ## HITS:1 COG:no KEGG:Namu_3050 NR:ns ## KEGG: Namu_3050 # Name: not_defined # Def: CRISPR-associated protein, CSE3 family # Organism: N.multipartita # Pathway: not_defined # 1 252 1 213 215 151 38.0 2e-35 MFLTRIFLNPGRRGCRELISSRQRLHAVVLNCFPPGALDAPEEGRLLWRLDGSGAARASR RAVGGRAHEALILYVSSPVPMDPSLIVETAGYATDEGVTVRDTGAFLEALSPGQRWGFRI TVNPTFRDTKQRNGKGKKKTLAHVTIAQQTQWLLDRTEANGFSILTSREIGGDLPVLEDE AGERVDGANLVVDCVGRGIEKFQHDGTYVTIQMATFQGVLEVRDPARLRACLVGGMGRSK AYGCGLMTLARP >gi|319978218|gb|AEUH01000138.1| GENE 11 7946 - 8560 473 204 aa, chain - ## HITS:1 COG:no KEGG:Sare_2576 NR:ns ## KEGG: Sare_2576 # Name: not_defined # Def: CRISPR-associated Cas5 family protein # Organism: S.arenicola # Pathway: not_defined # 1 204 36 238 238 145 44.0 1e-33 MLAAAQGRRRTDPIEDLAQLRMAVRIDQPGELLRDFHTAHRGKTAMPLSDRFYWSDAVFA AFVEGPDGLINALANAIRRPAFPLYLGRRACPPALPLFLAVESDPIWEVVRSTGWLASGF HRKRQRAKREVQVRVVADAGIVPGADESTPGRTLQDVPVSFDSERRLYTFRQVEEASVPL QNPEYDESAHGSALTGGHDPMEVL >gi|319978218|gb|AEUH01000138.1| GENE 12 8671 - 9825 1356 384 aa, chain - ## HITS:1 COG:no KEGG:Namu_3052 NR:ns ## KEGG: Namu_3052 # Name: not_defined # Def: CRISPR-associated protein, CSE4 family # Organism: N.multipartita # Pathway: not_defined # 5 380 4 371 378 301 49.0 3e-80 MSRFIDIHVLQTLPPSNPNRDDTGAPKTATFGGVRRMRISSQAIKRATREDFEKLAPEGN RGIRTKLVVELVRAAIVERAPQLAQSAAELAEMGLVEIGFKLTEPKRKKNEETTDQKLKE AGFLVFLSAKQIEHLAEALVSVADAEDRKKAFKALKPKTLVDTDHSISIALFGRMVAEPN SLNVNAACQVAHALGVGEVESEYDYYTAVDDEKKRDDEADEGAGMIGTIEFASATVYRYA TINVDLLEENLGSVDAADDAIRLFIDSFVRAMPTGKVTTFANRTLPDAVVVQVRDDQPIN LVGAFEEPVPAKTAGDGTKKGFAGPAVKRLVAFEKDMRALTGMTPKASLVAWTTHKAESI AELGQQVRLVDLGGAAVDALRGQQ >gi|319978218|gb|AEUH01000138.1| GENE 13 9876 - 10565 486 229 aa, chain - ## HITS:1 COG:no KEGG:Caci_3908 NR:ns ## KEGG: Caci_3908 # Name: not_defined # Def: CRISPR-associated protein, CSE2 family # Organism: C.acidiphila # Pathway: not_defined # 53 216 46 206 223 88 39.0 2e-16 MTGEEADARQADSASKPCPETAVLNRSWESAEKVRKWVEWVIPRRLNGREASAMAARAQL RHGAGKEVGAVPSIWAYTLESQRPELSDQRPSWGENAVHTALTLWARHQGSNAAPMHMVD AKETPRSFGSAVRALAEKGRGDKRPEETPVYRRLSSVVAAQTFGALSHHLRGIVDLLEAA EIPMDYGLLAADLFEWQIPRRRSAVTRRWGRQFARAPIPAESAEDEASE >gi|319978218|gb|AEUH01000138.1| GENE 14 10562 - 12256 1191 564 aa, chain - ## HITS:1 COG:no KEGG:Tcur_2680 NR:ns ## KEGG: Tcur_2680 # Name: not_defined # Def: CRISPR-associated protein, CSE1 family # Organism: T.curvata # Pathway: not_defined # 2 553 3 539 561 295 37.0 3e-78 MTDPDYNLLDEAWIPVRLLDGSVEKMGLLDVFRNTNEITDLACELPTQKIAIQRILLAVC YRVKPSHDDAEGEPLNADEDWLVQWRAGAPTDEALVYLERWRDRFFLFGGQRPFMQAPDL RTAKDAVSGLEKIIADIPNGEQFFTTRHTRAIARIPADEAALWLVHAHAYDPSGIRSGAV GDPDVKGGKGYPIGPSWCGQTGTVLLKGKNLDETLVLNLVPFDSGPLQGPDPAGDKGACT WEAEDIETAQRRDFSRDGDPDPAGISIPRLFTWHSRRIRLVGDRSGVTGVVLAQGDKLAP QNMQRFEPMSLWRYSMPQSKKFREHTYMPRKHEPGRALWRNLPGILPGLGVIEGFDKGKQ REFLPAQTLGFHEGIEQGPGAEYPVRVRVEAVGVTYGPQEATFEDLYHDELSFSTALLKE KGEELPHLIDGQVRCTEQVASRVGIFAANLARAAGQSGDGAGDGARDSAREQFFALVDDP FRRWLARLDGDSDVDAVRGEWHGEVRRIVRLLAKRLVDEAPPSATIGRETKGGFMSVGIA ENILWAAVNKTMPRTEDQLKEGEK >gi|319978218|gb|AEUH01000138.1| GENE 15 12253 - 15270 2322 1005 aa, chain - ## HITS:1 COG:STM2944 KEGG:ns NR:ns ## COG: STM2944 COG1203 # Protein_GI_number: 16766250 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Salmonella typhimurium LT2 # 167 720 135 658 887 213 32.0 1e-54 MARKVDGKVVEAPVVAAAPPPPEFPRAEPGLTAEGLSGLTRSFWAKSGKRNPADWLSVPQ HLMDAADVAGRLFDSYLSEHHRALLASVWGADQAKARASLVFLAGVHDVGKASLPFCCQH GPLADFVRAHGVRVPQRRDVPDRADLPHGLASRFAFVNAIAEGGGDRRRASKWATIIGVH HGRYPDGAQIKAARDAYNGELGRPQKEPRWEQSRQELIEWMARRSGFPLRPQIPSALPRL PIAVASAYASVVVIADWIASNEDYFPLRPQAGRERELIADEQWERVERGWGLADMPRPMT VPPTAGGSPESYYRHRFGWGPAITPTDAQIAAVRLAEEADPDLMIVEAPPGSGKTELAFA VAERMVRSRGLQGVFIGLPTQATTNAMYERATKWLTNLLGDSPQRLGIHLAHGKNDLNDA FVELLDRDRGYPVQVHDEEEERSGEDHSSLRASAWMAGRWRATLSPVVIGTIDQVLLTAL KSRHVLVRHLGLMGKAVILDEVHAADTYMGTYLKAALTWLGMYGIPVVLLSATLPAERRI ELAEAYRRGRCRREVDDSARLDGNIGYPVLTTVPRGGQEVDVHVVGGGGPEARRTILPLV AQSPQDLVPALDEALAEGGCAVVIRNTVKDAQATYDALAPHFGADGVTLLHSRFIASDRA ARDERMLRLFGKDSAARPRKHVVVATQVIEQSLDVDFDVMITDPAPMDLVLQRIGRLHRH PGRERPSGLREARCHVMVADTGSAPWAYSGGTDVVYERSHVLRALGILADRGRIGVERPG DYAELTQLAYSDEAVGPATWAGALQEAERKARNNANTAVDRARTWCLTGPRLPQWNADRL EESFVGNASTGDGAPKGRQAAAQAAVRDSEDQIPVLLVAVDPGMGCVPIKPPWQLDADGE TIPIDVSTWPSSGLVREMRTWSLSLPPWPFRENGKAIDEVVDAVACAIWDDEATRDWECL EHPLLRGELVLAMHKTDEGSTRLERDLLGCHLVYTQERGLEVRGR >gi|319978218|gb|AEUH01000138.1| GENE 16 15369 - 15572 230 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSYWISSPIGRVVVGNRETREYHRVANVNANCQLPEIVRSRRLVGFTPDSVAQAVAEGFD PCWYCSR >gi|319978218|gb|AEUH01000138.1| GENE 17 15821 - 16024 180 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTVQIQRLTPARTEMPGAATSVGSAGGRSSPVRPGENNARHRAAPSLAPPVSHESTRSLL TPPPARP >gi|319978218|gb|AEUH01000138.1| GENE 18 16031 - 17158 1389 375 aa, chain + ## HITS:1 COG:L119891_1 KEGG:ns NR:ns ## COG: L119891_1 COG1136 # Protein_GI_number: 15672696 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Lactococcus lactis # 61 256 21 223 290 73 28.0 5e-13 MGIRCQNGSMSNSEAQERASLRPARKPWWRSRKAHAAHGGGGRVLLGGLGLSYRDPSSGA PLLDGVDVAFRARRITAVLDPSGLRSRALFLVLAGLEDPDSGRVVSAKPRSARQRWAGRI GAVALLGARDTLDRDLTIRQNILAPLAATGSVADWDNLVGALELTGLRTKVDARPAELAP LERFKATVARAIVSGAEAFLVTEPPVLPAHDIEELVGLLRAVADAGCAVVVATRHPETAL RAHRVIMLANGTVSLDAAPPTLDAIDASLERSPEDPKNLLGPIPSASPLFYEAGAAGADD EDPGSVAWHSLNVVDPADPAPLSSVATYDEIDETATYTGAIPATTATGEDTDVDDLIVQA KRILSDLPGSIAPED Prediction of potential genes in microbial genomes Time: Thu May 12 18:05:22 2011 Seq name: gi|319978213|gb|AEUH01000139.1| Actinomyces sp. oral taxon 178 str. F0338 contig00139, whole genome shotgun sequence Length of sequence - 2548 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 508 580 ## Jden_2340 protein of unknown function DUF1469 2 1 Op 2 . + CDS 505 - 819 389 ## + Prom 902 - 961 3.8 3 2 Tu 1 . + CDS 1059 - 1313 344 ## gi|154508042|ref|ZP_02043684.1| hypothetical protein ACTODO_00532 - Term 1054 - 1098 5.1 4 3 Tu 1 . - CDS 1327 - 2460 701 ## PROTEIN SUPPORTED gi|126667548|ref|ZP_01738518.1| Ribosomal protein S7 Predicted protein(s) >gi|319978213|gb|AEUH01000139.1| GENE 1 2 - 508 580 168 aa, chain + ## HITS:1 COG:no KEGG:Jden_2340 NR:ns ## KEGG: Jden_2340 # Name: not_defined # Def: protein of unknown function DUF1469 # Organism: J.denitrificans # Pathway: not_defined # 23 167 6 157 157 94 38.0 1e-18 TRAYAPPAQAGPARQGPGPQAAPDPGPRAGQGAQPQTGPQARRKPSVGKLIADVTSQFSS IVRGEIELAKVQTATMFARIRTGLVLMAAAAVFALFLLGWILHTIEAALATVLPVWAASL IVVGLLAVVVAVLALLGSRALRKAQESKPDPKAGITEAVHIVKNGLTK >gi|319978213|gb|AEUH01000139.1| GENE 2 505 - 819 389 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTPTEPAAAPRTEAQIRADLEATRAALAASVDDLYEQLQPAAIVANTKAAAAEAIAGAR KSVRDTAAAAREGDSEAIKRIGIAVGATLVAIGLLTFAWSRRRR >gi|319978213|gb|AEUH01000139.1| GENE 3 1059 - 1313 344 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508042|ref|ZP_02043684.1| ## NR: gi|154508042|ref|ZP_02043684.1| hypothetical protein ACTODO_00532 [Actinomyces odontolyticus ATCC 17982] # 1 84 11 92 92 67 64.0 3e-10 MKVARKLKYFSPDTDLDALQRELAPEDMYEEQGGDGADTDVDEPDEEYDLYAKYADFATV ADEPDDIDPAQLDESFWTGKPSDK >gi|319978213|gb|AEUH01000139.1| GENE 4 1327 - 2460 701 377 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126667548|ref|ZP_01738518.1| Ribosomal protein S7 [Marinobacter sp. ELB17] # 6 347 8 337 354 274 43 5e-74 MTDTPLDYATAGVDTAAGDRAVELMKAAVARTHDDTVVGATGGFAGMVDASALLGMRRPL LATSTDGVGTKIAIARAMDAHGTIGQDLVGMVVDDIVVIGARPLLMTDYIACGRVVPERI AAIVEGIARACEATGTPLVGGETAEHPGVMEPDDYDIAGAATGAVDADRVLGPDRVADSD VVVAMASSGLHSNGYSLVRSVVARTGAPLEGRVGEFGRTLGEELLEPTRLYTRLCLDLVE RFGVGGDDRRGVHALSHVTGGGLAANLSRVLPSGSVASVDRGSWRVPAVFDWVRRGGSVS WEAMEDSLNLGVGMVAVVGEADAPAVVGAVRQAGVDAWVLGEVRSTAAGVGAGERLVRGA KGVDAGAVRLHGSYRVG Prediction of potential genes in microbial genomes Time: Thu May 12 18:05:37 2011 Seq name: gi|319978211|gb|AEUH01000140.1| Actinomyces sp. oral taxon 178 str. F0338 contig00140, whole genome shotgun sequence Length of sequence - 1035 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1035 1221 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase Predicted protein(s) >gi|319978211|gb|AEUH01000140.1| GENE 1 1 - 1035 1221 344 aa, chain - ## HITS:1 COG:MT0829 KEGG:ns NR:ns ## COG: MT0829 COG0034 # Protein_GI_number: 15840220 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Mycobacterium tuberculosis CDC1551 # 24 330 179 491 527 397 65.0 1e-110 EAAPNGAAQSAHPTPPAPPVDQEGACAPLVPAALKVLPRLKGAFSLVFMDEDALYAARDP HGYRPLVLGRLDSGWVVASETAALDLCGAALVREVEPGELVSIGASGVRSWRFAVSRAST CVFEYVYLARPDTTIGGRRIVAARHAMGAALARENPVDADLVMPTPDSGTPAAIGYAQES GIPFAQGLVKNAYVGRTFIEPTQSLRQLGIRLKLNPLREVIEGRRLVVIDDSIVRGNTQR ALVAMLREAGAAEVHVRISSPPVAWPCFFGIDFPTREELIASSMGVDQVRESIGADSLAY LSLEGMVESTGQGTSLCLGCFTGDYPEPVPPGAPLPSAPGAASC Prediction of potential genes in microbial genomes Time: Thu May 12 18:05:48 2011 Seq name: gi|319978185|gb|AEUH01000141.1| Actinomyces sp. oral taxon 178 str. F0338 contig00141, whole genome shotgun sequence Length of sequence - 28661 bp Number of predicted genes - 24, with homology - 22 Number of transcription units - 15, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 521 608 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 2 2 Tu 1 . + CDS 876 - 3449 2896 ## COG0247 Fe-S oxidoreductase 3 3 Tu 1 . - CDS 3517 - 4749 1416 ## HMPREF0573_10558 hypothetical protein 4 4 Tu 1 . - CDS 6046 - 6627 561 ## COG0717 Deoxycytidine deaminase 5 5 Tu 1 . - CDS 6729 - 7325 705 ## Balac_1360 hypothetical protein 6 6 Tu 1 . - CDS 7444 - 8082 604 ## 7 7 Tu 1 . + CDS 8256 - 9185 1001 ## COG1194 A/G-specific DNA glycosylase 8 8 Op 1 36/0.000 - CDS 9904 - 11268 177 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 9 8 Op 2 . - CDS 11265 - 12095 354 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 10 8 Op 3 . - CDS 12097 - 13155 1203 ## Cfla_2491 hypothetical protein 11 9 Tu 1 . - CDS 13342 - 14253 711 ## COG4823 Abortive infection bacteriophage resistance protein - Prom 14323 - 14382 2.4 12 10 Op 1 2/0.000 + CDS 14511 - 15086 576 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 13 10 Op 2 . + CDS 15215 - 16828 2169 ## COG0457 FOG: TPR repeat 14 11 Tu 1 . - CDS 17026 - 17640 407 ## COG0693 Putative intracellular protease/amidase - TRNA 17884 - 17957 59.0 # Gly CCC 0 0 15 12 Op 1 . + CDS 18098 - 18643 455 ## Krad_1131 NUDIX hydrolase 16 12 Op 2 . + CDS 18648 - 19286 830 ## Cfla_0170 transcriptional regulator, TetR family 17 12 Op 3 . + CDS 19340 - 19711 543 ## COG0526 Thiol-disulfide isomerase and thioredoxins 18 13 Op 1 13/0.000 - CDS 19814 - 20866 1294 ## COG0136 Aspartate-semialdehyde dehydrogenase 19 13 Op 2 . - CDS 21000 - 22316 1969 ## COG0527 Aspartokinases 20 14 Tu 1 . + CDS 22659 - 24353 2307 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 21 15 Op 1 21/0.000 - CDS 24373 - 25905 2153 ## COG0477 Permeases of the major facilitator superfamily 22 15 Op 2 1/0.000 - CDS 26184 - 27776 2286 ## COG0477 Permeases of the major facilitator superfamily 23 15 Op 3 . - CDS 27882 - 28475 877 ## COG0353 Recombinational DNA repair protein (RecF pathway) 24 15 Op 4 . - CDS 28476 - 28661 160 ## Predicted protein(s) >gi|319978185|gb|AEUH01000141.1| GENE 1 2 - 521 608 173 aa, chain - ## HITS:1 COG:ML2206 KEGG:ns NR:ns ## COG: ML2206 COG0034 # Protein_GI_number: 15828184 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Mycobacterium leprae # 1 136 63 199 556 147 56.0 7e-36 MWAPGEDVSRLTYFSLYALQHRGQQSAGIAASDGSKILVYKDQGLVNQVFSEQSLQGLRG HIAVGHVRYATTGADVWENAQPTLGPTPDGTVALAHNGNLTNTDELRALAASIAQEGEDF QHGATTDTSLVTALLGMAEGMPGPRPFVAAPSVLSPGGGAGAAGSDRRTAPAG >gi|319978185|gb|AEUH01000141.1| GENE 2 876 - 3449 2896 857 aa, chain + ## HITS:1 COG:ML2501 KEGG:ns NR:ns ## COG: ML2501 COG0247 # Protein_GI_number: 15828353 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Mycobacterium leprae # 9 842 7 725 880 483 38.0 1e-136 MADPTLSAIMWAFALLVTAIALVAFARGLAHMWRTVASGTPDPGRLRPVGRRLWGVVTTA LTHREFKGRPWVKAAHWLVMMSFPLLFLTLVSGYAQLRDQTWALPLLGHFIPWEWVTEAF AWAGLAGIVVLMAVRRAAGSGSPAEAALSNDDPDRAASAAAEAMGVEDPAVPSRLARPHP RDSSPQGLASRFLGSTRWQALFVEWVVLIVCACVVALRALESALLGATPGLEGHASALHF PLTAWLGSLFASAASSSAPALANAVVVVSAVKVITSMTWLMVVGVQTGMGVAWHRFVAVL NLYARRNADGTKSLGPADHMLVDGVPATSPDDFDELPEDAVLGVGTVEDFSWKARLDLYA CTECGRCQELCPAWNTQKPLSPKLLVMGLRDHMESASAKEIVTRSDGDAQSGEDSGTGAG PTGAGAAGEEDAVLLDKGVAPSPHSFDLVTALAASGATGPGGVLDVAGPLVPGVVSEEVL WDCTNCGACVEQCPVDIEHIDHILDLRRHQVLMESAFPRELARAFRGMESKANPYNQPAR KRMDWAKGLDFEVPVVGEDVEDASGVDYLFWVGCAGAYDDKAKATTAAIAELLHTAGVSF AVLGSGESCTGDPARRAGNEVLFQMLAAQAIETLTEASPKRIVVSCAHCFNTIAGEYPEL GGRFEVIHHTQLLNRLVREGLLTPVAASGPDGAPFDGADGAGAAGTGTAAPGSGAPLRVT YHDACFLGRHNRVYSPPRELVAALPGVELVEMPRNRDRAMCCGAGGAHAWFEETRGTRIA DARIAEAAQTGADVVATACPFCSQMLGSASGSSAGLVPGASDGQGAAGSGLPQVRDVAVM LLEAVRRGQDGRAPEGD >gi|319978185|gb|AEUH01000141.1| GENE 3 3517 - 4749 1416 410 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10558 NR:ns ## KEGG: HMPREF0573_10558 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 71 397 10 345 345 112 30.0 2e-23 MKERLADAQPSLADALPTGTGDNPLTKTALGISTFTTTEGAFTIDSTGVITRSGSASPIA VSFDVQGPAAAIDVERVRDKHVLVLSDEVDPAELEALAVSMWDDAGWVGPGELRLSSSAT LRGAWSVDAAARRALGTPEALTQAWVLDCPRTRSEEVSESMGEWARAFPDGLPTGLEYRI LQALARMARRLAGALRLAPSGFVMEPDAESAVNLTVYSPRWIAPEDLLSALRDRGGFTGL ADARDLSPAGPSPQMEDAEAERIARLRQDLGPMREDIAARIERERQELAATGESQKVVEG YALVGPIGGRSEVMIEVHAVPSPPRVLRWEPWTASTIIEYQVRWLPGGSLDAPDATLSRS ARRERLRSASDIENAAGLIATMAGGNVIDEDGFLVGLEEIQEADDQEPQD >gi|319978185|gb|AEUH01000141.1| GENE 4 6046 - 6627 561 193 aa, chain - ## HITS:1 COG:ML2507 KEGG:ns NR:ns ## COG: ML2507 COG0717 # Protein_GI_number: 15828356 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidine deaminase # Organism: Mycobacterium leprae # 1 186 1 186 190 251 67.0 5e-67 MLLSDRDIKASLDSGRIRLDPLDRGLVQPASIDVRLDRLFRLFDNHRYAVIDPREDQSDL TRLVEVDPEKPFVLHPGEFVLGATYELVSLGADIAARLEGKSSLGRLGLLTHSTAGFIDP GFSGHVTLELSNTATMPILLYPGMKIGQLCFFDLSSPAEVPYGSEGLGSHYQGQRGPTPS RTHLRFTTTRIGR >gi|319978185|gb|AEUH01000141.1| GENE 5 6729 - 7325 705 198 aa, chain - ## HITS:1 COG:no KEGG:Balac_1360 NR:ns ## KEGG: Balac_1360 # Name: not_defined # Def: hypothetical protein # Organism: B.animalis_lactis_Bl-04 # Pathway: not_defined # 57 183 4 136 154 78 40.0 2e-13 MSSIDRRQGPGTAPRALGPASDSAHPSSSEPGGADGGAGARAQDHDWSRSYSAGGGPQYG QPDGYGQSRYGAPDWYTNPQYGQPDGYGQSQYGAPGWCASPQYGEPDGYGRFEYGQGGCH QPPRQLLIVVILALFAGLFGLHNFYLGHTNRGLVQLLVSVCTLGFAAPFVWLWAVVELIL IVTRSGSYAFDAFGRPLV >gi|319978185|gb|AEUH01000141.1| GENE 6 7444 - 8082 604 212 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTPDPQAPASPSDPPVVEPYAPDGGSSSSAGGPGAPPQGGTMNPYAVPYMSPQGGATDP YASPQAGAADPCTEPDAQYRAAGQGYRAPAPAKSYASALLLHLFLSGVGAGDFYMGFKKT AFGKLVLNIAGIGMIVASIVMVGMIIADPSFGPYNAKLPTLDQVRALEGAYRLFHVGCWA LGLLVVWALISVIMVAARVGMYRADAKGVPLS >gi|319978185|gb|AEUH01000141.1| GENE 7 8256 - 9185 1001 309 aa, chain + ## HITS:1 COG:Cgl2617 KEGG:ns NR:ns ## COG: Cgl2617 COG1194 # Protein_GI_number: 19553867 # Func_class: L Replication, recombination and repair # Function: A/G-specific DNA glycosylase # Organism: Corynebacterium glutamicum # 12 302 5 286 293 204 41.0 2e-52 MIRTPEPAEAAALLDALTSWYDHNGRDVPMRADGVTPWGTLVFEVMSQQTPLVRVAPVWL RWMRLWPAPADLADAPTADVLVEWSTLGYPSRALRLQQCATRIRDAHGGAVPTDHAQLLD LPGVGGYTAAALASFQFHQRIAVLDVNIRRVASRVFDGIELPASSAPTKAERERAEAVLP EDGHECAAWNLALMEFGALVCTQRSPDCPACPIRERCRWAGRGYPRAAKRPRTQQWRGTD RQARGRVLAALRAAHSTAPRGRAGISRAEALRAATLPGAEEGQAQRALASLIADRLVAYD EASSRITLP >gi|319978185|gb|AEUH01000141.1| GENE 8 9904 - 11268 177 454 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 330 454 293 412 413 72 32 2e-12 MRALHQLFGALVEAWSEVKVQRARVVLSLVGVVAAVAAMSTVIALGDLIVQSSKEMTEAM DGRATTLHVTASKSGTDAGDSPATPAPAPSSSDTDPAAEDGQSPTAVPDPVGDAMATVAD RLSIPYWARVENGGVTLDEIWQSRQTGEFRGHPVVEPELGWQDPLVRAVDPAYDIIYRIK PLRGRWVEDADADQRLTPVVINATMWDCLGRAPIDERPIVLHSSDSDAQFRVVGVVSAKT PYEPPVMYAAYSAWANVRASASSGRSDEPDRSSVEMLVWVGPDQVDEARQIVPRALASVL GPGWEGTASGGDRDGMDSGSLGQIRTVIMVIGGIVVFLGALGLLNVAIVTVRQRVREIGI RRALGASAGRVFFAVFMESVVATFLAGVLGVAVAILVVRFLPLESMGIILQDKPAFPLGA ACDGLAISTGIGALCGIIPAFAAVRVKPIDAIRY >gi|319978185|gb|AEUH01000141.1| GENE 9 11265 - 12095 354 276 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 23 240 16 233 245 140 36 7e-33 MALVELRDVTRTVSLPDGEDLRILRGVTLDVEAGERVAIVGRSGTGKSTLLNIVGMLDRP TSGTYRVDGVDAVGLREGRRARMRGASFGFVFQQFNIFAARTAVENVEVPLLYEAGPGFW RRRRLATEMLERVGLGDRADSYPSQMSGGEQQRIAIARALVRRPKTILADEPTGALDPHT GGAVMDLLESVAEENDSALITITHDMAIAARSHRVFELFDGVLHRLGSAADFVGGSPQEP PPPPDDGPDGAGTSRTEAPGARGDGADGRDWAEEEQ >gi|319978185|gb|AEUH01000141.1| GENE 10 12097 - 13155 1203 352 aa, chain - ## HITS:1 COG:no KEGG:Cfla_2491 NR:ns ## KEGG: Cfla_2491 # Name: not_defined # Def: hypothetical protein # Organism: C.flavigena # Pathway: not_defined # 3 315 6 319 342 194 42.0 5e-48 MRVLDVVKVLIWAVIAVALVKFAFFPGSSAEEAQSLDPSGAYGQITTVVAKGSVKNTVTL TGTIEADEATQVRATLDGQVFRVLADDGAAVDEGDPLLEVRKEVPGEDTQTTDPEGNTTT QPGKPSFKSAVVRAPVSGTLRMSALVGQSFAVGDTVGTVSPGTFSAVASLDAEQLYRLQD PPSTATIAVKNGPAPFECSGLAIGAPQSPQGPQKEQDPSSSATPGGVRAKCAIPSNQKVF AGLKVTMDVTAGEATDVLVVPVSAVEGRYQQGTVYLPSDDGTQTKAQVELGLTDGRVIEV KSGLSEGEEILEFTPSARKEDAERGGWEPGPGGDHGNGFTGGQGAGDSGGRD >gi|319978185|gb|AEUH01000141.1| GENE 11 13342 - 14253 711 303 aa, chain - ## HITS:1 COG:PM1540 KEGG:ns NR:ns ## COG: PM1540 COG4823 # Protein_GI_number: 15603405 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Pasteurella multocida # 6 250 6 251 309 85 27.0 9e-17 MGASGKDWLDFDQQIDLLRERGLEVEDDEECIRFLQNVGYYRLSGYFRYWQKDPAHQGNT FKNGATFNQVRELYLTERRLADGLAGTLRKVEVALATRFAHAYGREVAEEGGLARGQGFT LPSNPETPPVDFFVCRDLDRSKEPFIEHYRDKIPGGVPSADAYDRLPVWVAVCVLSFGTL SRCLAASGESGVLNAVAESVGVKRRYFDSQVRSLVYLRNRIAHHARLWNHIVTNSPGMAP NIRRRMQQNRGAYAPQSVYEVCVVLSMLAEGLGVVDAGYLDRIDESLKSHDELNMGLRHP KKC >gi|319978185|gb|AEUH01000141.1| GENE 12 14511 - 15086 576 191 aa, chain + ## HITS:1 COG:CC2073 KEGG:ns NR:ns ## COG: CC2073 COG0454 # Protein_GI_number: 16126312 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Caulobacter vibrioides # 2 188 12 177 178 168 52.0 4e-42 MIRRATPDDANALSALASRAFSETFGHLYREDDLSAFLHEAYAPDVLRAELADPDRATWL LFPGSAPSGPDGPCGPDDADCPEGAGQAPVGYASARPAALPHAAVRPGDGEVQRLYVLSG HQGNGRGSALLATALHWLEREGPRPVWLGVWSGNPGAQRLYERHGFSRVGEYAFMVGSHA DHELIYRRPPL >gi|319978185|gb|AEUH01000141.1| GENE 13 15215 - 16828 2169 537 aa, chain + ## HITS:1 COG:all3838 KEGG:ns NR:ns ## COG: all3838 COG0457 # Protein_GI_number: 17231330 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 24 421 286 687 710 155 37.0 2e-37 MPDSAADHSDPDSLFASGQEELDQGRLAEAETLLRQSLALYEAREGTELEQARCLYSMAT AIYYLGRLGEAEFHYRRALGLYASFEGAEGLQADCLKRIGTIAQSLGNHSQAEELYQQAL EYYRQYRGTEYIQATCLYRRACALARIGRPAEAEGLHREALAIYTRFEGTEHAQAECLND LANTVASVGRLDEAEDMYRQALRLYANLDDTAQRDKADCLYNLAHLVGAKGRCAEAAAFY REALALYEGVGGTEQDRADCLRGLGDAHRNLDRSKAEDFYRRALPLYQGIEGTARNQARC LNNLALVLVDGGRADEAVDLYRQALPLYTLPDQLGLPDAELRRAYCLYNLALATKELGRL SEAEDFFRQVLDLYGGLGVGDRDRAECLFDLADALRAMNRFVEAEDCFRQALGMFEGIEG AQRERANCLSSLGLTYTNLGRPSRAEAMFHKALGLYSAIGGAERDSASCCFGLGLALRAM GRLAESERSYRRAIGLFSAVPGTEEVRASCMGHLAGVVEAQGRAAEAEELRRQARSM >gi|319978185|gb|AEUH01000141.1| GENE 14 17026 - 17640 407 204 aa, chain - ## HITS:1 COG:CAC2826 KEGG:ns NR:ns ## COG: CAC2826 COG0693 # Protein_GI_number: 15896081 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 6 198 3 197 201 161 43.0 8e-40 MAGARWTVAVVLFEGFELLDVFGPVELLSFLLSDYEITYHADEPGAVRSGQGARVLAPLP LEGAADIVVVPGGPGTRRLVGDRAFLDVFAPWAARAPVVASVCTGSAVLAAAGLLEGYRA TSNKRSFAWASGFGRDVEWVPRARWVHDRDRWTSSGVAAGMDMAHALIAHFSGPDAAAEA ARAAELEVRTDPHWDPFADPHHPG >gi|319978185|gb|AEUH01000141.1| GENE 15 18098 - 18643 455 181 aa, chain + ## HITS:1 COG:no KEGG:Krad_1131 NR:ns ## KEGG: Krad_1131 # Name: not_defined # Def: NUDIX hydrolase # Organism: K.radiotolerans # Pathway: not_defined # 4 163 2 129 132 90 42.0 2e-17 MNPRPVVAAAIVDSLRAPTRLLCAARAYPPQLRGRYELPGGKLEHGEAPLEGLAREIREE LSTAIRVGEQVRAPSPGAGDARRVGADGAGDGPSAGADSAASPGPSWWPILQGRVMGVWL AEVAPGSPAPTAGGSHCSLEWVPLDRVEALDWIGHDLDVVRAVVGACGEAPLATDGPGLS E >gi|319978185|gb|AEUH01000141.1| GENE 16 18648 - 19286 830 212 aa, chain + ## HITS:1 COG:no KEGG:Cfla_0170 NR:ns ## KEGG: Cfla_0170 # Name: not_defined # Def: transcriptional regulator, TetR family # Organism: C.flavigena # Pathway: not_defined # 1 194 1 194 201 164 47.0 2e-39 MPKIIGTTLADHRELTRRRLFNALGALLAEEPFDAITMSRIAQKAGVGRTAVYNHFADKE VLLLAYMRQVTTGFTEVLRRRLDEEPDPVRRLRVYLRAHMEMTDRYHLMSGVALRKHMSQ ENSSHLRDHAGVVGGVLLSILDDAVAEGAIPAQNTLGLVHLIHSAMAGQRLPRDPQERGA ALRMTEAFILRAVGVPEKKVAEVTSEGPRSLE >gi|319978185|gb|AEUH01000141.1| GENE 17 19340 - 19711 543 123 aa, chain + ## HITS:1 COG:Cgl2917 KEGG:ns NR:ns ## COG: Cgl2917 COG0526 # Protein_GI_number: 19554167 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Corynebacterium glutamicum # 1 116 1 116 124 144 57.0 4e-35 MATVALTQDNFEKTIAGPGIVLVDFWAPWCGPCRQFGPVFEAASEAHPDVVFGKIDTDQE KQLAMAAQIESIPTLMAFRDGVAVFRHSGALPQSALNDLIGQIGALDMDDVRAKIAAAAE QEV >gi|319978185|gb|AEUH01000141.1| GENE 18 19814 - 20866 1294 350 aa, chain - ## HITS:1 COG:MT3811 KEGG:ns NR:ns ## COG: MT3811 COG0136 # Protein_GI_number: 15843328 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Mycobacterium tuberculosis CDC1551 # 3 347 2 345 345 411 69.0 1e-115 MSGLNVAVVGATGQVGRVMRTLLEERDFPVGTIRFFSSARSAGTALPWKGEEVVVEDVAT ADYSGIDIAVFSAGATASREYAPRFAAAGAVVVDNSSAWRKDPDVPLVVSEVNPEDTADR PKGIIANPNCTTMAAMPVLGPLADAAGGLERLFVSSYQAVSGSGRAGVAELTGQIEAGVK QDPTGLALDGRAVFLPAPVVYAAPIAFDVVPLAGSIVDDGSGETDEEQKLRNESRRILHI PDLAVSGTCVRVPVFTGHTLTIHAEFSGPITPGRALEVLAGAPGVRLVDVPTPLEAAGRD EVLVGRVRQDQAVPGDRGLVLVVSGDNLRKGAALNAVQIAELVAAELLAR >gi|319978185|gb|AEUH01000141.1| GENE 19 21000 - 22316 1969 438 aa, chain - ## HITS:1 COG:ML2323 KEGG:ns NR:ns ## COG: ML2323 COG0527 # Protein_GI_number: 15828248 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Mycobacterium leprae # 1 438 1 421 421 423 55.0 1e-118 MALIVQKYGGSSVADTAAMKRVARRIADTHNAGHQVVVVVSAMGDTTDDLLDAAAELSSE PPEREMDILLSAGERISMALLAMAVGELGVEARAYTGAQAGIRTDSHFGAAQIIGMVPER VARAVRDGQVAIVAGFQGVSKDEDVTTLGRGGSDTTAVALAAALGADVCEIYTDVDGLFS ADPRIVPKAHRLRTLTQEETLEMAAHGAKILHLRAVEFARRYGVPLHVRSSFSEKNGTWI SDAPAHPDLKGLVPDAALAHPEEPTMEKPIISGIAHDRSQDKITVTNVPNNPGVAARVFA VVAEVGANIDMIVQNIPVSDPTKANITFTLPEDHARAALDALGEARAEIGYDELRYNPDI GKLSLIGVGMRTNPGVSARLFSALSNAGINIDLISTSEIRLSVVTRLEDLDRAVQAVHTA FGLDASETEAVVYGGTGR >gi|319978185|gb|AEUH01000141.1| GENE 20 22659 - 24353 2307 564 aa, chain + ## HITS:1 COG:Cgl1887 KEGG:ns NR:ns ## COG: Cgl1887 COG1080 # Protein_GI_number: 19553137 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Corynebacterium glutamicum # 1 558 7 561 568 449 50.0 1e-126 MTESLALYGTPVVHGVAYAPAVWVHRPPLPPQSAPALPAPHREDEVRRFMEAQSAVADSL FDKSLRATEQAADVLVVTASIATDRAWTRDVAERIRKGTPAVQAVMAATARFVVMFETAG GVMAERTTDLMDVRDRVIAHLQGMPEPGIPAVSEPFVLLADDLSPADTAGLDPSLCQAIV TRLGGPTSHTSIISRQLGIPCVVAARRLREIPEGEPVLVEAADGLLTTGVDPRVAHSLVE ADLQRRALVEEWRGPAATSDGRAVELLANVGDAAASEAAHEGGLAQGIGLARTELAFLKS TAEPGVTDQVRAYAQVIAPWRGSKVVVRTLDAGSDKPVAYATLPGEENPALGVRGIRTSG PYPSILANQLDAIAIAAMSAPETRVWVMAPMVSTIPEAQWFSAMVRQRRETYGVDLRAGI MVEVPSVAILIDQFLEAVDFVSIGTNDLTQYTMAADRASADLAEYADPWQPAPLSLVAQV ARAGARAGKPVGVCGEAGADPLLACVLVGMGVSSLSMASGAIPGVGAMLATVTMDQCEAA ARAVVGARDGGEGRYRARRALGLM >gi|319978185|gb|AEUH01000141.1| GENE 21 24373 - 25905 2153 510 aa, chain - ## HITS:1 COG:SMa0185 KEGG:ns NR:ns ## COG: SMa0185 COG0477 # Protein_GI_number: 16262550 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Sinorhizobium meliloti # 13 510 14 514 514 270 40.0 4e-72 MPVQSSSQAFTGRQWGILGVLTMAVLLLAIDNSILSLAVPSLSADLNPTANQILWIGDIY AFTIAGLLVTVGNIADRYGRKRVLLVGAAGFAAASLVCALAPTAEILILGRFLMGIFGAA IMPSTLSIVRNTFDDPGQRTRAIAIWSIGTTGGAALGPLLGGFLIEHFRWSSVFFINLPI MVLVLAVGVPMLRESYGNREASVDIASSVLSILTIVPIVYAIKQIAHGLFDTTLAIGLVL GLVSGVLFVRRQMSMEHPLLDLSLFRIPAFTGAVLGSAASVFALVGLVYFFSQYLQLVRG LSPLAAGLVEMPATVTGILAALAASRVLARLGRGHAIGFGGAFMTLGLGMVAVAASVSGV IMIMVGVGVLGFGAGMASTLTTDAIVSAAPRERSGAASAISETAYQLGAAFGIAILGSMQ SALYRWFVALPADFDQVHGSTAGDSLSGTLGVLDLADPADAAVAEAAKTAFTHAMQITTV FALVLIAASAAVVWRVIPSPRPDRGSPRDH >gi|319978185|gb|AEUH01000141.1| GENE 22 26184 - 27776 2286 530 aa, chain - ## HITS:1 COG:SMa0185 KEGG:ns NR:ns ## COG: SMa0185 COG0477 # Protein_GI_number: 16262550 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Sinorhizobium meliloti # 21 519 8 499 514 268 40.0 2e-71 MESDAPLNNSPASGGAPTARPEPFTPRQWGILGVLTMAVLLLAIDGSILSLAVPSLSADL DPTANQILWIGDIYSFAIAGLLVTVGNIADRYGRKRTLLVGAIGFAAASLLCALAPDANV LILGRFLMGIFGAAVMPSTLSIVRDTFDDPGQRTRAIAIWSIGTTGGAAIGPLLDGFLIE HFRWSSVFLINLPIMVLVLTAGVPMLRESYGNRRASVDIASSVLSILTVVPIVYVVKEAA HGGFGWPQALALVLGAASGWAFVRRQRRLDHPLLDLGLFRIPAFSGAVLTAGIAIFALVG LLYFFSQYLQLVRGMAPLTAGIVEMPATVTGILAALAAPRLLPRLGQGGAIALGTALLGV GMGIVAAAESVPGIFLILVGVGVLGFGSGLSMTLTTDAIVGAAPASRAGAASAISETAYE LGAACGIAVLGSVLSSYYRWFVALPDDFESTHGAGASDSLARTLEALGASGTAAADGVPP GGSDPALAEAVRLTFAHGMQATAVFAVLLCLAAALVAWRAIPSGPARTGE >gi|319978185|gb|AEUH01000141.1| GENE 23 27882 - 28475 877 197 aa, chain - ## HITS:1 COG:ML2329 KEGG:ns NR:ns ## COG: ML2329 COG0353 # Protein_GI_number: 15828252 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Mycobacterium leprae # 1 194 1 199 203 234 60.0 5e-62 MYEGALQDLIDEFGQLPGIGPKSAQRMAIHVLEADEDDVARLVDAINAVRTRVRHCEICG NVTEEPTCSICRDARRDPTKVCVVQEPKDIQAVEAARVFRGRYHVLGGVIDPIHGIGPEQ LRIAALMKRLGEGEIAEVILATNPNVEGEATAAYLARMLSTMGIETSRLAMGLPMGADLE YADAVTLGRALEGRLRV >gi|319978185|gb|AEUH01000141.1| GENE 24 28476 - 28661 160 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no AALRERALRAPSPGPRVGAADDDSASIDDEDISDSSVVGLAAVLEVLGGRVIEEKTTEGT Y Prediction of potential genes in microbial genomes Time: Thu May 12 18:06:26 2011 Seq name: gi|319978183|gb|AEUH01000142.1| Actinomyces sp. oral taxon 178 str. F0338 contig00142, whole genome shotgun sequence Length of sequence - 1059 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1058 364 ## gi|154508071|ref|ZP_02043713.1| hypothetical protein ACTODO_00562 Predicted protein(s) >gi|319978183|gb|AEUH01000142.1| GENE 1 2 - 1058 364 352 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508071|ref|ZP_02043713.1| ## NR: gi|154508071|ref|ZP_02043713.1| hypothetical protein ACTODO_00562 [Actinomyces odontolyticus ATCC 17982] # 22 266 525 804 967 145 48.0 5e-33 KAGSRADGADGGGQRGASSSADADMLRGRWTEVVERLASISRVTWSMVGGNGQLGAVDGS RVVVLFPVEAMVAAFTRGNRAGDVERAVREATGLTVTVSAQVGQAGGGQTVTGPSAQASQ SQGGARRWVSQPPPFDGAAPGAAERSDAGAPTATGAPSAMEEPAPRASDGWPEPAGVPGG EAPSGGGITGGRPAPTRARPDEGAPSGGGPAGGGTAADGAPQRSRRKRAFTVFTYPGDQE VPGNTPGALDGGSPAPAPGALDRESPAPAPGARDQGAPAPAPQWSGPARDGAPDAPPVSA RAPQPAPRDPSLPDEPAFVDGPDDDWADTTPSPEAPSGWSEAVAVPGGASVS Prediction of potential genes in microbial genomes Time: Thu May 12 18:06:55 2011 Seq name: gi|319978180|gb|AEUH01000143.1| Actinomyces sp. oral taxon 178 str. F0338 contig00143, whole genome shotgun sequence Length of sequence - 2731 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1232 1156 ## COG2812 DNA polymerase III, gamma/tau subunits 2 2 Tu 1 . + CDS 1560 - 2159 260 ## Predicted protein(s) >gi|319978180|gb|AEUH01000143.1| GENE 1 2 - 1232 1156 410 aa, chain - ## HITS:1 COG:ML2335 KEGG:ns NR:ns ## COG: ML2335 COG2812 # Protein_GI_number: 15828258 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Mycobacterium leprae # 5 367 3 365 611 420 59.0 1e-117 MTTALYRRYRPDTFDEVIGQEHVTEPLKAALRANRVTHAYLFSGPRGCGKTTSARILARC LNCAQGPTDAPCGQCASCKELATGGSGSLDVVEIDAASHGGVDDARDLRERATFAPVRDR YKIFIIDEAHMVTNQGFNALLKLVEEPPEHVKFVFATTEPERVIGTIRSRTHHYPFRLVP PDVLGPYLKELCAQERVRVGDGVLPMVMRSGGGSVRDTLSVLDQLMAGAIDGEVSYATAV ALLGYTDSALLDESVDALAGGDGAGAYRVVERMVESGHDPRRFVEDLLQRLRDLLIIAVA GDGARDVLSDAPADQFERMQLQAKNWGPKGLSRAADLVDEALRSMTGATSPRLQLELLVG RILVPVAPAQPEGALEGAVGLVGGGSPHEAAPVGSDTAAGASGGGAQGKG >gi|319978180|gb|AEUH01000143.1| GENE 2 1560 - 2159 260 199 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAHLGTSMCIGVVLPQHIDVGEDEALALLAPRERRQKGARKRPGIGKATVRKMKGDDPSW WVLDVDFAALSLGVPLGLRPRAEASRAARRWRQARAVTAELAIGLGAAYAGVGEEAEVPV PAHVAADAERALAYAPQAWFFPANGPAWPHVGPLVEAGTALPPSATGLWVVASDPLTGRD DRTAQHRVASVLVRLAKGA Prediction of potential genes in microbial genomes Time: Thu May 12 18:07:09 2011 Seq name: gi|319978167|gb|AEUH01000144.1| Actinomyces sp. oral taxon 178 str. F0338 contig00144, whole genome shotgun sequence Length of sequence - 12495 bp Number of predicted genes - 12, with homology - 6 Number of transcription units - 6, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 346 - 377 -0.8 1 1 Tu 1 . - CDS 391 - 453 88 ## 2 2 Tu 1 . + CDS 437 - 1933 2014 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 1998 - 2057 11.4 - TRNA 2352 - 2444 60.1 # Ser GCT 0 0 - TRNA 2514 - 2600 57.5 # Ser TGA 0 0 3 3 Tu 1 . - CDS 2751 - 4091 1952 ## COG2942 N-acyl-D-glucosamine 2-epimerase 4 4 Tu 1 . + CDS 4481 - 6046 2257 ## COG0033 Phosphoglucomutase 5 5 Op 1 . + CDS 6473 - 7207 721 ## 6 5 Op 2 . + CDS 7225 - 7941 634 ## 7 6 Op 1 . + CDS 8065 - 8358 126 ## gi|154508283|ref|ZP_02043925.1| hypothetical protein ACTODO_00779 8 6 Op 2 . + CDS 8355 - 9068 540 ## 9 6 Op 3 . + CDS 9065 - 9766 256 ## 10 6 Op 4 . + CDS 9833 - 10384 125 ## 11 6 Op 5 . + CDS 10436 - 11809 1453 ## Xcel_1639 hypothetical protein 12 6 Op 6 . + CDS 11806 - 12489 840 ## Xcel_1640 hypothetical protein Predicted protein(s) >gi|319978167|gb|AEUH01000144.1| GENE 1 391 - 453 88 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRILLMVAPHSNGIGQSYAL >gi|319978167|gb|AEUH01000144.1| GENE 2 437 - 1933 2014 498 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 44 350 437 721 744 182 32.0 1e-45 MSRIRIRFLRCIGAAAAALACAGLSPLAAMADSPDSGAASGGPAVSTAGASAPSSESAPQ SAPAAQNGDEARPAPQDPTAKPQDQTTDPQGPTTGPQGQGADPQAPDASDAGPADGAPAS SEEEGEPGPPTPATGKWVRHGAGWSYELPDGTLLKNGVFDVEGRRYAFTADGYVPVGWYK DPNGVWYMSTENGVRTGWYREGGTWYHFADNGAMDTGWLSTGGSSYFMTSSGAMATGWFQ VDGTWYCFESSGAMASAKWVWAGAWYYLTGSGAMASGWFIADGTWYYAGPDGAMRTGWVA DGGRWYYLHPSGALGTGWLQEGASWYYLDPDSGAMSTGWAVVSGSWNYFDRWKGFWVSGR ADFEADWNYAKTLYSPTNYLVVVDTGAPHCMTFYWADDSWQPLTDMPCSVGKPSTPTIKG TFTIKSRGPSFGHGYTAYWWTQFSGDYLFHSILYYEGTYTVKDGTLGGHVSHGCVRLRYE DAKWFHDTIPSGTYVTIY >gi|319978167|gb|AEUH01000144.1| GENE 3 2751 - 4091 1952 446 aa, chain - ## HITS:1 COG:ECs4803 KEGG:ns NR:ns ## COG: ECs4803 COG2942 # Protein_GI_number: 15834057 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Escherichia coli O157:H7 # 1 427 1 409 413 191 30.0 2e-48 MGWFDSLEHNRWLSTQMQALIAEAEGAIVPTGFAHLDAEGKPDPTRSIDLAVTGRMAYVF SLGVLMGLPGTRRYADHAVKALSTYFTDPVNGGMWVSIKPEAGADGHGVPWDEDGRVKSQ YHTVYALLGAAAAAVANRPGAHELLNSMLDEQKERWADDYGLVWDQYDEAFTAPVPVHTL GTLIHTIEAYMAAAEATTEPEWLDRAEKMAAFAHKIASKNGWRVHEYYDEEWNPSPGAGK LLNDGRRHYDGYVTGHSMQLARFALQVRSGLRSMGWKVPDYLLEMGTELFERARVDGWRR TDGCPGFATVVDDEGDPVPGEDEHQQWVVCEGVCATVAIRRAMLDDGARVSEVEHFEHCY RSFVDYIHDYLIGQPGRWVRRLGPRNESVQPAKSSRWDVYHAIQATLAIRVPLWPPTAPA LFRGLLDHPEEPAPDKKSWNFFGLRG >gi|319978167|gb|AEUH01000144.1| GENE 4 4481 - 6046 2257 521 aa, chain + ## HITS:1 COG:Cgl2489 KEGG:ns NR:ns ## COG: Cgl2489 COG0033 # Protein_GI_number: 19553739 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglucomutase # Organism: Corynebacterium glutamicum # 1 520 39 553 554 640 67.0 0 MVFGTSGHRGASLDGKFNEAHIVAITAAIVEYRASQGINGPLFIGRDTHALSEPAWRTAL EVLAGAGVDARIDSRGSYTPTPAVSVAILGANGAPGQLRTSGDGLADGIVVTPSHNPPRD GGFKYNPPHGGPADTDATSVIAARANELLESGQWREVPRVVGSDLLHHPSITPFDYLDFY VGQLDQIVDIEAIRAAGVRIGADPLGGASIDYWAAIGERYGLDLTVVNPKVDPAWPFMTL DWDGKIRMDCSSPYAMASLRQAMTPDAEGNTPYDIATGNDADSDRHGIVTPDGLMNPNHF LAVAIEYLFSHRPGWGEGCAVGKTLVSSALIDRVVASLGRELVEVPVGFKHFVPGLLSGT VGFGGEESAGASFLRKDGTVWSTDKDGIIMALLASEILAVTGKTPSQLHAEQVERFGASA YARIDAAASREEKAKLAALSADDVTATELAGDPITAKLVRAPGNDAPIGGLKVATEYAWF AARPSGTEDVYKIYAESFKGAEHLAEVQKAAKKLVSDALGA >gi|319978167|gb|AEUH01000144.1| GENE 5 6473 - 7207 721 244 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGKNHLCGTADDLRRAGARASLGASVVCALVCAVMFTIGSLQRGVLFALLGAMTLLITID DWHTLVVENSAPRRWRRVEGGYLLKTPGSLAVTHAVTGLGLVVLSPALLWFMALTRVVTP PVRVMLTPGLLTLFLKSAGAVVRPFLFRPFLPEIFVSPALLSVRCNSAHQWTAEWDERTE ITDSRGLSCVHMDNRVHSGWLIGMHGAPCGHSYLQRTVSLLARHPIERRALGTAEGAAVL AKIL >gi|319978167|gb|AEUH01000144.1| GENE 6 7225 - 7941 634 238 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFSNDWASSCLYQGSLLLLAFAFPFFLSDGAGPAARYDTCYRVGVTVWALIAFACNHVTR RMNVHPERYWRWGADGFHMSSPWALVVADWGATLLLTALGAAAFLPPAWAPGTGIHITVF GACAVPSGIGGLVFRLSQSHLRPRVLVNDTRITILPARGDAWTARWSDLPTVAPGDRRAA AALEEALEAQVMPAHFDRPTSRDAVIALIIQVARHPGERGRLASPEGIGALKEWTGRI >gi|319978167|gb|AEUH01000144.1| GENE 7 8065 - 8358 126 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508283|ref|ZP_02043925.1| ## NR: gi|154508283|ref|ZP_02043925.1| hypothetical protein ACTODO_00779 [Actinomyces odontolyticus ATCC 17982] # 19 97 368 440 440 65 41.0 1e-09 MNGGRQRHQDGALWADHTAEVEVDGTRKTITIEAAHAYTVIAADDGGITISNPLGHNNEA DGSGNPSNNEPQVGGTFRMSWDDYRRCFMSYTYGRIP >gi|319978167|gb|AEUH01000144.1| GENE 8 8355 - 9068 540 237 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIPRKLIGYGAGVTCVSLFLLNDFLCTHLRLDSPPPGAQSCPSGIHIRKAGHSSHDPFGN IPIGSDSNGNYAWLNVADMEIVFGHQRAYARADFRGNGGHQESPAKLLNIGDTYTAPDIG AFTLEQVEPAGFVFGPAISRATFCFSPAPAFTVDPPILWRNKLIPDVEEPPHLWGGQDDA PPYYDAPDPSPGATTPQSGATPQSDTAPPSAPEQGAASTNPTADTDPPEPTGGEQIP >gi|319978167|gb|AEUH01000144.1| GENE 9 9065 - 9766 256 233 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNTRRLLGRGTCAACVSLFLLNDFLCTHLRLDSPPPGAQSCPSGIHITRVDTRSGEPRIG RTNGGDDVWLEVDSVAIQFGQQSSTLVVGVSGTTAQPKTVHHFTNAHESAAAPGVGTFSL EYVRPVGVVFGPKDPWAVFCFTPEPGFTVDPPILWRNKLIPDVEEPPHLWGGRDDAPYND PGPASGATAPPAPRNGAPPTTPQSDATPQSAAVPNAPAVGTAEGAGTTTTGGR >gi|319978167|gb|AEUH01000144.1| GENE 10 9833 - 10384 125 183 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIAAIAASVSGGTLSYFTLYIGAVLPIPVMTGFQAHPPTWLRARHLWPPARPPFAHALV AQLCVLAWWMGGYTITALAIHLTRGVSRVTVTPYVIQLAVINAIGGAAWALLLAWIDAHR PADRDPGDGQEDTRAADRSGPLDDPGISPVDPFPRKEGPAPRTRFHQDPSSGGRSGKPHL PLE >gi|319978167|gb|AEUH01000144.1| GENE 11 10436 - 11809 1453 457 aa, chain + ## HITS:1 COG:no KEGG:Xcel_1639 NR:ns ## KEGG: Xcel_1639 # Name: not_defined # Def: hypothetical protein # Organism: X.cellulosilytica # Pathway: not_defined # 12 397 8 374 382 270 54.0 1e-70 MTRPQAPSENPRLPAGRLAFLLFAGICLLAGLDAALVRLAATAPVPSTDLGAVHGLLMVY GFLGTAICLERAVAVSTGSDRPTKWAYLAPLTTGAGGACAVVLALNAGLREAMAALPVPR LLAAHLAGFQSARMAPGLLITTGMVLLVCVYAHVWRRRQASYAVLVQMLGALIGLGGALA WWRGLEVAAIVPWWLMFLVVTIIGERLELARLNFLAGSTERRITAEACAVLLALPLTLFA PGAGYPVLGLALGAVAADTAWHDVARRTIRVPGVPRLAAASMLCGYGWALVPALMWVVAP PVFDGYGYDAAVHALTIGFVVSMVIAHAPVIIPAVAKRPVPYHPAMWIPFALLQASLAVR LLAGARGAAGAWRFGGALGVVGMLVFVLTTLTVTIRAARSATRAARAPSPSNEAGAAGGD HHPADAPSGTARAARDTAEAANGTTGTAPAPSSREDA >gi|319978167|gb|AEUH01000144.1| GENE 12 11806 - 12489 840 227 aa, chain + ## HITS:1 COG:no KEGG:Xcel_1640 NR:ns ## KEGG: Xcel_1640 # Name: not_defined # Def: hypothetical protein # Organism: X.cellulosilytica # Pathway: not_defined # 54 226 3 175 478 147 61.0 4e-34 MSPTIGPPPGSPAPPGGERPDPGATSAHAPSGATAKGGTIPIGSPDSPGSAGERPGAQGR PRVARTDRLTTVWLCLAVLAAGATTVFRTALPQPLWTTIHLVTLGVLTNGILQWSWYFAR ALLRLPPGDRHSGRDNTRRILALNAALVVLIASMWAEWAPGAVVGATGVGAVAAWHGLAL VVAARTRLASRFAVVLRYYAAAAAFLVLAAVLAGLVATAAFAASAPQ Prediction of potential genes in microbial genomes Time: Thu May 12 18:08:28 2011 Seq name: gi|319978161|gb|AEUH01000145.1| Actinomyces sp. oral taxon 178 str. F0338 contig00145, whole genome shotgun sequence Length of sequence - 4391 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 551 497 ## Xcel_1640 hypothetical protein 2 1 Op 2 . + CDS 544 - 1986 2059 ## COG2132 Putative multicopper oxidases 3 1 Op 3 . + CDS 2081 - 2563 717 ## KRH_01810 hypothetical protein 4 2 Op 1 . + CDS 2717 - 3913 1331 ## COG0477 Permeases of the major facilitator superfamily 5 2 Op 2 . + CDS 3948 - 4389 446 ## COG4850 Uncharacterized conserved protein Predicted protein(s) >gi|319978161|gb|AEUH01000145.1| GENE 1 3 - 551 497 182 aa, chain + ## HITS:1 COG:no KEGG:Xcel_1640 NR:ns ## KEGG: Xcel_1640 # Name: not_defined # Def: hypothetical protein # Organism: X.cellulosilytica # Pathway: not_defined # 7 178 264 437 478 112 46.0 8e-24 VVGLAAAAGIAYPLARAAMRKGPSEYGSWSMCLGTAWILVGVAVVAQHAWATGTGEQLRA ANLPWLPIIGAGGLGQVFIGALTYLMPVVIGGGPRAVRVGVGALEAAAPMREAARNAALV LLAAATASQAPAAAPLATACWAVVLATYLADIALMARAGTAQARARTAPPSPSPSNTGGP RG >gi|319978161|gb|AEUH01000145.1| GENE 2 544 - 1986 2059 480 aa, chain + ## HITS:1 COG:NMB1623 KEGG:ns NR:ns ## COG: NMB1623 COG2132 # Protein_GI_number: 15677473 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Putative multicopper oxidases # Organism: Neisseria meningitidis MC58 # 200 478 59 345 390 137 33.0 3e-32 MADQKPRPKNPAGDADPLSRTAMGVMGLVCVAALALAVFLSNPSSPGAQTAAGTGTAAGG AAQVAPTGHTTEVAVGVSGFAFTPNRIEVPAGDRLVINFTNTGDQRHDLVLPNGAESGAL APGASATLDAGVVSADMEGWCSIPAHREHGMTLTITAVGAPSAAGGGDHSGHADHSGHAG HQAATGGPATAQELEEHAAAVEARDPVLPPADPATERHYTFTATEGTIDVTDTLTRAHWT FNGTSPGPILRGRIGDTFHITLVNNGTMSHSLDFHAGLVSPDQNMRSIDPGETLEYTFTA ANAGIWLYHCSTAPMSMHIANGMFGAVVIDPTDLGAVDHEYVMIANELYLGQDGQSADAS LLSALQPNAMAFNGTPFQYKAHPIQVKTGERVRVWVMDAGPNLPLTYHVVGTQFDTVWRE GAYVIRGGGSGGGWGQVLALGAAEGGFVEFTPLEAGHYTFVNHALSLAEKGQTGVFEVTD >gi|319978161|gb|AEUH01000145.1| GENE 3 2081 - 2563 717 160 aa, chain + ## HITS:1 COG:no KEGG:KRH_01810 NR:ns ## KEGG: KRH_01810 # Name: not_defined # Def: hypothetical protein # Organism: K.rhizophila # Pathway: not_defined # 5 159 8 162 164 139 47.0 3e-32 MGSVILRPLFATETDLLERATLGNINWCGQRFTMDDVRRTPEFAHYTRPDPLRGDFGIVA ERDEEALGVVWALFLPADHPGFGFVDETTPELSLWVDEAERRRGIGRSLMGAALTEARER GAAQISLSVEEGNFSKDLYTSLGFADVPGLEANGVMLLRL >gi|319978161|gb|AEUH01000145.1| GENE 4 2717 - 3913 1331 398 aa, chain + ## HITS:1 COG:PA2269 KEGG:ns NR:ns ## COG: PA2269 COG0477 # Protein_GI_number: 15597465 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Pseudomonas aeruginosa # 1 398 1 396 401 205 39.0 2e-52 MLSPYLGVLRLERAWRFSAAGLVLRLPMSMVGISIILLVRAQYGSYALAGAVSAVNIIAT ACCAPALARLVDRRGQSRVMGPSWAVSSAAMAALLACAAVRAPEWLLFVSGAVAGATWGA PGALVRSRWAAVLDRADQLTTAYAYESAMDELVYILGPVLSTVLGTLVHPGAGVLLSVAF LAVGGALFLSGRDSEPEPTGGGEHGGRGSVIRMPAVAVMALTYIGMGTMLGANDVAVVAF AAERGAPAMSGVLLAVSSAASLLAGLVYGARSWRWPTWRLYKVCVVAIALGATAYLAASG LWTMALVMSVTGMAYAPTMTNVNMIVQKTVPPHRLTEGLTWMSTSLNMGASLGSALAGPA VDAGGSYGGYLVTVGSAWTMVLLMTVGVGALGKATAED >gi|319978161|gb|AEUH01000145.1| GENE 5 3948 - 4389 446 147 aa, chain + ## HITS:1 COG:Cgl2204 KEGG:ns NR:ns ## COG: Cgl2204 COG4850 # Protein_GI_number: 19553454 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 38 130 26 125 344 85 48.0 2e-17 MPPTPLFDDESPSLLLSLAAGTSELASVVLAHALAGCGFHPSIVGFGGIGSLRSARVLGR VLMDRDLLQRSWLSGRRGWRQFFNAQVPRQPVLVTVGRSRRLTFADRGGYVDLVVNGHGL PPGWHDATIQVLHQADVRALGLREGDA Prediction of potential genes in microbial genomes Time: Thu May 12 18:08:36 2011 Seq name: gi|319978157|gb|AEUH01000146.1| Actinomyces sp. oral taxon 178 str. F0338 contig00146, whole genome shotgun sequence Length of sequence - 3498 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 824 1137 ## COG4850 Uncharacterized conserved protein 2 2 Tu 1 . - CDS 883 - 1485 742 ## Arch_1654 hypothetical protein 3 3 Tu 1 . + CDS 1656 - 3227 750 ## COG1960 Acyl-CoA dehydrogenases Predicted protein(s) >gi|319978157|gb|AEUH01000146.1| GENE 1 3 - 824 1137 273 aa, chain + ## HITS:1 COG:Cgl2204 KEGG:ns NR:ns ## COG: Cgl2204 COG4850 # Protein_GI_number: 19553454 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 22 242 122 335 344 172 41.0 4e-43 EGDALRLAAPAPGAPAHPDGHVRRRVRAGKPVGVTIRIVGDSEDFGIVSDVDDTVIVSML PRLLTAAKHAFVDRVSSREAVDGMAEFLTDAAQSHTTAPNPSHAPVLYLSTGAWNVVPAL RSFLERSHFPTGCFLMTDFGPSNTGWFRSGPEHKRRELRRLSRMLPHVRWLLVGDDGQHD PEIYAEFARECPHQVAGIAIRSLSEFEQFMSHGTFDAMVPDALWTVPEQIPVWYGSDGHA LLANVRGRGGLPRLASLIGTDEGSGPDGDGAGA >gi|319978157|gb|AEUH01000146.1| GENE 2 883 - 1485 742 200 aa, chain - ## HITS:1 COG:no KEGG:Arch_1654 NR:ns ## KEGG: Arch_1654 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 3 179 7 183 190 82 31.0 8e-15 MTPRQRYLMRRQRRQTAVFSITGLSLAVIALVGTLVLIGVIPIPFGNSFSASVKYAEEGD LVCPSVGAYPSPVEDVNVKVVNTTAHQGLATDATEMLISAGFTPQDPDNSTTEYGGKVRI FAGVSGVDDAYTVARYFPGAKVVLTDATDKAVTVELGNFYDSPIDAEEVAHTSRSKDPLV RPEGCLPVPPNGLQSGDASS >gi|319978157|gb|AEUH01000146.1| GENE 3 1656 - 3227 750 523 aa, chain + ## HITS:1 COG:AGpT133 KEGG:ns NR:ns ## COG: AGpT133 COG1960 # Protein_GI_number: 16119880 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 376 15 391 401 238 39.0 2e-62 MHFLSDDLLARIRSRAGEVDATNSFPAADLADLRGAGYLGAFVPAEFGGSGLGLTAIAAE QTRLAAAAPGTALGINMHQIIVGLGRFLVAHGNPRGEQILRDAAAGEVFGFGISEPGNDL VLFGSTTKASPAPGGGYSFEGTKIFTSLSPAWTRLLVFGRADLDEGPKSVFGLVHRDDPG YSIVDDWDTLGMRATQSMTTRLEGVAVPGDRILTVTDPGPSEDPVVFGIFAHFEILLAAT YQGVGERAVQVAAEHVKARRSVKNRTTYSNDPDIRWRIAEAALAMNAVGPQIRELARDVE AGADRGAAWMPQLSAVKNAAAEATLRAVEQAMRACGGSAYYNSHELSRLYRDALAGIFQP SDQESLHGAWANLVLGARRKLRERPAAGPRPSGRPDRAGRPSAWGAAAAAAPPFRCRGPR PGGRRPSARPRRRARRGAVPACWQHRGRRGGCAPGAATTPPPTSACRRPRPPRPSPSPWS PSRSSRGRRGCGRRTGPSRSRRPGRSGRAGPPRASSTSGTGAP Prediction of potential genes in microbial genomes Time: Thu May 12 18:08:41 2011 Seq name: gi|319978154|gb|AEUH01000147.1| Actinomyces sp. oral taxon 178 str. F0338 contig00147, whole genome shotgun sequence Length of sequence - 2588 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 78 - 1763 1262 ## gi|293191984|ref|ZP_06609344.1| conserved hypothetical protein 2 2 Tu 1 . - CDS 1884 - 2588 706 ## COG1180 Pyruvate-formate lyase-activating enzyme Predicted protein(s) >gi|319978154|gb|AEUH01000147.1| GENE 1 78 - 1763 1262 561 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293191984|ref|ZP_06609344.1| ## NR: gi|293191984|ref|ZP_06609344.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 156 1 156 414 212 80.0 6e-53 MADELHFDDPLRAAADPATDAEALRQLAYRYPETRAAVAAHPRAYRGLLDWLHQFGDPAV NAALEAREDYDGYIDSNGFLVMSGDVSGAVAGSQIRTAEHGMYVLGQGGVNSYSQVERTT PYSAVAGGASPKPEVQSREQVVAQVRRTSIFGNRTAGSGGQEAQAASASGVDAPTARTSS VETAVSRTPAAGTPAVDAPTARMPAADAGATRVMGGSPEGATAVFPKSGSPEGATAVFPK SGSPEGATAVFPKSASPAGAAATMPLPTADPRQQATRTMPGAVGRDSGGPYAAPGASPYR APSSTSQYAPSGGASASGYPAAGRRSSYSAASTATRAPAQQAPASGSDYDGGGDGAAPKK RKGGPSALGIIFSALVIVAIALLAVVIYVFTRGFETPNSSPTTPPAAQAAPTTASPTTPS PSPSTTEAIRYPAPANAQQLTAFATPSGNISCSFNSTGVSCVINSNDWAAGNYASCSGSS HGTLSISGDSAVQSCGTSGATAATGLTYGQYAANGDYACSSTADGVSCWNTKSGASFALA RGGWMTGSGGEIAPAQYSWNQ >gi|319978154|gb|AEUH01000147.1| GENE 2 1884 - 2588 706 234 aa, chain - ## HITS:1 COG:PA1919 KEGG:ns NR:ns ## COG: PA1919 COG1180 # Protein_GI_number: 15597115 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Pseudomonas aeruginosa # 4 234 5 232 232 142 42.0 4e-34 ADDLQIAGLVPMSTVDWPGELVASLFLQGCPWACPYCQNAAIIDPRVPGVVAWDAVERLL ARRRGLLDGVVFSGGEATRQIALAPAMRRVRELGFAVGLHTAGPYPARLGALLDEGLVDW VGLDIKATPANYPAVAGRPGSGGRAWEALRVLMDHPEVDHEVRLTVFPGGSDDGLEVARA VRQAGARSFALQQARQTGAPDGFVARAPGWDDQVRALDRDIGALGFDAYEFRGA Prediction of potential genes in microbial genomes Time: Thu May 12 18:09:13 2011 Seq name: gi|319978150|gb|AEUH01000148.1| Actinomyces sp. oral taxon 178 str. F0338 contig00148, whole genome shotgun sequence Length of sequence - 3734 bp Number of predicted genes - 5, with homology - 3 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 75 68 ## 2 2 Tu 1 . - CDS 57 - 1916 2986 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 3 3 Tu 1 . + CDS 1831 - 2040 174 ## 4 4 Op 1 . - CDS 2202 - 2741 813 ## Sked_10640 DoxX protein 5 4 Op 2 . - CDS 2821 - 3732 1110 ## HMPREF0573_10842 hypothetical protein Predicted protein(s) >gi|319978150|gb|AEUH01000148.1| GENE 1 1 - 75 68 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no SAVRCAHSRPGPAQWAGGGHALAP >gi|319978150|gb|AEUH01000148.1| GENE 2 57 - 1916 2986 619 aa, chain - ## HITS:1 COG:RSp0963 KEGG:ns NR:ns ## COG: RSp0963 COG1328 # Protein_GI_number: 17549184 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Ralstonia solanacearum # 9 560 100 651 683 893 73.0 0 MTASAALTIDPVATVDEYLDRSDWRVNANANQGYSLGGLILNSAGKIIANYWLEKVYTPQ ISAPHREGDYHIHDLDMFAGYCAGWSLKRLIQEGFNGVTGAIASAPPRHFSSACGQIVNF LGTLQNEWAGAQAFSSFDTYMAPFVRLDAMDYDEVRQCMQELIYNLNVPSRWGSQCPFTN LTFDWTCPDDLADEHPMIGEEVVDFTYGDLQPEMDVINRAFIDVMSQGDADGRVFTFPIP TYNITPDFEWEGENVGALFDMTAKYGLPYFQNFINSDLDPHMIRSMCCRLQLDLRELLKR GNGLFGSAELTGSVGVVTLNMARLGYLHKGDEAALVARMDELIDLASASLEIKRATIQEQ MDRGLFPYSRRYLGTLDNHFSTIGVNGMNEMVRNFSDDAYDLTDPRGFDMCVRLLDHVRE RMVQLQESTGHLYNLEATPAEGTTYRFAKEDRKRYPGIIQAGTDEQPYYTNSSQLPVAYT DDPFQALEDQEVLQGKYTGGTVLHLYMGERISSGAACREMVRRSLTAFKVPYITITPTFS ICPVHGYLAGEHFTCDKCAAARPGREPQACEVWTRVMGYFRPVQSFNIGKKGEYNERQMF SEGAADAHGELVGAYGARA >gi|319978150|gb|AEUH01000148.1| GENE 3 1831 - 2040 174 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALTRQSERSRYSSTVATGSMVKAALAVMFHLVLYVFRILAGGWGHLALVPGPTKSSGSN QQRTLYLVP >gi|319978150|gb|AEUH01000148.1| GENE 4 2202 - 2741 813 179 aa, chain - ## HITS:1 COG:no KEGG:Sked_10640 NR:ns ## KEGG: Sked_10640 # Name: not_defined # Def: DoxX protein # Organism: S.keddieii # Pathway: not_defined # 10 174 18 181 186 144 51.0 2e-33 MSDKTPRAAGQAVASAPGRFVLALLRISVGFVFLWAFLDKAFGLGFPTPSRRAWIHGGEP AQGYLRGALDATKPFTPFFQSLATPLADWLFMSGLLFVGAATVLGVASRLAAMAGVVMMV MMYLAEGTFVVGSANPVVDSHLVYALVLVLVALLGAGDTLGLGKWWKRLGPVKRVPLLV >gi|319978150|gb|AEUH01000148.1| GENE 5 2821 - 3732 1110 303 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10842 NR:ns ## KEGG: HMPREF0573_10842 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 27 222 16 216 285 114 41.0 6e-24 RAPGRPGGRAAAQDGADAWYRRKLARRIVALLACVALWLLLSYAAISTAAGQVLDTLLME ATMRATGRLVSFTSVVTGVVSVPAMVVAGVVVALVAVARKRPTLAGRALGMVIGANVTTQ LLKDMISRPDLGMTTGISNSLPSGHSTVAVTLSLALVAIAPQWLRAPSAWIGWAWTSLMG VSVMMAGWHRPADVVVAVLIAGAWALALSPIERRARHGKRVSKVMLWAVLAAGALAFSLT LVGLWGFSMSAGAPGSGYGFNDFLSVRPWRSLVLGMGACAWIVAVVGAVVREVDLLADDS RSA Prediction of potential genes in microbial genomes Time: Thu May 12 18:09:30 2011 Seq name: gi|319978147|gb|AEUH01000149.1| Actinomyces sp. oral taxon 178 str. F0338 contig00149, whole genome shotgun sequence Length of sequence - 1294 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 130 101 ## 2 2 Op 1 . + CDS 252 - 698 539 ## gi|293192028|ref|ZP_06609352.1| hypothetical protein HMPREF0970_01696 3 2 Op 2 . + CDS 695 - 1195 750 ## Cfla_1554 hypothetical protein Predicted protein(s) >gi|319978147|gb|AEUH01000149.1| GENE 1 1 - 130 101 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGDPVPVPDPSLDDAYGSAPIDGGYGSASRPPRPARRARPAQP >gi|319978147|gb|AEUH01000149.1| GENE 2 252 - 698 539 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293192028|ref|ZP_06609352.1| ## NR: gi|293192028|ref|ZP_06609352.1| hypothetical protein HMPREF0970_01696 [Actinomyces odontolyticus F0309] # 15 148 1 134 138 209 72.0 5e-53 MATSTPVNTDRIRELLTSRNVRFGDYNDSELVYLAPNAAFFWNATNPQILQLRAQWRGIA STKAQFTALVEEVGRCNATRSGPKAYLAPLEDGIRYGLGAECNIIVMSGLTPAQLDTFFE TAMSMIMSFFTDVEAALPDFVDWNEEER >gi|319978147|gb|AEUH01000149.1| GENE 3 695 - 1195 750 166 aa, chain + ## HITS:1 COG:no KEGG:Cfla_1554 NR:ns ## KEGG: Cfla_1554 # Name: not_defined # Def: hypothetical protein # Organism: C.flavigena # Pathway: not_defined # 10 140 22 150 173 68 32.0 1e-10 MNTPFAVPDPEDDATPYPVTLDRVLASVRAMGYQLDVVEEGRSVGAIFDSIPFLVSFDSG GRFMSVRGAWMTGLDGAEAQHPMFAAADNWNREKYFPTVYTLVSEDGSLGVFADLTVDTK AGLTEAQLRDAVGSGISTGVSAIQYMKEAAAKTLGWSGPSGSDADG Prediction of potential genes in microbial genomes Time: Thu May 12 18:09:45 2011 Seq name: gi|319978145|gb|AEUH01000150.1| Actinomyces sp. oral taxon 178 str. F0338 contig00150, whole genome shotgun sequence Length of sequence - 992 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 14 - 992 931 ## COG1205 Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster Predicted protein(s) >gi|319978145|gb|AEUH01000150.1| GENE 1 14 - 992 931 326 aa, chain + ## HITS:1 COG:Cgl0307 KEGG:ns NR:ns ## COG: Cgl0307 COG1205 # Protein_GI_number: 19551557 # Func_class: R General function prediction only # Function: Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster # Organism: Corynebacterium glutamicum # 3 324 28 308 785 192 39.0 9e-49 MELPGRPERASDWPEWVHPLVRGAWARRGVERPWSHQREALDATASGADVVVATGTGSGK SLAAWTPLLSDLAGAGSTTRISAVHRRPTALYLSPTKALAADQCASLEALMEGGPRLASA STCDGDTPREAKEWARANADALLTNPDYLHHVMLPAHGRWTRVLASLRYIIIDELHHWRG VTGSHIALVVRRLLRACHRLGADPRVIMLSATVRDPALVGAAMTGRPATAVTRDGSPAGP RHLALWQGGLVDDGSGAEPSSSGALASPAPFSSPGDPASQGALADGDRVLGASVVRRSAG AEAAGLTARLVEEGARLLAFVRSRAG Prediction of potential genes in microbial genomes Time: Thu May 12 18:09:46 2011 Seq name: gi|319978141|gb|AEUH01000151.1| Actinomyces sp. oral taxon 178 str. F0338 contig00151, whole genome shotgun sequence Length of sequence - 2792 bp Number of predicted genes - 5, with homology - 2 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1464 1586 ## COG1205 Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster 2 2 Op 1 . + CDS 1691 - 1801 119 ## 3 2 Op 2 . + CDS 1792 - 1917 112 ## - Term 1863 - 1901 -0.1 4 3 Op 1 . - CDS 1936 - 2226 91 ## 5 3 Op 2 . - CDS 2285 - 2791 424 ## gi|293192035|ref|ZP_06609359.1| membrane-spanning protein Predicted protein(s) >gi|319978141|gb|AEUH01000151.1| GENE 1 1 - 1464 1586 487 aa, chain + ## HITS:1 COG:Cgl0307 KEGG:ns NR:ns ## COG: Cgl0307 COG1205 # Protein_GI_number: 19551557 # Func_class: R General function prediction only # Function: Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster # Organism: Corynebacterium glutamicum # 1 480 312 776 785 357 45.0 4e-98 EAVAAQVRDRLSSRGSPLAGRVGAYRGGYLPEERRALEAAIRSGRVRALATTSALELGLD ISGLDATVTAGWPGTRASLWQQIGRAGRAGRAGVSVLIASENPLDAYLVRHPEDILAEVE AAVIDPANPWVLAPHLCAAAAEAPLTEADSAYFGPGLADVVGALERDGLLRRRPAGWFWD ATRPERPSDLTDLRGGGGDVQIVDPAGVVIGTIDEASADAHVFPDAVYVHQGRTYHVLSL SSVTGAAGPVGWGRAPVGAPPLVAPVRPGDQRVAVVEEVRTPLRTRASTHTSVAIRGVEA SWTSPDGLLTWCFGPTDVSTRVTDYDLLRLPGLEFIRNTELAMPTRTLPTRSAWFQLERG AEAVLGIGAADLPGALHAAEHAMIAILPLIATCDRWDLGGLSTQSHDDTGRPTVFVHDAF RGGAGHTRSGYARAAQWIGAALEAVRECPCEDGCPRCVQSPKCGNGNEPLSKAGAVALLG FVLERCP >gi|319978141|gb|AEUH01000151.1| GENE 2 1691 - 1801 119 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRAVHGPAGEPRRVFQNASEFRAEMVARRALVPKCV >gi|319978141|gb|AEUH01000151.1| GENE 3 1792 - 1917 112 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLSLPDFARITPKRLNPDAFWTRPPQNHLRTAEPRRILDQ >gi|319978141|gb|AEUH01000151.1| GENE 4 1936 - 2226 91 96 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYTNKDAKPQGFRGKGSSSEDLHCVDGMLEGLGRKAGPFRFLLRPIPLGALLRQVVGSVE EASSAARTPVPPPVERGAGTGGFSCGPPLGAPKERA >gi|319978141|gb|AEUH01000151.1| GENE 5 2285 - 2791 424 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192035|ref|ZP_06609359.1| ## NR: gi|293192035|ref|ZP_06609359.1| membrane-spanning protein [Actinomyces odontolyticus F0309] # 52 167 9 124 125 72 52.0 6e-12 GASDRRGPGPWAASGHRSGAAFHSWDCATTDGTGGRPSATGPEHCPATAAAADERGSGSL YAVGVLALVVAVVVVVVGVGQAFAARTRLQAAADLSALAGAEAAAVAAWEDVGDGPCSAA ARVASANGASTEKCDLHGSDCRVELVRRVAIVGIPVQVRARARAGVEP Prediction of potential genes in microbial genomes Time: Thu May 12 18:10:09 2011 Seq name: gi|319978136|gb|AEUH01000152.1| Actinomyces sp. oral taxon 178 str. F0338 contig00152, whole genome shotgun sequence Length of sequence - 2421 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 103 - 594 322 ## gi|154508102|ref|ZP_02043744.1| hypothetical protein ACTODO_00595 2 1 Op 2 . - CDS 666 - 914 398 ## gi|293192037|ref|ZP_06609361.1| putative membrane protein - Term 1196 - 1248 1.3 3 2 Tu 1 . - CDS 1370 - 2335 1185 ## COG0604 NADPH:quinone reductase and related Zn-dependent oxidoreductases Predicted protein(s) >gi|319978136|gb|AEUH01000152.1| GENE 1 103 - 594 322 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508102|ref|ZP_02043744.1| ## NR: gi|154508102|ref|ZP_02043744.1| hypothetical protein ACTODO_00595 [Actinomyces odontolyticus ATCC 17982] # 51 163 5 112 112 91 55.0 2e-17 MDGMKGRSSWASGRTGFRTGAGTGRRASCKSGGQPADGTGRHRARPLRGAERGSVTVEHA IGLVAVVGMIGLVVCAAQAGITSTALCQAVRDGARAASIGGKDPRAVAAASFSAVSSAPA SFAVAYDEEWVEVSGSATYRGPIGWVGGTATCAARTLAEGTTP >gi|319978136|gb|AEUH01000152.1| GENE 2 666 - 914 398 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192037|ref|ZP_06609361.1| ## NR: gi|293192037|ref|ZP_06609361.1| putative membrane protein [Actinomyces odontolyticus F0309] # 1 82 1 90 90 72 58.0 9e-12 MPIRNRIECGLIEARSRLVETASRLVRPQPGDTDGEEGATTVEYAIGTIAAAGFAGLLIL ILKSDTVRSALEGLIQEALATR >gi|319978136|gb|AEUH01000152.1| GENE 3 1370 - 2335 1185 321 aa, chain - ## HITS:1 COG:lin0622 KEGG:ns NR:ns ## COG: lin0622 COG0604 # Protein_GI_number: 16799697 # Func_class: C Energy production and conversion; R General function prediction only # Function: NADPH:quinone reductase and related Zn-dependent oxidoreductases # Organism: Listeria innocua # 4 309 1 310 313 113 31.0 4e-25 MSTMRAVRFDEYGGTPVLEVREVPSPPPQPGQVRVKVAYAGINPGEAVIREGLMKEYFPA RFPGEGQGSDFSGVVVEAGPGVTGVHVGQGVIGMSDERSAQAEFVTIDQDRVVPLPDGVD PAVGACLYVAGTTAWALAEAIQPRQGEVIAVSAAAGGVGFLLSQLLRRAGARVVGIASDA SAQALEGIGATQVAYGDGLPERLRAAAPGGLSAFADCFGGGYVEAAIDCGVPTERIKTII DIAAARRFGVQMVVMAGVADPGRVVGELADLVAAGELRVPIRARYPLDQVRAAYEDLATR HGVGKIVLSVGGESAGAVPTA Prediction of potential genes in microbial genomes Time: Thu May 12 18:10:25 2011 Seq name: gi|319978130|gb|AEUH01000153.1| Actinomyces sp. oral taxon 178 str. F0338 contig00153, whole genome shotgun sequence Length of sequence - 4684 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 566 - 1363 619 ## COG1309 Transcriptional regulator 2 1 Op 2 . + CDS 1360 - 1806 573 ## 3 2 Op 1 . - CDS 2037 - 2243 145 ## Tfu_1438 hypothetical protein 4 2 Op 2 . - CDS 2247 - 3287 1484 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 5 2 Op 3 . - CDS 3350 - 4141 806 ## COG0500 SAM-dependent methyltransferases 6 2 Op 4 . - CDS 4186 - 4683 464 ## Bcav_0621 type II secretion system protein Predicted protein(s) >gi|319978130|gb|AEUH01000153.1| GENE 1 566 - 1363 619 265 aa, chain + ## HITS:1 COG:mlr4833 KEGG:ns NR:ns ## COG: mlr4833 COG1309 # Protein_GI_number: 13474047 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mesorhizobium loti # 5 200 8 193 210 73 31.0 4e-13 MANDRSPSTRDRLLDAALERFSREGWGGTSIRDLARDAGVREGSVYKHFPSKQAIFDALV ERADARMAEVATALGVSVASPGAALPGYRGIGEEQLAAIAEGFFDAVLHDRELAALRRLF IVSQYRDPQIGRRLRHYWIEQPLAFQAGVFAGLINSGDFRDGLDPMATALAFFGPVLALL QLAESGDGAEERARALLRSHVRHFRTTHLHRTAGAEQTAGAEQETGEAEQAAETEQEAGK AKQEAGTEQAAPVEPSATAGEEAEP >gi|319978130|gb|AEUH01000153.1| GENE 2 1360 - 1806 573 148 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNRYRTVARVVFGVVFLGGALAHAYFASFSPQSYAPFADTALWPWLADLWRGSVMANIQW LSLVTAAFELVLGAALLAGGRAARLAALAALGFFAFILVLGYSWPAQDPWEDFLKNRAAT VLLASAIAPVLRAPGGSGPTKGRPSRTV >gi|319978130|gb|AEUH01000153.1| GENE 3 2037 - 2243 145 68 aa, chain - ## HITS:1 COG:no KEGG:Tfu_1438 NR:ns ## KEGG: Tfu_1438 # Name: not_defined # Def: hypothetical protein # Organism: T.fusca # Pathway: not_defined # 5 61 4 63 76 63 70.0 2e-09 MAGKERKQVLLRLDPAVHAAIAKWAADELRSVNAQIEVVLRRALDEAGRGVRAAPLRGRG RPRKDEER >gi|319978130|gb|AEUH01000153.1| GENE 4 2247 - 3287 1484 346 aa, chain - ## HITS:1 COG:Cgl2775 KEGG:ns NR:ns ## COG: Cgl2775 COG0330 # Protein_GI_number: 19554025 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Corynebacterium glutamicum # 26 345 5 324 325 315 59.0 6e-86 MSAQHPQPSQSEPSGGPARSARHATEEGSPGRVPNAPGSPSPRVDIEEKAARSVGSGAAF LVAFLLLLLLAGAVAVLVLGIMDASADPDHVHAGVVARIVAGALGIAAVVVVATGFDIVV PGQTSVRQFFGRYIGTVRRTGLVFVPPLTNGTKVSIKVHNFETTELKVNDLDGNPVNIAA IVVWQVADTARAVFAVEAYEAFIRVQAESALRHVATIHPYDESGPGKTSLRGGTDLVSAE LAAEVAERVALAGLEIVEVRISSLAYAPEIAQAMLQRQQAGAVIAAREQIVEGAVTMVDQ ALKRLEADDIVTMDEERRAQMVSNLLVVLCSDQRTQPVVNTGSLYA >gi|319978130|gb|AEUH01000153.1| GENE 5 3350 - 4141 806 263 aa, chain - ## HITS:1 COG:PA4075 KEGG:ns NR:ns ## COG: PA4075 COG0500 # Protein_GI_number: 15599270 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Pseudomonas aeruginosa # 15 262 28 273 278 89 34.0 7e-18 MEGLLGRWDSTHRTTRMRGWDFSVLEGSMTCGEPDWDFEAECRGALVALADAGGGTVLDM GTGGGERLALLLGSLDAGRRGRLDVVATEGWERNVPVARQRLEPLGVRVEEHDPERGDPL PLPDGGVRLVMNCHEALDARDVARVLAPGGLFVSEQVDGTDAPEVHDWFGTRPAYPDVRP AVVADALRGAGLVVEETAQWSGPMVFADVDALVAYWALVPWDVPEDFTVPGYAEVLERVH RESGGGPVRLTARRFRVRAHRPG >gi|319978130|gb|AEUH01000153.1| GENE 6 4186 - 4683 464 165 aa, chain - ## HITS:1 COG:no KEGG:Bcav_0621 NR:ns ## KEGG: Bcav_0621 # Name: not_defined # Def: type II secretion system protein # Organism: B.cavernae # Pathway: not_defined # 27 164 39 179 179 87 47.0 1e-16 GRPAAPRGGHAVSARLLARLRSRDGAVDEAIACDLVLAGLRCGASIPQALAALGGAVGSD GLVRISRELRLGAGWAAAWDPAPPGTELLREGLEAAWLQGVSPEAQLRRAAAQTRRHRIS DAKKAAEELGISLVAPLGALLLPAFVLLGLVPIVIHMFAGILGGV Prediction of potential genes in microbial genomes Time: Thu May 12 18:10:38 2011 Seq name: gi|319978128|gb|AEUH01000154.1| Actinomyces sp. oral taxon 178 str. F0338 contig00154, whole genome shotgun sequence Length of sequence - 1308 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 603 550 ## gi|154508107|ref|ZP_02043749.1| hypothetical protein ACTODO_00600 - Term 786 - 841 11.7 2 2 Tu 1 . - CDS 975 - 1307 304 ## gi|293190562|ref|ZP_06608903.1| conserved hypothetical protein Predicted protein(s) >gi|319978128|gb|AEUH01000154.1| GENE 1 3 - 603 550 200 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508107|ref|ZP_02043749.1| ## NR: gi|154508107|ref|ZP_02043749.1| hypothetical protein ACTODO_00600 [Actinomyces odontolyticus ATCC 17982] # 4 195 7 219 337 83 36.0 7e-15 MDTPQKPTVAMVGLTERERESAVAVASLAGVGVVPEGAAPDLVIAGEGAARAGGSSAPRV SVGPSGQLRLPKDEAVLMRVLASLALRRAAGGLRVLVAGWHGGVGTTALCRALAFRASCP LIDASGHAPGVLRPADGRAPGVRWADLDAAEAVFLPALASQLPSVHSSPVLAGDWRGCAD AADPRLGPVLAALEAAGAAR >gi|319978128|gb|AEUH01000154.1| GENE 2 975 - 1307 304 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190562|ref|ZP_06608903.1| ## NR: gi|293190562|ref|ZP_06608903.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 3 108 172 277 278 110 62.0 2e-23 RIDAPALARAGAVIAHFRQYEVFSPASAAVGAGLARWILVSNGVDPTGTAVVSAADALDP VGAARALAGWASGDEAGVGEWLVHVAQCVEYGARIGRDIAVHVQARTLGD Prediction of potential genes in microbial genomes Time: Thu May 12 18:10:56 2011 Seq name: gi|319978125|gb|AEUH01000155.1| Actinomyces sp. oral taxon 178 str. F0338 contig00155, whole genome shotgun sequence Length of sequence - 1518 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 52 - 978 867 ## gi|293190563|ref|ZP_06608904.1| putative potassium-transporting ATPase A chain 2 1 Op 2 . + CDS 1032 - 1518 306 ## gi|154508111|ref|ZP_02043753.1| hypothetical protein ACTODO_00604 Predicted protein(s) >gi|319978125|gb|AEUH01000155.1| GENE 1 52 - 978 867 308 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293190563|ref|ZP_06608904.1| ## NR: gi|293190563|ref|ZP_06608904.1| putative potassium-transporting ATPase A chain [Actinomyces odontolyticus F0309] # 1 306 1 307 309 265 55.0 3e-69 MPPLKHHLDSARLRSGVFAFVFAPIALVFMGSSMTDVQSLTAAGQPLASVEGLIGLALAS TLFALIAMNCDDSSAGMFVAFGWSLVIGAAQMAGYLQVPMLMKSPARPHDMLAAMSWSMY PVAMAAILAASAVTVKSVRVRAGETTRMPLARSHRSAFGASVALPAAAVVFAVYVYIAPN NSVRVAHEGLIGLVSSYEYHPAMALAGAGALAAMALSARWSITGTQLGAWAVLVLPSYLV VPVWASLTGKVVVPGRSPITQLLVAAPVLTALGLTVATTSLGTLWSRTKALRALSAAAEE DDQGPAGS >gi|319978125|gb|AEUH01000155.1| GENE 2 1032 - 1518 306 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508111|ref|ZP_02043753.1| ## NR: gi|154508111|ref|ZP_02043753.1| hypothetical protein ACTODO_00604 [Actinomyces odontolyticus ATCC 17982] # 29 141 21 126 595 63 46.0 5e-09 MSENESPDAEASQHRRPDDWALDRGRDHVLTGEEIELPGAEFGDAADIVGLSSVPVVTGE IEAVSIGARRSSNPNVGPLQPKIEDRIATGEIPVAPVPSASAIPVVPAPGAGAEAEPDAG PRDSGASGRDPSGAPGPEAGAASTTAREQEGAPEPAVEGAGA Prediction of potential genes in microbial genomes Time: Thu May 12 18:11:19 2011 Seq name: gi|319978122|gb|AEUH01000156.1| Actinomyces sp. oral taxon 178 str. F0338 contig00156, whole genome shotgun sequence Length of sequence - 1871 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1228 1255 ## Arch_1718 hypothetical protein 2 1 Op 2 . + CDS 1225 - 1870 595 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) Predicted protein(s) >gi|319978122|gb|AEUH01000156.1| GENE 1 2 - 1228 1255 408 aa, chain + ## HITS:1 COG:no KEGG:Arch_1718 NR:ns ## KEGG: Arch_1718 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 217 398 164 345 345 73 31.0 2e-11 TADEPAAHVEEPPSAARRPPLETRRSRRARLGWDRPAPAARPAAEPADEEPATSAEEPAY PTAAFSDEIRYPTAFSAGTAYPSARADEDSARADDETGAAEDTAFDDGADYGSAYSPWPS STLPMAGSTDAQSSSWNDEMTRTYWHSPEDLEPEPVRAHPGIDETGVIAPTRRSLLSSVE EAEEEARSAPEEAEEEAEEPEESVRPASLDDAIFEGTTVKAELPNRSRAHWMGLVAFAIG VPLAWFLAADAGARMTLADNAPMVTGTASFLALGELLGALAFSVLLIASARQSSLGAWAV GGLFTVIGLPWVLVPGPTATAMFPVMSALQSSGAVGANLQHHLQASGYSGRLLLIGVVLI GLGYVSHTTRRIGRAEEARRAEVERVNPSGAHFTSRERRRAHRAEGKR >gi|319978122|gb|AEUH01000156.1| GENE 2 1225 - 1870 595 215 aa, chain + ## HITS:1 COG:Cgl0297 KEGG:ns NR:ns ## COG: Cgl0297 COG0596 # Protein_GI_number: 19551547 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Corynebacterium glutamicum # 2 152 38 185 331 101 37.0 1e-21 MKAPGGAAAIELPGPWTHRHITAGGASFHVADTGTAMDYALVLLHAFPEHWWSWRDVIPP LAGAGRRVVAMDIRGCGTSDLSRGASDLVQLAQDVIGVVRTLGISSFSVAGAGTGGAVAW MVGALSPSELRSLIALGAPHPLGIRPVIGRAPWSGGRLLQGRLALPTGRVRALRDGRLLD AVYRSWASPSSRERLDSQSGPFRAALARPFAPHTA Prediction of potential genes in microbial genomes Time: Thu May 12 18:11:31 2011 Seq name: gi|319978103|gb|AEUH01000157.1| Actinomyces sp. oral taxon 178 str. F0338 contig00157, whole genome shotgun sequence Length of sequence - 17277 bp Number of predicted genes - 19, with homology - 15 Number of transcription units - 16, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 288 85 ## gi|154508112|ref|ZP_02043754.1| hypothetical protein ACTODO_00605 + Prom 292 - 351 5.2 2 2 Tu 1 . + CDS 456 - 1139 175 ## PROTEIN SUPPORTED gi|163788659|ref|ZP_02183104.1| 50S ribosomal protein L20 + Term 1249 - 1297 -0.3 3 3 Op 1 . - CDS 1283 - 1462 336 ## HMPREF0573_10749 hypothetical protein 4 3 Op 2 . - CDS 1467 - 1787 498 ## Arch_1726 transcription factor WhiB 5 4 Tu 1 . - CDS 1897 - 2022 78 ## 6 5 Op 1 2/0.000 + CDS 1988 - 4012 3249 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 7 5 Op 2 . + CDS 4009 - 4992 1025 ## COG1408 Predicted phosphohydrolases + Term 5021 - 5059 2.0 + TRNA 5067 - 5140 75.6 # Pro CGG 0 0 8 6 Tu 1 . - CDS 5497 - 6159 619 ## COG2910 Putative NADH-flavin reductase + Prom 6321 - 6380 2.0 9 7 Tu 1 . + CDS 6426 - 7031 566 ## COG1309 Transcriptional regulator + Term 7218 - 7273 2.0 10 8 Tu 1 . - CDS 7533 - 8471 1212 ## COG0385 Predicted Na+-dependent transporter 11 9 Tu 1 . + CDS 8620 - 9690 636 ## HMPREF0573_10033 membrane protein 12 10 Tu 1 . + CDS 9852 - 10046 130 ## - Term 9904 - 9933 3.5 13 11 Tu 1 . - CDS 9956 - 11011 990 ## - Prom 11256 - 11315 7.4 - Term 11471 - 11506 0.1 14 12 Tu 1 . - CDS 11750 - 12805 1357 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 15 13 Tu 1 . + CDS 12870 - 13250 590 ## COG1950 Predicted membrane protein + Term 13481 - 13515 -0.5 16 14 Op 1 . - CDS 13257 - 14564 1492 ## HMPREF0573_10723 hypothetical protein 17 14 Op 2 . - CDS 14561 - 14836 242 ## 18 15 Tu 1 . + CDS 15096 - 16556 1939 ## COG0015 Adenylosuccinate lyase + Term 16679 - 16728 2.2 19 16 Tu 1 . - CDS 16765 - 16992 374 ## gi|293190645|ref|ZP_06608936.1| putative FAD-dependent pyridine nucleotide-disulfide oxidoreductase Predicted protein(s) >gi|319978103|gb|AEUH01000157.1| GENE 1 1 - 288 85 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508112|ref|ZP_02043754.1| ## NR: gi|154508112|ref|ZP_02043754.1| hypothetical protein ACTODO_00605 [Actinomyces odontolyticus ATCC 17982] # 2 85 213 296 301 103 65.0 4e-21 PHTALRALRAARRPSREARRILRSPVAVPVLSIRGSDDGAWSPVDHAHDATFVDAAHSCV VVREAGHFLPEEAPSQVARAIAAHLDELSDTWDCD >gi|319978103|gb|AEUH01000157.1| GENE 2 456 - 1139 175 227 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788659|ref|ZP_02183104.1| 50S ribosomal protein L20 [Flavobacteriales bacterium ALC-1] # 35 195 34 197 223 72 26 3e-12 MIGDGNFISQVPLFGGLDDAQQVSLQQKMGHTTLRRGETLFDEGDLGDRLYIVTEGKVKL GHTSNDGRESLLAVLGPGEIIGELTLFDPGPRSTTATAVSPASLLFLEHEDLMHVLDTNP TLAKHMLRALAQRLRRTNESLSDLVFSDVPGRVAKALLDLADRFGTSTDKGVHVPHDLTQ EELAQLVGASRETVNKSLADFVSRGWIRLEGRAVTLLDVDRLARRAR >gi|319978103|gb|AEUH01000157.1| GENE 3 1283 - 1462 336 59 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10749 NR:ns ## KEGG: HMPREF0573_10749 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 1 51 1 51 52 80 74.0 3e-14 MTKWEYSTIPLLTHATKAILDQWGADGWELVQVVPGPTGSDSLVAYLKRPVQEPAPIQN >gi|319978103|gb|AEUH01000157.1| GENE 4 1467 - 1787 498 106 aa, chain - ## HITS:1 COG:no KEGG:Arch_1726 NR:ns ## KEGG: Arch_1726 # Name: not_defined # Def: transcription factor WhiB # Organism: A.haemolyticum # Pathway: not_defined # 3 95 13 105 106 101 61.0 1e-20 MSRALCAGYEPDALFVQGASQRQVRQRCLMCEVRIECLADALQSGANYGVWGGLTERERR AMLRHYPDIDDWLDWLRNSDDPLAQEIRLPKVPKVLVMVRGESAVG >gi|319978103|gb|AEUH01000157.1| GENE 5 1897 - 2022 78 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLTARGFECDMLQGYVPRGCGGGEGDGSPFFGPDPSARPM >gi|319978103|gb|AEUH01000157.1| GENE 6 1988 - 4012 3249 674 aa, chain + ## HITS:1 COG:ML2308 KEGG:ns NR:ns ## COG: ML2308 COG0744 # Protein_GI_number: 15828240 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Mycobacterium leprae # 1 627 1 638 803 246 30.0 2e-64 MSHSNPRAVRPMQLFGLLLTFVSVSALMGVVGAGMLVPVVGPLAIVTKSAPTVFNDIPSD IQVVEPAEESQLLDASGGVIARFYDKQRIVVPSANIADVMKKAIVAIEDKRFYNHHGVDP TGMLRALVSNLGESGTQGASTITQQFVRNALAERGYLEGDVDQVAAATEQTTERKLREAK YALAIEKVMTKDEVLTGYLNIAPFGPITYGVEAASLRYFSKSASEINYLEASLLAGLVQS PVQYDPLSHPEAAQDRRDVILSVMLNQGVITQEEYDKGIATPLADMLHPTVTPEGCSGAD DSRAYFCDYVLSQFLEDTTFGETRADREHMLKTSGITIRTTLEPTNQDAAYSALTAAIPV GDASGVNDALVSLDPKTGNIVAMAQNTTYGVGAGETMSNYAADGKFQVGSTFKVFTLIEW FKEGHNANETVGSNNTTYGSNAFKCNGSPIYTDTYPVQDLEGKTGTMNVVRATGLSVNQA FVNMASRVDFCKIFQNAYDLGITEDGEVPTAYPANILGSSSASPLQVASAFGAIANSGTQ CSPQSISSVSDRDENVLKQYEPSCKSVLEPDIANKTASLLTASAGQYYTSTKLADGRPYA AKSGTTDYSSNTWLTGFTSSLVTSAWVGHGANSSAPVWDTTINGRFYHQIFGETFVGQNI WAPYMSQALAGTPR >gi|319978103|gb|AEUH01000157.1| GENE 7 4009 - 4992 1025 327 aa, chain + ## HITS:1 COG:MT3785 KEGG:ns NR:ns ## COG: MT3785 COG1408 # Protein_GI_number: 15843301 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Mycobacterium tuberculosis CDC1551 # 26 326 10 303 319 184 42.0 2e-46 MSGAVADPAPSRADAPLARAAARTARATAGAVGAVVVPGAAALAWGRVERRLPVVRRFEV PVARPVPEVTILQISDLHLFPGQEFLVDFLRRVAATERFDVVASTGDNFGGPDGADLVRE AYAPFLDRPGFFVLGSNDYYSPVRKNWGRYLTRSKGPRPRVVPDLPWTPMTRMMADAGWV DLSNRAGDLEVPGVGALSLLGTDDAHIHRDRYVDPAPSWEREGVLRLGATHSPYTRVVSA LTRRGADLIVAGHTHGGQIGLPGFGALITNCDIPRRYAKGLSQWHSQGRASWLHVSAGLG TSRYARVRVATRPEVSLLHVFPARGRL >gi|319978103|gb|AEUH01000157.1| GENE 8 5497 - 6159 619 220 aa, chain - ## HITS:1 COG:AGc3633 KEGG:ns NR:ns ## COG: AGc3633 COG2910 # Protein_GI_number: 15889289 # Func_class: R General function prediction only # Function: Putative NADH-flavin reductase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 201 1 189 203 76 34.0 5e-14 MKTILVLGATGRTGRSVLKHRPDGATMYAAVRRAGPGAPLPEGADAPRVVDIEDQASLSA ALEGVDTVINAIRLRGDIAATALVGLHQSIVAARDTSRDLHVVHVGGAGSLRVGGGRRFW QAPGFPAATLPRGIGHARLRDYLEQYERGCRWAYLIPPPRFDPDGPFRGRYARIAAGGGE REFVDDGISYEDFAVALIDAVVGAWRGVWLIGGDGAPESP >gi|319978103|gb|AEUH01000157.1| GENE 9 6426 - 7031 566 201 aa, chain + ## HITS:1 COG:SMc04134 KEGG:ns NR:ns ## COG: SMc04134 COG1309 # Protein_GI_number: 15963871 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Sinorhizobium meliloti # 4 158 25 187 225 66 31.0 3e-11 MAWDVEGTKRRIKEAATAEFARCGPDGTTVERIAKKAGVNKERVYNYFGGKRQLFAAVLR DELATVARAVPVSSFALEDIGEYAGRVYDYHRERPELIRLLRWEALVFDGEVPDEDLRGS FYQRKTAAVLRGQEVGAVTGDFDAAQLMVLVLSIAGWWAAVPQVARMLCGPLSEEEHARR RAAVVRAARRLAAPGPGAAGR >gi|319978103|gb|AEUH01000157.1| GENE 10 7533 - 8471 1212 312 aa, chain - ## HITS:1 COG:Cgl1229 KEGG:ns NR:ns ## COG: Cgl1229 COG0385 # Protein_GI_number: 19552479 # Func_class: R General function prediction only # Function: Predicted Na+-dependent transporter # Organism: Corynebacterium glutamicum # 3 311 12 318 335 261 53.0 9e-70 MSSEDRSARIAVTVFPLIVIAAFIAAMVLPSAFLPLAPGVNWALGVIMFGMGLTLTLPDF ALVVTRPLPVLVGVAAQYVIMPLVGFGLAWVFRLPDAVAVGVILVGCAPGGTASNVISYL AKADVALSVTMTSISTLLAPLMTPVLTTWLVGSRMEVDGGAMAMNILLMVLAPVLGGFLV RYLAHALVERILPALPWVSVLGICYVLLVVVSKSVSVIVSSGAIIIGVVVCHNLLGYLFG YLAGRLGGGSEKAARTTAIEVGMQNSGMAATLAATQFPATPETAVPAAVFSVWHNLSGAL LAMFFRRAQARD >gi|319978103|gb|AEUH01000157.1| GENE 11 8620 - 9690 636 356 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_10033 NR:ns ## KEGG: HMPREF0573_10033 # Name: not_defined # Def: membrane protein # Organism: M.curtisii # Pathway: not_defined # 93 355 7 290 290 109 33.0 2e-22 MSTSDSAGGPGAPSSVRQGTAPAPNDTSPQAQGTAPASSARSGAPHGEPREPLGSTRQAT PSGSARKAASSSTRRAAPSGARDRGRGGKGGRRRPSRLRALARNGWIPDQHGAWPMTVLP IVIGAALGGPTWEHLVLGFAWTTAFACFNAVEHWVKSPKRRRSAIIRAVVASGAVTAASG GAYLAFRPALAWWAVGFAPLVGVALWEVRAGRERSIPARLATIAASSLMLPVAFSVSASG GLPGSVPARVWVAAGVLGAQFAGTVPTVRSMVRGRGDRRWVVGSVLFHALCTAAVAGLAA AGLISAWAVPVWVAMTVRAWAMPAWSERRARPLAPKTIGPLEFVWVGLLAIALVLP >gi|319978103|gb|AEUH01000157.1| GENE 12 9852 - 10046 130 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVGRTGTRDRRLPRCPAAHPETGANAPARTESKEPHLSPLFAEALSITSLIRHLAEMQIK PFTT >gi|319978103|gb|AEUH01000157.1| GENE 13 9956 - 11011 990 351 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRDVRGYRTTETSQVDNLMRVGEYLADDGFSYSRLKKLWIRSRRTRKDQLFLEALVDIN YWCYRRRECWLRHGVADESRFNGLPADERAPDLEQSRARAPERLEPDRDAALREFGRMIN HYEGALRCWKSLRQGAVVQSPVQLCADSVYRAHVRTRFALSNNPNSVRPHILLSTFVSNV RMAECARIFFFRIHSRPIGVGVILFIFVVLKFLAHEFNGYLAWNESTGAGAHVEGFRALE GFDASYPWPVVFILFIAFSITTYMSFRDYPPSFYGFDRMKHAYLSQEVIAFSFTVESVLL TVAALSPDDGANPLVKAMCFLLYVVNGLICISARWRIKEVIDSASAKRGDR >gi|319978103|gb|AEUH01000157.1| GENE 14 11750 - 12805 1357 351 aa, chain - ## HITS:1 COG:MT3881 KEGG:ns NR:ns ## COG: MT3881 COG0079 # Protein_GI_number: 15843395 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Mycobacterium tuberculosis CDC1551 # 3 343 2 346 353 210 43.0 3e-54 MTTLPIRPEVLALPRYIPGKALQGAQKISSNEMPWPPDQRVVEAAAAALGEANRYPDLTA APVREELARALGLSAGQVCVGTGSSAVLVAALSAVCAPGSEVVFPWRSFESYPIAAPTVH ASPVPVPLAPDSGPDLGALADAVTPATRAIIVCSPNNPTGVACPASDIVAMVERVPPSVL VIIDEAYIDFATDPAIAPCTDLIGAHSNVLVMRTFSKAHALAGLRIGYGLGDAGLIDAVQ ALLVPFGVSGPAQAAALASLRDGAPAREAVRAITCERDRIVPLLRRMGYDVPDSQANFYF LKGEGAGFVEACAAAGLVVRPFPEGVRVTVGTPEQNDALLAAAGGRTPAAR >gi|319978103|gb|AEUH01000157.1| GENE 15 12870 - 13250 590 126 aa, chain + ## HITS:1 COG:MT1954 KEGG:ns NR:ns ## COG: MT1954 COG1950 # Protein_GI_number: 15841374 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 3 126 37 161 167 68 32.0 3e-12 MEFVLRLIATMAGLWVSIRLVPSLEIAETASMTESLLVLAAIALVFTVVNSLVKPLVSFL AFPLYVLTFGLFAIVVNSAMFALTGWLSTSLGFPFRTGGFWSCFFGAVITAVVSSLVVAV LDSDNR >gi|319978103|gb|AEUH01000157.1| GENE 16 13257 - 14564 1492 435 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10723 NR:ns ## KEGG: HMPREF0573_10723 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 190 357 376 552 618 106 38.0 2e-21 MSPVHRVDPDDPRSLEPQVVVGALTVNTDDMLAARNVLAGLSTGIGAVWRTLEHAHHVAE AMSAERPATAASVMAFIDQADAQSRAVIGTLNDMRQAMESSAEYYAAADSEARCRWVGAA SARGLAGAAGKFGGSVTTGNYGVVVADEAGNHAFTANQWGAGQPQDRTAERAAATAAALG SAAHEYDVPDGARSSADVLRSLDGISRDEDAGRVVIEEHVSVVDGREVRSWTVVLQGTQN WDPVSSNPQDLQTNLQAVGGLRSEQLDAVKAAMEMAGIGQGEAVELVGHSQGGIIAAQLA CDAAFASKYAIANVTTAGSPIAGHAPRGTPVLALENDRDIVPGLDLSPNQLMENVTTATY HDAAYDRQLGIERSSAHDASGYGDQLEYSLNNGESAQGPIASYEERRNRALGLNSTTRTT VHSYTTRRVCEPPDL >gi|319978103|gb|AEUH01000157.1| GENE 17 14561 - 14836 242 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDAANREDLDAILCQALSGVEDARTQLHSAVPPGWRGGAASLYSGKVTRVGADLMWLSGT ISQAMDAARCEAQWAPPSTTDRQLGGFGGAR >gi|319978103|gb|AEUH01000157.1| GENE 18 15096 - 16556 1939 486 aa, chain + ## HITS:1 COG:RSc2720 KEGG:ns NR:ns ## COG: RSc2720 COG0015 # Protein_GI_number: 17547439 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Ralstonia solanacearum # 25 483 10 454 457 389 48.0 1e-108 MSADQNTLFSPSHGYQDLAAGGEPLSPLDGRYRAAVSPLANHLSEAGLNRARTHVEVEWL VHLLDGRVLPGAPSLTDAERDYLRALPLSFGRPQIKRLAQLEAVTRHDVKAVEYLIGEHL AAAPAALGEGTSLPGLREAVHIFCTSEDINNLAYALTIKAATEQVWLPAARALHGRLVAL ARAGADVPMLAHTHGQPATPVTMGKEMAVFAHRFGRQIARVAATEYLGKINGATGTWAAH VVGAPGADWPSVSRSFVEGLGLAWNPLTTQIESHDWQAELYADMARAGRVAHNLATDCWT YISMGYFHQNLAAQGSTGSSTMPHKVNPIRFENAEANLEVSAALLDSLAATLVTSRMQRD LTDSTTQRNIGVAIGHAVLAYDNLVRGLDGVDMDADRMAADLDANWAVLGEAVQQAMRAA AVGGATGMANPYERLKELTRGHEVGAAEMRDFIAGLGLPAEVEERLLALTPAAYVGLSSR LARWEH >gi|319978103|gb|AEUH01000157.1| GENE 19 16765 - 16992 374 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190645|ref|ZP_06608936.1| ## NR: gi|293190645|ref|ZP_06608936.1| putative FAD-dependent pyridine nucleotide-disulfide oxidoreductase [Actinomyces odontolyticus F0309] # 2 75 138 213 213 96 65.0 4e-19 MLDRVTRALWDNPEMAPVNVRARVLQSTGGELEEPGAQSECLADMTTIGFADEIARPDEL FARYGAPASDPAWRP Prediction of potential genes in microbial genomes Time: Thu May 12 18:12:26 2011 Seq name: gi|319978097|gb|AEUH01000158.1| Actinomyces sp. oral taxon 178 str. F0338 contig00158, whole genome shotgun sequence Length of sequence - 4236 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 315 169 ## gi|154508132|ref|ZP_02043774.1| hypothetical protein ACTODO_00626 2 1 Op 2 . - CDS 312 - 1829 1806 ## gi|154508133|ref|ZP_02043775.1| hypothetical protein ACTODO_00627 3 2 Op 1 . + CDS 1948 - 2253 524 ## COG1937 Uncharacterized protein conserved in bacteria 4 2 Op 2 . + CDS 2290 - 2535 435 ## Jden_2500 heavy metal transport/detoxification protein 5 3 Tu 1 . + CDS 2685 - 4236 1854 ## COG2217 Cation transport ATPase Predicted protein(s) >gi|319978097|gb|AEUH01000158.1| GENE 1 3 - 315 169 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508132|ref|ZP_02043774.1| ## NR: gi|154508132|ref|ZP_02043774.1| hypothetical protein ACTODO_00626 [Actinomyces odontolyticus ATCC 17982] # 2 79 3 80 213 97 69.0 2e-19 MNEQTPWREVRRWMRVSTPVNKAGAGGVARRLPQLAGPLVIGALGVGAVGAGIAWRALAT TRGGGRAQRFEDIVWSAIDEARAGPRATGSGPQAGGPDSAGGAH >gi|319978097|gb|AEUH01000158.1| GENE 2 312 - 1829 1806 505 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508133|ref|ZP_02043775.1| ## NR: gi|154508133|ref|ZP_02043775.1| hypothetical protein ACTODO_00627 [Actinomyces odontolyticus ATCC 17982] # 20 479 3 463 493 567 66.0 1e-160 MTHHEGREHMAAKRRARTPRRGRRRYHSLRRVPQAYPGIYLRLRLAPIIPGIAATIAVGA AADAQFLPEGARQWARARADAMGGAISEEAQKVFFDQPGLDTLVVRSLAHAARRLASLGR LWARAAIAVGRDRDGRLAASLPLIPRRGYDALMTGMLTVGTAVGVWRGGGVNTRIIRDSA ADPAMLEPPEGHPEAQIRRAAPPESLADLCADIDELYWAATTGAVIKICRVGTGGARRWL VSMVGTESMRFGSTHNPADIEVNIRLMLGLDSAMGVGLVAALHRAMAADSVPEEQWSSEP VLVCGHSQGGLVASVLASRDPAEAGVNVVGILSTGAPNRRVAVRPDVTIVTVAHDQDVVP SMDGSPDRSPDRRVTVGRTLVRPRKRPLYYAHSSATYTETVRLMERRAAVTPWDRLGKAV ASLRAMLPAPDEEARVTHHEIWQDLLEPTAVSTWNTVASLERADPRAVTYPIDYEGARPI SATARGIGAVWRALRNKALRGVKGL >gi|319978097|gb|AEUH01000158.1| GENE 3 1948 - 2253 524 101 aa, chain + ## HITS:1 COG:Cgl0381 KEGG:ns NR:ns ## COG: Cgl0381 COG1937 # Protein_GI_number: 19551631 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 1 101 1 100 100 97 52.0 5e-21 MSGTNTHGHPGYCADTARYLARLRRIEGQIRGLQRMVAQDEYCIDVLTQVSAVQSALDSV AIALLKDHMNHCVAAAARESDEAGAAKVEEAATAIARLIKS >gi|319978097|gb|AEUH01000158.1| GENE 4 2290 - 2535 435 81 aa, chain + ## HITS:1 COG:no KEGG:Jden_2500 NR:ns ## KEGG: Jden_2500 # Name: not_defined # Def: heavy metal transport/detoxification protein # Organism: J.denitrificans # Pathway: not_defined # 11 80 7 76 117 68 57.0 5e-11 MSTDIDRTIELSVKGMTCGHCVMSVTEELEEIPGVMNVEVILNPQGASKVTVLTDTPLDD AALADAVSEAGFELAGIARDF >gi|319978097|gb|AEUH01000158.1| GENE 5 2685 - 4236 1854 517 aa, chain + ## HITS:1 COG:Cgl0382 KEGG:ns NR:ns ## COG: Cgl0382 COG2217 # Protein_GI_number: 19551632 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Corynebacterium glutamicum # 4 510 81 564 755 380 47.0 1e-105 MTTTTTDPAAAATTRAATTRTGQDAPRIGEGARRTARELARRLTISAPLGLFAMAVSMVS AWQFPGWQWAVALASVPVVTWGAWPFHRAAFAAGRHGSTTMDTLVSLGVVASTLWSWWAL LLGGAGEVGMRMSMSLIPRASHSGHAEIYFEGACMIVVFLLTGRWMEARARYRAGDALRA LLELGVTEATLVEQGPDGGRIERTVPASSLAVGDLFLVRPGQKVATDGVVVSGASAIDAS LLTGESVPVDVGEGDAVTGATVNTWGALTVRATRVGADTALSQIGRMVAEAQAGKAPVQR LADRISGVFVPVVIAVSLLTAIGWLASGAALQVALTSAVAVLVVACPCALGLATPTALLV GSGRASQLGVVIRNAEALERTRRVDTALLDKTGTVTTGRMSLESALGAPGVADSELLRVA AGAEAASEHPLARAITAGAAERGIAPAATSTFTNQAGLGVCAVLDSEAGELALVGRASWL ESLGVSIDGALADRLRDAEASGASAVVAAVVEDWERA Prediction of potential genes in microbial genomes Time: Thu May 12 18:12:58 2011 Seq name: gi|319978082|gb|AEUH01000159.1| Actinomyces sp. oral taxon 178 str. F0338 contig00159, whole genome shotgun sequence Length of sequence - 13324 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 9, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1036 1105 ## COG2217 Cation transport ATPase + Prom 1239 - 1298 2.9 2 2 Tu 1 . + CDS 1345 - 2535 993 ## BBR47_41260 hypothetical protein 3 3 Op 1 . - CDS 2699 - 3028 414 ## Sked_10860 predicted membrane protein 4 3 Op 2 1/0.000 - CDS 3016 - 4218 1036 ## COG3463 Predicted membrane protein 5 3 Op 3 . - CDS 4227 - 5063 743 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 6 3 Op 4 . - CDS 5136 - 6017 1264 ## Arch_1747 hypothetical protein 7 4 Tu 1 . + CDS 6112 - 6474 479 ## COG3189 Uncharacterized conserved protein 8 5 Tu 1 . - CDS 6583 - 8166 2372 ## COG1070 Sugar (pentulose and hexulose) kinases 9 6 Op 1 . - CDS 8274 - 8942 618 ## COG1309 Transcriptional regulator 10 6 Op 2 . - CDS 8976 - 9038 108 ## 11 7 Tu 1 . + CDS 9037 - 10221 1248 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 12 8 Op 1 . - CDS 10175 - 11098 1209 ## COG0561 Predicted hydrolases of the HAD superfamily 13 8 Op 2 . - CDS 11100 - 12374 1770 ## COG0172 Seryl-tRNA synthetase - Prom 12506 - 12565 2.7 + Prom 12526 - 12585 2.8 14 9 Tu 1 . + CDS 12650 - 13322 518 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase Predicted protein(s) >gi|319978082|gb|AEUH01000159.1| GENE 1 2 - 1036 1105 344 aa, chain + ## HITS:1 COG:SA2344 KEGG:ns NR:ns ## COG: SA2344 COG2217 # Protein_GI_number: 15928136 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Staphylococcus aureus N315 # 163 341 613 791 802 192 56.0 7e-49 APGPAAGGARRSADAGTARSTDAGAAPGATTVTMSVGGMTCASCVRRVERKLGKLPGVSA SVNLATESARITLTRPYSDQELEGVVGAAGYQGAVTSRVEDGGPAGSAGTSRVDAGAAAS DAAARQSGARESAGSGDAPQPGPAPDAAAIALPEHLTGARVLGVLVVRDTVKPTSRAAIA QLRSLGIEPVLLTGDNEAAARHVASQVGIDRVIAGVLPDGKRAAVAELQERGAVVAMVGD GVNDAAALAQASLGIAMGSGTDVAIEASDITLAGSDLASAATAIRISRRTLRVIKENLFW AFFYNVAMVPLAIAGLLSPMLASAAMACSSVFVVLNSLRLRSAS >gi|319978082|gb|AEUH01000159.1| GENE 2 1345 - 2535 993 396 aa, chain + ## HITS:1 COG:no KEGG:BBR47_41260 NR:ns ## KEGG: BBR47_41260 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 15 376 29 390 435 85 24.0 4e-15 MKPLRTWLAVAGIVAIGDQAFFVVLSLFLLSSYGSGIAASFLTATSVARLALILVGGWAA DRWSPYRCIQVGLVLRIASFLIVGLDSLGGPSVPLLSCGVILFGAGDGLYLPSSGALLKR LTNHGDFSRTISWVQVITSVATGLAGATSGFSFEAVPLHTLFFSFALLALAAFALFHAIP ATPSAAPAADHALADPSPSSTTASLRTVMTANGGTVAVGLLTALLIDASLSGALNIVLPA RYLELGWGGGAFGAAVCLFALGGVLGGIIASRIPADRLAVRNAVINGGILVFAALTCALY FALPEPVSLALVAALGLCAGVAAPTLIGDIMNNTPPEFAGRVYSAINLVTYGSAPIAYAA CGFLMARFSTAAPFLVGAALLALGGAVRAATTPKPQ >gi|319978082|gb|AEUH01000159.1| GENE 3 2699 - 3028 414 109 aa, chain - ## HITS:1 COG:no KEGG:Sked_10860 NR:ns ## KEGG: Sked_10860 # Name: not_defined # Def: predicted membrane protein # Organism: S.keddieii # Pathway: not_defined # 15 109 433 524 525 65 40.0 5e-10 MWQLTSSGFGQDGPRAEAAHQAMAAVPAGSAVETDITLMARMVPDHEVYWVGSAKDMDPL PEYVVVDQRSYVWGGRDVTAVGWATDAHPGHSYELVFERNGFQVAHRTS >gi|319978082|gb|AEUH01000159.1| GENE 4 3016 - 4218 1036 400 aa, chain - ## HITS:1 COG:sll1352 KEGG:ns NR:ns ## COG: sll1352 COG3463 # Protein_GI_number: 16330152 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Synechocystis # 9 232 15 229 472 63 27.0 8e-10 MGTIGRRARELAVPVVVCALTLGVYCLYSVGQWRMMASPSWDLAIFTEAAKSYSQGHWPI VPIKGPGFNLLGDHFHPILVLLGPVYRVFPSGLTLLIVQNALLAWSAWPVTRAATRLAGS AGGFVIGLAYGLGWGLQGAVGAQFHEIAFAVPMLAHGGVAFAQGRYRACMAWLAPLVLVK EDLGLTIMVAGLVLAWCRRDQGRRALAESLAFAAFGAVAFVITTQVLLPALNPAGTWAYS LDGSATGAGSSAGGAAAAKVWPSLLEILTFPSVKIATVLVCVLGAGVVGAASPWFALVVP TLAWRFMGSVDFYYEWSSWHYNAVLVPIASAALLDVLGRAAQWRDRRSADEDRAPADEAG APRGAPSASAPGGARASSAPASRLRCSRWPRPRHCCRCGS >gi|319978082|gb|AEUH01000159.1| GENE 5 4227 - 5063 743 278 aa, chain - ## HITS:1 COG:BH2769 KEGG:ns NR:ns ## COG: BH2769 COG0463 # Protein_GI_number: 15615332 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus halodurans # 15 229 3 212 222 76 29.0 5e-14 MNHARGAAGRKAPQRIAAVIPAHNVGHDVAATVRACRAIPGVDLLIVVDDGSADDTGRAA RAAGAVVVRHSVERGRASAIETGVKVAAMRDRADGPARHLLILSPDLGESAVEATALVEA VNSGLADCASSVPPGEGKAQNARAQRIARSLIRRKTGWDSHNPMSYQRCITREALKAAMP FLNRYALEVAMTIDVLCAGFSVVEIPCNFVHSGADKSLGSLNRSARMADNVMAVLRQAFH RGALPAGAGSARAQAIGQPYPRPARERGGEGDAGTTDS >gi|319978082|gb|AEUH01000159.1| GENE 6 5136 - 6017 1264 293 aa, chain - ## HITS:1 COG:no KEGG:Arch_1747 NR:ns ## KEGG: Arch_1747 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 1 293 1 285 285 221 46.0 3e-56 MGKASRRKKVTDPAKLAAYRPPIPFVARPYEGLAKEVELVAMREIIPCATLTARTDAAHG GVEFDFVTLLPDGHPAMIRPDGRILVALQTRFNSADLSHDAGAAVLAAIAAKDRGAEGVV PVDVREPSERLQDILDPQGFTSISLEDDFSYWFDPSEEVDDETRRALERNREEIIPTEAV PGVPGAYWCEMNRNFVRYVTAVDESALFTALARLAVDGGANVGEGSRFVGAFRACGIPIP VFELGEGVGAADAAPGVAAMAEALARALELTDALTADERRVRQGLVSRQVTIR >gi|319978082|gb|AEUH01000159.1| GENE 7 6112 - 6474 479 120 aa, chain + ## HITS:1 COG:ECs2501 KEGG:ns NR:ns ## COG: ECs2501 COG3189 # Protein_GI_number: 15831755 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 107 8 115 122 97 48.0 7e-21 MSVVLKRVYEDPSPDDGYRVLVDRLWPRGVRKDELDYDEWAKDVAPSADLRRAWHHGEIT DSAFRQEYAAQLDGNPGVAALAERARRGRVTLLFAAKDTARSHAHVLKAAIEAAGATTAG >gi|319978082|gb|AEUH01000159.1| GENE 8 6583 - 8166 2372 527 aa, chain - ## HITS:1 COG:CAC1344 KEGG:ns NR:ns ## COG: CAC1344 COG1070 # Protein_GI_number: 15894623 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 9 523 11 528 534 532 50.0 1e-151 MTTDARALIEDGDTALGIEFGSTRIKAVLIGPDHQVLASGSHGWESHLDDGLWSYSLDEV RAGLRSAYASLVDDVRDEYGLTPTTYGSIGVSGMMHGYLAFNAEGRQLAQFRTWRNTNTA EAAAELSELFGVNIPLRWSVAHLHQTICSGETHVMRVASINTLAGWVHEQLTGRHVLGVG DASGMFPIDPATGSYDARCLSAYANLVDRLVPWDLAKLLPTVLSAGDDAGALTEEGALLI DPTGALLPGIPMCPPEGDAGTGMVATNAVAPKSGNISCGTSVFLMVVLEKALDNLHLEID MVTTPDGSPVAMVHSNNGSSEIDAWVSLFVEFARLAGTEVPVSTAYDLLYRNAMTGDADG GGLLAYNTLSGEPLLGLESARPLFTHTPDAALTLANAFRVQLMTVFAPMRVGVDVLRGEG VVLERLFAHGGLFKTPGVAQQLLADSLGIPTTVGETAGEGGAWGIAVLARYAADSHGLSL PDYLASRVFTGARAVTCEPTAEGVAGYERFLESYRSGLGAVRAISHD >gi|319978082|gb|AEUH01000159.1| GENE 9 8274 - 8942 618 222 aa, chain - ## HITS:1 COG:lin2279 KEGG:ns NR:ns ## COG: lin2279 COG1309 # Protein_GI_number: 16801343 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 28 98 1 71 187 61 36.0 1e-09 MYGYSGDDRRGMYLRAAGTHGRDAEKDLDLRIRKTRRAIRSGLVTACRAKPYAHVSVTDI CAASLVSRTTFYAHYADKDALLAEVVSLLLEDIAPAIEGMWLGGEDGAALSRRLADFYAR NGRALTTLLAVRGDGGSDLRERLRQMFRSVFRQWAQGRIDEQALPLASDVYASVALILVE RGAARPLTEAELALIDRLRVLFTGGAGGAAGPLGRPAPAGLP >gi|319978082|gb|AEUH01000159.1| GENE 10 8976 - 9038 108 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLRGYAVVRNGETHNNGWSV >gi|319978082|gb|AEUH01000159.1| GENE 11 9037 - 10221 1248 394 aa, chain + ## HITS:1 COG:CAC3432 KEGG:ns NR:ns ## COG: CAC3432 COG0596 # Protein_GI_number: 15896673 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Clostridium acetobutylicum # 1 380 2 368 371 177 30.0 2e-44 MFTHYTGNRQFDLQLNRSLGPVLDRAGVGRLAARVLPRIRSTGQITQFAGRLAARFESQG DTAAAWRLHSLAAFYLPVDDPRKRRFISAMSAAFDESHRGLALTRHAVPYGDGELTAMRW EANPADRARFPHAPTTLVMMNGFDGYAEEIMGFAERFPSRPFDIITFDGPGQGHTALAGM PLEPHWELPTNAVLDHFGLTSAAALGVSFGGYLVMRAAAHVPRISHVIAFDMMYRLLDGL TMPLPRPLRPLVAPVVERARPAWLIDAAMSIGPRASADIAWKLQQARALTGLSRPSDVLR ALGAYTMEPLAGRIHQPCLVMAGDADQYVPFERLADVRRVLAGAESVDVEVFHRAQDPGM AEHCQIGDPGRAFAVMGEWLSAPRPPRSAPGPRA >gi|319978082|gb|AEUH01000159.1| GENE 12 10175 - 11098 1209 307 aa, chain - ## HITS:1 COG:SP1245 KEGG:ns NR:ns ## COG: SP1245 COG0561 # Protein_GI_number: 15901106 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Streptococcus pneumoniae TIGR4 # 38 292 2 261 272 88 31.0 2e-17 MAEASFLTVPGSGEPTRPFAEVVDGARRALPPVFPHEADKVMVALDIDGTVLTPRGASHH VRAGIRELAAAGAQVVIASGRAPEDEMFDVLDELGFTDGWVVCNNGATVLRVSGGGVEVV RQEFLDPAPLIDSALEAMPEVVFYSAVPGAKRLLSAPFPPGELEQGSEIIPLERMRSTPT PKIVVRAPGMGREEFDRAIHSLSAADRYEVFVGWTSWADIGPLGATKASALEWLRSRLGV PRDGTVAVGDGTNDIAMIEWAFFGAAMGGATEEVRACADHVTAAVENDGAAAVMRVVLER CGADGAR >gi|319978082|gb|AEUH01000159.1| GENE 13 11100 - 12374 1770 424 aa, chain - ## HITS:1 COG:Cgl2831 KEGG:ns NR:ns ## COG: Cgl2831 COG0172 # Protein_GI_number: 19554081 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Corynebacterium glutamicum # 1 424 11 429 432 481 60.0 1e-136 MIDVRALRENPEPAKDSQRARGADPGLVDEIIEADAARRGALQEFESLRATQKEVSKSVG RASKEERPAILAKAKELAEQVKAAEAAANAADAEADRLARLLPNLVLDGVPRGGEDDYTV LRHEGPAPRDFAAEGFEPKDHLALGEGLDAIDTKRGAKVSGARFYYLKGVGARLELALMS MALDQAHRAGFVPMMTPTLVSPQIMGGTGFLGEHSDEIYYLPADDLYLTGTSEVALAGYH ADEILDLSDGPKRYLGWSTCYRREAGAAGKDTRGIIRVHQFNKAEMFSYCRPEDAAEEHA RFLAWEEEMLAKAELPYRVIDTAAGDLGTSAARKFDCEAWLPTQERYMEVTSTSNCTTFQ ARRLGIRERREDGTAPVATLNGTLATTRWLVALLENHQRPDGSIRVPEAMRPYLGGVEAL EPVR >gi|319978082|gb|AEUH01000159.1| GENE 14 12650 - 13322 518 224 aa, chain + ## HITS:1 COG:all1876 KEGG:ns NR:ns ## COG: all1876 COG1597 # Protein_GI_number: 17229368 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Nostoc sp. PCC 7120 # 63 172 21 124 323 61 41.0 1e-09 MSIPKWLVAAVLGASAIGAGRAAVALTRRARRRSALSVPSPPADPAEGLLRPWIVMNPSK HEDPAAFRALVDRAAMDLGAGPPHWLETTREDPGAGQAVEAVSRGAAVVIAAGGDGTVRA VAAGMAGSGVRMGILPVGTGNVLAGNLGLPDDPAAAMAVALGEYHRNVDLTWVRLDGVEE ASPLPAEGGLVLAAHAARPLEAGPAPLSGSPGPATEPERPARPP Prediction of potential genes in microbial genomes Time: Thu May 12 18:13:17 2011 Seq name: gi|319978077|gb|AEUH01000160.1| Actinomyces sp. oral taxon 178 str. F0338 contig00160, whole genome shotgun sequence Length of sequence - 5081 bp Number of predicted genes - 6, with homology - 4 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 709 744 ## Bcav_0378 diacylglycerol kinase catalytic region + Term 743 - 792 4.1 2 2 Tu 1 . - CDS 739 - 1752 1324 ## COG0077 Prephenate dehydratase 3 3 Op 1 . - CDS 2001 - 4232 2949 ## Sked_08770 hypothetical protein 4 3 Op 2 . - CDS 4240 - 4932 169 ## PROTEIN SUPPORTED gi|154175107|ref|YP_001408238.1| ribosomal protein L22 5 4 Op 1 . + CDS 4877 - 4990 76 ## 6 4 Op 2 . + CDS 5007 - 5079 155 ## Predicted protein(s) >gi|319978077|gb|AEUH01000160.1| GENE 1 2 - 709 744 235 aa, chain + ## HITS:1 COG:no KEGG:Bcav_0378 NR:ns ## KEGG: Bcav_0378 # Name: not_defined # Def: diacylglycerol kinase catalytic region # Organism: B.cavernae # Pathway: not_defined # 28 223 199 372 374 98 38.0 2e-19 DPGPGDGAPAPAPDGPGPDGAAPMPAPDEYACAVVAGMGFDGRTMAGTHAALKKQFGWIA YVLAGLGSIGAPRLSARLTLRSPEAWEPAPIDAGDGLVGTPRGDDEVTRVEARSVMFANC GELKFLLLAPDANLSDGLIDVIAVDAQAGLLGWADVTWKFLGQLVGLRPLNLPVSTGTVA FRQARGASVVAQAAQAVQVDGDALGTARAMRVRIQPDALDVAVPAERTWMELLPG >gi|319978077|gb|AEUH01000160.1| GENE 2 739 - 1752 1324 337 aa, chain - ## HITS:1 COG:Cgl2836 KEGG:ns NR:ns ## COG: Cgl2836 COG0077 # Protein_GI_number: 19554086 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Corynebacterium glutamicum # 17 287 8 281 315 202 43.0 1e-51 MTTFIPGRAPMGENMAIAFLGPFGTFTEQAVHQIAPAGAELIPMTSAPQALTAVRRGEAD RAVVPIENSIEGGVNATLDSLSHGEPLVIVAEMVVSVVFQLAVRPGTRPEDIRRIGTHPH AWAQCRNWLEETFPGVVHVPATSTAAAAQLLSGGDASFDAALCNAISVSTYGLEALHTDV ADNPGAVTRFVLVSRPGAVPAPTGADKTTIQVALPVNESGALLTLLEQFAVRGVDLSRIE SRPSGDGLGNYTFSIDIVGHIREERVQAALVGLHRYSPEVRFMGSYPRIDAVRPTILPGT ADDDFRAGRAWVADLLRGGDGTGKRGGAAPDWPLAPR >gi|319978077|gb|AEUH01000160.1| GENE 3 2001 - 4232 2949 743 aa, chain - ## HITS:1 COG:no KEGG:Sked_08770 NR:ns ## KEGG: Sked_08770 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 35 730 33 680 681 186 30.0 4e-45 MIGQWWFLAPTFLLMLVALVLPGVFWTRAGTRSALVAVGAAPAFTFGVVTALSVLYSRAS IRWEAATVLPVLGMCALLGALTWFGLHVRRASGGAWRPGMPLAEAIGTRSPLGAVQRTVR GATWGAVALGSALAALPMLAGADPADPVQQWDSTFHMNGVHAILMGGDASPFGGLHELYG GREVYYPTGWHAFVALFATPQTVVPASNVSSLVLMAVWVTGAAALVSVLTSSRTAILAAP VVAGLMPNMPADALTMYNQWPNATGTALLPGLMGLAVLVGRRFADDWRSRPPARALARRA GQTVFLMVGALGLIGAHPSAVFSLLALLAAPLLAALVGLARSCDWQDARGRARGIAWSLV AAAVVVVPLVVLASPKIRAMGKYPRQGVSWGEGLAHMFVPFPPFAQTMGMTWWTIVQFVL LVAGLVVIAGLARLAVPGREDTVPASPGGGRMPVWPLASYLVAGSLTALAYSPSSQLRTF LLAPWYMDSRRIMGLQSLMLSVLIAVGFAWIAELARGAVARFSGQGAGSGADQEGDGRGA DHGTADEDEEAALLPGLPQWTVAAVLAVAALVVSGFGAFDARAWAVDYVYDSDHLGKPGM ATTGELAMLRRMRSTTPEDALVVGDPIAGEAYVELLGGRKAVFPQLSAVNQDGDSQNVLT KHFNEIHTNPQVCEVVRKLGITHYYQEQDGQYYNFARSSRFPGLYNVDTSTGFELVDRGD TAVLYRITACGDVEPGGGPDAFK >gi|319978077|gb|AEUH01000160.1| GENE 4 4240 - 4932 169 230 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|154175107|ref|YP_001408238.1| ribosomal protein L22 [Campylobacter curvus 525.92] # 35 216 15 198 199 69 29 4e-12 MTWSTLAAASAPSALDTFIGWFKDPESLLVAMGPWVLWGTLLIVLIESGVLFPVLPGESL LFTAGLLHDKLDLHLPALILGTVAAAFVGAQIGYFLGAKWGRRFFKPDARVLKTAHLEKA EHYFTDYGGRSLVIGRFIPFVRTFIPLAAGIARYPYTRFVFFNTLGALLWGAGFLLVGAL LGNVPFIHDNLSVILAVIIAVSVLPVVIEVAGKRRKGAGGQHIAADAGAE >gi|319978077|gb|AEUH01000160.1| GENE 5 4877 - 4990 76 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNVSSAEGAEAAARVDQVTKGPSVGGIVSPLPYLSHR >gi|319978077|gb|AEUH01000160.1| GENE 6 5007 - 5079 155 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDKRRDSHPRTGSATGGASPSDER Prediction of potential genes in microbial genomes Time: Thu May 12 18:13:45 2011 Seq name: gi|319978065|gb|AEUH01000161.1| Actinomyces sp. oral taxon 178 str. F0338 contig00161, whole genome shotgun sequence Length of sequence - 22536 bp Number of predicted genes - 11, with homology - 9 Number of transcription units - 8, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 2509 3169 ## COG0392 Predicted integral membrane protein + Term 2565 - 2593 1.3 2 2 Op 1 . + CDS 2636 - 3664 1248 ## OB2992 hypothetical protein 3 2 Op 2 . + CDS 3668 - 3928 199 ## 4 3 Op 1 . - CDS 4007 - 7162 3953 ## COG3291 FOG: PKD repeat 5 3 Op 2 . - CDS 7256 - 7375 287 ## - Prom 7542 - 7601 1.8 - Term 7594 - 7645 23.2 6 4 Tu 1 . - CDS 7666 - 14934 9762 ## COG1404 Subtilisin-like serine proteases 7 5 Op 1 35/0.000 + CDS 15038 - 16771 228 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 8 5 Op 2 . + CDS 16768 - 18807 2708 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 9 6 Tu 1 . + CDS 18969 - 19874 889 ## CKR_2073 hypothetical protein 10 7 Tu 1 . - CDS 20019 - 21740 2054 ## COG4425 Predicted membrane protein 11 8 Tu 1 . - CDS 21861 - 22535 858 ## COG0500 SAM-dependent methyltransferases Predicted protein(s) >gi|319978065|gb|AEUH01000161.1| GENE 1 2 - 2509 3169 835 aa, chain + ## HITS:1 COG:MT0613 KEGG:ns NR:ns ## COG: MT0613 COG0392 # Protein_GI_number: 15839986 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 403 835 86 522 522 164 29.0 9e-40 SAPRAARLRGPVYVADRTVTTTRRREDLVEVAVCALGIVAVWALGVVASATTRGVTMDVM QFQVIRRILLLPVNLIEGMIVLTTPIAVVSSLALRRRLRSITQALATSVGAAIGGMALSS LARLLPESLAAPLSVDRAIATQGTLSGVVIGVNILFVVLAALFTASGEAQSMRSVRWGWA GLWAIVFLSVMRSSVTLPGAFVSVLIGRAFGCAGRWGFGFGGQRARGREIVRALLDIGIV PTHIIRTDLDTTTEPLRTWSVSEDEDGALRSEATCPGNIDIAVTLRPDSGHNRHYQVWDA GGRGLELVVVEPGRELTGTLVELWNNFRLRGISRWVSPSAKANVERAALTSLQASRAGVR TQEHVGITAAGDSIIVVMEALEPTAPLTELGGALTDALLDEAWAQLAAAHTRGITHRNLA PNSVVVDTDGRVWLLDWDRGEVATTDLNRSIDVAQMLVLQALAAGPERALASGRRCVGDE ALAACAPLIQKPVLPTAVNQRLRRSDLLGELRGALVEDSEAEEAGTANIQRFQPRTVFTF AILFVAVFVVLGSLNFSDIISALTQANPWWILASALLACLIWVGASVPLMALSPERLRFS DTLIAQVAASIITIVAPAGVGPAALNLRYLRKKRVPTAMAVTTVTLMQLSQVLITVTLLL LVVVVSGASLSVSIPYGTILAVVGVVVAVVGAVLAVPRARAWVWSKVEPTWQQVYPRLLW IVGQPRRLALVVCGNLLMNIGYVGAFWSALTAMGGSLEFSNVALTYLTSNALGSIIPSPG GIGPVETALTAGLQVAGVSVSIGLPTAIIYRLVTFYGRIPFGWVAMKYMERKDLI >gi|319978065|gb|AEUH01000161.1| GENE 2 2636 - 3664 1248 342 aa, chain + ## HITS:1 COG:no KEGG:OB2992 NR:ns ## KEGG: OB2992 # Name: not_defined # Def: hypothetical protein # Organism: O.iheyensis # Pathway: not_defined # 62 342 48 321 323 201 38.0 3e-50 MTRSTRPLRAFAVALLATASSLAAAGMAQAAPAAPTDVVTYDAPGSESADWSTDNDLDAT GYWTPERMRNAIPVDSDPGAAPDPALLQPPSGSSLLSAPVAPDSPPVDLTEPVEPASGES PTMLLGNQVYVPITTGKVFYSVGSNDYVCSGSSINTPGKSMVATAGHCLHAGAGGTWHSR IAYAPAYSNGTTPYGLWHMKTGTAFNGWTDRSDARYDYGFFIVYPLNGRYLVDRVGGNGL SYNRGTTNKGVRIWGWPAAAPFDGSRPYHCDGDTFAYGTGGDMAMNCRMNGGSSGGPWMT STSSATVGRVFGVTSRITTSGPARIISRPFDSDVVRLYQSLE >gi|319978065|gb|AEUH01000161.1| GENE 3 3668 - 3928 199 86 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MERHRYGRVAALVSAVAVVGALVALYAWSAAGGRGPGSTDPSPSVSQSGPSSESTATHVP YDQSAQSYWTEDRMRNAEPAQMPSAP >gi|319978065|gb|AEUH01000161.1| GENE 4 4007 - 7162 3953 1051 aa, chain - ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 88 668 656 1313 1734 190 28.0 1e-47 MTPHMPRAALTIALTGALIGGAAFAAPAQALPSGRAQAAAPSAPASSAPASSTPPASALK PAVGATTDQNGFEIDANGVLLRYTGSASDITIPGKATAIADEALRGTTAERITVPATVRT IGDRAFADAQQLKALVFEDTADHPSQLTAIGAEAAANTPALTEVAVPSKAASLGEGAFAD SGITRLSLPDSLTSIPARLVKNSSLLTEAVVGNAVTEIGEAAFSAAHSLKGLSVRQADGS TAPGFPSSLVTIGQEAFNATGLGSVDLPASVRTIGDAAFNAIERLSHVGLNEGLVSIGSS AFRTTGITEIVIPDSVTTVGSSAFSECSALTSAHIGRGVEAGQLSDAFTSSRQLSGFTVA PDNASYEAVDGVLYSKDHTSLVVYPAGKGSGTTYTVLDETTRIESGAFNSAPLKGVVLPS SLRSIGDWAFSGSYLESLVLPKAFETFGECAFSGMPSLARVDLGGTKTIGEQAFLASHAL QEVDFRADLGRLTTISAYAFAETGLVSAVLPDSVTSLGDGAFTRSTELKEVHLGSGLTEL GARVFTGTTALASLSVDPSNPVLSLDGSVLYQQGNDGSHLVYAALAAPLPAYSVRPGTAQ IDEAAFKGHATLREVVVPEGVKAIGAQAFAGIHELTDVALPDSLEEVSDAFAGTQVETIE FGARIRTIEEAAFEGGLPVRMIVRGGKDGTFASPEEWVPNHTASAFFGEGMKRIAYTHGN FPQTLVVPSTLEELVLTTGLLTAEQLAEAHVYVAADRDSAAWSVAKTAVEKAGLDPASHL HRYVDPALSLSSPAIDRAGDASKVQVKAGETIEVMATMTGGVAVGREARAIEVGPGGAET VVSDWNSASDSDPSLDDADGGAISSWVRFTWVPSVDGASLRVESRDITFRVVTGVLGSAL PPAPEPGEGQWVQDAGGWWYRFADGTYPRGQALVIDGGTYRFDQSGHALTGWVRDGGYWF HHGASGAMSTGWLVDGGAWYYLIPSWGAMAIGWVEDGGEWYYLNAAGHMTTGWQKIEGSW FYLGDSGKMATGWRHIDGYWHFFEPTGELRH >gi|319978065|gb|AEUH01000161.1| GENE 5 7256 - 7375 287 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MESDEEVNRSIGQLIIRSNNLSVMNIHYSLCCSINRLKW >gi|319978065|gb|AEUH01000161.1| GENE 6 7666 - 14934 9762 2422 aa, chain - ## HITS:1 COG:SPy0416 KEGG:ns NR:ns ## COG: SPy0416 COG1404 # Protein_GI_number: 15674549 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Streptococcus pyogenes M1 GAS # 237 1066 120 1024 1647 209 26.0 7e-53 MALFPEKGCAYPISVNHGAVMKRHIPRATLTLTLSGALVASAVALASPTAAAPPPTPPAS HAARPSTPAPAEPSGAAAQSGTGSGDDSQSGRGLEAALSSMSELAGGGAKAPEGGLLADE DRTGGTTPVTVIVQLDEGNVGIPWYRRIFGLSSSTKHEVVKDRITSAVSGALPGVSTSGG TPITDVKDYSTVIDGFAIEIPRGALEAVRGVEGVKAAFVEELHQRDIETPAPDAVGPPAN ASSLTMTHADSTAYKGDGQVIAVIDTGLEVGHPAFSGDLDDSTVKLSEGDVSSLKSKLAV GKEGVYVSEKIPFAYDYADNDADVVPHSSADLSHGTHVSGIATGNGGTIRGTAPNAQLMA LKVARDSDGALPDSAILGALDDVGVLEPDSVNMSFGSDAGMSDEAASIYADAYKVLREKG ITLNVAAGNSYSSAHNNTSGANLPYASDPDSSTVDEPSTFQTSLSVASVDNAEGMPYFTA AGLRIAYQEGANASGGDMASFGDIESGTYRYVDAGEGKAADIAAMNEANPDGIEGAFALV RRGGTSDAGKPMTFEDKVKALAPYKPAAVVIYDNVDSDSLVRPAVQTTTIPVAFISKANG EAMLAAADHNLVFEKGARLAPSTDYVMSDFSSWGVSPDLALKPEISAPGGNIYSALPGDK YGYMSGTSMATPQMAGIAAQVRQRVASDERFSSFTDSEKHDVVTNLLMGTAHPLVDVHQG TGAYYSPRKQGAGIADALAATTATVYPTVDGARSPSRPKADLGDGTSGWTFGITLRNVGS EARSYALSSQALSELVNDSGLISEHSKNFRGEGITVSYSGEGVSGTSEGATITVPAGGTA RVGVAIAVEQSFADYVGANLPKGTFVDGFVEFASQTDGEPGLTVPFLGFYGNWGDANVFD TKWSDEGAQKAHIYKSALASTTTGMPLGVNPLQKVRLLDEVKPDPSRYIVSNSGWSVSPN SIQPVTGTLRSTAKMTYTYTNEAGEAVRSYTFDGAHKSLYNPRRMQVLYAEARIGSPVFD GRDEQGNLVPDGAYKLTISTDTDGPDSSHQELSYDFTMDTVAPEISNLATTGEGEEKTVS FDVTDSSPLAAVDFADPDSGGWFHRILVSDDGTVEADGTHRYHFDVRASDLQKSWTEQGG KGKLPATPLLRAWDFGINASKNHRVVVESIPMTSLTLDQSSLSLVPGQNAKLVASHEPAE ANMTDVVWTSSDQAVATVSDSGVVTGVGQGEADIVVASAADPAVSATAHVKVAPVSEETG IALSDSPIIIPTSGQASVSALLAPSLRGSSVTWALDVEGASIQPSADTLGATITAGDRSA RGTVTATVTTPAGDKTATAPVEVHQSDYDDFVIDADGVLTQYRGSGSTINVPDTVTAIAD EALRGVNAERINIPATVRTIGNRVFADARRLTTVVFEDTADHPSQLTELGIALVDNDYSL DEVLLPSKVVKLGDGAFSNSTIKRLSLPDSLTAVPADLAERSSQLNEVTIGDAVTEIGAA AFSENTSLGELRIRQADGSTKAGLPSALTTIGASAFVGTRFASLDLPASVRAIGDSAFMT ARLTHLGLNDGLVSVGKQAFSGTLLTEVELPDSVTTVGSGAFKAMPELTKAHLGPNVAAD QLVAGFTLSPKLSSITVDSANASYESVDGVLYTKDRTHLVAYPMAKNSGGSYTVVDGTVR IDDEAFQQAPLRQIAFPASLRSIGASAFANARLTSVALPDQFETLEAHAFQATADLASAD LGGTVTIGDSAFDSASLTSVDFRTDLNRLETVGSMAFSGAPVRVLVFPDSLKSVGAFAFE NNPHLEEVHIGAALTDLDTGFLTGSDALRTLTVSADNPVYSAENNVLYAKQQDGTHLVLS LPSNTFTEYSVRPGTVQIDAQAFRNNKALQRVVLPEGLKVVKAGSFNNASSLTEIVLPDS LEAFDGLYSTGVEFIEFGTRIRTISENAFKGNVPDRMIVRGGVGGSFTGSTDSTGARSTA AFFGEGMKRINYGYGGSFPQTLVVPSTLAELALGTYALSAEQLAASHVYVAADEGSAAWS VARAAVEKAGLDPASQLHRYVEPTLSLSSPAIDRAGDASKVQVKAGESVEVTATLTGGVP DGREARFVEVAPDGTETVVSDWAVMGAAQPSPAPDANAGAEPGAQAPAGADEAGAQSGSD QAQSGAGQAQSGADQAQSGSGQAAPADQAAPTQSAASDQAAGAADGPDADGDAPASSAAF TWVPSVDGASLRVESRDITFRVVTGVLGSALPPAPEPGEGQWVQDAGGWWYRFADGTYPR GQALVIDGETYRFDQSGYLLTGWVRDGGYWFHHGASGAMSTGWLVDGGAWYYLVPGWGAM AIGWVEDGGSWYYMDPSGRMSTGWVSDGGYWYYLGASGAMATGWVSDGGHWYYLDASGHM VTGWQQIDGKWYEFGPSGALKG >gi|319978065|gb|AEUH01000161.1| GENE 7 15038 - 16771 228 577 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 337 559 133 354 398 92 31 2e-18 MLIRLLRARLRPYTWLLVGVVVLQFAQVMASLYLPTLNADIIDKGVATGDTAYIWRTGAF MLAVSALQGVCSVLATYLAARASMGMGRDLRGAVFERVSGFSERETTRFGAGSLITRNTN DVQQVQMLVMTSCTIAVTAPIMAIGGIIMAVSKAPALSWLIAASVPLLLVVVALIVSRMV PLFRSYQDKLDGINRVMREQLAGIRVIRAFVRERAETDRFEEANSSITRVSERVGQLFVL LFPLVMLMLDVTIVGVIWFGGHRVGDGDVEIGTLIAFMAYLMQILMGIVMASFMTIMIPR AAVCAERISEVLATPPTITAPAGATTAFPSPGSVEMRDVGFTYPDADERVLEGVGFTAAP GTTTAIVGSTASGKSTMVRLLARLLDASEGQVIIGGTDVREADPEALWAQIGLVPQHPFL FAGTVASNLRLGREDATDEELWEALGVAQAEDFVSEMDGGLDAAIAQGGTNVSGGQRQRL AIARALVRRPPVLIFDDSFSALDVATDARLRASLGPATAGTTKIVIASRVSTIVDADQIL VLDAGRLVASGTHEQLLAASEVYKEIVTSQFGAEALS >gi|319978065|gb|AEUH01000161.1| GENE 8 16768 - 18807 2708 679 aa, chain + ## HITS:1 COG:Cgl0939 KEGG:ns NR:ns ## COG: Cgl0939 COG1132 # Protein_GI_number: 19552189 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Corynebacterium glutamicum # 32 674 15 642 656 689 55.0 0 MSTQDKTPTTQTSPEEEELAAEALASSGGWHDGPPPGKAKAFWPSFKRMIGLLTAHRNWM AAILVATAASVALAVIAPKVLGRATNVVFEGVVSQALPAGTTKEQAIEALRAQGMDDFAT MLSAMDITPGTGIDFQRLGGILLLVLGLYAGSAFLNWFQGYMLNRITARALYRLRRRVEQ KIHRLPLSHFDRMRRGDVLSRLTNDVDNVNNTLQQTLSTALSSILTVVGVLAMMFSISWK LALVALITVPLSGIVFGFIGPRSQKAFTAQWARTGTLNTRVEESFSGHALVRVYGRTDSV RRAFAEENEQLYRTALRAQFLSGIMMPIMLVIGNIVYVAIAVVGGAMVASGALLLGDVQA FIQYSQQFSQPLGQLGAMATAVQSGTASAERIFELLDAEEEGADAPARAPSDTDGPGAAQ DSPAPSDGATAPKGAGAIEMRHVRFSYSPGTELIRDLTLRVDPGNTVAVVGPTGAGKTTL VNLLMRFYELDSGRITIDGHDIAAMTRHEVRRRTGMVLQDPWLFAGTVRENIRYGRPGAS DAEVEEAARACFVDHIIRALPHGYDTVLEEDAANISAGERQLLTIARAFVANPSVLILDE ATSSVDTRTELLVQRAMNALRQGRTSFIIAHRLSTIRDADTILVMEDGDIVEQGGHAELL AKGGAYARLHAAQFASASD >gi|319978065|gb|AEUH01000161.1| GENE 9 18969 - 19874 889 301 aa, chain + ## HITS:1 COG:no KEGG:CKR_2073 NR:ns ## KEGG: CKR_2073 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 17 300 18 300 300 270 43.0 5e-71 MSIATLLDEHSIDLPSYSEVSIESGRAPITLSVWEAERDTSPTVVFVPGTMTHPLFYSPF LDALSRHGYHVIGVHPLSHGRSPRLIRRFTLDDMIGNVRDAVGYACARFPGAVGLMGSSQ GGVLTLLTAGAEHRIRAAFPHNVLFTPLRESLSVTRFPNALGVVYPAIRRIIAAVGRVLP GLQVPVGFYLDVDKVFSERQWRDAFFADPMRLESYPMSFLSSLFTTDMSALVDGSISCPV VAFVSTGDPLFTLEYSRLVYERLVAPEKRLVELPADHHLVLNEKAERVIPTVLAALDDYL K >gi|319978065|gb|AEUH01000161.1| GENE 10 20019 - 21740 2054 573 aa, chain - ## HITS:1 COG:Rv1069c KEGG:ns NR:ns ## COG: Rv1069c COG4425 # Protein_GI_number: 15608209 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium tuberculosis H37Rv # 4 528 48 561 587 277 35.0 4e-74 MGLSFLGLIGALAMYAVSVSPSLMARSWWWHAVASGVLVSIGYILGLIVQSAGARLIKAA DLRISADESVEWASRAGVAVLFLAWWLYAVVQSYRRARRAAALVGMRAETPWGYLLGAVG AVIIAHLFVGLITGVNLVGRALINALADHMPRWAAALAGVAILGAIIYFLTSNVILRGGI GFFRRHAERMNTRTAKGIFQPSVPERSASPASPSTWESVGGQGRVFLGRGPSRADVEAVT GRAAKEPIRVYAGMPSAGAGIDEAAALVVAELERTGAFRRAAVLVAASTGSGWVDEWQVQ PFEYLTGGDCATASLQYSYVPSALNWLTGLEPAQEASRALFRAVREAIDAIDGAERPALF VCGESLGAFASQSVFDSPADALARVDGALWVGTPAFTPMHRVLTASRHRGSPEVAPVVDN GRNVRFVNAPEDLRQDLYGRELGRWSFPRIVYAQHASDPVVWWNPRLLWRQPDWLRERAG RDVSKNVEFTRGVTYLQIMADLPVAGTAPAGHGHTYNEELVPLWRALLGFDLEEGESVPV PRLAGLDGSWATAGALDRIGAAVRANLALSERQ >gi|319978065|gb|AEUH01000161.1| GENE 11 21861 - 22535 858 224 aa, chain - ## HITS:1 COG:Cgl1397 KEGG:ns NR:ns ## COG: Cgl1397 COG0500 # Protein_GI_number: 19552647 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Corynebacterium glutamicum # 7 199 59 251 271 132 40.0 4e-31 ADRPARGLAADIGAGTGKMSALLAEEGLEVRAVEPSAAMRAQARPHPLIAQVAATAEDTG LGAASCDLVVYAQSWHWVDPAAAGAEAVRILKPGAPLVIVFNQMDVTAQWVHRLCRIMRS GDVHRADRPPRPAGFAPPVLERFWWEDRMEPEQILELGTTRSSYLRADAARRRAMQDNLR WYLYEHLGHRADQEITIPYSTLVWTTRAPAARVHSGSTPAPEVP Prediction of potential genes in microbial genomes Time: Thu May 12 18:14:06 2011 Seq name: gi|319978060|gb|AEUH01000162.1| Actinomyces sp. oral taxon 178 str. F0338 contig00162, whole genome shotgun sequence Length of sequence - 3223 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 213 - 1721 1986 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 2 1 Op 2 1/0.000 - CDS 1746 - 2207 378 ## COG0563 Adenylate kinase and related kinases 3 1 Op 3 . - CDS 2297 - 3223 1033 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters Predicted protein(s) >gi|319978060|gb|AEUH01000162.1| GENE 1 213 - 1721 1986 502 aa, chain - ## HITS:1 COG:Cgl2685 KEGG:ns NR:ns ## COG: Cgl2685 COG1502 # Protein_GI_number: 19553935 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Corynebacterium glutamicum # 9 501 7 499 500 366 39.0 1e-101 MVPLPPPELPTWLIATIVGVDTVIRLVALGVVPRHRRASTSTAWLLIIFPVPFLGVPLYL VFGSWWAMGRALDDDPKASGIVKELLAAAPGWTPHAHEATAAIMRMNRETTRFGETTGSV GGLYTAPAEAFAAMADAVDRAAHHVHVLYYQTSWDDDTAVLYEALQRAAARGVVVRLLVD HHGLRSIPGWREFRDRLDGSGLQWHVMLPFNPFKGPVRRPDLRNHRKLLVVDSRTAFVGS HNLVAADYDTPAYAEAGIEFHDTTVSVHGSIARQVETVFATDWFYACGEVLGPADLDPED PGGDDLGGGEDAEHTGGAHASPMQIVPSGPAYRSEPGLRMFVDLIHCATRSVSVVSPYFI PDEALLSALTNAALKGVDVELFVSEESDQFLVGHAQRSYYEPLLKSGVRIHLYQAPTVLH SKYVMIDRRTVTIGSSNMDFRSFGLNYEVMLLAEDPGLAEMIAANDQRYRERSRELTLAE WIGKPWYAHYVDNVCRLGSALL >gi|319978060|gb|AEUH01000162.1| GENE 2 1746 - 2207 378 153 aa, chain - ## HITS:1 COG:Cgl1057 KEGG:ns NR:ns ## COG: Cgl1057 COG0563 # Protein_GI_number: 19552307 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Corynebacterium glutamicum # 1 138 4 139 177 81 41.0 6e-16 MLIDGLSGAGKSTLAAALAPPGGPWRVLGLDSYYPGWDGLEEGSRETARIARDLAAGRDT HYTPWDWEAGRPLAPVLLPAGIPTVIEGCGALTPASRASADLAVWVEAGGGAGQRRARAL ARDGDVYAPHWRRWAAQDGARDGARLADLVVRT >gi|319978060|gb|AEUH01000162.1| GENE 3 2297 - 3223 1033 308 aa, chain - ## HITS:1 COG:Cgl1056 KEGG:ns NR:ns ## COG: Cgl1056 COG0619 # Protein_GI_number: 19552306 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Corynebacterium glutamicum # 61 307 4 250 251 196 46.0 5e-50 GGADGAEDADEAGGADGAGGAGSSGAQASATAAEATDRADHPRTQAPRTPGREDGRRRRG LDRFNPVTRILALIVMTTPLLISVDPVSASVALGLELLVLPAARMRARSLALRCSPLAVA APLSALSMLLYASPGGATYWSMGPAAITDRSIWLAIGIALRVCAVALPAIVLLSNIDPTD MGDGLAQILHLPARPILAALAGARMTGLMAADWKALERARRIRGIGDGSRVRSFLRGSFS LLVFALRRSAKLSLTMEARGFGAPGPRTWARPSRMGAADAALMAVAVLIPAAAIAVAVLT SNLSLVGR Prediction of potential genes in microbial genomes Time: Thu May 12 18:14:06 2011 Seq name: gi|319978058|gb|AEUH01000163.1| Actinomyces sp. oral taxon 178 str. F0338 contig00163, whole genome shotgun sequence Length of sequence - 676 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 653 322 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P Predicted protein(s) >gi|319978058|gb|AEUH01000163.1| GENE 1 3 - 653 322 217 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 1 211 146 355 398 128 42 9e-31 ALDGVDLRIEPGERVLVLGPSGSGKSTLMAGLAGLLGGEDEGEAKGGLVVDGAAPADLRG RIGLVMQDPEAQVVLARVGDDVAFGMENLGVPREEIWPRVSAATGAVGLDAAPDHPTTAL SGGQKQRLALASVLAMGPRLLLLDEPTANLDPVGVAGVRGAVERVLDESGATLVVIEHRV DIWADLVDRVVVVLDGAIAADGPIDEVLASQARTLRE Prediction of potential genes in microbial genomes Time: Thu May 12 18:14:11 2011 Seq name: gi|319978045|gb|AEUH01000164.1| Actinomyces sp. oral taxon 178 str. F0338 contig00164, whole genome shotgun sequence Length of sequence - 11657 bp Number of predicted genes - 16, with homology - 11 Number of transcription units - 10, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 18 - 662 871 ## COG4721 Predicted membrane protein 2 2 Tu 1 . - CDS 1005 - 1271 70 ## gi|293192576|ref|ZP_06609530.1| ribbon-helix-helix protein, CopG family - Term 1343 - 1386 10.8 3 3 Op 1 . - CDS 1411 - 1884 590 ## Gbro_0247 hypothetical protein 4 3 Op 2 . - CDS 1945 - 2427 318 ## gi|154508175|ref|ZP_02043817.1| hypothetical protein ACTODO_00669 - Term 2957 - 2993 4.0 5 4 Tu 1 . - CDS 3156 - 3620 594 ## COG0251 Putative translation initiation inhibitor, yjgF family 6 5 Tu 1 . + CDS 3753 - 4424 954 ## 7 6 Tu 1 . - CDS 4428 - 4556 159 ## - Prom 4774 - 4833 1.7 8 7 Op 1 . + CDS 4734 - 4820 156 ## 9 7 Op 2 11/0.000 + CDS 4804 - 5433 454 ## COG1309 Transcriptional regulator 10 7 Op 3 . + CDS 5443 - 6708 1282 ## COG0477 Permeases of the major facilitator superfamily 11 8 Tu 1 . - CDS 6850 - 6984 67 ## 12 9 Op 1 . - CDS 7114 - 8280 826 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component 13 9 Op 2 . - CDS 8281 - 8394 215 ## 14 10 Op 1 . + CDS 8419 - 9051 852 ## Amir_3220 hypothetical protein 15 10 Op 2 . + CDS 9102 - 9893 790 ## Cfla_1573 VTC domain protein 16 10 Op 3 . + CDS 9913 - 11656 2197 ## Haur_2253 hypothetical protein Predicted protein(s) >gi|319978045|gb|AEUH01000164.1| GENE 1 18 - 662 871 214 aa, chain - ## HITS:1 COG:Cgl1054 KEGG:ns NR:ns ## COG: Cgl1054 COG4721 # Protein_GI_number: 19552304 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Corynebacterium glutamicum # 22 209 12 200 201 176 62.0 4e-44 MAAVLDERRHLSMAEPRTPAAPKMGWRVIDFVTAAVLAVACGLIFLVWNQVGGAGYELFG NLAPGLGGLATGIWMLGGPLGGFIIRKPGAALFVELLAASVSAALGSQWGITTLYSGLVQ GLGAELFFLLFAYRRYTIATAALAGAGAFCGAWAYEFVTGNYEKALSVNLVYLGTGVVSG ALLGGVLAWALTRALAATGALDRFAAGRQARQRV >gi|319978045|gb|AEUH01000164.1| GENE 2 1005 - 1271 70 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192576|ref|ZP_06609530.1| ## NR: gi|293192576|ref|ZP_06609530.1| ribbon-helix-helix protein, CopG family [Actinomyces odontolyticus F0309] # 3 85 8 90 95 77 51.0 3e-13 MTTYHLRGGGTATDEELEAEARMFEGGKYPGQWRPVPGRPPLFDEETAAVAVRLPVSQVE ALDDRAAASGSTRSEYLRALIAKDLETA >gi|319978045|gb|AEUH01000164.1| GENE 3 1411 - 1884 590 157 aa, chain - ## HITS:1 COG:no KEGG:Gbro_0247 NR:ns ## KEGG: Gbro_0247 # Name: not_defined # Def: hypothetical protein # Organism: G.bronchialis # Pathway: not_defined # 14 155 5 143 150 80 37.0 1e-14 MSIFVDEPVEGENDFDRVETDVVVDATVEKAWALVSEPGWWVNDGPLGDHDVTRGDDGVY RVNDPEAGQWLIEKADEDPMDVVAYRWYPLADDELPDGRTTRVEISLSEERGGVAIHVEE SGLSSVSDDEDVVRQAWEDEAGMWDQVLTAAKEYLES >gi|319978045|gb|AEUH01000164.1| GENE 4 1945 - 2427 318 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508175|ref|ZP_02043817.1| ## NR: gi|154508175|ref|ZP_02043817.1| hypothetical protein ACTODO_00669 [Actinomyces odontolyticus ATCC 17982] # 13 160 5 184 184 152 61.0 4e-36 MATNPEAAPAGAQTFKLVDASTQPAADCGCGGHEAPEAAPSGCGCGGHEAPEAAPASSSC GCGGHGGGHRHRHGHEHRGGAGVVHQSSDDELVIHSIPRVVRHAVLFAAVDSLPVGENIR IRAPHQPEPLFAHLRDSSSHYRVETLEAGPSWRYRVTRLA >gi|319978045|gb|AEUH01000164.1| GENE 5 3156 - 3620 594 154 aa, chain - ## HITS:1 COG:MT3779 KEGG:ns NR:ns ## COG: MT3779 COG0251 # Protein_GI_number: 15843295 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Mycobacterium tuberculosis CDC1551 # 6 146 5 145 151 108 53.0 4e-24 MTKPSAKLAELGIELPPVATPVAAYTPAVIDRGVVRTSGQIPVVGGEPVCVGAVGEGGVD PEDAKEAARVCALNALAAAAAAAGGVDRLAGVVKVVGFVSSRPDFYGQAGVVNGASELFG DVFGSAHARSAVGVAALPLGVSVEVEAEFALRQD >gi|319978045|gb|AEUH01000164.1| GENE 6 3753 - 4424 954 223 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDIVIQIGLICALLYVLGVLLHGRRSRRRLVFLRTLAFLTLTALGNALLLVLLAVGGGAL LSAVALIKDPNAFDSARNAAGILQSRFGDFRSVWLLLTLLVLVLLTAVIQLAVRRVLTLR FHLAADEEVYEIAEYFIQWFTIFLVVYQLYFAGIKEVFEAYLSRSLTNMAFDIALTPENL NIVMQPIAFSTWVVIAVEHMRKHAAHSAPPDAPPDPESGVGGD >gi|319978045|gb|AEUH01000164.1| GENE 7 4428 - 4556 159 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRGFLPQWDHCVRTAGFGPLGSDRWARTAGPGTALGAAALAS >gi|319978045|gb|AEUH01000164.1| GENE 8 4734 - 4820 156 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNSAIVPGSRNFYLSLTYELTCMYETNS >gi|319978045|gb|AEUH01000164.1| GENE 9 4804 - 5433 454 209 aa, chain + ## HITS:1 COG:PA3574 KEGG:ns NR:ns ## COG: PA3574 COG1309 # Protein_GI_number: 15598770 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 1 62 1 62 212 58 53.0 7e-09 MRRTAEDAAKTRVALLEAALIAFEEKGWRGATFEHIAERAGVTRGALHHHFRDKRTLLAE ALEWGWSDYGQRLFDTVSPATGDTGSADGTVALDERAHRAVEELLARFVRLITGDRAFRA LASTTVLVAPQAFEHYDKGDALDDWRERIEGIVASRRGATNVAPRDIAGLVLVLLQGFTV AAVTRPDDLPQPDRLDATMAALVKGLLET >gi|319978045|gb|AEUH01000164.1| GENE 10 5443 - 6708 1282 421 aa, chain + ## HITS:1 COG:SPCC18.02 KEGG:ns NR:ns ## COG: SPCC18.02 COG0477 # Protein_GI_number: 19075881 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Schizosaccharomyces pombe # 38 384 24 424 448 85 23.0 2e-16 MTTNTTTNDPRPDGAAPDDGALPRATPARATAVSGAILFTDMLLQGLAIPVLPLLPAVVE RGAAATGILFASYAVAMVIATLFAGRMVDRRGSKGLLVAGVIVLAIATLLFATGGPYWLL LAARFAQGLAGGVAWVAALSLIAASTGFDERGQMMGIAMSTVTLGVLVGPPLAGFLVDAL GPASPFLVATAVALADLAALLALIPGSPRRTDDSAGPLAVLRVPGSASIVTTIAIGAAVL AAVEPVLPAHLGARASTTAIGILFGVAALAGIIANPIVGRFVASTSPRLLTGIGVVAAGA ALVVLGRSTGVWEAGVGMGLLGLSSALLLAPATTLISEQGFRSDPPTLGGSFALYNLAYA TGLAIGPLLTGFGVQQTGFATAMAIAAAVLAALGGSALTRLPTGWAAERASTTARADRTR A >gi|319978045|gb|AEUH01000164.1| GENE 11 6850 - 6984 67 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVSYPNGTAGPGRALGSDRWFGLLGSDRWARKGAERVAALGGAG >gi|319978045|gb|AEUH01000164.1| GENE 12 7114 - 8280 826 388 aa, chain - ## HITS:1 COG:VCA0194 KEGG:ns NR:ns ## COG: VCA0194 COG1226 # Protein_GI_number: 15600964 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Vibrio cholerae # 11 218 6 212 280 117 33.0 5e-26 MTRPTKDPDPRERLHKRWERLTEWPMIAIAVVFLAVYSAEVLTDPGEEGGHEAVLDCCWA LFAVDFAVRLATAPRRWHWLRHNILDFASVALPALRPLRLVRILAAFRVFQRRAETTFRG RLVLYTGGLSVLLVWMGALAVLQAERHAQGALITDVGRALWWSLVTVTTVGYGDISPVTP TGRVVATGFMLFGIALLGVVTGLFSSWIVERVRDDAEQAARGPGRAPGGVLGGAQAGACG TSWGRRADDRAQAGADGADGGGVGAAACGGAGGSLGCPDDGAGSPGDEGAAAEGGAAAGP GAASADGGAGVVGDGVGPDTAALAREVAQLRRAVARLTLHVQRLDGSAGRSGGAPAPGPS DRPQDGAEEGSGEWCGGQVPPGASTDVD >gi|319978045|gb|AEUH01000164.1| GENE 13 8281 - 8394 215 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVTINYSIHVWEIPGLFLGDFWDFPGIPSDREAMHNR >gi|319978045|gb|AEUH01000164.1| GENE 14 8419 - 9051 852 210 aa, chain + ## HITS:1 COG:no KEGG:Amir_3220 NR:ns ## KEGG: Amir_3220 # Name: not_defined # Def: hypothetical protein # Organism: A.mirum # Pathway: not_defined # 1 176 1 178 206 152 53.0 7e-36 MSTLQMIGIDLAAMLVLVLGLYFPRHRRSDLVAAFLGVNVGVLAVATVLANSTVSAGLGL GLFGVLSIIRLRSDQISQTEIAYYFAALSIGLLSGMSTQATPLLIGLIALILGALALGDS ALVFGRYTTRTVQLDSAIADQDALVAALEERLGASVVATRVIKLDLVNDLTLVDVRTKQS PGKSTTASPRRGARPHVGADPQSPAHAEAA >gi|319978045|gb|AEUH01000164.1| GENE 15 9102 - 9893 790 263 aa, chain + ## HITS:1 COG:no KEGG:Cfla_1573 NR:ns ## KEGG: Cfla_1573 # Name: not_defined # Def: VTC domain protein # Organism: C.flavigena # Pathway: not_defined # 13 259 51 290 294 155 47.0 1e-36 MPTTRTRVVPRALAPIGLGDLNRRAALLTRVDRKYVLGLEEARRVLALVDPASRLLSIDG LDVSAYSTTYYDTPGLDAFSMTARPRRRRFKVRTRLYLDSGQAFLEVKTRGPRGTTIKER QRIAPHEAGGPLTDAHREWMADRLERVGRPGAAREVEPVLRGSYSRSTLLLPGGRARATI DSDLAWAALDPRGRPARTLARPGLVIVETKGTATPSCLDRLLWCNGHRPTRISKYATAMA ALHEERPTNRWNRVLTRHFGLAA >gi|319978045|gb|AEUH01000164.1| GENE 16 9913 - 11656 2197 581 aa, chain + ## HITS:1 COG:no KEGG:Haur_2253 NR:ns ## KEGG: Haur_2253 # Name: not_defined # Def: hypothetical protein # Organism: H.aurantiacus # Pathway: not_defined # 56 578 52 576 617 231 38.0 7e-59 MKALKPAFTPARALGAVVLVSALALGACSSGSNGQASGAAATTAAPASATGAGDTTAPPT AAEALAANAKASHVEDSDWSASDAVDVALSGTGATSSSDGVSSADGTVTISKAGVYRLTG SLSGKVVVAAGDGDRVVLILDNATIASSSGAAIEATTGDDLVLSLVGSSTVTDGSYTADA EANAAVYADMDLTITGAGSLSVTSGASDAITSKDDLYVASGTITATASDDGLRGKDSLTI AGGEVAVASGGDALKSDQDSDETKGYVAITGGSVRAVADGDGIAAETDAIITGGSVDITT GGGAAAGEPASGSAKGIKAGTYVLTGGSPTVTVDSGDDAIHADGALRLSGGTVTASSGDD GAHAEVAAVLDGADLTVPRSTEALEGGLITVSAGKVDLTSTDDGVNASGSTTVEAGLAAK EESDSSTASPQPASPDGAGTMEDTGEQLTITGGVLTVNADGDGLDSNGSLTISGGTVTVY GPSSGANSALDSNGAMTITGGTLMAVSTNEMVETPTSTDGQGWVSAEVTGAAGDAVTVAD SSGTVLGSFTPPKAFRNVIFSAPGMANGSTYTVTAGSSSVE Prediction of potential genes in microbial genomes Time: Thu May 12 18:15:16 2011 Seq name: gi|319978036|gb|AEUH01000165.1| Actinomyces sp. oral taxon 178 str. F0338 contig00165, whole genome shotgun sequence Length of sequence - 11350 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 6, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 497 - 1954 1839 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Term 2192 - 2217 -0.5 - Term 1958 - 2005 -0.9 2 2 Tu 1 . - CDS 2197 - 3123 814 ## SGO_0828 hypothetical protein - Prom 3317 - 3376 1.6 3 3 Tu 1 . - CDS 3539 - 4132 377 ## 4 4 Tu 1 . - CDS 4364 - 4429 74 ## 5 5 Op 1 16/0.000 + CDS 4622 - 5659 1788 ## COG1879 ABC-type sugar transport system, periplasmic component 6 5 Op 2 21/0.000 + CDS 5821 - 7377 2341 ## COG1129 ABC-type sugar transport system, ATPase component 7 5 Op 3 11/0.000 + CDS 7377 - 8378 1394 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 8 5 Op 4 . + CDS 8391 - 9461 1621 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 9 5 Op 5 . + CDS 9458 - 10021 851 ## COG0698 Ribose 5-phosphate isomerase RpiB 10 6 Tu 1 . - CDS 10146 - 11177 1256 ## COG1609 Transcriptional regulators Predicted protein(s) >gi|319978036|gb|AEUH01000165.1| GENE 1 497 - 1954 1839 485 aa, chain + ## HITS:1 COG:Cgl0025 KEGG:ns NR:ns ## COG: Cgl0025 COG2865 # Protein_GI_number: 19551275 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Corynebacterium glutamicum # 7 476 5 492 563 153 29.0 8e-37 MIQTLEDLQLALAQMELHRGDTLSLEAKTFSEYSANALGPTLSAFANLPGGGSILLGVSE SDGVSVVGVDDAHALIQSAVSQARNGFSTEIRVEGGAFTIDGRTVAVLNVSEAPVNEKPC RWNKRKKSYIRQYDGDYAMSPQEEQQLLLRHGRPREDSKPVDGSSRADLNADLVAAFTSS VRRSTSRMADVDDDVILYRMNIVMGDGRLTTAGLYALGEYPQRLLPHLTLTAAIEGSGGA RAVNRRNFVGPIPEILDDALDWVMRNVDSAQVVTSDGVGTTNPSMPELAVREVVTNALVH RDLSEPASGKGIDLRLGDRELRLTNPGGLWGVTVDQLGTQNSKSAVNEFLYTICQNVESR HGRVIEALGSGIAAARDALREAGLPEPRFFDNGVSFTVLFPSSSLLPESDLAWLGSLGAS GLSHRQSRALVDMRHGTVFTNSAYRARFGVPADEARADLADLVRRGLAERSGERRWVKYR LPSRL >gi|319978036|gb|AEUH01000165.1| GENE 2 2197 - 3123 814 308 aa, chain - ## HITS:1 COG:no KEGG:SGO_0828 NR:ns ## KEGG: SGO_0828 # Name: not_defined # Def: hypothetical protein # Organism: S.gordonii # Pathway: not_defined # 1 306 1 306 308 366 54.0 1e-100 MDLYFGDVSLCYTNSLAMALDSYGFHFRPEYLEALMVMGNGASVVENDAEHPLVFFDNGH PDASISNCLRILGFEYEDFYVEEAEGPSADGAWERLQRMLANGPVVAGPLDMGYLTYNPN CRDLEGVDHFVCVSGLHEERVRLHDPAGFPCMEMGLEDFARAWAAEGIAYKRGAFSMWGD FTRTREPDAEEIYHETSLVMRKRYEQGESGVMRMYADSIRQNGMNRQQKQIHQFFSFRLA AARSIYMGRFLEEHDPERAELKERIADLFGQAHLNSMEDDSARLADTLARIAELDERFKA LCLQYERG >gi|319978036|gb|AEUH01000165.1| GENE 3 3539 - 4132 377 197 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGQVNSLPQCGLTCSNAANLPQCDPRRPGGCGTRAAGAVFTAVPPGPWASEPGRAALWAV STTVPVRRAQFSQRYRAFAALPHFASVITLAKRRYPFENSASLVKRPGRLPRGCGSPLKM RHCLQNDPCAVPTARIQRQSSGTAVGAAPCERPNWLAFCGGGPSGGRQTAGHLSGKIIPQ HMTPPGLPGQEDLEDPA >gi|319978036|gb|AEUH01000165.1| GENE 4 4364 - 4429 74 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQSLSSAGSQESGEGWMQTPP >gi|319978036|gb|AEUH01000165.1| GENE 5 4622 - 5659 1788 345 aa, chain + ## HITS:1 COG:AGl86 KEGG:ns NR:ns ## COG: AGl86 COG1879 # Protein_GI_number: 15890150 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 36 345 70 380 380 250 42.0 3e-66 MKSRIAASAIVFTSAIALSLGACSGGGGTQSGKAGAGGGNYTIAVVPKDSTNPWFVRMET GVQRYAQETGMNVFQKGPAETDATMQAQVIQDLIAQGVDALCVVPVDPGAIEPVLKQAMD AGITVVTHEGASQQNTMYDIEAFNNAEYGAFIMDNLAEAMGEEGTYTTMVGHVTNASHNE WADGAVAHQKEKYPNMTLLEAEPRVESQDNSETAYQTAKEVLKKYPEVKGILGTSSFDAP GTARAIDELGLKGKVFTAGTGLPNANKQILQDGSVASLTLWDPADAGYAMASLATKILNG EKIADGVNLGVTGYENMHFSPGSDKVLEGNGWIQINKDNVDSFGF >gi|319978036|gb|AEUH01000165.1| GENE 6 5821 - 7377 2341 518 aa, chain + ## HITS:1 COG:AGl85 KEGG:ns NR:ns ## COG: AGl85 COG1129 # Protein_GI_number: 15890149 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 493 40 534 538 477 49.0 1e-134 MEAPLLTVSSVSKSFKGVHALQGIDLELAPGEIHCLAGENGSGKSTLIKVISGVHAPDSG SITIDGITWDRLTPLEAIAAGVQVIYQDFSVFPNLSVMENLALTTEVAAGRALVNWKRFR AIAERAVATIGFKVDLDAKVGDLSVADKQLVAICRALISDARVIIMDEPTTALTKNEVAA LFRIILDLRARGIAILFVSHKLEEVFEIAERFTILRNGRKIITCPKEDLDRRSFAKHMTG KEFDETRFEPTALSDEPVLEVEGLSAPGAFEDVSFELRRGEILGITGLLGSGRTELALAL FGAHPTQAGTIRVQGDEVRLRSVNDAIDAGIAYVPEDRLTEGLFLSRSIGENIVISELSD FTNALGGVNKRAIAAEQARWVSDLEIATPDPGNAVNTLSGGNQQKVVLAKWLATKPDVLI LNGPTVGVDIGSKFTIHSILRELAADGMAVIIISDDISEVLTNCSRVLIMRGGQLREAVD PATTSEARLGELLAATDPQPGSSGPADNAGADQPEGGR >gi|319978036|gb|AEUH01000165.1| GENE 7 7377 - 8378 1394 333 aa, chain + ## HITS:1 COG:ECs0377 KEGG:ns NR:ns ## COG: ECs0377 COG1172 # Protein_GI_number: 15829631 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 329 1 320 323 172 35.0 9e-43 MEALKKKALHANEFWVFLVVVALALIIQVRSGQFFTANNLVDLAGAMVVPGLFAVAAHLV LVSGGIDVSFPALASLAVYATTRVLVDNGFDGPVIVPFLIAAGLGALLGAFNGIFTSRLT VPTLIITLGTANVFNGFMQGALKSVQINTIPSSMQSFGQASLFVAQNRASGLQSAMPVSF LLLVAVVAAAFFLTRYTMFGRGILAIGGDEAAAARAGFSVAGTKFWLYVIVGIIAAMAGM VRTCSMGQMHPTNLLGMEMMVIAAVVLGGTAITGGKGSLTGVMLGTLLIVIVQNSMILTG IPTIWQNFALGLLIIVGTGISAVQVARSRGRTA >gi|319978036|gb|AEUH01000165.1| GENE 8 8391 - 9461 1621 356 aa, chain + ## HITS:1 COG:AGl81 KEGG:ns NR:ns ## COG: AGl81 COG1172 # Protein_GI_number: 15890147 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 20 333 28 339 346 194 38.0 2e-49 MNTKTRSTKTGALPAFLTRDPYITRLVLVLVLLLAFFAIVKSGPFFAVRTWQSMAVQFPE FGLMSLGVMFTMFTAGIDLSAVAIANVTSICAALALRAVLEGQPAATAPSMATATAVALG IALAVGALCGLVNGVLIAVAKIPPILATLGTLELFGGIAIVLTDGKPVSGVPEAFGAVFA SRVGGVVPMPLVVFLVCALVAAAILGLTGFGTRILLLGTNEKAARFSGMRVASLLIRTYI LSGVMAAMAGLVMLANYNSAKSDYGASYTLLTVLIVVLGGVNPNGGSGRLSGVLFAIVML QVLSSGLNMFPNISNFYRPLIWGAVLLFVICMNEGAFSVRRLIDASRRLLKKGTQQ >gi|319978036|gb|AEUH01000165.1| GENE 9 9458 - 10021 851 187 aa, chain + ## HITS:1 COG:CAC2880 KEGG:ns NR:ns ## COG: CAC2880 COG0698 # Protein_GI_number: 15896134 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Clostridium acetobutylicum # 9 134 3 128 152 99 38.0 4e-21 MKTVRSKNVVVGADFAGFPLKEAVKRHLEERGWTVTDLTPVLEETPMYHRVGFSLGAQIA EGRFERALAFCGTGMGIHIAASKVPHVHAAVCESVPSARRASAANNANLLAMGAFYVAPR TAMAMADAFLASELGSGYEDWEGFYAYHRIGYDECENFDYEAYKANGFEVVDPQTAELAE QPRGLAY >gi|319978036|gb|AEUH01000165.1| GENE 10 10146 - 11177 1256 343 aa, chain - ## HITS:1 COG:HI1635 KEGG:ns NR:ns ## COG: HI1635 COG1609 # Protein_GI_number: 16273524 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 13 335 6 333 336 147 29.0 4e-35 MTPEPTPRPRLVDVAREAGVSVATASRALGKGSELIRADTRDHVRQVARRLGYRVNPVAR SLRLSTTGHIGMVVPSIGNPFFMELVAQVEHCLAERGLNLLLADARMSVAHEDHLLRSLE SGAVDGLVVAPCHETYSTPALERAAAHVPTVQLDRGVQLDVPMVGVDDPHGIRTLLDHLA ERGVERIALLSNTGSNVSSVTRVRAARAVAEELGMRLDPDDIIECSFSVDEAGAAVNRVL DRSGAVPDAFLCLNDLLAIGAITALRGRGIAVPERVQVTGFDDIQFAALMRPTITTLHQP LDAIADRGVSILMGDDDSTGRIHIEGTLVERESTRRRPGAVGE Prediction of potential genes in microbial genomes Time: Thu May 12 18:15:34 2011 Seq name: gi|319978029|gb|AEUH01000166.1| Actinomyces sp. oral taxon 178 str. F0338 contig00166, whole genome shotgun sequence Length of sequence - 4391 bp Number of predicted genes - 5, with homology - 2 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 252 271 ## 2 1 Op 2 . + CDS 267 - 1985 1569 ## Bcav_1045 hypothetical protein 3 2 Tu 1 . + CDS 2196 - 3077 951 ## 4 3 Tu 1 . - CDS 2985 - 3200 131 ## 5 4 Tu 1 . + CDS 3560 - 4391 826 ## COG1940 Transcriptional regulator/sugar kinase Predicted protein(s) >gi|319978029|gb|AEUH01000166.1| GENE 1 1 - 252 271 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no DGARCALGAADATGGGASGGLGILIGPLVGPLLAAGRLAARVCLEAAADKAAATADHLEE AAALMQATEDSIGDDLKRIGEQL >gi|319978029|gb|AEUH01000166.1| GENE 2 267 - 1985 1569 572 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1045 NR:ns ## KEGG: Bcav_1045 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 37 283 2 240 412 147 37.0 1e-33 MSGAEALDTRPSTGADTGAGASDALHGSASPSSSSSSLNPLVAGREESRSAVSGSGVVED IWGLAHGLSEGSWLETGLSSVSLVADAVGVGVDPLGTLIAWGAGWLIDHFGPLKSWMDQF LGDADSVRADAATWSNVAWAMGECADSLEQDERGLMGEQVGATARGYRASNADTISALRT ASGAADAMGKATSVLAEVVGLVHDLLRDAISAIVGTLASAIIEAIATFGLAIPLIIAQVQ VKVGAKATQMAAHITGVLKSARSLAKQLSSARGLLELLRSLLSRGKRAVLGIVEFFKDGR RKARELFERITHGSKPHWDEDGATARAQELVAAGRTAHDTKVALENKFKDLIKKYGLEED YIGKNGRLKVNKDNYATLADDIADQGATIDEIDQVLTDGSALTEARRAEKAASEELGAEF RENTGEPPRTVIPGIGGVGGGNRSFDLLAVDERVESIDFVEAKGGLHPKMGHARNILPDG SKGEVLEQGTGAYLNHIARQDKNFLQLLRDNPDLWERIKSGQVTLNNVVCKTPTADLSLI ERTTEAFTLEPETIRALDTELNPPPTTNGTTP >gi|319978029|gb|AEUH01000166.1| GENE 3 2196 - 3077 951 293 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGFVYRFQTGCHQMTARPLCMRVLALITIQLEKSSCAHYPHPTVPKTQIQDFQRSGFYIT VLVSDSAETCPSFQSHSNLPRKRNIISKSSRLRPMVRRSAPTTSNPGPTRTENGASPTTM TTGFTAVTPQRAVQIMKAWAYHPWPITFDEGTRIYTSLGFKGDPDKSQFFTSDSSPTETD SFYYGDENTIDGVRMRLSNCVPKNDWANSLPKSRLAFQDYIDAYTRTLGAPITEKDNGTD FSTRWIVNGSIAVMIGGNEAFINLIVESPQQTKHLLDELEAKANGEEIGADYF >gi|319978029|gb|AEUH01000166.1| GENE 4 2985 - 3200 131 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPSWTLIGQGWWAHAFMIWGVTASNSAVMARGAASLLGWHVLEVVGADLLTVGLGFELIK KVLGLLRGLDY >gi|319978029|gb|AEUH01000166.1| GENE 5 3560 - 4391 826 277 aa, chain + ## HITS:1 COG:BH1094 KEGG:ns NR:ns ## COG: BH1094 COG1940 # Protein_GI_number: 15613657 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Bacillus halodurans # 9 197 10 199 407 62 27.0 1e-09 MRKKDLHSLRKTNRRLVLGLLAEGAPLSRAEIAERTGLSPSTVSTIMSDLLAEGLVHDTG DTESTGGRGRRLLGINPACGVMAIAEIQRQRTTVSFHDMGLGRLGARTLEVPIKPGSGDR LLQAIGDCVSAFAHCSPTPGGLAGIGLLFDDQAIAPDVTVLYSTGFDEASISLRQALITR FKVPVVEESAQACSARELVAAHLDNARSWAHIALGPRITLSLTTEEGPAPIGSRGLADLT EAVLPGQAHPERVLAALARSPLPPRLRPAPHPGWPTA Prediction of potential genes in microbial genomes Time: Thu May 12 18:16:04 2011 Seq name: gi|319978025|gb|AEUH01000167.1| Actinomyces sp. oral taxon 178 str. F0338 contig00167, whole genome shotgun sequence Length of sequence - 4179 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 301 376 ## 2 1 Op 2 . + CDS 351 - 1664 2202 ## Apar_0160 hypothetical protein 3 2 Tu 1 . + CDS 1923 - 4073 2767 ## COG0145 N-methylhydantoinase A/acetone carboxylase, beta subunit Predicted protein(s) >gi|319978025|gb|AEUH01000167.1| GENE 1 2 - 301 376 99 aa, chain + ## HITS:0 COG:no KEGG:no NR:no AGQAGADEAGAAPVPARPSWPHFLGRLSDVITTVVSLVPLDALLVSGEVASTTGFTALLA RTLALRLRGRAPCVIPLTHAVPAEPARMATALRATVLTS >gi|319978025|gb|AEUH01000167.1| GENE 2 351 - 1664 2202 437 aa, chain + ## HITS:1 COG:no KEGG:Apar_0160 NR:ns ## KEGG: Apar_0160 # Name: not_defined # Def: hypothetical protein # Organism: A.parvulum # Pathway: not_defined # 1 437 1 433 433 542 74.0 1e-153 MIQGILILLTFIVVAALMMAKKLPTLVALPLMAVVIGLIAGVPLIAKDDQGVATGLLDGV FEGGTVKMASAIIAVVFGAWLGQLMNRTGVTETIIKKSAELGGGRPLVVTLIMAVATALL FTTLSGLGSIIMVGSIVLPILISVGIPAASAACVFLMSFTTGLALNVANWQVFAKLFSLD IAVVQGFEFYLAAATGVATVVFILVEFARNGIKFAFSAPAPSPAGPSSDTRALKGVTGAL AMLTPLIPIVLVAALKLPVTVAFTIGIVWCLVFTARGFSRAMNTLAKTCYDGITDAGPAV ILMVGIGILFLSVTHPKVKEVLDPILLAVVPSGRWAYILFFIVLAPLALYRGPMNMFGLG SGIAALITGLGTMNPLAVMAAFLSAERIQAGADPTNTQNVWTANFCEVDVNTCTKRILPY LWAVSAIGVLLGAFLYF >gi|319978025|gb|AEUH01000167.1| GENE 3 1923 - 4073 2767 716 aa, chain + ## HITS:1 COG:aq_708 KEGG:ns NR:ns ## COG: aq_708 COG0145 # Protein_GI_number: 15606108 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-methylhydantoinase A/acetone carboxylase, beta subunit # Organism: Aquifex aeolicus # 1 501 1 478 660 129 27.0 2e-29 MPVRIGIDVGGTFTDAVAIDNTTFEVIGSAKVPTTHSAPEGVAAGIVQALHQVMESEGIA PDDVVFIAHGTTQATNALLEGDVATVGVLTLGSGLQGAKSRSDTTMGRIELAEGKELPST NQYVDASDDGLAERIDAALDGLRDQGAQAVVAAEAFSVDDPAHEALVVDRCAERGVPATA TNEISKLYGLKVRTRTAVVNASIMPKMLEAATMTERSIRQARIPSPLMVMRCDGGVMTVD EVRRRPILTILSGPAAGVAGALRYERLTDGVFFEVGGTSTDISCVKDGQVMVTYAEVGGH KTYLSSLDVRTVGIGGGSMVQIRGGRAVGTGPRSAHIAGIDYEVYADAGTITNPRLVAVR PLPGDPEYAVVECDGGARVCLTMAGAANIAGLVGDQDYARGDVEAARRAWAPLAKSMGCS VEEAAEAVLAFASQKNAAVASQLMRDYHLDPRTTVFVGGGGGAAAVVPHLARTMGHRHRL ARNAAVISTIGVAMAMVRDMVERSVVNPTDEDVIAIRREAELKAIQSGAAPGTVEVTVEV DSQRNMVRAVAVGATEMRAETAEDGALGDGELLATVAENLDLPAQELTIAARTDQMLAVT ARVRRKRLFGLLSSTKIPVRLIDAFGVIRLQKANARVWPATVASAATVVDEAIDELTVYN DGGANYPNLHLVVGGKTINLSGLSSEDQIRSLAGVELQGYDGAAPLIVIGTTRLED Prediction of potential genes in microbial genomes Time: Thu May 12 18:16:17 2011 Seq name: gi|319978021|gb|AEUH01000168.1| Actinomyces sp. oral taxon 178 str. F0338 contig00168, whole genome shotgun sequence Length of sequence - 2670 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 48 - 413 469 ## Apar_0162 hypothetical protein 2 1 Op 2 . + CDS 410 - 1897 1555 ## Apar_0163 hypothetical protein 3 2 Tu 1 . - CDS 2254 - 2670 108 ## Predicted protein(s) >gi|319978021|gb|AEUH01000168.1| GENE 1 48 - 413 469 121 aa, chain + ## HITS:1 COG:no KEGG:Apar_0162 NR:ns ## KEGG: Apar_0162 # Name: not_defined # Def: hypothetical protein # Organism: A.parvulum # Pathway: not_defined # 1 109 84 193 217 82 34.0 6e-15 MCADAGLRVVRGGAERAAGRLFFAEYLPKKRLVVVNTPSLRMWADHHGVGLGTAVDMALC HEYFHHLEWTRLGLTSLALDIPLLRVGPIRVGRVRFRSLSEIGAYGFTHECFHPSSEGVP Q >gi|319978021|gb|AEUH01000168.1| GENE 2 410 - 1897 1555 495 aa, chain + ## HITS:1 COG:no KEGG:Apar_0163 NR:ns ## KEGG: Apar_0163 # Name: not_defined # Def: hypothetical protein # Organism: A.parvulum # Pathway: not_defined # 1 485 1 448 459 533 54.0 1e-150 MTQRTVDYGGRALPVIGAYDVAVVGGGSAGSACAIRCAQLGLSVVVIDRYAMLGGSATNA TVCPMMPTHVAHRGVFAQIERELRASGEATRDGTTTMLWFAPERLGMVYERLLTGSGGEV LYDATLVDVLDAPGADSAATGARADGADTTGRPADKAGTGDPGTAPGPATRPAPRRLRHL VVMTAQGMVCVEAGVMVDASGDAVLARLAGVPTRSGDEAGANQVCSLRFTMGDIDVDAYR DYCLSLGDEFSPLKFGFFFESAMVVGRGFALEPLFRQGVAEGLLSEGDLRYYQAFSVPGK PGVMAFNCPHIPSLVSNTTARGRSAAIVEAHAAIDRLARFLRAKMPGFGSSFLVGVAPML GVRESWRIDGVYRLTEQDYARQARFDDGLVRGDWFIDVHSQSKGLFHQKGYQPGDYYEIP YRCFIAPQVSNLLVVGRCISTTFLMQASVRIIPTVIDMGDAAGCACALSAANGVLPLALS GVEVRRCADAASPLG >gi|319978021|gb|AEUH01000168.1| GENE 3 2254 - 2670 108 138 aa, chain - ## HITS:0 COG:no KEGG:no NR:no ARGAMTPHAYRTTTVKVDLRWDGNRWLLAELVPLHASGADPSDPDASRPAPSHSGAPRSA PPPTSPRSAPSPTGAPAGVQTPTPPEPGSSAPGTGGPGGPGGPSAPEPTATSWSQSPPAP DDPRAEPDPTDPNTAGGR Prediction of potential genes in microbial genomes Time: Thu May 12 18:16:50 2011 Seq name: gi|319978001|gb|AEUH01000169.1| Actinomyces sp. oral taxon 178 str. F0338 contig00169, whole genome shotgun sequence Length of sequence - 25475 bp Number of predicted genes - 24, with homology - 18 Number of transcription units - 13, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 616 410 ## gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein 2 2 Tu 1 . - CDS 939 - 1541 732 ## gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 - Term 2146 - 2181 4.1 3 3 Tu 1 . - CDS 2206 - 2712 415 ## 4 4 Op 1 11/0.000 - CDS 3429 - 4634 1829 ## COG4214 ABC-type xylose transport system, permease component 5 4 Op 2 11/0.000 - CDS 4648 - 6189 2218 ## COG1129 ABC-type sugar transport system, ATPase component 6 4 Op 3 . - CDS 6252 - 7376 1746 ## COG4213 ABC-type xylose transport system, periplasmic component 7 5 Tu 1 . - CDS 7624 - 8739 1323 ## COG2115 Xylose isomerase - Prom 8785 - 8844 1.5 8 6 Op 1 . + CDS 8746 - 8823 155 ## 9 6 Op 2 . + CDS 8834 - 10114 1504 ## COG1940 Transcriptional regulator/sugar kinase 10 6 Op 3 1/0.000 + CDS 10184 - 11197 1160 ## COG0524 Sugar kinases, ribokinase family 11 6 Op 4 . + CDS 11199 - 12401 1304 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 12 7 Op 1 . - CDS 12427 - 14442 2353 ## Cfla_0468 protein of unknown function DUF1565 13 7 Op 2 38/0.000 - CDS 14504 - 15331 1451 ## COG0395 ABC-type sugar transport system, permease component 14 7 Op 3 35/0.000 - CDS 15328 - 16284 1289 ## COG1175 ABC-type sugar transport systems, permease components 15 7 Op 4 . - CDS 16297 - 17580 1920 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 17712 - 17771 5.4 16 8 Op 1 . + CDS 17552 - 17617 115 ## 17 8 Op 2 . + CDS 17670 - 17774 258 ## 18 8 Op 3 . + CDS 17820 - 18884 1165 ## COG1609 Transcriptional regulators + Term 19041 - 19092 3.4 19 9 Op 1 11/0.000 - CDS 19142 - 20497 1694 ## COG1070 Sugar (pentulose and hexulose) kinases 20 9 Op 2 . - CDS 20595 - 21776 2052 ## COG2115 Xylose isomerase - Prom 21863 - 21922 2.8 21 10 Tu 1 . + CDS 22421 - 22621 97 ## 22 11 Tu 1 . - CDS 22810 - 23238 291 ## Bcav_1339 hypothetical protein 23 12 Tu 1 . + CDS 24062 - 25378 665 ## COG1940 Transcriptional regulator/sugar kinase 24 13 Tu 1 . - CDS 25407 - 25475 82 ## Predicted protein(s) >gi|319978001|gb|AEUH01000169.1| GENE 1 1 - 616 410 205 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190866|ref|ZP_06609028.1| ## NR: gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 57 204 21 181 198 105 44.0 2e-21 MVCTGEREGIPGSAARPGDAGAASSHCQYVPGTAPTTPPADEGEPADEGGGGEGEAPPST ETIVRTALARVPVSGAGLSWQPRKKSYTNAGVPTIVYATAPSQTHTASLFGREVVITLTA SRYSYDFGDSTPPLVTARAGEPWRRGNKEARLTHHYEQVTRGGERRVIALTTTWDATTTN PFTGETLTLPAVVTTTEQSSPFPVF >gi|319978001|gb|AEUH01000169.1| GENE 2 939 - 1541 732 200 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190865|ref|ZP_06609027.1| ## NR: gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 [Actinomyces odontolyticus F0309] # 32 198 93 265 266 81 32.0 2e-14 MGARALAVLVVLVAAVSVGAYVVFLRGEPDYGMSHGYQVQSDGSLKRPPVTDKAPEQPEE MSAGGETGAKATARYYLKARAYAWNTGDTGPLKSISEESCVHCQNAITHVEEFYAHGYWG TGGYNDVTETQIVRSLDENEFGPGAYAIQLRFDEHMPKGYTSNGYVDATVRDTIIKLHVH WDGSAWRVMEGAAENAEDVK >gi|319978001|gb|AEUH01000169.1| GENE 3 2206 - 2712 415 168 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGDYYVQSDGSLRRPGGVGEFPVPDPLVGEYSDAGAEAMASYYFETVAYAWNVGDPRAF MDMVVVSCQACEDTAVRIREAYADGGWAAKARASDFQVSTPAQVDDQQTYGDETYAVDVS FSERTPDVYANGTLTPGVERVRGARVLVGWDGYFWRVVEVEDQPQDGS >gi|319978001|gb|AEUH01000169.1| GENE 4 3429 - 4634 1829 401 aa, chain - ## HITS:1 COG:BMEII0362 KEGG:ns NR:ns ## COG: BMEII0362 COG4214 # Protein_GI_number: 17988707 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, permease component # Organism: Brucella melitensis # 1 397 1 396 396 276 44.0 5e-74 MSTQTAEAAAPSERTSWLKRIDLGQYGITLALVAVVVFFEVMTGGRLLRSNNVASLIQQN AYVIIMAIGMLMIIIAGHIDLSVGSLIAFIGGVLGIAMAQWNLPWGLAVVLALAIGVAVG AWQGFWVAFVGIPGFIVTLAGMLLFRGLAIVIVPQTIAPLPSGFVQIANAGILGWFAYIG VYDVFTIVLGVVLAAVFAVSQLRRRASLVKHGLVVEPLWLAVVKMVGVGVFVIAFAFILA GSAIGLPFVLVICAVLVLVYAFVLNKTTFGRYIYAVGGNREAAKLSGISVRSINFWIFVN MGLLAAIGAIVTTSRAGAAVSAAGQNYELDVIAACFIGGAAVWGGIGRISGTIIGALFMG VLNMGLSIMTVDSAWQQAIKGLVLIIAIAFDQLNKARSADG >gi|319978001|gb|AEUH01000169.1| GENE 5 4648 - 6189 2218 513 aa, chain - ## HITS:1 COG:BH3441 KEGG:ns NR:ns ## COG: BH3441 COG1129 # Protein_GI_number: 15616003 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Bacillus halodurans # 10 511 7 508 517 567 59.0 1e-161 MTEQPQSAVILKMEDISKDFPGVKALSHVTFDVRENEIHAICGENGAGKSTLMKVLSGVH PYGSYGGRILLDGHEAAYRSVKDSEKDGIVVIQQELALSPHLSIAENIFLGNERATRGII DWDRTNAEAKELMAEVGLDESPSTKVSDVGVGKQQLVEIAKALGKEVRVLILDEPTAALN DDDSQKLLDLIVGLKRSGISCVIISHKLSEIRAIADRVTVIRDGQTIKTHDVRAGGEITE DVLIRDMVGRAMENRFPDHEPHIGDEVLRVENWSVVNPVTGRRALHDVSFTARAGEIVGI AGLMGAGRTELAMSVFGRSWGRYESGDLYLRGERIEVASVKAAIKAGIAYASEDRKRYGL NLIASIKHNATSASMDRVSKAGVISDITEYQLTEAQRAKTRTKARSVDVVVGTLSGGNQQ KVVLSKWLLTEPDVLILDEPTRGIDIGAKYEIYTLINELADAGKAVVVISSELVELLGLC DRIYTLSYGRITGEVERADATQEILMSHMMQES >gi|319978001|gb|AEUH01000169.1| GENE 6 6252 - 7376 1746 374 aa, chain - ## HITS:1 COG:SMb20895 KEGG:ns NR:ns ## COG: SMb20895 COG4213 # Protein_GI_number: 16264937 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Sinorhizobium meliloti # 1 371 1 353 355 293 45.0 5e-79 MKLIRLFAAAATCAIALAGLGACSSERGGTGTGGTGGASASDTLVGIAMPTKSLERWNRD GSHLEEILKGKGYQTSLQYADNKVDQQITQIENMITQGANILVVASIDGTALGPVLQKAA NAGIKVIAYDRLINATENVDYYVTFDNYKVGQLQGEYIKQQVASMTPPVYLEPFAGSPDD NNAKYFFSGAWDVLLPLVEDGTLEVRSGKAPKANDEWQSIGILGWSSDDAQAEMENRINS FYSDGTKVDVVLSPNDSLALGIAQALEGSGYTPGSDYPLLTGQDCDKANIKNVIAGKQAM DVLKNTELEAEQTALMIDQIVSGQQVDINDTTTYDNGVKVVPTYLLAPRTVTKDTVKEVL VDSGYYTASDIGLE >gi|319978001|gb|AEUH01000169.1| GENE 7 7624 - 8739 1323 371 aa, chain - ## HITS:1 COG:xylA KEGG:ns NR:ns ## COG: xylA COG2115 # Protein_GI_number: 16131436 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Escherichia coli K12 # 18 321 33 350 440 68 24.0 2e-11 MFVSTANRHAPMGTHSRVPGRRSEETIQVTRFKYTVGPWNVAEGTDVYGPPTRQPLAMRE KVARFAEMGFSAIQFHDDDAVPDIASKSVAQIEDEAHELRALLDSLGVGCEFVAPRLWFD PAFKDGAYTAPKKEDWERAMWRTERSIDIANILGADLVVLWLAREGTLCMESKPPVAMIR QLVRSLDRMLAYDDHVRIAIEPKPNEPIDRSYAGTAGHAIALGDMTADPSRCGILVESAH SVLAGLDPTHDMAFGLASGKLFSVHLNDQNGMKFDQDKIFGSESLRSAFNQIKLLVDNDY GSNGECIGLDVKAMRSTGDRFGFKHLENSLRVVRLMEDKAALYDQSVVDSLVAEDDYEGL EMYILELLLGV >gi|319978001|gb|AEUH01000169.1| GENE 8 8746 - 8823 155 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFYVETITAMRSVSTDFRTIILFMN >gi|319978001|gb|AEUH01000169.1| GENE 9 8834 - 10114 1504 426 aa, chain + ## HITS:1 COG:BH2758 KEGG:ns NR:ns ## COG: BH2758 COG1940 # Protein_GI_number: 15615321 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Bacillus halodurans # 23 380 10 359 386 111 25.0 2e-24 MAPLRSPLDYGFAGGGAVGHANVRHQNLALLSRTIYAADAPPTRAALASATGLGRATVSR LIQDLLDSGFAHEQDPGGATGRGRPGPPIAPAARTIAALGLEVNVHFVAGRALDLGGNTL AEFRLDSADVESDPAGTLRLLGRSAAAMATNLRATGTEVIGARLAIPGLLNAHDQRLLVA PNLGWSDLDPLAMLGREWEALGVPTLTRNDADLQSLTAAYQRPGKLLADGAFLYIFGDVG VGGAIMDRGAPIRGAHGWAGEVGHTTVHADGLACKCGSRGCLEAYAGQNALRRAAGLDPD APIDALTARLSDGDERATRAVEAAGEALGIAIAIAVNLLDIPRIVFGTSLGVLLPWLTPP ITRELRARVLGHASRGIVLEACPITVLPACTGGALEVLNTTIAQPAAWIDRHGPAPAPNG AQHETD >gi|319978001|gb|AEUH01000169.1| GENE 10 10184 - 11197 1160 337 aa, chain + ## HITS:1 COG:YPO1816 KEGG:ns NR:ns ## COG: YPO1816 COG0524 # Protein_GI_number: 16122068 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Yersinia pestis # 27 318 26 281 319 68 25.0 2e-11 MVSIQVAGHVCVDLTPRLRSREVIVPGELTEVGPIEVRMGGTISNCARAIGQLGAEVFLS GMVGDDDLGLISRRHLEDAHPGHVELIVHPTAATSYSVVVCVPGEDRSFWHHTGANDEYR GEGALQPRSILHFGYPTLCPAMCADSGRPIVDLFERAHAQEGATSLDLAYLARNSPLRAI DWDLLFARVLPHADVFCPSWDDIASCLPGIDPVFDRARIEERARAFIAQGAAMVLITAGK HGAHLATSDTHALGRLARLTGIDAEEWAGQSIWDPAILVERQTDQPLDSNGAGDTFKAAF LVALTRRARPRQAIRFASEVVGKKLLRRPLATELPGA >gi|319978001|gb|AEUH01000169.1| GENE 11 11199 - 12401 1304 400 aa, chain + ## HITS:1 COG:PM1062 KEGG:ns NR:ns ## COG: PM1062 COG0246 # Protein_GI_number: 15602927 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Pasteurella multocida # 7 338 6 337 385 182 33.0 1e-45 MPLLVAFGAGNVGRGFIGELFSAAGWDVAFLDVDPRLVERLGEDRSYIHETVSNSGVLRQ VVTRVTAAYSTDQSAVDALVAEADLVTTSVGVRVLGAIAPALAHALQARWAAGRGALDVL LCENLHDASSAFRRMLRDALPASARARLETDVGLAETSIGRMIPTTPPAPGQPPTLVRAE PYRILPYDATGLTAPEPRVQGLFPVRDIPFSFYTDRKLLIHNMGHCACAYLGELLGATFI WEAIAEPRVHSIVRAAMMESALALSVVYGAGIGALGLHVDDLLARFANRALGDTVERVGR DPERKLGAKDRFIGAYALARRAGTPVEYLSLALAVGARRLAGEDGWDEARALAHVDAHLF PGGSPARDLFGLQFDSLAHGLDWEGQQALIDASPLRGTVA >gi|319978001|gb|AEUH01000169.1| GENE 12 12427 - 14442 2353 671 aa, chain - ## HITS:1 COG:no KEGG:Cfla_0468 NR:ns ## KEGG: Cfla_0468 # Name: not_defined # Def: protein of unknown function DUF1565 # Organism: C.flavigena # Pathway: not_defined # 1 665 1 646 653 712 57.0 0 MSTIIHVATTGSDDAPGTEERPLRTIDRAARLAMPGDTVRVHEGTYRERVNPRRGGLSDR RRITYEAAPGERVVLKGSEAVTDWSDQGGGVWRTDIPNSVFGPFNPFAAPIEGDWVVEPY VFGDEKRHLGDVFIDGVQLYEAPCRDQVFAPSRRTRVRDSWTGTTVPVEDPDVTELVWFA EVGAGATTIWANFGRRAPASSLVEVSVRPTVFWPSEHHIDWITVRGFEMAHAATQWAPPT AHQQGLVGPNWAKGWIIEDNEIHHSKCVGVCLGKEGSSGDNYATLRRDKPGYQYQLESVF AARHIGWDKERIGSHVVRRNHIHDCGQAGVVGHLGCAFSRIEDNYIHNIALRREFWGHEI AGVKLHAPIDVTIARNVITDCSLGIWLDWETQGTRITRNVLAANCRDLFVEVSHGPYTVD HNVLASRASVEIASCGGAFVRNLIGGTVRLDPIMDRATPYHVPHSTQIAGFGFIPGGDDR WVGNLFFGGDADEAYAPDGWFGGRAHHGLEGYAPYPASWEQYMEGVGESATDHERYFGRK LPVYARSNVYLEGARPFDGEEGSAEIPGECALSVSVRAGVGEGLAGPDAALRLVVLRVAL PGDFSGFRLPLPQGADLERAYYADAEFEAPDGSPVDLAADLSGAVAPAGTMVPAGPLDSL AGGTQEVAVVR >gi|319978001|gb|AEUH01000169.1| GENE 13 14504 - 15331 1451 275 aa, chain - ## HITS:1 COG:TM0430 KEGG:ns NR:ns ## COG: TM0430 COG0395 # Protein_GI_number: 15643196 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Thermotoga maritima # 20 272 15 267 271 133 31.0 5e-31 MTTHKGFDRYVSGALGWLWLIVTIVPIYYIVITSLRSQSSFYTENPLLPTGSPTLKAYAD VFQNKFWMYFLNTAIVTVVSVAVILLVCVMISYYIARRRNTWSRRAHSMILLGLGIPMQA VIVPVYYLVVQLGLYDSLFGIIAPSIAFAIPVTVMIMVNSMRDIPEELFDSMAVDGAKDW QILWRLVVPLSAPAVTTAGVYQALQVWNGFLFPLILTQSGSVRVLTLSLWTYQGQYTSNI PAILAAIILSALPVLVAYAVGRKQMVAGLTAGFGK >gi|319978001|gb|AEUH01000169.1| GENE 14 15328 - 16284 1289 318 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 37 300 11 280 292 139 30.0 6e-33 MSSRDRGAPAPAAAPALPAPRGGRGNQGSRGSGPSGLYALPALAFFALFAIIPLIGVLVL SFMNWDGLGAPAWAGFDNWSQTLADPLTRHAMVLTLIVMVVSWLVQTPLSILLGVFMGSR QRYREFLSVLYFLPLLFSAAALGIAFKALLDPNFGFSQALGIPFLKQDWLGSQALALPVL IAIISWAFIPFHSLIYQGGVRQISKSLYEAAELDGASTWRKFWSITIPQLKYTIITDTTL QLVGALTYFDLIYVMTGGGPGNATRILPLHMYIAGFKSFDMGQAAVLGTVLLVVGLALSL GLNRLSGASRMESQASGQ >gi|319978001|gb|AEUH01000169.1| GENE 15 16297 - 17580 1920 427 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 2 420 6 428 438 137 26.0 6e-32 MMASIGAIAMGASLALSACGGAKPGGGADGSASAWAVTGGTHEQIWRTSFDKWNEAHADA QINVEWFANDAYKEKVRTAIGSDNAPTLIFGWGGATLKDYVSSGKVVDITDATQQTVAKV IPSIAEGGKVDGKVYGVPNVGTQPVVLYYNKQLFDQAGVAVPTTYDELLAAVDTFKAQGT TPIALAGASKWTNMMWLEYLLDRNGGSEVFNRIAEGQAGAWSDPAVETTLTQVQDLVRAG AFGDAFGSVVADNNADVALVHTGKAAMLLQGSWAYGTFLTDSPDFVKAGNLGFAAFPTVA GGVGDPSAVVGNQSNFWSVSASASPDAQTAAKDYLADLFTDEYVKSIVDGGDIPPTIDAQ KFIKGTEQEEFVGFGYDLVLDSSNFQLSWDQELDSATSQVLLDNIQQLFALSMTPQQFID SMNATIK >gi|319978001|gb|AEUH01000169.1| GENE 16 17552 - 17617 115 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAMAPMEAIIRILVLTGQSFQ >gi|319978001|gb|AEUH01000169.1| GENE 17 17670 - 17774 258 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPRNNETVSHISASMFQFRTKTNSVKACLHCLTR >gi|319978001|gb|AEUH01000169.1| GENE 18 17820 - 18884 1165 354 aa, chain + ## HITS:1 COG:VC2677 KEGG:ns NR:ns ## COG: VC2677 COG1609 # Protein_GI_number: 15642672 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 16 347 3 331 335 143 30.0 6e-34 MIDTGNTGPAPRQRVTIAMIAEAAGTSRATVSKALNGRTDVSAATRDSIAAIATELGYRP TGAAQAVQSIAFVTNEFDSIYTAHVLSGAISECAGQGYLLTSGHLGIGGAPDATRPLSDA WLSAVSRTHIGVIVATTHVDPSLVRGCGRLGLHLVAVDPKGPVEEGVTTIGATNWNGALT ATQHLIGLGHRRIAFSRGATDSMPAGEREQGYRSAMRMAGLDVPPALVGGDDYSFASGHS SALSFLRLPEFERPTAIMCASDLVALGAIEACRSLGVAVPRDCSVIGFDDSELAALSSPP LTSVRQPMRHMGEAAARAVIDQHEGRTSASHPMRLETRLVIRDSTCPAPGAGHA >gi|319978001|gb|AEUH01000169.1| GENE 19 19142 - 20497 1694 451 aa, chain - ## HITS:1 COG:Cgl0113 KEGG:ns NR:ns ## COG: Cgl0113 COG1070 # Protein_GI_number: 19551363 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Corynebacterium glutamicum # 7 448 4 417 460 340 49.0 2e-93 MSTRPYVAGVDTSTQSCKIVIVDPGTNRIVRQGRASHPDGTEVDPQAWWDAFLEAVDSAD GLDDVRALSVGGQQHGMVVLDAQGQVIRPALLWNDTRSAAAARDLIRAKGSDGRDGADSE GAAWWAGATGSVPVASLTVTKLRWLADNEPENAARTAAVCLPHDYLTWRIAGGFEAVGLK GLVTDRSDASGTGYVDRSGGQYRLDILADALRIDEETARWIVLPRIAGPWDTVGHGDPAR GWDAIALGPGAGDNAAAALGVGLTPGAALLSLGTSGVVSAVSDHPVSDPSGLVTGFSDAS GNWLPLACTLNASRIIDAMARVTGLDYQEFDEAALAVPDAAGLRLVPYFEGERTPNLPDA TASLEGMTLANSDRQHVARATVDGLLDLMRFALDAMRALGVPVERVLLVGGGAKSTAVRA LAPQALGAAVEVPEPGEYVALGAAKQAARLR >gi|319978001|gb|AEUH01000169.1| GENE 20 20595 - 21776 2052 393 aa, chain - ## HITS:1 COG:AGl774 KEGG:ns NR:ns ## COG: AGl774 COG2115 # Protein_GI_number: 15890502 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 287 55 355 453 95 30.0 2e-19 MTRQPRPEDKFSFGLWTVGWNAVDPFGTATRPVLDPWEYTAKLAELGAWGITFHDNDVFD FAATDAERHDRVMKVKEAADAAGVVIEMVTTNTFTHPVFKDGGLTNNDRSIRRFGLRKVL RNVDLAAELGATTFVMWGGREGAEYDSSKDLNAAFDRYKEGLDTVAGYIKSKGYDLKIGL EPKPNEPRGDIFLPTVGHALALIAELDNGDVVGLNPETGHEQMAGLNYTHALAQALNAGK LFHIDLNGQSGLKYDQDKAFGHGDLVSAFFTVDLIENGFPCGGPRYEGPRHFDYKPSRTE GMEGVWESASANMEMYIRLAEKAKAFREDPATAELLKAASVYELGEPTLAEGESVEDFLA DASVYEDFDADAKGAREYHFVELYQQAMRHLIG >gi|319978001|gb|AEUH01000169.1| GENE 21 22421 - 22621 97 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLSLAVPRRCCSVLVQIASELPRIRQNPPFFPRLRRFLEQLAVKPAFPLETQTHFGPQP LEHPGR >gi|319978001|gb|AEUH01000169.1| GENE 22 22810 - 23238 291 142 aa, chain - ## HITS:1 COG:no KEGG:Bcav_1339 NR:ns ## KEGG: Bcav_1339 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 22 142 121 241 262 76 38.0 2e-13 MRQHLLEGMAALPLVRKMRANARWTLENADPGCESPGECRVLYALKRAGLEGLLTQVEVP TPGGMFFIDIAIPGLGIAIEFDGRVKYGASVREVHESLEAESRRQRLLELAGWTVIRVRW GDLKRIDEIIARVRLAIAARRR >gi|319978001|gb|AEUH01000169.1| GENE 23 24062 - 25378 665 438 aa, chain + ## HITS:1 COG:BMEII0106 KEGG:ns NR:ns ## COG: BMEII0106 COG1940 # Protein_GI_number: 17988450 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Brucella melitensis # 39 377 4 336 374 128 32.0 3e-29 MPDHAAAPASSGNGHTCFKAAHGERLTQGMRATNLKASNLSLVLRKILANPGEITRAAIS SSTGITRATISRLVDDLIGRGFVEELDPVGDSGRGRPANRLTPTAGRVAALGIEMNVSAL DIMLVDLTGRVLARKRVRGDFAGSDPIETMHRASRAARRVVDAGLPSGALFLGSGVGIPG LVSPTALALAPNLGWRDIPLEELLAPISCLDPVVVANEADLAAFAVAHSVPGAPSGPSSF IYVSGEVGVGSGIVVDHRPFRGVRGWSGEIGHICADPNGPMCKCGARGCLESYLGMYALA SRAGLDRGARAADILREAAVSERARSALDEAGAALGRCLAAVINATDIPVVILGGVVAEL APALVAPARKELGTRVLQAAWVPPMIRVWGESKSLTARGAALRVLQRLVDDPLSWLEEES GGEWPRARAEKPDQAALR >gi|319978001|gb|AEUH01000169.1| GENE 24 25407 - 25475 82 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no GGVLDIEAPAGDAPLVVTAPAV Prediction of potential genes in microbial genomes Time: Thu May 12 18:18:12 2011 Seq name: gi|319977933|gb|AEUH01000170.1| Actinomyces sp. oral taxon 178 str. F0338 contig00170, whole genome shotgun sequence Length of sequence - 80915 bp Number of predicted genes - 73, with homology - 56 Number of transcription units - 32, operones - 13 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 38/0.000 - CDS 1227 - 2084 1330 ## COG0395 ABC-type sugar transport system, permease component 3 1 Op 3 35/0.000 - CDS 2077 - 2997 1511 ## COG1175 ABC-type sugar transport systems, permease components 4 1 Op 4 . - CDS 3071 - 4327 2106 ## COG1653 ABC-type sugar transport system, periplasmic component 5 2 Tu 1 . + CDS 4327 - 4437 74 ## 6 3 Tu 1 . - CDS 4452 - 4595 262 ## - Prom 4616 - 4675 2.8 7 4 Op 1 . - CDS 4719 - 5750 598 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 8 4 Op 2 4/0.250 - CDS 5737 - 6789 712 ## COG1609 Transcriptional regulators 9 4 Op 3 21/0.000 - CDS 6840 - 8195 1577 ## COG0477 Permeases of the major facilitator superfamily - Prom 8229 - 8288 1.9 - Term 8290 - 8326 -0.9 10 4 Op 4 . - CDS 8396 - 9703 410 ## COG0477 Permeases of the major facilitator superfamily 11 5 Tu 1 . + CDS 9989 - 10675 71 ## COG0640 Predicted transcriptional regulators 12 6 Tu 1 . - CDS 10703 - 10768 66 ## 13 7 Tu 1 . + CDS 10797 - 12185 1447 ## COG3177 Uncharacterized conserved protein + Prom 12220 - 12279 2.9 14 8 Tu 1 . + CDS 12311 - 12379 138 ## + Term 12523 - 12551 -0.1 15 9 Op 1 . - CDS 12417 - 12593 120 ## 16 9 Op 2 . - CDS 12532 - 12744 139 ## 17 10 Op 1 . + CDS 12938 - 13594 600 ## Pcryo_1611 hypothetical protein 18 10 Op 2 . + CDS 13591 - 14241 569 ## Pcryo_1610 hypothetical protein 19 10 Op 3 . + CDS 14245 - 17775 3699 ## RER_28960 hypothetical protein 20 10 Op 4 . + CDS 17784 - 21521 3687 ## COG1002 Type II restriction enzyme, methylase subunits 21 10 Op 5 . + CDS 21607 - 22761 1314 ## Arch_0273 hypothetical protein 22 10 Op 6 . + CDS 22758 - 23402 488 ## Ajs_4079 hypothetical protein 23 10 Op 7 . + CDS 23407 - 25941 2691 ## RER_28940 hypothetical protein 24 10 Op 8 . + CDS 25938 - 28034 2732 ## COG4930 Predicted ATP-dependent Lon-type protease 25 10 Op 9 . + CDS 28117 - 28599 257 ## + Term 28645 - 28686 12.0 - Term 28976 - 29014 -0.9 26 11 Op 1 42/0.000 - CDS 29037 - 29945 1243 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 27 11 Op 2 . - CDS 29960 - 30745 817 ## COG1121 ABC-type Mn/Zn transport systems, ATPase component 28 11 Op 3 . - CDS 30742 - 31941 1374 ## Sare_2619 hypothetical protein 29 11 Op 4 . - CDS 31938 - 33602 1743 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 30 11 Op 5 . - CDS 33592 - 35238 1298 ## PPA1500 hypothetical protein 31 12 Tu 1 . + CDS 35523 - 35588 139 ## 32 13 Op 1 . + CDS 35966 - 36544 792 ## 33 13 Op 2 . + CDS 36541 - 37044 553 ## 34 14 Op 1 . + CDS 37703 - 39001 1561 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs 35 14 Op 2 . + CDS 39029 - 40000 1242 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes + Term 40033 - 40068 8.1 - Term 40011 - 40064 12.3 36 15 Op 1 . - CDS 40069 - 40698 482 ## COG1011 Predicted hydrolase (HAD superfamily) 37 15 Op 2 25/0.000 - CDS 40843 - 42126 579 ## COG1475 Predicted transcriptional regulators - Term 42162 - 42193 1.3 38 15 Op 3 15/0.000 - CDS 42350 - 43231 795 ## COG1192 ATPases involved in chromosome partitioning 39 15 Op 4 3/0.250 - CDS 43331 - 44041 508 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division 40 15 Op 5 16/0.000 - CDS 44034 - 44537 742 ## COG1847 Predicted RNA-binding protein 41 15 Op 6 18/0.000 - CDS 44579 - 45922 1873 ## COG0706 Preprotein translocase subunit YidC 42 15 Op 7 . - CDS 45979 - 46326 309 ## COG0759 Uncharacterized conserved protein 43 15 Op 8 . - CDS 46323 - 46664 229 ## Tbis_3595 ribonuclease P protein component 44 15 Op 9 . - CDS 46695 - 46835 210 ## PROTEIN SUPPORTED gi|227494188|ref|ZP_03924504.1| ribosomal protein L34 - Prom 46941 - 47000 2.5 45 16 Op 1 . + CDS 46942 - 47136 203 ## 46 16 Op 2 16/0.000 + CDS 47139 - 48734 2589 ## COG0593 ATPase involved in DNA replication initiation 47 16 Op 3 18/0.000 + CDS 49124 - 50260 1618 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 48 16 Op 4 5/0.000 + CDS 50268 - 51455 1349 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 49 16 Op 5 . + CDS 51448 - 52158 710 ## COG5512 Zn-ribbon-containing, possibly RNA-binding protein and truncated derivatives + Prom 52180 - 52239 1.8 50 17 Tu 1 . + CDS 52290 - 53294 572 ## 51 18 Tu 1 . + CDS 53699 - 54241 514 ## 52 19 Tu 1 . + CDS 54402 - 55298 904 ## - Term 55025 - 55076 1.7 53 20 Tu 1 . - CDS 55285 - 55410 62 ## 54 21 Op 1 24/0.000 + CDS 55435 - 57552 3269 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 55 21 Op 2 . + CDS 57618 - 60170 3825 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 56 22 Tu 1 . + CDS 60284 - 60682 766 ## Sked_00080 hypothetical protein 57 23 Op 1 . - CDS 60669 - 61568 771 ## BL0336 hypothetical protein 58 23 Op 2 . - CDS 61574 - 62632 1153 ## HMPREF0868_0365 hypothetical protein - Prom 62836 - 62895 79.6 + TRNA 62819 - 62895 90.0 # Ile GAT 0 0 + Prom 62821 - 62880 79.0 59 24 Tu 1 . + CDS 62929 - 63063 188 ## + Term 63091 - 63158 30.2 + TRNA 63073 - 63145 88.7 # Ala TGC 0 0 60 25 Tu 1 . - CDS 63134 - 63196 86 ## 61 26 Op 1 2/0.250 + CDS 63406 - 64596 1257 ## COG2814 Arabinose efflux permease 62 26 Op 2 . + CDS 64577 - 64897 234 ## COG0640 Predicted transcriptional regulators 63 27 Tu 1 . + CDS 65055 - 66008 1114 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases 64 28 Op 1 . - CDS 66181 - 66822 867 ## COG0671 Membrane-associated phospholipid phosphatase 65 28 Op 2 . - CDS 66844 - 67224 172 ## PROTEIN SUPPORTED gi|90021194|ref|YP_527021.1| ribosomal protein L20 66 28 Op 3 1/0.500 - CDS 67221 - 68246 689 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 67 28 Op 4 . - CDS 68287 - 71346 3367 ## COG0728 Uncharacterized membrane protein, putative virulence factor 68 28 Op 5 . - CDS 71343 - 74024 2620 ## Arch_1808 hypothetical protein 69 28 Op 6 . - CDS 74024 - 74467 603 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 70 29 Tu 1 . + CDS 74620 - 76218 1967 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 71 30 Tu 1 . + CDS 76342 - 77808 1948 ## COG0362 6-phosphogluconate dehydrogenase 72 31 Tu 1 . + CDS 77963 - 80425 3124 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Term 80614 - 80668 -0.5 73 32 Tu 1 . - CDS 80659 - 80913 275 ## FRAAL5563 WD-40 repeat-containing protein Predicted protein(s) >gi|319977933|gb|AEUH01000170.1| GENE 1 1 - 1225 1521 408 aa, chain - ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 1 366 4 359 449 325 48.0 1e-88 MDPRPAPAARARTWEDLRRPTPQWYRDAPLGFFVHWGPYSVPAWAEDHGELGAEDDWRAW FTHNSYAEWYYNTIRIPGSPAAERHRSLYGSLDYEAFLDMWDPSAFDPASWADLFKRAGG HYAVLTTKHHDGVTLWDAPGTGSRNTVRRGPRRDLVAGFADAVRGAGLRVGLYYSGGLDW HYRPHPPILSEEDCKDVCRPKDADYARYCYEHARDLIDRYRPDVLWNDIDWPDEGKDFGE HGLGRLLEHFYATRPEGVTNDRYGGVYADFLTSEYQHMGEAEGGAVWENCRGVGLSFGYN RAEGPDQYLSAAAALRHLIDVVSRGGRLLLNVGPRADGSLPDQQVQCLEGMAAWMDRHRD ELIATAPLGQVEVLDAAEGASSGGARRPLLGGRVGARPRPHRLRRPLR >gi|319977933|gb|AEUH01000170.1| GENE 2 1227 - 2084 1330 285 aa, chain - ## HITS:1 COG:lin0761 KEGG:ns NR:ns ## COG: lin0761 COG0395 # Protein_GI_number: 16799835 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 23 278 17 286 291 189 35.0 4e-48 MPETPSANRKPLSRRIGRAASAVAMTVIAVAFLFPFGWMIATSLKPTTEVFTSGISPVGS EVAWSNYSSALTTIPFGRVILNSFIVSLVGSLLVMTVSVLSAYAFARLKFKFRDHLFLLF LGTLVLPQEVLVIPLYIGMQRMGLINSYFALIVPFAFGAFGAFLIRQFLMSLPHEFEEAA RIDGCGDFQILTRILLPLLKAPILVVGVFAFIDYWSTFLWPLIAINDRNMATIPLGLQMF SGERGTDWGPMMAAVAMTTIPSIIIVVFLQKQLEKGVALGAFGGR >gi|319977933|gb|AEUH01000170.1| GENE 3 2077 - 2997 1511 306 aa, chain - ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 26 304 16 292 292 179 36.0 8e-45 MSTAAATQAPPKRARKRGDGLRALAYLAPGMSGFLLFILIPLVASLVISLFDWSLFGQAK FVGAGNYQRMLSGEDPAFWTILRNTVVFAISYTILNLVISLGLSYWLQHLSDFWSRLLRV IFFIPVVTPMAGNALIWRLLLDDQGVVNSALGKAGLGPVPWLNHPVLAMGSLLMMSLWQG LGYNIVVLTAGLNGLNTSVLEAAKIDGTTGWQRFFKVVFPMISPTVFFCTIMTVIGAFKV FAQPYMLTKGGPGEATNTIVLALYRNGFSFDKLGYASAMAWILFVIVMVLTALQFSQQKK WVNYDA >gi|319977933|gb|AEUH01000170.1| GENE 4 3071 - 4327 2106 418 aa, chain - ## HITS:1 COG:BH3690 KEGG:ns NR:ns ## COG: BH3690 COG1653 # Protein_GI_number: 15616252 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 14 392 8 398 420 97 24.0 5e-20 MRPHTPRPARVAALAAVALLAPLAITACSSGTAQSGSGGAQTTQMYTWVSTENDRAQWQA FVDAAKEKDPNFTLTFDGPSFNDYWTKVKTRMVASDAPCILTTQAARAQELSGILEPLDE YMKEAGIQAGDYNAAMMQGMTVDGHVLALPYDAEPDVLFYNKASFKAAGLDEPTTAYTTE QFLSDAKALTGDGKYGLAVKASLMANAAGDFAIAFGGDVSRDGQVTITEQKFVDGMQFAF DLVAKEGVAAAPNAADADGPSQGAFQNGTAAMIVDGPWMYGTFEKALGEDLGVAVLPTPT GQPRAVIQGSGFGISKSCPDKKAAFNVLKQLVTPEVIGAVAEKQGTVPSVESALDKWAAN KPEGNVAAVKTLLDNGTPLVTTNTWNQINTSFEQYSPEGFRGTRTAADILGDLAKTAG >gi|319977933|gb|AEUH01000170.1| GENE 5 4327 - 4437 74 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWFLSVGERSGFGTGGIVVLRRLVGRDSSLNVTVGR >gi|319977933|gb|AEUH01000170.1| GENE 6 4452 - 4595 262 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLFESVCSGTIVPAPIGEGANRWNPNGECVHDGWEDVHRQSLNILLS >gi|319977933|gb|AEUH01000170.1| GENE 7 4719 - 5750 598 343 aa, chain - ## HITS:1 COG:Cj0340 KEGG:ns NR:ns ## COG: Cj0340 COG1957 # Protein_GI_number: 15791708 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Campylobacter jejuni # 5 305 2 276 335 102 26.0 7e-22 MAPRRIILDCDPGNGVPGANVDDALALAYALRSPLVDVGAVWTVFGNTPPDLGYRCASEV IRVVGAGGSGAGGGPSSSPVLRCGPFQPASGDPQRWLRQRRESASSPAGLAAWGRRPERA SVREEADGDAAGEVPPARLGDLVDDLRCPEQTTLVGIGPLTNIARVLTGWGAAPGRVDRI VVMGGALAFGTMVDTNFAVDPRAARVVVRFGIPVTIVPLDTTRTTCLTRDRWRAIIRRAA EKGPMAVATASAFDAWVSPWLSYSEATRPVDGMWVHDLVALVALAHPELVDSSQHRVDVR DDGKLVECPDGVPVDIVEAVDNEAMISLWERTVFDAGAAAPPP >gi|319977933|gb|AEUH01000170.1| GENE 8 5737 - 6789 712 350 aa, chain - ## HITS:1 COG:L0143 KEGG:ns NR:ns ## COG: L0143 COG1609 # Protein_GI_number: 15673630 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 9 303 10 293 332 159 33.0 9e-39 MSRKPTRDDVARLAGTSTAVVSYVVNGGPRRVREETRQRVLRAIGELGYRPNALAKSLSH GRTDLYALLVPDLANPYLAALAQALEHEFFARGKVLLVGDSHDDPDRETLILEAFLRQQI AGLVWYGVNQPLPLDLLEESHVPAVLLNQPTGPPQNGSRAVGLSGRPGIWSVSADERLEG RLATEHLVDHGRRTIAIVAGPNGRRNARERVRGWEKALRRADLPSSAPIHVPFTREGGRA SVDHVLELGADAVVASNEMQAIGLLNGLHLRGVRVPEDVAVIGINGTPLAEYASPSLSMV EMPAPAVADRIARALNGEGDAGGAAIAPYVVARASCGCGDVSERIEDGAP >gi|319977933|gb|AEUH01000170.1| GENE 9 6840 - 8195 1577 451 aa, chain - ## HITS:1 COG:Cj0339 KEGG:ns NR:ns ## COG: Cj0339 COG0477 # Protein_GI_number: 15791707 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Campylobacter jejuni # 1 451 1 453 453 516 59.0 1e-146 MTTQNEDLQASDLSFRTPEGRRSFWINIVSVWLGTTMEYVDFALYGLAAGLVFGDVFFPE QTPIISLLSSFATYAVGFLARPLGAIVLGHVGDRHGRKTIMVITVGLMGVSTTALGLLPT YHQVGWVAPALLVFLRLCQGFGAGAELSGGAVMLAEFSPVKHRGVVSSLIGVGSNTGTLL ASSVWLLVLMIPKDDLVVWGWRIPFLVSVLIALFAIFLRRSMQESPVFRAFQQKKAEEQE AVGRSGLDAKKGGWKAFFVMLGLRIGENGPSYIAQSFLVGYVVKALQMSKSVPTTAVMVA SVLGFAIIPLSGWLSDRFGRRITYRVFCALLVAYAFPAFALLQTRDPWVVGTVIVVGMGL GSLGIFGVQAAYGVELFGVQHRYSRMAVAKELGSILSGGTAPMVASALLAAFDSWIPLAA YFAATALIGFATTFVAPETRGRDLALMEDAI >gi|319977933|gb|AEUH01000170.1| GENE 10 8396 - 9703 410 435 aa, chain - ## HITS:1 COG:mll2542 KEGG:ns NR:ns ## COG: mll2542 COG0477 # Protein_GI_number: 13472297 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Mesorhizobium loti # 1 402 20 419 505 189 34.0 1e-47 MLDTTIVNVALPSIGAGLHVSFQGLQWVVNAYAIALAALLLGAGGMADRFGTRRVYEASL LLFAIASLGCALSPELWVLLASRFAQGVAGASLIPTSMVLAADGCRTPEERSVALGWWGS AGGLAAAMGPVLGGCLVSTLGWRCVFWVNVPICVLIRLVLTGSGSRRAGRSRRRAAPSDG AGQVLSILAIGLTSFSVIELGSSSTRILGSAGCAAAAASWWAFLRYESRSAKAFLETALF RVPAFATNCAIGWILNFALFGELFVLSMYLQQGLGYSSAHAGLVIFPQTCSALIAAPLGG RAAAAWGGRRATAVGLTAGGIGFLGIALFPLTGSQAALAPLSFLASFGMAFAMPPVSAEA IGSVAERRKGMASGVINTARQLGTVCGTAAMGAVYAIAGATMGVVVSVAIAALLFFIGAL ASWSVGSAGASARGA >gi|319977933|gb|AEUH01000170.1| GENE 11 9989 - 10675 71 228 aa, chain + ## HITS:1 COG:BMEI0150 KEGG:ns NR:ns ## COG: BMEI0150 COG0640 # Protein_GI_number: 17986434 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Brucella melitensis # 11 222 27 240 243 135 39.0 5e-32 MTFRTLPVAMPDISKISSAFADRTRAAVCGALMDGTAWTPTELAGFCSVSKSTMSEHLAV LKAVGVVGEVRQGRHRYVRLAGPEVASVIESLASIAGANFPSPKNYNAHRANTEFAAGRT CYNHLAGELGVRLLQELSAHGYISAHLQVTEDGDELLRSWGIPHPERLEGKPCLDTTHRV FHLAGGLGSTICTRLLALSWIERTHSNRCARLVPEGRAALGAAGLSLP >gi|319977933|gb|AEUH01000170.1| GENE 12 10703 - 10768 66 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEVGTRPLIGVARKGGGGWMG >gi|319977933|gb|AEUH01000170.1| GENE 13 10797 - 12185 1447 462 aa, chain + ## HITS:1 COG:NMB1759 KEGG:ns NR:ns ## COG: NMB1759 COG3177 # Protein_GI_number: 15677601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis MC58 # 50 458 44 451 454 251 36.0 3e-66 MQDVQEAPMKTPAPPPMRLAELRDAVLKALLSGGAPPDARTLNVLLGGVEDREYHPWEWF THHEPPEPLTREQWWCAVRQDRARTARPTPFTMVDGTRFSFNLPDPLLRLIDDISSQATG QLELPDPVVNPATRDRYLVNSLYEEAITSCQLEGASTTRRDAKRMLREKRRPRDRSEQMI LNNCMALERVRELKDQALTPQMVLDIHAVVTEGTLDDPADAGRIQQPGEERIRIYGDETD DHVLHVPPPAEQLPERLRRLCDFANGAGDFADQYVPPLVRSIIVHFMMGYDHYFVDGNGR TARAVFQWSILRQGFFLAEFLSISRLLRQAPARYARSFLRVEQDEGDLTHFLLAQSRVIS RAITDLHEYLARKSSELGRAAALLRGTHLNNRQIAVVESFLRDPSGSVSAVDHQSTHRVS AQTAHNDLRGLEEAGFLARTKRGRRIVWFPAPDMAERVRDAH >gi|319977933|gb|AEUH01000170.1| GENE 14 12311 - 12379 138 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALDVPSAFKGAGARRTERATA >gi|319977933|gb|AEUH01000170.1| GENE 15 12417 - 12593 120 58 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLGFAASGRLRIGMIQNPSELDGFDRIWADSGRLRRFLDHTPDRPDFRPRLRRILDH >gi|319977933|gb|AEUH01000170.1| GENE 16 12532 - 12744 139 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTRVGAGRWFALRRVCSAGDALGALCGAMRVAHTGVARVGMVAVGRVVQNASGFRRVWA APHRDDPKSV >gi|319977933|gb|AEUH01000170.1| GENE 17 12938 - 13594 600 218 aa, chain + ## HITS:1 COG:no KEGG:Pcryo_1611 NR:ns ## KEGG: Pcryo_1611 # Name: not_defined # Def: hypothetical protein # Organism: P.cryohalolentis # Pathway: not_defined # 5 161 6 163 199 68 28.0 2e-10 MLPAYSSAQTVGGLVNPGTVEAAGLLAGGMSVRAVNDRLVADNLTRTQSSQNRRRLATTT TRHLSHLGPDALAHVAQARPGFELLLWLAIVSEARLFADCARDAVWEALVLAERPFRVAD FDEFWRLKALTEPSLAALSAANRKRQRSVFLRCLREAGFIDGAGAPLPLLVPPSVARLVN GAPEPWGWPLREAFPVPQAQFDVLDGEALEQYRMRRSG >gi|319977933|gb|AEUH01000170.1| GENE 18 13591 - 14241 569 216 aa, chain + ## HITS:1 COG:no KEGG:Pcryo_1610 NR:ns ## KEGG: Pcryo_1610 # Name: not_defined # Def: hypothetical protein # Organism: P.cryohalolentis # Pathway: not_defined # 12 197 13 198 203 138 35.0 1e-31 MRRTYARSLDDTFDIAYQVMTSERFIEMEGLGNEVPYFVLRYRPEWEPGVDKAVRRLTSR VRQHYPLTVVDVFDLAHRLWRESGYLETIIEAEGSMPEQDFRDGLKGLIDPEDLLAPAID REIARDADARAVLVTGVHHLFPLVRAHRLLNCLQPLASRRPLVLAFPGSYNQSPERASSL VLFDRVGEDNYYRAFDLLDQRYALDGAGGATERTGE >gi|319977933|gb|AEUH01000170.1| GENE 19 14245 - 17775 3699 1176 aa, chain + ## HITS:1 COG:no KEGG:RER_28960 NR:ns ## KEGG: RER_28960 # Name: not_defined # Def: hypothetical protein # Organism: R.erythropolis # Pathway: not_defined # 1 1174 1 1166 1168 570 31.0 1e-160 MDMTGIYAKDINRTIDGVIKADESAHIRSEVDEYVMTSEIRAGLDRILDEWNAPRPQSNG VWISGFFGSGKSHMLKMMSYLLGDVRGSGEMTRQQVAEVFTQKARGHEMLIAGIGRSQQI PALSMLFNIDAKDDKATSRPEAMIEVFYRAFFEARGLYTADLGVGALERDLVDRGLFNQF KECFRAVAGGEWEDRRAGTVFAEAPTARALAGVGIEVDKPIASYRASLNRSAEAFADEVA HWLDSREPGARIGFFIDEVGQFIGRNTHLMLNLQTITEQLFTKCSGRAWVFVTSQEKLDD VIGGLGEQQGNDFSKIQGRFTTQINLSTQEVEQVIARRLLAKSDAGRAVLEGQWSRSGAG FASMFTFPAWQKAYSNYADEEAFIAKYPFVDYQFDMFSQTMIGMSESGLFTGRHSSVGAR SLLGVCQQIAQDLQERPIGTITSFDQFYDGIAAMLKSDVKENLHRAQANSLTSDPFTISV LKALLLARYVKDFPCAPSNLVVLLRRDLGQDATALEARVKRSLDALVGQYFVQKLPEGTY TYMTDEEKEVKQQIGRMSVSEGEVTGHLMRSVRAVLGNAAASYRHPDTDLDLRITTIMDC MRPASTESLVLNCVTPLSQVDAEAARIRSTGGQDTLTVVLALSDETVEEVYMYLRTERYV RGATGSSAPSGRKRLIQTYGEQNNERDERIKAAVRMSFKGARFFTDGSEVEVPASMPPAE AVNRGMGVVVGRVYHRMGDALPLAKLKEGDVDRFLRDEDEGLLPEGGLQAANPLADQFAQ WVKQSRDLEGRSVFVQDAVDHFRAAPFGWGQNAVLAIVATAIRRGLVDAKLNERPLTRTE LSSGLRVSQERSRILLEPKRQVDQRALARFRAFAKEFVLSDGRLEDDNARAVAVDNAVQG WRAVLDRARRTPFDFGGEIDEAEATLEAVGAGPLGAEAYLAPGVQAVFDDVVEIRSDFLE PLQEFLTRHMPAVEDAVRAADSLAAYGDGVGEDGRRAIEELRALVSDEHLYNRTDEVAPL RERVDRARRVFVEERRSEALRAFEGVRAEVTASDEFGQADPTAREAALSRIDAECASIRG MTDPGAIVIAAQKVDALSAELFARLIASHERVDEAPGHSDEAPGGRAAPPQVVQLSALTR DLCRTTIRTPDDVDAYVEELRSALHRAVSNGNIIRR >gi|319977933|gb|AEUH01000170.1| GENE 20 17784 - 21521 3687 1245 aa, chain + ## HITS:1 COG:STM4495 KEGG:ns NR:ns ## COG: STM4495 COG1002 # Protein_GI_number: 16767739 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Salmonella typhimurium LT2 # 17 1213 43 1210 1225 533 31.0 1e-150 MDININALESLATKSRARLIGQATGAIQRLLDDPLTEAEDRRRIERAVAHHGGVEGAARE AAYSWFNRLTALRYMDAVGVSGTHMVVTPTGASSEPECLSQARVGAFDPRVDRAPASPDA DGRPGVAQRVGRLLFENRDVEAYVVLLSAYFDLWHPLIPDMFPAGDHWTNLVPPSDLLSA SSVRADIVEGIAVGESGGPGVEVIGWLYQFYIAERKQQINDAKVKITSKELAPVTQLFTP HWIVRYLVENTVGAQWLRAHPGSPLRDRFDYLVAPVPGQEDQGMGIGDPKDFRVIDPACG SGHMLTYAFDVLWQMYVEAGYPKRRIAAAILDNNLFGAEIDARAVQLASFALMMKAVEHD PGFLERKLREQDRKAAGPGAGRPRITHVHSVVLEDLSPTEIAQAVRGAFGSEDSERASVG LEVNRVIEDLRNADTFGSLIRVPAGTAEIFEAIARSIEEGRRDVQLGSADTQQWREAAGI CRVLEDSRYTTMVANPPYLVSKKFGDGLKEYARQSYPDSRQDLATMFMQHALTFTRKRAS IGTLTLNSWMLTSSFERFRIRLLKNASIQSIAMLGRGLAGIQLDFCATVFWNACPRSDRT IRFITASEPATSPSNPIRIQRLQEAAQNPKSSDNIEVTYSRITRMPRSAFLHSISDQILE GFKTPLAEVARPRQGLATADNERFLRLWFEVSSSRSYLTATSREDADASGARWFPHNKGG SFRKWWGNQDYLVNWEDDGREIQEYRDPYTGKQRSRPQNIDFYFKPCVSWSNVSSGTPSF RLYNENFIFSHVGQALFSDTEHHPAIMSFANSTVVTKILEAIAPGTHFEVGQIKTLPWIE PESFDPTPIERLIEIFRDDWDARETSWDFQRPPYLVGDSSLLEELFAAWYERSVSTAREA QRLESENNRFWAGVYGLEDEVEIDVPLSRVSLTYNPRFPPAPSKGAERSEEEYRWLHYQR SATELVSWLIGVIMGRYSLDAPGLILADQGSTLDDFQARVPTASFMPDADGIVPITDSLF HDNADLRLTDALSALLGADSLGQNLDFLARCLAVKKPTSNGADFQPPRPVVDSRAALLKY MKSGFLKDHEQAYSNRPVYWMFSSPKGSFKALMYLHRYTPDTVGHVLTEYASEVVAKLAD QIEVINRRLPDMGGSDRTRAQGKREKLRAQLHDVQGYIDSTLYPLSQRRIVLDLDDGVRV NRLKLAYGMREDLDELASPALQMPDDLKWMTDQKRKGNIWWSTQN >gi|319977933|gb|AEUH01000170.1| GENE 21 21607 - 22761 1314 384 aa, chain + ## HITS:1 COG:no KEGG:Arch_0273 NR:ns ## KEGG: Arch_0273 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 1 371 1 368 382 150 28.0 1e-34 MKLTHVELQNWRNFAHIEFDLNTRLFVVGPNASGKSNLLDALRFISDVANAGLYEASIKK RDSFRSIVHANAEECTLAFKFDSSALQGYELSLEQVHEPVIYDGEEVGEELADMPYVKAE NIFTADGNVHPSPYVSHILGHEVVFAHETTLQSATLSNSFKGIRDYFVGIHYIHPNPKKM LERGRYDEHATGFTQRVASRPDVVLQPAIARIRPILASIVPEIPRLSYVRIGANDDIIFY SDDPQKPGAFTHMRFSEGTLRILGILFELATLPKDTSLVLIEEPELFMQPSVVRSLPDFF AAVAADRNVQMIITTHSPDLIDNELIRPDQVLMLEPTDTGTKGTLLSESQDPRVRSFVET AFPLSEAVDSVNRRRIPLGIWDHR >gi|319977933|gb|AEUH01000170.1| GENE 22 22758 - 23402 488 214 aa, chain + ## HITS:1 COG:no KEGG:Ajs_4079 NR:ns ## KEGG: Ajs_4079 # Name: not_defined # Def: hypothetical protein # Organism: Acidovorax_JS42 # Pathway: not_defined # 19 198 19 196 200 78 27.0 2e-13 MTVSKHPQVFVEGNTDEPIVRALMTATGWVSEEYRIFCAKGSGNIIRSITKHAEAARQIP RILFLDSDNKCPVDMRKDLEKELTHIPADFVLRIVCTCIESWVLADCEGLASFCGVGIAA IPASQKLAPIHNHKNELLKVLRKSKSPKGREMTQGSGNDLQFSDDYTRHLADLMTDYWDA ERAAQNNDSLRRAIARLKDLRARLCTDAVPEVRQ >gi|319977933|gb|AEUH01000170.1| GENE 23 23407 - 25941 2691 844 aa, chain + ## HITS:1 COG:no KEGG:RER_28940 NR:ns ## KEGG: RER_28940 # Name: not_defined # Def: hypothetical protein # Organism: R.erythropolis # Pathway: not_defined # 5 793 3 785 831 345 30.0 4e-93 MSGIDASALADGLRELLTRARLVTWLDQGREYTDQVDEVAALVPDAQLITVDHNEFATKL RVLRDQPKDAFILYRPGDLPPLDTDLLADIRCGYPAFSADASSMLARELDLDDDLTPILR EYDGFFTRERIRRIKDRKIQINDKNKLLAVMSAVLLGIHDHSFSSIFVELANSDACEQAK IGKLKWGLADFFWEGARSIWKYQGDPELNNLMTWLFIEQRRGWDPALASAQRDFATWMGG VKNQDAVKVWATRISEDLGLAEELPSLDTADLERDLVTPGVDDQLIRRAVTSLIDGRSNA ADVRTQRSRRASSLWKEESASAWEAAEAGGRFLHLVHTHRDAAFRTAGEGFALYTDAANG LWQVDQAYRHYVRASRAAGDPDVFGELDAAVETIYLHDFLRPLSAWWDAAVRALPTWRIP AHASSNPGRQSDFYDLMVRNAKKTAVIVSDALRYEAGAELAERLIARGGATVEITPWYTL LPSVTALGMAALLPNRSLSTGVRADALEVRADGEPTGGLAARDAVLRAASGDRITAHSYQ TIAGMLRDDLRALGKGMAAIYVYHDHIDAVGDHAASESGTPEAVERALTDLAALVTRLRS ADFHRILITADHGFLYQARPLPDDTTAKQGTAPRAEKILLKKRRYQLGDAMESSDDFAVF PAADVGLASGPDVALPFGMRRSRIQGRGDQFVHGGASLQEISVPVIEVNTTAGLKAHDVE VELSYSTSRISMTRITPKVIQKDPIAERIRPIEITVGIIDADTGKPLSNVERLTLDSVEA TRLDRARTVELILLDGILQEYAGKTVLLAAHKVVNGMRASDAPVASLELEVRDTGFGGLG GFTL >gi|319977933|gb|AEUH01000170.1| GENE 24 25938 - 28034 2732 698 aa, chain + ## HITS:1 COG:STM4491 KEGG:ns NR:ns ## COG: STM4491 COG4930 # Protein_GI_number: 16767735 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent Lon-type protease # Organism: Salmonella typhimurium LT2 # 8 482 24 507 694 451 45.0 1e-126 MTRELTPVEAKASNAFAGYIVRKDLVHQVKGNAVVPSYVLEFLLAQYCATTNEEEIRSGV ESVRKILAAHYVNRGESELIKSRIREQGGYRIIDKVSVSLDEANDRYVASFENLGVANAL VSDQTVRDNDRLLTGGVWCMADMSYFPALEGAKTAPWNVSRLQAIQLPRFSFEEYADARG QFTTQEWIDLLVASVGLDPSKLSERAKMLQVLRLVPYVERNYNLVELGPKGTGKSHIYSE FSPHGILVSGGEVTLAKLFVNNATHRIGLVGFWDVVAFDEFAGRKRVEQNLVDTMKNYLA NKSFSRGDSPTTAQASMVFVGNTMRTVPAMLATSDLFEALPDAYHDTAYMDRIHYYLPGW EVDVIRPELFTSGYGFIVDYLAEALRHLRTLDFSGDLSGHFELDSHMSTRDRDAVRKTFS GLVKLIHPDRDGTPAELESLLRLAMEGRKRVKDQIIRLDETMRESGSHFLYTVSASGGER AIRTQEEIDYPEYFTADDDRPAGEPPTTGGTHPQPPVGGAAQETAPRPADRLDTLRSRAI PKREEVEARIGVSLPKMLLPYLVDAQKITIVDPYLRKRHQIANIHDILIHMVQAVGLAAP PRVHLATGRAEPDRATEQMQLLNDLCQTWGHFGVEIDYEFVADIHDRVIKTDTGWVIDLG RGLDVYERFQALSRFDPRFDLMELRRTKGFTFNAQRTE >gi|319977933|gb|AEUH01000170.1| GENE 25 28117 - 28599 257 160 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNQQNISGIGDGNTISQHQHIGDQTLYSLSLDGLAEEDALARRVTVTERYSRMRSALVAF VFALLGFGCLAYFYWNDGCPGLSKLVELAQESLLTVIFEYTLPLIVGATASAFFGKKITK PTDIEIRSKGRRKHIYSIAREKGYSKREWRRAQRRTRDRA >gi|319977933|gb|AEUH01000170.1| GENE 26 29037 - 29945 1243 302 aa, chain - ## HITS:1 COG:BH1390 KEGG:ns NR:ns ## COG: BH1390 COG1108 # Protein_GI_number: 15613953 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Bacillus halodurans # 22 292 15 285 292 166 39.0 3e-41 MRAAEFLSPVDFLRDLTNPVLAFLPRALLMAVLAAVVCGAVGVHVVLRGMAFIGDAVAHS VFPGLAIAFLTSGSLVLGGAIAGVATAVLVALLSHSRRLKEDSIIGVLFVGAFALGVVII ARAPGYAGSLQDFLFGSITGVPASDVPVAIGGGAVVLTLLCVFHRWIVAVTLDRETARVA GVRILATDLVLYTAVALAVVISVQTIGNVLVLALLVAPASIARLLTDRLFTMMWLSPALG ALAGIVGLYLSWSIDVPTGATIVLVLIGFFALAWVFAPRHGALARARAARRGVPRGVSAA RS >gi|319977933|gb|AEUH01000170.1| GENE 27 29960 - 30745 817 261 aa, chain - ## HITS:1 COG:BS_ytgB KEGG:ns NR:ns ## COG: BS_ytgB COG1121 # Protein_GI_number: 16080128 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn/Zn transport systems, ATPase component # Organism: Bacillus subtilis # 24 230 3 217 250 172 43.0 6e-43 MKQSHRGGARPTGGEGPTGEADSPLVVRGLSVNLGGRPVLRDVDLTVSAGELVGLIGPNG AGKTTLIRAILGLVPAAAGSIECAAAVGYVPQRHEFAWDFPISVRDAVLGAVSVRLGLLA RPRLAHHRAVERALAKVGMRELADRPVAELSGGQRQRVLVARALALEPSVLLLDEPFTGL DMPTQELLMDLFASLAASGSAVLMTTHDLVGARAQCDRLYLINGTVIGSGTPNQLRDADV WVRTFEVREDNPLLSALGVVA >gi|319977933|gb|AEUH01000170.1| GENE 28 30742 - 31941 1374 399 aa, chain - ## HITS:1 COG:no KEGG:Sare_2619 NR:ns ## KEGG: Sare_2619 # Name: not_defined # Def: hypothetical protein # Organism: S.arenicola # Pathway: not_defined # 15 264 9 259 309 221 49.0 6e-56 MRLRTRGAVASLAVGIIALALGTAPARAADDPALDQTVGADEEASTESAIIGTGHVDIGP RMVDGQWSVALRDDSGAHPVWRDPDRTVLRVSNAALMAAPTGSDYAFMGAQASEQYYVVP QTQNPDVVWLGWNTQDPGVVSAIDRGATMRIGSVSGPGRTWMFLQDGTFGKPRLLVDGQS GQAQDVWVDASTHVHANWVFTAPGVYTAALTFSARTTDGQQLSASTTLRFAVGSQTSADE AFAAAPAAAAEAGAAGSAGSAGSADSAGGGGSAADAGPAGDSGLADDGASANEARVPTGS SADGTGLSDQVMLIIALVVAASITVALGAVLIMRSRSIAAERRKAIAEADGAEGMEGAEE SDTTGGNDRAAGPGGADPAPASSTGERTSAPASSKEGTR >gi|319977933|gb|AEUH01000170.1| GENE 29 31938 - 33602 1743 554 aa, chain - ## HITS:1 COG:alr3576 KEGG:ns NR:ns ## COG: alr3576 COG0803 # Protein_GI_number: 17231068 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Nostoc sp. PCC 7120 # 375 547 137 315 325 113 36.0 1e-24 MPTRRRAAGLLLAAALSASACSSTAAGADAQQDRIRVVATTGILADLARNVAGDRASVTQ MVPDGADPHSWEPSLRTIRDVAYADVALSNYLLLEEHAIIRAMDANLPANAKSVSVAEEA AKQGATVLPLVEDRSLDTVWLGMRVIGSGRSLGADRSSEVDLVATGVEGPGAASAYLTTS FGEPQVGFASSDGFDAASGYQADTSVLPADAHQHMSWAFTQPGVYRIHFEARLRTSPGAA PTPLGSGTAVFAVGVGADEVAAQQGRAVLAAGHADITADLDRGMLTLAADKGAVGEGATP LPAAAPPTTGEGGGAGSTGGTGLSAMESYSLDDVVIDVPTRTLTQVPGTGYRFVGEPGSD VYMLPQAVLGKHVHGEIDPHLWHDAHNAAAYVRVIRDALAEADPASADEYQRNAAAYLAR LDGLDAEVQAEIARIPPERRRLVTTNDAFGYLARAYGLEVSGFVAPNPGVEPSVADRVRL QATLTDLRIPAVFLEPNLARTRSTLRTAAEDAGVRVCPLYGDTLDASAPTYIDMMRFNAH SLATCLGDTEQETP >gi|319977933|gb|AEUH01000170.1| GENE 30 33592 - 35238 1298 548 aa, chain - ## HITS:1 COG:no KEGG:PPA1500 NR:ns ## KEGG: PPA1500 # Name: not_defined # Def: hypothetical protein # Organism: P.acnes # Pathway: not_defined # 33 506 362 834 880 210 34.0 1e-52 MSGGSQAPATAGRDFRVVATVSRETLSLGLALDGLAVDAQHAVLGVDDSALLKGANAPMS AEGYEMGQAPGDGADALGWSLGALREAGYDRAQINVTASGPEGSQVLVYRRDPNGKATAV LHGGSFNIADADTEAYSILPLGEQTEGTLSWMFTAPGEYSLNMTITAWSSTTDAVVQSPA TLALRVAVGSEAIANALAPAGVPASNAPSQPATAQQQTAPASNTTASPSAQGDGRASGQS DARNAARASSSGEKCVATTITREATAEEQASLTSNQASPNTARTTLTFSVGPGASGNATE GHFDLGPAIENGQLVARVKDDRSAPAQWVDPASLTFALGDAARITAPDETSFVATPGSQV WLISSTQVPGVPWLGMNSQRDEIVNGTTGGVNFTLDSVDGPGRVAVFNSGSLGGGVGEHV FDGAGSTYTLPPNTHAHQNWLFTEPGTYTLTLSMRVTPTGADLAGTGGGTTTRLTPTGAT GANGRPVVSEVVGRTASGDECDLTLATTGADAQRILLLATALVFSGCVAVGASRRRGAQD RVGNRSAH >gi|319977933|gb|AEUH01000170.1| GENE 31 35523 - 35588 139 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRIVIIIGSGFPERPIDATTL >gi|319977933|gb|AEUH01000170.1| GENE 32 35966 - 36544 792 192 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDEILIRQREKTLRQFKAKGQPVHESEEFMDNALQYMDAHMQQASSKVELAFLAAEYSQN QAGRTCIPSVGGTKFRPWNFNLLSVDESVENLHFVAVRAKDWAFAQVSTSDGRNLAQGTP EYLNHIATVDPDFIQLLRENPDLFDGIASGRIRLHNDMCWVDDTGLDSIQYRSQPFVFTP ETLAVLRERIHS >gi|319977933|gb|AEUH01000170.1| GENE 33 36541 - 37044 553 167 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTEFIALRPERVDQIITAWAAVRWPLKERHAAHLFAQTGFMPDRHGTHAFRSELCPVADS FFISNGKHVQSAHVALSTIATPDQYGRSAPLTQSVYLTYAQIIDARTGAKREAEDDGTGP QARWRLANGSNIVLGANQALVDLTVNSPEQTAILDWQILWELQGRPE >gi|319977933|gb|AEUH01000170.1| GENE 34 37703 - 39001 1561 432 aa, chain + ## HITS:1 COG:PAB2227 KEGG:ns NR:ns ## COG: PAB2227 COG1167 # Protein_GI_number: 14520410 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Pyrococcus abyssi # 26 419 16 410 410 324 40.0 2e-88 MSTPTPSPQGKTPHHGTRLDPWFDAYAERAHNLRASEIRALFSVVSRPEVVSLAGGMPNL KDLPLDRLAASAERLITRHGAQAMQYGSGQGWEVLRERITEVMTYDGIIGADPDNVVVTT GSQQALDLVTELFVDPGDVVLAEAPSYVGALGVFRARQADVIHVPMDEHGLIPEALVETI RNARAAGKTIKFIYIVPNYHNPAGVTLSLERRPVIAEICMRERILIVEDNPYGLLGFHSD PLPALKSFSPEGVVYLGSFSKMFAPGFRIGWAYAPHAIRAKLVLASESAILCPSMVGQMA IADYLGDYDWYGQVKSFRAMYAERCKAMLDSLAEYIPDCRWTVPEGGFYTWVTLPDGLDA KAMLPRAVRARVAYVAGTAFYYDGQGNDHMRLSFCYPTPERIREGVRRLSTVVNAEKDIV DMFGTGDNPSSR >gi|319977933|gb|AEUH01000170.1| GENE 35 39029 - 40000 1242 323 aa, chain + ## HITS:1 COG:ddlB KEGG:ns NR:ns ## COG: ddlB COG1181 # Protein_GI_number: 16128085 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Escherichia coli K12 # 6 297 4 290 306 180 35.0 4e-45 MATPTKVLIIAGGLTHERDVSVRSGRRVANILTRAGFVVRITDVNDTLVSTIASFQPNVV WPLVHGSIGEDGSLQTLLESLGIPFVGSASVQSMLASNKPTAKALLASAGMPTPGWFSLP QALFRQVGATNVLSAIEHGVSFPVVIKPTDGGSALGLSTAHNAEELRTAMVDAFAYGQKL MVEQRIDGRDVAVSVVDLGDGPIALPPVEISTDGGRYDYDARYTTDETEYFVPARLESNA LADLRAAAIEAHTALGLRHLSRIDFVIDEDGTCWFIDANVAPGMTDTSLLPQAAEAADDY SFAAFCKEVVDFVASTGGNSTSE >gi|319977933|gb|AEUH01000170.1| GENE 36 40069 - 40698 482 209 aa, chain - ## HITS:1 COG:Cgl0120 KEGG:ns NR:ns ## COG: Cgl0120 COG1011 # Protein_GI_number: 19551370 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Corynebacterium glutamicum # 5 204 2 201 201 85 29.0 9e-17 MIASSSTRSVLFDVGGVLVDSHPDPLAVAALLGEETAAFARLVDQAIWAHRGEYDAGCSD RQFWDRVAGDCGLPEPSPATLKELVGLDTARMLDANPDVIELARDLRRGGARLGIVSNAP YSVSRSIKESEWGSCFDFFVFSCDYGVCKPQRGIYRDALVQSGSNVHDVVFVDDRKVNVR AAELMGMGGILWTNPLDARQRLREFGYLL >gi|319977933|gb|AEUH01000170.1| GENE 37 40843 - 42126 579 427 aa, chain - ## HITS:1 COG:ML2706 KEGG:ns NR:ns ## COG: ML2706 COG1475 # Protein_GI_number: 15828464 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mycobacterium leprae # 149 424 59 330 335 241 50.0 2e-63 MADVQKADGVKRGRKHTGLGRGLGALIPPVQDAPSQAPAPRAGGSPLDVFFPEGNTRQRG GSAKELLQPKQRSSAQGKKRGPAKRAEMLRQATPVKDVAEDGAAPSRGTDRAPSAGADSV LPVERLVPDAPSDVSRETSEEVELLPVPGASFAEVPVDQIVPNAKQPREIFDEEDLAELA ASIKEVGVLQPVVVRPIPDGEPEGRLKEYLAEKPDAKYELIMGERRLRASQLAGETTIPA IIRQTDDSDLLRDALLENLHRAQLNPLEEASAYQQLMADFGATQEELAKRIARSRPQIAN TLRLLKLPPAVQKKVAAQVISAGHARALLALPSADEMERLADRIISEGLSVRTTEELVRL GKVKAGPKPRARKKPQVSALGESVVTALQDVYETRVTITEGRKKGKIVIEFAGSDDLQRI ADLILRH >gi|319977933|gb|AEUH01000170.1| GENE 38 42350 - 43231 795 293 aa, chain - ## HITS:1 COG:Cgl3035 KEGG:ns NR:ns ## COG: Cgl3035 COG1192 # Protein_GI_number: 19554285 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Corynebacterium glutamicum # 27 288 28 289 307 268 55.0 6e-72 MTTENNAPLSGQVERALRDLKMLDDATFGRPERTRTIAIANQKGGVGKTTSTVNMAAGFA LGGLNVVVIDADAQGNASSALGVEHDTGVLSTYDVIIGGASIEEVLRPCPDIEGVSVCPA TIDLSGAEIELVDVEGREFLLRNALREFLDSRDDIDVILIDCPPSLGLVTLNVMVAADEV LIPIQAEYYALEGLSQLWATVERIGADLNPGLRVSTMLLTMADRRTRLSEEVEAEVRAHF PEQTLRTVIPRNVRVSEAPSYGQTVVTYDARSVGAVAYRMAALEVSLRFPQSE >gi|319977933|gb|AEUH01000170.1| GENE 39 43331 - 44041 508 236 aa, chain - ## HITS:1 COG:Cgl3036 KEGG:ns NR:ns ## COG: Cgl3036 COG0357 # Protein_GI_number: 19554286 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Corynebacterium glutamicum # 30 236 3 209 209 125 36.0 6e-29 MSEAPLSPSDGVPARRDGGRAPRRAGDPFTPEAPTDEVRRFFGPAFADIAEFARMLEEEG ELRGLLGPRDMERIWSRHLVNSAAVLDFLPRRGQVLDIGSGAGLPGIVIAVCRPDLDVHL AEPMARRCEWLADVVDALGLDNTTIHQARAEELRGKGKADVVTARAVANMAKLVRMTSKL IAPGGALVALKGRRAPIEIEEAASELKRHHLMAQIHEVPSVMEDESTYIVKCTRTK >gi|319977933|gb|AEUH01000170.1| GENE 40 44034 - 44537 742 167 aa, chain - ## HITS:1 COG:MT4039 KEGG:ns NR:ns ## COG: MT4039 COG1847 # Protein_GI_number: 15843554 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Mycobacterium tuberculosis CDC1551 # 2 160 28 185 187 118 58.0 5e-27 MSDEKSEALTRLEEEGELAADYLEELLDIADLDGDIEIGVENGRASIEIIADSNDDLERL VGEDGEVLDALQELTRLAVQTETGERSRLMLDIASFRADRKAELQEITQEAIARVRSTGA EVKLEAMNPFERKVCHDVVADAGLVSESEGVEPHRRVVILPEDEDGE >gi|319977933|gb|AEUH01000170.1| GENE 41 44579 - 45922 1873 447 aa, chain - ## HITS:1 COG:BH1169 KEGG:ns NR:ns ## COG: BH1169 COG0706 # Protein_GI_number: 15613732 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Bacillus halodurans # 7 267 39 263 280 115 30.0 2e-25 MFDAIIHPFAVAVAWVWVWIHDFLVLIGMPSGSGIAWVLSIVLLTILVRVAIVPLLLKQI RSSRAMQAVQPEIRKIQEKYKGKKDMVSRQKMTEETQALYRKHKVSPFASCLPLLVQMPV LFAMYRAIYAVKDLAEGTYTYSGQAASSLGPITQTTASEINGSTVFGVPLSHTLRTETSN SSAAIVFVVAIILMVALQFVSMRLSFSRNMPDMGSNPMARSQRSMMYVMPVMFVFSGVFF QMGVVVYTVTSSFWALGQAYWTIKVMPTPGSPAYADLRAKREKDYQEWAKPYFVEYDRRR AELPTSATDPDVVAFNEKSLAEIRQRAKKQKIASDFPDFMSAGDIVTVYRNLASQEWTTL PDEQWMHGLRQALEKARSRKEAVARPTGKKKSREQRLREAAAQREASVPDKGSSAKAAQG AAISAAELERRRQERRRARRKKSKNKR >gi|319977933|gb|AEUH01000170.1| GENE 42 45979 - 46326 309 115 aa, chain - ## HITS:1 COG:PM1164 KEGG:ns NR:ns ## COG: PM1164 COG0759 # Protein_GI_number: 15603029 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pasteurella multocida # 16 78 16 78 86 81 55.0 4e-16 MNRATGALRAAAVWPIRLYQRFVSPLLGPRCRYAPTCSAYAVEAVTVHGIAKGTALALWR LVRCNPWSLGGVDCVPPRGRWRPDPWVPPDDWAGNDPTIEHPLPMGLSPGGGRER >gi|319977933|gb|AEUH01000170.1| GENE 43 46323 - 46664 229 113 aa, chain - ## HITS:1 COG:no KEGG:Tbis_3595 NR:ns ## KEGG: Tbis_3595 # Name: not_defined # Def: ribonuclease P protein component # Organism: T.bispora # Pathway: not_defined # 5 100 12 106 124 64 48.0 1e-09 MVDPADFRSTIRTGSRGGDPLVVVHARTDEGEENRLVGFVVPKREIKRANGRNRVKRQLR HIMRERVGSLPPGARVVVRASGRALGASSQELARRLDGALARAWRKWSGGGER >gi|319977933|gb|AEUH01000170.1| GENE 44 46695 - 46835 210 46 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494188|ref|ZP_03924504.1| ribosomal protein L34 [Actinomyces coleocanis DSM 15436] # 3 46 2 45 45 85 93 8e-16 MTTKRTFQPNNRRRSKTHGFRLRMSTRAGRAILANRRRKGRAKLSA >gi|319977933|gb|AEUH01000170.1| GENE 45 46942 - 47136 203 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTRLGARLDGVNAIIHCVTNRIHPQVWIPLWTAAVHSAHRAELSTEGAGRHLSQTRRDPP GVVT >gi|319977933|gb|AEUH01000170.1| GENE 46 47139 - 48734 2589 531 aa, chain + ## HITS:1 COG:MT0001 KEGG:ns NR:ns ## COG: MT0001 COG0593 # Protein_GI_number: 15839373 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Mycobacterium tuberculosis CDC1551 # 70 529 40 506 507 417 50.0 1e-116 MIPPKLSTDLSPTCGTRCGRDRLDYHQVIHRIPSLNDPTHNREALEESVATDTITSAWAL ALETVPTKELGPAATSMLRSARPLGDIEGTILLAVPNAFTKSWIEDRASNQLTAALTASL GRTVRIAITVDPSISAVTEEPKPEPPAASPLAPAAPASSPLARPAPAPQTGPALVSSAQN GGLTHLNPKYTFETFVIGPSNRFAHAAALAVSETPGTSFNPLFIYGDSGLGKTHLLHAIG HYALSLLPHLKVRYVNSEEFTNEFINAIRLNKTDNSQVEAFHRRYRELDILLIDDIQFIG DKEQTVEGFFHTFNALYENNKQIVLTSDVPPAQLNGFEDRMRSRFASGLLVDVQPPDLET RIAILQKKATTDSLEVTPDVLEYIASRISSNIRELEGALLRVIAFANLSKERIDLPLAEM LLKDFVSDPTDNEVTVPLIMGQCAHYFGITIEQMSSSDRSHTVVEARQIAMYLCRELTDL SLPKIGQAFGRDHTTVMHANKKITALMKEKRETFNHVSELTNRIKQKAKES >gi|319977933|gb|AEUH01000170.1| GENE 47 49124 - 50260 1618 378 aa, chain + ## HITS:1 COG:MT0002 KEGG:ns NR:ns ## COG: MT0002 COG0592 # Protein_GI_number: 15839374 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Mycobacterium tuberculosis CDC1551 # 1 376 13 400 402 299 41.0 7e-81 MKFTVARDVLAEAVSWTARALPVRPASPILAGVRIRAADDELTLSSFDYEVSANSKIPAT VDEAGEVLVSGRLLADISKSLPSKPVTVELEGQKVNLVCGSSHFTLAVMPLDEYPLLPAQ PAAIGYIDPQTLAQAVAQVSIAASRDDTLPLLAGVCTELTGSTITFLATDRYRLAMRELD WQPTDTSIEEVTVIKARILQDVAKSMTSGTKVEVGLSGQSQPGASSLIGFSASGRHTTST LMDGDYPAVRRLFPETTPIQAVCDRHALLEAVKRMSLVAERNTSVRLAFSEGQVVLEAGQ GENAQASEAIEATLHGDDISTAFNPQFLIEGLSVLDTDYVRFGFTHATKPAVITGQKKPD EGEVTQFRYLLMPIRFGI >gi|319977933|gb|AEUH01000170.1| GENE 48 50268 - 51455 1349 395 aa, chain + ## HITS:1 COG:ML0003 KEGG:ns NR:ns ## COG: ML0003 COG1195 # Protein_GI_number: 15826868 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Mycobacterium leprae # 1 385 1 372 385 301 48.0 1e-81 MRVSHLALDDFRSWKRGLVEFPPGATVLVGANGQGKTNLVEAIAYLSTFSSHRVGAESAL VRIPADPASTAPGGAVIRVRLVQAEGREQVVELEIVRGKANRARINRTQVRPRAILGLVR TVVFAPEDLALVRGEPAARRAFMDDLVIQRSPVMAGVKADFERVARQRAALMKSAQASAR RGASPDLSTLDVWDQQFAHLSARLSAARAQAVTSLAGPASRAYDEVSDSPRRLVLSFEAS VDRAIGTDPDDPASADPCDAPAQERRTLAALAAHRDKEVTRGVNLVGAHRDELSLVLGGM PVKGYASHGESWSVALALRLGAFELLSEDGDTPVLILDDVFAELDTARREGLAAMASRAE QAIITCAVEEDIPAGLDHRTIRIRMDAAEGSAADA >gi|319977933|gb|AEUH01000170.1| GENE 49 51448 - 52158 710 236 aa, chain + ## HITS:1 COG:Cgl0004 KEGG:ns NR:ns ## COG: Cgl0004 COG5512 # Protein_GI_number: 19551254 # Func_class: R General function prediction only # Function: Zn-ribbon-containing, possibly RNA-binding protein and truncated derivatives # Organism: Corynebacterium glutamicum # 73 236 11 178 178 116 43.0 4e-26 MPEGADEEFAARALERAQKVARARGYIRSTMPTWGIDEERPARTADGSSEYAAIEVDGAG TAGGADSSARRARMRAEAYGRAGAAPDAEGELGALRAALASHPDQWRKNPGMAPMRTQYR RASSLGSVLSRMIRRNQWDTPTRMGSIMAMWPAIVGEDIAAHATIETFEGHKLIVRCSST AWAKNLHLYLPMIERRIAEEVGPGVVQQVVIRGPVAPSWRKGPLSVKGRGPRDTYG >gi|319977933|gb|AEUH01000170.1| GENE 50 52290 - 53294 572 334 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTVPPSAGGPTPLGPSQDPRSANQDPGASQQGAPGASPLPYSDQPTSVLPPASAPYQGA PQGGGAPGSPYQGGYPPQSNPGAQQNGYAPTVQQGGYGAPDYGQGAPYGQPGPGNGGPSA PQKNKTAIIIGAIVAVLVVAGAGFAVYWFGFRGSSDPSAGSAPASPGAPSDGGAPSAGGS PSAAPTSSGGGSPNGGSASQGQPGSMTLDVSGAEVSIDSVERGPKDINGYDTVIVTLTTT NNASSTQPVVAAAPIAFSGREANQGNVFGLAVYPAGKEPDGLGLREDVPKDLGPGESATW KMAYAMPQTVDEVSIQLIGIDASTGGPWTVKLPK >gi|319977933|gb|AEUH01000170.1| GENE 51 53699 - 54241 514 180 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRPPAPLASGSGSNKGMWIAIAAVVGVLVIALGGLGIYKLATSGAAPSANAGTTVLGTAS MAGGSVKSEFSILPGPNDANGKRTIIVMQKVTNDSSQTIKVDEYSMFVMQKGELLEFSDY PSGKEPDQVDPGKWWEALRPGQETTVAFAYTLVDSSQVEIEPYEQNDLSIDYAKYYWQPN >gi|319977933|gb|AEUH01000170.1| GENE 52 54402 - 55298 904 298 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSYTPPTGGGPNPGPVPSGQTNPQDPYGALNGAPSQGSPAPGGAQAGSAQAAPASSADGG GTSGGGHFPQTGAPVPSGYQAAPYPGAPAQGGTGYGAMPAVNAPTLPAARTAAAPQNASP TTKAAGTIAGIAIAVVLIGLVVFAIYRVALVASPPDDPAPGYSSTSASQDEDPMEADLPV AKAHIAVDDVREGPEDINGRSTVVFTLTATNNSDAPLLKVDLHPEVTQNGADRSLTDYPK GGEPEGFDPNGIFGEIRPGATETWTVAYEVSNRTDLTVQVRDSNGSTSGLPEWILSFP >gi|319977933|gb|AEUH01000170.1| GENE 53 55285 - 55410 62 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIFQRFRPCDESYAAAPDREAAGSRPAAVARSGPEGAVTGN >gi|319977933|gb|AEUH01000170.1| GENE 54 55435 - 57552 3269 705 aa, chain + ## HITS:1 COG:Rv0005 KEGG:ns NR:ns ## COG: Rv0005 COG0187 # Protein_GI_number: 15607147 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Mycobacterium tuberculosis H37Rv # 13 705 19 714 714 778 59.0 0 MSFPGRTGTFDRICKVKHRISVIRSMREGSHVADIDNKTDGAATYGASDITVLEGLEAVR KRPGMYIGSTGERGLHHLVYEVVDNSVDEALAGYADHIEVTIQSDGGIKVVDNGRGIPVD EHPTEHRPTVEVVMTILHAGGKFGGGGYAVSGGLHGVGISVVNALSTRVDTEIRRQGHVW RMSFANGGTPVTPLERGEETTETGTTQVFYPDPSIFETVEFDFETLRQRFQQMAFLNKGL RITLTDERENAIEEGDEIAGAPDEDPQEAGGRRSVTYCYEHGLRDYVEYLDSAKKTDTIN NEIIDFEAENDEGTMSVEIAMQWTGAYSSSVHTFANTINTTEGGMHEEGFRTALTSLVNR YGRDKGLIRDKDDNLTGDDVREGLTAVISIKLTEPQFEGQTKTKLGNTEARTFVAQQVYS KLTDWFDAHPADAKAIIAKGQAAQAARVAARKAREATRRKGVLESASMPGKLRDCSSRTP SECEIFIVEGDSAGGSAVGGRDPERQAILPIRGKILNVEKARLDRALSSDTIRSLITAFG TGIGEEFDIEKLRYGKIVIMADADVDGQHIATLLLTLLFRYMRPLIEGGHTFIAMPPLYR LKWTNSDHEFAYSDKERDELLAVGAAANKRLPKEGGIQRYKGLGEMNDHELWETTMDPEH RILKQVTLDEAADADETFTILMGDDVDRRRTFIQRNATDVRFLDI >gi|319977933|gb|AEUH01000170.1| GENE 55 57618 - 60170 3825 850 aa, chain + ## HITS:1 COG:MT0006 KEGG:ns NR:ns ## COG: MT0006 COG0188 # Protein_GI_number: 15839378 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Mycobacterium tuberculosis CDC1551 # 16 845 13 835 838 981 62.0 0 MSDERTTDARHISDTERVRQVDLQKEMQRSYLDYAMSVIVGRALPDVRDGLKPVHRRVLY AMYDGGYRPASSFSKSSRVVGEVMGNYHPHGDAAIYDALARLVQPWSLRYPLVAGQGNFG TPGNLGPAAPRYTECKMAPLAMEMVRDIDEECVDFQDNYDGRNQEPTILPARFPNLLANG SEGIAVGMATRIPPHNLRELARGVQWALDHSGASREELLEALLKLIPGPDFPTGATILGH KGIEDAYRTGRGSITQRAVVSVEEIQGRQCLVVTELPYQVNPDNLADKIAQLVRDGQIQG IADIRDETSGRTGQRLVIVLKRDAVAKVVLNNLYKRTPLQENFSANMLALVDGVPRTLSI DGFIRHWITHQIDVIVRRTRFRLRKAMERLHILEGYLKALDALDEVIALIRRSPTVDEAR AGLIDLLDVDAVQADAILALQLRRLAALERQKILDEHAQLKARVDDLRDILDKPERQRTI ISEELSEIVDKFGDERRTTILPFDGEMSMEDLIPEEDVVVTITRDGFAKRTRTDNYRSQK RGGKGVRGTQLRGGDVVEHFFVTTTHHWLLFFTNLGRVYRAKAYEIPEGGRDAKGQHVAN LLAFQPDERIAQVLAIRTYEDADHLVLATKSGLVKKSPLALYNSPRSGGIIAINLREDEA GNPDELVSAQTIMADQDLILVSRDGQAVRFAATDEQLRPMGRSTSGVRGMKFRGDDELLS MEVPVEGTDLLIVTESGYAKRTPVAEYPTKSRGTMGVRVGKLSDERGGLVGALVVKPDED VMVITESGKLVQVNASDVRPTSRNTMGVIFARPDADDRIIAITRNSDSTVGAEDQDDADA PADDQPASEQ >gi|319977933|gb|AEUH01000170.1| GENE 56 60284 - 60682 766 132 aa, chain + ## HITS:1 COG:no KEGG:Sked_00080 NR:ns ## KEGG: Sked_00080 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 2 132 143 275 275 125 48.0 4e-28 MSEQHAPTSNQSDAPRRVDLAIARIDAWTVMKVSFLLSVAFGIAIVIATIVLWFMLDAMH VFSTLESFLQEIGAGKFTELLEYVRLPRTIAYATIVGVINVVLMTAISTLGAMLYNVVAS LVGGIKVSLMDE >gi|319977933|gb|AEUH01000170.1| GENE 57 60669 - 61568 771 299 aa, chain - ## HITS:1 COG:no KEGG:BL0336 NR:ns ## KEGG: BL0336 # Name: not_defined # Def: hypothetical protein # Organism: B.longum # Pathway: not_defined # 66 272 142 353 500 125 33.0 2e-27 MGDVIDHARGADADPSAPPGPADALALCCLAQCDFGALGAVRGADGMRVADLGALALSRF LYRHSLHPRLDRRMLVAAASSPRFAPLICAHAVDRWSARPLIQFSALTLRTPGGPGSPVM VVFRGTDRSWQGWAEDAAMGLSFPLPGHRAAARYLAFAAERHPGPLFVMGHSKGGNLAEY ALASLLRARPRDAERVHLFSLDAPGFPAPLVRSGFFEANAAPASRVRIPGSWVSVLLDQP GPARFVRSGLPGPMGHDPYTWVVEGGDFVPAPAPGPVPRAVGAAVDRALGLRPIRITRP >gi|319977933|gb|AEUH01000170.1| GENE 58 61574 - 62632 1153 352 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0868_0365 NR:ns ## KEGG: HMPREF0868_0365 # Name: not_defined # Def: hypothetical protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 1 322 1 335 399 225 37.0 2e-57 MGTIIDDARTHLARLDAEPLREVDSLILSAAAYFDLSPMAGVRARRPGVRIGDLARAENY DRYFAHVLKPQDGHRLLTAMAANPRFRDVRVFSYTTETSIDEQKQFCAMSFALGPHLTYV AFRGTDASLVGWKEDFNMSFQYPVPAQRRAAAYLDHVAAHRRGQIIVGGHSKGGNLAEYA AAHARAPVRARLGAVYNHDGPGFPDDALDQFSPVRHIFRKTVPQSALVGLLMSSPRRARV VRATTNGILAHYPYTWRVEDGDFVDSPLNKDARGFDRRLTRWLAAHPRDEAELLIEALYG VATSSSTETVSEMVAAFAADLPFFARSFTSMGREAHAVLRGAILDLLPFPGR >gi|319977933|gb|AEUH01000170.1| GENE 59 62929 - 63063 188 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKWATCLVAVGSAVAVGLVVRQIVHDLSDNAELWTSVTDAVEA >gi|319977933|gb|AEUH01000170.1| GENE 60 63134 - 63196 86 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEKPRNHFPLRFRGNIKLAS >gi|319977933|gb|AEUH01000170.1| GENE 61 63406 - 64596 1257 396 aa, chain + ## HITS:1 COG:BS_ybcL KEGG:ns NR:ns ## COG: BS_ybcL COG2814 # Protein_GI_number: 16077257 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Bacillus subtilis # 11 379 6 370 390 144 31.0 4e-34 MHGGSTASWGKIGVLALCAFLIGADGFILAAILPQIADRLNVPLAQAGLLITAFAWVYAV SAPLFGMVLGRWNRRSTLGAAMLVFALGNVLVATAPTYGWMLAARVVSALGSSMATPTAM ALATSLAPARHRGKAISVVSTGMTLATVVAVPLGALMGGRYGYRAVFWAMGGGAIAVLLA ILASIRAREATASKSVPSIRALLMGLADRQTVLTLAVTITIFIAGYACISYLSAIVGQAG MGGQFSLVLLVFGLGSLLGGVLGGWLCDRWPSVVLARAGMGVFAVSLAALGLAAASDRHP WAVHVCVAAWTVSAWVAVIAQQYRLIALAPERAQLNLSLNSSSIYIGQGFAGVLGAAVIS AQPAHVIPLVASAVVVLALAMTAPYETMRGDEPTKG >gi|319977933|gb|AEUH01000170.1| GENE 62 64577 - 64897 234 106 aa, chain + ## HITS:1 COG:alr1867 KEGG:ns NR:ns ## COG: alr1867 COG0640 # Protein_GI_number: 17229359 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Nostoc sp. PCC 7120 # 9 103 8 101 112 68 41.0 4e-12 MSRRRDDADTREAALARVFAALSDPLRLRIVRRIALEGELGCGQIDHGLSSSTVSHHCKV LREAGIVVSTPVGNQRVLTLRPEFERDYQGLLASIVRRGSAAPGGD >gi|319977933|gb|AEUH01000170.1| GENE 63 65055 - 66008 1114 317 aa, chain + ## HITS:1 COG:DR0011 KEGG:ns NR:ns ## COG: DR0011 COG0702 # Protein_GI_number: 15805052 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Deinococcus radiodurans # 40 311 7 282 293 100 27.0 3e-21 MALTARWNLTTGCLCENRRMSTNTQNAHDPARSSDARGLPPLGITGASGNVGGTVARVLA EAGIPMRLLANTPSRAPQLPGAAAVQCSYENTGASRAALAGVTTLFMVSAPESADRLARH KDFIDAAAEAGVRHIVYLSFMNAAPDCVFTLGRTHFHTEEHIKASGMAWTFLRDNFYIDF FVDLPDKEGVIRGPAGDGRVGAVARSDAGRVAAAVLRDPGRHVGRTLNVTGPSSLTLDEI AEILTGALGRPIRYERETVDEAYESRKKWPVEQWQYDAWVSTYTSIAAGQMDVVSTVVED LTGRPPLSVRDVAEGRG >gi|319977933|gb|AEUH01000170.1| GENE 64 66181 - 66822 867 213 aa, chain - ## HITS:1 COG:lin2125 KEGG:ns NR:ns ## COG: lin2125 COG0671 # Protein_GI_number: 16801191 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Listeria innocua # 11 198 13 203 231 73 32.0 2e-13 MFLRRLLSAPVIIGAILVVLIGIAGGLITNPTPIDSSVLTGFIDSRSPGVTALMNGVGFV FDPNAVAVMAVVAGLIVWRVVKKILHGVFVTGSVALSAVNAHIIKALDGRPRPPEVTRLA AETSPSFPSGHTTAITAFLAALVLVLTMTRWGRRLRHLLWIAALAFIVFIAVSRLYLGVH WLTDVLGGFGIGLGSTMILTPFLLGPNVLRRAL >gi|319977933|gb|AEUH01000170.1| GENE 65 66844 - 67224 172 126 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90021194|ref|YP_527021.1| ribosomal protein L20 [Saccharophagus degradans 2-40] # 4 123 4 131 134 70 36 2e-11 MRPCPCGSGSAFDRCCGPYLGGAPAPTAVALMRSRYTAFALGDEDHLFRTWHPRTRPRPP YWAPGTSWTGLRVVAVEGGGEADEEGIVTFEASWRDGATGRTGAQRERSRFERRRGLWVY VSGTDR >gi|319977933|gb|AEUH01000170.1| GENE 66 67221 - 68246 689 341 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 36 332 1 298 306 270 48 2e-71 MSFPVLTGLVHAEVKAPEGAATAPLPDSGVPEGATVHEVVVVGSGPAGYTAAIYTARAGL SPVVLAGALAAGGALMNTTEVENYPGFPTGIQGPDLMEGMREQAERFGADIRYEDAVSVS LAGDVKAVATDDEVLYARAVVLATGSEYRHMGVPGEDRLSGHGVSYCATCDGFFFKDKRL AVVGGGDSAMEEALFLTNFASEVVVVHRRDQLRASKVMAQRAFDNPKISFAWNSTVSEVL GDDSVTGLRLASTADSSESVLDVDGVFVAIGHLPRTGFLAGQVALDPAGYIVVDEPSTRT DVPGVFACGDAVDHTYRQAITAAGSGCRAALDAERWLAERA >gi|319977933|gb|AEUH01000170.1| GENE 67 68287 - 71346 3367 1019 aa, chain - ## HITS:1 COG:Cgl3028_1 KEGG:ns NR:ns ## COG: Cgl3028_1 COG0728 # Protein_GI_number: 19554278 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, putative virulence factor # Organism: Corynebacterium glutamicum # 5 414 76 472 640 185 33.0 5e-46 MSSKSRPSILRASALMASGTMVSRLLGFVRSAMLLAAIGAASGGVSAAFQTANTLPNTVF NLLASGVFDAVLVPQIVGALKRKHDGETYVNRLLTLAGTLLFVVTVVAMVAAPLLVIITA AGYDSEIRNLAILFALLCLPQLFFYGLYNLLGELLNARGIFGPYMWAPVVNNVVGIAGLG AFLAIWGSTDGRIDVGDLSSPQFWLLAGSATLGVITQALILLIPMRRAGIRFRPDFRFRG TSFGSASKVAGWTFATLGVSQVGVLSTNNIAALADAYAKDNAGEMVIAGINAYSTAFMIF MVPQSLITVSLATAIFTRMAEAVADGNDRGVAHNYALGVRTITSLTLLAAAILMAASVPM MQMVLYSTANQQVVMAYALVLASLMPGVASTGMVLMSQRVFFAYEDVKPVFLMGIGPTVL QALVGWGMYFTTGASWWVVGAALGETACRLMQGVIAVAWVGKRVPLVDKSAMLRSYFKYL VSAIVAGLVGFGMLWLVGVTTPLSPAPVRFLVAVLKVVLVGLSSTMAYMLVLRALSSTES ATTMRPLLRRVRVPEGVVNVLAASSATDPSVTGIMTRIKPVTREDLMAGAHNGGGSASSG PGPDGKRIPSFEEVVAPSFQGPVYPQAHPAPRSIDEDAYRSPLLPVSEDFPSPAPSPKSP ITHQIPLVRRRKPKEDAALSPDDGAPVVPAPSDAPEPAPLPGQPQVAQPHDTGAADPDPT VAVPVVARAAAPDRLPSEAPAAPQEESAAEDPSTQQTEVVPMDATLFESLEDYRAHEQAQ QPVAPVAPQWPGQPGDGAGQKSAQHGRTMDPTRPTLIFAAVVFLVGAVWSTWMVLTPATD LDLAQSLRSGVAQNTPQDAPQSALATTPPPTSAPTAPPVISSVSVLSWDNDNGDHPDRAI NMIDSNPATSWTSRWFDDNQFREGAEVTIVVKFQQVATVSSVTLTMDSQTSGGLISVRKV TDPSNPRGGTELATSALSPSTTITLPEPVSTDSIALVFRQMPKAQDGRNWAWVSELTVN >gi|319977933|gb|AEUH01000170.1| GENE 68 71343 - 74024 2620 893 aa, chain - ## HITS:1 COG:no KEGG:Arch_1808 NR:ns ## KEGG: Arch_1808 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 510 801 338 633 663 96 30.0 4e-18 MGTVTRGFLRAAAAAAAALCALAPLSGASAAHPAPMGTEAARTAGAVAQAVGAREERPSV LGSQSASSGLTVAVSSLNPRIITSEEELLIAGTVTNTTSSPLSEVALTITVGTTTPVSVA ALTNALSEDAGSSEVSTTGLGGIAAGASAVFEVRVPTSSLPLGAASGWGPRVLTATASSG GATGSDNAIIVWDSGETVAASRVNALVPWTSESASQGAREREAVLAIAGRTGATLAVDPL LIPRGGQPTEAPPQDGAGEEDPESGQSGQSDQSDQPAPTPSPSASASPAPTPSPTPTDPV ENTRYRANQSFTASLLRTAPELVALPAGDADLGALALAGDAGLQTKALQSIAAFPSTPRK AGWVAPAPAPSASPSAPTSPPGSAAPSPSPSGSAGPSAPASSAPTPSTGATETPGAQTGE TGIDAAEADEGPRIHTDVVWPSDTAFGTTQLSAYPGRLTIAPVGSLMPNDDVPFTSVARV RVDPSTGETIGVIPTTTVDGADVLAQQGDIAALLGWDNSQSEADQLDAEQAITAITAIIT RERPYSSRTLFTAAPRGTAVSASLTGKLDALLTNRWVEPVSFGDMADSEATDLERRTAGA GSLDAGTEAAVTSLTQALADLAPLAKATEEPDEVFSQVEPFLLPSIGAGLSPDSQVAAAS RYAAQVAGLLASIAVEPSDAVNLINKSAAFPVRIRNSLPWDVHVDVTLIPDDPRLQATPA TNRVVTANSATTVEVPVSAIGSGNIQVKYHVTTPSGAVLDNKQSVRVRMRAGWEDAFTMV VASLFGLLFAMGVVRSLRRRRARAALVGAGTASAESAGEGHGGDVRGPADHGSSGQCADD GAPADSAEDGAGPTDGGPAGAVPAEEDPGAPLGATGPGDPRSAEHDTEEKEQT >gi|319977933|gb|AEUH01000170.1| GENE 69 74024 - 74467 603 147 aa, chain - ## HITS:1 COG:ML2698 KEGG:ns NR:ns ## COG: ML2698 COG0494 # Protein_GI_number: 15828458 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Mycobacterium leprae # 6 145 67 206 251 139 53.0 1e-33 MPVSAETSAGGIVIDVRGGVPYAALIARRNRAGRIEWCLPKGHLENGETPQQAALREVAE ETGIRGRIIRHLASIDYWFSGSDHRVHKVVHHYLMGYASGAISVAGDPDHEAEDAAWVPL RDVARQLAYPNERRIVAIALDLLYKDA >gi|319977933|gb|AEUH01000170.1| GENE 70 74620 - 76218 1967 532 aa, chain + ## HITS:1 COG:MT4026 KEGG:ns NR:ns ## COG: MT4026 COG0617 # Protein_GI_number: 15843539 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Mycobacterium tuberculosis CDC1551 # 19 499 3 480 480 477 54.0 1e-134 MSTDDLPTGESLFSAENGEITDLATLMANATRTFRGLPAAIMELGTVFESAGEEIALVGG PVRDAFLGVAPHDFDLTTSARPGRTEELLGQWGNAVWDIGKEFGTIGARRGDVVVEVTTY RADSYQIGSRKPEVTFGDTLEGDLTRRDFTVNAMAMRLPRMALVDPCGGLADLAAGNLRT PVSAEQSFDDDPLRIMRAARFSAQLGMDVDLDVMNAMEDMAHRLGIVSAERIRTELERLI VSPHPRRGIELLVHTGVASIVLPEVADLVSTVDEHRRHKDVYEHTLTVVEQAMDLETGPD GPVPAPDFILRFAALMHDVGKPATRRFEPNGAVTFHHHEVVGAKLTRKRMRALHFDNATT DAVCQLVALHLRFHGYGESAWTDSAVRRYVSDAGEQLERLHRLTRADCTTRNRRKANRLS AAYDDLEERISTLREREELDAIRPDLDGDQIMEILGLRPSRAVKMARDHLLELRMEHGPL GPEAAREALERWWASDAVRAQAAALQAEQAKWEEKAARKRARKAAAKAGRGD >gi|319977933|gb|AEUH01000170.1| GENE 71 76342 - 77808 1948 488 aa, chain + ## HITS:1 COG:Cgl1417 KEGG:ns NR:ns ## COG: Cgl1417 COG0362 # Protein_GI_number: 19552667 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconate dehydrogenase # Organism: Corynebacterium glutamicum # 11 478 8 476 484 580 63.0 1e-165 MTTFVPEASTADIGVYGLGVMGANLARNLARNGYTTAVFNRTQARTEKLISEHGGEATFV PSSTLEDFVASLRAPRVAIIMVQAGPATDAVMEQLAALMDEGDIIVDCGNSLFADTIRRE KWAAERGLHFVGAGVSGGEEGALWGPSIMPGGTRASYERLGPMFEAISATYDGVPCCTYI GANGAGHFVKMVHNGIEYADMQVIAEAYTLLREGLGAAPAQIADIFAAWNEGELNSYLIE ITAEVLRQVDASTGVPLVDLIVDAASQKGTGKWTVQTALDLAVPVTAIGEATFARGASCE PAQRAAGQKLAGNATPLVIETDEARAAFVEDVRRALFASKIVAYSQGFDEIGAGAAEYGW DIDKGALARIWRAGCIIRAAFLDDITRAYEADPALPLLLAAEPFATRFQECTPALRRIVS QAALAGVPIPVFASSLAYFDQIRAARLPAALIQGQRDFFGSHTYRRVDREGIFHTLWAEP GRPEEQWG >gi|319977933|gb|AEUH01000170.1| GENE 72 77963 - 80425 3124 820 aa, chain + ## HITS:1 COG:MT0056 KEGG:ns NR:ns ## COG: MT0056 COG0744 # Protein_GI_number: 15839427 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Mycobacterium tuberculosis CDC1551 # 92 693 144 754 820 265 35.0 4e-70 MADNNGTMNWDDIMGPGPASSDPAPRTPAKARGGGGPEPRRRGDRKGPASRRDAKGDADS GGKDGKGKKGGKKKRSLGRKIGLGIVFTLLGIVILGMATFLYLYATISVPKPDDLALAEK TTVYYADGTTEMGTLGDVNRQIIDASTLPDYVSKAVVASEDRTFYTNSGVNFKGILRALW NNLRGGARQGGSTLTQQYVERYFMGETTSYVGKLKEAVLAVKINRTQSKDEVIGNYLNTI YLGRGAYGIEAASQAYFGHPAKDMTLSEAAMIAGIIPAPSAWDPAISPDKAKERWERVLG LMVEDGWVSQADADAAVFPETIDPDSLNRESMQGTNGYLIAHIKQELVNSGAFFEDQITQ GGLRITTTIVKERQDEAVAAAQTMNSVNGWDPAHQHVALSSIDPNTGEILAEYGGPDYEQ RQQNAVTQDIAMAGSAFKPFALLANARQGGTVNDTYSGASPQTFPGLVEPVTNDGGYSFG NVTLVKATAYSMNTPYVALNRDIGPETTMQAAVDAGIPQDTLGLNDQLLNVLGSASPHNI DLATAYATIANGGQRVEPHIVRQVTNSGGSNKLYETTVAPTRVFEAEEVSSIMPALEAVT TGEGTADSVAGALRGFTTAGKTGTSSDQLSAQFVGFVPNMVTAVSMYQSDDDGNSVPLEN IGGLDQFHGGDWPVDVWTNYMQNATKSLTEKDFTWIVKSNRNSKNNAPSSVPAPTQEPTQ EASNPEPTYEPTRAPEPEPTQTATPGNGGGNGGNGGGNGGGGNGGGNGGGGNGGGNGGGG NGGGNGGGGNGGGNGGGGNGGGNGGGGGGGGANPGAAGGN >gi|319977933|gb|AEUH01000170.1| GENE 73 80659 - 80913 275 84 aa, chain - ## HITS:1 COG:no KEGG:FRAAL5563 NR:ns ## KEGG: FRAAL5563 # Name: not_defined # Def: WD-40 repeat-containing protein # Organism: F.alni # Pathway: not_defined # 1 77 1495 1570 1578 62 49.0 4e-09 TGTRDGTTRIWDATTGEPVRFFITALPDGECAVLTPDQTRVIGASPYAWRWLGRYAVHPD GALERIPVEIDGPLPSLGPGTQAD Prediction of potential genes in microbial genomes Time: Thu May 12 18:21:52 2011 Seq name: gi|319977916|gb|AEUH01000171.1| Actinomyces sp. oral taxon 178 str. F0338 contig00171, whole genome shotgun sequence Length of sequence - 16612 bp Number of predicted genes - 18, with homology - 11 Number of transcription units - 14, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 3085 2553 ## COG5635 Predicted NTPase (NACHT family) 2 1 Op 2 . - CDS 3085 - 3561 537 ## FRAAL5562 hypothetical protein - Prom 3595 - 3654 2.9 3 2 Tu 1 . - CDS 3719 - 7222 1622 ## COG2319 FOG: WD40 repeat 4 3 Tu 1 . - CDS 7345 - 7833 492 ## FRAAL5562 hypothetical protein - Prom 7867 - 7926 2.2 5 4 Op 1 27/0.000 - CDS 8209 - 8667 420 ## PROTEIN SUPPORTED gi|227384530|ref|ZP_03867939.1| LSU ribosomal protein L9P 6 4 Op 2 21/0.000 - CDS 8686 - 8928 344 ## PROTEIN SUPPORTED gi|227494167|ref|ZP_03924483.1| ribosomal protein S18 7 4 Op 3 24/0.000 - CDS 9042 - 9635 748 ## COG0629 Single-stranded DNA-binding protein 8 4 Op 4 . - CDS 9671 - 9961 358 ## PROTEIN SUPPORTED gi|229822655|ref|YP_002884181.1| ribosomal protein S6 9 5 Tu 1 . - CDS 10202 - 11785 1768 ## COG0534 Na+-driven multidrug efflux pump 10 6 Tu 1 . + CDS 12186 - 13538 2021 ## COG0305 Replicative DNA helicase 11 7 Tu 1 . - CDS 13535 - 13777 117 ## + Prom 13574 - 13633 2.9 12 8 Tu 1 . + CDS 13665 - 14066 704 ## 13 9 Tu 1 . - CDS 14195 - 14896 763 ## 14 10 Tu 1 . + CDS 14754 - 15014 77 ## 15 11 Tu 1 . - CDS 15092 - 15241 117 ## - Term 15467 - 15497 4.3 16 12 Tu 1 . - CDS 15551 - 15769 332 ## 17 13 Tu 1 . - CDS 16151 - 16360 360 ## COG1278 Cold shock proteins 18 14 Tu 1 . + CDS 16359 - 16565 121 ## Predicted protein(s) >gi|319977916|gb|AEUH01000171.1| GENE 1 1 - 3085 2553 1028 aa, chain - ## HITS:1 COG:alr3466 KEGG:ns NR:ns ## COG: alr3466 COG5635 # Protein_GI_number: 17230958 # Func_class: T Signal transduction mechanisms # Function: Predicted NTPase (NACHT family) # Organism: Nostoc sp. PCC 7120 # 376 1026 376 1003 1526 145 25.0 4e-34 MPKSPTAPQRPLSASAVRRILDGLDDLTQGNAKMARAICDLLDGDGKTTIEAVLRAVPSE KEPREYLRKFKTRFNQDAGGAITVVFDGRRGPDLYFTGTDPIAEDIAEYSEHQTYRHTSS EQMIDAYAVEEKDPRSDERPELRIYLSFLEQDGDRAVSCADLLERTIKVGLKDRYHVSFF RPDRVLTGEVIDETLKDRIGEADLAIVLVSYAYLDEGIEAQRVRNEHEHPIIVPLENISP SASFGQFDRHDIVSGPSFDEALDSRHPTRSPRAQEIANSVTDLVERGLAKDQILPPMPSA SELSSDVLAEELWTHAEQRHISSETISTRALTEEFDTTRADDAPTYSDRTADTPNVVERL TEWARGSDGENSRLCALLGDLGTGKTTSAILLTRRLLELRKAGEQVLLPIYFDLRDLSPT GLADFGLRTLFTRLLSVSTKSAVMVDDVLDAIRSEPTLVVFDGLDEVLVHLSPGDGQRLT RSLLEVLTLSGPGTETEPPPTRLLLSCRTQYFRTVKEEFAFFDGQGRERVRGKDYLVLTL LSFDEDQIREYLRRNIPDSDPDRLLEMIRSVHDLRALASQPVMLDMIRQILPAIEEDLRA VRRVRSVDLYERFVDQWLSRDDGKHSLIPEHKIQLMTHLAWQVWRSGARTWSARWMETWL LQFLHAHPEMELDYERRMPNQWKQDFRTATFLSRRGDDFAFAHSSLLEYFLARRLTDSLA ADSDDDALAAWDITQPSNETYTFFAELIDRLPDSDQRQALARLEHVGKRGTPEARANTFA YTLQSLERDYPHPRPTAINLSDTDLRGWTIGSEQTHVDLSGVPLTGARLDDAHIRHVLLD RVDAAGASMQRALFEHCSLTNANFTDANLAGTIFRHCDLEGSSLTEAHRHRTQFLHTTGT PQQLSGTVTAPLAGHHPAQICAETQIFGGHSGSISSAAWSPDGARILTGADDGTARVWDA RTAETTLELTHQSSVTTVAWSPDSARILTGTTDGTARAWDARTGKTILELTHTDWVQAVA WSPDGAHI >gi|319977916|gb|AEUH01000171.1| GENE 2 3085 - 3561 537 158 aa, chain - ## HITS:1 COG:no KEGG:FRAAL5562 NR:ns ## KEGG: FRAAL5562 # Name: not_defined # Def: hypothetical protein # Organism: F.alni # Pathway: not_defined # 16 119 17 123 157 72 42.0 4e-12 MSDSKLYILNAREDQELAGSFLALAEPRLKSVRDHAFAWWSPSSLLPGEERRTQIDDRLA ESDYVILLLSPDLLVYLRDEDAPKISDSYAKLLPVMLVDVPLDRSMELFGIDDRQIYCGD TVEPHSYVSLDSAYQRNRFVDGFITRMIRRINGKGGYQ >gi|319977916|gb|AEUH01000171.1| GENE 3 3719 - 7222 1622 1167 aa, chain - ## HITS:1 COG:MA2525 KEGG:ns NR:ns ## COG: MA2525 COG2319 # Protein_GI_number: 20091353 # Func_class: R General function prediction only # Function: FOG: WD40 repeat # Organism: Methanosarcina acetivorans str.C2A # 786 1102 754 1073 1233 209 33.0 2e-53 MSDEKPTLRIYISFLNKDGERAVSCADLLERTIKVGLKDRYQVKIFRPDRVLTGEVTDET LEDRIGEADLAIVLVSYEYLDEGIEAQRVQDMHKHPIIVRLVHILPDASISPFHWADFIN VQPYVAAKDENERQSTANHVANFVGRILTERETASRRAEPRLTPEEIAEELSNHVELELA AEAIPARANRGIIEHREDQDIPNENTLELSADITDAVLHLIAWANGKVDEGSRLCALLGD LGTGKTTSAILLTRRLLALRKTAETAPLPIYFDLSDLSPKGLTDFGLHTILEHLVAGSSK STITVADLCNVIRSESTLIIFDGLDEILIHLSRADGHRLTRSLIEALTLGEQCSTRVLLS CRTQYFRAYKEEVSYFEEKGRRHSARGVILTLLSFNEQQILEYLRRNIPKSNPSSLLKLI RSVHDLHALAAQPVILNMIRQEIPRIEEDLDVGRRVHSVDLYGRFINQWLNRDDGKHRLI PEHKIQLMTHLAWQVWNSGSRTWSARWMETWMLQFLHAHPEMELDYKERMPDQWKQDFRT ATFLARRGDDFAFAHSSLLEYFLAKRLTDSLVADSEDEAPAVWDIVQPSDNTYMFLGELI DRLPTADRRWAFTHLEHIGKHGTPDARTNVFAYTLQALERGYPHPRPRALNLSDTDLRDW KIGSKQTHVDLSGVPLTGARLDYAHIQHTRLDRADAAGASMQRALFEHCSLTNTDLTDAD LTGTIFRHCNLEEAPLDKARRHRTQLLHTSGTPQQLSDVLIAPLTRHGLLRILPEIQILG GQSDDVTAVAWSPDGAQLLTVSRDGTVRVWDATTGENTLTLTHTESVTAVAWSPDSTHIL TAGIDRHIRIWNLSRKTNIALIHPFGWFRYIEWSPDSTQVLAGTEEGTVGIWSTDNDHYT IKFEHDCYLQAALWSPDSKHILTGSGKSVHIWNTATGTIVLTLNHSSWVRSVAWSPDSKH IAVGLEQNATCVFTAATGATVLTLNHSSWVRSVAWSPDGSKIVTQTNSDICVWDMRTGKK THTFAHNGNLIMAEWSPDSTHLLMKLKHNTHIWSSITGNTTLTLTHTGRINSATWSPDGN HILTASDDGTARIWNAATGEPVRFFITALPGGECAVLTPDQTRVIGASPYAWRWLGRYAI HPDGTRERIPVEIDGPLPPLGPGTPTE >gi|319977916|gb|AEUH01000171.1| GENE 4 7345 - 7833 492 162 aa, chain - ## HITS:1 COG:no KEGG:FRAAL5562 NR:ns ## KEGG: FRAAL5562 # Name: not_defined # Def: hypothetical protein # Organism: F.alni # Pathway: not_defined # 6 126 9 127 157 109 46.0 3e-23 MSGIQLFLSWAHDDAEAKDSFLKVLRPRLKSARGHTFTWWKDSFILPGEEWKREILTQLA KADYIVQLISPSFLASKFIRDYEIPGVGEAPLKKTLPVMLVDVPLDGSMEFHQIDRRQIY CIQQGRSRYSYISRETDYQRNQFVDGFVERVLARVEGKTGYR >gi|319977916|gb|AEUH01000171.1| GENE 5 8209 - 8667 420 152 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227384530|ref|ZP_03867939.1| LSU ribosomal protein L9P [Jonesia denitrificans DSM 20603] # 1 150 1 149 150 166 57 1e-40 MATKKLILTQDVANLGHAGQVVEVKAGYARNYLIPRGYATAWTKGAQKQIDQMAAARRRH EIASIDDARAVRDTLQAALVEVSGKVGQSGRLFGAVSAAAIADAVKEQLGQTIDRRRVVI AAPIKAVGDYTVNVNLHPEVSATLKVRVSASK >gi|319977916|gb|AEUH01000171.1| GENE 6 8686 - 8928 344 80 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227494167|ref|ZP_03924483.1| ribosomal protein S18 [Actinomyces coleocanis DSM 15436] # 1 80 1 79 79 137 87 6e-32 MAKPQLRNKPIKKKANPLKSAKIEKIDYKDTALLRKFISDRGKIRARRVTGVSTQEQRKI ARAVKNAREMALLPYSSTGR >gi|319977916|gb|AEUH01000171.1| GENE 7 9042 - 9635 748 197 aa, chain - ## HITS:1 COG:MT0060 KEGG:ns NR:ns ## COG: MT0060 COG0629 # Protein_GI_number: 15839431 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Mycobacterium tuberculosis CDC1551 # 1 156 1 161 164 166 59.0 2e-41 MAGETVITIIGNLTADPELRWTQSGAAVADFTVASTPRTFDRNAGEWRDGETLFMRCSVW REAAENAAESLRKGMRVIVQGRLNQRSYETQQGERRTVVEMQVDEVGPSLRRARAQVTRT TPQGGGGYQADNRGGGGQGPQGGYNQGGGYGGGQGQQGGYGQGGSANYGAPAGGSADDPW RSPDQGGATSFGDEPPF >gi|319977916|gb|AEUH01000171.1| GENE 8 9671 - 9961 358 96 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229822655|ref|YP_002884181.1| ribosomal protein S6 [Beutenbergia cavernae DSM 12333] # 1 96 1 96 96 142 62 2e-33 MRKYELMVILDPSIDERTVAPTMEKYLAPVTAGGGSVENVDVWGKRRLAYDILKRSEGIY VVIDMTTTPEVALEINRRMGIDETILRTKLLRPDAH >gi|319977916|gb|AEUH01000171.1| GENE 9 10202 - 11785 1768 527 aa, chain - ## HITS:1 COG:Cgl1936 KEGG:ns NR:ns ## COG: Cgl1936 COG0534 # Protein_GI_number: 19553186 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Corynebacterium glutamicum # 90 476 16 398 435 162 32.0 2e-39 MRRWNMTGTTPSADNPQPGVETCRAPHGTEDPRPQRPAEEPHSRSGAEDPRPQLPAEEPH SRGGAEDPRPHRAADAPQRPTEEPRASIPRQIVALALPALGALVAQPLFTVIDSAMVGHL GTPELAGLGIASTVLNTVVGVFVFLAYSTTAIAGRALGAGRPDRAIRGGVEAMWLAAGLG LVAAAALSLGADPLLRGLGADALTLPHASAYLRWSSPGLVGMFVAYAATGTLRGLQDTRT PLIAATAGAAFNACANWALMYPLGMGVAGSGLGTALTQTLMAAFLVAVVVRGARRERVPL RPSTSGVLAAALDGAPLLVRTIALRAALLATLATVTAIGTQALAAHQIVWTLWTFAAYVL DALAIAAQALVGFAEGRGDRGGMAPLLRTLARWGTAFGALVGVVLGAASPWLPALFTADP AVRGPASWAIVVGALFQPLAGLVFLLDGILIGAGRGRFLAAASLVNLTLYAPLLWLIARS SSTGALAGSPAVALALVWAAYGAVYTGARAVTNTWGTWLGRAALVRG >gi|319977916|gb|AEUH01000171.1| GENE 10 12186 - 13538 2021 450 aa, chain + ## HITS:1 COG:Cgl2923 KEGG:ns NR:ns ## COG: Cgl2923 COG0305 # Protein_GI_number: 19554173 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Corynebacterium glutamicum # 8 440 74 506 510 461 56.0 1e-129 MADEEFDRIPPQDIDAEMSVLGSMMLTSEAVSVVSEILRPQDFYQPSHETIFDTIVELTG RAEPADAITVAGELKRQGLLSKVGGAAYLHSLIASVPTAANAGYYARIVRERAIMRRLVS AGTRVVQLGYADAGGDVDEIVDQAQAEVFAVSDGRDTSDYVSMNQMSSDILGALEDIERN QGKLRGVPTGFIDLDELTQGLHGGQMIIVAARPAMGKSTLALDFCRSASIKHGITSLIFS LEMSSQEIAMRMLSAESGVFLSKMQAGKMNDRDWQNVSATTSRISEAPLYVDDSANITIA EIRSKCRRLKQQENLGLVVVDYLQLMTIGRQVESRQQEVSAMSRNLKLLAKDLDIPVVAV AQLNRNSEGRTDRKPMMSDLRESGSLEQDADIIILLHRPDYYDENTERKGEADIIVAKHR AGPTATIGVLFQGDVARFANRTDRPEPGQG >gi|319977916|gb|AEUH01000171.1| GENE 11 13535 - 13777 117 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MARKACRSGRCLAASASTSAEVIGFAAICGVQLAGSTIGLFSRRYGRCVADYATFLTVRQ GHARAVTISWSERDGGAAAL >gi|319977916|gb|AEUH01000171.1| GENE 12 13665 - 14066 704 133 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVDPASCTPQMAANPMTSAEVLADAARHRPDLHAFLAINPALPPELRAWLALGADPGVRA ELGPHPAPPVVALVCAREMGLFDDGGEEWEEPFEMGVIEVEPLDMSLLGLAAAPAKPKGK RNVLITGNLVAVG >gi|319977916|gb|AEUH01000171.1| GENE 13 14195 - 14896 763 233 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAWAAGPLLIVALAGCSSATPQSGQTPAALRTTQLPQSAPQSSSTTSTTSPSPAVLATAV APSAPSQCSASATAPNGRCLSSASVSWLPGLPDPSALDWQDADEVAKAYVITYHTWDASK DENQEYAAVRAAIYETGDFSRTQALDPESAQASKEFLPLKTDGARTTATVERVTTEGADT NPQRPNGTWQRNIFYAREYPDGRFNTQRGVGWITLTRGEGGSWFVSDAKWTTV >gi|319977916|gb|AEUH01000171.1| GENE 14 14754 - 15014 77 86 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEVVEEDWGALWGSWVVRSAAGVCPLWGVAEEQPANATIRRGPAAQAMERDVIFASGDSC ALWAPRPGRRAFRVPILGGARVRVKT >gi|319977916|gb|AEUH01000171.1| GENE 15 15092 - 15241 117 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGQAINLPQWTVSYPNAGQRLPTPMHRFLPQCGTPLAPPAPHGSTRAPG >gi|319977916|gb|AEUH01000171.1| GENE 16 15551 - 15769 332 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEHIVKSETAPTMAQARVTASVLVSMIVVLAALGMFAAAQAGSTASVVLAVVIALTALIT APALFSLTRKAA >gi|319977916|gb|AEUH01000171.1| GENE 17 16151 - 16360 360 69 aa, chain - ## HITS:1 COG:VC1142 KEGG:ns NR:ns ## COG: VC1142 COG1278 # Protein_GI_number: 15641155 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Vibrio cholerae # 1 62 4 64 76 73 54.0 6e-14 MASGHVKWFNDAKGFGFIIPDDQTGDVFVHFSSIVGQSGRRTLQEGDKVDYVAVEGPRGL HAEEVSRVV >gi|319977916|gb|AEUH01000171.1| GENE 18 16359 - 16565 121 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGVSFRCPVPGARTLSGGGPGPRPRTGNGPGADVELHGTPDPPELPLTCVNMNHIRRSAG IWAGRVPH Prediction of potential genes in microbial genomes Time: Thu May 12 18:22:45 2011 Seq name: gi|319977898|gb|AEUH01000172.1| Actinomyces sp. oral taxon 178 str. F0338 contig00172, whole genome shotgun sequence Length of sequence - 14899 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 - CDS 180 - 998 943 ## COG1120 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components 2 1 Op 2 33/0.000 - CDS 995 - 2023 1275 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 3 1 Op 3 . - CDS 2087 - 3205 1721 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 4 1 Op 4 . - CDS 3262 - 4086 719 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases 5 2 Tu 1 . + CDS 4125 - 5561 1086 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 6 3 Op 1 . + CDS 5726 - 5905 79 ## 7 3 Op 2 . + CDS 5955 - 6203 234 ## Mlut_06200 hypothetical protein 8 3 Op 3 . + CDS 6203 - 6562 153 ## 9 3 Op 4 . + CDS 6555 - 7967 1212 ## Mlut_06190 hypothetical protein 10 4 Tu 1 . + CDS 8339 - 8554 169 ## Mlut_06180 hypothetical protein 11 5 Op 1 . + CDS 9425 - 9853 309 ## Mlut_06160 hypothetical protein 12 5 Op 2 . + CDS 9875 - 10330 264 ## Mlut_06160 hypothetical protein + Term 10368 - 10415 3.7 13 6 Op 1 5/0.000 + CDS 10419 - 10787 404 ## COG0640 Predicted transcriptional regulators 14 6 Op 2 . + CDS 10784 - 12685 2010 ## COG2217 Cation transport ATPase 15 7 Tu 1 . - CDS 12722 - 12970 357 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases - TRNA 13246 - 13332 57.3 # Leu CAG 0 0 - Term 13206 - 13242 6.3 16 8 Op 1 34/0.000 - CDS 13439 - 14209 643 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 17 8 Op 2 . - CDS 14206 - 14898 223 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P Predicted protein(s) >gi|319977898|gb|AEUH01000172.1| GENE 1 180 - 998 943 272 aa, chain - ## HITS:1 COG:FN0307 KEGG:ns NR:ns ## COG: FN0307 COG1120 # Protein_GI_number: 19703652 # Func_class: P Inorganic ion transport and metabolism; H Coenzyme transport and metabolism # Function: ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components # Organism: Fusobacterium nucleatum # 1 259 4 256 264 193 35.0 3e-49 MTFEVRGGTFAYPGAPAPILDDVGFTVPGRSLMAVLGPNGAGKTTLLRTMIGLQKWDRGR SLIDGRDIADMTPRALGRALSYVPQARNASAIGLSGAEMVMVGRAPHMRVLAQPGRAEEQ MALRALEDIGAAALARMPCRAMSGGQFQMVLIARALVAEPSVLVLDEPETGLDFRNQLIV LRLLERLVAERGLTVIMNTHYPAHALRVADQVLMISAAHRAVVGPTGEVMNERTLAEVFG VDVRIAELEHNGTTVETVVPVSLSAKWAPPSE >gi|319977898|gb|AEUH01000172.1| GENE 2 995 - 2023 1275 342 aa, chain - ## HITS:1 COG:FN0306 KEGG:ns NR:ns ## COG: FN0306 COG0609 # Protein_GI_number: 19703651 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Fusobacterium nucleatum # 19 337 13 329 333 243 42.0 5e-64 MHHLLVVSGRLRRWVVPALVVVIVACATGAIMVGRYEVSLADIVAAFRARLGAGPAVAPR IDSVVFNLRVPRVIVAVLIGAGLSVAGASFQSLFSNPLATPDTLGVTAGTCVGAVAALLL DGNLLGVQAMALAAGLVTVLFTTSIARARTGGFNVITLVLGGVIVSALANAVLSLLKLTA DPTSQLPEITYWLMGSLAAVSYGQIALGAPFIIGGAIVVLALRWQLNILALSDDEARAAG VNVPLLRALLVVASTAITASVVSMCGQVGWVGLLVPHIARMLCGSNNRALIPVSLLLGSA LMIAIDTLARTLTASEIPISILTAIIGAPFFIVLLRRTGGAS >gi|319977898|gb|AEUH01000172.1| GENE 3 2087 - 3205 1721 372 aa, chain - ## HITS:1 COG:FN0305 KEGG:ns NR:ns ## COG: FN0305 COG0614 # Protein_GI_number: 19703650 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 40 361 8 321 336 121 25.0 2e-27 MKSPLKALGALAAAGALFLAGCGGTASTQSTGPQSAPGTRTIVDHAGNQVELPAQITRVA IDQIPIESTYLAYFDGKAPYLVAMSAARVKAMSDTIAAQMAPEMLQVQTDYYDNGELNAE SLASIGVDVVLYNAFNKEHGEMFRKAGIPAVGFTTTGNPSDTYADWMELLEDVFDQDGHM ADKIALGKDLVAGARERASRVAEADRQSTMVLMGAADGQLAVAGGRDGWFTDSWAESLNY TNVTADTDTSHTPVTFEQVLAWDPQVLLVTGKGMSNMTAKSVLDNTVEGVDFSTLSSVRS GRVYSTGLGMWNWFTPNPDSPVVANWLGKSLYPDQFADVDLVSITKDYYQRMYSFTLTDA QALAIVDPDSAS >gi|319977898|gb|AEUH01000172.1| GENE 4 3262 - 4086 719 274 aa, chain - ## HITS:1 COG:Cgl2726 KEGG:ns NR:ns ## COG: Cgl2726 COG2141 # Protein_GI_number: 19553976 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Corynebacterium glutamicum # 1 265 76 340 347 316 61.0 3e-86 MRYENPLHLAEEAAALDQIADGRVALGMSRSAPEIAQRGWEAFGYRAEDPRGADLARANF ERFLAAIEGAPMAQAAPLGEQYPTSCAPGTPLRILPHSEGLRQRIWWGSGTTASAQRAAR DGVNLMSSTLVLETGQDSFADLQAAQIAAYRRAWKEAGHSWTPRVSVSRSVFPLLSERDR RLYGLAASGDQVGELDQSPTTFGRAYAADPDALVRQLQQDAAVASADTLLLTIPNQLGVD ANLSIIENFARCIAPELGWVPSTRGPQEQDPLLL >gi|319977898|gb|AEUH01000172.1| GENE 5 4125 - 5561 1086 478 aa, chain + ## HITS:1 COG:Rv1586c KEGG:ns NR:ns ## COG: Rv1586c COG1961 # Protein_GI_number: 15608724 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Mycobacterium tuberculosis H37Rv # 14 476 10 469 469 127 26.0 5e-29 MVELVSLRKTNEVAAYVRASMDRKGDRWTVETQLRKIRALAEAKDWKVVEVYDDNAVSAT KKRRAGTRWAEMLEDARAGKFSMVVAVDMDRLLRSTKDLNTLIDLGLRVVTVDGEIDLST ADGEFRATMLAALARFEARRKAERQIRSNERRRAEGIPTSAWKAFGWTRDGELIEEEAAA VRRAFDAFLGEPSLSIRRIREDLNSAGHFTARGSEFSVDAVRYLLANPLYCGYIKHYASG ELYPVQGEAFPPIVGEQTWRAAVAKLEDNVRRSARQGNQPKYLLSTIGLCGKCGATLVSG TNSRKQPTYRCGEQFHLTRQREPVDAMVTEAVLTRLSSVDVHDLVMPQEDDGPDREELLT ERNALVDRVKELSPLLRDIHQPVLEITAAINDVKARIDEIDAELLDRSVSAAAKLLADVE EPVGTAERREVVEAKWKVLDVDRRRMLVDELVTVTIEPITPGHVKFDPDLIRIEPRRD >gi|319977898|gb|AEUH01000172.1| GENE 6 5726 - 5905 79 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEDFRCLSLSTFPAHPRSTALASIRAPLLLRRQTASIPSSPPLVLLVVWPVNAWLARP >gi|319977898|gb|AEUH01000172.1| GENE 7 5955 - 6203 234 82 aa, chain + ## HITS:1 COG:no KEGG:Mlut_06200 NR:ns ## KEGG: Mlut_06200 # Name: not_defined # Def: hypothetical protein # Organism: M.luteus # Pathway: not_defined # 1 82 1 82 82 135 91.0 4e-31 MSANSAAFDHVNGFRWRQGDPSLAESEARLYDLGVLRSVLEESVEIAVADARADGVTWAK IGDALGVTHQAVIKRYGRGGGR >gi|319977898|gb|AEUH01000172.1| GENE 8 6203 - 6562 153 119 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRADARNLSTAPADLTPPGGPGLAPAEGGLPVVVMTVRSRASHSKSFGSLIRSYGRGKGL SIAASCNNVPGQRSVPPALRRVASWDHFGSHLGVNRGSFRRRTLSEALRAVRGGGGRRG >gi|319977898|gb|AEUH01000172.1| GENE 9 6555 - 7967 1212 470 aa, chain + ## HITS:1 COG:no KEGG:Mlut_06190 NR:ns ## KEGG: Mlut_06190 # Name: not_defined # Def: hypothetical protein # Organism: M.luteus # Pathway: not_defined # 1 470 4 473 473 920 97.0 0 MAKQENNPTSIGLTQYLDPSYWTWAAEDPNGAALLQQGAEAILAYVVQRLEATGCEVVEA YGIVHDKDEREVWSDTEKALVIEPKPDHLHAVIKFASRAKSAPLDRLAFGIGVEPQYVEK PGRGRYAYDNMLSYLTHVKYADKHQYAPSEVATVRGPDYLGIDAQRRETWLKGRAHVKKK VVAENFEDMRERVLQGEFTRDQIMLTDELFDIYSRHQREIDDALSAYGQRRAYRAAAKLR AGEFSTHVVFVHGDAGIGKTRFATDFITEAINAANAHGERWQVYRAATGNPLDDWRGEEV LLLDDLRASAMDANDWLLLLDPYNASPAKARYKNKGEVAPRLIVITATIEPVEFFYYARQ KGNVDEALDQFIRRLASVVKVYRADDINRYLVQHIGKIEPYEWHQCSIPTAAHTPGMYGN AYHQNVGSRELTYGPETSAEHDAEGAVAELLGGLAVRSPDVPLALIGGAA >gi|319977898|gb|AEUH01000172.1| GENE 10 8339 - 8554 169 71 aa, chain + ## HITS:1 COG:no KEGG:Mlut_06180 NR:ns ## KEGG: Mlut_06180 # Name: not_defined # Def: hypothetical protein # Organism: M.luteus # Pathway: not_defined # 1 71 126 196 196 88 90.0 8e-17 MAAHLLGVEARVQASADARELAAQIRPLAINVNDLDRRARAGESVALSAEVPELIESLRE VRALLGDRAAS >gi|319977898|gb|AEUH01000172.1| GENE 11 9425 - 9853 309 142 aa, chain + ## HITS:1 COG:no KEGG:Mlut_06160 NR:ns ## KEGG: Mlut_06160 # Name: not_defined # Def: hypothetical protein # Organism: M.luteus # Pathway: not_defined # 1 142 1 142 300 159 88.0 3e-38 MSAALDRLKQQNEPEQSQQQLGSPETMELLTSILAAVEAQNTRIATLTEQQKKLAGFVKV MDEETTRRLERITAPASTSSPSSDVSARIASIESRQNEIASTLGEFAQSLNGESLNAASR SLVAEAQKNRAATASAIEGLKA >gi|319977898|gb|AEUH01000172.1| GENE 12 9875 - 10330 264 151 aa, chain + ## HITS:1 COG:no KEGG:Mlut_06160 NR:ns ## KEGG: Mlut_06160 # Name: not_defined # Def: hypothetical protein # Organism: M.luteus # Pathway: not_defined # 1 150 151 300 300 186 88.0 2e-46 MSQVGGAVQRIEKRTEERVEKAVEQVAGEASATMTASLDASNARAERIIAATAKLEARQL WSAAAAMCLVLLPVAVVVAGLWMGIAGLITGVQWALDVDGSVWLGIGRWLVVGAGLAGAG YGLFASVSWVAGLVETWKGRGMPKWPRWRKR >gi|319977898|gb|AEUH01000172.1| GENE 13 10419 - 10787 404 122 aa, chain + ## HITS:1 COG:Cgl0882 KEGG:ns NR:ns ## COG: Cgl0882 COG0640 # Protein_GI_number: 19552132 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Corynebacterium glutamicum # 1 119 1 119 119 165 73.0 2e-41 MLTIASRLDVMNRLGRAMADPTRSRILMSLLDSPSHPAVLSRELGLTRSNVSNHLTCLRD CGIVVAEPEGRQTRYEIADPHLAAALTALVDVTLAVDESAPCIDPACSVPGCCAASSSGD AS >gi|319977898|gb|AEUH01000172.1| GENE 14 10784 - 12685 2010 633 aa, chain + ## HITS:1 COG:MT2048 KEGG:ns NR:ns ## COG: MT2048 COG2217 # Protein_GI_number: 15841474 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Mycobacterium tuberculosis CDC1551 # 16 632 140 756 771 484 56.0 1e-136 MSAACGCDEPTTSPADEAEEAERPWWRDPGLVVPILSGVAFGTGLALEWSGLHIAALVAF WAGLLLGAYTFVPGAIRNLVVKRKLGIALLMTISAVGAVILGYVEEAAALAFLYSIAEAL EDKAMDRARGGLRALLKLVPETATVLRDGTSASVPAREIAVGDVLLVRPGERVATDGLVR SGRSSLDTSAITGESIPVEVAPGEAVSAGAINSAGVLEVEATAAGTDNSLTTIVELVEQA QAEKGDRARIADRIARPLVPGVMVLAVLVGVLGSLLGDPETWITRALVVLVAASPCALAI AVPVTVVSAIGAATKFGVVIKSGAAFERLGGIRHLAVDKTGTLTRNRPEVTAIVAAHGFD DAQVLAWAAAVEQHSTHPLAAAIAAAGRGTPAAQDVAEEAGHGIGGLVDGRRVAVGSPRW IDAGPLKARVEDLEAEGQTCVLVTVDGFLAGAIGVRDELRPEVPEVVRTLRDQGVEVSML TGDNSRTAAALAKLAGIGDVHAELRPEDKARIVAGFSETSPTAMIGDGINDAPALAGATV GIAMGATGSDAAIESADVAFTGHDLGLIPQALRHARRGGRIINQNIVLSLAIITVLLPLA ITGVLGLAAVVLVHEVAEVVVIANGLRAARARK >gi|319977898|gb|AEUH01000172.1| GENE 15 12722 - 12970 357 82 aa, chain - ## HITS:1 COG:BMEI0017 KEGG:ns NR:ns ## COG: BMEI0017 COG2141 # Protein_GI_number: 17986301 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Brucella melitensis # 1 82 1 83 340 105 63.0 2e-23 MKSFGFLSFGHYAPAGRRGPSGRDALHQAIELSVAADELGVNGAYFRVHHFANQAAAPMP LLAAVAARTRRIEVGTGVIDLR >gi|319977898|gb|AEUH01000172.1| GENE 16 13439 - 14209 643 256 aa, chain - ## HITS:1 COG:ML0849 KEGG:ns NR:ns ## COG: ML0849 COG0619 # Protein_GI_number: 15827377 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Mycobacterium leprae # 1 221 6 244 283 73 30.0 3e-13 MSERRKPRRSAGFTLPRALPWPSFLSRVWPGTLIVCWTALTTFVITIPTWATLGAVAAVL VAATALLRVPAGALPRPPVWLWSGFIGGCAGAWLGEGVEVFLRASSVALIVVWGSSLLLW AVPTAAMAGALRVFLAPLRLVGAPVDEWALIMGEALRGMPMLRDQAGAVMDTVRLRLGDE MASMSMRGYARLAIDVVTASLSAASRGAAETGRAMSMRGGLRPPVGERVSLGWRDLVAAL VCAAAVSAIVAAKILL >gi|319977898|gb|AEUH01000172.1| GENE 17 14206 - 14898 223 230 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 31 212 147 326 398 90 35 6e-18 APALSAAAESLWADGVAHSYDVATPWENTVLRDVSLIVEAGEGVLITGDNGSGKTTLSRI LTGLMAPTWGKCTLGGRPMTSRVGDVALSMQFARLQLQRPTVRLDILSAAGMGPVIGTRQ GRHSAEADRVVDEAMASVGLDPALQSRSIDELSGGQMRRVALAGLLASRPRVLILDEPMA GLDQASRDLLIAVLEERRRAGLSVLVISHDLEGMDSLCQAHHHLSEGVLS Prediction of potential genes in microbial genomes Time: Thu May 12 18:23:19 2011 Seq name: gi|319977874|gb|AEUH01000173.1| Actinomyces sp. oral taxon 178 str. F0338 contig00173, whole genome shotgun sequence Length of sequence - 22287 bp Number of predicted genes - 23, with homology - 20 Number of transcription units - 14, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1541 1398 ## COG1122 ABC-type cobalt transport system, ATPase component 2 2 Op 1 1/0.000 + CDS 1623 - 3086 1940 ## COG0471 Di- and tricarboxylate transporters 3 2 Op 2 16/0.000 + CDS 3112 - 5235 2710 ## COG0642 Signal transduction histidine kinase 4 2 Op 3 . + CDS 5232 - 5702 552 ## COG0784 FOG: CheY-like receiver 5 3 Op 1 . + CDS 5968 - 7134 1275 ## Arch_0273 hypothetical protein 6 3 Op 2 . + CDS 7131 - 7748 223 ## Arch_0274 hypothetical protein 7 4 Op 1 . - CDS 7690 - 8160 302 ## COG1683 Uncharacterized conserved protein 8 4 Op 2 . - CDS 8175 - 8864 1005 ## gi|154508307|ref|ZP_02043949.1| hypothetical protein ACTODO_00804 9 4 Op 3 . - CDS 8893 - 9678 1190 ## COG0778 Nitroreductase 10 5 Tu 1 . + CDS 9756 - 10223 744 ## gi|154508310|ref|ZP_02043952.1| hypothetical protein ACTODO_00807 11 6 Op 1 9/0.000 - CDS 10394 - 10918 423 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 12 6 Op 2 . - CDS 10926 - 11219 318 ## COG4453 Uncharacterized protein conserved in bacteria 13 7 Tu 1 . + CDS 11409 - 12350 849 ## CMS_2136 putative hydrolase + Term 12553 - 12584 -1.0 14 8 Op 1 34/0.000 - CDS 12395 - 13264 540 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 15 8 Op 2 31/0.000 - CDS 13264 - 14142 1224 ## COG0765 ABC-type amino acid transport system, permease component 16 8 Op 3 . - CDS 14165 - 15031 1447 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 17 9 Tu 1 . - CDS 15257 - 15475 254 ## 18 10 Tu 1 . + CDS 15748 - 16014 407 ## 19 11 Tu 1 . - CDS 15928 - 16368 195 ## 20 12 Tu 1 . - CDS 16591 - 17181 833 ## COG1309 Transcriptional regulator + Prom 17182 - 17241 3.4 21 13 Tu 1 . + CDS 17267 - 19606 2534 ## COG1233 Phytoene dehydrogenase and related proteins - Term 19429 - 19470 2.1 22 14 Op 1 2/0.000 - CDS 19619 - 20908 1363 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 23 14 Op 2 . - CDS 21001 - 22287 1381 ## COG0499 S-adenosylhomocysteine hydrolase Predicted protein(s) >gi|319977874|gb|AEUH01000173.1| GENE 1 2 - 1541 1398 513 aa, chain - ## HITS:1 COG:ML0848 KEGG:ns NR:ns ## COG: ML0848 COG1122 # Protein_GI_number: 15827376 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, ATPase component # Organism: Mycobacterium leprae # 29 450 55 474 724 187 36.0 5e-47 MWTICHCPAPGCRSGGHSWDTRGMTDSARAARLAPVEVASCGALSGLAVTFGVASAATPV FHTLFQVATAIPLAMIALRMRRRAWAGAFAATLLMSLAVGGVAAVGRVFQSCVVGVIVGG LHRRRARAALVWLVAVGLGALWGAGTGVAFWVLEDLRRLALEALQKSLDGFLRLLAHVPG AQGPASALQALNQWFVDYWWLWLPFTRLVGVVALLVVAKWLLGRVLDRIDLDRGWDPLAP ALQGRRDEAGAAPLPLFLADAEFTYPGAGSPSLTGVTLALEEASLTGIVGPNGSGKSTLA LLLAGAEPTGGSRHGGGGLGRRGGIALVGQRSELQMLGETVAEDVLWGMDAHEREQVDLA GVLGLVGLGGLESASPHHLSGGQLQRLSLAGALARSPRLLISDESTAMIDREGRAQLMGV LASLPSRGIAVVHVTHDAGEVAGAERVIRVEGGRIVYDGPPSGCPTTSARGAEAVPQERP PAARSASGATAEEPSETGAAPSGDGPAAPPGTG >gi|319977874|gb|AEUH01000173.1| GENE 2 1623 - 3086 1940 487 aa, chain + ## HITS:1 COG:VC1314 KEGG:ns NR:ns ## COG: VC1314 COG0471 # Protein_GI_number: 15641326 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Vibrio cholerae # 57 476 46 477 487 284 41.0 3e-76 MAQRRKGGIAIDPRSAQNMSTRRGGPKKRTAPSPAKVVGIIGAITAITVCLLVQLPGLDA AGTRMFGIFLAAILLWVTEAVPLAGTAVLVIFLEVLLISDHALLPVADGAPPYTRFFGAL ANPVIILFLGGFMVADGAAKYKVDRALSAVLLKPFIGKPRLTVMGVMLITALMSMFMSNT ATTATMFAVMMPVITALPKGRARTGIALSIPVAANVGGMGTPVGTPPNAIALGALESAGI EVTFLQWMLAAVPLMLVVLVVSWLFISWRYIPADAEFDIDTTARFQTGRNAVIFYVVAGA TILAWMTEALHGVSANIVGFLPVVVLLITKVMSGDDLRALDWPVLWLVGGGIALGAGVSS TGLDKWMLGSIQWASIPGALLILVLALVGWATSNVISHSASANLLVPMGMGLAASVSTSA AEIAIVLALGCSLGMCLPISTPPNAIAYSTGTTPTREMVVVGVVVGLVGIVLLAFVAPLT WGALGVV >gi|319977874|gb|AEUH01000173.1| GENE 3 3112 - 5235 2710 707 aa, chain + ## HITS:1 COG:VC1156 KEGG:ns NR:ns ## COG: VC1156 COG0642 # Protein_GI_number: 15641169 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 261 692 21 418 431 213 34.0 9e-55 MARKDGGPPGLATGWTLGPDPSDARVRQRTDVRSGGHRLDAHVHVLIVGDDSDRIAVPLT RDLAWAFPNLCFDRVDAPGDLPGYQAGLGADDRVVLGVATSEVADIDALIDAAAALDALA PIQWIAVTDRAEHRDLARSTESDRLASVLKSPWTVPLLVGQSYSTMVRRLQADGCGAGEV LDLIGRPPSFAVQGPLLEGLGRSEHSVVLELLAGVERVLGARPRLVVPEGTQLVTQGKPV GAVHLVLDGKVSLHRDSPRGEVLAHLASSGPLIGLVSLARGESAFFTGVTTTETVLVRLT TEQLQIVLARDPAIGSTLTALAIRSLTRRLMRAEDLHLENAMLAEGLEQQKQALAATLED LRRTRAELVERARFAMLGELSAGIAHELNNPVTALVRAAEHLRADVDTALAASSSTSAQR DAMTRALTAPPRSTAAERALIKELTPIAGDRNGARRLVRAGVADADDARQLLAGGRSQVE AALAGARLGSSLRSVLAASTRVIELTQSLKGYARPDDADLKAVDVREGIDDVLRLTSHRV HGIKVARDYEDVPLIRAHPSKLQQVWTNLIVNAAEAIEDENEDVQAARISAETGYYSGPD PAPARGDAPARITIRVRPDGDGVAITVSDNGPGIAPEVVDKIFEPHFTTKAGRVRFGLGM GMSIVSSIVADHSGTLDVDSRPGATSMRIRLPATPLPDNQPQREEES >gi|319977874|gb|AEUH01000173.1| GENE 4 5232 - 5702 552 156 aa, chain + ## HITS:1 COG:VC1155 KEGG:ns NR:ns ## COG: VC1155 COG0784 # Protein_GI_number: 15641168 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Vibrio cholerae # 5 155 22 172 181 123 41.0 1e-28 MTLTILSLEDEADVRAAIERDLDQFWDTIRLEAADNVEDAWAVVEEIDEDGDELALVLSD HRLPGESGVDFLVALMADSRFASARTVLVTGQADQDDTIRAVNEAGLDYYIAKPWDPSKL REVVRDQLTEYVLASEVDPLPYLPVLDAVRVMEAIR >gi|319977874|gb|AEUH01000173.1| GENE 5 5968 - 7134 1275 388 aa, chain + ## HITS:1 COG:no KEGG:Arch_0273 NR:ns ## KEGG: Arch_0273 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 6 376 1 369 382 315 44.0 2e-84 MNGGGMQFTHVHIRNWRNFRDLRFNVGPRLFIVGPNASGKSNLLDVFRFLGDIVAPGGGV AHAVERRGGYKKIRSLFSRQHPSPTIDVTMRDGEEEWCYTLTLRQEGSGRRRVLIEEEKV EKNGRAVLSRPNEQDREDPELLTQTYLEQKVTNRGFRKIAEFFAQTKYFHPVPQVIRGLT GNEIRIADPYGSDFIRQISETPKRTQEARLKRVQAALQRAIPEFEKLRLEPDSMGVPHLE AAYKNWRPHAAWQNETQFSDGTLRLLGLLWTLIATPGEKSSVLLLEEPELSLHSALVSEL PSVLAEARRSSKGSVQIFLSSHATEVVDDGTVNEDEVLVLVPSGDGTKGGLLSEFEGVGR YLDIGIPFSETIRSIMGGAGTPRFVLGS >gi|319977874|gb|AEUH01000173.1| GENE 6 7131 - 7748 223 205 aa, chain + ## HITS:1 COG:no KEGG:Arch_0274 NR:ns ## KEGG: Arch_0274 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 12 194 8 188 194 80 33.0 3e-14 MRVPLTALVYTEGDTDRPVVSAVMKAAGWSGSEFEVFALGGAANLEKRLRQQVAQESPIP RIFIMDSDGKCPVELRRRLMAQGSAATVVLRICHYEIESWILADDQGFSRFFTIPLAKVE SPDGRNAKERMLRCVDRLGRSNTQDFARRSPRSSGAYAFGSRYLVVVGQLMDSEWNAERA ALRSDSLRRALQRLRNLRDRFARAG >gi|319977874|gb|AEUH01000173.1| GENE 7 7690 - 8160 302 156 aa, chain - ## HITS:1 COG:lin2433 KEGG:ns NR:ns ## COG: lin2433 COG1683 # Protein_GI_number: 16801495 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 138 4 141 147 125 48.0 3e-29 MSACLAGVPCRYDGGAKPDPAIVRAVAEGRALPACAEVAGGLPTPRPPAEIRGGDGADVL RGTARVVTEQGADVTAEFVDGARRVADEAVARGVRAAVLQPRSPSCGCGRVYDGSFSGGL VDGDGVLAAALRARGIGVTPHVRSGRGDSGGAEAPA >gi|319977874|gb|AEUH01000173.1| GENE 8 8175 - 8864 1005 229 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508307|ref|ZP_02043949.1| ## NR: gi|154508307|ref|ZP_02043949.1| hypothetical protein ACTODO_00804 [Actinomyces odontolyticus ATCC 17982] # 3 215 8 223 226 316 77.0 7e-85 MQYDDPMPRPGFDTYVFVERIAANAAALHPFPEHHVGLVREALADAGLEVSVLGPGAPEV GEAVYFQSEPMDDDALGLLADALTLRGIGAYAYALVGDSIGDGLGSIALFTRVGDVFPRA GRRILMTRMWISRSPVRKAVTWAFGSPADLDEANLLLSERFETEPVLDPRGMAAIEILHP EFAAGTAEPMALLDEIFQVLGAAGFEGITMCNDPGAGPVPDTDEAGAAG >gi|319977874|gb|AEUH01000173.1| GENE 9 8893 - 9678 1190 261 aa, chain - ## HITS:1 COG:lin0935 KEGG:ns NR:ns ## COG: lin0935 COG0778 # Protein_GI_number: 16800005 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Listeria innocua # 14 240 2 227 246 157 37.0 2e-38 MSKFTERSTAPIENATIVNQLSHRTFRRFTDERLTEEQVDTLIEVARRTATSSFLQQTTI LHITDQRVRDEIAAASGQPYVGGDKGDLFIFLVDMYRNKSLREEMGVDSRAASSMNLFIC AAEDAILAAQNTVTAAESMGLGTVFLGSILGDPRRLIAAMELPELTYPILGLLVGHSKQE PLYRPRLPHDVVFARNTYPRVDSFRQTLAEYGKEVSEYYEARGGSAQFDDFGQLVVASLG TGGAHVSPVLESLHEQGLCLF >gi|319977874|gb|AEUH01000173.1| GENE 10 9756 - 10223 744 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508310|ref|ZP_02043952.1| ## NR: gi|154508310|ref|ZP_02043952.1| hypothetical protein ACTODO_00807 [Actinomyces odontolyticus ATCC 17982] # 20 155 2 139 139 140 57.0 3e-32 MDCDIVLDRGAGATSSGLPFTMVGMRKRSILTALVTASLAVSLAACGETDTAHSYTTPKD TKTVEKTLETVNGAGISVPEKRDLKVKLPNGQSAVKWQVDVSDPSFAEVGKSENDVVTIH PLKPLGEDDAPVTVTLTSPDGAATSFTLKITEGAN >gi|319977874|gb|AEUH01000173.1| GENE 11 10394 - 10918 423 174 aa, chain - ## HITS:1 COG:MT0945 KEGG:ns NR:ns ## COG: MT0945 COG0454 # Protein_GI_number: 15840341 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Mycobacterium tuberculosis CDC1551 # 11 170 7 164 166 118 40.0 6e-27 MAVHVEVLQEPRRLQRGDERSAFRSGNAELDDWFHRFAWQNQDQGNTVVYVAVQGTEVLG YYALSTASVTRTELPSSLAPRRRPEPCPCILLARLAVDERAQGMGIGSGLLRDAFARSLE ASRIIGAAALLVHCADGNARGFYLANAEFEASPASDMQLMVSLRAMRRLLDSSA >gi|319977874|gb|AEUH01000173.1| GENE 12 10926 - 11219 318 97 aa, chain - ## HITS:1 COG:STM3652 KEGG:ns NR:ns ## COG: STM3652 COG4453 # Protein_GI_number: 16766938 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 8 84 13 89 95 58 37.0 3e-09 MPPAKTHRMDLRVTERQNLLIRQAAALTDRSVSDFVLTSATLEAQRALADQRVFPVSEDQ FAHFQEIVDRPVTDMPKLRKLLNRPSPFGKHYAIGSA >gi|319977874|gb|AEUH01000173.1| GENE 13 11409 - 12350 849 313 aa, chain + ## HITS:1 COG:no KEGG:CMS_2136 NR:ns ## KEGG: CMS_2136 # Name: not_defined # Def: putative hydrolase # Organism: C.michiganensis_sepedonicus # Pathway: not_defined # 15 301 16 305 312 126 36.0 1e-27 MSFAPMAPFASLPVPRGVECVSIPSPCGPLSAYHGPPATASSATAAPAPPVRGPRPGAAG SGRPKVLLVPGFTGSKEDFRLPISDLVDRGFEVLAYSQRGQADSAAPTGAGAYGLGGFAG DVTAIASAWGAGQRVHLLGHSFGGVVARAAAIARPGLFASLTLFSSGNTAKGADRPVPDP VPAGPAGRERVLAAVFPGTDFTRPGLGWEEFMRVRALATSTGNLMGIRAILADQRPRSAE LRATGLPIHVVYGAEDTAWPVSDYRREAREVGAVETPIPGAQHSAQTQRPTAWADAVSRF WLAVDSGHLVWQM >gi|319977874|gb|AEUH01000173.1| GENE 14 12395 - 13264 540 289 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 53 287 5 242 245 212 45 2e-54 MSTESNEEKAAPAAGAAGSSTRAADAPQSFATLEGDSPLSAYFNQSVPVFSAKRVVKRFG DNIVLRGLDLDVASHEVVVLLGASGSGKSTLLRCANLLERVDDGQIFLSGIDITDPHIDA DEVRGYIGVVFQHYNLFPHMSVLDNVTLAARKVKKWSTQRANERGMELLERIGLASKAKE YPDRLSGGQQQRVAIVRALATNPELLLLDEITSALDPVLVGEVLDLVLEIKDQGSTILMA THEIGFARNAADRVVFLEHGRIAEQGPPEQVIGDPQEASTRDFLSRILH >gi|319977874|gb|AEUH01000173.1| GENE 15 13264 - 14142 1224 292 aa, chain - ## HITS:1 COG:AF0232 KEGG:ns NR:ns ## COG: AF0232 COG0765 # Protein_GI_number: 11497848 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Archaeoglobus fulgidus # 72 282 12 218 224 120 35.0 2e-27 MSSHQPDPLELAGLPVSDVELGRRAMRRAKARRSILISVASTLFVAAAVVLAVGTSQGWP RTQETFFSGRHFAASLPQVAAGLWLNVRILVIAVLGVAVLATVLAAARTLGGPVFFPIRF LAAAYTDIFRGIPFLVVLYLVGFGIPALNPNDRIPTAVLGTIALVLTYSSYVAEVLRAGL ESIHPSQRYAARSLGLTHGQTLRMIIVPQAIRKVTPALMNDFISMQKDVGLISVLGAVDA IRAAQQYQAKTYNMTSYVVAGVLFILMSFPFIRLSDWYTAKLRKREQMGGTV >gi|319977874|gb|AEUH01000173.1| GENE 16 14165 - 15031 1447 288 aa, chain - ## HITS:1 COG:AF0231 KEGG:ns NR:ns ## COG: AF0231 COG0834 # Protein_GI_number: 11497847 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Archaeoglobus fulgidus # 32 280 20 264 264 87 28.0 2e-17 MRRSFRPLAALSAIAAAAIALTACSGSTPADSSTAPQSYAKGQLPTLTEGKLTVATSDPA YSPWILDNDPASGEGYESAVVYALAEELGYTQDNVVWTRATFESSITPGAKDWDINIQQF SITDERRNAVDFSTPYYTTSEAIITKEGSPAAGAASVADLKDVLIGVQAGTTSQQFVSEK LEAGLTQTPQQFNSSDDTVLALQTGRVDAIVVDLPTAFNMVATQIDNGVVVGQFADAADG ENYGIVLPKGSKLTASVSEAMDRLRDNGTLAQLQEKWLNEATNVPVLH >gi|319977874|gb|AEUH01000173.1| GENE 17 15257 - 15475 254 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLDDKFQETAGKAKEAVGDATGNEELRSDGQKDRVAGSAGQIVDKAKEKLGDAKEAVTD KIEEVKDKLTGK >gi|319977874|gb|AEUH01000173.1| GENE 18 15748 - 16014 407 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEVRVLMPEVLSLVLDAPGIPSEDTKDLRRFSEIDETAAFEVCVGLLIDYEIPLSEELL SRIHEFDDLLFDEDVEDLDTLRSSTVVE >gi|319977874|gb|AEUH01000173.1| GENE 19 15928 - 16368 195 146 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLSYPDSGGIGRFSPNPDAFWTNSPPHPRSHWQPRRILDHHDRTLSRQHQTRVFPDFPA PTARTRLVSWRLIPCRSGGSQPGRAESEPIRTHTVALETENQRGPHNLCGYKQENGRISL DHGGGAQGIQIFDVLVEEQVVELVDA >gi|319977874|gb|AEUH01000173.1| GENE 20 16591 - 17181 833 196 aa, chain - ## HITS:1 COG:CC2204 KEGG:ns NR:ns ## COG: CC2204 COG1309 # Protein_GI_number: 16126443 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Caulobacter vibrioides # 9 125 20 133 215 75 41.0 7e-14 MDPPNSDRGGLRAAILATSRELLDQGGPAALSMREVARRAGCTHQAPYHYFPNREAILAA LVAEGFTGLADALHRARTDNAGANASGIVVATGSAYIDFALSNPGVFRIMFRSDMYDPTA HPDLLAAGDRARAELSALARQVFGDAVTDEAEAALWAYVHGASCLLIDGPGALGPTTTEE KRRFASRLLHSGALRP >gi|319977874|gb|AEUH01000173.1| GENE 21 17267 - 19606 2534 779 aa, chain + ## HITS:1 COG:alr4631 KEGG:ns NR:ns ## COG: alr4631 COG1233 # Protein_GI_number: 17232123 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Phytoene dehydrogenase and related proteins # Organism: Nostoc sp. PCC 7120 # 1 498 16 508 533 161 27.0 4e-39 MYDAIIIGSGIGGLSTAGFLAGTAGKRVLVLEKHSVPGGLTHAFRRGGASWDVGVHYLGG LGEGEMFRRAFDYLTGGQVHWRPMPRHYDRFVYPGIDFRVPAGRRAYEDALIAAFPHEAR AVRRYFRDIGAAASWGTLAYAREMVPRSVEPLIRLAQWARRGPALDTTKAYLERRFRDPR LRVLLTSQWGDYGVEPARSAFVAHATVVAHYLEGAWFPEGGSARIARAIEDGIEARGGRI RVCQEVREILVENGRTVGVRVIDRSGPRPREAVHRAPVVVSGAGAGPTYNRLLPTDGPVG AATRGVRRAIDEVGPGLSAVSVYLELSELPDGVDGSNVWVSTTTDHDDIEGATADLLAGV PRAAFCSFPSIKAGEGRATAEILAFADPRAFEAWEGTAKGDRGADYARLKAVMADGLIAL ADSAVPGLRAAVRRVEVGTPLTVEHYTSHPAGAFYGIPATPERYRKRLTTPRTPIEGLYL TGQDAGFLGIAGALMAGMSAACQVLGPAGYNQIMRAVKEGPGQRPGAGATAAGGEGQDAG APGAAGGKRAAGGEGAAEPGEERPLPVGKRGASVVSSRAVTPSVVELVLDLEDDAPQWWP GQYVRLRVADHEWRDYSIASLEGRRLRLLIDTRTRGRGARFAVGAAPGARTLLEGPFGSF TATDSPRRRVFVATGTGLAPFLPVFAQDPRESDRLLFGCRTSAEDLTRVLDDPMPPVTRC ITREKVDGAFRGRVTAALAEFGGQAAECDFHVCGSSEMVADAMAVLRELGAGAVVTEAF >gi|319977874|gb|AEUH01000173.1| GENE 22 19619 - 20908 1363 429 aa, chain - ## HITS:1 COG:MA1276 KEGG:ns NR:ns ## COG: MA1276 COG0402 # Protein_GI_number: 20090140 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanosarcina acetivorans str.C2A # 18 424 28 410 442 108 27.0 2e-23 MPHDVTVWSAGLVIPITAPSVMDGAVAVRDGRIEHVGAREWVIGALRDRGLAFTERHFDG VLLPGLVNAHTHLQYTGMDGVGSGSYSGYPDWARAFDAVYDLGGLDWGADAARGADMLLR AGTTAAADVVTDPEAAPALSDAGLHGVAYWEVMGWSNEQWRQCGAEQVGRALDAMPTPPG AGISPHAPYSLDADPLLDLPDLARRRGMRIHIHLGESYSEAEWPETRTMELSDLWKSGHP SFTAMRSRGGGFSSTQFVDQLGVLGPDCHVAHGVYMRADDRRRLRARHTSVALCPRSNRV IGLDAPPVAAYLREGNPIAVGTDSLSSSPSLDVLEDVALLFDLARAQGYRDGDLARRLLE AATLGGAAALGLETGPDRVGQLQAGAIADMVVVDVPVTDVVGTIEAVARHGAGRQVETVV SGRVRWSAV >gi|319977874|gb|AEUH01000173.1| GENE 23 21001 - 22287 1381 428 aa, chain - ## HITS:1 COG:TM0172 KEGG:ns NR:ns ## COG: TM0172 COG0499 # Protein_GI_number: 15642946 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylhomocysteine hydrolase # Organism: Thermotoga maritima # 45 428 4 386 404 196 35.0 6e-50 APGTGGSRWRPRLVVRAPIGGRASVTWDGEPLLGPHEAGRGADYGASRTAWARDRMPVTR RAVQDAAPLIAGRTVGLSLVLEPKTAALALMLHEAGARVRVFGHAAETRDDVAEALRAAG IAVFAESGATGEREEELARLFLAGGVEFLLDDGSHLIRMAHDPRRAPGALESLVGAAEET TSGLRALRAFPLRVPVVASNDARSKTLFDNAYATGQSCLLTILDLIDPAARGVALWEQRV AVVGYGDVGKGFARFAAACGAAVDVVETDPVRALQAQMDGHRVAALEEAAAASAVLVSAT GEPATITPAALAALPTGAVVAVAGGVEGEAGLAEAGAGSWPLTESGSRAVQLLHRPGAAP VRVLDRGACINCTAGEGNPIEVMDMSFGVQVAALRELASHGSSMPPGLRALPGPADILVA RRALEALA Prediction of potential genes in microbial genomes Time: Thu May 12 18:24:12 2011 Seq name: gi|319977864|gb|AEUH01000174.1| Actinomyces sp. oral taxon 178 str. F0338 contig00174, whole genome shotgun sequence Length of sequence - 8524 bp Number of predicted genes - 11, with homology - 7 Number of transcription units - 6, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 210 228 ## 2 2 Op 1 . + CDS 324 - 1028 293 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 3 2 Op 2 . + CDS 1025 - 2383 1679 ## Cfla_3126 protein of unknown function DUF214 + Term 2422 - 2465 15.4 4 3 Tu 1 . + CDS 2579 - 3283 1204 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 3304 - 3344 2.8 5 4 Op 1 . + CDS 3633 - 3710 137 ## 6 4 Op 2 1/0.000 + CDS 3701 - 4765 1483 ## COG5263 FOG: Glucan-binding domain (YG repeat) 7 4 Op 3 . + CDS 5198 - 6016 1043 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 6142 - 6173 1.6 8 5 Tu 1 . - CDS 6340 - 6456 67 ## 9 6 Op 1 . - CDS 6662 - 7165 608 ## 10 6 Op 2 . - CDS 7239 - 8096 1083 ## COG0561 Predicted hydrolases of the HAD superfamily 11 6 Op 3 . - CDS 8111 - 8524 426 ## gi|293193639|ref|ZP_06609861.1| putative mucin-associated surface protein Predicted protein(s) >gi|319977864|gb|AEUH01000174.1| GENE 1 3 - 210 228 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTPDPRAWAQTAVRAYASATNMLVPGRRFTVLAPGAGPAHPRALVDLLERFGARVVDDP SSAHAAFEA >gi|319977864|gb|AEUH01000174.1| GENE 2 324 - 1028 293 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 30 233 17 222 245 117 37 3e-26 MSTNTPANGAITAQGLVKTYMRGQSPVHALAGINLSLPQGTQVAIMGPSGSGKTTLLHCL AGVLRPTAGRITVGGVETTSLSERAMSAMRLRSFGFVFQDGQLLPELPCEENIAMPLMLA GAAKAEAIAKARSILANLGLDGAGRARPGQLSGGQAQRVAIGRALATDPAVIFADEPTGA LDQATGAEVMGLLTSACSATGATLILVTHDAGVASRVPHTIHMRDGLIDMEASR >gi|319977864|gb|AEUH01000174.1| GENE 3 1025 - 2383 1679 452 aa, chain + ## HITS:1 COG:no KEGG:Cfla_3126 NR:ns ## KEGG: Cfla_3126 # Name: not_defined # Def: protein of unknown function DUF214 # Organism: C.flavigena # Pathway: not_defined # 14 446 14 443 449 191 36.0 7e-47 MNPVLAAASAILRGRDRATSVLTVIAFALPHAVFLAVLGGVGAFQARASRFDAAYSEQMG QGGGTMYTALAYFAATLLIVPILSMGAAAARLGMSRRAHDLAVLRLIGLGPGAAKLACVV ETAAHAGAGVVVGSVLYAVTLPAWGLIAFQGWPMGASEMWVGVLLLLGAALAIVVLAALS AWFGMRKVAITPLGVTRRQKADRVTPIGVVMGAVLLVVWMTAGSKLMDAGFAIGMAALFG FLAAIFLIVNLIGVWSVSLLGRLIAAVARRPQTVLAGRRLVDDPRSLWRSFGPVALVGFL VGVLYPILSVVGGMAGDDPVSRMVGDDMLRGMILTFCIALALGAISTAVNQAIRAIDSAE QARALFRMGSPLAFMDRARRIEIGVPALLMIPGSMLIGFLFIIPIAGQAGLWGTLGALLA TALGGVALILAASELSAPLRRAILEEARESNE >gi|319977864|gb|AEUH01000174.1| GENE 4 2579 - 3283 1204 234 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 60 224 528 689 744 132 39.0 7e-31 MALNAQGSTRRVLTTVTALALGVGLVALPGVASAADASSAAAAQATAGRTGPQADADGVW KQDSKGWWYRLPDGSYPAGTDMVINGVEYTFDAEGYMRTGWVRSAEGWRYYHPSGARAGE GWVQDGGTWYYMKGSPAVAHTGGWLEVGGAWYYMDSSGAMATGWVKVGEDWYFMQPSGAM ATGWLQLGDTWYYLQSNGVMAYGIVEVNGVPHAFEINGAWVGAWDGTESEDVAP >gi|319977864|gb|AEUH01000174.1| GENE 5 3633 - 3710 137 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSAIGLVVMWAPSYKDVEKGDPLWH >gi|319977864|gb|AEUH01000174.1| GENE 6 3701 - 4765 1483 354 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 99 329 525 733 744 171 39.0 1e-42 MALTTRGRVGRLIAVVTTLALGVGLGALSGVANAADASSAGIAQSGDLPPQDPAPVKGTW KQNAAGWWWYELPDGSYPAGVDMTIDGAEYTFDDWGYMRTGWVQEAGGWRYYLPSGARLT GGWVQVGGSWYYVEPSTKLMATGYTIINNTAYYLDDTTGAMVTGWKKIGEKWFYFDSDGA PKTGWLAGRRGAWYYLDGDGAMLTGWAYINFQTYYLDENTGVMAQGWKKIADKWYHFDTS GGISTGWVTSGQSWYYVVNGEMVKGWLFNNYHWYYLADNGVMATGWTKVGDEWFYLYSGG SMATGWLQQGSTWYYLRPSGVMAANDVVEIDGVPHIFGPDGAWQGVYKEKPPVE >gi|319977864|gb|AEUH01000174.1| GENE 7 5198 - 6016 1043 272 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 53 260 521 725 744 170 41.0 3e-42 MALTTRGRVGRLIAVVTTLALGVGLGALSGVANAADASSAGVAQSGDGPQSGPAPVKGTW KQNATGWWYEYPNGGYPTAVDIVIDGVEYCFDGKGYMRTGWVLADEGWRYYYPSGARAGV GWVLVDGEWFFMEGDRPHARSGGWFELDGLWYYLDARGAMVTGWNKIDGKWYYLDGDGAM LTGWAFIGYRWYYLDDGGAMATGWTKVGDEWFYLYSGGSMATGWLQEGSTWYYLRPSGVM AANDVVEIDGVAHIFDASGAWQGVYKEKPPVE >gi|319977864|gb|AEUH01000174.1| GENE 8 6340 - 6456 67 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVQSVLVATAPEGDRGTANRTRGGSGLYRQHPSRLTVI >gi|319977864|gb|AEUH01000174.1| GENE 9 6662 - 7165 608 167 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDPTNPYNTSGAQGPQQPQQFPAPDAAQAQQFPAPGAQAQQFPAPANQLPAVAPGQAAP AAPTGGFSTGGLGALGLALVPWILFTLWWTADPGSGLESALSGFVGKFSIVLGIAAAAYG FSAMSKDRKEGKAAWWLGIIGAIAGILYAVLFVYLLVTGMVTIGSRY >gi|319977864|gb|AEUH01000174.1| GENE 10 7239 - 8096 1083 285 aa, chain - ## HITS:1 COG:Cgl0894 KEGG:ns NR:ns ## COG: Cgl0894 COG0561 # Protein_GI_number: 19552144 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Corynebacterium glutamicum # 12 284 15 276 277 164 36.0 1e-40 MPTDPGAYAPADVRLVVADMDGTLLDDRKRFPGGLWRMLDQLDARGATFAPASGRQVWTL LDMFPDRPGMTVIGENGAIVMRDGAEISSSPLDLPTLREAVRLVRAATGPGGVNGGLVMC GKRSAYVERVDEPFVAGVVPYYHRTRQVDSQLDVLDAIEAGELDDAIVKLAVHSLDPVSP LAEATLARFRSTHQYAVSGANWADLQIRGVDKGRAVRALQGALGVTRAQTAVFGDFHNDL SMLAEADLSFAVANADPDVVRAARFVAPSNNEGGVVSVVERLFSL >gi|319977864|gb|AEUH01000174.1| GENE 11 8111 - 8524 426 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293193639|ref|ZP_06609861.1| ## NR: gi|293193639|ref|ZP_06609861.1| putative mucin-associated surface protein [Actinomyces odontolyticus F0309] # 1 137 81 217 217 149 70.0 7e-35 VESAAARAKSVVEHAGPAIESARANLVEDYLPRASRAVNQAGSALSAPGPLSERARRAKE VSTVALTTPTTKVRKRRFLRAVGLVALAATAAGVGYVLWKRSQPIEDPWAEEYWADLETD AFVPDSEAEQASLKSGE Prediction of potential genes in microbial genomes Time: Thu May 12 18:24:50 2011 Seq name: gi|319977855|gb|AEUH01000175.1| Actinomyces sp. oral taxon 178 str. F0338 contig00175, whole genome shotgun sequence Length of sequence - 5016 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 377 173 ## gi|293193639|ref|ZP_06609861.1| putative mucin-associated surface protein 2 2 Op 1 1/0.000 + CDS 376 - 897 834 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 3 2 Op 2 . + CDS 986 - 1867 994 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 4 2 Op 3 . + CDS 1940 - 2701 1020 ## Swol_1665 hypothetical protein 5 3 Tu 1 . - CDS 2904 - 3167 254 ## BLA_0071 putative septation inhibitor protein 6 4 Op 1 . + CDS 3334 - 4035 833 ## COG3879 Uncharacterized protein conserved in bacteria 7 4 Op 2 . + CDS 4028 - 4789 923 ## COG3764 Sortase (surface protein transpeptidase) 8 4 Op 3 . + CDS 4805 - 4960 247 ## gi|154508344|ref|ZP_02043986.1| hypothetical protein ACTODO_00841 Predicted protein(s) >gi|319977855|gb|AEUH01000175.1| GENE 1 2 - 377 173 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293193639|ref|ZP_06609861.1| ## NR: gi|293193639|ref|ZP_06609861.1| putative mucin-associated surface protein [Actinomyces odontolyticus F0309] # 55 125 11 81 217 81 66.0 1e-14 MASIVALIGARDHAGATPLAQVPSDSSVSGLRITGIRSTMVPSGAPFPAGRRTPLDRFGD TMAPTPNFDVDQLREEAAQLRDTAAVYAGKVAVKAADLVGAGVDWAAPRAQAALARAIDR AAPAV >gi|319977855|gb|AEUH01000175.1| GENE 2 376 - 897 834 173 aa, chain + ## HITS:1 COG:Cgl0034 KEGG:ns NR:ns ## COG: Cgl0034 COG0652 # Protein_GI_number: 19551284 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Corynebacterium glutamicum # 3 171 24 188 190 179 61.0 2e-45 MKAIVHTSMGDIALDLFPDHAPNTVNNFVGLARGEREWTDPRTGQKTSRPLYDGTVFHRV IKDFMIQGGDPLGNGTGGPGYQFNDEIHPELAFNEPYLLAMANAGKRMGKGTNGSQFFIT VVPTPWLQGKHTIFGKVADQASKDVVDAIISVPTGANDRPASDVVIESVEIVD >gi|319977855|gb|AEUH01000175.1| GENE 3 986 - 1867 994 293 aa, chain + ## HITS:1 COG:MT0119 KEGG:ns NR:ns ## COG: MT0119 COG0705 # Protein_GI_number: 15839491 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Mycobacterium tuberculosis CDC1551 # 15 247 9 239 284 131 39.0 1e-30 MSMPRFGQRSDPRAAPDCPRHPGARSVDYCKRCNRPMCPSCVVPTEVRSICVDCAKSSRG RRSARLGSAAAFLRSSGAPVTLVLVGVCVLMYLLVLVVPPVQSLFMLVPAWVGPRPWIVV TSAFLHSGFLHVFFNMLTLYWVGSVVERAIGHWRYGAVCLISALGGSALVMLWCFVQPEA LFAATVGASGAVFGLFGAVFVLQRLSGSSTAPILILLGVNLVYGFANPGVSWQAHIGGFL AGAAATWALLRTSGRPGGTLAPRAKQIAVCVAMAAAMAALCAVSYWGLVALYA >gi|319977855|gb|AEUH01000175.1| GENE 4 1940 - 2701 1020 253 aa, chain + ## HITS:1 COG:no KEGG:Swol_1665 NR:ns ## KEGG: Swol_1665 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 5 218 2 214 249 154 38.0 4e-36 MATPWDIYDQLIDQIPADITVQSARADGKWRRIGTSEGGAGMAFGMNVQSRPRTVAAAED LVGRPLRDAANLVKSWNFEDAGLGMAAINAYHSHTDRALAHGFTPLAENNWGRTFHAYAD AVAGKRVAIVGHFPFAPAALPDVAELIVLERALFDGDYPDSACEYLLPGVDWAFITGSAF VNKTMPRLLELTRGVHSVVLGPSAPASPIVLDHGASEVLAFASCHPALLEEGLAGRLPGG MFDAGMRVGLARG >gi|319977855|gb|AEUH01000175.1| GENE 5 2904 - 3167 254 87 aa, chain - ## HITS:1 COG:no KEGG:BLA_0071 NR:ns ## KEGG: BLA_0071 # Name: not_defined # Def: putative septation inhibitor protein # Organism: B.animalis_lactis # Pathway: not_defined # 7 87 124 203 203 83 37.0 2e-15 MAESKKRKKNGRDAADDTEIRPWTEGIPLSPVWWAPLFCGLLLIGLVWLMVYYISSGAYP VPGISWWNIVIGLGIMMVGFLMTLWWR >gi|319977855|gb|AEUH01000175.1| GENE 6 3334 - 4035 833 233 aa, chain + ## HITS:1 COG:MT0015 KEGG:ns NR:ns ## COG: MT0015 COG3879 # Protein_GI_number: 15839386 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium tuberculosis CDC1551 # 40 232 59 261 262 130 39.0 3e-30 MSRRPPRAAAAILAVTIAAGALFSVSFLNNRRNPASGGALEDLVRARQDSVASLEAQNRD LQSQIDQYAGQSGARAASTTVPALAASPVVGPGVEITLTDAPADQAPEGASPNDLVIHQQ DIEDVMNALWSGGAEAMTVQGARITSRTVIRCIGNVILVDGTSYSPPYVVQAIGDPGALR ETVMANPRIVNYQAYVARYGLGWDMREKDQLRFPPAATDTTLNYAHVEAPTNG >gi|319977855|gb|AEUH01000175.1| GENE 7 4028 - 4789 923 253 aa, chain + ## HITS:1 COG:Cgl2874 KEGG:ns NR:ns ## COG: Cgl2874 COG3764 # Protein_GI_number: 19554124 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase (surface protein transpeptidase) # Organism: Corynebacterium glutamicum # 25 250 6 254 255 80 28.0 2e-15 MGEHIAHARPPKGRGRRIFWNTVGVIGELLITASFVIGLFAVWQLYWTTFQVQGQVAQTI TAYEEDHAPVARQAGEARTDAPPAFTREVSDGEVYGLVHVPSWDWMKIPLAEGTTSAVLD NGWAGHYGNTAQPGQLGNFSVAGHRRTYGNNFRWIDRLGEGDKVVAEVDDFYVVYSVQTW EIISADDPDQVRVIAPVIDDLTFNKEPTERWMTMTTCNPEYGNWERYIVHLKFQSWTPKS SGVPAELLDEPES >gi|319977855|gb|AEUH01000175.1| GENE 8 4805 - 4960 247 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508344|ref|ZP_02043986.1| ## NR: gi|154508344|ref|ZP_02043986.1| hypothetical protein ACTODO_00841 [Actinomyces odontolyticus ATCC 17982] # 1 51 1 51 53 77 88.0 4e-13 MYAWIFRHLPGPTWFKVIESLVLIGLVVAALFTWVYPWVQTYLELGDASVG Prediction of potential genes in microbial genomes Time: Thu May 12 18:25:09 2011 Seq name: gi|319977852|gb|AEUH01000176.1| Actinomyces sp. oral taxon 178 str. F0338 contig00176, whole genome shotgun sequence Length of sequence - 3910 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 155 - 793 908 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 2 2 Op 1 . - CDS 856 - 2871 2503 ## COG0515 Serine/threonine protein kinase 3 2 Op 2 . - CDS 2844 - 3908 831 ## gi|154508349|ref|ZP_02043991.1| hypothetical protein ACTODO_00846 Predicted protein(s) >gi|319977852|gb|AEUH01000176.1| GENE 1 155 - 793 908 212 aa, chain + ## HITS:1 COG:ML0015 KEGG:ns NR:ns ## COG: ML0015 COG0512 # Protein_GI_number: 15826878 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Mycobacterium leprae # 3 191 2 193 232 207 52.0 8e-54 MTRILCIDNYDSFVYTIVGYLRHLGARVDVVRNDAVDPQWHDGAYDGVLVSPGPGDPAGA GATLRTIAECAERAVPMLGVCLGMQALGELYGGAVGHAPELMHGKTSVITHDGKGVFAGV PSPLTVTRYHSLAVDPATVPSALEVSASTSSGIVMGLRHRDLPLEGVQFHPESVMTQSGH LMFANWLAVCGDGGAPARAATMAPVVARRSSL >gi|319977852|gb|AEUH01000176.1| GENE 2 856 - 2871 2503 671 aa, chain - ## HITS:1 COG:Rv0014c_1 KEGG:ns NR:ns ## COG: Rv0014c_1 COG0515 # Protein_GI_number: 15607156 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Mycobacterium tuberculosis H37Rv # 9 456 7 425 428 291 42.0 3e-78 MAEPVARRLAGRYEVRSLIGRGGMAEVHLGFDTRLSRVIAIKMLRRDLALDSIFQARFRR EAQSAASLNHPNIVAVYDTGEEMIEDAQGRAISIPFIVMEYVEGHTVKELISDGMPVPIN EAIEIVGGVLQALEYSHAAHLVHRDIKPGNIMLTNDGKVKVMDFGIARALTDSQATMTQT NAVVGTAQYLSPEQARGEQVDARSDLYSTGVVLFELLTGRPPFKGDSAVAVAYQHVGQMP PIPSSITADIPDALDRVVMKALAKDRDQRYASADAMLADLMRVSRGLGVSAPETSVWMGQ MPRQDQTAATIPLSAAASPTSTVIANPARQGQGAQGPAPTPANGANEEDEARARKRRAML IGSLIALALLVIGSSVYALANWKGKPATAAVPNVVGYTQAEAKAQVESAGFTWAIAADRV ASDSVAEGSVVSTNPPGGEQAEEGSTITATLSSGPDSVAVPDSLVGMSPDDAKRAVEAVG LKWEVSSDRVASEEVGEGKVAKVNPSSASKVKAGSTVTGYLSSGSDTVAVPDLSGMKQEQ ARSALENVGLKLGTVDFTDDSEQPKDQIVYSDPEAGSSAQKGAKVNVTISSGNNQVTIPT VVGTSAEDAKAALEGVGLKPNMTEVDSDKPKGQVVSIDPAEGQKVNKGTQVNVKVSKGKG NGTPTPTPTSH >gi|319977852|gb|AEUH01000176.1| GENE 3 2844 - 3908 831 354 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508349|ref|ZP_02043991.1| ## NR: gi|154508349|ref|ZP_02043991.1| hypothetical protein ACTODO_00846 [Actinomyces odontolyticus ATCC 17982] # 127 354 496 708 708 113 47.0 2e-23 AVAPAAVADAEPGVATARSAMPAQLRASVGVEAPTDPAVETKPSPRTRAKRAPGDQPPLT VPDLTSPRAPAPDTTTTGRHAAVTDTGTEHPPLTVPDLTSPRAPAPDTTTTGRHAAVTDT GMTRSEAEEALRLEALTRRLRERQKARDREDAAAEARRTREEERRAEEERQRAEAEAAAR QERERRERAEAEERRREEARRERRERRERSEDDGAVITRKSPSVARILGTVPPPRPAAKK KSGTPTAPVRPRRSVFEQRTEPDPDKPRWHALDARAAHSTQPPKATKYNRSVVSPPVPLG KRVLRAVIVALVVLALVVLAAVTIKHMIGSLLSAHAGTTPTEEVRTWQSPWPAV Prediction of potential genes in microbial genomes Time: Thu May 12 18:25:34 2011 Seq name: gi|319977843|gb|AEUH01000177.1| Actinomyces sp. oral taxon 178 str. F0338 contig00177, whole genome shotgun sequence Length of sequence - 9625 bp Number of predicted genes - 9, with homology - 7 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 3 - 1146 1318 ## COG0515 Serine/threonine protein kinase 2 1 Op 2 19/0.000 - CDS 1161 - 2615 1901 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 3 1 Op 3 4/0.000 - CDS 2612 - 4003 2130 ## COG0772 Bacterial cell division membrane protein 4 1 Op 4 7/0.000 - CDS 4003 - 5328 1721 ## COG0631 Serine/threonine protein phosphatase 5 1 Op 5 5/0.000 - CDS 5330 - 5836 693 ## COG1716 FOG: FHA domain 6 1 Op 6 . - CDS 5833 - 6537 937 ## COG1716 FOG: FHA domain 7 2 Tu 1 . - CDS 6770 - 7729 1676 ## COG0208 Ribonucleotide reductase, beta subunit 8 3 Op 1 . + CDS 8250 - 8924 374 ## 9 3 Op 2 . + CDS 8971 - 9625 146 ## Predicted protein(s) >gi|319977843|gb|AEUH01000177.1| GENE 1 3 - 1146 1318 381 aa, chain - ## HITS:1 COG:ML0017 KEGG:ns NR:ns ## COG: ML0017 COG0515 # Protein_GI_number: 15826880 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Mycobacterium leprae # 1 312 1 321 437 195 43.0 2e-49 MTSGAGRVLGGRYTLLTPIAQGGMGEVWKARDRVSGHIVAAKVLRPELSGEELSLSRLRL EARNSMSISHPNIANVQDSGEEDGIGWIIMELVEGFPLTDYLRGGARLAPEYLMPVLVQV AMALGAASRAGVVHRDIKPANILVRPDGVVKLTDFGISRATGQVNLTAVGMVMGTAQYLP PEQAMGEVATPIGDLYALGVIAYEAATGKRPFTGETQVDIAFAHVNDPVPPLPDFVPEPL ADVILHLLEKDPAKRPESGSAAVRELAGAAKAMGIPVTPTPLPPPRSVSRSNAPRTPVQP SVPIVAPVRHKPRRTLPEELLRPPEGLVARPRPASDPQVRIFDEVDPAPPSSGPPQMLSQ QLPAAAPMPMRTPLPRPSARA >gi|319977843|gb|AEUH01000177.1| GENE 2 1161 - 2615 1901 484 aa, chain - ## HITS:1 COG:ML0018 KEGG:ns NR:ns ## COG: ML0018 COG0768 # Protein_GI_number: 15826881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Mycobacterium leprae # 1 484 1 489 492 290 41.0 4e-78 MNNQIRRIFIVVLIMFALLGAGMTNTQFVAAPALNADERNQRTILHSAEIDRGPIIVNGS AVASSTKQDGSQRFERSYSQGPLYASVTGYFSSVISQSTGLESAAEEVLDGQSQSLFAQR LRNLFTGAGRQGGGVVLTLDDKMQQVAAERLGDRRGAVVALNAKTGAVLALYSSPSYDPN SLAVFDSQSVNDAYGALLADPNRPLVNRAIGGDRYAPGSTFKILTAVALLENGIATPDTR MDSPVSTTLPGSSTQVENIESSQCGDGKPTLTEAFARSCNTTFVLASKSLTNQQLSDVAK RFGFGTEQAIPLPVTPSVFPSDTDAAQLAMSSIGQYTVQATPLQMAEVAQAVANKGTMMA PYLIDQVVDADLQTRSTTEPAQLGKPVTPEIANQVTAMMRAVVSQPYGSGSSMAIPNVAV AAKTGTAETGNGARANAWAVGFAPADDPQIAFAVVVEGDEAEPTPHGGTVAGPIARALLE AGLQ >gi|319977843|gb|AEUH01000177.1| GENE 3 2612 - 4003 2130 463 aa, chain - ## HITS:1 COG:MT0020 KEGG:ns NR:ns ## COG: MT0020 COG0772 # Protein_GI_number: 15839391 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 8 445 14 450 469 320 43.0 5e-87 MATVSIAPARSRRLVEVLLLLLALAVGVGGYVLASLNYTGALPANLFTHVAVLVVLAVVA EVGVHFLAPYADPVILPVAVALTGLGLAMIYRLDLSYERLGGETVGMRQGVYVAASIVLA ALFLTAVRDHRRLRRYTYTFGALSLVLLLLPMIPGLGTETYGARVWIRLGPMSLQPGELV KITLALFFAGYLVTNRDNLAIGGRKLLGLRLPRGRDLGPIMVVWLIGIAILVLQRDLGTS LLFFSLFVATLYVATNRPSWLLIGFVLFVPAVAVAVKAFPHVANRFNVWLNALDPDVYSA TGGSYQVVQGLFGQASGGLMGSGWGRGYPQLVPLANSDFILSSFAEELGLTGMAAILVLY LVLIQRGLRAAVTVRDGFGKLLATGLSFSLAIQLFVVLGGITRLIPLTGLTAPFLAAGGS SMVSSWLTVALLIRVSDAARRPASTPAPWNSGVHEPMKAGDAQ >gi|319977843|gb|AEUH01000177.1| GENE 4 4003 - 5328 1721 441 aa, chain - ## HITS:1 COG:ML0020_1 KEGG:ns NR:ns ## COG: ML0020_1 COG0631 # Protein_GI_number: 15826883 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Mycobacterium leprae # 8 243 5 236 237 206 51.0 6e-53 MSAHAVQFHYAARSDVGLVRSNNQDSGYAGANLLVLADGMGGPAGGDIASSVALAHLVPL DTDSHPADSMLPLLREALMDAHEELTDRSSRDRDLEGLGTTCIALMRSGNRLAMVHIGDS RAYVLRGDTLTQVTTDHSFVQYLVDTGQITPEEAEHHPNRNVVLKILGDAQADVSPDETV REAVVGDRWMLCSDGLSGLVSPETIGQVMAEREDPGECAEELIDLALRGGGTDNITCVIA DIVPAGKFPPAPPEIVGAAAADRHAKSRGGKGAAARAAGLGSHAVGLDDEDDEAEGRGRS AWWTPVFVSLVTVLVAAAGWLSWAWSQSQYYAIGQDGYVVIHQGIPQSIGPWKLSHAIEV TEVSLAELTPVDQQRLTSPVMRSSRAAIDEYLAGLRRTAEDARAAQDAHTAQSGQPQSGG PQSGTPPQTDGAAQSGDGGAS >gi|319977843|gb|AEUH01000177.1| GENE 5 5330 - 5836 693 168 aa, chain - ## HITS:1 COG:MT0022 KEGG:ns NR:ns ## COG: MT0022 COG1716 # Protein_GI_number: 15839393 # Func_class: T Signal transduction mechanisms # Function: FOG: FHA domain # Organism: Mycobacterium tuberculosis CDC1551 # 5 166 4 153 155 95 35.0 5e-20 MTTDLAFTVFRIGFLVLLWLLVLAAVNTLRRDIYGTVVTPRGKGRSKADERRRQSKKKRK EGGKRPTAGAPQTPKDLLLTGGPLVGTMLPLGDAPIVIGRSPACTLVLEDEYASSRHAAL SPQADGWWIEDLSSRNGTFIDDERLTGPHQLKVGDVIRIGQTTLELVG >gi|319977843|gb|AEUH01000177.1| GENE 6 5833 - 6537 937 234 aa, chain - ## HITS:1 COG:MT1875 KEGG:ns NR:ns ## COG: MT1875 COG1716 # Protein_GI_number: 15841297 # Func_class: T Signal transduction mechanisms # Function: FOG: FHA domain # Organism: Mycobacterium tuberculosis CDC1551 # 145 231 67 153 162 68 40.0 1e-11 MSIFDRFESAVERGVNGAFSRVFRSGIKPVDITTAIRRAMDDHVQAVTAERIITANHFTV HASNADLDSLEASLDVLADEFAQQATEHAMANGYALLGPVVIEFVADPTASSGTLSVDAE KRRGPAAPATASSASPEHPIIDVDGDKWLLTEPITVIGRGSEADIVVSDSGVSRRHLELR ITPTGVIATDLGSTNGTFVEGHRIDAATLLDGNQIVIGRTKILFWTHPDQGGAL >gi|319977843|gb|AEUH01000177.1| GENE 7 6770 - 7729 1676 319 aa, chain - ## HITS:1 COG:ML1731 KEGG:ns NR:ns ## COG: ML1731 COG0208 # Protein_GI_number: 15827930 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Mycobacterium leprae # 8 319 14 325 325 495 80.0 1e-140 MTGPVFEAINWNRIQDDKDLEVWDRLTGNFWLPEKVPLSNDLPSWQTLNKDEQTMTNRVF TGLTLLDTLQGTVGAVSLIPDARTPHEEAVYTNIAFMESVHAKSYSSIFSTLLSTEEINE SFRWSNENEALQRKAQIIKSYYDGDDAEKRKVASTMLESFLFYSGFYAPMYWSAHAKLTN TADLIRLIIRDEAVHGYYIGYKYQLAVGESTQERRDDLKDYTYSLLYELYENEEQYTEDL YDPLGLTEDVKKFLRYNANKALMNLGYEALFPHDATDVNPAILAALSPNADENHDFFSGS GSSYVMGEVVDTEDEDWDF >gi|319977843|gb|AEUH01000177.1| GENE 8 8250 - 8924 374 224 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSDERWIPGGEEEPVSPAEQMPQPVRPAAMRSFDEDPDAPGDPDALVIPEFSEEDADWRP PAPVVPVFDEADAGATDPDSTPKVVSFDPDAHVEAEDAQPYSDMPEPEGGSSDEAFRPGA DEAGAPGADSAGGHQGAPVDAGGPADDGAVGGAVEADAAAGSPDAAHGGEGPEAPGGAAN DREDPDGAASGPSRWAAPNGGSPSGDALSGVSPSGGARSGAAHT >gi|319977843|gb|AEUH01000177.1| GENE 9 8971 - 9625 146 218 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTNPLIDSLVAVVAERPEDLPLRVHLAELLIVDGRGPEAVPHLGAVLAADPANEHAAALM RRVLGMPSQPASTGGAPSTGAGETDPGPSVGAPPAIPRESSHGAPPAPSAPVGPADPADP GSSRFAPASSGGAPSEAPAPSDGAPTADPVRSDDAPSGAPAPSDGGHDPVPEEGPTAPQG ADHSPAGPAPGGPAPVDSAPTGTEDSGQPVSGPAPEAP Prediction of potential genes in microbial genomes Time: Thu May 12 18:26:12 2011 Seq name: gi|319977838|gb|AEUH01000178.1| Actinomyces sp. oral taxon 178 str. F0338 contig00178, whole genome shotgun sequence Length of sequence - 4398 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1199 1271 ## 2 2 Op 1 . - CDS 1300 - 1839 569 ## gi|293193562|ref|ZP_06609832.1| conserved hypothetical protein 3 2 Op 2 . - CDS 1859 - 3160 1906 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 4 3 Tu 1 . + CDS 3486 - 4394 1276 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain Predicted protein(s) >gi|319977838|gb|AEUH01000178.1| GENE 1 3 - 1199 1271 398 aa, chain + ## HITS:0 COG:no KEGG:no NR:no ARAAPLRSPRRDRPGKSAEDPQTNAVQRPRGGTMPLDDAARAQIERAEALIELGREEQAV ELLATVPQTDPEAVVSVNCALAFAHLRLDQRAEARASAERAVQAMPNSTEALYWLTGTES DAARALEHSSRLVELEPQWGPYRALHARTLKRNGRSEDAEREAREAVRIAPEDVFVLNTS GSVLRGSHPDEALEAYGRVLEIDPGNTDAREGIARLKRVHEADESSRLYRGLLETNPERG HYEDQLHWMVFVNPLRLTIGTVLLDAAGWLAMTVPYALGWRTAAVVCGTVWSLMMILGLV SGFRENAAKIAEGMETSSGAVVAAAFRRRPFSSFTAMASIVVVGAAPAVWGAWTLFNPAP SWWSAVGVPVAFIVSAWITAPLVWFVAMVVRLLRQRRG >gi|319977838|gb|AEUH01000178.1| GENE 2 1300 - 1839 569 179 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293193562|ref|ZP_06609832.1| ## NR: gi|293193562|ref|ZP_06609832.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 28 176 2 150 158 147 54.0 2e-34 MTSPVLVENRSGRTCGCALANVRHAMEDPKKRLGAAVGVAAATAWYGLPDVVRCRGARAV IKAGLLAAVMWSATAQMPPAAPVPPYPDEESDCDGTPAPKGDDPLEGVTEAAPGELALLA GAGIGTVCLTVAVEKWLFRRGERRRAAGVRLAHTRQGLVIGALTGAASAVQFAVAAPAD >gi|319977838|gb|AEUH01000178.1| GENE 3 1859 - 3160 1906 433 aa, chain - ## HITS:1 COG:Cgl0575 KEGG:ns NR:ns ## COG: Cgl0575 COG0596 # Protein_GI_number: 19551825 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Corynebacterium glutamicum # 13 432 24 422 427 221 37.0 2e-57 MRTESYRQHGHTIHEHRLEVPLDHRRPGAPTIGIFAREVVREGGEDLPRAVFLQGGPGYP APRFGRFDSGWIARLLKDYRVVLLDQRGTGQSTRLDAQSLAELPSDRERADRLKLFRQDQ IVFDAEALRRELCGDAKWTTVGQSFGGFITTAYLSLAPEGVEASLITGGLPGLVHVDEIY ALTYERTAARNRAYFHRHPGDERTVREVAAHLADTEELLPTGERLSPARLRMIGMMLGGQ GRTDQLHYLLEGPWVSVRGQRRLSSQFLQAVAAAVEVSPVYALLQEGIYACAAPALAGGA TNWSADRLAEQIPGFAKDADPLDASEPYYLTGEHFMRRVYDEDPGLAPLAGAADLLASDT DWAPVYAPEVLARNTVPVAAAVYYDDMFVPRELSLDTAELMGARPYITNAFQHDGSAYSG GRVLSHLLDLIHD >gi|319977838|gb|AEUH01000178.1| GENE 4 3486 - 4394 1276 302 aa, chain + ## HITS:1 COG:Cj0982c KEGG:ns NR:ns ## COG: Cj0982c COG0834 # Protein_GI_number: 15792309 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Campylobacter jejuni # 52 293 33 277 279 248 51.0 1e-65 MNTPRTKRPILRYAAAALAMAAALGASACSPHGQSAPGGGAAGAQVGDRTPEQIKEAGEI TIGIFSDKAPFGYIDADGKPAGYDVVYGDRIAADLGVTAKYVPVDAAARTEVLQSNKVDI TLANFTVTPERAEKVDFANPYFKVSLGVVSPASAEITDVSQLAGKTLIVTKGTTAEAYFE ANHPEVELQKYDQYSDAYQALEDGRGDAFSTDNTEVIAWAIAHPGFGVGIKSLGETSYIA AAVKKGNSALLDWLNNQLVDLGEEDFFHKDYELTLAPVYGDAASADDLVVEGGTDTPQSA AQ Prediction of potential genes in microbial genomes Time: Thu May 12 18:26:39 2011 Seq name: gi|319977833|gb|AEUH01000179.1| Actinomyces sp. oral taxon 178 str. F0338 contig00179, whole genome shotgun sequence Length of sequence - 3879 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 17/0.000 + CDS 112 - 771 802 ## COG0765 ABC-type amino acid transport system, permease component 2 1 Op 2 34/0.000 + CDS 752 - 1429 994 ## COG0765 ABC-type amino acid transport system, permease component 3 1 Op 3 . + CDS 1417 - 2244 534 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 2 Tu 1 . - CDS 2390 - 3661 1195 ## COG0328 Ribonuclease HI 5 3 Tu 1 . - CDS 3817 - 3879 95 ## Predicted protein(s) >gi|319977833|gb|AEUH01000179.1| GENE 1 112 - 771 802 219 aa, chain + ## HITS:1 COG:SP0711 KEGG:ns NR:ns ## COG: SP0711 COG0765 # Protein_GI_number: 15900609 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 23 219 3 199 206 205 57.0 5e-53 MNPEVLAEYAPLYAKAAVLTVKIAAIGIAGSLAVGLVCAAVTQFRIPVIGRIVTAYVELS RNTPLLVQLFFIYFGLPRVGVKWSGETCAVVGLVFLGGAYMAEALRAGLDSVAPIQWESA ASLGLSPAQVLRFVALPQALAVSVPPLAANTIFLIKETSVVSVVALPDLVYVAKDLIGIS YNTSEALVLLVASYLVVLLPVSLAARLLERRARRAGFGD >gi|319977833|gb|AEUH01000179.1| GENE 2 752 - 1429 994 225 aa, chain + ## HITS:1 COG:SP0710 KEGG:ns NR:ns ## COG: SP0710 COG0765 # Protein_GI_number: 15900608 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 1 225 1 225 225 264 60.0 9e-71 MQDSGINVLWEGRNALRILEGLAVTLGVSAVSVALSLVLGVGVGLAMRSRLAPLRWLMRA YLEFVRIMPQIVLLFLAYFGLTRLTGVNLDGITASVLVFTLWGAAEMGDLVRGALISIPR HQYESAEALGLDARQAMRFVILPQTVRRLAPPAVNLVTRMVKTTSLVVLIGVVEVLKVGQ QIIDANRFEYPTAALWVYGVVFVAYFLVCWPISLLSRHLEKTWQN >gi|319977833|gb|AEUH01000179.1| GENE 3 1417 - 2244 534 275 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 30 266 2 240 245 210 44 2e-54 MAELNENTGAGAPERASAPESAGPGAPPRLRLVDLDKTYPGGHHALRGVSLDVADGEVVV VIGPSGCGKSTLLRTINGLEPVDSGSIQLDGEELTAPGVRWQAVRQRVGMVFQSYELFPH LTVMENLVLAPVRVRGEARAAARQRARALLERVGLADRADSYPRQLSGGQRQRVAIVRSL MMEPEVLLLDEVTAALDPEMVREVLDVVLDLAREGMTMVIVTHEMDFARAIADRVVFMDG GRVVEVAPPRRFFARPGTERARRFLDIFTFEGANL >gi|319977833|gb|AEUH01000179.1| GENE 4 2390 - 3661 1195 423 aa, chain - ## HITS:1 COG:ECs0210 KEGG:ns NR:ns ## COG: ECs0210 COG0328 # Protein_GI_number: 15829464 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HI # Organism: Escherichia coli O157:H7 # 3 140 5 141 155 110 45.0 6e-24 MTITVAVDGSSLGNPGPAGWAWVVDRDCWDAGGWPEGTNNIGELTALLEALEATAAAGLS DQGLHVLADSQYAINVASKWRIGWKKRGWTKADKKPIKNLALIQRIDRAMEGRTVTFEWV KGHAGHPLNELADDLARACAQTYQEGGTPAPGPGFRARGEDGADGAEQPDGSARTLGGVG TDGAEQPDGSARTLGGAGTEGAEQPGRSAGTPGGADPDSAGSTQESPGSGPEPADGVSTG AIPSHAAAGPRGGRGRKSTRGQRPFSPHPSVGGAPEPATAVGPEPTRAPDPGPGEPAAPG SAPSVPAGAGASAASGPSAPAQEAADPIDWEKRFLVAWTGGDEEALARITGPQTTRIWPG GQATTTLAGPVPPKASIGRFNVQRAGEAFLVRYTLRWEGGSSVESSVWLPGPVLVHHQST ERG >gi|319977833|gb|AEUH01000179.1| GENE 5 3817 - 3879 95 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no GWTARAEEFAFALGQVGVAQ Prediction of potential genes in microbial genomes Time: Thu May 12 18:26:43 2011 Seq name: gi|319977831|gb|AEUH01000180.1| Actinomyces sp. oral taxon 178 str. F0338 contig00180, whole genome shotgun sequence Length of sequence - 942 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 941 1187 ## COG1770 Protease II Predicted protein(s) >gi|319977831|gb|AEUH01000180.1| GENE 1 2 - 941 1187 313 aa, chain - ## HITS:1 COG:alr3911 KEGG:ns NR:ns ## COG: alr3911 COG1770 # Protein_GI_number: 17231403 # Func_class: E Amino acid transport and metabolism # Function: Protease II # Organism: Nostoc sp. PCC 7120 # 30 291 384 644 688 249 48.0 4e-66 AWEDAGRFVDAPGPVRTLSAEEGRYCDGTLRVEYQSQVLPPTTALVEPASGAVTPIKTQD APGWDPNEFVEERVWVAARDGAARIPVTLVHHRDARPDGTNPGWLTGYGAYEVSYDPEFD ALRLPLLRRCVFAIAHVRGGGEMGRAWYEDGKGLAKEHTFTDFIDVAEWLASSGWVDPSR LVAEGRSAGGLLMGAVVNKAPDRFRAVLAGVPFVDALTTILDASLPLTVGEWQEWGDPIT DPAVFAAMRAYTPYENVPEGVALPAVMATTSLNDTRVEFVEPAKWVQRLREASGAAGRGD GQGPAGGAPSSCA Prediction of potential genes in microbial genomes Time: Thu May 12 18:26:48 2011 Seq name: gi|319977820|gb|AEUH01000181.1| Actinomyces sp. oral taxon 178 str. F0338 contig00181, whole genome shotgun sequence Length of sequence - 10197 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1220 1252 ## COG1770 Protease II 2 2 Op 1 . + CDS 1395 - 1946 702 ## COG2246 Predicted membrane protein 3 2 Op 2 . + CDS 2038 - 3012 1353 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily 4 3 Op 1 . + CDS 3204 - 3824 659 ## gi|154508360|ref|ZP_02044002.1| hypothetical protein ACTODO_00857 5 3 Op 2 . + CDS 3830 - 4354 184 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 6 4 Op 1 40/0.000 - CDS 4512 - 5411 1016 ## COG0642 Signal transduction histidine kinase 7 4 Op 2 5/0.000 - CDS 5411 - 6160 914 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 8 4 Op 3 36/0.000 - CDS 6403 - 8943 166 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 9 4 Op 4 . - CDS 8960 - 9676 198 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 10 5 Tu 1 . - CDS 9800 - 10195 539 ## Slip_1168 ABC transporter, ATPase, predicted Predicted protein(s) >gi|319977820|gb|AEUH01000181.1| GENE 1 2 - 1220 1252 406 aa, chain - ## HITS:1 COG:MT0805 KEGG:ns NR:ns ## COG: MT0805 COG1770 # Protein_GI_number: 15840196 # Func_class: E Amino acid transport and metabolism # Function: Protease II # Organism: Mycobacterium tuberculosis CDC1551 # 3 313 9 319 718 186 39.0 6e-47 MRPPIAPKRYGFRVRDIHGQRFDDPWDWLRDADDPEVIALLEAENAWADSVTSATRPLAE AIVGEVRAHTALTDASVPVRRGDHWYFSRTHEGSDYATHHRVPADASSGAQAPPIPEPGR PLPGEQLLIDENREAAGHEFFRLADLVPSPDGRLIASARDTRGDERYTWVVQDAQTGEVV DRAVVGAGYGLAWSADSRWFVYTGLDEAWRSCEVWAHRVGTGAPQDLLLLAEPDGAFDLV VAPSGFPGHVVIHSAASTTGGAWLWLPDCPTRLPIPLVGAAPGALIGVESAGDHLLVVHT ATSAEGELAAAPLPPDGELASLAPEAGPGLSRAALGVRDPGSQPGGAPDLEPIAPPSTWV ALRSAGPGERIAHVEARASFLALTMRSGSLTQVEVRTRRAGAPRAE >gi|319977820|gb|AEUH01000181.1| GENE 2 1395 - 1946 702 183 aa, chain + ## HITS:1 COG:Rv3277 KEGG:ns NR:ns ## COG: Rv3277 COG2246 # Protein_GI_number: 15610413 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium tuberculosis H37Rv # 51 182 73 209 272 73 34.0 2e-13 MGVIEGTPRRGSTIPHSTAPTGGADRAGRSPGAPAPARASLKGRLVAWIREFIQFGMVGA TAFVVDWGLFNLLQHGPLGLLAGHPNTAQFCAAATATLYAWVANRLWTYRGRTRARASRE ALLFFFANACGIGISQFCLLFTHHILGLASALADNVAVYVVGFALGTAFRFFFYHYVVFT GER >gi|319977820|gb|AEUH01000181.1| GENE 3 2038 - 3012 1353 324 aa, chain + ## HITS:1 COG:Cgl1119 KEGG:ns NR:ns ## COG: Cgl1119 COG4667 # Protein_GI_number: 19552369 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Corynebacterium glutamicum # 36 322 10 294 298 220 40.0 4e-57 MSDDRSVVPHGPEGAAPTGDPCPAPADSIATMTTIDDTAVVFEGGGMRNAYTAAVVSELI AEGINFPHVSGVSAGASHLCNFVSRDAARSHATFVDLVDDPEFGGVGHFRRGQGYFNAEY IYERICYPDGAMPFNLGAFLRNPARTNVAAFNASRGEVRWFTKDEISTLDTLGPVIRASS TLPILMPPTIIEGDTYVDGALGPNGGLPFDAPLRDGYRKLLVVLTRPRDYVKPPLTPSVS ALLRMAYRRFPSVHQGVALRASRYNTGRRCLFGLEERGRAYVFCPDNLWITNTESHRERL EATYRAGLVQIRREMPAIKAFLGL >gi|319977820|gb|AEUH01000181.1| GENE 4 3204 - 3824 659 206 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508360|ref|ZP_02044002.1| ## NR: gi|154508360|ref|ZP_02044002.1| hypothetical protein ACTODO_00857 [Actinomyces odontolyticus ATCC 17982] # 1 206 4 207 207 197 65.0 4e-49 MKAQRSGMARRALAALTALPLAFAMSACRIGLGAEGDNGRSASGPAPASSASNGAASPQS GLSSQATGASKSRPSQVLGTWEGSIDGDSYRVDVNSVVVKDGVTTLTVTFANTGATTIYG WYLAFGGPDMLADKVKLTDVQNNLVYSPGKGADGSCMCSEVDSAAMASGESRTVFTTFKG LPDGVNAVTVSVSNVAPFEGVPVTRE >gi|319977820|gb|AEUH01000181.1| GENE 5 3830 - 4354 184 174 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 67 171 240 345 347 75 34 2e-13 MTTMRRRARAVTAAALVLVAAVGAGPALAADAATVEFPDPVKVVFSVATVDGAVSMAEQQ NATVTTLSGDVNFEVDSDQLTARAKEVLDSLASEWSKAKPSTVTVTGHTDSVADDAHNLD LSKRRAKAVGDYLAGKVPGLSLTTDGKGEAEPIGDNETEEGRAQNRRVEIRAER >gi|319977820|gb|AEUH01000181.1| GENE 6 4512 - 5411 1016 299 aa, chain - ## HITS:1 COG:alr3159_2 KEGG:ns NR:ns ## COG: alr3159_2 COG0642 # Protein_GI_number: 17230651 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 83 291 4 223 225 108 30.0 1e-23 MTRRILWCLAPAAVACACALAWWLAGSRLGVWAFVPLPTVLASVGVLVSAAALAAALVSS ALSQARLAGAAEADARRDLEHRQFLARLDHELKNPLTAILAATAPGDDEASALVAAQARR MGVLVTDLRKLADLSSAPLERELVDLEEAARDAVEAARSSVGDARNFTLVFPTAPWPLPR VIGDADLLYSAIQNVVFNAVKYSGEGDSIEVRASQSENSVSIEVADTGIGIPDAERQSVW AELARGSNAAGRSGSGLGLPLVRLVIERHGGHVRLTSRQGVGTSVVLTLPLPDEAGAAR >gi|319977820|gb|AEUH01000181.1| GENE 7 5411 - 6160 914 249 aa, chain - ## HITS:1 COG:BS_phoP KEGG:ns NR:ns ## COG: BS_phoP COG0745 # Protein_GI_number: 16079963 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 21 244 5 233 240 167 38.0 2e-41 MTTRTGTPSPAEPAAGAAATVLLVDDEAAIIDGLAPFLRRSGFTVHTARDGEEGLAAHSR LHPDIVVSDIMMPHMDGREMVRRIRAGGTWTPVILLTQIDTSFERSAALDEGADDYLSKP FDPQELVSRIRAVLRRCLRTPRPLSAAERVTSGDLVLDRLARRAFLSDAELELTPKAMGL LDYLMTHPGELHTRERLLSVLWGIDFASSTRAVDHRIREIRAALGEDPADPVFIETVPSV GYRFIGRVG >gi|319977820|gb|AEUH01000181.1| GENE 8 6403 - 8943 166 846 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 725 846 297 413 413 68 31 2e-11 MTTNTMLVRANLRAHGRRFVSTGLAVAISTAFIAITMVVMGGLVSQLGAGVTDRYANVTT VVSFNPNQTGGASLSDLTDQARTRLHAVPGVRAVGTYTQWPAEAQANGARAFFFVSPLMD EPLRRPALDSGDYPSGPGQILLPDSTATLLGVKAGGTVTARPQFDGIPAQDLTVTGVYSA PSSNSGMPAAYTTEAGFTALMGSTPSGTLMVATDEAGDDNGNPPTAEQERWASTIGNSLS GLDGVDVTTAKAAMESDLEQAHLGGAAMTAMLMIFPAVAALVASIVVSSTFRVVLQQRRR ELALLRTLGATRTQVRRLVTLEALAIGALSSLIGTAAGTLLGAGALTVMDPRTGYAGALA ATDYIQLALVWAAATVFTAAVGLFPALSASRVPPIAALAPVNEAGAGARKSHTARLVIGA LIAVGALAGVWAASGTADTTTRFLAMFGLSLLAFFGLILAFSVVLPPLTRLLGAAWPGML ARMARENTMRNPGRTSATGSAIVIGVALVVTMMVGASSLRETLTTAVDDARPFDLGVASN SGGPLAQDVEAKVAATNGVAATAPEYAADGTAALPDGSPALPSSSAPGSANAQLVGQPDY TAVAHSHVEQLDDQTARVGVAALDGTRLTVCGSAGSCLSLTASYSDKADPDQVMVSASNL AAIAPDRSLVLVIAKLSDGADAQEVQSALLTLDPSLRVDGSALERQTYMRVIDQVLMAVI ALLGVSVIVSLVGVANTLSLSVVERTRENGLLRALGLTKRQMKRLLALEALCLSVTGALV GLGMGVLFGWLGVLSVPLDDVTPVLVLPWAQIGAVLVVAVLSALVASWLPGRRAARVSPA EALATE >gi|319977820|gb|AEUH01000181.1| GENE 9 8960 - 9676 198 238 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 11 222 11 213 309 80 26 4e-15 MENNIVHLEHVSKIYGSGDTEVRALDDVTVDFGEAEFTAIMGPSGSGKSTMMHILSGLDT ATSGAVFVGGRNISILRDTELTRLRRDRIGFIFQSFNLVPTLDARANIVLPMQLAGRKPD KDWFDLIVTSLGIEDRLGHRPSEMSGGQQQRVAVARALMSRPTVIVADEPTGNLDSHSSA EVLDLLRRAVDELEQSVIMVTHDTGAAEHADRVLVCRDGRIVADLRGADQYALADALR >gi|319977820|gb|AEUH01000181.1| GENE 10 9800 - 10195 539 131 aa, chain - ## HITS:1 COG:no KEGG:Slip_1168 NR:ns ## KEGG: Slip_1168 # Name: not_defined # Def: ABC transporter, ATPase, predicted # Organism: S.lipocalidus # Pathway: not_defined # 16 130 445 565 571 65 39.0 6e-10 DGASAQAATPSGTGAGGRPWPADRVPARKPAVDRPKTKALDSGFLIDRQTVDLSDVEQIV DQGQTEAVAWLVRGALEQFAGRASLRDVLARLERQLNSEGLDTITKFGARPGFVARPRMI DVGAAINRYRW Prediction of potential genes in microbial genomes Time: Thu May 12 18:27:11 2011 Seq name: gi|319977788|gb|AEUH01000182.1| Actinomyces sp. oral taxon 178 str. F0338 contig00182, whole genome shotgun sequence Length of sequence - 34578 bp Number of predicted genes - 33, with homology - 23 Number of transcription units - 12, operones - 7 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1376 1715 ## COG3044 Predicted ATPase of the ABC class 2 1 Op 2 . - CDS 1414 - 1962 728 ## - Term 2080 - 2115 1.6 3 2 Op 1 18/0.000 - CDS 2149 - 4305 3536 ## COG0209 Ribonucleotide reductase, alpha subunit 4 2 Op 2 11/0.000 - CDS 4275 - 4691 443 ## COG1780 Protein involved in ribonucleotide reduction - Term 4833 - 4869 9.0 5 2 Op 3 . - CDS 4876 - 5127 465 ## COG0695 Glutaredoxin and related proteins 6 3 Tu 1 . - CDS 5780 - 5926 142 ## 7 4 Op 1 . + CDS 5849 - 6412 311 ## 8 4 Op 2 . + CDS 6388 - 6513 172 ## 9 5 Tu 1 . - CDS 6696 - 8360 2292 ## COG1760 L-serine deaminase 10 6 Op 1 . + CDS 8505 - 9998 1571 ## SCO4697 integral membrane protein 11 6 Op 2 . + CDS 10029 - 11336 441 ## 12 6 Op 3 . + CDS 11338 - 12741 587 ## 13 6 Op 4 . + CDS 12743 - 14290 1222 ## 14 6 Op 5 . + CDS 14300 - 15580 1170 ## 15 6 Op 6 . + CDS 15633 - 16388 881 ## COG0775 Nucleoside phosphorylase 16 6 Op 7 . + CDS 16403 - 16900 587 ## COG0394 Protein-tyrosine-phosphatase - Term 17294 - 17345 7.1 17 7 Op 1 . - CDS 17403 - 18158 309 ## gi|269218276|ref|ZP_06162130.1| conserved hypothetical protein 18 7 Op 2 1/0.000 - CDS 18155 - 18706 416 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 19 7 Op 3 . - CDS 18802 - 20541 1857 ## COG0785 Cytochrome c biogenesis protein 20 8 Tu 1 . + CDS 20711 - 21025 405 ## 21 9 Op 1 . - CDS 21199 - 21963 956 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 22 9 Op 2 . - CDS 22032 - 22901 1360 ## Arch_1535 hypothetical protein 23 9 Op 3 . - CDS 22908 - 24152 1292 ## HMPREF0573_10073 hypothetical protein 24 9 Op 4 . - CDS 24200 - 25771 1838 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 25 9 Op 5 . - CDS 25784 - 28171 3402 ## COG3451 Type IV secretory pathway, VirB4 components 26 9 Op 6 . - CDS 28164 - 28580 546 ## Arch_1695 hypothetical protein 27 9 Op 7 . - CDS 28650 - 28862 285 ## Blon_1669 hypothetical protein 28 9 Op 8 . - CDS 28894 - 29832 680 ## - Prom 29913 - 29972 1.5 29 10 Tu 1 . + CDS 30219 - 32075 2019 ## COG3505 Type IV secretory pathway, VirD4 components 30 11 Tu 1 . + CDS 32207 - 33046 676 ## COG5184 Alpha-tubulin suppressor and related RCC1 domain-containing proteins 31 12 Op 1 . - CDS 33043 - 33747 865 ## COG1072 Panthothenate kinase 32 12 Op 2 . - CDS 33758 - 34294 773 ## COG2236 Predicted phosphoribosyltransferases 33 12 Op 3 . - CDS 34360 - 34578 168 ## gi|154508398|ref|ZP_02044040.1| hypothetical protein ACTODO_00895 Predicted protein(s) >gi|319977788|gb|AEUH01000182.1| GENE 1 2 - 1376 1715 458 aa, chain - ## HITS:1 COG:VCA0786 KEGG:ns NR:ns ## COG: VCA0786 COG3044 # Protein_GI_number: 15601541 # Func_class: R General function prediction only # Function: Predicted ATPase of the ABC class # Organism: Vibrio cholerae # 11 433 4 421 549 344 46.0 2e-94 MSLRTGTDRDLLKELSQMDGRPYGAYRSLAGTWDYGDFTVAIDRVQSDPYAPPSVLRAWT TPSGMGLPDEALASSQARLAAADYLARIFYDAAHARVGRDVQIPRPRQEVLQRSYALVLP DRVEVRFQVRLPARGRTILGRTAARLFDVDVPNIIMDCFDFVTDDEATNTKRRGLLTHIA AYEDYCALRGALDENGWVAFVADGAVLARRSGVSQLPLADAVPFISPAALRASVDLPHAG AVTGLAIPPGITLIAGGGYHGKSTLLSAIARGVYAHIPGDGRELVATTPNAMKIRAADGR SITDVDISPFIGGLPGGADTTAFSTANASGSTSQAAAISEAVELGSPLLLIDEDTSATNL LIRDARMRSLVTHEPITPLVDRARGLAEAGTSLIMVVGGSGDYLDVADRVLLMDEYLCHD ATERARAIVAEQPRPLPGDDDGAPVGGTAAPAPDEGGA >gi|319977788|gb|AEUH01000182.1| GENE 2 1414 - 1962 728 182 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDPYNPYAQPQQPAGPSYTPDPGYGQPQPGYGQPQPGYGQPQQGYGQSGYGQPQPMYGQ PQPMYGQPLSSFGPEYELRSRAGTVFGLGLASLICAFCNFIPYLSWLFLFASVGMGIAAW VMGQNLQNQVTAMGLPITSIPNAGTGKVLGIVGLGVDVLAVVLWVIVLMGFWSSFSSWGR FF >gi|319977788|gb|AEUH01000182.1| GENE 3 2149 - 4305 3536 718 aa, chain - ## HITS:1 COG:BMEII0930 KEGG:ns NR:ns ## COG: BMEII0930 COG0209 # Protein_GI_number: 17989275 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Brucella melitensis # 15 718 35 738 738 1154 79.0 0 MAENFTDTGVEEAPSPELDYHALNAQLNLYDADGRIQFDADRAAARQYFLQHVNQNTVFF HDLEEKLDYLVEEGYYEKHVLDQYAMEDIKALYKQAYAHKFRFPTFLGAFKYYTSYTLKT FDGKRYLERFEDRVVMVSMYLARGDVDLAASFVDEIMTGRFQPATPTFLNAGKAARGELV SCFLLRVEDNMESIARGINSALQLSKRGGGVALQLTNLREAGAPIKKIQNQSSGVVPVMK LLEDSFSYANQLGARQGAGAVYLHAHHPDIMAFLDTKRENADEKIRIKTLSLGVVIPDIT FELARKNEDMYLFSPYDVERVYGVPFSEIPVTEKYHEMVDDGRIHKKKINARRFFQTLAE IQFESGYPYIVFEDTVNRANPIKGRVTMSNLCSEILQVSEASTYNEDLSYAHVGKDISCN LGSLNIAKTMDSPDFSKTIETAIRGLTAVSDLSDIGAVPSIARGNAMSHAIGLGQMNLHG YLAREHVHYGSEEALDFTNIYFMTVLFEAVRASCAIARERGERFEGFEDSAYASGEFFRK YIDEAWEPATDKCRELLAASSIRVPTRDDWRALAADVARYGMYNQNLQAVPPTGSISYIN NSTSSIHPIVSRVEIRKEGKIGRVYYPAPYMTNENLEYYRDAYEIGPEKIIDTYAVATQH VDQGLSLTLFYPDTVTTRDLNKSYIYAWRKGIKTLYYMRLRQMALEGTEVEGCVSCML >gi|319977788|gb|AEUH01000182.1| GENE 4 4275 - 4691 443 138 aa, chain - ## HITS:1 COG:BMEII0931 KEGG:ns NR:ns ## COG: BMEII0931 COG1780 # Protein_GI_number: 17989276 # Func_class: F Nucleotide transport and metabolism # Function: Protein involved in ribonucleotide reduction # Organism: Brucella melitensis # 1 138 1 135 135 175 63.0 2e-44 MSIVVYFSSATGNTRRFVEKLGLPAARIPLRPKEPRLSVDDDYVLVVPTYGGGNTKGAVP KQVIRFLNDPANRAHCKGVISSGNTNFGRAYCLAGDIIAAKLGVPHMYKFELLGTPEDVS RVREGLEQFWQKTSPTQA >gi|319977788|gb|AEUH01000182.1| GENE 5 4876 - 5127 465 83 aa, chain - ## HITS:1 COG:AGc102 KEGG:ns NR:ns ## COG: AGc102 COG0695 # Protein_GI_number: 15887421 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 69 11 79 83 105 65.0 3e-23 MTITVYSKPRCPQCDATYRALDKQGVDYTTIDVTQDAASLDYIKGLGYQQAPVVIAGEDH WSGFRPDRIKAAAAQAAPLALQA >gi|319977788|gb|AEUH01000182.1| GENE 6 5780 - 5926 142 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDSVRLRRILDHMPVKPPFPPETQTHFGPPNRWGTRPLGHRAGTRGRG >gi|319977788|gb|AEUH01000182.1| GENE 7 5849 - 6412 311 187 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLSFGRERGLDGHVVQNASESDRIHPDPAKTVKPRRISDHSRAEPVENSETQTHFGPPP PRFAVEPAGAGAPKAQRAGGRPHWGGQEGIGGGNVACLSALGWARGRWGRQIAYPNGWFP APMRLVGKRRRRGARSGGYTAGPPGARAREPRAMGCFHSDTDTTDPNFRSATTLCKFYNA CKVTVSL >gi|319977788|gb|AEUH01000182.1| GENE 8 6388 - 6513 172 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQSNGIPLKTPLRLQNDPSVRTTGPLREAQLRQEGVGRAFL >gi|319977788|gb|AEUH01000182.1| GENE 9 6696 - 8360 2292 554 aa, chain - ## HITS:1 COG:PA2443 KEGG:ns NR:ns ## COG: PA2443 COG1760 # Protein_GI_number: 15597639 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Pseudomonas aeruginosa # 39 554 3 458 458 418 52.0 1e-116 MDPHPVPLDSPLGQYHAGRPTAAAIVSGEIPATDAPHPLSVFDMFRISIGPSSSHTVGPM RAGLAFTTELAALAQPARISVDLYGSLGATGRGHSTDRAVLLGLAGHDPETVAIDVVDAL LPALAATGSLTLPGGTAVPLDLAADMRFAPRTVLPYHVNALTITASDHRGRTVLQRTYYS VGGGFVMVQTNDDASDPQVASLATSQSGVGIDVPAPHPFATAAQLLAACAESGLSIAELV RANEEAVRPRAVVDAHLDRIANTMFDCVDAGTSAQGILPGGLDVPRRARALAAKLARRSG SRAADGPRPWVSTPADPMRAMDWVNLFALAVNEENAAGHRVVTAPTNGAAGVIPAVLGYL VTHCPEAGASPGAGPDTGDGARAPLSRVRLAPIAEAEAGPSAARARDRRAAAHSFLLAAT AIGALIKTNASIAGAEVGCQGEVGSASAMAAAGLAQALGGSPEQVENAAEIAMEHSLGLT CDPVGGLVQVPCIERNAIAAVKAINAARMALWGDGRHTVSLDAVIETMRQTGNDMLSKYK ETSEGGLAVNVVEC >gi|319977788|gb|AEUH01000182.1| GENE 10 8505 - 9998 1571 497 aa, chain + ## HITS:1 COG:no KEGG:SCO4697 NR:ns ## KEGG: SCO4697 # Name: SCD31.22 # Def: integral membrane protein # Organism: S.coelicolor # Pathway: not_defined # 1 306 1 333 604 80 29.0 2e-13 MIGVERPSDWSVLGFDCDPVPGDPVAVRAGAVSWTALADQISHCAQSLRALEAGASRGAD SVAALLEARDEIVDQVGIMEARYRQAGAALEEYAVVLDRAQSESLQAWYAARDAQGELDA AVGRSESFTRSAQDAGVAGDDEEQARCTRLAGAAEADAACARGRIAAQQQIVTAAVEQRD AAAVRAMGLIAAAKDADGVADSWWDDWGRTVVKWMTALAEAVGAITGVLALVFGWIPILG QAVSGVFLTISVILGTVAAIGYTALWLAGDEELLTVLVSVALALVGLGALKGAARLGTAM CKGMLKHVEFLARAKGAASSVLGKVGLRALLRPGGVPAILRYRFRNAMVGQLGEESMLVG LWPKVRYAMSEYAQAVTRNTRVPDLTIRLFGNMRLGEVKFIRDFSPVKRLCEQMADYSAI AARDAGDGLVHIFRPEFYTDPKTVSKFMEWATGKDVGIHNLAFHSLEEYVGLRAATEINV TCQFGQGVNHYFSGWGG >gi|319977788|gb|AEUH01000182.1| GENE 11 10029 - 11336 441 435 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYGHRALGAQDVRDAFEPFGCRVRPLLDGTFQVGIGDFSFYIVDCPSPASRPGLVEWLAR VRGAAPYTHLYGAVIEDVHRYVDDFDEADALWDDIRGAFALLAERIGGVFVFLDYDPAQP TDYYDEETGQAYRFPDEVDSAAEPEGPSGPALDLDWYGLLSVTPHPAGRWIEACERYFPD LVPDYYDYERRPITAAARRYLLDPDSRQENPALIQSGFLGRMGARLPRLSSGGAGTANTN PNPVYGYGCRVLLRTVEERMSLREFRRFFTTMARELRSELATGQVLHGYASDGGEAVWVP GAEQPRSLELSRDYEFLGLPSVASWWVWLGPDYARVVGGWLERGAPGSWVVEGTGDGGVF VQTSTEPVACGAGAWFPEEFLPTVGPVPRRRLVDWLLGTRPRPQVGPARVMPTWTRPTAT AGTRPTADPGEPTED >gi|319977788|gb|AEUH01000182.1| GENE 12 11338 - 12741 587 467 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSWTVALDVYGHRALGAQDVRDAFEPFGCRVRPLLDGTFDVGVGGLGFRLMDCPSPASRP GLVEWLARVRGVPTYTHLYGAVIEDVHRYVDDFDEADALWDDIRGAFALLAERIGGVSVL LDYDPAQPTDYYDEETGQTYAFPPEPAGEPAPRTDEGAFLSLAWMGFPALTPHPMRRWME TCERLYPDLAPDRYGYEGRPITAAKRRYLFDTSSREYESLISNTGLLGDVHVNLPAVYSP DRTTPVPLHRTYSIACRVSLDKLETGMGVEGLRDFFVRMARELPAELATCELGQRHWPAM SQDREFLGLPSVGSWWVWLGPDYARVVGGWLERGAPGSWVVEGTGDGGVFVQTSTEPVAY GAGAGQWFPEEFLPTVGPVPRRRLVDWLLGTRPRPQVGPACVMPTWTRPTTDDADPGAVP DTRLGADAGPGADGTNGPVVGADADRAHNPTTGPGDAPGGPGEPTED >gi|319977788|gb|AEUH01000182.1| GENE 13 12743 - 14290 1222 515 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSYIVSFEVYGHRALGAQDVRDAFGPGAAVEERLDGSYTVVFDGQGFVLENWMSVLSSAG LVQWLARVRGVPAYAHMYSAVFDEKMIDGRYGPDIDGALRTAFAALASRADGVSVFLHEL AIDYTDEETGQEFTLRAEDDWAPRPSEHDALVLEWYGRMPAGPNPVANWILACERHHPDL VPTEYGRPLRPITGAAREYLLSAENNERYMTLREGTLLGDIDFRLPTLVERSGPGYLLSV NPQGAYSVSCALLWQRARERMGLGALRRFFTNMGRGLGAEFATGQFLTGYASVSGLAEWT PGAAQRHSPQLGRDDELLGLPSVASWWVWLGPDYARVVSGWLERGVPSSWVVEETGDGGV LVQTSAEPTGYEAGAGQWFPEEFLPTVGPVPRRRLVDWLLGTRPRAQVGPARVMPTWTRP TTDDADPGAVPRPGGGPEPTAGPDTGADGTTGPGADGTNGPGADADADRAHNPTTDPGAD GTSGHVASADTDRAHNPTTGPDTRPGADAPGPPRT >gi|319977788|gb|AEUH01000182.1| GENE 14 14300 - 15580 1170 426 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDDANSPASSARVLYAHLPGGAGFRADALTGEPEARLDWKSAWGDEDADQGTVFSLAHSG APGGGGTRVGDYRVVPVLPSEGFGSQELAEVTNDFNATVRALAENPVALPVAFDDYGRLR LVAHRREGEDFLRLSVYSTSAAVHADLERAGPLFIIRHGASVIRFLDEHSERILFVSVDP ASCLRPIEFPAGLPQYFATMPPPRRADRREEGTGWGAGAAGEGPTPTGFVLDLSDDWVRV DLTVGAEELRARVDDLIAEQMRQWPTTMGWLRERTRGWLQRTCLQAKSSGGVEFAVLLNS AKRTEPALTLVNYWLALPRTSDGPLVHVRDFLAETADDDDEIAVIETRARRLLRSVRVRP AARRGGAVPEQEMLLVDYWIEVPGDPASAARIVFTRPGADSRDGVVAQCDAIVRSARWDA PGAGPA >gi|319977788|gb|AEUH01000182.1| GENE 15 15633 - 16388 881 251 aa, chain + ## HITS:1 COG:SA1427 KEGG:ns NR:ns ## COG: SA1427 COG0775 # Protein_GI_number: 15927179 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Staphylococcus aureus N315 # 55 251 31 228 228 125 35.0 8e-29 MSENLSFPGRVRADAVIQCAMDMEAAPLLHLLEPMGDEETPRAVHAGAPGRHVQRFALGL IDGRTVLVVTSGIGEANAAAATARALVLVDAPIVIAAGTTGGLARDINVGDIAVGVSAVY GQADATAFGYALGQVPRMPVDYSSSERAAARCDALAGLVDHPVRLGRIVSSDSFCTEQVA EPMRQRFPDAMGADMETCAIAQVAWSCGVDWISLRAVSDLCGPGADQAFHMDGERAAAHS AEAVRAYLALL >gi|319977788|gb|AEUH01000182.1| GENE 16 16403 - 16900 587 165 aa, chain + ## HITS:1 COG:MT2293 KEGG:ns NR:ns ## COG: MT2293 COG0394 # Protein_GI_number: 15841726 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Mycobacterium tuberculosis CDC1551 # 1 163 5 153 163 101 42.0 5e-22 MRVLMVCTGNICRSTMAHQVLDEAVAAAGLSGAVRVDSAGVSDEEHGNPIDPRAARVLRA HGHGVPDHRARQVRASELADWDLVLAMTSSHYAALSRLRDRARPGGRAPEIRMFRDFDPQ CAHLPEGPGRDKDLPDPWYGDYGDFVDTLAVIERSTPVLVDYVRR >gi|319977788|gb|AEUH01000182.1| GENE 17 17403 - 18158 309 251 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|269218276|ref|ZP_06162130.1| ## NR: gi|269218276|ref|ZP_06162130.1| conserved hypothetical protein [Actinomyces sp. oral taxon 848 str. F0332] # 21 187 26 188 209 73 31.0 1e-11 MSDHIFDDDEIAAHLAASLAPIEPPAHVRDDLLAAIEEVEQEQWPQAPEEAPARGRRVIP LFARVFAAAAAAVALFAAGIGVGRWTTMTSMESTSHYAALNQAQDVRRSTDTMPDGHVVT LTWSPEMGMTAITTPSALRAPEGQVMQVWARHGSTVESLGVYERRGGDYSFVDIMPQPGS EIFLTFEPQGGSAQPTGEPLVVLRVGAPDPAPAPGGAPTPDSGPTSDGAPSPTTAPTWNG GRPSDGQSGAV >gi|319977788|gb|AEUH01000182.1| GENE 18 18155 - 18706 416 183 aa, chain - ## HITS:1 COG:MT0461 KEGG:ns NR:ns ## COG: MT0461 COG1595 # Protein_GI_number: 15839833 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mycobacterium tuberculosis CDC1551 # 14 181 15 185 187 99 39.0 3e-21 MTDTAPRDLGREDLQLVAEGDKRAFARLYDAWAPTLFALIRCVLKDRAQAEEVLQDTFLH VWRRAPSYDPGRGSVRAWLTTIARRRAIDRVRSAQAARDRELASPPDVDWDQTAEEAESR IAGGDVRRALGSLGEPHKTVILLSYFGGLSHSRIADATGLPLGTVKSTIRQALARLRTFL EER >gi|319977788|gb|AEUH01000182.1| GENE 19 18802 - 20541 1857 579 aa, chain - ## HITS:1 COG:Rv2874_1 KEGG:ns NR:ns ## COG: Rv2874_1 COG0785 # Protein_GI_number: 15610011 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Mycobacterium tuberculosis H37Rv # 3 292 116 401 401 267 51.0 3e-71 MTTLVLIGFLGGLITGISPCILPVLPVIFMAGGAASARPSPSFAVAPGTGISLGGTASAQ EDHRATRPAPRAPSKWRPYQVVAGLVLSFTAFTLLGSTLLTALHLPQDFIRWAGVVLLVL IGVGMIVPRFMHVLEKPFAAFSKAGKHTGANGFGLGIVLGAAYVPCAGPVLAAVSVAGST GRIGADTVALALSFAAGTAIPLLAFALAGRRLTERVSAFSRHQRRVRIAAGATVLALAAG IVTDLPAVLQRALPDYTSALQRSADPGLSRGGGQRSACVDGATTLADCGALPSLKPTAWL NTDGEAAPQGSGATLVDFWAYSCINCQRSIPGIERLYEAYRPYGLQVIGVHSPEYAFEKE TANVQSGGDKLGITYPIAVDSDLETWTDFDNHYWPAQYLADAHGQLRYTHFGEGGEATLE SLIRQLLADQGNALPDPLFTDDGVEGVAFHRTPETYLGHERAASFAGTEAYSAGTRDYAA PASLGADRFALQGRWTIGAQSISPADDEAELRLMWRGRRVDLVVSGEGDLTWEMDGQTHT QHVSGTPNSMTLVNRDTESSGLLTVTASKGLALYSFTFG >gi|319977788|gb|AEUH01000182.1| GENE 20 20711 - 21025 405 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSIRRAALAAALSVVFAAGLAACGNSANDKMENPSSPSMTQDTKMENSMSGDGMKGGMDD GKMDDGKMDDGKMDDGKMDDGKMDDGKMDDGKMDDGKMDDGKEG >gi|319977788|gb|AEUH01000182.1| GENE 21 21199 - 21963 956 254 aa, chain - ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 3 241 5 244 245 226 47.0 4e-59 MSDDASTLAQWIAEAHDIVFFGGAGVSTESGIPDFRGAKGFYHQEREIPLERVLSIDFFS ACPGAYYAWFAEETAREGVAPNAAHRFLAGLERAGKLKAVVTQNIDGLHQAAGSKRVLEL HGNWTRLECTGCGARSTIDDFDEARAGRVPHCPSCSAVVRPDIVFYGEALDPATLEGAVL AIAGADMLIVGGTSLAVYPAAGLIDYYQGGRLVLMNATPTPYDGRADLIIREPIGRVFAQ IQGHVRAGATVRPE >gi|319977788|gb|AEUH01000182.1| GENE 22 22032 - 22901 1360 289 aa, chain - ## HITS:1 COG:no KEGG:Arch_1535 NR:ns ## KEGG: Arch_1535 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 1 287 1 280 281 129 31.0 9e-29 MKDALKKLVEFLFNGHASSVDSVTKDILDFSADTSQVVSDISGFAVMPVALTVLAIVMMV ELNRKASHIEADHQTGVKLIAAVIFKYIILVMAVKNSGILLNAIRALVANVMGDPHLDPE GAGVSVNENGAAVSAFQASIDKASSVDQVGMLIFLLIPFLVTLAAKATLMIVVLLRFAEI YMLTAFSPLPFAFLGNDETKSFGTSFLRKYVEVCIHGVCIIVSIHIYKGIISVASSAENG SFLEVSPIGSDDPLAWLVGNYLPILITPVILMMVVLGSGKIAKAIAGNS >gi|319977788|gb|AEUH01000182.1| GENE 23 22908 - 24152 1292 414 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10073 NR:ns ## KEGG: HMPREF0573_10073 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 166 380 130 348 360 102 29.0 2e-20 MSIDPRVTVVELYELARNRSLWSYGFFDDLESHGSSWLQLRDWCLFARGVQNFAEIPDPP APPPELVEGSRSRGIFSRKRKPPRNPVIDGSAPPPPRKAPKKKQGFVAPTSTGGTLSSSV PLPAIIAGAVVIVLCLGLLVFQLVRNQKPATDPNAGGEAGTASDTQQSAAMSGTSATKAS NSTGATAGGAVLTAGGQAFSCIAEGEEVTCSGENSMGQLGTADLATARTFTFYLPTAARS LVAGDDFACASTSSEVWCWGDNRWAQTGHGNSEVTSPGPVVGVPGGAIKDIAAGRAHACA LTDQGVWCWGVNTAHQVKDSDDTYLSPTQVEGLDSLEITGIEASNFTTFVRTTRGVWAWG DNTRHQITSTDSAYLPPTQISEPTEGTGNPGGTGATQAPSQNGGPANPSNQRAQ >gi|319977788|gb|AEUH01000182.1| GENE 24 24200 - 25771 1838 523 aa, chain - ## HITS:1 COG:BS_yomI_3 KEGG:ns NR:ns ## COG: BS_yomI_3 COG0741 # Protein_GI_number: 16079194 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Bacillus subtilis # 243 387 103 229 240 80 37.0 6e-15 MKFQQQAGPTIKYAKTELKTRVVSVTQSSGSALREGAHGVPQTTGGSALHEGAHSAAHQA GGAAHANGFDAAAVHSDPVGAVGSNASSLIPKQSIRAVGDNLLVVGKKGAATAAKGGLVG GKLASGGSKSLLLKTRGLGDALSSLGGMTGKMSAAIKASKPMMLLGAINGTKAAGIAAHA AQAVSQFIIVASKAIGSLVATALMAAKNLSLLGIVMVVATFLSSFLGSLGIEIAQQERKV GRSVAANVPPEYAQYVNQAGSMCPEITAPLIAAQIEAESGWNPAAKSPVGAQGISQFMPG TWATQGGDYNGDGRADPLDPADAIPSQGHFMCSIVATLNPHVASGAVAGSIQEVALAGYN AGPGAVISNGGIPPYAETQNYVSKILALMVKYQAAQDVAAVGGSLGDALEWAKSIAMDDT NHYVLGSQGPTAWDCSGLTGAFMARLGVALPRTAREQSTAPGGVDVPYDQMQPGDLIFWD WGDGSWHTAIALGGGQMVSADSPESGINIEPVFPGVRNVRRFL >gi|319977788|gb|AEUH01000182.1| GENE 25 25784 - 28171 3402 795 aa, chain - ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 312 763 114 608 617 96 20.0 2e-19 MFKSKKDADKPKKKKKVSRRQILRAQDVLDYEALMDSGICVLDDQHFSITLRLSDVNYVI APPHQQESIIEQYAKFLNSFDSAHHLQVSVINRVMDQETVEEAAFMPYRPDSYNALRREF NRVIHDRLATGRNNTVTDKYFTITVEAEDYEDGRARLMRVATESISHLRSVGGCTADVVT GQARARIVHSFTRPGERFDFDYAQLVDPGKSSKDFLAPAVFDFSNARHVRIENEEVSYLR VLWLEELPAWLSDRLVKELTDLNMNLGVSLHISPIDQGEGLDLVKRRLADMSIQRANETR KLTKQGLPYDLMPHELEASYEAGVDLRHELETSNQKLFTTTLLVGVAASSIEELDSRCER VIQVGNRQSCRFSTARYMQEAAFNAFLPLGRCRLPLFRTLTTGTTAVMIPFTSQELFEED GVCYGVNALSKNLILASRSTLMNGNAFFLGTSGSGKSMYGKSEIHQVLLNRPEDEVIIID PDREYAPVGHELGATTVEIHAGSQHCVNPMDIVRDSTEGDLVRLKSEFVLSMCELLIGGT TGLSPSQRSIIDRCVTAIYNTFFSKRRAPMPTLATLTDMIRRQDDAEAPQLATSLELYST GSFSGFSQQTNVDTSNRFVIYDISQLGQSLRSFGMLIILEQIWNRVIQNRAKGIRTWLYI DEFHLLFSNDYAAAYFQSIFKRARKYGLNPTGLTQNIEELLMSERARLMLANCDMLALLN QTPTDAESLQELFQFSDEQRGYYSHVPAGQGLLKMGQVVIPFDNSIPVDTRMYKVFSTKP GEMIDAQQYRQPMSR >gi|319977788|gb|AEUH01000182.1| GENE 26 28164 - 28580 546 138 aa, chain - ## HITS:1 COG:no KEGG:Arch_1695 NR:ns ## KEGG: Arch_1695 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 1 96 1 96 121 88 44.0 7e-17 MAIEIQVPREITAYQAKVVMGMSWRQLACAAAMTILGFATIFVGYLADFVETSQYIVIAI IIPFAALGWYRPRGLPFEKYAGYIINHQRSRQLYVYGMISQVQPSADGGPVPRKAKRNRK KTKSRRKGSLPVERDSYV >gi|319977788|gb|AEUH01000182.1| GENE 27 28650 - 28862 285 70 aa, chain - ## HITS:1 COG:no KEGG:Blon_1669 NR:ns ## KEGG: Blon_1669 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 4 69 22 87 88 77 62.0 1e-13 MDANVVNEALHLLVRFATIGGGLWAIWGVVVLAGGLKDHNGPGIQSGVWQLVGGGMIVAA AQLFGQISLH >gi|319977788|gb|AEUH01000182.1| GENE 28 28894 - 29832 680 312 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRTFIALAAVATIGAPALYSGGAQAQPATTGTMNGNGHTQALNATRGVLAPTSPMLEIH WAANDDQKPPTPDGPNDPNPPTPGPSEPGEPDIPAPNPGPTTPPAPENPPTPDDPNPGTP GQPPTPGNPDPAPGNPDPAPDPGVPTPDDPAQPSPGPTTPAPDPGPTTPNPGPTTPDPGP TTPNPGPTTPPAPGADPTNPPTPTPGRTPSKPGTQYRAADRGTPPLPAGQVPSALPPAPT GIGDISAVVGGTPAPAPEAAQAPAPDVNAQRPSASELAQTGAGSSALTSAAFGATGLTLL AMRRRARKRSRA >gi|319977788|gb|AEUH01000182.1| GENE 29 30219 - 32075 2019 618 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 380 613 315 551 591 102 28.0 2e-21 MDKSRVRTPVVVGVLCVLAWAIGDKISYQVRAIVRGGGTAGDVLDGFWKDLPVLTHFSFD GPDVLAGMIAAASVLLAVVYRIGARTNIRAHQEHGSARWGGPRDIRPMVDPDPARNLLFT RTECLSLDTRATRRNLNALVVGASGTGKTSRFVVPNLLQASTSYVVTDPGGRVKAATEGR LRANGYEIRTLDLVNPAAHGDSFNPFAHIDAGNAEVELLILVDALMANTAPMPGALHDAL SDNAERALLAALCAQAYASDPHGASLGAVADLLAQLDAADGDGRALFARGAPAEDAPPPG AATAAGPAQPPAVGTGAADGPLALGEPPRTAGRSEPDPLMPPVPPAAPPFPGGAHTAPVG PEWPLDDETAVRREQIREFARSQYAAYSQGPDSTRLAVRASLGSRLAPLRIVSIRRALSS DTIGTELIGQRKTAVFLVVPDSHRSLDFLVSLFYDLVVRVNSAAADRSPGGRLPVPVQCF MDDFATVGRIPGFEAKLGVMGKRGISACVIVQTYSQGRALYGDDWEGIVGNCDSKLFMGG DEETTLRWIVGRLGRETIDTVDSATVKGSVGAFSQSWRRLGRELLTADELAGLDPGRCVF TLRGLPPFLSDKLPAQDR >gi|319977788|gb|AEUH01000182.1| GENE 30 32207 - 33046 676 279 aa, chain + ## HITS:1 COG:CAC2722 KEGG:ns NR:ns ## COG: CAC2722 COG5184 # Protein_GI_number: 15895979 # Func_class: D Cell cycle control, cell division, chromosome partitioning; Z Cytoskeleton # Function: Alpha-tubulin suppressor and related RCC1 domain-containing proteins # Organism: Clostridium acetobutylicum # 1 260 88 353 370 65 27.0 8e-11 MGLRADGTVAAVGADSAGECRTLRWQDVVAVAAGGVHTARNTGSSHTLGLRADGTVFACG WNAQGQCETRSWRGVAAVAAGWRFSVGLRGDGTLVGTGRAAEGQLRFEEWRAVVDVSCGD WHTAAVLADGTARATGNNAAGQCDVGHWTRIRAVASGYLHTLGLTGEGTVRAAGRSRAWA GSEQWEGIAAVAAGSRHSVGLRTDGTVVAVGESESGQCDVGGWRSIVAIAAGPAHTVGLR ADGAVVATGSNSHGQLGVGSWRLGRAVAPLRARRPHGGP >gi|319977788|gb|AEUH01000182.1| GENE 31 33043 - 33747 865 234 aa, chain - ## HITS:1 COG:BMEI1883 KEGG:ns NR:ns ## COG: BMEI1883 COG1072 # Protein_GI_number: 17988166 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate kinase # Organism: Brucella melitensis # 39 234 12 208 213 119 36.0 5e-27 MASAPPAARDSSGVPHFPSPQDPASLESSARIARRVVEEVADRVGKGGRVRVLGLTGPPG TGKSTVAALVADLLPKAGIPLAGMAPMDGFHMSNRVLAEAGIADHKGAPDTFDVGGFVAL LERIQRAEATVLAPDYRRELHEPVAASLRVAPEGVAVTEGNYLGLDLPGWSQVRGLVDVL IYVDTPENEVLRRLVARHEAFGRDRAAAAHWVRTVDLANIRLVASTRPRADLVV >gi|319977788|gb|AEUH01000182.1| GENE 32 33758 - 34294 773 178 aa, chain - ## HITS:1 COG:Cgl1433 KEGG:ns NR:ns ## COG: Cgl1433 COG2236 # Protein_GI_number: 19552683 # Func_class: R General function prediction only # Function: Predicted phosphoribosyltransferases # Organism: Corynebacterium glutamicum # 15 178 4 156 158 149 48.0 4e-36 MAEASASPDATDAPDREVLTWQGFGDASRELATRIVDSGWVPDLIVAIARGGLIPAGAIA YAMDVKAMGTMNVEFYSGVGQTLAEPVLLPPLMDVSAMDGKRVLVVDDVADSGKTLKMVM DLIAEHGLTLDGGASVRVDARSAVVYKKPVSIIEPDYVWRHTSQWINFPWSTLPVIKP >gi|319977788|gb|AEUH01000182.1| GENE 33 34360 - 34578 168 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508398|ref|ZP_02044040.1| ## NR: gi|154508398|ref|ZP_02044040.1| hypothetical protein ACTODO_00895 [Actinomyces odontolyticus ATCC 17982] # 1 71 99 169 169 84 67.0 2e-15 ALPAPDAPWVGTCPACGAQRRLFRAPRRVASCGACARVFDADRILTWTRDGAPASPGGAY ARELARVRRSRR Prediction of potential genes in microbial genomes Time: Thu May 12 18:30:21 2011 Seq name: gi|319977785|gb|AEUH01000183.1| Actinomyces sp. oral taxon 178 str. F0338 contig00183, whole genome shotgun sequence Length of sequence - 1749 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 221 128 ## Noca_3805 protein of unknown function SprT 2 1 Op 2 . - CDS 295 - 1749 1422 ## COG1643 HrpA-like helicases Predicted protein(s) >gi|319977785|gb|AEUH01000183.1| GENE 1 2 - 221 128 73 aa, chain - ## HITS:1 COG:no KEGG:Noca_3805 NR:ns ## KEGG: Noca_3805 # Name: not_defined # Def: protein of unknown function SprT # Organism: Nocardioides_JS614 # Pathway: not_defined # 1 72 14 85 233 75 55.0 5e-13 MDGAGLGQWAFRFDRAKRRAGSCAHATRTISLSGPLTDIYDEATVRAVVLHEIAHALVGP SQGHGARWRETAR >gi|319977785|gb|AEUH01000183.1| GENE 2 295 - 1749 1422 484 aa, chain - ## HITS:1 COG:RSc1251 KEGG:ns NR:ns ## COG: RSc1251 COG1643 # Protein_GI_number: 17545970 # Func_class: L Replication, recombination and repair # Function: HrpA-like helicases # Organism: Ralstonia solanacearum # 28 470 949 1325 1331 92 27.0 2e-18 GSPGCRPGADGGLRAAPPPFATAFASAVRELRGVELSEADLAHAAEHLPDHLRMTFVVVD SRGRELGAGKDLAHLQHSLAGAADQAVRRAVRGAIAQAMEDAQAGRGRRGKTAEGTGRAT GGGRKGRSDGGRGTGADGRTPGTADGGGTGADGRPDASASASPSRAGGGSEAAALSALNE DSITAMPALPRAVASQADGLSLRAFPALVPQGSADDPRAGVRVMANAAEAAREHRLGLAR LLLQRVRLATARVTTRWTGREALMLAASPYRGTDALVADAQLASALSLVDQLCEPDSVRS PEDFERLAAAARGRHEDRVYEILGHVVRAMEAHSEAEGAVAAHPQASLREVVDDVRENTR RLVHAGFLADTPFGALPHLARYLRAGAVRIDRASQSAAALDRNLADMDRLGEAARRLEGA RRAAAGRPYDATTARLLSQAQWMAQELRVSLFAQRLGTPNKVSFKRLLGVIADAEAAVRP GGRP Prediction of potential genes in microbial genomes Time: Thu May 12 18:30:25 2011 Seq name: gi|319977781|gb|AEUH01000184.1| Actinomyces sp. oral taxon 178 str. F0338 contig00184, whole genome shotgun sequence Length of sequence - 6766 bp Number of predicted genes - 5, with homology - 3 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 3580 3254 ## COG1643 HrpA-like helicases 2 1 Op 2 . - CDS 3646 - 4803 1118 ## gi|227495836|ref|ZP_03926147.1| hypothetical protein HMPREF0058_0139 + Prom 4904 - 4963 4.7 3 2 Tu 1 . + CDS 5200 - 6348 898 ## COG0524 Sugar kinases, ribokinase family + Term 6535 - 6604 7.5 4 3 Tu 1 . - CDS 6141 - 6560 108 ## 5 4 Tu 1 . - CDS 6683 - 6766 75 ## Predicted protein(s) >gi|319977781|gb|AEUH01000184.1| GENE 1 1 - 3580 3254 1193 aa, chain - ## HITS:1 COG:hrpA KEGG:ns NR:ns ## COG: hrpA COG1643 # Protein_GI_number: 16129374 # Func_class: L Replication, recombination and repair # Function: HrpA-like helicases # Organism: Escherichia coli K12 # 353 1130 254 906 1281 652 45.0 0 MPGEHDNRTPARPKPARGAKGRRRARSGARPEHAHVHRPFTPQQLAQRAAAIPVVVFPDL PVSARRDEIAEAIRDHQVVIVSGETGSGKTTQLPKICLQLGRGVTGMIGHTQPRRLAARS VADRIAAELGQTVGRAPGQVVGYQVRFTDEVGPTTLVKLMTDGILLAEIQSDPMLERYDT LIIDEAHERSLNIDFILGYLARLLPARPDLKVIITSATIDSARFAEHFGRWDGPVGRGRP VEPAPVIEVSGRTFPVEIRYRPLAADLAPSRSSTPTGSDGAEAGAAGPGTRGSASASAGA GPGSGPTGSGTDGSGATPDGAGAGGAGAFEQLVLEDPDDDLPALGYGLGEDIDVETAICH AVDELSAEGDGDILVFLPGERDIRDTESALLDHLGARGVRAGEDRGALPGSIEILPLYAR LTAAEQHRVFEPHRLRRVVLATNVAETSLTVPGIRYVVDPGLARISRYSNRTKVQRLPIE PVSQASADQRAGRCGRVADGVAIRLYSQRDYEARPRFTEPEILRTSLASVILQMASLGLG PVEDFPFVDPPERRAVRDGVNLLVEIGALAPDGPSDRGPGRGERPGSAPRSAEGAGDRQG SRPSAAPSPQGTGSGRGAPRRGSAQPGHRLTRIGRDLARLPIDPRLGRMLLEADANGCAS EVLVIVAALSIQDVRERPAEHQAAADAAHARLADPHSDFVTYVNLWRYLNVQARDLSGSA FRRLCRAEFLHYLRFREWRDVVNQLRQMARPLGIDAGPVGEPPRRDVVEAAASGGAADAA ARAVVAFTQGTATVDADQIHRSLLVGLLSNLGSWDESKRDYEGARGTHFTIWPGSGVGGR PAWVMAAELVETSRLFARTVARIRPEWVEPAAGALVKRVHSEPYWSSSKGAAMVKEKVLL YGLTLSADRPVLLGGLGGTVIDEAGASSSGLLGAAPGAGALGAGPLTARALAREMFIRHA LVEGEWRERHAFQRANEELVEQAREVERRSRTHGLVADEEARFAFFDDLVPEDVVSAAHF NRWWKGERRSRPDLLTYPRALLLPRGAGEGDGFPDHWVGGDVRLPVSYEFSPGSPRDGVS VRVPVEALARVSDEGADWLVPGMVEELITGTIRALPKAKRRLLAPAPETAARIAQWLADN PSGASPANDRAVPTRADDREDPASLSAAMDRLARWGRRRARRAPTAPAPGPGG >gi|319977781|gb|AEUH01000184.1| GENE 2 3646 - 4803 1118 385 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|227495836|ref|ZP_03926147.1| ## NR: gi|227495836|ref|ZP_03926147.1| hypothetical protein HMPREF0058_0139 [Actinomyces urogenitalis DSM 15434] # 25 385 2 358 358 112 29.0 4e-23 MIAVAAAVIIALVAALVWSATRQGGEPSVPPTAAQSSAQSAAQSSSAQSSSATPSPTLTA TAPTADDLADALLAVPDGCTVDALKDAEGKVRFSEGTARANAGMYEVSITIDSAYQGSLG GAPTTAARLVCRGGGDTMVEELAFYDPSLALVAALDARKDIGAAAHAALVRPAFTSVEFS ADALSLRVSGVSMFGDQASCPTCAPSTEAVVAMRWNGSAFEAVSSWFDTANGAVAAPDQA EAQGFYDAVAAEDYAAAARHASQDDIARLRTEGIAGCAPDDLDTCGMRAVQFPVGGQVGA CGAIPDVREGSGEFTIPGDPLGLAFAQSWFTGQPYQQGDFVCGIDTSSSARPLSADGSRY PVWLVVRPTDDPHGFTVVFFGRAFS >gi|319977781|gb|AEUH01000184.1| GENE 3 5200 - 6348 898 382 aa, chain + ## HITS:1 COG:alr4681 KEGG:ns NR:ns ## COG: alr4681 COG0524 # Protein_GI_number: 17232173 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Nostoc sp. PCC 7120 # 2 349 4 283 286 120 29.0 6e-27 MNGLFAGLTTLDVIHALDHVPDPTVKVTSTDHVMAAGGPATNAAIAFAALERAAGRLRPS TGAAGAGNARAEDGDGAAAPGAGALEAGAEGSGRSRAVLLSALGAGAAAEFLRADLAEAG VRLVDATAASNSSGPAVSGIIEHPGGRMVASTNARVEADPALARRALESARPWDVVLVDG HNPGLAQSALTAGTHPATDPDDPFAELEARPAHLRVLDGGSWKDWFTPLLGLVDVAVVSA DFAPPLLRAPDGAQVAGFLRGFGITRTVRTQGPGPVQFWWDGRSGEVEVEPVDAASTLGA GDAFHGAFAWGCARYRRVGAPISDPRGLIRFASAVARVSVASFGTRRWLSSPELQRCVES FVESFEAGPAAEPGGPGGAPGR >gi|319977781|gb|AEUH01000184.1| GENE 4 6141 - 6560 108 139 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHRNLPQCDNAPAPAVPGCGEAAPLPTPMSHCLPQCGPVFSPNDPLPAPMRTRSPPKRPI AYPDTGPAPPLQRPGAPPGPPGSAAGPASKLSTKLSTHRCSSGLDSHLLVPKEATETRAT AEAKRMSPRGSLIGAPTLL >gi|319977781|gb|AEUH01000184.1| GENE 5 6683 - 6766 75 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no ATSGPVPSDRADQPRLMTGPTGRQPHR Prediction of potential genes in microbial genomes Time: Thu May 12 18:30:55 2011 Seq name: gi|319977778|gb|AEUH01000185.1| Actinomyces sp. oral taxon 178 str. F0338 contig00185, whole genome shotgun sequence Length of sequence - 1552 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 170 - 958 814 ## CA2559_12698 hypothetical protein 2 2 Op 1 . + CDS 1091 - 1222 93 ## 3 2 Op 2 . + CDS 1213 - 1552 284 ## gi|154508402|ref|ZP_02044044.1| hypothetical protein ACTODO_00899 Predicted protein(s) >gi|319977778|gb|AEUH01000185.1| GENE 1 170 - 958 814 262 aa, chain - ## HITS:1 COG:no KEGG:CA2559_12698 NR:ns ## KEGG: CA2559_12698 # Name: not_defined # Def: hypothetical protein # Organism: C.atlanticus # Pathway: not_defined # 4 208 6 207 405 66 27.0 8e-10 MSDVTIETPRGVVLSGTMIDPVDSRDAAVLFSHTFLADRRSSAFFDDIARAFRGAGYATL AFDYSGHGRSGDEIITLSTMVEDLRAASGWLADQGHPRQIVHAHEFGATVALEARPPSAV TYILSSPALGPLSYDWTTIFSEVQLSDLERYGTTTIPDDSPSVRRHFTISKQTLADLSMA DAEKLLTGLEVPTLITHDAADEETGLLDLTRAAFPLLPDGSRVEITSDDGPVDGADPAPG ALPPKTTLRSACLEWARRWVPR >gi|319977778|gb|AEUH01000185.1| GENE 2 1091 - 1222 93 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRRSTARVPDRAHSHTRKDRRPLRAPLPLNPAEQLAREGNVWE >gi|319977778|gb|AEUH01000185.1| GENE 3 1213 - 1552 284 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508402|ref|ZP_02044044.1| ## NR: gi|154508402|ref|ZP_02044044.1| hypothetical protein ACTODO_00899 [Actinomyces odontolyticus ATCC 17982] # 1 89 1 89 98 112 76.0 8e-24 MGMKIGIVLGFGVGYVLGARAGRERYEQIRATAARLRRAPVVARPLDAAGQRVSDIVRAG GEHVTDKVADAVKERLFGAPASAAEAGDAAPAPKSGTAPASRGAQANGGAVPG Prediction of potential genes in microbial genomes Time: Thu May 12 18:31:11 2011 Seq name: gi|319977776|gb|AEUH01000186.1| Actinomyces sp. oral taxon 178 str. F0338 contig00186, whole genome shotgun sequence Length of sequence - 1322 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 203 - 1246 1384 ## COG1814 Uncharacterized membrane protein Predicted protein(s) >gi|319977776|gb|AEUH01000186.1| GENE 1 203 - 1246 1384 347 aa, chain + ## HITS:1 COG:Cgl2230 KEGG:ns NR:ns ## COG: Cgl2230 COG1814 # Protein_GI_number: 19553480 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Corynebacterium glutamicum # 2 343 24 354 357 305 52.0 1e-82 MEARTYRDLSERRTGEERAVLLALEDAERRHEEYWLARLGEHALPAPKPPLRTRAAAVLA HLFGTIFILAMAQRAEQRSSRDADDDVPAHMQADEHIHAEVIRSLAAKSRETLAGTFRAA VFGANDGLVSNLALVLGVAATGMAPGLVLTTGVAGLLAGALSMAAGEWVSVTSQRELLDA SIPDPSANRAVPDLDVDANELALVFRARGESPEEADAHAAQVFARISAPATGESGSIPVR AVFAGAQAEAGAHEQIGTPAKAALSSFAFFAVGALIPLIPYIAGLSGITAIVCAAAVVGC ALLATGGVVGVLSGQAPAPRALRQLAIGYGAAAVTYLLGLAFGTTAG Prediction of potential genes in microbial genomes Time: Thu May 12 18:31:14 2011 Seq name: gi|319977764|gb|AEUH01000187.1| Actinomyces sp. oral taxon 178 str. F0338 contig00187, whole genome shotgun sequence Length of sequence - 8222 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 5, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 106 - 1749 2171 ## COG1113 Gamma-aminobutyrate permease and related permeases 2 2 Tu 1 . + CDS 1968 - 2636 873 ## Arch_1061 domain of unknown function DUF1994 3 3 Tu 1 . + CDS 2861 - 3577 587 ## BLA_0008 hypothetical protein + Term 3610 - 3637 -0.1 4 4 Tu 1 . - CDS 3587 - 3919 183 ## 5 5 Op 1 . - CDS 4118 - 4321 132 ## gi|293194683|ref|ZP_06610051.1| toxin-antitoxin system protein 6 5 Op 2 19/0.000 - CDS 4369 - 5037 887 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 7 5 Op 3 6/0.000 - CDS 5034 - 6275 1474 ## COG4585 Signal transduction histidine kinase 8 5 Op 4 45/0.000 - CDS 6384 - 7163 1304 ## COG0842 ABC-type multidrug transport system, permease component 9 5 Op 5 . - CDS 7189 - 8082 1292 ## COG1131 ABC-type multidrug transport system, ATPase component 10 5 Op 6 . - CDS 8152 - 8220 70 ## Predicted protein(s) >gi|319977764|gb|AEUH01000187.1| GENE 1 106 - 1749 2171 547 aa, chain + ## HITS:1 COG:BS_gabP KEGG:ns NR:ns ## COG: BS_gabP COG1113 # Protein_GI_number: 16077698 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Bacillus subtilis # 22 350 12 338 469 247 41.0 4e-65 MTTTPDQRGATPVPAAPSTSRFKTRHLSMMALGLAIGAGFFLGTGQAIAEAGPAVLVSYA LAALIVVSVMFALAELASALPSTGSFSTYAEAGIGRWAGFTIGWLYWIMLIMVTGIEISG AAEIFVRWFPSTPRWLVALAVVVVLGAVNLMAAGRYGEVEAWLSMVKVVAIVAFLVVGVV LVGRSLVLGPLPGHEPIWTNVFGHGGFAPKHMGGIAVGLLAVITSFGGIEIITIAAAEAD DAQAAMSSAIRSVIMRILVFYVGSVVLMIALLPWDGQEMRTGAFAAILSMAGVPYVGTVM EVIIFMALISAYSANVYASSRMAYSLSARGMGWRWLLGAPQVSASAIIAADEARSAAAPG VSSADGPGAADEAGAARADAPAVPGGAARDGEEAAVAPQGPEEAGLAGLEEVLADEDHGD IARGRTPRRAVGLIIALSLISVALNWYLPETALSLLINAVGMVLLIVWTFILIAQMRLHR SLEASGRIAIRMPGWPWLGWVVLAGLAFVAGLMAWSETGRQQLVAMGGLTLIIVVVYFVR QLWSARR >gi|319977764|gb|AEUH01000187.1| GENE 2 1968 - 2636 873 222 aa, chain + ## HITS:1 COG:no KEGG:Arch_1061 NR:ns ## KEGG: Arch_1061 # Name: not_defined # Def: domain of unknown function DUF1994 # Organism: A.haemolyticum # Pathway: not_defined # 3 221 13 243 244 216 49.0 4e-55 MAVVVALLVWAFAPGVGWDLFDLGGQSGDGAAQSDGAWFQRLRALPVRGPSDEASVPEYS RDEFGQRWADTDHNGCDTRNDVLARDLARPVFKPGTHDCVVLSGTLAEPYTGSTVEFERG QDTSQLVQIDHVVALADAWRSGAWQWDGDERQEFANDMDNLLAVDGAANERKSAASADQW LPPNKAFRCEYVQRQIVVKTTYGLSVTQAERDAMAAVLGGCS >gi|319977764|gb|AEUH01000187.1| GENE 3 2861 - 3577 587 238 aa, chain + ## HITS:1 COG:no KEGG:BLA_0008 NR:ns ## KEGG: BLA_0008 # Name: not_defined # Def: hypothetical protein # Organism: B.animalis_lactis # Pathway: not_defined # 87 222 54 173 373 76 38.0 9e-13 MNTAIHWAQDCLLFGAVAVVVVGAWWLWGARAPLAQSAQPVRSARPARTVHPARSFRPAL SAGAGAGSAPAPWTWRGRRLLPLLWALYVPGLLIFTLLPLPRDPARACASIIHWDNYVPF GSLAAVVQQFQDGQTLRAVLYGLSIGLNVLLFVPLGALGEATRRGDGARGARSLWAWAGL GCALSCLIEFAQWTGLFGVVPCTYRVVDIDDVITNTLGTCLGAWLLPRLARKPWLMRG >gi|319977764|gb|AEUH01000187.1| GENE 4 3587 - 3919 183 110 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEALISVTVAALAVIDMAKGVDRSALAPGPGPGRLPPGHAHKDSDLRNTRRTQRLDPALV CGPVFPNSLPCVRHKGDRSKGGPTEKERLDSGGFAGRTADHCPHPFEGRG >gi|319977764|gb|AEUH01000187.1| GENE 5 4118 - 4321 132 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293194683|ref|ZP_06610051.1| ## NR: gi|293194683|ref|ZP_06610051.1| toxin-antitoxin system protein [Actinomyces odontolyticus F0309] # 1 67 1 67 67 84 82.0 2e-15 MAMTLRLDQDRARKLEELAQRSGTSKHDAVLHAIDEEFSRTTHHQRVNDALSRTVERWGD AIERLGQ >gi|319977764|gb|AEUH01000187.1| GENE 6 4369 - 5037 887 222 aa, chain - ## HITS:1 COG:BS_yvfU KEGG:ns NR:ns ## COG: BS_yvfU COG2197 # Protein_GI_number: 16080459 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Bacillus subtilis # 3 222 2 200 200 173 42.0 3e-43 MSIRVLVADDQAMIRGALASLLDLEPDIEVVAQASNGREAVEALAPAARGTGKGGAGEDG GGHDGSGAVDVAVLDIEMPIMDGITATETIRRRFPATRVLMVTTFGRPGYLQRALDAGAT GFMVKDAPVDQLADAVRRVAKGLRVVDPTLAVETLSRGTSPLTERESDVLRAVSTGGTIA DIAREMRLSQGTVRNHVSSAMLKTEARTRAEAVRIATESGWL >gi|319977764|gb|AEUH01000187.1| GENE 7 5034 - 6275 1474 413 aa, chain - ## HITS:1 COG:DRA0009 KEGG:ns NR:ns ## COG: DRA0009 COG4585 # Protein_GI_number: 15807681 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Deinococcus radiodurans # 165 367 97 300 336 85 33.0 2e-16 MTTTSTPRAAVRPISASALRDFMARHPTIHSLAWALPFLVFLAFPIGSALDEGLDTLRGA GQLTTSVLIGGAYLATWIFNPVPRQSERLTTRFTITHSVLLGAELAAFAFGHIIGVPGTF PMLSYQTSAWVLQSPRRIYPSGTAALVALTCVQASADGYPFYYGLYVLFPVIVTTMARHA IDRDNEERLVHQQALLVAKDRERLRLSGDLHDILGHSLTAIHIKAQLAARFLDAGRAEEA AAQVDDLIDMSQSALSDVRAIVAENRRLSPREELESARSLLEVARIDVIITNTGEPPAGI RSSLAGHVIREGITNAIAHAHPTRVWIALSPTGVSVTNDGYSASFSRSSSGSGTGLEGLR ERAATEGTVTWGPTGSTWCVALAFHGTGADNPPAAHEHPVIVDAPQPTAKGIL >gi|319977764|gb|AEUH01000187.1| GENE 8 6384 - 7163 1304 259 aa, chain - ## HITS:1 COG:DRA0008 KEGG:ns NR:ns ## COG: DRA0008 COG0842 # Protein_GI_number: 15807680 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Deinococcus radiodurans # 33 259 35 261 263 63 25.0 4e-10 MTTITISREPASGWNGAALLGWYTFWKNLTNPFSIGFAILLPIGMYFMFGTGQSYSDIWT VNGNVAATVLVSMTLYGVFLTVASLATNTALERTSGISRLYATTPLSPLANTCARICASM GIAVVVTAITYGVGAATGAKMDASAWIQTPLLILASSILASAQGLAVAFAVRSDGAFAAS SAVTVFSGFLSGMFIPINQMGSFFQTIAPYAPMYGIVNLTLLPVYGWETFKWSYVANLIA WILFFVLLAAWAQRRDTAR >gi|319977764|gb|AEUH01000187.1| GENE 9 7189 - 8082 1292 297 aa, chain - ## HITS:1 COG:DRA0007 KEGG:ns NR:ns ## COG: DRA0007 COG1131 # Protein_GI_number: 15807679 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Deinococcus radiodurans # 12 297 8 290 301 171 39.0 2e-42 MRTSTAISARGLTRRFGSVTAVDDISLTVPVGQIVALLGANGAGKTTLIDMLLGMTAPTK GTSELFGMPARDAMRRSLIGVVHQSGALPMDYTVKEAVGLFARTHPHHLPVDQILDETKL GPFAQRTIRKLSGGERQRVRLALALLPDPYLLVLDEPTAGMDATARREFWEVMRAQAAEG RTILFATHYLAEAQDFAQRTIILKAGRIIADAPTDELRQRGRTTHMSIAVPEESGPGLVE ELHNANPQWKASYEGGRVHATGRDMDDAARIALAHPGAHDISVTASTLEDVFTELTA >gi|319977764|gb|AEUH01000187.1| GENE 10 8152 - 8220 70 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no GTVKQMVKGTASAIPGPVHEGT Prediction of potential genes in microbial genomes Time: Thu May 12 18:31:38 2011 Seq name: gi|319977759|gb|AEUH01000188.1| Actinomyces sp. oral taxon 178 str. F0338 contig00188, whole genome shotgun sequence Length of sequence - 6035 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 - CDS 92 - 2008 833 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 2 1 Op 2 49/0.000 - CDS 2005 - 2973 1399 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 3 1 Op 3 21/0.000 - CDS 2966 - 3892 1600 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 4 1 Op 4 . - CDS 4181 - 5785 2849 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 5 1 Op 5 . - CDS 5798 - 5884 172 ## - Prom 5904 - 5963 5.8 Predicted protein(s) >gi|319977759|gb|AEUH01000188.1| GENE 1 92 - 2008 833 638 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 37 630 2 524 563 325 36 5e-89 MSDNTNYTQPLTDEQMEAAVERAVQAGTASAPSDVHPDDANPLLKITDLEVAFTSSTGVV PAVRGANLTIYPGQTVAIVGESGSGKSTTAAAVIGLLPGTGKVTGGTIEFDGQDITHLST KEWVRLRGSGIGLVPQDPMTNLNPVLRIGTQVKEALKANNVVPSSEIGARVAALLEEAGL PDAERRAKQYPHEFSGGMRQRALIAIGMAAHPKLLIADEPTSALDVTVQRRILDHLGKLT SEMGTAVLFITHDLGLAAERAEQLVVMHRGRIVESGPALEILQHPQHPYTKRLVSAAPSL ASARIESAHKRGVQVTEEELTGAGMGSTSTEEIIRVEHLSKEFDIRGAKGEAKILKAVDD VSFVLRRGTTLAVVGESGSGKSTAANMVLQLLTPTSGKIFFDGQDTANMSEGELFRLRRR LQAVFQNPYGSLDPMYSIYRLIEEPLKIHGYGTPAYAKQEYDRAQATGREPEEWISKLMD ASSGAVMSSAAKQELNPKRLRIARVSELLEMVALPRSAMRRYPNELSGGQRQRVAVARAL ALNPEVIVLDEAVSALDVLVQNQILYLLNDLQAQLGLSYLFITHDLAVVRQIADDVIVME KGKLVEANTTDELFHNPVEDYTRELIEAVPGRNIQLNL >gi|319977759|gb|AEUH01000188.1| GENE 2 2005 - 2973 1399 322 aa, chain - ## HITS:1 COG:Cgl1946 KEGG:ns NR:ns ## COG: Cgl1946 COG1173 # Protein_GI_number: 19553196 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Corynebacterium glutamicum # 14 322 35 343 344 421 69.0 1e-117 MSDMIPSANITRTRANQEHYVSDIDETGLGAVDAVPDEGAPSSMWGEAWKRLRKRPIFWF AAIIIAATILISLFPGLFTGQDPAYCVLERHNGPAASGHPFGFDKQGCDIYARVIYGARA SVSVGILTTIAVVIIGATIGALAGYFGGWLDALLSRITDVFFAVPLLLAAIVFMQMFKES RSIMMVVTVLAAFGWTQIARITRGAVMTAKNEEFVTAARATGASRARILVSHIIPNSMAP IIVYATVALGTFIVSEASLSFMGIGLPSSVISWGADISAAQGSLRDAPMNLFYPGMALAL TVLSFIMMGDAVREALDPKARK >gi|319977759|gb|AEUH01000188.1| GENE 3 2966 - 3892 1600 308 aa, chain - ## HITS:1 COG:Cgl1945 KEGG:ns NR:ns ## COG: Cgl1945 COG0601 # Protein_GI_number: 19553195 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Corynebacterium glutamicum # 1 307 1 307 308 407 73.0 1e-113 MLRYIGRRLLQTIPVFFGATFLIFAMVYLMPGDPVAALGGDRGLSESAAAQIRAQYNLDK PFWMQYLLYLKGVFTLDFGKAFNGQPVIDLIAGAFPVTIRLALYAMAIETILGITLGVIA GVKRGGWFDSTVLVLSLALISVPTFVLGFVLQFFLGVKLGWLPTTASNSVTFESLTMPAM VLGGVSLAYVIRLTRQSVSQNVSADYVRTARAKGLPGGVVMTRHILRNSLIPVATFLGGD LGALMGGAIITEGIFNIHGVGGTLWNAIIKGEPQTVVSVTTVLVLVYIIANLLVDLLYAV LDPRIRYE >gi|319977759|gb|AEUH01000188.1| GENE 4 4181 - 5785 2849 534 aa, chain - ## HITS:1 COG:Cgl1944 KEGG:ns NR:ns ## COG: Cgl1944 COG4166 # Protein_GI_number: 19553194 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Corynebacterium glutamicum # 39 533 44 533 534 515 56.0 1e-146 MNLKKAWLAVPAAAALALTACSGGGGGSNASGGSSDGIVTVNGSEPQNTLLPGMTSENGG GRILTALFSGLVRYDNSGKTVLDVAESIEPNEDNTVWTIKLHNDRKFSDGTPVKASNFIG AWNLVTSEKQVQAAFFDFFQGTDDEGNGEITGLEASDEYTFTATLKQPTADFRDRLGYSA YAPLPDSTLADPDNGGEHPVGNGPYKVDGENAWEHNVQIKMVPNENYVGDTPAKNKGLTM VFYTSLDAAYQDLLSNNLDVLDAVPNAAFSTYESELKGRTANQPYAGIQCFTIPQWLPHF TGEEGRLRRQAISMAVDRDSITKTIFNGTRTAATDFTSPTLPTYSDKLSGNEVLAYNPDK AKELWAQADAITPWDGTFKLSYNSDGGHKEWVDATVNSIKNTLGIDAEGNPYPDFKSLLA AEDDQSITGAFRAGWMADYPNPYNFLQPLYQTKAASNKGDYSNTEFDSLMTQASSAATPE DSAKFLQQAQEILLTDLPSIPLWYPNTNGGWSANVDNVSFDWHGQAVYQDITKK >gi|319977759|gb|AEUH01000188.1| GENE 5 5798 - 5884 172 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDIHRFNHARDDTLITSPAVSHQTAPST Prediction of potential genes in microbial genomes Time: Thu May 12 18:31:46 2011 Seq name: gi|319977747|gb|AEUH01000189.1| Actinomyces sp. oral taxon 178 str. F0338 contig00189, whole genome shotgun sequence Length of sequence - 9885 bp Number of predicted genes - 15, with homology - 9 Number of transcription units - 7, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 - CDS 1 - 877 1191 ## COG1732 Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) 2 1 Op 2 3/0.000 - CDS 945 - 1664 1017 ## COG1174 ABC-type proline/glycine betaine transport systems, permease component 3 1 Op 3 24/0.000 - CDS 1661 - 2329 774 ## COG1174 ABC-type proline/glycine betaine transport systems, permease component 4 1 Op 4 . - CDS 2326 - 3156 299 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 5 1 Op 5 . - CDS 3153 - 3602 594 ## gi|293194669|ref|ZP_06610037.1| transcriptional regulator, MarR family 6 1 Op 6 . - CDS 3687 - 4256 154 ## 7 2 Tu 1 . + CDS 4153 - 4284 344 ## + Term 4462 - 4509 17.7 8 3 Tu 1 . - CDS 4204 - 4446 185 ## 9 4 Tu 1 . - CDS 4894 - 6255 696 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 10 5 Op 1 . + CDS 6383 - 6841 500 ## gi|154508425|ref|ZP_02044067.1| hypothetical protein ACTODO_00922 11 5 Op 2 . + CDS 6862 - 6951 135 ## + TRNA 6939 - 7023 62.5 # Ser GGA 0 0 12 5 Op 3 . + CDS 7027 - 7227 57 ## 13 6 Op 1 . - CDS 7214 - 7927 824 ## 14 6 Op 2 . - CDS 8371 - 8937 631 ## COG2606 Uncharacterized conserved protein - Prom 9139 - 9198 75.4 + TRNA 9122 - 9194 74.9 # Arg ACG 0 0 - Term 9105 - 9174 19.3 15 7 Tu 1 . - CDS 9342 - 9884 514 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases Predicted protein(s) >gi|319977747|gb|AEUH01000189.1| GENE 1 1 - 877 1191 292 aa, chain - ## HITS:1 COG:MT3866 KEGG:ns NR:ns ## COG: MT3866 COG1732 # Protein_GI_number: 15843379 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) # Organism: Mycobacterium tuberculosis CDC1551 # 7 292 41 332 343 147 35.0 2e-35 MRRKHAALAVVAAGALALSACGSASSLEGSAPSGEALTIGSQDYYSNEIIAEAYAQALEG AGFAVNRQMRIGQREVYMPEVAKGTIDVFPEYTGNLLQYLKADATETTSEDVYRALAEAM PEGLRALDQAGATDQDSYVVSEEFAQLHSLTSIGDLAGAGTVVLGGNSELETRPYGPEGL ASVYGVTATFTPIEDSGGRLAAKALRDGTVQVADIYSSNPVLAEGGLRVLQDPEGLFLAS HVVPIVSSKVDERAAAVINSVSAALTPEALIEMNRQSTAEQKSAADIARAWL >gi|319977747|gb|AEUH01000189.1| GENE 2 945 - 1664 1017 239 aa, chain - ## HITS:1 COG:lin1466 KEGG:ns NR:ns ## COG: lin1466 COG1174 # Protein_GI_number: 16800534 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, permease component # Organism: Listeria innocua # 26 216 18 209 218 120 35.0 2e-27 MSVLIEALQWLADPGHWGGETGIVWRTAQHLGITVLVVALAMALAVPVGTWIGHTGRGRW LVSATGAARAVPTLGVLTLTGLALGIGLAAPAVALLVLALPPMLAGTYSGIASTDRTTVD AARAIGLTEWQVVTQVEVPHAMAIIMAGVRGAVLQVIATATLAAYTADYGLGRFLYAGLK TRDYAQMIGGSLVVVVLALVVDAALGALQRRIDSRGAPDPDPSSAAAPSAQTDPGAPAH >gi|319977747|gb|AEUH01000189.1| GENE 3 1661 - 2329 774 222 aa, chain - ## HITS:1 COG:MT3864 KEGG:ns NR:ns ## COG: MT3864 COG1174 # Protein_GI_number: 15843377 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, permease component # Organism: Mycobacterium tuberculosis CDC1551 # 1 207 1 207 229 107 32.0 2e-23 MTWLMSNWGLVGELALTHLAIAVPAIVCSVLVAVPVGWCAARSKRVGPPVLAVLSAMYAV PSLPLLIIVPVVVGVSYRSPVNMVLILTLYGVAVLVRQSAEGFSAIERATLRSATACGFG TARRFWQVELPLAAPVIVAGTRVVVTSTVSLVTIGAFVGVRSLGTLFTDGFQRGLVAEVV VGLAATIALALGIDALVAGIGRAATPWTRAGAAADDEQGARA >gi|319977747|gb|AEUH01000189.1| GENE 4 2326 - 3156 299 276 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 237 1 239 245 119 32 7e-27 MIRFDAVSKTYPGGTVAVEDFSLRIPERTTTVFLGTSGCGKTTLMRMVNAMVAPTSGRVL VRGQDVAEQDPVALRRSIGYVLQEGGLFPHRTIADNIATVPLLEGAPRRQARARALELMA LVGLDRDMAGRYPAQLSGGQRQRVGVARALANEADILLMDEPFGALDPLVRADLQAELIR IREQLGTTILFVTHDIDEAFLLGDQIAVLRTGGRVAQVGTAQELLSHPADDFVADFVGAS RAARPVRIDHRRGGLVVDEAGTPVGTLREPDRGARP >gi|319977747|gb|AEUH01000189.1| GENE 5 3153 - 3602 594 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293194669|ref|ZP_06610037.1| ## NR: gi|293194669|ref|ZP_06610037.1| transcriptional regulator, MarR family [Actinomyces odontolyticus F0309] # 10 149 7 142 145 95 46.0 1e-18 MSEPEELDASLMASLVIAQIAAWNAIERALSASRVPLSYGRYLVLLTITGALGRARIQDV ASSQRITVGAASRLVDRLCADGLVQRHPSPTDRRVALLSVTDEGARRLRDAQGIVEAETR RVFSPLDPHQRASLSAMLTKVTLTAEATR >gi|319977747|gb|AEUH01000189.1| GENE 6 3687 - 4256 154 189 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPHFASVITLAKWRYPFENSASLENCPRTLPRAGTVLRIPPAALAKQPPHPPQGRDFPLK PRHHLQNCWRNCPAAKWHGSTPPGRGGGAHRLRDGRPANSSAGTDPHQGRDRTGAAPIGA GNRSLGQAIRLPQCAVPYPNAVHWAREGAPAPVEPQSTGCLPLKRRQGRPRPRVGKPPEP PAPTNLQIQ >gi|319977747|gb|AEUH01000189.1| GENE 7 4153 - 4284 344 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPALGRVRGQFSSEAEFSKGYRHFASVITLAKCGIAVKAEASL >gi|319977747|gb|AEUH01000189.1| GENE 8 4204 - 4446 185 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQLAAPGRALPEARRARPHLAQERLSPRASDSGATERIEPGPRAPEPGRSTGQFSQRRLR FYSDATLCKRYNTGKVAVSL >gi|319977747|gb|AEUH01000189.1| GENE 9 4894 - 6255 696 453 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 30 410 9 414 447 272 38 6e-73 MSAQKSSIFTWRTHGDGTSIQPGDVVLPGERLSWPRTIGIGAQHVVAMFGATFLVPLLTG FPPSTTLFFTAVGTLLFFLITAGRLPSYLGSSFALISPILAVSQTLGADYALGGIIATGA TLALVGAVVHFAGVRWIDAVMPPVVTGAIVALIGLNLAPAAWNWVKEGPVTAVVTIVAVC LVTVLFKGIIGRLSILIGVLVGYAAAIVQGEVDFTAVGSAAWIGLPQFHAPAFAVSTLGL FVPVVFVLVAENVGHVKSVSAMTGENLDGLTGRALVADGVSTMLAGAGGGSGTTTYAENI GVMAATRVYSTAAYVIAAFFAFALSLMPKFGALIATIPHGVLGGAGTVLYGMIGMLGVRI WVQNRVDFSNPVNLNTAAVSLIVAIADFTWTVSGMAFGGIALGSAAAILVYHGMRAVSKL RGTNLEAASPASAPSGSELEGEAYAKRHRAASD >gi|319977747|gb|AEUH01000189.1| GENE 10 6383 - 6841 500 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508425|ref|ZP_02044067.1| ## NR: gi|154508425|ref|ZP_02044067.1| hypothetical protein ACTODO_00922 [Actinomyces odontolyticus ATCC 17982] # 53 150 49 148 148 70 45.0 2e-11 MIRAELWDPWGVSIPNAPIAAPPDYPGSHEAGAAWDPYGPYAGAPGGGYAVLGEVFPGTE GVDAAAVFAMGGYTAPKPIANPVATVALWLSVLAMPLLGLFCPVTLVLGIVGLARSSGLP GQVGRHESAAALVLSGGALAWWIFMYAIVFRW >gi|319977747|gb|AEUH01000189.1| GENE 11 6862 - 6951 135 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPPAALPRGFAFGGRFRLLLFGTMCTLAS >gi|319977747|gb|AEUH01000189.1| GENE 12 7027 - 7227 57 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFSPGTATESGSGAFLRQHPVSTSSAGGAGPGRAGARRRFPSQVLARWRQPVPSSARGPS DPFRGG >gi|319977747|gb|AEUH01000189.1| GENE 13 7214 - 7927 824 237 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAAQDALADKSAAVQNARNKVTMLLDRLDRQKLSPEQLDYVDSVPASLEQICTAFAAEEP ECARRAAEEVQAVRDSVSGATAVGLILPPTLFISGIFIPPFPLSFALASVTGIVVLIVCY TALLGQTTRMQQVSARAWGPANAAINAIGWRNPVTGVNCGHLRNVEELFLATASDAARLT LIQEHQLETQAAQFNEMQRQHIVLEEQLRSAQIHRTTVALQAQRAMMHSSITPINRP >gi|319977747|gb|AEUH01000189.1| GENE 14 8371 - 8937 631 188 aa, chain - ## HITS:1 COG:PA2301 KEGG:ns NR:ns ## COG: PA2301 COG2606 # Protein_GI_number: 15597497 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 13 184 3 175 180 70 34.0 2e-12 MTDHTDIPGIGTLRPVPALDRLDLVAPPVARALTALAAQDPALASSALVAPIDPDLADTE TMTRAFGMDLALSSNCILVAGKRAGEERVAACVVRATTSADVNHVVKRLLDVRKASFWPQ DRAVEASGMEYGGITPVGVPPQWRLLIDSRCASGWSCIGSGLRASKLFVPGELLAALPAA EVVDGLGA >gi|319977747|gb|AEUH01000189.1| GENE 15 9342 - 9884 514 180 aa, chain - ## HITS:1 COG:Cgl0237 KEGG:ns NR:ns ## COG: Cgl0237 COG0008 # Protein_GI_number: 19551487 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Corynebacterium glutamicum # 1 123 125 249 293 120 56.0 1e-27 RASRREALAALGRAPALRLAAPRAQWTVTDTLHGQYTGPVDQFVVRRADGVPAYNLASVI DDAFQGVDQVVRGDDLLAQAPGQAQLASLLGLAQPVYAHVPLAVSESGARLAKRDGAVTL ADLADLGWGPADAVGLIGESLGVRGARRAADIADALGDRGLEGIPARPWVVVPPTGRRPH Prediction of potential genes in microbial genomes Time: Thu May 12 18:32:42 2011 Seq name: gi|319977732|gb|AEUH01000190.1| Actinomyces sp. oral taxon 178 str. F0338 contig00190, whole genome shotgun sequence Length of sequence - 12096 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 9, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 379 273 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 2 2 Op 1 . + CDS 496 - 936 440 ## gi|293194565|ref|ZP_06610027.1| CrcB protein 3 2 Op 2 . + CDS 945 - 1319 315 ## + Term 1388 - 1422 2.3 4 3 Op 1 32/0.000 + CDS 1491 - 2510 1427 ## COG1135 ABC-type metal ion transport system, ATPase component 5 3 Op 2 22/0.000 + CDS 2507 - 3220 1135 ## COG2011 ABC-type metal ion transport system, permease component 6 3 Op 3 . + CDS 3340 - 4185 1445 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen + Term 4225 - 4291 26.5 7 4 Tu 1 . - CDS 4468 - 5043 913 ## COG3477 Predicted periplasmic/secreted protein 8 5 Tu 1 . - CDS 5719 - 6357 1031 ## COG0035 Uracil phosphoribosyltransferase 9 6 Op 1 . + CDS 6381 - 7754 973 ## COG0590 Cytosine/adenosine deaminases 10 6 Op 2 . + CDS 7754 - 8242 193 ## PROTEIN SUPPORTED gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 + Term 8364 - 8394 1.7 11 7 Tu 1 . - CDS 8340 - 8411 94 ## 12 8 Tu 1 . + CDS 8410 - 8865 461 ## CHU_2753 hypothetical protein + Term 9074 - 9107 1.5 13 9 Op 1 5/0.000 - CDS 9043 - 9519 758 ## COG0315 Molybdenum cofactor biosynthesis enzyme 14 9 Op 2 . - CDS 9516 - 10853 1218 ## COG0303 Molybdopterin biosynthesis enzyme 15 9 Op 3 . - CDS 10878 - 11918 1305 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases Predicted protein(s) >gi|319977732|gb|AEUH01000190.1| GENE 1 1 - 379 273 126 aa, chain - ## HITS:1 COG:Cgl0237 KEGG:ns NR:ns ## COG: Cgl0237 COG0008 # Protein_GI_number: 19551487 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Corynebacterium glutamicum # 5 125 2 120 293 152 62.0 2e-37 MTTPAGRFAPSPTGDLHLGNLRTAILAWAAARLSGRRFVMRIEDIDRERAGSAQRQLDDL EAIGVDWDGEPLVQSQRAHAHRAALEELASRGLVFECYCSRRDIREAASAPHVPPGHYPG TCARLS >gi|319977732|gb|AEUH01000190.1| GENE 2 496 - 936 440 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293194565|ref|ZP_06610027.1| ## NR: gi|293194565|ref|ZP_06610027.1| CrcB protein [Actinomyces odontolyticus F0309] # 8 140 7 134 154 88 52.0 1e-16 MDEHRMPPEWACVAAGGALGTLARAGLDSAAAARPVGGLFAPWSVLVANVVGALALGALA AFTAGRSAAFPERARALARLKLFAGTGFAGAFTTYGGYAAWTVSSAAQWSARAHGAALAL AQSAGLLAVGAVAAWAGHRLVSGAQR >gi|319977732|gb|AEUH01000190.1| GENE 3 945 - 1319 315 124 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWAALACAGALGALARWRLDGWARSATANAAPLLGKCGIVAVNVVGSAAAGLLLAWSASP AAHVASTGFLGGFTTFSTALVDAVDLWRGGRRASAVVLAVGTWAASVAAAAAAVWAASPG ALGV >gi|319977732|gb|AEUH01000190.1| GENE 4 1491 - 2510 1427 339 aa, chain + ## HITS:1 COG:BH3481 KEGG:ns NR:ns ## COG: BH3481 COG1135 # Protein_GI_number: 15616043 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Bacillus halodurans # 1 250 1 254 338 265 54.0 1e-70 MIHLTGVRKVYPVKGGNTVVALDGVNLHVARGSIHGIVGRSGAGKSTLIRCLTALERPTE GSVVVDGLDLATLSGKALRDARRRIGMVFQAANLLDARTASDNIGYPLKLAGVPKEQRRE RVEELLDLVGLAGRGGSYPSQLSGGQRQRVGIARALADDPAVLLCDEPTSALDTESTAQI LALLKQVRDIAGVTVVVITHEMAVVREICDSVTLLDHGAVAQTGTIAEVVSDPASPLARE LVPTPAVEHVPAGADGSPGPGASGTVLLDVVFTSHPGVPTGATVLHLASSMGADVTAGTF ESIGDIQVGRLALTVPSYHADSIIEQLRKNSIHAEVRDR >gi|319977732|gb|AEUH01000190.1| GENE 5 2507 - 3220 1135 237 aa, chain + ## HITS:1 COG:FN0659 KEGG:ns NR:ns ## COG: FN0659 COG2011 # Protein_GI_number: 19703994 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Fusobacterium nucleatum # 20 230 14 223 233 156 46.0 4e-38 MNAVAHLAVLPAGKGDPTWLDNPALTTQFLPAIWETLFMVGLSTLCSVIIGMGIGLLLVQ TGKNGLTPNKPVYQGVSVVVNVIRSLPFIIGIIMLIPLTKALVGKSSGSMATVVPLIILS APFFARLVETNLLAVDHGKIEAAQMMGASNRQIRWGVLVREALPSLVQSITTLAITLIGY SAMAGAVGGGGLGQMAISYGYNRWQDDVMISTVLAIIMIVQVIQMIGDAISRLVDHR >gi|319977732|gb|AEUH01000190.1| GENE 6 3340 - 4185 1445 281 aa, chain + ## HITS:1 COG:DR1359 KEGG:ns NR:ns ## COG: DR1359 COG1464 # Protein_GI_number: 15806376 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Deinococcus radiodurans # 39 281 15 256 256 191 44.0 1e-48 MRTLRTILATGLTAVVAVGLAACGGSSQNGGSNGSGSDSSGAVTLKVGATPAPHAKILTY INDNLAAEAGIKLDIVEYTDYNQPNRALNDGELDANFYQTVPYLENAEREFDYDFTAGAG IHLEPLAIFSNKHKSLSELPDGGTIAVISDASNQSRALELLAKQGLVQLPADGSDASVAN VTTLKDFTFKEVEGPQLVRSLDDFDFAVINGNFAQEGGLNIADQALVIESPENNPAVNVL VWKTANDKQEAIDKLEQLLHSDKVKQYIEQTWTDGSVIPAF >gi|319977732|gb|AEUH01000190.1| GENE 7 4468 - 5043 913 191 aa, chain - ## HITS:1 COG:SA0175 KEGG:ns NR:ns ## COG: SA0175 COG3477 # Protein_GI_number: 15925885 # Func_class: S Function unknown # Function: Predicted periplasmic/secreted protein # Organism: Staphylococcus aureus N315 # 16 175 2 160 164 181 52.0 9e-46 MCGEASPMALIQTDPTHRKYGVAALVGVLGGFVSAIIKFGWEVPLPPRTPERNATNPPQA LLELLGMSPQTTHASYIFNGNEGLPWMSFLVHFAFSIGFAVVYCVLAERFPQVKLWQGAA FGILVYVGFHVVFMPAIGIVPAPWNQPFAEHFSEFFGHIIWMWVIEVFRRDLRNRITRGP DAEVPVAEPAA >gi|319977732|gb|AEUH01000190.1| GENE 8 5719 - 6357 1031 212 aa, chain - ## HITS:1 COG:BS_upp KEGG:ns NR:ns ## COG: BS_upp COG0035 # Protein_GI_number: 16080742 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Bacillus subtilis # 2 209 3 207 209 202 49.0 3e-52 MRIHVADHPLIAHKLTVLRDRQTPSATFRQLVDELVTLLAYEATREVAVTETAINTPVAP TVGRKLSEPRPIVVPVLRAGLGMLEGMTRLLPTAEVGFLGMRRDDDTLEIETYANRLPDD LSGRQCYVLDPMLATGHTMVAATDYLFERGARDVTCVCVLAAPEGIETLEKAVGDRADVT VVVAAVDECLNDKSYIVPGLGDAGDRLYGIVD >gi|319977732|gb|AEUH01000190.1| GENE 9 6381 - 7754 973 457 aa, chain + ## HITS:1 COG:PM0078 KEGG:ns NR:ns ## COG: PM0078 COG0590 # Protein_GI_number: 15601943 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Pasteurella multocida # 18 167 3 152 166 149 46.0 1e-35 MKFPIARIAATLGHVTLTDRYREAMGKALYLADRARETGDVPVGAVVLDTLGRVVGKGWN CREKDRDPAGHAEIVALRDAARTLKRWNLVGCTLVVSLEPCTMCAGAIVSARVDRVVFGA WDPKAGAAGSVRDVLRDSRLNHQVEVVGGVLGHEAAMQLRSFFAGKRPAAQMPVMSAPSR VVEPGPVDEPVVEASAPEEVSDARANGAPSVLQVPGAPPAPRRADSRRGDSRRAEGGEHR RADQSRSGAAHRASSPSSAPAALGPVTRSVPAIRPVRVQPVASVPAGLHTPAQPKPLSLG DDLPPRRASASASSVQARGEGNRPRPTAAAERPAPKASATGAPGAGKAEGVSTPPRPPAS AQRSRVVEPPRPPAAPQSRQEAPEPPSFTQRAVRSPRPTRTQDRPLPQAFQPERPQRPQR IEYELSSRQFSDADPVTAGIRVRRPVRKTRDASNARH >gi|319977732|gb|AEUH01000190.1| GENE 10 7754 - 8242 193 162 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 [Burkholderia pseudomallei 305] # 7 158 4 171 194 79 35 2e-14 MGLAARVITVSDRCARGEAEDRSGPLALSLLAGHGISADLALVPDDIPAIRAAIRSACEA GARLVLTTGGTGVSPRDNTPEATAPLLAVRIDGLADAVRRRGEASVPAACLSRGLVGVTE RGPGGVLVVNAPGSCGGVADAVAVVGPLAAHIIDQLDGGDHG >gi|319977732|gb|AEUH01000190.1| GENE 11 8340 - 8411 94 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRSSLNGGTKAPALSAASAIAGE >gi|319977732|gb|AEUH01000190.1| GENE 12 8410 - 8865 461 151 aa, chain + ## HITS:1 COG:no KEGG:CHU_2753 NR:ns ## KEGG: CHU_2753 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 1 147 44 195 196 144 47.0 1e-33 MTWIKPSFLWMMYRSGWGTKPGQERTLAVRITREGFEAALADACLSHFDPDVHSTRQAWS AAVAACPNRIQWDPERNASGDPLAHRSIQIGIGPARVGAYAHEWVVGISDVSDLVEALRL APARLGELGPRERPYPLPAGLAQRIGASGPF >gi|319977732|gb|AEUH01000190.1| GENE 13 9043 - 9519 758 158 aa, chain - ## HITS:1 COG:MT0887 KEGG:ns NR:ns ## COG: MT0887 COG0315 # Protein_GI_number: 15840277 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Mycobacterium tuberculosis CDC1551 # 4 155 15 166 167 127 51.0 7e-30 MTAFTHLDESGAARMVDVTAKQPTVREASASARVDVSPRVMGALRTGAVPKGDVLAVARI AGIGAAKRVPDLLPLAHTIGVHGCEVGLSLEEDHVAITATVRTADRTGVEMEALTSVTVA ALAVIDMVKGVDRSARIRDAKITRKSGGRSGEWVRPAD >gi|319977732|gb|AEUH01000190.1| GENE 14 9516 - 10853 1218 445 aa, chain - ## HITS:1 COG:Cgl1170 KEGG:ns NR:ns ## COG: Cgl1170 COG0303 # Protein_GI_number: 19552420 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Corynebacterium glutamicum # 14 356 14 346 403 171 38.0 3e-42 MRDPELTIPEHRDLVEALAQPLPAVEAPLESCAGAVLAEDVRAAVPVPPFTNSAMDGFAL RFDDIAGLAAPVALPVLGDIPAGDAVPRECRPGAAWRIMTGAPMPDGADTVVRVEATDHS PGVAEAPGAVTISRLPERGADVRHQGEAVEPGDLVVRAGAVLRGQELAAAASVGHGALRV HPRPRVLIVTTGAELAAAGDGLAHGQIPDSNGFLLRGLVEEAGGAVAAHLRTGDSPGQLR DALDRAPQADLVITAGGISQGAHEVVRRALGADAGFHRVAQQPGGPQGVGRTRVGSGEAP VICLPGNPVSVFVSFHVYVARALAVMARRLPKRRGITAPRTAPAVAAASWRSPRGKTQFI PLRAASADEAGAAGLSAPSAAPPVPPVAIPVVPVHALGSRSHLVASLPGADFIGVVPPAT TRVGPGDQLDVIDCSHRDAAAGGAR >gi|319977732|gb|AEUH01000190.1| GENE 15 10878 - 11918 1305 346 aa, chain - ## HITS:1 COG:RSc0624 KEGG:ns NR:ns ## COG: RSc0624 COG1063 # Protein_GI_number: 17545343 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Ralstonia solanacearum # 6 346 1 338 338 268 45.0 8e-72 MRAVVMEAPGRIVIEDIDAPKVLDPTDAVIELAATCICGSDLWPFRGAQPVDHQYMGHEY IGEVVETGPAVKKVAPGDFVVGSFCISCGECEICRAGYPSRCPVAAAAGDPFIGARSNGT QAEFARVPFADGTLVKTPAPPSPELIPHLLAASDVLGTGWFAADSAGAGPGTTIVVIGDG AVGLGAIIGAKQLGAERIIAMSRHPQRQALAREFGATDIVEERGEAGVARVREMTGGYGA HGAVEAVGNETAFQQALGCVRPGGHLGFVGVPHGVSLDMGQMFGAEVHIFGGPAPVREYL PQMIDLIYKGEIFPGKVFDKVLPLADAAEGYAAMDQRSATKVLLTV Prediction of potential genes in microbial genomes Time: Thu May 12 18:33:04 2011 Seq name: gi|319977728|gb|AEUH01000191.1| Actinomyces sp. oral taxon 178 str. F0338 contig00191, whole genome shotgun sequence Length of sequence - 1953 bp Number of predicted genes - 3, with homology - 1 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 253 340 ## 2 1 Op 2 . + CDS 268 - 1815 1305 ## SACE_4812 rhs protein + Term 1909 - 1946 -0.2 - Term 1619 - 1656 -0.1 3 2 Tu 1 . - CDS 1794 - 1952 59 ## Predicted protein(s) >gi|319977728|gb|AEUH01000191.1| GENE 1 2 - 253 340 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no GAAKTALGAADATGGGASGGLGILIGPLVGPLLAAGRLAARVCLEAAADKADASADHLEQ SAALMQAVEDSIGDDLKRIGEQL >gi|319977728|gb|AEUH01000191.1| GENE 2 268 - 1815 1305 515 aa, chain + ## HITS:1 COG:no KEGG:SACE_4812 NR:ns ## KEGG: SACE_4812 # Name: rhsA # Def: rhs protein # Organism: S.erythraea # Pathway: not_defined # 58 380 3 324 1544 140 31.0 1e-31 MSGAEALDTRPSTGADTGAGRPGPGTPGASDALHSGSPAGTASAPGGAPPSSASPSLNPL VAGREESRSAVSGSGVVEDIWGLAHGLSEGSWLETGLSSVSLVADAVGVGVDPLGTLIAW GAGWLIDHFGPLKSWMDELLGDADSVRADAATWSNVAWAMGECADTLEGDEKNLMGEQVG ATARGYRASNADTISALRTASGAADAMGKATSVLAEVVGVVHDLLRDAISAIVGTLASAI IEAIATFGLAIPLIIAQVQVKVGAKATEMAAHVTGALKSARTLAHNLTHLSGLLDMLRGM LTRTKNTAANAAHMLTGAKPAAAAAGGATPHGPITGTTAPASAHTGTGTGHTARHTPPPA PDTPTPDIPPTTPKPKRSKKPKTEHWKYQDRASDGTWAKGNGGASTYRESEKKGIQRYTA IAEKSGAPLKRVIDDRKYLASITGCDHGRVYDAIVQLPDETYCGLEIKSGSAKLTTRQAE FDALVKPDNPARVKLDDGTIVEITSTDEFHVERQE >gi|319977728|gb|AEUH01000191.1| GENE 3 1794 - 1952 59 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no LEPVQILRSGQARRNSFSKGLSVLRPSITVHSTRINVYRPQHISAHYSCLST Prediction of potential genes in microbial genomes Time: Thu May 12 18:33:26 2011 Seq name: gi|319977718|gb|AEUH01000192.1| Actinomyces sp. oral taxon 178 str. F0338 contig00192, whole genome shotgun sequence Length of sequence - 12045 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 147 208 ## gi|154508770|ref|ZP_02044412.1| hypothetical protein ACTODO_01279 2 1 Op 2 . + CDS 172 - 696 509 ## 3 2 Tu 1 . + CDS 1138 - 2202 1357 ## COG2896 Molybdenum cofactor biosynthesis enzyme 4 3 Op 1 10/0.000 + CDS 2431 - 3786 2307 ## COG2223 Nitrate/nitrite transporter 5 3 Op 2 13/0.000 + CDS 3878 - 7627 5651 ## COG5013 Nitrate reductase alpha subunit 6 3 Op 3 12/0.000 + CDS 7627 - 9177 2072 ## COG1140 Nitrate reductase beta subunit 7 3 Op 4 12/0.000 + CDS 9174 - 9863 822 ## COG2180 Nitrate reductase delta subunit 8 3 Op 5 . + CDS 9868 - 10641 1023 ## COG2181 Nitrate reductase gamma subunit 9 4 Tu 1 . + CDS 10907 - 11971 1091 ## PPA0506 putative regulator Predicted protein(s) >gi|319977718|gb|AEUH01000192.1| GENE 1 1 - 147 208 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508770|ref|ZP_02044412.1| ## NR: gi|154508770|ref|ZP_02044412.1| hypothetical protein ACTODO_01279 [Actinomyces odontolyticus ATCC 17982] # 1 48 54 102 102 63 69.0 5e-09 EWEKGEDSFTIHVHPEEVFTGEQAAPVFRDYIVGGILPPAELLRPLDI >gi|319977718|gb|AEUH01000192.1| GENE 2 172 - 696 509 174 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGPEFRAPAVERFMEIVRAWAYHPWPMSVQDGIDVYTNLGFKGDPQDPELFTSDMSPAKP DSYLVSTDGIIDESVLCLSAWLPEEYSESSRPRARAYYESFVRAFHEVLGDTRDSEEGED YSSTLWVANERVSVSLEGMPTLILLTVESPQQTEYTLDELEAEANGEEIGDDYF >gi|319977718|gb|AEUH01000192.1| GENE 3 1138 - 2202 1357 354 aa, chain + ## HITS:1 COG:Cgl1171 KEGG:ns NR:ns ## COG: Cgl1171 COG2896 # Protein_GI_number: 19552421 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Corynebacterium glutamicum # 23 354 2 337 337 376 58.0 1e-104 MPLELQSPGSARLGAPSTARPGLVDRFGRVARDLRVSLTDRCNLRCSYCMPPEGLDWLPS EATLTDGEVVRLVRIGVEELGIRQVRFTGGEPLLRRGLEGIIAGCAELTTDQGRAPDLAM TTNALGLARRARSLRAAGLGRVNISLDSLDPALFAAITRRDRLGDVLAGIDAAIEAGLAP VKVNTLVLRGVNEAGLPDLVDYCLERGIELRVIEQMPIGPPDTWDRRRILTVDEILHIMS ARHELSPLPREDPHAPAARWMVDGDPRRTVGVIASVSEPFCGACDRTRLTSDGQIRSCLF STEEFDLRARLRGGASDQELARAWADAMWAKPAAHGLDTDAFARPSRTMSRIGG >gi|319977718|gb|AEUH01000192.1| GENE 4 2431 - 3786 2307 451 aa, chain + ## HITS:1 COG:Cgl1164 KEGG:ns NR:ns ## COG: Cgl1164 COG2223 # Protein_GI_number: 19552414 # Func_class: P Inorganic ion transport and metabolism # Function: Nitrate/nitrite transporter # Organism: Corynebacterium glutamicum # 15 451 10 445 445 465 58.0 1e-131 MSANDDGAKTASRYVLKDWNPNDDSKWDSALAWRTLWITTYSMILAFCVWFLPSAIAPKL TLLGFDLDKSQLYWLTALPGLAAGILRLVYMFLPPLIGTRKLVGITSLLCILPMLGWFYA VQDNTTPYWVLLALAFMCGIGGGSFSGYMPSTGYFFPKRLSGTALGLQGGIGNLGMSVIQ LVGPVLMGFGLFGMTWLAPQTHVTGDHAGEQIWVYNAAVFFVPWCVLAAILAFIWLRDVP VKANIKQQLDIFSNPNTWYMTILYVMTFGLFSGFSAVFGLLINNQFGRESSLALPVLGAT FAFLGPLIGSLIRMSWGVFCDRMGGAIWTFISAVGMAITLAGVAWVLRNPTGWTDFYIFM ALMLTMFLFSGFGNAGTFKQMPMIMPARQAGGAIGFTSAVASLGPFLVGVALSAVGTGGA WLFFLGCAVFCVACAVLCWAMYARKGAPYPG >gi|319977718|gb|AEUH01000192.1| GENE 5 3878 - 7627 5651 1249 aa, chain + ## HITS:1 COG:Cgl1163 KEGG:ns NR:ns ## COG: Cgl1163 COG5013 # Protein_GI_number: 19552413 # Func_class: C Energy production and conversion # Function: Nitrate reductase alpha subunit # Organism: Corynebacterium glutamicum # 18 1249 17 1248 1248 1580 62.0 0 MAFPTYDGQSTPATGAALFALGSWLRRGKASPDARQLFLEGGREGDSFYRKRWSYDKIVR STHGVNCTGSCSWKIYVKDGVITWETQQTDYPSTGADMPEYEPRGCPRGAAFSWYEYSPT RIKYPYVRGVLLDMFREAKARLGDPVDAWADVVEDPEKARAYKSQRGRGGMVRVSWEEAM EIVASAYVHTIKQYGPDRIAGFSVIPAMSMISYGAGARFHELIGATMLSFYDWYADLPPA SPQVFGDQTDVPEAGDWFNSQYLIMWGTNLPLTRTPDAHFMAEARYHGQKVVVVSPDFAD NTKFADDWLRVQPGTDGALAQAMGHVILKEFHVGKREPMFLDYMTRYTDSPFLVEINEVG EGAHDGIDPKTLVPGKFLTADKMPEGTTGRTENNQFRPLVIEADGTVKDPGGTLADRFGE EGAGHWNLNLEGVTPALSIMETDQWEGVEIALPRFDLPAAPGQASIGGGYVLRGVPVRRV NGRLVTTVYDILLANYAVEREGLPGQWPTDYMDASTPGTPAWQEEFTSVPAGAAIKIGRE FAANAVETQGRSMILMGAGTNHYYHSDEIYRTFLALTEMCGTQGRNGGGWAHYVGQEKVR PIMGWGSFTFALDWARPPRQMISTGWYYMTTGQWRYDGAPASAMANPIKSSHLDGKQLVD TLVESVQRGWMPCYPTFSKGSTQLGREAAEAGVPPAQYVSQELRAGRLRFAIEDPDAHHN VPKILANWRTNLLGSSAKGTEFFLRHMLGTGNEVNAEELSEGNRPASVSWRDAHPGKLDL MWVADFRNTSTTLHSDVVLPAATWYEKHDLSSTDMHPYMHCFDEAVNPPWEARTDFEVFQ TLARLVARLAPDHLGTQTDVVAVPLGHDSPDAMTMVGGTVPEQTWTPGRTMPKLVPIERD YTQIGYKFDRMGPLLVKAGLASKGVAFNVEAEYEQLGDLNGRAPMDGNAGEGMPLCDTAI KAANMALRFSGTTNGALAAQGFRTLEKRVGNEMAFLAEGDEEKKITYQDTVLQPRSVITS PEWSGSEHGGRRYSAFVQNLECRKPWHTLTGRPQFYVDHDWLLDMGEALPVFRPPVDLAH LYGERPVGDHRPGQQGQAEVAVRYLTVHNKWAIHSQYYDNPYMLTLGRGGQTVWMSPADA EKIGVRDNEWIEAKNRNGTVTARAVVSHRIPEGTVFMHHAQDRQINTPLNEGSGKRGGTH NSLTRILIKPSHLAGGYAQFSYAFNYYGPTGNNRDEVTVIRRRSQEVQF >gi|319977718|gb|AEUH01000192.1| GENE 6 7627 - 9177 2072 516 aa, chain + ## HITS:1 COG:Cgl1162 KEGG:ns NR:ns ## COG: Cgl1162 COG1140 # Protein_GI_number: 19552412 # Func_class: C Energy production and conversion # Function: Nitrate reductase beta subunit # Organism: Corynebacterium glutamicum # 1 476 1 476 531 789 76.0 0 MKVMAQIAMVMNLDKCIGCHTCSVTCKQVWTNREGTEYIWFNNVETRPGVGYPRGWEDQE KWKGGWKRTATGRLVPRQGGRVASLAKIFANPTMPGMADYYEPWTYEYDKLLQAPAGSRA IPTARAKSRITGEHIDKPEWGPNWDDDLGGSMETLDQDPVLHQMSIQVAKTIEDAYMFYL PRICEHCLNPTCVSACPSGAMYKRTEDGIVLVDQDACRGWRMCVSACPYKKVYFNHHTGK AEKCTLCYPRLEVGEPTVCSETCVGRLRYLGVLLYDADRVGWAASQADERDLYRAQREIL LDPFDPQVVAQAQANGVPHSWITAAQQSPIWKLITEYEVALPLHPEFRTLPMVWYVPPLS PVVDQVTASGSDGEDHRVLLSAISQMRIPLEYLAGLFSAGDTAPVELSLRRLAAMRSYMR DVYVGREGDEAIPASVGMSGQKIQEMYRLLSIAKYDDRYVIPTAHPEMPRGIKELEGCPV SYDAEAFHGMAPATSNRPMGGSSPEGRTMLPLEVVR >gi|319977718|gb|AEUH01000192.1| GENE 7 9174 - 9863 822 229 aa, chain + ## HITS:1 COG:Cgl1161 KEGG:ns NR:ns ## COG: Cgl1161 COG2180 # Protein_GI_number: 19552411 # Func_class: C Energy production and conversion # Function: Nitrate reductase delta subunit # Organism: Corynebacterium glutamicum # 15 224 14 226 228 188 51.0 6e-48 MSTFVGIPRIPAAAPEREMPEADRRAVRMAVSLLLDYPGEDFLEKVGAVEAELAALPDEA SAPLGAFTRAARGAGVRAMQVHYVETFDQRRRCALGLTYYTHGDTRGRGQAILAFKEVLR RAGFEMRREELPDHLPVVLEFAAFDETGTAEALLRSNREGIEVIRTALRSAQSPYAGLLD ALVATLPEPDEKTLEGFHRLVSQGPPTELVGIGDLSATPSPASDTTMER >gi|319977718|gb|AEUH01000192.1| GENE 8 9868 - 10641 1023 257 aa, chain + ## HITS:1 COG:Cgl1160 KEGG:ns NR:ns ## COG: Cgl1160 COG2181 # Protein_GI_number: 19552410 # Func_class: C Energy production and conversion # Function: Nitrate reductase gamma subunit # Organism: Corynebacterium glutamicum # 23 250 8 235 259 255 56.0 6e-68 MFFVPASAPAGAPAALGLVDIVLWVVLPYVVFTLLAVGLVWRYKTDQYGWTSRSSQWNEP AILRWASPLFHFGVLFVFLGHVLGLAVPKSFTSAVGISEHAYHLVATIPGTIAGLMTVVG LVALVYRRVVVKSVRVATTRMDIVTYVMLSVPVALGAVATVVNQVLGGHDGYDYRETISV WFRSIFYLQPQAQLMVDVPLTFKLHVVAGLLLLGLWPFTRLVHAVSVPVGYIARPPVVYR ARDGRREAARTRGGMSD >gi|319977718|gb|AEUH01000192.1| GENE 9 10907 - 11971 1091 354 aa, chain + ## HITS:1 COG:no KEGG:PPA0506 NR:ns ## KEGG: PPA0506 # Name: not_defined # Def: putative regulator # Organism: P.acnes # Pathway: not_defined # 65 260 1 197 206 155 42.0 2e-36 MSAALRERARRAADRAYGRSRDEGRMMTNETARPRRLADTLSAMVSLTPAQHALLERISR HAQPVTVAQLAQESDLHVSSVRETLEGLLELGLIEREQLPAQGRGRPALGYSTSMPADPA FAAQMLGQFARSVFAWLRTDLEDPAVAARSIGHRWADAALSEMNVPEHNHREVAEGFSIA GHMGKVRLFLTAMGFGAQSRTDDPLAVVLTACPFAEADAPDGLAFELRRGIVERVFERTA TGVATWRLEQDDADPMVLVVHLEAAAGPRPKPLATTLRFFGGAAEAAGCDTREIPSDEAP ATLGALVGLLRESDPALAPVLDVSSYLVNERSATLDTPLAPGARVDVLPPFAGG Prediction of potential genes in microbial genomes Time: Thu May 12 18:33:45 2011 Seq name: gi|319977711|gb|AEUH01000193.1| Actinomyces sp. oral taxon 178 str. F0338 contig00193, whole genome shotgun sequence Length of sequence - 4134 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 75 - 1496 1818 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) 2 2 Op 1 1/0.000 + CDS 1626 - 2048 508 ## COG0314 Molybdopterin converting factor, large subunit 3 2 Op 2 . + CDS 2045 - 2869 796 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 4 2 Op 3 . + CDS 2874 - 3500 434 ## HMPREF0675_3558 hypothetical protein 5 3 Tu 1 . - CDS 3529 - 4041 617 ## COG0521 Molybdopterin biosynthesis enzymes Predicted protein(s) >gi|319977711|gb|AEUH01000193.1| GENE 1 75 - 1496 1818 473 aa, chain + ## HITS:1 COG:SP0930_1 KEGG:ns NR:ns ## COG: SP0930_1 COG2333 # Protein_GI_number: 15900810 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Streptococcus pneumoniae TIGR4 # 38 303 21 277 278 148 35.0 2e-35 MAGAGALALVGGALVGPAVAAPGDGAQSEAQSAVPQSGSSYSFEGASGKTRMFVLSSYNS DAILLESNGWFGIIDGGEDADAPDGSDPRYPARSGIAASTSSTTEWLLGYLDDQGVTDSN VAFYLGTHAHSDHIGNADEIIRRYRPKLIFSPEYSDQWITDENGLWDNQYVYDHLVEAAQ WARSEYGAQFIQELDGYSTHVCMGDMDVQMIPFDVDEVYKRQGTTDANLMGWGAKVSAFG RSAFLAADLMDTDSDWTTHNGFEGRIASEVGSVDILKAGHHGQESSNFEEFLGALSPTTI IQTGLAEDTPDRLTRHAIHGDGLWFPMGDIWDSVKVPALVCEFSAQGIAYDGVDNARRGH EYTSETPRAWWFHAGHTEATTGWWQGPSGNWYYFDSSASAVHSEWRLIDGYWYFFDESGA LASREEGQTQSASIEGTGSGRSASPLVWVVGAGLLVAAAGGAWWARRSRRARG >gi|319977711|gb|AEUH01000193.1| GENE 2 1626 - 2048 508 140 aa, chain + ## HITS:1 COG:Cgl0210 KEGG:ns NR:ns ## COG: Cgl0210 COG0314 # Protein_GI_number: 19551460 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin converting factor, large subunit # Organism: Corynebacterium glutamicum # 9 139 20 152 152 105 42.0 2e-23 MPARVFTGITEEPLDATALTNAARDPRCGAVAVFVGAVRNHDGGERVDAIEYSSHPSSPQ VLHSLADRLAERAGVHVVVAWHRVGRLEVGDDAMVVAVGAEHRAQAFTAVETLVEEVKAQ LPIWKKQQLTDGTHSWSGLS >gi|319977711|gb|AEUH01000193.1| GENE 3 2045 - 2869 796 274 aa, chain + ## HITS:1 COG:ML0817_1 KEGG:ns NR:ns ## COG: ML0817_1 COG0476 # Protein_GI_number: 15827360 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Mycobacterium leprae # 30 260 10 247 259 179 42.0 4e-45 MTVGDPYSRDAAVDARSRCVAIPLRGAPAPLGGPAAQRYRRNWLVAGVGEAGQARIAAAR VLVVGAGGLGSPVLLYLTAAGVGTIGVCDSDRVEVSNLQRQLLHSEEDVGRAKPESAVRR LSALNSEVRFEQHPHVTREWLEAHGRDWDLVVECADTFSAKYMVADWCAEAGVPLVWGTV VAMAYQVSVFWSRPPAPVPPTSLRSLHPVEPAPGTTPASPEAGVLGPVVGQAGTTMATEA LKVVAGFGEPLLGRVLVVDAAKQRADVLTFAPWG >gi|319977711|gb|AEUH01000193.1| GENE 4 2874 - 3500 434 208 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0675_3558 NR:ns ## KEGG: HMPREF0675_3558 # Name: not_defined # Def: hypothetical protein # Organism: P.acnes_SK137 # Pathway: not_defined # 6 198 95 287 293 117 45.0 2e-25 MGTVHALVLAGGTGERLGGASKADLDVGSRLLDVVLAGLAPHVSGAVVVVAPPGVPVPDG VGRAMEEPPGGGPLAGIGAGLDFLTGRAGAAGDDRVVVCSVDSPGAGALADRLSAVVLAP HEAGAAIVGGDPEPYTQYLQAVYRVGPLARALDGARAALGGLHGVGVRRVLGGLVLRRVG APWSECRDVDTREDLQWWRARLRGGGPV >gi|319977711|gb|AEUH01000193.1| GENE 5 3529 - 4041 617 170 aa, chain - ## HITS:1 COG:Cgl1165 KEGG:ns NR:ns ## COG: Cgl1165 COG0521 # Protein_GI_number: 19552415 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzymes # Organism: Corynebacterium glutamicum # 14 166 10 162 162 90 35.0 1e-18 MPQPKEYDLSPIRATIITFSDRVLSGQREARGARACADALTAAGLGHVATAVVPEQRGAL ESQVRGAIAAGSRLVLVLGGSGFGVGNVAPEVVREVVEVEIPGIAEQIRAHGLKSTPLSG LSREVVGVSARDSSGALVVASPGSKGGALDTLAVLVPLLPAVFGQLDEER Prediction of potential genes in microbial genomes Time: Thu May 12 18:33:51 2011 Seq name: gi|319977708|gb|AEUH01000194.1| Actinomyces sp. oral taxon 178 str. F0338 contig00194, whole genome shotgun sequence Length of sequence - 1725 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 42 - 101 1.9 1 1 Op 1 23/0.000 + CDS 126 - 917 1435 ## COG0725 ABC-type molybdate transport system, periplasmic component 2 1 Op 2 . + CDS 927 - 1725 1034 ## COG4149 ABC-type molybdate transport system, permease component Predicted protein(s) >gi|319977708|gb|AEUH01000194.1| GENE 1 126 - 917 1435 263 aa, chain + ## HITS:1 COG:MT1905 KEGG:ns NR:ns ## COG: MT1905 COG0725 # Protein_GI_number: 15841325 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Mycobacterium tuberculosis CDC1551 # 1 258 1 258 261 143 39.0 3e-34 MRRIRVTLAATAAALALAACGGGGGSAAQSDRTAPAPAQTAVLNVFAAASLKEAFGELEK TFEAATPGVDVAFNFAGSQDLVSQLAEGADADVLATANESTMAKAAEAGLVGEQTVFLSN TLTLVAPKGNPAGVTGLDSSLDNAKLVVCDTEAPCGKLTKELTGALGITLNPVSAEPNVK DVLGKVASGEADAGVVYRTDANAAEDQVEAIGIQGADQAVNKYPIAPVEATKNAELAQKW IALVLSPEGQSVFERAGFTPAAQ >gi|319977708|gb|AEUH01000194.1| GENE 2 927 - 1725 1034 266 aa, chain + ## HITS:1 COG:Rv1858 KEGG:ns NR:ns ## COG: Rv1858 COG4149 # Protein_GI_number: 15608995 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, permease component # Organism: Mycobacterium tuberculosis H37Rv # 12 237 7 233 264 160 53.0 2e-39 MGKRRAPAVSPVPVWTAAVGGLALAFLVLPLVFMAGRVPWGRLPWILASEESVAALGLSL RTCAAALAVDLVLGVPAAVLLSRDWRLVRFFRVLVAVPLSLPPVVAGIALLAAFGRKSPI GASMEAWGASIAFSTTAVVVAQVFVSLPFLIVTLEAALRARPEGLEEMASSLGASPSRVL WRVTLPMVLPGLARGSALALARCLGEFGATLTFAGSMQGVTRTMPLQIYLARESNTELAL ALGAVLLMAATVVVALTELPRRIRRR Prediction of potential genes in microbial genomes Time: Thu May 12 18:33:53 2011 Seq name: gi|319977704|gb|AEUH01000195.1| Actinomyces sp. oral taxon 178 str. F0338 contig00195, whole genome shotgun sequence Length of sequence - 5272 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1141 1172 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component - Term 1189 - 1243 20.1 2 2 Tu 1 . - CDS 1277 - 2005 1035 ## COG1309 Transcriptional regulator + Prom 1851 - 1910 1.6 3 3 Tu 1 . + CDS 2087 - 5270 4049 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) Predicted protein(s) >gi|319977704|gb|AEUH01000195.1| GENE 1 2 - 1141 1172 379 aa, chain + ## HITS:1 COG:MT1907 KEGG:ns NR:ns ## COG: MT1907 COG1118 # Protein_GI_number: 15841327 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Mycobacterium tuberculosis CDC1551 # 3 370 2 361 369 181 41.0 2e-45 GGSAVRVRGRVAARDWDVDLSLPAGAVTAVMGHNGAGKSTLAQVIAGSLALDEGAVRIGE RVVDSPGRFVGARDRGIAMVSQQPRIFGHMSVLANVAFPLRSRGVERAEAASRALDQLRR VGMEGVAGRRGDELSGGQAARVAIARALVFDPSVLVLDEPTAALDVEATAQVSLVLRERL AATGVTTVLVTHDAVEALELAAHMAVMDSGRVVEEGAPAAVLARPSSVFAARLAGFNIVS GRARRGDGLVGVRVGGGTLYGAGDAPDGPVALLFAPEAVALSRGPVDASPRSQLPGRVES VDASGGVITVGVLLGAEPPSGSDAPGPASGSGEPALVRARVTTAAWAELGLGVGDAVWSS IKATQVRAVPLSEPAAPAD >gi|319977704|gb|AEUH01000195.1| GENE 2 1277 - 2005 1035 242 aa, chain - ## HITS:1 COG:MT1725 KEGG:ns NR:ns ## COG: MT1725 COG1309 # Protein_GI_number: 15841142 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mycobacterium tuberculosis CDC1551 # 45 230 8 194 207 90 34.0 2e-18 MDSTADAPRIPTGNPRGAAEGAPTSFTDAEINPSNMATHHVAKIQRGPSGPRGEVRRRIL EAARAAFTASGYDGTTMRQIARSAGCDSALITYYFSTKQQLFRACLDLPSDPASDVIALL APGPSGAAERLVDYALDLYEHHLTSDAMSALMRALATDAETSQRFRAYISTEVLGPVAAA IGAGTAIAQRIEIVLSMMHGVVTMRYIVRLEPLASMPREQVRAQLAPLIQPLIDSAFNAP TF >gi|319977704|gb|AEUH01000195.1| GENE 3 2087 - 5270 4049 1061 aa, chain + ## HITS:1 COG:CAC2401_1 KEGG:ns NR:ns ## COG: CAC2401_1 COG1924 # Protein_GI_number: 15895667 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 26 726 6 663 663 716 50.0 0 MDPATDTDTGTGAAEREGAQARRVHQLGIDVGSTTVKAVVLDGNRVLYSDYRRHNADVRA ELGRLLADIEARHPGLEVVSAITGSGGLTTARAMGIPFVQEVIAGTEATRRLHPEVDVVI ELGGEDAKLTYLHPTPEQRMNGTCAGGTGAFIDQMATLLHTDASGLDALAAEHTQLYPIA SRCGVFAKSDIQPLINQGAAHEDLAASIFTAVATQTIAGLACGRPIRGNVMFLGGPLHFL PQLREAYKALLPKADSFVTPTGAQLYVAIGAALMAAKTAPPGGQSAQARPTGDEAGAGGV PSAADGGPSPQARPTGDEAGAGRARPRPLADLINALATAEVGAESPRMRPLFASEEERTE FERRHGAEVVPKAEPAEARGRCWLGIDAGSTTIKAVVIDSRGRIVFTHYASNEGDPVAAA VEIVRRVRTGLPEGAVIGRSCATGYGEGLVKAALTMDEGEIETMAHYRAAESVAPGVTSV IDIGGQDMKYLRIRDHAVDSISVNEACSSGCGSFLQTFAQTMGTDVRAFARAAMDSTNPV DLGTRCTVFMNSSVKQAQKEGADVRDISAGLSYSVVRNALYKVIKLKDPSDLGKRVVVQG GTFLNDSVLRAFELLTGRQVVRPDIAGLMGAYGAALTARMHDSGQGASSLATVEALEGFS VETTRKTCRLCQNHCQMTISTFSNGERHVSGNRCERGASLERVPKKSELPNLYDWKYKRI FGYRRLTAGKAFRGDMGIPRVLGMYENYPFWFTMLTKLGFRVMISGRSNHALFEEGMESI PSENVCYPAKLVHGHIEALLKKGVTNIFYPCVAFEQGSEGADNCFNCPVVATYPEVIRNN MERLSDPGVRFISPFVNFSNREYLPAHLVGAFKEHGYDIPLEEMRAALDAAWEEDAAVKA EIRQKGRESLEWMRAHGVRGIVLAGRPYHLDPEINHGIPEVIVGLGMAVLTEDSVIDARL ERPLRVRDQWTYHSRLYEAAARVGDEPDLEMVQLNSFGCGVDAITADQVQEILEGRGDVH TVLKIDEVSNLGAARIRLRSLEAAVSERCAPAPRTAEAEGA Prediction of potential genes in microbial genomes Time: Thu May 12 18:33:54 2011 Seq name: gi|319977701|gb|AEUH01000196.1| Actinomyces sp. oral taxon 178 str. F0338 contig00196, whole genome shotgun sequence Length of sequence - 2586 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1533 1880 ## COG3581 Uncharacterized protein conserved in bacteria - Term 1268 - 1299 0.9 2 2 Tu 1 . - CDS 1544 - 2584 1314 ## COG1404 Subtilisin-like serine proteases Predicted protein(s) >gi|319977701|gb|AEUH01000196.1| GENE 1 1 - 1533 1880 510 aa, chain + ## HITS:1 COG:CAC2401_3 KEGG:ns NR:ns ## COG: CAC2401_3 COG3581 # Protein_GI_number: 15895667 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 130 454 1 313 339 335 50.0 1e-91 SAPLVSAAVDTALHEDPASVAAREEQAGARAGHIEVRASFTKEMREAGYEILAPQMSPIH FRFLSPLLARAGLRVRVLERTSRQSTETGLKYVNNDSCYPAIVVIGQLVEEFISGRADPD RTAVGITQTGGMCRATNYAALLRKALRDAGYPQVPVIALSVQGLEDNPGFQLGLKDIHKA VQAFVIGDAIQSMLLRVRPYEAVQGSAMELYRRWDRIVREWIEDGRSQEFGGGRARRLTY AGLIRACVREFDALALRDIPRKPRVGLVGEILVKFHPDANNHAVDVIEAEGCEAELPGLM QFFHYSSATAEWDQANLGIGGRQRRVMPLVLWALERYEAPVRRAFAATGGKFEPHRRIGE MIPRSQDIARLGNQAGEGWYLTAEMVDMIEHGCPNIICAQPFACLPNHIVGKGMFRALRN RYPESNIVAVDYDPGASEVNQLNRIKLMLATAVQGRDGSGGDDGVLDLVGIDFEDAPGAG VGAGAGGAGGRRVVGLGMPGVPPRRAAALR >gi|319977701|gb|AEUH01000196.1| GENE 2 1544 - 2584 1314 346 aa, chain - ## HITS:1 COG:BS_aprE KEGG:ns NR:ns ## COG: BS_aprE COG1404 # Protein_GI_number: 16078094 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus subtilis # 18 252 124 361 381 83 32.0 7e-16 EITTQEYVAALGVDKEAAAGYTGAGVGIAVIDGPADTTVPELAGADITVKPMCDFTASPD GRTHATTVVSILAARGYGVARGARITNYVVPAPDDTNKSACSGIDTDDGINAAVADGARV ISISLGGGKISDAERQAITAALARGVVVVVGTGNDGAQDPPNFFSSVNGTVGVGASDSAG RIQPYSNYGRGLAVMAPGDRITEHDFATGQTIVSSGTSYATPIVAGFVAVAMQRWPQATG NQIVQSLIATATTGPTGQPLISPKGLDTTDPTQYPDTNPLLDKFPGTQPTAQTVADYADG LLATDTVFDNDPAYTYRGTDPATIRSHPDRTALGTSPRYHQPSGSD Prediction of potential genes in microbial genomes Time: Thu May 12 18:33:56 2011 Seq name: gi|319977697|gb|AEUH01000197.1| Actinomyces sp. oral taxon 178 str. F0338 contig00197, whole genome shotgun sequence Length of sequence - 3116 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1099 938 ## 2 2 Op 1 . + CDS 1596 - 1925 192 ## PROTEIN SUPPORTED gi|18309686|ref|NP_561620.1| 30S ribosomal protein 3 2 Op 2 . + CDS 1925 - 3031 1531 ## Bfae_24900 hypothetical protein Predicted protein(s) >gi|319977697|gb|AEUH01000197.1| GENE 1 1 - 1099 938 366 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGVHKAGAKHVGGGLKRRAHGSFFERLRAVSDLRYDDSADEAAYQVVLDEIGECERLVRA YESANGYHSEQVRSAISEWARAFLADLGATRDQLRDGHDVVVQARGVMRQARDLFQANVS DELLTGAERGWRAAADVVGAVATVVAPLGGALCLLGAETYFSSLAEERHREREEYCQNVI EQMNASLNEGAGRMQGVIDKGGEANDSRISTKHPAAAPLPQIVDPNLGGPGGAGSGAGAP GGADGSGGGLGGDGSTGAGGDGSAGGGVLAGRDWASEGFARPGAVQDPPPYARLSDVDGV GLVGRPVNQTVTPNGLVGGYAPPSAVHASDPRWDPSYRIPHDAIGVGRAASSGALGALAG GAAVSA >gi|319977697|gb|AEUH01000197.1| GENE 2 1596 - 1925 192 109 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|18309686|ref|NP_561620.1| 30S ribosomal protein [Clostridium perfringens str. 13] # 2 107 3 109 110 78 36 6e-15 MISADAIRGYVDLIVLGLLRERPSYAYELAKTISQVSQGQYAIKQTTLYSALKRLEGQGV TESYQDVSESGKTRTYYRLSPTGQDYLDDKTAEWADTKGLVDRFVEGRE >gi|319977697|gb|AEUH01000197.1| GENE 3 1925 - 3031 1531 368 aa, chain + ## HITS:1 COG:no KEGG:Bfae_24900 NR:ns ## KEGG: Bfae_24900 # Name: not_defined # Def: hypothetical protein # Organism: B.faecium # Pathway: not_defined # 1 368 1 336 342 156 32.0 1e-36 MDTIDSFLDAMFAPYPASTRLADAKAELRAMMEDAYADAIASGMTHNEAVGRVITDFGNL QEIAPVLGIADDLTAAGGAPQQEAAPAPAGADGAEAPSRPRVTLPEAQELAEARRRGAAG LGLGVAMLVSTGIPLFAFQGLAEVGLVPFNDDTATFLALVADLPLVALGIIMLLSRSRLF ANVKHLVNLEFTTDPIVTAWAARLRLEHEDQRMRRLAIAVGLWICAALPLAVVQALPPIG VGMAVPAGADNGTLNGGDLSELALALSVALVALGLWVFLPANWAAQTQQTLIREGRQGHD YDEEDYRDPVIGIVASVYWPCVVAAYFVWGAVFDQWDKSWVLFPVAGLLFGVFAAVRMAT RRAKARPQ Prediction of potential genes in microbial genomes Time: Thu May 12 18:34:36 2011 Seq name: gi|319977691|gb|AEUH01000198.1| Actinomyces sp. oral taxon 178 str. F0338 contig00198, whole genome shotgun sequence Length of sequence - 6427 bp Number of predicted genes - 7, with homology - 4 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 134 - 2146 2694 ## COG1297 Predicted membrane protein - Prom 2347 - 2406 80.3 + TRNA 2317 - 2406 55.8 # Ser CGA 0 0 + Prom 2319 - 2378 80.3 2 2 Op 1 . + CDS 2602 - 2718 72 ## 3 2 Op 2 . + CDS 2999 - 3118 72 ## 4 3 Tu 1 . - CDS 3349 - 3777 413 ## 5 4 Tu 1 . - CDS 4001 - 4684 442 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent 6 5 Tu 1 . - CDS 4867 - 5850 1058 ## COG2354 Uncharacterized protein conserved in bacteria - Prom 5935 - 5994 1.5 7 6 Tu 1 . + CDS 6119 - 6425 209 ## Jden_2204 methyltransferase small Predicted protein(s) >gi|319977691|gb|AEUH01000198.1| GENE 1 134 - 2146 2694 670 aa, chain - ## HITS:1 COG:MT2465 KEGG:ns NR:ns ## COG: MT2465 COG1297 # Protein_GI_number: 15841909 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 19 662 9 649 667 576 58.0 1e-164 MSAPQAGRPAPLEAPAPLKELTIRGIVIGGIITLVFTAANVYLGLKVGLTFATSIPAAVI SMAVLRLWSDHTVQENNIVQTIASAAGTLSAVIFVLPGLVIVGWWNGFPYWQTMLVCALG GSLGVMYSIPLRRALVTGSELPYPEGVAGAEILKVGDAHGAGEDNRRGLLAIVLGALASS LMSLLSNLKIAASGVSAAFRLGGTGTMLGSSLSMALIGVGHLVGMTVGVAMLVGMLLSYG VLLPYFASGSLDGGDGLEDALGAVFRSDVRFIGAGVMAVAAVWTLVKIMVPIVRGMRESV AASRARHGGGSVARTERDIPAGVVVASILASMLPIGGLLWFFASGTAIAHNTGALIAVSV VFILAAGLLVAAVCGYMAGLIGSSNSPVSGVGILVVVLTALAVLLVHGTGSSAEETTALV AYTLFTAAIVFCIATIANDNLQDLKTGQLVEATPWKQQVALVIGVVFGALVIPPVLGLMQ TSFGFQGTPGAGESALAAPQASLIASLAQGVLGGGIDWGLLGLGAVIGAVVIGIDEALRP LSRGRLALPPLAVGMGIYLPSSLTVLIPVGGLIGFFYNRWADGRPEPGRARRFGTLAATG LIVGESLFGVVFAGIAGATGSEAPLALPFIGDGYAPVGQVVGVAFFVGAIALLYRWTRRT SAEAEGADGA >gi|319977691|gb|AEUH01000198.1| GENE 2 2602 - 2718 72 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWGPLGVRRDAALMPFWRSFVLQSEPVGAAVELFCERS >gi|319977691|gb|AEUH01000198.1| GENE 3 2999 - 3118 72 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSQRGREKGIGAGNQVLGQVNCLPQCGVAYPNASLCRWL >gi|319977691|gb|AEUH01000198.1| GENE 4 3349 - 3777 413 142 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLYFLWAFGGDSDGGPDDVPALLADLANTDDEHVDVSFSKGDIAVSVSRSLWLSIEDVEA DTLPPRSFHAPDEAAVVEIARLLDEDRVDEILERYPWVDGYPQPDRDELLRTLTGGVLGS APEAPRGHPEGPGPCDAADLRG >gi|319977691|gb|AEUH01000198.1| GENE 5 4001 - 4684 442 227 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 6 198 5 201 201 174 47 1e-43 MPVYTLPDLPYDYAALEPYISGTIMELHHDKHHANYVAGANAALEKLEAARESGDFAAIN LLEKNLAFNLGGHVNHSIFWTNMTGEGASRPEGELAAAIDEFFGSFEKFQAQFAAAALGL QGSGWAVLAWDTVGKRLVTLQLTDQQGNIPVATVPLLMIDMWEHAFYLDYRNVKADYVKA WWNVVNWANAEERFNGARLASRLVNAHEALAGLGAQAVDTIKGWWQS >gi|319977691|gb|AEUH01000198.1| GENE 6 4867 - 5850 1058 327 aa, chain - ## HITS:1 COG:Cgl1064 KEGG:ns NR:ns ## COG: Cgl1064 COG2354 # Protein_GI_number: 19552314 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 1 326 1 312 318 216 46.0 3e-56 MPSGLVALLDDIATLAKATASSLDDIAAGAVKASSKAVGVVIDDTAVTPQYVQGLSPDRE LPIIGRIAKGSLINKLFIVLPVALALTWLAPWALPILLIAGGSYLCFEGGEKVLSALGWL PHHEAGAGSEEGPETAPQEVERRIVASATRTDLILSSEIMLVSMAGLAGESAPRRVAILI AVAFFMTFFVYGLVALLVKMDDFGLRLARAGAPEGAAADPMENRPGPVDNVGRVIVRAMP RVFRVIGVVGTVAMLWVGGHLVIESLAELGATGPHHVVSAAEHAVAGAGGFAQWCAESFL SGVFGLALGGFLALVWAVARRLVRRKR >gi|319977691|gb|AEUH01000198.1| GENE 7 6119 - 6425 209 102 aa, chain + ## HITS:1 COG:no KEGG:Jden_2204 NR:ns ## KEGG: Jden_2204 # Name: not_defined # Def: methyltransferase small # Organism: J.denitrificans # Pathway: not_defined # 5 101 21 116 536 69 41.0 4e-11 MTHSSTPVPVLDIDLIDSLRADLGASEWTVDRINAALSATATDAMMRGMRVPALLELAGS RDPAAVLTRFFMLGADEASDVLDAALPSLGARGLVRLGLAGP Prediction of potential genes in microbial genomes Time: Thu May 12 18:34:57 2011 Seq name: gi|319977681|gb|AEUH01000199.1| Actinomyces sp. oral taxon 178 str. F0338 contig00199, whole genome shotgun sequence Length of sequence - 12312 bp Number of predicted genes - 13, with homology - 9 Number of transcription units - 11, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1307 1406 ## COG2890 Methylase of polypeptide chain release factors 2 2 Tu 1 . - CDS 1448 - 3199 1479 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 3 3 Tu 1 . + CDS 3198 - 3356 177 ## 4 4 Tu 1 . - CDS 3319 - 3525 200 ## - Term 4013 - 4048 -0.3 5 5 Tu 1 . - CDS 4052 - 4384 462 ## COG1416 Uncharacterized conserved protein 6 6 Tu 1 . - CDS 4642 - 5067 286 ## BSU12030 hypothetical protein 7 7 Op 1 . - CDS 5245 - 5853 758 ## ROP_66700 putative TetR family transcriptional regulator 8 7 Op 2 . - CDS 5915 - 6475 835 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) 9 8 Tu 1 . - CDS 6666 - 6800 96 ## 10 9 Tu 1 . + CDS 6826 - 6954 120 ## + Term 7075 - 7116 2.0 11 10 Tu 1 . - CDS 7401 - 8438 1217 ## ECO103_3516 hypothetical protein - Prom 8574 - 8633 2.6 12 11 Op 1 . + CDS 8706 - 11468 3755 ## COG0550 Topoisomerase IA 13 11 Op 2 . + CDS 11544 - 12312 776 ## COG0125 Thymidylate kinase Predicted protein(s) >gi|319977681|gb|AEUH01000199.1| GENE 1 3 - 1307 1406 434 aa, chain + ## HITS:1 COG:Cgl1871 KEGG:ns NR:ns ## COG: Cgl1871 COG2890 # Protein_GI_number: 19553121 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methylase of polypeptide chain release factors # Organism: Corynebacterium glutamicum # 10 289 104 362 509 171 39.0 2e-42 EGAPPRLAALFDLRPHSASLPDPDDPGAQREHQWWVASDLGEDVTGRPLDEDHVLGIGGA SLSLLGQTIRERVGSALDLGCGCGTQALYLATHCGRVVATDLSARAGALTQFNAALNGAP IDVRVGSLFEPVSGESFDLIVSNPPFVITPDTVRAGAGFHEYRDGGMQRDELVGALIRSA PDHLAPGGTLQILANWEIPAGADPDGHWSPRVEDWLSGLGVDAWVVQRDALDPAQYAEMW IRDSGGHRMSHEAFERGYAAWLRDFTAAGVGAIGMGFLAVRRPEPGEGERAPGGPDQGAG GGQWLPGGGRAAFDLALEGRAPRGVDVAWALRSLRAPRTWDLVLTRAPDVREERHYVPGS PDPCLLVLHQGAGLGRSVPVSPAVSAVVGASDGALTVGQIVAAVAALTDREADDVWEEAR APLAELIRWGFLTF >gi|319977681|gb|AEUH01000199.1| GENE 2 1448 - 3199 1479 583 aa, chain - ## HITS:1 COG:Cgl2090 KEGG:ns NR:ns ## COG: Cgl2090 COG0488 # Protein_GI_number: 19553340 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Corynebacterium glutamicum # 9 573 13 544 550 239 30.0 1e-62 MSTVLFSHVSFSYGATPVLGNVSFACGPGDRLAVVGPNGVGKSTLLALAAGLLEPDSGTV SAPPVPPAAAFLTALPPAAAPPPTAAPPSAAPPPPSPTGSTPLPPSAEAHGARTVARSIE AATSGTRALAARFDALTERLAQGASPRDEAEYDRVLAAMTARDAWTIDARLDQTLEALGL GGVDRSRALASLSPGQRARLRLALVLVERPDALVLDEPTNHLDANGREHLARAIDDWQGP VLMTSHDRAFIERTATALLDLDPAPWRALAVADGEPADFGAYRVGGSYSDYLRDKAAARS RHAAVHAEQQAVKRRLAAHRRDSAVVGHARFKPRTETRMAQKYYADRAQAVSTRRINDDS RRLAALEAREVRKPRYDEAVFAFPRPSGSTADGPPPLARSGGIALSVRSASVEGRLASTS FEVRHGEHLLITGPNGSGKTTLLEWIARGAPVGAHGRVDTAPGAVLVPQEPPRPGDPLVP EDAWRLGVGEAGRGFVPPALWNRPLGGLSAGNQRRAQLAWAARAAPRVLVIDEPTNYLDL DALESLEESLRQWGGTLVASTHDEWLIAKWWGRLHRLEGAHGR >gi|319977681|gb|AEUH01000199.1| GENE 3 3198 - 3356 177 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAHLWEEGEGRRGVMSGGHAPIIPGTGRAVFQNASEFRSFSTAAGRSVPKCV >gi|319977681|gb|AEUH01000199.1| GENE 4 3319 - 3525 200 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRPNPRTAPVVVPVGAAARGRGKAATPGSPQRSQLTLVQNASESTQNRGKQPKTVTLRRI LEHSAPLP >gi|319977681|gb|AEUH01000199.1| GENE 5 4052 - 4384 462 110 aa, chain - ## HITS:1 COG:DR0054 KEGG:ns NR:ns ## COG: DR0054 COG1416 # Protein_GI_number: 15805095 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 6 109 7 110 111 79 48.0 1e-15 MSYDYILHVSDVARWPAALSNLGNLTQLGLAKGIAVIVNGTGVYALQGANDWTAQMEAAA RAGVDFFACARSLANHEFVEGTLPEWLGQVPAAIPAIREWTKDGATYIKS >gi|319977681|gb|AEUH01000199.1| GENE 6 4642 - 5067 286 141 aa, chain - ## HITS:1 COG:no KEGG:BSU12030 NR:ns ## KEGG: BSU12030 # Name: yjdF # Def: hypothetical protein # Organism: B.subtilis # Pathway: not_defined # 6 140 27 159 160 70 34.0 1e-11 MKTESTLYFDGRFWVVVIERHEEGRVRAVRIVFGARPSDAELYEFLLAHASALTRRLDEA AAVPVGPERQPKRPNPKRAQRQASRFARQARPSTASQAAIRADRERSAQRRAAGAKQKKR EAADERRRLEREKAKARHRGH >gi|319977681|gb|AEUH01000199.1| GENE 7 5245 - 5853 758 202 aa, chain - ## HITS:1 COG:no KEGG:ROP_66700 NR:ns ## KEGG: ROP_66700 # Name: not_defined # Def: putative TetR family transcriptional regulator # Organism: R.opacus # Pathway: not_defined # 3 181 4 184 199 105 40.0 1e-21 MPPPAPKRLPAPQRRRLMLEAAVHVFGEHGYAGATTDAIAREAGVSQAYVVRTFGSKESL YSEAAAYTFDAVREAFRSAPVPPGASSEQIRCALGDAYLELVADRSKLLMKMSLYTMGAH PVFGPLAREGFDSIYRVLRHERGLAADKAALFLARGMLINTVLGLRLWEGEGRENASELL SFLLDEDAEHVLAAAAAPRPEL >gi|319977681|gb|AEUH01000199.1| GENE 8 5915 - 6475 835 186 aa, chain - ## HITS:1 COG:PA0853 KEGG:ns NR:ns ## COG: PA0853 COG2249 # Protein_GI_number: 15596050 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Pseudomonas aeruginosa # 2 155 18 177 207 101 37.0 7e-22 MRILAIQGSPDPESFSAQLTRAYTTAAREHGHDIETIDLSAEDFDPVLRYGYRRHMDDES APKRYQEMIRRADHLTFLFPIWWSAEPAVLKGFLDRVLTPGFSYSYDPKPHGLLKGKTAS LIVTSRAPAALYRIYGGPLSRWKRMVLAFNGIRSKGALILGNMDTQKDTPERRAAFTARV HAHAQH >gi|319977681|gb|AEUH01000199.1| GENE 9 6666 - 6800 96 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIRFLPQCQSMPTPMLVRACPNDPFPTPMPAHAHPNAGRCYRTG >gi|319977681|gb|AEUH01000199.1| GENE 10 6826 - 6954 120 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGVFVPQCVFPDLTDPECLTGLWQRTLLRGLGRPAAPGRGLR >gi|319977681|gb|AEUH01000199.1| GENE 11 7401 - 8438 1217 345 aa, chain - ## HITS:1 COG:no KEGG:ECO103_3516 NR:ns ## KEGG: ECO103_3516 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 76 289 263 486 520 115 32.0 2e-24 MSSTTSMNLVGVGIAIVLIAAVIIGVLLWRAHLKRKFGPLPPVPDDVRAAGDAKEWASLN RHHMPVWTSDPAHFTPAAHHRLIAITAPYALCHHDPWELLDLSDPDDNRTMIERDWGISS RAELIEQLHSLLTDGHRSTFAAERDRWSDPQLAEADAARFRLDAATSQPHAEALWRVERM RNNERNIRNIDYTAWDLIRAAMLARNGAAFGWLTSEQAWDTLALIDWALRQQYSSWAQLW EAFHLTRWWWISEGGETERWNDLHDRNRGLALLSPGRPWAVVPWDMPVPGPQLLIVDDMI ALDGAEPMGPQAREYATGWERWIDDQIRARTTKRPGTHRFNNKLD >gi|319977681|gb|AEUH01000199.1| GENE 12 8706 - 11468 3755 920 aa, chain + ## HITS:1 COG:Cgl0309_1 KEGG:ns NR:ns ## COG: Cgl0309_1 COG0550 # Protein_GI_number: 19551559 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Corynebacterium glutamicum # 6 596 15 604 608 599 56.0 1e-170 MAASKLVIVESPAKAQTIEGYLGPDYHVTASIGHIRDLPQPKELPDTMKKGPYGRFAVNV EDGFKPYYMVNPDKKKKVAELRRLLKEADELYLATDEDREGEAIAWHLLQVLKPKVPVKR MVFHEITKEAIQRALDNTRDLDTDLVDAQETRRILDRLYGYEVSPLLWRKIRPSLSAGRV QSVATRLVVARERDRMAHVSAEYWSINTAFTCAGQAFSARVASVDGAAIATGSDFSEKGG LSAKAVKAGVLHLDEATARSYAQALTDAPASVVDSVTRKPYRRRPAAPFTTSTLQQEASR KLHWNASSTMRTAQSLYESGYITYMRTDSTALSGQAIHAAREQATQLYGAEAVAEAPRTY GTASKGAQEAHEAIRPAGDHFRTPGEVASSLSKQQLALYDLIWKRTVASQMADARGYTAT IRVLTRIDVDGERHGVVSSASGTVITAPGFRLAYQEGRDQGRYDAEKADSDSEKTLPDVA EGDAASLTRATPDGHETQPPGRYTEATLVKTMEELGIGRPSTYAATIQTIGDRGYVTHRG QYLVPTWLAFSVTRLLEENLANLVDYDFTASMEGDLDRIAAGEENGAEFLSGFFFGPDGS GETGGLRHDVASLGDDIDARAVNSIDLGRGVTLRVGRYGPYLEKADGARANVPPEVAPDE LDDQLVDQLFARAADDGRELGVDPDSGHTIIVKDGRYGPYVTEVLPEPEGDVGAKAKKNA AKPRTASLFKTMDISTVTLAEALQLLSLPREVGVDPATGEAITAQNGRYGPYLKKGTDSR TLASEDQLLTITLDEALAVYAQPKTRGRGVARPPLREFGEDPISGKKVTVKDGRFGPYVT DGETNVTVPRAETVEDLTAERAYELLADKRAKGPAPKRTRKTAAKKTAAKKTATKKTAKK ATAKKAAPKEKAPGTGAGGE >gi|319977681|gb|AEUH01000199.1| GENE 13 11544 - 12312 776 256 aa, chain + ## HITS:1 COG:RSc1784 KEGG:ns NR:ns ## COG: RSc1784 COG0125 # Protein_GI_number: 17546503 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Ralstonia solanacearum # 13 208 3 202 205 159 47.0 6e-39 MHLTKGSSMSPRGVFITFEGVDGAGKTTQIRNVREWFEERGFEVVMTCEPGGTPLGAELR RLVQNGPDDIDPRTEALLYAADRAYHVATVVRPALARGAVVLGDRYIDSSLAYQGAARSL GVDEIAAMSAWATEGLSPRTTFLLDLPPEVGAGRRTDAPDRMERESADFHERVRREYLRL ADAEPDRIVVIDAVGTREEVFSEIRGVLVERFADSAPVSGAASASGGGPALESGAPVPDR ASDPDRGAPAQEDGTG Prediction of potential genes in microbial genomes Time: Thu May 12 18:35:34 2011 Seq name: gi|319977657|gb|AEUH01000200.1| Actinomyces sp. oral taxon 178 str. F0338 contig00200, whole genome shotgun sequence Length of sequence - 25845 bp Number of predicted genes - 27, with homology - 21 Number of transcription units - 18, operones - 7 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 167 - 1348 1327 ## COG0470 ATPase involved in DNA replication 2 1 Op 2 . + CDS 1369 - 2979 1852 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) + TRNA 3092 - 3167 84.5 # Thr CGT 0 0 3 2 Tu 1 . + CDS 3458 - 3928 851 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) + Term 3950 - 3989 13.5 4 3 Tu 1 . + CDS 4034 - 5068 1144 ## COG0276 Protoheme ferro-lyase (ferrochelatase) 5 4 Tu 1 . - CDS 4984 - 5166 103 ## 6 5 Tu 1 . + CDS 5155 - 6387 1510 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases 7 6 Op 1 1/0.000 - CDS 6350 - 7648 1304 ## COG0038 Chloride channel protein EriC 8 6 Op 2 . - CDS 7645 - 8109 553 ## COG1846 Transcriptional regulators 9 6 Op 3 . - CDS 8168 - 8347 134 ## 10 7 Tu 1 . + CDS 8403 - 8771 537 ## SAV_4695 Lsr2-like protein + Term 8798 - 8833 9.5 11 8 Tu 1 . - CDS 8997 - 10205 170 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 - Term 10405 - 10447 5.5 12 9 Op 1 . - CDS 10486 - 11016 547 ## gi|154508488|ref|ZP_02044130.1| hypothetical protein ACTODO_00989 13 9 Op 2 . - CDS 11113 - 11226 114 ## 14 10 Tu 1 . + CDS 11225 - 11962 1101 ## COG0588 Phosphoglycerate mutase 1 - Term 11885 - 11940 1.3 15 11 Tu 1 . - CDS 12094 - 12555 637 ## + Prom 12677 - 12736 2.0 16 12 Op 1 5/0.000 + CDS 12766 - 13281 850 ## COG1329 Transcriptional regulators, similar to M. xanthus CarD 17 12 Op 2 . + CDS 13278 - 14006 285 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 + Prom 14010 - 14069 1.8 18 13 Op 1 . + CDS 14122 - 15474 2197 ## COG2270 Permeases of the major facilitator superfamily 19 13 Op 2 . + CDS 15478 - 16173 708 ## Tint_0107 restriction endonuclease BglII + Term 16183 - 16234 2.2 - Term 16066 - 16100 1.5 20 14 Op 1 . - CDS 16170 - 17006 1164 ## COG4725 Transcriptional activator, adenine-specific DNA methyltransferase 21 14 Op 2 . - CDS 17055 - 17567 540 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 22 15 Tu 1 . - CDS 17732 - 19327 1702 ## gi|154508501|ref|ZP_02044143.1| hypothetical protein ACTODO_01002 23 16 Tu 1 . - CDS 19458 - 21050 1416 ## gi|293194135|ref|ZP_06609936.1| conserved hypothetical protein 24 17 Op 1 . - CDS 21158 - 22741 1167 ## gi|154508501|ref|ZP_02044143.1| hypothetical protein ACTODO_01002 25 17 Op 2 . - CDS 22814 - 24250 769 ## gi|154508501|ref|ZP_02044143.1| hypothetical protein ACTODO_01002 26 17 Op 3 . - CDS 24247 - 25770 1111 ## 27 18 Tu 1 . + CDS 25717 - 25843 114 ## Predicted protein(s) >gi|319977657|gb|AEUH01000200.1| GENE 1 167 - 1348 1327 393 aa, chain + ## HITS:1 COG:MT3747 KEGG:ns NR:ns ## COG: MT3747 COG0470 # Protein_GI_number: 15843256 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Mycobacterium tuberculosis CDC1551 # 1 389 4 399 404 232 43.0 9e-61 MDGVFSGLIGQEAAVAVLREAAGSARATTAAGDGEGARAMSHAWLITGPPGSGRSVAATA FAAALQCTGEPVGCGQCPGCRTTLAKTNADVVFVATETSIITVDTARSLVQQAQSSPSQG KWRVLVVEDADRLGESGANALLKAIEEPPAHTVWLLCAPSPEDMIATIRSRCRCLGLRIP RAGAVADLLVDEGVADPETALEAARAAQSHIGLARALARDPQMRQRRREIITAPARVRSV GEAVIAAGRLLETARAQADAQVGERNAREKSELLRQLGMDEGERATKQSRALLRQLEEDQ KRRAKRALTDALDRALVDLLAIYRDVLMVQLDSRQELINTDLSDLVHGIAADSSPAQTMA RVDHIEQARRRLIANGNVLLVLEAMVVSLRPQA >gi|319977657|gb|AEUH01000200.1| GENE 2 1369 - 2979 1852 536 aa, chain + ## HITS:1 COG:MT2281 KEGG:ns NR:ns ## COG: MT2281 COG0596 # Protein_GI_number: 15841715 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 62 535 55 518 520 226 35.0 9e-59 MNTRQIWATVVAVVAVVAVVCVTGFYYGWFAPLQLEGAAGPVASWTPRGDGTDDYYSQKL VWGSCDDVTLNEGKGENTDDPTVYQCARVSAPLDWDAPSGESIELAVAVRRSGTANAPFL FVNPGGPGGAVVESLPYYAGSLLGKKVVRAYDIVALDPRGVGQSTPVRCLSDSERDEKNA GGDGSSDTADLAPDEIVAAAEQESADFAAGCERLTGELYKHVDTVSAAKDFDLVRALVGA PALNLLGYSYGTFLGATYAGLFPDKVGRFVLDGAVDPAVDVNEMSAMQMRGLDAAIQHWI DDCQAGPRCPLGKSHADGVSTLVSFIKALESQPMRTRDPQRPLTSNLALSAIYGAMYDTS WYQTLTSAVGQAISNADGSALLEIADLLNERNDDGTYSGNSLDALVVVNNLDYGPVGTVD EWAEAASALEAELPIMGPYGGYPSAGLAAWPTAHAERAPITAAGAAPIVVVGTTHDPATP YPMAQALASQLSSGVLVSVEGWDHTSYKRGGNQCAVDAVEKYLVDGTVPQSGLMCQ >gi|319977657|gb|AEUH01000200.1| GENE 3 3458 - 3928 851 156 aa, chain + ## HITS:1 COG:slr1894 KEGG:ns NR:ns ## COG: slr1894 COG0783 # Protein_GI_number: 16329942 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Synechocystis # 14 153 16 155 156 64 32.0 7e-11 MTVESTGFKASDVLAGELQKVLVDVTALSLVGKQLHWNITGEGFRSLHLYLDEVVDIARG ASDEFAERMRAVFAVPDARPDVVAQSNSLPAVPEALVTTSEGADLAVAAITATVGTMRDV HEKVDAEDSASADILNDYIRRLEQQAWFIRSQNGRD >gi|319977657|gb|AEUH01000200.1| GENE 4 4034 - 5068 1144 344 aa, chain + ## HITS:1 COG:YPO3117 KEGG:ns NR:ns ## COG: YPO3117 COG0276 # Protein_GI_number: 16123282 # Func_class: H Coenzyme transport and metabolism # Function: Protoheme ferro-lyase (ferrochelatase) # Organism: Yersinia pestis # 20 337 5 317 320 269 44.0 6e-72 MTDDRTNGAPAPGDGHGARRPAVVLANLGTPSAPTPSHVRRFLREFLSDRRVVETHPLLW RPVLEGVVLRVRPRVVAKKYAGIWTPQGSPLMRYTLRQAELLGRRMPDVDVQAAMRYGEP GLGAVLDALHARGTRRVAVLPAYPQYSATTVASLNDVAAQWLRRNRDGFDLRLVRSFPTA PAYIDALASALESHWGRCGRPDFAAGDAVVVSFHSIPEAMDRAGDPYRSECMGTVAALEA RLGLARGALTVAFQSVFGPAAWLGPATIDTVTRLGARGCARLDVICPGFMADCLETLEEI DQLNREAFTRAGGSGFHYVPWGNDSEGAVSALAEQARTAVAGWV >gi|319977657|gb|AEUH01000200.1| GENE 5 4984 - 5166 103 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGSHATRLTPGPPARGSVRSPVRCDKSADRPRAYTQPATAVRACSASADTAPSESLPHGT >gi|319977657|gb|AEUH01000200.1| GENE 6 5155 - 6387 1510 410 aa, chain + ## HITS:1 COG:Cgl2392 KEGG:ns NR:ns ## COG: Cgl2392 COG0626 # Protein_GI_number: 19553642 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Corynebacterium glutamicum # 19 407 10 383 386 350 47.0 3e-96 MTPHSDPSPCPRSAAPASFATRAVHVRFDPDPVTGDVVPPVHVSSTHVQNSPGDLKEGFE YGRCGNPTTNAFAGALAALEGAEHGFAFPSGMSAEDTLVRLLTRAGDHVIHSTDVYGGTH KLFSVIKPAEGVSSEAVDLTDAERAARAIGERRPALVWVETPSNPFLTVTDIEAVAELTH RAGGLLVVDNTFATPVLQRPLELGADAVVHSTTKYVGGHSDVVGGAVVLRDGLRLPDRIE PFFGDRDAAAEMAGLQMTVGAVESPRDAHLAHRGLKTLALRVERHCASAQRVAEFLQSHP KVAAVHYPGLPGHPGHALARRQNPLGVGGVLSFEVATEEQALVLCTRTRVFALAASLGAA ESLIEHPAVMTHSTRAGGVGGVPGTLLRLAVGLEDAADLIADLDQALARI >gi|319977657|gb|AEUH01000200.1| GENE 7 6350 - 7648 1304 432 aa, chain - ## HITS:1 COG:NMB0982 KEGG:ns NR:ns ## COG: NMB0982 COG0038 # Protein_GI_number: 15676874 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Neisseria meningitidis MC58 # 43 411 4 370 380 159 32.0 1e-38 MTPPPPEPRRPHHPALADTIRFACATALTGIIAGLVGLACIHILHSLTALVWDAHSGTLL EAVEAAPWWKRVAVLAASGAIGGISWTLLFRSNKDVVPVARCVQGESMPVVRALWHALTQ IGVVALGASVGREVAPREVAASLSAWMGDRLGLSPRDRRIIVACGAGAGLAAVYSIPLSG AVYTLEILLVGMSARACAAALAASGIAVLVSTGWARPEPFYTVPDLSPSLSLTVWAAVFG AVIGWLGWAFKGAVAAASGARPRGARLLWTLPVAFTAVGALSVWVPSVLGNGQASAQTQF DAAWAGGAGLAVAALVLAAKAVATLVTIRAGAWGGVLTPAVALGAGLGALSGLAWSQAWP GSPVAAHVFIGAAVFLGASMGAPFTGLVLVAEFTQQGAGILVPAIVATASATAASALARA LRSGRGPGRGPR >gi|319977657|gb|AEUH01000200.1| GENE 8 7645 - 8109 553 154 aa, chain - ## HITS:1 COG:Cgl1145 KEGG:ns NR:ns ## COG: Cgl1145 COG1846 # Protein_GI_number: 19552395 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 3 145 2 145 162 64 27.0 8e-11 MPTTPAHFSRRAHAWETYFTTTARLTERIESALKREARLSMAEYSVLLMTSRAADSGVRP SVLAKQVVFSRSRLTHTLKRLEARGLVRREACKGDGRGGLVLLTDEGSRVFQSAALVQRS VIRELFLDGISEEEIRVLTTLFTRVATRLDGRPQ >gi|319977657|gb|AEUH01000200.1| GENE 9 8168 - 8347 134 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPRLRNAQCREYRIGLPGPAPFYSIPAAPSPSPLIPPSTAGRSAGRSVSQSRNHYEKLY >gi|319977657|gb|AEUH01000200.1| GENE 10 8403 - 8771 537 122 aa, chain + ## HITS:1 COG:no KEGG:SAV_4695 NR:ns ## KEGG: SAV_4695 # Name: not_defined # Def: Lsr2-like protein # Organism: S.avermitilis # Pathway: not_defined # 12 119 11 110 111 80 43.0 1e-14 MKTTKTITEVFDDFDGTPADQSVRFAFNGATYEIDLTRAHFEEFAEALQPWIKAGRRIGS GAPRSLKKRRKRTQREVEEAASISMANAKMRKWAGENGYSVSPKGRIPQKVVEAYLEANP NE >gi|319977657|gb|AEUH01000200.1| GENE 11 8997 - 10205 170 402 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 294 397 246 344 347 70 37 1e-11 MSLRRSPLVALAAAAALAAGCSAGSPSSQSQGPAQSGTAPSPTTSATASAPAVPGYKPGE IPPIPLFTLPPIGVFAQNADKAVIDTATAIASVPGITVAPAACDGGSLLANGGNTVLYGD GSGYSRNGDQGVTNYGDGSGTITDGPITLVYYGDGSGNYTNSDTNEQIVVYPGGSGNYNS PALQIVSYGDGSGFYVNSETGDNITIYTGGSGNYSNSKTGVSIVNYGDGGGTYSDKTGLN IVNYGDGTALVNGKSIKADPVPAVGSIGSFPPVGSLKPVESCGTTITLQDGVLFDFGESA VRADAADTLAKLASVLADAKVPAAHVYGHTDSISDEAFNQTLSEQRAKAVVDELKKNGAT ASLDWQGFGETKPVAPNTNDDGTDNPAGRQANRRVEIFIPTF >gi|319977657|gb|AEUH01000200.1| GENE 12 10486 - 11016 547 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508488|ref|ZP_02044130.1| ## NR: gi|154508488|ref|ZP_02044130.1| hypothetical protein ACTODO_00989 [Actinomyces odontolyticus ATCC 17982] # 53 174 48 167 168 65 45.0 1e-09 MSIRKSPLAVLASISIAAAAGFALSACGSNNATPAQDAASSAAGAEAAGNAASSAAGAAG NAASSAAGAAGNAASSAADSAGNALSGDGVYTIDESNTTVALPSGTKTVVVNASNVDVTG GDVTEIKVNGSENDIHVNSAQRVSFSGSDNEIYYVQGPAPQVASDTGAENTVEQGR >gi|319977657|gb|AEUH01000200.1| GENE 13 11113 - 11226 114 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPRFSHIFTPAHTRRDGVHGTDPGPCPTIWNDPPGKP >gi|319977657|gb|AEUH01000200.1| GENE 14 11225 - 11962 1101 245 aa, chain + ## HITS:1 COG:ML2441 KEGG:ns NR:ns ## COG: ML2441 COG0588 # Protein_GI_number: 15828319 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Mycobacterium leprae # 2 241 6 245 247 326 66.0 3e-89 MTYKLVLLRHGESEWNAKNLFTGWVDVPLSDKGRAEATHGGELLKEAGVTPDLLFTSMLR RAIVTANLALDAADRHWIPVERDWRLNERHYGALQGKNKKEIRDEYGEEQFMQWRRSYDV PPPPIEAGSEFSQDADPRYAGEPIPATECLKDVLERLLPYWEGTIVPAIKTGKTVLIAAH GNSLRAIVKHLDDISDGDIAGVNIPTGIPLVYELDEETLKPVKKGGTYLDPEAEAKIAAV ANQGK >gi|319977657|gb|AEUH01000200.1| GENE 15 12094 - 12555 637 153 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLVAMVGGGLVGTLVANITGGTANGNTWSIAKDAHFAIATMLAVAGLICFAIGWRLNVVS AEARARAHADEVRARLVGSMSDGTLQVSPGVAPSSQDEAEAFVERTAAEQYAEARSALRN RHSVFYIPVQYIGALGLAGAVVVLVYAVINALV >gi|319977657|gb|AEUH01000200.1| GENE 16 12766 - 13281 850 171 aa, chain + ## HITS:1 COG:ML0320 KEGG:ns NR:ns ## COG: ML0320 COG1329 # Protein_GI_number: 15827084 # Func_class: K Transcription # Function: Transcriptional regulators, similar to M. xanthus CarD # Organism: Mycobacterium leprae # 1 159 4 162 165 186 62.0 2e-47 MSFEIGQTVVYPHHGAATIEEITTRSIRGAEKTYLKLRVNQGDLTIEVPADNVDLVGVRD IVDEDGLEEVLSVLRAPYVEEPTNWSRRFKANQEKIATGDIVKVAEVVRDLTRRDDLKKL STGEKRMLTKARGILTSELALARGIDKADAAARLDGILAEGRIDEAGLDAE >gi|319977657|gb|AEUH01000200.1| GENE 17 13278 - 14006 285 242 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 6 231 6 220 234 114 36 6e-25 MSRLVAVLTAAGSGSRLGCGGPKALVPLAHRPLLWWAARALVEAGATAIVVTAPAEAVAE FRSALEGVGAEVVVVAGSAASRQASVANGLGALPPLDADDVVLVHDAARPLTPPSMIRRV ADAVRSGCDAVIPTVPVADTVKVVVPLAGGLGLVEGTPDRSSLAAVQTPQGFTWHTLRDA HRAGAARAADEGAAATDDAGLVEALGIAVHTVAGDPAALKITRPEDLVVAEHLAARAGAG AR >gi|319977657|gb|AEUH01000200.1| GENE 18 14122 - 15474 2197 450 aa, chain + ## HITS:1 COG:Cgl2562 KEGG:ns NR:ns ## COG: Cgl2562 COG2270 # Protein_GI_number: 19553812 # Func_class: R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Corynebacterium glutamicum # 18 440 18 434 440 319 43.0 6e-87 MATSAPSAQQTGVINRSVIEWAAWDWGSAAFNAVATTFVFTTYLTSDGVFTDSGTANSWL SAGMTVAGLVIAVLAPITGQRADRRGKGGVWLGWFTGAVVVCLFAMYFVHPASALGPQGS LALGIVLLGLGNVFFEFASVNYNAMLNHLGTKENRGRISGFGWASGYIGGIVLLLVLYVC LIGNNLLGVPTEDHLNIRVAMLVAGLWFGGFAVPVILRPPLPENPRHDGSRESIVDSYRL LWRTVVTLRREAPHSLFFLIASAVFRDGLAGVFTFGAVLAKSAFGFSAGDVMLFAIASNV VAGLATAAFGFLDDRIGPKKVIIVSLSAMVLAGFGVFALHSRGAIVFWTLGLVLCIFVGP TQSASRSFLSRIIPEGREGEVFGLYATTGRAVSFLAPAMYWVSLAIGSAITPAGQDHTYW GILGIMLILLVGLALTIPVKADRATLDHME >gi|319977657|gb|AEUH01000200.1| GENE 19 15478 - 16173 708 231 aa, chain + ## HITS:1 COG:no KEGG:Tint_0107 NR:ns ## KEGG: Tint_0107 # Name: not_defined # Def: restriction endonuclease BglII # Organism: T.intermedia # Pathway: not_defined # 10 221 5 188 196 110 31.0 4e-23 MELTGSWERVLPQDVVRRFDFYETRQAGAILRAVDGEQFDEVVDVLRRFEIRRADLVRPG GQESRLAQRFNAAFRDRGWREARVDTEITLRLHVRPYRPAGEERATETTTSVENPGYLVD NFKGRVALDLEWNAKDGNLDRDIAAYRSLYDAGFIDVAVLVTRTQDELRSFARRVRLAEG MDEAEAKKMLATTTTTNIDKLLPRLRRGDAGGCPVLAVAITSRAFENAARD >gi|319977657|gb|AEUH01000200.1| GENE 20 16170 - 17006 1164 278 aa, chain - ## HITS:1 COG:all7280 KEGG:ns NR:ns ## COG: all7280 COG4725 # Protein_GI_number: 17233296 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Transcriptional activator, adenine-specific DNA methyltransferase # Organism: Nostoc sp. PCC 7120 # 10 189 3 192 210 116 34.0 4e-26 MPDAPLYAPLPTTEGGFQTVLADPPWRFANRTGKVAPEHRRLGRYETMELEDIRALPVGD VTAPNAHLYLWVPNALLPEGLAVMEAWGFRYVSNIIWAKRRKDGGPDGRGVGFYFRNVTE PILFGVKGSMRTLAPARSTVNMIETRKREHSRKPDEQYDLIEACSPGPYLEMFARYAREG WSAWGNEADDAVEPRGRASKGYSGGEIASLPSLSPNERMSPWLANRVAHILAEEYTQGTS IQELANRSGYSIARVRSLLKKSGVPFRDRGRRRTATPA >gi|319977657|gb|AEUH01000200.1| GENE 21 17055 - 17567 540 170 aa, chain - ## HITS:1 COG:MT3687 KEGG:ns NR:ns ## COG: MT3687 COG0245 # Protein_GI_number: 15843194 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 9 167 6 156 159 147 55.0 1e-35 MSDSPVPFRVGQALDVHAFASAESVAVRPGRVMMLACLEWPGEVPLEGHSDGDVAAHAVC DAVLAATGLGEMGSVFGVDRPEWAAASGEAFLREVARMAAAAGWRIGNATVQVVGNRPRM AARLPEAGARMSAVLGAPVSVSATTSDHLGFTGRGEGLAALASALVVRPA >gi|319977657|gb|AEUH01000200.1| GENE 22 17732 - 19327 1702 531 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508501|ref|ZP_02044143.1| ## NR: gi|154508501|ref|ZP_02044143.1| hypothetical protein ACTODO_01002 [Actinomyces odontolyticus ATCC 17982] # 46 528 56 493 496 100 23.0 4e-19 MSLVDKHPDQDPDRDGAQGAASRARRAGASPIVRKRPSELLRERRQRIRRRRLTTAGVVL ALALAAFYVWGPFHSPHKPELRSQGYTGTHTDTGRPSPKWANGTDIAWKLDHDTAGPGEY AVMKNGDQVILAHELNRDAGTGVISLDVSGTKPAIQWEATLPRDSSLRVETRPVLVGEDL FIGEHRLDPTTGAASPAPWAGRNRDGQDNPNLAVTIVITSRDGVVIACSRRICSGWEKED GEWTMRWERDSTPTPLEDYQDCPECGEGTWVILNDRRVFSWGPAPEAAPQGGRLLDPVAQ EGRIINTQTGEVRSLYLDKTAKDGQLELHSTADGWVVADSETDQATVFASDGRPLESFAI SVSAGTGDHKAVLPIGGQTPTSAQLRDFLTTGEAPWAEGLLRTRATTTDNGATICDTLSF TPTGSDRPSHESVIPWNYHIVQNEDGQCFIGAMNPTISKDHSIIRLKSDNALGDDFTIDM YGTSLFHTSGSFKFHGSTGYGTEASAVIRVYDDLVLVIHDEGVTAFTPRWS >gi|319977657|gb|AEUH01000200.1| GENE 23 19458 - 21050 1416 530 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293194135|ref|ZP_06609936.1| ## NR: gi|293194135|ref|ZP_06609936.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 6 530 30 517 517 96 23.0 4e-18 MSLVDKHPDQDPDRDGAQGADASRTRVGGASPIVRKRPSELLRERRQRIRRRRLTTAGVV LALALAAFYVWDPFHSPHKSPRKPELRSQGYTGTHTDTGRPSPDWTNGADIAWQLDQELE DSNDSAVVKNGDQVILAHELNKHAGTGVISLDVSGAEPAIQWETALPGDRSLQVETRLLL VGEDLFIGEHRLDPTTGADSPAPWAGRNRDGQDNPDQAVTIPLSSHDGVVIACSRRICSG WVKEDSEWTMRWERDSTPTPLGDYQDCPECGEGTWVVLNNWNTTAWGPAAEASPQWRWLF TPVAGMGTIINTQTGEVRALHDTKKENRFELYSTADGWVVADSKTDQATVFAPDGRPLES FAIADSVGAGNHKRVLPIGGQAPTSAQLRGFLTTGEAAWAEGLLRARATTTDDGATVCDT LSFTPTGSSHPTRETTTSGVFLRNRDTGECTTDATRSAISEDHSVIHTWNKHVRSDEIID MHGIGLFHASGSFIPTDATTALFARAVRVYDDLVLVFSHEGITAFTPRGS >gi|319977657|gb|AEUH01000200.1| GENE 24 21158 - 22741 1167 527 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508501|ref|ZP_02044143.1| ## NR: gi|154508501|ref|ZP_02044143.1| hypothetical protein ACTODO_01002 [Actinomyces odontolyticus ATCC 17982] # 46 524 56 493 496 96 25.0 5e-18 MSLVDKHPDQDPDRDGAQGAASRARRASASPIVRKRPSELLRERRQRIRRRRLTTAGVVL ALALAAFYVWGPFHSPHKPEPRSQGYTGTHTDTGRPSPDWTNGADIAWQLEGRVPLADEA AIVKNGDQVVLFHEWDGRKNIGVVSLDVSGTQPVTQWRGVLPKDVSLRPVIHPVLIGDDV FAGAFRLDLATGSRALAPWAERDQEANRPPATAITSGYGVVIACGEDTCSGWAKDGDQWT MKWERDSAPTPLREYEYCSECGGGTWVPLNDRDVVEWGRATDPFTEWGTSSIVVQYPKRI INTETGEVRSLCLDEAKADQFELYSTADGWVVADSKTDQATVFAPDGRPLESFLIVDSAG AGDHKAALPITGQAPTSAQLRDFLTTGEAPWAEGILRVLPTTEDNGTKTCGELSFTPTGS THPTRKTAIGTSGIAGPGKRDCAIRARRPAISEDHSAIFLRGDAYGGDDGHDAIVDMHGT SLFHTSGTTYPAAGDALLDGGRSVRVYDDLVLVFSYEGITAFTPRGS >gi|319977657|gb|AEUH01000200.1| GENE 25 22814 - 24250 769 478 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508501|ref|ZP_02044143.1| ## NR: gi|154508501|ref|ZP_02044143.1| hypothetical protein ACTODO_01002 [Actinomyces odontolyticus ATCC 17982] # 10 475 45 493 496 112 27.0 6e-23 MTHGSEQRTTAGAPAGASSARQRRRRTTALAALGAVVVLVVALALVRDHVFSSPAPVLLP SQPPRAQGYTGDHQDTGRPAPSWSGGVDTAWSIRTFDDNWRASKNAIRSAEFAASGDGRV FTAFEAEVGDGFQVASINVTGPAPVVEWMHFYPHHTDDAPLITTDTGLVIADLVIDSSTG ESTPAPWADESFMATPRAHVNGTLVVCGAGSCSGWRQDTGPGQWTRQWKASGDGLRPTSP LTPAGSGASTWLLVSSSSAGPVTILDPSTGELRDLPPEQPSGVWGDRQVYAASDGWAVVS AEAGQVLTYTPDGQASGAYPYNPDIDSGSHRFALPVGGRTPTAGDVAAFVTTGTAPWADG VLRVLGAQESRPCATYSFTPTAEGAPTRKATVNHVPYAKPVDGKCAFAYLPRQTRVSADG AVLFTASRYDGASLVDLGGSGAFGRAGEARLGPAHWLYDDLVVSADPYGIAAFAPKEP >gi|319977657|gb|AEUH01000200.1| GENE 26 24247 - 25770 1111 507 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDPLDDEPAPADDAPEAHPGGAQQDGGPRADEEPLSAIRQRHQRARRRRALAISGALVVV LVIGVLVYPRLSAKPPTGAQESGYENTQSVPGAPALGPDLDVLWRVPAPPSGDGAVNARI LPVDDLLVFVAKETAQGDIWVAALVPERDSARQPPSLLWASTIKDPEGLGDAPMTAYTSF LEWRDVWGRPNDSEESKVFFKDWAVDAGTGAVERAPWADPDAATPAHAVGTFNKTVIACG ADCSGWTYSSDSGWTAAWTQNAVRIPHSYRQAEAEKNPKDKNWPHTWILLDPDQDGALPI LDTATGEVRTPFPTGTSPDDVYAEKHGWTVVEEDSATVTSYSADGNRTQVGTLAPDARSA VPLDDSHYHNLTGMSGPEDPVARVTGEDCDSLTIGRPSDDPPAVRVPASTRIATRTGDSC VPAFALSSPLRSGAHEPVVALAERGGTDPAFLVLAAMRTDERPDGPRLLLPQDGSPPMVF TGTGSSTFLIGVDENGLIGFVPKVKKK >gi|319977657|gb|AEUH01000200.1| GENE 27 25717 - 25843 114 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLRGVVGGGGFVVEGIHTRKYAKVVAEVECFSSIQNLRALS Prediction of potential genes in microbial genomes Time: Thu May 12 18:37:52 2011 Seq name: gi|319977653|gb|AEUH01000201.1| Actinomyces sp. oral taxon 178 str. F0338 contig00201, whole genome shotgun sequence Length of sequence - 4846 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 140 - 1873 1558 ## gi|293194134|ref|ZP_06609935.1| putative Hsp70 nucleotide exchange factor FES1 2 2 Tu 1 . + CDS 1908 - 3401 1973 ## COG0498 Threonine synthase 3 3 Tu 1 . + CDS 3537 - 4845 1774 ## COG0215 Cysteinyl-tRNA synthetase Predicted protein(s) >gi|319977653|gb|AEUH01000201.1| GENE 1 140 - 1873 1558 577 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293194134|ref|ZP_06609935.1| ## NR: gi|293194134|ref|ZP_06609935.1| putative Hsp70 nucleotide exchange factor FES1 [Actinomyces odontolyticus F0309] # 112 577 33 496 496 153 29.0 4e-35 MWTGPPAGILYGMASSDDQWSPYAGGAPKDPSGSDGGGDARFPYGPGKDLAAPPFEPQPG YTARAGSAGKGPAPSAAGSASSAQASSEAAPFGPQPGYTARAELGGRGTGPAAPNRRRST GTAQSGAGARSAGSRARLGMGKVKAYAAAAIALAVVAAYVLGRSAPTAEPPREQGYTGWA QDSGHPAAAWARGVDVAWRIDAPGEGNTRPRGRVLRYGSTVIAVDSGEDLAARVTAIDAS GAQPVTLWTTNPRDTDLMSDASPLVVGDELVMPGAVIDLATGSASEAPWGRDDTLAYASG VLVTCTGKETCSGWTRSEAVWERQWRVIAEQPDYRTQLWDRSVLRAGPEDDTWLLISNDP QGEASVLRATTGEVRTIAPAPAKGAYAVRTAYGASDGWVLVDVNTKEAVAFSPEGEQSGT FPVTEPVGSGSTTRTVTAGDGRPTVDGVKRFLTTGRAPWSDGTVWMDGKDCTTVNFAPSE DRKPVAAAVEKEAGLANGSSGACTLVFSHAVAGEGDSVLFVQQGVDPDRKKALIDMSAGG VVGSEELSGMATPTWVYDDLIVGIDATGIVALTPKDA >gi|319977653|gb|AEUH01000201.1| GENE 2 1908 - 3401 1973 497 aa, chain + ## HITS:1 COG:Cgl2171 KEGG:ns NR:ns ## COG: Cgl2171 COG0498 # Protein_GI_number: 19553421 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Corynebacterium glutamicum # 1 490 1 471 481 561 60.0 1e-159 MRYTSTRQTTGSPRPFSDILLEGLAPDGGLYVPERYPRLSADDLEALRRVLDQQGYAALA ARILAMYIDDIPEEDIAAMASRAYSAPAFSDPAIVPVTALEGAGFDLRLAHLSLGPTAAF KDMAMQLLGELFEHELRRRGQWLTIVGATSGDTGSSAEYALRGRDGLSVVMLTPAGRMTA FQRAQMFSLLDDNIVNVAVDGVFDDCQDLVKAVNMDAAFKARWHLGAVNSINWARLLAQV VYYVATWLRATEGPGGGSPCTHPGTGAPERVSVVVPTGNFGNVCAAHIARQMGVPLDALV VATNENDVLDEFFRTGVYRPRSSDQTLATSSPSMDISKASNFERFVHDLLGRDGARTADL FGTALVRNGCFDLSGTPEFQAMRETYGFVSTSSTHADRLAEIARTEAESGYLIDPHTADG VHAARELASRGELAGTVVVMETALPVKFADTIMEATGHLPPVPGRFAGIEAGGQRVVPMA NSVDALKELIAQRAGGR >gi|319977653|gb|AEUH01000201.1| GENE 3 3537 - 4845 1774 436 aa, chain + ## HITS:1 COG:MT3686 KEGG:ns NR:ns ## COG: MT3686 COG0215 # Protein_GI_number: 15843193 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Mycobacterium tuberculosis CDC1551 # 1 417 7 405 469 400 51.0 1e-111 MRLYDSKTASVRPLEPVVPGEVGIYLCGATVQGSPHVGHLRAAVSFDTLIRWLRRSGYEV TYVRNVTDIDDKILTKSAQAGWDWWAWAYRYEREFTSAYDALGVQAPTYEPRATGHIPDQ IDLVRRLVERGHAYDDGAGNVYFDVHSQPDYGSLTRQRLADMRTTEDEAQIDAAVEAGKR DPRDFALWKATKPGEPATASWDSPWGRGRPGWHLECSAMSRRYLGDEFDIHGGGIDLRFP HHENEQAQSHGAGWAFARMWVHNAWVTTKGEKMSKSLGNVLSLEALTRDHPAVAVRWALS TVHHRSAIEWGPETLDNAASAWARFSGFVSRAIEAVGEAGAEEVAIPADGLPAAFREAMD DDLNVAGALAVAHEHLKAGNVALGSGDAEAVRREQVLVRSMLDVLGLDPASPQWRGQAGI GGGAGAGERAALDALV Prediction of potential genes in microbial genomes Time: Thu May 12 18:38:33 2011 Seq name: gi|319977607|gb|AEUH01000202.1| Actinomyces sp. oral taxon 178 str. F0338 contig00202, whole genome shotgun sequence Length of sequence - 59382 bp Number of predicted genes - 49, with homology - 25 Number of transcription units - 24, operones - 13 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 130 216 ## 2 1 Op 2 . + CDS 175 - 1950 2035 ## COG0366 Glycosidases + Prom 1959 - 2018 1.9 3 2 Op 1 . + CDS 2147 - 2914 781 ## 4 2 Op 2 . + CDS 2930 - 3250 408 ## 5 2 Op 3 . + CDS 3269 - 4522 855 ## 6 2 Op 4 . + CDS 4512 - 5711 816 ## 7 3 Op 1 . - CDS 5708 - 5899 60 ## 8 3 Op 2 . - CDS 5984 - 6553 585 ## 9 4 Tu 1 . + CDS 6917 - 7459 967 ## COG3797 Uncharacterized protein conserved in bacteria 10 5 Tu 1 . - CDS 7532 - 8095 454 ## 11 6 Tu 1 . + CDS 8303 - 9280 979 ## COG0566 rRNA methylases 12 7 Tu 1 . - CDS 9277 - 9822 545 ## 13 8 Op 1 . - CDS 9954 - 10247 369 ## 14 8 Op 2 . - CDS 10247 - 10675 452 ## 15 8 Op 3 . - CDS 10672 - 11994 1513 ## gi|256667333|ref|ZP_05478286.1| hypothetical protein StAA4_09310 - Prom 12020 - 12079 3.6 + Prom 11997 - 12056 3.2 16 9 Tu 1 . + CDS 12273 - 13556 1693 ## COG0104 Adenylosuccinate synthase + Term 13692 - 13745 0.1 17 10 Tu 1 . + CDS 13872 - 14717 818 ## gi|154508511|ref|ZP_02044153.1| hypothetical protein ACTODO_01012 + Term 14897 - 14926 1.9 18 11 Op 1 . + CDS 15261 - 15935 743 ## gi|154508512|ref|ZP_02044154.1| hypothetical protein ACTODO_01013 19 11 Op 2 . + CDS 15919 - 17121 1082 ## COG0354 Predicted aminomethyltransferase related to GcvT 20 11 Op 3 . + CDS 17121 - 17546 407 ## COG1490 D-Tyr-tRNAtyr deacylase + Term 17558 - 17608 8.6 21 12 Tu 1 . + CDS 17632 - 18090 -275 ## + Term 18335 - 18380 -0.5 22 13 Op 1 . - CDS 18145 - 18312 263 ## 23 13 Op 2 . - CDS 18233 - 18685 467 ## 24 14 Tu 1 . + CDS 19049 - 21517 2537 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 25 15 Op 1 . - CDS 21683 - 24397 2964 ## COG1404 Subtilisin-like serine proteases 26 15 Op 2 . - CDS 24401 - 25771 1568 ## gi|154508577|ref|ZP_02044219.1| hypothetical protein ACTODO_01078 - Prom 25843 - 25902 2.7 + Prom 25787 - 25846 1.9 27 16 Op 1 . + CDS 25965 - 26288 436 ## 28 16 Op 2 . + CDS 26338 - 26625 347 ## + Term 26659 - 26711 -0.2 29 17 Op 1 . + CDS 26818 - 27573 713 ## 30 17 Op 2 . + CDS 27595 - 27900 233 ## 31 17 Op 3 . + CDS 27912 - 28691 793 ## 32 17 Op 4 . + CDS 28697 - 29182 628 ## + Term 29205 - 29249 7.1 33 18 Op 1 . + CDS 29276 - 30772 1701 ## 34 18 Op 2 . + CDS 30863 - 31945 1329 ## COG1404 Subtilisin-like serine proteases - Term 31748 - 31790 -0.7 35 19 Tu 1 . - CDS 32022 - 33416 1064 ## + Prom 33401 - 33460 2.2 36 20 Op 1 . + CDS 33605 - 37879 5495 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 37 20 Op 2 . + CDS 37946 - 38485 778 ## gi|293191448|ref|ZP_06609190.1| conserved hypothetical protein 38 21 Tu 1 . - CDS 38404 - 39054 276 ## 39 22 Op 1 . + CDS 38914 - 44940 6894 ## AAur_1424 fibronectin like domain-containing protein 40 22 Op 2 23/0.000 + CDS 44995 - 45960 1273 ## COG0714 MoxR-like ATPases 41 22 Op 3 5/0.000 + CDS 45967 - 47142 1450 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 42 22 Op 4 . + CDS 47139 - 49733 3171 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases 43 22 Op 5 17/0.000 + CDS 49730 - 51259 1923 ## COG0515 Serine/threonine protein kinase 44 22 Op 6 . + CDS 51285 - 52157 804 ## COG0631 Serine/threonine protein phosphatase 45 22 Op 7 . + CDS 52157 - 53584 1402 ## gi|293191205|ref|ZP_06609138.1| putative FHA domain protein 46 22 Op 8 . + CDS 53748 - 56459 4250 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase + Term 56487 - 56533 18.6 + Prom 56664 - 56723 2.9 47 23 Op 1 5/0.000 + CDS 56800 - 58452 2390 ## COG0672 High-affinity Fe2+/Pb2+ permease 48 23 Op 2 . + CDS 58484 - 59128 1071 ## COG3470 Uncharacterized protein probably involved in high-affinity Fe2+ transport 49 24 Tu 1 . + CDS 59233 - 59380 120 ## Predicted protein(s) >gi|319977607|gb|AEUH01000202.1| GENE 1 2 - 130 216 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no IAERAEARAAKDWSRADALRDRLAGAGVVVADGRDGATWSLS >gi|319977607|gb|AEUH01000202.1| GENE 2 175 - 1950 2035 591 aa, chain + ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 19 577 5 558 561 587 52.0 1e-167 MFSEELISSREAGLGHDDWWKGAVVYQIYPRSFQDSDGDGVGDLRGVRSRLDYLEALGVD VLWLSPVYRSPQADNGYDISDYCSIDPLFGTMDEFDALLGDAHARGMKIVMDLVVNHSSD EHPWFAASRSSKDDPKRDWYYWRPPRPGRAPGEPGAEPNNWGSFFSGSAWTLDEATGEYY LHLFHRKQPDLNWENPRVRAAVHAMMNWWLDRGVDGFRMDVINLISKAPGLPDGEVLPGR AWGEGFPLVSEGPRLHEFLAEMRREVFDGRGGAGADAPLAVGEAPGVRIDKARLYSGAAR RELDMVFQFDHVDLGLEAGKFRPRPLRPGELADCLSAWQEALADDGWNSLYLDNHDQPRA VSRFGDEEHWYASATALATALHLLRGTPFIYQGEELGMTNGVFGSVGDLRDVEALRYYDG AVAAGEDPGAVLAGLRAMGRDNARTPVQWDGGENAGFTSGAPWIGVTPNYREVNAAAQVG DPGSVHSYYRALIAIRHRLRVVALGSFERLDAGDPRVFAYRRALGEDRLLVVVNLSSDSV RPALADGGGLVQLLGNADEAKAPGDALGPWEARVYLGGRTSIGIAKIYCES >gi|319977607|gb|AEUH01000202.1| GENE 3 2147 - 2914 781 255 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQFMQTAEFGSLIGPLAKLGAISEIMAAAVAEIPDSVDVVDPTGRIGLTVRSDGGIENFQ IIGGWREAIGPARLGDAITALINTGRTRLLEAMNSSIDQNWGEAIDNGWLSGRGVGPEQD AAMRDAVEDSAAQIIQAAHAQGPVIPDQAFERAADLFERMDESRRLLDEAEDASRSVVDE ADVEDVACEFSNGAVSRVLVNPMWARKTPIVRIRQRVIEEISSPNGGQASIGLGQSESAQ VVLDMIQVLHAISSN >gi|319977607|gb|AEUH01000202.1| GENE 4 2930 - 3250 408 106 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTDVSSIEVDVSALRDDANEWMHQGDRLRSLAGVLSAVPDSGSNVPPDENLVIGAVNAVT ALLDSLVAQAADEFDEMARALLVSAQSYETTEEELSDGFKRFNDAF >gi|319977607|gb|AEUH01000202.1| GENE 5 3269 - 4522 855 417 aa, chain + ## HITS:0 COG:no KEGG:no NR:no METSTGPRDSYHDDGFTREGGGVARSGVGQSRSGGLGGDYGAERGIRARDGLSRSEGATA ARGSGYPVERGSRGSADDRPGSSTAVYDRTAAEGRYPNRADAGTAGGAPVEQSGQFDLYT SIVTVFQEISKNREGIRAAISRLEATWTAVSPMLDAAGLTAMKVLVSMSITALRQLSEGL IAMADALQVFLASHQGVVDVLKRLSEQFSPQLQAFVEDEVVGSMRTRKAGLKGATVDQYN KQADTQADAMVTLSEASTQLTRGLSNLAKAHSALMVGFGAATGGAALGIAGGTAVLTPPA TPAAPAVYMSVIAATVSQIGALGSAYVSARDDILGGIQKICDVPTLASGQWPKSTGDSGA GASATARRSSGSVERAAGASATARRSSGSVERAAGASATARRSSGSVRQTAGASNAR >gi|319977607|gb|AEUH01000202.1| GENE 6 4512 - 5711 816 399 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLDSQVVRASEPSYFDRLRAVADLRYDDSAEETAYNTVMEQISECERLVSEYEQTNGYHS QEVRDAVSEWAEAFRSDLTRTRDELAEAQAVVGQARQIMRQARDLFYQNVSPELLSDGER LFEQVISAGTSVATLVVPIGGYLAHVAADTFFDMLKEQRDGEREAYCKQIVDQMNEKLLT GSEAMRDAIDRQCCGGLETPMDGRARRQGVCADAPGSKWGAEGFQQPGLAQESPQRVPVK DLEGVCLLERPVNQSVTPNGLVGGYAPPSATDFSDSRWDASYRIPSSVSDVSRTASMGAF GLGGAKAAQALSAHQLLGAGANPSAAQGAQGMLGPTGAGGGAAGEKQEGRKEYTFSGFQG LFFDPDAESDSTPWDPAHGPGSGDDGTEFELRLEDWELP >gi|319977607|gb|AEUH01000202.1| GENE 7 5708 - 5899 60 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAPPGGRTTQPKPFICCRVRPQAVNGALTRDSAAAPGLGHGLRHSASRQGPLSHDDDLTP HNA >gi|319977607|gb|AEUH01000202.1| GENE 8 5984 - 6553 585 189 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDAGGTEAGMGYFRAVFGGNSYYEYKVRVGPEEAVRRAVDFWRSKKCKVGGDDIDRRLRK AGYTGTEMSGGSDLGILKDLSLLVSVVGWPLLLVPATRRAVPRPFTIGIVASLEGPETNG TTLFCFDANEDDSESVFSPREHTEHQMIELGRELARQGILREAPRRLTRRDLPKGHPLRD YDAFKLLWR >gi|319977607|gb|AEUH01000202.1| GENE 9 6917 - 7459 967 180 aa, chain + ## HITS:1 COG:SP0830 KEGG:ns NR:ns ## COG: SP0830 COG3797 # Protein_GI_number: 15900717 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 3 179 4 180 180 176 46.0 1e-44 MEYVALLRGINVGGKNKVVMSELRKQIAAEGYGNARTYINSGNLVFEADSPRDEVAQTVE SVLEGHYPFPIRLALLSGREYLAELSGLPEWWHGDAARRDALFYTRGLDRAHVRERIEAM PLGDEAVHFGENAVFWGKFNEAEFLKTAYHKHLLREDFYRQVTIRSGATVERIASMLSRD >gi|319977607|gb|AEUH01000202.1| GENE 10 7532 - 8095 454 187 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGFLRAEFGCTRYYEYQVGVGPEEAVRRAAAFWHALDCEVADYNMDSRLRELGYTGTEVL GGGGSEATFQNDLLELLPLGLVLLFSRKIRLSNPGPFQVGIAAPLDGPYPDRTTLVCFHA TWIVHEKGFISPREYTELQMIALGEELGRQGVLLENPHRFTQRTLPRDNPLRVFNWYRIR RAAVEGS >gi|319977607|gb|AEUH01000202.1| GENE 11 8303 - 9280 979 325 aa, chain + ## HITS:1 COG:Cgl2588 KEGG:ns NR:ns ## COG: Cgl2588 COG0566 # Protein_GI_number: 19553838 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Corynebacterium glutamicum # 1 322 1 309 313 256 48.0 3e-68 MAGDSGRPGAIRKKGTKKGAQVGTGGHSRRRLEGKGPTPKAEDRPYHPAHKRKARREARQ AQEAAIARARAKSSIRVPAGHELIAGRNPVAEAARAGVPIERVFVLDNVKDDRVEEVIRL ASSMGAPVFEVTRRDLDVATDGASHQGVAIEVRGYEYTPVDDLIAGSVQQVGHPLLVALD QVTDPHNLGAVLRSAGAFGGDGVIIPERRSAGVTTAAWKVSAGAAARVPVARATNLVRAL EDCKRAGFFVVGLDGGGDAPLRGLPLADGPLVLVAGAEGTGLSRLVRQTCDQIVSIPISS AVESLNAAVATGIALYEVASVRAGG >gi|319977607|gb|AEUH01000202.1| GENE 12 9277 - 9822 545 181 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAKGLPITRCAYCWYTLSCPPEEALQWGIDYWRSLGCRTTKRDVNARLLPLGWTGAEMRG GSRAGGCFGAAGEGCMEAFPPAALLFLLAARLLRTSYIINIAASPSSSQAGATDLVFFRT GPDRLNDSALYGPQGFVKARMQLFRDRIAPSGLLLSGPWLTTEQDLPDSHPLAPANWLRK G >gi|319977607|gb|AEUH01000202.1| GENE 13 9954 - 10247 369 97 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEEDLKVNGQGLQETIESLKSSLTEMQNSFDEIRNGHSQLGTSWKGEASDAALTKLSGLE DEGNSQTETLQNTIAALEAALEGYNKAEETISELWAL >gi|319977607|gb|AEUH01000202.1| GENE 14 10247 - 10675 452 142 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTDRFGPGADDPLEEIYSLIESLKTMTDSLHEEFDDSEEDTDWRDEWAVSARHGDYGDD WRIIQRRIDREETTMDAVLMGEDQSPEAIRIRQTSEENLSRFEDDLDDEEKEHLDSLRDE LSGAAQDLQERIERLTQQLKGL >gi|319977607|gb|AEUH01000202.1| GENE 15 10672 - 11994 1513 440 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|256667333|ref|ZP_05478286.1| ## NR: gi|256667333|ref|ZP_05478286.1| hypothetical protein StAA4_09310 [Streptomyces sp. AA4] # 221 408 234 420 596 65 27.0 8e-09 MDLSVKYETLQLSQQLLERQATTHLTNIQSFMGEWCRMEPGQAAGIGTRIAASQRLKGTQ LGGSPVGKVINMSGKAVAKGVLFMLFIPINEAIVSIGEGAMDLLITCHQGAADKLNQTID TYADADKAAHEALMAILQSLGGSATPFEDPRDSPAQLGDARTKAGPHYGGPDPRVDQQLS QDAQEAGEYLSTLKDRASQRLSDARSSDRSVAESQDASSYLVPPEAPTSEMENLRWSAGA IAGSIDWAIEKLTGVSLLNDVVFKYFVGDWRLVNMAKSAWGEIGDALVAVGQNDSEVLPA LAEWTGKGSEVTNLFIAALSKATTSLSSATGVMSLLLTGFQLMLKESAKEVGESIRVIAN TVALMVAQSAIPVAGWVTAGATAIARADMVIKRIRKIYTIVNMIVDVIESFVRGRAQMLE VQNTMSNLAEAAVRGVAARV >gi|319977607|gb|AEUH01000202.1| GENE 16 12273 - 13556 1693 427 aa, chain + ## HITS:1 COG:Cgl2708 KEGG:ns NR:ns ## COG: Cgl2708 COG0104 # Protein_GI_number: 19553958 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Corynebacterium glutamicum # 1 427 1 428 430 531 60.0 1e-151 MPAVIVLGAQWGDEGKGKATDQIGQRTDYVVKFNGGNNAGHTVVVDGQKYTLHLLPAGIL SQGVTPVIASGVVVDLDVLFGEVEEMEARGVDCSRLLLSASAHIIPPYNRLMDQANERAR GNNLIGTTGRGIGPTYADKMNRIGLRVQDLFHPQELRAKVEAALAPKNTVLKAVDLPALD PAEVSDHLLSFADRVRPMVCDASLVVNDALDRGGTVLFEGGQATMLDIDHGTYPYVTSSN PTAGGALTGTGVGPTRIDRVVGVAKAYTTRVGEGPFPTELTDQVGEDLRARGGEFGATTG RPRRTGWFDAVVTRYAARVNGLTDICLTKLDVLSGYDTVPVCVAYEVDGARTEEMPLDQA SFASAVPVYEELPGWSEDISGCRSFDELPEAARAYVDRLEEVSRCRIQSIGVGPGREATI VRYPLLG >gi|319977607|gb|AEUH01000202.1| GENE 17 13872 - 14717 818 281 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508511|ref|ZP_02044153.1| ## NR: gi|154508511|ref|ZP_02044153.1| hypothetical protein ACTODO_01012 [Actinomyces odontolyticus ATCC 17982] # 5 281 1 280 280 193 52.0 9e-48 MDVPVPTRVSAYVVSGLALAALVVWTLPMSRWTIGPVTAAACVLLVAGWPRLSRVDTSPV VQIVVAVAGSTIPSAVAYWQNLDIAVALMGIALAVAVAATVLTAPAPRDRSVSPGDWSES ASGAVANTVTLLLFIMGGSMWVSLTVLERWSVTVPVVAFAALVVVWGNQVGRSKRVQACA TVALGLVAGLVAAWGAWYLGRTAGLLPAVFPTLAERYNEFDAFLVFGGVTGTGVGGAIVV VDGLLGTRTRDQALIAVAARGAAKFLIAVLPVYAMVRISGV >gi|319977607|gb|AEUH01000202.1| GENE 18 15261 - 15935 743 224 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508512|ref|ZP_02044154.1| ## NR: gi|154508512|ref|ZP_02044154.1| hypothetical protein ACTODO_01013 [Actinomyces odontolyticus ATCC 17982] # 1 193 1 201 220 251 70.0 3e-65 MMVIPADLPPLVAPVAWMLGTWRGWGTRAADGADAPVVEEVRGDVVGDQMRLVTRVYEGT ADRDVDPTWDAASGLSAIARGPLLVEETLYVSVAPSDAPLPPPGQHNPRAFTASGATTGG HASAWDGMSMGPRVRMVSEAIARVEGSERVVHYGRMFGLVAGELMWTQERTLEGADEAVV EFSGRLMRAEQAADTAEGADSPEQPDLAERAAARGEEGSDGLAD >gi|319977607|gb|AEUH01000202.1| GENE 19 15919 - 17121 1082 400 aa, chain + ## HITS:1 COG:ML2203 KEGG:ns NR:ns ## COG: ML2203 COG0354 # Protein_GI_number: 15828181 # Func_class: R General function prediction only # Function: Predicted aminomethyltransferase related to GcvT # Organism: Mycobacterium leprae # 24 374 14 358 373 155 38.0 9e-38 MASPTDPRTAPLPPAHQGGGAQDPVDGAWAAAPQHYGDPSGEQWALEGGRALVARPDLAV VDVSGEDRQTWLTSLSTQVVTGMAPGDSRELLVLSPEGRIEHWAGASDDGTTTHLIAEGM DAGALAGFLDSMRFALRVRVSVRDDLAVYASVRAGGNDAAAVGSLPGVEWTWEDPWPGVA PGGAAYYQGARHPGSRTPMMFHVVPRAMAGAFEGAWLEADGHRMAGMLAWEAMRVAAWRP RLGADTDARSIPPEVDWLRTAVHTDKGCYRGQETIARVVNLGRPPRRLAYLQLDGSRSEL PEPGTRIEVGGRTVGVVTSVARHADEGPVALALLARTVGPEQVFDIDGVAAAQEVIVPVD GKCSASPASRPGAELGGRQIRRSDGGSGALRGMGSALGSR >gi|319977607|gb|AEUH01000202.1| GENE 20 17121 - 17546 407 141 aa, chain + ## HITS:1 COG:Cgl1872 KEGG:ns NR:ns ## COG: Cgl1872 COG1490 # Protein_GI_number: 19553122 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Corynebacterium glutamicum # 1 140 1 143 144 149 60.0 2e-36 MRAVIQRVTRASVSVGSEVVGAIDGPGLMVLLGVARGDGPQQAAVVARKIAELRILDGEA SAQDAGAPVLVVSQFTLYGDTRKGRRPSWAHAAPGPEAEPLVEAVVADLRGRGLRVETGC FGAMMEVSLVNDGPFTVLVEA >gi|319977607|gb|AEUH01000202.1| GENE 21 17632 - 18090 -275 152 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIICCAQGSEGSPPKPRKGVRVSRWTAGSPCGGGFPRPCGRGNLHTPSSRRAGRLIPARA GRYLALGCFFGVVLAHPRAGGALPGARVLLRGGAGSSRAGGALPGARVLLRGGAGSSRAG GALLRAHRALPGAASRIGVGQSALGQVISLPQ >gi|319977607|gb|AEUH01000202.1| GENE 22 18145 - 18312 263 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQTTPNKPDFPLRTRHRLQNSPSPGPNHSSLRDKGPSGASTSPPTVLGKRGPALG >gi|319977607|gb|AEUH01000202.1| GENE 23 18233 - 18685 467 150 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLTVLGGALQPWSRATECAQERAVGAQIIDGDNQRRLQTASNTRPVGASFSQRYRDPRG RLRGLSRTAALFSKRYHTFTGIPHFASVITLAKWGIPLKNSASLTNHPAPAQRTRETLGF RGKLLANHPQQARLPFENTTSLAKQPEPRP >gi|319977607|gb|AEUH01000202.1| GENE 24 19049 - 21517 2537 822 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 809 2 803 815 981 59 0.0 MFERFTDRARRVVVLAQDEARGLKHNYIGTEHLLLGLITEGEGVAAKALETMGIKGDAVR KSVIEIIGEGEKPVEGHIPFTPRAKRVFELSLREALQLGHNYIGTEHLLLGLLKEGEGVA AQVLTKQGADLAQVRQTVIQMLSGYQRGDDEGRESVGAGVGGSGGPERSNSAILEQFGRN LTQAARENKLDPVIGRRTEMERVMQVLSRRTKNNPVLIGEPGVGKTAVVEGLAQAIVHGD VPETIKDRQIYSLDMGSLVAGSRYRGDFEERLKKVLKEIRTRGDIILFIDEIHTLVGAGA AEGSIDAAQMLKPMLARGELQTIGATTNDEYRKYIEKDAALERRFQPVKVDEPSVEDTIE ILKGLRDRYEAHHRVIITDEAIKSAAELADRYVSDRFLPDKAIDLVDEAGARLRIRRMTA PPELRELDERISELRRNKESAIDDQDFEKAAALRDQESKLGEERKAKEAAWKGGESDEIA EVTDHEIAEVLAMSTGIPVVRLTQTETSKLLRMEDELHKRVIGQDEAVKALAQSIRRTRS GLKDPNRPGGSFIFAGPTGVGKTELAKALAEFLFGDEDALVQLDMSEFSEKHTASRLFGA PPGYVGYDEGGQLTEKVRRKPFSVVLFDEVEKAHPDIFNSLLQILEEGRLTDSQGRKVDF KNTVIIMTTNLGTRDINKGVLTGFQSSEHQTHDYARMKSKVSEELKQHFRPEFLNRVDDT IVFPPLQKDEIKQIVDLMIAKLAKRMEAQDMHLQLSDGARELLADLGFDPVLGARPLRRA IQREIEDSLSERILFGEIQPGQVVTVGVEGEGKERKFTFNGH >gi|319977607|gb|AEUH01000202.1| GENE 25 21683 - 24397 2964 904 aa, chain - ## HITS:1 COG:alr3543 KEGG:ns NR:ns ## COG: alr3543 COG1404 # Protein_GI_number: 17231035 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 474 776 106 389 615 108 27.0 6e-23 MASNKDLLNAQRYQRRRLTTVFSMGLPGGSEAEPTSMIGPIVVGSALSIVMVVVAALMGK FAPALPDNWDNGMLITVKDTGERYYTSKGVLLPLGNITTARLASAPGEMSTSSVSASALE GVSRGTPIGIIGAPDDVPASENLHSDQWTACAIGTVTRTWVATAPSALVENGTALVQSQG TTYLVVGSARHRIDDSALSGVLIALGLESYAPVEVDPAWIAVFADGTPIGPLAIDRAGTP VTGLPASVASPVIGSVLATQGDSRKYIVTASGTIAPLTEVTSALYSLGSPALSQATTVPV AELAALSIDTKGVGPADWPSTISAPNPEHAPCATLDLAASAPTARLSTAPLTSLAPAADS SADPDPKPTATGPGAGDLGVGNTDVTSGSTGADPAAKGPKVTVAGGSGALVRFTDGGALG ATVFVSDVGAAHPLGEAPDDTIARLGWTADDIVTLPAAWQRLIPAGVTLSNAQVWAAAPT ASSTPNGPVAGTTQMTTAPSSLTAPDAEGEACSADKPQLIPTQTSIINQLGLRNAWKISE GGGVTIGIVDSGVQATNAHFSGGALLTGMDLTGEADGRTDTYGHGTIVAGLIGARTIQGS GLTGVAPAAKLLPIRVYRDTEKETQDAGNGPRIDRMAQGITAAAKAGARIIVVAQSTPNG TPTLQAAVADATKAGALVVASAGTNSEDTITAPRYPAAYPEVLSVGAINPDGSHASNNAQ GETIDITAPGSNVLSAFSGGGDCIFSAETPATSYSAAYVGGIAALVASAHPKETPAQWKY RLTATALRPNPSEHSPLDGWGVVAPSAALALDLDTMIPGPPRPDGKTPIAPAVAPATPPD FSVNYGPHIQNVTLIITAAGSSAAMILTIVSWMRHPRPRPTTARPGKRGSGPRNAPRAPN ARPR >gi|319977607|gb|AEUH01000202.1| GENE 26 24401 - 25771 1568 456 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154508577|ref|ZP_02044219.1| ## NR: gi|154508577|ref|ZP_02044219.1| hypothetical protein ACTODO_01078 [Actinomyces odontolyticus ATCC 17982] # 10 453 3 444 445 117 29.0 2e-24 MDTATQAPLLIPITVAYGRRTVDVSVPSQMILAELAPTLAKAFGEAAPSLEHPITITTGQ QELAYGQSLAAQRIRPGATLTLAQEQDPAPARIYDDPVEAIGDQVESTVTPWTSASSVNL ATGVSTALLVLTALLVATSGQSIAAALVAGFAALIATGLASVVARDNAVGGLALAHTAPL LGGAAAVSFADATSPGGAWITGGIVIAIASIAAFALPGPLRISAIAPAVAGIALTLIGFS TLAFPTTPSGPAALITLVLAVILLLAPALVVSRPQTLDVAATSATQGGGLRAATVAPRFK AARVAALSVAAGAAAALPLTSALTVFGLSLQHGPAPSGAARAPMELQADPWGVALVATVG LVLTLTARNQRSRSEVLIHTATGAGLIAFACFGAAATLPGAVPVAAAAGLVASFLLLSFA VVSPRARPTLSRLAGALQLIGLVGAFPLVIMCWELF >gi|319977607|gb|AEUH01000202.1| GENE 27 25965 - 26288 436 107 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGMEVGAGAFDFAISTVNGSSDNLTRNLNAIRGVIEGARGIWKGAAAATFQQLMTEWDQ DVNNTIKALTEYVDKLNDMKRGYASTEEEAASRLKRTTSAGDYSGSF >gi|319977607|gb|AEUH01000202.1| GENE 28 26338 - 26625 347 95 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNEDISADFPRMQDTSESITKGSQDIQQILDSMQSELQKVQWTGESANAYQVSQRKWSEG MRGLQSVLAAVGHEVQNAMLDYQANESRGAGIFNG >gi|319977607|gb|AEUH01000202.1| GENE 29 26818 - 27573 713 251 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSFPQTSELDGLVESFSRMNALADLMREASDHVPDRITVEDPTESLRLTMDRGGAIEDLQ IDGGWRDEIEPHELGGALTALIGEARTRLYEDADAYIAERADEFEERADGGAQSSAASAE IQARADELSRRSADQGQVLADQVFADLSDLSGRADRMLDDFDGAGAPGASEEEDEFADGE AFICEESGGLILRVSINPTWARVTPTVTIRNELVEVLTSCEGDALDGDSDSFLGDAEQSV VDLLGILNSLS >gi|319977607|gb|AEUH01000202.1| GENE 30 27595 - 27900 233 101 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MADKIKINVDALRGDANEWEHQASQIEPVSGHIPVIGPLRSAAFDPVVGACKAATEIVEV LNSLSLAAVAEFRQIADDLRLVADAYEAQEVEIGQHVKDAY >gi|319977607|gb|AEUH01000202.1| GENE 31 27912 - 28691 793 259 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSFPSPSSSPGSSASSTPLEEVLQKLMECGEKLTDLVEVSAKVIASAAKLLAKLELTALQ WALDKVHQIVTEFCSKAKDFLEKTKYTMSAFSIASGLRDFLDEKFTPAFKTLDQDVIDGM PTKRHDLKGDAYSSYSGLVGKQVKAASDMVNASTKLSSQLNSAIIAGYVFASGLLVAAAA AIFGFAKAFFAAGAVPAGTVAAPFIASEAGAFFIATVSAFVSIATSAWSSSANGMRAIKE IELVGSNKWPNSNLDYQEG >gi|319977607|gb|AEUH01000202.1| GENE 32 28697 - 29182 628 161 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGEESNTLVGVDPEQESGALNQTEGGVGALESAQSDGVSAMEREQTSLKWGYEEGPVAVS EKYKSVLQSTHTNLGTRITELREMMKGSQHSVTHYQEDDAEIAERFRRAARSDTVSGGTS GLPGAGNPAGSPAAPQPTTPGYQPPAAADGQTGTGFNGSGS >gi|319977607|gb|AEUH01000202.1| GENE 33 29276 - 30772 1701 498 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGNEDHQCRPGDDPAALMQGPTAFERLREVSELHYDDQADQDAYQKVLDQIAECGRLVTQ YEQTNGYHSQEIRSAITEWAQKYRSDLKRTEDQLTRGHQYATQARDVMRNARDTFLEKVS PELLSPAEEKMKAGINVGTTVMKYTIPISGYLVDVAADKYWEWLEGERKDKREQFCDEIL KKMNHSLNETAGKMESSIDQGKTTGDEYNDGDQPAIGGSMPGLGGADGGLPGGGLPGGGL PGGGPGAGDYDPGALGGGPGADGLGADGLGASGALDGSGADDPYAARGADWMSEGFNKPG VEQAAPPRAEIDDLDGVGLIDRPINQTVTPNGLVGGYAPPSSTDFSDPKWDPSYKIPSSV TDAGKAASAGALGAVGGMGAAGALKGLGGAAAGMSAKSLMAGGAGAVGAGLKGAGAAGAG LKGAGTGAGAAGGGMMGMGGAGAGAAGAGDDKKKKRRGLAGLFSDQSEQGGPVWDPAHGP GSADDGVVLEVDLDEWGL >gi|319977607|gb|AEUH01000202.1| GENE 34 30863 - 31945 1329 360 aa, chain + ## HITS:1 COG:slr0535 KEGG:ns NR:ns ## COG: slr0535 COG1404 # Protein_GI_number: 16332024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Synechocystis # 35 246 136 359 613 76 32.0 7e-14 MGLCVAGAPGARALDEITTQEYVGPLGVDNEVASGYTGKGIKVAVIDGPADTTAPELTGA NVTVKPMCSFTASDAHRSHATAMVSILAARGYGVARGAEIINYSTPTAEDNDPTACATTS IDDAINAAVADGVRVISASVGKGDLAELARPAITAAVARGVVVVVATGNDGLQDPADSLS SINGTVGVGASDSNGNYQSFSNYGRGMTVMAPGNNTWVHDFAESGRIRGARGTSYAAPIV AGFVAVTMQRWPQATGNQVVQSLVHSATAGPTGQPLINPKGLDTTDPAQFPDENPLMEKF PGTEPSARTVSDYRDGVLADQSVFESDPSYVYRGVDAEVARSHADRSALGTSPRYHRKED >gi|319977607|gb|AEUH01000202.1| GENE 35 32022 - 33416 1064 464 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSELLVPFTLRLPDASLRDCAASQGTPLSEALDSVGVDPAWFQLCRISGGLVKPADRIGF EVAAGEVIMFVADPSAMLTQGRWASRAPNGGSGGEGDAAARNRAVPGGTLVGSALVIVLV ELFVLAGPFLWDWRPGLLSRLVTGALGVLYALVLAYVAPAWARTQCVATASAVFALSCLQ IVPADNPVGAAVAPVVMAWGGLLFCLAYRMRRVEDVEDAVAVPRAAWMVATACTTLMVLG GSLVARPAFLAVAAGTVLVWLSPQLTMRSSSFVLIDIKEVLTSALPGRQQDPPRPKAMNA RTSRAAYSSARVRSHVVIALGCAMCVLGGLMSGSHAGAGEWRGRAALATVLGAALSLAFA ARHQRSGGRLLMTGASVLLAALVAASPEVSAHVPGGSAVLLAGAAFLIAVLFPLRGAGGG RRKRMAPSPLVGRILDVLQALILIVVLPASVYASGLFDAIRQSM >gi|319977607|gb|AEUH01000202.1| GENE 36 33605 - 37879 5495 1424 aa, chain + ## HITS:1 COG:Rv3447c KEGG:ns NR:ns ## COG: Rv3447c COG1674 # Protein_GI_number: 15610583 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Mycobacterium tuberculosis H37Rv # 17 1421 3 1234 1236 728 35.0 0 MARLTKKTRQAEAVDESVPTRPQGELVLEPPPDREEAPNASNMLTSVVPMLGSVGMMAFM ALSQSNNPRMLFMAGAMVFAMISMVGTQMYRQVSAFRTKVADMRGEYLAYLAETRRAVRD MAKQHYQYLQHTLPAPDSLVLLAESGARVWERRTKTAGLLNVRLGESQQKLSKRLVAPDP GVLANPDVVCQSALARFVDVQSNLGLAPYGIYLDQFLTVELCGLRRHTEPQLRAMIAHLA VMVPPSDLRIAVLASEENIDQWDWVKWLPHVRSEESEDALGPARMVATTAEELDELMGPD VETRGEFRQESAGQGSRPHFLLVVDGAEYPSNWIPMSEKGVEGLTRVVIRTVWEDLEDSS TLRLILRVGKGDKVVNGRRTVEPMPLGQVVQQDITPALYRPDQMSIEEATAVARRLTRFT EGMAQAVTNKSEKKASSDLMDLLGLGDVRLFDPDKRWRHRKGRDYLRVPFGLTESGAPVI IDIKESAKNGMGPHGMLIGATGSGKSEVLRTMVLAMALTHDPVQLNFVLVDFKGGATFAG MDTMPHVSAMISNLEEESFLVARMEEALQGEMSRRQELLRAAGNFAKVEEYENARRAGKH DGPPLPALFVIIDEFSELLVAHPNFIKVFEAIGRLGRSLSVHLLFASQRIDTKAGDLMSH ISYRIGLKTFSAGESRSIIGSDVAFKLPPLPGSGYLLAGGEDLVRFRASYVAAPPPESAK AVDADSDEEEAIRHILPFSADRVEPDEAIVIEEGLLPVPHHGAQALRRGGEVPPPPPGAV GAPTGAEPVGAEEEEPDFSQYADMSQLDIAVKRMEGHGLAAHKIWLEPLDTPPTLDMLFG DLAVDPRFGLVSASWREKGPLRVPMGMVDLPRQQKQETLEYDFSGAKGHGIVVGSPMTGK STALRSLVMSLALTASPLEIQFYVMDFGGTFSSMRNLPHIAGIASRGDTERATRMLAEIE AILFDREIYFRKNGIDTMADYRRMRAAGKADDGYGDVFVILDGWVTLKEEFDGEDRRLGR MMERALNYGVHLIVGTGRWLDLRMDIQPLFGTKIELRVDDESASIIDRQAKKNVPLDRPG RGLDMDAHQMLLTLPRIDGERDPETLSEGVRKAIADIHAAARGMHAPRLRLLPERITVPE LLSAIPELGRPMRELEAEYTKALEAEAAAARANAPEGAEVPEAPIEKPPVPLHDKCLVLG VEERRLGPLVFDPLKEQNLFLVGDGESGRSSFIRLVAHEIMRTNTPKEAKLIVVDPRRSL LGEIPEEYLVAYLTMNDDIENELRAIASAIVEKRAPGKDVTVAQLRERSWWSGPELWLLA DDEDMLMGGMNNKLQPIDGLLSQARDVGLHVVAARRTGGGMMSDRFISKVREMGATGLIL SGSPDDGPVIGRVKAVPSRPGRAQVVTRNAGVYRAQLAWQPTSE >gi|319977607|gb|AEUH01000202.1| GENE 37 37946 - 38485 778 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293191448|ref|ZP_06609190.1| ## NR: gi|293191448|ref|ZP_06609190.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 179 1 180 180 75 32.0 1e-12 MGTHVNRLGGALVAAVLACALAACGQLGGGASSNKPLDQLQNLGAGTKADYQSRMDQASE ADKQKILTEANQVDAVVGAELVLDDPSAKGAKFQLKDDNTVVMDQSIAPLMSNATNWRVG DRSIDLCTDADCEYYSSWTVAPKTGSGGGGLSGYTLTLLIENEAEAANVNVTREFTLGK >gi|319977607|gb|AEUH01000202.1| GENE 38 38404 - 39054 276 216 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MADHLFGFVDDPDAAVVHIDVGDLHAPVQKTKDRHEQHHSDDEQRRHPALHASHSASPPF ITCRWSDFRPVSLVSHESLLHRELPGKRFGPGYTLINAGSHCHSTNPTTAHHAGAGAPPR RTFAPITSHNGSESNQIMNCLTRWDPRAGACCARPRRHQRPGPAAQGLGPGDEEAFGQGT GRPAPPARGPHLPSVNSLVTLTLAASASFSMSSVSV >gi|319977607|gb|AEUH01000202.1| GENE 39 38914 - 44940 6894 2008 aa, chain + ## HITS:1 COG:no KEGG:AAur_1424 NR:ns ## KEGG: AAur_1424 # Name: not_defined # Def: fibronectin like domain-containing protein # Organism: A.aurescens # Pathway: not_defined # 17 1769 66 1832 2060 746 35.0 0 MPALFVITVVLFVSVLGLLYRGMEVTHVDVDDGGIWVIDKSKQMVGHLNYDARVLDGALR AESANFDIGQAGETVTMTDRATSTIAPINVTAVALGSATPLPSGAVATQGGDTLGVYNPA EGTFWVTSAKGPVAADLGDGAALDSGEGAGASTVAVDGTAFSLSASDGTLVTVAPKGTVQ SVNKSRIKGIDPAATLAMTAVGNHPVAFDTTGNILYLPNGQAYSLTDYGIAPGAVLQQPG ADADSVYLATPTSLVGVPLKGGEPSVVDTGASGSPAAPVWHMGCAYGAWSGSGAYLRSCA DPSKDTNDVVDTLKSAREVVFRTNRSAIVLNDVALGSVWLPDENMTIVDNWDEVEQQLAE SEEEEQTTDLIDEVEDPERKEQNTPPDAEDDEFGVRPGRSTLLTVLDNDSDSDGDVLTAR ATSSPGFGTVVSTRGGRALQIADVADSQTGSTSFTYEASDGQDTDSATVRVSVHPWSENA APEQRTVPTVKIGQGAQIQYNVLGDWRDPDGDQVFLQDAQAPEGLSVQFAEDGTLQIREL GAGAGPKTVDLTVSDGRAQGKGTLKVDVQEPGNQAPSANADFYVAREGETITLDPLANDT DPNGDPLRLVGISAPASVTALPDLEAGTIDFTATAQGTYQFSYTVTDGADQKIGIIRVDV IPADEASGPIAEDDLAVLPVGGSVLVAPLGNDTDPSGGVLVVQSVDVPADSGLEVALLDR HLLRVSSPAGIGESVSFTYTVSNGYGSATARVLVVPGSETNSSLPPVLQPDTVKVRVGDV GTASVLDNDRSPGGLSLSVEPTLNYEANPSVGTPFVTGNEVRIEAGGTPGTLSVVYSVID SSGNTAASTVTFEVVPDSEANNPPRPKALTAWAASNQTTRIPVPLAGVDPDGDSVWLVGT DQQPTKGTAVVRDSWLEYTPAEGASGTDVFTYVVEDRRGARGTARVRVGIAPPASLNQNP AAVPDTVLVRPGRQVSVNVLANDVDPDGDPLALDADGLTASDPRLAPTASGDFVTITTPE AEGTYLVSYGVSDGRGGSSRGQLTAYVSANAPLKAPIARDDSLSFDRLPSDGSAARVRVT ENDEDPDGSVSDLVVTTSDPGVVVDGQDLLITPQASRRLVVYTVTDSDSLSNSAVVSVPG LDSTPPTVRSEGMPIEMKAGESRTLALADYTTVRPGRSARLSEGGSIVSSNGIDQAAANG EGSVDVHAREGFSGSATVSLDVSDGSSNDSGALSARLSLTFIVRPASNQPPTLTPTPIRV GAGEDPVTQNLALAVTDPDGANPRDFAYALVSKPDSVSASVSGTALTVSAPAGTATGAAG SIAVSVDDGSGPVQASVPVEVVSTTKPKMQVSPIQLEVERDASVTVDVASRNVIGAIGPV QVVDSGPGVAAGSASTVTWSGTSVTITPDPAKAEFTYQYSITDAPGDSSRTVSNTITVKV RQTKPAAPVSVRAVSRGGAQGVGAAELLITDANLGADFVVKGYRVTDVNTGTGYDCEPQM RCLVSPLGPGPHSFTAVVRTNKGDSAPSAPSNTLDFSPDPSGVRRLQVTPGNGSLTANWS APDQQGSGIMNYEVAVTGSGGGTEQVPASSTWYTRSGLTPGESYTVSVVAIDTNSRRSAA VSTRVTLPSAPEPPRDLKAMVVGGTGDGTTSFNVTWKLGSHHSSGWANGTVSVDSRPYPV SPGATQKEVSVPSGGSATITVTVFNADGYSNAASISVSTLVKNPLPPTAPVLKPTGNTGE LQVVGLAKVPGNGYEARQLTLRYARSEAACAAGDEVEDGEVIGGLGASSTPVTLFFCQTA TAVDATKVVSSATSASGTPKAGNVPKFKVKAKADGTSIKATWNIPADADIVRAHASIKEG GVAPQFGSPPPSSAVFSGLAPLNRYTVVVTLTNASGAERTVEKEVWTEEDPQYIEETWER PASCWGTPCGAFRISATRADQFAPDATLTCYVYTYRDEEHERERRKLTLDAKGNWTVTGL PTQATSAGAFAGMKQHVTSCWVEDDDDD >gi|319977607|gb|AEUH01000202.1| GENE 40 44995 - 45960 1273 321 aa, chain + ## HITS:1 COG:PH0776 KEGG:ns NR:ns ## COG: PH0776 COG0714 # Protein_GI_number: 14590644 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Pyrococcus horikoshii # 18 319 10 313 314 274 42.0 2e-73 MTLSTEQAGEFAQVFKSLVDAISSAVLDKQQVVRLALTSLFSGGHLLLEDAPGTGKTALA RALSAVIDGSHSRIQFTPDLLPSDITGVNIYDQSNGRWVFRAGPIFASIVLADEINRASP KTQSALLEVMEEGQVTVDGVSRPAPTPFMVIATQNPVEQAGTYPLPEAQLDRFLMKASVG YPSRAAMVEVLAGSAAPDRSRLLRPVVGGDDIDHWSRVVADNFADTAVLDYVAALAEATR EDESALLGVSTRGAIGMVRCARVWAAAQGRTFVLPDDIKALAVPVWGHRIVVDPDAAFSG ATSQGIITRALTGVSAPSLGS >gi|319977607|gb|AEUH01000202.1| GENE 41 45967 - 47142 1450 391 aa, chain + ## HITS:1 COG:slr1927 KEGG:ns NR:ns ## COG: slr1927 COG1721 # Protein_GI_number: 16331174 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Synechocystis # 15 298 33 325 406 63 26.0 5e-10 MRGVVRALGRGLSIVTPTGWALLTLVALSVLVGLTLQWSEAATITAASLVVIVIAAVRVA WRPPHIVAIDITNEHIVAGQSTVGQLTVTNGSARSARTGIIEVPIGAGLGEFVVPPLRAR GKWEDIFLVGSKRRGVIEVGPARSVRTDPLGLVRRVREWGEPVVLHVHPRTVRVPFDATG FQVDIEGVVTAKLSSSDVSFHALRDYEPGDDQRSVHWASSARLGKLIVRQYEETHRSHHL VILDNTEGSWDQDPFETAVSAVASLGLANLRESRTVSLATASEWIPTTSAMRMLDALAEL GTAPRGDFLRRVREVIDDRPGASAVTIVVPASTTDEQAAHLSRLTPVDVPVSIVRIDPSA DRGRSAVGGGVLLDCPTLKDLPRLIIAGGLT >gi|319977607|gb|AEUH01000202.1| GENE 42 47139 - 49733 3171 864 aa, chain + ## HITS:1 COG:lin0469 KEGG:ns NR:ns ## COG: lin0469 COG1305 # Protein_GI_number: 16799545 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Listeria innocua # 349 673 258 584 721 76 26.0 3e-13 MSESIEITRKRVHSAAAEEVPPPPAPGENVAWAATAVPSGRYGSARTTRARLRSLTERLR RPSPALPGWSVLVLAVLAAGPALAFEPVFGNGQGLLAAGAGAVAGLAIGTAASRWHWDLL SVLAAVAAVYLLLGGYAALPQTLANGFVPTGQTLQVLVLGVWRSWKDLLTLTPPISAYSG PAVVPWITGLLAGTLSSLITIRLGRPVAGTLPMGIMGFVAIFFGPSGQEQPAWAVAAWWV VLVGWWAWAAQRQRLRLGQDVLAGRADSAPAPAGGAAGGRMRSTVYAGRRVAGALTIVAA GVAVAVPVTIALSPLPGRVVGRDLVDPPLDVRTYPSPLSAFRHYSTDLKDDALITVSELP DGQRVRIAAMDVYDGTTFGMSVAGESASEGYIPVGTAIPGRAGTATASISVTTDKLRGPW VPVLGRPENIVFTDGAAGAQREGLHFDQWANTALTTSPSPQMAYTVETSFADAATDDELE GLPAASFTNSDKNLPQGLESLALEATQNADGPLATARAIEHYLSGNGFYLQENTVQSRPG VRTDRLERMLKAADLIGDDEQYAALMALILHSKGINARVVMGAYPAEGADGHNRRGTVKL FGADLHVWVEVEFKDGMWGVFDPTPPRDKTPQTQVPKPKTVPRPQVLQPPDPPKPPVELP PNVRDQNADTDKPDSASLPWGLILGGTGLFLLLFGPVIGVVVYKALRRRRRRRAEPAAAL LGSWDEVVDLAADAGIWVEAGQTRQEAAWSLSRQWSGDQDIVEPNAFDLVAAPSGANAAV IAGWTPFEQEVPTAVAIARRADIANFAHNGATREHVNAAWKDFASLRAQVRRTTGPIQRA RRALSLRSVRRNAAQARRRRKEQG >gi|319977607|gb|AEUH01000202.1| GENE 43 49730 - 51259 1923 509 aa, chain + ## HITS:1 COG:Rv3080c_1 KEGG:ns NR:ns ## COG: Rv3080c_1 COG0515 # Protein_GI_number: 15610217 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Mycobacterium tuberculosis H37Rv # 15 286 25 296 340 131 33.0 3e-30 MSTAGMVVPGPAIPGFRFVRPLGMGGFADVYLYHQALPSRDVAIKIVRAQGDARGTEELH READAMTLLAGHPAVVELHGVGTTSDGRAYLVMEYCPVADVLSQVRAKPMALDRALDFMI QICGGVEVFHRQGYVHRDIKPSNIMLDAYGQPVLADFGVASPKGELEVGAFDGFSVLWAP PEQQDASNPATPQQDVWALASTTWTLVTGRSPFEDPIGDNSAASIAMRVQRGRIRSLGRA DAPPELEEVLRAALDIDPDKRTPTAQAFGEGLQSVQRAMGLPVTKMEVKESKTNAVFGLT SDAIDAKTRLRGAVRIDPEGTRMRALKYDFGDQVPAAPVADSWEVERVDGAEAAPESAKK QEEPERSKRTPFATILVLFLVGALVTAGLITAILTQQGTVNRVPGTETPSTGSETADPVG QPPPVVTGLAGDYADGTITWTWSAPENTSPETVEYSYAATGPDGDKNGTTEGTTVTLPAA SGRNCVDIATVAVVTRRASDPIRACVDVP >gi|319977607|gb|AEUH01000202.1| GENE 44 51285 - 52157 804 290 aa, chain + ## HITS:1 COG:CT259 KEGG:ns NR:ns ## COG: CT259 COG0631 # Protein_GI_number: 15604980 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Chlamydia trachomatis # 18 250 12 245 248 93 33.0 5e-19 MTAGEHNRGLVLQWGAATDIGPVRSENEDSWLAAPPFFAVADGMGGHLGGARASRTAVQT LCEYLSEERLGDRAADLVDLSEAVEEAGERVNALAGADDAQNAPGTTLTGVLALDTDEGP YWLSLNIGDSRVYVVGNGRISPITHDHSAVQEARDMAEELGTPQIIPPSNVVTKALGGGL TGSVSADYTLMPLCEGDYAVVCSDGIHGVLTDVQMCSIVTAGRGPQGVADHLVSAALSAG TRDNATAVVVYAASASRLTSDPHAMTRVIQPVRPASTRTTRRTPLRKGSN >gi|319977607|gb|AEUH01000202.1| GENE 45 52157 - 53584 1402 475 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293191205|ref|ZP_06609138.1| ## NR: gi|293191205|ref|ZP_06609138.1| putative FHA domain protein [Actinomyces odontolyticus F0309] # 1 475 1 441 441 453 60.0 1e-126 MLQIGRFLCRGQGAILVMGPLGAALVPLSLADGAADVVWGRADTQTWLAQVQADGDSALL LDCAGEQPYLTVLGAIQLRMRSAGQIAEQVWTDPIWAWPFNPGEWQTFEAAIVDGDEVTW MVPAADRVEAGEIRIGRSLDELEALASLPVDTMGFLIHSTETPTQRPVAVVPESPKDDEA YDVAAAEAKFLATLNKDERASAPAAESLAVVAEQVVATMPSDDSSATRISRERPRHAADS AEDGAAPGQGQEAPSEQGTGSEKGAGTGQGTGTDSAPPADSAPQGPAPAGADEEDSMART LVDQPPVALPPATRPPAARPPVATAPPPAEDSYEEIAPDVLAGMNAEATQARRAPHNVGP SAPVAFLIHGGTVPVEVSRDVIIGRDPDARALTGRPVATVLRVPSPATEISRSHCAVMMT APGAWSLMDLGSANGTILRHADGSFQDVTPMVTIALNDGDLIDVGEGTTIEFRVR >gi|319977607|gb|AEUH01000202.1| GENE 46 53748 - 56459 4250 903 aa, chain + ## HITS:1 COG:TM0272 KEGG:ns NR:ns ## COG: TM0272 COG0574 # Protein_GI_number: 15643042 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Thermotoga maritima # 3 887 4 867 881 857 53.0 0 MVKYVYDFSEGDKSMKDLLGGKGANLAEMTKLGLPVPPGFTITTEACRAYLQGSVVPESL STEVTTALRGVEDQMERHLGDPYNPLLVSVRSGAKFSMPGMMETVLNIGLNDQSVIGLAK VSGNERFAWDSYRRLIQMFGKTVLDIDGDLFSDALDSLKAERGVKGDTELTADDLRGLVE TFKGIVRDQTGQDFPQDPRTQMDMAIEAVFRSWNTERARIYRRRERIPHDLGTAVNICTM VFGNMGENSGTGVCFTRDPSSGHSGVYGDYLENAQGEDVVAGIRNTLALADLERINKPVY DELRSIMRKLETHYRDMCDIEFTIERGKLWMLQTRVGKRTAAAAFRIAVQLREEKLITRD EALGRVTGDQLTQLMFPQFDAKADKELVATGMAASPGAAVGRIVFDNAQAEAAAEAGVKC VLVRRETNPDDLPGMVAAEGVLTARGGKTSHAAVVARGMGKTCVCGAEALEIDSEAGTVT IGDLVLTGEDVIAIDGQTGEIFRGEVPVTDSPVTTYLAQGLDAGLAAAGDDEGTRELVGA VDFILSHADEVRRLRVRANADTPLDSKRAIEFGAEGIGLCRTEHMFLGERRPLVERAILS APGSAERQAAFDELERLQKQDFLEMLEVMDGKAMTVRLIDPPLHEFMPALVDLETRVAVG KATGNLDPADEAMLAEVRRMHEQNPMLGLRGVRLGIYLPGLFALQMRALCEAAAELVARG LRPEPEIMVPLVGSVRELQLVREEGEAIIAQVAAQRGADLSGITIGAMIELPRAAMTAED LAEEADFFSFGTNDLTQTVWGFSRDDVESVFFPRYIEAGIFGVSPFESIDVHGVGTLVAE GVRRARSTKPGIKLGVCGEHGGDPSSIHFFHRVGLDYVSCSPFRVPVARLEAGRAAVAER VEE >gi|319977607|gb|AEUH01000202.1| GENE 47 56800 - 58452 2390 550 aa, chain + ## HITS:1 COG:YPO1941 KEGG:ns NR:ns ## COG: YPO1941 COG0672 # Protein_GI_number: 16122187 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity Fe2+/Pb2+ permease # Organism: Yersinia pestis # 45 533 146 634 639 260 34.0 5e-69 MGALVTTTTKAAMRQSTKGLVVIVMAALAAVLWAAPARAANPQTWQEVSATIASLIDQAV ADYKAGDAAKGKEGVNDAYYKNYETTGMEKQVMARISGSRVSAVEMEFSLLKQAMGSGDS AGVDSHAATLKQCIAEDAAKLDGVAPASGSSGAGSADYSPGAWGQTAKQINGIIDQAVAD YKAGDAAKGKEGVNDAYYKNYETTGMEKQVMARISGSRVSAVEMEFSLLKKAMAEGDGAG VDSHAATLTNAIREDANTLDGYTGQAASDAASQTTPWLSAFVPALLVILREGMEAILVVA AVLAYLGKSGHKDKARVVWSGVAIALGLSALLAFLFNYFTSLAGANQELLEGVAALFAVA MLIWVSNWMIHKSSDKAWEKYIRDQTDASLTRGSLLGLAFISFLAVLREGAETILFYVPI VSSAGDRTGYVWAGLGVGLVVLVVVYLLIQFAALRIPLRPFFTITSLLLALMAFTFTGSG VKELQEADVLSLTYVNGFPTVDLLGIYPRVENLVAQALVLIIIVGLYVFGKRSLARASGG AEDAPDAKAD >gi|319977607|gb|AEUH01000202.1| GENE 48 58484 - 59128 1071 214 aa, chain + ## HITS:1 COG:TP0971 KEGG:ns NR:ns ## COG: TP0971 COG3470 # Protein_GI_number: 15639955 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein probably involved in high-affinity Fe2+ transport # Organism: Treponema pallidum # 5 207 1 199 204 148 44.0 6e-36 MKRSLARIGAAAAAIALAGTLAACSSNSNNSNNSNNGANPPASQSAADDGAKPGEDAGFE ELPLGDDVFVGPLKVGGVYFQPVDMEPAVSTPAKDSSMHMEADISAVADNDLGYGAGDFI PALTVDYQIADKSGTVVQEGTFMPMNASDGPHYGINLPKLEAGTYDVTFTIKSPETNGWL LHTDEKTGVKGRFWQEPLKAEFKDWQWDPTSVDW >gi|319977607|gb|AEUH01000202.1| GENE 49 59233 - 59380 120 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLQQLSEATLSLLLTMLAVGATIRLVPAAGRGRSGRRARALAVWGGIGA Prediction of potential genes in microbial genomes Time: Thu May 12 18:44:23 2011 Seq name: gi|319977603|gb|AEUH01000203.1| Actinomyces sp. oral taxon 178 str. F0338 contig00203, whole genome shotgun sequence Length of sequence - 3048 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 + CDS 2 - 1165 1764 ## COG4393 Predicted membrane protein 2 1 Op 2 10/0.000 + CDS 1171 - 2448 1833 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 3 1 Op 3 . + CDS 2454 - 3048 740 ## COG0577 ABC-type antimicrobial peptide transport system, permease component Predicted protein(s) >gi|319977603|gb|AEUH01000203.1| GENE 1 2 - 1165 1764 387 aa, chain + ## HITS:1 COG:FN1355 KEGG:ns NR:ns ## COG: FN1355 COG4393 # Protein_GI_number: 19704690 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 211 379 119 288 298 143 41.0 6e-34 AVWGGIGAGAAASLALTVVATIAPKSLNRQSIGLVTLPVAVVAAAALIALFLRRARLVRA SGADPADPLSGDGALRAVAALYIAFALFRALPTAMTQGVMMLSGGVELYSTTAVMAITGY ALAWVLCAVLGWLSHRLCSLSARVWPAVLVITSLGLAHILMIVRVLKAKRLISLPADVFS AVSWLINHEAVFTLAAAAALGAVTVITWWASRSIPREGANPAEGRLARARARLYKAMAGS AAIGYLCGALLITVGVAIGSQEVELSAPESYSVVDDHATVELSAVSDGHLHRYEYTTSTG VKVRFIVIQKSGSSFGVGLDACEICGPTGYYEKDGKVICKLCEVAMNIATIGFKGGCNPI PIEYEVGNGTLTVPVSALEAAAPVFAK >gi|319977603|gb|AEUH01000203.1| GENE 2 1171 - 2448 1833 425 aa, chain + ## HITS:1 COG:FN1354 KEGG:ns NR:ns ## COG: FN1354 COG0577 # Protein_GI_number: 19704689 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 425 3 428 428 427 51.0 1e-119 MFFRMLKGALTRRGNRHAMIALTVALGACVATSMLSVMFNVGDKVNQELKSYGANIVVRP QGAAVLQDLYSTQGPAQQQAYLREDELVKIKTIFWTYNILDFAPLLPTTATGADGGAVTV TGTWFSHRLDLPTEESVQTGLDKLRGWWSIEGSWPDDSDEAGAVVGSAYASAHGVRVGDT ISLTGPSAARDVVVRGVFTAGDDSDHAVYAQLGLVQELLGRQGAVGSVEVSALTTPDNDL ARKAAKNPKSLSVSEKETWYCTAYVSSIAYQIEEVMTDSVARPVRQVAQSEGSILEKTQL LMVLVTALALVASALAIANLVTAGVMERSSEIGLMKAVGAKGKSIIGLFLTETIVVGLAG GVLGYGAGLALAQIIGYTVFGSAIAFAPVVVALVAVLVVLVVLGASIPAIRYLLRLNAAE VLHGR >gi|319977603|gb|AEUH01000203.1| GENE 3 2454 - 3048 740 198 aa, chain + ## HITS:1 COG:FN1353 KEGG:ns NR:ns ## COG: FN1353 COG0577 # Protein_GI_number: 19704688 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 178 1 180 400 149 38.0 4e-36 MSKRAMFWRMVASSVLRRRSRVLIAVLAVAIGATTLSGLATITVDVPAQMAREIRSYGAN LVVTGADGQAMDDEALAAVDQELPAAQLVGSASFDYETVTVNDQPYVVGGTDLAAVRQMS PFWFVDGEWPSGSAQVLLGEEVATTIDAKTGDRITINQLDGTASSNAAAASGSAASGKAN SSGAAASGGGKANSSGAA Prediction of potential genes in microbial genomes Time: Thu May 12 18:44:26 2011 Seq name: gi|319977595|gb|AEUH01000204.1| Actinomyces sp. oral taxon 178 str. F0338 contig00204, whole genome shotgun sequence Length of sequence - 7390 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 36/0.000 + CDS 3 - 941 165 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 2 1 Op 2 1/0.000 + CDS 950 - 1828 1163 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component 3 1 Op 3 . + CDS 1825 - 2478 604 ## COG4939 Major membrane immunogen, membrane-anchored lipoprotein 4 1 Op 4 . + CDS 2526 - 3980 280 ## gi|293191217|ref|ZP_06609150.1| thiamine biosynthesis lipoprotein ApbE + Term 4200 - 4266 14.1 - Term 4184 - 4232 4.3 5 2 Tu 1 . - CDS 4362 - 5651 323 ## Bcav_1339 hypothetical protein 6 3 Tu 1 . - CDS 5795 - 6088 696 ## 7 4 Tu 1 . - CDS 6501 - 7169 839 ## COG0247 Fe-S oxidoreductase Predicted protein(s) >gi|319977595|gb|AEUH01000204.1| GENE 1 3 - 941 165 312 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 191 312 293 413 413 68 35 2e-11 ANSSGAAASGGGKANSSGAAASGGGKANSSGAAASGGGKANSSGAPTAPGLGSQDGAQSA SSPLSAQTAQGAQASQGAQAAQGAQAAQGAQTAQSGADGQARSITVTVSGILKTGGNEDG YIYMSAGDMVELTGAWEPSIAQYSVALEGDQLTALVDSINASVPSVRAQTVKRLVQSDSG VIDMLRSLLGIITVIVLALTTIGVSTTMIAVVTERRNEIGLRKALGATSRSIMGEFMGEG VALGAIGGLVGAAAGYALAAAISWNVFHRAVAVHPLILIATVVSSVAVAVVACLPPVRRA LAVDPALVLRGE >gi|319977595|gb|AEUH01000204.1| GENE 2 950 - 1828 1163 292 aa, chain + ## HITS:1 COG:FN1352 KEGG:ns NR:ns ## COG: FN1352 COG1136 # Protein_GI_number: 19704687 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 41 252 7 218 224 290 62.0 2e-78 MSSTPTTPQAPDPRTPPDSAGTAQDGPAQGLAGGGAAGDPLLRLSGVSKIYGDLHALDNV SLEIPRGQWVSIVGPSGSGKTTLMNIVGAMDTATKGTVALDGHDISDLTGDELTEIRCRT IGLVFQQFHLIGHLTALENVMVAQYYHSMPDEKEAMEALDRVGLADRATHLPRQLSGGEQ QRVCVARALINYPKLVLADEPTGNLDETNEEIVLDLFAQLHKDGTTLMVVTHDALVAEQG QREIRLDHGRVVKETWHDHGFTPRDSGLAVPAPSADGGAGQGPARSGKDKVR >gi|319977595|gb|AEUH01000204.1| GENE 3 1825 - 2478 604 217 aa, chain + ## HITS:1 COG:FN1351 KEGG:ns NR:ns ## COG: FN1351 COG4939 # Protein_GI_number: 19704686 # Func_class: S Function unknown # Function: Major membrane immunogen, membrane-anchored lipoprotein # Organism: Fusobacterium nucleatum # 52 182 14 139 140 65 33.0 7e-11 MSPRMNDGSRADGDRRRAVAAPGSPRVDASGPASRRWASAASRAVLGALGATILAACQPS TGLDMSKPLQDGTWSAQSNADDQGSVGTITITVEGGSITSTSYTTAMSDGSDKGGDYGKD SSGRVFNQDYYDKAQAAVASFEEYSAKLTETGDPAKVDVISGATVAHQQFVQAAIRAIAQ AQGVDGSGAADGVDIPGLGESTKDGGDLDKDLGGGNG >gi|319977595|gb|AEUH01000204.1| GENE 4 2526 - 3980 280 484 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293191217|ref|ZP_06609150.1| ## NR: gi|293191217|ref|ZP_06609150.1| thiamine biosynthesis lipoprotein ApbE [Actinomyces odontolyticus F0309] # 1 484 44 406 413 244 46.0 1e-62 MGTRVDIVGWGGDGMGVVDACVRLVARHEAMWSVFRPASEVSRLNAAARVVPLPAGGTGG EAPGCPGAVGDPGVGASGAVHCPGVSVSAATDQLLRDALALSDATGGAFNPLIGPLVGLW DVKGMREAFLAGDSLPSAPLAADIESALRAVDARLLRRVGPSRWALGGQDAPAGGGGWRA DDCWTLGEGSAGGAGRCADEGGRAASEEPSGAGALPSLDLGGIAKGRTADECRDLAVDMG ARGVLVSIGTSSIAAYATRPDGGPWRVGLRDPDSPPTAVASVIELPCEGLASLSTSGDNL GPLGPAARTGPVGRLGPGGRPGRSEEWGGGARGTDGAARGEDEAGVLGAGRDAQWTHGTR HGQANGGARETPRETEPGAERGPRGASCGTGRGVPGGVERGTSGGAPCPGGRLVDRLLAH HIIDPRTGRPASGGARQVSVLAGSGVLAEALSTAILVDPSCVRGEAVEQWARARGIDGRW RIAG >gi|319977595|gb|AEUH01000204.1| GENE 5 4362 - 5651 323 429 aa, chain - ## HITS:1 COG:no KEGG:Bcav_1339 NR:ns ## KEGG: Bcav_1339 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 204 334 105 235 262 70 38.0 1e-10 MWTPQLRDGHSSGMESNDIDVYRVPTTSAPPGADGSREVRIKYGFRAWICNAPADLPRWE IWEAIALARIAAVNGSLESTPTFVGSSALLLHGITGWVKNPDVTIHIDARRRTTWLPEVL FGDVRIPPVRVSCRRSRPLTGEPSLHGSGFLVESPQDALLRVVMGDEQLPAFVLACMAMR AWAPFNASDREGAREKEALIKARLAEILDARAPGRGTRRGSRILRAADAGCENPAEAALL WVVLTVSPLPVVTQHEIRLGGRTYYADIAVVDLKIIFEFDGLAKLGSSKAEIEAEKRRWV LRDEALRDDGWKVIRVCWADFEDFHALRQRIRRAFSSTRSLSPPPCAPLWAVPSEDCDGP DRRIHSSAPVSADTRRKAPQHRGGGSPQSSSTGGVDRGAHAAERRTPAVGLTQRTRTLLV DPDPPDWTE >gi|319977595|gb|AEUH01000204.1| GENE 6 5795 - 6088 696 97 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVVCPNAQEASTSSSLRGTIPCFIPPIPWRPPQTPRPVAPAPAPAPAPAPAPAPAPAPAP APAPAPAPAPAPAPAPAPAPAPAPAPAEHAPHAQSSP >gi|319977595|gb|AEUH01000204.1| GENE 7 6501 - 7169 839 222 aa, chain - ## HITS:1 COG:Cgl2262_2 KEGG:ns NR:ns ## COG: Cgl2262_2 COG0247 # Protein_GI_number: 19553512 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Corynebacterium glutamicum # 2 210 185 395 404 127 38.0 2e-29 MVEVLERNGFAPVVAPDVCCGLTWITTGQLGGAKKRLRKLLGALAPFAANGIPIVGVEPS CTAVLRDDLLDLLPDDPRSALVCHGTLTLAELLSTVPPAELDLPSLEGVTVVAQPHCHHY SVMGWGADQALLERLGARVRRIDGCCGLAGNFGMEAGHYDVSVGVAGLSLLPALDDEPGA VYLADGFSCRTQAEQLASRGGVHLATLLASGGAPRPGAGRPV Prediction of potential genes in microbial genomes Time: Thu May 12 18:45:24 2011 Seq name: gi|319977593|gb|AEUH01000205.1| Actinomyces sp. oral taxon 178 str. F0338 contig00205, whole genome shotgun sequence Length of sequence - 1150 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1101 1100 ## COG0247 Fe-S oxidoreductase Predicted protein(s) >gi|319977593|gb|AEUH01000205.1| GENE 1 3 - 1101 1100 366 aa, chain - ## HITS:1 COG:Cgl2262_2 KEGG:ns NR:ns ## COG: Cgl2262_2 COG0247 # Protein_GI_number: 19553512 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Corynebacterium glutamicum # 187 329 1 142 404 125 43.0 1e-28 MPPARLGAYLRDFTALMREFSIDGLLYGHFGDGCVHVRLGLPLADEAGVARSRRFLERAA RACARHGGSVSGEHGDGRARSELLRYMYSPDMLDLFAQVKALFDPDNLLNPGVLAAPMTP DRARERLAGRAAAAPASSAPAQGQDPQPGVDPLDAFLRRPRARPIPADGGFAFRGDHGDL TTAVHRCTGVGKCRAGAPGAFMCPSYKATGDEKDVTRGRARILQDAANGELVASIDSPDV LEALDLCLACKACSADCPAGVDMAKYRSEAFFRRYRGRARPLSHYTLGWLPRWTRLTARV PLLARAGNAALSVGWLRRLVFRAIGLDTRRGMPPLQAGTFSAWARRRGLGTGVPAEAGAA DPGDGG Prediction of potential genes in microbial genomes Time: Thu May 12 18:45:26 2011 Seq name: gi|319977588|gb|AEUH01000206.1| Actinomyces sp. oral taxon 178 str. F0338 contig00206, whole genome shotgun sequence Length of sequence - 4424 bp Number of predicted genes - 5, with homology - 3 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1076 1329 ## COG0277 FAD/FMN-containing dehydrogenases 2 1 Op 2 . - CDS 1146 - 1427 313 ## - Prom 1497 - 1556 3.2 3 2 Op 1 . + CDS 1587 - 1655 59 ## 4 2 Op 2 . + CDS 1688 - 3181 2027 ## HMPREF0573_10934 hypothetical protein - Term 3109 - 3156 3.6 5 3 Tu 1 . - CDS 3243 - 4367 1510 ## COG2267 Lysophospholipase Predicted protein(s) >gi|319977588|gb|AEUH01000206.1| GENE 1 2 - 1076 1329 358 aa, chain - ## HITS:1 COG:Cgl2262_1 KEGG:ns NR:ns ## COG: Cgl2262_1 COG0277 # Protein_GI_number: 19553512 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Corynebacterium glutamicum # 23 358 29 365 544 202 38.0 5e-52 MNGSKALAALAAAGIEADASTGTRARYSTDAGLYRIPPEVVVFPRDTDQVLRALEIARGL GVPITTRGGGTSCAGNSIGPGAVIDFSRHMNRVLSIDPRARTATVEPGCIGSTLQEAARA HGLRFGPDPSSQNRASIGGMVGNNACGPHATAWGRTSDNVVSMECVDGRGRRFTASSRHG SAIPEVPGLAALIGANLAPIRRELGSFGRQVSGYSLEHLTPEGGRNLAAMLTGSEGTLVT LLSVTVRLVPLPTAPVLVALGYPGMIEAADDVPTVLGHSPLAVEGMDRRLVDAVRRHKGP GAVPALPDGDGWLLCEVGGADETAEDSLARARALAGAAHTNAVVVYPPGDEAAALWRI >gi|319977588|gb|AEUH01000206.1| GENE 2 1146 - 1427 313 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTPPPAHPKDSGGARFPLSTSTPTGPIPVVRRRPVTGKSLSDHVLASGAEPPAYPDTALP DREPAFPDAPPPPREPSTATNAFPTPRRRSPRH >gi|319977588|gb|AEUH01000206.1| GENE 3 1587 - 1655 59 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLTWYASHMPVSPGKFEGAVKQ >gi|319977588|gb|AEUH01000206.1| GENE 4 1688 - 3181 2027 497 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_10934 NR:ns ## KEGG: HMPREF0573_10934 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 1 483 1 547 549 206 34.0 2e-51 MALFEFEDGHLVPAQFGYPVASELDPELVDAVCQQVLQIVSRPLFPVTWRDVTSDGEDAP PRLTALDVTGQVVSVEILQTLDSKTLIASLSRLAEIAALSWVDLANEYPSGPEGFRVGWN KFRESMPTAAGPGPRLIMVAAAIEPSVRPALDVLASSGVEVHAMSLRQMSNGRLFLDVDA VGPRLYGMHSPHLVGGGQDQIPQIGDRGADPAIAVLGESPYFGIEPAPPSAVPVGAQRSG ADQGPEYGDGALSAYASDAAGGADYGDYADEGAYPDGDDYEYDDYADGGAEEGAYPDGDG YDYGDGAAGGDGDPDPTVFSGPRTKPTPRLRAHGAVAYSTDEEPPAPFVPQSDEEGPLTD GGAMDAAVASSADVEADPPEVAAAREAGLPVLDRDEDGLRALGQILGEDVPLVARTDLRM PSDLALTGSGAIRCDGLTYPSIEILFNARGIGDYDGWNELYLGDRLGPTLAESLAEINRE IMREYADAPAYRGPARH >gi|319977588|gb|AEUH01000206.1| GENE 5 3243 - 4367 1510 374 aa, chain - ## HITS:1 COG:Rv2854 KEGG:ns NR:ns ## COG: Rv2854 COG2267 # Protein_GI_number: 15609991 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Mycobacterium tuberculosis H37Rv # 21 373 4 330 346 156 35.0 8e-38 MSEYDVLAPWGADGPAPVGRWRQDLLGPAYQSRTIELLPDEEGDNVATLVRYRPRKDPGA LPGSLGFPRFVALYVHGRNDYFFQTELAQNTAASGAAFYALDLRKYGRSLRPWQSIGFTD DLTTYDEEIGEALDIIREDHRSAPLVLVGHSTGGLIATLWAHRHPGAVHALVLNSAWLEM QSLASLRSAMQPFIGQIAQRSPMWEVPTGGSDFYGRSLAGGWAASGFELPEELRGRPGDG EDDQGADSAPAPSDPAVSGWDYAREWKRPESYPVPAAWLDAIMAGHDAVEKEVRLECPVL SMMSTSSYTGEAWSPRVFSSDVVLDADVIALRSMSLSDLVTIARLPGKHDIFLSDPAVRA RAYSIMRGWLEAFA Prediction of potential genes in microbial genomes Time: Thu May 12 18:45:44 2011 Seq name: gi|319977585|gb|AEUH01000207.1| Actinomyces sp. oral taxon 178 str. F0338 contig00207, whole genome shotgun sequence Length of sequence - 2237 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1144 1362 ## COG2021 Homoserine acetyltransferase + Prom 1389 - 1448 3.5 2 2 Tu 1 . + CDS 1499 - 2237 943 ## Arch_0246 NLP/P60 protein Predicted protein(s) >gi|319977585|gb|AEUH01000207.1| GENE 1 1 - 1144 1362 381 aa, chain - ## HITS:1 COG:Rv3341 KEGG:ns NR:ns ## COG: Rv3341 COG2021 # Protein_GI_number: 15610477 # Func_class: E Amino acid transport and metabolism # Function: Homoserine acetyltransferase # Organism: Mycobacterium tuberculosis H37Rv # 28 374 21 368 379 327 50.0 3e-89 MARHPLPPAPASGGWHDGDPAAFRSFADIGPLRLENGDTLPSVRIAYETWGRLDEAKSNA VLVLHALTGDAHVLGPTAPGQDTPGWWEAVVGPGKAIDTEHCFVVAANVLGGCQGSTGPS SPSPDGAPWGSRFPVVSTRDQVRAEARLADLLGVGAWSLVIGASLGGQRAIEWAVSRPDR VERLVVVASGARTTAEQAAWTHTQIRAIELDPNWRGGDYHALPVGPTGGLALARQIAHTT YRSPAELDERFGRIPQNDEDPLRGGRLAVESYLDHHGEKLARRFDAGSYVALCRTMLSHD IGRDRGGQSEALRRVRARTLVVAVDSDRLFHPDQAWRMAEGIDGALYREIHSAHGHDGFL IECDQLAALLGEFLNERAPCA >gi|319977585|gb|AEUH01000207.1| GENE 2 1499 - 2237 943 246 aa, chain + ## HITS:1 COG:no KEGG:Arch_0246 NR:ns ## KEGG: Arch_0246 # Name: not_defined # Def: NLP/P60 protein # Organism: A.haemolyticum # Pathway: not_defined # 16 246 10 241 509 83 32.0 8e-15 MKALPQQRGRIRLGLAALAAIAALTVPQVAQAAPSEDDIAKAQAAEEAAKLSVAEIEVRL AQVSAQAQSATQAAQMAGEDLNAANIALGQARATADQAQADADRAQAEFEEGKKQIASIA QTAYRDGNASLDALAPYLDSDGLRTVETKKSSIDSFSNSAETKMQNVAALEQVANVMRGA ADQALAAQQSATDEVQARTDAANAAASSAQAQQRTVEAQRSAYIEELAKKQNTTVDLIQQ REAALE Prediction of potential genes in microbial genomes Time: Thu May 12 18:45:49 2011 Seq name: gi|319977578|gb|AEUH01000208.1| Actinomyces sp. oral taxon 178 str. F0338 contig00208, whole genome shotgun sequence Length of sequence - 4711 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 488 185 ## PROTEIN SUPPORTED gi|145223395|ref|YP_001134073.1| NLP/P60 protein + Term 603 - 655 11.1 - Term 593 - 640 12.3 2 2 Op 1 . - CDS 706 - 1407 818 ## Cfla_2866 hypothetical protein 3 2 Op 2 . - CDS 1520 - 2014 851 ## COG0221 Inorganic pyrophosphatase 4 3 Tu 1 . + CDS 1947 - 2084 59 ## 5 4 Op 1 . + CDS 2232 - 3467 1454 ## COG2027 D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) 6 4 Op 2 . + CDS 3479 - 4315 1111 ## gi|154508550|ref|ZP_02044192.1| hypothetical protein ACTODO_01051 7 4 Op 3 . + CDS 4371 - 4710 341 ## Arch_0250 tRNA(Ile)-lysidine synthetase Predicted protein(s) >gi|319977578|gb|AEUH01000208.1| GENE 1 3 - 488 185 161 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145223395|ref|YP_001134073.1| NLP/P60 protein [Mycobacterium gilvum PYR-GCK] # 17 150 210 339 348 75 33 6e-14 PSSSSDDDDGSDSGYTPPSYTPEIEDSADDDDDGGSWWGSGGAGTAIAAAKSYLGVPYVW GGESYGGVDCSGLTMLAWAQAGVSLPHLSRAQYGYGTHVSINSMEAGDLIFWSSNGAQSG IYHVAMYLGGGEMIEAPTFGVPVRITGVYSWGSIMPYAVRL >gi|319977578|gb|AEUH01000208.1| GENE 2 706 - 1407 818 233 aa, chain - ## HITS:1 COG:no KEGG:Cfla_2866 NR:ns ## KEGG: Cfla_2866 # Name: not_defined # Def: hypothetical protein # Organism: C.flavigena # Pathway: not_defined # 1 228 2 229 235 211 52.0 2e-53 METLHEFDERTGSFQTDSVPYEPVFALPPSLEEKVAPRSGSLRPFHGNTVAYFLEDQVRD LVGSIASSLHARFGESLSLPLPVEMAHVTLHDLHASPDRDQVRPLVEASSARASALVARA RGIGPIRTVATTVFNMMNTSVVIGVRAVDEEEHRKLLTARGLFDEVVPSGPFTPHITLAY YRPAMPVPLPPDEFRDALAQLTGRVAGVPVALAPERLHALRFDSMGDYWAVQH >gi|319977578|gb|AEUH01000208.1| GENE 3 1520 - 2014 851 164 aa, chain - ## HITS:1 COG:ML0210 KEGG:ns NR:ns ## COG: ML0210 COG0221 # Protein_GI_number: 15827013 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Mycobacterium leprae # 1 157 1 157 162 209 67.0 2e-54 MEFDVTIEIPKGNRNKYEIDHDTGRIRLDRMLFTSTRYPDDYGFIDDTLGEDGDPLDALV LLEEPTFPGCVIRCRALGMFRMRDEKGGDDKVLCVPASDQRASWRTDIDDVSEFHRLEIQ HFFEVYKDLEPGKSVEGAHWVGREEAEAEIRRSYERLAEHEAGH >gi|319977578|gb|AEUH01000208.1| GENE 4 1947 - 2084 59 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSWSISYLFLFPLGISIVTSNSIVSSVSNVPRAARGRVGIVGIAR >gi|319977578|gb|AEUH01000208.1| GENE 5 2232 - 3467 1454 411 aa, chain + ## HITS:1 COG:ML0211 KEGG:ns NR:ns ## COG: ML0211 COG2027 # Protein_GI_number: 15827014 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) # Organism: Mycobacterium leprae # 45 394 90 446 461 129 36.0 2e-29 MSARPALVAQAAAGGQGNPVSASDVAGLWGEVEQSAAAGGFTAWGLVVDGTTGEELLDAA AGTAHMPASVTKTLTALTALTHLDPSSTLATTALYSDGTVYLDGEGDLLLGEGESDAGAI NGRAGLATLAARAAEALKGQGVSSARVDWRGSLFEGEAHLGSWDAQEVGNYAGDVGAIAI DAGRTEPGANSFHQDPALRAAQVFAAALESNGVATDLGSAAPAPQGAASLASVESATMGQ QIRWMLHHSDNTVADQYCHLAAAAAGAPTTFAGSVDNLLSTLTSAGVPTDGLRLEDCSGL SSNDRLTARTLVGVLRAAMASSNAGARDLIESLPWAGLQGTLTTRFTDPPAVGNVQAKTG SLAAVASLSGVLTTQGGRTLVVAVGVEDPAERAASARGLLDSFEEGLVSLN >gi|319977578|gb|AEUH01000208.1| GENE 6 3479 - 4315 1111 278 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508550|ref|ZP_02044192.1| ## NR: gi|154508550|ref|ZP_02044192.1| hypothetical protein ACTODO_01051 [Actinomyces odontolyticus ATCC 17982] # 1 278 3 280 280 339 67.0 9e-92 MPFGPKAVATALAAVSHAGPPASPMGAAAVVARLRKSVVWSNRRLLALTGLVEASAAVAG SPTLVVDRRGLTRVVGTIMERTTGKRSPLVLASQAVILRSLMRTATGIWNVADPGRVLVA PNVLSDARRYALDQSDWCKWVSLTTGLRGVHLTHAPFLIPYIADLMHHLPARSDDLVRVV LLLDALPTAQMDALTPRDLPSIQWLRECRANIGGVALVHACQGAGVPLTGVAQIHAQTEG FARAVVHEGALEVLLSSLDALPTAAEYEDPRLWLARVR >gi|319977578|gb|AEUH01000208.1| GENE 7 4371 - 4710 341 113 aa, chain + ## HITS:1 COG:no KEGG:Arch_0250 NR:ns ## KEGG: Arch_0250 # Name: not_defined # Def: tRNA(Ile)-lysidine synthetase # Organism: A.haemolyticum # Pathway: not_defined # 20 105 27 111 361 63 46.0 3e-09 MSLAVRGCLRGLGASPGDSVVVALSGGADSLALAAAAIDSGTRMGLDVRTVTVDHGLRPD SGHEARAVAELARSLGAVPRVVGVDAAVGADGPEGNARAARMEALRREASGSP Prediction of potential genes in microbial genomes Time: Thu May 12 18:46:12 2011 Seq name: gi|319977574|gb|AEUH01000209.1| Actinomyces sp. oral taxon 178 str. F0338 contig00209, whole genome shotgun sequence Length of sequence - 2983 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 253 167 ## 2 1 Op 2 11/0.000 + CDS 346 - 900 852 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 3 1 Op 3 . + CDS 924 - 2981 1224 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 Predicted protein(s) >gi|319977574|gb|AEUH01000209.1| GENE 1 2 - 253 167 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no GARSIRVKALEGLPRAVRGRVLASFLRDAGVPGGSLAAGHVEAVDALVTAWRGQGPLSLP RVVVARSGRGERAVIEAGPLRSQ >gi|319977574|gb|AEUH01000209.1| GENE 2 346 - 900 852 184 aa, chain + ## HITS:1 COG:CAC3203 KEGG:ns NR:ns ## COG: CAC3203 COG0634 # Protein_GI_number: 15896450 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 6 182 1 177 178 195 51.0 5e-50 MDAADMGDDLKDVLVTAEDLDQRLGELARQIDEDYAGKEILLVGVLRGAVMVMADLSRKL RTPLQMDWMAVSSYGSGTKTSGVVRILKDLDQDVTGRHVLIVEDIIDSGLTLSWLRSNLV GRGAASVEIATTLRKPKAAKVDVPVKYVGFEIPDEFVVGYGLDYAEKYRNLPFVGTLSPH VYRK >gi|319977574|gb|AEUH01000209.1| GENE 3 924 - 2981 1224 686 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 6 616 2 593 636 476 44 1e-134 MNETSNKHPRVNLAWAVVPLVALLLGGWFVWQAFQPRGVDTSEGLALISGSTVERVVLNE GIQQVNLGLTEDYTHQGTGVEDPTANLGRSVYFSYSYTQTKDILDAVQAAAPAKGWNAVR PQGSALGAMAQLVVPLLILLVAFYFLMNRLSGSRMMGGFTQSRVKEFNQERPDVTFADVA GEDEAVEELEEIREFLSSPDKFHRVGARIPRGVLLYGPPGTGKTLLAKAVAGEAHAPFFS ISGSEFMELYVGVGASRVRDLFDRAKKNAPAIIFIDEIDAVGRHRGSGIGGGNDEREQTL NQMLVEMDGFDERANIILIAATNRPDILDPALLRPGRFDRQIAVEAPDLRGREAILKVHA AGKPMTDDVDLRQIAKRTPGFTGADLANVLNEAALLTARSNADLIDNRAIDEAIDRVIAG PQKRTRVMNDHDKAVTAYHEGGHALAAAALRYTDPVTKVTILPRGRALGYTMVMPTEDRF NKTRNQLLDDLVYSMGGRVAEELVFRDPSTGPANDIAQATKTARAMVTDYGMSDRIGMVK LGDADVEAFGHRGGPKEGAVVSDEMASVIDAEVRRLLDDAMREAWEILTRNRDVLDRLAE RLLEEETLDESQLAELFKDVRKQPERPVWNYQSDAAVEGAVMGRPKGVPDLPAGSGEGAA DQGEAQAAAGSAAGPAGGDGAPADPG Prediction of potential genes in microbial genomes Time: Thu May 12 18:46:18 2011 Seq name: gi|319977570|gb|AEUH01000210.1| Actinomyces sp. oral taxon 178 str. F0338 contig00210, whole genome shotgun sequence Length of sequence - 2555 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 + CDS 3 - 602 471 ## COG0302 GTP cyclohydrolase I 2 1 Op 2 15/0.000 + CDS 604 - 1407 557 ## COG0294 Dihydropteroate synthase and related enzymes 3 1 Op 3 . + CDS 1404 - 2553 1498 ## COG1539 Dihydroneopterin aldolase Predicted protein(s) >gi|319977570|gb|AEUH01000210.1| GENE 1 3 - 602 471 199 aa, chain + ## HITS:1 COG:BH1646 KEGG:ns NR:ns ## COG: BH1646 COG0302 # Protein_GI_number: 15614209 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Bacillus halodurans # 14 196 5 188 188 224 59.0 8e-59 SWSAGRGADEVTFDADAVERACRDLLVAVGEDPGREGLAGTPGRMARAWRELLAGVDLDP RDYLRTQFHAGTDELVLVRDIVFHSVCEHHLLPFSGRAHVGYIPRGGVVTGLSKLARLVE GYARRPQVQERLTAQIADALNEVLRPQGVIVVIEAEHMCMSMRGVRKPGSSTVTSALRGI MNDGTTRAEMMALVLGGRR >gi|319977570|gb|AEUH01000210.1| GENE 2 604 - 1407 557 267 aa, chain + ## HITS:1 COG:ML0224 KEGG:ns NR:ns ## COG: ML0224 COG0294 # Protein_GI_number: 15827021 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Mycobacterium leprae # 7 254 8 265 284 191 47.0 1e-48 MAERTLVMGILNVTPDSFSDGGRYAHVDDALAHAHEMVDQGADIVDVGGESTRPGSTRIG PDEEWARIGDVVRELAGGGIVVSVDTLHAVTARRAAAAGATIINDVSGGVWDPGMSASVA STGCRFVVQHYRALPGMPGESFDYGQDVVGAILERLGRQIDAAIGAGIAPERLIVDPGLG FSVTNEQCVLIVESLPRLTALGYPVLIGASRKRFIKAMGGDADEQTARISSACARQGVWA VRVHDVARNARAVRSALTGPAYGGECL >gi|319977570|gb|AEUH01000210.1| GENE 3 1404 - 2553 1498 383 aa, chain + ## HITS:1 COG:Cgl2637 KEGG:ns NR:ns ## COG: Cgl2637 COG1539 # Protein_GI_number: 19553887 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroneopterin aldolase # Organism: Corynebacterium glutamicum # 8 119 5 117 130 99 46.0 2e-20 MTRQTVVIALRGLEVDAVHGVFDFERQEPQRFVADVTLWVQADGALAADDIAGTVSYAEI ADEVVSILQGQSASLLETLADRIARAAMSYDGVEGVEATVHKPDAPMPHAFDDVSVTVRL GAVDLTPLALTRADIYDAGAGSVLTGSTGTGGASEGGKHGQADGRRAALPSPVVVGPRSV VIAIGGNLGNVPVTLASVVEALDYVEGFSVSDVSPLLRTRPVLDEEQQEQPDYWNAVVLG TFEGDVSSLLEQTERIEREFGRERNEHWGARTIDIDIVQVEGTTSSDPALMLPHPRAHER AFVLAPWVLADPSAVLEGVGPVAELLEECDDREGILDAIDDWLEDPEGIIEESDEVLAAS ARDEADGPAGPEGPQRQSGARSA Prediction of potential genes in microbial genomes Time: Thu May 12 18:46:22 2011 Seq name: gi|319977559|gb|AEUH01000211.1| Actinomyces sp. oral taxon 178 str. F0338 contig00211, whole genome shotgun sequence Length of sequence - 9020 bp Number of predicted genes - 11, with homology - 9 Number of transcription units - 6, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 851 633 ## gi|154508556|ref|ZP_02044198.1| hypothetical protein ACTODO_01057 2 1 Op 2 . + CDS 857 - 1321 519 ## Arch_0255 hypothetical protein 3 2 Tu 1 . + CDS 2146 - 2238 138 ## 4 3 Op 1 7/0.000 - CDS 2345 - 3415 594 ## PROTEIN SUPPORTED gi|163764769|ref|ZP_02171823.1| ribosomal protein L18 5 3 Op 2 . - CDS 3453 - 4826 1474 ## COG1066 Predicted ATP-dependent serine protease 6 3 Op 3 17/0.000 - CDS 4896 - 5576 952 ## COG0569 K+ transport systems, NAD-binding component 7 3 Op 4 . - CDS 5625 - 7079 1964 ## COG0168 Trk-type K+ transport systems, membrane components + Prom 7190 - 7249 1.6 8 4 Op 1 . + CDS 7304 - 7522 217 ## AAur_3301 hypothetical protein 9 4 Op 2 . + CDS 7615 - 7713 233 ## + Term 7721 - 7757 8.1 10 5 Tu 1 . + CDS 7801 - 8448 793 ## COG0406 Fructose-2,6-bisphosphatase 11 6 Tu 1 . - CDS 8599 - 8955 456 ## HMPREF0573_10971 hypothetical protein Predicted protein(s) >gi|319977559|gb|AEUH01000211.1| GENE 1 3 - 851 633 282 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508556|ref|ZP_02044198.1| ## NR: gi|154508556|ref|ZP_02044198.1| hypothetical protein ACTODO_01057 [Actinomyces odontolyticus ATCC 17982] # 114 277 548 688 690 100 46.0 1e-19 AGAATAPAPVPAVPEPPAPEDEAPAAPPAPPSIPPRLPGSITRPSLASKGVGEPPAPGAE AASGEGDRRPQQGQPPQQGGQQNPRKGGAKPRRSKPQRGFTGTPITRPDGGKGAPARPAW HPVSSVPANSPSGRTPVRTKARPPKAQAAAAQQGKPSWSSLFGDVSAEENHPGQETSTQL APVPREVGREGGMSLPDWQFSLAGSEQVRVVDDRSQGPGPERGQMAGEADQRRAILEPNL PKGTPMGPIPADEATHTGILRRVVVRPTMTGAIPIVKPGRRP >gi|319977559|gb|AEUH01000211.1| GENE 2 857 - 1321 519 154 aa, chain + ## HITS:1 COG:no KEGG:Arch_0255 NR:ns ## KEGG: Arch_0255 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 2 144 13 152 160 62 32.0 5e-09 MSVLLVAATGAICGGLAFAATEIGTRVGATPLLITPAPAVLFLVITAFLLWAGVHVRRYR ANKDTWIGALGALRVAVAARASSMVGAGLTGILIGLTGSSLAHAEAAVMAQNAIAAGLSA LAGLVWTVSAVVVERWCVIKPDDEDAAGDGRCPA >gi|319977559|gb|AEUH01000211.1| GENE 3 2146 - 2238 138 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVFDVGAAPTPKPSPPPQPQSGGQSGAQSR >gi|319977559|gb|AEUH01000211.1| GENE 4 2345 - 3415 594 356 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764769|ref|ZP_02171823.1| ribosomal protein L18 [Bacillus selenitireducens MLS10] # 1 342 6 347 360 233 37 4e-61 MTSDAAILREALHAVAPGTPLREGLERIQRSHTGALIVLGYTPEVEALCSGGFDLDVPFT ASRLRELSKMDGAVVIDTTNWSIRKANVQLMPDPSIPTDESGMRHRTAQRMARQIHLPVL SLSASMRIISIYVGGEHRLVEEPEALLSRANLAVDTLDRYSQRLDEVLQTLTILEMRDAA TVRDVATVMQRMEMIRRITSEINQYLEELGSDGRLLALQVEDLVRGSASERALVMRDYAH DQGDVARVEAELTTLGSDRIVDLSAIANVLGLGVYEIADLDREVQPRGIRALSLVPHLSW NIIKEISSRWSTLSQLRSASVEELQKVEGVGPYRAKIIHENLLHQSHMARAGLVGW >gi|319977559|gb|AEUH01000211.1| GENE 5 3453 - 4826 1474 457 aa, chain - ## HITS:1 COG:BS_sms KEGG:ns NR:ns ## COG: BS_sms COG1066 # Protein_GI_number: 16077155 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus subtilis # 1 452 1 456 458 380 42.0 1e-105 MPKDRVSYRCSECGWSSPKWVGQCRECLAWGTVEEAGAPSPMARAVAPRRRAVPIGEVDG ERASKRPTGVGELDRVLGGGIVPGAVILLAGEPGVGKSTLLLDIAAKAARSLEGGGGVLY ATGEESASQVKGRAERIGAIDRRLLIADESDLGALLGHIEAVDPALLVVDSIQTVSSSQV EGGAGGVAQVRAVAASLIRVAKDRNLPVLLVGHVTKDGGIAGPRVLEHLVDVVCQFEGDR HSRLRLLRAVKNRYGPTDEVGCFELTDSGVRGLADPSGLFLSRTGTDVPGTCATVSLEGR RPMPTEVQALVAPTGAGSPRRTTSGVDHARVAMLLAVLHARLGADCSGADVYVSTVGGAR TTEPAIDLAMAVAIVSSLKGVPPRAQMVAVGEIGLTGEVRACTGLPRRLQEAARLGFTSA LVPAQGAEELRAPSGITVYTISNLAAAIRVAFEQTRQ >gi|319977559|gb|AEUH01000211.1| GENE 6 4896 - 5576 952 226 aa, chain - ## HITS:1 COG:BH0597 KEGG:ns NR:ns ## COG: BH0597 COG0569 # Protein_GI_number: 15613160 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Bacillus halodurans # 17 189 7 180 218 130 40.0 2e-30 MASEERDANNLASGTLVIGLGRFGAAVAVTQDKLGYEILAVEKNPTTVQAFAGRFPLVEA DATSKEALEQLGAREFRSAVVGVGSSLEAAVLITANLVDLGISSIWVKATSSQHGRILRR VGAHHVVYPDLDAGKRTAHLVGGRMVDYIELETDGFAIMKLRPPTEIHGFTLEELDLRGR YGVNVLAVRRPKQRFEYADSLTRVNPEDEIIVSGDAHLLEYFANRP >gi|319977559|gb|AEUH01000211.1| GENE 7 5625 - 7079 1964 484 aa, chain - ## HITS:1 COG:DR1668 KEGG:ns NR:ns ## COG: DR1668 COG0168 # Protein_GI_number: 15806671 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Deinococcus radiodurans # 3 483 24 510 512 232 34.0 1e-60 MLPNPDEIATPPSPVVKPAPGTSHTPHPQIPTTPWMRVRVWLSRQARYAPSRLALGTFAS IIAIITGLLSLPFASASDKEAPFVDTLFTAVSAVCVTGLTTVDTATYWSVFGQAVIALGI VVGGLGVMTLASILGFAVSRHLGLTQRMLATQETKTESMGQISTLLKAVIFTSLTMQLLL AAVFLPRFLTMGAAPGTATWEAVFMAISVFNNAGFVILPNGLAPHVSDWWMLVPIILGTF VGAIGFPVIMDIAKRWRTPRKWTLHTKLTLSTYLILAFAGTLLMALIEWHNPVTLGGIDS SSKILNSLLAGFNARSSGLSAVDVGSLHSQSHFVQDILMMIGGGSASTAGGVKVTTFAIL VLAVVAEARGDRDIETYGRRIPSSAVRLAVAVSLMGLAMVAASVVLLLSLTPYSLDTILF ETVSAFATVGLSTGVTPDLPTAAKYVLIVLMFAGRTGSMTVAAALALRERSRVIRMPKEQ PIIG >gi|319977559|gb|AEUH01000211.1| GENE 8 7304 - 7522 217 72 aa, chain + ## HITS:1 COG:no KEGG:AAur_3301 NR:ns ## KEGG: AAur_3301 # Name: not_defined # Def: hypothetical protein # Organism: A.aurescens # Pathway: not_defined # 5 65 92 152 154 80 65.0 1e-14 MVQPSAPRFMTVTEVADIMRVSKMTVYRLIHSGEMPAIRVGKSFRVPEAAVSQMIHSGLA DRGDEQNRAIGG >gi|319977559|gb|AEUH01000211.1| GENE 9 7615 - 7713 233 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGSVIKKRRKRMAKKKHRKLLRKTRHQRRNKK >gi|319977559|gb|AEUH01000211.1| GENE 10 7801 - 8448 793 215 aa, chain + ## HITS:1 COG:Rv0525 KEGG:ns NR:ns ## COG: Rv0525 COG0406 # Protein_GI_number: 15607665 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Mycobacterium tuberculosis H37Rv # 2 202 3 201 202 179 52.0 4e-45 MDTTIVHVMRHGEVDNPDGVLYGRLPGFGLTGCGQAMAERVARTLVGRGADITHVVASPL LRAQLTAAPTAAAYGLPVESDPRLIESANVFEGLPLGSNPAVLARPAYFKYFVNPLRPSW GEPYADIASRMSAALSAALRSARGHEALVVSHQNPIETLTRFVRGKVLAHPPKSRNFALA SLTSFVFSGATLVGVCYEEPAADLVAQAKDLSYRR >gi|319977559|gb|AEUH01000211.1| GENE 11 8599 - 8955 456 118 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10971 NR:ns ## KEGG: HMPREF0573_10971 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 1 111 1 125 160 68 38.0 7e-11 MPRVLLVVAWIAVTIYAVADWSRTPEDEMPGRIPKPMWLVIIVFTTMLFAVGAIAWLVLR WVNRAEARGHRPPPGPKGPVAPDDDPEFLFRLERDIQRKRRQEQQRRREEEGPADSHD Prediction of potential genes in microbial genomes Time: Thu May 12 18:46:58 2011 Seq name: gi|319977545|gb|AEUH01000212.1| Actinomyces sp. oral taxon 178 str. F0338 contig00212, whole genome shotgun sequence Length of sequence - 13655 bp Number of predicted genes - 13, with homology - 8 Number of transcription units - 11, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 90 - 992 1092 ## COG0524 Sugar kinases, ribokinase family 2 2 Op 1 2/0.000 + CDS 1159 - 2106 1151 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase + Term 2180 - 2223 3.0 3 2 Op 2 . + CDS 2226 - 3557 1618 ## COG0477 Permeases of the major facilitator superfamily 4 3 Tu 1 . - CDS 3573 - 4469 1015 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase 5 4 Tu 1 . - CDS 4639 - 5958 1737 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 6 5 Tu 1 . + CDS 6247 - 7689 2287 ## COG1376 Uncharacterized protein conserved in bacteria + Term 7866 - 7896 0.2 + Prom 7950 - 8009 4.0 7 6 Tu 1 . + CDS 8073 - 8711 838 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 8893 - 8925 2.9 8 7 Tu 1 . + CDS 9070 - 9489 141 ## PPA1607 phage-associated protein + Term 9509 - 9561 2.1 9 8 Tu 1 . + CDS 9838 - 9951 185 ## + Term 9977 - 10011 0.8 10 9 Tu 1 . + CDS 10444 - 11409 872 ## + Term 11425 - 11475 3.4 11 10 Op 1 . - CDS 11402 - 11536 191 ## 12 10 Op 2 . - CDS 11608 - 12552 1018 ## 13 11 Tu 1 . + CDS 13037 - 13570 387 ## Predicted protein(s) >gi|319977545|gb|AEUH01000212.1| GENE 1 90 - 992 1092 300 aa, chain - ## HITS:1 COG:HI0505 KEGG:ns NR:ns ## COG: HI0505 COG0524 # Protein_GI_number: 16272449 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Haemophilus influenzae # 4 292 5 292 306 144 35.0 2e-34 MGRVIVVGSINQDITVTVERFAEPGETLSGSGVAYTLGGKGANQAAAAAHSGARTLFIGR VGADPAGRGLRDELASHGVDTTWLLEDPTPSGTALITVEAPTGQNTIILDAGANGRVGAE QAREAVDLGRDDVVVLQGEIPPAANAALIPWAHRAGARVVLNLAPVYAIDPDVLASVDVL VVNESEAGLVLGRPAPASSAEAVRAAEDLRALGIPEVLVTLGAAGAAYASRTGSAHLDAV GDGPVVDTTGAGDATVGCLAAALAAGHSFEDSVGWGMRAGGAAVGAKGAAPSYAGIEPIA >gi|319977545|gb|AEUH01000212.1| GENE 2 1159 - 2106 1151 315 aa, chain + ## HITS:1 COG:Cgl1931 KEGG:ns NR:ns ## COG: Cgl1931 COG1957 # Protein_GI_number: 19553181 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Corynebacterium glutamicum # 1 311 1 312 316 313 56.0 2e-85 MAVKLILDLDTGIDDALAIAYALGSPELELIGVTCTYGNVTTELAVRNSLAVLALLGRTD IPVFAGQACASCADSFEPSPLVRRIHGDNGLGGVRIPDSPRPAEQGGAVDFIVESARRYG ADLVYVPTGSSTNIAAAFGADPSLADRLRVVMMGGALTTRGNTTPWSEANVSQDPEASDA LLRRGRVTMVGLDVTHQTLLTKAATARWRRLGTAAGTAFADMTDYYIDFEASEIGIVGCG LHDPLAVAVAADESLVGVLPINLRVDLAGPTRGRTIGSLEGLAEPDKRTRAAVRVDAERF LAGFMDRTERVLGAR >gi|319977545|gb|AEUH01000212.1| GENE 3 2226 - 3557 1618 443 aa, chain + ## HITS:1 COG:SA0132 KEGG:ns NR:ns ## COG: SA0132 COG0477 # Protein_GI_number: 15925841 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Staphylococcus aureus N315 # 22 407 15 404 450 96 22.0 9e-20 MRNPTSDDGVYATGPARVLLPLLGIYIAGTITLQGFNLVYLQIGEAIGAGDSVALITAVP GIVLGVACFLYGSLGDFVPLRRIVDAGAVLIAAGSLLDLVASSSLWGVIVARALQILGCQ AAASVYMILATSITDPRRRALYVGLMAAAFQGATAIGLVVGGWLARVNWVLLLLVPLIGL ACVPVMGRRVPARSRPGRVDVFGLAVFSCCALTLTVAASNRSWPLLVVLAALIAVFWAHI GRAREPFITRSFFTNARWLRAISLILLPYTLAFAIAPIAARIGQDYYGMDASGVSLMLMP AYLVALASAVASAAVVDRIGRGAAVRLSVAVIGAGAVLMAFCMDAGTWVLIASMTMVYAG YGMLFSPVYSTVLATVAPGQTGRGVAMNDLAMQGMSAVGIGMATPWIASGRVAGVLLAYA VIAALVVGIDWWHEAGEARLGAR >gi|319977545|gb|AEUH01000212.1| GENE 4 3573 - 4469 1015 298 aa, chain - ## HITS:1 COG:Cgl0443 KEGG:ns NR:ns ## COG: Cgl0443 COG1575 # Protein_GI_number: 19551693 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Corynebacterium glutamicum # 3 294 12 295 298 198 48.0 8e-51 MATVHDWVEGARVRTLPAAAAPVIVGTAAAWHLGAREPLRALLALVVALALQVGVNYAND YSDGIRGTDDDRQGPARLTGGGLARPRTVLMAALGCFAAAGAAGVALVALSGAWPLLVVG CAAVAAAWFYTGGPRPYAYTGVGLSEAMVFVFFGLVACVGTAWTQAASAPWWLWTSASAL GLLSIALLMANNIRDIPTDRATGKRTLAVRLGDPASRWVYAVCAVAPVPAMGALSAGLGL AWQSTAALVAACCAWSFLVIAPVATGAAGRALIPVLRSTGLYTLSWALLAAIALLGAR >gi|319977545|gb|AEUH01000212.1| GENE 5 4639 - 5958 1737 439 aa, chain - ## HITS:1 COG:ML2257 KEGG:ns NR:ns ## COG: ML2257 COG0318 # Protein_GI_number: 15828205 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Mycobacterium leprae # 91 422 41 359 368 161 41.0 2e-39 MATPRVLVVPGGTSAGAAALANASLHKLIELHQERQADPSHPAPRTIVVLRDPDQVPPST LRAAAVAPSEAFPADEYPDIAAHVDDPGFFDGIDLVIPTSGSSAGSPRLVGISTDALVAS AKATEAALSGPGRWILALPTHHIAGAMVLVRSAVAGTDPQIVDCTNGFDPRDLLPAVEGA TRDGAAGYLAVVPTQLSACLDAGDEVVAALRGLSAILVGGAATNHLLLERAKGMGLPVVT SYGMTETCGGCVYDGVPLPGTTVRAIDQGIHMRLAIAGDTLMTRYLTGDQPFFEECGHRW LITADIGIILASGLVEVRGRADDVIVSGGLSIAPAPIRRCIRQLDEASDAWIMPTDDTKW GQVVTALIVPREAPIDATAMASLGQRIRDHVASTLGRMQAPRRVVAVDSLPYLGFDKVDR AAAASLASSLAGTDRDWRR >gi|319977545|gb|AEUH01000212.1| GENE 6 6247 - 7689 2287 480 aa, chain + ## HITS:1 COG:sll0670 KEGG:ns NR:ns ## COG: sll0670 COG1376 # Protein_GI_number: 16331947 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 356 480 54 169 169 95 35.0 1e-19 MLAKLSARSKWAIGSLVAVVLVAAAACVVYAANYSSRALPHTSVAGTSVAGMTRAEVVDL VRSRADAATVTITAEGQATTASLTDAGVSVDAEATAKEAVPDSTSLMTKLGALFSSTDIA PVTTTSDDSVAELAAKVNSTLTTEMKNAQVVVSSDGTGFTVVPDQKGNGVDTADVKAAVL SAAASLSSNSADVSSKEMDPVVTTADAQRAADKATALIAQDVAIDDGIETYRATQEDKVK WVEFLTKSDGSLEDPSLDKVKVADWVNKTAQESDVAPVNGVNNVDSEGKVLTVAREGKPG LKTNNTEQIIKDLMASLSDGKDYSGTFHYDDVAPSWDTKQVAEGTENLVYKAAEGEKWVD IDLSTDTVTAYVGGKVAGGPFYMVPGAPDTPTVTGTFHVYLKYESQTMRGENADGSKYET EGVPWVTYFTGSYAMHGAPWRSSFGWSGYGGSHGCVNMPVDAAKFMYDWADMGSTVVVHY >gi|319977545|gb|AEUH01000212.1| GENE 7 8073 - 8711 838 212 aa, chain + ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 5 206 201 400 402 120 34.0 2e-27 MVGQIIDKDIRARRKIRNRSTFDKVMTYVINNFAAPTNLTGIAEYLRNTQGVPIKRETLA NYIDILVNAKLLYRCDRFDMKSKRSLQGGEKYYLADTGIYFARNVDATMDYGPLLENAVF TYLKSKDYRVSVGSIGKLEVDFIARRADEGYSYIQVSMTVADKAVEEREYRPFSKVRDNF PQYLLTLDPLPLERDGVTHRNIVELMEADDDL >gi|319977545|gb|AEUH01000212.1| GENE 8 9070 - 9489 141 139 aa, chain + ## HITS:1 COG:no KEGG:PPA1607 NR:ns ## KEGG: PPA1607 # Name: not_defined # Def: phage-associated protein # Organism: P.acnes # Pathway: not_defined # 3 128 19 149 333 110 45.0 1e-23 MRFHGIVECLRPLMMGDRRLGCFVAALVDMGLGSPGGSALELRSESTWKSLGNGSRRLSA RLASELVSRWDVVVFGENLVGAYGEDALIDVAECVRALDPRVSKADVGEGIGRVLYEVFK RAAEEAAGRRAVGEGAHED >gi|319977545|gb|AEUH01000212.1| GENE 9 9838 - 9951 185 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLACHYRRCIVLRSAIIIRARWVLVQYSTCVPRGICI >gi|319977545|gb|AEUH01000212.1| GENE 10 10444 - 11409 872 321 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRREVRGVISGWVWRLWGVFCGRVVDQRAGMVVLTSVNLLMMALQCCFLLTEWVWWWLYL NTYTLGLFTAALYPCVFLALVSEVRLPWSPGFAVRLVAFQLPVGLAAGYLSAFLGGHYSP CGVINRAGYHDSELTIMECVARYRDLVQHAPSLFLFLALVYSAGLVARSRWDSGAEVLCA SGDGCARPDESLVRRLKRLRIGRRILWWTGLVLQMPSVTLWMYINDTRLPLPRFPQPEIG LNAFLIILFAAFALWIAVARLRRASSRILAANTGLRAPENAWTVLCAILVLCTTALVCVL SMAYAVSLGFILLMLEAMSFN >gi|319977545|gb|AEUH01000212.1| GENE 11 11402 - 11536 191 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGSGPSDGWYYRTADQSRGPILTSRVHIQCAVSACGTPTYQSFS >gi|319977545|gb|AEUH01000212.1| GENE 12 11608 - 12552 1018 314 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAATWAARLWGVFCERVVDPRAGVAVLTSMNLLMMVLQCCFLLTERVWWWIYLNTYTLGL FTVALYPGAFLSFASQVRLPWSREPLARFACYQALAFAFVVYGVAPLFGPASPCGYLNRI GYFPSIGAREQCFVRYADTTSSSPLLIALSALFYSSGVVIRAWRDSERGTPGYQAWQFIP SEREAILLRLWKLYARKRVVWWAGLVLQLPALTLLLDYDGNSLPLSRYPLEAMGAAAYGM TLFTGIALWVYVGRLESVLLRTFGDDVDLVEPKGVLAGLRYTMVTFITACVSFCSILSVL HFIYASMIALLIVW >gi|319977545|gb|AEUH01000212.1| GENE 13 13037 - 13570 387 177 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDRGLKFWRVFWDGVSEPRMTLAVFTSVNVVMTALHCCLLLTPLRVVFLVANFYTLGAFT LIVYPGVFLLLVLAVKPSPRISYGVKLVFSQMVASALTVIASIPLLRPCSDCCAVGALQC DLEIVSDHALFYHVFSYTSASIHSFVARPLFFLFFIPSSVYTLVLVLRVWKHSRADE Prediction of potential genes in microbial genomes Time: Thu May 12 18:47:42 2011 Seq name: gi|319977540|gb|AEUH01000213.1| Actinomyces sp. oral taxon 178 str. F0338 contig00213, whole genome shotgun sequence Length of sequence - 4090 bp Number of predicted genes - 5, with homology - 3 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 106 - 1152 933 ## - Term 1142 - 1175 1.9 2 2 Op 1 . - CDS 1243 - 2223 1178 ## COG0447 Dihydroxynaphthoic acid synthase 3 2 Op 2 . - CDS 2189 - 2407 189 ## + Prom 2231 - 2290 1.5 4 3 Op 1 1/0.000 + CDS 2331 - 3419 1509 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 5 3 Op 2 . + CDS 3629 - 4088 521 ## COG1165 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase Predicted protein(s) >gi|319977540|gb|AEUH01000213.1| GENE 1 106 - 1152 933 348 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MREGAGVMWNRAVVMTASVGLVLCSAALSGCGGGTGAAEAPPASGAGQSQPVSGGAQSDP VPVPGEVVNGVAKDEATWSTPLAVYFGVDTNDLRLHAQSVVTATCMHENGYADFRENYDS TAPRPRTTAPDGVGRLFNEELAAQYGYRNAPNPQYLIEADVEAAGGGGLYENASPEFKDQ WYACLDRATAEVNEGEPPRRLPTGDGPDQATLDSIQSQLNRFHIDTTGPQLQADAAAWRQ CMAPLGIADLPDRPWEAGSMRMPDSLREKWDWRPSGKPSADEVEVAAHDAQCRRSSGWSH DLYDQTWDQSVLFYNQHRAELDPMLAEYAARSAHYRQIITQYGGTPRE >gi|319977540|gb|AEUH01000213.1| GENE 2 1243 - 2223 1178 326 aa, chain - ## HITS:1 COG:VNG1079G KEGG:ns NR:ns ## COG: VNG1079G COG0447 # Protein_GI_number: 15790175 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroxynaphthoic acid synthase # Organism: Halobacterium sp. NRC-1 # 7 326 2 304 304 381 61.0 1e-106 MSALPYVSDTFDQARWRPVEDFSFTDITYHRGISRGEELDGREAGADMPVVRVAFDRPQI RNAFRPLTVDELYAALDHARRAPDVAAVILTGNGPSPKDGGYSFCSGGDQRIRGHDGYCY EVEGADASADTGRRREQVDPARAGRLHILEVQRLIRSMPKAVIAAVPGWAAGGGHSLNVV ADLSVASVEHAAFMQTDANVGSFDAGYGSALLSRQVGDKRAREIFFLAEHYDARTAERWG VVNRAVPHAQLEETALQWARTIASKSPQAIRMLKYAFNLADDGLAGQQLFAGEATRMAYM TPEAQEGRDAFLEHRAPDWSPYPYYY >gi|319977540|gb|AEUH01000213.1| GENE 3 2189 - 2407 189 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGMRVSVAVARVFVIGPSSGVRRVVMGFPWVVGAAGCVQRCHYPAFLPVGKAAVSERVRW LHERPPLCFRHF >gi|319977540|gb|AEUH01000213.1| GENE 4 2331 - 3419 1509 362 aa, chain + ## HITS:1 COG:ML2268 KEGG:ns NR:ns ## COG: ML2268 COG4948 # Protein_GI_number: 15828213 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Mycobacterium leprae # 43 355 12 314 334 203 43.0 3e-52 MTTRLTPDEGPMTKTRATATLTRIPMKTLPPLLRKKFKAIKDLLVYSVPMRTRFRRVTSR DGILLHGKAGWGEAAPFWDYGPEESARWIDAAINEATVPIPFTRRKSVPVNVTVPVLSPE AAAARVEESHGCATAKVKVADPGSTLKEDCARVEAVADALTRTVKRERWIRLDANGAWDV DQAVTAIHELERAAGDVPIEYVEQPCATADELYELHRLIDVPIAADESIRRAPNPVAAAR MSGAQVAVIKIAPLGGPERALRIARDTGLRVVVSSALETSVGLAAGVRTAAALPGRALAC GLATASLLAADVTAPLEVSGGRIHVATRSPDPKLIDHSPVDGDLLSKWLARLMACAPHIL DR >gi|319977540|gb|AEUH01000213.1| GENE 5 3629 - 4088 521 153 aa, chain + ## HITS:1 COG:ML2270 KEGG:ns NR:ns ## COG: ML2270 COG1165 # Protein_GI_number: 15828215 # Func_class: H Coenzyme transport and metabolism # Function: 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase # Organism: Mycobacterium leprae # 4 150 2 151 556 114 49.0 7e-26 MTENPSILAARAVVGSLIGAGVGHVVYCPGSRDAPFAYALSGLGGLVGTTVAIDERSAGF YCVGLARAGVLAAVITTSGTAVTELHPAVAEASHARLPLVVVSADRPFELRGVGASQTTD QVGLFASHVRGTWDIPAGAEDAGRLAGLVARAA Prediction of potential genes in microbial genomes Time: Thu May 12 18:48:03 2011 Seq name: gi|319977533|gb|AEUH01000214.1| Actinomyces sp. oral taxon 178 str. F0338 contig00214, whole genome shotgun sequence Length of sequence - 6850 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 714 635 ## COG1165 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase 2 2 Tu 1 . - CDS 660 - 779 72 ## 3 3 Tu 1 . + CDS 807 - 2756 2041 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 2845 - 2901 0.5 4 4 Tu 1 . - CDS 2855 - 4042 1186 ## COG1169 Isochorismate synthase 5 5 Tu 1 . + CDS 4241 - 4948 335 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 + Term 5006 - 5045 0.1 - Term 5421 - 5476 5.0 6 6 Op 1 . - CDS 5653 - 6495 902 ## COG0191 Fructose/tagatose bisphosphate aldolase 7 6 Op 2 . - CDS 6506 - 6850 398 ## Predicted protein(s) >gi|319977533|gb|AEUH01000214.1| GENE 1 1 - 714 635 237 aa, chain + ## HITS:1 COG:MT0581 KEGG:ns NR:ns ## COG: MT0581 COG1165 # Protein_GI_number: 15839953 # Func_class: H Coenzyme transport and metabolism # Function: 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 5 218 326 542 554 117 40.0 3e-26 VWAPPDWLSQWEQQDRAARAAVNALVARCAAQGGPPTMFEVARAVEEASDGPLVLGASNP VRAVDLVCADLAGRRIHSNRGLAGIDGTIATAAGIARGLGAASGAPAPRTVVLLGDLAFF HDASSLALAAAEGAPLDIVVADDHGGGIFATLEHGRREHADLYDRWFGAGQSTDTAALAA AYGARYQRLDADGLRGALREGADGGGVRVLHVPISRAAGLYATAGSASALGLAAGRV >gi|319977533|gb|AEUH01000214.1| GENE 2 660 - 779 72 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSQVCTSFIPNHWGFTESCLNPHTRPAAKPSAEALPAVA >gi|319977533|gb|AEUH01000214.1| GENE 3 807 - 2756 2041 649 aa, chain + ## HITS:1 COG:TM0571 KEGG:ns NR:ns ## COG: TM0571 COG0265 # Protein_GI_number: 15643337 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Thermotoga maritima # 319 646 23 349 459 177 34.0 5e-44 MSDENSQWVPRDPGTRPEAWTDQAQQAPYVSGAGQAPTAGAAPQAPSAAPSAAETEQLPR VQEGAWLPGADETAQLPRVDGAALPLSAQETMQLPRVQETMQLPRVQETAQLPSATAGQF QSAASPAYGQPPAEAGPAYGQFQSEGFPAYGQPPAEAGPAYGQFQYEASPAYGQPPAEAG PAYGQTGFEAGPAYGQFQSEGSPAYGQPPAEAGPAYGQEAPGGDGSHDFPQPVGQMPLQQ GAPEYAPVTGAPEYAPVTGAAVVRKGPGWLALVASMLVTALVTAGGLWFVLRPDSTQDSS AASANGGTVASVASADSAPDWQAVASAVSPAVVTIQVQSSSSTGMGSGVVYDAKGDIVTN YHVIAAAVQSGGKIQVTLADGRIYDAEVVGHDRTTDLAVIRLVNPPSDLTVARFGSSSHL TVGAPVMAIGAPLGLSKTVTTGIVSALNRPVEVAVDDDSSKNNGQGDSSDPFGQQKRNQS SSDTVITNAIQVDASLNPGNSGGPLFDQTGAVVGINSSIKSVTSSDGQAGSIGLGFAIPS DLVTSVADQLIAKGTVSHAVLGVNVTTAAVSVGGDTYAGAELADVTSGGAADKAGLRKGD VITQVEGQEVSSAKQLTGYIRRYKGGDTVKLTYVRDGASHEVQVALQAK >gi|319977533|gb|AEUH01000214.1| GENE 4 2855 - 4042 1186 395 aa, chain - ## HITS:1 COG:BS_menF KEGG:ns NR:ns ## COG: BS_menF COG1169 # Protein_GI_number: 16080135 # Func_class: H Coenzyme transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate synthase # Organism: Bacillus subtilis # 136 394 203 468 471 148 33.0 2e-35 MVGVGAAWRFAHRGPGAIAGFHEQWEAVASGPPAPAPQPVAFASFAFSSGVSVALVPAVA VIDCEHGRFVVHASTPGAGAEHAAGAGTAPAPGADALPAPSAADLPADPLEAAAALLERR SPVAVARLVGSEQGSMTRSQWRGQVTRMAALLRGGRAQKAVLARDMIARVRGFDERRLLE RLSALYPSTWIFGVDGLIGATPEMLASARSGRVFSRVLAGTCAPGGGPALLESDKDLREH ALAVESVASALHPLCSDLRVPQEPFLLELPNVTHLATDVSGVLDASLLDAVAALHPTAAV CGTPRHESMGLIERYEDTDRGRYAGPVGWIDTSGEGEFALALRCGQVEGAGAGAEDGSRI RLFAGAGIMPDSDPGAELVETEAKMRPLLDALGAR >gi|319977533|gb|AEUH01000214.1| GENE 5 4241 - 4948 335 235 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 19 232 1 221 221 133 34 3e-31 MSEVLRADLDKTQADVASMFDHVAERYDMMDALMTGGMNHVWMAAFRKAVAPHPGERILD LAAGTGVSSAALAKGGAEVVACDLSEGMIEVGRQRHPDIEFVQGDAMDLDFDEASFDAVT ISYGLRNVPDPEGALREMARVVRPRGRLVVCEFSTPPSRAFGAVYGAYQRTVMPLLARLA STNAQAYDYLVESIRQWPDQRALGAMIARNGWSEVQYRNLTGGIVALHRAVKPLP >gi|319977533|gb|AEUH01000214.1| GENE 6 5653 - 6495 902 280 aa, chain - ## HITS:1 COG:TM0273 KEGG:ns NR:ns ## COG: TM0273 COG0191 # Protein_GI_number: 15643043 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Thermotoga maritima # 4 275 8 309 315 169 35.0 7e-42 MFADTAALARECAQQGRGLAAFNVVLLDHAEAFIDAAERCGLPLVLQLSQNAVKYHGGRL APLGKALLELARASSARVAVHLDHAEDPGLCRAAIDLGFSSIMYDGSALPDADNRASTKR IAEAAHAAGISVEAELGEVGGKNGVHDPSARTDPDEAAAFVADTGVDMLAVAVGSSHAMA TRDAVLDDARIAAIHRAVAVPLVLHGSSGVPDDGMRSAIRAGMTKINVSTHLNAVFTSDV RRMLDEDPRAVDPRKYMGPGNRAVSAEAERLMRLYTTVHD >gi|319977533|gb|AEUH01000214.1| GENE 7 6506 - 6850 398 114 aa, chain - ## HITS:0 COG:no KEGG:no NR:no VIGPGEGAREAAERARAPGGGPAAMVFASAMGLLGAQLGIVRHQVGDVPLVIGGGLSEGG PLVYDNLARGAASVTGRMPPPRIVPAALGPASQALGAAALALRSAQGPSALAHQ Prediction of potential genes in microbial genomes Time: Thu May 12 18:48:22 2011 Seq name: gi|319977512|gb|AEUH01000215.1| Actinomyces sp. oral taxon 178 str. F0338 contig00215, whole genome shotgun sequence Length of sequence - 20169 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 6, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 679 246 ## COG1940 Transcriptional regulator/sugar kinase 2 1 Op 2 . - CDS 676 - 1548 1284 ## COG2222 Predicted phosphosugar isomerases - Prom 1599 - 1658 1.7 3 2 Tu 1 . - CDS 1764 - 3176 1618 ## COG0673 Predicted dehydrogenases and related proteins 4 3 Tu 1 . - CDS 3952 - 5112 1687 ## COG1940 Transcriptional regulator/sugar kinase 5 4 Tu 1 . - CDS 5231 - 7036 2278 ## Sked_01980 hypothetical protein 6 5 Op 1 38/0.000 + CDS 6995 - 7993 1464 ## COG1175 ABC-type sugar transport systems, permease components 7 5 Op 2 14/0.000 + CDS 8007 - 8885 1305 ## COG0395 ABC-type sugar transport system, permease component 8 5 Op 3 1/0.000 + CDS 8914 - 10269 2309 ## COG1653 ABC-type sugar transport system, periplasmic component 9 5 Op 4 . + CDS 10397 - 11887 2150 ## COG3345 Alpha-galactosidase 10 6 Op 1 . + CDS 12122 - 13402 1680 ## COG0644 Dehydrogenases (flavoproteins) 11 6 Op 2 30/0.000 + CDS 13440 - 13799 188 ## PROTEIN SUPPORTED gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A 12 6 Op 3 34/0.000 + CDS 13822 - 14376 712 ## COG0377 NADH:ubiquinone oxidoreductase 20 kD subunit and related Fe-S oxidoreductases 13 6 Op 4 22/0.000 + CDS 14373 - 15110 994 ## COG0852 NADH:ubiquinone oxidoreductase 27 kD subunit 14 6 Op 5 15/0.000 + CDS 15110 - 16459 2023 ## COG0649 NADH:ubiquinone oxidoreductase 49 kD subunit 7 15 6 Op 6 23/0.000 + CDS 16456 - 17157 943 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 16 6 Op 7 12/0.000 + CDS 17154 - 18455 1687 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 17 6 Op 8 . + CDS 18452 - 20168 2412 ## COG1034 NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G) Predicted protein(s) >gi|319977512|gb|AEUH01000215.1| GENE 1 2 - 679 246 225 aa, chain - ## HITS:1 COG:lin0031 KEGG:ns NR:ns ## COG: lin0031 COG1940 # Protein_GI_number: 16799110 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 41 197 6 170 321 66 33.0 4e-11 MSADDRGGGAEGGAPRPAAPGRALPSGGAGHSPSPAAPVPAAIGIDIGGTTVKGALVGQD GSVLGTAGPRATPVDDVPGLVGAVVGLVRRLDAGRGVPVGVCVPGIVDEALGVGVLSANL GWRGAPLRRLLSAALGSPVALGHDVRCGALAESLWGVGEADMLYVAIGTGIASALVIGSR PCPAPAWAGEIGQIVVEDPDHPGRRAPSSRSPPPRPSPGAPPKPE >gi|319977512|gb|AEUH01000215.1| GENE 2 676 - 1548 1284 290 aa, chain - ## HITS:1 COG:PAB1348 KEGG:ns NR:ns ## COG: PAB1348 COG2222 # Protein_GI_number: 14521735 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Pyrococcus abyssi # 1 287 19 339 341 87 30.0 3e-17 MSQTLEEIASQPRLWRRALEVDASPLPRDGESLAVVGCGTSWFIAQAYTTAREREGKGPG DAFTATEFPWDRHYDRVICLSRSGTTTEVVDVLRRLAERGTPGLLVTAVGGGPASPFASA EVVLGFADEESVVQTRFATSALAYFRASLGHDIAAAAADAELALDAELPQRWVDADQIAF LGTGWTIGLANEAGLKLREASQSWTEAYPAMEYRHGPISIAQPGRLTWVFGPVPEGLGEQ VAATGAELVSSQWDPMAHLVLAQRLAVARAAARGLDPDAPRNLTRSVILP >gi|319977512|gb|AEUH01000215.1| GENE 3 1764 - 3176 1618 470 aa, chain - ## HITS:1 COG:SMc04129 KEGG:ns NR:ns ## COG: SMc04129 COG0673 # Protein_GI_number: 15963876 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Sinorhizobium meliloti # 103 433 35 388 433 104 29.0 4e-22 MSARTTSGLARAPAQSTHAPAQSAHAPAQPAHAPAQSAQALVPTLLSRILHNPSATRIID GAHRRHSPDGGTMVAELRIGIIGLGARSAIAANLARTGLGARLVGGADPDPQMRARAARA FGEHLAWVEDTDQLLAAGIDAAIVTSPDDTHEAITCALLEAGAAVYLEKPIAITIEGADR VLETAYRTGTPLYVGHNMRHMAVVRTLRDVIRQGTIGQVKAIWCRHFVGNGGDYYFKDWH ADRSRTTGLLLQKAAHDIDVMNWLSDSVPARVVAMGDLMVYGSLADRAPRPGQLMQDWFS FDNWPPESLTGLNPTIDVEDMSMLLMRQVSGAMVSYEQCHFTPDYWRNYTVIGTRGRAEN FGDGQGGVVRVWTHRRGYDAKGDIEIPLRGDAGGHDDADVATMAEFLRFVLNGEPTDTTP LGAREAVAAGVLATRSLRNHNTPYDVPRVRPELAAYFEDNQRRATRLRGA >gi|319977512|gb|AEUH01000215.1| GENE 4 3952 - 5112 1687 386 aa, chain - ## HITS:1 COG:BH0797 KEGG:ns NR:ns ## COG: BH0797 COG1940 # Protein_GI_number: 15613360 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Bacillus halodurans # 85 360 3 290 293 86 27.0 8e-17 MLEAFGPDRVRELNSLTVISYLRGVDSATVTQLSQATGLSRTSISSTVAHLEKLGWILSL PPANGNLAGRPAKRYRFASTAGLVLGVDIGANRVSALLTDLNGAELGYAEDPVSPRTPAA TRVERTHGVIDEALNRAEVDAQRVSVVSAGLTGPVDLEGNSPFGNTLPGWSAVNIVQDLS ERYGRRVLVENDCKLALTAEAWRGQAIGVSNGAFIMAGARVGAAFMIDGVVCRGGGGAAG EVGALDILRWAKAPAIFTRHPDLPKGLPLIMAAEWVCSRMRGGDREAARLVHEFAHDLAV GAAALALTVDPDVLILGGGISASADLWKDPFAEALAPLVLRMPELRVSQLGSRGVVEGAA WRGAQFLEAHSYSDYLLPAQSLTQHA >gi|319977512|gb|AEUH01000215.1| GENE 5 5231 - 7036 2278 601 aa, chain - ## HITS:1 COG:no KEGG:Sked_01980 NR:ns ## KEGG: Sked_01980 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 80 524 56 494 578 415 50.0 1e-114 MTNFVKTIDRIAPHCLTSLYGGPSSTVNTEGFSRMRTARKTVTAAAAGLLVAAGACLPAA LGTGSASSIASTPSDQWQPVASAPAPVANPLKGFMPFAPDDGSAPPSASDGLGYTMEWTY MPVSSVVTGYHAYDFSAFEARLNQIGQRGHQTIFRFYLDYPGRPTGVPQYLIDQGIDTSR TYTVFDNNKISFSPDYNDSRVQDMILDFVRALGEKYDGDARIGFITAGLIGFWGENHTWP MNGETSADNPKGENWMPAQDFQDRLVAAWDDAFNTTAVQYREPSAATKAHGMGYHDDSFA YSTLDNVDWHFMSHMKAEHEEDAWQHAPIGGEVYPPLQTCIFSQPLGCPGADAEKAQGRN FDMDASLDATHATWLMNHKAFSEGYTGADLERAKTANARMGYVLSATRARASVSEPASNG AGTRTVSVEAEIANSGLAPFYADWPLEFALLDPSGAVVGTQRAGGVLPTIQPGASATARA DITVPADAGDLAVTMRVVNPLPGGAPVAFANEGMGTRLEGALTLPLAVSTLQTGPAPSPS ASPDPAGTATQAPADPGAPTARPTAGTKASHLANTGSLTAPLIIVAAVAGAVGALLVRRK R >gi|319977512|gb|AEUH01000215.1| GENE 6 6995 - 7993 1464 332 aa, chain + ## HITS:1 COG:ML1768 KEGG:ns NR:ns ## COG: ML1768 COG1175 # Protein_GI_number: 15827944 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mycobacterium leprae # 36 320 30 315 328 147 34.0 4e-35 MGSNSVNGLDKVCQRCFISRVTSMEVFSMTSPAVAGERTAAPRARKRRDDRIAAAVLLAP SFAGFLVFYVYPALRGLWYSFTDFSMLKPPNFVGLDNYSAALGSTEVWNAFVVTVAYAAY NIIGQTFLGLLLAALMQRFARSTWVRSMLLLPWLVPNVAIALIWGWLLDANLGFVTHLLK GIGIEGVTFFNERAAMPIVAGINIWAYTGYTALLLYAGMLQIPGELYESAALDGAGEARM FFSITLPLLRPVLVLVFVMGLIGSFQIFDTVQIGYAGHPIPAVRVIYYYIYQQFTFLKMG YASAVAMLLVCVLGVLTAIQLRLMRAGTSDLA >gi|319977512|gb|AEUH01000215.1| GENE 7 8007 - 8885 1305 292 aa, chain + ## HITS:1 COG:MT2099 KEGG:ns NR:ns ## COG: MT2099 COG0395 # Protein_GI_number: 15841527 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mycobacterium tuberculosis CDC1551 # 24 291 30 278 280 152 33.0 8e-37 MLRRIQASRVAAYIIIALAVLVTVFPFYWMIRTALTPQADLIADSTRLWPSHPTLINFKR ILGLVSVEEAQAAGGSGATINFAIATTNSLIYSGGIAIAQTFFAACAAYAFARLRFPGKN LLFGCVIGALMVPGIFSLLPNYVLVKQLGWLDTMHGMMLPALFMAPFSVFFLRQFFLSLP REIEEAAMLDGLGPIARFFRVTIPMSTGPIMTMALITIIGMWKDFLWPLLVGRNGAQLLT VALGIFQQQSPNRAPDWTGLMAGSTLSVIPVLILLVLMGRKLVESLNFSGIK >gi|319977512|gb|AEUH01000215.1| GENE 8 8914 - 10269 2309 451 aa, chain + ## HITS:1 COG:ML1770 KEGG:ns NR:ns ## COG: ML1770 COG1653 # Protein_GI_number: 15827946 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mycobacterium leprae # 2 410 3 403 446 127 26.0 3e-29 MKPRTIASVGLAALMTASLAACNANTSSTTGGESGADGVTINYWLWDDRQAPLYQQCADD FHAANPGITVKITQTAWGQYWETLTTQLSAGNAPDVFTNHSAHLLQFIDNNQIMDLTEMA QKAGVDDSIYQPGLADLSKYDGKRYGLPKDWDTEGLLYDTAVATEAGYSKEDMAELTWNP TDGGTFEQFIAKTTVDANGKNGLDPDFDKSNVVRYGFYPEWADGAIGQNGWGNLAASNGF TYGDRNVAPTKFNYSDQSLVDTVTWIHGLIDKGYAPKYDQQSTLGTEATMNAGTAASTIQ GSFTASGYLGKDAQRAFAFAPLPKGPIGRRSAMNGLHDVVWSGTTHPDEAFKWVAYMGSE ACQVKVGESGVIFPAATKGTEASLKAREAQGQDNSAFTSVVENKETFPVPVLAHGDEVNT LIEDAIKAIADGADAKSTLEAANQKANDLLK >gi|319977512|gb|AEUH01000215.1| GENE 9 10397 - 11887 2150 496 aa, chain + ## HITS:1 COG:TM1192 KEGG:ns NR:ns ## COG: TM1192 COG3345 # Protein_GI_number: 15643948 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Thermotoga maritima # 167 426 184 444 552 157 32.0 6e-38 MTSPAPALPNGAPLLASPGARIVKEGDGHVLIRASRVAILHSMAGGQYYRSGHNSWSPSG WSALDGPPLRIADPERRTTADDPSSDDPFRHHGSWMGAVARGEGGEALLVGALDGAHPCV RCDEDTLVGFDPGGDALWMVAWGDEQGVFDSYTAALAPRALPRRPAPRVWCSWYSYYEKV TWRDIEEQLDSLAGLGYRTVQIDDGWQAGVGDWRPNAKFGDLAECARRIRDHGMTPGLWV APFIAREGAPLLEAHPEAFVHGPDGSPVVAGYNWGGPYYALDTTHPTAQHYLRGLVRTLV GAGIGYLKLDFINAAAVPGQRFQEAWGDEAYRLGCAVIREEAPDAYLLGSGALVIPSIGV LDGIRVSCDVAPIWKNYATEDPSDAEARNAFKGSVARLWLRSLIDVDPDVVFFRHVKNLL NDQQIEWLRDVAAVAGFKSSSDPVAWLTEEEKGTARAWLARDEDIERVSRNTWRIDGREV DFGPGLARPEHCYPVS >gi|319977512|gb|AEUH01000215.1| GENE 10 12122 - 13402 1680 426 aa, chain + ## HITS:1 COG:ML2276 KEGG:ns NR:ns ## COG: ML2276 COG0644 # Protein_GI_number: 15828219 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Mycobacterium leprae # 8 381 4 356 408 176 36.0 1e-43 MAANSEGDTCADVVVVGAGPAGAATAYHLATAGADVLLLDKSAFPRDKICGDGLTPAAVH ELAAMGVDTTGWARNRGLTVIGGGHTIHMEWPDQKSLPGYGLTRARADLDHALVRRAVEA GARLVEGATATGATTDASGRVTGVAVRSGRGRDARSYTVSAPLTVDAGGVAARLSTGLGL AKRANRPMGVAARAYFRSPRGDEEWMESHLELWSGEPGRSDLLPGYGWIFPMGDGVVNVG LGSVSSRAGGTDLPYKKVFEAWTSNLPAEWGLTPGNQIGPLRSAALPMCFNRAPHYAQGL ALVGDAGGMVSPFNGEGIAPAMRAGRFAASCAVQALRRTTRAGFDRAMGEYPRLLREEYG GYYQLGRAFVALIGKPRIMRACTNLGLPVPRLMKLVHKLLSDGYERHGGDIDDRLITTLT KLVPPA >gi|319977512|gb|AEUH01000215.1| GENE 11 13440 - 13799 188 119 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A [Campylobacter curvus 525.92] # 5 119 14 125 129 77 31 9e-14 MTNPYVPLLIMSAAALVLAFGGLAASAVLGPTKKSTTKADNYECGIQPTSAHLTEGRFPV RYYLVAMTFIIFDIEVVFMYPWAVSFNQLGLFGLAVMMSFLVTLAVPYAYEWRRGGLDY >gi|319977512|gb|AEUH01000215.1| GENE 12 13822 - 14376 712 184 aa, chain + ## HITS:1 COG:MT3234 KEGG:ns NR:ns ## COG: MT3234 COG0377 # Protein_GI_number: 15842722 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 20 kD subunit and related Fe-S oxidoreductases # Organism: Mycobacterium tuberculosis CDC1551 # 1 183 1 183 184 244 67.0 8e-65 MGLEESLPAGIALTSVEKVLGLARKHSQWPVTMGLACCAIEMMAAGTPRFDMARFGLEVF RASPRHSDLMIVSGRVSHKMAPIIRRVYDSMPEPKWVISMGACASSGGVFNNYAVVQGCD HIVPVDVYLPGCPPRPEALIHAVLVLREQIGKEPLGVHRREIAREAERAALEATPTHQMK GLLV >gi|319977512|gb|AEUH01000215.1| GENE 13 14373 - 15110 994 245 aa, chain + ## HITS:1 COG:Rv3147 KEGG:ns NR:ns ## COG: Rv3147 COG0852 # Protein_GI_number: 15610283 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 27 kD subunit # Organism: Mycobacterium tuberculosis H37Rv # 28 245 19 236 236 265 59.0 5e-71 MSDLSIPGEGAPARAEAAPSPRGFALPEPIARREGLFGAGADSSTSGFSGLVADSFLPGE AARPYGGWFDQAVDVLEERIGADGLEVADVIEKVVVDRGQLTVFIAREHIVRVASYLRDD PDLRFEMCLGTNGVHYPADTGRELHAVYPLYSITHNRMIRLEAACPDEDPRIPSIVSVYP ANDWQERETWDLLGIVFTGHPSLTRTALPDDWVGHPQRKDYPLGGVPVEFKGATTPPPDT RRSVN >gi|319977512|gb|AEUH01000215.1| GENE 14 15110 - 16459 2023 449 aa, chain + ## HITS:1 COG:Rv3148 KEGG:ns NR:ns ## COG: Rv3148 COG0649 # Protein_GI_number: 15610284 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 49 kD subunit 7 # Organism: Mycobacterium tuberculosis H37Rv # 23 449 16 440 440 526 60.0 1e-149 MSRPAFHAAPGAADLPLDDIPEVLAQGGDWDEVLAEIEKLTSERIVVNLGPVHPSTHGVL RLILELDGEKVRETRVDTGYLHTGIEKNMEYRTWTQGVAYCTRMDYVAPFFQEAAYCLGV EKLLGITDDVPERASMIRVVMMELCRIASHLVAIGSTGNEMGATTIMTIAFRGREEILRV FERVTGLRMNHEYIRPGGVIQDIGEGTTGYIRERLRGVRRDIGELQDILQENPIFKKRLC DVAVMPLSALMALGHTGPGVRAAGLPLDLRKTQPYCGYEDYEFDVPTRDQSDVYNRTMVR FDECYESMRIVWQALERLDACEGAPTMVSDPQIAWPARLSVASDGQGNSPDHVREIMGES MESLIHHFKLVTEGFHVPAGQVYQTVEHAKGILGVHLVSDGGTRPFRAHFRDPSFANLQS LAMMTEGGQLADVVVSLAAIDPVLGGVDR >gi|319977512|gb|AEUH01000215.1| GENE 15 16456 - 17157 943 233 aa, chain + ## HITS:1 COG:MT3237 KEGG:ns NR:ns ## COG: MT3237 COG1905 # Protein_GI_number: 15842725 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Mycobacterium tuberculosis CDC1551 # 2 226 31 252 252 232 55.0 4e-61 MSYTPDVEARLRADSAQIIARYPEGHSRSALLPMLHLVQSVDGYVSADGIDLISRILDLP RAEISAVATFYTQYKRHPTGDYLVGVCTNALCAVMGGDEIWERVSAKVGVGSDETSESGR ITLERIECNAACDYAPVVMVNWEFFDNQTPESALALIDDIEAGRDIHPTRGPAVAPAFKE NERLLAGFPDGRADEGPSAGPATLLGVGIAQENGWTEPVAVPGGAQAEGGEQA >gi|319977512|gb|AEUH01000215.1| GENE 16 17154 - 18455 1687 433 aa, chain + ## HITS:1 COG:MT3238 KEGG:ns NR:ns ## COG: MT3238 COG1894 # Protein_GI_number: 15842726 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Mycobacterium tuberculosis CDC1551 # 9 409 7 422 445 525 62.0 1e-149 MTTYRAPGPLTPVLTHGWGQERSWTLDSYRQRGGYQGLERARTMEPAAIIDAVKASGLRG RGGAGFPTGLKWSFLPPDDGGPRYLVVNADESEPGTCKDIPLIMGNPHVLIEGIAITSRA IGCDHAFVYLRGEVAHVYRRLLRAVREATESGVLGGLRITAHAGAGAYICGEETALLDSL EGRRGHPRLKPPFPAVAGLYARPTVVNNVETIAQVAGVFRNSPEWYSSMGTDKSKGHGIF SLSGHVANPGQFEAPFGITMRELIDMAGGIRQGHRLKFWTPGGSSTPIFTEDELDTPLDY ESVGAAGSMLGTRALQVFDETVSVVRVIARWSEFYQHESCGKCTPCREGTYWIKQIMLRL ERGEGLPGDVDLLDEIAHNIAGRSFCPLGDAAATPIMSGIKRFREEFEAGLTTPARERFP YGASATHARGGAR >gi|319977512|gb|AEUH01000215.1| GENE 17 18452 - 20168 2412 572 aa, chain + ## HITS:1 COG:MT3239 KEGG:ns NR:ns ## COG: MT3239 COG1034 # Protein_GI_number: 15842727 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G) # Organism: Mycobacterium tuberculosis CDC1551 # 259 572 187 504 791 313 56.0 5e-85 MSDAPTMVDVIIDDVHVSVPKGTLVIRAAEQAGIRIPRFCDHPLLEPVAACRQCLVEVGM PDRNTGELRFMPKPQPSCAQTVAPGMVVKTQHTSGVADTAQRGVMEFLLINHPLDCPVCD KGGECPLQNQAMTEGRGKSRFADAKRTFKKPLRITSQILLDRERCILCQRCVRFGKEISG DVFIDLQGRGGGTAPTDDHYFMGEQIGGFDTTTLGFFDPAAKDNSAPDLSGPYGTDAIAG SFNEGDLGPAGLDQSGRAFASYFSGNIIQICPVGALTAASYRFRARPFDLVSTASVSEHD ASGSAVRQDIRRGVVVRRMAGNDPEVNEEWITDKDRFAFEWDGQDRLRTPLVREDGELVP TSWSDALDRARKGLEAAPSAGFLPGGRLTFEDAWAWSKFARTVLGSDNIDFRSREATEEE RSFLASTVAGTGLGVTYSDLERAGQVLLVALEPEDECGALFLRLRKGVRRGGARVATVAP YTSTGSQKLSATVLRSAPGTEPAVVDSIAPGGQNAALFGALSGGVVLVGERAARTPGLLT AVRALAQRSGARLAWVPRRAGDRAALEAGLLP Prediction of potential genes in microbial genomes Time: Thu May 12 18:48:52 2011 Seq name: gi|319977463|gb|AEUH01000216.1| Actinomyces sp. oral taxon 178 str. F0338 contig00216, whole genome shotgun sequence Length of sequence - 49882 bp Number of predicted genes - 54, with homology - 37 Number of transcription units - 26, operones - 14 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 31/0.000 + CDS 3 - 2180 2069 ## COG1005 NADH:ubiquinone oxidoreductase subunit 1 (chain H) 2 1 Op 2 28/0.000 + CDS 2173 - 2832 774 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 3 1 Op 3 30/0.000 + CDS 2832 - 3785 1319 ## COG0839 NADH:ubiquinone oxidoreductase subunit 6 (chain J) 4 1 Op 4 26/0.000 + CDS 3782 - 4081 528 ## COG0713 NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) 5 1 Op 5 30/0.000 + CDS 4096 - 6060 2909 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 6 1 Op 6 22/0.000 + CDS 6088 - 7638 2226 ## COG1008 NADH:ubiquinone oxidoreductase subunit 4 (chain M) 7 1 Op 7 . + CDS 7635 - 9203 2112 ## COG1007 NADH:ubiquinone oxidoreductase subunit 2 (chain N) 8 2 Tu 1 . + CDS 9335 - 10213 1091 ## COG0142 Geranylgeranyl pyrophosphate synthase 9 3 Tu 1 . + CDS 10364 - 11875 2101 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 10 4 Tu 1 . + CDS 12022 - 12903 985 ## COG0501 Zn-dependent protease with chaperone function 11 5 Tu 1 . - CDS 12924 - 13514 946 ## COG1666 Uncharacterized protein conserved in bacteria - Prom 13547 - 13606 80.3 + TRNA 13525 - 13606 60.0 # Tyr GTA 0 0 + TRNA 13704 - 13776 78.4 # Thr GGT 0 0 + TRNA 13806 - 13882 88.8 # Met CAT 0 0 + Prom 13808 - 13867 80.4 12 6 Tu 1 . + CDS 13931 - 14101 270 ## PROTEIN SUPPORTED gi|227493374|ref|ZP_03923690.1| ribosomal protein L33 + Term 14117 - 14152 8.1 - Term 14002 - 14056 4.5 13 7 Tu 1 . - CDS 14170 - 15009 1146 ## COG2186 Transcriptional regulators + Prom 14834 - 14893 2.0 14 8 Op 1 1/0.200 + CDS 15073 - 16629 2135 ## COG0747 ABC-type dipeptide transport system, periplasmic component 15 8 Op 2 4/0.000 + CDS 16706 - 17623 1263 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 16 8 Op 3 5/0.000 + CDS 17681 - 18685 267 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 17 8 Op 4 . + CDS 18720 - 19406 852 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase + Term 19642 - 19680 -0.2 + Prom 19506 - 19565 2.8 18 9 Op 1 . + CDS 19694 - 20128 255 ## 19 9 Op 2 . + CDS 20189 - 21580 567 ## 20 9 Op 3 . + CDS 21577 - 22719 562 ## COG1404 Subtilisin-like serine proteases 21 10 Op 1 . + CDS 22829 - 23140 345 ## 22 10 Op 2 . + CDS 23140 - 24864 1343 ## Srot_0053 hypothetical protein + Term 24878 - 24906 -1.0 23 11 Op 1 . + CDS 25032 - 25595 345 ## 24 11 Op 2 . + CDS 25690 - 25758 67 ## 25 12 Tu 1 . + CDS 25883 - 26500 292 ## + Term 26546 - 26581 -0.7 26 13 Op 1 . + CDS 26632 - 27246 376 ## 27 13 Op 2 . + CDS 27243 - 28274 1020 ## Noca_3206 putative transmembrane protein 28 14 Tu 1 . - CDS 28469 - 29122 731 ## gi|293190832|ref|ZP_06608994.1| hypothetical protein HMPREF0970_01327 + Prom 28929 - 28988 2.5 29 15 Tu 1 . + CDS 29121 - 29255 111 ## 30 16 Op 1 . - CDS 29260 - 29850 629 ## 31 16 Op 2 . - CDS 29894 - 30664 815 ## gi|293190832|ref|ZP_06608994.1| hypothetical protein HMPREF0970_01327 32 16 Op 3 . - CDS 30732 - 31376 745 ## gi|293190832|ref|ZP_06608994.1| hypothetical protein HMPREF0970_01327 33 17 Tu 1 . + CDS 31380 - 31550 133 ## 34 18 Op 1 . - CDS 31523 - 32737 1325 ## 35 18 Op 2 . - CDS 32730 - 33029 352 ## 36 18 Op 3 . - CDS 33019 - 33225 135 ## 37 19 Op 1 . + CDS 33079 - 33348 91 ## 38 19 Op 2 . + CDS 33442 - 34539 1325 ## COG0812 UDP-N-acetylmuramate dehydrogenase + Term 34651 - 34687 4.6 - Term 34861 - 34915 13.1 39 20 Op 1 . - CDS 35085 - 36170 1234 ## COG1816 Adenosine deaminase 40 20 Op 2 . - CDS 36175 - 36999 994 ## COG2267 Lysophospholipase 41 20 Op 3 . - CDS 36996 - 38273 1697 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 38400 - 38459 80.4 + TRNA 38383 - 38458 89.5 # Trp CCA 0 0 + Prom 38383 - 38442 79.7 42 21 Op 1 . + CDS 38485 - 38823 470 ## HMPREF0573_11146 preprotein translocase subunit SecE + Term 38834 - 38872 6.0 43 21 Op 2 45/0.000 + CDS 38901 - 39830 1215 ## COG0250 Transcription antiterminator 44 21 Op 3 55/0.000 + CDS 40021 - 40455 601 ## PROTEIN SUPPORTED gi|170782939|ref|YP_001711273.1| 50S ribosomal protein L11 45 21 Op 4 . + CDS 40525 - 41232 950 ## PROTEIN SUPPORTED gi|227497155|ref|ZP_03927403.1| ribosomal protein L1 46 22 Tu 1 . + CDS 41486 - 41749 202 ## + Term 41862 - 41912 2.4 47 23 Op 1 35/0.000 + CDS 42120 - 43901 258 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 48 23 Op 2 . + CDS 43909 - 45654 250 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 49 23 Op 3 . + CDS 45678 - 46349 826 ## Arch_1180 transcriptional regulator, TetR family + Term 46404 - 46443 4.0 50 24 Op 1 47/0.000 + CDS 46581 - 47102 577 ## PROTEIN SUPPORTED gi|227428348|ref|ZP_03911405.1| LSU ribosomal protein L10P 51 24 Op 2 . + CDS 47187 - 47576 436 ## PROTEIN SUPPORTED gi|172039978|ref|YP_001799692.1| 50S ribosomal protein L7/L12 52 25 Tu 1 . + CDS 47719 - 48513 955 ## COG0789 Predicted transcriptional regulators 53 26 Op 1 . + CDS 48641 - 48733 59 ## + TRNA 48721 - 48792 77.1 # Asn GTT 0 0 54 26 Op 2 . + CDS 48793 - 49882 900 ## Predicted protein(s) >gi|319977463|gb|AEUH01000216.1| GENE 1 3 - 2180 2069 725 aa, chain + ## HITS:1 COG:MT3240 KEGG:ns NR:ns ## COG: MT3240 COG1005 # Protein_GI_number: 15842728 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 1 (chain H) # Organism: Mycobacterium tuberculosis CDC1551 # 299 691 4 399 410 401 54.0 1e-111 LPGLAPFGRPLAGAAATGLDWEGAPSERGLDADGMLAAAARGDLGALVVGGVDVRDFADP DAVRAAIAATPFVVSLEVRSSEVTRLADVVLPVAPPIEKNGTFINWEGRLRPFGQAVSAR SLTDRDVLVRLAEEFDADLGITALSDLYAEANPLMDWEGAREAFAPVGAPPVPGVGEGQA VLAAHKPMLDAGRLQDGAPWLGRGGAHPRRPCLGRHAGRRGSGGRAAAHAHHRPRLPHAP RRRRRPARPRRVGARVLQRFHRPRVPRRARRRRDAQRQSGGCPMSALAPAAAHYTAADFS EETWWLSLIKAVFIIVFLILNVLLALWVERRGLGRMQTRPGPNVAGPLGLFQAFADAVKL LFKEDMWTRRAERFLYFLAPAIAAFAAFSVFAVIPMGPNVSLFGHSSPLQLADMPVATLY ILAIASLGLYGIVLGGWSTRSTLPLYGAVRSSAQVISYELAMGLSLVSVFLMSGTMSTSQ IVAAQGQYWWVFTLFPAFVIYCVSATGEVNRLPFDLPEAEGELVAGHMTEYSSMKFGWYY LSEYVNMLNVSAVATTMFLGGWHAPWPLSQVEFLASGWMGMVWFFLKMWFFMFLLIWTRA TLLRFRYDQFMALGWKWLMPIALAWLVMVALVRGLTQFVTVSAPVLYGSVAVVFLIALVV IWVTDPGEEPAHDPADEEYTGFADGFPVPPLPGQSPVASPRAGRAAAAAAGGTSPASPHE EGSDE >gi|319977463|gb|AEUH01000216.1| GENE 2 2173 - 2832 774 219 aa, chain + ## HITS:1 COG:MT3241 KEGG:ns NR:ns ## COG: MT3241 COG1143 # Protein_GI_number: 15842729 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Mycobacterium tuberculosis CDC1551 # 18 187 24 193 211 220 63.0 1e-57 MSKKKPADDEQMLFEHDPKGPIGRFVAPMAGYGVTLSSFFRPTVTEQYPREEARVMPRFH GRHQLNRYADGLEKCVGCELCAWACPADAIFVEAASNTPDEQYSAGERYGRVYEINYLRC IFCGMCIEACPTRALTMTNEFELAEYTREDDIYVKDDLLVPLSEGMLSTPHPMVPGKSDG DYYRGEVDGPVPEQIDWVGSRRPDDPSLPGARALVGEDG >gi|319977463|gb|AEUH01000216.1| GENE 3 2832 - 3785 1319 317 aa, chain + ## HITS:1 COG:MT3242 KEGG:ns NR:ns ## COG: MT3242 COG0839 # Protein_GI_number: 15842730 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 6 (chain J) # Organism: Mycobacterium tuberculosis CDC1551 # 12 247 14 243 262 140 41.0 3e-33 MNLLQGWPAMGSVGEAVLFACTAVFMVACALGVLFFKKAAHSAVCMVGVMLGLAVLYIAN NAPFLGVAQIVVYTGAIMMLFLFVIMLIGIGESDDYALQGRGRIIAAVLGGLGLAVCVTA AVVHSTADARTAAPVDPYSNAPITDLAVMLFERHWLSMELAGGLLITAALGAMVLTHSDR LVPKADQAQTARSKMRAFAQTGRRIGQLPAPGVYARSNAADVPAVSGETLGPVEESVPRV LRVRGLERSIGEVDREVAQALALARSDGQGDSPFGAVGPADVGRSGSWGMPGPAAPTGLN QPKPRRSDGAATQEEAK >gi|319977463|gb|AEUH01000216.1| GENE 4 3782 - 4081 528 99 aa, chain + ## HITS:1 COG:MT3243 KEGG:ns NR:ns ## COG: MT3243 COG0713 # Protein_GI_number: 15842731 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) # Organism: Mycobacterium tuberculosis CDC1551 # 1 98 1 98 99 86 58.0 1e-17 MSLAWYLVLAAVLFAIGAVVVLVRRSAVISLMGVELMLNSANLVLVAFSRIGSNIDGQIM AFFVMVVAAAEVVVGLSIIVSIYRSRATTSVDDANLLKH >gi|319977463|gb|AEUH01000216.1| GENE 5 4096 - 6060 2909 654 aa, chain + ## HITS:1 COG:MT3244 KEGG:ns NR:ns ## COG: MT3244 COG1009 # Protein_GI_number: 15842732 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Mycobacterium tuberculosis CDC1551 # 23 635 12 613 633 532 54.0 1e-151 MTPFLLPAAAPASATGAASLAPLMILVPLVSAGLLLLLGRAADSWGHWLATAASWASFGI GAAVAAQMLGSDPSSRRSSQTLFTWIPAGDFTVDFGLLVDPLSLTFVILVTFVGSLIHVY AIAYMAHDGARRRFFAYLNLFIAAMLVLVLADSYAGLFVGWEGVGLASYLLIGFWNHVPA NAVAAKKAFIMNRVGDIGMLIAMMAMVANFSSVSYAEVSARVGGAGQAQATVIGLFLLVA ACGKSAQFPLQAWLGDAMAGPTPVSALIHAATMVTAGVYLVVRSGAVFTAAPAAAAAVVV VGAVTLLFGAVVGSAKDDMKKVLAASTMSQIGYMMLGAGLGPIGWAFAVFHLFTHGFFKA LMFLGAGSVMHGMGDQVNMRRFGALRGAMKVTWLTFMMGWLAILGVPPLSGFWSKDKIIE AAFSIHSFGGSEAPWAPWVFGTVALVGAGVTAFYMSRLFFMTFHGKARWTTEAEGSPVRP HESGPLMTIPMVVLAVGAVVIGAALSVGDFFTTWLEPAIGPVEHGEPVVGEYALQGATLV LVLVGAWIAWRKYGAAQVPASVPAGNALTEAARRDLYQDTVNEALFMAPSKGLVAVATVG DSAVIDGALTGAGRACQGVGKLVGATQNGFVRAYASYILAGAVAAVALVLAFRL >gi|319977463|gb|AEUH01000216.1| GENE 6 6088 - 7638 2226 516 aa, chain + ## HITS:1 COG:Rv3157 KEGG:ns NR:ns ## COG: Rv3157 COG1008 # Protein_GI_number: 15610293 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 4 (chain M) # Organism: Mycobacterium tuberculosis H37Rv # 7 493 5 517 553 332 40.0 8e-91 MEASQFPWLTAMVALPALGAALLGLVPALRRSAGRAVALAVSAAELVLALVVAATRFDWS APASYQVTDSFTWIPQLGISWSLSVSALSLVMVLLATALVPLVLVAGWDEEESADQGAYP ALVLALESFMVVIFTAWDLAVFYFAFEAMLVPLYLMIGRYGVGGEAERHRAAMKFLLYSL FGGLVMLGGVVALWATAPAPGSRELFFRFDTLAQVSQSLPQGAQALVFATFMVAFAIKAP MVPVHTWLPDTAAVARPGTSVLLVGVLDKIGTYGMIVLCLRMLPGPSASARWAMVALAVV SIIWGGLAANGQNDIMRLVSYTSVSHFGFMVLGVFIGSETALVGAMFYMVAHGVSIAAMF WLSGWLSRRGGTQDMREYAGMQRVAPVLAGLWLTAGLASIALPGLSGFVPEYLVLMGTWR VSAAAALFAVLGVVIAAMYVLMPYQRVFTGAPAKGKEGLSDLGGRERAVMAPVVAAMLVL GIWSAPLVSALTPIANDAALGSPAPTASASPEGSSK >gi|319977463|gb|AEUH01000216.1| GENE 7 7635 - 9203 2112 522 aa, chain + ## HITS:1 COG:MT3246 KEGG:ns NR:ns ## COG: MT3246 COG1007 # Protein_GI_number: 15842734 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 2 (chain N) # Organism: Mycobacterium tuberculosis CDC1551 # 11 521 5 530 531 343 46.0 4e-94 MILLPQAAFSAPDVRWAALVPVLIVLGTAVLSVLVEAFAPVRARRPIVIGLALFATGAAA AVLALRWTTVLAAPASLGEYIEDPLTVGAQFVLAIIGFLSVLVMADRTRVGDGSFAAQPS DRPGSAAEELSEAKGYQRSEVFPLALFSLGGMMVFPAADNFVALFVALEVMSLPLYVLAA TARRRRQLSHEAALKYFVLGAFSSAFMLMGAALLFGASNSLTISALAQAIGASVAMDRLV LIGVLCVMIGLLFKVGAAPFHAWAPDVYTGAPTPVTGFMAAAVKVAAFGAVLRFYQVVAG LLSWDLLPVFLAVAAATIVVGTFVGLVQSDVKRMLAFSSIAHAGFILIGVFSLVKGSSGH VLLYVLSYGLATVGAFGVVTLVRTRDQDGAVGGEANQLSRWAGLGRTNPWVAASMLVFLL SFAGIPLTGGFIGKFVVFSDGAAGGLAWLVAVALVASAVTAFYYFRLVRLMFFTEPEGDA VVVRSEGLTGIAVAVCAVATIALGVFPGPVLSQLSKIVILLP >gi|319977463|gb|AEUH01000216.1| GENE 8 9335 - 10213 1091 292 aa, chain + ## HITS:1 COG:Cgl0466 KEGG:ns NR:ns ## COG: Cgl0466 COG0142 # Protein_GI_number: 19551716 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Corynebacterium glutamicum # 1 292 67 350 350 197 45.0 2e-50 MSAAGGKRLRPLLTLVCAQLGRPELALGERVITGATAVELTHLASLYHDDVMDSAPTRRG VPSAQRLWGNNRAILAGDVLFARASALVAALGPEMVTRHAVTFERLCTGQLNETFGPAPG EDPVEFYLRVLADKTGSLVSSAAAFGALLSDAGAFVGDVVAEFGERVGVAFQIADDVLDL ASSGGQSGKTPGTDLREGVDTLPVLLLRRREAEGGLDDAGRRILEGLGSDLSSDEALGDV VSMLRVHDVLDETRSLARAWAAGAIDCLEALPKGEPKTALVAFAHLMVDRLA >gi|319977463|gb|AEUH01000216.1| GENE 9 10364 - 11875 2101 503 aa, chain + ## HITS:1 COG:Cgl2757 KEGG:ns NR:ns ## COG: Cgl2757 COG0493 # Protein_GI_number: 19554007 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Corynebacterium glutamicum # 3 466 4 451 457 461 53.0 1e-129 MAPLNVAVIGAGPAGIYASDILAKSGLEVSIDLFERLPAPYGLVRYGVAPDHPRIKQIIV ALYKILQRGDIRLLGNVEVGRDITIEDLRDHYDAIIIATGADRDAPLPIPGIGLPESYGA ADFVSWYDGNPDYPRTWPLEAKEVAVIGVGNVALDISRVLAKHAPDMLRTEVPANVAKAL AENPVTDVHVFGRRGPAQVKFTPLELRELGKVPDVDVIVSEEDFDFDEGSQRALRESNQQ RQVVKTLTNYAMADPEERTASRRIHIHLFQSPVEVVADGNGHVKALRTERTQLNGDGTVS GTGVITTWPVQAVYRAVGYYSSPIRGLPFDTRAGVVPNVEGRVIDSATTKDQSAPVIPGV YATGWIKRGPVGLIGSTKSDAQQTIAHLVEDAGEGRLHAATRAVGHEAMVALLEERGIEY TTWEGWELLDAYEQALGKAYGELPGGRGLRERVKVVSRRAMTDISRGRDVDPAGADLIGQ MGEMGVPTAPERFEDYTGPGRRH >gi|319977463|gb|AEUH01000216.1| GENE 10 12022 - 12903 985 293 aa, chain + ## HITS:1 COG:ML2278 KEGG:ns NR:ns ## COG: ML2278 COG0501 # Protein_GI_number: 15828221 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Mycobacterium leprae # 1 290 1 283 287 258 51.0 7e-69 MTHRSYHNGLKTVLLLGGMWALLLAVGAAIAAGTGNRLWIFVFAAIGLVQTGFSYWNSAS VALRSMDAYQVGEAEQPVLHAIVRELAERAGQPMPTIWVAPTRTPNAFATGRDPEHAAVC CTEGILEILGERELRGVLGHELMHVYNRDILTSSVAAAMAGIVMSIAQFLMFFGGGSRRD GEEGGLGFLGAIAIALLAPIAAALIQFSLSRTREYDADEDGAELTHDPLALASALRKLEA ATDRVPMAPTPANQNVAAMMIANPFRAGGLANLFSTHPPMEERIARLERMAGY >gi|319977463|gb|AEUH01000216.1| GENE 11 12924 - 13514 946 196 aa, chain - ## HITS:1 COG:MT0592 KEGG:ns NR:ns ## COG: MT0592 COG1666 # Protein_GI_number: 15839964 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium tuberculosis CDC1551 # 30 196 1 163 163 177 61.0 2e-44 MRCRGRIAPKVPATPIGTGNGRIPMKGATMADSSFDIVSTLDRQEVDNAVNQTAKEIANR YDFRGVDASIALSADSITMTANSASRVLAVLDVLQSKLIRRGLSLKVLDYAGREPKASGK LFTLTCPLKEGIPQDTAKKIAKLIRDEGPKGVKPQIQGDALRVSSKSRDDLQAVIALLKG PDGDAFDVALQFVNYR >gi|319977463|gb|AEUH01000216.1| GENE 12 13931 - 14101 270 56 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227493374|ref|ZP_03923690.1| ribosomal protein L33 [Mobiluncus curtisii ATCC 43063] # 1 56 1 56 56 108 85 6e-23 MASKSQDVRPKITLACSECKERNYITKKNRRNTPDRLELAKFCPRCRKSTRHRETR >gi|319977463|gb|AEUH01000216.1| GENE 13 14170 - 15009 1146 279 aa, chain - ## HITS:1 COG:Cgl2598 KEGG:ns NR:ns ## COG: Cgl2598 COG2186 # Protein_GI_number: 19553848 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 38 257 21 240 240 151 40.0 1e-36 MLSDKSTQAIEAAISSIRLQNDNAPAADTRSTATMDAIKAYILQHGLRAGDRLPTEASLC ADLGVSRSSVREALRKLEALDIVTVRQGSGSYVGTMSLQPLVETLVLRSALDEINGIQSL RSIIDTRRALDLGIAPGLLAAMKGKRDPRLWDLADAMRAKARAGRTSLAEDIAFHSALLE SLHNPLMSQLVSAMWLIYQALAPQLETASEEHLLASAEAHAAILRACESGDVNAYKEAIE DHYLPLLSLIGAAPAAHGPGPSPRRRGHSRGADEQAQGE >gi|319977463|gb|AEUH01000216.1| GENE 14 15073 - 16629 2135 518 aa, chain + ## HITS:1 COG:Cgl2599 KEGG:ns NR:ns ## COG: Cgl2599 COG0747 # Protein_GI_number: 19553849 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Corynebacterium glutamicum # 10 516 27 536 539 297 37.0 4e-80 MRIRKSLASGAAVAAAALVLTACGGGQTSSTPQTSGAGPTGEINVGVAYATTDYGPVTTS ALALGTNWHVLEGLYAFDMVDYSVRPALAKGDPVEVSATEYEVALRDGAKFSDGNAVTAN DVVASYERAVDPTSVYKQFFTFVDSVTAKDSATVTIKLKHPFSMLKDRFTNVRVAPASMD ADALKAQPVGSGPYKYESISPTEVTAVPNDQYNGSVPATAARIRWHVLKDDSARLSAAIG GTVDIMEAVPASAVHQLRASGWAVEAVPGYNNPFLMFNTTKAPFDKPEVRQAVLRAIDRQ KLVDGPMEGQAVVATSFLPESSPAHKKPATDLGHDPQAAKKQLADAGAVGTEITLTTTDH PWIANLVPQVKADLEALGLTVKVDPLADPYTSATDVDEPGYDIVMAPGDPSVFGMDPGII MSWWNGDNIWTQKRDSWAKTAPDAFRSFQGIVEETVQLEPDDPAALAKWGQAQDLLAEQA VIYPLFHRKMLTAYKPGKVADFSPIAATGLQLLGVSAK >gi|319977463|gb|AEUH01000216.1| GENE 15 16706 - 17623 1263 305 aa, chain + ## HITS:1 COG:BMEII0862 KEGG:ns NR:ns ## COG: BMEII0862 COG0329 # Protein_GI_number: 17989207 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Brucella melitensis # 3 303 19 321 322 263 47.0 3e-70 MKSQIRGVVPPLAIPLKDGALDTASLERHINRMIDAGVHGLFVSGSSGEVAFSTDARRDQ VLAEALRVVDGRVPVLAGVIDTETERVIEHIRRAEDSGATGVVATAPFYALGGEAEVERH FRLLAENTGLELWAYDIPVCVHTKLSPDLLLRLGRDGVLAGVKDSSGDDVSFRWLSLANE AAGHPLRLLTGHEVVVDGAYMSGADGSVPGLGNVDPHGYVRQWQAYERRDWEAVRAEQDR LAELMRIVQVKGVQGFGAGIGAFKTAMMLLGVFDTNEMPRPVAALEGDNVEWVASVLRGT GLLAG >gi|319977463|gb|AEUH01000216.1| GENE 16 17681 - 18685 267 334 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 27 326 3 314 319 107 25 1e-22 MEANGVAPSARGPWELEDEGARGPAGRQLLALDIGGTKVAWGLVRVRARRLSASQRGSVP TSAWEGGPEVARRVTELARRLVADNPGVDGVGVASAGVVDPASGAIVSATGTMPGWAGTP LGAALAEATGRPVAVLNDVHAHGLGEAVLGAGRGFGTVLSFAVGTGLGGALVHHGSVFQG DHHIAGHFGHVHHHFAPDMECSCGRSGHIEAFCSGSGIVRWYNSLRGGADPQARDGRGLQ ELADGGNALAATCFERSGYALGEAVASLSNCVDPGAVVLSGSMTKSGPRWWDALRLGYAA GAMTPLAGVPLLPGALGGDAPLLGAALEFLSREQ >gi|319977463|gb|AEUH01000216.1| GENE 17 18720 - 19406 852 228 aa, chain + ## HITS:1 COG:Cgl2596 KEGG:ns NR:ns ## COG: Cgl2596 COG3010 # Protein_GI_number: 19553846 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Corynebacterium glutamicum # 5 221 10 229 232 174 50.0 2e-43 MTHPLIASLRGRLIVSVQAYPGEPMRHPETMAQIARAAEIGGAAAIRCQGLSDISAIKGR VQVPVIGLWKEGHDGVYITPSLRHARACVMAGADVVAIDATDRPRADGRDYAQTVTELKR EGALVMADCGSLEDARRAVEAGSDIVSTTLAGYTGDRGKTEGPDLELLAGIVQEFPGVPV ICEGRVHTPGDAAAAIGAGAFAVIVGTGITHPTSITGWFKAAVEGARA >gi|319977463|gb|AEUH01000216.1| GENE 18 19694 - 20128 255 144 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDGVGGVDPDLEEQARVAASWGTGDLADACSSGLEEIGVRQTAVRWGFERAALRVNSAFA RLLGAVNSDLDECGRELERMISASQQVVSSYLEDEEAIAAAFRSMADAQGQSGAQAAGSA SPQDASASAGSGGGGYGAVGSGGD >gi|319977463|gb|AEUH01000216.1| GENE 19 20189 - 21580 567 463 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSGSTKHVSGFFDRLRLVADLDYSDAEQDQAYEAALARIDACVERVKAYGKGNGFHSQLV RDAIDERVGDLCARLGAARARLVDGYEAQRRARESMTRTKEAFETNVAQTLLTPAEMAWQ VLYGAQVAQVPVLGTLAHATMDTYFAALERERNVLREATCQVMLTVFNQELDGHAKAIQK VAERDSNIAPPSTTAPVRSASSGPQGTDPSGSGYGLGGVGGAVGGGPAEGGLGAPAVGGA WSSAAGGVLPGEEWASEGFARPGLPQDAPARVAVGDLEGVGLQGRPMNVEVTPNGLVGGY VPPFADRASDPRWDPAYRIPSEVVDASRRAARGAFASGPLGAGSAMASARGLISGTGVGG VLSASVGAAAGAARAGAAGSGGQPGFVPGVPFAPMAAGEDSRGRGARDRGRADEDEEEDE EQVVDGLDWEVEECHEPVWDPAHGPGSADDGVEFEIDWEEWER >gi|319977463|gb|AEUH01000216.1| GENE 20 21577 - 22719 562 380 aa, chain + ## HITS:1 COG:BH2080 KEGG:ns NR:ns ## COG: BH2080 COG1404 # Protein_GI_number: 15614643 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus halodurans # 66 269 452 658 1052 75 31.0 2e-13 MSAWRFPPGGSGVLRVGAVFAVVAAVVCGGRVPAWGVEEIMTQAYVGVLGVDNAVARGLT GKGVRIGVIDRPADTSLPELAGADVTVKQMCDFTTSDVHRSHGSAVVSILAARGYGLARD AQIINYAIPGEGDDSSNCGSGSSVGGAINAAVADGVRIISISLAGYLNDALREAVAGAVA RGVVVVVGTGNYGSKDPVVSYASMNGTVGVGASDLGGRISAYSNYGRGLTVLAPVEDFQY HDLGSGQVVTGPGTSFAAPIVAGMIAVAMQVWPGASGAQVAQSVVGTATVGPTGQALVNF PGLIDTDPSGFADESPVMDKFPGEEPSWETVADYRDGLLGTETVFDNDPSYVYRGVQENV ARGHADRSALGTSPRYHRRE >gi|319977463|gb|AEUH01000216.1| GENE 21 22829 - 23140 345 103 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTDVYVEAGDLERLHSGVDAVADGLAQVRVGDTAGYLPAGMVGSGSASVVMGACDTIDGL VEGVVEALRDYSSNVGETIAQFVATEDANTVAFQNIANSSGVN >gi|319977463|gb|AEUH01000216.1| GENE 22 23140 - 24864 1343 574 aa, chain + ## HITS:1 COG:no KEGG:Srot_0053 NR:ns ## KEGG: Srot_0053 # Name: not_defined # Def: hypothetical protein # Organism: S.rotundus # Pathway: not_defined # 21 532 6 530 586 130 27.0 2e-28 MVAWSDVVLWRAAHIETVQDDASAARKAAREQVQELEAHYSRIASQGLAIDAMRQALSRV QDELDHAVNAISAYMLACAEAIDGVRVVEAKVKSALEISEEIGCTIYDDGSVSIPYVDLS DAVRYGKAAVLQGKYSDLQMCIADALSVARKVDEELRRKLQELAEDRFEQNGPKEGRHSA SPGLPDDADESWSPSEVSAWWGALTDAERQACIERDPLKYGNLDGIDMASRDKANRMALY GYDTGGNASIRLGSTGLLQQAQDKFDQAQAAYDDCVGAHGLFSDEAMALKAERDRAGRDL ADLQKIDERLRKGAGEDGTPLSLLALDTSGEQVKAAVAVGDVDHAQHVANFTPGMGTNVR ESLDNYVDVADRMRANAADQAKIDKSDVAVVAWLNYDAPTDITKTWDTDVAGTEKARAGA DRLAGFLTGVRSWRDEQGGSLHLTSVSHSYGSTTAGFAMRQMGEGVVDDHIYLGSPGSAA YTVGALGVDPSHVWVSAVPEGDMVQGMGPDVTFGRNPEQLEGIQHLSGDATGAQGYDPQP SWPVANHSSYFAPPKPGERNKAFDDVCRVVVGVK >gi|319977463|gb|AEUH01000216.1| GENE 23 25032 - 25595 345 187 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEPAMARFVRALAEKEGVIGVNGRRYFNACTDRGPGKEEGWAFGTERLFISSLTDRDVEA AASQYLSVLPLYKGTRGDVQKDGSFILRSGDAANGGELSVYYFPDDRSSMRYESGCRPSD GSMGDLNEYVPPTVEDAFPDLVVYPAFDEDTKKPNSPPPPRSAPGQSGQSGQPAQSGGSD DGSGEDQ >gi|319977463|gb|AEUH01000216.1| GENE 24 25690 - 25758 67 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGFRMLPTGSSCCVSARAGVLQ >gi|319977463|gb|AEUH01000216.1| GENE 25 25883 - 26500 292 205 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQAETPFAQRESIEEYLRTVEPVMAAFVRALAEKGGGVIGVDRRRYFNACTDRGPGKEE GWAFGTELLHISSLTDEDIEAAASQYLSVLPLYKGTRGDVQKDGSFVLRSGDAANGGELQ VNYFPDDRSSMHYESRCRPSDGSMGDLNEYVLPTVEEAFPDMVVYPAYDEDSGAPNTPPP PRNASGHSGQSGQPAQSGDGSGEDQ >gi|319977463|gb|AEUH01000216.1| GENE 26 26632 - 27246 376 204 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQAETPFAQRESIEEYLRTVEPAMARFVHALAEKEGVIGVSAQRYFNACTDLGPGKEEG WAFGSGILYFSSLTDKDIEDAASQYLSVLPSYKGGRGKTQKDGSREYRSDDGANGGELQV DYFPDDRSSMHYKSRCRPSDGSMGDLNEYVLPTVEEAFPDMVVYPAFDKDTKQPNSPPPA RGASDQSGQPAQSGGSGDGSGEDQ >gi|319977463|gb|AEUH01000216.1| GENE 27 27243 - 28274 1020 343 aa, chain + ## HITS:1 COG:no KEGG:Noca_3206 NR:ns ## KEGG: Noca_3206 # Name: not_defined # Def: putative transmembrane protein # Organism: Nocardioides_JS614 # Pathway: not_defined # 39 343 20 323 330 186 36.0 8e-46 MRVRRTWGVVLAALLLVGCAPVSPHEGAPTAPPTTGAASASGAAGTDDGQSARQGDATAD VMRGVTLDSVDDLGAAERAIDALPFTATVRLVTDPERGPDDYQEAITALSSRARLVVQLV DSTAMAGLGVEEARSRARAFVQRYAGQVEVWEVGNEVNGAWAGTGPQEINAKVLAMAEQV RAAGEPTAITVNLWSRPDCYEEEWEDEAAYLATVPAPLAGAIDYAFLSAYETACDPVQRP SAAEVGDALEALGAAFPKARLGIGEIGAQRGEDSDSGHIISEPTRDEKIAVARRWIGMNG QLAQRFGDRYVGGWFWWYFRQDVAEAPAGESIGGELADLLGAL >gi|319977463|gb|AEUH01000216.1| GENE 28 28469 - 29122 731 217 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190832|ref|ZP_06608994.1| ## NR: gi|293190832|ref|ZP_06608994.1| hypothetical protein HMPREF0970_01327 [Actinomyces odontolyticus F0309] # 57 211 85 248 263 87 36.0 4e-16 MHKLLHYEYLRTGYVFVTAMLLPLALLAGAFSYVARRKPRAVFTNHVEIPGTGIIHITLI TVAYICYAIVAGTMVYGFITGYPVHAFLWATGPATAGYFITQDRHLGLRAALHAKQSLLL SPEALTLTASPSQDEVRIPWTDNTRVGTSDILGTRLLPDPTTDKPPYTVIFDCPVTSLTQ LDALLDHFNTHPADRPLLALPEGADLVQTLLDTTPRP >gi|319977463|gb|AEUH01000216.1| GENE 29 29121 - 29255 111 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPTSTPTPNMQAAVILASSGRIRRTLSSTSAGLFASGRRVGRFL >gi|319977463|gb|AEUH01000216.1| GENE 30 29260 - 29850 629 196 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTRYRKRPGFRIPSLTPIEADKGTLRSGPLFAGMAAAEVGRQGKGASHVVRPVMDEAVKN ALEARADFRAGRTRAQGVGTAVAGIGYATLAYDIATSDTPGQTALAQGAGIAVGAAVGAA AGAAVTTAVSEAPFAWAGAAAVGGGVVAGVAIGAAVGATVAAGVGATYANAVPLEWRERI DAFFWHLPYHLGLANW >gi|319977463|gb|AEUH01000216.1| GENE 31 29894 - 30664 815 256 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190832|ref|ZP_06608994.1| ## NR: gi|293190832|ref|ZP_06608994.1| hypothetical protein HMPREF0970_01327 [Actinomyces odontolyticus F0309] # 86 250 76 248 263 79 33.0 3e-13 MRPPEAKSPAEADRRTRRAGLVLAGFVLAMTLGVALLVGMHRVLHYEYLRTGYVFVAAMI IPLIPLALVFTYVARRRPRAVFTTQVEIPGTNAARFVTFFMVCAYYAIAAGTMVYGFITG YPVHAFLWAVGPASAGYLLANDRHTSLHALLMPKQSFTLSPHALTITASPLQGEVRIPWT DNARVGTTDPTGTEIHPDPSVHKLTYTVKFDCPATSFTQLDRLLDHFNTHPADRQLLAQP EGADLVQTLLDTTPRP >gi|319977463|gb|AEUH01000216.1| GENE 32 30732 - 31376 745 214 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190832|ref|ZP_06608994.1| ## NR: gi|293190832|ref|ZP_06608994.1| hypothetical protein HMPREF0970_01327 [Actinomyces odontolyticus F0309] # 64 208 95 248 263 65 30.0 2e-09 MLHYEYLRTGYVFIAAFFTPLIPMAVVLTYVPRRRPRAVFTTQVEIPGTNAVRFVTFFMV CAYYAIAAGTMVYGFITGYPVHAFLVATGPATVGYVTAEAGCLGPRTVRMPRQSLALSPD ALTITASPLQGEVRIPWTDNARVGMTDLTGTDIHPDPSIPKLTYTVKFDCPITSLTQLDA LLDHFNTHPADRQLLAQPEGADLVQTLLDTTPQP >gi|319977463|gb|AEUH01000216.1| GENE 33 31380 - 31550 133 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHTGQQRHAEHPAEDNPASDRAQKARLSIRFRRIFRLGQGRLTLPVAHYQLASPRW >gi|319977463|gb|AEUH01000216.1| GENE 34 31523 - 32737 1325 404 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MADTGGIGTSLEVQTTSIDTRIPCNPDSVRDVAAWFYEAYSKVSDAVAEMTGNLLRNPEY WMGDAANAYEEMLGKLRLSAGKLADRHYRAYEVLRAYAQQLDYHYRDMETIRHDAEMLGL EIRDDYDIMCPPPLPPEPQEPARNATHAEWWTYDQDFLIWKGKKEEVLGYEELHRRVERV WSDLEEWVKKYLRPLQVESFTLLFAKYLDQEVQGVVDRPWQLALDAVDQGFARKAQMLEL QMDKAAAEVGRQGKGASHVVRPVMDEAVKNALEARADFRAGRTRAQGVGTAVAGIGYATL AYDIATSDTPGQTALAQGAGMAAGTAVGSAVGSAVTGVAAGSGAPAAAVVGLGTATGVGV ALIAAWAVTASVGAIYEKTVPLEWRERIDETFWHLPYHLGLANW >gi|319977463|gb|AEUH01000216.1| GENE 35 32730 - 33029 352 99 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTDLRITFDLLGNMATKYGYIAEDAETCGRAAPTELDGGIAQATILDLIRSQNEDLGLF ASRARRIEDNLRALIVAQLETEENVAGAFETLKQELQHG >gi|319977463|gb|AEUH01000216.1| GENE 36 33019 - 33225 135 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANVEDGTRVLLAHTSLDRLLDCLGEDQGWALIMADQLPRIKEDLRFDHLSVDHYVEPEA RMGGRDEH >gi|319977463|gb|AEUH01000216.1| GENE 37 33079 - 33348 91 89 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVETQVLLDAGQLVRHNERPPLVFAQAVEQAVQGRVRQQYARAVLHIRHNDLCLPRGVLP HRHLQGWRGHIRMGITRHAVHRLNLPLTD >gi|319977463|gb|AEUH01000216.1| GENE 38 33442 - 34539 1325 365 aa, chain + ## HITS:1 COG:Cgl0393 KEGG:ns NR:ns ## COG: Cgl0393 COG0812 # Protein_GI_number: 19551643 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Corynebacterium glutamicum # 5 364 24 365 368 258 45.0 1e-68 MSTRLSSLTTLRVGGPVGSYTVADSREDLINRVRAADAGGGPLLVIGGGSNLLAADAPFN GVVVRDGRHETRVLGEGGCAGTAAEGGAPGGVLVRASAGTPWDAFVSWTLDQGLSGLEAL SGIPGTVGASPVQNVGAYGHEVAETLDHVLVWDRQEEREARLSAQDLGFGYRTSVIKRSL SADWGPTGRWVVLDAVFRLERSALSAPVLYGELARRIGARAGERAQARLVREAVLALRAG KGMVLDDADHDTWSAGSFFTNPILSADEADALPAGAPRFPAGDGRVKTSAAWLIDHAGFT KGFALPEAGDPPRASLSTKHVLALTNRGGAAASDIEALARAVRAGVRRAYGVDLVPEPVA VGIAW >gi|319977463|gb|AEUH01000216.1| GENE 39 35085 - 36170 1234 361 aa, chain - ## HITS:1 COG:AGc218 KEGG:ns NR:ns ## COG: AGc218 COG1816 # Protein_GI_number: 15887491 # Func_class: F Nucleotide transport and metabolism # Function: Adenosine deaminase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 25 350 5 322 325 166 34.0 5e-41 MDVRAVPDLTAPVPTPPGRRDLAALPKAHLHLHFTGAMRPSTLVDIAREQQVRLPPHLLY IDPMNMPADGRGWFRFQRAYDSARHLVRSEATMRRLVRETMEDEAAEGSVRVELQVDPTS YAPWVGGITPALEIVMDEARAASADTGVDVGLIVAASRIRHPLDARVLARLASQYAGDGP GQVVGFGLSNDERVGATADFAPAFRIARRAGLVGVPHGGELAGPESIREVVAALCPRRIG HGVRTAEDPGLLDRLIAEGVAFEVCPTSNVHLGVYTDFSQVPLPSLISAGATVALSADDP LLFRSRLVEQYAQARDAFGLSDRELAGLARQSIEASLAPSSSKLRWFARIEEWLAAPPAA A >gi|319977463|gb|AEUH01000216.1| GENE 40 36175 - 36999 994 274 aa, chain - ## HITS:1 COG:DR1537 KEGG:ns NR:ns ## COG: DR1537 COG2267 # Protein_GI_number: 15806547 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Deinococcus radiodurans # 17 274 28 279 282 139 39.0 7e-33 MSADARRLVRRLADTGSPRGTVLIAHGYGEHSGRYLPLQEALVGAGYDIAFYDHTGHGTS GGPRGRVDAGALIRDHLAMRRLALAGARTPDLFLFGHSMGGVVTAASTLIDPERLRGTVL SAPAMRPLPPASASLARKAAPLARLLPSLVVRPPEPAGGESPLSRDPRVQQAFDADPLCY HGGVQLLTGVTMVIQGDEVLRHAHLARTPILVMHGSADRMADLAASRDFVAEAEAANPGL DIRLRVIDGAYHELLNEPEGPGLIRDIIAWLGEH >gi|319977463|gb|AEUH01000216.1| GENE 41 36996 - 38273 1697 425 aa, chain - ## HITS:1 COG:aq_1969 KEGG:ns NR:ns ## COG: aq_1969 COG0436 # Protein_GI_number: 15606968 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Aquifex aeolicus # 33 424 5 391 394 327 42.0 3e-89 MRCSSPVHSAHCPLPRAASWHDGEMTPTPASRVSARFTAITPSATLAVDAKAKALKAAGR PVIGFGAGEPDFPTPDYIVEAAVRAARDPAMHRYTPAAGLPALREAIAAKTLRDSGYEVS PADVVVTNGGKQAVFQAFAALLGPGDEAILPTPYWTTYPEVVKLAGATPIEVFAGADQDY KVTVDQLEAARTERTKVLLMCSPSNPTGSVYTPAELTAIGQWALEHGIWVISDEIYEHLL YEDAQTAHIVALVPELADQAVILNGVAKTYAMTGWRVGWMMGPADVVAAATSFQSHLTSN VNNIAQCAAQAAVSGPLGAVHEMRAAFDRRRRTIIDMLRQIDGLDVPTPKGAFYAYVGVE ALLGRPIRGRTATTSSELADLILDEVEVAAVPGEAFGRSGFLRFSYALADDDLVEGIGRV QELLS >gi|319977463|gb|AEUH01000216.1| GENE 42 38485 - 38823 470 112 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_11146 NR:ns ## KEGG: HMPREF0573_11146 # Name: secE # Def: preprotein translocase subunit SecE # Organism: M.curtisii # Pathway: Protein export [PATH:mcu03060]; Bacterial secretion system [PATH:mcu03070] # 38 111 12 85 86 76 51.0 3e-13 MAKHSAPSAESSRKDRGKAREGKGAARKDRTKAGATRKRDEAVAVKEKKRGLFARMWLFL TQVVAEMRKVTYPTRSETWTYFVVVVVFVTAIMAFTGLLDFGFGKLSALIFG >gi|319977463|gb|AEUH01000216.1| GENE 43 38901 - 39830 1215 309 aa, chain + ## HITS:1 COG:ML1906 KEGG:ns NR:ns ## COG: ML1906 COG0250 # Protein_GI_number: 15828020 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Mycobacterium leprae # 97 307 10 228 228 179 46.0 8e-45 MSDENAPGPDAFDGPIADEGFAPAPIDIEAAQARALDESAEAEADGAAQEVVDRAIDEAA QDVADGEAAPAPEDGADGEAAPAPEDGADGGAAPASGAEGAEPDPAPEDDPVEKLRQRLY MSPGDWYVIHTYSGHERKVKANLEQRITTQNMEDSIFSVEVPDEYVMEYRGTAKKRVRRV RIPGYAIVCMDFNEESYRVVKETPAVTGFVGDQHNPVPLSIDEVVMLLTPNVLEEAAEKK KDKPAPVQEIRTAFEVGETVTVIDGPFETMSATISEIMPEAQKLKVLVTIFERETPLELG FDQVEKLEQ >gi|319977463|gb|AEUH01000216.1| GENE 44 40021 - 40455 601 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|170782939|ref|YP_001711273.1| 50S ribosomal protein L11 [Clavibacter michiganensis subsp. sepedonicus] # 1 143 1 142 143 236 81 2e-61 MAPKKKKVTGLIKLQIAAGAATPAPPVGPALGQHGVNIVEFTKAYNAATESQRGNIVPVE ITVYEDRSFTFVLKTPPAAEMIKKAAGVQKGSGTPNTAKVGSITMDQAREIGQAKMADLN ANDVEAAARIIAGTARSMGITVED >gi|319977463|gb|AEUH01000216.1| GENE 45 40525 - 41232 950 235 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227497155|ref|ZP_03927403.1| ribosomal protein L1 [Actinomyces urogenitalis DSM 15434] # 1 228 1 228 228 370 79 1e-102 MTKRSKSYRAAAEKVHADVLYTPFEAMKLAKETTVTKFDSTVEVVLRLGVDPRQADQMVR GTVSLPHGTGKTARVLVFAVGPRAQAAIDAGADEVGGDELIEKVSKGYTDFDVAVATPDL MGKVGRLGRILGPRGLMPNPKTGTVTMDVAKAVKEIKGGRIEFRVDKHANLAFVVGKASF TAEQLTENYGSVLDEVLRLKPSSSKGRYLLKGSVSTTMGPGIPLDVTKVKDLLEG >gi|319977463|gb|AEUH01000216.1| GENE 46 41486 - 41749 202 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGQAVCLPRCGLACPIAWLRAPVLLQVGSLCIGAGNGALGQAVCLPRCGLACPIACFRAP VRLAGVRFHSDTTFCKRYTACKVTVSL >gi|319977463|gb|AEUH01000216.1| GENE 47 42120 - 43901 258 593 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 370 575 146 351 398 103 37 1e-21 MTSDQHPAAPADPGAKYREGQAAFKRLTRPIAVPMAISRVLGVVYGVLAVVPYVILVHLG GVLLEAWDRGEGPDAARIGAILAWLIGAFCARLGIYYVALLVTHIADIRFGHYVREQIVR TMSRAPLAWFSATASGRIRKAVQDDTHEIHTVIAHAPVESTAAVASPLSLVVYAFVIDWR LGLLSIATLPVYALITWWMTKDMGPKTAEMDQHLAEVSATMVEFVTGITVVKAFGRVGDT HGRFRRAAETFSSFYVAWCQPLLTGSALATSVVSPAVLLLVNLGGGALLVGAGCVTPVQV IACSLIALVIPQAIEVLTSTAWAYQLAGAAALRLVEILDIPSLEDTGKRVPDGHDIVLDH VSFSYGDLLALDDVSLTMREGTVTALVGPSGSGKSTLATLIARFADPDSGSVALGGVDLR DIPVKELYRHVAFVLQDPQLLDIPIRDNIALGRPGATDEEIMRCAAHAGIADFIESLPAG LDTVVGADTDLSGGQAQRVSIARALLMDAPVLVLDEATAFTDPESEAEIQDALSELVKGR TVVVIAHRAAPVAGADQIVVLERGRITGEGTSAELADHPYYRALAGTRTEGRK >gi|319977463|gb|AEUH01000216.1| GENE 48 43909 - 45654 250 581 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 324 559 272 508 563 100 32 1e-20 MGSFVANLKETLSEKGRESFRWVVVLSVVVGLIRGGALIAFIPASSALVSGRTVWGLTTG GWVGVVGASAAAAFVIEYALSLKNYSVAFDFLRFLHIAIGRKVAAMPLGAFTPDTAGKTS RLVSRELMLLGEIFAHMFSPALIDVVTSLTMLVGICVWNPPLGLACLLAIPFMWAAVWAS RRALAHEARLTEPAANELAGRLVEYAQKQGALRACGRGDSFDALTAAEDAYQRRATKGLW WATAGQVVNGMAAQVVVVSVVYAIGAVAAGGSLSPLEAVVAIGVLLRFMQILVNIGALVS SFETRRPVLELAHEVLSTPELPEPAESAPVTVPGEVVVDSVDFGYEAGAGVLHDVSFRAA PGTMTAIVGPSGCGKTTIARLVCRFYDVDSGAVRVGGNDVRDYSTEDLMEQLSMVFQDVY LFDDTLIANIRVGREDATDEQVMAAARMAGAEEIADRLPLGWSTPVGEGGRALSGGERQR VSVARALLKGSPIVLFDEATSALDPENESHIVEAMEELRRSSTLIVIAHKLDTIASADQI VVLDENGRVVQIGAHQELYDAGGPYRRFWRARERAKGWRLV >gi|319977463|gb|AEUH01000216.1| GENE 49 45678 - 46349 826 223 aa, chain + ## HITS:1 COG:no KEGG:Arch_1180 NR:ns ## KEGG: Arch_1180 # Name: not_defined # Def: transcriptional regulator, TetR family # Organism: A.haemolyticum # Pathway: not_defined # 1 148 6 155 223 114 38.0 2e-24 MGRRTGPKPFFNADDVVDAAYAGGLHVMTLAGVARSLGVAPQTLYRVYPSRDAVVVACLE RAAATLSAPDPALPWSDQLRDWAESAWRVCEEFPGLDVTIHSFPYPHIAVLPVIAALRGG LADAGFPGDADLAIDMVGDIAVLAHMGVTTRASAAPWRRRAAARRVEEATGSLDAGFNMN AGWPLLRAKAAAKQWMERKVEVVIAGFAAGPLPHDNNPAPRNA >gi|319977463|gb|AEUH01000216.1| GENE 50 46581 - 47102 577 173 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227428348|ref|ZP_03911405.1| LSU ribosomal protein L10P [Xylanimonas cellulosilytica DSM 15894] # 1 173 1 173 173 226 67 2e-58 MAKSDKVAAVAELVERFRAADAVLLTEYRGLTVGQLKQLRRGLGENATYAVAKNTLARLA AKEVGLDFLAEDLKGPTAIAFVTGEPVEAAKTLRDFAKDNPSLVLKSGAMEGAQLTAEAV KKLADLESREVLLAKAAGALKAKIAQAAYAFNALPIKAVRTIDALREKQGEAA >gi|319977463|gb|AEUH01000216.1| GENE 51 47187 - 47576 436 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|172039978|ref|YP_001799692.1| 50S ribosomal protein L7/L12 [Corynebacterium urealyticum DSM 7109] # 1 129 1 131 131 172 70 3e-42 MPLMAKLSNDELIEAFKEMSLIELSDFVKLFEETFDVEAAAPVAAVAAAPAADAPAVEEK DEFDVVLESFGDKKIAVVKVVKNLTGLGLKEAKDLVESAPSTLFENAKKEDAEKAKAEIE EAGGKVTLK >gi|319977463|gb|AEUH01000216.1| GENE 52 47719 - 48513 955 264 aa, chain + ## HITS:1 COG:Cgl2557 KEGG:ns NR:ns ## COG: Cgl2557 COG0789 # Protein_GI_number: 19553807 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Corynebacterium glutamicum # 9 264 5 251 251 158 38.0 1e-38 MSDDNTFYTVGEVAQRFALTVRTLHHWEAQGLLAPVERSWSNYRLYSPEDCARVQRIVIY RATGMGLMDIKALLESGESGASHLRRQRESLVAQRQQTDKMIEAIDLLLEDAMNGNALTV EEIGEILGDADFAAHQARAEERYGGTDDWKEWHRRTASWREGDWQANVERVQQIESDMID AIRDGVATDSDRAAELVGAHREALSEYFPVSPAKHYLISRAYLCDEGFRSHYDSQQEGFA QWLATAIEHVARNRGVDTDNPEWK >gi|319977463|gb|AEUH01000216.1| GENE 53 48641 - 48733 59 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTRGIRGSLSPSCQSPGRSATILFCRNLAS >gi|319977463|gb|AEUH01000216.1| GENE 54 48793 - 49882 900 363 aa, chain + ## HITS:0 COG:no KEGG:no NR:no VNPGPGRGFRYTGRGGAVWVGVVLRVVRLGLFVTALVCVAVVLWAEWKDIQERRERAQRR TDSINESRFKGAPKKAVRRELDAHMAGWREPDGARPRGAHLVAPGAHRLPTRIPVGRIGR FLAGDWVRPTSPPSLLDTDPSACAEAAGTDEAGAPGLRLDVWTVHPKTVKSPRLMSFTFG TRACMVELRQFATPGYATADRSEVLGYARAFMRLRSDGADPGGPTRMPTALFGPPGAPLG PTSKYMRVIGESLTQEGPFATAVHEGTWHAIDIGFHRAESADWRLVGRLRLLDAGGLVLA CPSPDGTCAMAPVDPEDLFWALEGAIGELDSDGRAMPSSAGGAGAPAGTAGAGAPAAVRA LLE Prediction of potential genes in microbial genomes Time: Thu May 12 18:52:19 2011 Seq name: gi|319977451|gb|AEUH01000217.1| Actinomyces sp. oral taxon 178 str. F0338 contig00217, whole genome shotgun sequence Length of sequence - 10519 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 30 - 404 181 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase 2 2 Op 1 21/0.000 + CDS 573 - 2090 195 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 3 2 Op 2 16/0.000 + CDS 2087 - 3091 1584 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 4 2 Op 3 . + CDS 3183 - 4244 1718 ## COG1879 ABC-type sugar transport system, periplasmic component 5 3 Op 1 4/0.000 + CDS 4421 - 6184 2716 ## COG2407 L-fucose isomerase and related proteins 6 3 Op 2 . + CDS 6391 - 7797 1767 ## COG1070 Sugar (pentulose and hexulose) kinases 7 3 Op 3 . + CDS 7772 - 8767 1224 ## COG1609 Transcriptional regulators 8 3 Op 4 . + CDS 8779 - 9210 665 ## Rcas_2834 RbsD or FucU transport + Term 9269 - 9331 5.0 + Prom 9388 - 9447 2.3 9 4 Op 1 . + CDS 9494 - 10240 1036 ## COG1349 Transcriptional regulators of sugar metabolism 10 4 Op 2 . + CDS 10251 - 10518 356 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes Predicted protein(s) >gi|319977451|gb|AEUH01000217.1| GENE 1 30 - 404 181 124 aa, chain + ## HITS:1 COG:Cgl0149 KEGG:ns NR:ns ## COG: Cgl0149 COG3695 # Protein_GI_number: 19551399 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Corynebacterium glutamicum # 3 78 7 80 114 57 43.0 6e-09 MDDERVEAVLRLVEAIPEGRATTYGRIAAAFGTGPRVVGRIMRDWGGSVPWWRVVNVHGT FPTSVRGEGMEHWEREGMPVDAERGRLLLEACSIEEDWLVATAARILSDLRKHSDSGEGS RRGR >gi|319977451|gb|AEUH01000217.1| GENE 2 573 - 2090 195 505 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 258 476 10 222 312 79 27 8e-15 MTNLLELKGISKTFPGVKALSDVDLDLRPGEVLGLCGENGAGKSTLMKVLTGIHKSDPGG EIWLQGEQVDIQSPEHARDLGLSIIHQELNIVPDLTVAQNLYLGRHESSKWMYVDDRKLV RDARELFERLNMDIDPTARCRDLPVARLQMVEIARALSFDSKILVMDEPTAPLTTTETEA LFSLVRDFITPETGLIFITHRMPELTELTDRISVLRDGKYIGTVETAATPMSEVVKMMVG REVPADARPTTKPISDEPVLEVEHLSTAKVVHDVSFEVKKGEIFGFAGLVGAGRTEVARA LFGADPHTEGVIRIHGEEVSIKSATDAVEFGIGYLSEDRKQYGLLLDKDISFNTSLATMD KFTKAAIVNARKIRAVAQEYVKKLRTRTPSVDVDVRSLSGGNQQKVVVAKWLERDAEILI FDEPTRGIDVGAKDEIYTLLENLASQGKAIIVISSELPEVLRLANRIAVMAHGRIIGVLD NEEATQENIMELATVGQEQANGEVA >gi|319977451|gb|AEUH01000217.1| GENE 3 2087 - 3091 1584 334 aa, chain + ## HITS:1 COG:mlr3338 KEGG:ns NR:ns ## COG: mlr3338 COG1172 # Protein_GI_number: 13472896 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Mesorhizobium loti # 43 328 37 318 322 164 36.0 2e-40 MSTADDNRGLEAPSPFTTFLKRNMQLLVVTAALVVILLFFSMAAPNFAKPTIYLDIVLQA AFTGVMALGATFVIATSGIDLSVGTGLSFVAVMAGIFLAGDKMNLPLGLGLLLTILVGAA VGLVNGFNVSILGLPPFIATLAMMMVARGLALVISQTQSITIKNAAYKKLATGELIPYVA NAALIFVVLTVVATFLMNKTLLGRYALAIGSNEEATRLSGVNVRLWKIIIYVVAGVFMAI GAILYSARSGLVQPAEGVGMELNVIAAAVIGGTSLSGGRASIPGALVGALIMETLKKGLI MMSIAQDYQYVVTGIAILLAVAVDNIRRARENAA >gi|319977451|gb|AEUH01000217.1| GENE 4 3183 - 4244 1718 353 aa, chain + ## HITS:1 COG:AGc5109 KEGG:ns NR:ns ## COG: AGc5109 COG1879 # Protein_GI_number: 15890064 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 68 352 70 357 357 111 31.0 2e-24 MRPIGRIVAVAAVGTMALGLTACSRSEDTSSTSGSSGSNAPAAKVECASDSGEEWEKSIK EGDGSQTIYLVSKGFQHRFWQAVKEGAEEAGKACGYKIQFVGPQNEKAVTEQTDQLNTAF SSKPAAIGFAALDSAAAAETLDKIKAEGIPVVAFDSGVDSDIPVTTVQTDNYAAAGEAAK HMIEILGDKQGSVGMVCHDQTSTTGKQRCAGFKDYFEKNAPSNLTLVQEQYAGEVGLAAD TAKGIIQANSDIVGMYGSNEAAASGVVQGVSESGKSDVTIVGFDSGKPQQDAIKDGSEAG AITQSPKRMGELTVKAAIAAINKGELPNVIDSGFAWYTKDNIADSSIAPNLYE >gi|319977451|gb|AEUH01000217.1| GENE 5 4421 - 6184 2716 587 aa, chain + ## HITS:1 COG:SP2158 KEGG:ns NR:ns ## COG: SP2158 COG2407 # Protein_GI_number: 15901968 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 1 587 1 588 588 853 69.0 0 MTQHPRIGIRPTIDGRRKGVRESLEDQTMGMARAVAGLFTENLRYPDGEPVEVVIADTTI GRVHEAQACAAKFKENNVGLTVTVTPCWCYGTETTDMDPAMPHAIWGFNGTERPGAVYLA AALAGHAQIGIPAFGIYGEHVQDADDSSIPEDVRTRLLDYATAGLAVAQMKGEAYLSMGS VSMGIAGSVVNPEFFGAYLGMRNEYIDMSEFTRRIDEGIYDPDEYERAYTWIRENFKQGK DWNPPEWQYPDKHEDWWKFVTKMTLIARDLMHGNPRLAELGFEEEAGGHGAIAAGFQGQR QWTDHFPNGDVLETILNTNFDWTGIRQPSVVATENDSLNGASMLFGHLLTNTPQIFSDVR TYWSPEAIEKATGWKPEGLAAAGLLDLRNSGSTTLDGAGRATRDGEPVIKPWYELTEEDR EATLAATTFHPASTGYFRGGGFSTHFRTSGGMPMTMCRINLVRGLGPVMQIAEGYSVELP DEVAFTIEERTNIEWPTTWFVPNLTGEGAFKSVYDVMNAWGANHGAISYGHIGGQLITLA SMLRIPVNMHNVPEERVFRPKAWSLFGTESPEGADFRACQNFGPMYR >gi|319977451|gb|AEUH01000217.1| GENE 6 6391 - 7797 1767 468 aa, chain + ## HITS:1 COG:rhaB KEGG:ns NR:ns ## COG: rhaB COG1070 # Protein_GI_number: 16131744 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 5 465 7 465 489 272 34.0 9e-73 MTTVLAVDLGSASGRVLAGTFADGRVEVADVHRFKHQALRDGARLTWDVETMWEQTVIGL RKAVAAHPDAVSVSIDTWGVDFAPLDADDELVGPVRAYRDERTSRTLGEFRSRLGDREFF DLTGLQPATINTANQLVALLVEEPELAARVASVLYLPDYFAWRLTGVKGWSRTIASTSGL AEPGAGRFSDEVFERLGIPRRWVGGISADRTVVGQCSVPGLEGLTVVRGGAHDSACAVHG LPIDEGKRAYFLSCGSWSVLGAIEDAPLMSDAAFDLGITNEARTDGGVRPLFNITGMWIL QEIQRQWEREGTPTDTDELVARARQCPPAAAYFNPDDPRFAEPGDMQRKIDEALAAQGAP LPQSMPEYVRVIIESFAHRYARAVGELTEATGRAPDQLNLVGGGARNRLLCDLTASLAGI TVVAGPIEASTFGSLLAQLEALGELDPADRPSVIAASASTRVHVPQAR >gi|319977451|gb|AEUH01000217.1| GENE 7 7772 - 8767 1224 331 aa, chain + ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 9 328 4 332 333 148 31.0 1e-35 MSMSRKPAKKVSLADIAAMAGVSTNTVSRVVRGDPEVADKTRKRIAALIDEMGYVPNYAA RALASNRTGVLHVLLAAPMFHGHGRVLLAVLNASSQAGYHVSLSYAYSPDGRLRSDIAPF DVDGVVILGGQDPTVEMAVAVGQRVPTVLVLTSEKGLDGVSTVAIDNVRGARIATEHLLD QGLIDVVHISGPEGWSDAVMRRIGYQRACAEAGVRPRVLVPDSWDSRDGYDAMRALNRLP QGIQTANDQLALGAMRFIHEKGGTVPGDVRVVGFDDIDGADSYAPPLTTIHQPFDRLGRT AVRLVRSMIDGGRPQDITIDPELVIRASSIL >gi|319977451|gb|AEUH01000217.1| GENE 8 8779 - 9210 665 143 aa, chain + ## HITS:1 COG:no KEGG:Rcas_2834 NR:ns ## KEGG: Rcas_2834 # Name: not_defined # Def: RbsD or FucU transport # Organism: R.castenholzii # Pathway: not_defined # 1 141 1 138 139 136 52.0 2e-31 MLYGPMTHPQFLRALATAGHGSKILLADANYPHKTGVGPACELVSLNYAPGMLDVIQVLR VLKRTIPIESVEIMVPDPAAEPVGIPIHDEFKEELPDVEFSSLTRFDFYDAARSEDVGIV VATGEQRLYGNLLITVGVRQPDE >gi|319977451|gb|AEUH01000217.1| GENE 9 9494 - 10240 1036 248 aa, chain + ## HITS:1 COG:SPy2054 KEGG:ns NR:ns ## COG: SPy2054 COG1349 # Protein_GI_number: 15675824 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Streptococcus pyogenes M1 GAS # 1 240 1 238 248 128 33.0 1e-29 MDRAARTGYILDRLAAEGEVSVAALAADLGVSPVTVRTTLRALEVEGYLVRTHGGARPTA FRNIHLRQLDRVEQKERIARAAADMIRDDDRIMIEAGTSCALIVKFLTRKRGVQILTNSV LVFANARSNPNLNITLTGGQFRAESESLVGPVAERSINDFNARIAFLGTDGFSVDRGLTT QLIEGGQVGAVMRTRAEETWLLADSSKYGCAGFVSFMGIDEVTGIITDDEIPAQSTKELS ERTKLRIV >gi|319977451|gb|AEUH01000217.1| GENE 10 10251 - 10518 356 89 aa, chain + ## HITS:1 COG:VNG2219G KEGG:ns NR:ns ## COG: VNG2219G COG0508 # Protein_GI_number: 15791042 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Halobacterium sp. NRC-1 # 1 81 1 81 478 63 43.0 6e-11 MATIVVMPQLGNSVESCIIVEWTVAEGDAVSLDQTLCSIETDKSTMEVPSTAEGTVLKLL WDEGDEVPVKDPLIIVGAPGEDVSGLVPG Prediction of potential genes in microbial genomes Time: Thu May 12 18:52:26 2011 Seq name: gi|319977437|gb|AEUH01000218.1| Actinomyces sp. oral taxon 178 str. F0338 contig00218, whole genome shotgun sequence Length of sequence - 14347 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 6, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 30/0.000 + CDS 2 - 778 1198 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes 2 1 Op 2 . + CDS 795 - 2162 671 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 3 1 Op 3 . + CDS 2197 - 4650 4004 ## COG0022 Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit 4 2 Op 1 . + CDS 4808 - 6112 1890 ## Csac_0867 hypothetical protein 5 2 Op 2 10/0.000 + CDS 6163 - 7182 1569 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 6 2 Op 3 . + CDS 7312 - 8910 2399 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Term 8929 - 8985 11.2 7 3 Tu 1 . + CDS 9026 - 9835 429 ## gi|154508703|ref|ZP_02044345.1| hypothetical protein ACTODO_01209 8 4 Op 1 1/0.000 - CDS 9842 - 10291 446 ## COG1846 Transcriptional regulators 9 4 Op 2 . - CDS 10338 - 11360 966 ## COG0604 NADPH:quinone reductase and related Zn-dependent oxidoreductases 10 5 Tu 1 . - CDS 11525 - 12994 666 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 13074 - 13133 3.7 11 6 Op 1 . + CDS 13102 - 13218 70 ## 12 6 Op 2 . + CDS 13227 - 13799 608 ## COG1695 Predicted transcriptional regulators 13 6 Op 3 . + CDS 13867 - 14250 253 ## AAur_4025 putative integral membrane protein + Term 14278 - 14323 4.0 Predicted protein(s) >gi|319977437|gb|AEUH01000218.1| GENE 1 2 - 778 1198 258 aa, chain + ## HITS:1 COG:SP1162 KEGG:ns NR:ns ## COG: SP1162 COG0508 # Protein_GI_number: 15901027 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Streptococcus pneumoniae TIGR4 # 33 250 122 340 347 151 40.0 1e-36 DAARAPEAAPAPASSAAATAPAADFPGASTSSPLKGVRKVVAKRMMESLTSTAQLTLNTS ANAAGILALRKKVKNADEALGLNRITLNDLVCFAVSRTLLKYPVFNAHLEDGVLTEFEQV HLGFACDTPRGLLVPVIRSAQSLGLKAFSDEAKRLAGGAIDGTLPPDYLGGGTFTVSNIG SFGIETFTPVINLPQTAILGVGAITPRPALAPDGAVGVEQRLNLSLTIDHQVIDGADGAR FLRDLVAAIENIDVTVLA >gi|319977437|gb|AEUH01000218.1| GENE 2 795 - 2162 671 455 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 6 452 4 450 458 263 34 7e-70 MSETHFDVIVLGAGPGGYLAAERLGHAGKRVALVEEQYLGGTCLNVGCIPTKTLLNGAKN YLHAKEAGQFGVDAQGVSVNWAQMQAWKDQVVKGLVAGVAATERKAGVTVINGRGRLDAP GRVTVEGTAYTSDHVIIATGSVPAMPPLPGTQDNPALVDSTGILSLPEVPARLAVIGGGV IGVEFASLYSTLGSKVTVIEMAPEILPFMDDDLAARARAAMKDVAFELGCRVESLDGGTV HYSKDGQSKSVEADVVLMAVGRRPATAGWGAEEAGLEIDRGVVVDDTMRTNLPNVWAIGD VTGRSLLAHAAYRMAEIASANILDPAAKKRGEVMRWHTVPWAVFSIPEAAGVGLTESAAK REGRPVLVAKVPALMSGRFIAENGFKAPGEAKILVDPDTHQVLGIHVLGAYAAEMIWGAQ AVLEMELTVEDLRQVVFPHPTVSEVIREAAWAVKL >gi|319977437|gb|AEUH01000218.1| GENE 3 2197 - 4650 4004 817 aa, chain + ## HITS:1 COG:SSO1526 KEGG:ns NR:ns ## COG: SSO1526 COG0022 # Protein_GI_number: 15898354 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit # Organism: Sulfolobus solfataricus # 466 787 1 319 324 226 41.0 1e-58 MTKSLIVDPSEVRAPGHVAFPDVPVNQYSFDLGTEVARHGEDGLVRMLHDMIVVRTFESM LDSIKKTGEWQGVAYNHKGPAHLGIGQESAYVGQSAVLTPDDFIFGSHRSHGEILAKCYS AMHQMDEGDLEAIMKGFLGGETLSYAERIDYKDTKDLTENFILFGALAEIFARKSGFNRG LGGSMHTFFLPFGSYPNNAIVGGSAPIANGAALFKRINRKPGIVISNVGDAALACGPVWE ALNFASMDQFRSLWRAEDGGNPPILFNFFNNFYGMGGQTYGETMGYEVLARVGAALNPEA MHAERVDGINPLAVAEATARKKKILEEGRGPVLMDTITYRFSGHSPSDASSYRTKEEVEL WEQVDCIKEYSDLLISNGLTTQSAIDDYTADLTGKLVKVLKLAIDDEATPRVADGYIDSV MYSNEKVEAFDDAAPEIDLADNPRVKALAKKVRTSVDANGKPVSKMRMYQFRDGLFEAML HRFKTDPTMAAWGEENRDWGGAFAVYRGLTEALPYRRLFNSPIAEASIVGAGVGYAMAGG RAVVELMYCDFLGRAGDEVFNQMAKWQSMSAGLLKMPLVLRVSVGAKYGAQHSQDWSALT AHIPGLKVYFPTTPTDAKGMLNLALSGTDPVVFFESQKLYDKGEDFEPGGVPEGYYETEE GEPAIRREGGDITIAAYGATVYKALEAADVLAEKYGMSAEVIDLRFVAPLNYDKLIASVK KTGRLLLTSDAVERGSFLNTVATNVQTLAFDALDAPVAVVGSRNGITPGPEMESFFFPQV SWIIDAIHERIVPLPGHVPTTKHATEAEIARLNRAGL >gi|319977437|gb|AEUH01000218.1| GENE 4 4808 - 6112 1890 434 aa, chain + ## HITS:1 COG:no KEGG:Csac_0867 NR:ns ## KEGG: Csac_0867 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticus # Pathway: not_defined # 35 417 49 430 436 309 40.0 2e-82 MDATNDATLSQDERTAALREAIRGEAFPEWVPESNNHIHTCYSFSPYTPTHAALLARRAG LRVVGSVDHDSIAAAPEMTAATRALGMGSVTGFEIRARFDPDGPLDGRKLNNPDSAGIAY MTVQGVPAPARAAVDAWLAPKRQARLRRTLAMADEANAVLAGLGLEPFDPRSDMVAASQY AHGGGITERHLLAAMASALIRGFGRGPALVAGLGTMGVTVPAALAGALADPGNPHLVFDL LGVLKAEYLDRVYIQPTDELATADEVVAFADSVGAIATYAYLGDVSASPTGDKKAEKFED DFLDELFDAMEAKGLRAVTYMPPRNTPAQLDRVHRLAAAHGMLEISGVDINQPRQAFNCP ELRRPEFAGLNEATWALVAHEALSSVDPALHLLGRTGRLGPDRLAQRIAQYAPLGRRIAD GEAADRVAEDATRA >gi|319977437|gb|AEUH01000218.1| GENE 5 6163 - 7182 1569 339 aa, chain + ## HITS:1 COG:PM1968 KEGG:ns NR:ns ## COG: PM1968 COG1028 # Protein_GI_number: 15603833 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Pasteurella multocida # 75 338 3 259 259 163 37.0 4e-40 MTHAPLIRGLFLAALRRPVIVSPTQEEAGIVLDEPLADVAPDSLADVELPVVVSAGGESY QVRATASGERAITGRVALVTGGAQGFGAEIAKGLVDQGCFVYIADLNGEGAAAKAAELGA GTTHPITVNVADEDSVAAMAAEVERVTGGLDLVVSNAGIVRAGSVLEQDASAFRLSTDIN YVAFFLVTKHLGQLLARQHSTAPEWMCDIIQINSKSGLVGSNKNAAYAGSKFGGIGLVQS FALEMVEHGVKVNAICPGNFYDGPLWSDPDRGLFVQYLNSGKVPGAKTVADVKEFYEAKV PMRRGAQGIDVLRAIYYIVEQAYETGQAVPVTGGQVMLS >gi|319977437|gb|AEUH01000218.1| GENE 6 7312 - 8910 2399 532 aa, chain + ## HITS:1 COG:yggP KEGG:ns NR:ns ## COG: yggP COG1063 # Protein_GI_number: 16130832 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 306 530 76 318 329 70 28.0 7e-12 MTQVPETQYAIQITGVDQFEVNTAKPVDRVGPHQLLLKVEACGICFSDTKLLHQFDNHPR KSEVVSGIAPEALAEIPNYRPGSAPTVPGHEPVVRVIAVGPEVTHFKVGDRLLVQADWKH LRTAKSNGAFGYNFDGALQEYTVVDERCVVAPDGEEFLIHVSEAPSAAAVGLIEPWATVE GSYAWKERGHLADGGKLLVVGEGDVDALTSAHAPASVTRVSEEEALSVEGAFDDIVYFGA NAGVIEKIAQLLASRGVLNVVLGGRRIERKVSLDIGRVHYDFIRFVGTTGDDPAEGYAWI PDTGDLRVGDKIAIVGAAGPMGQMHTMRAVTSDVPGISVVGTDLSDERLAGLKAVVGPVA EKRGVPLSIINTSVTPLEYGYTHITCLVPVPALVAGAVDLAAEGAIINAFAGIPAGTFGD FDLQGIIERRIFMLGTSGSDVSDMRTVLHKIEEGIIDTSISLFAVTGMAGFGDAINSVIN RTSGGKIMVFPSLHDLGLTPLGELPEKFPQVAAAMKDGLWTKEAEEALLATR >gi|319977437|gb|AEUH01000218.1| GENE 7 9026 - 9835 429 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508703|ref|ZP_02044345.1| ## NR: gi|154508703|ref|ZP_02044345.1| hypothetical protein ACTODO_01209 [Actinomyces odontolyticus ATCC 17982] # 10 267 5 262 270 358 74.0 2e-97 MTVNDRASRERYVALQAWLCGQMSPRAPHATRCFEEDPQCLYDAFLSVLVGTGLGFADFD DNRTHHGLYAFDFFVGDIEAPPRGRGVDGPRASVVAGDSATDAAAPAMCHALVDTWLGET GRRIPFVASVIASVATLELVADVSVDECVFECELPPGYARSTANKPWEDGSLPIGSLDLS LVAPGPLIHVETGARYARQPLSCSGFDEALASCVQMGLTVESSMASGAGVAAALRAPHGG PEMLGWAADTVARALSACGHTGGVQIRVW >gi|319977437|gb|AEUH01000218.1| GENE 8 9842 - 10291 446 149 aa, chain - ## HITS:1 COG:BS_ycgE KEGG:ns NR:ns ## COG: BS_ycgE COG1846 # Protein_GI_number: 16077377 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 10 140 15 148 155 59 27.0 2e-09 MTAPHNHPLVRALRAFNSELAANNRCMAARVGLNDSDLAVLDVLHREGPQTPSALAQRTR IATTTMTGVLRRLSAGGWVERRTSDKDLRSFTIHISAVARLSEVFRPVDERLIDLVDELP EQDAQRIVTFLGDATRIVRESHASGDSHP >gi|319977437|gb|AEUH01000218.1| GENE 9 10338 - 11360 966 340 aa, chain - ## HITS:1 COG:mlr2819 KEGG:ns NR:ns ## COG: mlr2819 COG0604 # Protein_GI_number: 13472500 # Func_class: C Energy production and conversion; R General function prediction only # Function: NADPH:quinone reductase and related Zn-dependent oxidoreductases # Organism: Mesorhizobium loti # 20 320 8 295 308 100 33.0 4e-21 MSGSAHQQEGIVNHIIRYRSFGSPDVLEEADVPLSEPQGADVRVRVHAVGLNPVDYKTFN GDLRALEHLRQLAHPLRRTPMFPRGVGRDFSGVITAVGPTATGHSVGDPVLGTLRSAPGQ IVPQGALATEVIAPAETITAKPEGMSYQTAASLGVAAETACGAFRRLRLAADDVIVIIAA AGGVGSLALQLALAMGATVVGIAGRGNADYLRSLGAIPVVYGQDLATRIAQASPEPVTKL LDCYGKGYARLGADLRIPRSGRGTLVPSPGALARGARFTGARHALPGDLERVAQLVASGR ITIHVANEYPMTLEGVRAAYTELAAGHTRGKIVVTDDTAR >gi|319977437|gb|AEUH01000218.1| GENE 10 11525 - 12994 666 489 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 5 445 2 440 456 261 35 3e-69 MTNETIKTFLEAVSDALYSYILIAVLIGVGLYFFIRTKALPLRMFGESLRVVAEPPEKEG EVSSFRALMVSTASRVGVGNIAGVATAVGLGGAGSVFWMWVIATLGGASAFIESTLAQIY KKKGPHHSYGGPAYYIQTALKKNWLASSFAVMLILTYMGGFNLLASFNVADSFSAYSWAD EWTPWIVGAVLAVLMAASIFGGTRRLTDVTGVLVPIMAIVYLGVGLVVIALNYQNIPTMF RAIVSGAFDFRFETLAGGFAGSAMMYGIKRGLYSNEAGVGSAPNAAASASVSHPVKQGLV QMLSVFIDTMVICTLTAFVVLSSGVEGSDTLKGAPLVTSSIATVLGGFAQPFVSTALFLF AFTTLIGNFYYAEVNFRFLLRNVDFKHWMLTAFRTIASVLVFAGALLEFEVAWSLGDILM GLLALINLPVIVILGKKAVDCLKDYQAQRKAGKEPQFVASSIGLDPGELDYWQDATAPVS ADSAVPVNA >gi|319977437|gb|AEUH01000218.1| GENE 11 13102 - 13218 70 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKSLGDNVTEQAGASVARPRTGDRGADVLVDSELIGA >gi|319977437|gb|AEUH01000218.1| GENE 12 13227 - 13799 608 190 aa, chain + ## HITS:1 COG:alr2018 KEGG:ns NR:ns ## COG: alr2018 COG1695 # Protein_GI_number: 17229510 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Nostoc sp. PCC 7120 # 8 140 38 171 216 102 38.0 5e-22 MCGRLSGMELKHAILGLLSIRSASGYDLARAFAGSVAHFWHADRSQIYRTLDRLSGAGAI TTEVVRQDGKPDRKVHSLTDAGRAELTDWLSSPVEEDLPKEPFLARLFFAALIGREGVER MLDERERQMNETLTTLSSLSANSDDLMGLLHTATLRNGLVHAEAEREWLRETRASLLGLA EPTEDSQERP >gi|319977437|gb|AEUH01000218.1| GENE 13 13867 - 14250 253 127 aa, chain + ## HITS:1 COG:no KEGG:AAur_4025 NR:ns ## KEGG: AAur_4025 # Name: not_defined # Def: putative integral membrane protein # Organism: A.aurescens # Pathway: not_defined # 1 110 1 111 126 79 44.0 5e-14 MTAIALLVALALLAALAVLQILAASGLPIGRFGWGGQHRVLPRRLRVASAVSVLVYAGLA ALLLSRAGVLPAGDSTAVIVLTWVVFAFFAASVALNALSRSPAERWTMAPTSLLLAAATL VIALGVG Prediction of potential genes in microbial genomes Time: Thu May 12 18:52:55 2011 Seq name: gi|319977425|gb|AEUH01000219.1| Actinomyces sp. oral taxon 178 str. F0338 contig00219, whole genome shotgun sequence Length of sequence - 12756 bp Number of predicted genes - 13, with homology - 10 Number of transcription units - 8, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 677 671 ## COG1928 Dolichyl-phosphate-mannose--protein O-mannosyl transferase + Prom 696 - 755 4.2 2 2 Op 1 25/0.000 + CDS 775 - 1947 1512 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 3 2 Op 2 42/0.000 + CDS 1949 - 2659 238 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 4 2 Op 3 10/0.000 + CDS 2656 - 3663 1087 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 5 2 Op 4 . + CDS 3669 - 4499 1095 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 6 2 Op 5 . + CDS 4504 - 5352 1157 ## COG0313 Predicted methyltransferases - Term 5881 - 5923 2.1 7 3 Tu 1 . - CDS 6023 - 6100 73 ## 8 4 Tu 1 . - CDS 6225 - 6641 265 ## 9 5 Tu 1 . - CDS 6824 - 7318 679 ## COG1335 Amidases related to nicotinamidase 10 6 Tu 1 . + CDS 7557 - 9398 2878 ## COG0143 Methionyl-tRNA synthetase + Term 9556 - 9596 -0.9 11 7 Tu 1 . - CDS 9359 - 9550 95 ## 12 8 Op 1 1/0.000 + CDS 9599 - 11089 2020 ## COG1621 Beta-fructosidases (levanase/invertase) + Prom 11125 - 11184 2.1 13 8 Op 2 . + CDS 11262 - 12756 2173 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific Predicted protein(s) >gi|319977425|gb|AEUH01000219.1| GENE 1 2 - 677 671 225 aa, chain - ## HITS:1 COG:ML0192 KEGG:ns NR:ns ## COG: ML0192 COG1928 # Protein_GI_number: 15827001 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Dolichyl-phosphate-mannose--protein O-mannosyl transferase # Organism: Mycobacterium leprae # 24 219 15 198 510 91 33.0 9e-19 MTTRTDRPAPGGAPARAPRWARLPASAAAGWAATLITTAIAALIRLTRLDNVPRLVFDET YYVKDAWSLIELGYEGTWPADYDASFAAGDTSGLTANASYPVHPPTGKWIIGWGMRLLGQ SDPVGWRIMGAICGVITVFLLCRLAQNLFRSPAITALAGAFVATDGIAIVMSRTAILDGF LAMFSLAAFLAVVKDQQDARARLGPRLAAWEGVGAPRRGWPGLRA >gi|319977425|gb|AEUH01000219.1| GENE 2 775 - 1947 1512 390 aa, chain + ## HITS:1 COG:AGl3262 KEGG:ns NR:ns ## COG: AGl3262 COG0803 # Protein_GI_number: 15891751 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 22 374 10 333 337 136 31.0 6e-32 MEEPVTPPLRSLAAFGALGIACASLAACAPRSGASADSLKVVATTTQICDYITQIGADAS GGLALDKTDADGTTSHLGAPAENAKKTISLTCLLAPNASAHEHEMTPAQTGALAEADIMA VNGVDLEHFLDEAVAASGFKGTMLVTSGVLTASDVDSPGASAEQEKALPYTVNRGQRAVD VAPWPFPPEDGEDAPEFRFDPHVWTSPKNAAIQVANIGAALEEAAPGAADQIRARTEAYA QRIEALDSWAAASLESVPQSARVLFTSHDAFGYFSKAYGITFIGAALSDFNEQQDATASH IDEAVRTVKESGARALFAENSNNSKSIEAIAKAAGVKAVVGEDALYGDSLGPAGSPGSTY TGSIVHNVTTIVEAWDGTLAPLPDGLGEGQ >gi|319977425|gb|AEUH01000219.1| GENE 3 1949 - 2659 238 236 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 13 206 16 214 305 96 30 1e-19 MLIAAKDLALGYSGVPAAWDITLDVQAGEGLALIGPNGSGKTTLLRAVLGDVRILSGTLE VGASTIGYVPQNNDLDPTFPVSAREVVTMGLYGELGWGSRPGRAHRARVEEALERVGMAE RASMRFGRLSGGQRQRILVARALVARPQLILMDEPFNGLDQPNRDSLVSIVRQATADGVG VVVSTHDLALAHLTCARACLLSGRQIAFGPVDEVLTEGLLAQAYGAGNDALTGAVP >gi|319977425|gb|AEUH01000219.1| GENE 4 2656 - 3663 1087 335 aa, chain + ## HITS:1 COG:all3574 KEGG:ns NR:ns ## COG: all3574 COG1108 # Protein_GI_number: 17231066 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Nostoc sp. PCC 7120 # 33 258 16 241 281 84 28.0 3e-16 MTAAFSGALEALRDLLSGVPGLGWVTYAPYAFRPFVLVVVLALVVGPVSTIVNLRRLEFN AEAMVHSVFPGIVAAAVWWGTRMIIPGAALAAVAATCALVAASRSKRENEAATAVVLATF FSIGVIISLRKGDMSGQLEALMFGRLMEMSDERLMQSLIVCALALAALALTWKEQVFVAH DRDGARVAGVRVLAVDVVINAAIGAVVVAASAAIGVLLVIGYVVVPGVGARLAAPSATRM SATASGIALVGALSGLALMNAPTARPVSPQAALALTVIAVSAVVGGWGLLRERMRGPVGE PAAAPAGPQGDEGTVVEGAALPAPASSPAPQGSSL >gi|319977425|gb|AEUH01000219.1| GENE 5 3669 - 4499 1095 276 aa, chain + ## HITS:1 COG:MA0025 KEGG:ns NR:ns ## COG: MA0025 COG1108 # Protein_GI_number: 20088924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 9 265 14 261 274 73 26.0 3e-13 MLAILALPVLEAVVVGALAGLVGALAVLDRRVFFAESITHGTFPGAVLGVVVAAAAGWGH AAMSVSLFAGALLMCLPLAALMYALTRIPGVSSQASAGVVLTLGFASGYFLATWFKPLPL QVSSFLAGSILTVSGADLAAASACLALALGLMGVRGPQLMRHCFDPVGLSPRTRRTNELA ILTVLLLTVVVLIPAIGTVLSIALIAAPAAALRPHAPTLRAFMVAAPVLGALIALSGLAL AVAADWSAGGCIALVAGLVTAASALRARLGAAPVGR >gi|319977425|gb|AEUH01000219.1| GENE 6 4504 - 5352 1157 282 aa, chain + ## HITS:1 COG:Rv1003 KEGG:ns NR:ns ## COG: Rv1003 COG0313 # Protein_GI_number: 15608143 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Mycobacterium tuberculosis H37Rv # 9 281 3 276 285 234 53.0 1e-61 MTHDDARPSGTILLAATPIGDPGDASPRLVAALASADVIAAEDTRRLRSLASRLGAAPTG RVVALHDHNEQAKASGLVRSARDGATVLLVSDAGMPTVSDPGYRVVTEAIAQGVPVSAIP GPSAPLTALALSGLPSDRFAFEGFVPRKAGEAERFLGALATDPRTLIFFESPRRVHETLV RMAGAFGGGRAGAVCRELTKTHEEVMRGTLDELVERTEGGVLGEVTIVVAGHKGGGNPED HAAAVLALADEGMRLKDAAAEVAAATGLRRNDLYRAALAARE >gi|319977425|gb|AEUH01000219.1| GENE 7 6023 - 6100 73 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPVAESPHDPRRAHSRNGIGNGSSG >gi|319977425|gb|AEUH01000219.1| GENE 8 6225 - 6641 265 138 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLGRLAGALFGGGAMAAADERSAAIGALARKYEHLFPYDHHGDFIAGLCEHGESELALDI LIGNLHDWGAPLEPGEIARIIALKRNLGFPEGPWWRQLPVPFIHLEPRLRDPALVNLHAL LTALLEARRAENDWEHSG >gi|319977425|gb|AEUH01000219.1| GENE 9 6824 - 7318 679 164 aa, chain - ## HITS:1 COG:MT2103 KEGG:ns NR:ns ## COG: MT2103 COG1335 # Protein_GI_number: 15841531 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Mycobacterium tuberculosis CDC1551 # 3 164 23 183 186 142 50.0 2e-34 MEGGNAVARRVAEHVAAHRGDYALVATTQDWHIDPGAHFSDEPDFVDTWPRHGVAGTPSA MLHPALDGLAADITVKKGQYAAAYSGFEGTTEDGRALAEALRGAGVTDVDVVGLALSHCV CETALDALREGFGVRVVEGLTAPVTPELGEAAKERMRSAGVEVV >gi|319977425|gb|AEUH01000219.1| GENE 10 7557 - 9398 2878 613 aa, chain + ## HITS:1 COG:Cgl0868 KEGG:ns NR:ns ## COG: Cgl0868 COG0143 # Protein_GI_number: 19552118 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Corynebacterium glutamicum # 4 591 5 595 610 780 64.0 0 MSRILSAVAWPYANGPRHIGHVAGFGVPSDVFSRYMRMAGHDVLMVSGTDEHGTPILVAA DGEGVSARELADRNNRLIVEDLVALGLSYDLFTRTTAGNHYRVVQDMFATVRDNGYMIEQ VTRAAISPSTGRTLPDRYIEGTCPICGAEGARGDQCDNCGNQLDPTELIDPRSRINGETP EFVESTHYFLDLPALARALSDWLDERERSGTWRPNVIKFSKNFLEDIRPRAMTRDIDWGI PVPGWEDQPTKRLYVWFDAVIGYLSASIEWARRTGDPEAWRKWWNDPDALSYYFMGKDNI VFHSQIWPAELLGYNGQGDRGGRPGDLGVLNLPTEVVSSEFLTMEGKKFSSSHGIVIYVR DFLERYQADALRYFISAAGPETSDSDFTWAEFVRRTNGELVAGWGNLVNRTASMIAKKFG EIPAPGELEDIDRALLDAIEAGFATVGGLIRHHRQKAALSEAMRLVGEANKYVTDTAPFK LKAPEQRERLATVLWTLAQAVADLNLMLSPFLPHAANDVDRVMGGEGRIAPMPVIEEAEE LDPQVLPSDFEGRSGYPIITADYTGVPTWERHPVTAGTPVAKPTPVFVKLDEAIVDEELA RYAASVPDDVTGA >gi|319977425|gb|AEUH01000219.1| GENE 11 9359 - 9550 95 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRDAPGQPGAGASSKDPGDATGIRPAPVRGPFEDGHACATPPGSPARGPAQAPVTSSGT EAA >gi|319977425|gb|AEUH01000219.1| GENE 12 9599 - 11089 2020 496 aa, chain + ## HITS:1 COG:CAC0425 KEGG:ns NR:ns ## COG: CAC0425 COG1621 # Protein_GI_number: 15893716 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Clostridium acetobutylicum # 27 488 30 484 490 270 36.0 4e-72 MAEDLMLSALNAERASLARQDPDHPLFHVAPPVGRLNDPNGLIVDSGTYHAFFQYTPEHP RKLVYWGHARSTDLVHWTYLPPAILPDSRQDRNGAYSGTAIGTDDGVELWYTGNYKDADT GEREATQCVVTTPDLRRFTKVVPPVIARQPEGYTAHFRDPQVWRDPDDPQGSYRMLLGAQ REDLTGTALLYRSSDLREWHCEGELSFPDAGGAFDRFGYMWECPNLVRLHDEGSGRDRDV LIFCPQGIAPEREGFENVFPCVYMVGELVGTEFRGADGDYWEVDRGFEFYAPQVFADRAR AWGRADPALLLGWAGNAGEDDQPSISTGGWVHALTAPRALSLAGGRLVQRPHLPGLPLEP TGFPGSVDGGGARVEALESSRSWRFSASLDHEPGACLELRVGGDSGLVVRVSESLLEVDR SRTRYPHGGIRRVSLEPGWAGSVEVVHDRSITEIFLGGGRLAFTLRSFLEGTGAGLVLGA SGGAVRVAGARAARAD >gi|319977425|gb|AEUH01000219.1| GENE 13 11262 - 12756 2173 498 aa, chain + ## HITS:1 COG:Cgl2590_2 KEGG:ns NR:ns ## COG: Cgl2590_2 COG1263 # Protein_GI_number: 19553840 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Corynebacterium glutamicum # 99 498 4 396 406 356 55.0 7e-98 MDHAKVATEVVEAIGGPSNISAAAHCATRLRLVIADPDKIDQQALDDNEELKGTFAAGGM FQIIVGPGDVDQVFQHMVDDHGVRQVSKDEAKEEAEQGGNLFSRFIKMIADIFVPILPAL IAGGLLMALHSVLKAEGLFAKQSVVQMFPWLEDYDSLINLVSSAAFASLPILVGFSSAKR FGGNVYLGAAMGAAMVSPSLLSAYSMAKPDEASAFWSYSNQSSVWHLFGLEVTKVGYQAM VIPTLVVTWILCLIEKRLHKVFKGTADFLLTPLITLLVTGFLAFVVVGPITREISNQLTS GIDWLYGTLGPVGGLLFGFFYSPIVVTGLHQSFPAIEIPLLPSNGGLGDFIFPVASMANV AQGAAALAVFFKCRNAKMKGLAGAGGVSAVFGITEPAIFGVNLRLRWPFFIGMGAAAIGS AGVALLGVRGQALGAAGFAGFVSIIPASIPAYLALEVLVFVLAFAATLAYASTRGRADMA DGAAGADSAATAPASAPA Prediction of potential genes in microbial genomes Time: Thu May 12 18:53:14 2011 Seq name: gi|319977415|gb|AEUH01000220.1| Actinomyces sp. oral taxon 178 str. F0338 contig00220, whole genome shotgun sequence Length of sequence - 10023 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 7, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 578 850 ## COG2190 Phosphotransferase system IIA components 2 2 Tu 1 . - CDS 763 - 1956 1353 ## COG1609 Transcriptional regulators 3 3 Op 1 . + CDS 2014 - 2280 247 ## Arch_1461 phosphotransferase system, phosphocarrier protein HPr 4 3 Op 2 . + CDS 2285 - 3967 2071 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) + Term 4041 - 4100 5.6 5 4 Op 1 . - CDS 4053 - 4619 551 ## gi|154509665|ref|ZP_02045307.1| hypothetical protein ACTODO_02198 6 4 Op 2 . - CDS 4639 - 6195 1580 ## BDP_0501 narrowly conserved hypothetical membrane spanning protein - Term 6345 - 6378 0.4 7 5 Tu 1 . - CDS 6379 - 6525 96 ## 8 6 Op 1 19/0.000 + CDS 6476 - 7639 1357 ## COG4585 Signal transduction histidine kinase 9 6 Op 2 . + CDS 7662 - 8312 858 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Term 8347 - 8394 1.3 10 7 Tu 1 . + CDS 8455 - 9945 2165 ## COG4868 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|319977415|gb|AEUH01000220.1| GENE 1 3 - 578 850 191 aa, chain + ## HITS:1 COG:SP0577_3 KEGG:ns NR:ns ## COG: SP0577_3 COG2190 # Protein_GI_number: 15900487 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Streptococcus pneumoniae TIGR4 # 39 164 22 147 166 136 50.0 3e-32 GASAAESAAAPAPAAVAPAAAAPAVEFSDEAKADFTLTSPMAGTAVPLGQVKDESFAKGM LGPGIAVEPSSGVVVAPCDGKVTVAFPTGHAYGLKSASGVQVLIHIGMDTVKLDGKGFTP KVAKGDFVRRGDVLAEVDLDVIREAGYDTVTPVVVTNKKKFASVTPDAEGEVAPGGALLT VVPKEAAQPTA >gi|319977415|gb|AEUH01000220.1| GENE 2 763 - 1956 1353 397 aa, chain - ## HITS:1 COG:SP1725 KEGG:ns NR:ns ## COG: SP1725 COG1609 # Protein_GI_number: 15901558 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 71 397 5 321 321 242 40.0 1e-63 MVWAESPFGQWSDRRSRHDLRPTADAPALGRGTGGGAPGWETTLATEGTFPGAPADDQFG GDVARSREPTLADVAERAGVSVTTVSRVLNARGYLSQDAKDRVARAIEDLGYRPNQVARA LHGKKTNTIGLIIPTVALPFFGELAVEVENALADGGYRILICNSLGRADREREYLNLLIG NRVDGIISGAHNEGLCEYESIRMPLVTIDRDLSPTIPNVRCDNRGAGRMATEKLLEAGAR RPALLTSRSGAHNLREQGYRDALAGTGAGPLVLTVPFDTPSPLRQSRVEAALDSVAGRFD AVFATDDLAAAEVLEWARGRGVDVPSRLKVIGFDGTTAIRRALPALATIRQPLDLIARKA VELLVAQIEGSGPSPGDADGAMPAVEFPVAFLPGRTL >gi|319977415|gb|AEUH01000220.1| GENE 3 2014 - 2280 247 88 aa, chain + ## HITS:1 COG:no KEGG:Arch_1461 NR:ns ## KEGG: Arch_1461 # Name: not_defined # Def: phosphotransferase system, phosphocarrier protein HPr # Organism: A.haemolyticum # Pathway: not_defined # 1 86 1 86 87 77 67.0 1e-13 MISRTVAIASAVGLHARPASVLAEAVDETGVDVTISFNGEEADAASLLEIMTLGAKHGDV VTLSTDDDNAAEVLDSLVALLSRDLDKE >gi|319977415|gb|AEUH01000220.1| GENE 4 2285 - 3967 2071 560 aa, chain + ## HITS:1 COG:Cgl1887 KEGG:ns NR:ns ## COG: Cgl1887 COG1080 # Protein_GI_number: 19553137 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Corynebacterium glutamicum # 1 552 7 564 568 406 48.0 1e-113 MGEHTVLHGIGVSAGTASAPAALVQPAPGVDATEPACTDPAADGARVREALAAVSARLKA REETAPKETKAILKATAKLAADRGLAKAVDKKLKKGMGVTAAVHAAVEEYAEMLRGLGGY MAERATDLYDVRDRATSELRGLPEPGVPALDGEVILVAHDLAPAETATLDPGTVKGIITE VGGPTSHTAILAAQLGIPAIVKATGIMGVSDGEVLALDGGVGEVIIAPTPEEVSLLEERS RRRAMALAGSSGEGATYDGYRVKLLANIGTVDDAQRAAAYDLEGSGLFRTEFLFLERSDA PSLEEQTETYTNVLKAFGERRVVVRTLDAGADKPLSFADLGAEENAALGRRGLRLCQVRE DLIDTQLAALAAAHKATDADLWVMAPMVATAEEATWFADKARGVGLPKVGIMIEVPAAAV RAEQLLSIVDFASIGTNDLTQYTMAADRLDGNLAHLLDPWQPAVLQMIKAACNGGRATGK PVSVCGEAGGDPLLALVLAGLGVASLSMAPSKVNAVRAALRLHDLATCQQMGAFAVDAPS AAKGREAVLKLVSPTMLDLI >gi|319977415|gb|AEUH01000220.1| GENE 5 4053 - 4619 551 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154509665|ref|ZP_02045307.1| ## NR: gi|154509665|ref|ZP_02045307.1| hypothetical protein ACTODO_02198 [Actinomyces odontolyticus ATCC 17982] # 4 179 5 210 231 72 33.0 2e-11 MALNDDKTEQFPTTAPADPTQALPTTAPADPTQVLPTTAPVDPTQVLPTTAAGDSASPAP DAGRTRPLYQVEDDTPEHGDGAAAHQAPQQDQGGEEDESHRVWSAPAAPADPHDLPEAPR RGVRAGGLAWGVIVMLAGVFLIALALVPRLNVPILAITLMAALGVALIVSALFTGRKPRA AAPSGRRK >gi|319977415|gb|AEUH01000220.1| GENE 6 4639 - 6195 1580 518 aa, chain - ## HITS:1 COG:no KEGG:BDP_0501 NR:ns ## KEGG: BDP_0501 # Name: not_defined # Def: narrowly conserved hypothetical membrane spanning protein # Organism: B.dentium # Pathway: not_defined # 4 338 22 359 624 91 30.0 6e-17 MSWQQQPSSRFFDSVRASGWYRSQPRVVGGVCAGISARTGWDLSLVRVLVGVLAFFAPPV VAGYGLAWLFLPEAQDGGIHAEELARGRIDIAQLGGIALAAIGLSAALPFAGFFGPVGYV FNGFWFFALVSAIVAFVVISSRSTVQGGPPMSGTQPNGAPHSQPRRGGPSDGRPSAGPTG PTGAPEGPAFSNAMPHGAPQQGAGPMPWYGQRAGYGPRPGYGTRPGYGPASGPHRARTWT PAGPKERPRLVSARVTLAVTGLIALVFATVFAVIYWLAGGALDTPNAQIDDATSLKIAQT ILIGGGVCLLISGFSLVVAAVKDRSSGWLLAMSIIGVLLFVPTSVMGMLVRSDYLFHNST NVSRPWTNVPGSETSLTWEADTVTGSPVGTSTLDLTGAPVGTTKTITVTQWSWNRVNILV AEGQPVQIICRSNVGSLSTSFPDGQQDWVADLGGCADADSRTPVSAVSPGWGKPGLGGIT IEIGADADLSTLHITQTRTVADTGLRWDGSDPAQSGAN >gi|319977415|gb|AEUH01000220.1| GENE 7 6379 - 6525 96 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTGMPRWTDKPEHTPPIHGVGMSVDEAEARRRGGRSTGWGGTHWGPGG >gi|319977415|gb|AEUH01000220.1| GENE 8 6476 - 7639 1357 387 aa, chain + ## HITS:1 COG:Cgl0594 KEGG:ns NR:ns ## COG: Cgl0594 COG4585 # Protein_GI_number: 19551844 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Corynebacterium glutamicum # 138 376 99 350 352 105 38.0 2e-22 MGGVCSGLSVHLGIPVNTLRLITVGSAVVGVGVLIYLWLWGTVPVDRPGDEERGGAGIGR RLVVVEDRAQAVARNRLAMAGACLIVLAALGLLLARTRVVEWRDVTASISIITGIALVWS QSLNLRGWKSLRFLAAVGGGLVLLASGIVLAASRETPAVILWRGGLIGAALVAGVLFAMA PLLLRANKDLSEAEAKRVRETERADIAAHLHDSVLQALTLIRASADDPARVRAIALSEER ELRSWLYTGRTESASSLAAAVEEAVGAVEARYGVPISVVTVSDTSAGAGELALVAALGEA AQNAVKHGEPPVSVYMEVRPDRIEAYVKDNGPGFDMGAVADDRHGVKDSIIGRMARAGGT ARIRRRTPGTEVELTVPRSDTLGGTPD >gi|319977415|gb|AEUH01000220.1| GENE 9 7662 - 8312 858 216 aa, chain + ## HITS:1 COG:Cgl0595 KEGG:ns NR:ns ## COG: Cgl0595 COG2197 # Protein_GI_number: 19551845 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Corynebacterium glutamicum # 3 215 2 229 230 197 51.0 1e-50 MTISLVIIDDHHMVRAGVRAELEAMGADLRILGEASDVEGAIAACHELRPDVALLDVHLP GGNGGGGAEVASACQDVEGLRFLALSVSDSAEDVVSVIRAGARGYVTKSISPSDLVEAIN RVAGGDAAFSPRLAGFVLDAFGTGEVSTTDSELDLLSAREQEVMRLIARGYTYKEVAHDL FISIKTVETHVSAVLRKLQLSNRHELTRWAMQRHIV >gi|319977415|gb|AEUH01000220.1| GENE 10 8455 - 9945 2165 496 aa, chain + ## HITS:1 COG:Cgl2942 KEGG:ns NR:ns ## COG: Cgl2942 COG4868 # Protein_GI_number: 19554192 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 4 496 3 495 495 674 66.0 0 MHRIGFNRDTYIRLQSEHIAQRRAQFGGKLYLEFGGKLVDDMHASRVLPGFTPDNKIVML KELADEVEIIVAVNAKDFARRKVRADMGTTYDDEVLRQIDEFRERGLFVGSVVITQWTDD NKAAHEVKSRFEKLGITVYRHFPIPGYPSDVSRIVSEQGYGANEYIRTSRDLVVVTAPGP GSGKMATCLSQLYHDHKRGIESGYAKWETFPIWNLPLDHPVNIAYEAATADLDDVNMIDP FHLEAYGVKTVNYNRDVEIFPVLNRLFERILGSSPYKSPTDMGVNMAGMSICDDGACREA SEQEVIRRYYKALVAEKKQGGEPVQSARIGLLMGRLGLERADRPVVRPALELAESTGAPA AAIELEDGRIVLGKTSPLLGCSSAMLLNALKLLAGIDPDVKLLARESIEPIQRLKTADLG SRNPRLHTDEVLIALAVSANGDENARLALAQLGALRGCDVHESVILGSVDEGIFRALGIQ VTSEPVYATKSLYRKK Prediction of potential genes in microbial genomes Time: Thu May 12 18:53:40 2011 Seq name: gi|319977413|gb|AEUH01000221.1| Actinomyces sp. oral taxon 178 str. F0338 contig00221, whole genome shotgun sequence Length of sequence - 1577 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1577 2184 ## COG0210 Superfamily I DNA and RNA helicases Predicted protein(s) >gi|319977413|gb|AEUH01000221.1| GENE 1 2 - 1577 2184 525 aa, chain + ## HITS:1 COG:Cgl0831 KEGG:ns NR:ns ## COG: Cgl0831 COG0210 # Protein_GI_number: 19552081 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Corynebacterium glutamicum # 23 471 9 457 763 534 61.0 1e-151 AGPGARAQGTFEWTQSAYASPAGDPEELTRGLNDRQRASVEHRGTPLLIMAGAGSGKTRV LTHRIAYLLATGQARAGQILAITFTNKAAAEMRERIAALVGDEAGRMWVSTFHSACVRIL RYEHEAAGLSGSFTIYDAQDSQRLMQMVLKAQDVDVKRFTPKMVAARVSDLKNELIGPER YAEVAGKDPVSRIVAAAYAEYDKRLRDSNAVDFDDLIMRTVLLLQRNPLIAEHYHRRFRH ILVDEYQDTNHAQYVLVRELVGAGDDGVVPAELTVVGDSDQSIYAFRGATIRNIEEFERD FPGARTILLEQNYRSTQNILSAANAVISRNQGRRPKNLWTQEGDGAPITVDAADSEYDEA RFVVSEIDRVADAGADWGDIAVFYRTNAQSRALEELLVRQGIPYRVVGGTRFYERREIKD ALAYLQVVSNPDDTVAARRVINLPKRGIGAKAEEAIAAHAARHGVSFGRALAHLWIRQGR PLGEGEGVDIAALVRHGGETGAPSSADGGAGAVGPADEAGAPSAD Prediction of potential genes in microbial genomes Time: Thu May 12 18:53:41 2011 Seq name: gi|319977411|gb|AEUH01000222.1| Actinomyces sp. oral taxon 178 str. F0338 contig00222, whole genome shotgun sequence Length of sequence - 1156 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1060 1132 ## COG0210 Superfamily I DNA and RNA helicases Predicted protein(s) >gi|319977411|gb|AEUH01000222.1| GENE 1 2 - 1060 1132 352 aa, chain + ## HITS:1 COG:Cgl0831 KEGG:ns NR:ns ## COG: Cgl0831 COG0210 # Protein_GI_number: 19552081 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Corynebacterium glutamicum # 33 351 471 763 763 211 41.0 2e-54 GEAENGGATALAPQGDPGGGAEEAPEVVGVPPRAARAAASFWGLVEALRRAERGGASVAD ILEEALDRTGYLAQLRRSEDPQDASRVENLAELHSVARAFAQEAPGGGLADFLERVALVA DSDQVPSGGARSGQVTLMTVHTAKGLEFPVVFVTGMEDGTFPHQRSLGEARELEEERRLA YVAITRARERLYLTRAAVRSAWGAPQEMPPSRFLDDIPAELLEVRRATTSAERLRAFQGG SYGSDDGGSRDGRDPWGDEDTGRAYGSGRGGAAPVGQVRARRIQRMGVPAQASAPAQKVL ALKVGDRVKHATLGAGTVTGIEGEGQRTVARVRFGGVDKRLLVRMAPMEKIT Prediction of potential genes in microbial genomes Time: Thu May 12 18:53:52 2011 Seq name: gi|319977377|gb|AEUH01000223.1| Actinomyces sp. oral taxon 178 str. F0338 contig00223, whole genome shotgun sequence Length of sequence - 31394 bp Number of predicted genes - 38, with homology - 31 Number of transcription units - 26, operones - 12 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 39/0.000 + CDS 272 - 1450 1587 ## COG0045 Succinyl-CoA synthetase, beta subunit 2 1 Op 2 . + CDS 1478 - 2395 1378 ## COG0074 Succinyl-CoA synthetase, alpha subunit 3 2 Tu 1 . + CDS 2610 - 4274 1361 ## gi|154509657|ref|ZP_02045299.1| hypothetical protein ACTODO_02190 4 3 Op 1 . + CDS 4380 - 6464 2039 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 5 3 Op 2 . + CDS 6555 - 6728 106 ## + Term 6833 - 6872 1.5 6 4 Tu 1 . + CDS 7417 - 8466 384 ## Bcav_1976 hypothetical protein 7 5 Tu 1 . - CDS 8745 - 9566 762 ## Lxx20890 TetR family transcriptional regulator + Prom 9537 - 9596 1.9 8 6 Op 1 . + CDS 9630 - 10214 952 ## COG1268 Uncharacterized conserved protein 9 6 Op 2 . + CDS 10290 - 11861 2026 ## COG0519 GMP synthase, PP-ATPase domain/subunit + Term 12106 - 12158 -0.1 - Term 11777 - 11801 -1.0 10 7 Tu 1 . - CDS 11933 - 12202 394 ## gi|293189090|ref|ZP_06607816.1| toxin-antitoxin system protein 11 8 Op 1 . + CDS 12250 - 12333 66 ## 12 8 Op 2 . + CDS 12315 - 12473 159 ## gi|93569008|gb|ABF13473.1| unknown 13 9 Tu 1 . - CDS 12801 - 13550 945 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain 14 10 Tu 1 . - CDS 13690 - 14538 1301 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 15 11 Tu 1 . + CDS 14883 - 15326 587 ## Elen_1414 hypothetical protein 16 12 Tu 1 . - CDS 15533 - 15631 60 ## + Prom 15537 - 15596 1.6 17 13 Op 1 . + CDS 15630 - 16079 611 ## BDP_0986 hypothetical protein 18 13 Op 2 . + CDS 16081 - 16740 915 ## COG2323 Predicted membrane protein + Term 16904 - 16949 2.1 19 14 Op 1 . + CDS 16967 - 18520 1479 ## COG0477 Permeases of the major facilitator superfamily 20 14 Op 2 . + CDS 18553 - 19215 486 ## SMU.2123 hypothetical protein 21 15 Op 1 . + CDS 19348 - 19893 539 ## COG4283 Uncharacterized conserved protein 22 15 Op 2 . + CDS 19978 - 20070 101 ## + Term 20149 - 20191 -0.2 23 16 Tu 1 . + CDS 20197 - 20397 104 ## + Term 20615 - 20645 2.1 24 17 Tu 1 . - CDS 20294 - 20503 169 ## 25 18 Op 1 . - CDS 20768 - 21097 344 ## Apar_1167 GCN5-related N-acetyltransferase 26 18 Op 2 . - CDS 21188 - 21703 584 ## Mthe_0610 pyridoxamine 5'-phosphate oxidase-related, FMN-binding 27 19 Tu 1 . + CDS 22114 - 22923 781 ## COG3315 O-Methyltransferase involved in polyketide biosynthesis 28 20 Tu 1 . + CDS 23048 - 23593 229 ## PROTEIN SUPPORTED gi|229236145|ref|ZP_04360568.1| acetyltransferase, ribosomal protein N-acetylase 29 21 Op 1 . - CDS 23634 - 23918 302 ## COG3070 Regulator of competence-specific genes 30 21 Op 2 . - CDS 23934 - 24566 957 ## COG4832 Uncharacterized conserved protein 31 22 Op 1 . - CDS 24670 - 25791 1357 ## COG1397 ADP-ribosylglycohydrolase 32 22 Op 2 . - CDS 25813 - 26628 841 ## Ccur_10210 transcriptional regulator 33 23 Op 1 45/0.000 - CDS 26834 - 27589 1014 ## COG0842 ABC-type multidrug transport system, permease component 34 23 Op 2 . - CDS 27582 - 28559 1193 ## COG1131 ABC-type multidrug transport system, ATPase component 35 24 Tu 1 . + CDS 28481 - 28687 267 ## + Term 28728 - 28755 -0.1 36 25 Tu 1 . - CDS 28735 - 30105 1924 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 37 26 Op 1 . - CDS 30341 - 30568 253 ## COG2026 Cytotoxic translational repressor of toxin-antitoxin stability system 38 26 Op 2 . - CDS 30663 - 31346 229 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 Predicted protein(s) >gi|319977377|gb|AEUH01000223.1| GENE 1 272 - 1450 1587 392 aa, chain + ## HITS:1 COG:Cgl2512 KEGG:ns NR:ns ## COG: Cgl2512 COG0045 # Protein_GI_number: 19553762 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Corynebacterium glutamicum # 1 377 5 379 402 421 64.0 1e-117 MDLYEYQARELFRKHGVPVLDFELATTPGQARDAAERLLGAGASLLVVKAQVKTGGRGKA GGVKLARTPDEAREKAEAILGLDIKGHVVERLMIASGADIAAEYYFSILLDRSNRRHLAM CSREGGMDIETLAKERPEALARVPLDPAVGIDADVARRIVDEAGFDRAAGEAIAPVLRTL WEVYRDEDATLVEVNPLVASPDGSIWAVDGKVTLDDNARFRHPAHADLVDAAAQDPREAA AKEAGLNYVRLEGQVGVIGNGAGLVMSTLDVVAMAGEDLRMRPANFLDIGGGASAAVMAK GLGIILGDEQVRSVFVNVFGGITACDEVARGILGALEELGGAASKPLVVRLDGNKVAEGR AILAAAGHPLVHLEETMDGAARRAAELAAASE >gi|319977377|gb|AEUH01000223.1| GENE 2 1478 - 2395 1378 305 aa, chain + ## HITS:1 COG:MT0979 KEGG:ns NR:ns ## COG: MT0979 COG0074 # Protein_GI_number: 15840376 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 1 304 4 300 303 356 68.0 4e-98 MTIFINSDSRVLVQGMTGSEGRKHTARMLSAGTAVVGGTNPRKAGTAVAFDVVGYGPAAD RVRPGVAEVPVFGTVAEARAATGANVSVVFVPPAFAKDAALEAIDAGIETVVVITEGIPV QDSVVFVDRALEKGTRLIGPNCPGIISPAQSNVGITPADITGPGRIGLVSKSGTLTYQLM HELRDIGFTTCLGIGGDPVVGTTHIDALKAFEADPDTDLVVMIGEIGGDAEERAAAWISE NMTKPVVAYIAGFTAPEGKTMGHAGAIVSGSSGTAAAKAEALEAVGVRVGRTPSQTALIA RELLA >gi|319977377|gb|AEUH01000223.1| GENE 3 2610 - 4274 1361 554 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509657|ref|ZP_02045299.1| ## NR: gi|154509657|ref|ZP_02045299.1| hypothetical protein ACTODO_02190 [Actinomyces odontolyticus ATCC 17982] # 19 467 1 453 459 294 46.0 1e-77 MSDEIHPQSAPRKTTTTYVAQRRTIRIAVPDGWARGALAGVEAAFVGWALCLVAAFGAYM SVASNVWMKDFTPKDALGVGGDLWAAVLGGTSVVGGVPYRAAPTLMGVLVVCLLHLLLRS SFSYSRASLWLSIPAFAFTSWLLTALSAPHAELASIAPGALGLPFAASAWATVASFKVRA AQFHAARWLGGGVRTGVAWAGYLAAAGSVLAVVALVAGWGRIAGIHELLGASSAVDNALI VAAQAFFAPTVAAWALAWWAGPGFLVGADAVHSPSVVGEGPIPPFPLLGAVPTSAPGAWT ALVLVAVGAACGARLVRRYPCKTLGEQMGLALTASLVFDAVCAAWMWSATMSAGAVRMSV LGPNVGWTLLALTLEVPLVALVVTACAHPSTRQRASALLSGARGGARGGARGGTGPDWIG AEEDGSDRAAAQSGAGAAGAGDGEPGAGGTGAAGADAWEAGTGGTGDGTADDSGTGADGT GTWGSDADQADWEHSAAGRGEHSGAASGAEGEGRNGQPAGASGSDDGAPPASDGGAGVRG APQDGHRGSAGPND >gi|319977377|gb|AEUH01000223.1| GENE 4 4380 - 6464 2039 694 aa, chain + ## HITS:1 COG:Cgl0838 KEGG:ns NR:ns ## COG: Cgl0838 COG0138 # Protein_GI_number: 19552088 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Corynebacterium glutamicum # 381 694 240 520 520 358 62.0 2e-98 MSARPDDFESLDARPIRRALVSVYDKTGLIDLARALADAGVEIVSTGSTASRIAGAGLAV TPVDRVTGFPEIFEGRVKTLHPSVHAGILADQRKPGHRAQLADLGIAAFDLVVCNLYPFE RTAASGAGFDECVEQIDIGGPSMVRAAAKNHPSVAVVTSPARYAEVAAAAAGQGFTLEQR RGLAAEAFAHTAAYDLAIAGWFARSLGLEEVADDLDDACERHLDASDEAFLSSLGYIVGD DAEAPVEQGAAAGAGPAAQEGAAAGAGSQEPGSEARDGAPAPMPLYVAEAFERADVLRYG ENPHQGAAVYREIGDSDQAGAHGAAFSPEGGPDGAAPASGVGPDGVVAASGLGLVGAAPA PQGGLDGGTGPDPRGGAVLPGIANARQLHGKAMSYNNYTDGDAALRAAYDHERPCVAIIK HANPCGIAVGGDAAEAHRKAHACDPVSAFGGVIAVNRPVSVELARQIVPVFTEVVLAPAY EEGAVEVLSAKKNLRVLQVAPPAPGGYEFKQISGGVLIQQRDGLDAPGDSPQNWTLAAGA PADAETLADLEFAWRSVRAVRSNAILLAKDGASVGVGMGQVNRVDSCRLAVERANTLGAR STGDAAADAGADVDSAGGARAEEVVPAAPERRSVGAVAASDAFFPFADGLQVLIDAGVKA VVQPGGSVRDQESIDAANAAGITMYLTGTRHFAH >gi|319977377|gb|AEUH01000223.1| GENE 5 6555 - 6728 106 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLTGRRSPQVQWPVTALAGRARGVRLALRFPAAGPFRVRRAVLERETRIGVGNQLLG >gi|319977377|gb|AEUH01000223.1| GENE 6 7417 - 8466 384 349 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1976 NR:ns ## KEGG: Bcav_1976 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 26 306 31 305 336 77 31.0 7e-13 MSVPRIEIIRTTYADRHVAGSSPRMRIRRGFHIRPPSGDTEPWQEWQAVALGRVLAVNDT YRGRAVFVGSSALVLHGIPMWVANPDVTLWPYRRRRAYALATVEGTRTVVPQARVRITTV PPVHAPVVVRGGVQAETPVDAAVRLAQGSGVIEGAVAVAMVLHRLARFDRFDLSASRGRV EAVRESMLEALEQSGNRRWRNRARRLIASADGGCDSVLEAVVVWIVRTLTDSAVVTQFEV VADGERYFADIALPELRVIIELDGRGKLGDDLASFRLAQRRWMVRQQRIEEAGWRVLRLN WHDLNDFVVLRSRVARFLCAAGGVLMSSYEDLWQIPPFESNGSHRRFYA >gi|319977377|gb|AEUH01000223.1| GENE 7 8745 - 9566 762 273 aa, chain - ## HITS:1 COG:no KEGG:Lxx20890 NR:ns ## KEGG: Lxx20890 # Name: tetR # Def: TetR family transcriptional regulator # Organism: L.xyli # Pathway: not_defined # 3 74 2 72 203 77 62.0 6e-13 MSTGSAPDRGRITREQILAAAMALLDQRGLPDLTMRKLAAELGIRPSALYWHFPDKQTLL ARLADRIVGSAPTPPAGGPGADDGSGGGGSAARSGPPNGHGGAACSGAAAHPGSPGEDQG SAFSEAAAYPGGSGEDRGASYPRTAHAVPWERAMRASADTMRSRLLAHRDGAEIVSSSIA LGLTRLPLTAMIHEPLHRAGASAETIDIAARTLGHFLLGHAFDEQQAEAARVLGVEPALP RLRLDPASAEAAFRAGVDLIVAGTAVRIAADRA >gi|319977377|gb|AEUH01000223.1| GENE 8 9630 - 10214 952 194 aa, chain + ## HITS:1 COG:mlr7548 KEGG:ns NR:ns ## COG: mlr7548 COG1268 # Protein_GI_number: 13476269 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 24 194 17 189 190 88 40.0 7e-18 MTTATASPALRRDRVLADRFGGTLAREAVLVVAGTLVLTLMARLSVPLPFTPIPVSLGTL GALSVGALGGARRSLLSVVLYTALGVAGAPVFAEDHVGWAFASFGYVLGYVLVAMIAGAA AERGWDRKPHTMLAASLVSLFSVYATGLAWMIPFTHMGLADGVAKGLVPFIVGDLLKAVV AAGVFPVVRSLTRR >gi|319977377|gb|AEUH01000223.1| GENE 9 10290 - 11861 2026 523 aa, chain + ## HITS:1 COG:Cgl0591_2 KEGG:ns NR:ns ## COG: Cgl0591_2 COG0519 # Protein_GI_number: 19551841 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Corynebacterium glutamicum # 199 523 2 326 326 473 74.0 1e-133 MNAPAPRPVLVVDFGAQYAQLIARRVREASVYSEIVPHTMGVADMLAKDPAAIILSGGPS SVYEAGAPSVDPAVFEAGVPVLGICYGFQAMAHALGGTVGRTGTREYGHTDAAVQEGSCL FDGTPTDQVVWMSHGDAVQAAPEGFAVTASTEQTPVAAFEDRGRRLYGLQWHPEVGHSQF GQDALRNFLYEGAGIEPTWTAGSIVDEQVARIRAQVGGAQVICALSGGVDSSVAAALVHK AVGDQLTCFFIDHGLLRAGEREQVEGDYARGMGIRVITCDESQRFLDALAGVTEPETKRK IIGREFIRSFEAAQKQVVEQVGASGGEVKFLVQGTLYPDVVESGGGEGAANIKSHHNVGG LPEDMAFELVEPLRTLFKDEVRAVGRELGLPDYLVNRQPFPGPGLGIRIIGEVTRERLDI LRAADLIAREELTAAGLDQEIWQCPVVLLADVRSVGVQGDGRTYGHPIVLRPVSSEDAMT ADWTRLPYDVLARISTRITNSVPEVNRVVLDVTSKPPATIEWE >gi|319977377|gb|AEUH01000223.1| GENE 10 11933 - 12202 394 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293189090|ref|ZP_06607816.1| ## NR: gi|293189090|ref|ZP_06607816.1| toxin-antitoxin system protein [Actinomyces odontolyticus F0309] # 13 89 3 79 79 108 68.0 1e-22 MPTTARSAEIKTRTTAQIKADASKVYAHWGISLSDAINIFLTKSIEVGGLPFDMRPSVPS FDSLERYAFHPRSDPNGVTTLPEDWDEDE >gi|319977377|gb|AEUH01000223.1| GENE 11 12250 - 12333 66 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGMRLLRLGGVAVVGCEPTTATWVRFI >gi|319977377|gb|AEUH01000223.1| GENE 12 12315 - 12473 159 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|93569008|gb|ABF13473.1| ## NR: gi|93569008|gb|ABF13473.1| unknown [Arcanobacterium pyogenes] # 3 49 4 50 87 70 78.0 2e-11 MGSIHLGAIALLSESEGVFTTAQAERMGVPRDALHDAVESGRLERIMRGPIA >gi|319977377|gb|AEUH01000223.1| GENE 13 12801 - 13550 945 249 aa, chain - ## HITS:1 COG:MA0416 KEGG:ns NR:ns ## COG: MA0416 COG1917 # Protein_GI_number: 20089309 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Methanosarcina acetivorans str.C2A # 128 248 23 141 141 129 52.0 7e-30 MPIKQTAGRDRLGGFAPEFARLNDDVLFGEVWSREGALPLKLRSIVTVSALIGKGITDGS LRHHLEFARANGVGRSEMAEILTHIAFYAGWPNAWAAFTMAAEVYGEAGAEESGHGGLFG MGAPNDGYAQYFSGRSWLKRVSREGDYLPVFNVTFEPGYRNNWHVHHAGSGGGQVLICVD GEGWHQEEGKPAQRLRPGDVVEVPANVKHWHGATTGSWFSHLAFEFPGEDTSTEWLEPVD DEAYAALEG >gi|319977377|gb|AEUH01000223.1| GENE 14 13690 - 14538 1301 282 aa, chain - ## HITS:1 COG:TM1009 KEGG:ns NR:ns ## COG: TM1009 COG0656 # Protein_GI_number: 15643767 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Thermotoga maritima # 4 268 6 278 286 217 39.0 2e-56 MNHITLNNGVEMPLVGLGTYQLSPDEAQASVAFALDNGYELIDTANVYVNERGVGRGMRA STKAREDVFLETKLWPSFYGSDTAVDETLERLGVAYIDLMILHQPAGNIRAGYRKLEDAY RAGKLRSIGISNFNAAEVQRLVDECEVVPALIQTECHPYFPQTELKELLARHNIALQAWY PLGGRGNDSIMGEPLIGELAKAHGKSPAQVILRWHVQQGNIVIPGSKTPSHIAQNIELFD FALTDEEMARIAALDRGDTIYHRTPELLDQYLHFVPDVDGQK >gi|319977377|gb|AEUH01000223.1| GENE 15 14883 - 15326 587 147 aa, chain + ## HITS:1 COG:no KEGG:Elen_1414 NR:ns ## KEGG: Elen_1414 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 6 132 11 135 158 110 47.0 1e-23 MKKHNLIVVAGVVWMAAGANVAVLGARAATGMSGTALAAAVALIAGAVATFLAFHMIFSR LVVKNSRRIRSLEGDRHNPLRFFDARGYATMAVMMSFGIGMRAAGIFPDWFVAFFYTGLG LALALAGVSFLLHRARGEGWVFHARRG >gi|319977377|gb|AEUH01000223.1| GENE 16 15533 - 15631 60 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRPLRFRLAWIGGSVPPPPAPVTHRRTPGIL >gi|319977377|gb|AEUH01000223.1| GENE 17 15630 - 16079 611 149 aa, chain + ## HITS:1 COG:no KEGG:BDP_0986 NR:ns ## KEGG: BDP_0986 # Name: not_defined # Def: hypothetical protein # Organism: B.dentium # Pathway: not_defined # 1 147 1 147 149 216 77.0 2e-55 MDFYTLDYIISHQSTDSARRVAAILFLLLVGLVFSALYLRDKVRTRWRDAGVGAIVLSLV LLGIQTEQYLQVTSQISQSQLLVHFVEGVAMDHNVPASEVLVNSTSLEDGIIVRFNAEDY LVHLNDNNNSYTLERTHIIDHNVYVNGEH >gi|319977377|gb|AEUH01000223.1| GENE 18 16081 - 16740 915 219 aa, chain + ## HITS:1 COG:L163025 KEGG:ns NR:ns ## COG: L163025 COG2323 # Protein_GI_number: 15674091 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 1 213 1 211 211 171 38.0 1e-42 MDFYTLTAIKFALGILTMIFQINILGKRDFSLNTPLNQVQNYVLGGIIGGVIYNSSITLL QFLIIILIWSLVVIATKILIDYSKIFKKLTACQPELIIRDGQVDIAHCAKVGLTAQSLSR SLRDQGISSVADVEAAVMETNGSLTIRVRGAESTHSLLPLVSDGQLVPSGLTLVGRDAAW VKEQLKAQGYSAVKQVFLAELIDGQLEIVPFSKAARPKA >gi|319977377|gb|AEUH01000223.1| GENE 19 16967 - 18520 1479 517 aa, chain + ## HITS:1 COG:mll3424 KEGG:ns NR:ns ## COG: mll3424 COG0477 # Protein_GI_number: 13472962 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Mesorhizobium loti # 37 508 36 500 524 134 27.0 6e-31 MHGKEMVRMPAGDCGSREARGGNEAKRAWRALGGLCLGTFVCFTNTTAFNIALPAISVSL GAGQVEQQWMVSAYNLVFGAFMLLAGSLGDRWGVKRTLLVGASAFAAASAAGVLATSSVT TIAARGVMGFGAGFSITMTLALITRLFESDGQVLALTMNAVAMTLGAPFGLVLGGLLVNL ADWRAIFVFDLVAFLIVLLLDILLLPGDRSQSGGTPGARAPLPLLSALLALAGLALVSAG LINAQISAASPDSWGPMVAGALVIAVFVRRDTRSSNPLAELGLLRIPSFAAASLSLLALN VAISGIMFVLPAYVETALGNNALVGALMLMPMVAAAMVGAAITRKASDRLGKRRACVASL VLIALGLAVMAASTAVAGYPLMVVGQCACGLGMAMGLPVMQGWAMERVPDRRRGGGSALV STFQQLGCLIGIGALGSLVGSSYAAACRGTAAEGFASISLAFEAADAQGASQAQAVRTAA SSAYAHAVLVAFCAAAVVLLLIAALAAFGSRSEAGEQ >gi|319977377|gb|AEUH01000223.1| GENE 20 18553 - 19215 486 220 aa, chain + ## HITS:1 COG:no KEGG:SMU.2123 NR:ns ## KEGG: SMU.2123 # Name: not_defined # Def: hypothetical protein # Organism: S.mutans # Pathway: not_defined # 1 203 1 203 207 203 49.0 3e-51 MGSRPSEQKRRYRLAQGVLVAAVLVYAAVRIGFLCFSLQMASAASLRQAVDQHSGLLMVM DELLVLSAVLGGWAFYNALESRARSGRALTGLAVASFALMLIGWVVVVLGTGRLVYPVNG LPVVADDGLQAAVAEIYGAWHLCDIMLGVLMISWAGSSCRPLWKCSGAAIGLAQILSTYF GQPVKPVPMLIAIALSTAWMLMRGVSSNNGLESEAVQEAE >gi|319977377|gb|AEUH01000223.1| GENE 21 19348 - 19893 539 181 aa, chain + ## HITS:1 COG:FN1248 KEGG:ns NR:ns ## COG: FN1248 COG4283 # Protein_GI_number: 19704583 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 177 25 201 204 218 56.0 4e-57 MARPTSKDELIVASARGYGKLAGFAASMTEGELAAPFDFSADKGKKEAHWSRDENLRDVL VHLHEWHRLLLEWVNANAAGEGRPFLPEPYTWRTYGDMNVVLWRKHQETPLEEARGLLEE SHNAVMALAEGFSDEELFHKGRFDWTGTTTLGSYFVSATSSHYDWALKKLRAHRRNCARR G >gi|319977377|gb|AEUH01000223.1| GENE 22 19978 - 20070 101 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKGRRRLSGHHWIIAGGGITMAAGKLFVF >gi|319977377|gb|AEUH01000223.1| GENE 23 20197 - 20397 104 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLSLGRIRGLSGRLVQNASESARNRLNPAKTVKLRRFLDHPRAGPSENSETQTHFGPRR RPGEGR >gi|319977377|gb|AEUH01000223.1| GENE 24 20294 - 20503 169 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLRHSTLTPSCCGVPNRAVTSAFRAGSAFGSAFSLTGPPQGAVVVQNASGFRCFRRAQH GGGPKSVSV >gi|319977377|gb|AEUH01000223.1| GENE 25 20768 - 21097 344 109 aa, chain - ## HITS:1 COG:no KEGG:Apar_1167 NR:ns ## KEGG: Apar_1167 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: A.parvulum # Pathway: not_defined # 5 109 186 291 291 89 48.0 3e-17 MNAVETLARFFQAENDRDWEAYRTFLHEDVVWHLHGEEDRTIRGVEEYLRAIRDAYAGSG ARFRVEEAHEAACGIRVVVLLVDDAGKMSFEVFDFEDGLIRREHEFLLG >gi|319977377|gb|AEUH01000223.1| GENE 26 21188 - 21703 584 171 aa, chain - ## HITS:1 COG:no KEGG:Mthe_0610 NR:ns ## KEGG: Mthe_0610 # Name: not_defined # Def: pyridoxamine 5'-phosphate oxidase-related, FMN-binding # Organism: M.thermophila # Pathway: not_defined # 24 171 167 314 323 98 39.0 7e-20 MDGSAIDWEAAANRWLDQEADEARMPEGELIAEIEEFLGRHKICALATAGAGIVRNTTVE YVHAEGAFWIVSEGGLKFRALRENSNVCLAVHDDDISFDTLAGLQVTGTAEVLESFGAEY ERACRLRGIPVERLRQLPFVMNIIKVTPTRYDYVSGRLKERGFSARQHVDL >gi|319977377|gb|AEUH01000223.1| GENE 27 22114 - 22923 781 269 aa, chain + ## HITS:1 COG:FN0984 KEGG:ns NR:ns ## COG: FN0984 COG3315 # Protein_GI_number: 19704319 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: O-Methyltransferase involved in polyketide biosynthesis # Organism: Fusobacterium nucleatum # 1 254 1 257 269 162 34.0 7e-40 MPELNELSKTLYIPLVGRIYASRHYPSMIHDPKALGLESSLPADAATMNRGQNEYTMVAS VARSVNMDRRIRQFLAVHPDAAVISVGCGLETTYWRCDNGRALWFELDLPEVINVRGELL SPGQRQVLIPGDMFDYSWIERVKEYGERPTIVVVSGVFYYFEEDRVIDFINHLAAFEAVR VVFDSTSSKGLKLSQAYVKRMKNGSTMHFSVDDPRQFVGRLTGGARLVEHVPYYRDIERR GVGLVPRAYMRVSDMFHMVSMTIIDMGRP >gi|319977377|gb|AEUH01000223.1| GENE 28 23048 - 23593 229 181 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229236145|ref|ZP_04360568.1| acetyltransferase, ribosomal protein N-acetylase [Chitinophaga pinensis DSM 2588] # 9 177 5 174 181 92 37 2e-18 MRYAETVVLKGGAELLVRNAVASDARALRETVRRTHSETDYLLSYPDEQGTDEEQEARLL EGTEGSGNEVELVAIVDGRIVGTAGVTAVGGRRKVGHRARFGISVLKEQWGMGIGRVLME ACVDCARRAGYAQLELEVVADNERAVSLYRRAGFEEYGRNPRGYRSSSTGFQELVHMRLE L >gi|319977377|gb|AEUH01000223.1| GENE 29 23634 - 23918 302 94 aa, chain - ## HITS:1 COG:SP0951 KEGG:ns NR:ns ## COG: SP0951 COG3070 # Protein_GI_number: 15900829 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Streptococcus pneumoniae TIGR4 # 1 72 1 74 75 68 50.0 2e-12 MASTPEYLAHVLDLLGRVEGVTHRAMMGEYLLYARGTLFGGVYDDRFLLKETPASSAALA AQCSPYPGAKPMRLVDVEDRDALADLVERVRAEL >gi|319977377|gb|AEUH01000223.1| GENE 30 23934 - 24566 957 210 aa, chain - ## HITS:1 COG:lin2189 KEGG:ns NR:ns ## COG: lin2189 COG4832 # Protein_GI_number: 16801254 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 4 207 7 207 208 195 48.0 5e-50 MAFDFKKEHKELYVPPKRPGIVTVPPMNYLAVRGHGDPNAEEGAYKRAIELLYGVAYAIK MSKKGTHRIDGYFDFVVPPLEGFWWQQDLHDVDYARKEDFDWISLIRLPGFATAQEVEWA KAEAASKKGGDFCEVEFLPYDEGLCVQCMHVGPYDDEPATVEAMHAHMEAQGYALDITDE RHHHEIYLGDARKVAPEKLRTVVRHPIKAV >gi|319977377|gb|AEUH01000223.1| GENE 31 24670 - 25791 1357 373 aa, chain - ## HITS:1 COG:MA4144 KEGG:ns NR:ns ## COG: MA4144 COG1397 # Protein_GI_number: 20092937 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ADP-ribosylglycohydrolase # Organism: Methanosarcina acetivorans str.C2A # 5 340 9 308 314 113 30.0 7e-25 MTNIDKYRGCLIGGAAGDALGYAVEFSREEQIAARYGAGGIRDYQLDDRGLAPFSDDTQM TLYTANGLLHALTRGRLRGILGPLRDYVSGFYVEWSRTQTEPYPLADHSAWISALPELFV PRAPGATCLSACAAGAHGTPEEPINDSKGCGGIMRVAPAGLLAGPRGADGADVQRLGAQL AALTHGHELGWLPAGVFAHIVSVLAHGEADSVRGAAEQALAVLPEAHPGARRVGELQELM RLVLRLADGTAPDVEAIHQLGEGWVAEEALAIALLCALRHEDDFEGTVVSAVNHSGDSDS TGAIAGNIVGARLGLAGIPRRYTERLELADVVLALADDLFTGCPITEYCPSPDGVWEHKY LLPADYVEWERGR >gi|319977377|gb|AEUH01000223.1| GENE 32 25813 - 26628 841 271 aa, chain - ## HITS:1 COG:no KEGG:Ccur_10210 NR:ns ## KEGG: Ccur_10210 # Name: not_defined # Def: transcriptional regulator # Organism: C.curtum # Pathway: not_defined # 1 270 5 271 272 268 55.0 1e-70 MPGGAPRARTLTRKRIFSAALALVDEEGLAALSLRALGKRLGVSQTAFYRYVPDKAALLE GVSEEVWRLTFDRFLSAIEEEPETDDQSGAEGEREENGRPEDRDQPSACGEPTADTQSSP GSGPGLVGARGEDWRWYVRQYATALHDTLLQHPEAVVLLLTHPISTPEQLTLLAKVMVRL SNASFAPPIDMLAIITAVSVYTTGFAAAEVVPPVGGGPDETSDESITAAIASLPADDLSA LRGLIGQVMDGEWDFTAQFERGLDALLRGWR >gi|319977377|gb|AEUH01000223.1| GENE 33 26834 - 27589 1014 251 aa, chain - ## HITS:1 COG:CAC2930 KEGG:ns NR:ns ## COG: CAC2930 COG0842 # Protein_GI_number: 15896183 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Clostridium acetobutylicum # 90 234 223 370 388 61 26.0 2e-09 MLKKLSAVIRLNIRLQLTDPASTLILTAIPLIFIPFMSPAFKSMLVADGYADATGVEQAV PSMAVLFSFLSVQIIINSFINERMWGTWPRLAASATPRGVVLTGKALVAYLLQFVQVLAV LVLGALAFGYRPKGSVIALLLAVLAFSAVLAALGVAIALWAPSQELALSLSNIIGMLAAG IGGAFCTVSSLPDWAQRAARFSPAYWAIDAIHSVSLDGAGVAGVLPAVGVLALFFAGLAA LAMARLAVKRD >gi|319977377|gb|AEUH01000223.1| GENE 34 27582 - 28559 1193 325 aa, chain - ## HITS:1 COG:BS_yfiL KEGG:ns NR:ns ## COG: BS_yfiL COG1131 # Protein_GI_number: 16077898 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Bacillus subtilis # 27 238 2 215 311 187 42.0 2e-47 MQSSAKLTLLTIEGADMPGSESGAPALRLRAVRKSYGDHEVLRGVDLDVARGQVLGLLGR NGAGKSTLIEIMCGLRGADSGSVSVCGADPAKERVGQHIGYAPQDLGVYPDLTVAQNLSF YGQLQGLSRKRARTRTAEVMELLGLEEQCSKRAKHLSGGQRRRLHAGMAIMHEPEVVFMD EPTVGADVEARSRILRAVRSLAEAGAAVVYTSHYLAEFEELGADIAILEGGRIVVSGTLE RVVADHSRASIALGFARNPRPMDGWRNEDRWLRREGDIPNPGALIADALASPALRGNRLE DVRIGQAGLQSAYLAIVGEEEQDNA >gi|319977377|gb|AEUH01000223.1| GENE 35 28481 - 28687 267 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLARPILIQACLLLQWLTVLTSLTTAKLTVLTHGCKPGPRGGGNTAPAPWERCGRRAGIG GPLRRPVG >gi|319977377|gb|AEUH01000223.1| GENE 36 28735 - 30105 1924 456 aa, chain - ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 1 437 1 434 443 576 62.0 1e-164 MAGQQGSLFDEAADEARPLADRMRPASLDEVVGQSHQIGPGKALRSMIEADRTPSMIFWG PPGVGKTTLARVIARRTCASFIDFSAVTSGIKEIREVMRQADAQASTGRRTIVFVDEIHR FNKAQQDAFLPFVEKGSIILIGATTENPSFEVNSALLSRCKVFVLNALAEDDITALLRRA LADPRGFGEQDVRIEDDLLRAIAVFANGDARVALSTLEMAFLNGDEEGGVTRVSAETVEQ CTSRRSLLYDKDGEEHYNIISALHKSMRNSDPDAAVYWLARMLEGGEDPLYVARRITRFA SEDVGLADTNALNVAVNAFHACHFIGMPECSVHLAEAVIYLSLAPKSNSSYTAYGRARRD ALRTQADPVPLAIRNAPTRLMKDLGYGQGYRLAHYEADKVAADMRCLPDSLAGAEYYRPT EEGNERRFKERLEWLKRLREQARAATPGASADDEEA >gi|319977377|gb|AEUH01000223.1| GENE 37 30341 - 30568 253 75 aa, chain - ## HITS:1 COG:FN0497 KEGG:ns NR:ns ## COG: FN0497 COG2026 # Protein_GI_number: 19703832 # Func_class: J Translation, ribosomal structure and biogenesis; D Cell cycle control, cell division, chromosome partitioning # Function: Cytotoxic translational repressor of toxin-antitoxin stability system # Organism: Fusobacterium nucleatum # 10 70 24 84 90 73 52.0 7e-14 MPASAGPSRARLILSWIGKNLEGRTDPRAHGKGLTASRSGEWRYRIGSYRALCVIEDDRL IIEAFSVGHRRNTHS >gi|319977377|gb|AEUH01000223.1| GENE 38 30663 - 31346 229 227 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 204 12 221 318 92 30 2e-18 MSARGLAVRFGERTVLRGADADLREGEVCAVIGANGAGKTTLCRALCGFERRARGTVSLA GRAASRSARVRASSMVFQDVNYQLFAESVADEVVFGLPRRDARAVDVAALLVGLDLEGLE ERHPATLSGGQKQRLAVAACVAAHKQVLVFDEPTSGLDLDGMRRVARLLRELAAQGRTVL VITHDLELVACACDRALVVEGGRVGATMLVADHFDAVKRAMGASREH Prediction of potential genes in microbial genomes Time: Thu May 12 18:55:40 2011 Seq name: gi|319977365|gb|AEUH01000224.1| Actinomyces sp. oral taxon 178 str. F0338 contig00224, whole genome shotgun sequence Length of sequence - 12844 bp Number of predicted genes - 17, with homology - 10 Number of transcription units - 14, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 34/0.000 - CDS 1 - 826 530 ## COG1122 ABC-type cobalt transport system, ATPase component 2 1 Op 2 . - CDS 823 - 1515 716 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 3 2 Tu 1 . - CDS 1637 - 2359 705 ## Elen_0329 hypothetical protein + Prom 2253 - 2312 3.4 4 3 Tu 1 . + CDS 2400 - 3383 959 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 3430 - 3484 0.4 + Prom 3704 - 3763 3.8 5 4 Tu 1 . + CDS 3784 - 4749 514 ## COG3177 Uncharacterized conserved protein + Term 4762 - 4808 -0.2 6 5 Tu 1 . - CDS 5211 - 5474 308 ## 7 6 Tu 1 . - CDS 5761 - 5910 232 ## 8 7 Tu 1 . + CDS 6323 - 6442 299 ## + Term 6592 - 6655 2.0 9 8 Tu 1 . - CDS 6714 - 6788 73 ## - Prom 6994 - 7053 3.7 10 9 Tu 1 . + CDS 6790 - 7074 449 ## + Term 7236 - 7274 3.1 11 10 Op 1 . - CDS 7511 - 8362 1051 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 12 10 Op 2 . - CDS 8359 - 9372 1444 ## COG0604 NADPH:quinone reductase and related Zn-dependent oxidoreductases 13 10 Op 3 . - CDS 9411 - 9827 651 ## COG1959 Predicted transcriptional regulator - Prom 9853 - 9912 2.9 14 11 Tu 1 . + CDS 9763 - 9903 174 ## 15 12 Tu 1 . - CDS 9935 - 10951 1492 ## LEUM_1079 peptidase 16 13 Tu 1 . - CDS 11136 - 11981 902 ## COG0500 SAM-dependent methyltransferases 17 14 Tu 1 . - CDS 12284 - 12844 12 ## Predicted protein(s) >gi|319977365|gb|AEUH01000224.1| GENE 1 1 - 826 530 275 aa, chain - ## HITS:1 COG:SP1438 KEGG:ns NR:ns ## COG: SP1438 COG1122 # Protein_GI_number: 15901290 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, ATPase component # Organism: Streptococcus pneumoniae TIGR4 # 24 245 19 240 374 157 34.0 2e-38 MIEFDHASFTYRAQVDEGRAGAVDVSLTVRSGEVVVLCGRSGCGKSTALRLAGGLAPRFF PGTAHGRVSLDGRAVDALETWEIAQRAGSLFQNPRTQFFSADSTGEVAFALENAGWPPED IRQRVGATFEELGMSALAGRDLFRLSGGERQKVAFASLWAVRPANLLLDEPTSNLDLPAV ADMAGFVARAKEAGCAVLVAEHRLSWLTGIADRYVRMEAGRVDRVWGADEFAALSAEEVR RMGLRMRSADEARPVIRPPVRGAPNAGEPAHGVPG >gi|319977365|gb|AEUH01000224.1| GENE 2 823 - 1515 716 230 aa, chain - ## HITS:1 COG:SP1437 KEGG:ns NR:ns ## COG: SP1437 COG0619 # Protein_GI_number: 15901289 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Streptococcus pneumoniae TIGR4 # 83 227 7 147 147 67 27.0 2e-11 MHLDPRTKLLSIAAMAVAVALAPSTSCEVALMCVASAFGVVVGVRRGSAVALVLGISAVV AARFVPDLDGVALRTALTAFLVLVRKMFACGLIAYATVRTTPVSEFMGALVKARAPRSLV VPLAVALRYLPVVREDWRAITTAMRMRGVSPSPAGFLRAPMRTIDCVYAPLLLGAGRVAD ELAVASIARGIENPVRRTCYLPIAMGGADYAVLAVFGATAVGSLALRALA >gi|319977365|gb|AEUH01000224.1| GENE 3 1637 - 2359 705 240 aa, chain - ## HITS:1 COG:no KEGG:Elen_0329 NR:ns ## KEGG: Elen_0329 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 38 240 1 203 203 242 64.0 7e-63 MRIGRSSPADHASPDPTHILTKKREGDPNLIWKDGGTVKHSTDTKPTGLDTRDLVTTGVF TALYFVFTMVGGGLFATNPVLTSWMPAAAALLTGPVYLLLIARVPKHGPLIILGAIEGII LFVTGMYWGWSVACVVLAVLADLIAGLGGFRSRALAFCAFVVYSLAPMGSYLALWVDPAA YASYLTGKGGEQAYMDTMMATATGWMLPAMVLSTIACAVVSGLVGLRLLRRQFERAGITA >gi|319977365|gb|AEUH01000224.1| GENE 4 2400 - 3383 959 327 aa, chain + ## HITS:1 COG:SP1433 KEGG:ns NR:ns ## COG: SP1433 COG2207 # Protein_GI_number: 15901285 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Streptococcus pneumoniae TIGR4 # 41 320 44 315 330 81 23.0 2e-15 MKEAFQAVLLAGASFAPAEPPDGFGPLWNSYKSDGQGCRGWFHHLSPERAEWSISIHDFV MEGDFVMDSAPHRYLTVTWFKSICGEEFQPYRRLRPGAVWGLSIGGGPWRGVAHGSSPVQ CVSIEVTPGFAQNCLAGELAGDDGAGAREEEVERAFTALGDSGPFPEMSALLSGLWPRPR DGARSALHYEGKVLEAMGLIVERSRSAATGEAKPVAAVDHERMREVALYIDDHCSARLRL ADLAAIACMSPTKFKETFKRVNGKTLTQYVQERRMSHAEALLRHSDLTIEQVGRAVGYTC PSRFSALFKREVGVRPSDLRKALSAWP >gi|319977365|gb|AEUH01000224.1| GENE 5 3784 - 4749 514 321 aa, chain + ## HITS:1 COG:FN0971 KEGG:ns NR:ns ## COG: FN0971 COG3177 # Protein_GI_number: 19704306 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 5 315 7 328 330 268 41.0 1e-71 MVYEPPFERTNAIDALCMEIAELVGMLSPQAPLATNPTLRRELRIQTIYSSLMIEGNSLD ERAVSAVLDGQRVLGDPRDILEVQNARAAYDLIPDLDPHSVDDLLRVHRVMMDGLVADAG RFRSGNVGVFDGGVLIHAGTPATYVSEVMGDLFNWLRSTSMHPLLASCVFHFEFEFCHPF SDGNGRTGRLWHTLLLSKWRPVLAWLPVESTIRRKQAGYYAALAQSGASGSSERFVEFML EAIHESVLPFAKPTSERGVARSRALGFFLENSRATVAQLGEYLGCSKRSAERVVAQLKEE GTLVRQGSARAGVWVVRDPGL >gi|319977365|gb|AEUH01000224.1| GENE 6 5211 - 5474 308 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLWRPFNDEHSAELKNRVFPDPQGPSIAIAYLRSLSSTYAQISLSNHCSDSFSLITDSS YRFTGCCELSSIFSDHSGPIAQSAYVR >gi|319977365|gb|AEUH01000224.1| GENE 7 5761 - 5910 232 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIVRVHLPHELFANIRVERATITRIITKTLRNKHNLYPSPLNQLIYGTE >gi|319977365|gb|AEUH01000224.1| GENE 8 6323 - 6442 299 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVGFAAILKRGTQTREIRRLSLISNLSLSRRWAHCLTKT >gi|319977365|gb|AEUH01000224.1| GENE 9 6714 - 6788 73 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNISSAEPPLETHLATFSARSLA >gi|319977365|gb|AEUH01000224.1| GENE 10 6790 - 7074 449 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSESLTFGVQGLLSVEPSHVASAYFNLAPEGQEELRRVVHRLETEVPIPDGAKGRLNVWL SVLKNALTENDDTDRMTRVRRGWFIQTIDRVLGP >gi|319977365|gb|AEUH01000224.1| GENE 11 7511 - 8362 1051 283 aa, chain - ## HITS:1 COG:MT3266 KEGG:ns NR:ns ## COG: MT3266 COG0596 # Protein_GI_number: 15842754 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 2 276 26 298 300 233 45.0 4e-61 MSYLTTPTSFITVDGTKIAYRELSPGASRLPLVMLVHLAANLDNWEPRLVDQLANDRHLI LLDLPGVGASGGTVPATIEEAGEQAARIVRGLGHDRVDLLGLSMGGMIAQEVVRHDPELV ERLILVGTGPRAGLGIDKVTPTTFRHMFRAALKHKDPKRYIFYTRDARGEAVANEVLGRL GSRTAQYADKAVTVPSFLRQLRAIKRWGASPADDLSHITMPTLIVNGDTDDMVPTPNSYD MHERIADSRLVIYPHAGHGSLFQHAEEFAAEVNGFLADTAANS >gi|319977365|gb|AEUH01000224.1| GENE 12 8359 - 9372 1444 337 aa, chain - ## HITS:1 COG:AGpA656 KEGG:ns NR:ns ## COG: AGpA656 COG0604 # Protein_GI_number: 16119675 # Func_class: C Energy production and conversion; R General function prediction only # Function: NADPH:quinone reductase and related Zn-dependent oxidoreductases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 331 26 355 359 224 41.0 3e-58 MKAAILTRYDKNATGLEVRDIPIPTPSADEVLVRIHTSAVNPLDNMISRGEVKMITPCTL PLVMGNEFSGVVEEVGPSVRAFEVGDRVYGRMPLSKIGAFAEYAAVDAGALAHVPDYLSL EEAATVPLTALTAMQAFELMDAQPGQTVFISGGTGSLGAMAVPVAKRLGLTVATNGNGAN EERVRALGADVFIDYKKQNYTDVLSDVDLVLDTLGESELPSEFAVLKEGGHLVSLRGMPN GRFARRMGMPWYKRMLLSFTGRKYDAMAARKHQTYDFIFVHEDGAQLERIASLFPADRPL TASIDSTFTLDQINDALAKVRAGGSQGKTIVRIGDFQ >gi|319977365|gb|AEUH01000224.1| GENE 13 9411 - 9827 651 138 aa, chain - ## HITS:1 COG:SP1636 KEGG:ns NR:ns ## COG: SP1636 COG1959 # Protein_GI_number: 15901472 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 3 138 5 143 145 89 39.0 2e-18 MDTRFSSAIHALILISEAERPINSTAIAASVGTNASYIRKLTTRLAKAGFIASSRAAGGF TLTVPAQQISLRDIYCAVTDTDRIQLFDIHGNPNDECVVGAHIRPTLTDVFARQQEALER ELTTTTLADCIASMRSRL >gi|319977365|gb|AEUH01000224.1| GENE 14 9763 - 9903 174 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGRSASEMRISAWIAEENRVSMTFIRVCIRYRYSKGVLRNCQSMAL >gi|319977365|gb|AEUH01000224.1| GENE 15 9935 - 10951 1492 338 aa, chain - ## HITS:1 COG:no KEGG:LEUM_1079 NR:ns ## KEGG: LEUM_1079 # Name: not_defined # Def: peptidase # Organism: L.mesenteroides # Pathway: not_defined # 77 331 45 272 277 119 31.0 1e-25 MNIRILRPSIGRATLIGSALAVSVSLLSAVGLASHAFDGAPTERDPSPIAYEDDEESSLS VLTAFSATASQLHGSIDTVAYQQAYRGTTYDKEAYVYVPDTYSPDSPANIVYLTHGWQGS AEGLAEGIAPVVDHLTDSGQLSPTLVVFATYYPDRSFATADYEDDYELNRFFATTEIDTV IDRVESTYTTFAHGDTSDQSLRDSRTHRAFGGFSMGATTTWDVFSMRPQYFYGYMPMAGE SWIGRATDADLDQIAQLLTAGSERADYGAGDFRILASVGSDDPALGDMSPQLDELRLDYP ELMTPDSLQLWVDEGETHSMESVQNQVAHSLPLLLPPA >gi|319977365|gb|AEUH01000224.1| GENE 16 11136 - 11981 902 281 aa, chain - ## HITS:1 COG:PA4800 KEGG:ns NR:ns ## COG: PA4800 COG0500 # Protein_GI_number: 15599994 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Pseudomonas aeruginosa # 14 186 11 185 266 95 33.0 9e-20 MSEPTGRARWFSDNAKNWDDRAELHMAGDYCGYQRLLADPAAISAELAQDIRRFGDLTDR HVIHLQCHVGTDTIGFARLGASRVVGLDLSEASLAHARSIAERAGAGIEYVHANVYDARR AVSGDFDLVYTSIGVLCWLPDVGEWARVVASLLKPGGTFFIRDDHPMFMAIGEDVSDGLK VEQPYFQLEEPATWDDDSSYVDTPGAPRIAHTTNHQWNHSLGQTITALIDAGLVIDSVEE AARAAWCPWPRLMEQDSDGGWRLRDRPERLPLQFAITAHKS >gi|319977365|gb|AEUH01000224.1| GENE 17 12284 - 12844 12 186 aa, chain - ## HITS:0 COG:no KEGG:no NR:no DRTARRARPARIRPGGQGQGGGPGPQPQQSLRARPARRGPDQSDRPRLAGPEAPPTGRAR KGRRTRPTGPSRTPANPGSPAHRTDKEREEDPTHRAQPAPANHAKRGGDPTHNPNPRTRP PGGSIPPTRVFPSRRGIDNVYGPTGRTHGPADSSPAAADPAAGRRARPDEHGASAHRARV FPGRGV Prediction of potential genes in microbial genomes Time: Thu May 12 18:56:28 2011 Seq name: gi|319977352|gb|AEUH01000225.1| Actinomyces sp. oral taxon 178 str. F0338 contig00225, whole genome shotgun sequence Length of sequence - 13120 bp Number of predicted genes - 13, with homology - 7 Number of transcription units - 10, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 372 131 ## 2 1 Op 2 . - CDS 353 - 1054 761 ## gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 - Term 1210 - 1250 9.1 3 2 Tu 1 . - CDS 1324 - 1827 517 ## 4 3 Tu 1 . + CDS 1703 - 1999 79 ## + Term 2005 - 2041 1.6 5 4 Tu 1 . - CDS 2298 - 2663 478 ## Acfer_0523 transcriptional regulator, LysR family 6 5 Tu 1 . + CDS 2894 - 3142 428 ## 7 6 Op 1 . - CDS 3261 - 6623 2872 ## COG0457 FOG: TPR repeat 8 6 Op 2 . - CDS 6602 - 6865 128 ## 9 7 Op 1 . + CDS 6765 - 8015 1336 ## Mflv_1868 hypothetical protein 10 7 Op 2 . + CDS 8012 - 10516 1734 ## COG0457 FOG: TPR repeat + Term 10674 - 10704 1.3 + Prom 10631 - 10690 5.7 11 8 Tu 1 . + CDS 10717 - 12030 888 ## COG1106 Predicted ATPases 12 9 Tu 1 . + CDS 12169 - 12687 139 ## 13 10 Tu 1 . + CDS 12835 - 13120 238 ## Bfae_22660 protein chain release factor B Predicted protein(s) >gi|319977352|gb|AEUH01000225.1| GENE 1 3 - 372 131 123 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQMTLNKTGIRNTIACFAGAFALLITCHTAYASTPDFSVTGHTDGDDTSGIEFSATQRST VDYSGRPPSSVGGGSGSGSAAGGGAPAVPAASEAPGGGSVPGGRAAGRPMEMVCTGEREG IPG >gi|319977352|gb|AEUH01000225.1| GENE 2 353 - 1054 761 233 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190865|ref|ZP_06609027.1| ## NR: gi|293190865|ref|ZP_06609027.1| hypothetical protein HMPREF0970_01360 [Actinomyces odontolyticus F0309] # 65 231 93 265 266 80 31.0 1e-13 MRREALARDAELRAQGIDPYKGTPVDEEPRRRLGARALAVLVVLVVAAVSVGAYVVFFRG EPDYGMSHGYQVQSDGSLKRPPVTDKAPEQPAEMSAGGEAGAEATARYYLKAYSYAWNTG DTGPLQSISDENCQFCKSESSRIDEFYAHGYWAAKGYTDVLQTQAIEKLPESEYGPDAYA VQFRIDEHLAEGYTSNGFQEGRTDQTVIKLHVQWDGNAWHVLEGRAKNANDIE >gi|319977352|gb|AEUH01000225.1| GENE 3 1324 - 1827 517 167 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGGYQIQSDGSPRWPNNAGPMPSTGPATWTFTDDGCRETAKLYFRLVAHAWNKGDASAI TNASQPGCRACKDTASRIEDHYQHGGWATGAASSDFTVTKVAQADAAVVGEGVYGVFMTY KFRTPDVYREGRLTRGHEETQSVVVRLAWSGHNWLVQEVEDQPQEES >gi|319977352|gb|AEUH01000225.1| GENE 4 1703 - 1999 79 98 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVSRQPSSVNVHVAGPVLGMGPALLGQRGDPSDWIWYPPLMAGEGEAPNAEHAETADTA LMSTASAHQARPPVREPRLPGPGALLPALTGTTSSCQV >gi|319977352|gb|AEUH01000225.1| GENE 5 2298 - 2663 478 121 aa, chain - ## HITS:1 COG:no KEGG:Acfer_0523 NR:ns ## KEGG: Acfer_0523 # Name: not_defined # Def: transcriptional regulator, LysR family # Organism: A.fermentans # Pathway: not_defined # 3 103 160 260 275 70 42.0 2e-11 MSESLFVLVPRAHGLAGRDSLTWADLDGEHFLIQAAIGFWAELVRSRMPDSEFVVQSDPN VFAQLVESSPLLSFSTDLTLPNPPMKSRRAVAVAESDASATYFLASRTAAPDQVRAIAEL F >gi|319977352|gb|AEUH01000225.1| GENE 6 2894 - 3142 428 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIHQPYGDYYGAYRDLERALEFGKVRSIGVSNFGVFDFSLADEDRAAIAALDQDRSIVF DHRDPSLMGGFLKGLGAARRQA >gi|319977352|gb|AEUH01000225.1| GENE 7 3261 - 6623 2872 1120 aa, chain - ## HITS:1 COG:all2787 KEGG:ns NR:ns ## COG: all2787 COG0457 # Protein_GI_number: 17230279 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 263 1094 5 884 924 323 32.0 9e-88 MVARSDLTPLGVRLAELLVEGAAVVTAPSVGGAIALGKGVYEWWQDIAALGKPGSKAADA LGKEIASRLAAAVKGARRRYTEQGLDVELLPGAVTAVETLLEQVVADGRAVVAAVQDPDR FEDLLSERGQEFRKNVEEKAEPLFDELVRTVAREFTSLAPGSSRFSLEALKQLLNNDSEA LAILHRMEAKQEASHNDIAEIKADVKDIAKVVNTPHTPVPARPSRIRFGSRPMEALGFVT RSEQEDLFDAVFSAAAPRTVLVGMHGCGKSQLAAAVAARCVEEEWPLVAWVNAESHGSVL EGLSELGQRMGVGETDDRTPEALAQLCLRALEEAEAADRLVVFDNVEHADDLRDLVPHGE GLRVVATTTKRVDWAQAHWRPIDVGGFEREQSITMLLGRTGQSDRDAADAIADALEDLPV AVSQAAATVKRSRCSLSTYLDRLRKYSLEDSVRRLDGDDYPDAVGTALWFAFQSALEEIG KQSSRWEALASRQLGVLALLAASGVPRRWLEGTDQESSPGSALDASEALNSLVEFSVCQL SEDGAKAALHRLQSRVIRENWKNEPKDRARAEEDAVALLASVNAERVRNRENRNRRQDAI DLADQLRAISEQDYSSVLFSAPRFGDVLLSALRHAIELGAPQAAISLSGAVEQLGAVLGP DHPHTLASRNNLASAYRDVGSFSQAIALFEQNLADRLRILGPNHPDTLTSRHNLAGAYQR AGRLDDAITLYEKTLANRTHALGPDHPDTLASRNNLAGAYQSAGRLDDAITRFEEVLADS TRILGPDHPHTLACRNNLAGAYQSAGRLDDAITLYEKTLADSARVLGDDHPQTLACRNNL AGAYESAGRLDDAITLYEKTLADSARVLGDNHPQTLTSRHNLAGAYESAGRLDDAITLYE KTLADSARVLGDDHPQTLTSRHNLAGAYESAGRLGEAIPLYEQVLTDRRRVLGKNHPDTL ASRGNLAGAYESAGRLGEAIPLYEQVAADSARVLGKNHPDTLASRGNLAGAYESAGRLGE AIPLYEQVAADSARVLGDDHPQTLTSRNNLAYAYKSAGRLDDAITLYEKTLADRVRVLGD DHPQTLTSRSNLAYAYLAAGRVEELLALFDPPNDPGSAGE >gi|319977352|gb|AEUH01000225.1| GENE 8 6602 - 6865 128 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTASNVREFLRRIIREILRTSSEFAGKAESPLFIVGFKAACAHGREGRPTPTGAGSSQMP RAVRTASVRRSARDRCDRLPVWLPVPT >gi|319977352|gb|AEUH01000225.1| GENE 9 6765 - 8015 1336 416 aa, chain + ## HITS:1 COG:no KEGG:Mflv_1868 NR:ns ## KEGG: Mflv_1868 # Name: not_defined # Def: hypothetical protein # Organism: M.gilvum # Pathway: not_defined # 53 406 15 371 376 82 28.0 3e-14 MNRGDSAFPANSLDVRRISRIILRKNSRTLEAVMALDPIVPQPGNDRNPAHFVGRAKTTT QARRRLKAGANLLLTDPRRMGKTFWMRAFAAREKGFHCYSINYEGVFTVNDFLIGTAKQL IKDGKLSQKARVKLRTIFNNCDIELPGPITIKSYHRQTSPHVLLTDVLGALDEDAAGVIP LVMMDEVPMAVNNIADREGPSAAAEILQTLRALRQRTANVRWIITGSIGFHHVLRRAGMT QGALNDLEPLPLGPLRDDEARELARRLLLGIGRLPDDAVVEALVEVSGGIPFLLHKVAAT LDQRHRNVIRPAEVRECFEDFIDDPDEFGWFEHYLTRVGPHYRERADLAERVLRATLSEA NDWIPVGALPPDDGVDDILEDLTKDHYLERRGQSIRWRYPVLQYIWARKKAGWDRR >gi|319977352|gb|AEUH01000225.1| GENE 10 8012 - 10516 1734 834 aa, chain + ## HITS:1 COG:all2787 KEGG:ns NR:ns ## COG: all2787 COG0457 # Protein_GI_number: 17230279 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 546 806 628 888 924 135 37.0 3e-31 MSAVSISRFTPSTMPEDLLARLFVVRQPVLESLMKRVGDLGATPSPHHTLLVGPRGAGKT HLISLVYHRAKNRAGTDGGKPLRIAWLPEDPWTIVSYARLLAAILERVAPDTGVKSADEA ELDARLRSTSRKDGPVLVLMENVDQILDALGEVGQQKLRNLLQTESGVLIIGSTTRLDRS LSSHAAPFFGFFDTIRLEPFSPEEAREMLTALAREAGNAELAERLSSSGALARIHTIAHL AGGQPRLWALLGSALTVEELRDLAALLLSRFDDLTPYYQEQLARLSPLQRLIVAELAAAD RPLPVKDIAERVGSDQRSVAKAVGDLAERGWLKPVSTIFTELLDRRRTYYDLAEPLARLA FQIKESRGEPLPLVVDFLVNWFDADQLRSSDGSDYGRVALMRMEQDEVGGLARRLTSLPE SRIPSLDLLGQVEDALAAFSAGDAEPVMALRSTLRQAIELRAHDEGGIASVRLELLDDAL REVGDVPRGDTNSAWVGRAERLDAEAQSPQSRLILVRWLAASWRFDEAEAALGTISSEQA ALEGANAIAYAYVSAGRMADAIALFEEVLADSLRILGHDHIDTLVFRGNLAGTYQVVGRF DEAISLYEEAVADSVRIFGSDDLATLSVRSNLAGALLAVGRVGEAVVMFRELLADRLRVF GADHSVTIVARKSLAGVCLVAGCVDEAVALYEEVVADQLRVLGPDHPDTVASRGSLAGAC RSAGRLADAITLYKDVVADQLRVLGPDHPDTLASRGSLAGACWEAGHLDEAITLFEQVLA DQLRVLGPNHPYTVASRCYLARAYREAGRVDDATAVLDPPTDSDDVDARRIESR >gi|319977352|gb|AEUH01000225.1| GENE 11 10717 - 12030 888 437 aa, chain + ## HITS:1 COG:alr3406 KEGG:ns NR:ns ## COG: alr3406 COG1106 # Protein_GI_number: 17230898 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Nostoc sp. PCC 7120 # 1 435 1 443 463 119 29.0 9e-27 MLRVLHVENWKSFHDPVDFTMIAGKESRCKDTLFRDGKARVLPTAAIYGANASGKSALLG AVEQLQELVREPRARGRRLPYDPHRLYGVGEPTVLGVEIVLDVPDATSKRDAIVYYEVSY TAERVVAESLYRLRSTDEEAVFTRDGQDVELYGDLDGNDFVQAVARTVQGNRLLLETLAN SEEDEVGTIIAGVIQWFKRLAVIRRGAPFVLMPQRIAKDDDFRQVMGAELSTADTGICDV LFTRVGREKVPVPDEVLDHIEAQLVDSAHEGFLSVGEGSVVRVRRDEAGDVSYERLVTVH KDERGAFELFLEQESDGTMRYFNLLPILYWAGQQNSRGVFLIDELEDSLHPKLTEELLRR FLSATGEDQRRQLIFTTHELHLLRSDMLRRDEIWLVEKRGHNSGLIRLTDFSGSGVRNGA DLRKIYMSGRLGGVPRI >gi|319977352|gb|AEUH01000225.1| GENE 12 12169 - 12687 139 172 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIANRYGNRIAVRFAPTGDHSSLDNLVASIGREMKEYGAQVKGAWIVCDTDKNACHRVKL EKWLAGSKRHHAVISCPCVEYWFLLHLCESPSSVAAKKAKRELGKCWQWGEYRKGGQVPS ELIEATDEAVRRAHQRRKSLGDDADAWNSPQWTDMPELIAWLDELDPGETRH >gi|319977352|gb|AEUH01000225.1| GENE 13 12835 - 13120 238 95 aa, chain + ## HITS:1 COG:no KEGG:Bfae_22660 NR:ns ## KEGG: Bfae_22660 # Name: not_defined # Def: protein chain release factor B # Organism: B.faecium # Pathway: not_defined # 1 95 1 95 146 115 75.0 7e-25 MDDLRIPPGPGCPRGLVVPAGELGERFSHASGPGGQGVNTADSRVRLSLDLATTTALDQE QRERALARLGERLVGTVLTVVAAEHRSQRRNRAAA Prediction of potential genes in microbial genomes Time: Thu May 12 18:57:30 2011 Seq name: gi|319977350|gb|AEUH01000226.1| Actinomyces sp. oral taxon 178 str. F0338 contig00226, whole genome shotgun sequence Length of sequence - 1195 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 50 - 87 0.9 1 1 Tu 1 . - CDS 183 - 1016 909 ## COG0500 SAM-dependent methyltransferases Predicted protein(s) >gi|319977350|gb|AEUH01000226.1| GENE 1 183 - 1016 909 277 aa, chain - ## HITS:1 COG:HI1273 KEGG:ns NR:ns ## COG: HI1273 COG0500 # Protein_GI_number: 16273188 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Haemophilus influenzae # 20 149 29 158 268 70 31.0 3e-12 MPKRSSGRGIVFDRDSREYWNKRAATFTRTATRAYEHWLMKLLALDPSDTVLDMGCATGT LAVPLAKRGHHVHACDFAEAMLAILSKRAARDRLPITAHLLAWEDDWEAAGLGTDSVDVA FASRSLVADGVRAHIGKLDSAARTKAAVTVSASPLPSYEPRLLTHLGRVAKRPHAVQDVK RALSGMGRVPFCASTTTFRPMRFASFDEARADLRRLAGPEPFTPGEQKLFDAYASEHFTC VPRPNPTGERSGHWTLDYTLPVTWTFIGWRTDGSEWE Prediction of potential genes in microbial genomes Time: Thu May 12 18:57:32 2011 Seq name: gi|319977346|gb|AEUH01000227.1| Actinomyces sp. oral taxon 178 str. F0338 contig00227, whole genome shotgun sequence Length of sequence - 4708 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 275 - 1051 1018 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity 2 1 Op 2 . + CDS 1052 - 2914 2237 ## COG3004 Na+/H+ antiporter 3 1 Op 3 1/0.000 + CDS 2914 - 3621 895 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity 4 1 Op 4 . + CDS 3794 - 4615 917 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 5 1 Op 5 . + CDS 4623 - 4707 84 ## Predicted protein(s) >gi|319977346|gb|AEUH01000227.1| GENE 1 275 - 1051 1018 258 aa, chain + ## HITS:1 COG:SA2266 KEGG:ns NR:ns ## COG: SA2266 COG4221 # Protein_GI_number: 15928057 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Staphylococcus aureus N315 # 6 254 4 230 231 86 26.0 6e-17 MTRYDLKGRTAVVTGATGAIGSAVVRALVEGGAGVAVIARGRRRLERLRGRLPDDARVVV CPADVTSAYELVEARERVHRELGAPDLVVTAAGVRRAALFEDAVPADWNLMLNTNVRGTL QAVQIFAHDVLAAGEAGDRADIVTFATAPAHERQQAYSVFSSFGAALSQFSKHLRAEYGP RGVRVHHIESLYTAGSFFTHGNLGADRSTSAHHDVLPDDIEYIEPIGTQHLASEVAFMVS LPAHVNFANAVVQPTRSH >gi|319977346|gb|AEUH01000227.1| GENE 2 1052 - 2914 2237 620 aa, chain + ## HITS:1 COG:jhp1447 KEGG:ns NR:ns ## COG: jhp1447 COG3004 # Protein_GI_number: 15612512 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter # Organism: Helicobacter pylori J99 # 11 415 14 427 438 232 34.0 2e-60 MRLISTVRGALKAFAAREASGALLLLAATAVALVWANAGALGYEAFWNARLSVHLGPWGI DMPLRHWINDGLMMMFFLLVSLQVKHDFVMGGLREWRRACVPVLAAACGLVVPAVVFGVI NAGGPHAGAWGIVIPTDTAFVVGILAAFGKRLPVQLRSFLMTLAVVDDVGALAVIATAYT GTIRLLPLAGVLLAASALYAVQRARVRRASVYIVFGALLWLLFLASGVHAAIAGVVLGLL MPVFPPERSALLTAEELTQRFRRAPTALTGKNAVYGILRAVSINERHQLSTAPIVNLFVV PVFALSNAGVALSGDALAHALHSPLTWGIVAGLVAGKYVGAFGASVLASKTRIGELAPGL GLRHINGGAMLTGIGFTISLFIVDLAIEDGAAQADARIGVLAASLLAALLGTLVLALTAF LDARRAPARPRLTRPVDPRRDHIAGNPASALTLVQYGQLGCLEDGATVELLREVRDHFDN DLRLVFRHNPLGDPGAEQAAEMLEAVAAQSPDLFEPVRVEVARLCDEADLDRDVLRRAAV EMGADLARLDAQMLQRPHIGRVHDDADDAAGMGLTRAPAFFIGEELYQGEHTPEALIEAL EAARAALDPRTPARTAGEHD >gi|319977346|gb|AEUH01000227.1| GENE 3 2914 - 3621 895 235 aa, chain + ## HITS:1 COG:jhp1023 KEGG:ns NR:ns ## COG: jhp1023 COG4221 # Protein_GI_number: 15612088 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Helicobacter pylori J99 # 6 235 4 242 250 84 27.0 2e-16 MSKRTIVIAGGAHGIGLAVARAVAAQGDEAIIVDRVPPSGTVDARFVRVDLSSARQADEA LGSLVKSEDRIDALVVAAGRAANGALGDRPAHEWSEHLSNDVLSVVVPARVLLPALIQTG RDYGVADIVVVGSIAGDTAFKNAVVYGAGSAARNSFGEQLRVELRHENVRVRSIHLGYVR TRAIEAIKPTVLESPFADTPLTPEDVSVVIMHELDQPAHVSTHDIVLVPTRQGWA >gi|319977346|gb|AEUH01000227.1| GENE 4 3794 - 4615 917 273 aa, chain + ## HITS:1 COG:L142816 KEGG:ns NR:ns ## COG: L142816 COG1028 # Protein_GI_number: 15673291 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Lactococcus lactis # 5 273 3 270 270 323 58.0 3e-88 MEPRVILITGASSGIGRAAARTLARDGHIVYGAARRVGRIEALVADGVRPVELDVTDEAA CRAAVGRVLAEQGRVDVLVNNAGYGSYGAVEDVALAEARRQFEVNVFGAAALVKAVAPGM RERRSGTIVNVSSMGGRLVVSRMGAWYHATKYALEALSDALRVELADFGISVVLIEPGAI RTQWGAIAADHLEESSKGGAYEARAARTAAVMRRLYSSPVLSAPHVVVRAMERAISAPRP RTRYLIGFGAKPVVALRALLPARAFDWIVKRTS >gi|319977346|gb|AEUH01000227.1| GENE 5 4623 - 4707 84 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDYVLDGQYARLLRNAGIDPSRLLRAAG Prediction of potential genes in microbial genomes Time: Thu May 12 18:57:38 2011 Seq name: gi|319977339|gb|AEUH01000228.1| Actinomyces sp. oral taxon 178 str. F0338 contig00228, whole genome shotgun sequence Length of sequence - 5891 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 4, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 268 - 894 601 ## COG2207 AraC-type DNA-binding domain-containing proteins 2 1 Op 2 . + CDS 962 - 1816 988 ## + Term 1946 - 1970 -0.3 3 2 Op 1 . - CDS 2182 - 3150 1290 ## SGR_6575 putative 5'-nucleotidase 4 2 Op 2 . - CDS 3166 - 3396 128 ## 5 3 Tu 1 . + CDS 3275 - 3982 940 ## COG0778 Nitroreductase 6 4 Op 1 19/0.000 - CDS 4048 - 4746 609 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 7 4 Op 2 . - CDS 4755 - 5891 895 ## COG4585 Signal transduction histidine kinase Predicted protein(s) >gi|319977339|gb|AEUH01000228.1| GENE 1 268 - 894 601 208 aa, chain + ## HITS:1 COG:L143624 KEGG:ns NR:ns ## COG: L143624 COG2207 # Protein_GI_number: 15673292 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Lactococcus lactis # 4 197 130 320 324 131 36.0 9e-31 MPAFQVLVETAFLLGIIRSATGEPVNALAATMVEPPRGEDLAEMEEYLGCPLRAGAVDRL VLRAADAALPFTTSHRSMWEYLEPELSRRLADLHSGASTAQRVRACLVELLPSGRCAAGD AARRLGVSRRTLQRRLGQEGTTFSKELASLRHDLALGYLSHAEMGVREVAYLLGYLETNS FLRAFRTWTGTGVGQWRAAHGSGGGAPA >gi|319977339|gb|AEUH01000228.1| GENE 2 962 - 1816 988 284 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGLDAVVPCRCFEDGLLADPPDGIGELVRTPYGELAVKAFEEEGLNDQTAASFSPGAQDA LRRFNEWAKAPCAHKWGYAWAEQIGNWAMVREFRSLVGLLGGRARYPHLDPLLPLDNEGC LAVDRMPAVRAELEDFCARVMEAQAWALVADGYDEPLFYCVDADISWWRSYYPPGGVEVE VLMDSDAVVFIEDGGTGIATRRFVQVWEDPSDQSDKRPVRIEFLDRRGTVHLPSPLIHGQ RDRVECRVEARTAPFLDDGEYWAGKRLMEGIDAALAVGQPMYWR >gi|319977339|gb|AEUH01000228.1| GENE 3 2182 - 3150 1290 322 aa, chain - ## HITS:1 COG:no KEGG:SGR_6575 NR:ns ## KEGG: SGR_6575 # Name: not_defined # Def: putative 5'-nucleotidase # Organism: S.griseus # Pathway: Purine metabolism [PATH:sgr00230]; Pyrimidine metabolism [PATH:sgr00240]; Nicotinate and nicotinamide metabolism [PATH:sgr00760]; Metabolic pathways [PATH:sgr01100]; Biosynthesis of secondary metabolites [PATH:sgr01110] # 11 321 9 308 317 346 59.0 8e-94 MPGKLVLERALVVGVASSALFDLDASDAVFRRDGEQKYREYQRDHLGDALAPGVAFPFIR RLLALNDLSGDERLVEVVVLSRNDPETGLRVMRSVEHHGLDITRAIFMQGRSPYRFMEPL RMSLFLSANEADVREAIRMGFAAGRVVGRAADDGRAADDGRAADDDGGTDLRIAFDFDGV LADDSSERIYQEGTLEAYQANESALADVPLPKGPLAAFLEKINHIQRIEDAKHDADPGGY ERRVRVAVVTARSAPAHERAINSIHQWGLRVNDAFFLGGIDKGPVLGVLQPHIFFDDQRR HVDTASRSTPSVHIPFGELNEA >gi|319977339|gb|AEUH01000228.1| GENE 4 3166 - 3396 128 76 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFDQLGAAEASARIASRTSEGTGSGLNARTEWREAASVENISAGFVMPPCSHGGGGEAAQ ARTLTATLVPKDTGPE >gi|319977339|gb|AEUH01000228.1| GENE 5 3275 - 3982 940 235 aa, chain + ## HITS:1 COG:CC0324 KEGG:ns NR:ns ## COG: CC0324 COG0778 # Protein_GI_number: 16124579 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Caulobacter vibrioides # 9 228 10 210 220 102 31.0 6e-22 MFSTLAASRHSVRAFKPDPVPSDVLDAILADASAAPSWSNTRPFALALATGERADRLRAA YTEEFDRTLDVQHRKPLAAAKLVLSGHAPDGDYKVWKPYPDDLRPHQVEVARLLYGAYGV ERHDHEGRDRANRRNAAAFDAPVIGFAFVHKGLMPFSALDAGIMLQTLWLSAKAHGVDSC PLGILAAWRRPVDAEFDVPRDYALITGFALGYADPEAPVNAFRAPRRPVSLIQGR >gi|319977339|gb|AEUH01000228.1| GENE 6 4048 - 4746 609 232 aa, chain - ## HITS:1 COG:all4635 KEGG:ns NR:ns ## COG: all4635 COG2197 # Protein_GI_number: 17232127 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 1 213 5 206 219 156 40.0 4e-38 MVVDDQRLTRLGIALMLRHDPDAELVAEAQDGQEAVDTLDTLSLRRGPMPDVVLMDARMP RLDGIGATRLVTARYPGIRVLVLTTYDQDDYAFGALAAGASGFLLKDATARQLCDAVHAV AHGDAVLTPRITRELIRRAAASPLFTGSSGGTEADALFSALTPRERQVASLVADGMSNAE IAGRLVIETASARRYVSRILAKTGLRDRVQIAVAWHRGEAPGGRRRPRPGDE >gi|319977339|gb|AEUH01000228.1| GENE 7 4755 - 5891 895 378 aa, chain - ## HITS:1 COG:BH1050 KEGG:ns NR:ns ## COG: BH1050 COG4585 # Protein_GI_number: 15613613 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 44 189 164 305 371 67 32.0 5e-11 GASGSRGAEALASFTTFALVCCCGMVLRARERARKTQRRSREAALRAASLEQERNLALAR SRVAAELHDSVGHGLTTIIVLSEGLTGATGDAEVDEALAGINAAARECLEQTREAVRALG RGAGAAAGPGRHTWDDVHSVIASARSTGMAVAFHETGRRPQDEGQADLAFAVAREGITNA LRHARGASVVTVSWVHRDDGWVEASVRDDGRKEAPDFGADGSAPGSNGRKGARGDDGQNE APGGKGRKGASVRDEGNGGGPPLPIAPPHSGLSMVRSRVVEAGGACGAGWSGSGWALHAR IPPGPGARGDGYGNAAEADAGTGAAAGTAAESRPDAEPGTGAGTSAGNRVGTTVRTDAEA GPGAESGMDAEAGTGAGR Prediction of potential genes in microbial genomes Time: Thu May 12 18:58:04 2011 Seq name: gi|319977329|gb|AEUH01000229.1| Actinomyces sp. oral taxon 178 str. F0338 contig00229, whole genome shotgun sequence Length of sequence - 14085 bp Number of predicted genes - 12, with homology - 9 Number of transcription units - 8, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 145 95 ## 2 1 Op 2 . - CDS 145 - 1584 1490 ## Blon_1416 protein of unknown function DUF214 3 1 Op 3 1/0.000 - CDS 1577 - 2311 226 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 2373 - 2417 0.1 4 1 Op 4 . - CDS 2463 - 3827 1483 ## COG0477 Permeases of the major facilitator superfamily 5 2 Op 1 . + CDS 3811 - 3873 146 ## 6 2 Op 2 . + CDS 3970 - 5286 2040 ## COG3949 Uncharacterized membrane protein 7 3 Tu 1 . - CDS 5314 - 5889 762 ## COG0350 Methylated DNA-protein cysteine methyltransferase 8 4 Tu 1 . + CDS 6343 - 8253 2322 ## COG0025 NhaP-type Na+/H+ and K+/H+ antiporters 9 5 Tu 1 . + CDS 8509 - 8673 123 ## - Term 9046 - 9083 5.0 10 6 Tu 1 . - CDS 9088 - 9675 772 ## GALLO_1344 putative acetyltransferase (GNAT) family 11 7 Tu 1 . + CDS 9794 - 13240 162 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 12 8 Tu 1 . + CDS 13422 - 14006 656 ## COG0693 Putative intracellular protease/amidase Predicted protein(s) >gi|319977329|gb|AEUH01000229.1| GENE 1 1 - 145 95 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGGGGRSGRGRWDWVPAAVCACYGAAVLASVLVPGFPAGPAVPQGLGA >gi|319977329|gb|AEUH01000229.1| GENE 2 145 - 1584 1490 479 aa, chain - ## HITS:1 COG:no KEGG:Blon_1416 NR:ns ## KEGG: Blon_1416 # Name: not_defined # Def: protein of unknown function DUF214 # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 1 425 1 431 474 200 38.0 1e-49 MPRRALRILANDWGAWIPLVAVMSLTTALVGVCTNQFAWTHSPEFLAATRAAGLRAEEFH VVSVSVYCCVALVAAFSLTVVGSATVVRCAPDFSRWRLLGATPAQVGATALAMTAVACGA GALAGSLASVPASFPLVPWFNGLAAQGFPGGTGGFDPPFAASPGAWAASLVLSWLTCMLG SLGPSLRAARMRPVEALRSTSAGAGAASGHPVAAAAVAALALLLPVAGTWDPAVGGTGAG GALSSLMVWPGALLLVAAALWGARGVEAAIACAAAVPALAGSTMGVLAGKALRARAGRSA VATIPLVVAAGGGSLLLCMVRTFERVMRAMGVQTAFNYTDTTVLVGLVALFSLATAAAVT ALGSDDQGPGITALRALGVPRAQATRMLVWQAALLAAATGLLSLVLALAAAGIGLFLSLR LFAVPAFCPPVDLCAVFAVASFLAVVSVLRWRLAGWLGRWPALRGDTRGARSAERRRRS >gi|319977329|gb|AEUH01000229.1| GENE 3 1577 - 2311 226 244 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 19 218 12 217 245 91 30 3e-18 MGCVVSARGVWKEVARGGGPLAVLRGCSLDAGPGVTALVGRSGAGKTSLLHIMSGLDAPT RGAVVVDGQDLGALNEARRTRFMRERVGFVFQDANLVPYLTLEENAVLPLELAGRRADPV RVGRLFSRFSLAGRRGTRAEAASGGEAQRCAVIRALLSAPGVLFADEPTGALDSGNSRVV LGALRAMGEEGTTVVIVTHDPVIASSADRVAFMRDGRITRVATALSAGEVLEGMDQQCGE APGA >gi|319977329|gb|AEUH01000229.1| GENE 4 2463 - 3827 1483 454 aa, chain - ## HITS:1 COG:RSp0310 KEGG:ns NR:ns ## COG: RSp0310 COG0477 # Protein_GI_number: 17548531 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 41 444 35 439 450 124 31.0 3e-28 MIENRTAWSALDAPSTLTSEKRHATLVQMTQDTSSAPAPVSASGRPLAPNWWWVVATIWS GQAFSVITSGASGWAIIWHVTTTEGSALKLALVMALSQLPLGLLAPLGGLAADRYNRRAV MIVSDLGAGSMSLALAVLAWFGHGDFALICVFAALRSCFQAFHFPAMSAAMPMLVPEKHL MRINTLDQAISAVSSIGAPAIGIALYSAFGLPFTLGLEFTGALLAVAGLALARIPSVEAE MPPTVMGQIRQGWAALSANRGLVVLIAALTIGMMVFAAIQAVYPLMAKQHFGADGGMVSL AEAITGSCMLAGALIMMAWGGGKRLALLMGCATLVVAPPIASIGFLPPGGFWALVALMGF ACVFLAWFHAPLMTLIQRHAGEDKAGRALGFFQMMIGLAVPVGVALGGAIAERTGVPALF TATGLVFGALGAFMCAAPSVRALDAPAPSAGRPQ >gi|319977329|gb|AEUH01000229.1| GENE 5 3811 - 3873 146 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRFSIKSVFHFHKNLAKHVA >gi|319977329|gb|AEUH01000229.1| GENE 6 3970 - 5286 2040 438 aa, chain + ## HITS:1 COG:Cgl1071 KEGG:ns NR:ns ## COG: Cgl1071 COG3949 # Protein_GI_number: 19552321 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Corynebacterium glutamicum # 6 368 6 368 400 240 40.0 4e-63 MPKKNLSVAMAYVGVVAGAGLASGQDLLQYFLSFGVMGVLGIAVMGGLNVVFGSLALQLG SYFRSDHHDEVFGRIAHPVVNRLIDIVLVVSSFIMGFVMLAGAGANLQQAFGLPTMWGAV VCGLLVVVTAFLDIDRITRVIGVFTPVIVVLILVLTAHTLTQSHPPVEQLDAAARSVVPA LPDVWLSALNYCALCVLGGVAMAFVLGGSVLRIDVARSAGRIGGAVLAGVVVLTGITLFL NVGTVKDVDVPMQEIARSIHPAFAFVYTLAIFALIYNTVFSLFYSVARRFSGGSEARMRL ILIGVVVAGLAASMAGFKSLVGIMYPILGYLGMALMVVVAMGWWRERHNIGREENLRRKM VRLLLRKHTPRAPYTTEHRRDVRALAEASVADASQLRSDADELAGQIVANEDDVRAFAHA NLPVDEDKVAELVEPRSA >gi|319977329|gb|AEUH01000229.1| GENE 7 5314 - 5889 762 191 aa, chain - ## HITS:1 COG:L118481 KEGG:ns NR:ns ## COG: L118481 COG0350 # Protein_GI_number: 15672513 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Lactococcus lactis # 1 180 14 167 169 104 32.0 8e-23 MTLAAHGGTLVGAWFYRQRFFGAPYGVDEAVVGFARDAEDPALRAADAWLDAYYAGSNPP LDFPLRAYGNDFQLRVWDALLRVPYGTTTSYGKLAEALTQESQRPQGAAAQSGNAGKRAR RAAAPSRPVSPRVVGWAVGRNPISLFVPCHRVLSASGGVSGYAGGVDRKEWLLALESGRV VAGLFGDVDGA >gi|319977329|gb|AEUH01000229.1| GENE 8 6343 - 8253 2322 636 aa, chain + ## HITS:1 COG:STM4269 KEGG:ns NR:ns ## COG: STM4269 COG0025 # Protein_GI_number: 16767519 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters # Organism: Salmonella typhimurium LT2 # 29 291 31 298 548 130 34.0 7e-30 MEPLPIVLIGMLIIVACQFAAPRVGVASPLILLGVGLAIGFLPVVAPVRIDPHIVLEMVL PPLLFSAAVRMPTMDFRRELSAVAMLAIPLVVISAAAIGFLVNWMVPQISLPWAIALGAV ISPTDAVAVSIAKRSGVAHRIITVLEGEGLFNDATALVLLAAATSAGVAASENALNPGAL ASAFLIALVVAIGVGWAVGELGVRVRARLKDPTSDTVFSFAMPFLASIPAEHLGGSGLVA AVVAGLVVSIRRVGMISANNRRFATQNWSTVTVVLESGVFLMMGLQAYGIVTDEATGTGY EGLGRAALVAVVAGGATVAVRAGFVSALLVWLDHRRKHSKRRHERDEARIGVFEERLADA CTVDEEVLQARNLSEEEWRAALNRWHRRLDRVQRRQTRRGSDLEYFANEPLGPREGAVVV WAGMRGAITLAASQTIAQDVPMRGFLLLVALLVAAGSLIIQGLTLPAVIRLVRPQMASGE VDDEERRLLAKVMGSALVDTALAQAIQEMDGKEVVQAGLSRTLVRIEHAQSGGSAPPRTA ERVIDGLRAGLADAPAGAGAGGAGPAEGSDDADEVARAFTRCQIRELALEAIQAQREALL DARDEGVFSSVALDGALARLDNEEILLVSGSHGLDH >gi|319977329|gb|AEUH01000229.1| GENE 9 8509 - 8673 123 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDELAIVHTTGPLIHTILTRPTKKTHPRRCPHHLGLRHAGNPRAYLGLSGAAAR >gi|319977329|gb|AEUH01000229.1| GENE 10 9088 - 9675 772 195 aa, chain - ## HITS:1 COG:no KEGG:GALLO_1344 NR:ns ## KEGG: GALLO_1344 # Name: not_defined # Def: putative acetyltransferase (GNAT) family # Organism: S.gallolyticus # Pathway: not_defined # 21 162 18 149 181 96 36.0 4e-19 MTSAPLAPVRVPLRGPIASQVHALYRRAFPPEERIPLPLLHASAMRRRAISFTAWVDPEL SDPSAHDAEVVAFTYSFVSKDLVYLAFLAVDDRLRSAGYGRRILEWFADEHPDLPLFLEI EPIDESAGNYAQRLRRLAFYQRNGFTVSNMLTHEAGQTFRVLHRGPGAGAISAERLEGEL NAFGAGLVRTRVTTD >gi|319977329|gb|AEUH01000229.1| GENE 11 9794 - 13240 162 1148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 1010 1148 284 413 413 67 30 7e-11 MLELREITKAYRTANLVQVALDRVSVAFRDNEFVAVLGQSGSGKTTMLNVIGGLDRFQEG DLVIDGVSTKHYRARDWDAYRNNRIGFVFQSYNLIPHQSVLANVELALTLSGVSRGERRK RALDALDRVGLSEHVRKLPSQMSGGQMQRVAIARALINDPEILLADEPTGALDSKTSVQV MDLLREVAKDRLVIMVTHNPELAHQYATRIVELADGRIVADSEPFIPGTEEVREAKPARR TSMGFLTALALSFNNLMTKKGRTLMTSFAGSIGIIGIAAILALANGANAYIQRTEEDALS SYPLTIQKQGFSMESMMSSAVLGGSSTEGDPGGVKVSSRRSLTGMIQNAKTNDLKSLKVY LDNNGGGIKAHASSVEYGYGITPLIFQQDTSKGLNQVSPEPAFSPIASARGGSANPMSST SLSTDAFYQMPTDQSLYADAYDVLAGTWPTGADDLVLVLDAQGQISDVFEYTLGLKDHKD LQSLMRSYYSGQMGRGAGSPQSGAQSGAGDAPSAMYDYSQILGTAFSRVNAPDQYTYDST YRVWTDHSADTDYMKDLVSKGQRLTITGIVKPKSSSRTPLRQGIGYTADLTHRVIDEAGA SQIVKDQIASPTIDVFTGKTFKELADGQKDKASGFDMSSLFSVDEGKLGAAFQIDPSKLQ MDMSALDFSGLDFSGLDFSKLDMSGMDLSGLDPSALVPQGAQSGAQSGLLPGMDLGELTK QFPQLADIDFAGIISAALKDGAVKEGAGEYLASRASQIAQDFIAYAREQAAKAPDRDGDG IPDIDLVKLVSDYMTSADVVRQLTDAVTSDQVIDSGAFIANLTKALGDDPAIAQIAQAVS QQIADAVSAQIASQLSGLLGQGLGASLSQMMSDTMGQAMGQMMEGFAEQIGARISSTMDD FAEAMSSAVSIDPGAFADAFSMNFDEKSLAALMATMMSTSVPDYDTNLKGLGWAAIDTPT TISIYPKSFADKDEVKKILDAYSADQVNAGAPDKAITYTDLMGTLMSSVTSIIDVISWLL IAFVSISLVVSSIMIAIITYISVLERRKEIGILRSIGASKGDVSRVFNAETVIEGFLAGV MGVGVTYGLCALVNAVVSSAFDVHDIAQLSPLAALALIAVSVGLTVFAGLVPASRAARQD PVEALRSE >gi|319977329|gb|AEUH01000229.1| GENE 12 13422 - 14006 656 194 aa, chain + ## HITS:1 COG:SP0804 KEGG:ns NR:ns ## COG: SP0804 COG0693 # Protein_GI_number: 15900697 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Streptococcus pneumoniae TIGR4 # 12 191 3 175 184 122 43.0 3e-28 MPTPPQLATDKKAAVLLAAGCEEVEALAVVDALFRAGIRADLVSVSQSLEVVSSHRIRII ADALIADVELADYDLLYLPGGMPGTLHLKACPAVPVEVLRRADAGEPVAAICAAPSILAE LGVLDRRRATANPAFMEAIAQGGATAEEEPVVVDGAITTSRGAGTAFDLGLELVRQMLGD EAADAVRAGIVRSR Prediction of potential genes in microbial genomes Time: Thu May 12 18:58:30 2011 Seq name: gi|319977324|gb|AEUH01000230.1| Actinomyces sp. oral taxon 178 str. F0338 contig00230, whole genome shotgun sequence Length of sequence - 5118 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 80 - 991 1161 ## COG0084 Mg-dependent DNase 2 1 Op 2 . + CDS 978 - 1079 59 ## 3 1 Op 3 . + CDS 1095 - 2330 1479 ## COG3583 Uncharacterized protein conserved in bacteria - Term 2306 - 2333 -0.8 4 2 Tu 1 . - CDS 2345 - 2428 59 ## 5 3 Op 1 4/0.000 + CDS 2385 - 3815 1992 ## COG3583 Uncharacterized protein conserved in bacteria 6 3 Op 2 . + CDS 3845 - 4888 1163 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 7 3 Op 3 . + CDS 4881 - 5118 258 ## gi|154509598|ref|ZP_02045240.1| hypothetical protein ACTODO_02131 Predicted protein(s) >gi|319977324|gb|AEUH01000230.1| GENE 1 80 - 991 1161 303 aa, chain + ## HITS:1 COG:SMc01193 KEGG:ns NR:ns ## COG: SMc01193 COG0084 # Protein_GI_number: 15965353 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Sinorhizobium meliloti # 20 302 2 255 259 134 34.0 2e-31 MSKRPPRRWPDPAAPLAGPVMDNHTHLPVHPGQIPSPQGERLSLDDQLERARRVGVERVV SSACEIPDFDPMLEVARAHEGVRVALALHPNEAALHAGVVEPSPDGLVPRVRVHHVPLDE ALSEVESRLGDPAVVAVGESGLDFYRTAQEGRGAQAESFRAHLELARQAGLPLQIHDRDA HRETLEVLAEAGAGVPAIVFHCYSGDREMALALEENGWYASFAGPVTYPANGGLREALLA LPRELVLVETDAPYLTPAPYRGCPNASYVISHTVRFIADLWEVDEESACQQLMANSVRVY GEW >gi|319977324|gb|AEUH01000230.1| GENE 2 978 - 1079 59 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MASGKPSRRGDRAPGLVPANEVGVVQGIGFAVL >gi|319977324|gb|AEUH01000230.1| GENE 3 1095 - 2330 1479 411 aa, chain + ## HITS:1 COG:BH0055_1 KEGG:ns NR:ns ## COG: BH0055_1 COG3583 # Protein_GI_number: 15612618 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 54 303 16 278 320 82 29.0 1e-15 MGDKPGNGAHRQRAAGRPAGTYNVWSHVRTLKHTRRTVITAASAAALAIGGGAAGLAYAS HRTVTIEVDGAAQRVSGFFSTVGDAISAGGITTGDHDLIAPAPESSVSSGDTVVVRTATE YRVSVDGAPTTAWSTASSVSGVLDAVPSAGSVAIAADRSQSRAEMPVGADTVHVAADGTT TDVAATAADGASAILEKAGVSAGPLDRVAFHRGADGVTLRVQRVTRGNVTSSTSIDYATE ERDDDTLDKGTTKTVQEGAAGSETTVAYQESVDGVVTVSAVLSTTRTEPTTRIVANGTKE AAQPAPAASAAPSSGSSGSSAPSDSGASAPSGDDASIWAAIAQCESGGNPTTNTGNGYYG MYQFSLPTWRSVGGAGLPSEASAEEQTMRARMLQQRAGWGQWGCAYKLGLV >gi|319977324|gb|AEUH01000230.1| GENE 4 2345 - 2428 59 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METADGRDTFRCVLTVLLVALAWDYSS >gi|319977324|gb|AEUH01000230.1| GENE 5 2385 - 3815 1992 476 aa, chain + ## HITS:1 COG:Cgl0885_1 KEGG:ns NR:ns ## COG: Cgl0885_1 COG3583 # Protein_GI_number: 19552135 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 38 302 23 289 314 69 25.0 1e-11 MSTHLKVSLPSAVSMAVHAIGGRLRAFAPSPVMVLASATAAGAIVLGTAGVSIASSHVNV MLEVDGVSRPVGVWGNNTVSDALKAAGVRVADSDLVQPGQGEALADGDTIIVRSSKPYSV AVDGKVRTVWTTAASADAILADAGALGTKVSLAADRSSSRDSLTPLVSRQRSVVLNADGT SKQIEARPGQDARATLHDAGIAVHPLDRVAVSTDEAGTLTVDVQRVTRGPATETAEIPFS ETTTTSADLFVGESEVTTQGVNGVTTWTVWQEQNGDEVLTSVPLTEHATTPPVPQVRSEG TKEATPAALIAAGIDPKATLEERTEPNGTTSVRYRAKLGSLSTPEEIAKITQESGGSSGT GGAAAAAAPNVPLVYSGEDPRSLARPLVAARGWSDSEYQCLVQLWNRESQWNPHAQNSSS GAYGIPQALPGSKMATAGADWQTNPVTQINWGLGYIAGRYGTPCSAWAHSNAVGWY >gi|319977324|gb|AEUH01000230.1| GENE 6 3845 - 4888 1163 347 aa, chain + ## HITS:1 COG:Cgl0886 KEGG:ns NR:ns ## COG: Cgl0886 COG0030 # Protein_GI_number: 19552136 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Corynebacterium glutamicum # 37 324 3 287 293 274 56.0 2e-73 MTANRPPSPQRARASSPARRGGPARRRTDPALPDGAQPSDTGLLGPVEVRAISGALGIRP TKALGQNFVHDAGTVRRIVAAAGVGEGDEVIEVGPGLGSLTLALLEAGARVRAVEIDPVL AAALPETVRARMGGAADRLHVVTADATAITGPADLGPDWPPPAKLVANLPYNVAVPVLLA MLDSFPTLTDVLVMVQAEVADRLAAGPGSRTYGVPSVKAAWYGRATRAGTIGRSVFWPVP GVDSALVRLRRSAEARGDDALRRATFEATDAAFGQRRKTLRAALKDWAGGAAASEALLAE AGIDPARRGETLTIDEFTRLGAALIRLRASGAVAERPDPRTAGSANA >gi|319977324|gb|AEUH01000230.1| GENE 7 4881 - 5118 258 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509598|ref|ZP_02045240.1| ## NR: gi|154509598|ref|ZP_02045240.1| hypothetical protein ACTODO_02131 [Actinomyces odontolyticus ATCC 17982] # 1 76 1 76 324 98 75.0 1e-19 MREVTASAPGKVNLVLRAGAPTADGYHPLATVFEALDLRETVTVRTTRTPGTTVETVAHL PGGGIDRATTELMRAVPAR Prediction of potential genes in microbial genomes Time: Thu May 12 18:58:44 2011 Seq name: gi|319977321|gb|AEUH01000231.1| Actinomyces sp. oral taxon 178 str. F0338 contig00231, whole genome shotgun sequence Length of sequence - 2263 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 1 - 639 868 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase 2 1 Op 2 . + CDS 644 - 2263 178 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 Predicted protein(s) >gi|319977321|gb|AEUH01000231.1| GENE 1 1 - 639 868 212 aa, chain + ## HITS:1 COG:ML0242 KEGG:ns NR:ns ## COG: ML0242 COG1947 # Protein_GI_number: 15827038 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Mycobacterium leprae # 1 207 95 299 311 128 50.0 8e-30 KRVPVAGGMAGGSADAAAALVACNELWGLGLGPDRLERIGRALGADVPACLTGGIALGTG RGDRMAPLECPGTRHHWVIALAHEGLSTADVFAEFDRAAQGAAAQGAPAVPTDEELADLT GPARLAGPRLVNDLTAAALSLRPELGGTLNAARAAGALAAIVSGSGPTVAALASSPDQAD EIAERLAAAPAVARVLTATGPAPGARVDRQEG >gi|319977321|gb|AEUH01000231.1| GENE 2 644 - 2263 178 540 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 321 504 21 214 305 73 30 2e-13 MAHLLGTQSVAAMAGSRRLLESVDVGLDDGSRVGILGPNGAGKSTLLRIVAGIQEPDGGV VTRRDGLRVAVLSQADALDPADTVAAAVHPGRAEYEWASDPRVRDIHEGLLASVPLAAAV GALSGGQRRRVALARVLAADADIVCLDEPTNHLDVEGVAWLAHHLNARFARPGASGALLA VTHDRWFLDAVCEHIWEVVPGVDPGGKRPQIPGRVEIYDGSYAAYTLARAERARQAGVAA RKRANLLTKELAWLRRGAPARTSKPKFHIAAAEALIADAPPPRDSVELVEMASARLGKDV IDLEDVSVSFTRPDSSRLDVLRGVTWRLAPGERVGVVGVNGAGKTTVLNLLRGEVEPTGG RVRRGKTVKVATLSQETRELDAVAGMRVVEAVADIAQVVVAGGKEITAQQMTERMGFTRE RAHTRVKEISGGERRRLQLMRLLMSQPNVLLLDEPTNDLDTDTLAAMEDLLDSFAGTLVV VSHDRYLLERVTDHQVALLGDGTLRALPGGVEQYLQMRRGAAAGPSGGAAGGAGEGAADG Prediction of potential genes in microbial genomes Time: Thu May 12 18:58:51 2011 Seq name: gi|319977302|gb|AEUH01000232.1| Actinomyces sp. oral taxon 178 str. F0338 contig00232, whole genome shotgun sequence Length of sequence - 16692 bp Number of predicted genes - 21, with homology - 15 Number of transcription units - 15, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 310 - 372 61 ## 2 2 Op 1 . + CDS 723 - 1001 433 ## Gobs_1770 protein of unknown function DUF1540 3 2 Op 2 . + CDS 1062 - 1553 111 ## 4 3 Tu 1 . + CDS 1857 - 2009 159 ## 5 4 Tu 1 . - CDS 1976 - 2083 164 ## 6 5 Tu 1 . - CDS 2515 - 3138 877 ## COG1309 Transcriptional regulator + TRNA 3495 - 3565 41.7 # Gln TTG 0 0 + Prom 3492 - 3551 76.8 7 6 Op 1 11/0.000 + CDS 3665 - 4663 1235 ## COG1207 N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) 8 6 Op 2 9/0.000 + CDS 4688 - 5659 1388 ## COG0462 Phosphoribosylpyrophosphate synthetase 9 6 Op 3 . + CDS 5816 - 6409 464 ## PROTEIN SUPPORTED gi|227492556|ref|ZP_03922872.1| 50S ribosomal protein L25 + Term 6562 - 6586 -0.3 10 7 Op 1 . + CDS 6594 - 7688 1271 ## COG0836 Mannose-1-phosphate guanylyltransferase 11 7 Op 2 . + CDS 7754 - 8944 1735 ## Sked_25760 nucleoside-binding protein 12 8 Tu 1 . - CDS 8833 - 9129 185 ## 13 9 Op 1 2/1.000 + CDS 9427 - 10575 1390 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 14 9 Op 2 . + CDS 10580 - 11332 971 ## COG1611 Predicted Rossmann fold nucleotide-binding protein + Prom 11345 - 11404 3.9 15 10 Tu 1 . + CDS 11457 - 11621 181 ## Tfu_0502 hypothetical protein - Term 11367 - 11404 -0.9 16 11 Tu 1 . - CDS 11607 - 11855 102 ## - Term 12056 - 12086 2.1 17 12 Tu 1 . - CDS 12132 - 12779 897 ## COG4122 Predicted O-methyltransferase 18 13 Tu 1 . - CDS 12897 - 14027 1335 ## COG0489 ATPases involved in chromosome partitioning 19 14 Op 1 3/0.000 - CDS 14161 - 14700 850 ## COG4420 Predicted membrane protein 20 14 Op 2 . - CDS 14693 - 15982 1885 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) 21 15 Tu 1 . + CDS 16009 - 16690 690 ## Jden_0702 hypothetical protein Predicted protein(s) >gi|319977302|gb|AEUH01000232.1| GENE 1 310 - 372 61 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLDEIDGLEEAWLEAAELLE >gi|319977302|gb|AEUH01000232.1| GENE 2 723 - 1001 433 92 aa, chain + ## HITS:1 COG:no KEGG:Gobs_1770 NR:ns ## KEGG: Gobs_1770 # Name: not_defined # Def: protein of unknown function DUF1540 # Organism: G.obscurus # Pathway: not_defined # 5 92 7 96 96 87 58.0 1e-16 MTTAMPRILDCSVTSCSYNKTKSCSAAAITVGYASTCTTFIPLSVKGGLDKPQSFVGACQ KADCVHNSALECTADSISVGAGTADCLSYEAR >gi|319977302|gb|AEUH01000232.1| GENE 3 1062 - 1553 111 163 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGCGPFPRVGARNDRALYTTVPGGGFPPPGASRMADRILWRTAIRRGQGWDPGASGLGA RGRIGGRPSTRLTTRSADERGTGLFVFGGAVVVGALNALCEGGLIHNERRSAHSGLWRDR AEAGKATAARVHWCDSARPRLALGGAVLKQAQSQHAATRRKAV >gi|319977302|gb|AEUH01000232.1| GENE 4 1857 - 2009 159 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQLTFPNATRGAREGAVRARRELFSQRCLRFHSATTLCKRYNTCEVAASL >gi|319977302|gb|AEUH01000232.1| GENE 5 1976 - 2083 164 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPYFQREVSPAGWMQWAVLQAMLSSQRDAATSQVL >gi|319977302|gb|AEUH01000232.1| GENE 6 2515 - 3138 877 207 aa, chain - ## HITS:1 COG:MT1047 KEGG:ns NR:ns ## COG: MT1047 COG1309 # Protein_GI_number: 15840447 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mycobacterium tuberculosis CDC1551 # 5 203 3 200 202 216 54.0 2e-56 MAEKRTRMSGVARREQLVAVGRSLFAEKGFDATSVEEIAARAKVSKPVVYEHFGGKEGLY AVVVDREVQSLIASLHSSLESSDRPRLILENATLALLDYIESNTDGFRVLVRDAPTDRTA GSFSSVMGDVASRVEHVIAAQFERSDFPTSWSPLYAQMLVGLIAQVGQWWLDDRRMKKDE VAAHVVNLVWNGQRNLRPNPALRLRQA >gi|319977302|gb|AEUH01000232.1| GENE 7 3665 - 4663 1235 332 aa, chain + ## HITS:1 COG:Cgl0919 KEGG:ns NR:ns ## COG: Cgl0919 COG1207 # Protein_GI_number: 19552169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) # Organism: Corynebacterium glutamicum # 1 327 21 349 485 256 47.0 4e-68 MKSRLPKVVHPVAGLSMIGHALRAAVGVDPDDVVAVVRHQRDVVAAEILRVLPSAIVADQ DEVPGTGRAVQCGLAALDGARGAVHGTVLVTYGDVPLLSSQTLARLVGAHEASGDAVTVL TSRVEDPTGYGRILRAPDGSVAAIVEQRDATAEQALITEINAGIYAFDADFLRGALAGLG TDNDQGEVYLTDALAAAARSGRGAGALELDDVWQTAGANDRAQLAELGAEFNRRICAGHM RAGVTIIDPASTWIGVDVVIGADTTIHPGTVLRGATNIASNCEIGPSATLIDAEVGERAV VPTAWVGGGAVAADTIVEPYSTIGTTIRAPRA >gi|319977302|gb|AEUH01000232.1| GENE 8 4688 - 5659 1388 323 aa, chain + ## HITS:1 COG:Cgl0918 KEGG:ns NR:ns ## COG: Cgl0918 COG0462 # Protein_GI_number: 19552168 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Corynebacterium glutamicum # 9 322 9 321 325 418 65.0 1e-117 MSGLVTTGEKNLVLVSGRAHLDLAQAVGEEIGCGVSPVTAYDFASGEIYVRFNESVRGAD VFVLQSIAGDINKWLMEQFIMIDAAKRASAKRITAVAPCYPYSRQDKKHQGREPISARLM ADLYKAAGADRIMSVDLHASQEQGFFDGPVDHLFAMPVLVDYVRGRVDLANAVIVSPDAG RIRVAEKWSTKLGGCPLAFVHKTRDTTRPNVAVANRVVGEVAGKECVLVDDMIDTAGTIT EAIKVLTNAGAKKVIVAATHGILSDPAVRRLSESGATEVVITDTLPVSAEKRFPSLTILS IAPLLARAILEVFEDGSVTSLFT >gi|319977302|gb|AEUH01000232.1| GENE 9 5816 - 6409 464 197 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227492556|ref|ZP_03922872.1| 50S ribosomal protein L25 [Mobiluncus curtisii ATCC 43063] # 1 196 1 197 208 183 49 8e-46 MNDTPVLIAETRTEFGKGASRRARRAAKIPAVLYGHGADPVHIDVPGHDLFLIVRGTKNA LVELKVDGKSQLALVKEVQVHPVSRNLLHADFLAVKAGEKVDVEVPVVVDGESAPGTAHT IEEFTILVKASATAIPEDLKVSIEGLEAGSAVRVADLVLPAGVEVELDPEQIIVSIQETA AAEEEEAPAEAEASEAE >gi|319977302|gb|AEUH01000232.1| GENE 10 6594 - 7688 1271 364 aa, chain + ## HITS:1 COG:DRA0032 KEGG:ns NR:ns ## COG: DRA0032 COG0836 # Protein_GI_number: 15807702 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Deinococcus radiodurans # 6 357 14 364 372 210 40.0 4e-54 MNSPGFHVIIPAGGAGTRLWPLSRRARPKFLLDVTGSGRTLLQATVDRMAASASSVTIVT GAAHREGVLEQLPEFASRDDRTLLVEPSGRDSMAAIGLAAYVVRARSGDDAVVGSFAADH VVARPDLLERAVASGIGAARQGYVVTVGIEPTGASTAYGYIQPDRSGDLPAPSGALAVRS FVEKPDGATAERYVDQGFLWNAGMFLMTAGVLAGHLARLMPDMHERLEAIAAQWGGDGFE DALALHWPHLTRIAIDHAIAEPAAAQGGVAVVPADEALGWTDLGDFDALNAVRGPQGALG DAVRIDSDGAGVFSTTGQAVALVGVPDVNVVVADDAVAVIRSGMGQSVKAVVDALGASGR DDLL >gi|319977302|gb|AEUH01000232.1| GENE 11 7754 - 8944 1735 396 aa, chain + ## HITS:1 COG:no KEGG:Sked_25760 NR:ns ## KEGG: Sked_25760 # Name: not_defined # Def: nucleoside-binding protein # Organism: S.keddieii # Pathway: not_defined # 70 389 87 361 365 111 32.0 6e-23 MKKTTTAALAVLALVGLAACSNGEGQSIADAAGIKVCAAVSGSHGGAQADPVARSLADAA ARAGASFEAAEGADSVPALAAGDCSLVVATDSTLAAAVSDAARSTGDKRFALVGSAFTDS KGNPEKPSNGTVINVQASQGSYLAGYAAAGMTTTGTVAVVGGARDLVTLSQMDAFVQGVD AYNLATGTSIQVLGWSPVMQEGAFASDADGVRGFTETFIAQGADIVMPVADAANSGAAQA IAARDNPALRLVWTGLDKSDLKGISREDAESGEDGQLPMSGQSGAQAQRVDTSSFADAFS SIQLDDEARAAIGAPVLTSVVVDYSGAFDALIASAAGGATAPMQYVSLISGGVILTGFGS YTPMLSTELKLGLSDMAGQIETGGLAVTTQYDVVGL >gi|319977302|gb|AEUH01000232.1| GENE 12 8833 - 9129 185 98 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSLIARKRADFHHRLIESGTLAISPSGVASNADRSQQFSQRVALAVARALGAQPDRVVL PDHSPTTSYCVVTARPPVSICPAMSDSPSLSSVDSIGV >gi|319977302|gb|AEUH01000232.1| GENE 13 9427 - 10575 1390 382 aa, chain + ## HITS:1 COG:MT1240 KEGG:ns NR:ns ## COG: MT1240 COG0624 # Protein_GI_number: 15840646 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Mycobacterium tuberculosis CDC1551 # 8 379 7 354 354 311 50.0 1e-84 MNQLPVDDPVALTLALVDLPSVSGDEARAASAVEAALRAAGYGEDSRLEILRDGDAVCAR TRLGLATRIAVAGHLDTVPIADNVPGRREQRDGRDTVWGRGSVDMKGGVAAALASACRIG GMVRAGEPPTADATWIFYDHEEVASHLNGLGRVQRNHPRWLEADLAVLGEPTGAHIEGGC NGTLRVIARFEGEASHSARAWRGDNAIHKTAPVIARVAAFGNPVVAVDGLDFRESLSVVR ISGGIADNVVPDAASMTVNYRFAPSKSAEQALGFVRGLFEGSGARLEVDDLCDGARPGAD SPAASRLIDAARAVAARDGGEVRVRAKVGWTDVARFSAAGVPALNFGPGEPLLAHTRDEH VAADEVVRCSDTLMAMLTGKGL >gi|319977302|gb|AEUH01000232.1| GENE 14 10580 - 11332 971 250 aa, chain + ## HITS:1 COG:Cgl1086 KEGG:ns NR:ns ## COG: Cgl1086 COG1611 # Protein_GI_number: 19552336 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Corynebacterium glutamicum # 17 241 17 239 256 263 55.0 2e-70 MAHDAERTNGRRIYQRGPVILRGHQIPDETSDASLLRTGAGADWMHEDPWRVLRIQSEFV DGFGALAEVGPAISVFGSARTPSSSPDWERARSVGRMLAERGYGVITGGGPGMMEAVNRG AWEAGGTSIGLGIELPHEQSMNRWVNLGVNFRYFFARKTMFVKYSQGFIVMPGGFGTMDE LFEAITLVQTGKIESFPVVLVGHDYWDGLIDWVRSTMAGAGMVGPEDVDLLSVVDGVEEA VDLATGSLSA >gi|319977302|gb|AEUH01000232.1| GENE 15 11457 - 11621 181 54 aa, chain + ## HITS:1 COG:no KEGG:Tfu_0502 NR:ns ## KEGG: Tfu_0502 # Name: not_defined # Def: hypothetical protein # Organism: T.fusca # Pathway: not_defined # 1 54 1 54 55 70 83.0 1e-11 MAAMKPRTGDGPLEATKEGRGIVMRIPSEGGGRLVIELTPEEAGELAAALSEAV >gi|319977302|gb|AEUH01000232.1| GENE 16 11607 - 11855 102 82 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRAVSPLGAGRQVHPWRARPDAVDPPASRTHPHKQRHTLIHLARIRPPAGGGLADPDTAL PRSCQTGAVGPEHWIYIGLNRL >gi|319977302|gb|AEUH01000232.1| GENE 17 12132 - 12779 897 215 aa, chain - ## HITS:1 COG:MT1258 KEGG:ns NR:ns ## COG: MT1258 COG4122 # Protein_GI_number: 15840664 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Mycobacterium tuberculosis CDC1551 # 14 214 14 214 215 151 43.0 1e-36 MIEEGAVADKAQAWVYTEDFIPLSHALLQAHQTASELGVPRVSTGTGTALRMLAAVSGAR AVLEIGTGTGVSGLWLLDGMAAGGVLTTIDCESELLPHARRAFRGAGVPSHRTRLIAGRA LDVLPRMATGGYDMIVLDGDIAETPQYLDYAVRVLRTGGTIAIVHALWHDQVADPARRDT ETVVAREVVNFLRESDQFIPAVLPVGDGLAVAVKR >gi|319977302|gb|AEUH01000232.1| GENE 18 12897 - 14027 1335 376 aa, chain - ## HITS:1 COG:ML1080 KEGG:ns NR:ns ## COG: ML1080 COG0489 # Protein_GI_number: 15827530 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Mycobacterium leprae # 8 376 15 383 383 369 57.0 1e-102 MTPSIDMVMDALSQVIDPEIRRPITDINMVTPDLVRIDGAVVHVKVLLTTAGCPLRTAIS KDVTARVGALDGVEHVNVEMGVMDDEQRKALREKLNGGRPEREIPFSRPDSLTRVIAVTS GKGGVGKSSMTANLAAALAGEGLKVGVMDADIYGFSIPRMLGIGHDPQVIDGMMIPPVGA SGVKVISIGMFVPDGQAVIWRGPMLHRALQQFLADVFWGDLDVLLIDMPPGTGDVAISIA QLLPTSQILVVTTPQVAAAEVAERAGSIASQTNQKVIGVVENMSFLPQPDGSRLEIFGSG GGQSVSERLSAQLGYEVPLLAQVPLDIALREGGDRGQPVVGAQGPAADALRAIGHRLAGA ERGLAGRPLGVSPVRR >gi|319977302|gb|AEUH01000232.1| GENE 19 14161 - 14700 850 179 aa, chain - ## HITS:1 COG:Cgl1100 KEGG:ns NR:ns ## COG: Cgl1100 COG4420 # Protein_GI_number: 19552350 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Corynebacterium glutamicum # 1 152 1 148 193 144 51.0 1e-34 MADRLDNPLERRRWFSRRKGAGRGDGFGRFAEATARFMGSPKFVLYMTVFVVLWIAANLM LYRISEEAAWDPYPFILLNLAFSTQASYSAPLIMLAQNRQDDRDRVQAQQDRQRAERNLA DTEYLTREIAGLRLALQDVATRDFVRSELRDMLEELRADLSGPPEGGEGTGGAESPDQG >gi|319977302|gb|AEUH01000232.1| GENE 20 14693 - 15982 1885 429 aa, chain - ## HITS:1 COG:Cgl1101 KEGG:ns NR:ns ## COG: Cgl1101 COG2239 # Protein_GI_number: 19552351 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Corynebacterium glutamicum # 9 413 6 418 430 302 43.0 1e-81 MELGTKGRRVYIGKLAGTSVFDPLGDQVGKVHDVVVVFRLKSEANVIGVVVEVGPRKRVF LPLTRITSIESGSVITTGLLNIRSFNQRPIETLVVSELFDRVVTMNDGSGQVQILDVAMR QRRPKDWVISTLHVQRVRTSALGFSRRGETLTVDVSEVSGLLKTDSNQAATALVQYTEDM RPADLADFMHTLPLDRKMAVALQLTDARLADVLEELGDDDRVAIVSGLDAARAADVLDVM QPDDAADLVSELPEKQAQSLLALMEPEEAADVRRLMTYEESTAGSLMTTEPVILGPNATV AQMLAAVRREDIPASIATIAFITRPPQEAPTGQYLGMVHIQRALREPPQTLLGTILDRDI EFVAPESHVATVTRLLATYNLAVLPVVDEDGHLMGAVSVDDVLDALLPQDWRDFDDDVTD RIMARSIDG >gi|319977302|gb|AEUH01000232.1| GENE 21 16009 - 16690 690 227 aa, chain + ## HITS:1 COG:no KEGG:Jden_0702 NR:ns ## KEGG: Jden_0702 # Name: not_defined # Def: hypothetical protein # Organism: J.denitrificans # Pathway: not_defined # 37 185 32 179 295 121 48.0 2e-26 MPQRPSVCPAPLDPWENGGKHTTGGQMPQVTRTHPAPTMPTGTEVASYATYLEAQQGVDF LSENGFDVSAITIVGSDLHLVERVTGRLTIARAALSGASNGGLWGALFGMLMSAGAGGAV TGAWVFGGLLGGALLGMGLSALAYTARGGRRDFVSSSQVVASRYAVLASSDIDTAYLLLQ RTPGNQSRARAARGRAEDSGPTEYGSRPDEEPRFGVRLPGARRADGA Prediction of potential genes in microbial genomes Time: Thu May 12 18:59:36 2011 Seq name: gi|319977295|gb|AEUH01000233.1| Actinomyces sp. oral taxon 178 str. F0338 contig00233, whole genome shotgun sequence Length of sequence - 6884 bp Number of predicted genes - 8, with homology - 4 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 553 - 2061 1425 ## COG0006 Xaa-Pro aminopeptidase 2 1 Op 2 . - CDS 2098 - 3099 1278 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D 3 2 Op 1 . + CDS 3196 - 4059 969 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) 4 2 Op 2 . + CDS 4056 - 4697 1015 ## COG2095 Multiple antibiotic transporter 5 3 Tu 1 . + CDS 5205 - 5960 893 ## 6 4 Tu 1 . - CDS 6152 - 6307 159 ## 7 5 Tu 1 . + CDS 6249 - 6380 214 ## 8 6 Tu 1 . + CDS 6515 - 6709 114 ## + Term 6754 - 6797 8.9 Predicted protein(s) >gi|319977295|gb|AEUH01000233.1| GENE 1 553 - 2061 1425 502 aa, chain - ## HITS:1 COG:sll0136 KEGG:ns NR:ns ## COG: sll0136 COG0006 # Protein_GI_number: 16331163 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Synechocystis # 67 493 27 440 441 185 31.0 2e-46 MSEQTSQSMADRGENRSRRPSSTAFRDFIGSGWGPRPTELPARERVADFLNDRTLKAGAP FPGERLVIPAGPYKVRSNDCDYRFRAHSAFAHLSGLGGEKEPDTVLVLEPNDDGTHTPLL FFKPRTSRSSKEFYADARYGEFWVGARPSLEELSAQTGLETRHIDTLRDALAKDAGTVQL RIVRGVDTNVEAMVNEVRSQAGLPAGEEAREDDERLEERLSEIRLTKDAFELEEMVRAVE VTKAGFEDIIRILPRAVGHRRGERVIEGAFRAVAREEGNGEGYETIAASGNHANTLHWID NDGQVREGDLVLVDAGVEVDSLYTADITRTLPVNGRFTEVQARVYQAVLDACEAALARAN EPGCRFKDVHDAAMGVIATRLHEWGILPVTPEESLAPEGQQHRRWMPHGTSHHLGLDVHD CAKARDELYKGALLEPGMVFTIEPGLYFRADDLLIPEEYRGIGVRIEDDVVVNSDGSVTR ISEDIPRTVADVEAWIARVQAE >gi|319977295|gb|AEUH01000233.1| GENE 2 2098 - 3099 1278 333 aa, chain - ## HITS:1 COG:ECs2474 KEGG:ns NR:ns ## COG: ECs2474 COG0252 # Protein_GI_number: 15831728 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Escherichia coli O157:H7 # 3 329 6 335 338 218 39.0 1e-56 MRIHVTYAGGTIGMVDSPGGLRPGADLEGWLRRQLDGIEASHTISVSSLDPLIDSAEATP GDWQTIADDIRTHAPAADAFIVLHGTDTMAYTSSALSYALADLGRPVVLTGSQCPLGVVG SDASPNVTGALRAAMSGRAQGTTLFFGHLLLAGNRVTKTSSWAYNGFDSPAVPPLARTGA PWRWGGPPSPGTGWESPEPFAAHDVAVISVVPGMSGARLRAMLSPAPDAVVLRCFGVGNI PASQPGLVPALAALREAGAPLVVASQCHQAEVVLGHYEAGGSLAGLGAISAHDMTLEAVY AKTVFLLSQGLRGEEFAAWMNRSIAGELTARQD >gi|319977295|gb|AEUH01000233.1| GENE 3 3196 - 4059 969 287 aa, chain + ## HITS:1 COG:BH2283 KEGG:ns NR:ns ## COG: BH2283 COG0613 # Protein_GI_number: 15614846 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Bacillus halodurans # 11 255 8 255 290 126 35.0 4e-29 MGYPRRMTRIDPHTHSSYSDGTDAPAQLMRAARDAGLSMVGLTDHDTFAGWDEAAAAVGE TGVALLRGVEISCSADGITVHLLGYLVDPADAALNGAFARTVEDRRTRARRMVDNLAADF PITWEQVLAFAPSDGPVGRPHIADALVAAGAFPDRDSAFVHALHPSGPYYARHWAPDPVD AVRMVRGAGGVPVLAHPRARARQRLLPEDVIERMAEAGLYGIERDHRDHDEAGRADVDRL AARLGLAITGSSDYHGTGKPNRLGENTTDPAVIDGIVSQGTLDVVHP >gi|319977295|gb|AEUH01000233.1| GENE 4 4056 - 4697 1015 213 aa, chain + ## HITS:1 COG:AF2111 KEGG:ns NR:ns ## COG: AF2111 COG2095 # Protein_GI_number: 11499694 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Archaeoglobus fulgidus # 10 203 7 204 213 98 33.0 9e-21 MTLTPFVFDITLFTTTLMTLLVIMDPIGTVPVFLALTGRLTAPKQKSAARQATSVSLGII IAFAILGGQILRFLQISVDALRLSGGVLLFLVAMELLMGSDSSTPDTGDDAVNVALVPLG TPLLAGPGAIVAVMVAVGQAGASLSGWLSVLSAVALAHVVMWASMRFSLFLSKLLGPGGI MLLTKISGLLLAAIATELVMGGVFGFIANAKAL >gi|319977295|gb|AEUH01000233.1| GENE 5 5205 - 5960 893 251 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGKTIRALAAAALSLGAVIGASGLASATGGQQEGAGTPTGAQTAASQEAPATVVAPAGNE APASGGAEASDQADAEAAQSGPGAPADEEKTYKPKTTDKAGYAEAIKDAQDDAAYYCAEG TEYYDMAKCANAKAAETELKKQDAALDAIYYCAEGTEYYDMAKCADAIERAGDAYKPDAG PQNPDEGEGPQSDGATADPAPQSGASSAPAPSGSAGAGPLAKTGAVAAVVVGAAGALAAA GAGLVALRKRA >gi|319977295|gb|AEUH01000233.1| GENE 6 6152 - 6307 159 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKTVKLTARRQTGADGSDNAHRRLERTVTPTAPPPPPPSRTGPPPAPPSG >gi|319977295|gb|AEUH01000233.1| GENE 7 6249 - 6380 214 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEPSAPVWRRAVSFTVFRIIPIIFICNHACGPMVVAIQLRSN >gi|319977295|gb|AEUH01000233.1| GENE 8 6515 - 6709 114 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAAKAAAQGGCPSRADGAGVGYDCDDCDTPVSVGRNAEPRLADTGSLMAVVGGLAGMPLA GASP Prediction of potential genes in microbial genomes Time: Thu May 12 19:00:09 2011 Seq name: gi|319977279|gb|AEUH01000234.1| Actinomyces sp. oral taxon 178 str. F0338 contig00234, whole genome shotgun sequence Length of sequence - 15537 bp Number of predicted genes - 15, with homology - 11 Number of transcription units - 12, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 100 - 165 122 ## 2 1 Op 2 . + CDS 162 - 2528 3979 ## COG0058 Glucan phosphorylase + Term 2551 - 2594 17.5 - Term 2201 - 2253 0.5 3 2 Tu 1 . - CDS 2483 - 2962 189 ## 4 3 Op 1 8/0.000 + CDS 2820 - 3209 452 ## COG1725 Predicted transcriptional regulators 5 3 Op 2 . + CDS 3209 - 3928 190 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 6 3 Op 3 . + CDS 3925 - 4734 1167 ## gi|154509561|ref|ZP_02045203.1| hypothetical protein ACTODO_02093 + Term 4794 - 4844 8.1 + Prom 4931 - 4990 1.6 7 4 Tu 1 . + CDS 5035 - 5952 1057 ## Namu_5330 putative transcriptional regulator 8 5 Tu 1 . + CDS 6202 - 7851 1758 ## COG0281 Malic enzyme - Term 7852 - 7882 -0.7 9 6 Tu 1 . - CDS 8055 - 8117 88 ## 10 7 Tu 1 . - CDS 8480 - 8953 -192 ## 11 8 Tu 1 . + CDS 9045 - 10133 1227 ## COG0666 FOG: Ankyrin repeat + Term 10352 - 10388 4.6 12 9 Tu 1 . - CDS 10875 - 12548 2223 ## COG0513 Superfamily II DNA and RNA helicases 13 10 Tu 1 . + CDS 12898 - 13527 601 ## HMPREF0573_10156 hypothetical protein 14 11 Tu 1 . - CDS 13737 - 13961 412 ## Xcel_0776 hypothetical protein 15 12 Tu 1 . + CDS 14093 - 15536 1350 ## COG0210 Superfamily I DNA and RNA helicases Predicted protein(s) >gi|319977279|gb|AEUH01000234.1| GENE 1 100 - 165 122 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNVVALGCVGTARIENPKERI >gi|319977279|gb|AEUH01000234.1| GENE 2 162 - 2528 3979 788 aa, chain + ## HITS:1 COG:Cgl1277 KEGG:ns NR:ns ## COG: Cgl1277 COG0058 # Protein_GI_number: 19552527 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Corynebacterium glutamicum # 9 786 12 791 795 1056 68.0 0 MTLDLTQSLSSQVRATSGRPVASSTPMEVWQGLSAAVVAQIADRWDATSKKYAAGRQEHY FSAEFLMGRALLNNLSNLGLVDEARDALAAEGIDLSTVLEEEPDAALGNGGLGRLAACFL DSCATLDLPVTGYGILYRYGLFKQLFDNGFQTEHPDPWMEEGYPFVIRREELQRIVTYAD LTVRAIPYDMPITGYGTKNVGTLRLWKAEPLEEFDYDAFNSQRFTDAIVDRERTMDISRV LYPNDTTFEGKVLRVRQQYFFCSASLQEIIDNYVRHHGDDLTGFAEYNAVQLNDTHPVLA IPELMRLLMDDHDLGWEDAWDVVSRTFAYTNHTVLAEALETWDIHIFDRLFPRIAEIVRE IDRRFRCDMAQRGLDQGTIDYMAPVSGNTVRMAWIACYASYSINGVAALHTEIIKRETLK EWHAIWPERFNNKTNGVTPRRWLKQCNPRLSALLDEVTGSDAWVRDLTVLGDFTDADTDA VLEGLAEVKRANKVDFAAWVREREGVEIDPDAIFDVQIKRLHEYKRQLLNAFYVLDLYFR LKEGQELDAPSRVFVFGAKAAPGYTRAKAIIKLINAIADVVNSDPDIDGRIKVVFVHNYN VSPAEHIIPAADVSEQISTAGKEASGTSNMKFMMNGALTLGTLDGANVEILDAVGPDNAY IFGATEDELPELRRTYDPRWHYENVPGLRRVIDALTDGTLDDNGSGWFHDIRWSLMEGGF DPADVYYVLGDFAAYREAKDRMARDYLDQKSWNRKVWANISRSGRFSSDRTIRDYAEGVW RIGAEPIA >gi|319977279|gb|AEUH01000234.1| GENE 3 2483 - 2962 189 159 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFGLTPISIASSRTLGTFAPGSHSPESIRAWKASANWTQMGRVSSARIASPPLTHCVSTV IQSHGGRALSNPRQTGPRSARAAVRCSGGTRGARGRPDGGRIHTAVSMRCSARGAARTPG PLRGHKGAPGARMRPRGRPGAADQPQAIGSAPMRHTPSA >gi|319977279|gb|AEUH01000234.1| GENE 4 2820 - 3209 452 129 aa, chain + ## HITS:1 COG:SP1714 KEGG:ns NR:ns ## COG: SP1714 COG1725 # Protein_GI_number: 15901548 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 4 113 6 115 121 87 33.0 6e-18 MRADDTRPIWVQLADAFQARILSGEWEPGAKVPSVRELAIEMGVNPNTVQRSLASLDSRG LTLTERTAGRFVTTNADAINEARLGAASAAADVYIAAARDYGIDEDSAVALVAARWRASA GPGVPVGAR >gi|319977279|gb|AEUH01000234.1| GENE 5 3209 - 3928 190 239 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 11 222 3 223 305 77 28 4e-14 MAAPVQPRGGALLEIDGLTLRRRSAVVLDRVSLSLDPGHIVALMGANGSGKTTLMKTVAG VLADYEGRVAIGGSAPGARTKGIVSYLPDAPFLDPATTPERAVGMYSHFFADFDRAKAGE LVERFGLVEHQPVREMSKGMREKLHIALVMSRRARLFLLDEPLSGVDPVARRVVLEEIVR DFAPDALMLIATHLIGDVEAVADRAAFMDAGRIVFEGDADDLRAERGAFLEEIFRGELR >gi|319977279|gb|AEUH01000234.1| GENE 6 3925 - 4734 1167 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509561|ref|ZP_02045203.1| ## NR: gi|154509561|ref|ZP_02045203.1| hypothetical protein ACTODO_02093 [Actinomyces odontolyticus ATCC 17982] # 7 268 2 269 270 78 26.0 3e-13 MTGAVGFRRYFRYELREQVRRDAMYLIALVLVGALAHVAVSLGAASSGMRRLLYEVGMTV PLLAIIRCLFDYWHTVHRSRGYFTMTLPLRGRTILMASCLRLSIGFAAAAAVGALILAVS GLAQANPLADLFSPALPYYWTLDPVRALVLTSIEVVADLCIPVQLVCLVTIGCSSRWRYL GSKVPVIATVAWFAAYTSLGALAERLAALGLDDEGALEPNFFFNNIFVSRQIDFTQVFVP LGTNLSVIGFTAVLLWWAVRAIERHTSLL >gi|319977279|gb|AEUH01000234.1| GENE 7 5035 - 5952 1057 305 aa, chain + ## HITS:1 COG:no KEGG:Namu_5330 NR:ns ## KEGG: Namu_5330 # Name: not_defined # Def: putative transcriptional regulator # Organism: N.multipartita # Pathway: not_defined # 23 281 42 303 313 210 46.0 6e-53 MTASFVPASPDLIWMVFIQQALVLIAWVIGLVVRRVLKNRMTMSTASATLTGLAGLWGGL VLAGWIFDSGDLWKPGMIGFAAVVALLVVGAVAVVLARLNPRPGLPPISETAALGESETR EFKSSARWNVRTGKRDEAMETVIAKTVSAFLNSGGGTLLIGVDDEGRLIGLDDDYATLKS PDADRFELWMRGMWGQRLGTNAAALPILDFASAPSGGDVCRVTIPPSPRPVYLLGPKGKG APELWVRVGNSTRRLEVSDAVLYVSQRWPEAVRVSFWTRLRMYFVLRNRERPAHLPRAVE EALRG >gi|319977279|gb|AEUH01000234.1| GENE 8 6202 - 7851 1758 549 aa, chain + ## HITS:1 COG:BS_malS KEGG:ns NR:ns ## COG: BS_malS COG0281 # Protein_GI_number: 16080040 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Bacillus subtilis # 3 549 17 565 566 575 51.0 1e-163 MPRKPQITADPLTNRGTAFTTAERARLGIMGRLPSAVETLDQQAARVYKQLGRLEDDLDK YIYLEQLHDRNEVLYYKVVIDHIAELLPVIYDPTIGEAIKKWSADYRRSRAIYLSVDRIE DVRPTFEGLGLGPDDVDLLVVSDAEQILGIGDWGVNGTDISVGKLAVYTAAAGIDPSRVI AVNLDVGTDNEELLNDPDYLGNRHGRVRGERYDALVDEYLNVASELYPRALLHFEDFGAS NARRILVDNRSKYRIFNDDMQGTGAIVISAVIAGMKANGTTFADQRLLVYGAGTAGTGMT DQIHAGMVRAGLTPEQAKDRIWLIDRAGLVTDGMEGLPDYQAAYARSASEVADWEHQGGV IGLLETVRRVHPTILIGTSTDHGAFTQDVIEALSSGCERPIVLPLSNPTERIEAMPQDII AWSKGQALVATGIPILPFDYEGTTFHIGQGNNSLLYPGLGLGTIVSGVPHVTDAMILAAA EAVAGQVASHDLGASLLPLVDHLRASSATVAVAVVRQAIADGQCDMDPGDVVEAVRRAMW QPVYEDLEA >gi|319977279|gb|AEUH01000234.1| GENE 9 8055 - 8117 88 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTRNLPQCGPAPDIELRDSL >gi|319977279|gb|AEUH01000234.1| GENE 10 8480 - 8953 -192 157 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPGTPPPKPTRPGSGPRPPAAPPPSAAPSARSSPSTPGPHREPTTSHNTHPAPPANTTNH TLKQPKRHRYPETSQSSARVIPPSLCQQPPHDVDNARSRPHPGNARPLPVPDPGPTPPDT HKEPDSRSTLRIQGRNPAVMRNATFPDSLPCARHQEG >gi|319977279|gb|AEUH01000234.1| GENE 11 9045 - 10133 1227 362 aa, chain + ## HITS:1 COG:Cgl0131 KEGG:ns NR:ns ## COG: Cgl0131 COG0666 # Protein_GI_number: 19551381 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Corynebacterium glutamicum # 2 350 4 343 355 93 26.0 6e-19 MRLSETLPEDIEEIIASGDLGAVARAVEGCQIGAYLRRSAYEPRLMHFPASQEITAFLLE RGEDIDSRDRYGQTPVHWRVMNRRVDQIPYLVSRGADINARDRDDRTPLFDAVRILEASD VERMISWGANARVVAKSRFHGKVTLTEHALGEEVFLDAPRALGVIRVLLAHRARTGKRER ESLRSMDSLRCAFITHGHSGVPDSLFDEAVAALAQLCALFGVEQREAVPAPVLGERLVFD ESASVGRQFDDLWDRLVPVSGQSASLQGEVIRIAGKVGYEVYDNGCVNWGSSFDMLLDQF LSIVTSRTGLSPDDVERARAAVDSLKAESMETQACDDLIRLAVRWARLNPVLIATDVPDV GR >gi|319977279|gb|AEUH01000234.1| GENE 12 10875 - 12548 2223 557 aa, chain - ## HITS:1 COG:ML0811 KEGG:ns NR:ns ## COG: ML0811 COG0513 # Protein_GI_number: 15827355 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Mycobacterium leprae # 61 524 33 499 544 456 54.0 1e-128 MTSTNTQGTDEATAPANPAPASDHYSAGVVRLGNIVPAAPEVPEATPDISERGDEDFSRK SFADFGVTDPIVDALEDQGITHPFPIQALTLGPALDRHDIIGQAKTGTGKTLGFGIPVLE DVIAPDEPGFDDLLNPNSPQALIVLPTRELSKQVASDLRAAAKYLSTRIVEIYGGVAFEP QISALKKGADVVVGTPGRLIDLLRQGHLHLSGVETVVLDEADEMLDLGFLPDVETLLSRV PSHRHTMLFSATMPGPVVALARKFMEHPTHIRAQDPDDQHQTVNTVKQVVYRVHSLNKVE VLARILQSKGRGRAVIFCRTKRTAARLGEDLAARGFAVGSLHGDLGQGAREQALRAFRNG KVDVLVATDVAARGIDVDDVTHVVNYQCPEDEKIYIHRIGRTGRAGNSGTAVTFVDWDDT PRWSLISKALGLGVPEPLETYHTSPHLFTDLDIPEGTTGRLPRAKRTRPGLAAEVLEDLG GPSRDPRGRGRSAGSSRSRSGRARRSKDGDQTIHPGKQGRSDRGARGRGRGEHSPRAAEA PAQRAPRKRIRRRKSTE >gi|319977279|gb|AEUH01000234.1| GENE 13 12898 - 13527 601 209 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_10156 NR:ns ## KEGG: HMPREF0573_10156 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 6 207 2 205 207 109 34.0 7e-23 MAHEYEPVSEPDADVIGLLAFSALAIMTRLAKDGDQAPSIEAHITHARMSARAFGLFEQL EVWSAHRGVDLEAAAGQYSGLYDNLDARTRPSTWSERSVKTFVTVGIIGDMLLRISQLHG LFQDREEMWPFEQEQWVREHLAPQIEEDPQLAARLSLWSRRVAGEVLGLVRATLFTHPDL SGSPEASDEIAAYITKRHGERLAGIHLKA >gi|319977279|gb|AEUH01000234.1| GENE 14 13737 - 13961 412 74 aa, chain - ## HITS:1 COG:no KEGG:Xcel_0776 NR:ns ## KEGG: Xcel_0776 # Name: not_defined # Def: hypothetical protein # Organism: X.cellulosilytica # Pathway: not_defined # 1 72 1 72 74 62 44.0 3e-09 MNIVIGVRDSARELPLEIDIDEAALVRTIEHAIASGGVVSLTDVKGDRILVPAAAIAYVQ IARKSDRRVGFAVS >gi|319977279|gb|AEUH01000234.1| GENE 15 14093 - 15536 1350 481 aa, chain + ## HITS:1 COG:MT3296_1 KEGG:ns NR:ns ## COG: MT3296_1 COG0210 # Protein_GI_number: 15842788 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Mycobacterium tuberculosis CDC1551 # 20 462 21 477 803 83 28.0 8e-16 MPVLDRAQAAVVEAARHRDVIARGAPSSGRTTVALAVLAEAVSGGRSAVLLVPDRARADH LAPRAQVLAPNAVRPVRTPASFAYQVVSTWRTQRRSPLGPVELVTGSAQDQAIARLIESV PAPWPDQIPAQMRAMPAFRAELRNLFARAGEAGMDGDALIAAGERFGQGQWVAAGHLLRE LLDASVTGAECPGALRVDLSRIQALAADLVGSWERDGAAAGVSAPPPVPDVVVVDDLQDC TPSTITLLGACSAAGARVVALSDSDVAVAGYRGGEPHLDLRLASLLGAGIDELDRVHGPS PRIRALVAEAASRVTQSGPSARRRAGADGDDAERPLRTHLAATPAQMGALIARELRAHRL HDAVAWSDQVVIVRSASMVDEMRRHLSRGGVPVAGGGRAFDFASQPVTRLMLDLLVAPRG GPTAEAAARAGLAKRLLASALVGADMLAVHRLLRGLAEQRPGGAGAGGRAPGATEEPTAA E Prediction of potential genes in microbial genomes Time: Thu May 12 19:00:59 2011 Seq name: gi|319977272|gb|AEUH01000235.1| Actinomyces sp. oral taxon 178 str. F0338 contig00235, whole genome shotgun sequence Length of sequence - 8507 bp Number of predicted genes - 7, with homology - 4 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 + CDS 12 - 806 915 ## COG2887 RecB family exonuclease 2 1 Op 2 . + CDS 808 - 4398 3838 ## COG0210 Superfamily I DNA and RNA helicases 3 1 Op 3 . + CDS 4423 - 4548 70 ## 4 2 Tu 1 . + CDS 5309 - 5425 102 ## + Term 5518 - 5546 -0.9 5 3 Op 1 . - CDS 6127 - 6606 -364 ## 6 3 Op 2 . - CDS 6666 - 7397 909 ## COG0730 Predicted permeases 7 3 Op 3 . - CDS 7450 - 8454 1640 ## COG1082 Sugar phosphate isomerases/epimerases Predicted protein(s) >gi|319977272|gb|AEUH01000235.1| GENE 1 12 - 806 915 264 aa, chain + ## HITS:1 COG:Cgl0751_2 KEGG:ns NR:ns ## COG: Cgl0751_2 COG2887 # Protein_GI_number: 19552001 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Corynebacterium glutamicum # 26 260 1 256 260 80 31.0 4e-15 MRAAHPDRWMGAGGTPTTGAALTGEVVLSPSQFESALQCPLRWFLTTIGADGPGTAAQSL GTLVHDIAERHPNGTREQLRAALDERIGELGYDLGTWAGRRDHAHAYAVIDNLASYMAGV PGAVDVEGTVAANVEGVTIRGRMDRVEHVDGGVRVTDIKTGRYGYTAKSVPDNPQLAAYQ MALIALEQPVVGGRIALVGGDKERVFDQPALHGEALDEWRAAVRRVGEAARGPSFRATPS EAACRFCSFDRLCPARDRGRKTVD >gi|319977272|gb|AEUH01000235.1| GENE 2 808 - 4398 3838 1196 aa, chain + ## HITS:1 COG:Rv3201c_1 KEGG:ns NR:ns ## COG: Rv3201c_1 COG0210 # Protein_GI_number: 15610337 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Mycobacterium tuberculosis H37Rv # 33 799 24 801 821 316 35.0 2e-85 MSSANGRQDLSALGVQDLSALGIQALVDPGKTPTDEQVRVIEAPRRPLLVVAGAGSGKTE TMSMRVLWLVANHEDIAPSSVLGLTFTRKAAGELGDRLRQRLALLARRVPSLRERLDEDP VSLTYNSFAERIVAEHGMRIGIDPDFTMLTEAGAVDLMTQIVEGWPTDLDEDLSPSAAVG HCLHLAGEVGEHGYTVEEARDALEGFGRDLEQIGATNDTARKTLRANARRLAYLGPVEEF QRRKREGGLLDFSDQLVLATRIVREAPEVSAQVRDEFQAVLLDEFQDTSVIQMELLSLLF HDHAVTAVGDPNQAIYGWRGASASSLERFLDRFQDGPAQPGQTLTLSTAWRNDRNILRAA NRVAAPLREHSRAAKSPVLRARPGAGEGRVDVAYTQDYRSALGAVVDFVSAHRARGTREG KRPTTAVLCRRRSDFPYVDMALREAGVPTQIVGLGGLLDQPSVQDARAALVLADDVEASP WLARLLAGIDLGAADLVALGDWARHLAREEGRDPHRAVLLDAVDNPPEPGWSASGGARGR PAISGEAVRRVRTLGSRLRAVRAGAGRSVVEQVERAVSIMGILDDVVSDPLAAGGRAALD AFVDVAASYEAEVPGASLSSFLAYLDMADERENGLEAPVSEPDPQAVQIMTVHASKGLEW DGVVVFAMDDGVFPSHSKRRTVDWRDGPPTDSGWVRDASALPYPLRGDCMDLPDFDLDVE GEAKPSATFKKWLEGDYEARLGEHAEREERRLAYVAMTRARSAQLLVGSWMYRTGASPRH PSRYLMEAHAELFAGAGAGVGAVSGEGGGTSGTGVGPSRGASGMAGAVPGEDGGASGTGG SAAPGAGGPLVVPGVGSALVVPRPDEEELGRLALTEAEAAFPEGPGPSRAAVARAAAQVR REIASMRADADVFDLLAQMEGEPGVADTVALLEEHRVSLEAPVVEIWQDRVPATSVSELL DDPQAFAARVRRPMPAEPSESSALGTVFHAWAERQLHLASPEPAGGDPTAPGDGPLGSLL PADEAGAGGGVDASLGGGPQVVEEAALDERSRAKLEVLRANFTEFVATELAGCSPVGIEE PFSVEVGGVSVQGRIDAVFERTGGGGPRFVVIDWKSGRAVDRRTDPAKLRYFITQLRLYR RAWSQRAGVPESAVEARVAFLAGPASFTVEQLEARCGADPGASLDGLVRGALGAEE >gi|319977272|gb|AEUH01000235.1| GENE 3 4423 - 4548 70 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCVTIPRVLAPAFRWKAVGIRQTGSRFGRRWRPLCSQRKVP >gi|319977272|gb|AEUH01000235.1| GENE 4 5309 - 5425 102 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGVVREHPAPSGALRHLEEPLGEVPVAPSGSTQHHQVH >gi|319977272|gb|AEUH01000235.1| GENE 5 6127 - 6606 -364 159 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFRLGGEAVEGTADRAPRRQRVQEPQRGERVTVRVRRPARGFVRLPRSRSIEDPRVRGGG RFGGGVGAGGPESSMAGRGRASGAPTRRGGRPRTPIARLSSSSRRPSGHRVRLQRSRGRV HRAHHQRFCVGIPRFRHAPDDAQCEEHRVSNVKNAGFSL >gi|319977272|gb|AEUH01000235.1| GENE 6 6666 - 7397 909 243 aa, chain - ## HITS:1 COG:mll3907 KEGG:ns NR:ns ## COG: mll3907 COG0730 # Protein_GI_number: 13473346 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Mesorhizobium loti # 2 242 4 248 256 86 30.0 3e-17 MLGLSPTDWLAVLAAAVLTGLAKTAVPGLASIAVALCALVLPAKESTGVMLVLLLVGDVL AVWAYRRDADRALLARLVPSVVLGVLAGAAFLRWSTQDQTRRGIGVVLLALVALTVFQRR RGKRGRAPAPVRTAYGALAGFTTMVANAGGPVTSMYFLACRFSVAAFLGTTAWFFFAVNL VKLPFTVSMGLIRPEHLALDLVLAPVVVLAALAGRRLAARLPLRVFEPLVIATTVIAAVP LVL >gi|319977272|gb|AEUH01000235.1| GENE 7 7450 - 8454 1640 334 aa, chain - ## HITS:1 COG:Cgl0172 KEGG:ns NR:ns ## COG: Cgl0172 COG1082 # Protein_GI_number: 19551422 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Corynebacterium glutamicum # 3 333 1 332 337 317 47.0 2e-86 MALTYGAYTACLHDRPLADALAVLKDAGLSGAEVNAGGFIPSPHCPVDALLASKAARDDY LGVFDEAGVALTGLNASGNPTSPLPGEGLKHAEDVFKVIELASLLGVDEIVTMSGTPGTD PGAKYPTWVVNPWNGIDMEILDYQWSVVAPFFKRVDEFARDRGVRVCLELHRRNLVFTVP SFERLVERTGSTNLRVNMDPSHLFWQQMEPIEATKRLGALVGHVHAKDTKILPGAAYRGV LDTDFGRVPAEAEGKVPVAIGYWCNSWPADPAWRFVAFGLGHGTGYWTRFLAEIARIDPD MNVNIEHEDAEYGNVEGLRISASNLLAAAEGLER Prediction of potential genes in microbial genomes Time: Thu May 12 19:01:16 2011 Seq name: gi|319977269|gb|AEUH01000236.1| Actinomyces sp. oral taxon 178 str. F0338 contig00236, whole genome shotgun sequence Length of sequence - 1539 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 127 - 1311 1509 ## COG0673 Predicted dehydrogenases and related proteins Predicted protein(s) >gi|319977269|gb|AEUH01000236.1| GENE 1 127 - 1311 1509 394 aa, chain - ## HITS:1 COG:Cgl0171 KEGG:ns NR:ns ## COG: Cgl0171 COG0673 # Protein_GI_number: 19551421 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Corynebacterium glutamicum # 5 390 6 399 411 278 42.0 1e-74 MPTPLSVAVIGAGMAGTTHANAWRQVGTVYDLGLPPIRLATIADTYQSAAEDAAARYGYA SATTSWRSIAADPTIDIVSIVVGNALHLEIAAAMIKAGKHVLCEKPLADTLEAGREMARL EAAHPEVVTAVGFTYRRNASLAKIAELATTGNLGEVVHFDGRYWCDYGTDPTTPIAWRYK GPMGSGALGDLGSHLIDAAELICGPLVSVSGAAMTTAIKQRPAATGYVTRGTASAGEGTA MEDVENDDVATFTGRFASGAVGTFSCSRVAWGLPNSFMVDVLGTKGRAAWDLARCGEITV DDTTSPAGLGGPRRVLANPGFPYFARGSSMAFGGVGLTQIEQFTYQAHAFLQQVAGVEGL PPCATFADGYRQMLIMDAIARSAEAGGARVDLSF Prediction of potential genes in microbial genomes Time: Thu May 12 19:01:22 2011 Seq name: gi|319977258|gb|AEUH01000237.1| Actinomyces sp. oral taxon 178 str. F0338 contig00237, whole genome shotgun sequence Length of sequence - 12931 bp Number of predicted genes - 12, with homology - 9 Number of transcription units - 5, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 156 - 1160 1133 ## COG1609 Transcriptional regulators - Term 1200 - 1247 6.6 2 1 Op 2 . - CDS 1440 - 2426 1495 ## COG0673 Predicted dehydrogenases and related proteins 3 1 Op 3 . - CDS 2516 - 3592 1766 ## COG0176 Transaldolase 4 1 Op 4 . - CDS 3676 - 3837 135 ## - Term 3848 - 3901 1.2 5 2 Op 1 . - CDS 3993 - 4766 1129 ## COG2188 Transcriptional regulators 6 2 Op 2 . - CDS 4803 - 4901 133 ## 7 3 Op 1 3/0.000 + CDS 5014 - 6003 1279 ## COG0524 Sugar kinases, ribokinase family 8 3 Op 2 1/0.000 + CDS 6032 - 6904 1237 ## COG3718 Uncharacterized enzyme involved in inositol metabolism 9 3 Op 3 1/0.000 + CDS 6929 - 8836 2667 ## COG3962 Acetolactate synthase + Prom 8889 - 8948 2.6 10 3 Op 4 5/0.000 + CDS 8992 - 10494 2057 ## COG1012 NAD-dependent aldehyde dehydrogenases + Term 10509 - 10552 16.1 + Prom 10633 - 10692 4.0 11 4 Tu 1 . + CDS 10801 - 12417 2431 ## COG0477 Permeases of the major facilitator superfamily 12 5 Tu 1 . + CDS 12586 - 12931 323 ## Predicted protein(s) >gi|319977258|gb|AEUH01000237.1| GENE 1 156 - 1160 1133 334 aa, chain - ## HITS:1 COG:Cgl2058 KEGG:ns NR:ns ## COG: Cgl2058 COG1609 # Protein_GI_number: 19553308 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 11 329 2 317 325 105 28.0 1e-22 MAVRTRPTTQADIARAAGVSRGLVSMALSGSGRMKENTRAHILEVARALDYHPHAAAAEL AGGRSNRLVLILPYLINPFFDALARQLRRAARARGYSLVVLVSELGSAEDIESQTIDEAV AMRPAGLIFAGTSLPSEALAGLAERVTVCTLDRELAEGSVWATRMDEARAARLVMDHLAE QGYQRLFFLGPEPDAHEAIVDERLGHCRAAAAASGIPFEVLAEPGAAGAISAVTDSCGRG EAAVVAFNDLVALEAAAAIYQLGWSMGPDLGVVGYDNTAMAARPEFALTSIDQNPARLAS LTVQQVLTPPVQSPRTRVVTPSLVVRSSSLRTRL >gi|319977258|gb|AEUH01000237.1| GENE 2 1440 - 2426 1495 328 aa, chain - ## HITS:1 COG:Cgl3002 KEGG:ns NR:ns ## COG: Cgl3002 COG0673 # Protein_GI_number: 19554252 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Corynebacterium glutamicum # 2 328 3 330 335 313 52.0 3e-85 MLNIAVIGAGRIGQVHARTIASHPGAVLALIADPFGDAAEKLAAVHGARHTTDIDSVFTD DSIDAVIIGSPTPLHIPHLLAAAKAGKAVLCEKPIALDMADVEAAREELDAVATPVMFGF NRRFDPSFAALRAAVADGRVGDLENLLIISRDPAAPPVEYVKVSGGIFRDMTIHDFDMAR FFLGDIIEVHAFGQNFDPGIAEAGDFDAAVVTLRNADGAVATIINNRKCAAGYDQRLEAQ GSTGTLNADNVRATTVRLSNGEATDAAEPYLDFFLQRYADAYRLELTAFIDAVAEGRTPP TSIADAIEALRLAEAATESARTGQAVTL >gi|319977258|gb|AEUH01000237.1| GENE 3 2516 - 3592 1766 358 aa, chain - ## HITS:1 COG:all4020 KEGG:ns NR:ns ## COG: all4020 COG0176 # Protein_GI_number: 17231512 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Nostoc sp. PCC 7120 # 91 276 89 305 381 89 29.0 1e-17 MPIEYTPGPLLDAARNTPTALWNDSSDPNELAQSISFGGVGATCNPTIAFTCINQRREVW LPRIAELAKEMPEASESEIGWQAVRELSLEAAKLLEPIFECENGRNGRLSMQTDPRLARS AKALADQAEEFSSLAPNIIVKIPATSVGVEAIEEATYRGVSVNVTVSFSVPQAIVTGEAI ERGLKRREAEGKDVSTMGPVVTLMVGRLDDWIKDVAKRDGLFLDPGHLEWAGIAAFKRAY QEFNKRGLRARMLSAAFRNVMHWSEFVGGDVVVSPPFKWAKLINDSGYKMQQRMDIPVRD DIMATLLSIPEFVRAYEPDGMTPLEFDTFGATAKTLRGFLQADADLDALVRDVVIPAP >gi|319977258|gb|AEUH01000237.1| GENE 4 3676 - 3837 135 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCAPRGKGPDPGDRHERAAGSAPGRGGPDGPAADSARPVGTGARPASSLCYDK >gi|319977258|gb|AEUH01000237.1| GENE 5 3993 - 4766 1129 257 aa, chain - ## HITS:1 COG:Cgl0157 KEGG:ns NR:ns ## COG: Cgl0157 COG2188 # Protein_GI_number: 19551407 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 21 256 16 252 253 201 47.0 1e-51 MSTTSSNNKEGATPYVPQIKLDRSSPVPLYFQISEPIATLINDGTLPAGTRLEDELSMAA RLQVSRPTARQALQRLVDRGLLTRRRGVGTLVSPPHVHRPMELTSLLSDLTDAGHHVSTS ITRYEMRAATQEEAKALGVDEGEVVAHIERIRYADDEPIAILMNLLPADITPAMQELENG SLYELMRRRDVVLTSAHQVIGARTASSKEARLLNESRGAAVLTARRTTYDPSGRVVEYGD HIYRASRYSFETTLFSN >gi|319977258|gb|AEUH01000237.1| GENE 6 4803 - 4901 133 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSEPMKTIPIPRNHAFVTIGVRMTGTMRNTPP >gi|319977258|gb|AEUH01000237.1| GENE 7 5014 - 6003 1279 329 aa, chain + ## HITS:1 COG:Cgl0158 KEGG:ns NR:ns ## COG: Cgl0158 COG0524 # Protein_GI_number: 19551408 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Corynebacterium glutamicum # 9 322 9 312 318 339 56.0 3e-93 MSMERHHVDLLAMGRSGVDVYPLQVGRHLEDVETFGKFLGGSPTNVAVAAARLGHSAGVV TGVGDDPFGRFVTREMERLGVDSSHVVTVRDFHTPVTFCELFPPDSFPLYFYREPSAPDL EITSHQIDWGAVGAARVFWFTVTGLSRDSSLGVHLEALDVRSRACSARPDGLAPPLTVVD LDYRSNLWKSPDQARSRVSAVLGHVDVAVGNLDECEMAVGRSDPEGAADALLARGARMAV VKQGPKGTLAKTADQSVFVEATPVEVLNGLGAGDAFGGALVHGLLEGWPLERTIRVASAA GALVASRLECSTAMPTTDELLAFAEGTHH >gi|319977258|gb|AEUH01000237.1| GENE 8 6032 - 6904 1237 290 aa, chain + ## HITS:1 COG:Cgl0161 KEGG:ns NR:ns ## COG: Cgl0161 COG3718 # Protein_GI_number: 19551411 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized enzyme involved in inositol metabolism # Organism: Corynebacterium glutamicum # 22 287 19 288 296 221 46.0 1e-57 MTLPDIYVPAGSSAHGRYTVDVDPGRAGWTFSGLRVLVLGPGESETVATGDSELLVLPLA GAAAVAVDGADYALEGRRGVFSGVTDYLYVGRGEEFTITSAQGGRFALPAAVAARDLPVA HYPKSQVRVDLRGAGDCSRQVNNYALAVEGVATSHLLACEVITPGGNWSSYPAHKHDTHS DDERELEEIYYFEIADGPEGSKGFGLHRTYGTPERPIDLCCEVHDGDVALVPHGYHGPCV AAPGYDMYYLNVMAGPSEDLVWLAPDDPAHHWIRATWEDQEVDPRLPMNK >gi|319977258|gb|AEUH01000237.1| GENE 9 6929 - 8836 2667 635 aa, chain + ## HITS:1 COG:Cgl0162 KEGG:ns NR:ns ## COG: Cgl0162 COG3962 # Protein_GI_number: 19551412 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase # Organism: Corynebacterium glutamicum # 9 635 4 636 637 639 53.0 0 MSNEAYKGTVRLTVAQAIIRFLSNQYSERDGAEHRLIAGAFGIFGHGNVAGIGQALLQNE IAPAEGENPMPYIMPRNEQGAVHAAAAFAKTTNRLQTWMTTASIGPGSLNMVTGAALATT NRLPVLVFPSDQFATRTPDPVLQQLEDPTSLDVSVNDAFRPVSRFFDRINRPEQLIPSLL CAMRVLTDPAETGAVTIALPQDVQAEAYDWPVEFFDKRVWHVRRPPAERAALERAVAAIR AAERPLLICGGGTIYSEASAELRAFASATGIPVADTQAGKGAINCDHPCSVGGVGSTGGD SGNHLADKADLIIGVGTRYSDFTTASKTQFKNPDVSFVNINVAAFDAAKESAEMVVADAR EALRALTEALADYRVDEAYSAEIAAERQAWAKKVERCYHLGHGPLPAQTEVFGALNELMG DEDVVINAAGSMPGDLQALWQAKTPVQYHVEYAFSCMGYEVPAALGVKLARPDSEVVSIV GDGTYQMLPMELATVAQEGVKVIYVLLQNYGFASIGSLSESRGSQRFGTKYRQRGSGSHL QDTEKITGVDIAANARSWGIDVIEAHGIAEFKEAYRAAVASDRATMIHIETDLMGPNPPG SSWWDVAVSQVSELESTQRAYEDYQRDRKPQRHYL >gi|319977258|gb|AEUH01000237.1| GENE 10 8992 - 10494 2057 500 aa, chain + ## HITS:1 COG:STM4421 KEGG:ns NR:ns ## COG: STM4421 COG1012 # Protein_GI_number: 16767667 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Salmonella typhimurium LT2 # 6 500 7 501 501 472 49.0 1e-133 MAIELWVDGREYQASSTRTLPVEDPATGEAVDELRLADSADLDHVVRVARAAQAEWARTP LAKRVDVMFKFRQLIIEHQDELADIIVREGGKTHGDALGEIARGRETVDFACGINAALKG EFTDQASTGVDVHTLRQPVGVVAGIAPFNFPVMVPMWMHPIALATGNAFILKVASLVPSA SLCIARLYKEAGLPDGLFNVIAGDRTLVGEILTHDGIDAISFVGSSPVAHVIQDTGTSHG KRVQALGGANNHAIVMPDADIEFAAQHISAGAFGAAGQRCMALPVIVAVGGVEEKLVPAI KARAEKIVVGPGTDSAAEMGPVITRSSLQRIEKWIADAEEAGAWVVLDGRGYRPAGEEYA DGFWLGPTILDNVDRGLQVYREEVFGPVLVVVRADTYEEAIEIVNSSEFGNGSAIFTSDG DTARHFAVDVEAGMVGVNVPIPVPVAYYSFGGWKESLLGDTHIHGPEGVRFYTRAKVVTT RWPKRGEKVGYVGMNFPTNG >gi|319977258|gb|AEUH01000237.1| GENE 11 10801 - 12417 2431 538 aa, chain + ## HITS:1 COG:BS_yfiG KEGG:ns NR:ns ## COG: BS_yfiG COG0477 # Protein_GI_number: 16077893 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 14 535 4 465 482 322 37.0 1e-87 MSTGKTIDDYTPEKVRELVASTPPSGKKRSLGAIAAVATLGSLLFGYDTGVVAGALPYMH MPGSAGGLEMTTFEEGWVGGLLCIGAAAGAFFGGRLSDRYGRRHNITLLAIVFLFGAIGC AIAPNIWVLYLARVVLGFAVGGASATVPVFLGETAPKRLRGVLVATDQMMIVFGQFLAYA MNALLARWNGGPTLHITGDAIDEHGTVLAQGGTDVAWETIRYAVNAAQITDAGNGMTWRY MLILCSLPAIALWIGIRMMPESSRWYAANLRIAEAIGALKRVRDERHDDVADEVNEMLEV QRSEAEQEKWSLVQILKIKWARKLLYIGIVLGLADQLTGINTAMYYTPKILSAAGVPMED AISLNVVSGAISFIGSAFGLWLVARFARRHVGMYQELGITISLAAMACVFGFFISPYLNE DGNIEGAPNFAPWLVLAIICLFVFIKQSGTVTWVLVSEIYPAAVRGTALGIAVATLWLAN AVVSIAFPPLMENVGGAGTYAIFAAINFLSFLFYWKVVPETKFHSLEELELKFKEDYS >gi|319977258|gb|AEUH01000237.1| GENE 12 12586 - 12931 323 115 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQQEERGERLLRAYAWLTGAALGGVDGVVAARLLPAWNGGAPLRDCLAVALAVCVPGIAA ALTAPALASWAGRRPVNTAAAALLLVGAGTGAVPAPWAPPLGLGLPGAGHALAWV Prediction of potential genes in microbial genomes Time: Thu May 12 19:01:44 2011 Seq name: gi|319977241|gb|AEUH01000238.1| Actinomyces sp. oral taxon 178 str. F0338 contig00238, whole genome shotgun sequence Length of sequence - 20974 bp Number of predicted genes - 20, with homology - 14 Number of transcription units - 11, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 859 884 ## gi|227497302|ref|ZP_03927534.1| hypothetical protein HMPREF0058_1526 2 1 Op 2 . + CDS 923 - 2140 1532 ## COG1062 Zn-dependent alcohol dehydrogenases, class III 3 2 Tu 1 . - CDS 2417 - 2665 395 ## 4 3 Op 1 . - CDS 2926 - 4431 1686 ## COG2272 Carboxylesterase type B 5 3 Op 2 . - CDS 4524 - 5153 608 ## Dhaf_4628 transcriptional regulator, TetR family - Prom 5193 - 5252 3.3 6 4 Op 1 11/0.000 - CDS 5277 - 6539 1698 ## COG0438 Glycosyltransferase 7 4 Op 2 1/1.000 - CDS 6601 - 8229 1830 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 8 4 Op 3 . - CDS 8222 - 9145 1220 ## COG3475 LPS biosynthesis protein 9 5 Tu 1 . + CDS 9147 - 9239 57 ## 10 6 Op 1 . - CDS 9212 - 10078 1087 ## COG3475 LPS biosynthesis protein 11 6 Op 2 3/1.000 - CDS 10121 - 11110 1260 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 12 6 Op 3 . - CDS 11116 - 12927 2646 ## COG4750 CTP:phosphocholine cytidylyltransferase involved in choline phosphorylation for cell surface LPS epitopes - Prom 12962 - 13021 2.0 13 7 Tu 1 . + CDS 13223 - 13513 263 ## - Term 13382 - 13433 4.1 14 8 Op 1 . - CDS 13447 - 13620 339 ## 15 8 Op 2 . - CDS 13651 - 15516 1748 ## Arch_0077 protein of unknown function DUF2142, membrane 16 8 Op 3 . - CDS 15513 - 16673 1419 ## COG0438 Glycosyltransferase 17 9 Op 1 6/1.000 + CDS 16930 - 18051 1106 ## COG0438 Glycosyltransferase 18 9 Op 2 . + CDS 18048 - 19061 1104 ## COG1216 Predicted glycosyltransferases 19 10 Tu 1 . - CDS 19766 - 20227 -28 ## 20 11 Tu 1 . + CDS 20624 - 20896 353 ## Predicted protein(s) >gi|319977241|gb|AEUH01000238.1| GENE 1 2 - 859 884 285 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|227497302|ref|ZP_03927534.1| ## NR: gi|227497302|ref|ZP_03927534.1| hypothetical protein HMPREF0058_1526 [Actinomyces urogenitalis DSM 15434] # 1 254 135 415 450 91 36.0 6e-17 AHELSLRGHARLVPQATALVGAGGAAGIGAACATGMSAQGPLALVGACAAALLVVSAVLP ESPLWHVIRGEEEKAFLALRRLHGPLEAAVAIDWVRLEAQMGAEQRPLRARDLAMPQVRA SMATAAVLLCAQEAPLAAAVVAYAPALVIRSGGSTATVAVFAAAVAAVSAMALAVAASTA GAALVELRAGAPAGAVVVGALVLAAQRCLTYPGCHGAIDPRVPPWLVEWQRRAVVLANVL ARFAVLLVGTLLLAAPPLVAAGAYLALQCAALVVVLARLPRELRI >gi|319977241|gb|AEUH01000238.1| GENE 2 923 - 2140 1532 405 aa, chain + ## HITS:1 COG:SSO0472 KEGG:ns NR:ns ## COG: SSO0472 COG1062 # Protein_GI_number: 15897400 # Func_class: C Energy production and conversion # Function: Zn-dependent alcohol dehydrogenases, class III # Organism: Sulfolobus solfataricus # 37 400 9 364 371 228 40.0 2e-59 MILSYPGLTQGPRAAPDKQHREVEAMTGEAIEIPATMRAAVLRDYDKGLQVETIPTPRPK SGEVLIKVSACGLCHSDLHVIGGAIAFPLPAVLGHEVAGTIVALGEGNEHNGLEVGQRVA GGFLMPCGQCRHCAAGHDELCEPFFELNRLKGVLYDGTSRLRGAEGETIHMYSMGGLAEY AVVPSTAVAPVPDEVDPVASAILGCAAMTGYGAVRRGADLRFGETVAVVAVGGVGSSIVQ IARAFGASQVIAIDVSDSKLEAVVPFGATATINSTTSDPREEVLRLTGGRGVDVAFEALG IPATWNTALDVLADGGRMVPIGLGAGRQSAQVEINRTVRRSQSILGSYGARTRQDLPAVV DLAARGIINYRDLVTRRYPLEEAAEGYQALRDRQIQGRAVVDMSL >gi|319977241|gb|AEUH01000238.1| GENE 3 2417 - 2665 395 82 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METLIKESAPTTTAAVRYGTAPTTNPDLFAFSSAMLVVFAMVAVMAVVGALVGSAVIISA AAVIGTVACFVGVYSGANLLNR >gi|319977241|gb|AEUH01000238.1| GENE 4 2926 - 4431 1686 501 aa, chain - ## HITS:1 COG:BS_pnbA KEGG:ns NR:ns ## COG: BS_pnbA COG2272 # Protein_GI_number: 16080492 # Func_class: I Lipid transport and metabolism # Function: Carboxylesterase type B # Organism: Bacillus subtilis # 27 467 26 458 489 142 30.0 1e-33 MILETAIGPVSAQRTGSPGAPVLQILGIPYARAKRFHAPGPAPRPAPGSPINTGRGHCCP QRLMPRALNLALRHFQLRPEWQPRHDTMDEDCLRLNVWTSGIRTPKPVLVFIHGGDCGSG TLPVYNGARLAELGVVVVTITYRIGVLGHLHVVDGERISCDRALEDQRAALRWVRANIAH LGGDPGAITLMGHCGGAQYALYQALNPDNAGLFHRLILCSGQRATPVPLDRGTEERAFAD LLADNGLSGYGELEAMPLRRLLWLRVPRAGLATVMEGGFFSRDPRDALRDGAFPRVPVLI GTTADEFSMIEMPLWYRRMGIATRAQDLGARVNAVYGPHGRRIAVELSDDEAGAGIVGLQ IAMMELVVFHGAALSLMDAFSRHTRVHGYRFAGVPKVYGGRPGSYHGAEVAFFFNTLDRM RIRVPDQDRALVRALQRDWLAFVREGAIPGAPAFDPADPRITAYHQDGTVATAPFPHAGL LRDLEGTDLSDRVIGAYMRRR >gi|319977241|gb|AEUH01000238.1| GENE 5 4524 - 5153 608 209 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_4628 NR:ns ## KEGG: Dhaf_4628 # Name: not_defined # Def: transcriptional regulator, TetR family # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 5 197 8 204 206 118 34.0 1e-25 MTPQTKQRILRAAREEFLRHGYGQAGLRRIAAAAHVTTGAVYNHFGSKHGLFDAIVRGPA DLLLAAWTSGRPPQDADTDSAPPSPSAASAHSATRTADVLSLVYQHAQAYELLLCHAHGS EYADFADRLARAEEGAYRRIPGMTDSIADRLFLRTIASDGVAALRAALAHHLTEDEARQY MERIARFRLGGWAELLGPSTSGTGPAPAS >gi|319977241|gb|AEUH01000238.1| GENE 6 5277 - 6539 1698 420 aa, chain - ## HITS:1 COG:L14736 KEGG:ns NR:ns ## COG: L14736 COG0438 # Protein_GI_number: 15672195 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Lactococcus lactis # 1 416 1 375 379 239 31.0 6e-63 MERTKIGIFMEFYLPYLGGIERYIDRLSAQLRHLGYDVFIVTSLFDESLPRFEERDGMRV YRLPTKGLFKQRYPFFKRDARFKETMARVEAEAADFYIVNTRFHLTSLLGARLAKRQGRP VALIEHGTAHMSVGSKVLDFFGGIYEHGLTHFVKRNVTSYYGVSKNCNKWLRHFRIEASG VWYNAVNPDDAAQADDRFEDRWPRSADAAGAGIPSADDDAEAGAPSSATGAQGRAPIVVS YAGRLIAEKGVVALLEAFTRVRDSHPEADLVLAVAGSGPIGDQLRAEYGGSKGVEFLGTL DFPAVMSLYRRTDVFVYPSMYPEGLPTSILEAGLMGCAVIATPRGGTEEVIIDPEHGWVV DGSSSAELADALTTALTEAVEDPGRRGACAAAVQRRVREVFTWQSVAREVSAVVEEAVAR >gi|319977241|gb|AEUH01000238.1| GENE 7 6601 - 8229 1830 542 aa, chain - ## HITS:1 COG:SPy0797 KEGG:ns NR:ns ## COG: SPy0797 COG2244 # Protein_GI_number: 15674839 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Streptococcus pyogenes M1 GAS # 35 440 2 414 428 171 28.0 5e-42 MSDATGARSADGSGAAGRDGPEDGGARAASADTALDSKKQLGRDYLWNTAASLMSSLAVV IMGIAITRSGPDQDSARHAYGIFTLALAIGQQYQTLGLYEVRTYHVTDVRRRFTFGTYLA TRILTCALMVAFIVGHAWTRSPESSPLLVVVCLALLRIFDAFEDVYYSEFQRCGRLDIAG RACFLRIFVTTFLWSGLYWATQSLMVATLVAFGVTVVVLAAAYGLPARGLFPLRPDWGVR AVRGLLVQCLPLFLAAFLNQYLANAPRYAVNDFLGSAPLGDFAIIYMPAVAINMLSLFVF RPLLTRIATRWAAGDWPGFAAMVRRGLASTAIAFVAVGAVTFLVGAPLLRLVYGIDVSAY RMELMVLVFGGAMNAAGVILYYALATMRRQNLVLVAYALAAVCAWALAQWLTPVLGMMGA AISYSGAMTVLAVLFVLFMVGGRRAVLSGRAAGDGDGPVSPEPEDGDEPAIAEPAAGDSP SSPGCTDQNGPSSSGPSSAPDGPTSDTIVEGGPSPAPDSPGAPPSPSPGDDGSCDPRGGP RA >gi|319977241|gb|AEUH01000238.1| GENE 8 8222 - 9145 1220 307 aa, chain - ## HITS:1 COG:SP1273 KEGG:ns NR:ns ## COG: SP1273 COG3475 # Protein_GI_number: 15901133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 19 286 16 260 267 92 25.0 8e-19 MHSYDPPALAAVQTVTKRVLAEIDRVCAVLGVRYTAYGGTAIGAVRHQGFIPWDDDGDVC MPREDYERFIREAPAVLRPEFFIASPLTDADYPISFGVIGLRGSEFVSEVAKGRSFRMPI GVDLFVLDEIADDKRVFARQSRGTWLWARLMFLRGVGNPPTGLPTPARQAASAVMLGVHF ALRALRITEQGLYRRWLRAALLGAKKNAAAAAGAAPADRPVLLGDFSTRDPMRWSATTDE LFPAREVPFEDITIRVPRDHDAVLTRGYGDYMRIPEESERVNHQPFRIVLGPHADEYGEA PDGGSDE >gi|319977241|gb|AEUH01000238.1| GENE 9 9147 - 9239 57 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSRGKGLGRRTGAPAGAAAPGQRGGKPSA >gi|319977241|gb|AEUH01000238.1| GENE 10 9212 - 10078 1087 288 aa, chain - ## HITS:1 COG:L15884 KEGG:ns NR:ns ## COG: L15884 COG3475 # Protein_GI_number: 15672196 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Lactococcus lactis # 19 272 17 266 278 111 29.0 1e-24 MADETQDLKKVQWLLTRILEELDRVCRAIGVEYAVYGGTAIGAVRHGGFIPWDDDIDVMM TREHYERFLAEAPAVMDPRFRLDNTRTRDDFPFMFTKMVVPGTLLVPEADKDAAYRMPFF LDILPLDAVPDDGAAFKAMSRRSWWWGRLLFVRGTPRPHLVGVSGAKRALIFTATTLAHY ALRLVGATPRRLQARWERAVRSHEGDGGGAMADFTMRDPENWIVRHEEFFPTRDIAFEDI TVKIQNRYDDLLRRGYGDYMRIPPESERYNHEAAEIDFGPYADGFPPR >gi|319977241|gb|AEUH01000238.1| GENE 11 10121 - 11110 1260 329 aa, chain - ## HITS:1 COG:FN1236 KEGG:ns NR:ns ## COG: FN1236 COG0697 # Protein_GI_number: 19704571 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 5 304 6 310 322 125 28.0 1e-28 MTCSRGGRRYGIFSGALWGLDTVVLAIALAMAPFLDFGQSALAGAVLHDVGCALALLVYM GVRGRLADTWRALGTRSGRSVVLAALLGGPIGMSGYLIAIDNIGPGLTAIISTFYPAVGT FLAFLLLKERMRPRQILALLVALGAIVIAALSVASDPTEGGNAVLGVVAALACVVGWGSE AVILAWGMRDDLVDNETALQIRETTSGLVYLLVVAPIAGVFGFTLHAVPTPGTGVVALAA LAGTASYLFYYKAISAIGASRGMALNISYSAWAIIFALLLQKTVPSPVQIVCCVVILVGT VLAATPDWDELVPRKGRKADLAAAAAGQV >gi|319977241|gb|AEUH01000238.1| GENE 12 11116 - 12927 2646 603 aa, chain - ## HITS:1 COG:FN1668 KEGG:ns NR:ns ## COG: FN1668 COG4750 # Protein_GI_number: 19704989 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CTP:phosphocholine cytidylyltransferase involved in choline phosphorylation for cell surface LPS epitopes # Organism: Fusobacterium nucleatum # 75 348 3 274 290 268 47.0 3e-71 MSLTGVGELTASQFNVLYALLRAGAPLTQRAIREATGMSLGRVNTAIRECEAAGLISERT LTPAGERALEPYRVKSAVIMAAGLSSRFAPISYERPKGILKVRGDVLVERQIRQLREAGI TDIAVVVGYKKEYFFYLAEKFGVDIVVNEEYATRNNNGSLWCVRKRLDNTYVCSSDDYFT ENPFESHVFQSYYSANYAAGPTSEWCIKVGTGDRITGATVGGADAWVMLGHVYFDRAFSQ RFRAILEEVYSQPETAAKLWESIYLDRIKELDMVIRRYPDGVIHEFDSVDELRSFDPLFM ENVDSEVFDHITQALGVMKSEIHDFYPLKQGITNLSCHFSVNDGEYVYRHPGIGTNKIVN RDAEFAALRLARDLGIDRTFLTGDPEAGWKISRFIPGVRNLDVTNDEELRRAMEMDRNLH NSGKVLERSFDFISEGLAYERLLEDFGPIDVPGYGELREKVMRLKAFADEDGFDRAPSHN DFFPPNFLVDSDGHIDLIDWEYAGMSDIAADFGTMTVCTAEMDEARVGRALEYYFGRTPT EVEHRHFRAYEVFAGWCWYVWALVKEAEGDDVGEWLFIYYSHAVRNIDSLLAQYEAARSQ EGR >gi|319977241|gb|AEUH01000238.1| GENE 13 13223 - 13513 263 96 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGQVISLPQCGLTYPNEWFPTPMPQQDSEGPVVEALAAAVGQLCSGVGRTSRVLTAKTRG PQAARARAPRAMGYFHSDASAFTAIPHFASVITLAK >gi|319977241|gb|AEUH01000238.1| GENE 14 13447 - 13620 339 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVRFASDAESSKGSLVRRAGTGSSLASEVEFSKGYRHFASVITLAKCGIAVKAEASL >gi|319977241|gb|AEUH01000238.1| GENE 15 13651 - 15516 1748 621 aa, chain - ## HITS:1 COG:no KEGG:Arch_0077 NR:ns ## KEGG: Arch_0077 # Name: not_defined # Def: protein of unknown function DUF2142, membrane # Organism: A.haemolyticum # Pathway: not_defined # 82 459 17 390 508 252 40.0 4e-65 MSGGARETPRGTVAGPSDGHQGADPALTPDAEHGADPAMTTAPHNHGAGQAAAPAAADQA EAPAGDGAAAPAPGPDSRPSARRAWVLPAIIALLVAMTGISWAVSSPVGGSPDDDYHLGA IWCPPSVDSTGCRITTIDGKKAVGVPQSLEKKNVTCYAFDHNNSAACTLAFSDEAPGATL RWDDGNYPWGYYQFQHLLVGSDTARSVLAMRLVNTMIALALMGAILLLADSALRLSLGVA LVTGWVPMGLYFVTSLNPSSWALTGTLAFTAGLLGACRSSGWRRWGLDACAAAGAVLACT SRGDSAFYMLVCTVALAFAVPWSRALVREAALAVMASGAGTWIMAHTKVAGLNLAGEVED NGLSTLSIAWMNIKALPDYLKGFTGHGIGPGWNDVSYGGTVERLAGLVVMVVLIVGAWRM SWRRFLSTGAVMGAICGVPVVIGIRGHFSNVEFYQPRYMLPLFAVAVLLWITPARAEGGR APLAPSGGAHGADEAGTAHASSPSPARAAGGRASLRIRFGDRPDDWFRRVVRFGTPVAAA LVAFMHSYALYLVLERYTMGRTPHAMPFDLGMQNLNAVHEWWWPWAPIGPMTVWAVGALA FAGALACAVRGALRHSRGTRA >gi|319977241|gb|AEUH01000238.1| GENE 16 15513 - 16673 1419 386 aa, chain - ## HITS:1 COG:BMEI1404 KEGG:ns NR:ns ## COG: BMEI1404 COG0438 # Protein_GI_number: 17987687 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Brucella melitensis # 15 375 1 355 372 139 30.0 7e-33 MAPSADFSVSPSRPVRVLLDGTAIPADLGGVGRYVDDLVPELVAEGADLTMVVQARDAEH FSKRVPDARVIAVPRRFESRPARMAWEQTGLASLVRKVRPDVLHSPHYTCPQFPGAPVVI TLHDATFFSHPQAHSPLKRRFFTRAIKRAIRRADALVVPSEATRDETIKYAGGSTDQFHV AYHGVDRTVFHPVDDAERERVAASLGLSGRRYVGFLGTLEPRKNVPNLVRGWVEAVRDMD DAPALVLAGGKGWDEDIDPALARVPSALTVLRPGYLPLEDLPGFLSGCDVLAYPSIAEGF GLPVLEAMSCGAATLTTRLTSLPEVGGDAVAYCGTDPASIARELRRLLDDPGRRAALGAA AIERSGEFTWKRAAQVHIEAYKAAMR >gi|319977241|gb|AEUH01000238.1| GENE 17 16930 - 18051 1106 373 aa, chain + ## HITS:1 COG:PA5448 KEGG:ns NR:ns ## COG: PA5448 COG0438 # Protein_GI_number: 15600641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pseudomonas aeruginosa # 110 372 104 373 375 134 38.0 2e-31 MSVRVALVVEQMWQPVPGGSGRYIVEVASRLGAAGVRALGIAAAHGRDEPSPASVGLTIP VVNSRLPRRLLYAAWDRAGAPGVDRLVAGPSGEADAVHATTWAIPPTSKPLAVTVHDVAF VRDPAHFTRHGNAYFRRALDKTRRRAGAIIVPSRATADDCVAAGLDAARITVIPHGLTHT PASADQVRAFQEAHGLTRPYILWVGTREPRKNLPTLLRAFARLAPASDLDLVLVGPAGWG EDATEEASLAEKIPAGRLHVLGRLGDADLAAAYAGARAFTFPSIWEGFGLPVLEAMAHGA PVVTSAGTCMEEVAGDAGLLVDPVDAGALAEALAAAAGTDHDRLAAAGRERASLFTWEES ARAHAAVYQDLVR >gi|319977241|gb|AEUH01000238.1| GENE 18 18048 - 19061 1104 337 aa, chain + ## HITS:1 COG:MTH172 KEGG:ns NR:ns ## COG: MTH172 COG1216 # Protein_GI_number: 15678200 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Methanothermobacter thermautotrophicus # 25 336 22 308 332 82 25.0 9e-16 MSARPTVRVVIVNWRNPALTLRAARSVAPQLGSGDHLVLVDNGSGDDSVAVIGGGLDSLR GAAAGARVSLVESPVNAGFGAGVAAGAGGADEDAIALLNNDAIVDDGYLDALLAPLGTTR GGAEVGATTALILLSGTWRPLADGEDRPHLVARDGARWTRLGDDEAGEGAVLVNSTGNLV DASGNGYDRDWLSPARGLDAPADVFGVCGGACAVSRRAWEAVGGIRTDLFMYYEDTDLSW RLREAGYAAVYVSGAVARHDHAASSGTGSPMFIRVNARNRLVVAAEHAPARVVVSALVRS LARAARAGFRGPGARGVREGLAGMPRALRARRRRRRG >gi|319977241|gb|AEUH01000238.1| GENE 19 19766 - 20227 -28 153 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MATWEESPGRFPPFPEFPARGAPRRASPTGTGGAPSTNAIESLNYQLRKGDQEPRPLPSS DETAVGLLWPPEPQQLRTRPRVGQGPGQTRQRTHRPAHTKATPLPTAEQAPARARTCTTK ALPAPLRPPTRRENQDPRRPKWRTSALSGAATL >gi|319977241|gb|AEUH01000238.1| GENE 20 20624 - 20896 353 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSQQLRATGKWALVYALGALVLNLVCQTLGTYLLWWADLQKGSSRPLGFVVQFVVNVLQ GVSDFTVIGLCAAGAFITWRVAGTVEGGSE Prediction of potential genes in microbial genomes Time: Thu May 12 19:02:46 2011 Seq name: gi|319977236|gb|AEUH01000239.1| Actinomyces sp. oral taxon 178 str. F0338 contig00239, whole genome shotgun sequence Length of sequence - 4246 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 239 - 1129 1205 ## COG0451 Nucleoside-diphosphate-sugar epimerases 2 1 Op 2 . + CDS 1133 - 2992 2767 ## COG3754 Lipopolysaccharide biosynthesis protein 3 1 Op 3 . + CDS 2992 - 4246 1970 ## COG1835 Predicted acyltransferases Predicted protein(s) >gi|319977236|gb|AEUH01000239.1| GENE 1 239 - 1129 1205 296 aa, chain + ## HITS:1 COG:MK0724 KEGG:ns NR:ns ## COG: MK0724 COG0451 # Protein_GI_number: 20094161 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Methanopyrus kandleri AV19 # 5 275 2 285 309 74 29.0 2e-13 MVKTVLVTGAGGYIGRHVVRALLERGLDVSVVDPRLDGVDERARRLDINIFDGSADIYDR MGRPDAVLHLAWRDGFVHNSPAHIEDLPAHYRLIDALYKGGLERLAVMGSMHEVGYWEGA ITADTPCDPASLYGIAKNALRQTAQLLASANGASLQWLRGYYIVGDDKFGSSIFAKLLAA AEEGRKTFPFTSGKNLYDFISVQGLAHQIAAAVSQDAVTGTINICTGKPISLAERVEAYI RENGLDISLEYGAFPDRPYDSPGVWGDATRITEILEALDGPGARLDVGAAPGGQGA >gi|319977236|gb|AEUH01000239.1| GENE 2 1133 - 2992 2767 619 aa, chain + ## HITS:1 COG:mlr7559 KEGG:ns NR:ns ## COG: mlr7559 COG3754 # Protein_GI_number: 13476280 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Mesorhizobium loti # 1 617 1 641 644 473 39.0 1e-133 MRRGGIFLFFDPQGKVDDYILKCLGTLREYIDEILVVSNSPLDGTARERLEGVATTVIER ENVGFDVGGYQDGLDAFGWDRLRSFDELVLFNYTFFAPVNPWKNLFDRVDEWGDVDFWGI TEHDEVRPHPFLAKSVMPRHIQSHWIAVRNPLLTSADFREYWRSMPRISSYNDSIQWHET RFTEHFTKLGYAHKVAYPREDYPSRNPVFDNAAQLLADGCPILKRRNLFHDPLYLDRYAI VGADMLELAARSGYDTDLILTNLARTSKPRDLVTNAGLTTVVPPSAAPEAREKAASLRVV AIAHIFYADMADEIIDRLSVLPDGWRLVATTADEERKAAIEETMARRGAVGQVRVVASNR GRDISAFLVDCSDVLAGDDYDVVVKIHSKKSVQDEANAAQLFKDHLYENLLDSKDHVANI LAEFADHPGLGMALAPMPHMGYPTMGHAWFANRPPARELAKRIGITVPFDDHQPLAPYGS MFIARPRALRPLVEAGLTHDDFPPEGGYQDGSLAHVIERLLAYAVLSEGYYARPVMTPKW AGVCYGYLEYKLAATSSMLPAFSIDQVPYLKARIGQVPNLLGAVKTNIMLRSPGIGNALK PAYRVGRKAVAALRSLRGR >gi|319977236|gb|AEUH01000239.1| GENE 3 2992 - 4246 1970 418 aa, chain + ## HITS:1 COG:mll7336 KEGG:ns NR:ns ## COG: mll7336 COG1835 # Protein_GI_number: 13476108 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Mesorhizobium loti # 24 371 2 332 357 90 28.0 6e-18 MSRPLGKASAYRPHTPLSVAEAFDSRSNSIGFLRWLMAFMVIFSHAGPIAGFYGGEDLGV QVSREQSIGGVAVAGFFFFSGFLITRSRMGRATIWRYMWRRCLRIFPAFWAAMLFTVVVL APIAYWHTHGSISGYLHPQTESPLTYFANNMWLNLGQRNIAGMGETLPYYVLHGARDWNG SAWTLIYEFKAYILVAVLGLFGALANKKVGGAFAIVLIALNGLQWMGAGQLANVNILFRD PFMLMFLAPFAFGMLFTLYGDKIPMDSRLAYGALVFGALSYASGGWNIVGQYGFLYFLMY LAIRLPLQNWEKNGDLSYGIYIYAWPLMAFGAYFHLQDRGWWAYHLTVVIGCHILAYLSW HLIEKPAMSLKNYMPGWMDSLIERFRPSFEALVRRTVVPAYSSTRLATALAAEEAGEG Prediction of potential genes in microbial genomes Time: Thu May 12 19:02:48 2011 Seq name: gi|319977231|gb|AEUH01000240.1| Actinomyces sp. oral taxon 178 str. F0338 contig00240, whole genome shotgun sequence Length of sequence - 3896 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 88 - 1401 1786 ## Namu_4197 hypothetical protein 2 1 Op 2 26/0.000 + CDS 1468 - 2310 1440 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 3 1 Op 3 . + CDS 2310 - 3626 2190 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component 4 1 Op 4 . + CDS 3700 - 3895 96 ## Predicted protein(s) >gi|319977231|gb|AEUH01000240.1| GENE 1 88 - 1401 1786 437 aa, chain + ## HITS:1 COG:no KEGG:Namu_4197 NR:ns ## KEGG: Namu_4197 # Name: not_defined # Def: hypothetical protein # Organism: N.multipartita # Pathway: not_defined # 132 436 106 394 516 130 33.0 9e-29 MNAKASSTFKESESGLRDVVFDEDVPEQSWKHRLRFAPLALLAAASLAAATVGVWYARTH PSPGGAAGAAGSAPVDFNQVTAPTMPESCGATPKADVGPWTPGVAGTGGIASTEAAQASY TAGVGSGAPGYVEGRDGYLFLSDVFNDNFSQAVGRTRPTAGGVAAWDRYMSGIEQAAADV GATPLFLVAPATWEVYSDKLPQWSDRLTGTTSLDLMRTALWNHSWVDVRPALAAARQKDP ATPLYSRVNPHWSPYAAVVAWDAVLDCLGDVDPALAGLPRLSPTGASTGPAPNEYEALGW GGGAPDDWASPVYGADPGAVRLAQFESADQLPAADQLDEALRGAPEVPLSAPMDFLNKPA LTHNPNGRGTALVVRDSQGNNLGPAVEASFEYTIQVGHALDATAPTDVAALIAAYRPSVV LVEFTQRYLALTPSPAQ >gi|319977231|gb|AEUH01000240.1| GENE 2 1468 - 2310 1440 280 aa, chain + ## HITS:1 COG:PA5451 KEGG:ns NR:ns ## COG: PA5451 COG1682 # Protein_GI_number: 15600644 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Pseudomonas aeruginosa # 10 248 15 233 265 94 27.0 2e-19 MSDYKRVIDLTLRMAQRSISSEFKGTALGRVWSFINPLATIAVYALIFGVVFRGKADPGV NSGLTSFALWIGVGVIAWNFISSGIQRGMDSLVGNAGLLTKVYFPRQVLVYSTVLALAYD FAFELAVIGIVMLVAGGPGVLLMIPSVIAVTVLACLFVTGMGMVLAIASVYFRDISHLWK IFNQIWMYASGVVFSLSMLNDVQNELYEQGWRIGGQPLPIVTLFRLNPAELFLEAYRSCL YGFANPSWKVWLGCAGWALGVYALGVLVFKRFSARIVEEL >gi|319977231|gb|AEUH01000240.1| GENE 3 2310 - 3626 2190 438 aa, chain + ## HITS:1 COG:CAC2328 KEGG:ns NR:ns ## COG: CAC2328 COG1134 # Protein_GI_number: 15895595 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Clostridium acetobutylicum # 5 309 4 336 419 261 38.0 3e-69 MTNAISISDVSKRFRIYKNRNQSLKGAFLQRSRAQFEEFWALSDVSFDIPQGKTFGLLGH NGSGKSTLLKCIARILAPDKGSIATTGRMAAMLEVGSGFHPELSGRENIYLNGAILGMSR REIDSKLDAIIDFSGVERFIDQPVKNYSSGMYVRLGFSVSIHVEPDILLVDEVLAVGDME FQNKCMDKFAQLKDQGRTVVVVSHGLEQMRTFCDRAAWLDHGKVVAVGTAAEVIDTYSDV AHHAVEVPGGGTRFGSGEAQITRIDILDGAGESVSLVYPGQCMRMRLHYRAKERVEGPVF GVSVDTREGTWVWGLHGVDATYVPRAIEPGEGHLDIVVPHLALNPGSYTVSAAIQNRDMT AVIDALQKAKRFDVLPGPGMESGGLIILGASFEGLTPAVPMLDIPKRGKADFDRLARLQV QQADSLAQQEDAQEKQED >gi|319977231|gb|AEUH01000240.1| GENE 4 3700 - 3895 96 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTSMRPKGLLAALSLVLAIGVLPVPAASADDAGPQSAAAQSSGASPRAEEGGGQSGAPAA DPAPT Prediction of potential genes in microbial genomes Time: Thu May 12 19:03:02 2011 Seq name: gi|319977225|gb|AEUH01000241.1| Actinomyces sp. oral taxon 178 str. F0338 contig00241, whole genome shotgun sequence Length of sequence - 6012 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1253 1750 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 1427 - 1459 -0.5 2 2 Op 1 . - CDS 1338 - 1922 614 ## HMPREF0573_10943 hypothetical protein 3 2 Op 2 . - CDS 1928 - 2971 1275 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 4 2 Op 3 . - CDS 2964 - 4772 2203 ## Arch_0077 protein of unknown function DUF2142, membrane 5 3 Tu 1 . + CDS 4922 - 5929 1688 ## COG1088 dTDP-D-glucose 4,6-dehydratase Predicted protein(s) >gi|319977225|gb|AEUH01000241.1| GENE 1 3 - 1253 1750 416 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 10 151 567 706 744 130 42.0 5e-30 AAPDLTGRGAWVGEGARWWWRHSDGGYPASQWARIKGSVYSFDASGYMRTGWYREDGAWY FLTPSGAMATGWVQVEGDWYYLDPATGVMAVGDLQVGGSRYCMDASGAMRTGWVSSGDGW RFYAPSGAMATGWAQVAGDWYYLDPATGVMRTETLVLNGRTYEFAPSGAWMGYEAPAGYL QPTDRITGLGASTNTLTWGMNGIKVMIVQRRLGIWHTMKLASVDASFVTAVKNFQWRAGL SQTGVVDEQTWNAMGTGWPWTIDQYQAQPVPLTARRSERVEAMIGYAWNQTGTPYTWGGA GPADLGYDCSGLVLQSLYAGGMDPQPIDVIKHGWPDYRTSQELYNYSGFQHVPLSERQRG DLVFYTSDGSVTHVAIYLGDDMVVHTDWMGRPARMQYITVGYGWDRMTWDVVRPFP >gi|319977225|gb|AEUH01000241.1| GENE 2 1338 - 1922 614 194 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10943 NR:ns ## KEGG: HMPREF0573_10943 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 23 194 28 205 206 76 33.0 5e-13 MRASTRFCAALAVGCAALAPACSSPDAAVQSGTAASPVPQSTGQLPQAEPASGGCAAMDE VFTRAVQTTPQGQAFAALTWEGPASGASADADPQQVWQAFTAVLDSDPWKSEFEAAATDD ASRHAAGALRTYVQVNQRISSGALPEYADEQQAQDDLKAGRAPAPNPEYEQAVADAADAH TALTQCMPHWPVLF >gi|319977225|gb|AEUH01000241.1| GENE 3 1928 - 2971 1275 347 aa, chain - ## HITS:1 COG:alr3070 KEGG:ns NR:ns ## COG: alr3070 COG0463 # Protein_GI_number: 17230562 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 4 222 3 210 318 83 27.0 7e-16 MSDKVTIVLRTRNRPRMLARALASIGAQTFTGYRVVVVNDAGDEQQVRAVVDAQDPGLRE RVELVTNKTSKGREAALESGLAASALEYFAVHDDDDAWHPRFLEETVAHLDAHPELGGVT TRCEIVRERVRADGECTEIEREALSTDSSGLSLVDTLVENYAPPISQLIRRRVADRIGHW DGTLSTQADWEFNLRLLAATPVGFIDGPPLAEWHHRDTEDQDLGNSIVTDAKAHAWDNLH IRDRYLRSALASQDPARPDLGQALLSAELYRRTRAELRRADGSIHSALDLVHADMVSTMA SLHDEVHALRQEVSALRAQVESHNALQEAVKRTVGMPVRAIRRIRRR >gi|319977225|gb|AEUH01000241.1| GENE 4 2964 - 4772 2203 602 aa, chain - ## HITS:1 COG:no KEGG:Arch_0077 NR:ns ## KEGG: Arch_0077 # Name: not_defined # Def: protein of unknown function DUF2142, membrane # Organism: A.haemolyticum # Pathway: not_defined # 26 508 17 490 508 308 39.0 4e-82 MSRTHPRRFCEVPMSEHPPVPRAPSRPARALLVAALAVSALIAGLGWVFASSPGSAPDDD YHLVSTWCPRPIASSGCETTTIDGEVHVMAPVTTSHAQCEAFSPDKSHACINDYSDSMMF PSYRYNDGAYPYGFYQFHHLFAGASVEASAWHMRVVNVLIALVLLGGVCALAPASMRQGL FLAITLAWIPMGAYFIASNNPSSWAITGVFSYGAGLFGALRSDGRRRWALLGIAVVGALL CFGSRGDAAFYVFVVSLGVLIAVGRRDRIVEIAGASVLSAIGVWLMAGGGQAGTIASSSA TVSLSERLAVVISNIRYLPEYFAGFAGLYSGPGWRDTPLPGACVVLGLMLLGVGVLIGGR AMTWRKAASVVVVLGAMAGIPILIATPPTFPNLGGYHSRYALPLLGVWLLLWLAIGRGQQ RFSRTQLVLFVAATGVVNASALHTTIGRYTNGLLHDGLMGWVSPANLNRKVEWWWAGMPL SPMGLWALASAAYVAAVVIALRLLAAGPAPAPTAVAAAPSTAPSPAAEAQTTGSDSPSSG TGAAEGATGADGAATAPSPANGGPEPATGADDEAAAPSPSHAAETQTTGSDAPSPTQEPT HE >gi|319977225|gb|AEUH01000241.1| GENE 5 4922 - 5929 1688 335 aa, chain + ## HITS:1 COG:SPy0936 KEGG:ns NR:ns ## COG: SPy0936 COG1088 # Protein_GI_number: 15674955 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Streptococcus pyogenes M1 GAS # 3 331 5 339 346 476 68.0 1e-134 MKIVVTGGAGFIGANFVHTLLEDHPGVDVVVLDKFTYAGNRASLTDVPADQAARLAVVEG DIADADLVDGVVAGADAVVHFAAESHNDNSLLDPSPFIQTNLVGTFTLLEAVRRHKVRFH HISTDEVYGDLELDDPAKFTPATPYNPSSPYSSSKAGSDLLVRAWVRSFGVEATISNCSN NYGPYQHIEKFIPRMITNRLRGVRPRLYGDGLNVRDWIHVRDHNTAVWEILMRGRIGETY LIGADGETNNREVVAVLNELMGYPADDFDHVTDRPGHDLRYAIDNSKLVTELGWEPRFTN FRDGLADTIAWYTDNEAWWAPLKEAVEAKYAAEGH Prediction of potential genes in microbial genomes Time: Thu May 12 19:03:17 2011 Seq name: gi|319977221|gb|AEUH01000242.1| Actinomyces sp. oral taxon 178 str. F0338 contig00242, whole genome shotgun sequence Length of sequence - 3659 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 964 1276 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Term 1179 - 1219 1.2 2 2 Tu 1 . - CDS 1228 - 1971 954 ## gi|154509515|ref|ZP_02045157.1| hypothetical protein ACTODO_02047 - Prom 2036 - 2095 3.8 + Prom 1909 - 1968 3.5 3 3 Op 1 . + CDS 2164 - 2460 354 ## HMPREF0573_11078 WhiB family transcriptional regulator 4 3 Op 2 . + CDS 2481 - 3657 1172 ## HMPREF0573_11079 glycosyltransferase Predicted protein(s) >gi|319977221|gb|AEUH01000242.1| GENE 1 2 - 964 1276 320 aa, chain + ## HITS:1 COG:L2599 KEGG:ns NR:ns ## COG: L2599 COG0463 # Protein_GI_number: 15672183 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Lactococcus lactis # 10 234 2 216 319 101 34.0 2e-21 SRAPEGGPPRVSVLMATYNGASHGGRWLREQVESILGQRGVDVRLVVSDDGSTDSTVDLL TRWSASDPRIRVLPRRQGEPGVAGNFLHLFTAHDPDGSFVAFSDQDDVWHLDKLSSQLEL MGSRGADVVSSNVMAFRCDRDGSVRDRHLIRKSGPQRRWDYIFEAAGPGSTYVFSPDAHR RLVGALARLDPTGVDVHDWYLYALARALGATWVIGEEPTLEYRQHDSNVQGANSGAGATR ARMEKLRSGFYRRQFILVAQACLSVGSYDARERRALESLAANLRSTSVWSRFAFALRFAR IRRQLTEGIKLAGARVLGVW >gi|319977221|gb|AEUH01000242.1| GENE 2 1228 - 1971 954 247 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154509515|ref|ZP_02045157.1| ## NR: gi|154509515|ref|ZP_02045157.1| hypothetical protein ACTODO_02047 [Actinomyces odontolyticus ATCC 17982] # 41 242 1 201 203 175 50.0 2e-42 MLLNNTISVSSVTKLASCLNARSGPALTVYSTERVELGGPVAARWITKIANYLQGEAGAP LFGDEEPRPARLHTRVEPWQQILWEATARCLGWEILDAPRPLPGDVVVAADADGLLARAA AAGAHALAQPATHLAFAWDGPLPQGVLDALQEIGAQSDTPDHPAPDLVEAALEEALAGSR PRPEPAGDRVALLWSARRGPAQVLDQWARGRSVVVIDPAHHEPVGAQALLAAEGMEGPIT IYMPNNP >gi|319977221|gb|AEUH01000242.1| GENE 3 2164 - 2460 354 98 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_11078 NR:ns ## KEGG: HMPREF0573_11078 # Name: not_defined # Def: WhiB family transcriptional regulator # Organism: M.curtisii # Pathway: not_defined # 1 86 1 85 97 125 72.0 7e-28 MWNIFGDGPIDWSDPADDELYARVDGPLAWQAHALCAQTDPEAFFPEKGGSTREAKAVCQ SCDVREECLEYALANDERFGIWGGLSERERRRLRRMAG >gi|319977221|gb|AEUH01000242.1| GENE 4 2481 - 3657 1172 392 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_11079 NR:ns ## KEGG: HMPREF0573_11079 # Name: not_defined # Def: glycosyltransferase # Organism: M.curtisii # Pathway: not_defined # 1 369 5 379 1110 196 35.0 1e-48 MAALVVSRGRSADCSEVLRAIAEQTVAPASVTVVDVAGRGAVPWDATALPEGVRLVRVGR AKNLGDAIRRAVDEVSGSPGAFTGSRWWWVLHDDSAPESTCLARLWRVAQSGRTIGAVGP KQLSWDGSRLLEVGVFATRSARRLERVRPDEIDQGQYDGTTDVLGVGTAGMLIDSRAWAA VGGTDPALGPFGDGLDLGRRLHLAGYRVVAAPGARLRHRRRSLAPASDPGGDGPPQDPDG RPGRQDEDASFRRRRRAQLYNWMKAVPAWQVPLLMAWLVIWSPARALGRVVTGAGHLALA EVGAWLSVVLATGQLLVGRFRASRTRTVPRSALRALETRPRSLAREPEAAPDDGPEERID PLVESSQRRYRLSATAALLGALAVASIGAALT Prediction of potential genes in microbial genomes Time: Thu May 12 19:03:39 2011 Seq name: gi|319977216|gb|AEUH01000243.1| Actinomyces sp. oral taxon 178 str. F0338 contig00243, whole genome shotgun sequence Length of sequence - 4227 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 1980 2167 ## Bcav_1207 glycosyl transferase domain-containing protein 2 1 Op 2 . + CDS 1977 - 3368 1467 ## RSal33209_3015 hypothetical protein 3 2 Tu 1 . - CDS 3475 - 3897 428 ## HMPREF0424_0592 hypothetical protein 4 3 Tu 1 . + CDS 3947 - 4226 284 ## Bcav_1215 hypothetical protein Predicted protein(s) >gi|319977216|gb|AEUH01000243.1| GENE 1 1 - 1980 2167 659 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1207 NR:ns ## KEGG: Bcav_1207 # Name: not_defined # Def: glycosyl transferase domain-containing protein # Organism: B.cavernae # Pathway: not_defined # 6 658 425 1117 1121 103 30.0 3e-20 GQGFSGGAWAGLPSRWTDLWEAAWASWIPGGDGYPAPPDPLLVPLAVLSWPLSLIGATPG SLAASVLVAAAPLAVLAAWPATRLMTSSPPLRALLSLLWACLPPLVLSQFHGQLAGVAVH LAVPVLACSWARLAGADPQLVVDGADGPTPVTGIRLRGAHGRAALAAAVIAAAAPWALIA TTAAGIAVSRRRGAGGSRPGGPLLATLVPAWALLAPTLGSIAAHPAAWRALAATGGGAHA HTPAQGWQAALGLAAAPASQVEAWALALPFALLALWAVGAALGALRRGDTRPSGIAGLAL MSVAAAHVLSLLPVGQDGGAVVRAWSAPALTLACALFALALARSARPSQEPWDSAALKRT GRLALAACLVPLLAMAGGAWERSAQESAFAGASTYTMSLDERVHDVHSPLVPAVSAQAAL SARAGRILVLTGSPSDTLTAQIWRGNGRAMTEGSPLTRALALARSRAAAQRGAIGDPATA SLAGLALTLVVYPDEATAAALADHGIDTILVPLGAPGGDDIAQGLARAPGLEKVGDTDAG AVWRLRPDGAQSARVRILATGGGWANVGSTNLTTDGAVEASGPTIALAERADSGWRATLD GRSLDPVASADGWSQSFALAGAGRLTIVHRAWWTYPWWAVSGLALVVAAAGSVPLRRRS >gi|319977216|gb|AEUH01000243.1| GENE 2 1977 - 3368 1467 463 aa, chain + ## HITS:1 COG:no KEGG:RSal33209_3015 NR:ns ## KEGG: RSal33209_3015 # Name: not_defined # Def: hypothetical protein # Organism: R.salmoninarum # Pathway: not_defined # 36 301 125 420 563 75 31.0 5e-12 MRRPPAPLAALAAGAACVCLVASAQAWAPAPRAESVPTAPLDVAPSSVTLACPPGVTNPF KPDQSAPGGAWSTTSAAPLAPAPATVTESGTGAGTPIPSAFVVAGQGGGELAGLSVTGCS TPLSEQWLAAGATTSGSDVVLTLANPSATASTASIEGYGGSGRIGEGPQQVRVPAGKSVS VLLAGWFPDETNLAVRVSADAGGVAAWAQTSVMDGEIPQGASWGPSVRPSTATVIPGIEA DAGSSLRIAVPSAEAASVSVTVHDDSGSAPLPAGDVTVDGRTAIDIPLDGMTRSKEPVAL SIASDRPVVAQVTSARIGAPWTDPAHAWVSRSVISPASPVTRASLPGAKDVASLVTAQLA ADPLRATSVETPSGVDAVRTRITLLAAGADQVDASVGGQALQLSSGTPLTVDLPASGGTL TATAPVTAALVVEADTPVGTLHAAWSIGTLGITATTANVVAED >gi|319977216|gb|AEUH01000243.1| GENE 3 3475 - 3897 428 140 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_0592 NR:ns ## KEGG: HMPREF0424_0592 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: not_defined # 8 136 9 137 142 86 38.0 2e-16 MHPSVRRRSARDRHGRGSRSPLFFPGTPAWRTRREDFDLLVSSLVAELADRWPAVSSIEF ATEDVPPSDPAPWESHSVVLARIFPADRRRGLRDRIVVYRLPLVLRCAPAEVEGVTRRLL VERISHVLALPPDEVDDLLR >gi|319977216|gb|AEUH01000243.1| GENE 4 3947 - 4226 284 93 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1215 NR:ns ## KEGG: Bcav_1215 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 11 92 5 86 134 111 73.0 7e-24 MSYPCGVIAARHCSKPGCSRGAVATLTYDYKDSTVVLGPLATAADPNSYDLCDEHAEHLT APRGWQVVRLATNFEPAPPSGDDLLALVDAVRR Prediction of potential genes in microbial genomes Time: Thu May 12 19:04:04 2011 Seq name: gi|319977213|gb|AEUH01000244.1| Actinomyces sp. oral taxon 178 str. F0338 contig00244, whole genome shotgun sequence Length of sequence - 2130 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 115 120 ## 2 2 Tu 1 . - CDS 135 - 977 750 ## COG1714 Predicted membrane protein/domain 3 3 Tu 1 . + CDS 1011 - 1985 1147 ## COG1300 Uncharacterized membrane protein Predicted protein(s) >gi|319977213|gb|AEUH01000244.1| GENE 1 2 - 115 120 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no YAARSVGAPDTSSPLDPSSPYAKRRAQFRVVAADDGE >gi|319977213|gb|AEUH01000244.1| GENE 2 135 - 977 750 280 aa, chain - ## HITS:1 COG:Rv3695 KEGG:ns NR:ns ## COG: Rv3695 COG1714 # Protein_GI_number: 15610831 # Func_class: S Function unknown # Function: Predicted membrane protein/domain # Organism: Mycobacterium tuberculosis H37Rv # 16 244 4 233 310 100 35.0 3e-21 MSASRVATIDLSAESVRTGEAVELDIIPAEPPYRFVSAFVDFAAYAATAVTVFYMLMHSW RHPTDQQQKIFSILLVASATLLVPLAVEALTRGSSLGKWAFSLRVVRDDGGPASLRHIFV RRLVGVIELTLFGLPALVSMFLTTRGKRLGDLAAGTIVVRQPTGALHPPLLMPPALAAWA STAVVLPVDRALRREALAFLRANAELAPAVRAAGGADLAERLRAYAETPVPAGAHPEQVI AAILVIERDKDWRRERTRVAATQERLGRATAGQFGIPPTA >gi|319977213|gb|AEUH01000244.1| GENE 3 1011 - 1985 1147 324 aa, chain + ## HITS:1 COG:MT3796 KEGG:ns NR:ns ## COG: MT3796 COG1300 # Protein_GI_number: 15843312 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 3 316 73 388 395 145 33.0 1e-34 MRVSHERTWNRLHDLTARRRLSGADIDELDRLYRLATSDLARIRTTDPTSELIGPLSRDL ASARGRLTGTPGVGLATVSRYFSVALPRALYGIRWAAVGVALVFSAIVALYVVHMEAHPD LYAALGSPSRLRAIAMTEFVQYYSQGTEAEFASSVWANNAWISALAVAGGITGAFPLYLL WSNALNTAVTASIVIHFGGVWHFFRYILPHGLPELTAVFLAIAAGLRIFWVMLVPGPRTR VEALGRCARQTATVVGGVTILLAVSGVLEGYVTPSELPDAVRVGLGALTVVAVWVYIFVV GRRAHMGVDDGDVEDAGYYQPVAG Prediction of potential genes in microbial genomes Time: Thu May 12 19:04:09 2011 Seq name: gi|319977209|gb|AEUH01000245.1| Actinomyces sp. oral taxon 178 str. F0338 contig00245, whole genome shotgun sequence Length of sequence - 2770 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 23/0.000 - CDS 375 - 1694 1178 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 2 1 Op 2 . - CDS 1704 - 2768 1148 ## COG0714 MoxR-like ATPases Predicted protein(s) >gi|319977209|gb|AEUH01000245.1| GENE 1 375 - 1694 1178 439 aa, chain - ## HITS:1 COG:MT3795 KEGG:ns NR:ns ## COG: MT3795 COG1721 # Protein_GI_number: 15843311 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Mycobacterium tuberculosis CDC1551 # 1 439 36 475 475 233 41.0 5e-61 MVVSWRVSALVLLGAIPLIAAPSVGFVWCWAGGCLVLGALDALTAVSPRELRVRREVDGP IRADETSASVLTITNPTKRRLRLVVRDGWPPSLRPSPARHAVRVEAGSSVVLRTRLAPTR RGTREADHATLRVWGALGLGGRQISIDAPLSLSVLPAFRARVLLPSRLARLHELEGTTPT TLRGAGTEFDSLREYVRGDDPRDIDWRASARSKELMVRTWRPERDRHVVVVVDCGRASAA LLGAPDADADTIEIGVAPRLDSGIETALLLGALATRAGDQVHLLALDQRVCARVSGIRGG AFLGAAATAFQEVVPSLDVTDWQLAVSQVRATVRHPALVVLVTRVPPAGMDVDFLEAVRA LSDRHTVLIASATDPDQDRARTDAEDVLMRAAAALEERTDQAGVADARSAGAVVVSAPSG ALPARTADTYIELKKAGRL >gi|319977209|gb|AEUH01000245.1| GENE 2 1704 - 2768 1148 354 aa, chain - ## HITS:1 COG:MT3794 KEGG:ns NR:ns ## COG: MT3794 COG0714 # Protein_GI_number: 15843310 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Mycobacterium tuberculosis CDC1551 # 7 354 22 358 358 369 61.0 1e-102 GQALGGGAPPTAPGAPSAPSSFSAPSQEQMNGRALSPQEERTKDALMALRSEIGKAVVGQ EGAVTGLIIALVAGGHALLEGVPGVAKTLLVRTMATALDVDMARIQFTPDLMPADITGSL VWDAGSSAFEFRRGPVFTNLLLADEINRTPPKTQSALLEAMEERQVSVDGQTHALADPFM VIATQNPVEYEGTYPLPEAQLDRFLLKLELPLPGRDAEFGIIKRHRDGFNPLALADAGVR PVASPADIAAARGAAARVDVSDTVLAYLVDVCRATRESPSVRLGASPRGATALLRTTRVW AFLQGRGFATPDDVKAMAAPTLAHRIGLRAEADLEGLTAARVVEGVLAQVPVPR Prediction of potential genes in microbial genomes Time: Thu May 12 19:04:10 2011 Seq name: gi|319977204|gb|AEUH01000246.1| Actinomyces sp. oral taxon 178 str. F0338 contig00246, whole genome shotgun sequence Length of sequence - 3936 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 97 110 ## 2 1 Op 2 . - CDS 90 - 1256 1375 ## RSal33209_1646 hypothetical protein 3 1 Op 3 . - CDS 1253 - 1870 695 ## Gbro_4819 hypothetical protein 4 1 Op 4 . - CDS 1891 - 3084 1422 ## Jden_1829 hypothetical protein 5 2 Tu 1 . + CDS 3295 - 3927 823 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain Predicted protein(s) >gi|319977204|gb|AEUH01000246.1| GENE 1 1 - 97 110 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTNPIPPSSGSPFGPPRTGAAPAGGTQGQALG >gi|319977204|gb|AEUH01000246.1| GENE 2 90 - 1256 1375 388 aa, chain - ## HITS:1 COG:no KEGG:RSal33209_1646 NR:ns ## KEGG: RSal33209_1646 # Name: not_defined # Def: hypothetical protein # Organism: R.salmoninarum # Pathway: not_defined # 53 388 2 346 349 128 31.0 3e-28 MRARSTVVVTPTGQSLLARARALVLLVGIAVIATALLVLVPDPSHSTTPLSTTNHGEDGA QALVRVLGDHGVSVKEAAEGDIASADADTTVVVIDPEKMPSRAVAAVQAAPNVVFVGTLY STADALPPYLEGLRVMTPFSSTPVKTVNPGCDSPTAARARTLTSSIYRVTGDPEQDPAQE WTLCFTADGRGFQYAEKESDGRFRAVITDPARLANSAVLRDGNASLAIGAMGRTRSVLWY MPTGEESATSASDLAPAHLRPLFLLLVAAALVLALARGRRMGRLASEKLPVEVPASEILV GKARLMRSNRAFGHAAQALRSASARRIAQRLGVPASAPAEQLAGALERRGVDPARSRALL WGPAPASHSALTVLAEQLEALEKEIRND >gi|319977204|gb|AEUH01000246.1| GENE 3 1253 - 1870 695 205 aa, chain - ## HITS:1 COG:no KEGG:Gbro_4819 NR:ns ## KEGG: Gbro_4819 # Name: not_defined # Def: hypothetical protein # Organism: G.bronchialis # Pathway: not_defined # 1 205 11 222 227 74 30.0 2e-12 MTPDLDEAQSWLSEELSRGEYHRDLDPVSKFFARIYRSLAESFNWDGHGVPPVQLIVLVM VIVLVAVALIALILNPVRARRRVSASVFDEDGQSAQEVRAALKRALAEGDWDQAFVWRYR ILVLEAGALGVLADTPGLTAHEAASQAARSAPALAQELMAEADLFDTVRYGEGHAGSADV ERLGALTERALDALRRTATAGAGAR >gi|319977204|gb|AEUH01000246.1| GENE 4 1891 - 3084 1422 397 aa, chain - ## HITS:1 COG:no KEGG:Jden_1829 NR:ns ## KEGG: Jden_1829 # Name: not_defined # Def: hypothetical protein # Organism: J.denitrificans # Pathway: not_defined # 41 392 57 412 414 117 28.0 8e-25 MSNQWTAPGEGEPQSPLAQGPTGAPGPWGQAPLPPGTPVPPPGTPVPLPGAPMGGPGASY GQGPAWAAPPRPGIIALRPLTLSDLFDGSFKAVRTNPTVFFGFALAVNAILAVVNALVTW FLINSVFDAVSRTDYRSANDVSGMLGGLTTQFITSWGLTAAASFVGSTLVSGMLCVAVIE AAAGRKPTLGQTWRRLAPRFWPLVATTLLTGILLVLIAIVLALVGLVVIAALAGAISSTR GGMGAVAAIGLISLALIAVLVFTQFCFAVRFLYAPAATVIEGRSGFGAIGRSWELTQGRW MRSVGRYLLIVLLSMVAMWIVSLGVSSVLDMSVFATAVDGTVSAAVVKQTVSTAVMTVLQ SLVTPLVTSYILLMYLDERIRRENFAATLAAAAAGNQ >gi|319977204|gb|AEUH01000246.1| GENE 5 3295 - 3927 823 210 aa, chain + ## HITS:1 COG:Cgl0731 KEGG:ns NR:ns ## COG: Cgl0731 COG0745 # Protein_GI_number: 19551981 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Corynebacterium glutamicum # 1 207 17 223 226 262 63.0 4e-70 MIGIVLEGEGYTVSTCPDGAKAVAAFQEHHPDLVLLDVMLPGMDGFEVCAALRAESNVPI VMLTARSDTADVVTGLEAGADDYVPKPFKPRELVARVRARLRGREDAGEERIALADLEID VPGHAVRRGDRLIALTPLEFDLLATLARTPSKVFSREELLEQVWGYRHAADTRLVNVHVQ RLRSKIERDPEKPEIVVTVRGVGYRAGTGA Prediction of potential genes in microbial genomes Time: Thu May 12 19:04:32 2011 Seq name: gi|319977200|gb|AEUH01000247.1| Actinomyces sp. oral taxon 178 str. F0338 contig00247, whole genome shotgun sequence Length of sequence - 4902 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1678 2122 ## COG0642 Signal transduction histidine kinase 2 1 Op 2 . + CDS 1675 - 3369 1801 ## HMPREF0573_10408 lipoprotein LpqB 3 1 Op 3 6/0.000 + CDS 3372 - 4082 521 ## COG1040 Predicted amidophosphoribosyltransferases 4 1 Op 4 . + CDS 4245 - 4898 569 ## PROTEIN SUPPORTED gi|227384070|ref|ZP_03867487.1| SSU ribosomal protein S30P Predicted protein(s) >gi|319977200|gb|AEUH01000247.1| GENE 1 2 - 1678 2122 558 aa, chain + ## HITS:1 COG:Rv3245c KEGG:ns NR:ns ## COG: Rv3245c COG0642 # Protein_GI_number: 15610381 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mycobacterium tuberculosis H37Rv # 25 554 23 530 567 299 37.0 1e-80 KPSRSSAALSRLGAVVRAVILATPLGGFARSCARAWRRSRLWRFISSTHPAAVIRQTLSL RLALVITATTLGLVLVFGFVVSAQMRSSVFDSRRAQILGDASLRFSSVQSVLGQSTASTV DQIQEMVRSTLSSLRASAAGAGAINVALLRSPSAPESFRINQIGDQQMEAVIDAPIRQAV RGGGAAQWQSVAIASTDGDAPGILVGMSVQVPKAGGHELYILYSLESDQRQITMVTRTLF VTALPILVLMPAGVFWVLHRMLAPVRKTAQAATDAALGDLSVRVEVVGDDEMADLGYAFN DMTASLQHTIDEYDELSRLQQRFVSDVSHELRTPLTTIRMAEDVIWHNRASLPSGAKRSA ELLHEQTERMDSMLADLLEISRYDAQSALLDPEVRDLRPLVRKVVDANRDLAERQGVRVV VDAPAARCAAEIDERRIERVLRNLVVNAIEHADGSDVVITIAQSGTEVAVRVRDGGVGMD EETAAHVFDRFYRADTARARTTGGTGLGLAIATEDTQIHGGQLEAHGVPGEGASFLLTLP KTAGDPLVTRPLELWEGA >gi|319977200|gb|AEUH01000247.1| GENE 2 1675 - 3369 1801 564 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_10408 NR:ns ## KEGG: HMPREF0573_10408 # Name: not_defined # Def: lipoprotein LpqB # Organism: M.curtisii # Pathway: not_defined # 2 564 8 558 558 162 28.0 5e-38 MRAVRASLAAVCAAALAGCSSFPMSSAPEPFDVSSRDNGVVQFAADGPSDGADPVSLVKD FLRACAAGTNDDYATARLFLTESSASSWKPEEEVLVHESDTENTPTIAPRDGDDDSDSAV TVTVSFTAIASVDSVGILTHSPHTTIDQEYSLVRKGGQWRIESPADVVVMSRSAFTASYQ LANLYFPAATQDALVADPRWLPSRRLAGHLLDGLAKGPRPSLAAAVVNAIAAGGPTPSRG VETTERAARVDLSGPEPASEQAAGLLVWEIRETLTQVPDISTVEVTISGQPLSAAAPPMG PSYSLDTLVGLSEAGMVYVSGATVSSVPVSETPSPTASRPTASPVSASLVAWNDGDATRI ARVGSAVATTAGSLALGPSIDRFGWVWTGGSDEDGPIAVVNEEGPPSAVAIESDPDGDVR GVRLSADGARALVVRGRTSLWTATVERESGGKPVRLSHFDEVPTGGAGLVDASWAGAGTI MYVTEPDGPQGEPQLVTLPLGGLPTSTPLSARATAMSAGGSPSAVVLVAQEAQGQSGAAP QALTRSGALWQPLPAGVREARYAG >gi|319977200|gb|AEUH01000247.1| GENE 3 3372 - 4082 521 236 aa, chain + ## HITS:1 COG:Rv3242c KEGG:ns NR:ns ## COG: Rv3242c COG1040 # Protein_GI_number: 15610378 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Mycobacterium tuberculosis H37Rv # 17 220 3 197 213 59 32.0 4e-09 MAAVGRLLGALARGGADIALPTSCPGCGAWDVDLCPQCRALARRECVWSVLEAPGAPGGV DLCALGAYEGALRRLVLAAKHSPRRDLGPFLAECGEGLGAALAGRFHSEGDRPAWDEIWV VPAPSSWRRRLRSRPVTAPLAAGVASALASLGACRRASAVDCVRLALGARSQSGKSGASR RAGRAGSMRALAGAPPGAGVVVVDDVVTTGATMREVIRVLGGCDAVAAVCAPGRVP >gi|319977200|gb|AEUH01000247.1| GENE 4 4245 - 4898 569 217 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227384070|ref|ZP_03867487.1| SSU ribosomal protein S30P [Jonesia denitrificans DSM 20603] # 1 217 1 215 219 223 51 2e-58 MDITVVARNAEIHPNFRQYVEEKVSKVTQFYPRAQRIDVELTHERNPRQADTAERIELTV YGKGPIIRAEAKSADRYAAVDIAAGKLYERLRRLRDRAKDHRRGYARDLTDTEVLEAPVP PQQEEAPAEEAAPKPLRSAEDLKVGEAREEQWGDSPIIVRQKVHEAPPMTVDEALDQMMM VGHPFFLFVDKETRQPCVVYHRHGWTYGVLRLNTTID Prediction of potential genes in microbial genomes Time: Thu May 12 19:04:42 2011 Seq name: gi|319977197|gb|AEUH01000248.1| Actinomyces sp. oral taxon 178 str. F0338 contig00248, whole genome shotgun sequence Length of sequence - 3027 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 223 - 3025 3792 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) Predicted protein(s) >gi|319977197|gb|AEUH01000248.1| GENE 1 223 - 3025 3792 934 aa, chain + ## HITS:1 COG:ML0779 KEGG:ns NR:ns ## COG: ML0779 COG0653 # Protein_GI_number: 15827340 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Mycobacterium leprae # 3 840 1 834 940 956 60.0 0 MSLIDKILRAGEGRTLKKLDRLASQVDALAEDFEELTDEELQAKTQEFKDRLSDGETLDD VLVEAFATVREASWRILRLRPFHVQVMGGIALHQGRIAEMKTGEGKTLVATMPAYLRSLT GEGVHVVTVNDYLAKYQSDLMSRVYSFLGVSCGCVLVGQTPAQRREMYAMDITYGTNNEF GFDYLRDNMAQVPEDLVQRGHAYVIVDEVDSILIDEARTPLIISGPADGDLNRWYVEFAR IARLLKRDEDYEVDEKKKTVGILESGIDKVEDQLGVENLYEAANTPLIGFLNNAIRAKEL FFLDKDYIVDGGEVLIVDEHTGRVLPGRRYNDGMHQAIEAKEGVEIKAENQTLATITLQN YFRLYPEGSRAGMTGTAETEAAEFASTYKISVVPIPTNKPMIREDKPDLVYPTEDGKLGA IVDDIEERHKKGQPILVGTASVEKSELLSKMLRARHIPHQVLNAKQHAREAAVVAMAGRK GAVTVATNMAGRGTDIMLGGNSEFLAQANLAAEGLDPKENPEEYREAWPKALEAAEEAVE AEREEVRELGGLYVLGSERHESRRIDNQLRGRSGRQGDPGESRFYLSMEDDLMRLFNSGM AQRIMASGAYPEDMPLENRLVSRSIASAQHQVEARNAEIRKNVLKYDDVMTGQRETIYGE RRKVLEGEDMAPQLRLFTESLVTGLVDEAIADKAVDEWDLEALWENLRAYYPPSVTLEEV EEEHGGRASLVREDLVTELVGDIHAVYADTEERLNANPLAVAQLGEEPMRALERRVVIAT VDRLWREHLYEMDYLKEGIGLRAMGQRDPLVEYKDEGAQMFQTMVERIREEAVQQVFSFA KQFERALADAEEQAGGAITSARVAPSQDKDEEASPSVEEIAAADAAASQEARAAVARARS VMGSVGREHAPTRVSYSSSQGQGQGQGQGGKGSA Prediction of potential genes in microbial genomes Time: Thu May 12 19:04:42 2011 Seq name: gi|319977193|gb|AEUH01000249.1| Actinomyces sp. oral taxon 178 str. F0338 contig00249, whole genome shotgun sequence Length of sequence - 1007 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 48 - 112 21.1 1 1 Op 1 . - CDS 170 - 529 365 ## 2 1 Op 2 . - CDS 676 - 900 295 ## Predicted protein(s) >gi|319977193|gb|AEUH01000249.1| GENE 1 170 - 529 365 119 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRFTPVVVPGLRLGDPPEPGEWAASLARALIEALLGHRPRAQLRRWFLPELYSAIESLR VAAPVGPLPCRPLHWRACSPSSGVSEVAVTIAAPTRNYAVALRLEEYRGRWMATALELA >gi|319977193|gb|AEUH01000249.1| GENE 2 676 - 900 295 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAQNAGTAPNGPTSVTVRPGDTLWSITAAALRSTDDVRIAAAWPRLYEANAGAIGPDPSL LRPGQVLTIPEDLS Prediction of potential genes in microbial genomes Time: Thu May 12 19:04:56 2011 Seq name: gi|319977188|gb|AEUH01000250.1| Actinomyces sp. oral taxon 178 str. F0338 contig00250, whole genome shotgun sequence Length of sequence - 4760 bp Number of predicted genes - 9, with homology - 6 Number of transcription units - 8, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 254 222 ## - Prom 288 - 347 3.1 2 2 Tu 1 . + CDS 318 - 950 813 ## Bfae_20570 hypothetical protein 3 3 Tu 1 . - CDS 951 - 1160 339 ## Bcav_1256 DNA binding domain protein, excisionase family 4 4 Tu 1 . + CDS 1114 - 1221 89 ## 5 5 Tu 1 . - CDS 1211 - 1321 229 ## + Prom 1244 - 1303 2.0 6 6 Tu 1 . + CDS 1324 - 1929 709 ## Cfla_2320 hypothetical protein 7 7 Op 1 . + CDS 2040 - 3113 997 ## Bcav_1258 chromosome partitioning ATPase 8 7 Op 2 . + CDS 3126 - 3665 446 ## gi|154509490|ref|ZP_02045132.1| hypothetical protein ACTODO_02022 - Term 3609 - 3645 3.5 9 8 Tu 1 . - CDS 3681 - 4760 1074 ## COG0419 ATPase involved in DNA repair Predicted protein(s) >gi|319977188|gb|AEUH01000250.1| GENE 1 2 - 254 222 84 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTRTPLPAPPLHDAALWLIGAPAAALLAAYAWAKAPALSADAPDQIIAVILACALGALI AWNLLWSAIAHAAAAGRAPGALRR >gi|319977188|gb|AEUH01000250.1| GENE 2 318 - 950 813 210 aa, chain + ## HITS:1 COG:no KEGG:Bfae_20570 NR:ns ## KEGG: Bfae_20570 # Name: not_defined # Def: hypothetical protein # Organism: B.faecium # Pathway: not_defined # 28 200 1 175 181 64 36.0 2e-09 MSSVFSSLIFILFHALVLRSVVWQSGRMDIELFLADLEGRFAEQRRRDNDLLVEELTDAE RTGVTLAARLLAVDGPVTLVLRGGRRLDGAVRDCTRTWVLVRGDGGDSLVPLGAVVGAWP LGRVAAGEAGVKRGAGMGHVLREFAARGVPLVVDHDAGAHRGRIVAVYADHVDAEVGEGP VGDSRDWGAGARVSLALSGLRELRVADGRW >gi|319977188|gb|AEUH01000250.1| GENE 3 951 - 1160 339 69 aa, chain - ## HITS:1 COG:no KEGG:Bcav_1256 NR:ns ## KEGG: Bcav_1256 # Name: not_defined # Def: DNA binding domain protein, excisionase family # Organism: B.cavernae # Pathway: not_defined # 4 64 3 63 69 71 60.0 8e-12 MDTRFLTLADVAETLNLTMSATRALVTSGDLPAIQVGGKKVWRVEESALEHYIQQQYALA RQRQHANAQ >gi|319977188|gb|AEUH01000250.1| GENE 4 1114 - 1221 89 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSVSATSARVRKRVSIDTLYQFVRFVAIWYESIRL >gi|319977188|gb|AEUH01000250.1| GENE 5 1211 - 1321 229 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLRPVSLSHNAPKHPGIKDYGLSMPWKTRAPIVTTI >gi|319977188|gb|AEUH01000250.1| GENE 6 1324 - 1929 709 201 aa, chain + ## HITS:1 COG:no KEGG:Cfla_2320 NR:ns ## KEGG: Cfla_2320 # Name: not_defined # Def: hypothetical protein # Organism: C.flavigena # Pathway: not_defined # 6 198 17 218 221 73 33.0 3e-12 MEIPARRPAWRDPRLIVGLALIAVSIILTTSIVSAARGGATVYRATQAILPGDVLGSHNI APTRLDVDTSVYATADALAPGATVSEEVAAGEILRVSSIADTSAATARRLVITVSDSLPA SVQAGDQLDLWSVQQSSGVQSGQGAVHLGVRATLVRVLEQTTSIAAKGTRIEILVDEASV GAVLEATAGKSSLAALPVGQR >gi|319977188|gb|AEUH01000250.1| GENE 7 2040 - 3113 997 357 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1258 NR:ns ## KEGG: Bcav_1258 # Name: not_defined # Def: chromosome partitioning ATPase # Organism: B.cavernae # Pathway: not_defined # 17 341 54 402 441 115 38.0 3e-24 MAEVSAAARSGVADLALLDAADPDVDRPLLAELSRVGMRAVLIASPDDEARARSLGGAGT ALAGDPEQAVRALVAALVDAPPAPGPSAEPPAPPEPPTDSSLIAVWGTSGAPGRSSVAVS VAHALAKSAPTLLIDADTANPSIAHLLGIPVDASGLSALSRQAVRAPIGPSDLRRLASPR SPGLDVLTGLTSPHRWREAAPSSLEHILEAARGAYRFVVVDVAATSLDPLPALRRPGGGR DDAALAVLAAADRVLVVARGDTVGINRLDHLAKWWEDSGLEAPLDLVVSRVSASSVGGRP AAVLMPALSSILPGRRVHLLPEESAVPEAALRGMAPAEYAPDCATAAVADALADALM >gi|319977188|gb|AEUH01000250.1| GENE 8 3126 - 3665 446 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509490|ref|ZP_02045132.1| ## NR: gi|154509490|ref|ZP_02045132.1| hypothetical protein ACTODO_02022 [Actinomyces odontolyticus ATCC 17982] # 1 151 1 150 156 115 50.0 1e-24 MRVYLPALPSELAGPTPPVRDGFAALPQAGAAKDDIEVLEDDAQSEAALASLVLARESDS PQAPARLVLALDVPDSCAVAGQVEGAPGVHLVPGVRAQWGDAAAILADGAGAAPAVRRVL AATTQEGADRALCELWDEALEWFDITERPALASALCAPVRAGTEEGAPARTGAEEGAPA >gi|319977188|gb|AEUH01000250.1| GENE 9 3681 - 4760 1074 359 aa, chain - ## HITS:1 COG:SA1181 KEGG:ns NR:ns ## COG: SA1181 COG0419 # Protein_GI_number: 15926927 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Staphylococcus aureus N315 # 65 346 722 996 1009 140 30.0 4e-33 DEERAALERARGQVAEHAPGGRTIAESIEESEALASAFSALLEAASTWSGACEQEESARA EFVAQLGRSGLPSDGDSWRGALVDEDLLDSYEARVSAHGQQLFALRQALASQEMERAGGL DAPDVEGARERARASARARVAAHQRVGSLEQCARELGSAVESLASCVDELRAARQEAGPV RRLADIAAASSPENLAATPLSAWVLVSRLEEVLRATNPRLAAISSGRYELVAVPDDGTQS RKSGLGLRIVDHDTDTERSARTLSGGETFYTSLALALGLADVVTAEAGGVELRTVFIDEG FGSLDAHTLSLVMDQLHQLRDGGRCVGVVSHVEEMASQIPDQVRVRPLPAGGSSLRVRA Prediction of potential genes in microbial genomes Time: Thu May 12 19:05:34 2011 Seq name: gi|319977183|gb|AEUH01000251.1| Actinomyces sp. oral taxon 178 str. F0338 contig00251, whole genome shotgun sequence Length of sequence - 4163 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 28/0.000 - CDS 1 - 652 910 ## COG0419 ATPase involved in DNA repair 2 1 Op 2 . - CDS 649 - 1866 1244 ## COG0420 DNA repair exonuclease 3 2 Tu 1 . + CDS 2204 - 2935 1008 ## COG0518 GMP synthase - Glutamine amidotransferase domain + Term 3040 - 3094 13.1 4 3 Tu 1 . - CDS 3028 - 3498 -214 ## 5 4 Tu 1 . - CDS 3674 - 4162 804 ## COG2190 Phosphotransferase system IIA components Predicted protein(s) >gi|319977183|gb|AEUH01000251.1| GENE 1 1 - 652 910 217 aa, chain - ## HITS:1 COG:L152588 KEGG:ns NR:ns ## COG: L152588 COG0419 # Protein_GI_number: 15673303 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Lactococcus lactis # 1 213 1 209 1046 122 35.0 6e-28 MRIRRLEMVGIGPFTQRQVIDFTAFDESGLFLLEGPTGSGKSTIIDALTFALYGDVARQK DASKDRLRSNRLEDGQRSEVDLVFEVRSGLYRVARTPAYTPAGRKTQRNSKATLVRVVED PSAEAGVRTVEDIASGPSKVGPEITRIVGLDKEQFLQTIVLPQGKFARFLTATSDERERI LRDIFDTRVYVAFQERLAEAAASSRAALAERERAAAG >gi|319977183|gb|AEUH01000251.1| GENE 2 649 - 1866 1244 405 aa, chain - ## HITS:1 COG:lin1687 KEGG:ns NR:ns ## COG: lin1687 COG0420 # Protein_GI_number: 16800755 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Listeria innocua # 1 403 1 372 374 169 29.0 1e-41 MLILHTSDWHLGRTLHGEDLSASADAFLDWLIGLVEERGVDAVVVSGDVYDRAVPPVDSV RRLRRALVALCARAPVVLTSGNHDGPVRLGAFAGLLDARLTVAADPLSVGTAVELGGEDQ GALVYAIPYLEPDLVRQQLSDLPGGEDGEPPPLPRSHEAVLAAALRRVGADLRRRRADGD ERPALCMAHAFITGALPSDSERDIEVGGVASVSASLFDSLGFDGGFEGHGLDYVAAGHLH RPQDVAGASVPIRYSGSPIAYSFSEAGAAKSVTLVSTGPTSVLSVEEAPVPVLREVRVLE GTMDALLSDPDPGARRSYCSVTVTDAARPAQMVPRIREAYPHALVVQHRSPADVSHRPVR GAVASRSPREVCEEFFEAVGGRALDDEERGVARSVWERMRGEERS >gi|319977183|gb|AEUH01000251.1| GENE 3 2204 - 2935 1008 243 aa, chain + ## HITS:1 COG:Cgl1505 KEGG:ns NR:ns ## COG: Cgl1505 COG0518 # Protein_GI_number: 19552755 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase - Glutamine amidotransferase domain # Organism: Corynebacterium glutamicum # 6 239 5 233 252 113 31.0 3e-25 MTKPFLMLTSRDDGPVLASEMAELPRIAGLGDDEHVHLRMEAGHLPDLDLADYSGIIVCG SPWDANADDHDKDARQVAAERWLSRLYTRVLGESFPLLGLCYGLGTLTLHLGGAVDTHHG EEISGIVLTKTDAGRADPLLEGTPDRFHAYVGHHEAVRELAPGMSVLLAGDDTPIQMVRV GEAAWATQFHPELDLAGVLVRVEQYGGRYYPPEAAGAITEQVSSVDVSPSHRVLTNFVEA FRR >gi|319977183|gb|AEUH01000251.1| GENE 4 3028 - 3498 -214 156 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTPARVISPSRCRQPPHDVDDPLTVPTTRVHTRGAPAPGPGALRTRSTWTRSTTPAAPGG GAPSPGPGALRTRSTWTRSTTPAARRNTRTRARHAADTVDNARPHQGEHPHRGSAPGGLL RIAHHPLGVRTQGPRFKKHAPHTRAGSSPCVRPSVA >gi|319977183|gb|AEUH01000251.1| GENE 5 3674 - 4162 804 162 aa, chain - ## HITS:1 COG:lin0026_3 KEGG:ns NR:ns ## COG: lin0026_3 COG2190 # Protein_GI_number: 16799105 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Listeria innocua # 9 153 4 148 151 130 49.0 9e-31 SDEAKADFTLTSPIEGTKVPLSEVKDEAFAGGALGPGIAISPWAGAVVAPCDGKVTVAFP TGHAYGLKSASGLQVLIHIGMDTVKLDGKGFTPKVAKGDFVRRGDVLAVVDWDAIRAAGY DTITPVVVTNKKKFAAITPAEAGPVAFGDTIITATPKEQAQS Prediction of potential genes in microbial genomes Time: Thu May 12 19:05:47 2011 Seq name: gi|319977176|gb|AEUH01000252.1| Actinomyces sp. oral taxon 178 str. F0338 contig00252, whole genome shotgun sequence Length of sequence - 7721 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1522 2204 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 1566 - 1625 1.8 2 2 Op 1 4/0.000 + CDS 1968 - 3137 1432 ## COG4671 Predicted glycosyl transferase 3 2 Op 2 25/0.000 + CDS 3134 - 4339 1796 ## COG0438 Glycosyltransferase 4 2 Op 3 3/0.000 + CDS 4339 - 5460 1269 ## COG0438 Glycosyltransferase 5 2 Op 4 . + CDS 5453 - 7258 238 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 6 2 Op 5 . + CDS 7255 - 7720 538 ## gi|293189572|ref|ZP_06608291.1| phosphotransferase enzyme family protein Predicted protein(s) >gi|319977176|gb|AEUH01000252.1| GENE 1 1 - 1522 2204 507 aa, chain - ## HITS:1 COG:Cgl1324_2 KEGG:ns NR:ns ## COG: Cgl1324_2 COG1263 # Protein_GI_number: 19552574 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Corynebacterium glutamicum # 120 497 1 362 415 332 49.0 1e-90 MATTTSTAEAILEAVGGAGNITHLTHCATRLRFELNDASVIDKDAVEAIDGVMGAVPQSG DRYQIIIGGAVQGVYNDIMNLPQMAGGGSAPSDGQSDADVKAAARAKARGKNALIDAFFE YLSDSFRPLLPVLLGASLILAALAVLEAFKVVNTHAEVIPSWLGFTNSMWRAVFYFLPAM VAYNASQKLNVDPWVGTTIILALLTPNFTGLMSTSDFPTTTCTEIFGTDRKQCVANVLGV PLQLSDYGGQVFVSLLMVPLLALLYKGLKRVIPANVQMVFVPFICFVVMIPLTAFLIGPL SIWLGNGMGSGLAWLNTNAPVVFAVFIPLLYPFLVPLGLHWPLNVLQIANIAAQGSDFIQ GPMGTWNFACFGATAGVLFLSIRDKDAEMRQTASGALAAGLFGGISEPSLYGIHLRFKRV YPLMLTGCVVGGLIVGIGGGLKIDTFVFTSLLTIPLFQPTALYAIAVAAAFATAFFMVVT FDYRTKEQRAEAKARRADQGAPASAPA >gi|319977176|gb|AEUH01000252.1| GENE 2 1968 - 3137 1432 389 aa, chain + ## HITS:1 COG:all5196 KEGG:ns NR:ns ## COG: all5196 COG4671 # Protein_GI_number: 17232688 # Func_class: R General function prediction only # Function: Predicted glycosyl transferase # Organism: Nostoc sp. PCC 7120 # 7 388 28 418 423 202 35.0 7e-52 MTTPLSVLLYSHDSQGLGHVRRNLALAHHIAARAGTGGLRGLVVSGLAPSPLFTLPTGFD WLALPGVEKRDGVYVPRRLPGTLSETVALRSAVLGAALEGFAPDLVIVDRHPLGIRRELE APLRALRRSHPEARIVLGLRDVLDDPGVAAREWEGLGGADALAPLIDQVWVYGDPRVHDA TSSGEIPSALASRALFTGYLAAGRADVDPHPGRLARPYVLTTAGGGSDGGPLVEAAAGAR VPGGHDHIVVAGPQLDEDRFARAQALAGPSTTVVRSCPGLAHRIREAAAVVAMGGYNTVC EILATSTPALIIPREKPRLEQAIRARALSRASAIDTARQEAATPDLLSQWLAGAVRSRTD RSHVELGGLAAAARAATRLLTRTPEGEPA >gi|319977176|gb|AEUH01000252.1| GENE 3 3134 - 4339 1796 401 aa, chain + ## HITS:1 COG:all5195 KEGG:ns NR:ns ## COG: all5195 COG0438 # Protein_GI_number: 17232687 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 4 384 3 392 418 288 42.0 1e-77 MTRIGYVLKVYPRFSETFVVTEILAREELGQDLRIYALRPTTDTRFHPEIARVEAPVSWV PRPRKAADLWETLTASLTDEDMRRRFADLMPRIAGLAADDVAQGAALARAAREDGITHLH AHFASLAGRTAWIASSLTGIPYTVTTHAKDIHHESVDMGLLREVCAGAAQVIAISRYNQD YLDRVLEGTGARVVLQRNALELARFAYRDPEPPSGPLRVLAVGRLVEKKGFAHLINALAL ARGAGIDFDAQIIGEGELAGDLARRIEDNGLSDSCRLLGARTQDEVRSHLEAADVFVAPC VPGADGNIDGLPTVILESMAVGTPVIATSVSGIPEAVVNGRTGVLVPPASASALAQALGG IASGRAPARDLARGARALIEESYDSRRQAARLAGLETEEEN >gi|319977176|gb|AEUH01000252.1| GENE 4 4339 - 5460 1269 373 aa, chain + ## HITS:1 COG:sll1724 KEGG:ns NR:ns ## COG: sll1724 COG0438 # Protein_GI_number: 16330033 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Synechocystis # 1 357 1 366 379 250 41.0 2e-66 MRIAYVCADPGIPVHGTKGASVHIQDVVRELVRRGHDVGIHAVRLGEHPPADLAGVPTWA HPLPAHCADREKAQAEAAGAIAEHLETDAPDLVWERYSLFSTVLARLARTRGVRGVLEVN APLIDEQREHRRLDDEKGALANLREQAGAASATICVSDPVRAWVEALTGTVRAHTVPNGV DTTRITPVGEEDGRVVVTFVGTLKPWHGVEHLIEARRLATAPWSLRVIGDGPQRPRLEEA ARLAGVDVDFRGAVAPHDMPAHLEGTAIAVAPYPMPQREADHYFSPLKVYEYMAAGLPVV ASSLGQIPSALDGCGVLVPPSDPAALARAIDSLAQNPARRADLGERARRAAVERHSWSGA VGRVLELAGGGDA >gi|319977176|gb|AEUH01000252.1| GENE 5 5453 - 7258 238 601 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 357 574 2 223 245 96 32 7e-20 MPKRTKGTRLGALSRTLRLVGPDVRPHKGLIVGGVTALLMDVVFRVMEPWPMKIVVDSVF EAALPGTGDAWSAAPMIVACALLLVVVVGLRAASNYLATVCFALVGSRVAQSLRARVFRH VQGLSQQFHARNRSADTVQRIVGDVNRLQEVAITAGLPLLANTATLAVMLVVMVVLDALL ALVVVVAVALFLLISRGSSQRITVASRSTRKGEGQLANTAQESLASMPVVQAYCLEDYIA GRFGGANGRAVRQGVKSLRLAARLERSTDVLVGLASAIVLVGGGMRVLAGAMGVGDLVLF TSYLRTAMKPLRDMAKYTGRISRAGASGERVADLMEVPQDIVTAPGALSPARIGGSLEFD RVVTEYDGVEVLHSLSLSVAQGERIAIIGPSGSGKSTLVSLAVRAQDPVRGRVLMGGYPL TALSLAALRSNVTLLHQDAVLFATSIRENIALGREGASYEEVVAAARAANAHDFIMEMPD GYETEVGERGGTLSGGQRQRVAIARALLRDTPIIILDEATTGLDPASAGLVLDAIDGLVR GRTALAVTHDPQVALRSTRVVWIEEGRVLLDGPPSRLLAESRRFRAWAASAAVGADGKGA P >gi|319977176|gb|AEUH01000252.1| GENE 6 7255 - 7720 538 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189572|ref|ZP_06608291.1| ## NR: gi|293189572|ref|ZP_06608291.1| phosphotransferase enzyme family protein [Actinomyces odontolyticus F0309] # 4 154 3 153 376 144 58.0 1e-33 MTPSSLALAVSTLLDPERLSGAVGEARVATRVRIKPGVSVSASLADSSGAPAGWARVLWP TAHSKAGKAAARAAGLGLPAWIRPLGSGLRIQWGGVASDPALMPHIDRARASGAVDPATW RILRHNPLRRLVVRSAGAVVRIRARADRRAAALNR Prediction of potential genes in microbial genomes Time: Thu May 12 19:05:57 2011 Seq name: gi|319977171|gb|AEUH01000253.1| Actinomyces sp. oral taxon 178 str. F0338 contig00253, whole genome shotgun sequence Length of sequence - 5378 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 662 704 ## gi|154509396|ref|ZP_02045038.1| hypothetical protein ACTODO_01927 2 1 Op 2 . + CDS 653 - 1633 1124 ## Bfae_02030 putative homoserine kinase type II (protein kinase fold) 3 1 Op 3 . + CDS 1663 - 2982 1986 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 4 1 Op 4 40/0.000 + CDS 3061 - 4503 1826 ## COG0642 Signal transduction histidine kinase 5 1 Op 5 . + CDS 4500 - 5159 1088 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain Predicted protein(s) >gi|319977171|gb|AEUH01000253.1| GENE 1 3 - 662 704 219 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509396|ref|ZP_02045038.1| ## NR: gi|154509396|ref|ZP_02045038.1| hypothetical protein ACTODO_01927 [Actinomyces odontolyticus ATCC 17982] # 1 213 158 370 376 213 62.0 7e-54 AAIPVPARLDDGSDPHVCVLADAGGCDLSSPRGTPERAREWTERAGGLFARLHAARPPAP LAASLASPAPATRAALEVHARIIGALDDGLGRRIEALARAVPEPAPAPAALIHADASPDQ VLVDEAGAVLLTDFDRARMGAAALDVASYAASAGPGLAPSFLRGYEQAGGRIPDDAPMAA AVVHARALSLADPLREARPDWAARVAAALDLMEGGAPWH >gi|319977171|gb|AEUH01000253.1| GENE 2 653 - 1633 1124 326 aa, chain + ## HITS:1 COG:no KEGG:Bfae_02030 NR:ns ## KEGG: Bfae_02030 # Name: not_defined # Def: putative homoserine kinase type II (protein kinase fold) # Organism: B.faecium # Pathway: not_defined # 13 308 497 828 878 86 34.0 2e-15 MALIDQALAVEGVRRAWPCDGAIAVEVVDRYGRLRAGRVSDSGPVLSPYACDPVLGRIPV GDGTLVVHRHGRRAVIVGADRVDKHVRKGGGRIARASAAAGRAYRSWGLEAASVTSWSPT SVSFTRLPGRSLADLGDAALPGWRLLADAWRAVGDQGLPVHSGADEAGNLARWQEWARAF ATVPDDPRVSGAVLEASRRLVEPAGAPLVVSHADLHDGQLLWDGRSLGVIDLDGARMAEA ALDLTNLRAHAELARLRGALSARGLDSVIGWLDAAAARMPTSAPRLDAYLAAARLRLLFV HSFRPGSRAWLEGWKDHALSDPTSTP >gi|319977171|gb|AEUH01000253.1| GENE 3 1663 - 2982 1986 439 aa, chain + ## HITS:1 COG:MT0337 KEGG:ns NR:ns ## COG: MT0337 COG1004 # Protein_GI_number: 15839709 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Mycobacterium tuberculosis CDC1551 # 1 438 1 437 443 418 55.0 1e-116 MRMTVIGCGYLGAVHAACMAALGHDVVGIDIDHDKVQALASGRSPFHEPGLDEILAQGTA SGRLRFTAEPTAADLRGRGLHFITVGTPQAPGEGSADLSHLMCAVSMLVRLLDPDEGAVV VGKSTVPVGTAQRVEEALAPAGARLAWNPEFLREGFAVADTLRPSRLVYGLSEDPGVARA GREALDEAYRSLLDAGTPVIASGFATAELVKVAANSFLATKISFINAMAEICDATGADVT MLAQALGHDERIGRRALGAGIGFGGGCLPKDIRAFVARAEELGRGEAVAFLKEVDAINLR GRSRAVAAAEAALGGSVEGATIAVLGASFKPDTDDVRDSPALDVASRLHERGAHVRVTDP IALSNAAADKPHLTMVEDLHTALAGADLVVLATEWRQFVDLDPAKIGPLVRTRTIVDGRN ALDREAWSAAGWKHVGIGR >gi|319977171|gb|AEUH01000253.1| GENE 4 3061 - 4503 1826 480 aa, chain + ## HITS:1 COG:alr5189 KEGG:ns NR:ns ## COG: alr5189 COG0642 # Protein_GI_number: 17232681 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 99 471 153 528 537 224 35.0 4e-58 MSAQSAQPGVSVSIRTRIIVAMVLVAGAALTVSGVLVSVLQQRALDSRAVDNLQRSKAAL EHLIAGGTDPDTGGPLTDPSEIVRLHLARTFWGASTGEVGFVDGALYWLPSQEPRLRPED DAELMAHVAPLTTGADTVFETHTTATTTYRLVVVPVAGPASTAALVHVIDMRRYSASFRS TMLLYAVSAGGTVALVAPLAWFAVTRLLRPIGELRRATDSIGEADLATRVPVRGSDDLSA LAGAVNRMLDRVERAVVARRELLDDVSHELRTPITVVRGHMELLDPSDHADVVDTRALVI DELDRMGALVGDLLELARASDAVDPAPVDLAALTDRVLDKARALGEREWALDGAARATCW ADSARITQAWLQLAQNAVQYSQEGTAVGIGSSCDSEWARMWVRDRGAGIAPQDIDRVRRR FVRGAGSERVSGSGLGLSIVESIVRAHGGRLDIESTPGEGSTFTLVVPLRPGGADPGGKP >gi|319977171|gb|AEUH01000253.1| GENE 5 4500 - 5159 1088 219 aa, chain + ## HITS:1 COG:alr5188 KEGG:ns NR:ns ## COG: alr5188 COG0745 # Protein_GI_number: 17232680 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 1 218 1 220 222 239 57.0 3e-63 MTTILIVEDERRIASFVAKGLRAAGFVPEVYSTGAEAVEAALQGRGALMLLDVGLPDIDG FEVLERIRGQGSGMPVIMLTARTSVADRVAGLEGGADDYVPKPFSFEELLARVRLRLREE APAAGGLTLARGSLVLDLKSRRARVGERWVDLSAREFTLLETFMRNPGQVLSREQLLSAV WGIDFDGGSNVVDVYLSYLRQKIGKDRFETVRGMGYRLV Prediction of potential genes in microbial genomes Time: Thu May 12 19:06:16 2011 Seq name: gi|319977168|gb|AEUH01000254.1| Actinomyces sp. oral taxon 178 str. F0338 contig00254, whole genome shotgun sequence Length of sequence - 1683 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 210 174 ## 2 1 Op 2 . - CDS 309 - 1658 1655 ## Predicted protein(s) >gi|319977168|gb|AEUH01000254.1| GENE 1 3 - 210 174 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSERMRIAWTICAIAAVGIAAFIGLRALASGSGQSMRESGLVVPSQSGGAAAQSDPGTTA PPAPPAGPE >gi|319977168|gb|AEUH01000254.1| GENE 2 309 - 1658 1655 449 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIVPLLVGAPFFVLALVLFWLPHSSNSARQQDPAMQESDIVETEAVPWALEGEVAPVKQT PAPSDSWFKGAREAWKIDAPALKRGGFEQITTYAANGSALITMQGANAWQSTIKGWDTSG AQPRELWAHTVQITDSPTIGAEEAGVWVGGTLFADGYAINGQTGQIVSLTWMRSQGAGRR VNDLVVTGDGLLVACDPGPGQCTARKSDGSVVWTSSTGLKGHTVRGTALGGGDEWIWLDG GAGAASFVNARTGQVNEEHYGSTGPCWRAGARDGWLVACQTDTRITALHADGTSAGAFDA AHWPIPASPQGGCSNASTPVWAGAPTLDEAIAYYRDSDASPTLGTLTPTDCEHIEYASPT GASTTIDLSDDPERRAFTLGDFGRQLQRQLAISADGRVLAIGNSMLVDLTSGRRMDMASA GSPDLPLLAAPGLMLAADRSGVIAVAPRE Prediction of potential genes in microbial genomes Time: Thu May 12 19:06:41 2011 Seq name: gi|319977165|gb|AEUH01000255.1| Actinomyces sp. oral taxon 178 str. F0338 contig00255, whole genome shotgun sequence Length of sequence - 2956 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 + CDS 152 - 1459 1853 ## COG0232 dGTP triphosphohydrolase 2 1 Op 2 . + CDS 1515 - 2955 1675 ## COG0358 DNA primase (bacterial type) Predicted protein(s) >gi|319977165|gb|AEUH01000255.1| GENE 1 152 - 1459 1853 435 aa, chain + ## HITS:1 COG:ML0831 KEGG:ns NR:ns ## COG: ML0831 COG0232 # Protein_GI_number: 15827365 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Mycobacterium leprae # 21 417 13 416 429 324 47.0 2e-88 MADTNELSLVIPGVDGGAGVYAPSDMDRMVHEEAKSSARTDFERDRARVLHSSALRRLGE KTQVLGPISDDFVRTRLTHSLEVAQVGRELGKELGADPDVVDAACLSHDLGHPPFGHNGE RALDGLASRIGGFEGNAQTLRVVTRLEPKVIAPDSTPAGLNLTRATLDAICKYPWVKSGG PDLAKSTRKFSVYGDDAPAFAWMRQGAPAGRRCLEAQIMDLSDDIAYSVHDVEDAIATGK LAPGQLRDDARASAVIDSTLGWYGPAVSRADLEEALGALLAMEDWLGDYDAGYAARCRLK DLTSELIGRFCSATVRATRDAFGPEPLGRYRADLVVPRATRAQIQVLKGMAVHYVMSPRE SEPVYYQQRTLVADLVDALYEAGADALEPVFAQQWRAASSDDVRLRAVIDQVASLTDVSA SAWHARHCGMLSSQL >gi|319977165|gb|AEUH01000255.1| GENE 2 1515 - 2955 1675 480 aa, chain + ## HITS:1 COG:Cgl2216 KEGG:ns NR:ns ## COG: Cgl2216 COG0358 # Protein_GI_number: 19553466 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Corynebacterium glutamicum # 3 470 6 456 633 472 51.0 1e-133 MAGLIRKDDIEAVRNAARIEDVVGDHVTLKPAGIGSLKGLCPFHDERTPSFNVRPAVGMF HCFGCGESGDVISFVQKIDHLPFAEAVEALAAKTGITLHYEEGGARVRTEEPGRRGRLVD AHRIAEEFYQSQLTTPQAAKGREFLAGRGFTQAMCAHFGVGYAPRSWDALIRHLRSRGYT DQEIQTAGLASQGARGIYDRFRGRLVWPIRDITGATVGFGARRLDDDEDSPKYLNTPETP IYKKSQLLYGLDLAKKAITQGHRVVIVEGYTDVMAAHVAGVDCAVATCGTAFGPEHVKII RRLLGDSADPAAGVVLASGRAHGSEVVFTFDGDSAGRKAALRAYGEDQNFAAQTFVAVAP GGQDPCELRLSQGDQAVVDLVRSRTPLFEFAIRSVLSGVDLRSAEGRVAGLRSSARIVAH IKDRALRREYARQLAGWLGMDTRTVDQAVHAASRRGPDGAPARPGAPTMAGAGRPGPSQG Prediction of potential genes in microbial genomes Time: Thu May 12 19:06:42 2011 Seq name: gi|319977161|gb|AEUH01000256.1| Actinomyces sp. oral taxon 178 str. F0338 contig00256, whole genome shotgun sequence Length of sequence - 1993 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 476 633 ## Bcav_1788 DNA primase 2 1 Op 2 . + CDS 480 - 959 696 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis 3 2 Tu 1 . + CDS 1158 - 1949 1186 ## SSA_1682 hypothetical protein Predicted protein(s) >gi|319977161|gb|AEUH01000256.1| GENE 1 3 - 476 633 157 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1788 NR:ns ## KEGG: Bcav_1788 # Name: dnaG # Def: DNA primase # Organism: B.cavernae # Pathway: DNA replication [PATH:bcv03030] # 2 155 514 665 667 94 44.0 1e-18 VLQHPALVIGAGFDELDGEAFTVPTHRAVHDAIRASGGLDEFTRILRSAEERFGPGEEAT AAATRRFVSQVLEAAGDYLAPAVSQLAVAPLPVADHERMRSYCRGVVAAMVRVDLTRGLG QARAALQRMGEDDPGYAEAFRELMRLEQRRQNYTERD >gi|319977161|gb|AEUH01000256.1| GENE 2 480 - 959 696 159 aa, chain + ## HITS:1 COG:SP0340 KEGG:ns NR:ns ## COG: SP0340 COG1854 # Protein_GI_number: 15900270 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Streptococcus pneumoniae TIGR4 # 4 157 7 160 160 237 68.0 5e-63 MAEVESFTLDHTAVKAPYVRLIGVQTGPAGDRISNFDVRLVQPNGNAIPTGGLHTIEHLL ASLLRDRIDGVIDCSPFGCRTGFHLLMWGAPSVEEVAAALVSSLRAIAEDVEWEDVPGTD ERSCGNYRDHSLFSAREWSRAVLDQGVSVDPFERRVLPQ >gi|319977161|gb|AEUH01000256.1| GENE 3 1158 - 1949 1186 263 aa, chain + ## HITS:1 COG:no KEGG:SSA_1682 NR:ns ## KEGG: SSA_1682 # Name: not_defined # Def: hypothetical protein # Organism: S.sanguinis # Pathway: not_defined # 19 260 5 249 251 193 45.0 6e-48 MATHAVPVPAHAAPRPRIGVNQYWIKLAMAALMVLDHLHHVPGLVPPEWADAFHVITRCV GVWFAYGAVEGVLYTSNMRKYLTRLWVAAAVMAAGSFALGQLLATRDVHMYDNNIFLTLA VGTTLLALVKRGAGAAWHTGAVVGALAGSVVAAMFLPIEGGLPVLPFMVITYALYTRVVW RDLAYLALAAAMFALAWQPYDTWQATVSMLAQNSDFMLILVIPVLHLYNGEHGPHTRFSK YFFYVFYPAHLWLLALIAYFQAA Prediction of potential genes in microbial genomes Time: Thu May 12 19:06:57 2011 Seq name: gi|319977140|gb|AEUH01000257.1| Actinomyces sp. oral taxon 178 str. F0338 contig00257, whole genome shotgun sequence Length of sequence - 21970 bp Number of predicted genes - 20, with homology - 18 Number of transcription units - 11, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 516 489 ## gi|293189583|ref|ZP_06608302.1| hypothetical protein HMPREF0970_00617 2 2 Op 1 6/0.000 + CDS 733 - 975 205 ## COG2161 Antitoxin of toxin-antitoxin stability system 3 2 Op 2 . + CDS 978 - 1238 204 ## COG4115 Uncharacterized protein conserved in bacteria + Term 1246 - 1276 1.7 4 2 Op 3 . + CDS 1565 - 2713 1368 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Term 2721 - 2769 21.4 5 3 Op 1 8/0.000 + CDS 2997 - 4223 1572 ## COG4558 ABC-type hemin transport system, periplasmic component 6 3 Op 2 10/0.000 + CDS 4220 - 5290 1272 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 7 3 Op 3 . + CDS 5344 - 6129 844 ## COG4559 ABC-type hemin transport system, ATPase component 8 3 Op 4 . + CDS 6126 - 7514 1218 ## Amir_2876 hypothetical protein 9 3 Op 5 . + CDS 7519 - 9762 2075 ## Xcel_2763 hypothetical protein + Prom 9788 - 9847 2.5 10 4 Op 1 35/0.000 + CDS 9889 - 11490 256 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 11 4 Op 2 . + CDS 11487 - 13256 268 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 12 4 Op 3 . + CDS 13280 - 14497 1530 ## COG1482 Phosphomannose isomerase + Term 14588 - 14624 -1.0 13 5 Tu 1 . - CDS 14722 - 15111 125 ## - Prom 15199 - 15258 1.5 14 6 Tu 1 . + CDS 15232 - 15564 318 ## 15 7 Op 1 . - CDS 15775 - 16512 736 ## COG0518 GMP synthase - Glutamine amidotransferase domain 16 7 Op 2 . - CDS 16584 - 17651 1077 ## Lxx17710 hypothetical protein 17 8 Tu 1 . - CDS 17759 - 18244 463 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component - Term 18333 - 18371 -1.0 18 9 Tu 1 . - CDS 18387 - 19436 1169 ## BAD_0485 hypothetical protein 19 10 Tu 1 . - CDS 19641 - 20318 499 ## gi|154509376|ref|ZP_02045018.1| hypothetical protein ACTODO_01907 - Prom 20365 - 20424 1.8 20 11 Tu 1 . - CDS 20633 - 21826 1660 ## COG0620 Methionine synthase II (cobalamin-independent) Predicted protein(s) >gi|319977140|gb|AEUH01000257.1| GENE 1 3 - 516 489 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293189583|ref|ZP_06608302.1| ## NR: gi|293189583|ref|ZP_06608302.1| hypothetical protein HMPREF0970_00617 [Actinomyces odontolyticus F0309] # 1 171 1 164 166 190 62.0 2e-47 MTTLPDSLTAALPSRERRCPALALADGRWLAAARAGLYVFPAPSPSEAGAGSGEAADAAG PALFPWYDVARARWEAEGELFALEWVDPSRRPLTGRTEGDPEQFMRHAGEYVDRSIVLHT QCQADNGTTITAWVRRGGRGLFSVLTASGPLDAAGRRAADALEAGAREAVG >gi|319977140|gb|AEUH01000257.1| GENE 2 733 - 975 205 80 aa, chain + ## HITS:1 COG:RC0290 KEGG:ns NR:ns ## COG: RC0290 COG2161 # Protein_GI_number: 15892213 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Antitoxin of toxin-antitoxin stability system # Organism: Rickettsia conorii # 1 80 23 102 102 65 37.0 2e-11 MTTVTATAARSDLYRLIDAVNEDSSPITITGRRGNAVLIGEDDWSAIQETLYLQGVPGMA DSLKAAAREDLDDAVDDVQW >gi|319977140|gb|AEUH01000257.1| GENE 3 978 - 1238 204 86 aa, chain + ## HITS:1 COG:RC0291 KEGG:ns NR:ns ## COG: RC0291 COG4115 # Protein_GI_number: 15892214 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Rickettsia conorii # 1 85 1 85 86 99 52.0 2e-21 MWRVVFSKQAAKDAKKLSSSGLKPKAQALINLLEEDPFAAPPRYERLVGDLSGMCSRRIN IQHRLVYEVFEEERTVRVLRMWTHYG >gi|319977140|gb|AEUH01000257.1| GENE 4 1565 - 2713 1368 382 aa, chain + ## HITS:1 COG:XF1999 KEGG:ns NR:ns ## COG: XF1999 COG0115 # Protein_GI_number: 15838593 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Xylella fastidiosa 9a5c # 29 377 5 356 362 354 51.0 2e-97 MASTQHTPTVLEAAADQALPAADELAGRFPLTPNPAPASEQEYADIMGKLSFGRKFTDHM AHMRWTRDGGWQDRGVVPYGPLALTPGTAVLHYCQSVFEGIKAYRHADGSVWTFRPGYNA ARINHSAHRLALPQIDRGDFVASLVDYVRADQKWVPGVDGASLYLRPFMFASEEFVGVHP AGVVDYYVIGSPSGPYFTGGFAGVPIWVVRGYHRAGPGGTGSAKAGGNYAASLLPQIEAE RRGFSQVCFLDTYEEKYLEELGGMNMFVVMADGSVRTPELSGVILEGCTRSAILRLLRDD GVGVCEERIALDDLVAGIRSGAVAEVFACGTAAVVTPISRLASDDFDVELPVGGLTRRIH ERLTDIQMGRAEDPYGWTYRLV >gi|319977140|gb|AEUH01000257.1| GENE 5 2997 - 4223 1572 408 aa, chain + ## HITS:1 COG:Cgl0385 KEGG:ns NR:ns ## COG: Cgl0385 COG4558 # Protein_GI_number: 19551635 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type hemin transport system, periplasmic component # Organism: Corynebacterium glutamicum # 8 369 4 346 359 192 36.0 9e-49 MRSIRFTAARACALAVCAAAAAALAGCANIGTSPAQSGSECRAVPASGEPQTIEMPETTA DQLPDPRTVVGPTTAYTRSADITPVSDDPTADQSLPATVPSADGPQQKVTDTTRILALNQ NGGLAAAVIGLGFGCNLVGRDISTGYPSTAHLPLVTRGGHELSAEAILALRPTVVITDTS IGPYDVQLQLRDSGIPVVFIPLVYEHGVGGVQAQIQAVADALGVHELGQRLASRTMREIN QVVARVAGMSPREYADRPRAVFLYVRGKSVYYWFGEGSGADSLIQSLSMVDVAAEVGFKG MSPTNAEALIKAAPDIIITMTLGIASTGGIDEVLALPGIAETPAGKNRRVIDMSDYEVMS FGPRTAEVLAALGTAVFDPEGAYQPGAPPAPIGQRLAELQSGGQAQSQ >gi|319977140|gb|AEUH01000257.1| GENE 6 4220 - 5290 1272 356 aa, chain + ## HITS:1 COG:Cgl0386 KEGG:ns NR:ns ## COG: Cgl0386 COG0609 # Protein_GI_number: 19551636 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Corynebacterium glutamicum # 62 347 66 351 358 255 50.0 1e-67 MTRLRSRRPAPSRARATAITATVLAVALVVLVVVSGVLGQLLVPPSEVLGSILHRLGIDW LPAPSTPFGDEALWNVRFPRLAMAVVAGASLAVAGAIMQGIFGNPLAEPGTVGVSSGAAV GASLSIMFSWTFLGSWTTPVLAFACGLATTAGVYVMARRAGRTEVVTLILTGVAVNAVTG AAISFIVFAASSAARDQIVFWQMGSLAGSRWAQVAVVGPLCLAGIVCAQFISRKLDLLSL GERAARHVGVDVERLRVQGMLLVALLVSAAVAFCGIIAFVGLVVPHLLRLVMGPGHRALL PISALGGAVLMTAADLYARTLIEFADLPIGMLTSCVGGPFFFWLLRRSRGRSGGWA >gi|319977140|gb|AEUH01000257.1| GENE 7 5344 - 6129 844 261 aa, chain + ## HITS:1 COG:Cgl0387 KEGG:ns NR:ns ## COG: Cgl0387 COG4559 # Protein_GI_number: 19551637 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type hemin transport system, ATPase component # Organism: Corynebacterium glutamicum # 14 237 17 240 271 182 50.0 6e-46 MRVRDVRAGFGEAEVLHGVSLDLVCGEVLALIGPNGAGKSTLLAVISGDLDPTGGAVQID GAPLRDWSQAERAMRRSVLLQSVDVSFPFRVREVVEMGRAPWAGTGACADDERIVRASLR LTETEEFAERPYTSLSGGERSRSAFARVLAQAAPIMLLDEPTAAMDVRHQEMVMRRAREY AATGAGVVIVVHALDVAAAYADRVALLDAGRVAALGAPSEVLTAGLLSRVYRHPIEVVAH PRTGAPLVVPVRDSSLTEASR >gi|319977140|gb|AEUH01000257.1| GENE 8 6126 - 7514 1218 462 aa, chain + ## HITS:1 COG:no KEGG:Amir_2876 NR:ns ## KEGG: Amir_2876 # Name: not_defined # Def: hypothetical protein # Organism: A.mirum # Pathway: not_defined # 10 228 9 229 365 137 44.0 1e-30 MNNPATRALRAVGASLTAAALACAALALGAAPASAAPQVTVDAPGGIAIEGETAVTVSGT GFQSIRGGFGGIYVLFGTVTSDNWQPSKGGVTGKDYSYAADDETKPAGYESLVVFPGSAT SYAATGGEISADGSWQSKLTIPGAKFTTYDRSGNASEVDCTTVQCGIITIGAHGQINANN ESFTPVTFSSSAVGAPAGAAAQQTGPSAGADEEPQSSGTSPAAPSPSAAPRAGGSGAGTA GSAGTPITIQTSSPDSEVRLSTIFLLVGVGLLGLALVVLAAGFGGFLAVKSIVLGLSPVA VERERARRQAKADRARARNEAKRRRYLAKHGLPESGDAPAWSAPVGAFGVLPDEARAAEG RDERDRPDGRDARDGDTCAADPLGQSAIQSPGVCAPGVPGRVNDAGAPAQGAGDRADSAA EAPDGAALAGPAADDTAVLETVGAAGPGDAGLNGFFSQAERR >gi|319977140|gb|AEUH01000257.1| GENE 9 7519 - 9762 2075 747 aa, chain + ## HITS:1 COG:no KEGG:Xcel_2763 NR:ns ## KEGG: Xcel_2763 # Name: not_defined # Def: hypothetical protein # Organism: X.cellulosilytica # Pathway: not_defined # 314 620 24 332 456 172 42.0 3e-41 MRASTSLKVLCAVCAAVTVGASAIALAPPAAQAAPVEVSGAVLQWGMNAETGSAGFAPGT CNFFGAGAAGDSGGGQWPNSPARDEAGEYVVSAKTGQELWRAQQGSVSILKPDANGAYAP ITWADKCTTRTGQRTDTNGAVSEAVVRIEGGTGSLDPATGSATISWKGSWTVAFYSGMTY WSANDPVLTVENGVGTLTATASGFGASMSDPDAKPTPLTPRKVTLATLKDVTVTDKGLTV VPEYRGVEVTSPANAAPQARTGADWGSWPQDFVDFQGETGQHSYWYSSGSSADARKPPSP LAVAVPGNGGDAPAPDPAPQPTPTTPTPTPTAPDPAPTTPAPEPTTPGPAPSGKGGGSLS DANLRWSMSDEANSGAYNGDCNFLSAGVAASTGSSRPWDSSFYSASSGNVSIVKADGAGG WTGATWQNHCLDAAGNKVSSNDLHANTGSQVVVTGGKGTVAADGATHIEWKGSWTVAFYG GMNYWSITDPVLDVDASGSGTLTATASGYGADRNDTSVWLPLPTQKVTLATLTGVDVGAA ASASGFTHTPDYLGVSVRVGSSVTKQAARTSDNAEYWGSFPQDFVTYQEKTGQSSYWYTS GGTRDAAKKAAPLTVAYTASFTTPSSTNDGIAGARGRGTTSARGSTSRASSGSGTAAQNN STAAGRRSASAGPSAPPVPVALGPDAGFAEAGGYTVNRVPAARGLRDSAGRIALGAGVMT AGSGLPLLLGWLIRRQLGLDPRAAVLR >gi|319977140|gb|AEUH01000257.1| GENE 10 9889 - 11490 256 533 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 319 512 4 192 305 103 35 1e-21 MKRAIVIALSWAGALALSLGYLAIGWGVDDALGGRAWVSWWIALVGVAASAGAAWLASAI GGREMARLEPKARHRLVAHVFALGPSERTRERAGRIVNSATDGVERVAAYKGTFLAPMIA SLTVPLIVVLFVGATIDPVSAGFLAVAIPLVPACVLGFRKAFKPVSARYRGASRALAAKE LDAIQGLGALVLMNAGRPMGERLAEAAEEVRSKVMSYLAGNQLILLVIDSVFSLGMTTGA VALALIRTGSGAMSPGGALSLVLLSSIMLDPLDRIGQFFYIGMGGMAASREIKRFTSQEP VVVEAEGASAPAVLPEPGAVELSGVDFSYDGTTPVLRGTDLRIEPGEHVALAGPSGAGKS TVSAILQGFRRPDAGTARIDGVDLAGAPLDWVRARTGVVEQTTYLFSASLRENLLVAKPG ATDAQLLSALRAAHLQDLLDRLPDGLDTMVGARGLALSGGEAQRVAIARAILKDASILIL DEPTAHVDLASEREILAALRSAAADRTALVISHRDATIAGADRRVDLVEGAIR >gi|319977140|gb|AEUH01000257.1| GENE 11 11487 - 13256 268 589 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 323 581 114 361 398 107 31 5e-23 MSRFQLSKRLLSITRRVLAPLALSIVARVVFLLLGVGLFALGGWAVLAVPTGRSAWGAGA VIAALVVMSLLKGLARYGEQFAGHFVAFHSLAMLRNYFYDRLEPQAPAGSDSLDPGDLLN RVTKDIDRIEVFFAHTLAPVCTAVVVPAITLGWMGAATSWWLALVELCFLLVSGLVVPAL GAGEARAAARDLRVARGRLAAHVTDSVQGVREVLAFGAQGRRMEQMGAHEAAIASSTAVS TGWIARRRGINQAVLGLSVLAVALVGLSMVPSGALAPDQVGLAVGIALGSFGPVLAVEDF AADLDQAFASAERVFAVTDRTPLVADPAEPVGFDAAGDIAFTGVSFTYPQVRLDADGAPG VRPEVLHDVTIRIPAGRTTAIVGASGSGKSTLAALLTRTWDPDSGSVSIGGVDIRSVGLR PLREAVASAPQRPYIFNDTVRANLLLANPGASPAQLDEALERVDLAPWAASEPDGLDTNV GDMGERLSGGQRQRLALARALLRDASVYVLDEATSQVDPGTEASVLAGIREATRGRTVVV IAHRVSTVADADQIVVMDSGRVVETGAYGELMARGGALAALVAREATVA >gi|319977140|gb|AEUH01000257.1| GENE 12 13280 - 14497 1530 405 aa, chain + ## HITS:1 COG:Cgl0726 KEGG:ns NR:ns ## COG: Cgl0726 COG1482 # Protein_GI_number: 19551976 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Corynebacterium glutamicum # 1 386 1 375 394 233 40.0 3e-61 MERLTGWTKADAWGSADAIPRFLGTSPSAGPVSELWFGAHPDGPTLLDDGRTLAQLIEAD PKAALGSGLLYAFGPTLPFLAKIIAPAQTLSLQVHPTKEIAREGYLREDVLGIARTDPSR VYRDMNHKPEMLLALTGFQALVGFRVPRKARELLVGLDGELAEMLRHRLKFSTLRGGLRS LATWLFDEDSPATAPRIDEFVRACCSRLAAGGSPSRRTDSMVCELGERHPGDPGIIVAFL MNPVDLRPGEAVYIPPRQIHAYQSGLGIEVMASSDNVVRAGLTRKYIDSAQLVEITEFSA LPAMRVAPEHPSATTDRYLAPAQEFELSLVHLRAGEHAARNRGVLVPGEGPRVLIGTGGR VTVRLDEARGGQSLELARGQAAFVSAAERDLWAWGEGSFAQVGAP >gi|319977140|gb|AEUH01000257.1| GENE 13 14722 - 15111 125 129 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTICMGYRHRRALLTPARVISPSRCRQPPHDVNDPLTVPTTRVHAGNPLTMPTTRAGSLG ERLRVPPAFEAAFEAPPPGPHSQRPENAHKDPDLRNTRRTQGLDPALVCGTGKGPGALNA QGTATRKKH >gi|319977140|gb|AEUH01000257.1| GENE 14 15232 - 15564 318 110 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLGFRVPFACSTVCHPESVMSRSFLARAMSLTAVAALSLGLAACGGGGGASATDLDSART EATSTINSLPSLTESDKEEFSMQLSSAADTESIDRIVADARSRSEEKAKE >gi|319977140|gb|AEUH01000257.1| GENE 15 15775 - 16512 736 245 aa, chain - ## HITS:1 COG:SSO1857 KEGG:ns NR:ns ## COG: SSO1857 COG0518 # Protein_GI_number: 15898650 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase - Glutamine amidotransferase domain # Organism: Sulfolobus solfataricus # 1 227 1 205 209 89 31.0 5e-18 MTALVIQNDPVAPLGLLSRFLDSPRVLRAWEDPGAIAPLVERVRAEGAPFDSLVVLGGTA DAHADGKWPWLPDLRALIAACAGAGVRVLGVCLGAQLAAVALGGRVDVGAPLGPECGVTP ISWTGDALSHDDPLIRCLTRARVVFEDHGDAIGEAPRGSQVLAVSEKYPQAFLAGSVLGV QFHPEITEPIARRWQESNEGTDTERVIAGYRAHEAELASTCEALAQWVARAPVRPSGGGA AAVLG >gi|319977140|gb|AEUH01000257.1| GENE 16 16584 - 17651 1077 355 aa, chain - ## HITS:1 COG:no KEGG:Lxx17710 NR:ns ## KEGG: Lxx17710 # Name: not_defined # Def: hypothetical protein # Organism: L.xyli # Pathway: not_defined # 43 354 2 315 315 175 41.0 2e-42 MTWGALVREVWRDIATGTARAGVLGVLFALLVCGPALVDCAQVVGIGAQAARFRASLASV RVFEAKGAIDGAACHALGRIPGVRAAAVRTAETGVRAAALPASSIPLYEATPGIGAIVRA PGAPTTGVLASEQVAAALGLRPGSALALAAGDAEVAGVYPWDEGDGRRPGFAYAVISPVP AQGVFDECWVESWPMSTAIPPLARTAVIPDGEPEVKEGAFNASLGASFDGDALLRGRLTR WAPLVSAVLGTLLGAGGVWARRLEIASALHAGLSRADALLIQTCEALAWTASGTVMALPA AAIVIAGAQSPEAPGVWAIVSLEAGTGLLAALLGVVAATCLVREKRLFAYFKGRR >gi|319977140|gb|AEUH01000257.1| GENE 17 17759 - 18244 463 161 aa, chain - ## HITS:1 COG:aq_297 KEGG:ns NR:ns ## COG: aq_297 COG1136 # Protein_GI_number: 15605829 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Aquifex aeolicus # 19 158 80 219 224 101 37.0 7e-22 MDAAREGTARTSAPARPDRPPRIAWVFQNPHGAPHRTALDHVALPLLARGVPTRAADERA RRLLEEFGLAGAADRPFSTLSGGEAQRLLMARALAGSPDVLLVDEPTAQLDQTTAHRVAD ALTALHAQGTAVVIATHDPAVRDQCDQVIDLGDERYRGGGR >gi|319977140|gb|AEUH01000257.1| GENE 18 18387 - 19436 1169 349 aa, chain - ## HITS:1 COG:no KEGG:BAD_0485 NR:ns ## KEGG: BAD_0485 # Name: not_defined # Def: hypothetical protein # Organism: B.adolescentis # Pathway: not_defined # 5 335 35 376 393 149 32.0 2e-34 MRKPQRRTLRLLGMGAFTAALAFALGTTVGVLVARPPTPEALQPGQAPTSAPVSQREFAD ERSVTLTISATGERALRSPAQGRLTSLGISAGTPVESGQVVCAIDGRPIAALATSTPLYR PIEDGARGDDIAALNAELARLGYAAPASDAADWRTRRALSQVLGVDDGAGGVPASFAPDS FLWIPAPQVTPTGVPAHLGDAVDAQTSLVELADAGAPPRLTIPQDAQPGARVILLGSQAL PVPSDGVITDPGTVSAIMDSLEYAGHVAAKGANAGDGAQLSVKWRLAEPLTVSVVPPAAV GGTAGRSCVWQGGQAVEVTVVGSQLGQTYVSSDTALDSVDLGTEGRACP >gi|319977140|gb|AEUH01000257.1| GENE 19 19641 - 20318 499 225 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154509376|ref|ZP_02045018.1| ## NR: gi|154509376|ref|ZP_02045018.1| hypothetical protein ACTODO_01907 [Actinomyces odontolyticus ATCC 17982] # 24 225 8 207 207 99 33.0 1e-19 MRAIVTTGSLAPGGRSSGPPIGRGAGAARRASGALALLALVCALCACGGQARTAVQSGPG WDAEFEQARAEYQGNDFVLKILEDNTITEEEMRAVEDANIRCLEDHGVVGARYENGTLAV SGTSDDPNEYETEVRECAGISGEPFVQMAYHSMKVNPTNARWADLVAPCLVRKGVVDPSF SGDDWERAGQNAAESNRPIEEVIEYTSTEDAFVLAMRECSEDPNQ >gi|319977140|gb|AEUH01000257.1| GENE 20 20633 - 21826 1660 397 aa, chain - ## HITS:1 COG:Cgl2078 KEGG:ns NR:ns ## COG: Cgl2078 COG0620 # Protein_GI_number: 19553328 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Corynebacterium glutamicum # 3 397 6 398 401 454 56.0 1e-127 MTIRTTHVGSLPRSQRLLDANRKHSAGALADADFSALLQEEVDAVVARQAGIGIDVVNDG EYGHAMLDTVDYGAWWTYSFSRFGGLSFEDVSRFDVRPPAGRDGRLSFSSFAERRDWVRF ADAYADPDSGIHIANRNPVAFPTITGALTYIGEQAVERDIAGVSHALEAAGKPLSDGFVA AISPGSAARVANAFYEDDEAVVWACADVLREEYKRITDAGLTVQIDAPDIAEGWDQMNPE PSVADYRAFCRVRIDALNHALRGIDPSLVRFHVCWGSWHGPHTTDIPFKDIVDLALAVDA NGLSFEAANARHAHEWTIWRDIKLPEGKYLVPGVVSHSTNVVEHPELVAQRIHQFADIVG DSRVVASTDCGLGGRVYPSIAWAKLESLAEGARLASR Prediction of potential genes in microbial genomes Time: Thu May 12 19:08:05 2011 Seq name: gi|319977136|gb|AEUH01000258.1| Actinomyces sp. oral taxon 178 str. F0338 contig00258, whole genome shotgun sequence Length of sequence - 3869 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 165 - 213 0.2 1 1 Tu 1 . - CDS 423 - 632 279 ## 2 2 Op 1 . + CDS 1353 - 3200 2784 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 3 2 Op 2 . + CDS 3250 - 3868 691 ## COG1381 Recombinational DNA repair protein (RecF pathway) Predicted protein(s) >gi|319977136|gb|AEUH01000258.1| GENE 1 423 - 632 279 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLSSRLERGLVERMVQNASESARIRPDPAQIAKLRRILDHPRAVPPGWGETQTGFGPPP PGARRFTQL >gi|319977136|gb|AEUH01000258.1| GENE 2 1353 - 3200 2784 615 aa, chain + ## HITS:1 COG:AGc3510 KEGG:ns NR:ns ## COG: AGc3510 COG0129 # Protein_GI_number: 15889212 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 613 36 642 642 813 67.0 0 MSRPLRSATSTQGRNMAGARALWRATGMGDDDFGKPIIAIANSFTQFVPGHVHLKDMGSL VAGAVREAGGVAKEFNTIAVDDGIAMGHDGMLYSLPSREVIADSVEYMVNAHRADALVCI SNCDKITPGMLLAAMRLNIPTVFVSGGPMESGAQVDGVVEHRLDLIDAMVMAVDDSVSDL ELAQVEANACPTCGSCAGMFTANSMNCLNEAIGLALPGNGTTLATQVERKKLFVEAGARI VDMCRRYYDGDDESVLPRSIATKSAFENAMSLDVAMGGSTNTVLHILAVAHEARVDFTLD DIDAISRRVPCLCKVAPNSTTYHIEHVHRAGGIPALMGELDRAGLLNRDVHSVHSDSLDQ WLGQWDIRSGTASAEAEHRYLAAPGGVRTTRAFSTANLFESLETDAVGGCIRSVEHAYTK DGGLAVLKGNIAQDGAVIKSAGIDEELFHFVGRAFVVESQEEAVESILAKRVQPGDVVVI QYEGPKGGPGMQEMLYPTSYLKGLGLGKKCALITDGRFSGGTSGISIGHIAPEAAAGGAI GLVRTGDEIEIDVNRRLLRVNVPDEELERRRAEKGPAPWRPSKPRPRQVSDALRVYGMLA ASADKGGVRVLPDWA >gi|319977136|gb|AEUH01000258.1| GENE 3 3250 - 3868 691 206 aa, chain + ## HITS:1 COG:MT2431 KEGG:ns NR:ns ## COG: MT2431 COG1381 # Protein_GI_number: 15841874 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Mycobacterium tuberculosis CDC1551 # 2 205 1 204 265 203 53.0 2e-52 MLKSYRDEAIVLRTHKLGEADRIITLLTAEHGQVRAVAKGVRRTSSKFGARLEPFSVVDV QAHRGRSLDTITQVESLALHGDAIAADYDLYVAGTVIVEAAERLTADSDAAGHSQYLLLL GALHALAKRRHDPTLIRTSYLLRALALAGWAPSCYDCANCGAAGPHSAFSIAEGGAVCAA CRPPGAASPSPDAMALMGDLLSGDWA Prediction of potential genes in microbial genomes Time: Thu May 12 19:08:15 2011 Seq name: gi|319977122|gb|AEUH01000259.1| Actinomyces sp. oral taxon 178 str. F0338 contig00259, whole genome shotgun sequence Length of sequence - 12289 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 4, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 174 238 ## gi|154509426|ref|ZP_02045068.1| hypothetical protein ACTODO_01957 2 1 Op 2 . + CDS 171 - 1013 1033 ## COG0020 Undecaprenyl pyrophosphate synthase 3 2 Op 1 . - CDS 1033 - 1530 683 ## COG0586 Uncharacterized membrane-associated protein 4 2 Op 2 5/0.000 - CDS 1607 - 1999 506 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 5 2 Op 3 42/0.000 - CDS 2009 - 2926 1260 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 6 2 Op 4 25/0.000 - CDS 2923 - 3756 808 ## COG1121 ABC-type Mn/Zn transport systems, ATPase component 7 2 Op 5 . - CDS 3747 - 4739 1586 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 8 2 Op 6 . - CDS 4823 - 5815 1356 ## COG1609 Transcriptional regulators - Prom 5870 - 5929 4.9 9 3 Op 1 . + CDS 5724 - 5921 171 ## 10 3 Op 2 19/0.000 + CDS 6017 - 7240 1893 ## COG2182 Maltose-binding periplasmic proteins/domains 11 3 Op 3 20/0.000 + CDS 7372 - 8955 2665 ## COG1175 ABC-type sugar transport systems, permease components 12 3 Op 4 . + CDS 8970 - 9875 1299 ## COG3833 ABC-type maltose transport systems, permease component + Term 9947 - 10000 19.2 + Prom 10110 - 10169 1.7 13 4 Op 1 8/0.000 + CDS 10205 - 10594 509 ## COG1725 Predicted transcriptional regulators 14 4 Op 2 . + CDS 10591 - 11487 1320 ## COG1131 ABC-type multidrug transport system, ATPase component 15 4 Op 3 . + CDS 11484 - 12206 1044 ## gi|293189557|ref|ZP_06608277.1| conserved hypothetical protein Predicted protein(s) >gi|319977122|gb|AEUH01000259.1| GENE 1 1 - 174 238 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509426|ref|ZP_02045068.1| ## NR: gi|154509426|ref|ZP_02045068.1| hypothetical protein ACTODO_01957 [Actinomyces odontolyticus ATCC 17982] # 1 55 189 243 245 74 67.0 2e-12 SPYAMALMGDLLSGDWAGAGRAAAYARPEAAGLVSSYTQWHLERRLRSLQVLERSRA >gi|319977122|gb|AEUH01000259.1| GENE 2 171 - 1013 1033 280 aa, chain + ## HITS:1 COG:ML0634 KEGG:ns NR:ns ## COG: ML0634 COG0020 # Protein_GI_number: 15827262 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Mycobacterium leprae # 4 273 14 296 296 273 51.0 4e-73 MSEPTAPPGPRPTPMDRAPSDPWAPLAGPGQWEAPAPVFAKGEAPAHVALVMDGNGRWAN SRGLPRVEGHRAGEYALMDTIAGAIDAGVRYLSVYTFSTENWKRSPAEVSFIMNYASDVL ERRTGQLREWGVHVRWSGRRPRLWKSVIRALEKADRATAHNTALDLVMCVNYGGRAEIAD AARAIAQEAASGRLSPRGVTERSFARHLYLPDVPDVDLMIRTSGEQRVSNYLLWQLAYAE MMFVDTPWPAFDRGELWDCLLAYAGRERRFGGAVDRVRGS >gi|319977122|gb|AEUH01000259.1| GENE 3 1033 - 1530 683 165 aa, chain - ## HITS:1 COG:L31711 KEGG:ns NR:ns ## COG: L31711 COG0586 # Protein_GI_number: 15673772 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Lactococcus lactis # 23 136 16 122 160 66 37.0 3e-11 MVTAGLAGFAPSFAREGIGLFLFLFAVVLLRAQATYWLGRLAAKGASSGAGASGWRGRVS RWFDGPVPRKGADLLDRWGMVVIPLCFLTVGVQTAVNAAAGLVRMRWGVYTLAMVPGCVL WAMLYGFGVYAVWTALTASVWTAAAVVVLACAGACALVMRKRRAR >gi|319977122|gb|AEUH01000259.1| GENE 4 1607 - 1999 506 130 aa, chain - ## HITS:1 COG:Cgl2228 KEGG:ns NR:ns ## COG: Cgl2228 COG0735 # Protein_GI_number: 19553478 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Corynebacterium glutamicum # 3 126 17 140 144 122 50.0 1e-28 MQRMTKQRRAVLDELGRVQDFRSAQQIFEDLHTRGERVGLATVYRSLQGLADDHRVDVLR STDGEALYRACDSQGHHHHLVCRRCGAAEEIAQAQIEAWVSAVAAEHGFTDVVHSLELFG LCRSCQRARG >gi|319977122|gb|AEUH01000259.1| GENE 5 2009 - 2926 1260 305 aa, chain - ## HITS:1 COG:BH1395 KEGG:ns NR:ns ## COG: BH1395 COG1108 # Protein_GI_number: 15613958 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Bacillus halodurans # 17 269 11 262 287 131 33.0 2e-30 MNGFWDGLAALASSPMMQRSLLAALLVGLTAPVIGTYLVHRRLAMLGDGIGHVSLTGVAL GWLVGAAANASPVDRWAVPGAIIVSLLGAVTIEAVRQSGRTSGDVALAMLFYGGIAGGVL FIGIAGGTSAQLNSYLFGSISTVRWSDITTISVLSAAILALGVGLAPALFSVTNDEEFAR STGLPVRSLSMLIAVLSALTVAVAMRVVGSLLVSALMVVPVAIAQLAATSFRSTMALSMG LGVAICVSGLTLTYFVDLSPGATIVVIAIGLYALGFALRSIVDAVRRARRIARSRATDNH PDRSA >gi|319977122|gb|AEUH01000259.1| GENE 6 2923 - 3756 808 277 aa, chain - ## HITS:1 COG:TP0164 KEGG:ns NR:ns ## COG: TP0164 COG1121 # Protein_GI_number: 15639157 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn/Zn transport systems, ATPase component # Organism: Treponema pallidum # 15 229 2 214 266 131 39.0 1e-30 MPVDASPSGAHGAGAATTAAPAVVRASNLHVAWDMDLVLHGVDFAIPAGQAVALTGANGS GKSTTLRALLGTAPITRGGVELFGASVARPSAVDWKRVGYVPQRVSSGGGVFASAIEVVR SGLLGPRRWWAAPGDSRAAMGALERVGLAHRASSPMGILSGGQQQRVLIARALVRRPDLL VMDEPMAGIDAASRARLAELVAEAKQEGTTVLVVLHELGELGPLLDRELHVAGGHISYDG PPHLNGADEGRHGNCHAPLPRPSSPGPVVDDIVGGRR >gi|319977122|gb|AEUH01000259.1| GENE 7 3747 - 4739 1586 330 aa, chain - ## HITS:1 COG:SPy0714_1 KEGG:ns NR:ns ## COG: SPy0714_1 COG0803 # Protein_GI_number: 15674772 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Streptococcus pyogenes M1 GAS # 39 323 32 315 322 165 35.0 1e-40 MNKRALLTLCAAASAVALAACSGGTGTTAASGQSDSQGLRVMASFYPLKYLTEQVGGAHV TVTSLTPDGAEPHDLDLSPAMVDSIGRADAVVYLKGFQTAVDEAVEQQSPKTAIDLADTV TLVDAGEGSNHPADEDEEGQSGHEGQSGHEEGHEHHHDMAKDPHFWLDPQRMADAASFIG EQLAAADPANANDYRANASTTADSMRQLSETLSNRTASCQSKTFVTAHTAFGYLADRAGL TQVGISGLDPDSSPSAARLQEIAEVVKSQGVTTIFTESLIDPKVAQTLADDLGIGTAVLD PIESQVDASKDYTAVMNENIDALAKALNCQ >gi|319977122|gb|AEUH01000259.1| GENE 8 4823 - 5815 1356 330 aa, chain - ## HITS:1 COG:BS_degA KEGG:ns NR:ns ## COG: BS_degA COG1609 # Protein_GI_number: 16078147 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 3 326 7 332 337 151 30.0 2e-36 MADIATHAGVSTATVSRVFNGVGQVSAPTRIKVLTAIDELGYDRPASAAPPSSPGIGVIV PELTNPVFSAFAHCLQVEIAHAGGIAMIRSQTPGATSEAEHISSLLAHGVDGLVFVSGRH ADHLGDVSPYLDLAGRNVPFVTVNGARPEIPAPDFSTGDALGIQAALAHLRELGHTRIAL LGGRTHAVPAQRKARAFREAMAADFGQDDPVVVETFYTFEAAAAATPDLIAQGVSAIVAG SDLQALGAISALRAHGLRAPGDVSVIGFDDSFLMPHLDPPLTTVRQPVAAIVSSAVRALF EALASREPQSHADFTFTPDLVVRSSTGRAA >gi|319977122|gb|AEUH01000259.1| GENE 9 5724 - 5921 171 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRVGAETWPTPLNTRETVAVETPAWVAMSAMLVRFFVIEALPSPSCEACNTNVTVPVLAE NVTLM >gi|319977122|gb|AEUH01000259.1| GENE 10 6017 - 7240 1893 407 aa, chain + ## HITS:1 COG:TM1204 KEGG:ns NR:ns ## COG: TM1204 COG2182 # Protein_GI_number: 15643960 # Func_class: G Carbohydrate transport and metabolism # Function: Maltose-binding periplasmic proteins/domains # Organism: Thermotoga maritima # 38 404 18 390 391 180 30.0 5e-45 MRRGIAGLGILAIAASLAACGGKSGTTTDNQSAGGAGASGTITIWADDTRYSQVEEFAKD FTASSGVNVSVVQKSESDMDTEFTTQVPTGNGPDLIVMAHDKLGSLVSNGVVAPVDLGDK SKFSDVAVKAVTYGGQTYGVPYAVESVALVRNNKLTTDTPATFADTLTSGQAAGVQYPFV VQMGEHGDPYHFYGFQTSFGAPVFKTDANGEYTSDLALGEQGGTDFANWLKEQADAKVLD TTITADIAKQAFLDGKAAYTITGPWNVAAFREAGMDVSVLPVPKAGSQDAQPFVGVQMFY QSAKSANTLLTKQFFNYLATDEGQKKMQELGGRASAMPEVADASSDQDIKAFSDIAQKGA LAPSIPAMRSVWEFWGDTEAAIVSGAQAPADAWSAMVTNIQNAITKK >gi|319977122|gb|AEUH01000259.1| GENE 11 7372 - 8955 2665 527 aa, chain + ## HITS:1 COG:VCA0944 KEGG:ns NR:ns ## COG: VCA0944 COG1175 # Protein_GI_number: 15601697 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Vibrio cholerae # 39 526 30 524 524 259 34.0 8e-69 MAEPNATTTKQGYSPSHARDVMKPGFIVKLVLMMLIDALGLYGIFTAYAVRSWGIIAIMA VLLAVLNWTYFSKRMVPAKYLIPGMVFLLVYQVFVMGYTAYVSFTNYGQGHNSTKEDAVS AILRNSEQRVPGAANTTAAVVEDGAGLGLVVQNPATGTFQIGNEDTPLHDVDAQISGTTI KVDGYTVLTVTDLLNRQSEVTGLSVPVSNDPNDGFYKTDDGSSAYLAKSAFSYDADADTI TDTSSGTVYTADEKAGNFKSAAGESLDPGWRVFVGTSNYANMVTAADLAGPFFKALLWSF AFAALSVLTTFALGLVLALVFADKRLKGRKIYQSLMILPYAFPAFLATLVWQGMLNSKFG FINQVFFGGAEIPWLTNGWLAKLSILGVNLWLGFPYMFLVCLGALQSLPGDVEEAAKIDG ASGLRTVWSIKLPLVLQSTVPLLIASFAFNFNNFSLIYMLTEGGPNYPGLSVPVGETDIL ISMVYKIAIESGVPNYGLASAMSIVIFIIVGVIAWLGFRQTKTLEEL >gi|319977122|gb|AEUH01000259.1| GENE 12 8970 - 9875 1299 301 aa, chain + ## HITS:1 COG:SA0209 KEGG:ns NR:ns ## COG: SA0209 COG3833 # Protein_GI_number: 15925920 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type maltose transport systems, permease component # Organism: Staphylococcus aureus N315 # 23 300 9 278 279 207 42.0 2e-53 MTASKKGATPKDNHPLKGARWWKELGWRHVVAILTIIYCLIPLLYVISVSLNPGATLTGS NQLFSSVSPENYMALGTDSKYSQYWAWILNSLIVSSVTAVGTVLMGATAAYAFSRFRFRG RRGTLTFLLLVQMFPQMLAFVALFLLLLGLQDIFPVLGLNSKLGLICVYLGGALGSNTFL LYGFFNSIPRSLDEAAMIDGATHAQTFWTIIMPLVRPVLAVVGLLSFISSLGDFVIAKVV LQQPSEFTLAVGMYMWASDSRTAPWGIFAAGAVIAAIPVVLLFQYLQKYIVSGLTSGAVK E >gi|319977122|gb|AEUH01000259.1| GENE 13 10205 - 10594 509 129 aa, chain + ## HITS:1 COG:BH3492 KEGG:ns NR:ns ## COG: BH3492 COG1725 # Protein_GI_number: 15616054 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 1 112 1 112 129 92 38.0 2e-19 MPPTFTPGIPIYVQIADNIRDQILRGALRDGDQLTSTTEYATTYRINPATAGKAFAILVD EGLVEKRRGIGMFVAEGAHASLVEEGRRTYVDETLYPAVEAGLALGLDIDTIATNVRAYD AGSHTKEDS >gi|319977122|gb|AEUH01000259.1| GENE 14 10591 - 11487 1320 298 aa, chain + ## HITS:1 COG:BH3493 KEGG:ns NR:ns ## COG: BH3493 COG1131 # Protein_GI_number: 15616055 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Bacillus halodurans # 2 287 5 276 283 110 28.0 4e-24 MITFNSIAHTYPGEAEQALSDISLGFGDSTVTALVGPNGSGKTTLMQILSGLLPPTSGTV SVNGTALVPDDLLGYSIMASSNRDFDNSSGAVLVDYARLRPTWDQGLFDRYTSRFGFQLG RRKSLRKLSSGQAAILTGAIALASGAPITVLDEIQAPLDVPTRYAFYEELLALAADCMEG RRPRRVFLVSSHMVSELENVAEEVIALKGGRVIAHESVDEFTSRICAIAGNAADVERFLA DHPGIPVIASRTLGSAREIVVDLRGRGVTDRELASHSLTPSPCSFQDAFAYLIQENDQ >gi|319977122|gb|AEUH01000259.1| GENE 15 11484 - 12206 1044 240 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189557|ref|ZP_06608277.1| ## NR: gi|293189557|ref|ZP_06608277.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 240 1 240 240 288 65.0 3e-76 MSTQTPDLVKSSPSALRLGLIDWRPAVVLYLLVLIGQFLANSLVTGIVYLSLGESNGQVY QAQTNSAFVTGVFIFVSAAVNATVGMRSASLSGASRPTLTASTYVTTVLLIFLGSIFWWV LLAAQPLLFFTGTTTYPMGVDSWLFVIVAWIACDGAGRFIGGTGRCVGSLWKRSFALMGA IPLGICAIGAPITWFVLTTVHRVGLLGPMHPAYWLLGLLPLVVCAAGWPLTASGHIRRMA Prediction of potential genes in microbial genomes Time: Thu May 12 19:08:37 2011 Seq name: gi|319977116|gb|AEUH01000260.1| Actinomyces sp. oral taxon 178 str. F0338 contig00260, whole genome shotgun sequence Length of sequence - 4803 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 153 - 1532 2249 ## COG0423 Glycyl-tRNA synthetase (class II) 2 1 Op 2 . + CDS 1534 - 2859 436 ## PROTEIN SUPPORTED gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase 3 1 Op 3 . + CDS 2886 - 3704 811 ## COG0710 3-dehydroquinate dehydratase + Term 3865 - 3922 0.1 4 2 Tu 1 . - CDS 3767 - 4165 501 ## CLH_2436 hypothetical protein 5 3 Tu 1 . - CDS 4363 - 4488 179 ## Predicted protein(s) >gi|319977116|gb|AEUH01000260.1| GENE 1 153 - 1532 2249 459 aa, chain + ## HITS:1 COG:Cgl2226 KEGG:ns NR:ns ## COG: Cgl2226 COG0423 # Protein_GI_number: 19553476 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Corynebacterium glutamicum # 3 459 5 461 461 703 72.0 0 MASRLDNVISLAKRRGFVFPCGEIYGGTRSAWDYGPLGVELKENIKRQWWDHMVRRRADV VGLDSSVILPRETWVASGHVKAFTDPLIECLNCHKRAREDQLVEELAAKKGVEESQVSLA DVPCPNCGVRGQWTEPRAFSGLLKTFLGPVDDEAGLHYLRPETAQGIFINFANVMTAARK KPPFGIGQIGKSFRNEITPGNFIFRTREFEQMELEFFCEPGTDEEWHQYWIDYRKAWYVG LGIDPENLREYEHPKEKLSHYSKRTVDLEYRFGFAGGEWGELEGIANRTDYDLTVHSQAS GAKLDYFDPASKERWTPYVIEPSAGLTRSLMAFLIEAYHEDEAPNAKGGVDKRVVLKLDP RLAPVKAAVLPLSRKPELTGPASELADGLRGLWNVDYDDAGAVGRRYRRQDEIGTPFCIT YDFDSVEDHAVTVRERDSMEQVRIPLDQVKPYLVERLGC >gi|319977116|gb|AEUH01000260.1| GENE 2 1534 - 2859 436 441 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase [Haemophilus influenzae R2866] # 13 383 28 335 353 172 31 5e-43 MGGTPRAHGPAPLRIGPLRLWTPVVLAPMAGVTDAPFRRLCRFFGEAGLPQELRPLDPAR PGTGGDEAVAGTDGREGIANLARAGAGRRAVPRVDAPAGLYVTEMVTSRALVEEGARTLD MVRPDPVERVRSIQLYGVDPAVMGAAARLLVERGLADHIDLNFGCPVPKVTRKGGGAALP WKRDLFVDLVGSVVRACQKAGERAGRQIPVTVKIRIGIDSAHETATDAALAAQRLGASAL TLHARTQAQHYAGRAHWDEIARLKEALTIPVLGNGDVFEARDAMAMMERTGCDGVCVGRG AQGRPWLFRDIVAAYHGAPVPPGPALAEVVAVIERHAAWVVADQGDEARAMREMRKHVGW YLRGFAVGGAQRHALSMVSTLQELHDLLAGLDPDQPFPEAARGPRGRAGGERPPHLPDGW LDHPYLTDSERGRLHLAEIGY >gi|319977116|gb|AEUH01000260.1| GENE 3 2886 - 3704 811 272 aa, chain + ## HITS:1 COG:lin0494 KEGG:ns NR:ns ## COG: lin0494 COG0710 # Protein_GI_number: 16799569 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase # Organism: Listeria innocua # 31 267 17 248 252 122 34.0 6e-28 MSTGGARSRARGTVDIGRPPRGRVELGLRTRVIAPLTSGSRGGLLDEAAAVGDEPVDLVE WRADSMVRALAGSSFAAAPDLVGELTATAGALLGASALPLIATVRTSSEGGAAHLDDSEY CALVASLAGMADAVDVEIGREGAPDLVREAQRAGARVIASHHDFGGTPGDEELAGVLAAM NHAGADVLKIACAPRSGGDVARVLTAGAWAREAYDRPVIAIAMGRLGGPTRFAGAALGGA ATFASVGGASAPGQYTAREARTVLDIIEGAPQ >gi|319977116|gb|AEUH01000260.1| GENE 4 3767 - 4165 501 132 aa, chain - ## HITS:1 COG:no KEGG:CLH_2436 NR:ns ## KEGG: CLH_2436 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 128 1 129 131 127 44.0 1e-28 MPIATVSATFNAPVERVWGLVTGMDPSWRSDLSSIEVVDGLRFIEETTEGYRTSFTTTAR VENERWEFDLVNRHIEGHWTGVFKQEGTGTAVVFTEDVRGRRLFVRPFVRGYLRHQQATY IEDLGRALAAQE >gi|319977116|gb|AEUH01000260.1| GENE 5 4363 - 4488 179 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFHRIPELFSTFLRRHRWRRQQRNDLQFCLDRPIFGALTAR Prediction of potential genes in microbial genomes Time: Thu May 12 19:08:45 2011 Seq name: gi|319977108|gb|AEUH01000261.1| Actinomyces sp. oral taxon 178 str. F0338 contig00261, whole genome shotgun sequence Length of sequence - 6248 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 5, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 253 304 ## gi|293189256|ref|ZP_06607979.1| conserved hypothetical protein + Prom 260 - 319 4.0 2 2 Tu 1 . + CDS 339 - 1250 1156 ## COG0583 Transcriptional regulator + Term 1310 - 1348 9.0 3 3 Op 1 4/0.000 - CDS 1259 - 2056 886 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 4 3 Op 2 . - CDS 2053 - 2940 914 ## COG0796 Glutamate racemase 5 3 Op 3 . - CDS 2892 - 3503 767 ## gi|293189260|ref|ZP_06607983.1| conserved hypothetical protein 6 3 Op 4 . - CDS 3514 - 3837 414 ## COG2127 Uncharacterized conserved protein 7 4 Tu 1 . + CDS 3836 - 5263 1735 ## COG1488 Nicotinic acid phosphoribosyltransferase 8 5 Tu 1 . + CDS 5447 - 5707 102 ## Predicted protein(s) >gi|319977108|gb|AEUH01000261.1| GENE 1 1 - 253 304 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293189256|ref|ZP_06607979.1| ## NR: gi|293189256|ref|ZP_06607979.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 83 1 83 110 127 85.0 3e-28 MTIPSFAITLLGWLGAAASIAAYYLVSSKRFAPDSLRYHSLNVTSCVLLATACASTGAWP SFLTNTIFIAVGAHMIWRVRGRLK >gi|319977108|gb|AEUH01000261.1| GENE 2 339 - 1250 1156 303 aa, chain + ## HITS:1 COG:mll2048 KEGG:ns NR:ns ## COG: mll2048 COG0583 # Protein_GI_number: 13471925 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mesorhizobium loti # 3 250 1 252 329 77 31.0 4e-14 MDIDPRRLPFLMAVGREGGIVAAADILMVSPSAVSQQIQKLEDEVGLKLVERTTAGAVLT PAGRIVADSGERISTEIAETLKALRPLTGQVTGTATIGAFLTVIRSTLIPALATLDGNLP GVDLRIEETDEQPGMARLRTGAFDLLVIERDEEPGIAPRGYTDIPFIDEPWVLVTPDSAP AIGSIADLGELTWLRTTPGSTGDHVMRRITKALPATHWAPHTYTTYDVARALVRARVGST VLPAMALSGANFEGMRVTPLPTLGTRQILLRRKNHGWDEAGAAARVAAHLLEWTRHHDST LTE >gi|319977108|gb|AEUH01000261.1| GENE 3 1259 - 2056 886 265 aa, chain - ## HITS:1 COG:ML1173 KEGG:ns NR:ns ## COG: ML1173 COG1234 # Protein_GI_number: 15827592 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Mycobacterium leprae # 1 265 3 254 260 149 38.0 6e-36 MRLTVVGATGSMSGPGSPASCYLVQAEGQCPVSGLLRTFSVLLDLGPGSFGALWRHMDPC GIDALFLSHCHADHMGDIISLQVHRKWGPARDLAPLLLLGPDGTLGRVRQIEGAVGLEDY SGEFAFARVDSATVVRVGPLLLRAFPGQHSIESYGVRVEGPSEADPARTASLFYTGDTDE CPSIVKGARGVDLLLSEVGFTAADQVRGIHMDGLRAGRVAAEAGAGALVATHIQPWTSHE LVRAELARTWDGPLSFARGGAVYEL >gi|319977108|gb|AEUH01000261.1| GENE 4 2053 - 2940 914 295 aa, chain - ## HITS:1 COG:Cgl2459 KEGG:ns NR:ns ## COG: Cgl2459 COG0796 # Protein_GI_number: 19553709 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Corynebacterium glutamicum # 23 286 21 280 284 278 54.0 1e-74 MAGFTPDGHARAPRGLRVTYMDNAPIGIFDSGLGGLTVARAVIDKLPDEEVVYLGDTRHT PYGPRPVAQVRTYTLACLDQLASMGVKALIIACNTATAAALADARERYWVDAGIPVIEVI TPAARQAVTATRNGRIGVIGTRATIQSEAYSHVLAAVPGLSLTQRECPRFVEFVEAGITT GAELEAVATQYLAPLKEAGVDTLVLGCTHYPLLTGVIGRVMGEEVALVTSSEATANLTYN ELVDRDLLHEPRGGASPRHHFYSTGGSGDFGRLARRFLGPEVSSVRTMDVEGGAL >gi|319977108|gb|AEUH01000261.1| GENE 5 2892 - 3503 767 203 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293189260|ref|ZP_06607983.1| ## NR: gi|293189260|ref|ZP_06607983.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 6 197 7 186 190 155 53.0 1e-36 MGPFARDGAYYSARIDRRSLLLLRQLVSEILVVLDDPSDAEEMSIMVSASTGPEVKRDAP VERSLEFLLPPMSADAQTAQSLRALTEDIVRSDKSRRLRAFWRILDGAAPVDARERPWAA DSDEVDLRVRAADAWQCLGALNDIRLGLAGELGMETASDAEAVEALAMSEPTGEHTQSVA IIYMLVTWWQDSLLTAMQEHHED >gi|319977108|gb|AEUH01000261.1| GENE 6 3514 - 3837 414 107 aa, chain - ## HITS:1 COG:Cgl2465 KEGG:ns NR:ns ## COG: Cgl2465 COG2127 # Protein_GI_number: 19553715 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 10 100 33 123 124 95 52.0 2e-20 MATTAPRTGARPRARAEAWARSYEIPVWQAVVWDDPVNLMGYVTAVLQRHFGYSRARAHE LMMRVHTTGRAVVNSGPRERIEADVVALHTYGLRATMEPAPAGAHAR >gi|319977108|gb|AEUH01000261.1| GENE 7 3836 - 5263 1735 475 aa, chain + ## HITS:1 COG:Cgl2467 KEGG:ns NR:ns ## COG: Cgl2467 COG1488 # Protein_GI_number: 19553717 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Corynebacterium glutamicum # 26 471 5 439 446 380 52.0 1e-105 MSQAYCPQRVLNTVFTPLVTSRTRYPRHMAASRSSSFLTDMYELTMLDAALHDGTAQRRC VFEVFGRRLPATRRFGVVAGTGRILEALERFTFDSEQIEWMRDNGIVSERALDYLADFRF SGTIWGYAEGECYFPGSPLLTVQGTFAECTLLETLLLSILNHDCAVASAASRMTIAAHGR PCMDMGARRAHERAAVSAARAAVIGGFQGTSDLEAAKRYGIRAIGTAAHAFTLLHDDERA AFDSQVANLGPGTTLLVDTYDVAQGVVNAVEAARAAGGELGAVRLDSGDLVAQAFKVRGQ LDAMGATSTRITVTSDLDEYAIAALGAAPVDFYGVGTRLVTGSGVPTAALVYKVVAREDS GGGMVPVAKKSESKSTVGGMKVAGRVIGEEGYAAEELLLVGVPLDEAADALQARGARPLQ IPLVVDGGIDPRMWGSEALARAQERHIRSRNELPYQAWRLSEGETAIPTRYEKAG >gi|319977108|gb|AEUH01000261.1| GENE 8 5447 - 5707 102 86 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVSYPNSSLRSRAASARPTPMGGFLPRSVSLYPECAVEAPTLADHRARAHLAAVEQRGPL SQRCLHLYSDTRLCKRYNACKVTVSL Prediction of potential genes in microbial genomes Time: Thu May 12 19:09:05 2011 Seq name: gi|319977103|gb|AEUH01000262.1| Actinomyces sp. oral taxon 178 str. F0338 contig00262, whole genome shotgun sequence Length of sequence - 5224 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1027 1136 ## COG0381 UDP-N-acetylglucosamine 2-epimerase 2 1 Op 2 . - CDS 1018 - 2163 1237 ## ELI_13320 glycosyl transferase, group 1 family protein 3 1 Op 3 . - CDS 2200 - 3561 2028 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 3593 - 3652 2.5 4 2 Op 1 . + CDS 3779 - 5080 1695 ## COG0677 UDP-N-acetyl-D-mannosaminuronate dehydrogenase 5 2 Op 2 . + CDS 5137 - 5224 79 ## Predicted protein(s) >gi|319977103|gb|AEUH01000262.1| GENE 1 1 - 1027 1136 342 aa, chain - ## HITS:1 COG:PA3148 KEGG:ns NR:ns ## COG: PA3148 COG0381 # Protein_GI_number: 15598344 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Pseudomonas aeruginosa # 1 342 1 342 354 266 43.0 3e-71 MRVMSVVGARPQFVKLAPIDWALREAGIGHVIVHTGQHYDPLLSDVFFSDLRISPPDEHL GVGSGTHGVQTGRMLAAMDPVLARHRPDWALVYGDTNSTLAAALAAVKLHIPTAHLEAGL RSFNRSMPEEVNRVLTDHAADLLLAPTAAAAAHLEAEGLGGRAVVVGDVMADVLARVRDE VAGQGPSAVERSLGLTAGSYSLATIHRPSNTDDPARLDSLLTSLGRVGHPVVLLAHPRLV AACTRAGIAIARGGLVPYPPLPYPQLVSAMMRARGVITDSGGLQKEAFLLRVPATTVRAE TEWVETVELGWNVLVGTGDDLVEAASRPRPADTDAAPYGDGR >gi|319977103|gb|AEUH01000262.1| GENE 2 1018 - 2163 1237 381 aa, chain - ## HITS:1 COG:no KEGG:ELI_13320 NR:ns ## KEGG: ELI_13320 # Name: not_defined # Def: glycosyl transferase, group 1 family protein # Organism: E.litoralis # Pathway: not_defined # 1 363 1 373 379 136 31.0 1e-30 MTAREGLNLLVYPSPLESAGRLVKLALSLQGSGMFAATEVVGIALPGAPREEGLAPGALL RRVPGASLRSPMGALRILVAWQWRVYRRYRRARVTAVAAQNLFMLPMCHRLARRTGAVLA YNCHELETETIASRGLRQRIQRAIERRYIRRADVVSVVNDSIAQWYRDAYPGLRVEVVTN APLPSTGRVDLRARLGIPGSALLYVHAGFLAPGRNIEAILTAFERVPGAHVVFLGDGALG PRVRLAAERCPNIHWLPPVPPDQVVAHMRGADAALCLIEYSCLSHRLSTPNKMMEGFAAG IPVLCSDLPEARRLLGPRAGTWILADPRTGLEPALRRISRADIEAFEAPRIPSWDEGADR LVAAYGRAINGRRSRGGTPCA >gi|319977103|gb|AEUH01000262.1| GENE 3 2200 - 3561 2028 453 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 314 453 485 621 621 122 42.0 2e-27 MHTHIKRPARALGAALALAIAVPGVALAASAPEDGAQNDGPSGYRLVWHDEFDGDRLDPS NWDYQLGGWGDNTQQVYRRENVSVHGGALHLTGKYSPGTPSWNGNSQTTSYQDFTSGFVH SKNLRSFTYGYFVARVKLPKSRSSWSAFWLSPQNGAYGGWPRSGEIDVFESKGSLPGFVQ SNAHWFSEASANPRRQQRPNYSTRDTTVNTQDEFHTYALQWEPTKLTFFLDGVKTHEING PFTGMEHHGPGAPFDGPFYLRLNHAIGGKFLGDTPYQDARTARSDYEGNGTDMEVDYVRV FQRDGDTPPPSGRWMSDSRGWWWRQPDGSYPASTSMTIKGRTYRFDSSGYMRTGWVLDKG YWYYHDGSGAQLTGWISTGGYWYHLGSGGAMEIGWVRVGDTWYYLGASGAMTTGWQNIEG SWYYMDSSGAMLTGTHWINGAQRTFNSSGVWIG >gi|319977103|gb|AEUH01000262.1| GENE 4 3779 - 5080 1695 433 aa, chain + ## HITS:1 COG:PAB0776 KEGG:ns NR:ns ## COG: PAB0776 COG0677 # Protein_GI_number: 14521369 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetyl-D-mannosaminuronate dehydrogenase # Organism: Pyrococcus abyssi # 3 425 23 448 448 239 36.0 8e-63 MRIAVVALGKIGLPLAVHYAGKGHSVVGVDTDPAVVAAVNAGAEPFPGEAHLGARLREAV GGGRLRATTRYGEAVPGADAVVVVVPLFVDDATWEPDFALMDAATRSLAEHLDPGTLVSY ETTLPVGALRGRFKPMIERISGLREGDGFHLVFSPERVLTGRVFADLRRYPKLVGGLDEA GARAGVAFYEAVLDFDERDDLPRPNGVWDMGSAEAAEMAKLAETTYRDVNIGLANQFAVY ADRAGFDIGRVIEACNSQPYSHIHRPGIAVGGHCIPVYPRLYLSTDPGATVVRAAREANA AMPAYAVSRAAEVLGGLEGMRAAVLGASYRGGVKETAFSGVFPTVDALREAGAEVLVHDP LYTDEELAGFGWTPFHLGEGADVVIVQADHPEYARLEPADAPGARLLVDGRGATDPALWR GVPRITIGGGGPR >gi|319977103|gb|AEUH01000262.1| GENE 5 5137 - 5224 79 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVATRTFAPEPTAAALRLGAFAAALAERG Prediction of potential genes in microbial genomes Time: Thu May 12 19:09:17 2011 Seq name: gi|319977095|gb|AEUH01000263.1| Actinomyces sp. oral taxon 178 str. F0338 contig00263, whole genome shotgun sequence Length of sequence - 7550 bp Number of predicted genes - 11, with homology - 6 Number of transcription units - 5, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 52 - 576 559 ## Sked_08730 glycosyltransferase 2 1 Op 2 . + CDS 590 - 700 63 ## 3 1 Op 3 . + CDS 670 - 2697 3110 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 2743 - 2788 -0.5 4 2 Op 1 . + CDS 3151 - 3477 160 ## 5 2 Op 2 . + CDS 3509 - 3655 205 ## - Term 3515 - 3552 -0.3 6 3 Tu 1 . - CDS 3747 - 5165 2144 ## COG1091 dTDP-4-dehydrorhamnose reductase 7 4 Op 1 . + CDS 5122 - 5265 99 ## 8 4 Op 2 1/0.000 + CDS 5363 - 6235 1426 ## COG1209 dTDP-glucose pyrophosphorylase 9 4 Op 3 . + CDS 6244 - 7014 909 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 10 4 Op 4 . + CDS 7011 - 7391 574 ## HMPREF0573_11069 hypothetical protein - Term 7371 - 7411 3.0 11 5 Tu 1 . - CDS 7442 - 7543 121 ## Predicted protein(s) >gi|319977095|gb|AEUH01000263.1| GENE 1 52 - 576 559 174 aa, chain + ## HITS:1 COG:no KEGG:Sked_08730 NR:ns ## KEGG: Sked_08730 # Name: not_defined # Def: glycosyltransferase # Organism: S.keddieii # Pathway: not_defined # 4 167 218 387 407 106 46.0 4e-22 MAPWLAPGVFVDALRLAAPRLPGARLVFLGQGSGWEALRERARGVEGVSFHAPVGADEAH AWLAAATASLASLRPGGYSYAYPTKILSSLAAGTPVVYAGEGEAARDVASADLGVAVGLD AAAVAGAMAAVAERAAEGGWDRLRLWRWVRQHRSALSSSRRAVLAALGADGPSG >gi|319977095|gb|AEUH01000263.1| GENE 2 590 - 700 63 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYPVSCLHAPHENLLSQNAHTQHEGLICAPCGNPSR >gi|319977095|gb|AEUH01000263.1| GENE 3 670 - 2697 3110 675 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 259 482 526 744 744 202 46.0 1e-51 MRPLWKPLAIGLALAVGAMVAPLAPQAGAADGITIALDPGHGGSDPGASANGLVEKDLTL AVGRALKAELETYQGVRVHMTREDDSRPSENISQDLSARVASSVAAGASALVSLHFNSGT AGSNGAEVWYPNASSYNYSTHTQGAALATAIQNQLTSLGLTDRGIKTRDNPYYDYPDGST GDYYAISRHARLSNLTGIIVEHAFLTSASDAARLREPGFVQSLAMADAAGIAQALGLSKG TWHNDGGRWRFGADGKYLTGWFSVGGVWYWGDSEGYTVTGWQVINWRWYYFDASTAMQTG WRQIDGAWYYLGSSGAMVFGWAKDGGAWYHLGASGRMDTGWILDGWSWYYLDPASGAMRT GWVEDGGTWFHLGSSGSMTTGWLSEGGSWYYFDPASGAMRTGWAAIGGSWYYLDPASGAM RTGWVEDGGTWFHLGSSGSMTTGWLSEGGSWYYLDPGSGAMATGTVVVDGRESVFAASGE WLGYASGGTTASGMHPVMAAPTAQRAAVIASMAAAFDDSGSSYPASLAAGGAPTINDFAA IAYDEAVAEGVSPELVFTQAMKETGWLGFGGDVSVDQFNFAGIGAVGGGAGGASFPDVRT GLRAQVQHLRAYADAGATASSLANPLVDPRFSYVSKGSAPYVEYLGVQENPNGTGWATAR NYGYDIIAMMKSYFG >gi|319977095|gb|AEUH01000263.1| GENE 4 3151 - 3477 160 108 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPPPGHAHKDPDLRNTHRTQGRDQALVCGAVFPNPLPCVRRGGARGTRPTGPGKRVRNAT RRRRCQSASWDRPLRGPWIRAIRALPGGPARIRAPDGPATGGALVALP >gi|319977095|gb|AEUH01000263.1| GENE 5 3509 - 3655 205 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRIAKNPELAAAAPGGSAGSGAAVTLAAGGALVAVLTGVFFVRAWRVE >gi|319977095|gb|AEUH01000263.1| GENE 6 3747 - 5165 2144 472 aa, chain - ## HITS:1 COG:Cgl0332_2 KEGG:ns NR:ns ## COG: Cgl0332_2 COG1091 # Protein_GI_number: 19551582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Corynebacterium glutamicum # 192 472 1 270 271 252 53.0 1e-66 MVPTAKPLGIETTPIPGFLRIDLTVHGDNRGWFKENWQREKMVGLGLPDFRPVQNNISFN DEVGVTRGIHAEPWDKFVSVATGRVFGAWVDLREGPSFGTVYTTVIDPGTAVFVPKGVGN AYQTLEPATAYTYLVNAHWSPDARYTFLNLADETVAVPWPIPLERAVLSDKDRAHPRLAD VTPFPAPSGPRVLVTGANGQLGRELMRQLPAAGFEATGVDLPEVSIADADQMEAFDWSSF DVVVNAAAWTDVDGAETPDGRRASWLANATGPANLARACASHGLTLVHISSEYVFDGSAE VHPEDEAPSPLGVYGQSKAGGDAAVLAAPKHYLVRTAWVVGDGKNFIRTMASLACSGVRP QVVDDQVGRLTFTTDLAAGIIHLLSTRAPHGTYNLTGEGPVMSWADVATRVYELLGHDAS EVTRVSTEQYYADKGGAPRPLSSVLDLAKIEATGFTPGDSCARIDAYVASLA >gi|319977095|gb|AEUH01000263.1| GENE 7 5122 - 5265 99 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGVVSIPRGLAVGTMRVPLPSTALRTPQILSDGRARARHWAAGGDVR >gi|319977095|gb|AEUH01000263.1| GENE 8 5363 - 6235 1426 290 aa, chain + ## HITS:1 COG:MT0348 KEGG:ns NR:ns ## COG: MT0348 COG1209 # Protein_GI_number: 15839720 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Mycobacterium tuberculosis CDC1551 # 1 288 1 288 288 435 73.0 1e-122 MKGIILAGGSGTRLNPITLGTSKQLVPVYDKPMIYYPLSTLMLAGIQDVLVITTPHDAPS FHRLLGDGSQLGVNITYTVQHEPNGLAQAFVLGADHIGSDPAALVLGDNIFYGPGMGTQL RRHVDPDGGAVFAYHVSNPRAYGVVEFDADFTALSIEEKPAKPKSNYAVPGLYFYDNDVV AIARDLEPSARGEYEITDVNRAYLRAGKLKVEVLPRGTAWLDTGTFDSLADATAFVRTVE ARQGMKIGAPEEVAWRMGFIDDEGLRRRAEPLVKSGYGQYLLDLLVRDAE >gi|319977095|gb|AEUH01000263.1| GENE 9 6244 - 7014 909 256 aa, chain + ## HITS:1 COG:SPy0794 KEGG:ns NR:ns ## COG: SPy0794 COG0463 # Protein_GI_number: 15674837 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pyogenes M1 GAS # 6 229 4 224 231 132 32.0 6e-31 MPDRLLIIIPAWNEEEVLGVVLDDVLEHKPADADVLVVSDGSTDATAAIARSRGVRVLDL PLNLGVGGAMRAGFQYAVRTGYTHACQLDADGQHDPKDIQALIDRAHEEGADVVIGARFS GKGDYSVRGPRKWAMGMLSAILSRVCRTPLKDTTSGFKLYGPRALALFIHNYPAEYLGDT IEALVIAARSDLTVRQVGVEMRPRAGGEPSHNPVKSAKFLVRAMVALVVALSRPAPGKKK GAPAPTTGAQGVKEGA >gi|319977095|gb|AEUH01000263.1| GENE 10 7011 - 7391 574 126 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_11069 NR:ns ## KEGG: HMPREF0573_11069 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 1 115 2 116 144 132 54.0 6e-30 MSPTYILGLVFAVIILVTVFLKMRNSGMKERYSLWWFVIAFFTALFSLFPPVLKWMARSL GVVVPLNLGFFLAGVVLLLLALRYSVDLSRADEDKRRLTEEAAILRAQVEDLDARVTELE AAKKDR >gi|319977095|gb|AEUH01000263.1| GENE 11 7442 - 7543 121 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPSRLARRLQRGEADPHAATERLTLGPDGRPRR Prediction of potential genes in microbial genomes Time: Thu May 12 19:09:47 2011 Seq name: gi|319977090|gb|AEUH01000264.1| Actinomyces sp. oral taxon 178 str. F0338 contig00264, whole genome shotgun sequence Length of sequence - 3580 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1340 1447 ## Bcav_2900 aminoglycoside phosphotransferase + Prom 1113 - 1172 2.4 2 2 Op 1 . + CDS 1209 - 2426 1320 ## COG4962 Flp pilus assembly protein, ATPase CpaF 3 2 Op 2 . + CDS 2501 - 3283 971 ## Arth_2672 type II secretion system protein 4 2 Op 3 . + CDS 3294 - 3579 288 ## Predicted protein(s) >gi|319977090|gb|AEUH01000264.1| GENE 1 2 - 1340 1447 446 aa, chain - ## HITS:1 COG:no KEGG:Bcav_2900 NR:ns ## KEGG: Bcav_2900 # Name: not_defined # Def: aminoglycoside phosphotransferase # Organism: B.cavernae # Pathway: not_defined # 75 424 4 351 386 183 43.0 1e-44 MGERLDRLVDSPRDPPRLARRIDRARGQALAHPRRVLFDQGIVHRHLHVTCISDPFTEKL TEFSNKDQDPRPTVDNSSAVPPYTRQARYLGTVTMTPLKLAALATVAVPGLQVTGLRADS YSDEVRSVANIVDASGNRWTVTCSNETIAGPAAEAEAAILQRLATTHDVNRIPFDVPRIK GTTRTRDGNRVYVHQDLGGRPLEDSDLADDPLLPASLGRALAALHNLPERVYTDVSVPSH SAIECRAGLLSMLDEAPARAAVPVNLRIRWQGALDDLSLWRFPPAPIHGDLQGNAVYVSR GSVVGIAGFTSACVGDPAVDIAWVQAVASDAFLERFREAYSHEREATDLHLFTRAQLMSE LALVRWLLHGAHTDDHSVVEEARAMLADLSSDLGELRLVESRPHSSADDLLSDTPIASAT ARRAPGREEQAGQGQRRAGREERAGQ >gi|319977090|gb|AEUH01000264.1| GENE 2 1209 - 2426 1320 405 aa, chain + ## HITS:1 COG:RSc0652 KEGG:ns NR:ns ## COG: RSc0652 COG4962 # Protein_GI_number: 17545371 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Ralstonia solanacearum # 20 356 53 391 453 223 37.0 5e-58 MDDALVEEHAAWVRERLAARAINPAREAGRVARTVDEAVEALAHPLSPQERDRLVRALMA EICGLGPIQDLLDDPEVEEVWINSASRVFAARAGVSELTTLILREEQVRDLVERMLAHSG RRLDLSHPFVDATLDTGERLHVVIPPVTSQSWSVNIRKHVQRARTLADLVDAGMVPAPVA AFLRAAVAVGLSVVVSGGTQAGKTTLLRALAGELPRSRRVVTCEEVFELSLANWDHVAMQ TRPPSVEGRGEIALRDLVRESLRMRPECLVVGEVRGAEALDLLVALNAGVPGMATVHANS AREALDKLAVLPLLAGENVTSGFVVPTLAASVDLVCHVHRAPDGLRHLSEVLAVPGRVEG TRIETAVLWEYDGARLERGSGALDMHERFAAAGHSLSDLLSWGRP >gi|319977090|gb|AEUH01000264.1| GENE 3 2501 - 3283 971 260 aa, chain + ## HITS:1 COG:no KEGG:Arth_2672 NR:ns ## KEGG: Arth_2672 # Name: not_defined # Def: type II secretion system protein # Organism: Arthrobacter_FB24 # Pathway: not_defined # 9 260 33 284 284 197 48.0 4e-49 MRLAAPAWARRWGDMVRRADLPGMDGPRLVLVCLACALLALVASFVVTETWAVGLAMGVV AAPLPAAWVSSRSHRRAREAREAWPEVVDSLVSGVRAGAALPTLLADLGSGGPPALRGAF EAFGQDYRAHGRFDVALDRLKDRLADPVADRIVEALRLARSVGGADLARLLTDLAVLLRE DARVRGELEARQSWTVGAARLGVAAPWAVLLLISQGARSARVWNTPQGMAVLALGGACCV AAYALMRVLSRLSTDPRTMR >gi|319977090|gb|AEUH01000264.1| GENE 4 3294 - 3579 288 95 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLFVLSSHLAPVAAGLLLGAGLVLVAVCARDQFPDFSARVAQGSPAPAPRGPTWSGLARL WGARLLDSLGSTTRSVERRLELAGSPTDAVGFRAA Prediction of potential genes in microbial genomes Time: Thu May 12 19:10:04 2011 Seq name: gi|319977084|gb|AEUH01000265.1| Actinomyces sp. oral taxon 178 str. F0338 contig00265, whole genome shotgun sequence Length of sequence - 3867 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 538 765 ## COG2064 Flp pilus assembly protein TadC 2 1 Op 2 . + CDS 565 - 756 364 ## gi|154507635|ref|ZP_02043277.1| hypothetical protein ACTODO_00116 3 1 Op 3 . + CDS 728 - 1147 435 ## Arth_2669 TadE family protein 4 1 Op 4 . + CDS 1134 - 1559 418 ## Bfae_20370 hypothetical protein 5 1 Op 5 . + CDS 1552 - 2118 520 ## gi|293189310|ref|ZP_06608033.1| conserved hypothetical protein 6 2 Tu 1 . + CDS 2340 - 3701 2313 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase Predicted protein(s) >gi|319977084|gb|AEUH01000265.1| GENE 1 2 - 538 765 178 aa, chain + ## HITS:1 COG:PAB1459 KEGG:ns NR:ns ## COG: PAB1459 COG2064 # Protein_GI_number: 14521587 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein TadC # Organism: Pyrococcus abyssi # 26 136 90 201 308 61 37.0 1e-09 VAGIVVVALLGCVVGMSAWDRLLTARADSRQRRVEAAVPDASELLALAVGAGESVPAALD RVASLSHSDLGDELARAVADIRLGAPSVRALTDLAARNDSPALSRLCQTLSTAIGRGSPL AAVLHDQARDIREASRQRLMEEGGKREIAMLFPVVFLILPVTVLFALYPGLMALDITP >gi|319977084|gb|AEUH01000265.1| GENE 2 565 - 756 364 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154507635|ref|ZP_02043277.1| ## NR: gi|154507635|ref|ZP_02043277.1| hypothetical protein ACTODO_00116 [Actinomyces odontolyticus ATCC 17982] # 20 63 13 56 56 77 97.0 2e-13 MRTLLNRVRAALRRSEDDRERGDVPGWVLITLMTAALVVLIWGVASDRLVEIFNRAMDSI AGI >gi|319977084|gb|AEUH01000265.1| GENE 3 728 - 1147 435 139 aa, chain + ## HITS:1 COG:no KEGG:Arth_2669 NR:ns ## KEGG: Arth_2669 # Name: not_defined # Def: TadE family protein # Organism: Arthrobacter_FB24 # Pathway: not_defined # 15 132 3 121 122 67 40.0 1e-10 MRWIRSRASEGAGGEEGSEVASTVLVQTLVVLVVLALVQLAFALHVRSLTVSAASEGARR GGLLGGDAAEAVARTNEVLATIAGGARDSEVTTSYEEIGGRSVLVVTVRAPLPLLLGMGP RWMSVRGTSLVEEGTDDGG >gi|319977084|gb|AEUH01000265.1| GENE 4 1134 - 1559 418 141 aa, chain + ## HITS:1 COG:no KEGG:Bfae_20370 NR:ns ## KEGG: Bfae_20370 # Name: not_defined # Def: hypothetical protein # Organism: B.faecium # Pathway: not_defined # 5 135 14 147 155 71 39.0 9e-12 MTADRGRADGERGDAVVEFIGFFAVLVVPVVYIIVAASWVQAAVFATDAGAREAVRIIAS HPDDGEERAQAQVGLAFADFGVAGQPVVEASCEGCASPGGQASVTVSTVVPLPLVPRWLG APGIPVQSSAVSPVQEVDADG >gi|319977084|gb|AEUH01000265.1| GENE 5 1552 - 2118 520 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293189310|ref|ZP_06608033.1| ## NR: gi|293189310|ref|ZP_06608033.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 15 185 5 140 147 77 39.0 4e-13 MGRGSGRAVPGRDPEEGRVGLLTCAACAFVCVFLLVSIAVTGVAVQDRRLLACADRVAAT AAGVIDGASVYAGGSGAGAGQGAGNGADEGADADPGAAADADGGGADADGVPASLPDATA AAERALAQMGGTTCAVGEGVRIESASLADGDLRVSVRARATVAFLPSFLATAAAPVLVRS SSARVHEG >gi|319977084|gb|AEUH01000265.1| GENE 6 2340 - 3701 2313 453 aa, chain + ## HITS:1 COG:Cgl2027 KEGG:ns NR:ns ## COG: Cgl2027 COG0334 # Protein_GI_number: 19553277 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Corynebacterium glutamicum # 17 453 11 447 447 585 65.0 1e-167 MSIFDPSRLSPSLQTVFQAVLDRDPAQAEFHQAVYEVLLTIAPLVERHPEYADLSVIDRV VEPERQLLFRVPWTDDAGRVRVNRGFRVEYNSALGPYKGGLRFHPSVNLSIIKFLGFEQI FKNALTGLPMGGGKGGSDFDPKGKSDAEVMRFCQSFMTELSRHIGPDTDVPAGDIGVGGR EIGYMFGQYKRLKNRFDAGVLTGKGLTWGGSYVRTEATGYGLVYFAAEMLAAKGSGFDGK RVVVSGSGNVAIYATEKAQQLGATVIAVSDSSGYVLDEAGIDVPLLKDVKEVRRGRVSDY AAERPSATFVPSGAIWDVPCDVALPCATQNELPLSGAQSLIKNGVQLVAEGANMPTTPEA IEALQAAGVTYAPGKASNAGGVATSGLEMQQNSGRTQWPFAETDAKLHQIMVDIHANTVA TAEEYGVPGDYVAGANLAGFRKVADAMVSLGVI Prediction of potential genes in microbial genomes Time: Thu May 12 19:10:41 2011 Seq name: gi|319977054|gb|AEUH01000266.1| Actinomyces sp. oral taxon 178 str. F0338 contig00266, whole genome shotgun sequence Length of sequence - 31073 bp Number of predicted genes - 31, with homology - 25 Number of transcription units - 17, operones - 9 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 + CDS 2 - 760 854 ## COG0642 Signal transduction histidine kinase 2 1 Op 2 . + CDS 760 - 1458 907 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 3 2 Op 1 21/0.000 + CDS 1582 - 2634 486 ## PROTEIN SUPPORTED gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 4 2 Op 2 17/0.000 + CDS 2748 - 4475 2580 ## COG1178 ABC-type Fe3+ transport system, permease component 5 2 Op 3 . + CDS 4499 - 5677 1670 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components + Term 5705 - 5763 3.7 6 3 Tu 1 . - CDS 5509 - 5823 97 ## 7 4 Op 1 4/0.000 + CDS 5953 - 7071 1795 ## COG1186 Protein chain release factor B 8 4 Op 2 28/0.000 + CDS 7162 - 7851 284 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 9 4 Op 3 2/0.000 + CDS 7871 - 8785 1452 ## COG2177 Cell division protein 10 4 Op 4 . + CDS 8789 - 10060 1637 ## COG0739 Membrane proteins related to metalloendopeptidases 11 5 Tu 1 . - CDS 10067 - 10234 122 ## 12 6 Op 1 . + CDS 10149 - 10676 926 ## COG0691 tmRNA-binding protein 13 6 Op 2 1/0.250 + CDS 10695 - 13481 3049 ## COG1511 Predicted membrane protein 14 6 Op 3 . + CDS 13478 - 15685 2542 ## COG1511 Predicted membrane protein 15 7 Op 1 . + CDS 16691 - 17308 -138 ## 16 7 Op 2 . + CDS 17341 - 17871 357 ## MAB_0271 hypothetical protein 17 7 Op 3 . + CDS 17858 - 18820 581 ## KRH_07350 hypothetical protein + Term 18933 - 18973 13.6 18 8 Op 1 . - CDS 19270 - 19698 412 ## 19 8 Op 2 . - CDS 19691 - 19888 132 ## 20 9 Tu 1 . + CDS 20094 - 20411 435 ## COG0393 Uncharacterized conserved protein - Term 20486 - 20515 1.1 21 10 Tu 1 . - CDS 20637 - 22589 2709 ## COG2233 Xanthine/uracil permeases 22 11 Op 1 1/0.250 + CDS 22854 - 23936 826 ## COG1609 Transcriptional regulators 23 11 Op 2 . + CDS 23976 - 24689 877 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases - Term 24659 - 24699 2.5 24 12 Tu 1 . - CDS 24807 - 25043 164 ## 25 13 Tu 1 . + CDS 25042 - 25323 521 ## COG0776 Bacterial nucleoid DNA-binding protein - Term 25248 - 25292 -0.9 26 14 Op 1 . - CDS 25417 - 27162 2266 ## COG1061 DNA or RNA helicases of superfamily II 27 14 Op 2 . - CDS 27159 - 27485 319 ## HMPREF0573_10096 hypothetical protein 28 15 Tu 1 . - CDS 27698 - 28477 1280 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 28530 - 28589 2.3 - Term 28560 - 28600 3.7 29 16 Op 1 8/0.000 - CDS 28693 - 29310 397 ## PROTEIN SUPPORTED gi|157155704|ref|YP_001464307.1| putative deoxyribonucleotide triphosphate pyrophosphatase 30 16 Op 2 . - CDS 29307 - 30074 1049 ## COG0689 RNase PH 31 17 Tu 1 . + CDS 30111 - 31061 948 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding Predicted protein(s) >gi|319977054|gb|AEUH01000266.1| GENE 1 2 - 760 854 252 aa, chain + ## HITS:1 COG:MA1149_2 KEGG:ns NR:ns ## COG: MA1149_2 COG0642 # Protein_GI_number: 20090015 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 34 242 30 245 279 111 32.0 1e-24 AVRQWRCAGRARAELAALRAVHEERVRRPGVLFHELRTPLTVVRGAAELLLDGSAGPLNP VQHRFAKTIAENSNQVIEMADDLLAEIRMESELFTLRPRRVEIRGLVRDNVAQMRRFHSA NIRLDNHGAPIHVRVDPRLMGQAIANVVNNAARHAGEGVSILVAVADSEDDVTISVSDNG AGMSAEERERLFVPFATGGSVRPGTGLGMMITQKIVELHGGRVLVDTIATRGTTFYLTLP RRQWPAPARGAA >gi|319977054|gb|AEUH01000266.1| GENE 2 760 - 1458 907 232 aa, chain + ## HITS:1 COG:BH3157 KEGG:ns NR:ns ## COG: BH3157 COG0745 # Protein_GI_number: 15615719 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 9 226 4 232 239 154 37.0 2e-37 MGAQGEAPRALVVDDEPQILMIMRFALETAGFEVVTAGNGAKAWELFKSGSFDLVVLDLM IPVVSGLVIAERIRAVSDVPIMMITALSEEGDRVRGLETGADDYVTKPFSPREFTLRALA LVRRWRGGGSAVVRNGALEVDAAAHRVFLSGRRIGLPATEERFLGVLAARVGEAVTYREL LNLVWSTHETSGGKDMIKTTAYRVRQALGPQGAQYVRSVRGVGYMMPVIAPG >gi|319977054|gb|AEUH01000266.1| GENE 3 1582 - 2634 486 350 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 [Haemophilus parasuis 29755] # 32 327 25 323 346 191 34 4e-48 MTNRRLAPLAGVAAVGMLALSACVVTDEQIQSAAGSETITIACGAMEDLCQKWTETFTAS TGITATYVRLSSGEAVARLDSAKDNPEFDVWHGGPVDGYGAAVEKGLIEAYDSPSAAEIP AKYKDPGHNWTGVYVGVLGFCSNKAVLDNLGVEVPDSWDDLLDPALKGQISTAHPSTSGT AFTTLWTQVVLRGGEDGALDYMKQMHNNVLQYTKSGTAPGQIAGRGEVGVGLVFSHDCVK YRDEGMKDLVVSFPKEGTGYEIGGVALVANSKHSAAAKKYIDWAISPEAQNIGQTVGSNQ VLTNPKAEANDKMVKLDEVSLIDYDFAAASAAKPALTARFDEEIAAQPRE >gi|319977054|gb|AEUH01000266.1| GENE 4 2748 - 4475 2580 575 aa, chain + ## HITS:1 COG:VCA0686 KEGG:ns NR:ns ## COG: VCA0686 COG1178 # Protein_GI_number: 15601443 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Vibrio cholerae # 62 568 189 691 700 293 37.0 6e-79 MPPQAPAPARPARAARTRVRQDPVTVAIVGVIALVLVLLVLAPLATVFSRAFSKEGTEVL SSILSSTTNRTIIVNTIVLGCVVGLAGTLVGFFMAYVQARVDMPGKKLLHLICLVPIVSP PFAVATSSITLFGRNGIISSQIFGQRWDIYGLSGLTLVLTLSFFPVAYMNMLGMLRNLDP AMEEAAASLGASPWRVFRTVTLPMLIPGFAASFLLLFVEAIADLANPLVIGGDFTVLASR AYIAINGEFNTAAGSAYSLVLLVPALLVFLLQRYWAGRSSAVTVTGKPTGRVRMVRANAA RIPLGAVTAVLAGFVVVVYATVIIGAFVNILGVDNTFTLRNFQYVLSGIGNDAMIDTTIL ALIATPIAGLLGMVVAWLVVVRLRASAGVMDFLGMLGLSVPGTVLGIGYLITYNKPVIIG HVMLLPALAGGSAVFGGALAIILVYVARSMPSGQRSGIASLQQIDKSIDEASTSLGASGL QTFVKVTLPLIRPAFIAGLTYAFARSMTTLSPIVFITTPKTKIMTSQILAEVDAGRFGNA FAFCTILIVIVMAVIGLTNLLVRDKSVAAQSRAGL >gi|319977054|gb|AEUH01000266.1| GENE 5 4499 - 5677 1670 392 aa, chain + ## HITS:1 COG:PA3607 KEGG:ns NR:ns ## COG: PA3607 COG3842 # Protein_GI_number: 15598803 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Pseudomonas aeruginosa # 25 392 6 363 363 293 45.0 4e-79 MSTTQSSLESAAGDGAPADEAQGSLSLASLTKTFGQGDSAVTAVDHVDLDILPGEFITLL GPSGCGKTTTLRMIAGFEDATSGEVLLDGENMVSVPPNKRPMSMVFQSYALFPHLSVREN VAYGLKLRKMPDAQVDEAVDVALAAMNLTAMAGRAPSQLSGGQQQRVALARAMVVRPKVL LFDEPLSNLDAKLRVKMRVEIRRMQKRLGISSVYVTHDQSEAMAMSDRIVVMNAGRIEQV DTPAEIYLHPASVFVADFVGRANFLPAQVLDAEGDRVRVRALGGELEVRAQADALAAARS GGDVVLLVRPESLRLSPLDEAPQALTGGLGQIVTSIFYGETVEYEIETESGSLVCVVSDP REDEILAEGQVVRIGIEAEKAWLLESGETASA >gi|319977054|gb|AEUH01000266.1| GENE 6 5509 - 5823 97 104 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCAAPCCLIPSLVCGTGRERALNEGPTPGRSRADAPQDGPVKRTPVRGRQALAVSPDSSS HAFSASIPMRTTCPSARISSSRGSDTTQTREPDSVSISYSTVSP >gi|319977054|gb|AEUH01000266.1| GENE 7 5953 - 7071 1795 372 aa, chain + ## HITS:1 COG:Rv3105c KEGG:ns NR:ns ## COG: Rv3105c COG1186 # Protein_GI_number: 15610242 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Mycobacterium tuberculosis H37Rv # 8 364 15 372 378 409 63.0 1e-114 MAIEFTEEIQSLRLTMDNVLAVVRPERLRAQIAELNEKAAAPDLWDDPARAQEVTSALSH RQSELDRVVRMGERIDDLEAMVEMAGEDPDEAEEILDEAAADLDTLKKDVSDLEIRTLLD GEYDERNAVITIRSGAGGVDAADFAEILLRMYLRWAERHGYATKVMDTSYAEEAGLKSAT FEVQAPYAYGTLSVEAGTHRLVRISPFDNQGRRQTSFAAVEVIPLIESTDHIEIPESELK VDVFRSSGPGGQSVNTTDSAVRMTHIPTGIVVSMQDEKSQIQNRAAALRVLQSRLLVLRH EEEQAKKKELAGDVKASWGDQMRSYVLQPYQMVKDLRTEFESGNPQAVFDGDIDGFINAG IRWRKNGQSGDR >gi|319977054|gb|AEUH01000266.1| GENE 8 7162 - 7851 284 229 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 221 1 222 245 114 30 1e-24 MIRFDDVTMVYTPGAQPALDHVSLEVEREEFVFLVGKSGSGKSTFLQLVMREMKATSGKV WVLGKDVSRLSTWAVPKLRRQIGTVFQDFRLLPSKTVYENVALAMQVIGKPKHAIESAVP DALELVGLSGKEQRLPHELSGGEEQRVAIARAMVNRPEVLLADEPTGNLDPETSLGIMRL LDRINRTGTTVVMATHDADIVNQMRKRVIELKQGEVVRDQSRGVYGAAR >gi|319977054|gb|AEUH01000266.1| GENE 9 7871 - 8785 1452 304 aa, chain + ## HITS:1 COG:Cgl0780 KEGG:ns NR:ns ## COG: Cgl0780 COG2177 # Protein_GI_number: 19552030 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Corynebacterium glutamicum # 1 303 1 299 300 127 27.0 2e-29 MKLRFIFSEVGKGLSRNRAMSVAVILVTYVSLLFVGIAGLSQMQVSKMRTDWYDKIEVSV YMCAIDDRSENCNGAEATEEQIAAVRARLDSQEMGQYVQQYYEESKEEAYENFVKLNGDS NLGQWTTPDMLQVSFRIKLVNPEQYSVIKEEFSGTPGVSEVRDQREIVEPLFKVLGAART AALGLGGIMVVAAVLLISTTIRLSAMSREQETQIMRYVGASNLFIQAPFMIEGALAALVG AGLAIGTLFAGVHFIVQGWLAPSFKWTNFIGMAHVGIMSPVLIAAAVLLAVIASAFSLAK YTRA >gi|319977054|gb|AEUH01000266.1| GENE 10 8789 - 10060 1637 423 aa, chain + ## HITS:1 COG:VC0630 KEGG:ns NR:ns ## COG: VC0630 COG0739 # Protein_GI_number: 15640650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Vibrio cholerae # 294 419 271 392 433 90 41.0 7e-18 MGADRRRSRARFARACGALAVALALGAVVAPTTVADDDRDAAEAAKEAAEASMNELRAQL EGIDANLAQVFVDLQSLNGQIPEAQTKADEAGTHYDTAVREHAVLEGQLRAAKSEKASID SEITQAEQDQSQASAAIGELVRRKYREGGASAVTLALTAEGSASIEQRASAADAALRAET QTVNAALDVQSSQRTQQARQEAITNRIGSLEEQARQAEEDAQAAKQEADSTLAELNTLRA DAQAKQTEWDSRKGEVEASLAQAEADYQARSAELSQIDEANQASGATYTSSSGFRSPLDI PIVVTSPFGMRYHPVLGIMKGHSGTDMAADCGTVIRAVASGYVNAVSADVSAGNYVDVNH GMVGGNSVITEYLHMQAQYVSPGQYVNAGDALGEVGSTGYATGCHLHFGVRENGSYVEPM DYL >gi|319977054|gb|AEUH01000266.1| GENE 11 10067 - 10234 122 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRATVFFASEAALRRSPSVAFGFFHSLGTTAPLRRRIPFGRKGRPPQRKHYEGQS >gi|319977054|gb|AEUH01000266.1| GENE 12 10149 - 10676 926 175 aa, chain + ## HITS:1 COG:Cgl0781 KEGG:ns NR:ns ## COG: Cgl0781 COG0691 # Protein_GI_number: 19552031 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Corynebacterium glutamicum # 13 171 2 160 164 154 49.0 7e-38 MPKEWKKPKATEGERRKAASDAKKTVARNKRARHDYLIEDTWEAGLSLMGTEVKALRMGR ASLVDSWVEVSGGEAWLYGANIPLYAQGSWTNHAPTRKRKLLLHRAEIDRMASKASAKGY TIVPLELYFMGGRAKVEIALAKGKQEWDKRQALREKQDQREAERAMRRYVKQARR >gi|319977054|gb|AEUH01000266.1| GENE 13 10695 - 13481 3049 928 aa, chain + ## HITS:1 COG:CAC2582 KEGG:ns NR:ns ## COG: CAC2582 COG1511 # Protein_GI_number: 15895842 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 5 681 8 686 722 243 25.0 1e-63 MRGPVRILRRDLLRLVSVPAAWIVIIGLAFVPALYAWFNIVGFWDPYNQTSRIRVAVANE DEGWTREPIGFVNVGAMLQERLADNDQLGWHFVSADQARAEVERGDSYAAFVIPADFSKS LTGVVDGTFTRPDIAYYVNEKNNAVAPKITRAGATALDNQINSAFVATVARTLSERVSEA GASLASTLETGQSELVGAMGRAASGLGSARSSLEGLDPQIEGAKASVSTARTALSGIDGA ASSAEASLGTADQLVSDARTSLSSFSSGLSSSLDGLASQASQAAAGAASAGGALDGGLQG AAGAVGGLLGEASSINTANGQLLADLQGLPIASDPAVSSVLGSLSAQNTAVGSAITDLTA LNSSISGTSTAVATALDSMSSAASAVSGAVSSGRAALDDQVPAISSALDQMAASSAQLRA ALATARSLRAQVSGLLDQMDSLLDGASAASSQSAANLGAIESDLSSASTDIQSIASSNSL TALAASLGVSPEAIAAFVASPTSIQTEAVFPVAAYGSAMAPLFTNLALWVGAFSLVILFK LEVDEEGTGPLTSAQKYLGRWMLLAVFAVAQALVVSAGDLVIGVQTASRTAFVGTAVLVS LAFLSIIYMLATCLQHIGKGLCVIIVVVQIPGAAGLYPIEMMPSFFRALHPALPFTYGIN ALRETVGGFYGNHYVLDIGVLGAHTLIAFAIGLALRPHLVNLNAMMTRELSQSGLFVAEA TRVPSSRYRLTQIISVLADHEGYQRGVARRIERFERRYPAIKRGGLAAGIVVPAVLAVLS VTNAAEVPIVLGAWIVWLLLVILFLVGVEYLREALDRQRLLSTMGESDVRSLLRRRAAGA RARLMAGAAAVTASLASVFDTGRSEAGEEEPPHGADAERAGEDAGADGAGGIDGGSGSGG PSSGADSCSCGGADEGAGADSAGARGGE >gi|319977054|gb|AEUH01000266.1| GENE 14 13478 - 15685 2542 735 aa, chain + ## HITS:1 COG:MTH1858 KEGG:ns NR:ns ## COG: MTH1858 COG1511 # Protein_GI_number: 15679846 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanothermobacter thermautotrophicus # 1 709 1 604 631 234 27.0 5e-61 MRTIWAVYRADLRRARRSVISLVVVLGLTAIPALFTWFNVAASWDPFSNTKNLTIAIANT DQGYKSDLVPLTVNIGDQVVAALRANEDFDWAIESEEGAVEKTRSGQYYAAVVIPADFSK DMMTFFSSPDARSAPLTYYINEKKNGLAPKIAGQGAEHLSAQVNQVFAQTLAEIALDTAS SIAGAMEDPSNTRALGALDTRIQTVASRLAAAANSAEAYAALVDASLTLLDSTSALVSNA SGAGAAAQSGASGLASGGTGLAGAVQTATASVVGALSASQSSLSALSESIEGVYSQADGA SASAVASLRSQAGVFEGQAAQYASIKSTLAGLTGSPVPSEQLDRLQSAVDRLNGLAEGLR SAATALEGKNTSVQTNHATVAQLIADAQSAVSTVRGDYDNTLKPQLDSLASSLEAGAGSL ENVRAALASAASGVNDGTGGARSGLTRLRDAFKGAAESLREAEGKLTSMHDSLAEALTSG DIARVRAIIGSDPKALAAALAAPVGIETNRVFPVDNFGSQMAPLYSVLALWVGSVLMVIA MRSDVTDDNAADGLNEDDPLRRYFASHRVPLASGFIGRYLVFGTIALLQATLVFGGDIVF LKVQHTHPWLFMCVGWLTAVVFSFMLYTLVATFGNAGKAIGVLLLVLQISGAGGAFPLVL LPPFFSAVSPFLPATHAITALRAAIAGYSGAEYADAMWMLAAFIPVCAFGGLALRPLLVS KNRALVGELESTKLI >gi|319977054|gb|AEUH01000266.1| GENE 15 16691 - 17308 -138 205 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSVRQEGAHGAEEEPSIRDEGEEPGDDADQRSGPKGNGPGRTSGHEPDPADVPEDTNPST GLSHSGLGLDGQVFTGGDDSSVQALIAQSFVSRSGPLPDPEDLAGFERALPGCAERIVRM AEKAQDAAIEDGHLSTRAEASALKWTAVSLSLVPLLMMGVAGFLAFVGKSTGAAWVAVAA AAVGGLPRVVEALKARPGPSRRDGS >gi|319977054|gb|AEUH01000266.1| GENE 16 17341 - 17871 357 176 aa, chain + ## HITS:1 COG:no KEGG:MAB_0271 NR:ns ## KEGG: MAB_0271 # Name: not_defined # Def: hypothetical protein # Organism: M.abscessus # Pathway: not_defined # 1 171 1 189 190 62 29.0 5e-09 MIPPLNDDGFLPPGRWGAAFNEVRDRFASGPTPRRAEVWAEFEQALALLRSTVHVSRVWL GGSFLTSKAEPGDVDAVFLIRADQLERAMRDDYGSRVVGLAASGHMFKKVTGLRVDSFIV PWALEDTRDVTVPEYQQRRGYWDDWWERRRRAPGEPLESEAHQRRGYVEVVIDGNR >gi|319977054|gb|AEUH01000266.1| GENE 17 17858 - 18820 581 320 aa, chain + ## HITS:1 COG:no KEGG:KRH_07350 NR:ns ## KEGG: KRH_07350 # Name: not_defined # Def: hypothetical protein # Organism: K.rhizophila # Pathway: not_defined # 8 319 6 315 317 139 32.0 1e-31 MATADGLDELRERILRDAPAPQWRTPDDLARLSVEAFVSQRATGRVLRSSGELRFTGAGV RGHEASLPAVVRVMGALQRLVDATGAAVEGVRTGAGKLPEGIRRRTRLQLAADPQPGSLR LVFSPESGVAEELGDTVPLAIDADDERLLADRAFEDLMALLAVGAEDAPVADRFADDVRE HGPRVASALKAYADALVHGGFTTDLVWKEPELPTRRVTTTSADAARISSIIDGRDLDEES ASVEGLLIGISRERRVALVVDTDGEGGAVQILRGGLQDADIDSLHTGSRVRVEVVEKPRV SAAGETSIIRRASRVEVLSG >gi|319977054|gb|AEUH01000266.1| GENE 18 19270 - 19698 412 142 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTEPSTPEHEADPSPDGGAVPSPDVRLPDPDHADEPAIAAEKHDDAGLDAPVQADLDVLN SGISPKWIIEVANWHPFLALFAAYAYSRYPGARSRAEKMNRAATEYKFFYAAVKVFDWMF LLTMLLAPAVLVGLALWKLIIK >gi|319977054|gb|AEUH01000266.1| GENE 19 19691 - 19888 132 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPGLEGKAVGRLRVMAREGAKNTLDQYWDGVREPRDGERPPIDMGDRFGARPQATAHRMG VLGID >gi|319977054|gb|AEUH01000266.1| GENE 20 20094 - 20411 435 105 aa, chain + ## HITS:1 COG:lin0240 KEGG:ns NR:ns ## COG: lin0240 COG0393 # Protein_GI_number: 16799317 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 105 1 105 110 105 55.0 2e-23 MLVITTPTLEGHPIQQYLGMVSGETIAGVNLFKDFGAGLTNMFGGRASSYENELQEAART AVGEMCGRAGQMGANAVVGVHVDYFTAGADNGMLAAIATGTAVII >gi|319977054|gb|AEUH01000266.1| GENE 21 20637 - 22589 2709 650 aa, chain - ## HITS:1 COG:Cgl2307_1 KEGG:ns NR:ns ## COG: Cgl2307_1 COG2233 # Protein_GI_number: 19553557 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Corynebacterium glutamicum # 4 449 12 453 453 478 59.0 1e-134 MAHTPRAASPTTTHPVDRVPPTGKLLVLAVQHVLAFYAGAVVVPLVIASGLGLDNRTLVH LINADLFTCGIASIIQSAGIGKRIGVRLPLIQGVTFTAVSPLIAIGAAATPAGADPNTGL ATMYGSIIAVGLIVFFAAPYFAKLLRFFPPVVTGTLLTVMGTTLIAVSAGDVVGWASTAD DASKAGAVLEGLGFALGTIAIIVIVQRVFTGFASTLSVLIGLVVMTGVAFALGRADFSEV GGASWLGVTTPFYFGLPKFSASAVFSMLIVMAVTAVETTGDVFATGEVVGKRITPAHIAN ALRADGLSTFLGGVLNSFPYTCFAQNVGLVRLTRVKSRWVVTAAGALMIVLGVLPKAGAV VAAIPQPVIGGASLAMFASVAVVGIQTLSKADMRDNRNAVIVSTSVGLAMLVTLQPSIAE AMPPWLRILFGSGVTIGSLTAVALNVVFFHIGRPTSAAVALVGGRSVTLDEVGAMGRDEF VSVFSRLHEAHTWPAERAWEHRPFASVEELRRAFEDEVLAATPDEQEALIGAYTDIVDLL LTDCGDERAGLDTASMALGEFDDHEAQELRALGAAYREKFSRPLVMCVDNLADRAQLLAS GWRRVESSPAREARFALGEVIDIADARFTSLVADANPLRAAWSEGFEQLD >gi|319977054|gb|AEUH01000266.1| GENE 22 22854 - 23936 826 360 aa, chain + ## HITS:1 COG:ECs0398 KEGG:ns NR:ns ## COG: ECs0398 COG1609 # Protein_GI_number: 15829652 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 12 350 6 343 360 160 33.0 4e-39 MGTARGGRQPAMMDVARLAGVSHQTVSRVVNDTGHVAEETRSRVLAAIEQLGYRRNSVAR ALVTRRSSIIGVITTTSAHYGSASLLVSLEAAAREAGFFTGVTALSDYSAASLAGAIDRF LGLAAEAVVVAAPVAGLAEAMGAMGAPVPVVAVSAVQCASPRLRSVRADQVGGARSAVRY LVERGHRDIVHIAGPGDWYEARAREAGWRAEMEERHLVPREPLGSSWECAGGYEAGRRLA AQGLPDAVFAANDAIALGLLRALDEAGARVPDDVSVVGFDDEPAAAFYGPGLTTVRQDFA ELGRCAVRAACGAITGEGAPGAVVVPTRLVERRSVADRRGQGTAGQRGSAGGVDKRAPWR >gi|319977054|gb|AEUH01000266.1| GENE 23 23976 - 24689 877 237 aa, chain + ## HITS:1 COG:TM0283 KEGG:ns NR:ns ## COG: TM0283 COG0235 # Protein_GI_number: 15643052 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Thermotoga maritima # 7 236 29 253 254 183 40.0 3e-46 MDHMEPDTWDAAFSALVERTRAAVCALHAELPRWGLVVWTAGNVSQRVPGPAADGSADLL VIKPSGVSYDDLTPESMVVCDLDGRLVLGEGAPSSDTAAHAYVYSHMPRVGGVVHTHSTY ATAWAARGEEIPCVLTMMGDEFGGPVPVGPFALIGDDSIGRGIVEALSDSRSPAVLMRNH GPFTIGRDARAAVKAAVMVEEVARTVHIARQLGEPVPIDPDLVDSLYARYQNVYGQH >gi|319977054|gb|AEUH01000266.1| GENE 24 24807 - 25043 164 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEEPHSFSFLVTTFVVTRKTLRNLHGICRLYGGISRKSPRFPAIYEKKRSILRRRTLNRA GSAGERHAETGHHWDIGR >gi|319977054|gb|AEUH01000266.1| GENE 25 25042 - 25323 521 93 aa, chain + ## HITS:1 COG:HI0430 KEGG:ns NR:ns ## COG: HI0430 COG0776 # Protein_GI_number: 16272378 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Haemophilus influenzae # 3 91 47 135 136 68 47.0 2e-12 MSVNRTELVAQIAERANLTKVQADAALGAFQDVLVESLSKGEPVKVTGLLSVERVERAAR TGRNPRTGEEIKIPAGFGVKLSAGSTLKKAVAK >gi|319977054|gb|AEUH01000266.1| GENE 26 25417 - 27162 2266 581 aa, chain - ## HITS:1 COG:MT2985 KEGG:ns NR:ns ## COG: MT2985 COG1061 # Protein_GI_number: 15842460 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Mycobacterium tuberculosis CDC1551 # 11 575 38 596 602 585 57.0 1e-167 MSPALTRRAPNGRLRAWQSRALDAYTSRSPRDFLAVATPGAGKTTFALTLATELIAQGAV DKLTVVCPTEHLKGQWAGNAARFGIHIDPDFTNSQGVAGSHFDGVAVTYAQVGSNPAVHA ARTRAHRTLVVLDEIHHAGDALSWGDGVREAFTDATRRLALTGTPFRSDTSPIPFVTYAE DTEGIRRSRADYTYGYGDALRDGVVRPVLFMSYSGQMAWRTNAGDEVSARLGEPLTKDMM KQAWRTALDPKGEWIPAVLRAADQRLDEVRRAVPDAGGLVIASNQDQARAYARHLRLVTG RAPVLVLSDDADASDRISAFSDSTDKWMVAVRMVSEGVDVPRLAVGVYATNASTPLFFAQ AIGRFVRARRRGETATVFIPSVVPLLELAGGMEVERDHALDRPERSGPSEDDMWNPEDAL VAKANQSEKASSDLLNQFEVLGADAEFDGVLFDGSSWGAGAEVGSEEEQEYLGIPGLLDA DQVATLLRARQADQIASQKRSRDAARQRADNAGVPAHKRRAAKRKELQHLVSTWARRSGD THAVIHSQLRQRCGGPEVAQASTEQIEARIALLREWFVGRR >gi|319977054|gb|AEUH01000266.1| GENE 27 27159 - 27485 319 108 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10096 NR:ns ## KEGG: HMPREF0573_10096 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 9 103 4 101 102 102 52.0 4e-21 MRDALDEPTGPGGPSTQSSTGLLERTEVEQEKSPGDGDRFAHFVRRDKADSSMVTGQPVV ALCGKVWIPTRDASKYPVCPTCKELRDSMGKNGRNWPFSDGGKPGSGE >gi|319977054|gb|AEUH01000266.1| GENE 28 27698 - 28477 1280 259 aa, chain - ## HITS:1 COG:BS_ybfT KEGG:ns NR:ns ## COG: BS_ybfT COG0363 # Protein_GI_number: 16077305 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Bacillus subtilis # 1 247 1 248 249 214 46.0 1e-55 MRIGIFNDEDQIASQAADRIIEVYRAKPDFVLGLATGSSPLGLYAELVRRHQAGEISFKR VRSYNLDEYVGLPRDHYEGYANFIRRNLVGLVDMPEGAAHGPDGWCDDLEAGARAYDEAI KADGGIDIQVLGIGSDGHIGFNEPGGSLVSRTHVGVLTEQTRRDNARFFDGDIDAVPTHC VTQGLGTIMDSRAHVFIATGEGKAEAVRAMVEGGVTQRWPASVLQHHPDVTVLLDEAAAS KLELADFYKEVWAKERAGR >gi|319977054|gb|AEUH01000266.1| GENE 29 28693 - 29310 397 205 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157155704|ref|YP_001464307.1| putative deoxyribonucleotide triphosphate pyrophosphatase [Escherichia coli E24377A] # 5 199 3 194 197 157 46 8e-38 MRAPRLVFATGNAHKISELEAILAPAWEGFDSPMIARMSDFDVEAPVEDGASFEENALIK ARHLAALTGLGALADDSGLTVDVMGGAPGIFSARWCGRHGDDAANLDLLLAQLADVPDAL RSAAFVSAAVLVLPDGREFVERGEVRGRLLRERRGGGGFGYDPVFVPDGHALTTAQMSAE QKNAISHRGRAFRALAPAVIDYLRS >gi|319977054|gb|AEUH01000266.1| GENE 30 29307 - 30074 1049 255 aa, chain - ## HITS:1 COG:Cgl2451 KEGG:ns NR:ns ## COG: Cgl2451 COG0689 # Protein_GI_number: 19553701 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase PH # Organism: Corynebacterium glutamicum # 2 243 17 256 258 263 60.0 3e-70 MSTPATRADGRAPDQLRPVSITRGWSGTGEGSVLIEFGATRVLCVASFTTGVPRWKRDSG EGWVTAEYAMLPRATDQRSPRESVKGRVGGRTQEISRLIGRSLRGVVDTRALGENTVTLD CDVLRADGGTRTASVTGAYIALADAVAWAKRNGAVSPGAKVLTDSVSAISVGIVDGAPVL DLPYAEDVRAHTDMNVVQTGDGRFIEVQGTAEHAPFSRAELSALLDLAALGNARLADAQR AALSGAPGERVEVAP >gi|319977054|gb|AEUH01000266.1| GENE 31 30111 - 31061 948 316 aa, chain + ## HITS:1 COG:CC0266 KEGG:ns NR:ns ## COG: CC0266 COG2816 # Protein_GI_number: 16124521 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Caulobacter vibrioides # 55 305 54 309 313 125 32.0 1e-28 MALDLPLVRGRVDPCVGLRDSLAPDRILSGAPLPGGFGAATAIGVAPSGHVMVDGAEGGS GRVRLARMAPGRASALVRGGARASFIGAVGGRAVVAVEVENGGDGAGGRWRHLRSVGHLM GGDDAALALSGVALAAWHRDYRYCGRCAGALVPECGGWAARCGSCGRLEYPRQDPAVIVR VDDHRGRTLLAHNAAWEPGRVSVIAGFVEAGESPDRAVAREVGEEVAIGIGEPRYVGTQP WPFPRSQMMGYVARTLEESPAPAPDGAEIEWAGFYSRQELADAVGSGRLLAPGRSSIAYA MLRQWYGGELPLPGRA Prediction of potential genes in microbial genomes Time: Thu May 12 19:11:29 2011 Seq name: gi|319977050|gb|AEUH01000267.1| Actinomyces sp. oral taxon 178 str. F0338 contig00267, whole genome shotgun sequence Length of sequence - 3693 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 764 956 ## COG0210 Superfamily I DNA and RNA helicases 2 2 Tu 1 . - CDS 691 - 1677 888 ## gi|293189250|ref|ZP_06607973.1| conserved hypothetical protein - Prom 1703 - 1762 1.9 3 3 Tu 1 . - CDS 1906 - 3453 1782 ## COG5282 Uncharacterized conserved protein Predicted protein(s) >gi|319977050|gb|AEUH01000267.1| GENE 1 3 - 764 956 253 aa, chain + ## HITS:1 COG:MT3291 KEGG:ns NR:ns ## COG: MT3291 COG0210 # Protein_GI_number: 15842783 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Mycobacterium tuberculosis CDC1551 # 3 245 433 698 700 154 42.0 2e-37 DALGGVGWSEEAPVGGSARERWADMAAIVAWADDTNALDLAGFLSELDERAQYQVEPDKT GVEVATIHTAKGLEWDAVFVAGVAEGLLPISYASTPAAREEERRLLYVALTRARDVLEVS WARMRATGGRGKRHRSRLLDGVWPAGRSVGKPKRQAAKESARERRERFEAEAGEEALDLF ARLKAWRLGVAQAASVPAFAVLTDQTLRDIAVTRPKTVAQLRVIKGIGDVKVERFAAPVL ALVRGEEVAPPEP >gi|319977050|gb|AEUH01000267.1| GENE 2 691 - 1677 888 328 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293189250|ref|ZP_06607973.1| ## NR: gi|293189250|ref|ZP_06607973.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 295 1 287 292 184 42.0 5e-45 MKLRKGTAVLWHGQNAVQIGSDHYHHTVIEGLDDAERAWLLRASLPEGGGKRSRARPAGP PPRRALIESALRRTRFLDEGPTPPLFVGAVGLTPATVVALGALADGFALECSIEDSGLID EDFQRFFPSTALGTLRAPSARRLLAHRIPLARLHCQAQPHIAIVSGDRMIDPARTEALTA VDTPHLLVTRGETGYEVGPLVLPGATPCHHCVESARIADDPFRLAHLRHGCGQPLGALTA SAHLAAGLLIARLVRALCTGAFDSADLPAIHVVGPDGGTAVRERARPDPSCACGIGTARP LGQSPTAPGEQPPPPEPGPAPGRRSARP >gi|319977050|gb|AEUH01000267.1| GENE 3 1906 - 3453 1782 515 aa, chain - ## HITS:1 COG:MT3287 KEGG:ns NR:ns ## COG: MT3287 COG5282 # Protein_GI_number: 15842777 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 64 425 81 433 472 245 41.0 1e-64 MSDESDGTPFDDFLRSLLGDEAGEEAARAMRAQGFDPSSMPAEYSDPAHLEQVLNQFRFI MSTTSGPVNWGLVEGAAKQQAFTAGDPRPSAAEAAAAKQAMTVADLWLDTVTDFSAGPAE RQAWSRAQWIDATLPMWKRICEPVAANVSRALSAALEDQMGEGGDLPAGVAAMLGQTQEM MPKLSAMMFASQIGRALSALAQEAIGSYDVGVPLAPAHTTALLPHNIAVFSEGLDIDFTE VRQFMAVREAAHRRLFASVPWLSGDLLRAVEAYSAEIAIDPAAIARAAGSVDPSDPESIE RALSGGVFAISVTEAQHRALTRLETLLALIEGWVEVVTAAATAPYLPHSDQLREMMRRRR ASGSPAEEVLGRLIGLKMRPRRARGAASIFSLVAADGTAAARDALWSHPDMVPTMNELDS PDTFLTIRRAALEQDADIDAALNSLLDGTMGWAEGLSPQDDPEAETLREAGFAPEGADSD DGAPQEPGSPGEGGEEPGAPEGGGGPDGTPGSPGQ Prediction of potential genes in microbial genomes Time: Thu May 12 19:11:46 2011 Seq name: gi|319977047|gb|AEUH01000268.1| Actinomyces sp. oral taxon 178 str. F0338 contig00268, whole genome shotgun sequence Length of sequence - 3182 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1176 1387 ## COG3480 Predicted secreted protein containing a PDZ domain 2 2 Op 1 . - CDS 920 - 1600 171 ## PROTEIN SUPPORTED gi|121596721|ref|YP_990474.1| 60S ribosomal protein L19 3 2 Op 2 . - CDS 1218 - 3023 2193 ## COG0433 Predicted ATPase Predicted protein(s) >gi|319977047|gb|AEUH01000268.1| GENE 1 1 - 1176 1387 391 aa, chain + ## HITS:1 COG:ML0643 KEGG:ns NR:ns ## COG: ML0643 COG3480 # Protein_GI_number: 15827268 # Func_class: T Signal transduction mechanisms # Function: Predicted secreted protein containing a PDZ domain # Organism: Mycobacterium leprae # 34 388 22 340 340 163 33.0 6e-40 GVSGTRPVGRRREVPLWARLAASGTALALMVAGLSWVPVPYVVESPGPTFNVLGSDSGTP MISISGTDPASGADVAVDAPQSSSYAAKAPSGDGGPGQLRMVTVAESGGPGRRLNLLQLV GATFDSRSRILDYDSVYPPTATQEQVDSANQAQMQSSQSTSQVAALEYLGWRVPASVAIE GGVEGTDAAGKVAKGDALTAITTADGAVHPVDSASAPFAIMSRVPVGDTVVLSVQRDGTA LEIPVTTVAGEEGSEGSKLGVYLSIDAQLPLDISFNLENVGGPSAGMMFSLGIIDRLTPG DMTGGAVIAGTGTMNYDGRVGAIGGIQQKMWGAKEDGAQWFFAPAANCDEVVGNVPDGLR VVRVATLGEAADAVHAIAQGDGASLPICTAR >gi|319977047|gb|AEUH01000268.1| GENE 2 920 - 1600 171 226 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|121596721|ref|YP_990474.1| 60S ribosomal protein L19 [Burkholderia mallei SAVP1] # 1 206 91 302 304 70 33 2e-12 PRRGSPRAPPRRRPPARRRKRERLRRRRRRRRASQRRRRPRGCGPPPRRRRRGRPARRRG PWRRRRPPAARRPSASRGRPSAPGRRRRRPDSARPSGARGRSSRSWAPFCAPRARRSRAP SSARASGRGGPCGPETPTAPPASAPCRSAGRRRRPGRWRGRRRRPRPVWRPARRAARLGR SRRPRRSSRRGRRTIGRRPPSRPTSSAGSPRWRPRARRSSWSRSRR >gi|319977047|gb|AEUH01000268.1| GENE 3 1218 - 3023 2193 601 aa, chain - ## HITS:1 COG:MT2585 KEGG:ns NR:ns ## COG: MT2585 COG0433 # Protein_GI_number: 15842038 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Mycobacterium tuberculosis CDC1551 # 31 497 6 469 533 426 54.0 1e-119 MENAAPDPGAGAAAAPGSASTGAGAGANAPAAPAAGSFAAMIQGAYSWDAPTITIGTLID GGARVPGTTAKMPTAMFNRHLLVAGATGTGKTRTLQLLAEGLSAAGSSVLLCDVKGDLTG LAEAGAGSDKLLSRTAANGQEWAPAAFPVELLSLGGADSQFPGVPVRAQVSDFGPILLAR ALALNTTQEQALQLIFAWADSQGLELVDLPDLRSVISFLTSEDGKDELAGIGGVSKATAG VVLRALTALESQGGGQFFGAPGFDTADLVRSDTAGRGVVSLLGVGDISSRPALVSAVIMF LLADLFSSLPEVGDVERPKLVFFFDEAHLLFADATKEFERQVVQTVRLIRSKGVGVVFVT QTPKDIPSDVLAQLGSRIQHGLRASTPEDFKKLKATVQTFPKTGLELDEVLTTLGTGEAV VTVLDPKGNPTPVTPVGIWAPASVMGPASDGTVARVNRSSSLMGRYADSVNPESAEERLA SRAAEAQAAREAEEARAAAEKEEEKARVAAQKEAERVRAAAAKEAERAAREAQRAVEKEE AARRKEAERLQREAERARQKEEAARQRAAERRAREVESVLGSVLRTAGQEITRSIFGTRK R Prediction of potential genes in microbial genomes Time: Thu May 12 19:11:47 2011 Seq name: gi|319977042|gb|AEUH01000269.1| Actinomyces sp. oral taxon 178 str. F0338 contig00269, whole genome shotgun sequence Length of sequence - 2742 bp Number of predicted genes - 5, with homology - 2 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 143 136 ## 2 2 Op 1 . - CDS 490 - 1521 863 ## Franean1_2557 hypothetical protein 3 2 Op 2 . - CDS 1536 - 1622 112 ## + Prom 1480 - 1539 3.3 4 3 Op 1 . + CDS 1588 - 2451 699 ## gi|295101620|emb|CBK99165.1| Site-specific recombinases, DNA invertase Pin homologs 5 3 Op 2 . + CDS 2460 - 2742 149 ## Predicted protein(s) >gi|319977042|gb|AEUH01000269.1| GENE 1 2 - 143 136 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSESIADLKAQLAAAEAAAKEAEAQAARARAEALRARLEAAGAAEGS >gi|319977042|gb|AEUH01000269.1| GENE 2 490 - 1521 863 343 aa, chain - ## HITS:1 COG:no KEGG:Franean1_2557 NR:ns ## KEGG: Franean1_2557 # Name: not_defined # Def: hypothetical protein # Organism: Frankia_EAN1pec # Pathway: not_defined # 8 294 10 297 332 85 27.0 4e-15 MELKREHLEWLSIVRYQSDLAENQAKEPEPLNAVSISTIHDAVESMLSLIAEVHRVTTRS KDFANLFDTVSGHMKTQVGDISGHRSAMIALNNARVGFKHHGNQSNKQTIDRHIANGLNF LADAAEQGLNTPFAEVSLLGFVRDPKVREYIHRADDCSSGAVEDRMPAFEYLRLGFETLV QGYQQSKSYYPGRSLITTKPSFLPSVFDIRDHGGKVGEKAFEWLENLDRWVRVLVLGIDV QEYTFFLANTPGVLMTLGGRAHFSWRPGPDLSDDVFRRCRQFVVESAIRLGRRDFAYDAW NARQSLPEDQRNTGISRVMAPDENGLLTPHEDRGDATSAEETR >gi|319977042|gb|AEUH01000269.1| GENE 3 1536 - 1622 112 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPVFKVPEGGVIGCNLGSWRSRCLAVCS >gi|319977042|gb|AEUH01000269.1| GENE 4 1588 - 2451 699 287 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295101620|emb|CBK99165.1| ## NR: gi|295101620|emb|CBK99165.1| Site-specific recombinases, DNA invertase Pin homologs [Faecalibacterium prausnitzii L2-6] # 1 207 284 483 493 91 31.0 4e-17 MTPPSGTLKTGKTYRYYNCLNQRKKKCSAKTVRKDEIELRVEEVVASFLADPEMLASLAV DLADHYKQTHGRGDKIFKALEARRTDVEIKLANFVKVISPGFFNASTAEAMNALEAQKQE LDAAIQAEHVKATLYEDKASIGAFYKRFAEATIDTPETRDQLFEYFVDKVFIGHDQIVIA SYYHDSARPIEFEDLEEALTIGHRAGKCKPTLESESSTLPPQVETGGIEPPRPRRFRPRG QRRARHFSTSRGPGRGPSTGGGQHPGAVHRRRARVVLRRARRRTVAA >gi|319977042|gb|AEUH01000269.1| GENE 5 2460 - 2742 149 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHEWLARRRLRALTRAVYTSAVEEDPVDADDPGRKARLAPSGASACAVIVTIIVVSLCAL WRLGPSGAPAPGSPAQTSGAAQSAASDEDGRSNA Prediction of potential genes in microbial genomes Time: Thu May 12 19:12:16 2011 Seq name: gi|319977040|gb|AEUH01000270.1| Actinomyces sp. oral taxon 178 str. F0338 contig00270, whole genome shotgun sequence Length of sequence - 1196 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1190 1057 ## COG0658 Predicted membrane metal-binding protein Predicted protein(s) >gi|319977040|gb|AEUH01000270.1| GENE 1 3 - 1190 1057 395 aa, chain + ## HITS:1 COG:Cgl2296 KEGG:ns NR:ns ## COG: Cgl2296 COG0658 # Protein_GI_number: 19553546 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Corynebacterium glutamicum # 28 365 98 432 554 75 28.0 1e-13 VTARVRLVQDPAPARSRHAETTARARLLTVGGAPSGAYALVRGSGIADAARGDELDVVGT IDPSSYSEAPNAGVLRVTRASAPHRPGGVWAWARSVRAHLVRTCSAMSPQARALVPGMAI GDDRALPAGLEDAMRTTSLTHLTAVSGSHIVIVLAVVSLAVPGRRIPRALSTGAVLALIL LLVGPEPSVVRSVATAAVSALALITARPGQACAALCAVVTATVIIDPWASRSYGFALSVL AALAVIGPAAAAVRWSARRLRGDTALGRALRRLVGAVAVPAACQVFVMPVLLTLDPSVPT WAVLANALAEPAVAPATVAALSAALLGPYWQAGAAWCAWASSWATGWIAWVAEACADLPG ARLEVPVPMVVGAYLVGAGAYAAVRARALRRGARR Prediction of potential genes in microbial genomes Time: Thu May 12 19:12:17 2011 Seq name: gi|319977037|gb|AEUH01000271.1| Actinomyces sp. oral taxon 178 str. F0338 contig00271, whole genome shotgun sequence Length of sequence - 1200 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 189 - 1166 1250 ## COG1466 DNA polymerase III, delta subunit Predicted protein(s) >gi|319977037|gb|AEUH01000271.1| GENE 1 189 - 1166 1250 325 aa, chain + ## HITS:1 COG:Cgl2295 KEGG:ns NR:ns ## COG: Cgl2295 COG1466 # Protein_GI_number: 19553545 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Corynebacterium glutamicum # 10 325 8 331 331 136 33.0 5e-32 MAQRPRRTSAPTEISLAPVVLIKGTEGLLIDRALDRLRALAHQAAPDVERTDLVAATYRP GQLDLIASPSLFGESRMVIIRDLETMSDALAADLVAYAASPAPDVWMFLVHPGGGARGKK VVEAVARAKWPVIPAEPLKNDRDKLALVQADVRAAGRQMDPAAMRALVDALGSDPRAMAG ALAQLLSDVPGRVTVEDVHRYQAGRVEASGYDVADAAVAGEAAKALTLTRHAFVTGVAPQ LLVAALAMKFRAMAKASIGGNAAALKMAPWQIDRARRDLRGWSDRSLAGAFEAIATADEE TKGASRDPQRAVEKAVITICRLRGR Prediction of potential genes in microbial genomes Time: Thu May 12 19:12:19 2011 Seq name: gi|319977033|gb|AEUH01000272.1| Actinomyces sp. oral taxon 178 str. F0338 contig00272, whole genome shotgun sequence Length of sequence - 5688 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 + CDS 102 - 1850 196 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 2 1 Op 2 . + CDS 1852 - 3771 174 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 3 2 Tu 1 . - CDS 3776 - 5680 2308 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits Predicted protein(s) >gi|319977033|gb|AEUH01000272.1| GENE 1 102 - 1850 196 582 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 326 562 116 354 398 80 27 4e-15 METIRILMGRLREYGRASVLTPLFIVGEVALECLLPLIMATLVDRLGGEELGPVLRTGLA LVVMAMASLACGVLAARFSATAAAGLAKNLRQDLFFKVQYFSFADIDRFSTSSLVTRMTT DVTNVQNAFGMMIRIAVRVPLMIVFSVVMAWRINSSMALIFLGMVPVLAIILAVIVLVAF PIFRRIFKKYDALNNSVQENVAAIRVVKSFVTEDHERERFGAASQDLRADFTKAEKILAL NNPVMMFFIFAAIMMVDFIGARMIVASGGTELTTGQLSALITYGIQILSHMLGLAFIFVM TTMAAESANRIAEVLTHEPSLSSPPGGETEVVDGAVEFEGVSFKYSATAEENALSDISLR IESGSTLGIVGGTGSAKTTLIQLVSRLYDATSGTVRVGGRDVREYDLESLRDAVSVVLQK NVLFSGTIKENLRWGDPAATDEELVRACELAQADEFIRAFPDGYDTMIERGGTNVSGGQK QRLCIARALLKRPRILILDDSTSAVDTRTDALIRRAFATEIPDTTTIIIAQRLSSVRDAD QIIVMDDGRIKERGTHDQLMAAGGEYREIYESQNQTTESGVA >gi|319977033|gb|AEUH01000272.1| GENE 2 1852 - 3771 174 639 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 405 625 135 355 398 71 27 1e-12 MARTDPNAEKLEGIERPFRRLTAYLFGLYRVRLVIVAVCIVVASVASSVGSIFLQQIVDT VITPGMAQGIDAVWEPLVRLVLTMGAVFATGVLASFLYTRIMAVVTQGALKRMRDDMFDR METLPLRFFDTHPHGAIMSAYTNDTDAIRQLIGQAIPTLIQSALTIAVVAVTMLSYSVWL TALVVVVAGVMVALTRRLGGASARYMVRQQASLAAEEGFVEEMMDGQKVVQVFNHEEAAK RDFLEYNEKLFADSEKANRYGNVLLPALGNVGNLLYVLVALVGGVMVLRSVPNIGFAGMG VITVGVVVSFLGMVRFLSQTIGQVAMQVPMIALGVAGASRVFALIDQEGEADDGYVRLVR ARVGQDGSIEPTRERTHLWAWEHPHQAEGTVTYTPLRGELVMEGVDFSYDGEEQVLTGIS LWAKPGQKIAFVGATGAGKTTITNLINRFYDIDDGKIRYDGINVNKIHKPDLRRSLGVVL QDVRLFTGTVMDNIRYGRLDATDEECVAAAKLANADSFIRRLPDGYDTVLKGNGSSLSQG QAQLLSIARAAVADPPAMILDEATSSIDTRTEALVQEGMDNLMEGRTVFVIAHRLSTVRN ADAIMVLDHGRIIERGTHDQLLEQRGVYYQLYTGAFELE >gi|319977033|gb|AEUH01000272.1| GENE 3 3776 - 5680 2308 634 aa, chain - ## HITS:1 COG:MA3490 KEGG:ns NR:ns ## COG: MA3490 COG1112 # Protein_GI_number: 20092301 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Methanosarcina acetivorans str.C2A # 51 633 822 1401 1939 272 33.0 2e-72 MARARECLGRIAAQPVLRPLPIDRLVALSGALAGMGRADRCADLRAHSALERALEPLARL GLVDAAAQVRERQIGPFELATALEVGLARATIDQRLRHGPLRAFTRAVHEHSADAYADDL ASLRALLPQGLIARALASHRERLEAQGTRLDDLKRELARRVRARGIRDLVGEYGDLITDI TPCVLVSPDSVARFFPAERQDFDVVVFDEASQITVASAVGAMGRGRSAIVCGDSNQMPPT SFAKLSRETEDADGFPDEESILTECVSAHVPRAWLSWHYRSQDEALIAFSNRQYYGGRLS SFPSPQAPASAPGRGHGVSMRLVEGRFIRSVPRGGERRTLRTNPEEAAAIVDEVRARFAA EASPSIGVVTFNLPQRDLVETSLRDLDDPRITSSLDADDGLFVKNLENVQGDERDTVLFS VAFAHDESGRVPLNFGPLNLPGGERRLNVAVTRARREVVVFASFEPDELEAERSQSRGVR DLKEYLRLARLGPDEYARHSPRLPEPDDHREQIAARLREAGAAVSTDVGLSDFRIDLVLA PRHDPGAPSLAVLLDGKGWAARGTVYDRDVLPRSVLSGLMGWPGVERVWLPDWVADPQGV VDRLMGSLEEASAGAAAEARTRLTAREAPGGQAG Prediction of potential genes in microbial genomes Time: Thu May 12 19:12:22 2011 Seq name: gi|319977028|gb|AEUH01000273.1| Actinomyces sp. oral taxon 178 str. F0338 contig00273, whole genome shotgun sequence Length of sequence - 6416 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 3457 3934 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits 2 1 Op 2 . - CDS 3502 - 3687 91 ## 3 2 Tu 1 . - CDS 3805 - 5064 1661 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 4 3 Tu 1 . - CDS 5366 - 5632 323 ## PROTEIN SUPPORTED gi|227874747|ref|ZP_03992900.1| ribosomal protein S20 - Prom 5761 - 5820 2.3 - Term 5707 - 5752 4.9 5 4 Tu 1 . - CDS 5918 - 6415 548 ## Xcel_2206 hypothetical protein Predicted protein(s) >gi|319977028|gb|AEUH01000273.1| GENE 1 1 - 3457 3934 1152 aa, chain - ## HITS:1 COG:MA3490 KEGG:ns NR:ns ## COG: MA3490 COG1112 # Protein_GI_number: 20092301 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Methanosarcina acetivorans str.C2A # 539 855 126 444 1939 128 33.0 6e-29 MADTADSAPGATPESAEPTAPTGPTAPGEAGAGEGRRAPDRSSAHEDVTAPVGGPLADGA ANPLDGIDIGVEHPVFLSYAHAVAGARPELVVHVGRAPAEAGAPPLRLSLEFSIECDGVV LAVPAPLVDVEADDGPVSLSTPIALERTELLGMAEQRPTVLVVTMRAGDWCREERMSGPE VLAARQWIVGQDQAWAAMTLSTFVQPQHPRLSDLEREAATLLDQWTGSASLDGYQGGPER VDAAVDALCHAFAARSIAYAPPQGNWSERGHRLRDAEDVLQGRLASRLDAALVLASALEH LRVEPLVVVCGATAVLGYWKSERHSAENVAMPGWQLVNQIDRGLIGLVDTGSLVESPPPP LREVAARARRILRADGSGARFVVSVRQARLEGARPQPVRSVGQDGAVVEIAAPVVERSNT ITLENLPEQGGARPSLPAAPPRVEGWKAALLDLSARNRLIHCPDSAVASHRLVELAVPEA VMGAVEDLIASGSAITLEGAGERAAACRRTGRAFHEELDPPLLAPGIADHRRVQVDVSND AYFSALRALRSQARTLVQETGTNNLYLALGSLVWTTRGTTVRSPLVFVPVELSTRSARSP YTLRIDPAGAPTPNYSLVERLRLDLGLVVPGLREPAADEAGIDVPALFDAVRAALAEAGL PFHVEATTYLGIFNFGNFRLWKDLEESWELLARNPLVHHLIHSPDAAFADPRADVSPPPA EEVLPTLPIPADASQVGVAARALAGHTLVVEGPPGTGKSQTIANLVMRLLAEGRKVMFVA EKQAALDVVYRRIAAQGFGPLVLNLHDRKQKPALVRRRILDASELEPHPNSVRIESVRHR VEAGREALAAYARALHGTTASGMSAYEARASLVAYGDARPALAITGPQLAALDSAGAAEL RDEAVPRLRDCLLGLDGDRGRASAFLGAPVDDADIDAVADAIGELHEVARADDDGLLVAL AALDEGEAEVVASVLGATAYPLPALSRLADPAWRAAARSLRDRLAAPPDAPALDYFQPDA LEADLAPVRDAVSDATHALLRRRTKTGRAFGPLAQYRITSTPLPRSPRAALDLIDDLVVC AAQDRELRELWRRVLPEAAGLHVGTWSAARGEDRSAALREVEFLIGLEGAFDGPPGTALS RACGIALAAAAG >gi|319977028|gb|AEUH01000273.1| GENE 2 3502 - 3687 91 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCAAPCCLIPSLVCGIGGERTTKRDDERAQPRVRTAPSRRTRGQERAGRCPGADAAARAG A >gi|319977028|gb|AEUH01000273.1| GENE 3 3805 - 5064 1661 419 aa, chain - ## HITS:1 COG:BS_ybaC KEGG:ns NR:ns ## COG: BS_ybaC COG0596 # Protein_GI_number: 16077182 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Bacillus subtilis # 109 413 8 312 318 136 28.0 7e-32 MAHMKPALVIWCAAAIGIAVYYLVLRQHVLRIQDTRAEVCVAVALTAVVLGGLVVALGIR VVTALARRSAAAAVTSLKVSLLTGALLALMVALSQSTAFTPAINGPDPIAELRPITVNGR AEWLSLRGQDRSKPVLLFLSGGPGGSQLVTARHCFADLERDYVVVTWEQPGAAKANGAIS ADDITLDTYLSDGAAVTGLLREEFGQDRIYLMGESWGSALGLMMARDHPEHYRAFIGTGQ MVDFLETDRIDYQTALDDARANGNQKLVGQLEEQGPPPYESGTALKTAAFLSPLNGLSAR SGQLHSAVFSTMDGVYGVEYGLLDKVRFFWGLLRVLDAFYPDLYPIDLRQSAAEQQIPIH IFHGRYDYNAPASLAEDYYARLSAPSKSLVYFEHSAHNPWQTENGLFNDEVRRVFAEAD >gi|319977028|gb|AEUH01000273.1| GENE 4 5366 - 5632 323 88 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227874747|ref|ZP_03992900.1| ribosomal protein S20 [Mobiluncus mulieris ATCC 35243] # 1 87 1 87 87 129 77 8e-30 MANIKSQIKRIRTNEKRRLRNQAVKSELKTLVRRTREAVEAGDQEKAVAALRVASRKLDV AVSKGVIHKNQAANRKSKLARRVASLNK >gi|319977028|gb|AEUH01000273.1| GENE 5 5918 - 6415 548 165 aa, chain - ## HITS:1 COG:no KEGG:Xcel_2206 NR:ns ## KEGG: Xcel_2206 # Name: not_defined # Def: hypothetical protein # Organism: X.cellulosilytica # Pathway: not_defined # 43 160 55 172 175 134 59.0 1e-30 GPGDGPSGSHGTGSGAHDASGQPRPSTSIREASVADALAHASYTPVMDGDADPGEVVWTW VPYQEDASVGKDRPVVVIGADGPGVYVVQLTSKDHGLDAEQEARNGRFWLDIGSGAWDPK GRPSQVRLDRALWVLSTDVRREGAVLPRATWQRVVDAIDRHHRAH Prediction of potential genes in microbial genomes Time: Thu May 12 19:12:30 2011 Seq name: gi|319977026|gb|AEUH01000274.1| Actinomyces sp. oral taxon 178 str. F0338 contig00274, whole genome shotgun sequence Length of sequence - 1073 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 148 - 1053 1230 ## pnf2450 hypothetical protein Predicted protein(s) >gi|319977026|gb|AEUH01000274.1| GENE 1 148 - 1053 1230 301 aa, chain + ## HITS:1 COG:no KEGG:pnf2450 NR:ns ## KEGG: pnf2450 # Name: not_defined # Def: hypothetical protein # Organism: N.farcinica # Pathway: not_defined # 70 266 524 720 882 75 29.0 2e-12 MTTPVSFQDTIEGYERAVAACGAIMHNAGLPMGVRLAAYKTRIDWCAEYGDYTECKRLAS AGTALGAAELGADDPAVLVLRNSEAYWMCVLGFSDAASDRFPALIADIERVLGRDSELAF AARNNSAMPHKSAGRWDRAARVYRGVLADMEATASPDDVLLLTARDNLAEALGADGRFEE STRLYEANIDVMAGMWASGDWRILRVRDAIARNWWMAGEHGRAIGLWMVLAKDARTHMGE DDPFTAECRLTLAGAQYAMEEWEEALRWAETAASALPAEWGPDERASVEAFVEDCRRAAR R Prediction of potential genes in microbial genomes Time: Thu May 12 19:12:35 2011 Seq name: gi|319977024|gb|AEUH01000275.1| Actinomyces sp. oral taxon 178 str. F0338 contig00275, whole genome shotgun sequence Length of sequence - 2053 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 193 - 2049 2942 ## COG0481 Membrane GTPase LepA Predicted protein(s) >gi|319977024|gb|AEUH01000275.1| GENE 1 193 - 2049 2942 618 aa, chain + ## HITS:1 COG:MT2476 KEGG:ns NR:ns ## COG: MT2476 COG0481 # Protein_GI_number: 15841921 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Mycobacterium tuberculosis CDC1551 # 1 617 36 650 653 919 74.0 0 MPIPTPALEALIRPARTDASRIRNFCIIAHIDHGKSTLADRMLQLTGVVAQREMRDQYLD RMDIERERGITIKSQAVRMPWAVGDQPYALNMIDTPGHVDFTYEVSRSLAACEGAVLLID AAQGIEAQTLANLYLALENDLAIIPVLNKIDLPGAQPEKYAHEVAQLIGVDQDEVLLVSG KTGEGVEALLDRIVEVVPAPVGDPGAPTRAMIFDSVYDTYRGVVTYVRVVDGSLGTRQRV RMMSTGATHELLEIGVISPEPTPCEGIGAGEVGYLITGVKDVRQSRVGDTVTSAVKGAEQ ALDGYRDPNPMVFSGIYPVDGSDFPDLRDALERLQLNDAALTFEPESSAALGFGFRCGFL GLLHLEIIRERLEREFGLDLIATAPNVVYTVVTEDGTEVRVDNPSEFPEGKISEVHEPVV TATILTPTEFTGTVMELCQDRRGTMKGMDYLSEERVELHYRLPLAEIVFDFFDQLKSRTR GYASLDYQESGSQVADLVKVDILLNGDRVDAFSAIVHRDGAYAYGQRMTKRLKELIPRQQ FEIPVQAAVGARVIARETIKALRKDMLAKCYGGDITRKRKLLEKQKEGKKRMKSIGRVDV PQEAFIAALTSDVPTGKK Prediction of potential genes in microbial genomes Time: Thu May 12 19:12:36 2011 Seq name: gi|319977022|gb|AEUH01000276.1| Actinomyces sp. oral taxon 178 str. F0338 contig00276, whole genome shotgun sequence Length of sequence - 1222 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1220 1411 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases Predicted protein(s) >gi|319977022|gb|AEUH01000276.1| GENE 1 2 - 1220 1411 406 aa, chain + ## HITS:1 COG:Rv2388c KEGG:ns NR:ns ## COG: Rv2388c COG0635 # Protein_GI_number: 15609525 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Mycobacterium tuberculosis H37Rv # 25 400 3 350 375 266 46.0 6e-71 PWPADGAPWPADGALDPRLVEGDAGRPLSLYVHVPFCRVRCGYCDFNTYTAQFGAGADLD TYAGSVLAEAALATRVLDGAGFGPRPASTVFFGGGTPTMLAPGELARALDGLRERIGIAP GAEVTLEANPDTVSARGLAGLADAGFTRVSFGMQSAVPRVLAALDRTHAPERVPEAVAWA REAGLDVSVDLIYGCPGETLDDWRASLLAATRMRPDHISAYALVVEEGTRMGAQVARGEL PAPDPDDEAAKYEEADAVLSAAGYRWYEISNFALVGPDEAGALDRGALAPTALARASRHN LAYWRDWDWWGLGPGAHSHIGALRWWNVKHPAAYASRLARGLSPAHSGEALDAPTRDLER VMLAVRTGGGVALADTPAGSGAVSGGRGRVEGLVADGLIDAARARA Prediction of potential genes in microbial genomes Time: Thu May 12 19:12:37 2011 Seq name: gi|319977017|gb|AEUH01000277.1| Actinomyces sp. oral taxon 178 str. F0338 contig00277, whole genome shotgun sequence Length of sequence - 1914 bp Number of predicted genes - 4, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 144 - 257 123 ## 2 1 Op 2 . + CDS 268 - 954 804 ## Cthe_1455 hypothetical protein 3 1 Op 3 . + CDS 1032 - 1682 883 ## Sked_24060 hypothetical protein 4 1 Op 4 . + CDS 1675 - 1912 212 ## Predicted protein(s) >gi|319977017|gb|AEUH01000277.1| GENE 1 144 - 257 123 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCAVLCCLIPSLVCGTGGKRTPMGDEGRALSDPPRAA >gi|319977017|gb|AEUH01000277.1| GENE 2 268 - 954 804 228 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1455 NR:ns ## KEGG: Cthe_1455 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 18 227 15 223 224 131 30.0 2e-29 MSATDDVLRSLEGARTDKLAAFNARLTPTVDAERFLGVPMAQMRRAAQALRRAGRSDEFY SSAPHAYVELDIVHMLCLNTEADPDQWRRKMEEFLPLVDNWMVADSVKPACMGAHPGAVA AAARQWTGSEHAYTVRVGVCVLMGALRTSAYAADHLDWVAGIDWDDYYVQMVCAWYFATA FDAHREDAVPYVAEPGPLPDPVRRRALRKILESRRTTPEERAWVRALP >gi|319977017|gb|AEUH01000277.1| GENE 3 1032 - 1682 883 216 aa, chain + ## HITS:1 COG:no KEGG:Sked_24060 NR:ns ## KEGG: Sked_24060 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 1 207 1 205 213 137 45.0 2e-31 MAPQCALIADIVGSRALSDRPSAQRALEAVLARAAQGLDLLQEPVATIGDEFQLATRTLA DALVYTARVHLLVPDGLSLRFGIGEGQILVLDALGADEHGHYLQDGSAWWAARAAIDRAH ARQDGASPFQRTWYASDPEGAPADAVVNALLTLRDHAVWRLSARQRRIAGALAMGEPQVA IARAEKISQQAVSDFARGAGAGVLQSTALIQEANRA >gi|319977017|gb|AEUH01000277.1| GENE 4 1675 - 1912 212 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHETLILSIAAAEASALVTTRRLWRATAFAVLIAVVVVRAVLALGSSAPACALPALVGAA WFVWAEADARAAGVVVRWA Prediction of potential genes in microbial genomes Time: Thu May 12 19:13:01 2011 Seq name: gi|319976990|gb|AEUH01000278.1| Actinomyces sp. oral taxon 178 str. F0338 contig00278, whole genome shotgun sequence Length of sequence - 24355 bp Number of predicted genes - 26, with homology - 16 Number of transcription units - 15, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 349 418 ## Mlut_11210 hypothetical protein 2 1 Op 2 . + CDS 359 - 1303 1104 ## Sked_13390 hypothetical protein 3 2 Op 1 3/0.000 + CDS 1599 - 2633 1346 ## COG1420 Transcriptional regulator of heat shock gene 4 2 Op 2 4/0.000 + CDS 2683 - 3807 1363 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 5 2 Op 3 . + CDS 3834 - 4595 746 ## COG1385 Uncharacterized protein conserved in bacteria 6 3 Op 1 . - CDS 4617 - 5387 1011 ## 7 3 Op 2 . - CDS 5387 - 5926 779 ## 8 3 Op 3 . - CDS 5942 - 6220 398 ## 9 3 Op 4 . - CDS 6210 - 6638 643 ## 10 3 Op 5 . - CDS 6625 - 7374 735 ## 11 3 Op 6 . - CDS 7426 - 8577 1262 ## 12 3 Op 7 . - CDS 8662 - 9468 419 ## 13 4 Tu 1 . - CDS 9717 - 10142 453 ## - Prom 10178 - 10237 4.5 14 5 Tu 1 . - CDS 10366 - 11268 1123 ## COG2017 Galactose mutarotase and related enzymes - Prom 11317 - 11376 1.7 15 6 Tu 1 . + CDS 11423 - 12754 2017 ## COG3579 Aminopeptidase C 16 7 Tu 1 . + CDS 13127 - 14152 1474 ## COG1816 Adenosine deaminase 17 8 Tu 1 . + CDS 14489 - 15187 543 ## COG2071 Predicted glutamine amidotransferases + Term 15242 - 15303 1.0 - Term 15461 - 15499 2.9 18 9 Tu 1 . - CDS 15689 - 16759 740 ## gi|154507871|ref|ZP_02043513.1| hypothetical protein ACTODO_00353 19 10 Op 1 11/0.000 - CDS 17391 - 18272 1261 ## COG1180 Pyruvate-formate lyase-activating enzyme 20 10 Op 2 . - CDS 18705 - 18956 464 ## COG1882 Pyruvate-formate lyase - Term 19045 - 19085 2.3 21 11 Tu 1 . - CDS 19093 - 21195 3409 ## COG1882 Pyruvate-formate lyase - Prom 21418 - 21477 1.5 22 12 Tu 1 . + CDS 21244 - 21399 78 ## + Term 21430 - 21459 1.2 23 13 Op 1 13/0.000 + CDS 21504 - 22814 1818 ## COG0460 Homoserine dehydrogenase 24 13 Op 2 . + CDS 22816 - 23763 1034 ## COG0083 Homoserine kinase + Term 23817 - 23874 8.3 25 14 Tu 1 . - CDS 23873 - 23935 62 ## 26 15 Tu 1 . + CDS 24027 - 24354 81 ## gi|154509279|ref|ZP_02044921.1| hypothetical protein ACTODO_01804 Predicted protein(s) >gi|319976990|gb|AEUH01000278.1| GENE 1 2 - 349 418 115 aa, chain + ## HITS:1 COG:no KEGG:Mlut_11210 NR:ns ## KEGG: Mlut_11210 # Name: not_defined # Def: hypothetical protein # Organism: M.luteus # Pathway: not_defined # 16 97 160 242 254 63 55.0 3e-09 AGVPGTGAAALEAAGGAGAPAPRRGLSGGRWIGPLERLLIVLMAASGPQAAIAAIIAAKG VIRFPEVSKDDTGEKAEEFLIGSLVSWGLAAAAAVFITSIAQGAWAVPAEPGSGM >gi|319976990|gb|AEUH01000278.1| GENE 2 359 - 1303 1104 314 aa, chain + ## HITS:1 COG:no KEGG:Sked_13390 NR:ns ## KEGG: Sked_13390 # Name: not_defined # Def: hypothetical protein # Organism: S.keddieii # Pathway: not_defined # 7 310 4 284 286 293 54.0 6e-78 MAFTPIDPYGADVLARDPHRDAPGAPNPRSRQIHVEIGMVLEDVGSGWVGAVTRVEKSGG VHLVELEDRRGKRRSFPLGGGFWLEGQPIIALAPLPRTSPGPSRADGAAAEAGFRAGLTT PGGRRVTNSGSIAAPRSAPRVAKASRIWVEGKHDAELVQHVWGDDLAEAGIVVQLLDGAD NLEEVLEEFGPTGTARAGILLDHMVAGSKETRIARAVQERWGESVLVLGHPFVDIWQAVK PGRVGLEAWPQIPRGTDIKHGTLEYLGWPHATRADIAAGWRRILSTVRNYRDLEPALLGR VEELIDFVTVPWAH >gi|319976990|gb|AEUH01000278.1| GENE 3 1599 - 2633 1346 344 aa, chain + ## HITS:1 COG:ML0624 KEGG:ns NR:ns ## COG: ML0624 COG1420 # Protein_GI_number: 15827254 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Mycobacterium leprae # 5 343 3 341 343 261 48.0 2e-69 MSRSSADDRRLDVLRAIVTQYVATREPVGSKAIAQSHDLGVSSATIRNDMAVLEDAGLIY QPHTSAGRVPTDRGYRVFVDRLATLKPMTAPERRAVESFLSDSADIDDVVERTVRLLAQF TRQVALVQYPATSVRTLRHLEVVDLAPSRLLVVVITDQGEVVERSLSLAAPVSGDDVASA RLLLRAHMGAATSRDVDVGRDEAVAAAPPELGALVGVLADVVAELLAPEESARIVISGVS NLARGAVDFHDITPVLDALEEQMVLMRLFAQAESDGGVHVTIGAENPHEALSEASVVTGT YRAGETAGEGVAHLGVVGPTRMDYARTMSSVRAVAAYLSRFLAT >gi|319976990|gb|AEUH01000278.1| GENE 4 2683 - 3807 1363 374 aa, chain + ## HITS:1 COG:ML0625 KEGG:ns NR:ns ## COG: ML0625 COG0484 # Protein_GI_number: 15827255 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Mycobacterium leprae # 2 371 3 375 378 294 45.0 2e-79 MRDYYEVLGVQRDASPEEIKKAYRKLARQLHPDYAGPDSEEAFKELSVAYETLSDPDKRK MYDIGGPDALRGGGGADFGSAFAEAFSSMFGGGFSSAFGGASTPQSRVRAGRDRQVHVDI TLEEAAFGAKKEVEYSTYAVCDVCGGSMCEPGTSPAQCGSCHGSGFVTRVQNSILGRMQM QSPCPTCQGYGDVISSPCHNCAGQGRVPTQRTITVNIPAGASEGMQVRVSGESEVGVGGG PRGDLYLAIHEKKHPVFERRGDDLHTWITIPMTTAALGTEFELDTLDGVKGVTIKAGTQP GDDIVLQDLGVGRLQRAGRGALHVHVDVEIPKRLDAKSRRLLEELADARGEVRVEPHRQH TSFFDKLRDTFGAS >gi|319976990|gb|AEUH01000278.1| GENE 5 3834 - 4595 746 253 aa, chain + ## HITS:1 COG:Cgl2237 KEGG:ns NR:ns ## COG: Cgl2237 COG1385 # Protein_GI_number: 19553487 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 11 245 15 234 241 127 38.0 2e-29 MSADPAGLEAGARARLGGDEGRHAAQVRRIGAGEWVDVVDGRGRRLVCEVVGASKTGLDL VVASAVDEAGPAVRTVLVQALAKGGHDEQAVDVCTQIGVGAVVPWASERAVVRWSGPRAL KGRAKWQGVARAAAKQARRAFVPEVGEPLDTRGLLGWVGRLVEAGGAVLVCHEEAGAGIG SVLDGLRADSPGGRLSGRVAVVVGPEGGISPGESEALVGAGAALVGVGPTVLRSSAAGAV ALALVLAAAGQYR >gi|319976990|gb|AEUH01000278.1| GENE 6 4617 - 5387 1011 256 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLRLHTETPNPYFSMSGEKELLAITAYAAGGAAHDEAVTAPLRQCGVLTAAGLAPPADSI TAGLAGARRRVYMMTMSATDGRQCDAYIAPGAVTMVDYGTDEYTMCGLDHCELPIALLAY TALKPRPVEREDLGPLRAPIEFLDYLDAGKPHMIARVLQRTGMEYARERGGGLLGARHST LSRAMVEGTWAITVGFLYEMVDGQWQQADSVLTLASPHTLYEIVRDQDGRRPFVRFDPIT GRLVWMRLGQWLAPVG >gi|319976990|gb|AEUH01000278.1| GENE 7 5387 - 5926 779 179 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDDTRLTVEHTGRTTAVTASDRWIRLPRPDDPDLTPPMRLLADQLPPGVLDDPRLDLVLV RAEGTEEWHDTFTTAVEPFGAEVGAQQWQHQVTSLSLRTIPGYQIVDMTPWATDEQAGLV TTGTYILDTISLTLTQWTWVVEDADGGGKGITVTAICPTHQLAASSADTIPMFESTRVL >gi|319976990|gb|AEUH01000278.1| GENE 8 5942 - 6220 398 92 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDIEIDTELVRHIGGESQRCSDSLNAVSVADGSDGQLGGAYSSLIDLLQTAVGEVAKATA QAAESASHAADLHDTVEEQISDSLSAILPRLD >gi|319976990|gb|AEUH01000278.1| GENE 9 6210 - 6638 643 142 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGTLDDYSQQARDDVRHAQSQAQEVAQYSQRLKRIRQTATDPTRSVAVTVDASGALLGIE TPLAPPAVLAAIVQTYQAARKRANFLLIDEARKSFGDDSRIVRRLEDDLETVDPSRDRRP FQADRYRAERERRRAERESYGY >gi|319976990|gb|AEUH01000278.1| GENE 10 6625 - 7374 735 249 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSSMFYATICFLMGVITFFFWWHEMFSDGPVGEFSRAFDSDRGKNFLALTIPAIGTFLI SGSVLAFSSEVASPFVSQSRHPGILYAVLSSLLGAVGILSLLVFVASFIPFSLPEWMYPE YHAAKREEQRRQEAAERGESEDRDPFLGDDGVYASQLGNVPIDIPEAVGLPSTGDPPPLQ GPPETEEPALYDSDTADAFDITATTTALPRTGTPTSHGARPARASRARRGHGGAATTPTG TREDPNGDA >gi|319976990|gb|AEUH01000278.1| GENE 11 7426 - 8577 1262 383 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MADGIRIRTLLSPGVPRGLRAEMFKHQQDLDAVAQDLQTARDMVEAGDLQGATADRLFDA MTDNTDNAKTLVEAVKALVEALGALADALTSTGLWFDNLADAAVAEHLEVTEDEIVAPAD PTNPAYDAQVEAYSQFCEESYEIYEKERKAYEDFESACQGIAGSLAWVAGQVLPNGSWGD ISKWHGLIDNTTSVIMWGAVHKARGRFAPRWPVGATNAMGQSLKGKFRNVQQMGWFRRHW EMTKHENWVSGFKQGNKPAATIGRAARRFGNLMGRIGIVATLADGVLSAKDQWERDAINP QIGTGERVARATTMGAATAGVGWTGAWVGAKAGAFIGTFGGPIGIAAGTVIGGIIGGIVG SAAGGWLGQFFNNNVVSKALHDG >gi|319976990|gb|AEUH01000278.1| GENE 12 8662 - 9468 419 268 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKHLLLITSGATLGSSALYFWYCEMFSDNPVGEFGRAFDSNRGKNFTALSQPAGGISFVL GALALIGPMIDPDIFSSRHGHSPFWSYWTAVFVLPSMLFMFIALIGIIPFTLPEWMYPEY HAAKREERRRREAAERGESEDRDPFLDDNGVYASQLDNVPIDIPEASGLPSTAAPRSFHA APSAGSPPPPHGAAEAGAPALHDPDAATGSHAGAADAFEITAVIASLPRTGAPTSHGARP ARAHRGPQEDRTTAVKQNTMEQTDRIKE >gi|319976990|gb|AEUH01000278.1| GENE 13 9717 - 10142 453 141 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPHPGIVMAVALLFGSIVLFLWHQEMFSDGHVGRFSRQSNAERPKNITALVAPAIGCMSL AVGFCALSAFVRSYSDPAVEEPSGFWLYWDTFWGALILISLIVAAVGFIPLPLPKWMYPP YHEAKREEQRRRETDERTGGR >gi|319976990|gb|AEUH01000278.1| GENE 14 10366 - 11268 1123 300 aa, chain - ## HITS:1 COG:ECs4802 KEGG:ns NR:ns ## COG: ECs4802 COG2017 # Protein_GI_number: 15834056 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Escherichia coli O157:H7 # 1 297 3 297 300 274 44.0 1e-73 MTTHTLSGTELTLSSGDYTARVVTVGAGLAGLDWRGHRIVLPHGADQVPPAYMGKVLVPW PNRIAGGAYTWQGRRYSVPVNEPELGTALHGLAAWTDWRVVHADSDSATLGLFVAPTYGY PWPLQAWATYAVDAARGLSVTVSTKNTGQAAAPYGTGVHPYLTVDGHPADSYELTVPASS ALTTDASLIPTGRVAVDEAGVDFREPALIGGRSIDHAFTDIRAGGAWTASILHRGTGIGA AITADTPWVQVYSGEQVSRRGVAIEPMTCPPNAFNSHTDLAVLEPGQTHSMTVGIHGFAL >gi|319976990|gb|AEUH01000278.1| GENE 15 11423 - 12754 2017 443 aa, chain + ## HITS:1 COG:lin2432 KEGG:ns NR:ns ## COG: lin2432 COG3579 # Protein_GI_number: 16801494 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase C # Organism: Listeria innocua # 18 443 20 440 441 378 44.0 1e-104 MPHQISEELIERLRASGDPARTVARNAVTVAGIEAASTDRDRVVATPTVVSDRIDDWKVT SQKKSGRCWLFSSLNLIRSAARERLGLKDFEFSQNYVFFWDKFERANWFLTDVIATAATE DLDGRLLQFLLADVLSDGGQWDMAVSLYVKHGLVPKQAMPETESSSNSHRMNTLLQVLLR RTALELRELVAQGASDEAVAAAKEEALAQVWRILVISLGEPPASFEWEWRDDKGGFHRDG ALTPREFFARYVEADLTQYVCLVDDPRREHPKGRALTVDHLGNVVGGRPIRYINAEMGTI KRLAADSLRAGRPVWFGADCSQQSDRKSGLFVEGLFDFSSLFGVDLAMTKEQRVNTGESA MNHAMLFTGVDVDEAGAPRRWRVENSWGEEPGDKGFFTMDDAWFSEYVFEVVVPVDSLPE ELVGALTEEPMHLPAWDPMGTLA >gi|319976990|gb|AEUH01000278.1| GENE 16 13127 - 14152 1474 341 aa, chain + ## HITS:1 COG:SPBC1683.02 KEGG:ns NR:ns ## COG: SPBC1683.02 COG1816 # Protein_GI_number: 19111850 # Func_class: F Nucleotide transport and metabolism # Function: Adenosine deaminase # Organism: Schizosaccharomyces pombe # 3 337 2 337 339 313 47.0 4e-85 MTTDVTRSFTLSLPKAELHLHLEGTLEPALKLRLAQKHGVDIGQSTIDEVRASYRFNDLA SFLAVYYPAMEVLRDEDDFHDLAAAYLARARQDGVVRAECFFDPQAHTGRGIPIETVIRG YHRAIEEAREQGMSADLILCFLRDLSAESAMETLRAALPHKDMFIGVGLDSDERGNPPTK FAEAFALARDNGLRVTMHCDIDQVGSIDHIRSALLELGAERLDHGTNIVEDPALVAYARD HGIGLTTCPLSNSFVTEEMKGKEIVELLEAGVKVSVNSDDPAYFGGYIADNYFALAERFG LDRAALARIARNSIEISWAPGQDKARLLAGLDAFEAGEGLV >gi|319976990|gb|AEUH01000278.1| GENE 17 14489 - 15187 543 232 aa, chain + ## HITS:1 COG:ML1573 KEGG:ns NR:ns ## COG: ML1573 COG2071 # Protein_GI_number: 15827825 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Mycobacterium leprae # 27 212 56 234 249 104 40.0 1e-22 MRLWDRLLSAALPTWNVVREYAQEDGPEGARLRARGADAIVVMGGEDVHPALYGRAQGYE GEGRHWYRADLGQLALIGHARATGTPLLGICRGMQLINVAFGGTLEQHIEGAEGTHTNPS ILTDHRFVRHGVQVAPGTELHRALGPFLTGASTVVSSAHHQRVEAAGEGLAVCATAEDGT VEAVEHRDAPIVGVQWHPEDPAADIDQLRALLAHLGARRAPRLERASTTLAA >gi|319976990|gb|AEUH01000278.1| GENE 18 15689 - 16759 740 356 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154507871|ref|ZP_02043513.1| ## NR: gi|154507871|ref|ZP_02043513.1| hypothetical protein ACTODO_00353 [Actinomyces odontolyticus ATCC 17982] # 22 335 69 381 398 177 34.0 1e-42 MSQFHFALVRSTAQNCSDFLRDPSFVRLRRGFFLHPLNVEEQPPVYDVWREVAYHRLQAV RLSSASPPVFVGESALLAHGIPLWESNPNVAIHTAGRVQRYPFPQVRIAGLTIPGCAVRS FTKPPTNQTTSISGLECEDVIDALIRVIASEERLPAFVAACMGIRHLSRFDARSPDASRR REREVKRQILARMRSSPYRRSHILIEQIVDNADAGCESVLEAALLWVVLSVCPYDVRTQF VIDTPDGQFRADWAIIELGLVGEADGAAKLGSTISAFRTAQRAWMARQRALEQEGWTIRR YQWADFEDIPALREQLARAVNPYDLPIPPSRARLWRPLGRCDSQERRFHIRSADGY >gi|319976990|gb|AEUH01000278.1| GENE 19 17391 - 18272 1261 293 aa, chain - ## HITS:1 COG:VC1869 KEGG:ns NR:ns ## COG: VC1869 COG1180 # Protein_GI_number: 15641871 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Vibrio cholerae # 50 281 3 233 246 203 42.0 4e-52 MSTTALPLPTFREQDFQAPQARTVGQGVEGLEELTDLQRSERLKRMREGTLGSIHSWELV TAVDGPGTRMTVFLNGCPLRCLYCHNPDTFLMKDGAPISDEELLGRIARYRRIFRATNGG ITLSGGEVLMQPAFAKRIIHGAKQMGVHTCIDTSGYLGANCDDQMLDELDLVLLDVKSGN EETYKAATGRELAPTIAFGDRIAQRGGPTRVWIRFVLVPGLTDDEENIRQVGQIISRWKN VIDRVEVLPFHQLGKDKWHSLGLEYKLENTKAPSPEATEKVRDYFRSLGFEVH >gi|319976990|gb|AEUH01000278.1| GENE 20 18705 - 18956 464 83 aa, chain - ## HITS:1 COG:SA0218 KEGG:ns NR:ns ## COG: SA0218 COG1882 # Protein_GI_number: 15925929 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Staphylococcus aureus N315 # 27 82 693 748 749 93 76.0 1e-19 MATKTYEERLADMKAAREERGDQDGLYHANINVLEEATLKDAMEHPEKYPNLTVRVSGYA VNFVKLTREQQLDVLSRTFHHSA >gi|319976990|gb|AEUH01000278.1| GENE 21 19093 - 21195 3409 700 aa, chain - ## HITS:1 COG:FN0262 KEGG:ns NR:ns ## COG: FN0262 COG1882 # Protein_GI_number: 19703607 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Fusobacterium nucleatum # 7 686 3 682 743 813 58.0 0 MTTDENAWEGFVEGTWSQEIDVRDFIQRNYTPYEGDASFLAGPTEKTLRVWNTLEEKYLS KERKVRILDVDTDTPADIDAFKPGYICEDDDVIVGLQTDSPLKRAMMPNGGWRMVETAIH EAGKEYNPEVKKIFTRYRKTHNDAVFDIYTPRIRAARSSHIITGLPDAYGRGRIIGDYRR VALYGVDYLIEQKQKDKDRYVDQPFSEHWARYREEHSEQIKALKKLKKLAADYGYDISGP AKNAHEAVQWTYFGYLASVKSQDGAAMSIGRLSGFFDCYFERDLKNGVLDEAGAQEIIDA LVIKLRITRFLRTVAYDQIFSGDPYWATWSDAGFGDDGRTLVTKTSFRLLQTLVNLGPAP EPNITIFWDENLPKGYKEFCAFISITTSSIQYESDPQIRQHWGDDAAIACCVSPMRVGKQ MQFFGARVNAAKALLYAINGGRDEMSGKQVMEGHAPVQGDGPLDFDEIWAKYEDMLDWVV GTYVEALNIIHYCHDRYAYEAVEMALHDAEIVRTMGCGIAGLSIVADSLAAIKYAKVYPV RDETGLVVDYRTEGEFPTYGNDDDRADDIAATVVHTIMDKIRAIPMYRDAIPTQSVLTIT SNVVYGKATGAFPSGHKKGTPFAPGANPENGIDTHGMVASMLSVGKLDYNDALDGISLTN TITPQGLGRTQQEQITNLVGILDAGFVPDDSCAYDGTKGY >gi|319976990|gb|AEUH01000278.1| GENE 22 21244 - 21399 78 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWSAAPGCTGTWCLGPLSYGTLGAIKRYQPAMRALCLTPSRAAVGLVESWD >gi|319976990|gb|AEUH01000278.1| GENE 23 21504 - 22814 1818 436 aa, chain + ## HITS:1 COG:Cgl1157 KEGG:ns NR:ns ## COG: Cgl1157 COG0460 # Protein_GI_number: 19552407 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Corynebacterium glutamicum # 4 436 15 445 445 395 55.0 1e-109 MTRPLPELTVGILGAGTVGSQVIRILESSSADFAQRSGAGLSIGAVLVRDPAAPRDVEID PALLTTDPAAAIDGMDLVVELIGGIEPARTLVLRALRQGTSVVTGNKALLAAHGPELYEA AAASGADLYYEAAVAGAVPVVYGLRESLAGDRITKVMGIVNGTTNFILDAMDSQGLGYAD ALAEAQRLGYAEADPTADVEGLDAAAKCSILASLAFHTRVGIDDVEVEGITRVTSEDMAQ ARREGRVIKLLAIAERRTGDDGREGVSMRVHPAQIPADHPLASVDGAFNAIVVEGEAAGR LMFYGQGAGGAPTASAVLSDVVAAAHHRAYGGHAPRESVYAHLPVLDPGSARTRYHVRMR VEDRIGVLADVAGVFADHGASIQAVSQHDDDSESGACSLIVTTHLAREADLRAVVRALGH CAAVREVVSAIRVEGD >gi|319976990|gb|AEUH01000278.1| GENE 24 22816 - 23763 1034 315 aa, chain + ## HITS:1 COG:Cgl1158 KEGG:ns NR:ns ## COG: Cgl1158 COG0083 # Protein_GI_number: 19552408 # Func_class: E Amino acid transport and metabolism # Function: Homoserine kinase # Organism: Corynebacterium glutamicum # 8 303 11 300 309 179 43.0 4e-45 MRVHEDHVAVAVPATTANLGPGFDSMGMALDLWDEVEVHATTRATSVVVEGEGADDVEKG DDNLIVRALRLALDRVGAPQVGIRMHCTNRIPHSRGLGSSASAIVAGVVAARALIGDPTV LTPDEVLDIGTEMEGHPDNVAPAVLGGATVAWMGLEGTVKRARAVRLDPPDIIRPVAFIP DFELKTSAARAALPSSVPHADACFNVSRAALLTALLSGASADAGEGPLHGLLMDATEDRL HQDARRPAMEPSLALVDWLRGAGMAAVVSGAGPTVLSLEDVDPQIREDARKAGWRVLALP VAATGARITQGRLAE >gi|319976990|gb|AEUH01000278.1| GENE 25 23873 - 23935 62 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCVDARGKTECGGPHRVYAG >gi|319976990|gb|AEUH01000278.1| GENE 26 24027 - 24354 81 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509279|ref|ZP_02044921.1| ## NR: gi|154509279|ref|ZP_02044921.1| hypothetical protein ACTODO_01804 [Actinomyces odontolyticus ATCC 17982] # 1 89 40 131 616 67 54.0 3e-10 MSNETSAEATTQSLKAMKLPELKALAQSRGLKGISQLRKPQLVELLSNDARPALEPAAPG AAASAQDPVPGAEPASAPAQDAAGAAPRPPGGGERRGGGTGRRGARLVA Prediction of potential genes in microbial genomes Time: Thu May 12 19:15:06 2011 Seq name: gi|319976957|gb|AEUH01000279.1| Actinomyces sp. oral taxon 178 str. F0338 contig00279, whole genome shotgun sequence Length of sequence - 32030 bp Number of predicted genes - 32, with homology - 29 Number of transcription units - 9, operones - 6 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1445 1946 ## COG1158 Transcription termination factor 2 2 Op 1 . + CDS 1680 - 3344 1946 ## Achl_2089 hypothetical protein + Term 3414 - 3451 6.3 + Prom 3377 - 3436 1.5 3 2 Op 2 10/0.000 + CDS 3569 - 3790 313 ## PROTEIN SUPPORTED gi|229819781|ref|YP_002881307.1| ribosomal protein L31 + Term 3811 - 3866 23.4 4 3 Op 1 32/0.000 + CDS 3876 - 4976 1445 ## COG0216 Protein chain release factor A 5 3 Op 2 10/0.000 + CDS 4973 - 5854 295 ## PROTEIN SUPPORTED gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase 6 3 Op 3 4/0.000 + CDS 5860 - 6537 666 ## COG0009 Putative translation factor (SUA5) 7 3 Op 4 . + CDS 6534 - 7649 1754 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 8 3 Op 5 . + CDS 7646 - 7918 298 ## gi|154509269|ref|ZP_02044911.1| hypothetical protein ACTODO_01794 9 3 Op 6 . + CDS 7994 - 8896 1169 ## COG0356 F0F1-type ATP synthase, subunit a 10 3 Op 7 . + CDS 8949 - 9158 457 ## Elen_1040 H+transporting two-sector ATPase C subunit 11 3 Op 8 38/0.000 + CDS 9164 - 9724 648 ## COG0711 F0F1-type ATP synthase, subunit b 12 3 Op 9 41/0.000 + CDS 9721 - 10539 1027 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 13 3 Op 10 42/0.000 + CDS 10584 - 12233 2367 ## COG0056 F0F1-type ATP synthase, alpha subunit 14 3 Op 11 42/0.000 + CDS 12237 - 13166 1302 ## COG0224 F0F1-type ATP synthase, gamma subunit 15 3 Op 12 42/0.000 + CDS 13192 - 14640 2085 ## COG0055 F0F1-type ATP synthase, beta subunit 16 3 Op 13 . + CDS 14640 - 14915 376 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 17 3 Op 14 . + CDS 14941 - 15342 471 ## Jden_1772 hypothetical protein + Term 15388 - 15437 1.3 18 4 Op 1 . + CDS 15447 - 16136 962 ## COG1637 Predicted nuclease of the RecB family 19 4 Op 2 . + CDS 16136 - 17248 912 ## PFREUD_20890 hypothetical protein 20 4 Op 3 . + CDS 17289 - 18479 1427 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 21 4 Op 4 . + CDS 18530 - 18772 92 ## Arch_0371 hypothetical protein 22 5 Op 1 . + CDS 18968 - 20299 1549 ## BDP_1938 hypothetical protein 23 5 Op 2 . + CDS 20314 - 21306 1167 ## Bcav_1332 hypothetical protein + Term 21328 - 21365 9.2 24 6 Tu 1 . + CDS 21442 - 22332 1075 ## COG3118 Thioredoxin domain-containing protein + Prom 22417 - 22476 1.6 25 7 Tu 1 . + CDS 22625 - 22798 211 ## 26 8 Op 1 . - CDS 22864 - 25059 3203 ## COG0296 1,4-alpha-glucan branching enzyme 27 8 Op 2 . - CDS 25069 - 25239 109 ## 28 8 Op 3 . - CDS 25236 - 27281 2767 ## COG0366 Glycosidases - Term 27292 - 27339 -1.0 29 8 Op 4 . - CDS 27362 - 28462 1579 ## COG0180 Tryptophanyl-tRNA synthetase 30 8 Op 5 . - CDS 28506 - 30797 3406 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases - Prom 30903 - 30962 1.8 31 9 Op 1 . + CDS 30988 - 31836 932 ## COG2086 Electron transfer flavoprotein, beta subunit 32 9 Op 2 . + CDS 31850 - 32030 240 ## Predicted protein(s) >gi|319976957|gb|AEUH01000279.1| GENE 1 3 - 1445 1946 480 aa, chain + ## HITS:1 COG:Rv1297_2 KEGG:ns NR:ns ## COG: Rv1297_2 COG1158 # Protein_GI_number: 15608437 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Mycobacterium tuberculosis H37Rv # 103 477 58 434 440 495 69.0 1e-139 SSRPRRSRRRRAVTQGAVEPARVTIDLPPTTKEAGADAGSSAVEAILDIELPEGDNTAET QRERREQPRRRKLERSDRISNRERRERGDRQDRGQRIGREGGDDQLAPIAGILDVQENHA FVRTSGYLPGSNDVYVTLGNVRRWGLRAGDAVAGAVRLPREGERQRQKYNALVRVDSVNG MAPEEAKSRREFGKLTPLYPSEQLRMETSAKAYTPRVIDLVAPVGKGQRGLIVSPPKAGK TMIIQQIAKAIELNNPEVHLMVVLVDERPEEVTDMRSIVKGEVIASTFDRPASDHTTVAE LAIERAKRLVELGQDVVVLLDSLTRLSRAYNLAAPASGRILSGGVDASALYPPKKFFGAA RNLKEGGSLTIIASALVETGSKMDEVIFEEFKGTGNMELRLSRQLADRRIFPAIDVNASG TRREEMLFKPEELRIMWKLRRVLGTLDQQSGLELVLDKLKETQSNAEFLMLVQKTTPSVE >gi|319976957|gb|AEUH01000279.1| GENE 2 1680 - 3344 1946 554 aa, chain + ## HITS:1 COG:no KEGG:Achl_2089 NR:ns ## KEGG: Achl_2089 # Name: not_defined # Def: hypothetical protein # Organism: A.chlorophenolicus # Pathway: not_defined # 60 405 41 386 497 107 28.0 1e-21 MIDYVGQSPGQSKDPAKEAKNHWRVVRAFLRLLVAAALLAALAWISVGLAGNRGTGHAAP TGVEVYDEVGVLNQDLLSQETNAMEFARPSRVVYLTLAEVPTGNFNEAVLAFARSNPQLG LIDGANPDKWASGTLVFAVAPKQREAAIYLGEDRDPGESVGKDTIDAMRAEFKGKNWDAG MLAGARAIAPHAGAFHVSGSALFFSVIGALIGLGEIWVFVSNPRKIARAKRDFELHMGNV DRDRATTEAALVSIPPGSQYGAIVEQRFAHYKDEIARLSDSGPDVDVRGALRFSVPARND AKALASAAEEMDLSDDAIVSASDFFSMRGDWEGAWANEIGPVMEDLNSLEELVDTVTDEI KTAEARASRDGIMRWIRTQTDAIMAMRGQLRDGTTTPDEALAALDAISEQTRSCAVDVIN ASLDADTSEYAWKRRLRWQKAQTQPPEDDMAEPDGAYKLNGEFIDYDPSRTVRPNFSVPG VADLELAGFGQAEGIGSVPVFVPLSMMNGRYYDEHSWTPPSSSSSSSSSSTSVGSGSYGS SGGSFSGSGYSGSF >gi|319976957|gb|AEUH01000279.1| GENE 3 3569 - 3790 313 73 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229819781|ref|YP_002881307.1| ribosomal protein L31 [Beutenbergia cavernae DSM 12333] # 1 68 1 68 72 125 83 4e-28 MKKGIHPEYVDTVVTCTCGNSFHTRSTIASGQMRVDVCSACHPFYTGKQKILDTGGRVAR FEARYGKRVRPKK >gi|319976957|gb|AEUH01000279.1| GENE 4 3876 - 4976 1445 366 aa, chain + ## HITS:1 COG:MT1338 KEGG:ns NR:ns ## COG: MT1338 COG0216 # Protein_GI_number: 15840749 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Mycobacterium tuberculosis CDC1551 # 1 362 1 357 357 372 61.0 1e-103 MNQPHDDLASLEPLLAEYAQVEQDMASPATLADPARLRRVGRRYAELGRVKAAADRLAAA RDDLGAAREIAAEDPSFAQEVPALEEEAAAAHEALMRILAPRDPQDALNAILEIKAGEGG EESALFASDLARMYARYAEAHGWSVTEMSATRTELGGFKDVTLAIRAKGSPAPDEGVWAH LKYEGGVHRVQRVPVTESAGRIHTSAVGVFAMPEIEDDEDVVIDPNDLRIDVYRSSGPGG QSVNTTDSAVRVTHLPSGIVVSMQNEKSQLQNKEAALRVLASRLAAEARAKREAEASAQR LSQVRTVDRSERIRTYNFPENRIADHRTGFKAHNLDQVLGGALDPVIASLREADEAERLA QAASAP >gi|319976957|gb|AEUH01000279.1| GENE 5 4973 - 5854 295 293 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase [Acidobacterium capsulatum ATCC 51196] # 9 288 20 290 294 118 33 5e-26 MTVDGTAPRALVDWAARELSAAGVDSPAADARALVEWACGADSLWSAPPLLAPEEVERLR GAVSARARRVPLQHITGRMHFRSLALRAAPGVFVVRPETECVAQAAIDRARTAVGRRGSA TVVDLCSGSGAIAIAVATEVPGCRVWAVEADEAAARLAASNIGALAPGRVELLLADATDG AALAHLDGSADVVVSNPPYVPADEMPDQPEALADPAGALYGGSPDGTAVPRLVARRALGL LAPGGALVMEHSPSQSAAMRRAAESLGYCCASTHRDLAGRDRYLVAERPGVTQ >gi|319976957|gb|AEUH01000279.1| GENE 6 5860 - 6537 666 225 aa, chain + ## HITS:1 COG:MT1340 KEGG:ns NR:ns ## COG: MT1340 COG0009 # Protein_GI_number: 15840751 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Mycobacterium tuberculosis CDC1551 # 25 219 23 217 217 154 48.0 2e-37 MNTTPTIVACAPRWAAPDLDKARLAAASGALLVIPTDTVYGIGADAANPGAVARLLAVKG RGRQMPPPVLVADADAVDRLCEDVPEAARRLARAHWPGPLTLVLRARDGLGWDLGDTGGT IALRVPDHPGALALLRATGPMAVTSANTTGRPPATTIDQAVAYFGGAVDLYADAGPTASS TPSTVVAFAGGAARVLRHGALGVDDIAALAPLEAEGRAGEGGNAR >gi|319976957|gb|AEUH01000279.1| GENE 7 6534 - 7649 1754 371 aa, chain + ## HITS:1 COG:Rv1302 KEGG:ns NR:ns ## COG: Rv1302 COG0472 # Protein_GI_number: 15608442 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Mycobacterium tuberculosis H37Rv # 6 353 35 388 404 179 36.0 8e-45 MKVYLLLMCVAIALTLLVTPLVRFVCLKWGIVPQLRSRDLQATPIPRLGGVAMTIALIIT MLIASRIPYMAPLFTTAIPWSVMAAAAGMCALGVIDDIIELDWMAKLAGQILITGGMALG GVQLVTFPIFGLTIGSSRLSLFATVFIVVGIINAVNFIDGLDGLAAGMIAIGAGAFFVYS YAITRLMGAASYATAASLIVIALVGVCTGFLWFNFHPSSIMMGGGAETMGLVLAAGGIIV TGQIDPSLLGKQQILVGLLPILLPLAVIFMPVLDLVVTSIRRMRKGKSPFMADRSHFHDR LMVAGHSHRRVVAILWMWTAIVCLPAVGLLTFDWWRVLIATGGAVGVGVVVTMREFPGVN RIRARAAQEGR >gi|319976957|gb|AEUH01000279.1| GENE 8 7646 - 7918 298 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509269|ref|ZP_02044911.1| ## NR: gi|154509269|ref|ZP_02044911.1| hypothetical protein ACTODO_01794 [Actinomyces odontolyticus ATCC 17982] # 1 84 1 84 97 75 58.0 1e-12 MSHVLALVVVVLVVVAQWFFTRRAMVDRAHFAAWVGGGYVAKILLLSLGLYLPRALGVDV RAAAIGAVVAAIVASAGEAVVMARAGRSGD >gi|319976957|gb|AEUH01000279.1| GENE 9 7994 - 8896 1169 300 aa, chain + ## HITS:1 COG:Cgl1179 KEGG:ns NR:ns ## COG: Cgl1179 COG0356 # Protein_GI_number: 19552429 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Corynebacterium glutamicum # 49 297 4 265 270 153 35.0 4e-37 MMFTPRRRPLSAESAPKPPATRKPRWYWACLTLLVAVIAATAVPAFSNEPHQPSIDDFFE PAIFGEGTVFEFNRLTLARVIMGLLVCLVLVLVARRASLVPSRGQMAVEAIAGYIRSNVA LEMLGNKNGRKFAGFIGFLFFGVLAMNIAGVLPGIDIAASSVVAVPLVFAVISYATFIGA GIAKQGAGGFFRSQLFPPGLPWPIYFLVTPIEFLSTFVVRPVTLTLRLLCNMISGHLLLG MTYFGTAALLHQLTALSAAGAFTGAAMFVMTAFEVFVAFLQAYIFAILSTVYIKLSIEHH >gi|319976957|gb|AEUH01000279.1| GENE 10 8949 - 9158 457 69 aa, chain + ## HITS:1 COG:no KEGG:Elen_1040 NR:ns ## KEGG: Elen_1040 # Name: not_defined # Def: H+transporting two-sector ATPase C subunit # Organism: E.lenta # Pathway: Oxidative phosphorylation [PATH:ele00190]; Metabolic pathways [PATH:ele01100] # 1 68 5 72 72 63 61.0 2e-09 MGTAAFAYLGYGLATLGPALGIGLLVGKTQDATARQPEVAGRLFTNMIIGAGMVEALGLI GFVLPLIVR >gi|319976957|gb|AEUH01000279.1| GENE 11 9164 - 9724 648 186 aa, chain + ## HITS:1 COG:Cgl1181 KEGG:ns NR:ns ## COG: Cgl1181 COG0711 # Protein_GI_number: 19552431 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Corynebacterium glutamicum # 11 175 17 181 188 98 38.0 7e-21 MFHVLAAAQEKGGLAVVLPPPYEIFWSAVVLLVVLLLVGRFALPRIYRVMDERAAKIEEG LGAAEKAKADQAAAARERDAILRDANAEAHEIRERANEEAKSIVAAGRAEAQDEANRILE VAQRQILAERQAAQISLRAEVGLLASELAERIIGEQLRDTALTSRVVDRFLDELEEDTNA ARDGAR >gi|319976957|gb|AEUH01000279.1| GENE 12 9721 - 10539 1027 272 aa, chain + ## HITS:1 COG:Cgl1182 KEGG:ns NR:ns ## COG: Cgl1182 COG0712 # Protein_GI_number: 19552432 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Corynebacterium glutamicum # 21 271 19 271 271 88 28.0 1e-17 MTTVRTIESVPYSRVLADALAVPGTNTMRVAEDFFGLADIVKEDQKLARALTDPSRSASD KRALVRTAFGPHVTAATSSVVAAMASDHWARPEDVDAALEVLGILAVLNDALRRGALDDV REELFQVRYFLQNTRDLRVRLSDLRLGDEHERGDLASAVFASRVSPWTMRLIRRAVGRSH HGRLLHNLRRYAAWAATMQDRLFVTVATAHPMSGAQVERLRSILSKRFDAEIDLAISIDP DVIGGFVLRAGSTAVDATVRTRLADARARIAV >gi|319976957|gb|AEUH01000279.1| GENE 13 10584 - 12233 2367 549 aa, chain + ## HITS:1 COG:MT1348 KEGG:ns NR:ns ## COG: MT1348 COG0056 # Protein_GI_number: 15840759 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 1 529 1 527 549 687 66.0 0 MADLSISPEEIRGALDSFMESYRPAQVATEEVGHVIETADGIAHIEGLPGTMANELLRFD DGTLGLAMNLDQREIGAVILGDFGGIEEGQAVHRTGEVLSVPVGEGYLGRTVDPLGRPID GLGEITAVEGRRALELQAPGVMMRKSVHEPLATGLKAIDSMIPVGRGQRQLIIGDRKTGK TAIALDTILNQRDNWKSGDPAKQVRCIYVAIGQKGSTIASVRSTLEDAGAMEYTTIVASP ASDPAGYKYMAPYTGSAIGQHWMYQGKHVLIVFDDLSKQAEAYRAVSLLLRRPPGREAYP GDVFYLHSRLLERCAKLSDELGGGSMTGLPIIETKANDVSAYIPTNVISITDGQIFLQSD LFNANQRPAVDVGISVSRVGGDAQPKAMKKVAGTLKLTLAQYRSMAAFAMFASDLDAATR RQLTRGERLMELLKQPQSTPYPLEEQVASIWTGTHGYLDDLEVTDVLAFERALLDHLRAN SDVLGEIAATGDLSAQTEDALKEAVEGFHSQWIIAGRGAAELDEGEVEAERVREEITRAR PSGAAEAKA >gi|319976957|gb|AEUH01000279.1| GENE 14 12237 - 13166 1302 309 aa, chain + ## HITS:1 COG:MT1349 KEGG:ns NR:ns ## COG: MT1349 COG0224 # Protein_GI_number: 15840760 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Mycobacterium tuberculosis CDC1551 # 1 306 1 301 305 206 41.0 5e-53 MGGQQRIYKQRIASTMTLAKVFRAMEMIAASRIGAARRAATEAGPYEKALTQAVAAVAIH TDLDHPLTRERADTTRVAVLVVASDRGMAGAYAATILRESEKLIADLESGGYEPVIYTFG RRAAAYFRFRSVPIERAWEGESDAPTIETTREVAAELLGRFLDKDPASGVGSVHLVYTRF NNVMSQVPEVRQMLPLSVVDSPHADQEREAERGFHDDPAASFPEYEFIPSAEAVLDVLLP LYVESRIHNVLLQSAASELASRQRAMHTATDNAEELITKYTRLANSARQAEITQEITEIV GGADALNAG >gi|319976957|gb|AEUH01000279.1| GENE 15 13192 - 14640 2085 482 aa, chain + ## HITS:1 COG:Cgl1185 KEGG:ns NR:ns ## COG: Cgl1185 COG0055 # Protein_GI_number: 19552435 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Corynebacterium glutamicum # 10 478 17 480 483 658 72.0 0 MTQETPIAEGRVARVVGPVVDVEFPPDRIPPLYNALLVDVDLSGQGEDEGSFTMTLEVAQ HLGGNLVRAIALKPTDGLVRGAKVTDTGSAITVPVGDVTKGHVFNVTGDVLNLKEGEALE IKERWPIHRQPPAFDQLEARTKMFETGIKVIDLLTPYVQGGKIGLFGGAGVGKTVLIQEM IQRVAQDHGGVSVFAGVGERTREGNDLIKEMGEAGVFDKTALVFGQMDEPPGTRLRIALT GLTMAEYFRDVQHQDVLLFIDNIFRFTQAGSEVSTLLGRMPSAVGYQPNLADEMGQLQER ITSAGGHSITSLQAIYVPADDYTDPAPATTFAHLDATTELSREIAAKGIYPAVDPLASTS RILDPALVGREHYDVATHVKAILQKNKELQDIIAILGVDELSEDDKVAVARARRIEQFLS QNMYMAEKFTGVPGSTVPLSETIEAFKRIAEGRYDEVPEQAFYNCGGIDDLERNWHELQK DA >gi|319976957|gb|AEUH01000279.1| GENE 16 14640 - 14915 376 91 aa, chain + ## HITS:1 COG:RSp0810 KEGG:ns NR:ns ## COG: RSp0810 COG0355 # Protein_GI_number: 17549031 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Ralstonia solanacearum # 7 85 4 83 139 61 46.0 3e-10 MASANHLQVEVVSHEGRLWHGNALAVRVPGIDGSLGILPGRQPLLAQLAVGRVHVDVADG QMVFEVNGGFASVDSDFVTVVADHASVVPQD >gi|319976957|gb|AEUH01000279.1| GENE 17 14941 - 15342 471 133 aa, chain + ## HITS:1 COG:no KEGG:Jden_1772 NR:ns ## KEGG: Jden_1772 # Name: not_defined # Def: hypothetical protein # Organism: J.denitrificans # Pathway: not_defined # 9 131 6 139 146 75 36.0 8e-13 MRPFHVVVWALVVLLLCGACLWAYLNVRARRLSRRVGAFQCWSRPDMHSGWTAGIGVYGV ETLSWYRLMGLRPSPVYSLPRRGLEVSAPIARSADGGVVEVRLAHGDHRYEVAVQRETYN GLVSWVESGPPHR >gi|319976957|gb|AEUH01000279.1| GENE 18 15447 - 16136 962 229 aa, chain + ## HITS:1 COG:ML1155 KEGG:ns NR:ns ## COG: ML1155 COG1637 # Protein_GI_number: 15827583 # Func_class: L Replication, recombination and repair # Function: Predicted nuclease of the RecB family # Organism: Mycobacterium leprae # 1 229 1 220 220 261 60.0 9e-70 MIATCTVDYSGRLSAHLDAAKRVIMVKGDGSVLIHSDGGSYKPLNWMTAPCAVVSEAPGP ADEAEGVTEVWSVQAAKSDDRLVIRVFDVVSDITEDLGVDPGLVKDGVETHLQALLAEQV PQVLGEGWELVRREYPTPVGPVDLMVRDAEGRHVAVEVKRRGGIDGVEQLTRYLSLLGRD SLLEGLTGVFAAQEISKQARTLAEDRGIRCVVLDYDAMRGFDDPESRLF >gi|319976957|gb|AEUH01000279.1| GENE 19 16136 - 17248 912 370 aa, chain + ## HITS:1 COG:no KEGG:PFREUD_20890 NR:ns ## KEGG: PFREUD_20890 # Name: not_defined # Def: hypothetical protein # Organism: P.freudenreichii # Pathway: not_defined # 10 355 14 395 410 201 39.0 4e-50 MAAPDAPREVSAARIVSQGLAAPLPDPVAAVEALLAIQGQQPSAIPWAIGVRAEGAARSD IEEAFERVELVRSWPMRGTVHVTSARDHHWLRACLAHRYSTWLAQTSRNGLAQSVVDRAG QLAVQEIARSGPRTRAQLVEVWQDAGIAVGGPGSVGNDAPESPVNRRHLMVRLHIDGVLA QGPVRGNEHLVVDASSLPAAGDIAKGGAAHAEALALLAERYARGHGPVSAQDLARWASLP VSEARRALAAAVEASRGGAVPIVERSRGFEREDLPDLVAGLRRRADGVFALPSFDELHVG YRDRSCLTDEAGERAICPSRNGMFRPIVVRRGRVVAVRLPSGGLEFLDGASGTARSQARA AVERRMRWLA >gi|319976957|gb|AEUH01000279.1| GENE 20 17289 - 18479 1427 396 aa, chain + ## HITS:1 COG:Cgl2593 KEGG:ns NR:ns ## COG: Cgl2593 COG1820 # Protein_GI_number: 19553843 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Corynebacterium glutamicum # 6 363 18 357 388 197 38.0 3e-50 MSDSVLRGRVVLADSVIDDGVIEVRGGAIARVAPSSQYGPDLPDPTDSTFLPGLVDVHCH GGGGESFPNAETPEEALVAVMEHRRHGTTSLVASCVTASADVLRARAATLAGLAGRDEIA GIHFEGPFVSEARCGAQDPAYIVDPDPVLTRELLDICGGHALSMTLAPEKPRAYGEGSVA EALIDGGAIPSWGHTDSNAACARAALEYSRRRFRQARTPRGPRATITHLFNGMRPLHHRD TGPIAEFLSDAARGGAVVEMICDGIHLDPSIVRDVYELVGRDHAVLVTDAMAAAGMPDGR YTLGPQDVVVRDGVARLAQGESIAGGTAHLLDCVRVAVTAAGIPLVDAVYMASEQGARIL GDSSAGALAEGKRADIVEVDAELNVRCVRRRGTVVE >gi|319976957|gb|AEUH01000279.1| GENE 21 18530 - 18772 92 80 aa, chain + ## HITS:1 COG:no KEGG:Arch_0371 NR:ns ## KEGG: Arch_0371 # Name: not_defined # Def: hypothetical protein # Organism: A.haemolyticum # Pathway: not_defined # 3 78 18 93 95 66 46.0 2e-10 MGALGSVARTERGPDGADYQVRYLREAAKEYTCPGCLRPVRVGSAHVVAWPEETRFGLPQ GVGARRHWHTECWRRRLRPT >gi|319976957|gb|AEUH01000279.1| GENE 22 18968 - 20299 1549 443 aa, chain + ## HITS:1 COG:no KEGG:BDP_1938 NR:ns ## KEGG: BDP_1938 # Name: not_defined # Def: hypothetical protein # Organism: B.dentium # Pathway: not_defined # 12 299 36 329 377 76 28.0 2e-12 MRFKRYPAASCVGVVALVCVLVGLLALTVFRPPARSESSVQSSTDLVMTRDGVLPLVKDD VTVTAASASGADVTLIFGTTQDVKGWIGDAAYTEVVGLEANREALKAQVHDAIGAGATQA QSGSGASNLAEQLAGSDMWLARSSGAGSASLDLVDVPQARSVLAVSTAGAGDVTLTLSWV NDQSNTPAIIAFLGALIFGLVALVLFLTRWQLLRHRAERARRIDERRRADGLETSSIDSH EVAKKAAALREEETARVEAARASAAEPDSASSSDIDAVVEALAEETGGVLDSDADAGYDD PAEWYDAGEAQEEPQQWSPPQDEWPSADDDSSQGGRHGVGEGVIDEDPPQTEPSDTGVID LSAIRPGVTLPSRRALREAREKGEQVLVVDGQEYDTGLIPAVSDDEEAQGGAPTGAADDD ASATGGWTSIMSGWLSDGGKGGK >gi|319976957|gb|AEUH01000279.1| GENE 23 20314 - 21306 1167 330 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1332 NR:ns ## KEGG: Bcav_1332 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 24 322 32 332 334 112 32.0 3e-23 MGALAVCAALAGAVALGACSRELVAESGANSQSAANLDIDRIASVLSATNDAMTKADSEL DTAPLSGRVSQGALEARGAQYALAKASGNAIIPLDFTTSSRIAITNSESWPRAIFHITES DSSSSALPVLEVFVQDSARSEYQLTSWARLLGGTQFVLPPVGTGSAYVTGDQSGFVSSPE MALTQYITMINSGTQDTSVFTADEFTKKYIDDISSLNSSVSAAGTVTAQASATDYPVSGL VLQDGSALVSANFTYTLTYQRTVAGSTMTLGGQTASLSADGATVEGTATVKYIGTVVMRI PSGTAGGQIQVVGGERLIQSVTLDSSSKPD >gi|319976957|gb|AEUH01000279.1| GENE 24 21442 - 22332 1075 296 aa, chain + ## HITS:1 COG:MT1366 KEGG:ns NR:ns ## COG: MT1366 COG3118 # Protein_GI_number: 15840778 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin domain-containing protein # Organism: Mycobacterium tuberculosis CDC1551 # 20 295 14 303 304 81 31.0 2e-15 MNEDKDTQQETEVRAADSRGAVDLAAAPARTAQPDGPGDGLRIPLVTETDEAGFQDVMST SQAVPVVLVLWSQRSLESKPTMAALEEIAREKAGAFQLVEVEIEAAPQIAQAFQVQAVPS VIALVGARPLPLFQGSAAKEQIVPVIDQVLEAAAQMGVTGRVAVSAEQTQEPTPPEHETP LAAEAEGRLADAVAAWERVIELNPRDEAAKAHLSRVRLARRSAEADGAGDPASRADALFA AGDQAGAFQVLLDLIGAASDAEEREALRQRLVDLFRVAGATPEVKKARTLLSMLLM >gi|319976957|gb|AEUH01000279.1| GENE 25 22625 - 22798 211 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWKWLLMIGVFVAVAVWYGAFSSHADLTKKIDREHDPDMRRELLGLQQEIDRGRAGF >gi|319976957|gb|AEUH01000279.1| GENE 26 22864 - 25059 3203 731 aa, chain - ## HITS:1 COG:Rv1326c KEGG:ns NR:ns ## COG: Rv1326c COG0296 # Protein_GI_number: 15608466 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Mycobacterium tuberculosis H37Rv # 18 726 22 730 731 837 60.0 0 MTALPNPIPVDDDTLDAVAAGAFYSPHSVLGAHVGDGGITIRAVAHLADAVEVVTPQGSF PAVHERGGVWVAVVPGDQVPDYRLRVTYGQDTNTVDDPYHYLPTLGEMDTYLISEGRHEN LWKVLGAHVRHFDGDMGGSDGVAFAVWAPNARAVRVVGDFNYWDGTASSMRSLGASGIWE LFIPGVGVGARYKFEIQGPGGNWFQKADPLARATEIPPATASVVTDYFHEWGDDEWLANR ARHSVHNGPMSVYEVHVGSWRQGLGYRDLADELVPYVKEMGFTHVEFMPVAEHPFGGSWG YQVTSYFAPTSRYGTPDDFRYLIDALHRAGIGVILDWVPAHFPKDDWALARFDGTPLYED PDPLRGEHPDWGTYVFNFGRREVRNFLVANALYWLEDFHIDGLRVDAVASMLYLDYSRKD GQWRPNQYGGRENLEAIGFIQEANATAYRRHPGIVMIAEESTAWPGVTAPTSGGGLGYGM KWNMGWMNDTLRYMEEDPVNRRWHHGELTFSLVYAFSENYVLPLSHDEVVHGKGSLISKM PGDKWQRLAGLRSLYAYQWSHPGKQLLFMGQEIGQEQEWNESYSLDWWLTDQEGHAGVQA LVARLNAIYTSNPALWDDDYTGFEWIDASDGDHNLISYLRKGTGKDGSRQVLVCVSNFAG NPHEGYRVGLPFGGEWAEILNTDAEEFGGSGVVNVGTQIAEDVPWNGRPASVELRVPPLG AVWLAPVPQKQ >gi|319976957|gb|AEUH01000279.1| GENE 27 25069 - 25239 109 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRAGALDAPAPPYSFARAGPSRATPAPRGAAPVTRNGDGLGPAPIHWHTRANLEE >gi|319976957|gb|AEUH01000279.1| GENE 28 25236 - 27281 2767 681 aa, chain - ## HITS:1 COG:Cgl1197 KEGG:ns NR:ns ## COG: Cgl1197 COG0366 # Protein_GI_number: 19552447 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Corynebacterium glutamicum # 8 675 4 660 675 543 45.0 1e-154 MSDELLPRIPAIEVFPVVEDGTLPAKATQGEPFPIRATVFREGHDAYAAEAVLVDPDGRE HSRTRMVDIMPGLDRYEAWVCADRPGDWTFRIDTWSDPYATWRHDATVKIDARVDVELML EEGARLMERAAAGEALANPGQSRPSPADVEVLVDAAARLRDANRSPQQRLSAGVSSRVRV IFRDSPLRDLVGSTRSYPLGVARPLALAGAWYEIFPRSEGAFQRPDGSWVSGTLRTAAED LPRISSMGFDVVYLTPVHPIGTTHRKGKNNALEAGPGDPGSPYGIGSAEGGHDAVHPDLG GFEAFDAFVSRARSLGMEVALDLALQCSPDHPWVAEHPEWFTTRADGTIAYAENPPKKYQ DIYPLNFDNDPEGIYNAIRAVIEVWISHGVTIFRVDNPHTKPVRFWQRILDDVHRAHPGT LFLAEAFTRPAMMRTLGAVGFDQSYTYFAWRTYKHEIEEYLEEVSTQTAHLMRPTFWPTT HDILTPQMWDGGTAIFAIRAMLAALGSPTWGVYSGYEFVENTPRGSFEEPNDNEKYEFRP RRWADAEPIGISKLITLLNAARSKHPALRQLHQIRIHPTSDDRLIAFSRQIPGRFTDSGQ TDTVICVISLDPHNGVDGSVYLDLAELGLETSGGHFTVVDELDGRTYLWGADNWVSLSPV TRLGHVLSVEQAHTSPWSHHE >gi|319976957|gb|AEUH01000279.1| GENE 29 27362 - 28462 1579 366 aa, chain - ## HITS:1 COG:DR1093 KEGG:ns NR:ns ## COG: DR1093 COG0180 # Protein_GI_number: 15806113 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Deinococcus radiodurans # 40 364 25 348 351 217 45.0 4e-56 MTRSEDALANSTSDASLARSKARSAEIDLAIDQDPSRFRVLTGDRPTGNLHLGHYFGTLR NRALLQDRGVDTWVLIADYQVITDRDGVGPIRQRVLGLIADYLAAGIDPERSTIFNHSAV PALNQLMLPFLSLVTEAELHRNPTVKAELEATEGRAMTGLMLTYPVHQAADILFCKANIV PVGQDQLPHLEQARLIAQRFDKRYGRVDPDQPVFRRPEALLSEVPLLLGTDGQKMSKSRG NTIELGMSADETARILKKAKTDAERLITFDPVGRPEVSNLLMLASLSSGEAPEAIAERIG DAGAGALKALVTESLNEMLAPLRARRADLLANEDHLLAVLAAGNERANAQADETLAQVRG AMRMDY >gi|319976957|gb|AEUH01000279.1| GENE 30 28506 - 30797 3406 763 aa, chain - ## HITS:1 COG:DR0264 KEGG:ns NR:ns ## COG: DR0264 COG1523 # Protein_GI_number: 15805296 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Deinococcus radiodurans # 14 682 8 637 720 568 45.0 1e-161 MSELLVFDTSPTRVRVGEPNPIPTRLGVHPSAGGLDVAVVARNATGVDLCVFDEAGHETQ FALLGPTTGVWHGHIPGLGAGTRYGLRVHGPWDPDGGMFHNSRKLLLDPYGRGVEGAVDL GASVYAHCVDEDLYPDEYPMRRSPLDSAPHMPRNVVVTPSFPIAAKPRIPWDRTILYEVH VKGFTKNMPGVPPELRGTYAGLAHPSCVSYLLDLGVTSIELLPVHAKCDEPFLTERGLTN YWGYSTLSFFAPEPSYATAAARMRGAQAVVDEFRGMVSILHEAGIEVILDVVYNHTCEGG DSGPSLSWRGLDSDLYYRHTSSRPTQIIDVTGTGNTLNMDNPKTIQMVLDSLRYWASDMG VDGFRFDLAATLGRFATGFSPMHPLLVAMAADEVIGSAKLIAEPWDVGPGGWQTGNFPVP FSEWNDHYRGALRNFWLADVRAQASGARVSGPNDLATRLAGSRDVFDHGHGSLRGPRASV NFITAHDGFTLADLTAYDRKHNMANLEGNRDGSNDNRSWNHGVEGTLAASATHPNALTPF TILQDTTIAGLRQRSQRNLLTTLFVSAGTPMLLGGDEFGRTQYGNNNAYCQDDPISWVDW NLAATQQEGIDLVSWLIALRKAHPVLRPDCFATGRPYGADTIPDVSWYTAWGEAMPDSAW SDPAHRTFQMLRSGVPWGDRDALVIINANLQDVTVTLPRGHDLDWLVVMNTVWDSPDDGG IGPRRDPRDLGLDVEHIAPRSKIPMQALSMMVLLSDVAAHVGR >gi|319976957|gb|AEUH01000279.1| GENE 31 30988 - 31836 932 282 aa, chain + ## HITS:1 COG:ML1712 KEGG:ns NR:ns ## COG: ML1712 COG2086 # Protein_GI_number: 15827916 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Mycobacterium leprae # 18 278 4 261 266 137 37.0 3e-32 MLIWEADGVCGTLGAMRIVVCVKHVPDAASQRRFEDGRLVRGEDDVLNELDENAIEAAVR LAEDERDAGRAAEVVALTMGPEDAEDAIMRALQMGADRGVLVSDDLLEGSDVVTTASVLA AAIARIADEGGAVDLVVTGMASLDAMTSMLPAALAAKAGMPLLGLARSMDVEGRRVEISR TVDGYDERLGAPLPAVVSVTDQINEPRYPAFAAMRAARKKPIDQWGIDDLVQTPGGEAPT LRRALTSVRRSEEITRSGAGTIIQDSGDAGRALAAYILEVVK >gi|319976957|gb|AEUH01000279.1| GENE 32 31850 - 32030 240 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSAPVLVLADHAAGQARLTPAAGQLLTLARSLTSGGVDALVLTDAIDMEALGALGADRVL Prediction of potential genes in microbial genomes Time: Thu May 12 19:15:58 2011 Seq name: gi|319976949|gb|AEUH01000280.1| Actinomyces sp. oral taxon 178 str. F0338 contig00280, whole genome shotgun sequence Length of sequence - 8825 bp Number of predicted genes - 10, with homology - 7 Number of transcription units - 7, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 + CDS 3 - 425 400 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 2 1 Op 2 . + CDS 425 - 1549 1554 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain + TRNA 1778 - 1861 73.9 # Leu TAG 0 0 + Prom 1780 - 1839 80.3 3 2 Op 1 . + CDS 1872 - 2069 158 ## 4 2 Op 2 . + CDS 2514 - 2984 512 ## COG2020 Putative protein-S-isoprenylcysteine methyltransferase 5 3 Tu 1 . + CDS 3135 - 3608 589 ## COG1225 Peroxiredoxin + Term 3780 - 3813 -1.0 6 4 Tu 1 . + CDS 3992 - 4111 75 ## 7 5 Op 1 . + CDS 4257 - 5045 346 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 8 5 Op 2 . + CDS 5078 - 6721 2204 ## Bcav_2750 putative integral membrane transport protein 9 6 Tu 1 . - CDS 6996 - 7652 791 ## - TRNA 7807 - 7882 81.0 # Lys CTT 0 0 - TRNA 7911 - 7986 81.0 # Lys CTT 0 0 10 7 Tu 1 . + CDS 8117 - 8797 714 ## COG1321 Mn-dependent transcriptional regulator Predicted protein(s) >gi|319976949|gb|AEUH01000280.1| GENE 1 3 - 425 400 140 aa, chain + ## HITS:1 COG:MA3264 KEGG:ns NR:ns ## COG: MA3264 COG1104 # Protein_GI_number: 20092080 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 3 119 312 430 448 69 32.0 2e-12 TAARRRAFASRAAALRQRLLAALPPGCAPTVAPADAVPSIIHLSIPTSRPEAVLMAMDMA GVVVSAGSACHAGVARPSRVVMAMGRDEAGALGVLRVSLGADTTEDDIDALIAALPAAVR AGQALEGAPRARNTRNQEAH >gi|319976949|gb|AEUH01000280.1| GENE 2 425 - 1549 1554 374 aa, chain + ## HITS:1 COG:Cgl1211 KEGG:ns NR:ns ## COG: Cgl1211 COG0482 # Protein_GI_number: 19552461 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Corynebacterium glutamicum # 1 362 1 360 365 307 50.0 2e-83 MRVLAALSGGVDSAVAAAKAVEAGHDVVGVHMALSSQPAQCRIGSRGCCSVEDAGDAARA AETIGIPFYVWDLAEEFEQTVITDFIAQYRAGRTPNPCVRCNEFVKFRELADRARALGFD AVCTGHYAAIVEGEQGPELHRGADAAKDQSYVLAVMGPDELRRVVLPLGGAPSKAWVRAE AERLGLGVSDKPDSYDICFIPDGDTQGFLRAHLGSAEGEIVTPDGRVLGRHGGYWNYTVG QRKGLGIGAPAPDGRPRYVLETRPETNQVVVGASELLTVDRIECSDAVWLAPDDPGDAAS GGLYAQFRAHGRPLPVASLVRRAPGGAGARFVVGLAEGARGVARGQSLVVYRGDRVLGEG TIDATSRAGAALAS >gi|319976949|gb|AEUH01000280.1| GENE 3 1872 - 2069 158 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQTLTLGNSLCGRLGRGGRQPCGEAEVRGGKGPFSQWCLRARRPVANRTRATWRPCFASA GPKCV >gi|319976949|gb|AEUH01000280.1| GENE 4 2514 - 2984 512 156 aa, chain + ## HITS:1 COG:mlr1316 KEGG:ns NR:ns ## COG: mlr1316 COG2020 # Protein_GI_number: 13471367 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Putative protein-S-isoprenylcysteine methyltransferase # Organism: Mesorhizobium loti # 41 156 49 158 158 58 35.0 6e-09 MLVQMERTPVYNHIPDLCLLGLGAAAVASAWLWPWAVVLPGWLRLVGGAFVVAAALVIGA VLGALRRAATSTNPVDEPTALLASGPFALSRNPLYLAYVLAVLGCALASGSWAALLCPLV CFSVLNWLIIPIEERALRRAFGDEYERYRRSVRRWL >gi|319976949|gb|AEUH01000280.1| GENE 5 3135 - 3608 589 157 aa, chain + ## HITS:1 COG:MT2597 KEGG:ns NR:ns ## COG: MT2597 COG1225 # Protein_GI_number: 15842053 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Mycobacterium tuberculosis CDC1551 # 2 156 5 157 157 179 59.0 3e-45 MTRLEVGQRAPGFTLETTKGALSLADALERSGKGVVVYFYPKASTPGCTKEACDFRDSLE GLAGAGYSVIGVSPDPLSALERFAEAQSLAFPLASDPSRDVLEAWGAWGEKKNYGRTITG VIRSTVVVAKDGTVALAQYNVKATGHVARLRKALGVD >gi|319976949|gb|AEUH01000280.1| GENE 6 3992 - 4111 75 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCAALCCLIPSLVCDMGWGRPTRAAVPVRSVRGDKRYGH >gi|319976949|gb|AEUH01000280.1| GENE 7 4257 - 5045 346 262 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 7 226 2 222 312 137 33 2e-32 MSQPMNPNGAAPAAPALAIRGLVKVYGNLIAVADLSLDIPVGSFYGIVGPNGAGKTTTLT MATGLLRPDRGTAFVYGVDVWGDTAKARSLLGVMPDGMRLLDKLSGPDFLIHVGMLHGVD RATASDRSNQLLQALDLTEAGKKLIADYSAGMTKKIALAAAMIHSPKLLVLDEPFEAVDP VSAANIRQILMDYTKRGGTVILSSHVMATVQQLCTHVAVINHGRVLAAGTTAEVAQGGDL DARFAQLVGGVHNEEGLTWLGN >gi|319976949|gb|AEUH01000280.1| GENE 8 5078 - 6721 2204 547 aa, chain + ## HITS:1 COG:no KEGG:Bcav_2750 NR:ns ## KEGG: Bcav_2750 # Name: not_defined # Def: putative integral membrane transport protein # Organism: B.cavernae # Pathway: not_defined # 45 544 60 539 542 121 27.0 8e-26 MVKQTLVLVLSIIGILYFGGVGAFVYIGLTMSAQSAFAPSMSFYLTLIGPAVFIGWILLP VLFGALDNTLAPDRLSPYVGPTRRLGVGLVAAGGVGFGGALSTMVLLMPAWFNAWRGVPL HALGSLAAALTTLALSLIWGRAVATWFAVRINSAGRRDLVAIIGTMAFFIIVTPMGVWIN YLARNFSEATMNWSTDVALWSPFGAPFGFVESLAAARYAEAGARLVIVVATAVAGWALWM RVLQPVMSGSAQPIAPEAQRAIEEGRHLVDESKAVAERVQRVNESRGAGLAETGPYLALG MGPRSATLAARTLHYWFRDPRIMMNLPLMLIFPLMAYLFKTQLPPEAGGFNATLFLYLVP ITTGLVVGALMQYDSTGAWIVISSGMKGREERLGRLAGSLPVLIIQVASYLVYDAVARSD AYSFVFHQVMGFVLFSGAAATTLSFGAWWVYPVQPPGASPLSTKGTGSFLTTMLIQLGSF LGTAVVQAPSFVLIVLGEMGTVPVAASMAFSLVWSVVVLAAGVVIGGKVWDKHAVDALTT IRSWPGH >gi|319976949|gb|AEUH01000280.1| GENE 9 6996 - 7652 791 218 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDDPFTPDTEALYGPEAGSDNASATADGGAEPGAGTGGRVRAYDPQAIADPAARRRAWER LYGFVEYLNSTWGALRNSRGVGDYYIRAGWWTNPIIVAHLAALQGEYAEAWLLDADRGAG ATLMLRAVEHTESVLVRVTGKKGTGFGWSQDAMATKGMRWDRSTPPAAYGSNVAPSAEAR RAEYFRRFLDETNAGEAPTALAFFEGLCEVEDEARADS >gi|319976949|gb|AEUH01000280.1| GENE 10 8117 - 8797 714 226 aa, chain + ## HITS:1 COG:MT2858 KEGG:ns NR:ns ## COG: MT2858 COG1321 # Protein_GI_number: 15842326 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Mycobacterium tuberculosis CDC1551 # 9 220 11 226 228 179 50.0 4e-45 MGEARTGHSPVVEDILKIIWSAQERGEDGVAVNEVAARMGVVPSTASENVARLTEQGLID HKPYRKAHLTAEGRRVAIGIVRRHRLLETYLHEALGFDWDEVHEEAEALEHAVSDRLLER LDRVLGHPSRDPHGDPIPRPDGSVADEAGVLLPLVGEGAGCVVARVSDRDPQALREFDAA GLVPGAALTVVRKEGSAVVVALGGAPAVRLGAEQAGAITVRLAGAD Prediction of potential genes in microbial genomes Time: Thu May 12 19:16:30 2011 Seq name: gi|319976939|gb|AEUH01000281.1| Actinomyces sp. oral taxon 178 str. F0338 contig00281, whole genome shotgun sequence Length of sequence - 8935 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 5 - 451 660 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 2 2 Op 1 . + CDS 363 - 578 166 ## 3 2 Op 2 . + CDS 575 - 3508 4553 ## COG0495 Leucyl-tRNA synthetase 4 2 Op 3 . + CDS 3515 - 4552 1335 ## COG0266 Formamidopyrimidine-DNA glycosylase 5 3 Op 1 . - CDS 4607 - 5098 634 ## HMPREF0573_10185 hypothetical protein 6 3 Op 2 6/0.000 - CDS 5184 - 5825 874 ## COG1045 Serine acetyltransferase 7 3 Op 3 2/0.000 - CDS 5908 - 6840 806 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 8 3 Op 4 . - CDS 6884 - 7537 769 ## COG1739 Uncharacterized conserved protein 9 4 Tu 1 . + CDS 7708 - 7878 233 ## PROTEIN SUPPORTED gi|227495349|ref|ZP_03925665.1| 50S ribosomal protein L32 10 5 Tu 1 . - CDS 8040 - 8498 715 ## + TRNA 8658 - 8733 64.2 # Arg TCT 0 0 + TRNA 8803 - 8878 67.0 # Arg TCT 0 0 Predicted protein(s) >gi|319976939|gb|AEUH01000281.1| GENE 1 5 - 451 660 148 aa, chain - ## HITS:1 COG:MT0784 KEGG:ns NR:ns ## COG: MT0784 COG0537 # Protein_GI_number: 15840174 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Mycobacterium tuberculosis CDC1551 # 3 129 2 129 133 108 47.0 2e-24 MATLFESIIAGDVPGRFVWADDQCVAFATIEPTSPGHVLVVPRSPYPKWTDAPAPVAAHL ASVAQAIGAAQEDAFGVPRAGMAIAGFDVPHLHLHVIPLRDHTDILLSKGAPASPAELDE AVAALRAALVARGNGANVPADPSSAALG >gi|319976939|gb|AEUH01000281.1| GENE 2 363 - 578 166 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVAKATHWSSAHTKRPGTSPAMIDSNSVAMRLLSPTRARAAQCSHPPARRVVVYPWRRLT HDRTDDRGNRP >gi|319976939|gb|AEUH01000281.1| GENE 3 575 - 3508 4553 977 aa, chain + ## HITS:1 COG:Cgl2961 KEGG:ns NR:ns ## COG: Cgl2961 COG0495 # Protein_GI_number: 19554211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Corynebacterium glutamicum # 10 973 12 950 952 1060 56.0 0 MTETAPAEPAQRYTAALAGEIEQKWQDRWDALGTFHADNPVGDLAGPLAGRDPYFILDMF PYPSGKGLHVGHPLGYIATDTLARFRRMNGDNVLYTMGYDAFGLPAEQYAVQTGQHPAVT TDQNIANMRRQLRRMGLSHDPRRSLATTDIDYVHWTQWIFLQIFNSWFDPDATAPDGSKG AARPIAELEEGLAEGRVATRDGRPWAELSKKDRAAEVDSHRLAYVSEAPVNWCPGLGTVL ANEEVTAEGRSERGNYPVFKRNLRQWMMRITAYGKRLEDDLDTIDWPEKVRAMQRNWIGR SEGATVRFDVPGALGAGSSPSVLEVYTTRPDTLFGATFMVVAPEHPILGGTAGGDADDEA ALTLPQAWPEGTKDAWTGGAATPRQAVAAYRAQASAKSEAERVDEERTKTGVFTGLFGIN PVNGQPVPVFVADYVLWGYGTGAIMAVPAHDDRDWAFARAYDLDVVRTIGPADDPYGPDL DEGAYTGDGVAVDSANDGVDLNGMSKDEAKAAMTAWLAVEGHGRVTTTYRLRDWLFSRQR YWGEPFPIVWDEDGVAHALPEEMLPVELPEVSDYSPRTFDADDASSEPEAPLGRAQEWVR VTLDLGDGPKRYRRETNTMPQWAGSCWYEMRYTDPANSECFADRRNLDYWMGPREGKASG GTDMYVGGVEHAVLHLLYARFWQKVLFDLGHVPDPEPYHRLFNQGYVQAYAFKDARGQYV PADEVEGDEDTGFTWRGEAVAREYGKMGKSLKNIVTPDDMYEAYGADVFRVYEMSMGPLD LSRPWETRAVVGSQRFLQRLWRNAVDEGTGAVTVTEEPADVATRRLVARTIAEVTTEYEN LRVNTAISRLIVLNNHLTSLDAVPREAIEPLVLMLSPVAPHICEELWSRLGHGESLAREP FPVVSDESLLVEEAVTAVVQVNGKVKARIQVPPSISEEDLRAAALAEAPIVKALGGSDPL KVIVRAPKLVNVVAPKN >gi|319976939|gb|AEUH01000281.1| GENE 4 3515 - 4552 1335 345 aa, chain + ## HITS:1 COG:Cgl2944 KEGG:ns NR:ns ## COG: Cgl2944 COG0266 # Protein_GI_number: 19554194 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Corynebacterium glutamicum # 1 345 1 269 269 185 34.0 1e-46 MPEGDAIRRLAGTLDELFVGGTVSASSPQGRFASSAARLDGWVVQRVRVHGKHMFIGFVP PVPGRSYEAGVALLEGAAAGSGEPVLGEGEPWPDQWVHIHLGLYGWWRFNGDETVVDEGH GVAHRIPNVPKGQWNGHSETRWGDGFGEARAGEWEPPEPVGAVRLRLFNDHAVADLVGPN RCELISDQERAAAEARLGPDPLDAGARADAEAAERFARVAHSKRRAIGEIVMDQSIIAGV GNIYRADALFLAGISPHRKGANVSLKRLRGLWVLICDLMNRGLAAGRLDTMDPDEAPDPP IEGDEEASRWYVYHRTGRPCLRCGTPIREALMQNRRLFWCPGCQK >gi|319976939|gb|AEUH01000281.1| GENE 5 4607 - 5098 634 163 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_10185 NR:ns ## KEGG: HMPREF0573_10185 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 15 159 18 163 166 74 34.0 1e-12 MEAHTLHAPLDAVPEPPRPEGIEIATLTPYDVPTLAILSLEAYGEPLTAEAIMESSEELR LTFEGAFGATTEDSFVGAWDGGTLVGAILVVRESPWDDPPDGPFVVDLMVAPEYRRRGIA TALVAEAARRCAEWGFDSLSLRIDSRHTGAKELYSVLGFDQTD >gi|319976939|gb|AEUH01000281.1| GENE 6 5184 - 5825 874 213 aa, chain - ## HITS:1 COG:PA3816 KEGG:ns NR:ns ## COG: PA3816 COG1045 # Protein_GI_number: 15599011 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Pseudomonas aeruginosa # 25 207 7 189 258 201 50.0 9e-52 MPFAARVARAIARAVPTKALLLFEEDLMTARRRDPAATSALEIALTYPGVHALWTHRVSH ALWNRGARTPARVLSSVARAFTGVDIHPEARLGRRVFIDHATGVVIGQTAEVGNDVVIFH GVTLGGVAMTPGKRHPTVGDHVMIGAGAKVLGPITIGNGVKVGANAVVVKDVPCGTVAIG VPARLLPKPERDTRDRDLIVDPNYFFDEPALYI >gi|319976939|gb|AEUH01000281.1| GENE 7 5908 - 6840 806 310 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 306 3 305 308 315 56 1e-85 MAHIADNVAELIGGTPLVRIHRVAGEGADVVAKLEAFNPASSVKDRIARSIIDAAESSGA LEPGGTIIEATSGNTGIALSMVGAARGYKVVIVMPSSMSVERRAIVRAFGAELVLTDPKG GVSAAVAEAERIRDERPGSIIASQFTNPANPAVHEATTGPEIWEATGGQVDVFVAGVGTG GTISGVAKYLKGKNPGVRVIAVQPAESPLLTGGAPAPHGIQGLMPNFVPGTYDAGAVDEV VSVESAKALEFARRAAAEEGLLVGISSGAALAGTAAVAARPEFAGKRIVTLLPDTGERYL STALFEGLED >gi|319976939|gb|AEUH01000281.1| GENE 8 6884 - 7537 769 217 aa, chain - ## HITS:1 COG:Cgl2074 KEGG:ns NR:ns ## COG: Cgl2074 COG1739 # Protein_GI_number: 19553324 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 17 193 19 195 217 159 45.0 3e-39 MPSPIATVPARTLATSEIEVKRSRFITTLARTDTARSARGVIDAVKAQHPQARHNCSAYL ISPEGAAPLQHSSDDGEPAGTAGTPMLEAIRASGTWNVTAVVTRYFGGVLLGAGGLVRAY SSSVSEALAALPRVLLVPRDILEVELDPADAGRIEAELRGAGASIVDAQWAGRVRLAVGV DPALRESFDAVLARASRGVAKFRHVDTRRVEVDAAGG >gi|319976939|gb|AEUH01000281.1| GENE 9 7708 - 7878 233 56 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227495349|ref|ZP_03925665.1| 50S ribosomal protein L32 [Actinomyces coleocanis DSM 15436] # 1 55 1 55 56 94 83 3e-19 MAVPKRKMSRANTRTRRSAWKADLTELNTIRVSGREVRVPRRLAKAYKSGLLHIED >gi|319976939|gb|AEUH01000281.1| GENE 10 8040 - 8498 715 152 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTAEIKAAALPIRYGAEVTMPAGMYFLGDPCYAIDEDGVWDEWLAEAEEAGTPAANGAMV ADVDGVPMLGFQTAYGDGGYLSSDGEHLFGVDSGMIGLVPIKLVHRYFRRPVADLNGLGT FIASADPIVCTRMGGTLTFGGTTIETAPDEGA Prediction of potential genes in microbial genomes Time: Thu May 12 19:16:52 2011 Seq name: gi|319976929|gb|AEUH01000282.1| Actinomyces sp. oral taxon 178 str. F0338 contig00282, whole genome shotgun sequence Length of sequence - 13080 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 7, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/0.000 + CDS 295 - 1254 1211 ## COG0685 5,10-methylenetetrahydrofolate reductase 2 1 Op 2 . + CDS 1256 - 3571 3088 ## COG0620 Methionine synthase II (cobalamin-independent) + Term 3575 - 3612 2.0 3 2 Tu 1 . + CDS 3708 - 3956 75 ## 4 3 Tu 1 . + CDS 4157 - 4591 708 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 5 4 Tu 1 . - CDS 4863 - 6077 1451 ## COG1171 Threonine dehydratase - Prom 6229 - 6288 1.5 6 5 Op 1 . + CDS 6250 - 6873 762 ## COG1949 Oligoribonuclease (3'->5' exoribonuclease) 7 5 Op 2 . + CDS 6932 - 7246 75 ## + Term 7316 - 7362 12.5 8 6 Tu 1 . + CDS 7726 - 8604 341 ## Achl_2173 single-strand binding protein 9 7 Op 1 3/0.000 - CDS 8671 - 9108 312 ## COG0736 Phosphopantetheinyl transferase (holo-ACP synthase) 10 7 Op 2 . - CDS 9116 - 13078 5068 ## COG4982 3-oxoacyl-[acyl-carrier protein] reductase Predicted protein(s) >gi|319976929|gb|AEUH01000282.1| GENE 1 295 - 1254 1211 319 aa, chain + ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 21 307 14 285 296 202 36.0 6e-52 MPSSDDIYACPCGHDVEPTLISFEVMPPRRPELAAPFWETAAELLRVRPDFMSVTYGAGG KDRDNAREVVHRLVRDTPIQPIAHLTCVGNSTADVVATVYDYLDCGVRTFLALRGDPPAD QPDWRPAPDGVRSATELIHLIRTVERRRCDAHPGEKLRGAFKPLTIAVAAFPAGNPAAGT TPDQEVERLLIKQAAGASFAITQLFWDADVYTSFLERARRAGVTLPVVPGIMPATDPGRL RRVGALTGVEAPQRLLDALASCDDEAERAEFGAAFGARLVRGVIDAGAPGVHLYTFNKSR PVLDILDLMGITSGTKENH >gi|319976929|gb|AEUH01000282.1| GENE 2 1256 - 3571 3088 771 aa, chain + ## HITS:1 COG:MT1165 KEGG:ns NR:ns ## COG: MT1165 COG0620 # Protein_GI_number: 15840571 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Mycobacterium tuberculosis CDC1551 # 7 765 9 758 759 728 56.0 0 MAPSTTPFPSATILAYPRVGRGRELKRALEAHWAGRTTAEELAAAHEGLRRANLARLVEL GLGADDASLADAPSYYDHVLDATVLLGAIPPRFAGRSGLGLYFALARGDAGAAPQEMTKW FDTNYHYLVPEIGPDTPISFADDTVVRRYAQAADWGYVTRPVLVGPVTYLALAKAGTAGY DPLDRLDDVVAAYSRALAALADAGAPWVQFDEPALDSDNLPRTRAALTGLAARAYEALAG AAHRPRILVTTPYGDASEAIPALARTGVEALHVDTGRGSLPAGADLSDVVLVVGAIDGRN IWRADLAGARDTLERARALGAKAVTVATTNSLQHVPHDTALEQWDDERLNADLHSWLAFA DQKVEEVVTLARGLDDGWDSVSASIDRASAALAQRATAPGVVRPEVRARTAALGEADRRR EDYPQREAAQRERFGLPGLPTTTIGSFPQTAEIRRARAANARGELSDADYKARMREEIAS VIALQERLGLDVLVHGEAERNDMVQYFAELLDGFAATRNGWVQSYGSRCTRPSILWGDVA RPAPMTVEWSTYAQSLTDKPVKGMLTGPVTIIAWSFPRNDLPLGEVADQIGLALRDEVAD LEAAGIGAIQVDEPALRELLPLDAARHRDYLDWSVGAFRVSTSGVRADTQIHTHLCYSEF GQIIDAIAALDADVTSIEAARSKMEVLPELAEHGFDHGIGPGVWDIHSPRVPGVDELVGL IEAAAESVPLRRLWVNPDCGLKTRGYAETEESLANLVEATRRVRARHSAQD >gi|319976929|gb|AEUH01000282.1| GENE 3 3708 - 3956 75 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRNPVGSASGAAMALSGNSGNGRGLPFCAGGRDFALDDAIGQVRTPLLRRGGRDFAAARP SATPGSPSAQGPTAIALSQCLA >gi|319976929|gb|AEUH01000282.1| GENE 4 4157 - 4591 708 144 aa, chain + ## HITS:1 COG:PA5360 KEGG:ns NR:ns ## COG: PA5360 COG0745 # Protein_GI_number: 15600553 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Pseudomonas aeruginosa # 48 128 147 227 229 73 46.0 9e-14 MALVNTPAPSAPNADADRPTSFEDFLAVLNGLHLSSGRIDINIDRASVTIDGRDAGLSSK EYELLAYLAAQADRAVSREELFATVWHGAGLDLQSRTVDAHVRRLRKKLSVAPDLISTVR GAGYRFNSAPTVRVRLTRAHALAA >gi|319976929|gb|AEUH01000282.1| GENE 5 4863 - 6077 1451 404 aa, chain - ## HITS:1 COG:BS_ilvA KEGG:ns NR:ns ## COG: BS_ilvA COG1171 # Protein_GI_number: 16079236 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Bacillus subtilis # 3 404 30 422 422 370 48.0 1e-102 MVTKLERSERLSDQTGTDVLLKREDQQKCRSFKVRGAFARMSLLTAEERERGVVCASAGN HAQGVAYSCAHLKVPGTIYLPSNTPLQKRRRIATIGGDWVEQVIVDGTFDAANSLAQEAA AQTGRVYVHPYDDPMTIAGQGTIGVELVEQMPSDTAAVLVPVGGGGLIAGIATWLRAVRP EVAVIGVEPAGAASMHAALAAGEPMTLAKVDAFVDGTAVGRTGDVPFEAARRFVDRVLVV PEGAVCTEMLELYQSDGVIAEPAGALASTAACVYLSDPSDRKAIVGRSDGAIVCVVSGGN NDLTRYAEVMERSLRHEGLRHYFLVTFPQQPGMLRMFLEDVLGSGDDIVHFEYTKKNNRE LGPALVGIDLADPADITGLRRRMEASPLHVEELPLDSEITRLVI >gi|319976929|gb|AEUH01000282.1| GENE 6 6250 - 6873 762 207 aa, chain + ## HITS:1 COG:MT2586 KEGG:ns NR:ns ## COG: MT2586 COG1949 # Protein_GI_number: 15842039 # Func_class: A RNA processing and modification # Function: Oligoribonuclease (3'->5' exoribonuclease) # Organism: Mycobacterium tuberculosis CDC1551 # 4 184 1 182 215 174 51.0 1e-43 MTTLKDPLVWIDCEMTGLDVGADALIEVAVVITGADLTVVDPGIDVLIAPPAAALEGMSD FVRDMHTKSGLLDDLKDGVTMEEATERVLSYIKRFVPEAGKALLAGNSVGTDKMFLEANM PAVIDHLHYRLIDVSSIKELAKRWYRSAFAEAPEKHGGHRALADILESIQELEYYRRVLF PSAPVSRADARRIARDVVALGIAESAG >gi|319976929|gb|AEUH01000282.1| GENE 7 6932 - 7246 75 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MISRREGLCFRGGTVGGGRCGIGSRSAPAANSTEGSRAGRVAAGPGSRAVARVCLGEVRL GKWADRGCAGSFAREAEYSKGYRHFTSVITLAKCGTAVKAVASL >gi|319976929|gb|AEUH01000282.1| GENE 8 7726 - 8604 341 292 aa, chain + ## HITS:1 COG:no KEGG:Achl_2173 NR:ns ## KEGG: Achl_2173 # Name: not_defined # Def: single-strand binding protein # Organism: A.chlorophenolicus # Pathway: DNA replication [PATH:ach03030]; Mismatch repair [PATH:ach03430]; Homologous recombination [PATH:ach03440] # 5 146 4 150 184 82 38.0 2e-14 MQDQMMCIRGRVGSDIRMHSTSGGKLIAKFRLAVPRWRYERSQDDGPGSYVEDEPTWCTV QVWDSFAHNVGYSIQKGQPVIVMGRPLANAWVGKDGELRSEVVVAASAVGHDLSKGCASF FRSPHRPPAVPPQGGAERQDRPPASDGEGQRGAPAAPGAADHLGASAERGAAPVRTGAPR PGGSGVADGGADPRRSEGAPDHQTPGSAPNPRNGVLRGSSNGDRRARPGAGAPRLVYPGP EDVVTGSSATDRSDPGGGLEPARKPAGPVGPVEDGSAVTADAAGRAVAVGAR >gi|319976929|gb|AEUH01000282.1| GENE 9 8671 - 9108 312 145 aa, chain - ## HITS:1 COG:Cgl2440 KEGG:ns NR:ns ## COG: Cgl2440 COG0736 # Protein_GI_number: 19553690 # Func_class: I Lipid transport and metabolism # Function: Phosphopantetheinyl transferase (holo-ACP synthase) # Organism: Corynebacterium glutamicum # 5 144 4 132 135 97 42.0 6e-21 MARGLGVDLVHLPAFAEQVDQPGSRFLQGILTPREARHVRARAAATCPADPSSALARHAG GLWAAKEAFVKAWSAALYASPPPVDPLDVDWREIEVVFDAWNRPGLRLHGRIEREYAASF DGPRPGLLVSISHDGDYAIAEVLIG >gi|319976929|gb|AEUH01000282.1| GENE 10 9116 - 13078 5068 1320 aa, chain - ## HITS:1 COG:Cgl2444_5 KEGG:ns NR:ns ## COG: Cgl2444_5 COG4982 # Protein_GI_number: 19553694 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier protein] reductase # Organism: Corynebacterium glutamicum # 8 822 45 843 843 708 52.0 0 PAAGGASGADVPDEPFTASSAIRALLALSAKIRLDEVGDSDTTESLTNGVSSRRNQLLMD MSAELGLNSIDGAGEADVRTLCATVDKAAKGYKPFGAVLSEAVKERTRKLFGAAGAKASR IGERVTGTWQLGGGWASHVTIEIVLGTREGASTRDGDLATLPVDAPSNAAGVDALIDAAV QAVASRLGVSVTLPSSGGASGGGVVDSAALDAFAETVTGDSGVLAKVARTILSDLGLDAP AQVEDEDEDASRVLDAVAAELGSGWADLVEPRFEAQRAVLLDDRWASAREDVARFGAEGV LSETASFVGAGRAVAEQAAWWASRLEEEDDARAARMRAIAEEALSDASALPHASDVAVVT GMTPASIGGAVVAGLLAGGASVVATSSSVSPARLDFAKRLYREHAAAGARLWLVPANLAS YRDVDALAEWIGTEQTKSVGADVVVTKKALVPTLFYPFAAPRVSGTLADAGPASENQFRL LLWSVERSIAALARIGSDTHADHRLHVVLPGSPNRGIFGGDGAYAEAKASFDAIATRWAS EPVWGQRVSIAHPRIGWVRGTGLMGGNDPMVAAVEAAGVRTWSTQEMAEQLLGLSSADVR ARAAEAPVDADLTGGLDSSIDLRALRAQAAAQAAPADDAEPAATVSALPSPTQTSVTHVD PAQWGATTASLDDTIVIVGHGEVGPWGSARTRVQAELGIHTDGTVELTPAGVLELAWMTG LLTWSDTPVGGWYDKDDQLVPESEIFERYRDEVVARSGVRFFHDDGPLHDGHTPEAAVVF LDRDITFTVDDEAEARSYADEDPQFSVASQTAFGEWQVTRKAGAQVRVPRRATLNRRIGG LFPTDFDPARWGVPASMLDSIDPIAVWNLVSAVDAFVSAGFSPAELLRFIHPADLACTQG TGFGGMRSMHKLFVDRFLGEEYPQDILQETLPNVVAAHTMQSYIGGYGSMINPVGACATA AVSIEEGVDKIAAHKADFVVAGAIDDIQVESIVGFGSMNATANSQELLDRGISPRFVSRA NDRRRGGFVEAQGGGTVLLTRASVAVEMGLPVLAVVAHAQTFADGAHTSIPAPGLGALAV ARGGASSPLARSLAALGVGVDDVAFVSKHDTSTNANDPNESDLHTRVGRALGRSAGNPLL VVSQKTLTGHAKGGACVFQVGGIIDAFRSGVIPANIALDCVDDKMEQYSPLVWLRSPLDL SARGPVRAAFATSLGFGHVSGFLALVHPGAFEAVVEREAGPEALARWRSASDARLRAGSR HFEMGMLGHAPLFEPVDARRLPDDPALGAAGDPHEVEAQMLLDPRARLGADGYYHLPDQQ Prediction of potential genes in microbial genomes Time: Thu May 12 19:17:10 2011 Seq name: gi|319976927|gb|AEUH01000283.1| Actinomyces sp. oral taxon 178 str. F0338 contig00283, whole genome shotgun sequence Length of sequence - 4479 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 4478 6065 ## COG4981 Enoyl reductase domain of yeast-type FAS1 Predicted protein(s) >gi|319976927|gb|AEUH01000283.1| GENE 1 2 - 4478 6065 1492 aa, chain - ## HITS:1 COG:Cgl0813_2 KEGG:ns NR:ns ## COG: Cgl0813_2 COG4981 # Protein_GI_number: 19552063 # Func_class: I Lipid transport and metabolism # Function: Enoyl reductase domain of yeast-type FAS1 # Organism: Corynebacterium glutamicum # 96 823 1 717 718 704 53.0 0 AHQVGGRPLEPVCEFLPVHVPFHSPLLSSALDQVRQWASQCGIDAALTESLASAVLVGTV DWPAQVGAGVEAGAKWLLDLGPGATTVRMTRNLVDGTGVGVLPLGTQADRDRAATPGWEP AAPADWSALRPGLITLPDGRTVVDTRFSRLTGRSPVLLAGMTPTTVDPGIVAAAANAGFW AEMAGGGQVTEEVYNENLAGLRAQLRPGHTAEFNSMFLDRYLWNLQFGQARIVSRSRASG APLDGVVISAGIPEHDEALALIEQLRADGFPYVVFKPGTVDQIRKVIAIARDAAPTKIIV QVEDGHSGGHHSWEDLSDLLLATYAQLRAQDNIVLAVGGGIGTPERAADFLTGDWSKRHG RPPMPVDGVLVGTAAMTTKEAQTTRAVKELLVATPGVGADDDSRGWVGEGASRGGMTSGL SHLRADMHEVANAAAAAARIIAEIGSDGAQVRARKDEIVEILSHTAKPYFGDLEEMTYAQ WVRRFADLSFPWVDPTWQIRFHDLLQRVEARLAPVDHGEVPTLFPTADDAANAHEAVDRL LAAYPNAETTFVTPIDSAWFPALCRSYPKPMPFVPVLDDDLIRWWGQDCLWQAQDERYDA DQVRIIPGPVSVAGIDRVDEPVAELLGRFEGAAASRLASRGAEPAAAASRLGNGRAAASE DEWLRHVPFISWTGHLMTNPAAVLDEETVEIRRSGAGVDFVIHLDTAWDSDPRGAEKHAV RELVFPLTLTGEDGAVPVIDEARLPEHMYAMLAATAGVGSTSVAGDRVEALPVMVPSERS EFGEAHYSFTLAPALGFDHAEATGAALPASYELAAWAPDALLGPAWPAIYAALGSAVHDD YPVIEGLLNAVHLDHSMTLVAPPERLVADGVRSIDVVSHVAAVDESSSGRIVTVALELTS GGEPIGTTQERFAIRGRATGNRAPSEAAPFGGASVEVVDTPRSVLRRVTVSAPDDMTPFA IVSGDYNPIHTSYAAAKVAGMDAPLVHGMWLSATAQHAAQAAVAGDAGTHLAGWTYSMYG TVDLNDQVEITVERVGRVVGGGLSLEVTCRIDKQVVSRASACTFAPTVAYVYPGQGIQSP GMGLDERAKSKAVDEVWRRADAHTRSAMGFSILAIVRDNPTEIVARGVTYRHPEGVLNLT QFTQVALATLAIGQTARLREEGVLVPGSAFAGHSLGEYDALAAYAEVFPLETVLDLVFQR GSTMHSLVPRDERGRSDYRMGALRPNQFGVDDAHVVDYVASVAGASGEFLQIVNFNLAGQ QYAVAGTVAGLRALEEDANRRAREAGGKRPFMYVPGIDVPFHSTVLRNGVADFRQKVDER IPDEVDPGKLVGRYIPNLVARPFELTREFAQSILDVVPSEEVRRLLKEPGAWEAAAAEPG RLTRTLLIELLCWQFASPVRWIETQRVLLSSGEAAPGVQGLAVDEVVEVGLGAAPTLANL AARTLGQPEFSASRTEVFNVQRDEARVLRTDVAVVEDEEDEEDEEVEAPPTA Prediction of potential genes in microbial genomes Time: Thu May 12 19:17:12 2011 Seq name: gi|319976923|gb|AEUH01000284.1| Actinomyces sp. oral taxon 178 str. F0338 contig00284, whole genome shotgun sequence Length of sequence - 2599 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 705 926 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 2 1 Op 2 . - CDS 752 - 2308 2413 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 3 1 Op 3 . - CDS 2305 - 2598 295 ## COG1038 Pyruvate carboxylase Predicted protein(s) >gi|319976923|gb|AEUH01000284.1| GENE 1 3 - 705 926 234 aa, chain - ## HITS:1 COG:Rv2524c_1 KEGG:ns NR:ns ## COG: Rv2524c_1 COG0331 # Protein_GI_number: 15609661 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Mycobacterium tuberculosis H37Rv # 8 225 29 246 359 95 33.0 7e-20 MPSFLSSLSSTPYALVFAGQATPWREGLDEVAHDPEIAALLGRVLAASDDLLSPVRRELA TQSVASLPFSLPAAAGEPAVARRVNGPDEAALSVPGIVLSQLGALMDLSRAGVDFASHPP VAFEGHSQGVLGVEVARAWIDGDEARAATVFALARLIGAAAARQTRRLRAAHADGATYMV SVRGVSDALLASLISQSTTTQYPLSVALRNDTDAHVVSGAPADLAALVAAAERA >gi|319976923|gb|AEUH01000284.1| GENE 2 752 - 2308 2413 518 aa, chain - ## HITS:1 COG:TM0716 KEGG:ns NR:ns ## COG: TM0716 COG4799 # Protein_GI_number: 15643479 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Thermotoga maritima # 28 518 26 515 515 486 49.0 1e-137 MSTQNAQGRDAFIDAQDRVVAEAEANAVSRQHAKNKMTARERLAMFFDDGVWHEIGQFIG GSVRSGQIGSAVAAGYGRVQGRMVAAYAQDFSVRGGTLGKVEGDKIIDLIELGIRMRIPI VGIQDSGGARIQEGVVALAQYGRIFKKTCEASGLVPQISIILGPCAGGAVYQPALTDFII MTRENSHMFVTGPDVVAATTGEKVTLEQLGGAAIHNFQSGVAHHMADTEQEAIDYVRSLL DYLPSSCGAEPPRYEYAPNAEDEAAAAAVADLVPTSGRVPYDVRDVVRALVDHGEYVEIQ ELFAPNVTIGFACVDGRSIGVVANQPMNDAGTLDVDASEKAARFVRFCDAFGLPVVTLVD VPGYRPGTEQEQAGIIRRGAKVIVAYANATVPMVTVILRKAYGGAYIVMGSKSIGADLAY AWPDAQIAVMGAEGAASIMHRRELAAAKEAGDFDEVKARLVAAYEAETVNPDVSVRSGQL DAIIRPSDTRQVIIDSLHLLASKDQSAPHVKRHDNGPL >gi|319976923|gb|AEUH01000284.1| GENE 3 2305 - 2598 295 97 aa, chain - ## HITS:1 COG:SPBC17G9.11c KEGG:ns NR:ns ## COG: SPBC17G9.11c COG1038 # Protein_GI_number: 19112692 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Schizosaccharomyces pombe # 5 90 1097 1183 1185 68 41.0 3e-12 LRSHRRAAAQSAPAAPAQGNPGDIVSPMQAIVVSLAVSNGDRVEEGQLVAVLEAMKMEKP LLAPRSGVVTGLGVSQGDTVGAGALICHIEAAGQEDQ Prediction of potential genes in microbial genomes Time: Thu May 12 19:17:14 2011 Seq name: gi|319976920|gb|AEUH01000285.1| Actinomyces sp. oral taxon 178 str. F0338 contig00285, whole genome shotgun sequence Length of sequence - 4395 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1452 1956 ## COG4770 Acetyl/propionyl-CoA carboxylase, alpha subunit 2 1 Op 2 . - CDS 1480 - 2340 1211 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 3 2 Tu 1 . + CDS 2687 - 4348 2849 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains Predicted protein(s) >gi|319976920|gb|AEUH01000285.1| GENE 1 3 - 1452 1956 483 aa, chain - ## HITS:1 COG:Cgl0680 KEGG:ns NR:ns ## COG: Cgl0680 COG4770 # Protein_GI_number: 19551930 # Func_class: I Lipid transport and metabolism # Function: Acetyl/propionyl-CoA carboxylase, alpha subunit # Organism: Corynebacterium glutamicum # 2 483 6 493 591 443 49.0 1e-124 MRTINRILVANRGEIALRVIRTAKDMGISTVAVYADQDMTAPHTREADLALALDGDDAAS TYLSADKLLQIAVESGADAIHPGYGFLSENPEFARAIENTGIAWIGPAPGVIEALGDKIS ARRVAQACGVDPVPGLSEPVTARAEVEAFIAEHGYPVVLKRADGGGGRGITIVRSGADLD LFFARHTDPADLGAHFVERFVEVARHVETQSARDSHGNFHVISTRDCSVQRRNQKLVEEA PAPFLPDGAHERLVEWSRALFEHVGYVGVGTCEFLLEPDGSIWFLEVNPRLQVEHPVSEE VTGIDLVREQIRVAQGLTLTPPPPPRGHSFEFRVTSEDPAADLTPSAGRLDEVHWPLGPG IRLELGIEAGDAVQTAFDSMIAKIIVTGSDREHALARARRALAEFSVHGVPTPVPVYRDI INDPDFCGVGGFTVSTRWFETVFAPQHDYSHLSGPAEAEGTRDLPRQTYVIELDGRRMTL TLP >gi|319976920|gb|AEUH01000285.1| GENE 2 1480 - 2340 1211 286 aa, chain - ## HITS:1 COG:lin2018_2 KEGG:ns NR:ns ## COG: lin2018_2 COG0340 # Protein_GI_number: 16801084 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Listeria innocua # 19 276 12 247 253 83 27.0 4e-16 MDGLQRIPLDPRVSAPVYHIASTPSTNSLAVELIADQATPTPHMTTVIATEQTAGRGRLT RTWTSAPGKGLTATTVLSLPTSLALASSGWLIHACALAVRDAVDARLAPLGRTTHTKWPN DVCVDGTRKICGILGEVAPASSPFVNTFVIGYGVNVSMADDERPTPEATALSLEGDQEAA ASPRGTALALLADILVGLDRRITGLINADGDPIASGLAHEATTRCATIGSRIGVADPTDP TGAPQLEGTATALSPIGTLVVLLDDGTTTDVSAGDVTPIANGKHRA >gi|319976920|gb|AEUH01000285.1| GENE 3 2687 - 4348 2849 553 aa, chain + ## HITS:1 COG:MT2552 KEGG:ns NR:ns ## COG: MT2552 COG0488 # Protein_GI_number: 15842003 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Mycobacterium tuberculosis CDC1551 # 1 552 8 557 558 837 77.0 0 MIKARKAIGDKVILDDVTMAFYPGAKIGMVGPNGAGKSSILKIMAGLDEPSNGEARLTPG YSVGILMQEPVLDEDKTVIENVRLGAAGIFGKLARFNEISEQMADPDADFDALMEEMGKL QTEIDAAGAWDIDSQLDQAMDALRCPPPDQPVSVLSGGERRRVALCKLLIEAPDLLLLDE PTNHLDAESVLWLEKHLASYPGAVIAVTHDRYFLDHVAGWIAEVDRGHLYPYEGNYSTYL ETKEKRLAVQGQKDAKLAKRLKEELEWVRSNAKGRQAKSKARLARYEEMAAEAERTRKLD FEEIQIPPGPRLGSVVIEAKDLSKGFGDRSLISGLSFSLPRNGIVGVIGPNGVGKTTLFK TIVGLEPLDGGELTIGETVKISYVDQSRAGIDPDKTLWQVVSDGLDFIQVGNVEMPSRAY VSAFGFKGPDQQKPAGVLSGGERNRLNLALTLKQGGNLLLLDEPTNDLDVETLGSLENAL LAFPGCAVVVTHDRWFLDRVATHILAWEGTDEEPDKWYWFEGNFEAYEKNKVARLGEAAA KPHRVTHRRLTRD Prediction of potential genes in microbial genomes Time: Thu May 12 19:17:23 2011 Seq name: gi|319976892|gb|AEUH01000286.1| Actinomyces sp. oral taxon 178 str. F0338 contig00286, whole genome shotgun sequence Length of sequence - 26777 bp Number of predicted genes - 26, with homology - 20 Number of transcription units - 16, operones - 8 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 281 - 565 331 ## gi|154509211|ref|ZP_02044853.1| hypothetical protein ACTODO_01732 2 1 Op 2 . + CDS 610 - 2760 2666 ## COG1640 4-alpha-glucanotransferase + Term 2798 - 2827 -0.9 3 2 Tu 1 . - CDS 2720 - 2932 64 ## - Term 3336 - 3389 -1.0 4 3 Op 1 3/0.000 - CDS 3608 - 4066 379 ## COG2190 Phosphotransferase system IIA components 5 3 Op 2 . - CDS 4063 - 4293 293 ## COG1264 Phosphotransferase system IIB components - Prom 4342 - 4401 1.7 6 4 Tu 1 . + CDS 4510 - 6939 2915 ## COG1472 Beta-glucosidase-related glycosidases 7 5 Op 1 . + CDS 7058 - 7681 1027 ## Jden_2535 hypothetical protein 8 5 Op 2 . + CDS 7721 - 8731 1518 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 9 6 Op 1 9/0.000 - CDS 8768 - 10123 1661 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 10 6 Op 2 . - CDS 10125 - 10838 1030 ## COG3279 Response regulator of the LytR/AlgR family 11 7 Op 1 . + CDS 11044 - 12183 1876 ## COG0562 UDP-galactopyranose mutase 12 7 Op 2 . + CDS 12221 - 13153 270 ## PROTEIN SUPPORTED gi|163794676|ref|ZP_02188646.1| 50S ribosomal protein L13 + Term 13272 - 13315 3.6 13 8 Op 1 . + CDS 13394 - 15133 2045 ## COG0733 Na+-dependent transporters of the SNF family 14 8 Op 2 . + CDS 15130 - 15294 198 ## 15 8 Op 3 . + CDS 15326 - 16078 1181 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 16175 - 16227 1.2 16 9 Op 1 . + CDS 16274 - 16585 375 ## 17 9 Op 2 . + CDS 16585 - 18324 1226 ## SACE_1016 hypothetical protein 18 10 Tu 1 . + CDS 18504 - 19040 288 ## 19 11 Tu 1 . + CDS 19294 - 19827 299 ## + Term 19915 - 19943 -1.0 20 12 Tu 1 . + CDS 20009 - 20617 315 ## 21 13 Tu 1 . + CDS 20749 - 21357 310 ## gi|154508127|ref|ZP_02043769.1| hypothetical protein ACTODO_00621 22 14 Op 1 . - CDS 21387 - 22286 1132 ## COG0566 rRNA methylases 23 14 Op 2 . - CDS 22294 - 23358 1332 ## COG0167 Dihydroorotate dehydrogenase 24 14 Op 3 . - CDS 23376 - 24080 716 ## HMPREF0573_11811 hypothetical protein 25 15 Tu 1 . - CDS 24573 - 25205 797 ## Bcav_1880 putative integral membrane protein 26 16 Tu 1 . + CDS 25265 - 26647 1804 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases Predicted protein(s) >gi|319976892|gb|AEUH01000286.1| GENE 1 281 - 565 331 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509211|ref|ZP_02044853.1| ## NR: gi|154509211|ref|ZP_02044853.1| hypothetical protein ACTODO_01732 [Actinomyces odontolyticus ATCC 17982] # 1 94 1 94 94 77 47.0 2e-13 METAQELIEALGGGENIVGIEPCLMRIRVEVRTQRAVVENGLRLPEILAVVRSGAYVQLV AGLKTEEIAADMKSLVDNVPHGTRTGPVGESVSA >gi|319976892|gb|AEUH01000286.1| GENE 2 610 - 2760 2666 716 aa, chain + ## HITS:1 COG:Cgl2245 KEGG:ns NR:ns ## COG: Cgl2245 COG1640 # Protein_GI_number: 19553495 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Corynebacterium glutamicum # 13 702 6 694 706 546 44.0 1e-155 MDTNAPVSSDLGLLRYVADRNGVATGYWDWYGNWIDVSAESLLEVLRALGVPVSPESTVG DLREAARLVDEREWRQTLPPTIVARQGGGYEFPVHVPDGAWVAVQWILEDGRNGHCEQLD RYVPPRTIDDRLVGRATFDVPHWLPLGWHRLVATVEGGLVASSTLIIVPSSLDAPALRGG RRAWGVGAQMYSTRSHASWGIGDAVDLADLAAITADKGADFLLINPVHAPQPISPQENSP YLPVSRRWLNPIYIRPEAIAEYAALPQESRDAIEQLRAQTVDFAVDEDLIDRDRVWEAKR KALEIVFAAARPYRRQSDFDHFIERGGADLANYALWCALVEREETVELPEALSRSSAPAV EMERLEVADRVEFWQWCQWIVSQQLGEAQACARRVGMSMGIMADLAVGVHSRGSERWSHP TLFADGISVGSPPDMYSQQGQNWSQPPWSPKALAESGYVQLRDMLRAALANAGAIRIDHI LGLFRLWWIPEGRCASEGTYVYYDHEAMVGIVLLEAQRAGAVVIGEDLGVVEPWVRDYLR ERGVLGTSVVWFEKDGGGWPLRPCDYRERALAAVNIHDLPPTQGYIRGIQTTLRSEFGLL VDDVETVRAQDRLELEQVEARLREYGLIDSAQPTERELVEALYAYVAQTPSKLVVASLVD AVGDLRPQNMPGTDADLYPNWCVPLCNSEGDEVSIEELPTNERLTALFSLLRGSID >gi|319976892|gb|AEUH01000286.1| GENE 3 2720 - 2932 64 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCAAPCCLIPSLVCDTRGERMLKPGGSGPVGEAADPRRNRRTKPARTPSETHPGGGGVSR WNRAGARRAQ >gi|319976892|gb|AEUH01000286.1| GENE 4 3608 - 4066 379 152 aa, chain - ## HITS:1 COG:BH0296_3 KEGG:ns NR:ns ## COG: BH0296_3 COG2190 # Protein_GI_number: 15612859 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus halodurans # 2 124 12 133 161 95 39.0 4e-20 MSPAVRSPLPGEAAPLDSVDDPVFAGGVVGPGLALRPPTGERLDVLAPVTGTIVKTTPHA FVVRADDGAAVLVHLGIDTVALRGQGFEAVARQGERVDAGQVVCRWDPSGALDSGLDVVS PVVVLEPVGAAIEPALPVPSAIEAGDLLFTIV >gi|319976892|gb|AEUH01000286.1| GENE 5 4063 - 4293 293 76 aa, chain - ## HITS:1 COG:BU356_2 KEGG:ns NR:ns ## COG: BU356_2 COG1264 # Protein_GI_number: 15616961 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIB components # Organism: Buchnera sp. APS # 4 73 3 72 78 61 50.0 3e-10 MSDAQSIIDALGGWGNIKDLDACITRIRLDVVDADVIDERALRDAGAFDVIIVGDAVQVV VGPDSEEIVDAMNALR >gi|319976892|gb|AEUH01000286.1| GENE 6 4510 - 6939 2915 809 aa, chain + ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 30 661 42 715 721 384 35.0 1e-106 MTTNPGRAWAAADLSLIQAAALLSGSSEWDSRPVPAASVPSFVMSDGPHGVRRQLGDGDH LGIAASEKATCFPTAATVANSWDPALAERMGEALGREARALGVDVLLGPGLNIKRSPLCG RNFEYYSEDPLLAGRMAAGMVRGIQSQGVAACPKHFAVNSQELRRMASDSVVDERTMREI YLTAFEIVVREASPRALMSSYNRVNGTYAHENAHLLTEILRSEWGFDGMVVSDWGGSNSA VDAARAGGSLEMPGPGLYGARQIVAAARAGELDEAAVRARAQEVLDMAAAAAALGEADPV DLDAHHELAVRIAQDSVVLLRNQGRTLPLAEGTGVGVVGDMARTPRYQGAGSSQVNPTRL ESALDVLSDGAAGLAVEGFAQGYERQGGPDDQLIAEAVELASRCGTVLVCAGLDELAESE GLDRQHMCLPDAQNALIGALAATGARVVVVLSGGASVEMPWADGVDAVVHTYLGGQGGAR AAVNVLTGRANPSGRLAETYARSYEDHPTSQWYPATGPLSLYREGPYVGYRYFEAASVPV AFPFGFGLSYSDFEYSGLSVDRQGARFTITNTSDVDGAEVAQLYARAPGGVFGPARQLRG FAKVLVPAGESREVAIPFDDYTFRHWEAGRGRWEVEAGRWTIEVGPHSQSVALRAGLDVE GVPPSPIDPALGHYLDAEVKAATDEEFAALLGRPVPVAREAELLEPDDPLSAMSRARSRL ARFAARRLEAARERADAKGMPDLNILFVLNMPFRALAKMTRGAVSADMVDAVVLAVNGHV LRGAVRLAKGYFANRRADKRTQRELDAPR >gi|319976892|gb|AEUH01000286.1| GENE 7 7058 - 7681 1027 207 aa, chain + ## HITS:1 COG:no KEGG:Jden_2535 NR:ns ## KEGG: Jden_2535 # Name: not_defined # Def: hypothetical protein # Organism: J.denitrificans # Pathway: not_defined # 1 190 1 190 213 213 59.0 5e-54 MNALRKWWKNFEDKHQTLAQFIVFFILSNGITVLQLVLMPAFKAVFAGTSLVGTAFQFLP VGVSNGQTIYLFDYPAGAIQSGGGGGLAYFLAVEITLLIAQVINFFAQRSITFKSNSSIL RAALWYAVAYVVITVVAAALQVLYKDPIYAWSISAMGPGGETFADVVTMFINAAVSFWVF FPIFKVIFRDDPAPDQGGTADEGAGAQ >gi|319976892|gb|AEUH01000286.1| GENE 8 7721 - 8731 1518 336 aa, chain + ## HITS:1 COG:CAC1488 KEGG:ns NR:ns ## COG: CAC1488 COG0463 # Protein_GI_number: 15894767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 5 336 3 338 338 278 40.0 1e-74 MPAPLLTVIVPSYNSQDYLDRAMTSLVGYGDEVEVIIVDDGSKDATGGIADDWASRYGSV RVIHQENKGHGGAVNAGVAAAAGTHVRVVDSDDWLDRKALAAVLDVLREERAAGRELDLL VTNYVYEKQGKAHKAVIRYRNVLPRGRVFGWGQVRRCRYDQYLMMHALTLRTEVVRASGL SLPEHTFYVDYLYSHVPLPHVRTIRYLDVDLYRYFIGRDDQSVNEKVMISRLDQLARVNR LMAQAVPPRDSVDERLWRYMVHYLRINAVVCSVMAQLSGTPGHLALKDAIWLDMEMVNPQ ATAAIVRDPLAALVRHGSPRVIRAGYAVVRAVLGFN >gi|319976892|gb|AEUH01000286.1| GENE 9 8768 - 10123 1661 451 aa, chain - ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 247 442 236 431 433 65 24.0 3e-10 MLESLPDIPRLFTALAEWGAALVYVTAVARASSHDPSPARPVPRWRLVLAAAGGLPAFAI TQELLGRAPLSLWVPGMVLAFALVWGTIRIGTTMDWRWVTHTSARAFVVAELAASLAWQV VVYFHAAALIWRPEALGGFTAIAAACLGTVHYCERRVLRQGALPALRRGDLVAALVIGVA VFALSNLSFVSTQTPFSGRAGLEVFYIRTLVDLGGYAILFAQFERIQQSAMERELASIQA SLEAQHHQYLSARADMEQVARAHHDLKHQVAAIRAELDPGRAAASFEELESQIARIGQQY HSGNPVLDVVLTTKGRVCGAEGINFTAVADGSLLAGMSSMDIATLFGNALDNAIEASRRV PDPAKRLIELALFRRGEMVVIRVDNWFDGRLSTDAGGRLTTIKADGVRHGWGVKSIQWTA RKYGGQAATRAEDHWFTLTVLLPSASSLQER >gi|319976892|gb|AEUH01000286.1| GENE 10 10125 - 10838 1030 237 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 235 1 232 234 103 24.0 2e-22 MVRIAVVEDEAASRALLSDYIRRYGDEHGLRFDVTHFEDGAALIGDYKPVYDIILMDIQM EHVDGMTAARAVREVDQEAILVFITSAPQFAINGYQVGALSYLLKPLPWFAFSQELDRCL EALAKRQGASVLLQSGAAAHRVPVADIVYVESIKHRLTVHTTDGAAISIVSTLKAMEARL EGMDFFRSNSCYLVSLRHVRGVADQECLMTNGDRLRVSRPRKKAFMSALAAYVGGIR >gi|319976892|gb|AEUH01000286.1| GENE 11 11044 - 12183 1876 379 aa, chain + ## HITS:1 COG:ML0092 KEGG:ns NR:ns ## COG: ML0092 COG0562 # Protein_GI_number: 15826927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-galactopyranose mutase # Organism: Mycobacterium leprae # 2 376 6 382 413 469 61.0 1e-132 MDLLVVGSGFFGLTVAREAAERFGMDVTVIERRDHIGGNAHSSIDPATGVEVHRYGTHLF HTSNERVWDYANRFTAFNSYQHRVYANHGGVVYPLPINLGTINQFFGSAMGPQEARQLIA SQAGEAGGEPSNLEEKAISLIGRPLYEAFIKGYTAKQWQTDPTDLDPSIITRLPVRFTYE NRYFADTHEGLPVGGYGAWFENMVDHRRITVRTGTDFFDLSQPYSKAAVGQVPVVYTGPI DRYFDYEAGELSWRTLDFETEVVGVPDYQGCSVMNYSDESVPFTRIHEFAHLHPERARPG AENTVIMREYSRFATRGDEPYYPVASPADREVLDAYRQLAAGEDRVLFGGRLGSYKYLDM HMAIASALNRADEVVAQWR >gi|319976892|gb|AEUH01000286.1| GENE 12 12221 - 13153 270 310 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163794676|ref|ZP_02188646.1| 50S ribosomal protein L13 [alpha proteobacterium BAL199] # 5 307 4 309 312 108 29 4e-23 MVASIIWSICAIGAVIGIGWLARRFDRVGQEAATSLAQVTYWVASPAMVFHAEAGTDVSS VFGRPLLVAAASGAGAALFFVAVARAWPRLRGPDLALGAMAASLNNGAYIGIPVAVYVLG DASAVLPVIVFQVGFFTPMFFVLAEVSAAGRTPSPCAVARLVVTNPIVVAAVAGLVFSAS GAAMPVLVDVTASMLGQAAPPMILLAFGISLRGGGARVRGGAGAVLASATKLVVQPLVAL SVGMALGLRGQALMSVTVMAALPTAQNAYIAATRAGGGVEIAQGTVLITTFASLPVTIGV AALFHALSAA >gi|319976892|gb|AEUH01000286.1| GENE 13 13394 - 15133 2045 579 aa, chain + ## HITS:1 COG:PM0718 KEGG:ns NR:ns ## COG: PM0718 COG0733 # Protein_GI_number: 15602583 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Pasteurella multocida # 8 498 5 493 502 444 48.0 1e-124 METTARRTTKRESWSGQTGFLLAAIGSAIGLGNIWRFPGTAYANGGGAFLVPYLVALVLV GIPMLWLDYAMGHKLRGSPPWALRRLAAGGEVLGWFQTFVCFVILVYYGAVIAWSAQYMV YSVNQSWGEDPTSFFTDTFLRVVPGDQFSWEPVAAVAVPLVLIWVLVLAIIGRGIARGVE AANKVFLPLLVVLFLIMVVRALFLPGAVEGLNAFFSPHWESLANPQVWLAAFAQIFFSLS VGFGIMMTYASYLKRRSNLTGTALVAGFANSSFEILAGIGVFSAIGFMAHQQGVGVGDVQ GVSGPILSFVTFPKIISMMPGGAIFGVLFFASLTLAGVTSLLSLLQVVSGGLQDKFGLSP ARAALVFGVPATVVSVALFGTRSGLNTLDIVDNFINNIGVVSSAILVAVLAAVAPPRLRG LRRHLNSVSSVKVPRPWEPLVGVVVPLVLLVMMGITAVTLVQEGYGSYAPGLVAVFGWGS VALAVVGAVAFTLLPWRHRLAAPSIRAIIDADEAGAAVQDGSGEPRDPGDRGGGQPDGNA APGGSADNAAPGGSDDNAAPGGCDGPGDDGETNGEGARA >gi|319976892|gb|AEUH01000286.1| GENE 14 15130 - 15294 198 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSAIAVLIMVVTILLVWGGLVASVVLLQVLPVPPDDPAESSDRHDGAPGDPATD >gi|319976892|gb|AEUH01000286.1| GENE 15 15326 - 16078 1181 250 aa, chain + ## HITS:1 COG:L165449 KEGG:ns NR:ns ## COG: L165449 COG0561 # Protein_GI_number: 15672552 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Lactococcus lactis # 2 245 3 248 252 160 35.0 2e-39 MKLVAFDLDDTLAPSKSPLPPLMDQRLQSLLDHVPVCIISGGQIGQFRAQVLSQLHASDG QLSRLHLMPTCGTRYYRYEDGQWAMKYAHDLDPQVRDRAIASLERRAKELGLWEEDTWGP IIEDRGSQITYSALGQEAPLDAKRAWDPDGAKKALLRDAVAPDVPELDVRGGGSTSIDIT TRGIDKAYGMGKLVEATGIAPQDMLFIGDRLDAEGNDFPVKAAGYPTRAVADWHECAEAI AQIVGELGAR >gi|319976892|gb|AEUH01000286.1| GENE 16 16274 - 16585 375 103 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTDLYLVAGDLETLYERVGYVVEDLGPVVVEEVGSYIHGGMPGGQSADLGVQAADAIDKR VGGVVSGLSDFCVNLADMIAQFAATEDANTVAFQNIAHSSGVN >gi|319976892|gb|AEUH01000286.1| GENE 17 16585 - 18324 1226 579 aa, chain + ## HITS:1 COG:no KEGG:SACE_1016 NR:ns ## KEGG: SACE_1016 # Name: not_defined # Def: hypothetical protein # Organism: S.erythraea # Pathway: not_defined # 1 557 1 537 566 137 27.0 2e-30 MVAWSDVLNWRADYIEAARQQAEDALRTARAQVAELEDHQRHIQSQGDSISAMRGALGSQ QSCLDHGVNYLAEYMLACAEAVDGVRRVKAKVESALEISEEIGYPIHDDGSVDSFSRKYN HVPGRPLAHIQTEEDTVAEAKHTELKDCVADALRIADEVDAEFTKRLRPLAMGTFARAEG RASISPGLPNDADESWSPSEVSAWWAALTDEERQACIKRDPLKYGNLDGIDMASRDKANR LALYGREDADGNHVPGTGLLDQAQAKFDEAKSRYESKLHNGMRLGEANAEVEAEFYRAQA DLKDLQKIDEQLRTKQASDGTPLSLLALDTSGEQVKAGVAVGDVDHAQHVANFTPGMGTN VRDSLANYVDVADRMRDNAEEVTQGVKKKDVAVVAWLNYDAPTDVTKTFDPSVAGTEKAH AGADRLAGFLTGVRSWRDEQGGSLHLTSVSHSYGSTTAGLAMLQMGEGVVDEHIYLASPG SGAHSVGALGVDPSHVWVSAVPEGDSLVQGRGPDWSDFARDPQELEGIQHLSGDATGAQG YVPNPWSPIANHSSYFAPPKPGERNEVFDDVCRVVVGVK >gi|319976892|gb|AEUH01000286.1| GENE 18 18504 - 19040 288 178 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGFARALAEKGGGTIGVSPLRLVRYCWDWGPGQERGWSFRSETLYVVSVTDADIDEIAA QELSGLPYKGTRGTVQKDGSFVLRSGDATNGGQLQVNYFPDGRSSLHYESGCRPSDGSMG DLNEYVLPSTEEVFSDLVVYPAFDENTGDPNPPPSTDTGQSGQSDQSGGSGDEQGEDQ >gi|319976892|gb|AEUH01000286.1| GENE 19 19294 - 19827 299 177 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGFARALAEKGGGSIGFQPPRLVRYCWDWGPGEERGWSFRSEILYVVSVTDADIDEIAA QELSGLPYKGTQGEPSEDGSFTLRSGDAANGGQLQVNYFPDDRTSFHYESGCRPSDGSMG DLNEYVLPSTEEVFSDLVVYPAFDKDTKQPNPPPPAWGDSGQSGQSGGSGDEQGEDQ >gi|319976892|gb|AEUH01000286.1| GENE 20 20009 - 20617 315 202 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQAETLFAQRESIEEYLRSEEPVLAGFARALAEKGGGTIGVSAIRLVRYCWDWGPGEER GWSFRSETLYVVSVTDADIDEIAAQELSGLPYKGTQSPMHGDGSLILRSGDSANGGEMEI FYFPGRRSSLHYESGCRPSDGSMGDLNEYVLPSTEEVFSDLVVYPAFDKDTKQPNPPPST DTGQSGQFVQSGGSGDEQGEDQ >gi|319976892|gb|AEUH01000286.1| GENE 21 20749 - 21357 310 202 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154508127|ref|ZP_02043769.1| ## NR: gi|154508127|ref|ZP_02043769.1| hypothetical protein ACTODO_00621 [Actinomyces odontolyticus ATCC 17982] # 7 199 38 225 226 63 27.0 6e-09 MPQAETPFAERVSIEEYLRSEEPVLAGFARALAEKGGGSIGFQPARQLQICSDRGRGEEY GWRFRSETLYVVSVTDADIDEIAAQELSGLPYKGTQSPMHGDGSLILRSGDSANGGEMEI FYFPGRRSSLHYESGCRPSDGSMGDLNEYVLPSTEEVFPGLVVYPAFDEDTGDPNPPPST DTGQPGQSDQSGGSGDESGEDQ >gi|319976892|gb|AEUH01000286.1| GENE 22 21387 - 22286 1132 299 aa, chain - ## HITS:1 COG:Rv0881 KEGG:ns NR:ns ## COG: Rv0881 COG0566 # Protein_GI_number: 15608021 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Mycobacterium tuberculosis H37Rv # 13 281 23 275 288 150 38.0 2e-36 MLIPLTGPDLSADPRLDDYTRLKDVKLRARLEPERGLYMAESTSVITRALDAGHSPRSFL MAQRHLPGLAGAIESATGSADGGPVPIFTAPEEVLEQLTGFHLHRGALAAMHRPALPRAD ELLATARGGAPARRVVVLEDLVDHTNVGAVFRSAAALGVDAVLVTPSCADPLYRRSVRVS MGTVFQVPWTRLEAWPDMGALHGAGFTVAALALSDEAVPLDDFAALPVLAAPGAKLAMVM GTEGDGLGRRTIAASDYTVRIPMEHGVDSLNVAAASAVVFWATRGVGARTSGPAAGVCR >gi|319976892|gb|AEUH01000286.1| GENE 23 22294 - 23358 1332 354 aa, chain - ## HITS:1 COG:STM1058 KEGG:ns NR:ns ## COG: STM1058 COG0167 # Protein_GI_number: 16764417 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Salmonella typhimurium LT2 # 18 341 14 326 336 229 40.0 7e-60 MIGSAYRWAFQHFISKSDPERAHHLGLGAIALAGAFAPTRGLMRATVGHLDPRTAPTRVV DGAEQPLMIGPRALASRMGLAAGMDKNAEAVLGLAALGFAFVEVGTITPRPQPGNDAPRL WRLVDQRGLRNRMGFNNDGADAAAVRLRALRSTKAGRSAVVGANIGKNKTTPEEDAPWDY AYCARVLAPWVDFIVINVSSPNTPGLRDLQSVDRLRPIVEAARDGSRQACPDRDIPLFVK IAPDLSDEEIVGVCGLSRSMGLAGVVATNTTVDHDLGQGGVSGAPLRERALDVIRLVAAN LDDSQLLIGTGGVSSPADARRMLGAGADLVEAFTAFIYEGPTWPGALARALQRP >gi|319976892|gb|AEUH01000286.1| GENE 24 23376 - 24080 716 234 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_11811 NR:ns ## KEGG: HMPREF0573_11811 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 4 120 44 176 219 73 35.0 8e-12 MPTVIANPVPGLYGHSGWLDETGIRTLLRPLAASGSECSDYVRVQGIGSAVAERLLALLP AEQLEDRQNNAPALGDLLSAAVAHPGVRLCGYMIAPPRWDERLTIDAVLVPASDLGSSSD LLPGSGPLDADRRSGPAALGTGSPIEGVGAPDGAGAPGAISAPNGSDALATASAGDVQSG IPDGARPTGMPPREEHWLRIRDFLGLAANTVAPDELSCACAGPGGALAWWAWWD >gi|319976892|gb|AEUH01000286.1| GENE 25 24573 - 25205 797 210 aa, chain - ## HITS:1 COG:no KEGG:Bcav_1880 NR:ns ## KEGG: Bcav_1880 # Name: not_defined # Def: putative integral membrane protein # Organism: B.cavernae # Pathway: not_defined # 1 205 1 204 205 134 38.0 2e-30 MFSRKKHADDAHAPAEEPAVPTASGKKGRPTPTRKQAQARNNRPLVPADRKEAKRRASQA RNARFRAEQQALITGDEQHLPARDKGRVRRFVRDWVDARWSFGEFVMPLILLSLVLVMVL SAFADRMSLQVASGLLIGSTILMYGSFVVVILESTIVWRRIRKRLAERYPNDEIPRGTWY YCFSRMLFLRRWRSPRPQVARGEFPKTKKK >gi|319976892|gb|AEUH01000286.1| GENE 26 25265 - 26647 1804 460 aa, chain + ## HITS:1 COG:MT2598 KEGG:ns NR:ns ## COG: MT2598 COG0624 # Protein_GI_number: 15842054 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Mycobacterium tuberculosis CDC1551 # 7 437 15 440 470 342 46.0 1e-93 MNDTHASGTTAATDALRARLDEAFPALLDELKALVAIPSISSDPDHRADVDASAEHLRQR FEALGMSARVLSEDAPDGTAGMPAVVARGPRVEGAPTVLLYAHHDVQPTGDVSRWSLSPF EAEVRGDRIYGRGASDDGAGVVVHLGCMALLGERPPVNVVVYVEGEEEIGSPSFTAFLDA HRDELAADVIVVADSSNWKAGEPAVTSTLRGNAIATVDVTVADHAVHSGAFGGPLLDSVV VSSMLISSLFDADGSVAVEGLGGCDDADVVWDEADFRAAAGVVDGVRLAGTGDLAARVWT KPSITVIGFDARSVRDASNTISPHTRFRLSLRTVPGADPDEALDKLAAHLRAHAPFGARV EVQKNDGGNGFQADMDSPVASLLHECLTEAWGTPSVNIGVGGSIPFIADFQRIFPGAQVV VTGVEDPLTNAHSEDESQSIPDLKNAILAEALLLTRLPRL Prediction of potential genes in microbial genomes Time: Thu May 12 19:18:39 2011 Seq name: gi|319976888|gb|AEUH01000287.1| Actinomyces sp. oral taxon 178 str. F0338 contig00287, whole genome shotgun sequence Length of sequence - 2376 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 68 - 157 63 ## 2 1 Op 2 . + CDS 154 - 546 580 ## COG0784 FOG: CheY-like receiver 3 1 Op 3 . + CDS 543 - 1598 1278 ## COG0547 Anthranilate phosphoribosyltransferase 4 2 Tu 1 . - CDS 1744 - 2376 630 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases Predicted protein(s) >gi|319976888|gb|AEUH01000287.1| GENE 1 68 - 157 63 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWEAPLVARMGPHNTRLTAAPAHHEGVTK >gi|319976888|gb|AEUH01000287.1| GENE 2 154 - 546 580 130 aa, chain + ## HITS:1 COG:MT3230 KEGG:ns NR:ns ## COG: MT3230 COG0784 # Protein_GI_number: 15842719 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Mycobacterium tuberculosis CDC1551 # 2 128 4 130 133 98 42.0 3e-21 MSGENVNILVFSDDITRRRAVTEGVGLRASADTPRITWTEAATAFGVRDAFDQGDFALLV LDAEAKKEGGMSIAQDLMETREDVPPVVLLTARPQDDWLASWAGACATVPDPLDPLVLQE TIAEALRGAK >gi|319976888|gb|AEUH01000287.1| GENE 3 543 - 1598 1278 351 aa, chain + ## HITS:1 COG:ML0883 KEGG:ns NR:ns ## COG: ML0883 COG0547 # Protein_GI_number: 15827406 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Mycobacterium leprae # 7 350 20 363 366 211 43.0 2e-54 MSGAPQWSDVIARLNGGASLSYEDAYWVMDQVMTGELGDARLASFLTSMAIKGATVDEIH GLADAMQDHALAVDLPSRSLDVVGTGGDGFGTVNISTMSAIALAAAGIPIVKHGNRASTS KSGSADVLEAMGVSLDVDAARQREIFDRIGIAFLFANKVHPSMRFAAPVRRALGFPTAFN VLGPLTNPARVQACAIGAAAEASARLMAGVYASRGLSAIVFRGRDTGLDELSTVGVNQAW IVSGGAVSEVEFDASEMFDMEPARIEDLRGGNARQNAEAAREVLSGGGERAVRDAVCLNA AAGVLAWEGLGEAVDADSYAPRLGEAVERARRVLDSGDGAALVEDWAALSA >gi|319976888|gb|AEUH01000287.1| GENE 4 1744 - 2376 630 210 aa, chain - ## HITS:1 COG:MT2247_1 KEGG:ns NR:ns ## COG: MT2247_1 COG0847 # Protein_GI_number: 15841682 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Mycobacterium tuberculosis CDC1551 # 14 183 30 202 232 122 47.0 5e-28 GALPPALEGLGTPLERADWLVVDVETTGLGRACEITEIGALRMRGTQVVDEFEELVRPSA PIPPFIEKMTGITTDMVAGADPVQAVYPRFAAWAGAGAPGRVLVAHNADFDMGFLARAAR ACSMPWRTPPSVDTLSLARLALPRPVVANHRLGTVARHFRTATAPGHRALGDARATGEVL CGLAGLAAEAGFGHVEDLVAMCRWYAARRP Prediction of potential genes in microbial genomes Time: Thu May 12 19:18:45 2011 Seq name: gi|319976882|gb|AEUH01000288.1| Actinomyces sp. oral taxon 178 str. F0338 contig00288, whole genome shotgun sequence Length of sequence - 8866 bp Number of predicted genes - 6, with homology - 4 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 122 77 ## 2 2 Op 1 . + CDS 182 - 1513 1444 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 3 2 Op 2 . + CDS 1519 - 1596 71 ## + Term 1679 - 1711 0.3 4 3 Op 1 . + CDS 2330 - 3544 1876 ## COG0205 6-phosphofructokinase 5 3 Op 2 . + CDS 3555 - 4979 2134 ## COG3200 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Term 5039 - 5084 1.2 6 4 Tu 1 . + CDS 5233 - 8589 2129 ## COG0457 FOG: TPR repeat Predicted protein(s) >gi|319976882|gb|AEUH01000288.1| GENE 1 2 - 122 77 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTNGGRMDSLTERRSPSDPGPGGRAELARALARRLEGALP >gi|319976882|gb|AEUH01000288.1| GENE 2 182 - 1513 1444 443 aa, chain + ## HITS:1 COG:ML0892 KEGG:ns NR:ns ## COG: ML0892 COG0204 # Protein_GI_number: 15827414 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Mycobacterium leprae # 12 228 12 228 244 234 53.0 3e-61 MAFYQALKATGGPILKAAYRPWIRGKENIPAEGPAILASNHNAVWDSVFLPMMLDREVVF MGKADYFTGTGFKGWVTKEFMRAVGTIPVDRSGGRASEAALNAGLKRLREGELFGIYPEG TRSPDGRLYRGKTGVARLALLSGAPVIPVAMIGTHAAQPIGQRIPSRTNIGMVIGEPLDF SRYKGLHKDRYVLRAITDEIMYNLMLLSGQEYVDLYAADVKAKMAEEGEFDGPVPSNGRA APGGRIAPEVEVPGPPEEDADADSADGQDAPGAGEESEAPAPAEDAAPVEEAAPAAEAKA DSESPSPASDAAPVEEAAPVEEEAPAEEVPPVKETASAAEARAASESPAPASDAAPVEKA APAAEAKADSESPAPDTAPVEEAAPTPEAKADSESPAPASDAEPVEKTVPAAAAKANSES TASDTGGKAGGAKRKKSTKRKKR >gi|319976882|gb|AEUH01000288.1| GENE 3 1519 - 1596 71 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIGVLFQGGVARFANRTDRPEPGQG >gi|319976882|gb|AEUH01000288.1| GENE 4 2330 - 3544 1876 404 aa, chain + ## HITS:1 COG:AGc3836 KEGG:ns NR:ns ## COG: AGc3836 COG0205 # Protein_GI_number: 15889397 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 398 112 509 514 493 59.0 1e-139 MSVRRVALLTAGGFAPCLSSAVGGLIERYTEIDPTVEIIAYQNGYHGLLTGNYVVVDEDA RAHAAVLHRFGGSPIGNSRVKLTNKKNLVERGLVAEDENPLEKAAEQLRKDGVDVLHTIG GDDTNTTAADLAAYLEEHNYHLTVVGLPKTIDNDVVPIRQSLGAWTAAEEVSEYSQNVIG EHRSNPRMLIIHEIMGRNCGWLTAAGSRCYHEWLKTQEWVPSIGLSKERWDIHAIFLPEM KIDLDAEAKRLKAIMDEQGNVNIFLSEGAGVPEIIAEIEAAGGEVQRDPFGHVKLDTINP GQWFAKQFAEKIDAEKVMVQKSGYFSRSSRANADDLRLIKSMTDLAVECAFKGESGVIGH DEEDGDRLKAIPFPRIAGGKPFDITQKWFTDLMAEIGQDAAPAK >gi|319976882|gb|AEUH01000288.1| GENE 5 3555 - 4979 2134 474 aa, chain + ## HITS:1 COG:PA2843 KEGG:ns NR:ns ## COG: PA2843 COG3200 # Protein_GI_number: 15598039 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Pseudomonas aeruginosa # 27 473 5 445 448 472 53.0 1e-133 MIPSLTPGRGGQPERKSGYRRVPESLWSPSSGEPAPWDTWRRLEALQQPDYPDGARVGSV TASLRLQPPLVFAGEVDDLRTWMGAASRGEAFVLMGGDCAETFAEATADHLRLKIQTLLQ MAVVLTYGASLPVIKVGRIAGQYAKPRSSDVETRDGVTLPSYRGDAVNSHEFTPEARVPD PARLLDLYHHAASTLNLIRAFTKGGYADLRQVHHWNRGFMTNPAYARYESLAEDIHRAIK FMGAAGVDFDSLRDVDLYSSHEALLLEYESAMTRIDSRTGEPYNTSAHFLWAGERTRDPG GAHIELLSRVRNPVGVKLGPTTTPAEMIGLIDRLNPDGEDGRLTFITRMGADRIREALPP LLEAARADGRPVTWMTDPMHGNTITASTGHKTRRFETVMDEVRGFFEAHDQAGTVPGGLH VELTGDDVTEVIGGSEHIDEESLRKRYETLVDPRLNHQQSLEMAFQVAQYLAKG >gi|319976882|gb|AEUH01000288.1| GENE 6 5233 - 8589 2129 1118 aa, chain + ## HITS:1 COG:all2787 KEGG:ns NR:ns ## COG: all2787 COG0457 # Protein_GI_number: 17230279 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 638 1106 419 887 924 362 45.0 2e-99 MARLRGQSNDKDHAVAKRIEKAVAGRVRGMYQKCRDRGVDTERLDGVATEVVRLVDRVAA DDSLILEAVRFPDGFAALLRDLGGPIRAELFERSEQYFDELVDAVAGEYVRLAPWSPKFQ IEALKYVMGALDRIEEGVDEIRKDVGKGLENDRVTHSKLDELLAGGARSGRLLFGGRPDV AFGFRERVEQGRLRGLVVDERRERAVLVGMAGCGKSQLASSLAQWCGAAGWGLVAWLNAS SAETVRSDLVELARQLGVDMGDNPTQDQAVRRCLNHLQSAKASDRLVVFDNVERVEDLLG VVPCGVGLRVVATTRSGRGWRAQGWERVDVGVFPRAESIAYLLDAVGSRDVEAADAVAGR LGDLPLAVAQAAATASDEGWTLRRYLERLEKYSSEQVIRPVPGDSYTDDVSWALLLAVDS ALGRIEGGLRGVARCQVGALAVLAESGVPTRWIDPRAEDNKAAEGAGADERNEDTGAGGA GEDADAREAAAEKAHLALTALLNASVVQQSADGGVTMLHRLQGQVLRENWNQGEWAGAFD AAADLLDRVNVDSLRREDADGRRREARDLIDQLWAVAAQGYSRPLFECERVVADLPHVLF HARDLGLPNEALALRGAVGVVEEVLGDDHSGTLASRHGLAGVYWAAGRLGEAIPLYEQVL ADQVQVLGADHPHTLTSRNNLAYAYQAVGRLGEAIPLYEQVAADQVQVLGADHPQTLTSR NNLAGAYYSVGRLGEAIPLFEEVLADRTRVLGADHPDTLTSQNNLAYAYQAVGRLGEAIP LFEEVLADRTRVLGADHPDTLASRNNLAGAYESAGRLGEAIPLYQEVLADSARVLGDDHP QTLASRNNLAYAYRAAGRLGEAIPLYEQVLADRTRVLGPDHPDTLASRNNLAAAYESAGR LGEAIPLYEQVLTDQVQVLGADHPHTLTSQNNLAYAYQAVGRLGEAIPLYEQVLADRTRV LGPDHPDTLTSQNNLAYAYQAVGRLGEAIPLFEEVLADQVQVLGADHPQTLTSQNNLAVA YYSAGRLGEAIPLFEEVLADQVQVLGADHPQTLTSRNNLASVYYAAGRLGEAIPLFEKVL ADSARVLGPDHPDTLTSRRNLEAARRARESGSASARAP Prediction of potential genes in microbial genomes Time: Thu May 12 19:18:54 2011 Seq name: gi|319976878|gb|AEUH01000289.1| Actinomyces sp. oral taxon 178 str. F0338 contig00289, whole genome shotgun sequence Length of sequence - 1584 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 + CDS 3 - 719 555 ## COG0457 FOG: TPR repeat 2 1 Op 2 . + CDS 716 - 1447 593 ## COG0457 FOG: TPR repeat Predicted protein(s) >gi|319976878|gb|AEUH01000289.1| GENE 1 3 - 719 555 238 aa, chain + ## HITS:1 COG:all2787 KEGG:ns NR:ns ## COG: all2787 COG0457 # Protein_GI_number: 17230279 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 3 225 667 889 924 183 45.0 2e-46 TLASRNNLAYAYQAAGRLGEAIPLFEEVVADCVRVLGADHPHALASRNGLAGAYESAGRL GEAIPLYEGVLADTRRVLGPDHPDTLTSRNNLAGAYESAGRLDEAITLYEGVLADTRRVL GDDHPDTLTSRNNLAYAYYSAGRLGEAIPLYEGVVADRRRVLGDDHPDTLTSRNNLAAAY QAAGRHHEAITLFKDTLEVCEEVLSPGHPLTTRVRKSLETLEREMDQPPAPSPESSDE >gi|319976878|gb|AEUH01000289.1| GENE 2 716 - 1447 593 243 aa, chain + ## HITS:1 COG:all2787 KEGG:ns NR:ns ## COG: all2787 COG0457 # Protein_GI_number: 17230279 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 5 229 664 888 924 147 39.0 2e-35 MNDADPLAERNARAHQYLQEGRDDLAIPLFEEVLAERVRVLGAGHPDTIASRGNLAGAYE SAWCLGKAISQYKKVLAHQLRVLGADHPRTLVSRNNLAYAYQAAGRLGKAISQYKKVLAH QLRVLGAGHPDTIASRNNLASAYYAAGRLGEAIPLFEKVLADSARVLGPDHPDTLASRNN LAGAYQAAGRHHEAITLFKDTLEVCEEVLSPGHPLTTRVRKSLETLERETNQPPAPSPES SDE Prediction of potential genes in microbial genomes Time: Thu May 12 19:18:56 2011 Seq name: gi|319976870|gb|AEUH01000290.1| Actinomyces sp. oral taxon 178 str. F0338 contig00290, whole genome shotgun sequence Length of sequence - 6937 bp Number of predicted genes - 7, with homology - 4 Number of transcription units - 7, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 90 94 ## 2 2 Tu 1 . - CDS 244 - 2220 2502 ## COG0515 Serine/threonine protein kinase 3 3 Tu 1 . + CDS 2351 - 2707 572 ## Lxx15380 hypothetical protein + Term 2846 - 2901 14.9 4 4 Tu 1 . - CDS 3200 - 4294 1188 ## COG0142 Geranylgeranyl pyrophosphate synthase - Prom 4398 - 4457 3.5 5 5 Tu 1 . + CDS 4472 - 6103 2276 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 6 6 Tu 1 . + CDS 6256 - 6573 322 ## 7 7 Tu 1 . + CDS 6713 - 6935 199 ## Predicted protein(s) >gi|319976870|gb|AEUH01000290.1| GENE 1 1 - 90 94 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no STLASRRNLEAARRARESASASAPGVAEE >gi|319976870|gb|AEUH01000290.1| GENE 2 244 - 2220 2502 658 aa, chain - ## HITS:1 COG:MT2232 KEGG:ns NR:ns ## COG: MT2232 COG0515 # Protein_GI_number: 15841666 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Mycobacterium tuberculosis CDC1551 # 22 286 5 269 399 239 51.0 1e-62 MAPTADNSHRIGGMSDVHQPQGPADPMIGLLVDGRYQVTARLARGGMATVYVARDERLDR LVALKVMHPHLAESEDFVARFRREARAAARIHHPGVVAVYDQGVVSGQGFLVMELVEGTN LRALLSAQGAFTIEQTLRYVSDILQALHAAHRVGVVHRDIKPENILVPPEGPVRVADFGL ARAASEASQSSTGNMLGTVAYIAPEIARGGGADARSDIYSVGIMAYEMLTGDVPWAGQSP IQIATHHVSDDVPLPSASQPWIPREVDDLVAALTARDPANRPADAVHALDLARRAAESLP ADLASRRADVAGREQTGSALTAMKTAVVSAPALTAVLPASSQTTVRVADAPAKPEPPAVV RHGRRTAAVIGAIAVLLVVGLIGGGWWWTEYGPGSYIAMPTTDGRDLGSVRADLEGAGLG VVVEEEFSDTVAAGLVVRSDPASGTRVHKRGDVHVFVSKGVDMRVVPAVTGKTKDEATEA LTAAGLQPGTITYDYSEDVEEGRVITQGSEPGKSLVHDTTVDLSVSKGKEPIPVPDLVGQ KGDTAKQRLEELGLVAKPTEAYSDTVAAGDVISQQPAPSTVLHRKDAVDYVVSKGPEKVP VPDVQGKQRDEARRILEEAGFKVEENNILGGFFGTVRSTDPAAGTVLKRESTVTLNIV >gi|319976870|gb|AEUH01000290.1| GENE 3 2351 - 2707 572 118 aa, chain + ## HITS:1 COG:no KEGG:Lxx15380 NR:ns ## KEGG: Lxx15380 # Name: not_defined # Def: hypothetical protein # Organism: L.xyli # Pathway: not_defined # 19 117 31 124 124 70 47.0 1e-11 MTTARVPSVIPATEACRLLGIEERRLKQLIRDHVLSVAEDESGARGVPAEMIVKGQNGWV PLPDLQGTLTLLSDDGFTADEAVGWLYAVQDELGERPIDALVAGRHRRVNRIASALAL >gi|319976870|gb|AEUH01000290.1| GENE 4 3200 - 4294 1188 364 aa, chain - ## HITS:1 COG:Cgl2124 KEGG:ns NR:ns ## COG: Cgl2124 COG0142 # Protein_GI_number: 19553374 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Corynebacterium glutamicum # 4 362 18 364 366 167 38.0 2e-41 MSDLRDRMIPLRAAVDAATEAAIGAMTRDLAPTQRGAPLASLARSCSEGGKRFRGILAHV GHCLARGGPLDEAPVAPLSAALELYQASALAHDDIIDHARTRRGRPTPHVSLAALHHDRA WRGDPDRFGEAGAILVGDLLFSWAEAAMADQADRLAPRDAARLWARYSRMHAEVALGQFL DVAAEQAPLDPGDPGAMDTEAAMEVVVRKSARYSIVHPAALGAICGGADDALITAIESIL TPWGMAFQLRDDHLGVFGDPELTGKPSGDDIREGKRTVLLALAWQRATDSERAVLCRALG RADAPGALIDEARQVIAARGGDAHEELITSLVDQGRAHLREAPVSEQGRADLVGLCGIIT RRAA >gi|319976870|gb|AEUH01000290.1| GENE 5 4472 - 6103 2276 543 aa, chain + ## HITS:1 COG:ML1022 KEGG:ns NR:ns ## COG: ML1022 COG0568 # Protein_GI_number: 15827492 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Mycobacterium leprae # 34 543 58 574 574 448 57.0 1e-125 MSASRPSESESPEVTADKASQAAATGRAGARKPAAKKAPAKKRTAKSAAKSDEAGAGKPD EAKAPAKKAPAKKAAAKADGAKKTAKKASAKADGPKKAVKKSDGAGAGKPDEAKAPAKKA SAKKKARKSTAKADAAKSAARAGAADEREAVDEEELEEGVVDDSDEGFDPEAELEDAIDT DDEEDEEEEPEKDAGADPVKATDGAALRERTQAGIHVKGGFVVSDSDETDEPVQQVTVAG ATADPVKDYLKQIGKVSLLNAAQEVDLARRIEAGLYAEYKLKNRADEMTSKERRELHFLA QDGQQAKNHLLEANLRLVVSLAKRYTGRGMQFLDLIQEGNLGLIRAVEKFDYTKGYKFST YATWWIRQAITRAMADQARTIRIPVHMVEVINKLARVQRQMLQDLGREPTPEELAKELDM TPEKVVEVQKYGREPISLHTPLGEDGDSEFGDLIEDSEAIVPADAVSFTLLQEQLHHVLD TLSEREAGVVSMRFGLGDGQPKTLDEIGKVYGVTRERIRQIESKTMSKLRHPSRSQVLRD YLD >gi|319976870|gb|AEUH01000290.1| GENE 6 6256 - 6573 322 105 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVVTVCIVIVVAVGVFLVLALSRRSADQWQRRLRDRKVELHLEREDAPSEAGGSPRMTSL DALLDSASTPSNAYFSADRLPGIAKLEAASERRGQRSGHSSDQDQ >gi|319976870|gb|AEUH01000290.1| GENE 7 6713 - 6935 199 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MESFLEFMRSPMAMVVGALVIVLLCVLIRFLVKRNRPTGKGSDVQAPPTAPGPDEAPGAP GGSAVEPGGPQAQA Prediction of potential genes in microbial genomes Time: Thu May 12 19:19:16 2011 Seq name: gi|319976862|gb|AEUH01000291.1| Actinomyces sp. oral taxon 178 str. F0338 contig00291, whole genome shotgun sequence Length of sequence - 6725 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 2, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 40 - 957 613 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 2 1 Op 2 1/0.000 + CDS 1033 - 2643 2277 ## COG0541 Signal recognition particle GTPase 3 1 Op 3 . + CDS 2710 - 3813 1335 ## COG1228 Imidazolonepropionase and related amidohydrolases 4 2 Op 1 19/0.000 + CDS 4020 - 4511 416 ## PROTEIN SUPPORTED gi|227875958|ref|ZP_03994081.1| 30S ribosomal protein S16 5 2 Op 2 12/0.000 + CDS 4514 - 4753 260 ## COG1837 Predicted RNA-binding protein (contains KH domain) 6 2 Op 3 30/0.000 + CDS 4836 - 5348 227 ## PROTEIN SUPPORTED gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 7 2 Op 4 . + CDS 5351 - 6673 1759 ## COG0336 tRNA-(guanine-N1)-methyltransferase Predicted protein(s) >gi|319976862|gb|AEUH01000291.1| GENE 1 40 - 957 613 305 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 32 302 52 319 336 240 46 2e-63 MQRLRARLAASGALGRAIVDILSRGELKAADWEEIEESLLIADLGLEATDQLMEALKRRV AVEKVGDTAGIRRILREELLALVDPAMDRSLRLDKPEADGRALPAAVLMVGVNGTGKTTT VGKLARVLVADGRTVLLGAADTFRAAAADQLATWGQRVGVDVVRSEREGADPASVAFEAV KEGASRGVDVVVVDTAGRLQNKSTLMDELGKIKRVMDRQAPVGEVLLVLDATTGQNGLQQ ALVFAEAVGVTGIVLTKLDGSAKGGIVVSVQRQLGVPVKFVGLGEGADDLAPFDPEGFVD AIVSA >gi|319976862|gb|AEUH01000291.1| GENE 2 1033 - 2643 2277 536 aa, chain + ## HITS:1 COG:Cgl2009 KEGG:ns NR:ns ## COG: Cgl2009 COG0541 # Protein_GI_number: 19553259 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Corynebacterium glutamicum # 1 471 1 475 547 481 57.0 1e-135 MFGNLQDRLTESFRRLRSRGVLTESDVDQAVSEIRRALIDADVALPVVRQFTTRVREKAY GAARSKALNPGQQVVSIVHSELVEILGGQTRELVFAERGPTVFMLAGLQGAGKTTLAGKL GKWLREEGKTVMLVASDLQRPNAVQQLQVVGERAGVKVWAPEPGNGVGDPVEVARSGVEF ARQSGINVVIVDTAGRLGVDQEMMDQAIAIRDAVSPHEIMFVLDAMVGQDAVSTSTAFRD GVGFTGVVLSKLDGDARGGAALSVRGVTGAPILFASTGEGLDDFERFHADRMAGRILDMG DVLTLIEQAEKRMDAEEAEKVAAKAMAGELTLADFLNQLQQIKKLGSMKKMLGMIPGAAQ LRDQIDNFDEREVSRVEAIVRSMTPAERDDVGLLNGSRRARIAKGSGTTVSEVNGLVKRF EAARDMMAQMGQMGGGGVAGMGALPGRGGKAKQKSNARAAKAKRAKVKKKARSGNPAKRR QQELEALLPQSQRKRPDAPGSSFGAPAPAPEPEPARRPSMDDLPDDVKRMLGQLGG >gi|319976862|gb|AEUH01000291.1| GENE 3 2710 - 3813 1335 367 aa, chain + ## HITS:1 COG:Rv2915c KEGG:ns NR:ns ## COG: Rv2915c COG1228 # Protein_GI_number: 15610052 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Mycobacterium tuberculosis H37Rv # 17 367 24 369 370 222 41.0 1e-57 MIFDGWMLIAASPGQRPEWTRGSWAIEDGRVARLGRPGRADVRGYAVPGMVDCHCHIGIT DGGGSVGEAAQRQQAAATLASGVTLARDCGCPVDNSWLARQGSGPSLDFIHCGRHVARPM RYIRGLAVEVEDERDLPRVLADQAARSDGWVKLVGDWIDRSKGAESDLDPLWSLPVLRDA VAAAHEAGARVAVHAFSHSAIDDLIEAGVDDIEHGSGIDADQADEIARRSVLVAPTLAQV ALFDAFADQAGAKYPVYAATMRAMHEERREHFAMLADSGVRLVSGVDSGGYQKHGSIVNE LALWQEAGMDPARVVDTATWGTRAALGRPALTEGGAADLVVLADDPRQDVSALARPLHVF AGGHHVG >gi|319976862|gb|AEUH01000291.1| GENE 4 4020 - 4511 416 163 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227875958|ref|ZP_03994081.1| 30S ribosomal protein S16 [Mobiluncus mulieris ATCC 35243] # 21 163 1 143 159 164 60 1e-40 MFPPGRAEAHTNLKVGETTQVAVRIRLKRMGKVHAPAYRVVVVDSRKKRDGRVIEEIGVY DAQPDPSVIRIDSERAQYWLGVGAQPSDQVRNLLVLTGDIAKFNGKADAVSRVRVSEADK AAQAAQAVEDAEAEAEKSKAAASAKKAEEEAAAAKASEEAEEA >gi|319976862|gb|AEUH01000291.1| GENE 5 4514 - 4753 260 79 aa, chain + ## HITS:1 COG:MT2976 KEGG:ns NR:ns ## COG: MT2976 COG1837 # Protein_GI_number: 15842451 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein (contains KH domain) # Organism: Mycobacterium tuberculosis CDC1551 # 1 78 4 80 80 73 60.0 7e-14 MLADALEHLVSGIVDHPEDVQVTPRQLRRGQLLEVRVNPEDLGRVIGRGGRTARALRTVM GALSTRGSVRVDVVDTDRE >gi|319976862|gb|AEUH01000291.1| GENE 6 4836 - 5348 227 170 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 [alpha proteobacterium BAL199] # 8 165 10 172 179 92 36 1e-18 MLMTAAIIGPAHGLKGEVVLDVRSDDPAVLAPGASLVLSGSGSGVVVESTRVHKGRVLAR LGGVGTREEAEALRGAALLVAPHDEEDAWYPHQLKGLRAEDPQGRPLGSVSGLRPGGAQD LLLVGTERGEVMVPFVRQLVPTVDVEGGRVVIDPPPGLFDDGADDSGKGA >gi|319976862|gb|AEUH01000291.1| GENE 7 5351 - 6673 1759 440 aa, chain + ## HITS:1 COG:MT2974 KEGG:ns NR:ns ## COG: MT2974 COG0336 # Protein_GI_number: 15842449 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Mycobacterium tuberculosis CDC1551 # 1 227 1 226 230 239 58.0 8e-63 MRFDLVSIFPQYFDSLSLSLLGRADRSGLVSVRAHDLRDWAHGAHRSVDDAPYGGGAGMV MLPGVWCAALDEILGASCAPVLAVPTPSGTPLTQRLVGELARADHIVVACGRYEGIDQRV ADHYRERGLRVLEYSIGDYVLNGGEAAAAVLVEAVARLVDGFMGNPDSLAEESHGDGGLL EYPAYTKPRSFRGLDVPDVLLSGDHGAIRRWRRDQSLLRTAERRPDLIDALDPAGLDAQD REALASCGVVFAPGPTRVRYLPGEQSDAAELAAFAARTFPMAAPPEIPAEDIARFIREEL DEASFRDHMAGPGTRVLIARAQGGEDGSPILGYTLTVLGHPDQMPEGNGAVPLDGGAAYL SKCYTDAVAHGSGLAGALLEKAVADARAHGATQIVLATHIRNTRAQRFYKRHGFKKSGRR TFRVGSAAPTDDVMVRPADR Prediction of potential genes in microbial genomes Time: Thu May 12 19:19:19 2011 Seq name: gi|319976853|gb|AEUH01000292.1| Actinomyces sp. oral taxon 178 str. F0338 contig00292, whole genome shotgun sequence Length of sequence - 4914 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 + CDS 248 - 604 471 ## PROTEIN SUPPORTED gi|227495321|ref|ZP_03925637.1| 50S ribosomal protein L19 + Term 663 - 713 17.2 2 1 Op 2 4/0.000 + CDS 747 - 1490 987 ## COG0681 Signal peptidase I 3 1 Op 3 . + CDS 1487 - 2164 677 ## COG0164 Ribonuclease HII 4 1 Op 4 . + CDS 2331 - 2636 468 ## Bcav_2533 hypothetical protein + Term 2676 - 2717 12.4 5 2 Op 1 2/0.000 + CDS 2758 - 3183 507 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 6 2 Op 2 . + CDS 3180 - 4733 1454 ## COG0606 Predicted ATPase with chaperone activity 7 2 Op 3 . + CDS 4726 - 4912 197 ## Predicted protein(s) >gi|319976853|gb|AEUH01000292.1| GENE 1 248 - 604 471 118 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227495321|ref|ZP_03925637.1| 50S ribosomal protein L19 [Actinomyces coleocanis DSM 15436] # 1 113 1 113 115 186 80 4e-47 MQKLDAVDAASLRDDIPPFRAGDTVKVHVKVVEGNRSRIQVFQGVVIARRGHGVSSTFTV RKMSFGVGVERTFPVHAPTIDRIEVVTKGDVRRAKLYYLRKRHGKAAKIKEHRSGKSE >gi|319976853|gb|AEUH01000292.1| GENE 2 747 - 1490 987 247 aa, chain + ## HITS:1 COG:Cgl1987 KEGG:ns NR:ns ## COG: Cgl1987 COG0681 # Protein_GI_number: 19553237 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Corynebacterium glutamicum # 20 229 21 251 262 121 34.0 1e-27 MSEATEPPSAPRHLPSLKDRDRAEAEARRNPVVWLREIAVILVIALIISSLLRAFVVQVF WIPSPSMRGTLVENDRIAVSRIAALTGNIKRGDVVVFDDTLGWLGSGGDSSGSVLRSIGE FTGFVPAGGEQTLVKRVIGIGGDRVKCCSTDGKVMINGVEISETYIAEGQAASTIPFDVT VPEGHLWVMGDNRGNSADSRYHMGEGQSPFVPQKSVVGTVWAIIWPVSRWTTDIGHREVF AVVPDPQ >gi|319976853|gb|AEUH01000292.1| GENE 3 1487 - 2164 677 225 aa, chain + ## HITS:1 COG:YPO1058 KEGG:ns NR:ns ## COG: YPO1058 COG0164 # Protein_GI_number: 16121358 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Yersinia pestis # 11 217 3 188 198 118 39.0 1e-26 MSAPSASALFEQSLADQFGVVAGVDEVGRGSLAGPVAVGVALTSPGAADPPAGLVDSKAL TAKRREALAPAVRQWALDCAVGWAGPDEIDQWGIVAALRLAGLRALDELAGRGRVPGAVV LDGSHNWLAMPEDLLGGLGGPDYPPPCPAPIRARVKADASCAVVAAASVIAKVERDRLMA ELDDPGYEWSRNKGYASAAHIAALAELGPSAHHRRSWKLPARASR >gi|319976853|gb|AEUH01000292.1| GENE 4 2331 - 2636 468 101 aa, chain + ## HITS:1 COG:no KEGG:Bcav_2533 NR:ns ## KEGG: Bcav_2533 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 2 99 3 100 100 134 67.0 1e-30 MSEEIEGYENSLELELFKEYRDVIGLFKYVVETERRFYLCNQVDVQARPVGSDVFFELAL TDAWVWDIYRSSRFVRSVRVITYKDINVEELSKPELSIPSS >gi|319976853|gb|AEUH01000292.1| GENE 5 2758 - 3183 507 141 aa, chain + ## HITS:1 COG:Rv2898c KEGG:ns NR:ns ## COG: Rv2898c COG0792 # Protein_GI_number: 15610035 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Mycobacterium tuberculosis H37Rv # 7 117 11 121 128 61 40.0 6e-10 MSARTNRLGARGEDTAAAFLEESGATILTRNWRDGRRGEIDIVAGVPRGPRPGVAVVEVR TRVGNQRGSALESVDSRKIARLRALAGAWRRANPGQPGELRIDVVAITIDPGRARELGER LASCADLRQAGARVEWIRGVA >gi|319976853|gb|AEUH01000292.1| GENE 6 3180 - 4733 1454 517 aa, chain + ## HITS:1 COG:Rv2897c KEGG:ns NR:ns ## COG: Rv2897c COG0606 # Protein_GI_number: 15610034 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Mycobacterium tuberculosis H37Rv # 7 511 3 498 503 336 43.0 6e-92 MSAATRLARTHSVSLVGLEGRLVRVETQVGRGLVQFVIVGLPDASVRESKDRVRSALESC GLEVLDSRVTVNLSPAGLHKSGSGFDLAIAASVLLAAGRVGPTSFDGAVIIGELALDGSL QPVRGVLPAVLAARGQGITRAIVPAANAAEASLVGGVETAAFAHLAELVRWAGGNAQTPV SIGGALPDAPEPEPATSGPHVDMADMRGQHDAVEALTVAGAGGHHVFLLGEPGSGKTMLA SRMHTILPDLDDDTALVATSLHSLAGTLRDGEALVRRPPILTPHHSATMAALVGGGHARI TPGAASLAHGGVLFLDEAAQFQPSVLDALREPLENGEVHIHRAGLHARLPARFQLLLAAN PCPCGGGRQGRACTCSAQARMRYLARLSGPLLDRIDITVRVDTPTRADLARGPSPSSARL RERVAEARSRCARRLAGTPWTLNAQVPGRWIRASSGIDPRMVADLDRAVEAGVFSMRGAD RVLRLMWTTADLNGAARPGEVERGLAIWLRKGDNPND >gi|319976853|gb|AEUH01000292.1| GENE 7 4726 - 4912 197 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIDDTHYSAAWWTTVVEPGDADVHALRAALGDEEARRWARAPAPSPLPPALAGHDWEAAW GR Prediction of potential genes in microbial genomes Time: Thu May 12 19:19:31 2011 Seq name: gi|319976840|gb|AEUH01000293.1| Actinomyces sp. oral taxon 178 str. F0338 contig00293, whole genome shotgun sequence Length of sequence - 10737 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 7, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.500 + CDS 1 - 957 1067 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 2 1 Op 2 . + CDS 967 - 1866 1075 ## COG0582 Integrase + Term 1925 - 1963 -1.0 3 2 Tu 1 . - CDS 1829 - 2311 383 ## FRAAL5309 putative HTH-type transcriptional regulator 4 3 Op 1 . + CDS 2498 - 3457 763 ## COG1131 ABC-type multidrug transport system, ATPase component 5 3 Op 2 . + CDS 3454 - 4584 1373 ## Anae109_0895 ABC-2 type transporter 6 3 Op 3 . + CDS 4584 - 5690 1054 ## COG0842 ABC-type multidrug transport system, permease component 7 4 Tu 1 . - CDS 5783 - 6322 365 ## COG0739 Membrane proteins related to metalloendopeptidases 8 5 Op 1 38/0.000 + CDS 6660 - 7490 1139 ## PROTEIN SUPPORTED gi|29829169|ref|NP_823803.1| 30S ribosomal protein S2 9 5 Op 2 . + CDS 7516 - 8343 475 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts + Term 8358 - 8402 9.1 - Term 8220 - 8249 1.2 10 6 Tu 1 . - CDS 8394 - 8525 121 ## 11 7 Op 1 33/0.000 + CDS 8464 - 9180 986 ## COG0528 Uridylate kinase 12 7 Op 2 7/0.000 + CDS 9211 - 9768 818 ## COG0233 Ribosome recycling factor 13 7 Op 3 . + CDS 9768 - 10676 1250 ## COG0575 CDP-diglyceride synthetase Predicted protein(s) >gi|319976840|gb|AEUH01000293.1| GENE 1 1 - 957 1067 318 aa, chain + ## HITS:1 COG:Rv2896c KEGG:ns NR:ns ## COG: Rv2896c COG0758 # Protein_GI_number: 15610033 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Mycobacterium tuberculosis H37Rv # 7 310 64 370 389 151 42.0 1e-36 ADALDAELERADRIGAAFLTPSDPQWPEGMADLGPRAPVGLWARGRLDPRGSATRASISI VGARACTREGQNEAANMACALTEKGYRTVSGGAYGIDIAVHRGALAAPGPTTVVFAGGVG APYPRAHAEDFRAVMDAGGAVVSEAPPSWRPAKWRFLARNRLIAAWSGSTVVVEAGMRSG ALSTARAAMELGRNVGAVPGSVRAPMSAGALALIRNGATLVRDAADVEEMHAPIGSCAPE PLFGAPVEEDRGADALAPAQRRVWEALPARSRARLGAVVTASGLSERDVLVALAGLELAG MVSSDTTGWKRMPAVARR >gi|319976840|gb|AEUH01000293.1| GENE 2 967 - 1866 1075 299 aa, chain + ## HITS:1 COG:MT2962 KEGG:ns NR:ns ## COG: MT2962 COG0582 # Protein_GI_number: 15842437 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mycobacterium tuberculosis CDC1551 # 4 299 20 315 315 219 51.0 5e-57 MTTSVLDQWGAHLRHSRGASAHTVRAYTADLAAFLAFTGTDPADPGAASALTLRAARAWL ADSVARGAARSTVSRHVAALRNFSAWAHREGLAPTDAAAALASARADQRLPRVVDQDEAA ALLECARSRASADDPVSVRDWAILELIYATGIRVSEACSLTTSSIDPAALTVRVLGKGDK ERTVPFGVPARDALDQWTVRARPSLAVGTDALFVGAKGGPIDPRVVRAMIHRMCARAGVR DIAPHGLRHSAATHLLQGGADLRAVQEILGHSSLATTQRYTHVDAGRLSDVYRRAHPRA >gi|319976840|gb|AEUH01000293.1| GENE 3 1829 - 2311 383 160 aa, chain - ## HITS:1 COG:no KEGG:FRAAL5309 NR:ns ## KEGG: FRAAL5309 # Name: not_defined # Def: putative HTH-type transcriptional regulator # Organism: F.alni # Pathway: not_defined # 1 144 36 178 200 85 37.0 7e-16 MVSYYFGSKEGLFKAIAELSLTPAEVLDVISDRVPRDQLGRALLEAALTTWDRPDYREGL AQLIGDGLASPAAQRIFREYLQTEMVARLAAVIGGGNASKHAAALASIIAGIFFTRCILK VEPIASMSRADVIRHHTPAVNAILRPPLPTRAGAPCGTRR >gi|319976840|gb|AEUH01000293.1| GENE 4 2498 - 3457 763 319 aa, chain + ## HITS:1 COG:MA4206 KEGG:ns NR:ns ## COG: MA4206 COG1131 # Protein_GI_number: 20092997 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Methanosarcina acetivorans str.C2A # 23 316 18 313 333 219 38.0 7e-57 MTNIAVSFQKLRMEFRRPGRPPVVALDSLDLEVVRGQILGLLGPNGSGKTTSVNLLCGLL RPTSGTVVCEGIDVRSDVTGVRSRLGVVPQETALYDDLTADENLSFHARLYGVPRAKRKA RIDELLDLVGLSGRRGDRVGTYSGGMQRRLALARALLTRPTVVVLDEPTLGVDVQSRAAL WERVRQIADEGGTVLLTTNYMEEAQALADQLAVLDHGELVVTGTPEELRGALGQTFIDVR SSHPPSETGLRAVPGVREVTWRDGIASIAADGGPATVRGILDWFADHDADAVLAGVRRPD LNDVFLALTGKALRDGGAS >gi|319976840|gb|AEUH01000293.1| GENE 5 3454 - 4584 1373 376 aa, chain + ## HITS:1 COG:no KEGG:Anae109_0895 NR:ns ## KEGG: Anae109_0895 # Name: not_defined # Def: ABC-2 type transporter # Organism: Anaeromyxobacter_Fw109-5 # Pathway: ABC transporters [PATH:afw02010] # 1 362 5 369 378 77 26.0 8e-13 MSRLEPVWAIAVKDWTRFIRQPFLLVISVVIPLVFIFFYSLVIPVSNNNPVTVADSDGGP AAARLVETLRAVHSEEGPYYDVITTDPSAALSAFGEQRALAVIVIPDGFSEAAEAGNAAV ELRLNNINSDYSKNLRLRLDHAVRALNEELAGPVMTVEETRWLPHDPTMLGYISTSLLLF GCLYAAMVNTGLQVAAEWNGRTVKNLLLAPAGRGALVAGKILAGLGQSLVSLCLVLAVLV VGFGFKPTGDLLGMVGIVVATLLMGSGIGAAFGVASKKSLATASALIALAIAFFLVSGNE ESMRGLAWGQPITALWQLSRALPTTYAFMSARSILLTGDTSDLAVNLTIVLASTAVIFAA AAGLLRRALSHLNGGQ >gi|319976840|gb|AEUH01000293.1| GENE 6 4584 - 5690 1054 368 aa, chain + ## HITS:1 COG:STM0816 KEGG:ns NR:ns ## COG: STM0816 COG0842 # Protein_GI_number: 16764179 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Salmonella typhimurium LT2 # 5 368 9 375 376 63 22.0 6e-10 MHLDRINAIIAKNMRQWSRDEQALAGPMLIPLVLMFLCGVLFGFGGDEWNIALVNRGTGP YATAFETVVRGLRGNISPYFRVVTTDPVEADRLVTEGRLHMALTIPDDFDQVIEAGGVPT LETHVYNINTDMMKNARLRLDRAIQDYAAAHAEGLAPITVTQTTTRPDDVWRRTFIANSA VILAIMVGAALNAAIMVAREWERRTTKEIRLAPRPLADLTTGTLLAGIITGAINTVVTLM VAVTVFGVRIPLGRLPVLLSVAGLVSVACAGIGVAVGAWLRDYRTIQPLLMVTFAGSFFA SGGFSSVATLPRPVQVFDRFWPPAAVFENLQAWTWMADPPSPAPMVLASGVAAVLGVGIG AWALQRRL >gi|319976840|gb|AEUH01000293.1| GENE 7 5783 - 6322 365 179 aa, chain - ## HITS:1 COG:Cgl1980 KEGG:ns NR:ns ## COG: Cgl1980 COG0739 # Protein_GI_number: 19553230 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Corynebacterium glutamicum # 58 177 43 162 164 68 40.0 5e-12 MRAGAPRGETGSMTPRTRLRRFLRAAAAAALLAQPLPARAAPGGEDQRWMWPTGAPSPVL SSFAPPAHDWLPGRRGVDLDVPGGTPIVAAADGVVAFAGPVAAQPVISVEHERAGSPVWA TYVPAQAEVEAGQRVVRGQVIGRVPPGADHLHWGARAGRRAYTDPIRLTLGPVVLKPWE >gi|319976840|gb|AEUH01000293.1| GENE 8 6660 - 7490 1139 276 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29829169|ref|NP_823803.1| 30S ribosomal protein S2 [Streptomyces avermitilis MA-4680] # 1 275 1 275 308 443 80 1e-124 MAVVTMRQLLESGVHFGHQTRRWNPKMKRFILTERNGIYIIDLQQTIADIDIAFDFVKQT VAHGGTILFVGTKKQAQEAVAEQAQRVGMPYVNHRWLGGMLTNFSTVSRRLQRLKELEQI DFEDVAASGHTKKELLMMSREKDKLARTLGGIRDMAKVPSAVWIVDPKKEHLAVSEARKL NIPIVAILDTNADPDEVDYRIPGNDDAIRAVALLTRVVADAVAEGLIARSDARKGKGEET AAEPLAEWERELLEGEVKADEAAAAQAAEESAPKQD >gi|319976840|gb|AEUH01000293.1| GENE 9 7516 - 8343 475 275 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 1 274 1 261 283 187 39 3e-47 MAKFTAADVKALREQTGAGMMDVKKALTEADGDAEKALEIIRLKGLKSLSKREGRQALAG LLAATTDGTVGVMVEVNSETDFVAKNQKFIDFANEVLAAAVASKASDTDALLAAPMGEGT VKDRLDAFAAIIGEKLQIGRVVRVEGDNVDLYLHQTNPDLPPQVGVFVVTDAAGKAVAHD IAMHVAAYMPSYLDRDHVPADVLDKERATLEKITLEEGKPEHIVPKIVQGRLEAFFKDNC LVDQAFARDPSKSVGKVLKEAGASVSDYVRVHVGA >gi|319976840|gb|AEUH01000293.1| GENE 10 8394 - 8525 121 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAPPPNASPDTFNRTRRGSDMAVLLTIQRGAPASGALNRDNTT >gi|319976840|gb|AEUH01000293.1| GENE 11 8464 - 9180 986 238 aa, chain + ## HITS:1 COG:ML1591 KEGG:ns NR:ns ## COG: ML1591 COG0528 # Protein_GI_number: 15827835 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Mycobacterium leprae # 2 236 46 279 279 278 62.0 5e-75 MSEPRRVLLKVSGEAFGGGAIGLDTAVVRDIAEQVAVAIRQGVHVAVVVGGGNFFRGAQL AKAGLERSRADYMGMLGTVMNALALQDFIEQAGVPARVQSAITMTQVAEPYIPLKAIRHL EKGRAVVFGAGAGLPYFSTDTVSAQRALEIHCDELLVAKNGVDGVYDDDPKTNPGARRFD TITYSEALTRDLKVIDATALALCRDNGLKTRIFGMGEEGNVTRALMGEQIGTLVTTQD >gi|319976840|gb|AEUH01000293.1| GENE 12 9211 - 9768 818 185 aa, chain + ## HITS:1 COG:ML1590 KEGG:ns NR:ns ## COG: ML1590 COG0233 # Protein_GI_number: 15827834 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Mycobacterium leprae # 1 185 1 185 185 179 53.0 3e-45 MIDDILLEAEDKMDKAVDAASHEFASIRTGRATSSMFEQLSVEYYGAPTPMQQLASFQIP EARTVIISPFDRTATQEILRTLRESDLGVNPTDDGNVIRVVLPALTEERRKEYVKQAKSK AEEGRVSVRGVRRKAKDQLDRLKKDGEASEDDVDRAEKALDAATRAHTDQIDKMLAAKES ELLTI >gi|319976840|gb|AEUH01000293.1| GENE 13 9768 - 10676 1250 302 aa, chain + ## HITS:1 COG:MT2948 KEGG:ns NR:ns ## COG: MT2948 COG0575 # Protein_GI_number: 15842422 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Mycobacterium tuberculosis CDC1551 # 14 291 57 347 349 166 37.0 4e-41 MEDQSQSFLSAVFRPGPTRERPPLSATGRAGRNLPAAVTTGAVLVAVIAVALFVYPPLFI GVIAVACLAGVWELAGAFARMGVSVAVVPLYVGAIGMVTCAWLLGAEGMLFAMYLTVFVC AAWRLLDARTESRMSDIASSAFVAVYVPFLAAFVVLMVSAWPNPWVFLSYEAMVIASDTG GWAAGIAIGRHPMAPRLSPKKSWEGFAGSCLAAMGVGCLGLHLLGAQWWWGLVAGVLVAF VGTMGDLTESLIKREAGIKDMSGLLPGHGGVLDRVDAVLMTAPVVYFVYALALPGVSWEG SH Prediction of potential genes in microbial genomes Time: Thu May 12 19:19:46 2011 Seq name: gi|319976832|gb|AEUH01000294.1| Actinomyces sp. oral taxon 178 str. F0338 contig00294, whole genome shotgun sequence Length of sequence - 6136 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 1, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 9 - 1202 1111 ## COG0820 Predicted Fe-S-cluster redox enzyme 2 1 Op 2 . + CDS 1199 - 1750 659 ## Bcav_2501 hypothetical protein 3 1 Op 3 17/0.000 + CDS 1789 - 3000 1373 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 4 1 Op 4 6/0.000 + CDS 3050 - 4288 1435 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 5 1 Op 5 . + CDS 4395 - 5570 1446 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 6 1 Op 6 . + CDS 5580 - 6086 665 ## gi|154509147|ref|ZP_02044789.1| hypothetical protein ACTODO_01668 Predicted protein(s) >gi|319976832|gb|AEUH01000294.1| GENE 1 9 - 1202 1111 397 aa, chain + ## HITS:1 COG:Cgl1973 KEGG:ns NR:ns ## COG: Cgl1973 COG0820 # Protein_GI_number: 19553223 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Corynebacterium glutamicum # 16 383 2 365 366 409 57.0 1e-114 MRPTDQPPEGAAHAGARPVLSFTAKRRGKPPSHLADLDAAGRKAVLAGMGLPPFRADQLS RHYFERFEADPADMSDIPASMRQRVRDSLLPPLVSGVVSLRADAGRTVKDLWRLYDGAQV ESVLMRYPQRTTLCVSSQAGCGMACPFCATGQMGLTRNLSTAEIVDQVRLAQAACRDGAL AGGPTRLTNVVFMGMGEPLANYKTVVGALHRLVDPVPEGFGMSARNVTVSTVGLVPAIRR LAGEGLPVTLAVSLHAPDDDLRDDLIPVNSRWKVGELLDAARHYFLVTGRRVSIEYALIK DMNDQVWRAQLLADELNRRGKGWAHVNPIPLNPTPGSIWTASTRRSQDAFVATLRDNGVV TSIRDTRGSDIDGACGQLATSAAKRTAGDGSSRKETR >gi|319976832|gb|AEUH01000294.1| GENE 2 1199 - 1750 659 183 aa, chain + ## HITS:1 COG:no KEGG:Bcav_2501 NR:ns ## KEGG: Bcav_2501 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 1 183 1 183 183 187 60.0 2e-46 MTDAFPTVGFFRQGYNPEAVDAFFEDARRAYEGGVPAEQFSAEQVRQATFTLKRRGYDID VVDGAMNRLEAAFVQRDRAEHVAAMGEAAWYDRVADRATTLYPRLQRPRGERFAHPESGR GYRMDEVDDLLDRLAAYFDDGEPLSADDVRQATFRPARGKKAYAEGPVDAFLGRAVDILL AVD >gi|319976832|gb|AEUH01000294.1| GENE 3 1789 - 3000 1373 403 aa, chain + ## HITS:1 COG:Rv2870c KEGG:ns NR:ns ## COG: Rv2870c COG0743 # Protein_GI_number: 15610007 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Mycobacterium tuberculosis H37Rv # 3 398 38 406 436 316 55.0 5e-86 MPVVVLGSTGSIGTQALDVIDANPGVFEVVGLGAGGADPGALAAQAASHGVPLVAVARED RVAAVRGALAAAGPGLAPTVLGGADAASDLVREACAKWGEGVVVLNGITGSVGLRPTLAA LESGATLALANKESLVAGGALVRRALRRPGQVVPVDSEHSAIAQALASGVHEKGLTSRRL SGASEVADLVLTASGGPFRGLTRAELARVTPEQALNHPTWDMGPVVTINSATLMNKGLEL IEAHLLFDVAPDHIITTVHPQSIVHSMVTWQDGATTLQASPPDMRLPIALGLAWPRRLRG VEEPLRWDAASQWTFEPVDDEAFPAVGLARAAVSASATHPAVMNAANEVLVSAFRAGAVG LLGITDAVERVLGEHEGVEDPSLEDVEAVERWARERARELSAL >gi|319976832|gb|AEUH01000294.1| GENE 4 3050 - 4288 1435 412 aa, chain + ## HITS:1 COG:Cgl1968 KEGG:ns NR:ns ## COG: Cgl1968 COG0750 # Protein_GI_number: 19553218 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Corynebacterium glutamicum # 2 411 3 402 404 164 30.0 3e-40 MGYVVGIVVVVVGILASVALHEVGHMVPAKKFGVLVPDYAVGFGPALWKKKIGETTYALR AVLLGGYVKILGMYPPAREGARTLNRKGRPTLAEEARQASAEDLPEGQEARAFYNLSAPK KIVVMVSGPLMNLLICVVLSAITMIGIGAPRASTTLAAVSQTVAGASGESAGPAHEAGVR AGDVVESWNGRPIASWSEFHEAIAASPAGEPQQLGVKRGQEHLTFEVTPVEGQQGRVVGV TAGFEYVSASPADVVAADWQMFTSTASVVVRLPQAVWNVGRSLFTDDAREATSVVSVVGV GRMAGEVTGDPSSLGLRDTRQVVAVLLSLLASLNMALFVFNLIPLPPLDGGHIVGACYEW ARGALARARGKADPGPADTARMVPLTWAVGGVLVAMSVILIAADIIKPVSLA >gi|319976832|gb|AEUH01000294.1| GENE 5 4395 - 5570 1446 391 aa, chain + ## HITS:1 COG:MT2936 KEGG:ns NR:ns ## COG: MT2936 COG0821 # Protein_GI_number: 15842410 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Mycobacterium tuberculosis CDC1551 # 4 375 3 376 387 503 71.0 1e-142 MSAINLGMPDVHATPFPRKNTRRIMVGDVPVGGGAPVSVQSMTTTKTHDIGATLQQIAEL TAAGCDIVRVACPTDKDAAALPIIAKQSRIPVIADIHFKPKYVFQAIEAGCGAVRVNPGN IRKFDDQVKDICKAASDHGVSLRIGVNAGSLDPRLLKKYGRATPEALVESAVWEASLFEE NDFHDFKISVKHHDVLTMVEAYQQLSERGDWPLHLGVTEAGPAFQGTIKSCAAFGVLLAQ GIGDTIRVSLSAPPVEEVKVGTKLLEFMGLRDKTLEIVSCPSCGRAQVDVWTLAENVTEG LKELTVPLRVAVMGCVVNGPGEAREADLGVASGNGKGQIFIQGKVVETVPEDQIVETLIR RANLLAEEMGLEPGSGAVEVAPVGQRSTSTP >gi|319976832|gb|AEUH01000294.1| GENE 6 5580 - 6086 665 168 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509147|ref|ZP_02044789.1| ## NR: gi|154509147|ref|ZP_02044789.1| hypothetical protein ACTODO_01668 [Actinomyces odontolyticus ATCC 17982] # 1 159 1 159 167 136 49.0 6e-31 MRIRSAVVVAALAAAAGVAATGFPRRLGLTASQAAVSLPGDLLVPGADVVADRGIEVPAS AALVWEVLDAAFEPDGSVDVVAREAGEYLLFRSGGPDSPEGEGGESSCVIALLPVSGGRT LVQIRERHKAGEGGKARLWARIVAQSWATACLLREIRSASMALAERRA Prediction of potential genes in microbial genomes Time: Thu May 12 19:20:04 2011 Seq name: gi|319976816|gb|AEUH01000295.1| Actinomyces sp. oral taxon 178 str. F0338 contig00295, whole genome shotgun sequence Length of sequence - 16663 bp Number of predicted genes - 16, with homology - 14 Number of transcription units - 10, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 167 - 1951 2348 ## COG0442 Prolyl-tRNA synthetase 2 1 Op 2 . + CDS 1951 - 2442 778 ## COG0242 N-formylmethionyl-tRNA deformylase 3 2 Op 1 . + CDS 2640 - 3143 544 ## Bcav_1800 hypothetical protein 4 2 Op 2 . + CDS 3146 - 3499 267 ## gi|154509142|ref|ZP_02044784.1| hypothetical protein ACTODO_01663 5 3 Tu 1 . + CDS 3677 - 5089 2001 ## COG1027 Aspartate ammonia-lyase 6 4 Tu 1 . - CDS 5294 - 5815 156 ## plu2709 hypothetical protein 7 5 Tu 1 . + CDS 5706 - 5921 109 ## 8 6 Tu 1 . - CDS 6040 - 7713 1092 ## COG1002 Type II restriction enzyme, methylase subunits 9 7 Tu 1 . - CDS 7917 - 10661 4120 ## COG2609 Pyruvate dehydrogenase complex, dehydrogenase (E1) component + Prom 10911 - 10970 1.5 10 8 Op 1 . + CDS 11029 - 11421 381 ## HMPREF0573_11425 hypothetical protein + Term 11516 - 11568 -0.6 11 8 Op 2 . + CDS 11614 - 12480 623 ## + Term 12580 - 12616 10.1 + TRNA 12501 - 12574 81.3 # Val TAC 0 0 + Prom 12503 - 12562 77.6 12 9 Tu 1 . + CDS 12765 - 13418 446 ## PROTEIN SUPPORTED gi|148988990|ref|ZP_01820390.1| hypothetical protein CGSSp6BS73_02415 + Term 13469 - 13511 5.4 13 10 Op 1 . - CDS 13723 - 14205 417 ## COG3727 DNA G:T-mismatch repair endonuclease 14 10 Op 2 13/0.000 - CDS 14250 - 15239 961 ## COG0457 FOG: TPR repeat 15 10 Op 3 13/0.000 - CDS 15236 - 16345 1288 ## COG0457 FOG: TPR repeat 16 10 Op 4 . - CDS 16342 - 16662 316 ## COG0457 FOG: TPR repeat Predicted protein(s) >gi|319976816|gb|AEUH01000295.1| GENE 1 167 - 1951 2348 594 aa, chain + ## HITS:1 COG:Cgl1948 KEGG:ns NR:ns ## COG: Cgl1948 COG0442 # Protein_GI_number: 19553198 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Corynebacterium glutamicum # 1 589 1 585 588 666 61.0 0 MLRNYSTLFLRTLREDPADAEVNSHKLLVRAGYIRRAAPGIYTWLPLGLRTLRKIEGIVR EEMDRMGAQEVHFPGLIPAEPYKATNRWEDYGPTLFKLADRKGGDYLLAPTHEEMFTLLV KDMYSSYKDLPTTLYQIQTKYRDEARPRAGLIRGREFVMKDAYSFDIDEAGLEASYQAER DTYERIFTRLGLEYVIVSAMSGAMGGSRSEEFLHPSPIGEDTFVRSPGGYAANVEAVTTP APPDADASGTPAPRIVPTPDCPTIESLVALLNSEQPREDRPWEAADALKNIVVVLTHPDG SRELLVVGIPGDRDVDMKRLEASVAPAEVAMATDADFEGHPELVRGYIGPQVLGPNAPEG AESVRYLVDPRVVAGTAWVTGANQDVHHVLNLVMGRDFTADGTIEAAGIREGDEAPDGSG PLHVERGIEIGHIFQLGSKYAEALGLSVLDENGKSRVVTMGSYGIGVTRVMAALAEANCD DKGLSWPAGIAPFDVHVLATGKDNAVFEAAASLAAALDAAGLDVVFDDRRKVSAGVKFAD FELVGVPLGVVVGRGLKTGNVEVRVRGTGESFEVPVDAAAARVVELHGQLMGEN >gi|319976816|gb|AEUH01000295.1| GENE 2 1951 - 2442 778 163 aa, chain + ## HITS:1 COG:Cgl1563 KEGG:ns NR:ns ## COG: Cgl1563 COG0242 # Protein_GI_number: 19552813 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Corynebacterium glutamicum # 1 159 1 162 169 128 44.0 4e-30 MAIRPIRIIGDPVLRTVCDPVTEVTDSVRTLVEDLLEGVDMEGRAGLAANQIGVGLRAFS YNIDGQIGYVLNPTIVELSEDEYQDGDEGCLSIPELWYPTKRAWYARCEGTDLDGRPVVL EGEELMARCIQHEVDHLNGHLYIDRLERKVRKKALRDIRDAGM >gi|319976816|gb|AEUH01000295.1| GENE 3 2640 - 3143 544 167 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1800 NR:ns ## KEGG: Bcav_1800 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 1 167 1 166 166 208 66.0 7e-53 MRGTYTRGVLFVHSTPPALCPHIEWALGTALGQEVHLQWSDQGAAPGMVRCEFAWVGPIG SGARLASALRGWEHLRYEVTEEPTASADGGRWCHTPSLGVFHSQMDTAGNVVVSEDRVRA ALESASTLDELREALELALGQAWDDELEPLRHAGAGAPVRWLNNRVG >gi|319976816|gb|AEUH01000295.1| GENE 4 3146 - 3499 267 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509142|ref|ZP_02044784.1| ## NR: gi|154509142|ref|ZP_02044784.1| hypothetical protein ACTODO_01663 [Actinomyces odontolyticus ATCC 17982] # 1 110 1 115 123 76 48.0 6e-13 MGTIADLFGDQLRATVGADDGADGAAGEPAPASSPGSKADLVEEAAIEAVLEESGLEPGS ARAELTLRGDLDLDDLGLCAVVARFERAAGARCPDAEIGQWRTLGDLLSAARALGGG >gi|319976816|gb|AEUH01000295.1| GENE 5 3677 - 5089 2001 470 aa, chain + ## HITS:1 COG:YPO0348 KEGG:ns NR:ns ## COG: YPO0348 COG1027 # Protein_GI_number: 16120683 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Yersinia pestis # 3 464 6 467 478 583 62.0 1e-166 MTRTEEDLLGTREVPEDAYWGIHTLRATENYRISGRTINEVPELIRAFAEVKKACAMANM QLKVMKPVKAEAIIAACDEIIENGKCLDQFPVDQFQGGAGTSVNMNANEVIANLALEILG EEKGNYSVIHPNDHVNRSQSTNDAYPTAFRIALWRKVNRLRKALDELADAFEEKGGQFRR ILTMGRTQLQDAVPVSLGAEMTAFAHTLREEDERLDNNAELLLETNLGATAIGTGLNTPE GYQESVVRHLRRITGARVVASPDLLEATSDTGAYVSMHATIKRSAAKLSKICNDLRLLAS GPRAGLNEINLPRMAAGSSIMPAKVNPVIPEVVNQVCFKVIGNDHVVTMASEAGQLQLNV MEPVIAQAIFESINLLVNACDTLRLRCVEGITANEAVCRGYVENSIGIVTYFNDVIGHHL GDVVGKRAADEGKTVREVIHDMQLLAQDEVERILSPENLANPKYAGATED >gi|319976816|gb|AEUH01000295.1| GENE 6 5294 - 5815 156 173 aa, chain - ## HITS:1 COG:no KEGG:plu2709 NR:ns ## KEGG: plu2709 # Name: not_defined # Def: hypothetical protein # Organism: P.luminescens # Pathway: not_defined # 1 162 78 238 246 155 46.0 5e-37 MTLPGFFRPSKSWDVTVVHRGALLAVIEFKSQAGTSMGNNFNNRSEEAVGSAYDLRCAYD EGLLGQIDPPFIGWFMFVEENSASTRPHRDGTQYLFEIDHVMRRGSYIDRYVELCQRLTA TGLYTSCALVSAPASSITSGDFTVHSDDTSPQQFLDALEAHLTRAVRGEPGQR >gi|319976816|gb|AEUH01000295.1| GENE 7 5706 - 5921 109 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDVPAWLLNSITASRAPRWTTVTSQLFEGRKKPGSVTARRSTNTESLGSPARSATSWIRW TNPSMFLPAVT >gi|319976816|gb|AEUH01000295.1| GENE 8 6040 - 7713 1092 557 aa, chain - ## HITS:1 COG:jhp1409 KEGG:ns NR:ns ## COG: jhp1409 COG1002 # Protein_GI_number: 15612474 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Helicobacter pylori J99 # 131 511 809 1191 1252 105 24.0 2e-22 MTTDSTSLPNPNPPGSLGGEKQRGAVFTAPAVVDFMLDLIGYRQDAPLASRRILEPSFGG GVFLLRIAERLLSSHRSHGGKSAESLEPCVRAVEMDHSTFTATRRCLRETIQEHGFTPAE ADLLLDSWLIEGDFLTVPLDGAFDYVVGNPPYIRQEALDPAALALYRRRYRTMIGRADIY VPFFERSLSLLSPSGTLSFICSDSWVKNAYGRELRALIAHEYGMDAYFDMYMLPAFEKSV GAYTSITRIRRSPVRRTLVAQIESVEDAYLQDVARAADTPERSDPRLAPLPTPTGSSPWL LSPDPAQTIIRRLEDTFPTLEDAGCRIGIGVATGADKVFIGRLDALDVEDDRKLPLATGK DIVNGALAWDGRGIINPWADDGSLVDLTQYPRLGAYLRRFWHPLAKRHTAKRDPQRLWYR TIDRIVPSLTTTPKLLIPDIRDNADSIAYDAGTVYPHHGLYHITSTSWDLRALRAILRSG LAALVVRAYSTKLGGGYFRFQAQNLRRIHLPAWDSLETRTRRELSRVGRLGGVADNRLLA EALGITESESMTIKRAA >gi|319976816|gb|AEUH01000295.1| GENE 9 7917 - 10661 4120 914 aa, chain - ## HITS:1 COG:Cgl2195 KEGG:ns NR:ns ## COG: Cgl2195 COG2609 # Protein_GI_number: 19553445 # Func_class: C Energy production and conversion # Function: Pyruvate dehydrogenase complex, dehydrogenase (E1) component # Organism: Corynebacterium glutamicum # 10 900 20 907 922 1147 62.0 0 MSQLHESHPVVNGLIPQVPDNDPQETREWIESLSGLINEKGGPRARYILLHMLDEARRNG VQLPQEYTTPYVNTIPVDQEPYFPGDEAMERQYRRWIRWNAAVQVTRAQRPGVKVGGHIS SYASVSTLYEVGLNHFFRGKDHPGGGDHVFFQGHASPGPYARAFLEGRLSQEEMDGFRQQ ASTRRGLPSYPHPRQLPGFWEYPTVSLGLGPAEGIYQAWFDRYLHLGGLKDTSQQHTWVF LGDGEMDEPESRGMLQLAAQQRLDNLTFVINCNLQRLDGPVRGNGKIIQELEAFFKGAGW NVIKVIWGRGWDQLLAADKDDALVHLMNETLDGDYQTFKANDGAYVREHFFGRDPRTKEM VRNWTDEQIWELKRGGHDYRKVYAAYKAAMEHTGQPTVILAHTIKGYALGSHFAGRNSTH QMKKLTVEDAKQLRDRLQIPITDEELERDPYQPPYYMPPADHPALQYMKERREVLGGWVP ERRDDRQPKLPPLPARPFEALSKGSGKLEVASTMALVRLIKDLMKDKSVGRYFVPIIPDE ARTFGLDAIFPSAKIFNTTGQSYTPVDADMMLSYRESEHGRILHTGITEAGSAAAFQVVG TAYATHGLPMVPIYIFYSMFGFQRTGDQFWAAGDQLTKGFVIGATAGRTTLAGEGLQHMD GHSHVLAATNHAFVNYDPAYAYEIRHIMADGLQRMYGDAGGRDPNVMYYITVYNEPIHQP AEPAGLDVEGVVKGIYKLDGHTGSGGPKAQLLASGVGVPWAREARRLLAEDWGVDTAVWS VTSWSELRRDGLEADEHNFLHPEAEPRTPFVAERLAGAEGPFVASSDFDKMVPDQIRQWV PGDYHVLGADGFGFSDTRRAARRWYHIDAESMVVRTLAALASRGQVAPSAVRAAIDKYDL FNCAVPGSDHAGED >gi|319976816|gb|AEUH01000295.1| GENE 10 11029 - 11421 381 130 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0573_11425 NR:ns ## KEGG: HMPREF0573_11425 # Name: not_defined # Def: hypothetical protein # Organism: M.curtisii # Pathway: not_defined # 1 130 22 151 151 131 54.0 7e-30 MNLGFTRGQIIQEFYVDDDVDQKLREAVEASTAEQLVDVDYGDVVDGAIVWWRADDAEEE DLADVLVDALSNLDDGGLIWVLIPKPGRTGSLPVADVEDAATIAGLHSTTAASVGSQWAG IRLTARPRQR >gi|319976816|gb|AEUH01000295.1| GENE 11 11614 - 12480 623 288 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLTRRKHDFDCRITPLGGPTLLIEIDDLCVLVDPVLGDSPECVPPVCATGQELASAFGAE GDHLAEALEDGIDCVLVTDPDHLDQAGLAFALRAPTLMAGGDGQLPGAPRRAFEENAEAV VSDTVLIWAVTPSGATEEAWISGFWINAEACAIHIAGDRTTAGMVREYATAMGPLWAAVV RSGAVRDIWEEAPGRHRASDGGGDNGAVEAEPALTAAGLAEAACALGVPVIVFVYPDAGD RSARVGAAARRAFEQAGIGHALVELRPGEPVEVGRGEGQVDPTRWPST >gi|319976816|gb|AEUH01000295.1| GENE 12 12765 - 13418 446 217 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988990|ref|ZP_01820390.1| hypothetical protein CGSSp6BS73_02415 [Streptococcus pneumoniae SP6-BS73] # 26 202 16 192 192 176 46 1e-43 MSTTDQTTGERGAKPLLFDQINDAEENAAMRAKGWPPVYTASAESRIVLVGQAPGRIAQR TRKPWNDPSGRLLRQWLGVTDQQFYDPALFALMPMDFYFPGKGTHGDQPPRKDFAPTWHP RLLALMPRVRLTILVGAYAQRYYLGAAAGRTLTETVANAADAPEGFFPLVHPSPLNIAWR KRNPWFEQETVPLLREAVAEALEGAAEPGALEGAGHE >gi|319976816|gb|AEUH01000295.1| GENE 13 13723 - 14205 417 160 aa, chain - ## HITS:1 COG:SMc03764 KEGG:ns NR:ns ## COG: SMc03764 COG3727 # Protein_GI_number: 15966902 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Sinorhizobium meliloti # 1 128 17 143 145 78 36.0 4e-15 MRSNTRRDTAPELAIRRLLHARGYRYRVDFAPWPNKRRRADIVFTRLKVAVFVDGCFWHA CPEHSRVPLTNREYWEAKLKRNARRDLDTTSMCQAEGWTVVRIWEHVPADEAVAMIVEAL VAASGAEGPTGAGAESEGSAIEPLGADEQPDSAEAEPAGG >gi|319976816|gb|AEUH01000295.1| GENE 14 14250 - 15239 961 329 aa, chain - ## HITS:1 COG:all3838 KEGG:ns NR:ns ## COG: all3838 COG0457 # Protein_GI_number: 17231330 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 3 323 269 589 710 125 30.0 8e-29 MNNSDPIAERNALARQYQAQGDLGRAIALFEANLHECADLRGADSADTLAARNNLARAYQ DAGRLDDAIPLLERNLGDSTRLLGADHPSVLVAQANLAGAYQDAGRLDDAVAVSEASLAV HLRVLGADDPGTLTARNNLARAYQDAGRLDEATALFEQNLVGLARVLGPDDPMTLTARDN LAAAHQQAGRPADAVPLLRENLDARIRTSGPAHPDALTARGLLGLACLDAGQVDEAIRVL EENDCAADSVLGADDPIALSTRNSLAGAYLRAGRPADAIPLLEKGLDAVLRSSGPDHPYA PVARANLAAAYRAAGRNADAEALNPSEEG >gi|319976816|gb|AEUH01000295.1| GENE 15 15236 - 16345 1288 369 aa, chain - ## HITS:1 COG:all2787 KEGG:ns NR:ns ## COG: all2787 COG0457 # Protein_GI_number: 17230279 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 7 358 540 891 924 187 36.0 3e-47 MNAADPAAQRNARAHRYLEESRGDLAIPLFEEELAHRREALGPDHPDALTAQSNLATAYQ AAGALDEAIALYGGALALSRGALGQAHPDTLATQSNLATAYLDAGRLDEATALLEDTLAK RRAALGPTDPDTVASQGLLATAYLDAGRLGEATALLEDTLAQHRALLGPEHPDTLTSQSI LAGAYLEAGRLDEATALHERILAQRRETLGPDHPSTLTSQGGLAAAHVKAGAFDRAIPLL EDALTRLRAVLGAEHPDTLASQANLATAYRKAGRLDEATALLEDALAQHRALLGHAHPGT LTVQTNLANAYRKAGRLGEAIPLYEAALKTSEDALGPDHPLTANIRRDLESARRAASPAP SSSPESSDE >gi|319976816|gb|AEUH01000295.1| GENE 16 16342 - 16662 316 106 aa, chain - ## HITS:1 COG:all2787 KEGG:ns NR:ns ## COG: all2787 COG0457 # Protein_GI_number: 17230279 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 1 95 797 891 924 78 42.0 3e-15 NLAYAYQAAGRLDEAIPLYEQVLADRVRVLGDDHPDTLTSRNNLAHAYQAAGRHHEAINL FKDTLKVCEDTLSPGHPLTTTVRENLETLEREMNPAPASFPESSEE Prediction of potential genes in microbial genomes Time: Thu May 12 19:20:38 2011 Seq name: gi|319976813|gb|AEUH01000296.1| Actinomyces sp. oral taxon 178 str. F0338 contig00296, whole genome shotgun sequence Length of sequence - 5015 bp Number of predicted genes - 5, with homology - 2 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 2483 1514 ## COG0457 FOG: TPR repeat 2 2 Tu 1 . - CDS 2600 - 2686 74 ## - Prom 2738 - 2797 2.9 3 3 Op 1 . - CDS 3116 - 4486 2151 ## COG1252 NADH dehydrogenase, FAD-containing subunit 4 3 Op 2 . - CDS 4520 - 4831 216 ## 5 3 Op 3 . - CDS 4925 - 5011 102 ## Predicted protein(s) >gi|319976813|gb|AEUH01000296.1| GENE 1 2 - 2483 1514 827 aa, chain - ## HITS:1 COG:all2787 KEGG:ns NR:ns ## COG: all2787 COG0457 # Protein_GI_number: 17230279 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 556 826 524 794 924 228 49.0 5e-59 MRRAKRQITHIIKRAKPEDYAALAEARTPEAIEQVLRRLPRKWGRGIREADRQWFDELVG AVAREYTILAPWSLAFEPRALDAILSELETIHKVATDDGRKLDKLLAGSVRPGRVLFGGR PDVAFGFRERAEQGRLRGLVVDGAQERAVLVGMAGCGKSQLASSLAQECEAAGWALVAWL NASSNDSIRSDLVELARRLGVDTSDNPTQDQVVRRCLDHLQSAEASDRLVVFDNVERIED VAGVVPRGVGLRVVATTRNREVLSAWEPVQVGVFPRSDSISFVQSITDSKDDEAADALAE RLGDLPLALAQAAETARIERWALAEYLDQLDAYASDRVIRRILGGSYAGDVSTVLWMAVE LAIDRLDEEERVVPRRQVGALAVLAESGVPSRWIAPQVRKAQADGNAEKANEDTDADGSS EDADAPNAAAENARRALTALLNASVVQQSADGGVTMLHRLQGQVLRENWNREEWAEAFDA AADLLDRVDIDSLPREDTDGRRREARDLIDQLRAIAAQGYSRPLYECERVVADLPHVLFH ATDLGLPNEALALRGALSVVEEVLGPDHPDTLASRNNLAGAYESAGRLGEAITLFEEVLA DRVRVLGDDHPDTLISRNNLAYAYRAAGRLDEAIPLYEEVLADRVRVLGDDHPQTLTSRN NLAAAYRAAGRLNEAIPLYEQVLADRVRVLGDDHPDTLTSRNNLAAAYQTAGRLNEAIPL YEQVAADCARLLGGDHPDSLASRNNLAGAYESAGRLGEAIPLYEEVLADRVRVLGDDHPQ TLTSRNNLAGAYESAGRLNEAIPLYEQVLADRVRVLGDDHPHTLTSR >gi|319976813|gb|AEUH01000296.1| GENE 2 2600 - 2686 74 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLSWGRRAGCSPVLRKRSLMAVVRFGV >gi|319976813|gb|AEUH01000296.1| GENE 3 3116 - 4486 2151 456 aa, chain - ## HITS:1 COG:MT1902 KEGG:ns NR:ns ## COG: MT1902 COG1252 # Protein_GI_number: 15841322 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase, FAD-containing subunit # Organism: Mycobacterium tuberculosis CDC1551 # 3 434 14 446 463 505 60.0 1e-143 MSRHRIVIIGSGFAGLTAARRLKRADADITILARTSHHLFQPLLYQVATGILSEGDIAPT TREILKRQKNATVMQALVDGIDVEARVVKWRNHNDRFETPYDTLIVAAGAGQSYFGNDHF AVFAPGMKTIDDALELRARIFGAFEQAEQATDPAVVERLLTFVVVGAGPTGVEMAGQIRE LASKTLKGEFRNIDPTKARVVLVDGASAPLPPFGEKLGAKARKSLEKLGVELRMNAFVTG VDADGVTLKQKDGTAETIASQCKVWAAGVQASELGKLLADATGAEVDRSGRVVVGKDLTL PGHPEIFVLGDMMSVPGVPGVAQGAIQSARFAADTVKARLRGRAPKRTEFSYFDKGSMAT IARFKAVVKMGRLHLTGFTAWAAWCFLHLLYIVGFKSQVGTLVHWFFSFVSGARSQRTTT NQQMVGRLAMGQLGVGASGKLVAGEDVVEEIIEAQE >gi|319976813|gb|AEUH01000296.1| GENE 4 4520 - 4831 216 103 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLGFAAFWWHRSGVFQNASQFDGFGQISGIPARLRRILEHPPAAPPDRRETQTHFGTPG IAEPCLLGGSYHLGRQALRYIGGTGPPRPALRESSQFRDAWFQ >gi|319976813|gb|AEUH01000296.1| GENE 5 4925 - 5011 102 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRPVDALPLTATGKPSPAAARELLLSGE Prediction of potential genes in microbial genomes Time: Thu May 12 19:20:55 2011 Seq name: gi|319976804|gb|AEUH01000297.1| Actinomyces sp. oral taxon 178 str. F0338 contig00297, whole genome shotgun sequence Length of sequence - 10142 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 7, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1380 1145 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II - Term 1442 - 1504 -0.9 2 2 Op 1 13/0.000 - CDS 1548 - 2600 1427 ## COG0320 Lipoate synthase 3 2 Op 2 . - CDS 2597 - 3436 837 ## COG0321 Lipoate-protein ligase B 4 3 Op 1 . + CDS 3342 - 4202 876 ## COG0327 Uncharacterized conserved protein 5 3 Op 2 . + CDS 4199 - 4942 990 ## Bcav_1829 hypothetical protein 6 4 Tu 1 . - CDS 4958 - 5770 669 ## COG3022 Uncharacterized protein conserved in bacteria - Term 5835 - 5884 7.1 7 5 Tu 1 . - CDS 5971 - 7677 1206 ## Tery_0977 TPR repeat-containing protein - Term 8245 - 8286 -0.9 8 6 Tu 1 . - CDS 8317 - 9099 983 ## COG1940 Transcriptional regulator/sugar kinase - Prom 9129 - 9188 2.0 9 7 Tu 1 . - CDS 9192 - 10079 1280 ## COG0024 Methionine aminopeptidase Predicted protein(s) >gi|319976804|gb|AEUH01000297.1| GENE 1 3 - 1380 1145 459 aa, chain - ## HITS:1 COG:CC0966 KEGG:ns NR:ns ## COG: CC0966 COG0318 # Protein_GI_number: 16125218 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Caulobacter vibrioides # 44 435 48 473 530 145 30.0 1e-34 MAPDYSPARALLRAAEQHPSRASLVDGATDRVWSVAESADAVARLAGALRGRGLGASSRI AVVAQNSPWHFLIHVAASWIHAATVPMSPRAPAPRLREMLEAAAVDLVVVDEAGAAALAG GAGVPVVAVSDIEEWSRLAPAPAGPPLPCAQEEAAVVFTSGSTGLPRPVRLSHAALWWAS ACFRDGFEYSPGSHVVGVCAPLSHIGGFNGTTLDTFSHGGTLVVVGPSFDPVRTLECVQR HRIAMMFVVPTMARALLEANESVGADLSSWVRPLVGGDALTPALAERMRRAGLAPIHVWG MTETGGAGAMAAPDSRAPAGSIGRPFPYVDLRIVGAHGAAAGPGALGTIEVRGPGVVTGQ EWLSTGDLGFVDADGWVHLVGRAHRMINTGGELVAPPRVEAALMELEEVREALVVGVPDE TWGEVVGAVLVPSPGADAAFLSPASLAAALGGALAPWER >gi|319976804|gb|AEUH01000297.1| GENE 2 1548 - 2600 1427 350 aa, chain - ## HITS:1 COG:Cgl2160 KEGG:ns NR:ns ## COG: Cgl2160 COG0320 # Protein_GI_number: 19553410 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate synthase # Organism: Corynebacterium glutamicum # 7 335 5 335 348 432 63.0 1e-121 MSTLPDPEGRKLLRVEVRNSQSPIEAKPEWIRTTARAGENYQDMRSLSHAKGLHTVCAEA GCPNIYECWQDREATFLLGGALCTRRCDFCDIATGRPGEYDRDEPRRIAESVAELELRYV TITGVARDDLPDGAAWLYAQTCRLIHEKCPGTGVELLVDDFRGRQEAIDMVIDAAPQVFA HNLETVPRIFKKIRPAFDYDRSLEMIGRAHDGGMVTKSNLILGMGERREEISAAMRDLHE AGCDLLTLTQYLRPSPLHHPIDRWVHPEEFLEMAAEAEQIGFAGVMAGPLVRSSYRAGLL WAKGMRARGFAIPQNLAHIADSGSTLQEAGTVLARLRERTARRTRTAAPA >gi|319976804|gb|AEUH01000297.1| GENE 3 2597 - 3436 837 279 aa, chain - ## HITS:1 COG:ML0859 KEGG:ns NR:ns ## COG: ML0859 COG0321 # Protein_GI_number: 15827384 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase B # Organism: Mycobacterium leprae # 48 263 6 221 235 143 42.0 4e-34 MSHGSAVEAGYQASMSPMMSPTVRLAVFAASIRTGYRRRAQRTRGGATIITAVRIVNLLG RGLVDYREVDALQRSLHARVRQGGEDAVVVSQFTPTWTAGRHTKPQDIPDSDVPVIRVDR AGSATWHGPGQLVVYPVVRLREPVDLVAWIRSVERGVIDTVRAQWHLPARRVEGRAGVWI TEEGRRDRKLCAIGLKVARGATLHGLALNVAIDPQKAFTGIIPCGLVDADVASLSWEGVH TTVEAACAALVPALIDAIAPQLARPVSDVSTTTDPRTLQ >gi|319976804|gb|AEUH01000297.1| GENE 4 3342 - 4202 876 286 aa, chain + ## HITS:1 COG:Cgl2188_1 KEGG:ns NR:ns ## COG: Cgl2188_1 COG0327 # Protein_GI_number: 19553438 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 7 128 2 123 123 123 54.0 4e-28 MDAANTASLTVGDIMGLMEAWYPASTAEPWDKVGLICGDPGQRASSVLLALDPVDAVAQQ AREGGFDMVITHHPLLLRGASFLPVTDPKGSVISHLIRSDIALFNAHTNADVAHLGVAEA LAALVGMSHWEPLEPTGTDSQGRPIGLGRVGDVEPQSLGEFADRVAAALPASPAGLLVGG DEERPVRRAAVLGGSGDSLLQRAREAGADVYLTADLRHHPAQEHLEGGAPALLCGSHWAT ESPWLPVLAGRLAAAAAAKDVELRVEVSTIVTEPWSSHRATLGGTA >gi|319976804|gb|AEUH01000297.1| GENE 5 4199 - 4942 990 247 aa, chain + ## HITS:1 COG:no KEGG:Bcav_1829 NR:ns ## KEGG: Bcav_1829 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 3 240 4 241 246 94 35.0 5e-18 MKAPHTDQLELLTLLGYDQRESVLRHKRESHPAHAVVREFAGRAQDLQRAAVKQSAVISD AGREVARIEAEIARVRERRQRQQGRIDANQVPLRDISAMQHEIAQMDRRLGELEDDQLAA EERVESARSAQRAMEEESAAIMADVEEHKAQFLADTAATDDELRQVIRARRELVARLDAD LVEEYEDAKRRNGVLAVIEVRDGVGVGVGADLSPLELDRIMRTPADEVYRTEDTQQIVVR TTASTPR >gi|319976804|gb|AEUH01000297.1| GENE 6 4958 - 5770 669 270 aa, chain - ## HITS:1 COG:Cgl1949 KEGG:ns NR:ns ## COG: Cgl1949 COG3022 # Protein_GI_number: 19553199 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 1 234 1 232 245 117 37.0 2e-26 MLIWLPPSEGKSAPESGPRLDVEALSIPSLAAARRTVIEAVEALGDGEEAARALGVGARA AAQLGANTRLRTSPCAPASRVFTGVLYDAVAASGADPWERSEGVTVFSALFGALSPTDPI PDHRLAMGVSLPGLGPMARWWAPRLAAALEPLAKERIVLDCRSGPYRAACRAPWAHTWEL RVERQSATGRQVVSHDAKRWRGAVAGSLMAAGALESEGEDECMAALTGAAMEIELTDARG GAHRVVGVECSEPRRTAAGGSRREVTLVVS >gi|319976804|gb|AEUH01000297.1| GENE 7 5971 - 7677 1206 568 aa, chain - ## HITS:1 COG:no KEGG:Tery_0977 NR:ns ## KEGG: Tery_0977 # Name: not_defined # Def: TPR repeat-containing protein # Organism: T.erythraeum # Pathway: not_defined # 54 364 299 617 1215 79 26.0 4e-13 MLMCVFNLGLCLRQADRLDAAIGAFARALALAEDQRTPFDTERERKTVQCKTHDELGKVL ADKQSFTESLHHFTTALDIAKTMRDSLELQGIIRRHMANVLEDKGDTDQALAVCGQALED LRTAGSEPYEIHQTTVDLATVHIVRDEHATAVDLLHEAINGFESLPGTESAIAHARASLA NALRFLGRFAEAERQYALADTVFRRSGRQSSIAYNLQNSALNYAEQGKWEAADRTYSEAL SKFEELGVSDYETGRTLMNHSETIRLAGNPERAEEVCRQAIAALNRAEGAGGLVGMALTV LGCCLKDQQRLPEAEAALMESVRLIEQNEGSAYSLAYAEMDLGVVIADQCRFDEAAPHFK RAREEFLRCGMHYEANLVNREEADALQRESERAHDAEKKDALLVEALSLAVRAAVDADER RFQFLASDARMRWIQRVAEPSRELALSLAADAQDAGLVSELVASWRTSGVVSPSSGLERA RGGDPGRPPHHPAQATVPITGVMATDAENGVGGVGSVEGLGDPSGGVRGSANAAGRSIGP NLVLPHGSRSTLSAYAPDLALRRKVRYR >gi|319976804|gb|AEUH01000297.1| GENE 8 8317 - 9099 983 260 aa, chain - ## HITS:1 COG:ML1023 KEGG:ns NR:ns ## COG: ML1023 COG1940 # Protein_GI_number: 15827493 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Mycobacterium leprae # 5 251 70 315 324 227 50.0 2e-59 MHTASQGGVLAHIACGIDIGGSGVKAALVDLDTGEYIGDQIRIPTPAPATPDAVARVCRE LVDRLGVGADVPIGVTFPAPVFDGVIPFMANLDQSWVGTDVNALMSEHLGRPVVPLNDAD AAGIAEVAYGAAKDAKGVVVFTTQGTGIGSAIIVDGRLMTNTELGHLELDGHDAEKRASS GQKTLQGLDWAQWAERLQRYYSHVEMLLSPDLFVVGGGVSENADKFMPLLKLRTPMIPAK LLNTAGIVGAAYYAANHQSR >gi|319976804|gb|AEUH01000297.1| GENE 9 9192 - 10079 1280 295 aa, chain - ## HITS:1 COG:MT2929 KEGG:ns NR:ns ## COG: MT2929 COG0024 # Protein_GI_number: 15842403 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Mycobacterium tuberculosis CDC1551 # 16 293 7 283 285 344 61.0 1e-94 MTEASDLARRAPLGTLAPGRVSPARHVPERIERPEYMFHSGPERVTASDVKDAATIRRIR EAGRIAAGALEAVGAAVAPGVTTDELDRVGHDFLVAHGAYPSCLGYMGFPKSLCTSINEV ICHGIPDDRPLQEGDIVNVDITAYKDGVHGDTCAMFEVGAVDEESHLLVERTRNAMMRGI KAVGPGRQINVIGRVIEAYAKRFHYGVVRDYTGHGVGEAFHSGLVVPHYDAAPQYDTVME PGMVFTIEPMLTLGGVEWEQWDDGWTVVTADRSRTAQFEHTIVVTEDGAQILTLP Prediction of potential genes in microbial genomes Time: Thu May 12 19:21:07 2011 Seq name: gi|319976801|gb|AEUH01000298.1| Actinomyces sp. oral taxon 178 str. F0338 contig00298, whole genome shotgun sequence Length of sequence - 1353 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 19 - 861 1014 ## COG0413 Ketopantoate hydroxymethyltransferase 2 2 Tu 1 . + CDS 990 - 1353 288 ## Sros_6082 phenylalanine--tRNA ligase (EC:6.1.1.20) Predicted protein(s) >gi|319976801|gb|AEUH01000298.1| GENE 1 19 - 861 1014 280 aa, chain - ## HITS:1 COG:Cgl0115 KEGG:ns NR:ns ## COG: Cgl0115 COG0413 # Protein_GI_number: 19551365 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Corynebacterium glutamicum # 15 274 4 265 269 235 51.0 9e-62 MSRITPPDAHSPAPVSKGRVRLPHIAAAKRSGTRLTMLTAYDALTAPLLEAAGVDMLLVG DSLGNVMLGHSSTLPVTVGDMERATASVARSTSRALIVADLPFGSYEDGPAHAFASAARL MKAGAHGVKLEGGAPRAHIIRSLVDAGIPVCAHLGYTPQSENALGGPRMQGRGEAGEAMR ADALAVQEAGAFALVLEMVPSTVAARITAELGIPTIGIGAGPDTDGQVLVWSDMAGMSNW TPSFVRRFGELGSALTAAAGDYVAAVRDGSFPGEDNFKEQ >gi|319976801|gb|AEUH01000298.1| GENE 2 990 - 1353 288 121 aa, chain + ## HITS:1 COG:no KEGG:Sros_6082 NR:ns ## KEGG: Sros_6082 # Name: not_defined # Def: phenylalanine--tRNA ligase (EC:6.1.1.20) # Organism: S.roseum # Pathway: Aminoacyl-tRNA biosynthesis [PATH:sro00970] # 41 121 8 88 365 65 45.0 4e-10 MRRADSADQEAAGPCPGAVRAGRTSATPPTRKQDMAGNAPDLDPLDADAVDAVRVAGLAS IAAAQSLDELKAARSALLGERAPLVTANRAIGGLEPSQRAAAGKNLGRARGALTQALQER H Prediction of potential genes in microbial genomes Time: Thu May 12 19:21:10 2011 Seq name: gi|319976798|gb|AEUH01000299.1| Actinomyces sp. oral taxon 178 str. F0338 contig00299, whole genome shotgun sequence Length of sequence - 2128 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 + CDS 3 - 851 1241 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit 2 1 Op 2 . + CDS 853 - 2128 1630 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit Predicted protein(s) >gi|319976798|gb|AEUH01000299.1| GENE 1 3 - 851 1241 282 aa, chain + ## HITS:1 COG:Cgl1356 KEGG:ns NR:ns ## COG: Cgl1356 COG0016 # Protein_GI_number: 19552606 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Corynebacterium glutamicum # 11 277 85 343 345 381 66.0 1e-106 TQALQERHEALAAAHEARALVEEAVDVTLPTRRERLGARHPLETLMEEISDFFVAMGWEI AEGPEIEHEWFNFDSLNFDIDHPARQMQDTLYIDGTSAGGREPSDGHLVMRTHTSPVQSR AMLARGVPLYVACPGKVFRSDALDATHTPVFHQVEGLAVDKGLTMAHLKGVLDHFAKAMF GPEARARLRPSYFPFTEPSAEMDLWFPQKKGGPGWIEWGGCGMVNPNVLRASGIDPRVYS GFAFGMGLERTLMLRHGISDMHDIVEGDVRFSQQFGLTGRGN >gi|319976798|gb|AEUH01000299.1| GENE 2 853 - 2128 1630 425 aa, chain + ## HITS:1 COG:STM1338_2 KEGG:ns NR:ns ## COG: STM1338_2 COG0072 # Protein_GI_number: 16764689 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Salmonella typhimurium LT2 # 168 425 1 242 651 195 41.0 1e-49 MPMVPLSWLGDHVDVAEGTTAAQLAAALVRVGLEEEQIHPARVRGPLVVGRVLTRVEETA SNGKTVNYCRVDVGEHNDAPGTGKEPSELPSRGIICGAHNFDAGDLVVVSLPGAVLPGDF RIAARKTYGHVSDGMICSARELGLGEDHDGIIVLDRCYPDRRIPAPGTDIVGWLGLGEEI LEINVTPDRGYCFSMRGVAREYSHSTGAAFRDPGLPGVIAAEPPSATPDGFPVVFADDAP VRGQKGVDRFVARVVRGVDPAAPTPRWMVERLEAAGMRSLSLAVDITNYVMLDLGQPMHA YDLGDLAAPIVVRRARDGESLVTLDSEERALDPEDLLITDSPDGEGSRVIGIAGVMGGQY SEVGPATTDVLLEAAHFDPVSVARSARRHRLHSEASKRFERGADPRLPAVAAQRAAELLA RYGGG Prediction of potential genes in microbial genomes Time: Thu May 12 19:21:13 2011 Seq name: gi|319976790|gb|AEUH01000300.1| Actinomyces sp. oral taxon 178 str. F0338 contig00300, whole genome shotgun sequence Length of sequence - 6047 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 510 593 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit + Term 671 - 709 -0.3 2 2 Op 1 . + CDS 716 - 1204 587 ## COG4635 Flavodoxin 3 2 Op 2 4/0.000 + CDS 1263 - 1778 754 ## COG1438 Arginine repressor 4 2 Op 3 . + CDS 1857 - 3080 1882 ## COG0137 Argininosuccinate synthase + Term 3134 - 3184 15.2 5 3 Tu 1 . - CDS 3074 - 3256 176 ## 6 4 Op 1 . + CDS 3186 - 4556 1781 ## CMM_1025 membrane protein, putative polysaccharide polymerase 7 4 Op 2 . + CDS 4601 - 6037 1891 ## COG0165 Argininosuccinate lyase Predicted protein(s) >gi|319976790|gb|AEUH01000300.1| GENE 1 1 - 510 593 169 aa, chain + ## HITS:1 COG:CAC2356_2 KEGG:ns NR:ns ## COG: CAC2356_2 COG0072 # Protein_GI_number: 15895623 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Clostridium acetobutylicum # 4 168 494 654 654 100 33.0 1e-21 APASRPSQVAPFHPGRAARVFVRAGRDLVEVALAGELSPSTCRAFGLPPRSCAFEADLDA LIGQMGATAVQVKGVSTFPLAKEDIALVVPSDIPVARVEQVVRQGAGALAESVTLFDVYE GGQVEKGHRSLAFALRLRGDHTLTAKEAEDVRRAVVAKARKALGARLRA >gi|319976790|gb|AEUH01000300.1| GENE 2 716 - 1204 587 162 aa, chain + ## HITS:1 COG:MA2078 KEGG:ns NR:ns ## COG: MA2078 COG4635 # Protein_GI_number: 20090924 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism # Function: Flavodoxin # Organism: Methanosarcina acetivorans str.C2A # 3 160 11 177 183 78 30.0 5e-15 MHILVTAASKHGATDEVADAIARRLTEAGFEVDRIAPGDVDSVEEYDAVVVGSAVYILQW MPAAHDFMERFARELAAKPVWAFSVGMNGVPKHAPQDPTRIGPLLTHVNAKDFKSFGGRY KPSLLSLRERSVARLAGVVEGDFRDWAAIDEWTDGIIASLGR >gi|319976790|gb|AEUH01000300.1| GENE 3 1263 - 1778 754 171 aa, chain + ## HITS:1 COG:MT1695 KEGG:ns NR:ns ## COG: MT1695 COG1438 # Protein_GI_number: 15841113 # Func_class: K Transcription # Function: Arginine repressor # Organism: Mycobacterium tuberculosis CDC1551 # 3 149 10 156 201 92 43.0 3e-19 MSGGSSIPQTKTARHEAIRALLAAEAIGSQEGLRLRLSQMGIDATQTTLSRDLMDMRATK IRDARGALVYTVPDVDGGPTHEAEAAHARLARWCQNLLVTSVAVGNQLVLRTPVGAANLL GSALDAVRLDGVAGTIAGDDTILVICRAPDEARAVERRLLAYAEPGAAPEG >gi|319976790|gb|AEUH01000300.1| GENE 4 1857 - 3080 1882 407 aa, chain + ## HITS:1 COG:MT1696 KEGG:ns NR:ns ## COG: MT1696 COG0137 # Protein_GI_number: 15841114 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 6 400 3 398 398 559 68.0 1e-159 MSTSKDRVVLAYSGGLDTSVAIGWIGEQTGREVVAVAVDVGQGGEDLEVIRQRALDCGAV EAYVADAREEFASDYCMPALKANGLYEGRYPLVSALSRPVIVKHLVKAAREFGATTVAHG CTGKGNDQVRFEVSITSMAPDMDCISPVRDLALTRDVAIDYAERHHLPIETTKHNPFSID QNVWGRAIETGFLEDLWNAPTKDVYVYTDDPTYPPLPDEVTLTFKEGIPVAIDGRAVTPL EAVQELNRRAGAQGVGRIDMVEDRLVGIKSREIYEAPGAVALIEAHQALEAITLERLQRR YKRHIEQVWAELVYEAQWYSPLKKSLDAFIDDTQKYVTGDVRMVLHGGRATVNGRRSEAS LYDFNLATYETGDTFDQSSSRGFIDIYGLQSKLAAARDVRSGVDMGY >gi|319976790|gb|AEUH01000300.1| GENE 5 3074 - 3256 176 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGRGLLEVEIHGRPGNAIVVVVLIDLTINDAGGPRGDPPAPLRRCYRFVTFVVIQNTADQ >gi|319976790|gb|AEUH01000300.1| GENE 6 3186 - 4556 1781 456 aa, chain + ## HITS:1 COG:no KEGG:CMM_1025 NR:ns ## KEGG: CMM_1025 # Name: wzy4 # Def: membrane protein, putative polysaccharide polymerase # Organism: C.michiganensis # Pathway: not_defined # 23 436 36 456 515 155 32.0 5e-36 MSTTTTMALPGRPWISTSRSPRPTGDSRALLYATAALFFLGFAGKGVRAIFGEIGSLIPM GAALAVFVVVFRRSGARLTLRRFPTTISLFVAWCACTTAWSLYPFDTVKFSVLQAGVTLV SISIAVALPLPLLVDALILAFQWIIASSYLLEAAVALFHEGPLAPPVLWGKGPLPASSYW IDGRLFEGGPIQGFPGNRNPLAFIALLMGVCLVLRWMQTRSRTLSTVLWGLAGLAVFPLT GSATVILSALGCAIAIGVLVICRRVEPSKRLSLVGVFAGMAGVGALGVFFARHWVADFFG RSPDMSGRSVIWHAVMRLVRERPTGGWGWTIGWATYMEPFKSLVVRGDGTSTNQSHNVYV EAALLTGVIGLLIISFAVVWTTYRLIKVAVAHIDDNLLDVIPAVLMIAMVIQSFTESRML SEGNWMLFVVISTWVKVRGEVPFVWPSAAREHLEYA >gi|319976790|gb|AEUH01000300.1| GENE 7 4601 - 6037 1891 478 aa, chain + ## HITS:1 COG:Cgl1368 KEGG:ns NR:ns ## COG: Cgl1368 COG0165 # Protein_GI_number: 19552618 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Corynebacterium glutamicum # 8 475 10 477 477 537 61.0 1e-152 MTANDHVSLWGGRFSGAPADALSALSVSTHFDWRLAGADIAGSHAHADALHAVGLLSDDE ARRMHEALRRLDDDVRSGAFKAGPDDEDVHTALERGLIERVGPALGGKLRAGRSRNDQIA TLIRIYLRDEARHLAGRLLDIADALVAQAKAAGGAVMPGRTHLQHAQPVLVAHHLMAHVW PLIRDVERMVDWDRRARVSPYGSGALAGNTLGLDPRAVAAALGFDDSAANSIDGTASRDV VAEFSFILAMAGVDISRLSEEIIIWNTKEFGYVTLDDSYSTGSSIMPQKKNPDVAELARG KAGRLIGDLAGLLTVLKGIPLAYDRDLQEDKEPVFDQIDTLDVLGPAVAGMIATMTIHYP RLAELAPQGFSLATDIAEWLVKRGVPFREAHEISGACVRAAEARGVELADLTDEELAAAS AHLTPGVRGVLTVEGSVAARAGRGGTAPARVAEQIDEAGAALALLRDWADAPVQRPGR Prediction of potential genes in microbial genomes Time: Thu May 12 19:21:25 2011 Seq name: gi|319976787|gb|AEUH01000301.1| Actinomyces sp. oral taxon 178 str. F0338 contig00301, whole genome shotgun sequence Length of sequence - 2102 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1221 1096 ## COG1928 Dolichyl-phosphate-mannose--protein O-mannosyl transferase 2 2 Tu 1 . - CDS 1389 - 1913 736 ## COG0431 Predicted flavoprotein - Prom 1944 - 2003 5.4 Predicted protein(s) >gi|319976787|gb|AEUH01000301.1| GENE 1 1 - 1221 1096 406 aa, chain + ## HITS:1 COG:ML0192 KEGG:ns NR:ns ## COG: ML0192 COG1928 # Protein_GI_number: 15827001 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Dolichyl-phosphate-mannose--protein O-mannosyl transferase # Organism: Mycobacterium leprae # 4 245 194 445 510 111 30.0 3e-24 WRIGPASGPRPWLLAAGVLAGLASSVKWSGVYVLAVLGLYVALREWTTRWRAGHPSPLFG ALLADVWWAFALMVPPAVLTYVASWFGWFTHPRAHGHGVTGGTGFLGALDDLWAYHVEMW NFHTTLTAEHTYQSNPFTWLFQVRATSFAWINDSTISGCHTGNCARDIVALGNPFLWWGG VAALLVLLWATARHRNWRTGLVACGYLALYAPWLMYAKRTIFTFYTVAFVPFVALAVAWA VGLIVGQAQWRGLPSALASADEAGGADRRGTSTDEAGAAQHRPFAPAFPASDLEGRAPAG ALDTASGPEGTGGALEAATAATAADSAPFFPAPTGNGEGAATGPQHLGGATLSDAPTADA LLVRFIAAGAITAAVLACAWYFLPLWTGQVLDYEFWRDHMWLSSWI >gi|319976787|gb|AEUH01000301.1| GENE 2 1389 - 1913 736 174 aa, chain - ## HITS:1 COG:Cgl0985 KEGG:ns NR:ns ## COG: Cgl0985 COG0431 # Protein_GI_number: 19552235 # Func_class: R General function prediction only # Function: Predicted flavoprotein # Organism: Corynebacterium glutamicum # 1 172 1 172 188 92 33.0 4e-19 MTNVAVVLGSTRPGRVGEGVAQWVVSEANKVEGVNASVVDLADFELPFFAEPMPPSMEAP KQPEAVRFAQVLGAADAVVFVTPEYNQSIPGVLKNAIDYLPPAAMEGKRIGLVGYSWRSA ASALAHLRTVLTMFGTTVEPQLGLNLGTDFRDGALVPTPEIEEGLRAVVEALKA Prediction of potential genes in microbial genomes Time: Thu May 12 19:21:26 2011 Seq name: gi|319976784|gb|AEUH01000302.1| Actinomyces sp. oral taxon 178 str. F0338 contig00302, whole genome shotgun sequence Length of sequence - 1963 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 102 - 749 800 ## smi_0862 acetyltransferase 2 2 Tu 1 . + CDS 982 - 1963 831 ## Predicted protein(s) >gi|319976784|gb|AEUH01000302.1| GENE 1 102 - 749 800 215 aa, chain - ## HITS:1 COG:no KEGG:smi_0862 NR:ns ## KEGG: smi_0862 # Name: not_defined # Def: acetyltransferase # Organism: S.mitis_B6 # Pathway: not_defined # 6 205 7 206 210 106 28.0 5e-22 MTLRCRAARHDDINAMALIAAHGFEEYPLHGMMRPFMKPGASYFDFLVDLNTTMVRAYLR WRNALVVEEDGEAVAIALLNRVPVGLAGYIANGGLSMLRSAPLASLLRFFAMTEEADRGA KENADYDWYLEILAVAPGKRGHGIGSWVMRDVLPAYVRRRGGHRIAFITCTEDNVRFYRN SGCRVVNEDQVSLGGRTCPVWAFEKVAFEKVAHTD >gi|319976784|gb|AEUH01000302.1| GENE 2 982 - 1963 831 327 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHPPRLTALTRLVGAAVLAACLGAAHAVPALAADKAPGPAHVDEIFMLTKDGQLINRVSI TDPLGRIDYKRCQAMESPLASSSRADQTETRGTNSGSTTICDLTMSYNDANTQTEVKAGV NGTWALNTETFATAENMQQLFSVPMDYRSATVMVEGTVDEDSSSSQARLGTDETDSGRVF TVARWDTAPGQLVAAGSLTAGGVALADDPSRAIASPASAVPAYPGGHSFYTAASTAPTPS ASSSTAPSLARRGINWGRLVVAGIVAVAAAGFGVARRRNKPVAEPKGFPPPSQWPAAPTG YGRPEPAAAPGDPSTAPGSPAPHGAPG Prediction of potential genes in microbial genomes Time: Thu May 12 19:21:47 2011 Seq name: gi|319976782|gb|AEUH01000303.1| Actinomyces sp. oral taxon 178 str. F0338 contig00303, whole genome shotgun sequence Length of sequence - 1952 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1950 2260 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins Predicted protein(s) >gi|319976782|gb|AEUH01000303.1| GENE 1 3 - 1950 2260 649 aa, chain + ## HITS:1 COG:ML0977 KEGG:ns NR:ns ## COG: ML0977 COG1674 # Protein_GI_number: 15827463 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Mycobacterium leprae # 114 610 383 883 886 594 64.0 1e-169 RPGDEPLLDRYDGDESFRMPHEVSPARAAGDGGAATGVDGATTVLGEDGAPAVRPRTTPQ GAPTEILTQVRPTVAAPGPDTVPAHVPEPDTVPDPGLGAPPMTEAPAGGFQADLDESIAY TLPGDDLLVSGPPHKTRSAVNDQVVRALAQVFADFNIDARVTGFSRGPTVTRYEVVLGAG VKVDKLTNLSKNIAYAVASADVRILAPIPGKSAIGIEIPNSDRENVALGDVLRSGAARRN QHPLVVGVGKDVEGGYVVTNLAKTPHLLVAGQTGSGKSSFVNSMITSIMMRATPQQVRMI LVDPKRVELTIYEGIPHLISPIITDPKKAAEALEWVVKEMDARYNDLSDYGFKHIDDFNK AVSLGQIQAKPGLERVLHPYPYLLVVVDELADLMMVAPRDVEASIQRITQLARAAGIHLV LATQRPSVDVVTGLIKANIPSRLAFATSSLTDSRTILDQPGAEKLIGQGDALYLPAGASK PMRVQGAWVSESEIHQIVSHVKGQMEAHYRDDVVPEQTAAKVAEDIGDDLDDLLQAAELV VSTQLGSTSMLQRKLRVGFARAGRLMDLLESRDIVGPSEGSKARQVLVPPERLPEVLAML RGEQASLDGAGAAAPAAPAGSGPAAGSSAPVPAQGASAPAPAAAGRLRG Prediction of potential genes in microbial genomes Time: Thu May 12 19:21:48 2011 Seq name: gi|319976778|gb|AEUH01000304.1| Actinomyces sp. oral taxon 178 str. F0338 contig00304, whole genome shotgun sequence Length of sequence - 1943 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 68 - 1096 1386 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 2 1 Op 2 . - CDS 1113 - 1466 618 ## COG0316 Uncharacterized conserved protein 3 1 Op 3 . - CDS 1501 - 1941 462 ## gi|154509189|ref|ZP_02044831.1| hypothetical protein ACTODO_01710 Predicted protein(s) >gi|319976778|gb|AEUH01000304.1| GENE 1 68 - 1096 1386 342 aa, chain - ## HITS:1 COG:VCA0661 KEGG:ns NR:ns ## COG: VCA0661 COG0545 # Protein_GI_number: 15601419 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Vibrio cholerae # 84 179 58 154 157 85 46.0 1e-16 MTSTLRKIGAVACAALTAITLAACSPASSNSSQSGVLECTTDDSAEQAQVDRSGAGYFPK VTGTETPVIDPATGSEPTDKVLVKTLTKGDGPEVCPGAQVKVNYVGALWDGTKFDSSYDK GKPVSFSLSGVIKGWGYALAHAHVGDRLELVIPSSLGYGESGGQQIPPNSTLVFVVDILE QQGVSRQDLADEATLTGAEATGEELPAGITVTGGPGVEPTLTIDESQPMPTEQQVYTVYK GTGEALAATDTVLMKGVAGGWGVQGRTESSWNSEPLQQPVAQTPFAGYTIGSRIVVVSPI PAQAGQSGQSGTEAQAIVRVFDIVGKMEYLDSSGKPTTPPRG >gi|319976778|gb|AEUH01000304.1| GENE 2 1113 - 1466 618 117 aa, chain - ## HITS:1 COG:ML0871 KEGG:ns NR:ns ## COG: ML0871 COG0316 # Protein_GI_number: 15827394 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium leprae # 8 117 9 118 118 162 73.0 2e-40 MSDTATQAPTHEVILTDVAATKVKSLLAQEGRDDLRLRIAVQPGGCSGLIYQLYFDDRVL DGDAIRDFDGVEVIVDRMSVPYLAGATIDFADTIERQGFTIDNPNAQNTCACGESFH >gi|319976778|gb|AEUH01000304.1| GENE 3 1501 - 1941 462 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154509189|ref|ZP_02044831.1| ## NR: gi|154509189|ref|ZP_02044831.1| hypothetical protein ACTODO_01710 [Actinomyces odontolyticus ATCC 17982] # 6 130 197 321 322 85 57.0 8e-16 APGPSAGAGPASARGSGSGGGVAAVIAAIGGRIVDTGPLLVALGPLRRHLEEADLLVACE PDLSFPALAESCLDAITAAAAPWAIPVVALAVRSTLSAHERAQWGLHGVFATDGALGPGA AGARIARTWLPAADGPDSAAGRKRVG Prediction of potential genes in microbial genomes Time: Thu May 12 19:21:58 2011 Seq name: gi|319976774|gb|AEUH01000305.1| Actinomyces sp. oral taxon 178 str. F0338 contig00305, whole genome shotgun sequence Length of sequence - 1891 bp Number of predicted genes - 3, with homology - 1 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 87 65 ## 2 2 Op 1 . - CDS 51 - 1184 1272 ## COG3347 Uncharacterized conserved protein 3 2 Op 2 . - CDS 1282 - 1716 728 ## Predicted protein(s) >gi|319976774|gb|AEUH01000305.1| GENE 1 1 - 87 65 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no SAGAARWGTPARREASGQGSPALATCLR >gi|319976774|gb|AEUH01000305.1| GENE 2 51 - 1184 1272 377 aa, chain - ## HITS:1 COG:mlr3601 KEGG:ns NR:ns ## COG: mlr3601 COG3347 # Protein_GI_number: 13473109 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 21 375 15 346 347 115 30.0 2e-25 MKRFILNAEAPMTTPTADFIALCNRIGADPEFTRAGGGNSSAKTGTTLRIKPSGVPLATL REEDLVPLDIPTLLDALHSPAPEGEDPVRAAAARAQLGHRDRRPSVEILFHALIPDPLVL HLHPLTANAVTCNARGAALTEEIFGDQAIWVDYTDPGIPLALEIERQREAFQARHRKRPP RITMLGNHGVIVSGATVEEIDERVHFLTASIRAAIERAAADMEEQCLRVASHFKGATGAA AVAIAADEATRSDSERDAGPVSHGPLIPDQIVYAGSLPLLLDRNDTEEVVGAKTALYRAQ HGRLPVVAVIPAAAVIARGDSQGGADNALAVFLDALRVAREAGLLGEVRVMDEEERRFIE HWEAESYRKQVASAGLP >gi|319976774|gb|AEUH01000305.1| GENE 3 1282 - 1716 728 144 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWQAVAAGFAFSYAVTLGAFFQWQVRAGDDEFFTRVYTPFRTALGLKWRYRLFMPFGPAL TAAASLALNHSAHAVLPQALAVVVFFLYYFLAHVPTGFAQAEEDLMSGKGLTERQRRIYL RCNIPLHILMGSLYAATAVALVLT Prediction of potential genes in microbial genomes Time: Thu May 12 19:22:09 2011 Seq name: gi|319976771|gb|AEUH01000306.1| Actinomyces sp. oral taxon 178 str. F0338 contig00306, whole genome shotgun sequence Length of sequence - 1622 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 108 - 383 423 ## gi|293192901|ref|ZP_06609745.1| conserved hypothetical protein 2 1 Op 2 . - CDS 414 - 1511 1675 ## COG1252 NADH dehydrogenase, FAD-containing subunit Predicted protein(s) >gi|319976771|gb|AEUH01000306.1| GENE 1 108 - 383 423 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293192901|ref|ZP_06609745.1| ## NR: gi|293192901|ref|ZP_06609745.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 1 62 4 65 98 66 82.0 4e-10 MTTDPRRALTRLLNAFENHFDIARDGGEVDEDALVAAEEALRDAFFTYDDQLFTRYGVEL PFDLLDDEDDYDDEFDDDDDLEDDDFVDVDD >gi|319976771|gb|AEUH01000306.1| GENE 2 414 - 1511 1675 365 aa, chain - ## HITS:1 COG:all2964 KEGG:ns NR:ns ## COG: all2964 COG1252 # Protein_GI_number: 17230456 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase, FAD-containing subunit # Organism: Nostoc sp. PCC 7120 # 3 306 6 348 442 79 28.0 8e-15 MARVAVIGGGYGGVTVAKGLDPLADVVLIEQKDQFVHHAAALRAAVDSVWEQSIFMPYTN LLSHGEVVKGTVSKVEGTTVHVFGREPIEADYVVLATGSSYPFPAKYSSYKSGVAKARLE QLHENLGGARSAMVVGGGTVGIELTGELANAFPDLEITIVEKGDEILSTPGYSPGLRAEI SEQLAQLGVRVITGSELAYLPPQNVGDLAHFMVETKNGDAIEGDIWFQCYGARPVTGFLS GTAFEPLLHPNGTIAVEPTLRVKGHDNVYAVGDITDVRESKRADAARQQARVVIANISAQ LEGEDPDTTYEPTKEWVILPLGPTMGASQLLDSDGAVRILGAEQTAEIKGTDLMVSVIRS QLNLP Prediction of potential genes in microbial genomes Time: Thu May 12 19:22:17 2011 Seq name: gi|319976768|gb|AEUH01000307.1| Actinomyces sp. oral taxon 178 str. F0338 contig00307, whole genome shotgun sequence Length of sequence - 1534 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 + CDS 25 - 783 937 ## COG2025 Electron transfer flavoprotein, alpha subunit 2 1 Op 2 . + CDS 780 - 1533 620 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes Predicted protein(s) >gi|319976768|gb|AEUH01000307.1| GENE 1 25 - 783 937 252 aa, chain + ## HITS:1 COG:FN1533 KEGG:ns NR:ns ## COG: FN1533 COG2025 # Protein_GI_number: 19704865 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Fusobacterium nucleatum # 2 250 74 323 323 138 33.0 1e-32 MAAADAVVAALEAGSYGLVLLPSDYRGREIAGAVAATTDAGVVSGASSVSYDGGVLEIAK TALAGSWSMRIVVEGQTPVVGVASGAVDEARAASPTTPAVESLEVVLSPRAQAVAVLAST PEDTGGVSLADASTVVVGGRGVDGDFTMVKELADALGGAVGATRVACDEGWAPRGEQIGQ TGLSVSPNLYVGLGVSGAVHHTVGMQSSAHIVAVCDDPDAPIFELADFGVVGDVAEVVPQ ALDEIRKARAAE >gi|319976768|gb|AEUH01000307.1| GENE 2 780 - 1533 620 251 aa, chain + ## HITS:1 COG:MT3109 KEGG:ns NR:ns ## COG: MT3109 COG1104 # Protein_GI_number: 15842588 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 4 251 3 248 393 149 42.0 5e-36 MSHYLDHAATSPLDPAALEAWVAAQRELGAVPGNPAALHSGGRRARRMLDDARERVAHQL GADIHEVLFTSGATESDALGVMAAARGARASDPARTRILVSSVEHDAVANQRAVADREGF AWEELPVAPTGTTPIDEAALAALAPSVAVASMCLVCSETGAVQPVAALARAMRAGGARTH TDAAQAVPVVEVRFDELGVDLMSVAGHKTGAPVGVGCLVARRGIPALTDRPGGGHERGLR SGTPDVAGALA Prediction of potential genes in microbial genomes Time: Thu May 12 19:22:17 2011 Seq name: gi|319976765|gb|AEUH01000308.1| Actinomyces sp. oral taxon 178 str. F0338 contig00308, whole genome shotgun sequence Length of sequence - 1529 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 43 - 84 12.5 1 1 Tu 1 . - CDS 98 - 1360 669 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 Predicted protein(s) >gi|319976765|gb|AEUH01000308.1| GENE 1 98 - 1360 669 420 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 2 419 7 416 418 262 35 1e-70 MTDIIDELQWRGLLAQHTDLAALREHLASGPVTFYCGYDPTAPSLHHGHLVQLIVMRHLQ LAGHKPLALVGGATGQIGDPRQSGERQLQPTEVVQGWAQKLRDQISRFLDFEGPAAARMV NNLDWTQQMSAIDLLRTIGKYFRVGTMLNKDIVARRIASDEGISYTEFSYQVLQANDFLE LFRRYGCTLETGGNDQWGNMVGGVDLIHKVDGADAHVMTTPIITKADGTKFGKSEGGAIW LDPQMMSPYAFYQFWLQVSDEDVVRFLKIFTFLDRARIEELEAEVAARPHQRAAQKALAA AVTSMVHGDQELARVQAATKALWGGGDIKELDEASLRAATADLPRASVPLGEASVADALV AVGFERGKSAARRTVASGGVSVNNVRVSDADAPLTGSDALPGSLALLRKGRKNLAVVELA Prediction of potential genes in microbial genomes Time: Thu May 12 19:22:18 2011 Seq name: gi|319976763|gb|AEUH01000309.1| Actinomyces sp. oral taxon 178 str. F0338 contig00309, whole genome shotgun sequence Length of sequence - 1481 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 185 - 1481 1467 ## COG0210 Superfamily I DNA and RNA helicases Predicted protein(s) >gi|319976763|gb|AEUH01000309.1| GENE 1 185 - 1481 1467 432 aa, chain + ## HITS:1 COG:Cgl0755 KEGG:ns NR:ns ## COG: Cgl0755 COG0210 # Protein_GI_number: 19552005 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Corynebacterium glutamicum # 9 402 4 397 678 301 46.0 2e-81 MSLSPNELLDALDPDQRAVATQVAGPLAVLAGAGTGKTRAITYRIAYGAAVGAFDPSNVL AVTFTKRAAYEMRHRLAALGVPRAQARTFHSAALRQLRHFWPTVVGGPMPDVVPHKASLV SSACARLGITVDRTTVRDLAAEVEWAKVSMVGPDRYEAHLRRTGRQAPADLTGAEAARLL DAYEDAKSERGVVDFEDVLVYMCGILQERPDIARLVRGQYRNFVVDEFQDVNLLQSRLLD LWLGGRHDVCVVGDVAQTIYSFTGASPEYLTGFARKHPGARVLELTRDYRSTPAIVEAAN RVLAGAAQREGTVRLVSQRQGGVPVAYRTYDDDAAEAEGIAQRITELMAGGTPAHSIAVL LRTNGQSQVFEEALGARGIPVAMTNSTPFFRREDVRRALSALRTAATAQGGDGAGAGGVG AAVRDALGGVGW Prediction of potential genes in microbial genomes Time: Thu May 12 19:22:20 2011 Seq name: gi|319976759|gb|AEUH01000310.1| Actinomyces sp. oral taxon 178 str. F0338 contig00310, whole genome shotgun sequence Length of sequence - 1470 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 1 - 694 727 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone 2 1 Op 2 . - CDS 687 - 1265 679 ## COG0802 Predicted ATPase or kinase 3 1 Op 3 . - CDS 1266 - 1469 222 ## Predicted protein(s) >gi|319976759|gb|AEUH01000310.1| GENE 1 1 - 694 727 231 aa, chain - ## HITS:1 COG:Cgl0576 KEGG:ns NR:ns ## COG: Cgl0576 COG1214 # Protein_GI_number: 19551826 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Corynebacterium glutamicum # 1 228 1 212 225 88 34.0 9e-18 MLELALDTLNATSVALVRDGEAIARAADGSARRHAESLTPLVRRVLQDAGLGPDAAGAGL DRVLVGTGPAPFTGLRAGLVSARVIGEAVGAPVLGVASLDVVARQGLDLLPPDMTVFAVS DARRRELYWGRYEADGPDDVRLVGRLEVGAARSLLGAMREADGLIVPAGPLPAHSAQPLA DAGQGPVIDLDPAVMSRMVAARLARGQEERLGAQPLYLRRPDIQGRAPARL >gi|319976759|gb|AEUH01000310.1| GENE 2 687 - 1265 679 192 aa, chain - ## HITS:1 COG:Cgl0573 KEGG:ns NR:ns ## COG: Cgl0573 COG0802 # Protein_GI_number: 19551823 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Corynebacterium glutamicum # 12 130 16 142 165 95 51.0 7e-20 MAAETTIATGSAEQTRALGAALGAVLAAGDLVMLSGGLGAGKTTLAQGIGEGMGVLGRVA SPTFIIARVHPSGRGGPDLVHADAYRIRDLEDLETLDLDSSLDEAVTVVEWGEGKTEALS DSRLEVEVRRARGGTAPALDGAIDLAAVDDGRRDIVMRPFGQRWDGVLEAVVAAFRRAAP SGDGDAVGSTRA >gi|319976759|gb|AEUH01000310.1| GENE 3 1266 - 1469 222 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no GAAGTPGSQSGAPPARVGDGAVLFGDPATGAPTADQWAAAAGTINYEVVTRLGDHIPRIH IRAGRSQ Prediction of potential genes in microbial genomes Time: Thu May 12 19:22:25 2011 Seq name: gi|319976756|gb|AEUH01000311.1| Actinomyces sp. oral taxon 178 str. F0338 contig00311, whole genome shotgun sequence Length of sequence - 1443 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 74 - 128 20.8 1 1 Op 1 . - CDS 149 - 868 1192 ## COG1842 Phage shock protein A (IM30), suppresses sigma54-dependent transcription 2 1 Op 2 . - CDS 1064 - 1441 262 ## Predicted protein(s) >gi|319976756|gb|AEUH01000311.1| GENE 1 149 - 868 1192 239 aa, chain - ## HITS:1 COG:all2342 KEGG:ns NR:ns ## COG: all2342 COG1842 # Protein_GI_number: 17229834 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Phage shock protein A (IM30), suppresses sigma54-dependent transcription # Organism: Nostoc sp. PCC 7120 # 7 235 3 214 258 80 32.0 2e-15 MAEKQSILGRITQLAKANINALLDRAEDPQKMLDQLIRDYTNSIADAESAVAQTIGNLRL AEKDHAEDVAAAEDWGRKAQAASTKADQLRAAGDTAGADKWDNLAKVAIGKQIQFEGEAK ESAPMIASQTEVVEKLKAGLNQMKEKLSDLKVRRDQLVARQKSAQAQAQVTDAISSINVL DPTSELSRFEDKVRRQEAMAQGKTELAASSLDAQFAELETDASQIEVEARLAALKNKDA >gi|319976756|gb|AEUH01000311.1| GENE 2 1064 - 1441 262 125 aa, chain - ## HITS:0 COG:no KEGG:no NR:no LSKARAAAEPTEAASHAATAAALARQVLAAPVTPSAPAFGAGTPPTFNTGTPQSRGNGSF TGSTLGDFLLWSTIFGSHDHGGWGGHRHDDRDSSWGGGFGGFGGFGGGSSGGDDSPEMGW GGSSF Prediction of potential genes in microbial genomes Time: Thu May 12 19:22:36 2011 Seq name: gi|319976753|gb|AEUH01000312.1| Actinomyces sp. oral taxon 178 str. F0338 contig00312, whole genome shotgun sequence Length of sequence - 1286 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 275 260 ## 2 1 Op 2 . + CDS 272 - 1286 1075 ## PPA0131 putative glycosyl transferase (EC:2.4.1.-) Predicted protein(s) >gi|319976753|gb|AEUH01000312.1| GENE 1 3 - 275 260 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no AQAAGCRLVATSTWAGEHAISEGTGLLVPVDDPGALVRAMRRAADPTAFAPADRIRERAR ARYGEGAFVRRWRRIYASLADGGRGRGGRG >gi|319976753|gb|AEUH01000312.1| GENE 2 272 - 1286 1075 338 aa, chain + ## HITS:1 COG:no KEGG:PPA0131 NR:ns ## KEGG: PPA0131 # Name: not_defined # Def: putative glycosyl transferase (EC:2.4.1.-) # Organism: P.acnes # Pathway: Fructose and mannose metabolism [PATH:pac00051] # 5 326 14 339 375 231 41.0 2e-59 MSAMVFYSPTVVKGGAGSGSGVRPYRMREAFGRLGYEVIDVSGRHRERHGAMARARERVD AGGVEFVYAEMASTPVALTEPITTRVDPRADFAFLRHCRSRGVPVGAFYRDVYWRFPGYL RQAGAAYGLWARAHYAWELRRLDAAVDVLFLPTMRMGEYVPRVDPAKFRELPPGAPVGAA SEGEAADLFYVGALGDYYDLSECAAAVADTPGATLTMCVPPDQWRAHRAAYEPFLGGGVS VVHGRGEDLDPYYARAGAGVLAMRPVEYRAFAAPVKLFEYIGRSLPVLASQGTYVGDFVA RTGTGWTLPYDREAIAGLLARLVADPGAAAGARTAVRR Prediction of potential genes in microbial genomes Time: Thu May 12 19:22:47 2011 Seq name: gi|319976749|gb|AEUH01000313.1| Actinomyces sp. oral taxon 178 str. F0338 contig00313, whole genome shotgun sequence Length of sequence - 1243 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 595 - 831 113 ## 2 2 Tu 1 . + CDS 948 - 1242 108 ## Predicted protein(s) >gi|319976749|gb|AEUH01000313.1| GENE 1 595 - 831 113 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTALIMSQTAATHEAILDPQSSHHESPNPSSPAPSMAFIALTAAVSRSPIAFRTTCRWAR TRPRAAEASASAVVAARS >gi|319976749|gb|AEUH01000313.1| GENE 2 948 - 1242 108 98 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVSGEESWIGALLSVAFAALGRLGLAGMRGTMAGVRSSFANLCALGGKGHEGLVGGIKA LGGLRGMAAGYGHNLWMSIKNGWKYLKNFFSKRSPVRF Prediction of potential genes in microbial genomes Time: Thu May 12 19:22:59 2011 Seq name: gi|319976746|gb|AEUH01000314.1| Actinomyces sp. oral taxon 178 str. F0338 contig00314, whole genome shotgun sequence Length of sequence - 1235 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 23/0.000 + CDS 37 - 354 258 ## COG2963 Transposase and inactivated derivatives 2 1 Op 2 . + CDS 504 - 1233 533 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|319976746|gb|AEUH01000314.1| GENE 1 37 - 354 258 105 aa, chain + ## HITS:1 COG:MT0414 KEGG:ns NR:ns ## COG: MT0414 COG2963 # Protein_GI_number: 15839787 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mycobacterium tuberculosis CDC1551 # 3 105 5 108 108 84 52.0 7e-17 MPAPRKYPDELKNRAIRLTLDALADPDRSRGCFRRVGDELGVNPETLRGWVRQARIDAGD RPGTTTEGARRLRELEKEVRELRRANAILRSASAFFAAELDRPSR >gi|319976746|gb|AEUH01000314.1| GENE 2 504 - 1233 533 243 aa, chain + ## HITS:1 COG:Rv0796 KEGG:ns NR:ns ## COG: Rv0796 COG2801 # Protein_GI_number: 15607936 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mycobacterium tuberculosis H37Rv # 18 240 76 295 312 202 49.0 6e-52 MDERILALRDEPFNATLGSRKTWRLLNAQDGRPPVARCTVERRMRALGLSGAAPGRKPRA TGPAEGDRAPGDLLRRDFTATGPNRRWVVDFTHVPTRSGFCYTAFVMDLYARRIVGWATS ARMDTDNAGSALEHAIWTRKERGGADLEGLVHHSDHGSQYLSIAYTGRLVDEGIEASAGA VGSSYDNAAAEALNKSYKRELVWRDGPWKGRADLETATARWVDWYNRTRPHLTNDDDLPP TTV Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:00 2011 Seq name: gi|319976745|gb|AEUH01000315.1| Actinomyces sp. oral taxon 178 str. F0338 contig00315, whole genome shotgun sequence Length of sequence - 1219 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1218 742 ## COG1196 Chromosome segregation ATPases Predicted protein(s) >gi|319976745|gb|AEUH01000315.1| GENE 1 3 - 1218 742 405 aa, chain - ## HITS:1 COG:ML1629 KEGG:ns NR:ns ## COG: ML1629 COG1196 # Protein_GI_number: 15827858 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Mycobacterium leprae # 2 402 473 829 1203 66 29.0 9e-11 REREARAERARWESRRDTLAQSLQPADETADLLGGDGVLGLAAESVRAEPGFEDAVAALL SPYTDAVVVADAEAALEQLERAKRRGGGMIRMVVAGAGVDPGAGGTAGPTAGAEPGAGSA GTDPGSGAAEDSEAGADAGAVGAHADADAARAAGMGELPEGVRPAWGVVSLGERACALAA LLEGAVVAADPREALAALRVRGVRVAVTRGGDVLGRGTVTGSGARASSTLALRASHEEAV RRAEEAAASCEGVARELEAANSALDAAVREANTALRELRAQDAQRAKAEGEYARAASAAK AAEAEARRAQDSARRADDQVEWAFRALEEARERAAAAESVEEPESVEDAQADAAEAVRRA KEAREAETRARLELRALEERSRQVQARARSLRQAAAREREARAAR Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:00 2011 Seq name: gi|319976742|gb|AEUH01000316.1| Actinomyces sp. oral taxon 178 str. F0338 contig00316, whole genome shotgun sequence Length of sequence - 1180 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 52 - 963 1512 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 1025 - 1084 3.4 Predicted protein(s) >gi|319976742|gb|AEUH01000316.1| GENE 1 52 - 963 1512 303 aa, chain - ## HITS:1 COG:BH2317 KEGG:ns NR:ns ## COG: BH2317 COG1082 # Protein_GI_number: 15614880 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Bacillus halodurans # 12 301 6 294 303 326 54.0 3e-89 MTTTPIHDPHGITWGMHPIAWRNDDIPEVGEWNTIDVMFDDLAATGYAGTEVAGWYPPKE EVKEKADAHGLAIVAQWFSSFIVRDGIDAVVPPFRDNCEYLQYLGATRIVVSEQTGSVQG ARDICIFDNKPVLTDEQWPVLAEGLNRLGAIAHEYGLELVYHHHLGTVVQTKDETLKMLE LTDPAVVSLLFDTGHAYVGDGDVMGLLRGAIDRIKHVHFKDVRPDKMAESRQAKRSFLDS FLAGMFTVPGDGAIDFTEPYDFLVSHGYDQWILVEAEQDPKIAPPLEYAKTARAYIESTL LPR Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:01 2011 Seq name: gi|319976740|gb|AEUH01000317.1| Actinomyces sp. oral taxon 178 str. F0338 contig00317, whole genome shotgun sequence Length of sequence - 1129 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 72 - 1079 1346 ## COG0464 ATPases of the AAA+ class Predicted protein(s) >gi|319976740|gb|AEUH01000317.1| GENE 1 72 - 1079 1346 335 aa, chain - ## HITS:1 COG:DR0647 KEGG:ns NR:ns ## COG: DR0647 COG0464 # Protein_GI_number: 15805674 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Deinococcus radiodurans # 20 335 138 460 460 281 47.0 9e-76 MDGRRPGGFDWDSAEEQVGGPAPAFVQGREAELNEGADPGGEDMWDVEASTLTLADVGGM QAVKDRLNMAFLAPLRNPEMRRLYGKSLKGGLMLYGPPGCGKTYIARALAGEMGASFVSI TLTDILDQFIGNSEANLHSLFETARAHAPVVLFLDEIDAIGQKRSQASSSGWRGVTNQLL TEMDGIDSGNEGVFILAATNVPWDVDPALRRPGRFDRSVAVLPPDEPARQSILRHHLGSR PVEGIDLAHLARQTNGFTGADLAHLADSAVEYAMMDSMRTGTVRMVTMKDFKRALRQVRP SAGPWFSTARNIVAYGNRDGQYDDLAAYMRANKLL Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:02 2011 Seq name: gi|319976738|gb|AEUH01000318.1| Actinomyces sp. oral taxon 178 str. F0338 contig00318, whole genome shotgun sequence Length of sequence - 1069 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 5 - 1069 1076 ## COG4962 Flp pilus assembly protein, ATPase CpaF Predicted protein(s) >gi|319976738|gb|AEUH01000318.1| GENE 1 5 - 1069 1076 354 aa, chain - ## HITS:1 COG:Cgl0301 KEGG:ns NR:ns ## COG: Cgl0301 COG4962 # Protein_GI_number: 19551551 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Corynebacterium glutamicum # 5 290 40 329 377 242 50.0 1e-63 AGAGSDEIARCLAGLRATSGGLGPALAPLVADPRVTDVLVNGTQVWVDRGEGLVRAATDV GGADDVRRLAIRMAAACGKRLDDASPIVDGTLEGGVRLHAVLAPVSASGTLISLRAARGR NLSVDALARCGTLSPRVASLLRALVRVRANVLISGQTGSGKTTLLAAVLALVPPDERIVC IEETTELRPDHPHCVNLAERRPNVEGAGGVTLSELVRAAMRMRPDRLVLGECRGGEVRDV LTALNTGHDGGWATVHANGVRDVPARLLALGSLAGMGESAVAAQTVAAFDAFVHLRRRSG AAPGSPGRWVSEVGVPVRSGSGLRADLALAVDQGGGAEEGPAWPLLAQRCRVPS Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:03 2011 Seq name: gi|319976735|gb|AEUH01000319.1| Actinomyces sp. oral taxon 178 str. F0338 contig00319, whole genome shotgun sequence Length of sequence - 1015 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 283 389 ## COG4618 ABC-type protease/lipase transport system, ATPase and permease components 2 1 Op 2 . + CDS 280 - 1015 1009 ## COG1132 ABC-type multidrug transport system, ATPase and permease components Predicted protein(s) >gi|319976735|gb|AEUH01000319.1| GENE 1 2 - 283 389 93 aa, chain + ## HITS:1 COG:SMb21316 KEGG:ns NR:ns ## COG: SMb21316 COG4618 # Protein_GI_number: 16264640 # Func_class: R General function prediction only # Function: ABC-type protease/lipase transport system, ATPase and permease components # Organism: Sinorhizobium meliloti # 1 84 482 565 589 75 51.0 2e-14 GDGGAGLSVGQRQRLALTRALAGDARLVVLDEPTAHLDAVSEEVVVRAITALRDEGRTVV VIAHRAAVTAVADHVIEVRSQPISATADQEAPA >gi|319976735|gb|AEUH01000319.1| GENE 2 280 - 1015 1009 245 aa, chain + ## HITS:1 COG:Cgl1122 KEGG:ns NR:ns ## COG: Cgl1122 COG1132 # Protein_GI_number: 19552372 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Corynebacterium glutamicum # 30 224 15 208 518 111 38.0 1e-24 MSPFLTAPERRALARCLRMLEVPRGRLALSLLLGSAALASSIALGATAAWLIARASQQPP VLYLTVAATSVRLFGVSRAVLRYLQRLASHRVALGGMDALRRNLYDALAASRSDHLASLR RGDLMARTGADVDEVGNLLVRTVLPVGVSAIVGVGTAVAIALVSPAAGLVLAVCLLVSGV AAPALTARSVRVAEEDSASARIDLSASALTLMDGATELRVNGRVAPVRQALEDAEDRLAA AAARS Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:04 2011 Seq name: gi|319976733|gb|AEUH01000320.1| Actinomyces sp. oral taxon 178 str. F0338 contig00320, whole genome shotgun sequence Length of sequence - 1002 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 99 - 800 975 ## COG2860 Predicted membrane protein Predicted protein(s) >gi|319976733|gb|AEUH01000320.1| GENE 1 99 - 800 975 233 aa, chain + ## HITS:1 COG:Cgl0134 KEGG:ns NR:ns ## COG: Cgl0134 COG2860 # Protein_GI_number: 19551384 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Corynebacterium glutamicum # 4 233 5 233 239 172 41.0 6e-43 MHVLDALNASLPEVFRAIDLMGVLLNGILGGKVARERNFDAVGFAILAIMTALAGGMIRD VLLGAEAGPPVALTDPYYLGVALVGAAVAMMWKMDSRPWRFLLVVADGMVLGCWAATGAI KTLDNGFGVTPAILLGIITAVGGGMVRDISAGLVPRVFGGNNLYATPAFASAGATVVFWH LGQPTIGMGCSILIGLAFTGLAHWRHWQLPQTGEWTLTLTYSQLKAMTRRRVR Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:05 2011 Seq name: gi|319976731|gb|AEUH01000321.1| Actinomyces sp. oral taxon 178 str. F0338 contig00321, whole genome shotgun sequence Length of sequence - 920 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 919 911 ## COG0210 Superfamily I DNA and RNA helicases Predicted protein(s) >gi|319976731|gb|AEUH01000321.1| GENE 1 1 - 919 911 306 aa, chain - ## HITS:1 COG:Cgl0751_1 KEGG:ns NR:ns ## COG: Cgl0751_1 COG0210 # Protein_GI_number: 19552001 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Corynebacterium glutamicum # 71 263 484 671 756 73 29.0 5e-13 AGAGEPAPPRPCGPPDTADLVADPALWGPPLREEADRKGRRAAAAAELAGALGRAGDVWE AAGTGAERRPQEALWAMWSAAGVADQWRSWAMADDAESGWYDDQLDAVVALMRVADVWEQ RNPGGAAGRFAEELLGGSVPIDTISRVGQRPEGVEVLTPAQAVGRHWEVVAVVGLQDGAW PNMRLRDRILRADLLADVGAGRTTTDPEGNEALIDSTRAARKSVLDEEYRLLVAALSRAT RFIHAGAVRNEHQAPSAFFDLVATHAGTPRTGGVVPLDEVPAPLSLSGHIAALRQDAARA DGSERA Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:05 2011 Seq name: gi|319976729|gb|AEUH01000322.1| Actinomyces sp. oral taxon 178 str. F0338 contig00322, whole genome shotgun sequence Length of sequence - 880 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 880 496 ## COG0457 FOG: TPR repeat Predicted protein(s) >gi|319976729|gb|AEUH01000322.1| GENE 1 1 - 880 496 293 aa, chain - ## HITS:1 COG:all2787 KEGG:ns NR:ns ## COG: all2787 COG0457 # Protein_GI_number: 17230279 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 6 293 502 789 924 251 51.0 9e-67 ALASRNNLAGAYWSAGRVGEAIPLYEEVLADRVRVLGPDHPDTLASRNNLAVAYESVGRL GEAIPLFEEVAADQVRVLGPDHPDTLFSRYNLAGAYRSAGRVGEAIPLYEEVLADRVRVL GPDHPDTLASRNNLARTYESAGRLGEAIPLYEQTLADSLRVLGADHPGALTSRNNLACAY QAVGRVDEAVALHEQILADRLRVLGPDHPDTLSSRNNLAGAYESAGRLGEAIPLYEQVLA DRVRVLGADHPQTLTSRNNLAGAYQAAGRLGEAIPLYEQVLADRVRVLGDDHP Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:06 2011 Seq name: gi|319976727|gb|AEUH01000323.1| Actinomyces sp. oral taxon 178 str. F0338 contig00323, whole genome shotgun sequence Length of sequence - 877 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 828 1037 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase Predicted protein(s) >gi|319976727|gb|AEUH01000323.1| GENE 1 1 - 828 1037 275 aa, chain + ## HITS:1 COG:STM3109 KEGG:ns NR:ns ## COG: STM3109 COG0220 # Protein_GI_number: 16766410 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Salmonella typhimurium LT2 # 24 272 12 238 239 139 36.0 4e-33 VPGTAARSLPPAPGASPRRRPAQEGGVFMARTKSFTRRSRELPPNLRRTWEAVAPRYVIE PRRGVGRTTVAEDFALDPAEVFGRCAPLTIEVGSGTGEQLVAAAAAHPDRDYLALEVWVP GIAKLLSKAAGAGVENIRVLEADAAQALPHLLDDATAREVWTFFPDPWRKARHRKRRLVS DSFALEVARLLEDGGVWRMATDWDDYAWQMRDVVEACPLLDNPHAGERPDPADPRPDHGG FAPRYEGRIVTHFETRGLDAGRRAHDIVGARLPRA Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:07 2011 Seq name: gi|319976725|gb|AEUH01000324.1| Actinomyces sp. oral taxon 178 str. F0338 contig00324, whole genome shotgun sequence Length of sequence - 848 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 846 341 ## Predicted protein(s) >gi|319976725|gb|AEUH01000324.1| GENE 1 3 - 846 341 281 aa, chain + ## HITS:0 COG:no KEGG:no NR:no GEGRAPSGPSEPAPSPEPAKEPGSAHSSEPGAPTAPPDSGDEGPAPAPAPGAPPDSAPAP EAEPVPSPEPEPDAGAPGAADAAFPPPAPADTRSTQDTIISDRPLVPRSSAQPGKGRGAR KGAGAAQAPGAVKGAGSAEPEDAPPAPDSSADVEPSQPPRTATEPPAPPAEPDSSEPPAE PPVAPGAAKGAGAAEPEDAPPAPDSSADVEPSQPPRTATEPPAPPAERDSTEPATEPPSA PGGADEAPAPAAQPPAPRGADEAPSPHGRARHGIPLPEEPA Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:44 2011 Seq name: gi|319976723|gb|AEUH01000325.1| Actinomyces sp. oral taxon 178 str. F0338 contig00325, whole genome shotgun sequence Length of sequence - 845 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 106 - 844 650 ## COG1122 ABC-type cobalt transport system, ATPase component Predicted protein(s) >gi|319976723|gb|AEUH01000325.1| GENE 1 106 - 844 650 246 aa, chain + ## HITS:1 COG:BS_ykoD KEGG:ns NR:ns ## COG: BS_ykoD COG1122 # Protein_GI_number: 16078387 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, ATPase component # Organism: Bacillus subtilis # 47 206 313 481 490 119 42.0 5e-27 MEDMTSPGGVAYWRKTDGQWERTGALDAQDPVLALSGMRVPGRCPRVSARVGAGELVGVI GVNGAGKSSLLSALAGLGGFEADEALIGGRPLRRGRHIAGYVFQNPEHQFVSSTVSKELA VGGAPPARVEELLEQFHLAGHRGAHPLTLSGGQARRLSVATMAGAPHALVVLDEPTYGQD WANTQELMSFIDALRERGRCVLMATHDLELARRHCTAIIALPDPEQGADEVPAVPDRGAG AGPSGQ Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:45 2011 Seq name: gi|319976721|gb|AEUH01000326.1| Actinomyces sp. oral taxon 178 str. F0338 contig00326, whole genome shotgun sequence Length of sequence - 763 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 763 253 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 Predicted protein(s) >gi|319976721|gb|AEUH01000326.1| GENE 1 2 - 763 253 254 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 5 234 277 504 563 102 31 1e-22 GPAAPVLSARGLAIGYEAGAPVREGIDLDIARGASTCLVGPNGVGKSTLALTLAGLLPAL GGTIGVTPSHGAPDRKGDDPHKWSSRDMLGRISMVFQEPEYQFVARTVRGELEVGPRAAG ASGPGLDALVDEHLDALGLSSLAGANPMTLSGGEKRRLSVATALISAPDLLILDEPTFGQ DRGTWLGLVRLLRGARERGTTLVSITHDPAFVAAMGDAVIDLSDLGRPPAQNPEGGAGDA GPVRGGGADEGERG Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:45 2011 Seq name: gi|319976718|gb|AEUH01000327.1| Actinomyces sp. oral taxon 178 str. F0338 contig00327, whole genome shotgun sequence Length of sequence - 759 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 303 263 ## 2 1 Op 2 . + CDS 300 - 759 298 ## COG3412 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|319976718|gb|AEUH01000327.1| GENE 1 1 - 303 263 100 aa, chain + ## HITS:0 COG:no KEGG:no NR:no AEAGAGPAAVLEAAAAAARGGAEATEPMRATKGRASYLGERSIGHLDPGAVSSALILAEA AAAARAAEDNGDAEPAVEPLDDVEPDAEGHRDAEPEGEQP >gi|319976718|gb|AEUH01000327.1| GENE 2 300 - 759 298 153 aa, chain + ## HITS:1 COG:lin2845 KEGG:ns NR:ns ## COG: lin2845 COG3412 # Protein_GI_number: 16801905 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 5 122 6 115 124 67 37.0 7e-12 MSRVGIVIASHSDLLARGVAELAGQMAPGVAIGAAGGLEDGGLGTSYDRIEAALEAVLAA VDGPGSGAVVLTDLGSATMTAESVVEMSEAPERIRLVDTALVEGAVAAAVRAEVGDSLDD VARAAASVRFGAQDQGAPDHAGDGGAGGCGCAG Prediction of potential genes in microbial genomes Time: Thu May 12 19:23:52 2011 Seq name: gi|319976717|gb|AEUH01000328.1| Actinomyces sp. oral taxon 178 str. F0338 contig00328, whole genome shotgun sequence Length of sequence - 741 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 739 710 ## Predicted protein(s) >gi|319976717|gb|AEUH01000328.1| GENE 1 1 - 739 710 246 aa, chain + ## HITS:0 COG:no KEGG:no NR:no SAPAHEGPGGPAASALAALDELATPPTTTAPRRGGPLARLLRRARPTRPAHPWQHWFYTS GAWIGPHDWKNHRGPLTHDQRRLAHIAAALALTLAASATASALLWHWTNPPLPPPGPATT TGGYPTGPDSNPTPYPAPAPTPPAPPAGNDHNTPNGAQATAYHALALADYTWNTGDTTPL HDLSAPECQWCAQTTTDTTTTYTTGGWAANAWHTNTNPTTTTTTPNTTTTNTYTTTLTTT QRTPDI Prediction of potential genes in microbial genomes Time: Thu May 12 19:24:09 2011 Seq name: gi|319976714|gb|AEUH01000329.1| Actinomyces sp. oral taxon 178 str. F0338 contig00329, whole genome shotgun sequence Length of sequence - 724 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 148 - 188 5.7 1 1 Tu 1 . - CDS 273 - 353 68 ## 2 2 Tu 1 . - CDS 589 - 723 90 ## Predicted protein(s) >gi|319976714|gb|AEUH01000329.1| GENE 1 273 - 353 68 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDTEDVDVLIGTPYNPQPTTKPPTAA >gi|319976714|gb|AEUH01000329.1| GENE 2 589 - 723 90 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no FQMNQITPSPSPGNTMKSPSQPDNPHTDHTRHPTPVTLLVASRW Prediction of potential genes in microbial genomes Time: Thu May 12 19:24:17 2011 Seq name: gi|319976712|gb|AEUH01000330.1| Actinomyces sp. oral taxon 178 str. F0338 contig00330, whole genome shotgun sequence Length of sequence - 719 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 23 - 719 638 ## gi|154509489|ref|ZP_02045131.1| hypothetical protein ACTODO_02021 Predicted protein(s) >gi|319976712|gb|AEUH01000330.1| GENE 1 23 - 719 638 232 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509489|ref|ZP_02045131.1| ## NR: gi|154509489|ref|ZP_02045131.1| hypothetical protein ACTODO_02021 [Actinomyces odontolyticus ATCC 17982] # 3 230 429 656 1024 132 46.0 1e-29 MAALKADEASAAERLRACEEAARVLPSQIERAQAGLEAMRADAAAVPAARAELERLDERL EASRRADVLRASLTGLSEALREAVRGAKVADAAARDAHDLWLSATAGALAAGLADGSPCP VCGSESHPSPAPLSEDGITREQVRGLDEARQRADGELADAKSAHSDAVREISRLNAIAGD HTGAIEELRGAAASRLRALEGAARRIPGVEEAIGQERARLGELEGRRADAAA Prediction of potential genes in microbial genomes Time: Thu May 12 19:24:29 2011 Seq name: gi|319976710|gb|AEUH01000331.1| Actinomyces sp. oral taxon 178 str. F0338 contig00331, whole genome shotgun sequence Length of sequence - 713 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 713 862 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit Predicted protein(s) >gi|319976710|gb|AEUH01000331.1| GENE 1 2 - 713 862 237 aa, chain + ## HITS:1 COG:Cgl1357_2 KEGG:ns NR:ns ## COG: Cgl1357_2 COG0072 # Protein_GI_number: 19552607 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Corynebacterium glutamicum # 2 237 237 472 673 169 42.0 4e-42 ELLARYGGGAIDSAVTDVDEAPAPAPIAFPVGEAERLTGVAHTVERVVELLETVGCSVDG PADGVMTVVPPSWRSDLVDAAGLVEEIARLDGYDNIPVVMPPAPAGRGLTPAQRARRSAA ATLAEAGMVEVKSYPFVSDSFDRQGMEADDPRRAALRLRNPMADDAPLLRTSLLDTLLDV AGRNVARSNEDVAVYELGMVARPEGTVPAPLPSAERRPDEATIAALHAGTPAQPWHV Prediction of potential genes in microbial genomes Time: Thu May 12 19:24:29 2011 Seq name: gi|319976708|gb|AEUH01000332.1| Actinomyces sp. oral taxon 178 str. F0338 contig00332, whole genome shotgun sequence Length of sequence - 696 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 75 - 587 434 ## Predicted protein(s) >gi|319976708|gb|AEUH01000332.1| GENE 1 75 - 587 434 170 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDASSGAVGPSAGGSGPSPFDPSGFDSSGFDTAIARVEEEAARARELGERAGRFVDAAGQ VRGVGADERGRVRVVVDCTGMVVEARVAGSRELGRALVRAYAAARADAGRALADAAGEAY GPQSAAARGLAGRYGSGRGRQAHEWRQDLAERGGIGPDGAVRPAADWMRG Prediction of potential genes in microbial genomes Time: Thu May 12 19:24:39 2011 Seq name: gi|319976706|gb|AEUH01000333.1| Actinomyces sp. oral taxon 178 str. F0338 contig00333, whole genome shotgun sequence Length of sequence - 645 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 19 - 645 621 ## gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein Predicted protein(s) >gi|319976706|gb|AEUH01000333.1| GENE 1 19 - 645 621 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293190866|ref|ZP_06609028.1| ## NR: gi|293190866|ref|ZP_06609028.1| conserved hypothetical protein [Actinomyces odontolyticus F0309] # 44 208 21 198 198 117 43.0 5e-25 AARPGDAGAASSHCQYVPGTAPTTPPADEGEPADEGGGGEGEAPPSTETIVRTALARVPV SGAGLSWQPRKKSYTNVGVPTIVYAASPTQSHATSLFGREVSITLTASQYSYDFGDGTPP LVTARAGEPWRRGNKEARLTHHYEEVTRGGERRVITLTTTWDATTTNPFTGETLTLPSII TTTEQSTPFPVSHLRIDLTDTADEQDGH Prediction of potential genes in microbial genomes Time: Thu May 12 19:24:50 2011 Seq name: gi|319976704|gb|AEUH01000334.1| Actinomyces sp. oral taxon 178 str. F0338 contig00334, whole genome shotgun sequence Length of sequence - 591 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 129 76 ## 2 2 Tu 1 . + CDS 92 - 526 537 ## gi|154509146|ref|ZP_02044788.1| hypothetical protein ACTODO_01667 Predicted protein(s) >gi|319976704|gb|AEUH01000334.1| GENE 1 34 - 129 76 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAVRTSADAEWTIPFPFGRGGIPVGAAAVPY >gi|319976704|gb|AEUH01000334.1| GENE 2 92 - 526 537 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154509146|ref|ZP_02044788.1| ## NR: gi|154509146|ref|ZP_02044788.1| hypothetical protein ACTODO_01667 [Actinomyces odontolyticus ATCC 17982] # 4 135 11 141 146 88 46.0 2e-16 MVHSASALVLTAIGCAIGGFWLWMWSGAGALARRAGVLRLSSAQDSPACSVQRVVWPQLP LLAALWPATAALASREAAGWDASVQCAVVFALLGAMALVAVVCLYFGALPEWAYPGWMAR RYYRVHPERAVAELGPAQAASLAA Prediction of potential genes in microbial genomes Time: Thu May 12 19:25:01 2011 Seq name: gi|319976702|gb|AEUH01000335.1| Actinomyces sp. oral taxon 178 str. F0338 contig00335, whole genome shotgun sequence Length of sequence - 584 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 582 714 ## COG1200 RecG-like helicase Predicted protein(s) >gi|319976702|gb|AEUH01000335.1| GENE 1 3 - 582 714 193 aa, chain + ## HITS:1 COG:MT3051 KEGG:ns NR:ns ## COG: MT3051 COG1200 # Protein_GI_number: 15842526 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Mycobacterium tuberculosis CDC1551 # 18 189 25 216 737 66 29.0 3e-11 PLALRLPARTARALAGAGVRTAGDLLGITPRRYYHWGALTPLHSLREGEDATILAQVASA RIIANRSGAGVRMEVELTDGARSITATFFAKNQYKLEPHARLLTPGASYLFAGRVGAYRG RLQLAHPSFEGVDGEDAERAAQRPIPIYPATGGLTSWAVSRAIGVVLDGIDDADVPDPLP DSVRAAHRLPTRA Prediction of potential genes in microbial genomes Time: Thu May 12 19:25:02 2011 Seq name: gi|319976700|gb|AEUH01000336.1| Actinomyces sp. oral taxon 178 str. F0338 contig00336, whole genome shotgun sequence Length of sequence - 582 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 582 572 ## COG0587 DNA polymerase III, alpha subunit Predicted protein(s) >gi|319976700|gb|AEUH01000336.1| GENE 1 3 - 582 572 193 aa, chain - ## HITS:1 COG:MT3480 KEGG:ns NR:ns ## COG: MT3480 COG0587 # Protein_GI_number: 15842966 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 1 193 325 517 1098 239 65.0 2e-63 GADIAREAAFDLRLVAPRLPRTRVPGGHTPDSWLARLAHEGARERYGDRGESPGAWRTID HELEVIASLGFAGYFLIVKEIVDFCSARGILCQGRGSAANSAVCYCLGITAVDAVRHQLL FERFLSSARSGPPDIDIDIESGRREEVIQHVYDAYGRHRAAQVANVITYRPRSAIRDAAR ALGHSQGQAAAWS Prediction of potential genes in microbial genomes Time: Thu May 12 19:25:03 2011 Seq name: gi|319976698|gb|AEUH01000337.1| Actinomyces sp. oral taxon 178 str. F0338 contig00337, whole genome shotgun sequence Length of sequence - 582 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 581 434 ## gi|225022740|ref|ZP_03711932.1| hypothetical protein CORMATOL_02785 Predicted protein(s) >gi|319976698|gb|AEUH01000337.1| GENE 1 2 - 581 434 193 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225022740|ref|ZP_03711932.1| ## NR: gi|225022740|ref|ZP_03711932.1| hypothetical protein CORMATOL_02785 [Corynebacterium matruchotii ATCC 33806] # 43 192 169 328 533 64 32.0 2e-09 LSGAQRATADAVLGHLGEILQAGASRMGSLHRSMVAKDLHALRLHGFVTADRALTAFLSS LGAAPAPRAAAFASAALNLHLLTRPGAPADPGLLGRARQGYREVGGLSLTPLYAEPVLTA SGFAGAQAVFADSSGATWSVARVRPGDASSIPTAYAAEPVWRELSAPIRQLSRHRLLVAR ASARDDGRLSAGA Prediction of potential genes in microbial genomes Time: Thu May 12 19:25:12 2011 Seq name: gi|319976696|gb|AEUH01000338.1| Actinomyces sp. oral taxon 178 str. F0338 contig00338, whole genome shotgun sequence Length of sequence - 577 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 119 - 577 493 ## COG1555 DNA uptake protein and related DNA-binding proteins Predicted protein(s) >gi|319976696|gb|AEUH01000338.1| GENE 1 119 - 577 493 152 aa, chain - ## HITS:1 COG:BH1333 KEGG:ns NR:ns ## COG: BH1333 COG1555 # Protein_GI_number: 15613896 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Bacillus halodurans # 9 146 63 203 210 94 45.0 6e-20 GARSGRGSQAAGTTRAVVYVTGRVASPGVLTMPAGSRVGEAIEAAGGPVEGADLESLNLA RVIADGEHIVVPAQGAAPAPAGAGAPEGAKCVNLNAASEQELQELDGVGPAMASRIAQYR AAHGTITSVDELDDVPGIGPALLEKIRSGACP Prediction of potential genes in microbial genomes Time: Thu May 12 19:25:13 2011 Seq name: gi|319976695|gb|AEUH01000339.1| Actinomyces sp. oral taxon 178 str. F0338 contig00339, whole genome shotgun sequence Length of sequence - 548 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 273 - 317 -0.9 1 1 Tu 1 . - CDS 452 - 547 115 ## Predicted protein(s) >gi|319976695|gb|AEUH01000339.1| GENE 1 452 - 547 115 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no GAARRAPRALLPAPLVAVRAVIEGTAVPERQ Prediction of potential genes in microbial genomes Time: Thu May 12 19:25:17 2011 Seq name: gi|319976693|gb|AEUH01000340.1| Actinomyces sp. oral taxon 178 str. F0338 contig00340, whole genome shotgun sequence Length of sequence - 537 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 41 - 536 166 ## Predicted protein(s) >gi|319976693|gb|AEUH01000340.1| GENE 1 41 - 536 166 165 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRVGECVFELRDYPSIDDGAFVRWLAARRGRAGFAALYGAEVDGAYLDSDRYGGLGERVR AGFAALAERIDGIVVVLDEGAGEYTDPVSGDVYTFAQAGPPAEGGRAALALSWFGIVGTR EADPLLGWIDVCRQRWPALSPSVYRDEVRRPLTDGLIDALSRPGN Prediction of potential genes in microbial genomes Time: Thu May 12 19:25:27 2011 Seq name: gi|319976692|gb|AEUH01000341.1| Actinomyces sp. oral taxon 178 str. F0338 contig00341, whole genome shotgun sequence Length of sequence - 519 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 418 487 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Prom 438 - 497 2.2 + Prom 159 - 218 6.4 2 2 Tu 1 . + CDS 357 - 473 215 ## Predicted protein(s) >gi|319976692|gb|AEUH01000341.1| GENE 1 1 - 418 487 139 aa, chain - ## HITS:1 COG:YHR031c KEGG:ns NR:ns ## COG: YHR031c COG0507 # Protein_GI_number: 6321820 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Saccharomyces cerevisiae # 1 139 407 545 723 123 48.0 9e-29 MVLKQVFRQKGDNEFIDMLNNVRVGNLNYETIEAFQKLDRQINYTDGIEPTQLYPTLKEV LMANQAKLNSLPGKVYTFQAKDPENPFLVSMLDNNLMVDKVLHLKTGAQVMCVKNFTDEL VNGTLGTVLFMATRKLYMK >gi|319976692|gb|AEUH01000341.1| GENE 2 357 - 473 215 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFNISINSLSPFCLKTCFNTTVFSITLFQLLAKKQNFG