Definition Halothermothrix orenii H 168 chromosome, complete genome.
Accession NC_011899
Length 2,578,146

Click here to switch to the map view.

The map label for this gene is msmR [H]

Identifier: 220933161

GI number: 220933161

Start: 2548452

End: 2549282

Strand: Direct

Name: msmR [H]

Synonym: Hore_23290

Alternate gene names: 220933161

Gene position: 2548452-2549282 (Clockwise)

Preceding gene: 220933150

Following gene: 220933162

Centisome position: 98.85

GC content: 35.74

Gene sequence:

>831_bases
GTGGCTTACAACCATAAAGATATTGATGTTAACAGGCTTAATCTCACCGATATTGATATTTACCAATGTGGTCAAGAAAA
ATGCAAACCGGGTCATTCTTATGGGCCTGCCGTCAGGGATCATTATTTAATTCACTTCATATATAATGGAAGGGGTATTT
TCCAGGTAGGAGAGAATACTTACCATTTAGAGGCAGGACAGGGCTTTTTAATCTGCCCTGATATAGTAACATACTATCAG
GCTGACAGGCACAACCCCTGGGAATATGCCTGGATTGGTTTTCATGGGCTTAAAGCTAAAGACTATTTAAACCGGGCCAA
CCTCAGCCTGGCTAATCCGGTATTCTCTGATACTGATGGTAGTCCCCTCAGGTTTATATTTGAGGAAATGACCGCAGCCC
GTAAATTAAAGAGATCACGGGAAATTAAACTGATCGGGCTTATTTATGTTTTTTTATCACACCTGATTGAACTAAATGTA
TCTGGCTCTACTCCCGATAATAATTCAAAGGAAAATTACATAAAAAAAGCTATAGAATATATTGAGAAAAATTATTCCCG
ACATATTAAGGTAATTGATATAGCAAATCATGTTGGTCTCGACAGGAGTTATCTATGGTCTATTTTTAATGAGTTTTTAA
ATACTTCTCCCCAGCAATACCTTATTAATTACAGAATTAATAAGGCTTGTGAGCTAATGAAAAACAGAAATTTAAATTTA
AGTATTGGTGATATATCACGATCTGTTGGTTATAAAGACCCTCTAACTTTTTCTAAAACCTTTAAAAAAACAAAGGGGAT
ATCCCCTCTGCATTATCAAAAACAATCATAA

Upstream 100 bases:

>100_bases
TAGATAATTATTAACTTTATAAAGACTCCCTTGCCTCGATAATCAACTTCTACATATGGTATAATGTTAGATAATAGTAA
AAGAAAAAAGGGGGGTCTTG

Downstream 100 bases:

>100_bases
AACCATAGTCATATAGACCAAGTGTCACATTATTACATATTGAAACAACATTAAACCATTTAAGTTTAATCAAAGTTATT
ATATACTACAGTTGCAATAA

Product: AraC family transcriptional regulator

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 276; Mature: 275

Protein sequence:

>276_residues
MAYNHKDIDVNRLNLTDIDIYQCGQEKCKPGHSYGPAVRDHYLIHFIYNGRGIFQVGENTYHLEAGQGFLICPDIVTYYQ
ADRHNPWEYAWIGFHGLKAKDYLNRANLSLANPVFSDTDGSPLRFIFEEMTAARKLKRSREIKLIGLIYVFLSHLIELNV
SGSTPDNNSKENYIKKAIEYIEKNYSRHIKVIDIANHVGLDRSYLWSIFNEFLNTSPQQYLINYRINKACELMKNRNLNL
SIGDISRSVGYKDPLTFSKTFKKTKGISPLHYQKQS

Sequences:

>Translated_276_residues
MAYNHKDIDVNRLNLTDIDIYQCGQEKCKPGHSYGPAVRDHYLIHFIYNGRGIFQVGENTYHLEAGQGFLICPDIVTYYQ
ADRHNPWEYAWIGFHGLKAKDYLNRANLSLANPVFSDTDGSPLRFIFEEMTAARKLKRSREIKLIGLIYVFLSHLIELNV
SGSTPDNNSKENYIKKAIEYIEKNYSRHIKVIDIANHVGLDRSYLWSIFNEFLNTSPQQYLINYRINKACELMKNRNLNL
SIGDISRSVGYKDPLTFSKTFKKTKGISPLHYQKQS
>Mature_275_residues
AYNHKDIDVNRLNLTDIDIYQCGQEKCKPGHSYGPAVRDHYLIHFIYNGRGIFQVGENTYHLEAGQGFLICPDIVTYYQA
DRHNPWEYAWIGFHGLKAKDYLNRANLSLANPVFSDTDGSPLRFIFEEMTAARKLKRSREIKLIGLIYVFLSHLIELNVS
GSTPDNNSKENYIKKAIEYIEKNYSRHIKVIDIANHVGLDRSYLWSIFNEFLNTSPQQYLINYRINKACELMKNRNLNLS
IGDISRSVGYKDPLTFSKTFKKTKGISPLHYQKQS

Specific function: Regulatory protein for the msm operon for multiple sugar metabolism. Activates the transcription of the msmEFGK, aga, dexB and gftA genes [H]

COG id: NA

COG function: NA

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Non_Essential [C]

Operon status: Not Known

Operon components: None

Similarity: Contains 1 HTH araC/xylS-type DNA-binding domain [H]

Homologues:

Organism=Escherichia coli, GI1790559, Length=186, Percent_Identity=26.8817204301075, Blast_Score=64, Evalue=1e-11,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR009057
- InterPro:   IPR012287
- InterPro:   IPR003313
- InterPro:   IPR018062
- InterPro:   IPR020449
- InterPro:   IPR018060 [H]

Pfam domain/function: PF02311 AraC_binding; PF00165 HTH_AraC [H]

EC number: NA

Molecular weight: Translated: 31998; Mature: 31867

Theoretical pI: Translated: 9.23; Mature: 9.23

Prosite motif: PS00041 HTH_ARAC_FAMILY_1 ; PS01124 HTH_ARAC_FAMILY_2

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

1.4 %Cys     (Translated Protein)
1.1 %Met     (Translated Protein)
2.5 %Cys+Met (Translated Protein)
1.5 %Cys     (Mature Protein)
0.7 %Met     (Mature Protein)
2.2 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MAYNHKDIDVNRLNLTDIDIYQCGQEKCKPGHSYGPAVRDHYLIHFIYNGRGIFQVGENT
CCCCCCCCCEEEEECCCCHHHHCCHHHCCCCCCCCCCCCCCEEEEEEECCCEEEEECCCE
YHLEAGQGFLICPDIVTYYQADRHNPWEYAWIGFHGLKAKDYLNRANLSLANPVFSDTDG
EEEECCCCEEECCHHHHHHHCCCCCCCEEEEEEEECCCHHHHHHHCCCEECCCCCCCCCC
SPLRFIFEEMTAARKLKRSREIKLIGLIYVFLSHLIELNVSGSTPDNNSKENYIKKAIEY
CHHHHHHHHHHHHHHHHHHCCEEHHHHHHHHHHHHHEEECCCCCCCCCCHHHHHHHHHHH
IEKNYSRHIKVIDIANHVGLDRSYLWSIFNEFLNTSPQQYLINYRINKACELMKNRNLNL
HHHHHHCEEEEEEEHHHCCCCHHHHHHHHHHHHCCCCHHEEEEEHHHHHHHHHHCCCCEE
SIGDISRSVGYKDPLTFSKTFKKTKGISPLHYQKQS
EEHHHHHHCCCCCCCCHHHHHHHHCCCCCCCCCCCC
>Mature Secondary Structure 
AYNHKDIDVNRLNLTDIDIYQCGQEKCKPGHSYGPAVRDHYLIHFIYNGRGIFQVGENT
CCCCCCCCEEEEECCCCHHHHCCHHHCCCCCCCCCCCCCCEEEEEEECCCEEEEECCCE
YHLEAGQGFLICPDIVTYYQADRHNPWEYAWIGFHGLKAKDYLNRANLSLANPVFSDTDG
EEEECCCCEEECCHHHHHHHCCCCCCCEEEEEEEECCCHHHHHHHCCCEECCCCCCCCCC
SPLRFIFEEMTAARKLKRSREIKLIGLIYVFLSHLIELNVSGSTPDNNSKENYIKKAIEY
CHHHHHHHHHHHHHHHHHHCCEEHHHHHHHHHHHHHEEECCCCCCCCCCHHHHHHHHHHH
IEKNYSRHIKVIDIANHVGLDRSYLWSIFNEFLNTSPQQYLINYRINKACELMKNRNLNL
HHHHHHCEEEEEEEHHHCCCCHHHHHHHHHHHHCCCCHHEEEEEHHHHHHHHHHCCCCEE
SIGDISRSVGYKDPLTFSKTFKKTKGISPLHYQKQS
EEHHHHHHCCCCCCCCHHHHHHHHCCCCCCCCCCCC

PDB accession: NA

Resolution: NA

Structure class: Alpha Beta

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 10.0

TargetDB status: NA

Availability: NA

References: 1537846; 12397186 [H]