The gene/protein map for NC_009800 is currently unavailable.
Definition Escherichia coli HS, complete genome.
Accession NC_009800
Length 4,643,538

Click here to switch to the map view.

The map label for this gene is nagA1 [H]

Identifier: 157162620

GI number: 157162620

Start: 3331417

End: 3332550

Strand: Direct

Name: nagA1 [H]

Synonym: EcHS_A3327

Alternate gene names: 157162620

Gene position: 3331417-3332550 (Clockwise)

Preceding gene: 157162619

Following gene: 157162621

Centisome position: 71.74

GC content: 57.58

Gene sequence:

>1134_bases
ATGACACACGTTCTGCGCGCCAGAAGGCTGCTGACTGAAGAGGGATGGCTCGATGACCATCAGTTGCGTATTGCTGACGG
TGTCATCGCAGCAATCGAACCGATTCCAGTGGGCGTGACTGAACGCGATGCGGAACTGCTCTGCCCCGCTTACATCGACA
CCCATGTACACGGTGGTGCGGGCGTTGATGTAATGGATGACGCGCCGGATGTGCTCGACAAGCTGGCAATGCACAAGGCA
CGCGAAGGTGTCGGCAGTTGGTTACCGACTACCGTAACCGCGCCGCTTAATACCATTCATGCGGCGCTGAAACGTATTGC
GCAACGTTGCCAACGCGGCGGACCTGGTGCGCAAGTGCTGGGGAGTTATCTCGAAGGACCGTACTTCACGCCGCAGAATA
AAGGCGCGCATCCGCCGGAGTTGTTTCGCGAGCTTGAAATTGCCGAGCTGGATCAATTGATTGCCGTTTCTCAGCACACC
TTACGCGTGGTAGCGCTGGCACCGGAAAAAGAGGGGGCATTGCAGGCCATCCGCCATCTTAAACAGCAAAATGTACGAGT
GATGCTGGGGCATAGCGCGGCGACCTGGCAACAAACTCGCGCCGCGTTTGATGCTGGTGCCGACGGCCTGGTGCATTGCT
ATAACGGGATGACAGGTTTACATCACCGCGAACCGGGAATGGTTGGCGCGGGATTAACGGACAAGCGCGCCTGGCTGGAA
CTGATAGCCGATGGTCATCATGTGCATCCGGCGGCGATGTCGCTGTGTTGTTGCTGTGCAAAAGAGAGAATCGTGCTGAT
CACCGACGCGATGCAGGCAGCCGGGATGCCGGATGGTCGCTATACGTTATGTGGCGAAGAAGTGCAGATGCACGGTGGCG
TTGTCCGTACAGCGTCCGGTGGGCTGGCGGGCAGTACGCTGTCTGTTGATGCGGCAGTGCGCAACATGGTCGAGTTGACG
GGCGTAACGCCTGCGGAAGCCATTCATATGGCATCGCTGCATCCGGCGCGAATGCTGGGTGTTGATGGTGTTCTGGGATC
GCTTAAACCGGGCAAACGCGCCAGCATCGTTGCGCTGGATAGCGGGCTGCATGTGCAACAAATCTGGATTCAGAGTCAAT
TAGCTTCGTTTTGA

Upstream 100 bases:

>100_bases
AATTCCGTGTACAGGCACTGGAGTGTGGACATCGTGGGCTGACCAGTCTGGTGGACGAGTTAGGCCGCTGCCATGAAGAA
TGTCCGGTCGAGGAAGGGAT

Downstream 100 bases:

>100_bases
TAGTTTGCTCCTTTATTGGGCCTTCACTTCCCCCGTAAGGCCTTTCTTTTTCTTTCGTTTTGATCTGTGCAGCGGTGTCG
GATGCGACGCTAACGCGTCT

Product: N-acetylglucosamine-6-phosphate deacetylase

Products: D-Galactosamine 6-Phosphate; Acetate. [C]

Alternate protein names: NA

Number of amino acids: Translated: 377; Mature: 376

Protein sequence:

>377_residues
MTHVLRARRLLTEEGWLDDHQLRIADGVIAAIEPIPVGVTERDAELLCPAYIDTHVHGGAGVDVMDDAPDVLDKLAMHKA
REGVGSWLPTTVTAPLNTIHAALKRIAQRCQRGGPGAQVLGSYLEGPYFTPQNKGAHPPELFRELEIAELDQLIAVSQHT
LRVVALAPEKEGALQAIRHLKQQNVRVMLGHSAATWQQTRAAFDAGADGLVHCYNGMTGLHHREPGMVGAGLTDKRAWLE
LIADGHHVHPAAMSLCCCCAKERIVLITDAMQAAGMPDGRYTLCGEEVQMHGGVVRTASGGLAGSTLSVDAAVRNMVELT
GVTPAEAIHMASLHPARMLGVDGVLGSLKPGKRASIVALDSGLHVQQIWIQSQLASF

Sequences:

>Translated_377_residues
MTHVLRARRLLTEEGWLDDHQLRIADGVIAAIEPIPVGVTERDAELLCPAYIDTHVHGGAGVDVMDDAPDVLDKLAMHKA
REGVGSWLPTTVTAPLNTIHAALKRIAQRCQRGGPGAQVLGSYLEGPYFTPQNKGAHPPELFRELEIAELDQLIAVSQHT
LRVVALAPEKEGALQAIRHLKQQNVRVMLGHSAATWQQTRAAFDAGADGLVHCYNGMTGLHHREPGMVGAGLTDKRAWLE
LIADGHHVHPAAMSLCCCCAKERIVLITDAMQAAGMPDGRYTLCGEEVQMHGGVVRTASGGLAGSTLSVDAAVRNMVELT
GVTPAEAIHMASLHPARMLGVDGVLGSLKPGKRASIVALDSGLHVQQIWIQSQLASF
>Mature_376_residues
THVLRARRLLTEEGWLDDHQLRIADGVIAAIEPIPVGVTERDAELLCPAYIDTHVHGGAGVDVMDDAPDVLDKLAMHKAR
EGVGSWLPTTVTAPLNTIHAALKRIAQRCQRGGPGAQVLGSYLEGPYFTPQNKGAHPPELFRELEIAELDQLIAVSQHTL
RVVALAPEKEGALQAIRHLKQQNVRVMLGHSAATWQQTRAAFDAGADGLVHCYNGMTGLHHREPGMVGAGLTDKRAWLEL
IADGHHVHPAAMSLCCCCAKERIVLITDAMQAAGMPDGRYTLCGEEVQMHGGVVRTASGGLAGSTLSVDAAVRNMVELTG
VTPAEAIHMASLHPARMLGVDGVLGSLKPGKRASIVALDSGLHVQQIWIQSQLASF

Specific function: N-acetylgalactosamine utilization. [C]

COG id: COG1820

COG function: function code G; N-acetylglucosamine-6-phosphate deacetylase

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Unknown [C]

Operon status: Not Known

Operon components: None

Similarity: Belongs to the nagA family [H]

Homologues:

Organism=Homo sapiens, GI21361513, Length=415, Percent_Identity=33.2530120481928, Blast_Score=192, Evalue=4e-49,
Organism=Homo sapiens, GI224922791, Length=412, Percent_Identity=33.252427184466, Blast_Score=187, Evalue=1e-47,
Organism=Escherichia coli, GI1786892, Length=380, Percent_Identity=32.1052631578947, Blast_Score=178, Evalue=4e-46,
Organism=Caenorhabditis elegans, GI17553768, Length=383, Percent_Identity=29.7650130548303, Blast_Score=153, Evalue=1e-37,
Organism=Drosophila melanogaster, GI19920392, Length=387, Percent_Identity=30.749354005168, Blast_Score=178, Evalue=5e-45,
Organism=Drosophila melanogaster, GI281361140, Length=400, Percent_Identity=30.5, Blast_Score=174, Evalue=1e-43,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR006680 [H]

Pfam domain/function: PF01979 Amidohydro_1 [H]

EC number: 3.5.1.- [C]

Molecular weight: Translated: 40312; Mature: 40181

Theoretical pI: Translated: 6.67; Mature: 6.67

Prosite motif: NA

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

2.1 %Cys     (Translated Protein)
3.4 %Met     (Translated Protein)
5.6 %Cys+Met (Translated Protein)
2.1 %Cys     (Mature Protein)
3.2 %Met     (Mature Protein)
5.3 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MTHVLRARRLLTEEGWLDDHQLRIADGVIAAIEPIPVGVTERDAELLCPAYIDTHVHGGA
CCHHHHHHHHHHHCCCCCCCCEEECCCEEEEECCCCCCCCCCCHHHCCHHHHHHCCCCCC
GVDVMDDAPDVLDKLAMHKAREGVGSWLPTTVTAPLNTIHAALKRIAQRCQRGGPGAQVL
CCCCCCCCHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCHHHHH
GSYLEGPYFTPQNKGAHPPELFRELEIAELDQLIAVSQHTLRVVALAPEKEGALQAIRHL
HHHHCCCCCCCCCCCCCCHHHHHHCCHHHHHHHHHHHHCEEEEEEECCCCCHHHHHHHHH
KQQNVRVMLGHSAATWQQTRAAFDAGADGLVHCYNGMTGLHHREPGMVGAGLTDKRAWLE
HHCCCEEEECCCHHHHHHHHHHHHCCCCCCEEHHCCCCCCCCCCCCCEECCCCHHHHHHH
LIADGHHVHPAAMSLCCCCAKERIVLITDAMQAAGMPDGRYTLCGEEVQMHGGVVRTASG
HHHCCCCCCHHHHHHHHHHCCCCEEEEECCHHHCCCCCCCEEEECCHHHHCCCEEEECCC
GLAGSTLSVDAAVRNMVELTGVTPAEAIHMASLHPARMLGVDGVLGSLKPGKRASIVALD
CCCCCEEEHHHHHHHHHHHCCCCHHHHHHHHCCCHHHHHCCCHHHCCCCCCCCEEEEEEC
SGLHVQQIWIQSQLASF
CCCCHHHHHHHHHHHCC
>Mature Secondary Structure 
THVLRARRLLTEEGWLDDHQLRIADGVIAAIEPIPVGVTERDAELLCPAYIDTHVHGGA
CHHHHHHHHHHHCCCCCCCCEEECCCEEEEECCCCCCCCCCCHHHCCHHHHHHCCCCCC
GVDVMDDAPDVLDKLAMHKAREGVGSWLPTTVTAPLNTIHAALKRIAQRCQRGGPGAQVL
CCCCCCCCHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCHHHHH
GSYLEGPYFTPQNKGAHPPELFRELEIAELDQLIAVSQHTLRVVALAPEKEGALQAIRHL
HHHHCCCCCCCCCCCCCCHHHHHHCCHHHHHHHHHHHHCEEEEEEECCCCCHHHHHHHHH
KQQNVRVMLGHSAATWQQTRAAFDAGADGLVHCYNGMTGLHHREPGMVGAGLTDKRAWLE
HHCCCEEEECCCHHHHHHHHHHHHCCCCCCEEHHCCCCCCCCCCCCCEECCCCHHHHHHH
LIADGHHVHPAAMSLCCCCAKERIVLITDAMQAAGMPDGRYTLCGEEVQMHGGVVRTASG
HHHCCCCCCHHHHHHHHHHCCCCEEEEECCHHHCCCCCCCEEEECCHHHHCCCEEEECCC
GLAGSTLSVDAAVRNMVELTGVTPAEAIHMASLHPARMLGVDGVLGSLKPGKRASIVALD
CCCCCEEEHHHHHHHHHHHCCCCHHHHHHHHCCCHHHHHCCCHHHCCCCCCCCEEEEEEC
SGLHVQQIWIQSQLASF
CCCCHHHHHHHHHHHCC

PDB accession: NA

Resolution: NA

Structure class: Unstructured

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: N-Acetyl-D-Galactosamine 6-Phosphate; H2O [C]

Specific reaction: N-Acetyl-D-Galactosamine 6-Phosphate + H2O = D-Galactosamine 6-Phosphate + Acetate. [C]

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 9278503; 8932697 [H]