The gene/protein map for NC_009800 is currently unavailable.
Definition Escherichia coli HS, complete genome.
Accession NC_009800
Length 4,643,538

Click here to switch to the map view.

The map label for this gene is usg

Identifier: 157161807

GI number: 157161807

Start: 2477207

End: 2478220

Strand: Reverse

Name: usg

Synonym: EcHS_A2470

Alternate gene names: 157161807

Gene position: 2478220-2477207 (Counterclockwise)

Preceding gene: 157161808

Following gene: 157161806

Centisome position: 53.37

GC content: 54.93

Gene sequence:

>1014_bases
ATGTCTGAAGGCTGGAACATTGCCGTCCTGGGCGCAACTGGCGCTGTGGGCGAAGCCCTGCTTGAAACGCTGGCTGAACG
TCAGTTCCCGGTTGGGGAAATTTATGCACTGGCACGTAACGAAAGCGCAGGCGAACAACTGCGCTTTGGTGGTAAGACAA
TCACCGTGCAGGATGCCGCTGAATTCGACTGGACGCAGGCGCAGCTGGCATTTTTTGTCGCAGGCAAAGAAGCTACCGCT
GCCTGGGTTGAAGAAGCGACCAACTCAGGTTGCCTGGTGATCGACAGCAGTGGATTGTTTGCTCTCGAACCCGACGTACC
GCTGGTGGTGCCGGAAGTAAACCCGTTTGTACTGACAGATTACCGGAACCGGAATGTCATCGCCGTACCAGACAGTCTGA
CCAGCCAGCTGCTGGCGGCACTGAAACCGTTAATCGATCAGGGCGGTTTATCACGTATCAGCGTTACCAGCCTGATTTCA
GCCTCCGCCCAGGGCAAAAAAGCGGTCGATGCGTTAGCGGGGCAGAGTGCGAAATTGCTCAACGGCATTCCGATTGACGA
AGAAGATTTCTTCGGGCGTCAGCTGGCGTTCAACATGCTGCCGTTACTGCCGGATAGCGAAGGTAGCGTGCGTGAAGAAC
GTCGTATCGTTGACGAAGTACGCAAAATCCTGCAGGACGAAGGGCTGATGATTTCGGCTAGCGTCGTCCAGGCACCGGTA
TTCTACGGTCATGCCCAGATGGTCAACTTTGAAGCTCTGCGTCCACTGGCAGCAGAAGAAGCGCGTGATGCGTTTGTTCA
AGGCGAAGATATTGTGCTCTCTGAAGAGAACGAATTCCCAACTCAGGTAGGTGATGCTTCGGGTACGCCGCATCTTTCTG
TTGGCTGCGTGCGTAATGACTACGGTATGCCGGAGCAAGTCCAGTTCTGGTCGGTGGCCGATAACGTTCGCTTTGGCGGC
GCGCTGATGGCAGTAAAAATCGCCGAGAAACTGGTGCAGGAGTATCTGTACTAA

Upstream 100 bases:

>100_bases
TGGGTTTTAACGCCGTTCATCATCCGGCACGTTAATCTCTTCTTCATGCTCTCTGCTGTAACATTGGCAGGGAGCTTTGC
TATTTCTGGAGTAAACCACC

Downstream 100 bases:

>100_bases
TGTCCGACCAGCAACAACCGCCAGTTTATAAAATTGCGCTGGGCATTGAGTACGACGGCAGTAAGTATTACGGCTGGCAA
CGGCAGAACGAAGTTCGCAG

Product: putative semialdehyde dehydrogenase

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 337; Mature: 336

Protein sequence:

>337_residues
MSEGWNIAVLGATGAVGEALLETLAERQFPVGEIYALARNESAGEQLRFGGKTITVQDAAEFDWTQAQLAFFVAGKEATA
AWVEEATNSGCLVIDSSGLFALEPDVPLVVPEVNPFVLTDYRNRNVIAVPDSLTSQLLAALKPLIDQGGLSRISVTSLIS
ASAQGKKAVDALAGQSAKLLNGIPIDEEDFFGRQLAFNMLPLLPDSEGSVREERRIVDEVRKILQDEGLMISASVVQAPV
FYGHAQMVNFEALRPLAAEEARDAFVQGEDIVLSEENEFPTQVGDASGTPHLSVGCVRNDYGMPEQVQFWSVADNVRFGG
ALMAVKIAEKLVQEYLY

Sequences:

>Translated_337_residues
MSEGWNIAVLGATGAVGEALLETLAERQFPVGEIYALARNESAGEQLRFGGKTITVQDAAEFDWTQAQLAFFVAGKEATA
AWVEEATNSGCLVIDSSGLFALEPDVPLVVPEVNPFVLTDYRNRNVIAVPDSLTSQLLAALKPLIDQGGLSRISVTSLIS
ASAQGKKAVDALAGQSAKLLNGIPIDEEDFFGRQLAFNMLPLLPDSEGSVREERRIVDEVRKILQDEGLMISASVVQAPV
FYGHAQMVNFEALRPLAAEEARDAFVQGEDIVLSEENEFPTQVGDASGTPHLSVGCVRNDYGMPEQVQFWSVADNVRFGG
ALMAVKIAEKLVQEYLY
>Mature_336_residues
SEGWNIAVLGATGAVGEALLETLAERQFPVGEIYALARNESAGEQLRFGGKTITVQDAAEFDWTQAQLAFFVAGKEATAA
WVEEATNSGCLVIDSSGLFALEPDVPLVVPEVNPFVLTDYRNRNVIAVPDSLTSQLLAALKPLIDQGGLSRISVTSLISA
SAQGKKAVDALAGQSAKLLNGIPIDEEDFFGRQLAFNMLPLLPDSEGSVREERRIVDEVRKILQDEGLMISASVVQAPVF
YGHAQMVNFEALRPLAAEEARDAFVQGEDIVLSEENEFPTQVGDASGTPHLSVGCVRNDYGMPEQVQFWSVADNVRFGGA
LMAVKIAEKLVQEYLY

Specific function: Unknown

COG id: COG0136

COG function: function code E; Aspartate-semialdehyde dehydrogenase

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Essential [C]

Operon status: Not Known

Operon components: None

Similarity: Belongs to the aspartate-semialdehyde dehydrogenase family

Homologues:

Organism=Escherichia coli, GI1788658, Length=337, Percent_Identity=100, Blast_Score=680, Evalue=0.0,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): USG_ECOLI (P08390)

Other databases:

- EMBL:   X02743
- EMBL:   U00096
- EMBL:   AP009048
- EMBL:   M15541
- EMBL:   M15542
- EMBL:   M29962
- PIR:   A23792
- RefSeq:   AP_002919.1
- RefSeq:   NP_416822.1
- ProteinModelPortal:   P08390
- SMR:   P08390
- DIP:   DIP-11095N
- IntAct:   P08390
- MINT:   MINT-1222362
- STRING:   P08390
- PRIDE:   P08390
- EnsemblBacteria:   EBESCT00000003753
- EnsemblBacteria:   EBESCT00000003754
- EnsemblBacteria:   EBESCT00000015858
- GeneID:   946797
- GenomeReviews:   AP009048_GR
- GenomeReviews:   U00096_GR
- KEGG:   ecj:JW2316
- KEGG:   eco:b2319
- EchoBASE:   EB1052
- EcoGene:   EG11059
- eggNOG:   COG0136
- GeneTree:   EBGT00050000011013
- HOGENOM:   HBG518238
- OMA:   LELWLCG
- ProtClustDB:   PRK08040
- BioCyc:   EcoCyc:EG11059-MONOMER
- Genevestigator:   P08390
- GO:   GO:0005737
- InterPro:   IPR012080
- InterPro:   IPR016040
- InterPro:   IPR000534
- InterPro:   IPR012280
- Gene3D:   G3DSA:3.40.50.720
- PIRSF:   PIRSF000148
- SMART:   SM00859

Pfam domain/function: PF01118 Semialdhyde_dh; PF02774 Semialdhyde_dhC

EC number: NA

Molecular weight: Translated: 36364; Mature: 36233

Theoretical pI: Translated: 4.11; Mature: 4.11

Prosite motif: NA

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.6 %Cys     (Translated Protein)
1.8 %Met     (Translated Protein)
2.4 %Cys+Met (Translated Protein)
0.6 %Cys     (Mature Protein)
1.5 %Met     (Mature Protein)
2.1 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MSEGWNIAVLGATGAVGEALLETLAERQFPVGEIYALARNESAGEQLRFGGKTITVQDAA
CCCCCEEEEEECCCHHHHHHHHHHHHCCCCHHHEEEEECCCCCCCCEECCCEEEEEECCC
EFDWTQAQLAFFVAGKEATAAWVEEATNSGCLVIDSSGLFALEPDVPLVVPEVNPFVLTD
CCCCHHEEEEEEEECCCHHHHHHHHCCCCCEEEEECCCEEEECCCCCEEECCCCCEEEEE
YRNRNVIAVPDSLTSQLLAALKPLIDQGGLSRISVTSLISASAQGKKAVDALAGQSAKLL
CCCCCEEECCHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHCCCCHHHHHHHCCCCHHHH
NGIPIDEEDFFGRQLAFNMLPLLPDSEGSVREERRIVDEVRKILQDEGLMISASVVQAPV
CCCCCCCHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHHCCCEEEEHHHHHHHH
FYGHAQMVNFEALRPLAAEEARDAFVQGEDIVLSEENEFPTQVGDASGTPHLSVGCVRND
EECCHHEECHHHHCCHHHHHHHHHHCCCCEEEEECCCCCCCCCCCCCCCCCEEEEEEECC
YGMPEQVQFWSVADNVRFGGALMAVKIAEKLVQEYLY
CCCCCCEEEEEECCCCCCCHHHHHHHHHHHHHHHHCC
>Mature Secondary Structure 
SEGWNIAVLGATGAVGEALLETLAERQFPVGEIYALARNESAGEQLRFGGKTITVQDAA
CCCCEEEEEECCCHHHHHHHHHHHHCCCCHHHEEEEECCCCCCCCEECCCEEEEEECCC
EFDWTQAQLAFFVAGKEATAAWVEEATNSGCLVIDSSGLFALEPDVPLVVPEVNPFVLTD
CCCCHHEEEEEEEECCCHHHHHHHHCCCCCEEEEECCCEEEECCCCCEEECCCCCEEEEE
YRNRNVIAVPDSLTSQLLAALKPLIDQGGLSRISVTSLISASAQGKKAVDALAGQSAKLL
CCCCCEEECCHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHCCCCHHHHHHHCCCCHHHH
NGIPIDEEDFFGRQLAFNMLPLLPDSEGSVREERRIVDEVRKILQDEGLMISASVVQAPV
CCCCCCCHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHHCCCEEEEHHHHHHHH
FYGHAQMVNFEALRPLAAEEARDAFVQGEDIVLSEENEFPTQVGDASGTPHLSVGCVRND
EECCHHEECHHHHCCHHHHHHHHHHCCCCEEEEECCCCCCCCCCCCCCCCCEEEEEEECC
YGMPEQVQFWSVADNVRFGGALMAVKIAEKLVQEYLY
CCCCCCEEEEEECCCCCCCHHHHHHHHHHHHHHHHCC

PDB accession: NA

Resolution: NA

Structure class: Alpha Beta

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 10.0

TargetDB status: NA

Availability: NA

References: 2991861; 9205837; 9278503; 3029016; 2681152