Definition Escherichia coli HS, complete genome.
Accession NC_009800
Length 4,643,538

Click here to switch to the map view.

The map label for this gene is gspD1 [H]

Identifier: 157162799

GI number: 157162799

Start: 3500708

End: 3502672

Strand: Direct

Name: gspD1 [H]

Synonym: EcHS_A3519

Alternate gene names: 157162799

Gene position: 3500708-3502672 (Clockwise)

Preceding gene: 157162798

Following gene: 157162800

Centisome position: 75.39

GC content: 50.33

Gene sequence:

>1965_bases
ATGGACTGCGTCATGAAAGGACTCAATAAAATCACCTGCTGCTTGCTGGCAGCACTACTCATGCCTTGTGCAGGACACGC
TGAGAACGAACAATACGGCGCGAACTTCAATAACGCCGATATCCGCCAGTTCGTGGAAATAGTGGGTCAGCATCTTGGCA
AAACGATCCTGATCGACCCTTCGGTACAGGGAACCATTTCCGTACGCAGTAATGATACGTTTAGCCAACAGGAGTACTAC
CAGTTCTTTTTAAGTATTCTTGATCTTTACGGTTATTCCGTGATCACGCTGGACAATGGTTTTCTGAGAGTGGTTCGCTC
AGCTAATGTAAAAACATCGCCAGGGATGATTGCTGACAGTTCTCGTCCAGGCGTAGGTGATGAGTTGGTCACCCGAATCG
TACCGCTTGAGAACGTTCCTGCTCGTGACCTGGCCCCCCTGCTCCGCCAGATGATGGATGCGGGTAGCGTCGGTAATGTT
GTGCATTATGAACCCTCCAACGTTCTTATTCTGACCGGTCGTGCCTCCACCATTAATAAACTGATTGAAGTCATAAAGCG
CGTTGATGTCATCGGCACAGAGAAGCAGCAAATTATTCATCTGGAATATGCGTCAGCGGAAGATCTCGCCGAGATTCTTA
ATCAATTAATCAGCGAAAGCCACGGTAAAAGCCAGATGCCAGCCCTCCTCTCCGCGAGGATTGTGGCGGATAAGCGAACC
AACTCTCTTATCATCAGTGGACCGGAAAAAGCACGCCAGCGCATCACTTCATTACTGAAAAGCCTTGATGTCGAAGAGAG
CGAGGAAGGAAATACCCGGGTTTATTACCTGAAATATGCTAAAGCCACGAATCTGGTGGAAGTGCTAACCGGTGTTTCCG
AAAAGCTGAAAGATGAAAAAGGGAATGCGCGTAAGCCCTCCTCTTCTGGCGCGATGGATAACGTCGCCATTACCGCCGAT
GAACAGACTAACTCTCTGGTCATTACCGCTGACCAGTCCGTCCAGGAAAAACTCGCCACGGTAATTGCGCGTCTGGACAT
TCGCCGTGCACAGGTGCTGGTTGAGGCAATCATCGTTGAAGTTCAGGATGGAAATGGACTAAACCTCGGCGTGCAATGGG
CGAATAAAAACGTTGGCGCACAGCAATTTACCAATACCGGATTACCGATTTTTAACGCTGCGCAAGGTGTGGCTGATTAT
AAAAAGAATGGTGGGATCACCAGCGCGAATCCTGCCTGGGATATGTTTAGCGCCTACAATGGCATGGCCGCAGGCTTCTT
CAATGGCGACTGGGGAGTACTGCTTACCGCGCTGGCCAGTAACAATAAAAATGACATCCTCGCCACCCCAAGCATCGTAA
CGCTGGATAATAAACTCGCGTCCTTCAACGTGGGGCAGGATGTGCCGGTGCTATCCGGGTCACAGACCACTTCAGGGGAT
AACGTCTTTAATACCGTCGAACGCAAAACGGTGGGGACAAAACTCAAAGTTACTCCGCAGGTCAATGAAGGCGACGCGGT
GTTGCTCGAAATAGAGCAGGAAGTCTCCAGCGTTGACTCTTCCTCTAACTCGACGCTCGGCCCGACGTTTAATACCCGTA
CTATTCAAAACGCCGTGCTGGTCAAAACCGGTGAAACGGTGGTCCTGGGCGGATTGCTGGATGATTTTTCTAAAGAGCAA
GTGTCAAAGGTTCCTCTGCTTGGCGATATTCCTTTAGTGGGGCAACTCTTCCGCTATACCTCCACCGAGCGCGCTAAACG
CAACCTGATGGTATTTATCCGTCCGACGATTATCCGTGACGATGATGTTTATCGCTCACTGTCAAAAGAGAAATACACCC
GTTACCTTCAGGAGCAACAACAGCGGATCGACGGGAAATCAAAAGCGCTGGTTGGCTCGGAAGATTTGCCGGTGCTGGAT
GAAAACACGTTCAACAGTCACGCCCCTGCGCCATCGTCACGGTGA

Upstream 100 bases:

>100_bases
TTAAGTTTACTGCTAACTCAACAAAGTGCTCAGTTTACAATTCGTCGCAACGGCGTACCCCGCTTGATAAATGTTTCCGT
CGGGGAACTTACAGGAATGA

Downstream 100 bases:

>100_bases
GGCATTCATATGAGAATTCACTCACCGTACCCCGCCAGTTGGGCGCTGGCACAACGAATTGGTTATCTCTATTCAGAGGG
CGAGATTATTTATCTCGCCG

Product: general secretion pathway protein D

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 654; Mature: 654

Protein sequence:

>654_residues
MDCVMKGLNKITCCLLAALLMPCAGHAENEQYGANFNNADIRQFVEIVGQHLGKTILIDPSVQGTISVRSNDTFSQQEYY
QFFLSILDLYGYSVITLDNGFLRVVRSANVKTSPGMIADSSRPGVGDELVTRIVPLENVPARDLAPLLRQMMDAGSVGNV
VHYEPSNVLILTGRASTINKLIEVIKRVDVIGTEKQQIIHLEYASAEDLAEILNQLISESHGKSQMPALLSARIVADKRT
NSLIISGPEKARQRITSLLKSLDVEESEEGNTRVYYLKYAKATNLVEVLTGVSEKLKDEKGNARKPSSSGAMDNVAITAD
EQTNSLVITADQSVQEKLATVIARLDIRRAQVLVEAIIVEVQDGNGLNLGVQWANKNVGAQQFTNTGLPIFNAAQGVADY
KKNGGITSANPAWDMFSAYNGMAAGFFNGDWGVLLTALASNNKNDILATPSIVTLDNKLASFNVGQDVPVLSGSQTTSGD
NVFNTVERKTVGTKLKVTPQVNEGDAVLLEIEQEVSSVDSSSNSTLGPTFNTRTIQNAVLVKTGETVVLGGLLDDFSKEQ
VSKVPLLGDIPLVGQLFRYTSTERAKRNLMVFIRPTIIRDDDVYRSLSKEKYTRYLQEQQQRIDGKSKALVGSEDLPVLD
ENTFNSHAPAPSSR

Sequences:

>Translated_654_residues
MDCVMKGLNKITCCLLAALLMPCAGHAENEQYGANFNNADIRQFVEIVGQHLGKTILIDPSVQGTISVRSNDTFSQQEYY
QFFLSILDLYGYSVITLDNGFLRVVRSANVKTSPGMIADSSRPGVGDELVTRIVPLENVPARDLAPLLRQMMDAGSVGNV
VHYEPSNVLILTGRASTINKLIEVIKRVDVIGTEKQQIIHLEYASAEDLAEILNQLISESHGKSQMPALLSARIVADKRT
NSLIISGPEKARQRITSLLKSLDVEESEEGNTRVYYLKYAKATNLVEVLTGVSEKLKDEKGNARKPSSSGAMDNVAITAD
EQTNSLVITADQSVQEKLATVIARLDIRRAQVLVEAIIVEVQDGNGLNLGVQWANKNVGAQQFTNTGLPIFNAAQGVADY
KKNGGITSANPAWDMFSAYNGMAAGFFNGDWGVLLTALASNNKNDILATPSIVTLDNKLASFNVGQDVPVLSGSQTTSGD
NVFNTVERKTVGTKLKVTPQVNEGDAVLLEIEQEVSSVDSSSNSTLGPTFNTRTIQNAVLVKTGETVVLGGLLDDFSKEQ
VSKVPLLGDIPLVGQLFRYTSTERAKRNLMVFIRPTIIRDDDVYRSLSKEKYTRYLQEQQQRIDGKSKALVGSEDLPVLD
ENTFNSHAPAPSSR
>Mature_654_residues
MDCVMKGLNKITCCLLAALLMPCAGHAENEQYGANFNNADIRQFVEIVGQHLGKTILIDPSVQGTISVRSNDTFSQQEYY
QFFLSILDLYGYSVITLDNGFLRVVRSANVKTSPGMIADSSRPGVGDELVTRIVPLENVPARDLAPLLRQMMDAGSVGNV
VHYEPSNVLILTGRASTINKLIEVIKRVDVIGTEKQQIIHLEYASAEDLAEILNQLISESHGKSQMPALLSARIVADKRT
NSLIISGPEKARQRITSLLKSLDVEESEEGNTRVYYLKYAKATNLVEVLTGVSEKLKDEKGNARKPSSSGAMDNVAITAD
EQTNSLVITADQSVQEKLATVIARLDIRRAQVLVEAIIVEVQDGNGLNLGVQWANKNVGAQQFTNTGLPIFNAAQGVADY
KKNGGITSANPAWDMFSAYNGMAAGFFNGDWGVLLTALASNNKNDILATPSIVTLDNKLASFNVGQDVPVLSGSQTTSGD
NVFNTVERKTVGTKLKVTPQVNEGDAVLLEIEQEVSSVDSSSNSTLGPTFNTRTIQNAVLVKTGETVVLGGLLDDFSKEQ
VSKVPLLGDIPLVGQLFRYTSTERAKRNLMVFIRPTIIRDDDVYRSLSKEKYTRYLQEQQQRIDGKSKALVGSEDLPVLD
ENTFNSHAPAPSSR

Specific function: Involved in a general secretion pathway (GSP) for the export of proteins [H]

COG id: COG1450

COG function: function code NU; Type II secretory pathway, component PulD

Gene ontology:

Cell location: Cell outer membrane (Probable) [H]

Metaboloic importance: Unknown [C]

Operon status: Not Known

Operon components: None

Similarity: Belongs to the GSP D family [H]

Homologues:

Organism=Escherichia coli, GI87082242, Length=650, Percent_Identity=99.5384615384615, Blast_Score=1321, Evalue=0.0,
Organism=Escherichia coli, GI1789793, Length=334, Percent_Identity=29.0419161676647, Blast_Score=102, Evalue=6e-23,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR001775
- InterPro:   IPR005644
- InterPro:   IPR004846
- InterPro:   IPR013356
- InterPro:   IPR004845 [H]

Pfam domain/function: PF00263 Secretin; PF03958 Secretin_N [H]

EC number: NA

Molecular weight: Translated: 71161; Mature: 71161

Theoretical pI: Translated: 5.13; Mature: 5.13

Prosite motif: PS00875 T2SP_D

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.6 %Cys     (Translated Protein)
1.7 %Met     (Translated Protein)
2.3 %Cys+Met (Translated Protein)
0.6 %Cys     (Mature Protein)
1.7 %Met     (Mature Protein)
2.3 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MDCVMKGLNKITCCLLAALLMPCAGHAENEQYGANFNNADIRQFVEIVGQHLGKTILIDP
CCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCEEEECC
SVQGTISVRSNDTFSQQEYYQFFLSILDLYGYSVITLDNGFLRVVRSANVKTSPGMIADS
CCCEEEEECCCCCCCHHHHHHHHHHHHHHHCCEEEEECCHHHHHHHHCCCCCCCCEEECC
SRPGVGDELVTRIVPLENVPARDLAPLLRQMMDAGSVGNVVHYEPSNVLILTGRASTINK
CCCCCCHHHHHHHCCCCCCCHHHHHHHHHHHHCCCCCCCEEEECCCCEEEEECCHHHHHH
LIEVIKRVDVIGTEKQQIIHLEYASAEDLAEILNQLISESHGKSQMPALLSARIVADKRT
HHHHHHHHHHCCCCCCEEEEEEECCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHEECCCC
NSLIISGPEKARQRITSLLKSLDVEESEEGNTRVYYLKYAKATNLVEVLTGVSEKLKDEK
CEEEEECHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEECCHHHHHHHHHHHHHHHHCCC
GNARKPSSSGAMDNVAITADEQTNSLVITADQSVQEKLATVIARLDIRRAQVLVEAIIVE
CCCCCCCCCCCCCCEEEEECCCCCEEEEEECHHHHHHHHHHHHHHHHHHHHHHHHHHEEE
VQDGNGLNLGVQWANKNVGAQQFTNTGLPIFNAAQGVADYKKNGGITSANPAWDMFSAYN
EECCCCCEEEEEECCCCCCHHHHCCCCCCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHC
GMAAGFFNGDWGVLLTALASNNKNDILATPSIVTLDNKLASFNVGQDVPVLSGSQTTSGD
CCEEEEECCCHHHHHHHHHCCCCCCEEECCCEEEECCCHHCCCCCCCCCCCCCCCCCCCC
NVFNTVERKTVGTKLKVTPQVNEGDAVLLEIEQEVSSVDSSSNSTLGPTFNTRTIQNAVL
HHHHHHHHHHCCCEEEEECCCCCCCEEEEEEHHHHHHCCCCCCCCCCCCCCCCCCCCEEE
VKTGETVVLGGLLDDFSKEQVSKVPLLGDIPLVGQLFRYTSTERAKRNLMVFIRPTIIRD
EECCCEEEECHHHHHHHHHHHHCCCCCCCCHHHHHHHHHCCHHHHCCCEEEEEECEEECC
DDVYRSLSKEKYTRYLQEQQQRIDGKSKALVGSEDLPVLDENTFNSHAPAPSSR
HHHHHHHHHHHHHHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCCCCCCCC
>Mature Secondary Structure
MDCVMKGLNKITCCLLAALLMPCAGHAENEQYGANFNNADIRQFVEIVGQHLGKTILIDP
CCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCEEEECC
SVQGTISVRSNDTFSQQEYYQFFLSILDLYGYSVITLDNGFLRVVRSANVKTSPGMIADS
CCCEEEEECCCCCCCHHHHHHHHHHHHHHHCCEEEEECCHHHHHHHHCCCCCCCCEEECC
SRPGVGDELVTRIVPLENVPARDLAPLLRQMMDAGSVGNVVHYEPSNVLILTGRASTINK
CCCCCCHHHHHHHCCCCCCCHHHHHHHHHHHHCCCCCCCEEEECCCCEEEEECCHHHHHH
LIEVIKRVDVIGTEKQQIIHLEYASAEDLAEILNQLISESHGKSQMPALLSARIVADKRT
HHHHHHHHHHCCCCCCEEEEEEECCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHEECCCC
NSLIISGPEKARQRITSLLKSLDVEESEEGNTRVYYLKYAKATNLVEVLTGVSEKLKDEK
CEEEEECHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEECCHHHHHHHHHHHHHHHHCCC
GNARKPSSSGAMDNVAITADEQTNSLVITADQSVQEKLATVIARLDIRRAQVLVEAIIVE
CCCCCCCCCCCCCCEEEEECCCCCEEEEEECHHHHHHHHHHHHHHHHHHHHHHHHHHEEE
VQDGNGLNLGVQWANKNVGAQQFTNTGLPIFNAAQGVADYKKNGGITSANPAWDMFSAYN
EECCCCCEEEEEECCCCCCHHHHCCCCCCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHC
GMAAGFFNGDWGVLLTALASNNKNDILATPSIVTLDNKLASFNVGQDVPVLSGSQTTSGD
CCEEEEECCCHHHHHHHHHCCCCCCEEECCCEEEECCCHHCCCCCCCCCCCCCCCCCCCC
NVFNTVERKTVGTKLKVTPQVNEGDAVLLEIEQEVSSVDSSSNSTLGPTFNTRTIQNAVL
HHHHHHHHHHCCCEEEEECCCCCCCEEEEEEHHHHHHCCCCCCCCCCCCCCCCCCCCEEE
VKTGETVVLGGLLDDFSKEQVSKVPLLGDIPLVGQLFRYTSTERAKRNLMVFIRPTIIRD
EECCCEEEECHHHHHHHHHHHHCCCCCCCCHHHHHHHHHCCHHHHCCCEEEEEECEEECC
DDVYRSLSKEKYTRYLQEQQQRIDGKSKALVGSEDLPVLDENTFNSHAPAPSSR
HHHHHHHHHHHHHHHHHHHHHHCCCCCCEEECCCCCCCCCCCCCCCCCCCCCCC

PDB accession: NA

Resolution: NA

Structure class: Alpha Beta

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 6.0

TargetDB status: NA

Availability: NA

References: 9278503 [H]