Definition Rhodopseudomonas palustris HaA2, complete genome.
Accession NC_007778
Length 5,331,656

Click here to switch to the map view.

The map label for this gene is clpA [C]

Identifier: 86749516

GI number: 86749516

Start: 2749042

End: 2751447

Strand: Direct

Name: clpA [C]

Synonym: RPB_2396

Alternate gene names: 86749516

Gene position: 2749042-2751447 (Clockwise)

Preceding gene: 86749515

Following gene: 86749519

Centisome position: 51.56

GC content: 64.01

Gene sequence:

>2406_bases
ATGCCGACATTTTCGCAAAGCCTTGAGCAATCCCTCCACCGCGCGCTCGCGATCGCCAACGAGCGACATCATCAATACGC
CACGCTCGAGCATCTCCTGCTCTCTCTGGTCGACGATTCGGATGCGGCCGCGGTGATGCGCGCCTGCAGCGTCGATCTCG
ACAAGCTGCGCGCGAGCCTGGTGAACTACCTCGAAACCGAGTTCGAGAATCTCGTCACCGACGGTTCCGAAGACGCCAAG
CCGACCGCCGGTTTCCAGCGCGTGATCCAGCGCGCGGTGATCCATGTGCAGTCGTCGGGGCGGGAAGAGGTGACCGGCGC
CAATGTGCTGATCGCGATCTTCGCCGAACGCGAGAGCCACGCCGCGTACTTCCTGCAGGAGCAGGACATGACGCGCTACG
ACGCGGTCAACTACATCAGCCACGGCATCGCCAAGCGCCCCGGCGTGTCCGAGGCGCGGCCGGTGCGCGGCGTCGACGAA
GAGACCGAGACCAAGAGCGGCGAAGATTCCAAGAAGAAGGGCGACGCGCTCGAGACCTATTGCGTCAATCTCAACAAGAA
GGCGCGCGACGGTAAGATCGATCCGGTGATCGGGCGCAACGCCGAAATCAACCGCGCCATCCAGGTGCTGTGCCGCCGGC
AGAAGAACAACCCGCTGTTCGTCGGCGAAGCCGGCGTCGGCAAGACCGCGATCGCCGAAGGCCTCGCCAAGCGCATCGTC
GACTCGGAAGTGCCGGAAGTGCTCGCCGCAGCCACCGTGTTCTCGCTCGACATGGGCACGCTGCTCGCCGGCACGCGCTA
TCGCGGCGACTTCGAGGAGCGTCTGAAGCAGGTGCTCAAGGAGCTCGAAGCGCATCCGAACGCGATCCTGTTCATTGACG
AGATCCACACCGTGATCGGCGCCGGCGCGACCTCCGGTGGCGCGATGGATGCCTCGAATCTACTCAAGCCTGCCTTGGCT
TCAGGCACGATCCGCTGCATGGGCTCGACGACTTACAAAGAATACCGGCAGCACTTCGAGAAGGACCGCGCGCTGGTGCG
GCGTTTCCAGAAGATCGACGTCAACGAGCCGACGGTCGAAGACGCGATCGCGATCCTCAAGGGCCTCAAGCCGTATTTCG
AGGACTATCACAAGCTCAAATACACCAACGAGGCGATCGAATCGGCCGTCGAACTGTCGTCGCGCTACATCCATGACCGG
AAGTTGCCGGACAAGGCCATCGACGTGATCGACGAATCCGGCGCGGCGCAGATGCTGGTGGCGGAGAACAAGCGCAAGAA
GACCATCGGCATCAAGGAAATCGAGGCCACGGTCGCGACCATGGCGCGGATCCCGCCGAAGAGCGTGTCGAAGGATGATG
CAGAGGTGCTGAAGCACCTCGAAACCACCCTGAAGCGCGTGGTGTTCGGCCAGGACAAGGCGATCGAGTCGCTGTCCGCC
TCCATCAAGCTCGCCCGTGCGGGCCTGCGCGAACCGGAGAAGCCGATCGGCTGCTACCTGTTTTCGGGCCCGACCGGCGT
CGGCAAGACCGAGGTTGCCAAGCAACTGGCCTCGACGCTCGGCGTCGAGCTGCTGCGCTTCGACATGTCGGAATACATGG
AGCGCCACACCGTCTCGCGCCTGATCGGCGCGCCGCCCGGCTATGTCGGCTTCGATCAGGGCGGCCTGCTGACCGACGGC
GTCGACCAGCATCCGCATTGCGTCGTGCTGCTCGACGAAATCGAGAAGGCGCATCCCGATCTGTACAACGTGCTGCTGCA
GATCATGGATCACGGCCGGCTGACCGACCACAACGGCAAGCAGGTCAATTTCCGCAACGTCATCCTGATCATGACGACCA
ATGCCGGCGCCGCCGATCTGGCGCGGCAGGCATTCGGCTTCACCCGCAACAAGCGGGAAGGCGACGACCACGAGGCGATC
AACCGGCAGTTCGCGCCGGAGTTCCGCAACCGTCTCGATGCGATCGTGTCGTTCGCGCATCTCAATGCCGATGTCATCGG
CATGGTGGTCGAGAAGTTCGTGCTGCAGCTCGAGGCGCAACTCGCCGATCGCGACGTCACCATCGAGCTGACCGACGAGG
CCAAGGCCTGGCTGGTGCAGCACGGCTATGACGAGCAGATGGGCGCGCGGCCGATGTCGCGGGTGATCCAGGAGCACATC
AAGAAGCCGTTGGCCGACGAGGTGCTGTTCGGCCAGCTCAAGGGCGGTGGCCACGTCAAGGTGGTGCTGGTCAAGGACGA
GGCGGTCGCCGGCGTCGAGCTGGAGAAGATCGCCTTCGAATTCCTCGACGGCCCGGTGACGCCGAAGCCCGAAAATCTGC
CCAACGCGCGCAAGCGCGGCGCGGCCCGCAAGCCCAAGCCGGGCGGCCCGAAGGGATCGGCGAAGGATCCCTTGGTCAAG
GCCTGA

Upstream 100 bases:

>100_bases
CGGGGCTTTCGTCGGCGGGTTGCGACACTATATGTCGAAAGGTTGTTGTTGCGCGACGTTGCAGCGGCGATCATGATGAG
CGGGGGCCACAGAGGACGAG

Downstream 100 bases:

>100_bases
GCGGACTACGCTTGATCAATTGAACGAGGCCGGCGAGCGATCGCCGGCCTTTTTCGTTTGGGGCGAGTCGTCATCCTGAG
GTGCGAGCACAGCGAGCCTC

Product: AAA_5 ATPase

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 801; Mature: 800

Protein sequence:

>801_residues
MPTFSQSLEQSLHRALAIANERHHQYATLEHLLLSLVDDSDAAAVMRACSVDLDKLRASLVNYLETEFENLVTDGSEDAK
PTAGFQRVIQRAVIHVQSSGREEVTGANVLIAIFAERESHAAYFLQEQDMTRYDAVNYISHGIAKRPGVSEARPVRGVDE
ETETKSGEDSKKKGDALETYCVNLNKKARDGKIDPVIGRNAEINRAIQVLCRRQKNNPLFVGEAGVGKTAIAEGLAKRIV
DSEVPEVLAAATVFSLDMGTLLAGTRYRGDFEERLKQVLKELEAHPNAILFIDEIHTVIGAGATSGGAMDASNLLKPALA
SGTIRCMGSTTYKEYRQHFEKDRALVRRFQKIDVNEPTVEDAIAILKGLKPYFEDYHKLKYTNEAIESAVELSSRYIHDR
KLPDKAIDVIDESGAAQMLVAENKRKKTIGIKEIEATVATMARIPPKSVSKDDAEVLKHLETTLKRVVFGQDKAIESLSA
SIKLARAGLREPEKPIGCYLFSGPTGVGKTEVAKQLASTLGVELLRFDMSEYMERHTVSRLIGAPPGYVGFDQGGLLTDG
VDQHPHCVVLLDEIEKAHPDLYNVLLQIMDHGRLTDHNGKQVNFRNVILIMTTNAGAADLARQAFGFTRNKREGDDHEAI
NRQFAPEFRNRLDAIVSFAHLNADVIGMVVEKFVLQLEAQLADRDVTIELTDEAKAWLVQHGYDEQMGARPMSRVIQEHI
KKPLADEVLFGQLKGGGHVKVVLVKDEAVAGVELEKIAFEFLDGPVTPKPENLPNARKRGAARKPKPGGPKGSAKDPLVK
A

Sequences:

>Translated_801_residues
MPTFSQSLEQSLHRALAIANERHHQYATLEHLLLSLVDDSDAAAVMRACSVDLDKLRASLVNYLETEFENLVTDGSEDAK
PTAGFQRVIQRAVIHVQSSGREEVTGANVLIAIFAERESHAAYFLQEQDMTRYDAVNYISHGIAKRPGVSEARPVRGVDE
ETETKSGEDSKKKGDALETYCVNLNKKARDGKIDPVIGRNAEINRAIQVLCRRQKNNPLFVGEAGVGKTAIAEGLAKRIV
DSEVPEVLAAATVFSLDMGTLLAGTRYRGDFEERLKQVLKELEAHPNAILFIDEIHTVIGAGATSGGAMDASNLLKPALA
SGTIRCMGSTTYKEYRQHFEKDRALVRRFQKIDVNEPTVEDAIAILKGLKPYFEDYHKLKYTNEAIESAVELSSRYIHDR
KLPDKAIDVIDESGAAQMLVAENKRKKTIGIKEIEATVATMARIPPKSVSKDDAEVLKHLETTLKRVVFGQDKAIESLSA
SIKLARAGLREPEKPIGCYLFSGPTGVGKTEVAKQLASTLGVELLRFDMSEYMERHTVSRLIGAPPGYVGFDQGGLLTDG
VDQHPHCVVLLDEIEKAHPDLYNVLLQIMDHGRLTDHNGKQVNFRNVILIMTTNAGAADLARQAFGFTRNKREGDDHEAI
NRQFAPEFRNRLDAIVSFAHLNADVIGMVVEKFVLQLEAQLADRDVTIELTDEAKAWLVQHGYDEQMGARPMSRVIQEHI
KKPLADEVLFGQLKGGGHVKVVLVKDEAVAGVELEKIAFEFLDGPVTPKPENLPNARKRGAARKPKPGGPKGSAKDPLVK
A
>Mature_800_residues
PTFSQSLEQSLHRALAIANERHHQYATLEHLLLSLVDDSDAAAVMRACSVDLDKLRASLVNYLETEFENLVTDGSEDAKP
TAGFQRVIQRAVIHVQSSGREEVTGANVLIAIFAERESHAAYFLQEQDMTRYDAVNYISHGIAKRPGVSEARPVRGVDEE
TETKSGEDSKKKGDALETYCVNLNKKARDGKIDPVIGRNAEINRAIQVLCRRQKNNPLFVGEAGVGKTAIAEGLAKRIVD
SEVPEVLAAATVFSLDMGTLLAGTRYRGDFEERLKQVLKELEAHPNAILFIDEIHTVIGAGATSGGAMDASNLLKPALAS
GTIRCMGSTTYKEYRQHFEKDRALVRRFQKIDVNEPTVEDAIAILKGLKPYFEDYHKLKYTNEAIESAVELSSRYIHDRK
LPDKAIDVIDESGAAQMLVAENKRKKTIGIKEIEATVATMARIPPKSVSKDDAEVLKHLETTLKRVVFGQDKAIESLSAS
IKLARAGLREPEKPIGCYLFSGPTGVGKTEVAKQLASTLGVELLRFDMSEYMERHTVSRLIGAPPGYVGFDQGGLLTDGV
DQHPHCVVLLDEIEKAHPDLYNVLLQIMDHGRLTDHNGKQVNFRNVILIMTTNAGAADLARQAFGFTRNKREGDDHEAIN
RQFAPEFRNRLDAIVSFAHLNADVIGMVVEKFVLQLEAQLADRDVTIELTDEAKAWLVQHGYDEQMGARPMSRVIQEHIK
KPLADEVLFGQLKGGGHVKVVLVKDEAVAGVELEKIAFEFLDGPVTPKPENLPNARKRGAARKPKPGGPKGSAKDPLVKA

Specific function: ATP-Dependent Specificity Component Of The Clpp Protease. It Directs The Protease To Specific Substrates. The Primary Function Of The Clpa-Clpp Complex Appears To Be The Degradation Of Unfolded Or Abnormal Proteins. [C]

COG id: COG0542

COG function: function code O; ATPases with chaperone activity, ATP-binding subunit

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Non_Essential [C]

Operon status: Not Known

Operon components: None

Similarity: Belongs to the clpA/clpB family [H]

Homologues:

Organism=Homo sapiens, GI13540606, Length=354, Percent_Identity=31.638418079096, Blast_Score=164, Evalue=2e-40,
Organism=Escherichia coli, GI1787109, Length=745, Percent_Identity=58.1208053691275, Blast_Score=896, Evalue=0.0,
Organism=Escherichia coli, GI1788943, Length=258, Percent_Identity=57.3643410852713, Blast_Score=295, Evalue=9e-81,
Organism=Saccharomyces cerevisiae, GI6320464, Length=328, Percent_Identity=44.5121951219512, Blast_Score=276, Evalue=1e-74,
Organism=Saccharomyces cerevisiae, GI6323002, Length=306, Percent_Identity=43.7908496732026, Blast_Score=251, Evalue=3e-67,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR003593
- InterPro:   IPR013093
- InterPro:   IPR003959
- InterPro:   IPR018368
- InterPro:   IPR001270
- InterPro:   IPR019489
- InterPro:   IPR004176
- InterPro:   IPR013461
- InterPro:   IPR023150 [H]

Pfam domain/function: PF00004 AAA; PF07724 AAA_2; PF02861 Clp_N; PF10431 ClpB_D2-small [H]

EC number: NA

Molecular weight: Translated: 88227; Mature: 88095

Theoretical pI: Translated: 6.59; Mature: 6.59

Prosite motif: PS00870 CLPAB_1 ; PS00871 CLPAB_2

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.7 %Cys     (Translated Protein)
1.9 %Met     (Translated Protein)
2.6 %Cys+Met (Translated Protein)
0.8 %Cys     (Mature Protein)
1.8 %Met     (Mature Protein)
2.5 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MPTFSQSLEQSLHRALAIANERHHQYATLEHLLLSLVDDSDAAAVMRACSVDLDKLRASL
CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHCCHHHHHHHH
VNYLETEFENLVTDGSEDAKPTAGFQRVIQRAVIHVQSSGREEVTGANVLIAIFAERESH
HHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEECCCCC
AAYFLQEQDMTRYDAVNYISHGIAKRPGVSEARPVRGVDEETETKSGEDSKKKGDALETY
CEEEEECCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHCCCCCCHHHCCCHHHHH
CVNLNKKARDGKIDPVIGRNAEINRAIQVLCRRQKNNPLFVGEAGVGKTAIAEGLAKRIV
HCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHCCCCCEEEECCCCCHHHHHHHHHHHHH
DSEVPEVLAAATVFSLDMGTLLAGTRYRGDFEERLKQVLKELEAHPNAILFIDEIHTVIG
HHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCCEEEEHHHHHHHHH
AGATSGGAMDASNLLKPALASGTIRCMGSTTYKEYRQHFEKDRALVRRFQKIDVNEPTVE
CCCCCCCCCCHHHHHHHHHHCCCEEECCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHH
DAIAILKGLKPYFEDYHKLKYTNEAIESAVELSSRYIHDRKLPDKAIDVIDESGAAQMLV
HHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHCCCCCCEEHH
AENKRKKTIGIKEIEATVATMARIPPKSVSKDDAEVLKHLETTLKRVVFGQDKAIESLSA
HCCCCHHCCCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHCCCHHHHHHHHH
SIKLARAGLREPEKPIGCYLFSGPTGVGKTEVAKQLASTLGVELLRFDMSEYMERHTVSR
HHHHHHHCCCCCCCCCEEEEECCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
LIGAPPGYVGFDQGGLLTDGVDQHPHCVVLLDEIEKAHPDLYNVLLQIMDHGRLTDHNGK
HHCCCCCCCCCCCCCCCCCCCCCCCCEEEEHHHHHHHCCHHHHHHHHHHHCCCCCCCCCC
QVNFRNVILIMTTNAGAADLARQAFGFTRNKREGDDHEAINRQFAPEFRNRLDAIVSFAH
EEEECEEEEEEECCCCHHHHHHHHHCCCCCCCCCCHHHHHHHHCCHHHHHHHHHHHHHHH
LNADVIGMVVEKFVLQLEAQLADRDVTIELTDEAKAWLVQHGYDEQMGARPMSRVIQEHI
CCHHHHHHHHHHHHHHHHHHHCCCCEEEEECCCHHHHHHHCCCCHHHCCCHHHHHHHHHH
KKPLADEVLFGQLKGGGHVKVVLVKDEAVAGVELEKIAFEFLDGPVTPKPENLPNARKRG
HCCHHHHHHHHHCCCCCEEEEEEEECCCCCCCHHHHHHHHHHCCCCCCCCCCCCCHHHCC
AARKPKPGGPKGSAKDPLVKA
CCCCCCCCCCCCCCCCCCCCC
>Mature Secondary Structure 
PTFSQSLEQSLHRALAIANERHHQYATLEHLLLSLVDDSDAAAVMRACSVDLDKLRASL
CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHCCHHHHHHHH
VNYLETEFENLVTDGSEDAKPTAGFQRVIQRAVIHVQSSGREEVTGANVLIAIFAERESH
HHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHCCCCCCCCCCCEEEEEEECCCCC
AAYFLQEQDMTRYDAVNYISHGIAKRPGVSEARPVRGVDEETETKSGEDSKKKGDALETY
CEEEEECCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHCCCCCCHHHCCCHHHHH
CVNLNKKARDGKIDPVIGRNAEINRAIQVLCRRQKNNPLFVGEAGVGKTAIAEGLAKRIV
HCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHCCCCCEEEECCCCCHHHHHHHHHHHHH
DSEVPEVLAAATVFSLDMGTLLAGTRYRGDFEERLKQVLKELEAHPNAILFIDEIHTVIG
HHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCCEEEEHHHHHHHHH
AGATSGGAMDASNLLKPALASGTIRCMGSTTYKEYRQHFEKDRALVRRFQKIDVNEPTVE
CCCCCCCCCCHHHHHHHHHHCCCEEECCCHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHH
DAIAILKGLKPYFEDYHKLKYTNEAIESAVELSSRYIHDRKLPDKAIDVIDESGAAQMLV
HHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHCCCCCCEEHH
AENKRKKTIGIKEIEATVATMARIPPKSVSKDDAEVLKHLETTLKRVVFGQDKAIESLSA
HCCCCHHCCCHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHCCCHHHHHHHHH
SIKLARAGLREPEKPIGCYLFSGPTGVGKTEVAKQLASTLGVELLRFDMSEYMERHTVSR
HHHHHHHCCCCCCCCCEEEEECCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
LIGAPPGYVGFDQGGLLTDGVDQHPHCVVLLDEIEKAHPDLYNVLLQIMDHGRLTDHNGK
HHCCCCCCCCCCCCCCCCCCCCCCCCEEEEHHHHHHHCCHHHHHHHHHHHCCCCCCCCCC
QVNFRNVILIMTTNAGAADLARQAFGFTRNKREGDDHEAINRQFAPEFRNRLDAIVSFAH
EEEECEEEEEEECCCCHHHHHHHHHCCCCCCCCCCHHHHHHHHCCHHHHHHHHHHHHHHH
LNADVIGMVVEKFVLQLEAQLADRDVTIELTDEAKAWLVQHGYDEQMGARPMSRVIQEHI
CCHHHHHHHHHHHHHHHHHHHCCCCEEEEECCCHHHHHHHCCCCHHHCCCHHHHHHHHHH
KKPLADEVLFGQLKGGGHVKVVLVKDEAVAGVELEKIAFEFLDGPVTPKPENLPNARKRG
HCCHHHHHHHHHCCCCCEEEEEEEECCCCCCCHHHHHHHHHHCCCCCCCCCCCCCHHHCC
AARKPKPGGPKGSAKDPLVKA
CCCCCCCCCCCCCCCCCCCCC

PDB accession: NA

Resolution: NA

Structure class: Unstructured

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: Hydrolase; Acting on peptide bonds (Peptidases) [C]

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 6209404 [H]