| Definition | Rhodopseudomonas palustris HaA2, complete genome. |
|---|---|
| Accession | NC_007778 |
| Length | 5,331,656 |
Click here to switch to the map view.
The map label for this gene is dcp [H]
Identifier: 86751643
GI number: 86751643
Start: 5139349
End: 5141433
Strand: Reverse
Name: dcp [H]
Synonym: RPB_4545
Alternate gene names: 86751643
Gene position: 5141433-5139349 (Counterclockwise)
Preceding gene: 86751651
Following gene: 86751642
Centisome position: 96.43
GC content: 68.3
Gene sequence:
>2085_bases ATGTCTGAAAGCTCCGGACCGATTGCCGCGCCCACGGGCAATCCGCTGTTGCAGGCCTGGACCACGCCGTTCGAAACCCC GCCCTTCACCGAGATCGTGCCCGAGCATTTCCTGCCGGCGTTCGAGCGGGCGTTCACCGACCATGCCGCCGAGATCGCCG CGATCGCCAACGATCCGACCGAGCCGGACTTCGCCAACACCATCACGGCGCTGGAGCGCTCCGGCAAGCTGCTCAACCGG GTCGCCGCGGTGTTCTACGACCTGGTCTCGGCGCACTCCAATCCGGCGCTGCTGGAGATCGACAAGGACGTGTCGCTGCG GATGGCGCGACACTGGAATCCGATCATGATGAACGCCGTGCTGTTCGGCCGCATCGCGGCGCTGCACGACAAGCGCGCCG AGCTGAAGCTGACCTCGGAAGAGCGTCGCCTGCTGGAGCGCACCTACACCCGCTTCCACCGCTCCGGCGCCGGCCTCGAC GAGGCCGCGAAGGCGCGGATGGCCGAGATCAACGAGCGGCTGGCGCAGCTCGGCACCAACTTCAGCCACCATCTGCTCGG CGACGAGCAGGACTGGTTCATGGAGATCGGCGAGGGCGATACCGAGGGGCTGCCGGACAGCTTCGTCGCCGCCGCGCGCG CTGCGGCGGACGAGCGTGGCCTGCCCGGCAAGGCGGTGGTGACGCTGTCGCGCTCCTCGGTCGAGCCGTTCCTGAAGATG TCCGGCCGCCGCGATCTGCGCGAGAAGGTCTACCGCGCCTTCATCGCCCGCGGCGACAACGGCAACGCCAACGACAACAA CGCGCTGATCGGCGAGATCCTCGGCCTGCGCGAGGAGAGCGCCAAGCTGCTCGGCTATCCGACCTTCGCGGCCTACCGGC TGGAGGATTCGATGGCCAAGACGCCGGAAGCGGTGCGCGGCCTGCTGGAGCGGGTGTGGAAGCCGGCGCGCGCCCGCGCG ATGGCCGACCGCGACGCGCTGCAGGAGCTGGTCACGGAGGAGGGCGGCAATTTCGAGCTGGCGCCGTGGGACTGGCGCTT CTACGCCGAGAAGCTGCGCCAGCGCCGCGCCAATTTCGACGACGCCGCGATCAAGCCGTATCTGTCGCTCGACAACATGA TCGTCGCGGCCTTCGACACCGCGACCCGGCTGTTCGGCGTCACGTTCGCCGAGCGCAAGGACGTGCCGGTGTGGCACCCG GACGTCCGGGTCTGGGAAGTGAAGGATGCCGACGGCAGCCATCGCGGGCTGTTCTACGGCGATTACTATGCCCGGCCGTC GAAGCGCTCCGGCGCCTGGATGACGTCGCTGCGCGACCAGCAGAAGCTCGACGGCGCGGTGGCGCCGCTGATCATCAATG TCTGCAATTTTTCGAAGGGCGCCGACGGCGAACCGTCGCTGCTGTCGCCCGACGACGCCCGCACGCTGTTCCACGAATTC GGCCACGGCCTGCACGGCATGATGTCGGACGTGACCTATCCGTCGCTGTCCGGCACCAGCGTGTTCACCGATTTCGTCGA ACTGCCCTCGCAGCTCTACGAGCACTGGCAGGAGCGGCCCGAGGTGCTGCGGCGTTTCGCCCGGCACTACCAGACCGGCG AGCCGCTGCCCGACGATCTGCTGCAGCGCTTCATCGCCGCGCGCAAATTCAACCAGGGCTTCGCCACGGTGGAATTCGTG TCCTCGGCGCTGCTCGACCTCGAATTCCACACCCAGCCGGCCGCGAGCGTCGGCGATATCCGCGATTTCGAGCGCCGCGA GCTCGACAAGATCGGCATGCCGGACGAGATCGCGCTGCGCCACCGGCCGACCCAGTTCGGCCACATCTTCTCCGGCGATC ACTACGCCTCGGGCTATTACAGCTACATGTGGTCCGAAGTGATGGACGCCGACGCCTTCGGCGCCTTCGAGGAGGCCGGC GACATCTTCGACCCCAAGGTGGCGAAGCGGCTGCGCGACGACATCTACGCCTCGGGCGGCTCACGCGATCCGGAGGAGGC CTATATCGCCTTCCGCGGCCGTGCGCCGGAGCCCGACGCGCTGCTGCGCCGGCGCGGCCTGCTCGAAACCCCGGAGGCGG CGTAG
Upstream 100 bases:
>100_bases CGGGGTGAATAATCCGTGACCGGGCGGCGGTCGGGGCGCAAAAAAGCCCCACCCGGTGCTATATGTTCGCAATTCGCGAC GCCCAACGCAGGAATTCTAG
Downstream 100 bases:
>100_bases CGCCATGCTCGGCTTGCTTCGCCTAGCCGTCATTCGAACCGCGCTCGCGGCCGCGCTGACGCTGTCGGCAGGCGTCGCCG CGGCCCACGCCCATCCGCAC
Product: peptidyl-dipeptidase Dcp
Products: NA
Alternate protein names: Dipeptidyl carboxypeptidase [H]
Number of amino acids: Translated: 694; Mature: 693
Protein sequence:
>694_residues MSESSGPIAAPTGNPLLQAWTTPFETPPFTEIVPEHFLPAFERAFTDHAAEIAAIANDPTEPDFANTITALERSGKLLNR VAAVFYDLVSAHSNPALLEIDKDVSLRMARHWNPIMMNAVLFGRIAALHDKRAELKLTSEERRLLERTYTRFHRSGAGLD EAAKARMAEINERLAQLGTNFSHHLLGDEQDWFMEIGEGDTEGLPDSFVAAARAAADERGLPGKAVVTLSRSSVEPFLKM SGRRDLREKVYRAFIARGDNGNANDNNALIGEILGLREESAKLLGYPTFAAYRLEDSMAKTPEAVRGLLERVWKPARARA MADRDALQELVTEEGGNFELAPWDWRFYAEKLRQRRANFDDAAIKPYLSLDNMIVAAFDTATRLFGVTFAERKDVPVWHP DVRVWEVKDADGSHRGLFYGDYYARPSKRSGAWMTSLRDQQKLDGAVAPLIINVCNFSKGADGEPSLLSPDDARTLFHEF GHGLHGMMSDVTYPSLSGTSVFTDFVELPSQLYEHWQERPEVLRRFARHYQTGEPLPDDLLQRFIAARKFNQGFATVEFV SSALLDLEFHTQPAASVGDIRDFERRELDKIGMPDEIALRHRPTQFGHIFSGDHYASGYYSYMWSEVMDADAFGAFEEAG DIFDPKVAKRLRDDIYASGGSRDPEEAYIAFRGRAPEPDALLRRRGLLETPEAA
Sequences:
>Translated_694_residues MSESSGPIAAPTGNPLLQAWTTPFETPPFTEIVPEHFLPAFERAFTDHAAEIAAIANDPTEPDFANTITALERSGKLLNR VAAVFYDLVSAHSNPALLEIDKDVSLRMARHWNPIMMNAVLFGRIAALHDKRAELKLTSEERRLLERTYTRFHRSGAGLD EAAKARMAEINERLAQLGTNFSHHLLGDEQDWFMEIGEGDTEGLPDSFVAAARAAADERGLPGKAVVTLSRSSVEPFLKM SGRRDLREKVYRAFIARGDNGNANDNNALIGEILGLREESAKLLGYPTFAAYRLEDSMAKTPEAVRGLLERVWKPARARA MADRDALQELVTEEGGNFELAPWDWRFYAEKLRQRRANFDDAAIKPYLSLDNMIVAAFDTATRLFGVTFAERKDVPVWHP DVRVWEVKDADGSHRGLFYGDYYARPSKRSGAWMTSLRDQQKLDGAVAPLIINVCNFSKGADGEPSLLSPDDARTLFHEF GHGLHGMMSDVTYPSLSGTSVFTDFVELPSQLYEHWQERPEVLRRFARHYQTGEPLPDDLLQRFIAARKFNQGFATVEFV SSALLDLEFHTQPAASVGDIRDFERRELDKIGMPDEIALRHRPTQFGHIFSGDHYASGYYSYMWSEVMDADAFGAFEEAG DIFDPKVAKRLRDDIYASGGSRDPEEAYIAFRGRAPEPDALLRRRGLLETPEAA >Mature_693_residues SESSGPIAAPTGNPLLQAWTTPFETPPFTEIVPEHFLPAFERAFTDHAAEIAAIANDPTEPDFANTITALERSGKLLNRV AAVFYDLVSAHSNPALLEIDKDVSLRMARHWNPIMMNAVLFGRIAALHDKRAELKLTSEERRLLERTYTRFHRSGAGLDE AAKARMAEINERLAQLGTNFSHHLLGDEQDWFMEIGEGDTEGLPDSFVAAARAAADERGLPGKAVVTLSRSSVEPFLKMS GRRDLREKVYRAFIARGDNGNANDNNALIGEILGLREESAKLLGYPTFAAYRLEDSMAKTPEAVRGLLERVWKPARARAM ADRDALQELVTEEGGNFELAPWDWRFYAEKLRQRRANFDDAAIKPYLSLDNMIVAAFDTATRLFGVTFAERKDVPVWHPD VRVWEVKDADGSHRGLFYGDYYARPSKRSGAWMTSLRDQQKLDGAVAPLIINVCNFSKGADGEPSLLSPDDARTLFHEFG HGLHGMMSDVTYPSLSGTSVFTDFVELPSQLYEHWQERPEVLRRFARHYQTGEPLPDDLLQRFIAARKFNQGFATVEFVS SALLDLEFHTQPAASVGDIRDFERRELDKIGMPDEIALRHRPTQFGHIFSGDHYASGYYSYMWSEVMDADAFGAFEEAGD IFDPKVAKRLRDDIYASGGSRDPEEAYIAFRGRAPEPDALLRRRGLLETPEAA
Specific function: Removes dipeptides from the C-termini of N-blocked tripeptides, tetrapeptides and larger peptides [H]
COG id: COG0339
COG function: function code E; Zn-dependent oligopeptidases
Gene ontology:
Cell location: Cytoplasm [H]
Metaboloic importance: Non_Essential [C]
Operon status: Not Known
Operon components: None
Similarity: Belongs to the peptidase M3 family [H]
Homologues:
Organism=Homo sapiens, GI4507491, Length=611, Percent_Identity=31.0965630114566, Blast_Score=255, Evalue=1e-67, Organism=Homo sapiens, GI14149738, Length=594, Percent_Identity=30.8080808080808, Blast_Score=254, Evalue=2e-67, Organism=Homo sapiens, GI156105687, Length=657, Percent_Identity=27.0928462709285, Blast_Score=184, Evalue=2e-46, Organism=Escherichia coli, GI1787819, Length=681, Percent_Identity=40.8223201174743, Blast_Score=501, Evalue=1e-143, Organism=Escherichia coli, GI1789913, Length=683, Percent_Identity=34.8462664714495, Blast_Score=383, Evalue=1e-107, Organism=Caenorhabditis elegans, GI32565901, Length=566, Percent_Identity=22.6148409893993, Blast_Score=97, Evalue=4e-20, Organism=Caenorhabditis elegans, GI71999758, Length=470, Percent_Identity=21.4893617021277, Blast_Score=76, Evalue=5e-14, Organism=Saccharomyces cerevisiae, GI6319793, Length=624, Percent_Identity=29.4871794871795, Blast_Score=229, Evalue=1e-60, Organism=Saccharomyces cerevisiae, GI6322715, Length=688, Percent_Identity=24.7093023255814, Blast_Score=152, Evalue=2e-37, Organism=Drosophila melanogaster, GI21356111, Length=738, Percent_Identity=24.7967479674797, Blast_Score=187, Evalue=2e-47, Organism=Drosophila melanogaster, GI20129717, Length=554, Percent_Identity=26.173285198556, Blast_Score=157, Evalue=2e-38,
Paralogues:
None
Copy number: NA
Swissprot (AC and ID): NA
Other databases:
- InterPro: IPR001567 [H]
Pfam domain/function: PF01432 Peptidase_M3 [H]
EC number: =3.4.15.5 [H]
Molecular weight: Translated: 77855; Mature: 77724
Theoretical pI: Translated: 5.13; Mature: 5.13
Prosite motif: PS00142 ZINC_PROTEASE
Important sites: NA
Signals:
None
Transmembrane regions:
None
Cys/Met content:
0.1 %Cys (Translated Protein) 2.3 %Met (Translated Protein) 2.4 %Cys+Met (Translated Protein) 0.1 %Cys (Mature Protein) 2.2 %Met (Mature Protein) 2.3 %Cys+Met (Mature Protein)
Secondary structure:
>Translated Secondary Structure MSESSGPIAAPTGNPLLQAWTTPFETPPFTEIVPEHFLPAFERAFTDHAAEIAAIANDPT CCCCCCCEECCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCC EPDFANTITALERSGKLLNRVAAVFYDLVSAHSNPALLEIDKDVSLRMARHWNPIMMNAV CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCEEEECCCHHHHHHHCCCHHHHHHH LFGRIAALHDKRAELKLTSEERRLLERTYTRFHRSGAGLDEAAKARMAEINERLAQLGTN HHHHHHHHHCCCCCEEECHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHCCC FSHHLLGDEQDWFMEIGEGDTEGLPDSFVAAARAAADERGLPGKAVVTLSRSSVEPFLKM CHHHHCCCCHHHHEECCCCCCCCCCHHHHHHHHHHHHCCCCCCCEEEEEECCCCCHHHHH SGRRDLREKVYRAFIARGDNGNANDNNALIGEILGLREESAKLLGYPTFAAYRLEDSMAK CCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHCCHHHHHHHHCCCCHHHHHHHHHHHC TPEAVRGLLERVWKPARARAMADRDALQELVTEEGGNFELAPWDWRFYAEKLRQRRANFD CHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHCCCCCEECCCCHHHHHHHHHHHHHCCCC DAAIKPYLSLDNMIVAAFDTATRLFGVTFAERKDVPVWHPDVRVWEVKDADGSHRGLFYG CHHHCCHHCCCCEEEEHHHHHHHHHHHHHHCCCCCCCCCCCCEEEEEECCCCCCCCCEEE DYYARPSKRSGAWMTSLRDQQKLDGAVAPLIINVCNFSKGADGEPSLLSPDDARTLFHEF CCCCCCCCCCCHHHHHHHHHHHHCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHH GHGLHGMMSDVTYPSLSGTSVFTDFVELPSQLYEHWQERPEVLRRFARHYQTGEPLPDDL CCCHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHHHH LQRFIAARKFNQGFATVEFVSSALLDLEFHTQPAASVGDIRDFERRELDKIGMPDEIALR HHHHHHHHHHCCCHHHHHHHHHHHHHCEECCCCCCCHHHHHHHHHHHHHHCCCCCHHHHC HRPTQFGHIFSGDHYASGYYSYMWSEVMDADAFGAFEEAGDIFDPKVAKRLRDDIYASGG CCCCHHCCCCCCCCCCCHHHHHHHHHHHHHHHCCCHHHCCCCCCHHHHHHHHHHHHCCCC SRDPEEAYIAFRGRAPEPDALLRRRGLLETPEAA CCCCCCCEEEEECCCCCHHHHHHHCCCCCCCCCC >Mature Secondary Structure SESSGPIAAPTGNPLLQAWTTPFETPPFTEIVPEHFLPAFERAFTDHAAEIAAIANDPT CCCCCCEECCCCCHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCC EPDFANTITALERSGKLLNRVAAVFYDLVSAHSNPALLEIDKDVSLRMARHWNPIMMNAV CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCEEEECCCHHHHHHHCCCHHHHHHH LFGRIAALHDKRAELKLTSEERRLLERTYTRFHRSGAGLDEAAKARMAEINERLAQLGTN HHHHHHHHHCCCCCEEECHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHCCC FSHHLLGDEQDWFMEIGEGDTEGLPDSFVAAARAAADERGLPGKAVVTLSRSSVEPFLKM CHHHHCCCCHHHHEECCCCCCCCCCHHHHHHHHHHHHCCCCCCCEEEEEECCCCCHHHHH SGRRDLREKVYRAFIARGDNGNANDNNALIGEILGLREESAKLLGYPTFAAYRLEDSMAK CCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHCCHHHHHHHHCCCCHHHHHHHHHHHC TPEAVRGLLERVWKPARARAMADRDALQELVTEEGGNFELAPWDWRFYAEKLRQRRANFD CHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHCCCCCEECCCCHHHHHHHHHHHHHCCCC DAAIKPYLSLDNMIVAAFDTATRLFGVTFAERKDVPVWHPDVRVWEVKDADGSHRGLFYG CHHHCCHHCCCCEEEEHHHHHHHHHHHHHHCCCCCCCCCCCCEEEEEECCCCCCCCCEEE DYYARPSKRSGAWMTSLRDQQKLDGAVAPLIINVCNFSKGADGEPSLLSPDDARTLFHEF CCCCCCCCCCCHHHHHHHHHHHHCCHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHH GHGLHGMMSDVTYPSLSGTSVFTDFVELPSQLYEHWQERPEVLRRFARHYQTGEPLPDDL CCCHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHHHH LQRFIAARKFNQGFATVEFVSSALLDLEFHTQPAASVGDIRDFERRELDKIGMPDEIALR HHHHHHHHHHCCCHHHHHHHHHHHHHCEECCCCCCCHHHHHHHHHHHHHHCCCCCHHHHC HRPTQFGHIFSGDHYASGYYSYMWSEVMDADAFGAFEEAGDIFDPKVAKRLRDDIYASGG CCCCHHCCCCCCCCCCCHHHHHHHHHHHHHHHCCCHHHCCCCCCHHHHHHHHHHHHCCCC SRDPEEAYIAFRGRAPEPDALLRRRGLLETPEAA CCCCCCCEEEEECCCCCHHHHHHHCCCCCCCCCC
PDB accession: NA
Resolution: NA
Structure class: Unstructured
Cofactors: NA
Metal ions: NA
Kcat value (1/min): NA
Specific activity: NA
Km value (mM): NA
Substrates: NA
Specific reaction: NA
General reaction: NA
Inhibitor: NA
Structure determination priority: 9.0
TargetDB status: NA
Availability: NA
References: 8226676; 9097039; 9278503 [H]