Definition Mesorhizobium loti MAFF303099 chromosome, complete genome.
Accession NC_002678
Length 7,036,071

Click here to switch to the map view.

The map label for this gene is dcp [H]

Identifier: 13473513

GI number: 13473513

Start: 3307611

End: 3309665

Strand: Direct

Name: dcp [H]

Synonym: mlr4139

Alternate gene names: 13473513

Gene position: 3307611-3309665 (Clockwise)

Preceding gene: 13473512

Following gene: 13473514

Centisome position: 47.01

GC content: 63.89

Gene sequence:

>2055_bases
ATGCCCTCCACGAAAACCGTCGACATCGCCGCCCATCCGCTGACCGCGTGGCAAGGGCCGCTCGGCCTGCCCGACTTTGC
CCATATTGGCGACAGCGATTTTTCGCCGGTCTTCGACGCGGCGCTAACGGCGCATGAGGCGGAGATCGATGCGATTGCCG
GCAACAAGGACGCGCCGACGATCGAGAACACGCTTGCCGCGCTGGAACTCGGCGGCGAGGCGCTCGATCATGTCTCGTCG
ATCTTCTGGTGCCGCGCCGGCGCTTACACCAACGAGACCATCCAGGCGCTGGAGCGCGACATCTCGCCGAAGATGTCCAG
GCATTTTTCGGCGATATCGATGAACGAGAAGCTGTTTGCCCGCATCGACGACCTCTACCAGCGTCGCGAGAGCCTTGACC
TCGACGCCGAGACCCTGCGGGTGCTGGAGAAGACCTGGAAGGGTTTCGTCCGCTCCGGCGCCAAGCTCGATGCCGAGGGC
AAGAAGCGGCTGGCCAGGATCAATGAAGAGCTGTCTTCGCTCGGTACGGCTTTCGGCCAGAACGTGCTGGCCGATGAGCG
CGACTGGGCGCTGTTCCTCGACGAGGCCGACCTTGCCGGCCTGCCGGATTTCCTGAAAAGCTCGATGGCAGAGGCGGCCG
AGATGCGCGGCCAGAAGGGCCGCTACGCCGTCACTTTGTCGCGCTCGATCTACGAGCCGTTCACGACCTTCTCGGAACGC
CGCGACCTGCGCGAGATCGCCTTCAAAGCTTTCACCATGCGCGGCCAGAATGGCGGCGCCAGCGACAACACCGAAGTCGT
GCGCGACATGCTGAAACTGCGCGCCGAAAAGGCCGAGCTGCTCGGCTACGCCTCCTTCGCCGCGCTGAAGCTCGACGACA
CCATGGCCAAGACGCCGAAGGCGGTGCACGACCTGCTCGATCCGGTCTGGGAAAAGGCGCTGGAAAAGGCCGCCGCCGAC
CAGAAGGAATTGGAGCGCCTGGCGGCGCAAGCCGGCAGCAACGAGAAATTCGCCGCCTGGGACTGGCGCTTCTACCAGGA
GAAGCTGCGCGCGGAAAAATTCGCCTTCGATGAGGCGGAGCTGAAACCCTATCTGCAGCTCGACCGGGTGATCGACGCCT
GCTTCGACGTCGCGACCAAACTGTTCGGCATCACCTTCGAGGAGAAAAAGGGCATCGTCGCCTGGCACCCGGACGCGCGC
GTCTTCGTGGTGAAAAATGCCGATGGCAGCGAGCGCGCACTGTTCCTGGCCGATTATTTCGCGCGGCCCTCGAAGCGTTC
CGGCGCCTGGATGAGCGCGCTGAAGTCCGGCTACAAACTTGGCAACGGCTCGAAGCCGGTGATCTACAACATCATGAACT
TCGCCAAGCCGCCGGCAGGCGAACCGGCATTGCTGTCGGTCGACGAGGCGAAGACTCTGTTCCACGAATTCGGCCACGCG
CTGCACGGCATGCTGACCGACGTCACCTGGCCGTCGGTTGCGGGCACCTCGGTCAGCCGCGATTTTGTCGAACTGCCCTC
GCAGCTCTACGAGCACTGGCTGACGGTGCCGGCGGTGCTGGAGAAGCACGCGCTGCACGTCAAGACCGGCAAGCCGATGC
CGAAGGCGCTGCTCGACAAGATGCTGGCCACGCGCACCTTCGGCGCCGGCTTCGCCACCGTCGAGTTTACGTCTTCCGCC
TTGATCGACATGGCCTATCACGCTCGGCCAGATGCACCGGCCGAACCGCTTCGTTTCGAAGCCGAAACGTTGGAAAAACT
CGACATGCCCGACACCATCGCGATGCGCCACCGCACCCCGCATTTCGGCCATGTCTTCTCGGGCGACGGCTATTCGGCCG
GCTACTATTCCTACATGTGGTCGGAAGTGCTGGACGCCGACGCCTTCGCCGCCTTCGAGGAGACCGGCGATCCCTTCAAC
CCGGCACTGGCCGAACGGCTGCGCAAGAACATCTACGCCGCCGGCGGCTCGAAGGACCCCGAAGAGCTCTACACGGCCTT
TCGCGGCAAGATGCCGTCGCCGGAGGCGATGATGGTGAAGCGCGGATTGGTGTAG

Upstream 100 bases:

>100_bases
AGCCTATTGAGGCGGGTCGGCCAACTGCCATATTGCGGGCACGCGGACCGCGCCCTACATCGGTGACACCCTCCTCCTTG
ACGCCAGCGAAAGACCCTTC

Downstream 100 bases:

>100_bases
CGCCTCTTCCTTCTCCCCTTGCGAGAGAAGCTGGATCGGTGCGCAGCGCCGAGACGGATGAGCGACTATCTGAGGTGTCA
AGTTGCATTTGCGAATTCGT

Product: peptidyl-dipeptidase

Products: NA

Alternate protein names: Dipeptidyl carboxypeptidase [H]

Number of amino acids: Translated: 684; Mature: 683

Protein sequence:

>684_residues
MPSTKTVDIAAHPLTAWQGPLGLPDFAHIGDSDFSPVFDAALTAHEAEIDAIAGNKDAPTIENTLAALELGGEALDHVSS
IFWCRAGAYTNETIQALERDISPKMSRHFSAISMNEKLFARIDDLYQRRESLDLDAETLRVLEKTWKGFVRSGAKLDAEG
KKRLARINEELSSLGTAFGQNVLADERDWALFLDEADLAGLPDFLKSSMAEAAEMRGQKGRYAVTLSRSIYEPFTTFSER
RDLREIAFKAFTMRGQNGGASDNTEVVRDMLKLRAEKAELLGYASFAALKLDDTMAKTPKAVHDLLDPVWEKALEKAAAD
QKELERLAAQAGSNEKFAAWDWRFYQEKLRAEKFAFDEAELKPYLQLDRVIDACFDVATKLFGITFEEKKGIVAWHPDAR
VFVVKNADGSERALFLADYFARPSKRSGAWMSALKSGYKLGNGSKPVIYNIMNFAKPPAGEPALLSVDEAKTLFHEFGHA
LHGMLTDVTWPSVAGTSVSRDFVELPSQLYEHWLTVPAVLEKHALHVKTGKPMPKALLDKMLATRTFGAGFATVEFTSSA
LIDMAYHARPDAPAEPLRFEAETLEKLDMPDTIAMRHRTPHFGHVFSGDGYSAGYYSYMWSEVLDADAFAAFEETGDPFN
PALAERLRKNIYAAGGSKDPEELYTAFRGKMPSPEAMMVKRGLV

Sequences:

>Translated_684_residues
MPSTKTVDIAAHPLTAWQGPLGLPDFAHIGDSDFSPVFDAALTAHEAEIDAIAGNKDAPTIENTLAALELGGEALDHVSS
IFWCRAGAYTNETIQALERDISPKMSRHFSAISMNEKLFARIDDLYQRRESLDLDAETLRVLEKTWKGFVRSGAKLDAEG
KKRLARINEELSSLGTAFGQNVLADERDWALFLDEADLAGLPDFLKSSMAEAAEMRGQKGRYAVTLSRSIYEPFTTFSER
RDLREIAFKAFTMRGQNGGASDNTEVVRDMLKLRAEKAELLGYASFAALKLDDTMAKTPKAVHDLLDPVWEKALEKAAAD
QKELERLAAQAGSNEKFAAWDWRFYQEKLRAEKFAFDEAELKPYLQLDRVIDACFDVATKLFGITFEEKKGIVAWHPDAR
VFVVKNADGSERALFLADYFARPSKRSGAWMSALKSGYKLGNGSKPVIYNIMNFAKPPAGEPALLSVDEAKTLFHEFGHA
LHGMLTDVTWPSVAGTSVSRDFVELPSQLYEHWLTVPAVLEKHALHVKTGKPMPKALLDKMLATRTFGAGFATVEFTSSA
LIDMAYHARPDAPAEPLRFEAETLEKLDMPDTIAMRHRTPHFGHVFSGDGYSAGYYSYMWSEVLDADAFAAFEETGDPFN
PALAERLRKNIYAAGGSKDPEELYTAFRGKMPSPEAMMVKRGLV
>Mature_683_residues
PSTKTVDIAAHPLTAWQGPLGLPDFAHIGDSDFSPVFDAALTAHEAEIDAIAGNKDAPTIENTLAALELGGEALDHVSSI
FWCRAGAYTNETIQALERDISPKMSRHFSAISMNEKLFARIDDLYQRRESLDLDAETLRVLEKTWKGFVRSGAKLDAEGK
KRLARINEELSSLGTAFGQNVLADERDWALFLDEADLAGLPDFLKSSMAEAAEMRGQKGRYAVTLSRSIYEPFTTFSERR
DLREIAFKAFTMRGQNGGASDNTEVVRDMLKLRAEKAELLGYASFAALKLDDTMAKTPKAVHDLLDPVWEKALEKAAADQ
KELERLAAQAGSNEKFAAWDWRFYQEKLRAEKFAFDEAELKPYLQLDRVIDACFDVATKLFGITFEEKKGIVAWHPDARV
FVVKNADGSERALFLADYFARPSKRSGAWMSALKSGYKLGNGSKPVIYNIMNFAKPPAGEPALLSVDEAKTLFHEFGHAL
HGMLTDVTWPSVAGTSVSRDFVELPSQLYEHWLTVPAVLEKHALHVKTGKPMPKALLDKMLATRTFGAGFATVEFTSSAL
IDMAYHARPDAPAEPLRFEAETLEKLDMPDTIAMRHRTPHFGHVFSGDGYSAGYYSYMWSEVLDADAFAAFEETGDPFNP
ALAERLRKNIYAAGGSKDPEELYTAFRGKMPSPEAMMVKRGLV

Specific function: Removes dipeptides from the C-termini of N-blocked tripeptides, tetrapeptides and larger peptides [H]

COG id: COG0339

COG function: function code E; Zn-dependent oligopeptidases

Gene ontology:

Cell location: Cytoplasm [H]

Metaboloic importance: Non_Essential [C]

Operon status: Not Known

Operon components: None

Similarity: Belongs to the peptidase M3 family [H]

Homologues:

Organism=Homo sapiens, GI4507491, Length=592, Percent_Identity=31.0810810810811, Blast_Score=254, Evalue=1e-67,
Organism=Homo sapiens, GI14149738, Length=609, Percent_Identity=28.4072249589491, Blast_Score=226, Evalue=6e-59,
Organism=Homo sapiens, GI156105687, Length=588, Percent_Identity=27.5510204081633, Blast_Score=177, Evalue=2e-44,
Organism=Escherichia coli, GI1787819, Length=670, Percent_Identity=37.910447761194, Blast_Score=433, Evalue=1e-122,
Organism=Escherichia coli, GI1789913, Length=676, Percent_Identity=34.0236686390533, Blast_Score=355, Evalue=8e-99,
Organism=Caenorhabditis elegans, GI32565901, Length=611, Percent_Identity=22.7495908346972, Blast_Score=96, Evalue=6e-20,
Organism=Saccharomyces cerevisiae, GI6319793, Length=646, Percent_Identity=26.9349845201238, Blast_Score=203, Evalue=6e-53,
Organism=Saccharomyces cerevisiae, GI6322715, Length=579, Percent_Identity=23.1433506044905, Blast_Score=109, Evalue=2e-24,
Organism=Drosophila melanogaster, GI21356111, Length=498, Percent_Identity=26.9076305220884, Blast_Score=159, Evalue=5e-39,
Organism=Drosophila melanogaster, GI20129717, Length=541, Percent_Identity=25.1386321626617, Blast_Score=148, Evalue=1e-35,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR001567 [H]

Pfam domain/function: PF01432 Peptidase_M3 [H]

EC number: =3.4.15.5 [H]

Molecular weight: Translated: 75903; Mature: 75771

Theoretical pI: Translated: 5.38; Mature: 5.38

Prosite motif: PS00142 ZINC_PROTEASE

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.3 %Cys     (Translated Protein)
2.9 %Met     (Translated Protein)
3.2 %Cys+Met (Translated Protein)
0.3 %Cys     (Mature Protein)
2.8 %Met     (Mature Protein)
3.1 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MPSTKTVDIAAHPLTAWQGPLGLPDFAHIGDSDFSPVFDAALTAHEAEIDAIAGNKDAPT
CCCCCEEEEECCCCHHCCCCCCCCCHHHCCCCCCCHHHHHHHHHHHHHHHHCCCCCCCCH
IENTLAALELGGEALDHVSSIFWCRAGAYTNETIQALERDISPKMSRHFSAISMNEKLFA
HHHHHHHHHHCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCCHHHHHHHHHHCCCHHHHH
RIDDLYQRRESLDLDAETLRVLEKTWKGFVRSGAKLDAEGKKRLARINEELSSLGTAFGQ
HHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHCC
NVLADERDWALFLDEADLAGLPDFLKSSMAEAAEMRGQKGRYAVTLSRSIYEPFTTFSER
HHCCCCCCCEEEEECCCCCCCHHHHHHHHHHHHHHCCCCCCEEEEECHHHHHHHHHHHHH
RDLREIAFKAFTMRGQNGGASDNTEVVRDMLKLRAEKAELLGYASFAALKLDDTMAKTPK
HHHHHHHHHHHHEECCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHEEEECHHHHHCHH
AVHDLLDPVWEKALEKAAADQKELERLAAQAGSNEKFAAWDWRFYQEKLRAEKFAFDEAE
HHHHHHHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCEEEHHHHHHHHHHHHHHHCCCHHH
LKPYLQLDRVIDACFDVATKLFGITFEEKKGIVAWHPDARVFVVKNADGSERALFLADYF
CCHHHHHHHHHHHHHHHHHHHHCCEECCCCCEEEECCCCEEEEEECCCCCCCEEEHHHHH
ARPSKRSGAWMSALKSGYKLGNGSKPVIYNIMNFAKPPAGEPALLSVDEAKTLFHEFGHA
CCCCCCCCHHHHHHHCCCCCCCCCCCHHHHHHHHCCCCCCCCCEEEHHHHHHHHHHHHHH
LHGMLTDVTWPSVAGTSVSRDFVELPSQLYEHWLTVPAVLEKHALHVKTGKPMPKALLDK
HHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCEEECCCCCCHHHHHHH
MLATRTFGAGFATVEFTSSALIDMAYHARPDAPAEPLRFEAETLEKLDMPDTIAMRHRTP
HHHHHHCCCCEEEEEEHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHCCCCCHHHHHCCCC
HFGHVFSGDGYSAGYYSYMWSEVLDADAFAAFEETGDPFNPALAERLRKNIYAAGGSKDP
CCCEEECCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHEECCCCCCH
EELYTAFRGKMPSPEAMMVKRGLV
HHHHHHHHCCCCCCHHHHHHCCCC
>Mature Secondary Structure 
PSTKTVDIAAHPLTAWQGPLGLPDFAHIGDSDFSPVFDAALTAHEAEIDAIAGNKDAPT
CCCCEEEEECCCCHHCCCCCCCCCHHHCCCCCCCHHHHHHHHHHHHHHHHCCCCCCCCH
IENTLAALELGGEALDHVSSIFWCRAGAYTNETIQALERDISPKMSRHFSAISMNEKLFA
HHHHHHHHHHCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHCCHHHHHHHHHHCCCHHHHH
RIDDLYQRRESLDLDAETLRVLEKTWKGFVRSGAKLDAEGKKRLARINEELSSLGTAFGQ
HHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHCC
NVLADERDWALFLDEADLAGLPDFLKSSMAEAAEMRGQKGRYAVTLSRSIYEPFTTFSER
HHCCCCCCCEEEEECCCCCCCHHHHHHHHHHHHHHCCCCCCEEEEECHHHHHHHHHHHHH
RDLREIAFKAFTMRGQNGGASDNTEVVRDMLKLRAEKAELLGYASFAALKLDDTMAKTPK
HHHHHHHHHHHHEECCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHEEEECHHHHHCHH
AVHDLLDPVWEKALEKAAADQKELERLAAQAGSNEKFAAWDWRFYQEKLRAEKFAFDEAE
HHHHHHHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCEEEHHHHHHHHHHHHHHHCCCHHH
LKPYLQLDRVIDACFDVATKLFGITFEEKKGIVAWHPDARVFVVKNADGSERALFLADYF
CCHHHHHHHHHHHHHHHHHHHHCCEECCCCCEEEECCCCEEEEEECCCCCCCEEEHHHHH
ARPSKRSGAWMSALKSGYKLGNGSKPVIYNIMNFAKPPAGEPALLSVDEAKTLFHEFGHA
CCCCCCCCHHHHHHHCCCCCCCCCCCHHHHHHHHCCCCCCCCCEEEHHHHHHHHHHHHHH
LHGMLTDVTWPSVAGTSVSRDFVELPSQLYEHWLTVPAVLEKHALHVKTGKPMPKALLDK
HHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCEEECCCCCCHHHHHHH
MLATRTFGAGFATVEFTSSALIDMAYHARPDAPAEPLRFEAETLEKLDMPDTIAMRHRTP
HHHHHHCCCCEEEEEEHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHCCCCCHHHHHCCCC
HFGHVFSGDGYSAGYYSYMWSEVLDADAFAAFEETGDPFNPALAERLRKNIYAAGGSKDP
CCCEEECCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHEECCCCCCH
EELYTAFRGKMPSPEAMMVKRGLV
HHHHHHHHCCCCCCHHHHHHCCCC

PDB accession: NA

Resolution: NA

Structure class: Unstructured

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 1537804; 11677609 [H]