Definition Agrobacterium tumefaciens str. C58 chromosome circular, complete sequence.
Accession NC_003062
Length 2,841,580

Click here to switch to the map view.

The map label for this gene is exo I [H]

Identifier: 15889853

GI number: 15889853

Start: 2574394

End: 2576313

Strand: Reverse

Name: exo I [H]

Synonym: Atu2596

Alternate gene names: 15889853

Gene position: 2576313-2574394 (Counterclockwise)

Preceding gene: 159185285

Following gene: 15889852

Centisome position: 90.66

GC content: 60.73

Gene sequence:

>1920_bases
ATGGCGCTCGCACAGTATCGTCTCGAAAATTCATGGCATCCGCTCGCTGAGTCGGCATCCGGCGGTTTCACCTTCACGCT
GGTCAATCTATCCCACGAACCACTGAAGGATTTTCGCATCGTTTATACGTCGCTCACGCGCACAGTGGACAAGCCGGTGT
GCGGCAACGCCATCTATCTGCGTCGCAACGCCAATTTCCATGAATTTGCACCCCCATCCGGTTTCGTGCTGGAGCCGGGC
AAATCCTGGCGCTTCACCGTCGATGGCCTGTTGAGACCTGCCCGGCATCGCACGGATGGTGCAAAATCCGCCTATGTCAG
CCTTGCCGATGGCACACATCGCGCCGTGGATGTGGGTGACCTGATGCTGGATGGGCGCCACAGCGAGCCGGCGCCGGTGC
TGCTTCCCGAAGGCAAGCTCGATCTGCCCTTCGCCATCCAGCCCTGGCCAGCTGAAATCGATGCCGAACCGGGTGAAGGC
TTTCCCGTAGCACTTTTCCCGATGGAGGACGCCAGGGCGGAGGAAGCGCTTGCGGTGGAAACCGTGCTGTCGCTGTTCCG
CAGGCTCTTTGCCGTCGGGCATGTGCCTTTCAGCCTTGCGCCGGTGCATGAGGGAAAACCGCTGCGTTTCAAGCAGCATA
GCGGGCTCGAAGCGGAAGGTTACCGCATCGCTTTTTCCAACGAAGCCGTCACCGTGGAATATTCCGCCGCTGCCGGCCTG
CAATATGGCCTGACGGTTCTGGCGCAGCTGCTGCATGGCGCGCGCATTGATCCGAAGTTCCGTTTCCCGGCCTCCGGCAC
CATCAGCGATGCGCCGCGTTACAGCTGGCGCGGCTGCCATCTGGATGTCTCGCGGCAATTTTATCCGACCGATGATGTGC
TGCGGCTGATCGATATTCTCGCATGGCTGCGCATGAACCGCTTCCATTGGCACCTGACGGATGACGAGGCGTGGCGTCTC
GAAATCAAGGCTTATCCCTTGTTGACCACCGTCGGCGCGACGCGCGGTCCGGATGGGCCGCTTCTGCCGCAGCTCGGCAA
TGGCGCGGAACCGGTTTCCGGCTATTACACGCAGGATAATGTGCGCCTGGTTGTGGCGCATGCGGCGGCACTGAATGTCG
AAATCGTGCCGGAGGTGGATATTCCCGGTCACAGCACCGCGGCTCTTGTTGCCTATCCGGAGCTGACGGACGGGCAGGAA
GCGCCGGACAGCTATCGCTCGGTGCAGGGATATCCCAATAACGCGCTCAACCCGGCCATCGAGCCGACCTATGAGTTCCT
CGGCAAGATTTTCGACGAGATGGTGGAGCTGTTCCCATCGCGGCTCATCCATATCGGCGGCGACGAGGTTGCGGATGGCT
CGTGGCTCGCTTCGCCACTGGCCAAGGCGCTGATGGACAAGGAGGGGCTGGACGGCACCTTCGGCATCCAGTCTTATTTC
ATGAAACGCATTCAGGGGATGCTGCATGAGCGCGGTCGCCAGCTTGCCGGCTGGGACGAGGTTTCCCATGGCGGCGGCGT
CGATCCCGCTGGAACATTGCTGATGGCATGGCAGAAGCCAGAAGTCGGGCTTGAGCTTGCAAGGCAGGGTTATGACGTGG
TGATGACGCCGGGTCAGGCCTATTATCTCGACATGGTGCAGGACGAAGCCTGGCAGGAGCCCGGCGCAAGCTGGGCCGGC
ACGGTTCCGCCGTCGCATACCTATGCTTATGAGGCGGTCGGCGATTTCCCTGACGAATTGAAGGAGCGGATGAAGGGTGT
TCAGGCCTGCATCTGGTCCGAACATTTTCTCAACCGTGCCTATTTCAACCATCTGGTTTTCCCAAGGCTGCCGGCCATTG
CCGAGGCGGCATGGACGCCAAAGGCGCAGAAGGACTGGCTGCGGTTTTCCGCCATCGTTCCCTTGAACCCGGTTTATTGA

Upstream 100 bases:

>100_bases
TGACGGCAGGCCAAGCGCCACTCGCCCGGACATTTTCGCAATCCGAACACTACTGACCGAAACCCTCTGACGGGTTCAAT
GCGCAATAAAGGGGAGCATC

Downstream 100 bases:

>100_bases
GGGGAAGGCGAGATGCGTATTGCCGTTGGTGGAATCCATACCGAATGCAGCACCTATTCCCCTGTTCTGATGGGGCAGGA
GGATTTTCGCGTGTTTCGGG

Product: beta-N-acetylhexosaminidase

Products: NA

Alternate protein names: Beta-N-acetylhexosaminidase; N-acetyl-beta-glucosaminidase [H]

Number of amino acids: Translated: 639; Mature: 638

Protein sequence:

>639_residues
MALAQYRLENSWHPLAESASGGFTFTLVNLSHEPLKDFRIVYTSLTRTVDKPVCGNAIYLRRNANFHEFAPPSGFVLEPG
KSWRFTVDGLLRPARHRTDGAKSAYVSLADGTHRAVDVGDLMLDGRHSEPAPVLLPEGKLDLPFAIQPWPAEIDAEPGEG
FPVALFPMEDARAEEALAVETVLSLFRRLFAVGHVPFSLAPVHEGKPLRFKQHSGLEAEGYRIAFSNEAVTVEYSAAAGL
QYGLTVLAQLLHGARIDPKFRFPASGTISDAPRYSWRGCHLDVSRQFYPTDDVLRLIDILAWLRMNRFHWHLTDDEAWRL
EIKAYPLLTTVGATRGPDGPLLPQLGNGAEPVSGYYTQDNVRLVVAHAAALNVEIVPEVDIPGHSTAALVAYPELTDGQE
APDSYRSVQGYPNNALNPAIEPTYEFLGKIFDEMVELFPSRLIHIGGDEVADGSWLASPLAKALMDKEGLDGTFGIQSYF
MKRIQGMLHERGRQLAGWDEVSHGGGVDPAGTLLMAWQKPEVGLELARQGYDVVMTPGQAYYLDMVQDEAWQEPGASWAG
TVPPSHTYAYEAVGDFPDELKERMKGVQACIWSEHFLNRAYFNHLVFPRLPAIAEAAWTPKAQKDWLRFSAIVPLNPVY

Sequences:

>Translated_639_residues
MALAQYRLENSWHPLAESASGGFTFTLVNLSHEPLKDFRIVYTSLTRTVDKPVCGNAIYLRRNANFHEFAPPSGFVLEPG
KSWRFTVDGLLRPARHRTDGAKSAYVSLADGTHRAVDVGDLMLDGRHSEPAPVLLPEGKLDLPFAIQPWPAEIDAEPGEG
FPVALFPMEDARAEEALAVETVLSLFRRLFAVGHVPFSLAPVHEGKPLRFKQHSGLEAEGYRIAFSNEAVTVEYSAAAGL
QYGLTVLAQLLHGARIDPKFRFPASGTISDAPRYSWRGCHLDVSRQFYPTDDVLRLIDILAWLRMNRFHWHLTDDEAWRL
EIKAYPLLTTVGATRGPDGPLLPQLGNGAEPVSGYYTQDNVRLVVAHAAALNVEIVPEVDIPGHSTAALVAYPELTDGQE
APDSYRSVQGYPNNALNPAIEPTYEFLGKIFDEMVELFPSRLIHIGGDEVADGSWLASPLAKALMDKEGLDGTFGIQSYF
MKRIQGMLHERGRQLAGWDEVSHGGGVDPAGTLLMAWQKPEVGLELARQGYDVVMTPGQAYYLDMVQDEAWQEPGASWAG
TVPPSHTYAYEAVGDFPDELKERMKGVQACIWSEHFLNRAYFNHLVFPRLPAIAEAAWTPKAQKDWLRFSAIVPLNPVY
>Mature_638_residues
ALAQYRLENSWHPLAESASGGFTFTLVNLSHEPLKDFRIVYTSLTRTVDKPVCGNAIYLRRNANFHEFAPPSGFVLEPGK
SWRFTVDGLLRPARHRTDGAKSAYVSLADGTHRAVDVGDLMLDGRHSEPAPVLLPEGKLDLPFAIQPWPAEIDAEPGEGF
PVALFPMEDARAEEALAVETVLSLFRRLFAVGHVPFSLAPVHEGKPLRFKQHSGLEAEGYRIAFSNEAVTVEYSAAAGLQ
YGLTVLAQLLHGARIDPKFRFPASGTISDAPRYSWRGCHLDVSRQFYPTDDVLRLIDILAWLRMNRFHWHLTDDEAWRLE
IKAYPLLTTVGATRGPDGPLLPQLGNGAEPVSGYYTQDNVRLVVAHAAALNVEIVPEVDIPGHSTAALVAYPELTDGQEA
PDSYRSVQGYPNNALNPAIEPTYEFLGKIFDEMVELFPSRLIHIGGDEVADGSWLASPLAKALMDKEGLDGTFGIQSYFM
KRIQGMLHERGRQLAGWDEVSHGGGVDPAGTLLMAWQKPEVGLELARQGYDVVMTPGQAYYLDMVQDEAWQEPGASWAGT
VPPSHTYAYEAVGDFPDELKERMKGVQACIWSEHFLNRAYFNHLVFPRLPAIAEAAWTPKAQKDWLRFSAIVPLNPVY

Specific function: Hydrolyzes rapidly p-nitrophenyl-N-acetyl-beta-D- glucosaminide (PNP-beta-GlcNAc) and 4-methylumbelliferyl-beta- GlcNAc, and slightly active on p-nitrophenyl-beta-GalNAc. Hydrolyzes aryl-N-acetyl-beta-D-glucosaminide (aryl-beta-GlcNAc), aryl-beta-GalNAc a

COG id: COG3525

COG function: function code G; N-acetyl-beta-hexosaminidase

Gene ontology:

Cell location: Periplasm [H]

Metaboloic importance: NA

Operon status: Not Known

Operon components: None

Similarity: Belongs to the glycosyl hydrolase 20 family [H]

Homologues:

Organism=Homo sapiens, GI4504373, Length=414, Percent_Identity=25.6038647342995, Blast_Score=152, Evalue=1e-36,
Organism=Homo sapiens, GI189181666, Length=422, Percent_Identity=25.8293838862559, Blast_Score=135, Evalue=1e-31,
Organism=Caenorhabditis elegans, GI17569815, Length=388, Percent_Identity=25, Blast_Score=122, Evalue=7e-28,
Organism=Drosophila melanogaster, GI24657468, Length=407, Percent_Identity=25.7985257985258, Blast_Score=137, Evalue=2e-32,
Organism=Drosophila melanogaster, GI17647501, Length=407, Percent_Identity=25.7985257985258, Blast_Score=137, Evalue=2e-32,
Organism=Drosophila melanogaster, GI281365639, Length=407, Percent_Identity=25.7985257985258, Blast_Score=137, Evalue=3e-32,
Organism=Drosophila melanogaster, GI24657474, Length=407, Percent_Identity=25.7985257985258, Blast_Score=137, Evalue=3e-32,
Organism=Drosophila melanogaster, GI45551090, Length=424, Percent_Identity=26.4150943396226, Blast_Score=124, Evalue=3e-28,
Organism=Drosophila melanogaster, GI24653074, Length=424, Percent_Identity=26.4150943396226, Blast_Score=124, Evalue=3e-28,
Organism=Drosophila melanogaster, GI17933586, Length=440, Percent_Identity=23.8636363636364, Blast_Score=102, Evalue=6e-22,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR015882
- InterPro:   IPR001540
- InterPro:   IPR015883
- InterPro:   IPR017853
- InterPro:   IPR013781 [H]

Pfam domain/function: PF00728 Glyco_hydro_20; PF02838 Glyco_hydro_20b [H]

EC number: =3.2.1.52 [H]

Molecular weight: Translated: 70794; Mature: 70663

Theoretical pI: Translated: 5.20; Mature: 5.20

Prosite motif: NA

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.5 %Cys     (Translated Protein)
1.9 %Met     (Translated Protein)
2.3 %Cys+Met (Translated Protein)
0.5 %Cys     (Mature Protein)
1.7 %Met     (Mature Protein)
2.2 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MALAQYRLENSWHPLAESASGGFTFTLVNLSHEPLKDFRIVYTSLTRTVDKPVCGNAIYL
CCCCHHHHCCCCCCHHHCCCCCEEEEEEECCCCCHHHHHHHHHHHHHHHCCCCCCCEEEE
RRNANFHEFAPPSGFVLEPGKSWRFTVDGLLRPARHRTDGAKSAYVSLADGTHRAVDVGD
EECCCCCCCCCCCCEEECCCCCEEEEHHHHHCCHHHCCCCCCEEEEEECCCCCCEEECCC
LMLDGRHSEPAPVLLPEGKLDLPFAIQPWPAEIDAEPGEGFPVALFPMEDARAEEALAVE
EEECCCCCCCCCEEECCCCCCCCEEECCCCCCCCCCCCCCCCEEEEECCCCCHHHHHHHH
TVLSLFRRLFAVGHVPFSLAPVHEGKPLRFKQHSGLEAEGYRIAFSNEAVTVEYSAAAGL
HHHHHHHHHHHHCCCCEEEEECCCCCCEEECCCCCCCCCCEEEEECCCEEEEEECCCCCH
QYGLTVLAQLLHGARIDPKFRFPASGTISDAPRYSWRGCHLDVSRQFYPTDDVLRLIDIL
HHHHHHHHHHHHCCCCCCCEECCCCCCCCCCCCCCCCCEEEECCCCCCCHHHHHHHHHHH
AWLRMNRFHWHLTDDEAWRLEIKAYPLLTTVGATRGPDGPLLPQLGNGAEPVSGYYTQDN
HHHHCCCEEEEECCCCEEEEEEEEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCEEECCC
VRLVVAHAAALNVEIVPEVDIPGHSTAALVAYPELTDGQEAPDSYRSVQGYPNNALNPAI
CEEEEEEEEEEEEEEEECCCCCCCCCEEEEECCCCCCCCCCCHHHHHCCCCCCCCCCCCC
EPTYEFLGKIFDEMVELFPSRLIHIGGDEVADGSWLASPLAKALMDKEGLDGTFGIQSYF
CHHHHHHHHHHHHHHHHHHHHEEECCCCCCCCCCHHHHHHHHHHHCCCCCCCCHHHHHHH
MKRIQGMLHERGRQLAGWDEVSHGGGVDPAGTLLMAWQKPEVGLELARQGYDVVMTPGQA
HHHHHHHHHHCCCCCCCCHHCCCCCCCCCCCEEEEEECCCCHHHHHHHCCCCEEECCCCE
YYLDMVQDEAWQEPGASWAGTVPPSHTYAYEAVGDFPDELKERMKGVQACIWSEHFLNRA
EEEEHHHHHHHCCCCCCCCCCCCCCCCEEEHHHCCCHHHHHHHHHHHHHHHHHHHHHHHH
YFNHLVFPRLPAIAEAAWTPKAQKDWLRFSAIVPLNPVY
HHHHHHCCCCCHHHHHHCCCCHHHHHEEEEEECCCCCCC
>Mature Secondary Structure 
ALAQYRLENSWHPLAESASGGFTFTLVNLSHEPLKDFRIVYTSLTRTVDKPVCGNAIYL
CCCHHHHCCCCCCHHHCCCCCEEEEEEECCCCCHHHHHHHHHHHHHHHCCCCCCCEEEE
RRNANFHEFAPPSGFVLEPGKSWRFTVDGLLRPARHRTDGAKSAYVSLADGTHRAVDVGD
EECCCCCCCCCCCCEEECCCCCEEEEHHHHHCCHHHCCCCCCEEEEEECCCCCCEEECCC
LMLDGRHSEPAPVLLPEGKLDLPFAIQPWPAEIDAEPGEGFPVALFPMEDARAEEALAVE
EEECCCCCCCCCEEECCCCCCCCEEECCCCCCCCCCCCCCCCEEEEECCCCCHHHHHHHH
TVLSLFRRLFAVGHVPFSLAPVHEGKPLRFKQHSGLEAEGYRIAFSNEAVTVEYSAAAGL
HHHHHHHHHHHHCCCCEEEEECCCCCCEEECCCCCCCCCCEEEEECCCEEEEEECCCCCH
QYGLTVLAQLLHGARIDPKFRFPASGTISDAPRYSWRGCHLDVSRQFYPTDDVLRLIDIL
HHHHHHHHHHHHCCCCCCCEECCCCCCCCCCCCCCCCCEEEECCCCCCCHHHHHHHHHHH
AWLRMNRFHWHLTDDEAWRLEIKAYPLLTTVGATRGPDGPLLPQLGNGAEPVSGYYTQDN
HHHHCCCEEEEECCCCEEEEEEEEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCEEECCC
VRLVVAHAAALNVEIVPEVDIPGHSTAALVAYPELTDGQEAPDSYRSVQGYPNNALNPAI
CEEEEEEEEEEEEEEEECCCCCCCCCEEEEECCCCCCCCCCCHHHHHCCCCCCCCCCCCC
EPTYEFLGKIFDEMVELFPSRLIHIGGDEVADGSWLASPLAKALMDKEGLDGTFGIQSYF
CHHHHHHHHHHHHHHHHHHHHEEECCCCCCCCCCHHHHHHHHHHHCCCCCCCCHHHHHHH
MKRIQGMLHERGRQLAGWDEVSHGGGVDPAGTLLMAWQKPEVGLELARQGYDVVMTPGQA
HHHHHHHHHHCCCCCCCCHHCCCCCCCCCCCEEEEEECCCCHHHHHHHCCCCEEECCCCE
YYLDMVQDEAWQEPGASWAGTVPPSHTYAYEAVGDFPDELKERMKGVQACIWSEHFLNRA
EEEEHHHHHHHCCCCCCCCCCCCCCCCEEEHHHCCCHHHHHHHHHHHHHHHHHHHHHHHH
YFNHLVFPRLPAIAEAAWTPKAQKDWLRFSAIVPLNPVY
HHHHHHCCCCCHHHHHHCCCCHHHHHEEEEEECCCCCCC

PDB accession: NA

Resolution: NA

Structure class: Alpha Beta

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 8969205 [H]