Definition Akkermansia muciniphila ATCC BAA-835, complete genome.
Accession NC_010655
Length 2,664,102

Click here to switch to the map view.

The map label for this gene is hex [H]

Identifier: 187736302

GI number: 187736302

Start: 2202069

End: 2204255

Strand: Direct

Name: hex [H]

Synonym: Amuc_1815

Alternate gene names: 187736302

Gene position: 2202069-2204255 (Clockwise)

Preceding gene: 187736301

Following gene: 187736304

Centisome position: 82.66

GC content: 55.37

Gene sequence:

>2187_bases
ATGAACCGCATGAAGTTCCTTTTATTATCGTTTGCATGGGTGTGCATGGCTTGCGCCGGAGCATGGGGGCAGGATACGGC
CCCGTCTTTCCCGGCTAACGGGGCCAATTACAGGCTGTTTCCGGCGGACCGGCCTCCGCTGGTTCCCAAACCCCAGCAGC
TGCGCTGGGACGACAGGGCCATTCCCGTGCAGTCCGTACGCATTTTGGCTCCGTCTCCGTCCAGGACTTCCTATCCGGAA
CAGATGAAGTTCATTGTTTCCGAATTGAAATCTTTTCTGGCGGAGCACTGCGTGAAGGTGGCTCCGGACGGGACGTTTGC
CGTTAAATTCGTCAAGGGGGATGTGAAAGCCGGCACGGAAAATTCCAAGCTGAAGGAGGAGGCTTATTCCCTCCGAGTAA
CTTCCGGCGGCGCACTCATTACGGCGATGGATACCAGAGGATTCTATTACGGCATGAAAACGCTGGAGCAGCTTCTTTTG
CGCCGCGGCGGGACGACGACCATTGCCGCCTGCGATATCGTGGACTGGCCGGATTTTGAAATCCGCGGATTCATGAACGA
TGTGGGACGCAATTACATGCCGCTGCCTCTGATTGCACGGGAGCTGGATTCCATGGCGCAGCTCAAGCTGAATGTTTACC
ACTTCCATTTTACGGAGAACCCCGGCTGGCGGCTGGAATCCAAAATTTATCCGGAGCTGAACGCCCCCGAAAATTATACG
CGCATGCCTGGCAAGTTTTACACGCAGAAGGAGTTTAAGCAGCTGGTGGAGTACTGCCGCCTGCGCAATATCCTGCTGAT
TCCGGAGATGGATATGCCGGGGCACAGCCAGATGTTCCGCAAGGCGCTCAACGTGAAGATGAGCGATGAAAAAGCCACCA
AAGCCCTGGTGGCCCTGATCAAGGAGTTGTGTTCCCTGGTTCCCAAGGAGAAAATGCCCATCATCCACATTGGCACGGAC
GAGGTGCGCGGCAAGGATGAGCAGGTGAACAATGAGATTCTTAAGGAGTACATCCATGCAGTGGAGTTCTGCGGCCGCAT
TCCCATGAGATGGCAGCCCGGCCTGACGCCGAAGGGCTATAACGGCTCCATCCAGCAGTTATGGTCCGGCCGCCAGAACC
GTGGCGCATGGCCTACCGACGGAGCGAAGTATGTTGATTCCCTGGAGACTTACCTGAACCACCTTGATCCGTTTGAAACG
GCCATGACCATGTATTTCCGCCGGGCATGCCCGTTTCGGAATGCGGAAGGACTGGGCATGATGCTGTGTTCTTTCCCGGA
CCTGGAAATTACGGATCCGCGCAACCAAGTTCTTCAGACGCCCGTTTACGCCGGCATGGCGTTCGTTTCCGAACCTTTGT
GGAATAATCCCCATGAGAAGGTGCTGGGAGACCCCAACCAGGACGAATATATGAAGTATTTTTCCAATCTGCCCGTGCAG
GGGGATCCTCTGCTGAAGGGGTTTGCGGATTACGAGAACCGCGTGCTCGCCATCCGGGACCGTTTTTTCGTGGATAAGGA
GTTTAATTACGTACGGCAGGCGAATATTCCCTGGAAATTGCTGGGGCCTATTCCCAACGGCGGTAAGACGGAAAAGGAAT
TCGCTCCGGAGGAGGACAACAAGGCAGGGAAGATGAGGGATTCCTACGAGATTGACGGCGTCACCTATGAGTGGTCCGGA
GACGATTACACGGGGGCCACCATCATTTTCAAGCATTACTGCGATTTTCCGACGCTGTTCAATGGCGCAAAGATGGGAGC
TTATCCCCACAAAAACCACACTTATTACGCGCAGACCTGGATTTATTCCCCCAAGGCGCAGACGGTGCCTTTCTGGATCA
GCGGACATACCTGGGCCACGTCCGATTGGCGCAACGGTCCGGCGAGCGTTCCCGGCAAGTGGTTTCATGCGGATCCCAAA
TTTTTTGTGAACGGCCGGGAGATTGCCCCCCCGCAATGGAAAAAGCCGCGTAACAGCGGCGTGATGGTGGATGAAAACTA
CCATTTCCGGGAGCCTTCCATGGTTCCTCTTAAGAAGGGTTGGAACCGCGTGCTGGTAAAGAGCCCCAGCAACAATTCCG
CGCGTCGGTGGATGTTCACATTCGTTCCGGTGCTGGTGAACCCCAAGACGCCCGGCTGCAATGTGAAGGAGTATCCCGGC
CTCAAATTTTCCACACGTCCGGAATAG

Upstream 100 bases:

>100_bases
CACAGGTTCTGGTCTTTTTGCAAAAAATGCTGCCGGAACCCCGTTTTCCGGCAAGTTGGACACGATAGGTTTGACAGGCA
GCTGTTATTGCGGGCGTATT

Downstream 100 bases:

>100_bases
AAAATCCCGTGATGAGCATCCGGAGCTGGAATGAGCCGGTTGCAGTATCCGCCATCATCATTAAAAAAGAAGCCGCGCTT
CTGGCAAAAGCGCGGCTTCT

Product: Beta-N-acetylhexosaminidase

Products: NA

Alternate protein names: Beta-N-acetylhexosaminidase; Chitobiase; N-acetyl-beta-glucosaminidase [H]

Number of amino acids: Translated: 728; Mature: 728

Protein sequence:

>728_residues
MNRMKFLLLSFAWVCMACAGAWGQDTAPSFPANGANYRLFPADRPPLVPKPQQLRWDDRAIPVQSVRILAPSPSRTSYPE
QMKFIVSELKSFLAEHCVKVAPDGTFAVKFVKGDVKAGTENSKLKEEAYSLRVTSGGALITAMDTRGFYYGMKTLEQLLL
RRGGTTTIAACDIVDWPDFEIRGFMNDVGRNYMPLPLIARELDSMAQLKLNVYHFHFTENPGWRLESKIYPELNAPENYT
RMPGKFYTQKEFKQLVEYCRLRNILLIPEMDMPGHSQMFRKALNVKMSDEKATKALVALIKELCSLVPKEKMPIIHIGTD
EVRGKDEQVNNEILKEYIHAVEFCGRIPMRWQPGLTPKGYNGSIQQLWSGRQNRGAWPTDGAKYVDSLETYLNHLDPFET
AMTMYFRRACPFRNAEGLGMMLCSFPDLEITDPRNQVLQTPVYAGMAFVSEPLWNNPHEKVLGDPNQDEYMKYFSNLPVQ
GDPLLKGFADYENRVLAIRDRFFVDKEFNYVRQANIPWKLLGPIPNGGKTEKEFAPEEDNKAGKMRDSYEIDGVTYEWSG
DDYTGATIIFKHYCDFPTLFNGAKMGAYPHKNHTYYAQTWIYSPKAQTVPFWISGHTWATSDWRNGPASVPGKWFHADPK
FFVNGREIAPPQWKKPRNSGVMVDENYHFREPSMVPLKKGWNRVLVKSPSNNSARRWMFTFVPVLVNPKTPGCNVKEYPG
LKFSTRPE

Sequences:

>Translated_728_residues
MNRMKFLLLSFAWVCMACAGAWGQDTAPSFPANGANYRLFPADRPPLVPKPQQLRWDDRAIPVQSVRILAPSPSRTSYPE
QMKFIVSELKSFLAEHCVKVAPDGTFAVKFVKGDVKAGTENSKLKEEAYSLRVTSGGALITAMDTRGFYYGMKTLEQLLL
RRGGTTTIAACDIVDWPDFEIRGFMNDVGRNYMPLPLIARELDSMAQLKLNVYHFHFTENPGWRLESKIYPELNAPENYT
RMPGKFYTQKEFKQLVEYCRLRNILLIPEMDMPGHSQMFRKALNVKMSDEKATKALVALIKELCSLVPKEKMPIIHIGTD
EVRGKDEQVNNEILKEYIHAVEFCGRIPMRWQPGLTPKGYNGSIQQLWSGRQNRGAWPTDGAKYVDSLETYLNHLDPFET
AMTMYFRRACPFRNAEGLGMMLCSFPDLEITDPRNQVLQTPVYAGMAFVSEPLWNNPHEKVLGDPNQDEYMKYFSNLPVQ
GDPLLKGFADYENRVLAIRDRFFVDKEFNYVRQANIPWKLLGPIPNGGKTEKEFAPEEDNKAGKMRDSYEIDGVTYEWSG
DDYTGATIIFKHYCDFPTLFNGAKMGAYPHKNHTYYAQTWIYSPKAQTVPFWISGHTWATSDWRNGPASVPGKWFHADPK
FFVNGREIAPPQWKKPRNSGVMVDENYHFREPSMVPLKKGWNRVLVKSPSNNSARRWMFTFVPVLVNPKTPGCNVKEYPG
LKFSTRPE
>Mature_728_residues
MNRMKFLLLSFAWVCMACAGAWGQDTAPSFPANGANYRLFPADRPPLVPKPQQLRWDDRAIPVQSVRILAPSPSRTSYPE
QMKFIVSELKSFLAEHCVKVAPDGTFAVKFVKGDVKAGTENSKLKEEAYSLRVTSGGALITAMDTRGFYYGMKTLEQLLL
RRGGTTTIAACDIVDWPDFEIRGFMNDVGRNYMPLPLIARELDSMAQLKLNVYHFHFTENPGWRLESKIYPELNAPENYT
RMPGKFYTQKEFKQLVEYCRLRNILLIPEMDMPGHSQMFRKALNVKMSDEKATKALVALIKELCSLVPKEKMPIIHIGTD
EVRGKDEQVNNEILKEYIHAVEFCGRIPMRWQPGLTPKGYNGSIQQLWSGRQNRGAWPTDGAKYVDSLETYLNHLDPFET
AMTMYFRRACPFRNAEGLGMMLCSFPDLEITDPRNQVLQTPVYAGMAFVSEPLWNNPHEKVLGDPNQDEYMKYFSNLPVQ
GDPLLKGFADYENRVLAIRDRFFVDKEFNYVRQANIPWKLLGPIPNGGKTEKEFAPEEDNKAGKMRDSYEIDGVTYEWSG
DDYTGATIIFKHYCDFPTLFNGAKMGAYPHKNHTYYAQTWIYSPKAQTVPFWISGHTWATSDWRNGPASVPGKWFHADPK
FFVNGREIAPPQWKKPRNSGVMVDENYHFREPSMVPLKKGWNRVLVKSPSNNSARRWMFTFVPVLVNPKTPGCNVKEYPG
LKFSTRPE

Specific function: Hydrolysis of terminal, non-reducing N-acetyl-beta-D- glucosamine residues in chitobiose and higher analogs, and in glycoproteins [H]

COG id: COG3525

COG function: function code G; N-acetyl-beta-hexosaminidase

Gene ontology:

Cell location: Cytoplasmic

Metaboloic importance: NA

Operon status: Not Known

Operon components: None

Similarity: Belongs to the glycosyl hydrolase 20 family [H]

Homologues:

Organism=Homo sapiens, GI4504373, Length=227, Percent_Identity=29.9559471365639, Blast_Score=106, Evalue=1e-22,
Organism=Homo sapiens, GI189181666, Length=151, Percent_Identity=36.4238410596026, Blast_Score=100, Evalue=3e-21,
Organism=Caenorhabditis elegans, GI17569815, Length=216, Percent_Identity=28.7037037037037, Blast_Score=99, Evalue=6e-21,
Organism=Drosophila melanogaster, GI24657468, Length=158, Percent_Identity=36.0759493670886, Blast_Score=93, Evalue=5e-19,
Organism=Drosophila melanogaster, GI17647501, Length=158, Percent_Identity=36.0759493670886, Blast_Score=93, Evalue=5e-19,
Organism=Drosophila melanogaster, GI281365639, Length=158, Percent_Identity=36.0759493670886, Blast_Score=93, Evalue=5e-19,
Organism=Drosophila melanogaster, GI24657474, Length=158, Percent_Identity=36.0759493670886, Blast_Score=93, Evalue=5e-19,
Organism=Drosophila melanogaster, GI17933586, Length=219, Percent_Identity=29.2237442922374, Blast_Score=82, Evalue=9e-16,
Organism=Drosophila melanogaster, GI24653074, Length=193, Percent_Identity=29.0155440414508, Blast_Score=79, Evalue=2e-14,
Organism=Drosophila melanogaster, GI45551090, Length=193, Percent_Identity=29.0155440414508, Blast_Score=78, Evalue=2e-14,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR015882
- InterPro:   IPR008965
- InterPro:   IPR004866
- InterPro:   IPR012291
- InterPro:   IPR013812
- InterPro:   IPR001540
- InterPro:   IPR004867
- InterPro:   IPR015883
- InterPro:   IPR017853
- InterPro:   IPR013781
- InterPro:   IPR014756 [H]

Pfam domain/function: PF03173 CHB_HEX; PF03174 CHB_HEX_C; PF00728 Glyco_hydro_20; PF02838 Glyco_hydro_20b [H]

EC number: =3.2.1.52 [H]

Molecular weight: Translated: 83362; Mature: 83362

Theoretical pI: Translated: 8.90; Mature: 8.90

Prosite motif: PS00217 SUGAR_TRANSPORT_2

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

1.5 %Cys     (Translated Protein)
3.7 %Met     (Translated Protein)
5.2 %Cys+Met (Translated Protein)
1.5 %Cys     (Mature Protein)
3.7 %Met     (Mature Protein)
5.2 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MNRMKFLLLSFAWVCMACAGAWGQDTAPSFPANGANYRLFPADRPPLVPKPQQLRWDDRA
CCHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEECCCCCCCCCCCCCCCCCCCC
IPVQSVRILAPSPSRTSYPEQMKFIVSELKSFLAEHCVKVAPDGTFAVKFVKGDVKAGTE
CCCEEEEEECCCCCCCCCHHHHHHHHHHHHHHHHHHHHEECCCCCEEEEEEECCCCCCCC
NSKLKEEAYSLRVTSGGALITAMDTRGFYYGMKTLEQLLLRRGGTTTIAACDIVDWPDFE
CHHHHHHHEEEEEECCCEEEEEEECCCHHHHHHHHHHHHHHCCCCCEEEEEECCCCCCHH
IRGFMNDVGRNYMPLPLIARELDSMAQLKLNVYHFHFTENPGWRLESKIYPELNAPENYT
HHHHHHHHCCCCCCHHHHHHHHHHHHHEEEEEEEEEEECCCCCEECCEECCCCCCCCCCC
RMPGKFYTQKEFKQLVEYCRLRNILLIPEMDMPGHSQMFRKALNVKMSDEKATKALVALI
CCCCCCCCHHHHHHHHHHHHHCCEEEEECCCCCCHHHHHHHHHCCEECHHHHHHHHHHHH
KELCSLVPKEKMPIIHIGTDEVRGKDEQVNNEILKEYIHAVEFCGRIPMRWQPGLTPKGY
HHHHHHCCCCCCCEEEECCHHHCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCC
NGSIQQLWSGRQNRGAWPTDGAKYVDSLETYLNHLDPFETAMTMYFRRACPFRNAEGLGM
CCCHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCHHHHHHHHHHHHHCCCCCCCCCEE
MLCSFPDLEITDPRNQVLQTPVYAGMAFVSEPLWNNPHEKVLGDPNQDEYMKYFSNLPVQ
EEECCCCCEECCCHHHHHHCHHHHHHHHHHCCCCCCCCHHHCCCCCHHHHHHHHHCCCCC
GDPLLKGFADYENRVLAIRDRFFVDKEFNYVRQANIPWKLLGPIPNGGKTEKEFAPEEDN
CCHHHHHHHHHCCCEEEEECCEEECCCCCHHHCCCCCEEEECCCCCCCCCCHHCCCCCCC
KAGKMRDSYEIDGVTYEWSGDDYTGATIIFKHYCDFPTLFNGAKMGAYPHKNHTYYAQTW
CCCCCCCCEEECCEEEEECCCCCCCCEEEEEHHCCCCHHHCCCCCCCCCCCCCEEEEEEE
IYSPKAQTVPFWISGHTWATSDWRNGPASVPGKWFHADPKFFVNGREIAPPQWKKPRNSG
EECCCCCCCEEEECCCEEECCCCCCCCCCCCCCEEECCCCEEECCCCCCCCCCCCCCCCC
VMVDENYHFREPSMVPLKKGWNRVLVKSPSNNSARRWMFTFVPVLVNPKTPGCNVKEYPG
EEECCCCCCCCCCCCHHHCCCCEEEEECCCCCCHHEEEEEEEEEEECCCCCCCCCCCCCC
LKFSTRPE
CCCCCCCC
>Mature Secondary Structure
MNRMKFLLLSFAWVCMACAGAWGQDTAPSFPANGANYRLFPADRPPLVPKPQQLRWDDRA
CCHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCEEEECCCCCCCCCCCCCCCCCCCC
IPVQSVRILAPSPSRTSYPEQMKFIVSELKSFLAEHCVKVAPDGTFAVKFVKGDVKAGTE
CCCEEEEEECCCCCCCCCHHHHHHHHHHHHHHHHHHHHEECCCCCEEEEEEECCCCCCCC
NSKLKEEAYSLRVTSGGALITAMDTRGFYYGMKTLEQLLLRRGGTTTIAACDIVDWPDFE
CHHHHHHHEEEEEECCCEEEEEEECCCHHHHHHHHHHHHHHCCCCCEEEEEECCCCCCHH
IRGFMNDVGRNYMPLPLIARELDSMAQLKLNVYHFHFTENPGWRLESKIYPELNAPENYT
HHHHHHHHCCCCCCHHHHHHHHHHHHHEEEEEEEEEEECCCCCEECCEECCCCCCCCCCC
RMPGKFYTQKEFKQLVEYCRLRNILLIPEMDMPGHSQMFRKALNVKMSDEKATKALVALI
CCCCCCCCHHHHHHHHHHHHHCCEEEEECCCCCCHHHHHHHHHCCEECHHHHHHHHHHHH
KELCSLVPKEKMPIIHIGTDEVRGKDEQVNNEILKEYIHAVEFCGRIPMRWQPGLTPKGY
HHHHHHCCCCCCCEEEECCHHHCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCC
NGSIQQLWSGRQNRGAWPTDGAKYVDSLETYLNHLDPFETAMTMYFRRACPFRNAEGLGM
CCCHHHHHCCCCCCCCCCCCHHHHHHHHHHHHHHCCHHHHHHHHHHHHHCCCCCCCCCEE
MLCSFPDLEITDPRNQVLQTPVYAGMAFVSEPLWNNPHEKVLGDPNQDEYMKYFSNLPVQ
EEECCCCCEECCCHHHHHHCHHHHHHHHHHCCCCCCCCHHHCCCCCHHHHHHHHHCCCCC
GDPLLKGFADYENRVLAIRDRFFVDKEFNYVRQANIPWKLLGPIPNGGKTEKEFAPEEDN
CCHHHHHHHHHCCCEEEEECCEEECCCCCHHHCCCCCEEEECCCCCCCCCCHHCCCCCCC
KAGKMRDSYEIDGVTYEWSGDDYTGATIIFKHYCDFPTLFNGAKMGAYPHKNHTYYAQTW
CCCCCCCCEEECCEEEEECCCCCCCCEEEEEHHCCCCHHHCCCCCCCCCCCCCEEEEEEE
IYSPKAQTVPFWISGHTWATSDWRNGPASVPGKWFHADPKFFVNGREIAPPQWKKPRNSG
EECCCCCCCEEEECCCEEECCCCCCCCCCCCCCEEECCCCEEECCCCCCCCCCCCCCCCC
VMVDENYHFREPSMVPLKKGWNRVLVKSPSNNSARRWMFTFVPVLVNPKTPGCNVKEYPG
EEECCCCCCCCCCCCHHHCCCCEEEEECCCCCCHHEEEEEEEEEEECCCCCCCCCCCCCC
LKFSTRPE
CCCCCCCC

PDB accession: NA

Resolution: NA

Structure class: Alpha Beta

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 8341694 [H]