Definition Akkermansia muciniphila ATCC BAA-835, complete genome.
Accession NC_010655
Length 2,664,102

Click here to switch to the map view.

The map label for this gene is yicI [C]

Identifier: 187736353

GI number: 187736353

Start: 2273148

End: 2275544

Strand: Direct

Name: yicI [C]

Synonym: Amuc_1870

Alternate gene names: 187736353

Gene position: 2273148-2275544 (Clockwise)

Preceding gene: 187736350

Following gene: 187736355

Centisome position: 85.33

GC content: 56.32

Gene sequence:

>2397_bases
ATGTGCTCCGTTTCCAACATCCCTGAAAGACGAAAAACACCTCCTCTCCGGGGGAGTCTCGTCAAATGGGAGTACACGAA
TACGGTTTCCAAAATAACTCTCAGCGTGTGTCTTGCTGATGCCGGTATTGTCCGCGTAACGTATTTCCCCGGAGCGGTGC
CGGAAGACGAGCCGAGTTATGCCGTAAGCCCCGGATATTCCGCGCCGGGAGCGGAAATCCGCGAGTATGATGAGGACGGG
TTTCATGTCGTTGAGACGTCCCTGCTCCGCATCCGCATTCGCACGGAGGAGCAAAAGGTGGATTTTTACGATGTTGCTAC
GGATGAGCCCCTGCTGACGGATGAGGGCGGATTCGGCCGGGAGAGCAAGGACTGGACTGGAGACGCCCGGGTATGGATAC
GGAAAAATCTTCAGGAAACGGAGCATTTTTTCGGGCTTGGGGACAAGCCGTGTGCCTTGAACCTGAGGGGCAAGTATTTC
TCCATGTGGGGAGCGGACCATTATGATTTCCATGAGGAGTCGGATCCCCTTTACAAAAGTATTCCCTTTTTCCTCAGTTT
GCGGGAAAGGAAGGCGTACGGCCTGTTGTTTGACAACACGTGCCGTTCGTATTTCGACTTCGGAGCTACGGATGAAAAGG
TTCTTTCCTTCGGGTCCTTCGGGGGCCTGATGAATTATTATTTCATTTACGACAACACTCCTCTGGACATCATCTCCGCT
TATACGCGCCTTACCGGTACGCCTGAATTGCCGCCTTTGTGGGCGCTGGGGTACCACCAGTCCAAGTGGAGCTATTATCC
GGACAAGGCCGTATACAATCTGGTGGAGCGTTTCCGGGGCCTGGGCATTCCGTGCGATGCCGTGCATCTGGACCATCATT
ACATGGAACGGAAGGAGGGATTTACCTGGGACAAACAGAATTTTCCCGACGCGGAAGGAATGGTCCGCGCCCTGGAGAAG
GACGGTGTGAAGACGGTTCTGATTGTCAATCCCGGCGTGAAGGTCAATTCCGTCAATCCGGTCTGGAAAGAGGGGATGGA
ACGCAATTATTTCTGCCGCCGGTCGGAAGGCAATTTATTGTCGGAAGAAGTATGGCCGGGACTTTGCAATTTTCCGGATT
TTACGGCTCCCGCCGTACGCGGCTGGTGGGCGGATCTGTTCAGCAGGGATATTGGGAAAATTGGCGTGCGCGGCCTCTGG
AACGACATGAACGAACCTGTCGTTTTTCCGGACCGCACTTTCCCGATGGATACGCGGCATGAATATGACGGCATGCCCTG
TTCCCATGAAAAGGCCCATAATATTTACGGGCAGTGCATGGCGGAAGCCTCCTGGCTGGGCATGAAGCGTCACGCTCCGG
ACCGGCGCCCCTTCCTGCTGTCCCGATCCGGTTTTGCCGGATTGCAGCGCTTTGCCGCCACATGGACAGGGGACAACCGG
TCCAGCTGGGAACATTTGAAGCTGGCCAATTTCCAGTGCCAGCGGCTTGCGGCTTCCGGCATTTCCTTTGCGGGAGCGGA
TGCTGGGGGATTCATGGGGCATCCCACGCCGGAATTGTTTTGCCGCTGGATGCAAATGGCTTCTTTCCACGGTTTTTTCC
GCAATCATTCCTCCGGGGAGTTCGGCGGTCAGGAGCCATGGGTTTTTGGGCAGGAAGTCACCTCTTACGTGAAAGCCGCC
ATAGAGGGCCGCTACCGCCTGCTTCCCTATATCTACACGCAGTTCCGCCGGTACGCGGAGACGGGCATGCCCGTGCTGCG
CTCCCTGGCCCTGCAATGTTTTACCAACAAGGACACCTACTGGCGGGGGGCGGAGTATTTCTTTGGAGACCATCTGTACG
TCATTCCCATTCATGAGCCGCAGGAAGGCGGGAGGTTCCTCTACATTCCGGAGGGCGTCTGGTATTCCTACCACACGGAC
AGCCTGATGGAGGACACAGGAAAGGATGTATGGGTGAAGTGCCCCCTTTCTTTCCTGCCCGTGTATGTCCGCGGAGGGGC
GGTAATTCCCCATTGGCCGGTACAGCAATACGTGGGAGAACTGCCTCGGCCGCCGCTGACGCTGGATGTATGGTGGGCGC
CGGAAGGGGAGGTGGCTTCCCATTTGTATGAAGATGCAGGGGACGGATATGCATACCGGAACGGAGAATGCGCCGTGCAT
GGGTTCCTTTACCGCGGCGGCACCAATTCCCTGGAGCTGGATTGGAACTGCGAAGGGGATCCCTGTGCGTTCCATGAGTC
CGCAGAGGTGGTTTTGCACGGTTTGCCTGCGGGTATTTCCGTCAGTGCATGCATGGACGGCGTGCCGTGCTGCGGCGTGA
TGAGGGAAGGAAGGGTGTGGAAAATACCGGTGAAGGATAAATTTGACACGCTTTCCGTCTGCTGGCCGGAGGAATAA

Upstream 100 bases:

>100_bases
AATGAAAAGAAAAGTGCTGCTTTAGTTCAATCTTGCGTTCGTCATGAAATTGATTTAGTCTGGACTGAACGCTTTTGATC
CTCTCTTCCGTCCGGCTTCA

Downstream 100 bases:

>100_bases
TGTTCTTCCATTAAGGGATGAGGCGTCCGGAGGAACTGTCATGGAAGGTGGAAACTGCGCCTGACCGTATCCCTCTTCAG
TTTCAGTTCCCGCCATTCAT

Product: Alpha-glucosidase

Products: NA

Alternate protein names: Alpha-glucosidase II [H]

Number of amino acids: Translated: 798; Mature: 798

Protein sequence:

>798_residues
MCSVSNIPERRKTPPLRGSLVKWEYTNTVSKITLSVCLADAGIVRVTYFPGAVPEDEPSYAVSPGYSAPGAEIREYDEDG
FHVVETSLLRIRIRTEEQKVDFYDVATDEPLLTDEGGFGRESKDWTGDARVWIRKNLQETEHFFGLGDKPCALNLRGKYF
SMWGADHYDFHEESDPLYKSIPFFLSLRERKAYGLLFDNTCRSYFDFGATDEKVLSFGSFGGLMNYYFIYDNTPLDIISA
YTRLTGTPELPPLWALGYHQSKWSYYPDKAVYNLVERFRGLGIPCDAVHLDHHYMERKEGFTWDKQNFPDAEGMVRALEK
DGVKTVLIVNPGVKVNSVNPVWKEGMERNYFCRRSEGNLLSEEVWPGLCNFPDFTAPAVRGWWADLFSRDIGKIGVRGLW
NDMNEPVVFPDRTFPMDTRHEYDGMPCSHEKAHNIYGQCMAEASWLGMKRHAPDRRPFLLSRSGFAGLQRFAATWTGDNR
SSWEHLKLANFQCQRLAASGISFAGADAGGFMGHPTPELFCRWMQMASFHGFFRNHSSGEFGGQEPWVFGQEVTSYVKAA
IEGRYRLLPYIYTQFRRYAETGMPVLRSLALQCFTNKDTYWRGAEYFFGDHLYVIPIHEPQEGGRFLYIPEGVWYSYHTD
SLMEDTGKDVWVKCPLSFLPVYVRGGAVIPHWPVQQYVGELPRPPLTLDVWWAPEGEVASHLYEDAGDGYAYRNGECAVH
GFLYRGGTNSLELDWNCEGDPCAFHESAEVVLHGLPAGISVSACMDGVPCCGVMREGRVWKIPVKDKFDTLSVCWPEE

Sequences:

>Translated_798_residues
MCSVSNIPERRKTPPLRGSLVKWEYTNTVSKITLSVCLADAGIVRVTYFPGAVPEDEPSYAVSPGYSAPGAEIREYDEDG
FHVVETSLLRIRIRTEEQKVDFYDVATDEPLLTDEGGFGRESKDWTGDARVWIRKNLQETEHFFGLGDKPCALNLRGKYF
SMWGADHYDFHEESDPLYKSIPFFLSLRERKAYGLLFDNTCRSYFDFGATDEKVLSFGSFGGLMNYYFIYDNTPLDIISA
YTRLTGTPELPPLWALGYHQSKWSYYPDKAVYNLVERFRGLGIPCDAVHLDHHYMERKEGFTWDKQNFPDAEGMVRALEK
DGVKTVLIVNPGVKVNSVNPVWKEGMERNYFCRRSEGNLLSEEVWPGLCNFPDFTAPAVRGWWADLFSRDIGKIGVRGLW
NDMNEPVVFPDRTFPMDTRHEYDGMPCSHEKAHNIYGQCMAEASWLGMKRHAPDRRPFLLSRSGFAGLQRFAATWTGDNR
SSWEHLKLANFQCQRLAASGISFAGADAGGFMGHPTPELFCRWMQMASFHGFFRNHSSGEFGGQEPWVFGQEVTSYVKAA
IEGRYRLLPYIYTQFRRYAETGMPVLRSLALQCFTNKDTYWRGAEYFFGDHLYVIPIHEPQEGGRFLYIPEGVWYSYHTD
SLMEDTGKDVWVKCPLSFLPVYVRGGAVIPHWPVQQYVGELPRPPLTLDVWWAPEGEVASHLYEDAGDGYAYRNGECAVH
GFLYRGGTNSLELDWNCEGDPCAFHESAEVVLHGLPAGISVSACMDGVPCCGVMREGRVWKIPVKDKFDTLSVCWPEE
>Mature_798_residues
MCSVSNIPERRKTPPLRGSLVKWEYTNTVSKITLSVCLADAGIVRVTYFPGAVPEDEPSYAVSPGYSAPGAEIREYDEDG
FHVVETSLLRIRIRTEEQKVDFYDVATDEPLLTDEGGFGRESKDWTGDARVWIRKNLQETEHFFGLGDKPCALNLRGKYF
SMWGADHYDFHEESDPLYKSIPFFLSLRERKAYGLLFDNTCRSYFDFGATDEKVLSFGSFGGLMNYYFIYDNTPLDIISA
YTRLTGTPELPPLWALGYHQSKWSYYPDKAVYNLVERFRGLGIPCDAVHLDHHYMERKEGFTWDKQNFPDAEGMVRALEK
DGVKTVLIVNPGVKVNSVNPVWKEGMERNYFCRRSEGNLLSEEVWPGLCNFPDFTAPAVRGWWADLFSRDIGKIGVRGLW
NDMNEPVVFPDRTFPMDTRHEYDGMPCSHEKAHNIYGQCMAEASWLGMKRHAPDRRPFLLSRSGFAGLQRFAATWTGDNR
SSWEHLKLANFQCQRLAASGISFAGADAGGFMGHPTPELFCRWMQMASFHGFFRNHSSGEFGGQEPWVFGQEVTSYVKAA
IEGRYRLLPYIYTQFRRYAETGMPVLRSLALQCFTNKDTYWRGAEYFFGDHLYVIPIHEPQEGGRFLYIPEGVWYSYHTD
SLMEDTGKDVWVKCPLSFLPVYVRGGAVIPHWPVQQYVGELPRPPLTLDVWWAPEGEVASHLYEDAGDGYAYRNGECAVH
GFLYRGGTNSLELDWNCEGDPCAFHESAEVVLHGLPAGISVSACMDGVPCCGVMREGRVWKIPVKDKFDTLSVCWPEE

Specific function: Unknown

COG id: COG1501

COG function: function code G; Alpha-glucosidases, family 31 of glycosyl hydrolases

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Unknown [C]

Operon status: Not Known

Operon components: None

Similarity: Belongs to the glycosyl hydrolase 31 family [H]

Homologues:

Organism=Homo sapiens, GI88900491, Length=638, Percent_Identity=33.0721003134796, Blast_Score=330, Evalue=2e-90,
Organism=Homo sapiens, GI38202257, Length=638, Percent_Identity=33.0721003134796, Blast_Score=330, Evalue=2e-90,
Organism=Homo sapiens, GI66346737, Length=629, Percent_Identity=32.7503974562798, Blast_Score=326, Evalue=6e-89,
Organism=Homo sapiens, GI119393895, Length=642, Percent_Identity=28.816199376947, Blast_Score=259, Evalue=5e-69,
Organism=Homo sapiens, GI119393893, Length=642, Percent_Identity=28.816199376947, Blast_Score=259, Evalue=5e-69,
Organism=Homo sapiens, GI119393891, Length=642, Percent_Identity=28.816199376947, Blast_Score=259, Evalue=5e-69,
Organism=Homo sapiens, GI221316699, Length=565, Percent_Identity=29.2035398230088, Blast_Score=251, Evalue=3e-66,
Organism=Homo sapiens, GI157364974, Length=614, Percent_Identity=28.0130293159609, Blast_Score=244, Evalue=3e-64,
Organism=Homo sapiens, GI310115361, Length=526, Percent_Identity=25.0950570342205, Blast_Score=185, Evalue=2e-46,
Organism=Escherichia coli, GI2367256, Length=544, Percent_Identity=28.4926470588235, Blast_Score=204, Evalue=2e-53,
Organism=Escherichia coli, GI2367323, Length=577, Percent_Identity=22.5303292894281, Blast_Score=119, Evalue=7e-28,
Organism=Caenorhabditis elegans, GI71991189, Length=769, Percent_Identity=30.4291287386216, Blast_Score=332, Evalue=4e-91,
Organism=Caenorhabditis elegans, GI17560798, Length=671, Percent_Identity=31.4456035767511, Blast_Score=329, Evalue=4e-90,
Organism=Caenorhabditis elegans, GI17560800, Length=671, Percent_Identity=31.4456035767511, Blast_Score=329, Evalue=4e-90,
Organism=Caenorhabditis elegans, GI32563849, Length=608, Percent_Identity=26.6447368421053, Blast_Score=184, Evalue=1e-46,
Organism=Caenorhabditis elegans, GI71985706, Length=604, Percent_Identity=25.1655629139073, Blast_Score=176, Evalue=6e-44,
Organism=Saccharomyces cerevisiae, GI6319706, Length=570, Percent_Identity=29.4736842105263, Blast_Score=261, Evalue=3e-70,
Organism=Drosophila melanogaster, GI24643749, Length=590, Percent_Identity=32.2033898305085, Blast_Score=311, Evalue=2e-84,
Organism=Drosophila melanogaster, GI24643753, Length=590, Percent_Identity=32.2033898305085, Blast_Score=311, Evalue=2e-84,
Organism=Drosophila melanogaster, GI24643751, Length=590, Percent_Identity=32.2033898305085, Blast_Score=311, Evalue=2e-84,
Organism=Drosophila melanogaster, GI24643746, Length=590, Percent_Identity=32.2033898305085, Blast_Score=311, Evalue=2e-84,
Organism=Drosophila melanogaster, GI21357605, Length=590, Percent_Identity=32.2033898305085, Blast_Score=311, Evalue=2e-84,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR013785
- InterPro:   IPR011013
- InterPro:   IPR000322
- InterPro:   IPR017853 [H]

Pfam domain/function: PF01055 Glyco_hydro_31 [H]

EC number: =3.2.1.20 [H]

Molecular weight: Translated: 90914; Mature: 90914

Theoretical pI: Translated: 5.44; Mature: 5.44

Prosite motif: PS00129 GLYCOSYL_HYDROL_F31_1

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

2.5 %Cys     (Translated Protein)
2.3 %Met     (Translated Protein)
4.8 %Cys+Met (Translated Protein)
2.5 %Cys     (Mature Protein)
2.3 %Met     (Mature Protein)
4.8 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MCSVSNIPERRKTPPLRGSLVKWEYTNTVSKITLSVCLADAGIVRVTYFPGAVPEDEPSY
CCCCCCCCHHHCCCCCCCCEEEEEECCHHHHHHHHHHCCCCCEEEEEECCCCCCCCCCCE
AVSPGYSAPGAEIREYDEDGFHVVETSLLRIRIRTEEQKVDFYDVATDEPLLTDEGGFGR
EECCCCCCCCCHHHHCCCCCCEEEEEEEEEEEEECCCCCCEEEECCCCCCEEECCCCCCC
ESKDWTGDARVWIRKNLQETEHFFGLGDKPCALNLRGKYFSMWGADHYDFHEESDPLYKS
CCCCCCCCEEEEEHHCHHHHHHHHCCCCCCEEEEECCCEEEECCCCCCCCCCCCCCHHHH
IPFFLSLRERKAYGLLFDNTCRSYFDFGATDEKVLSFGSFGGLMNYYFIYDNTPLDIISA
CCHHEEECCCCEEEEEEHHHHHHHHCCCCCHHHHHHHHCCCCCEEEEEEECCCCHHHHHH
YTRLTGTPELPPLWALGYHQSKWSYYPDKAVYNLVERFRGLGIPCDAVHLDHHYMERKEG
HHHCCCCCCCCCEEEECCCCCCCCCCCHHHHHHHHHHHHCCCCCCCEEECCHHHHHHHCC
FTWDKQNFPDAEGMVRALEKDGVKTVLIVNPGVKVNSVNPVWKEGMERNYFCRRSEGNLL
CCCCCCCCCCHHHHHHHHHHCCCEEEEEECCCEEECCCCHHHHHCCCCCCEEECCCCCCC
SEEVWPGLCNFPDFTAPAVRGWWADLFSRDIGKIGVRGLWNDMNEPVVFPDRTFPMDTRH
HHHHCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCHHHHCCCCCCEECCCCCCCCCCCC
EYDGMPCSHEKAHNIYGQCMAEASWLGMKRHAPDRRPFLLSRSGFAGLQRFAATWTGDNR
CCCCCCCCHHHHHHHHHHHHHHHHHHCHHHCCCCCCCEEEECCCHHHHHHHHHEECCCCC
SSWEHLKLANFQCQRLAASGISFAGADAGGFMGHPTPELFCRWMQMASFHGFFRNHSSGE
CCCCCEEECCHHHHHHHHCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCCC
FGGQEPWVFGQEVTSYVKAAIEGRYRLLPYIYTQFRRYAETGMPVLRSLALQCFTNKDTY
CCCCCCEEEHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHCCCCCC
WRGAEYFFGDHLYVIPIHEPQEGGRFLYIPEGVWYSYHTDSLMEDTGKDVWVKCPLSFLP
CCCCCEEECCEEEEEEECCCCCCCEEEEECCCCEEEECCCHHHHHCCCCEEEECCHHHCC
VYVRGGAVIPHWPVQQYVGELPRPPLTLDVWWAPEGEVASHLYEDAGDGYAYRNGECAVH
EEEECCEECCCCCHHHHHHHCCCCCEEEEEEECCCCHHHHHHHHHCCCCCEEECCCEEEE
GFLYRGGTNSLELDWNCEGDPCAFHESAEVVLHGLPAGISVSACMDGVPCCGVMREGRVW
EEEEECCCCCEEEEECCCCCCCCCCCCCEEEEEECCCCCCHHHHHCCCCCCCCCCCCCEE
KIPVKDKFDTLSVCWPEE
EECCCCCCCEEEEECCCC
>Mature Secondary Structure
MCSVSNIPERRKTPPLRGSLVKWEYTNTVSKITLSVCLADAGIVRVTYFPGAVPEDEPSY
CCCCCCCCHHHCCCCCCCCEEEEEECCHHHHHHHHHHCCCCCEEEEEECCCCCCCCCCCE
AVSPGYSAPGAEIREYDEDGFHVVETSLLRIRIRTEEQKVDFYDVATDEPLLTDEGGFGR
EECCCCCCCCCHHHHCCCCCCEEEEEEEEEEEEECCCCCCEEEECCCCCCEEECCCCCCC
ESKDWTGDARVWIRKNLQETEHFFGLGDKPCALNLRGKYFSMWGADHYDFHEESDPLYKS
CCCCCCCCEEEEEHHCHHHHHHHHCCCCCCEEEEECCCEEEECCCCCCCCCCCCCCHHHH
IPFFLSLRERKAYGLLFDNTCRSYFDFGATDEKVLSFGSFGGLMNYYFIYDNTPLDIISA
CCHHEEECCCCEEEEEEHHHHHHHHCCCCCHHHHHHHHCCCCCEEEEEEECCCCHHHHHH
YTRLTGTPELPPLWALGYHQSKWSYYPDKAVYNLVERFRGLGIPCDAVHLDHHYMERKEG
HHHCCCCCCCCCEEEECCCCCCCCCCCHHHHHHHHHHHHCCCCCCCEEECCHHHHHHHCC
FTWDKQNFPDAEGMVRALEKDGVKTVLIVNPGVKVNSVNPVWKEGMERNYFCRRSEGNLL
CCCCCCCCCCHHHHHHHHHHCCCEEEEEECCCEEECCCCHHHHHCCCCCCEEECCCCCCC
SEEVWPGLCNFPDFTAPAVRGWWADLFSRDIGKIGVRGLWNDMNEPVVFPDRTFPMDTRH
HHHHCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCHHHHCCCCCCEECCCCCCCCCCCC
EYDGMPCSHEKAHNIYGQCMAEASWLGMKRHAPDRRPFLLSRSGFAGLQRFAATWTGDNR
CCCCCCCCHHHHHHHHHHHHHHHHHHCHHHCCCCCCCEEEECCCHHHHHHHHHEECCCCC
SSWEHLKLANFQCQRLAASGISFAGADAGGFMGHPTPELFCRWMQMASFHGFFRNHSSGE
CCCCCEEECCHHHHHHHHCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCCC
FGGQEPWVFGQEVTSYVKAAIEGRYRLLPYIYTQFRRYAETGMPVLRSLALQCFTNKDTY
CCCCCCEEEHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHCCCCCC
WRGAEYFFGDHLYVIPIHEPQEGGRFLYIPEGVWYSYHTDSLMEDTGKDVWVKCPLSFLP
CCCCCEEECCEEEEEEECCCCCCCEEEEECCCCEEEECCCHHHHHCCCCEEEECCHHHCC
VYVRGGAVIPHWPVQQYVGELPRPPLTLDVWWAPEGEVASHLYEDAGDGYAYRNGECAVH
EEEECCEECCCCCHHHHHHHCCCCCEEEEEEECCCCHHHHHHHHHCCCCCEEECCCEEEE
GFLYRGGTNSLELDWNCEGDPCAFHESAEVVLHGLPAGISVSACMDGVPCCGVMREGRVW
EEEEECCCCCEEEEECCCCCCCCCCCCCEEEEEECCCCCCHHHHHCCCCCCCCCCCCCEE
KIPVKDKFDTLSVCWPEE
EECCCCCCCEEEEECCCC

PDB accession: NA

Resolution: NA

Structure class: Alpha Beta

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 10945254 [H]