Definition Akkermansia muciniphila ATCC BAA-835, complete genome.
Accession NC_010655
Length 2,664,102

Click here to switch to the map view.

The map label for this gene is betC [H]

Identifier: 187735569

GI number: 187735569

Start: 1282440

End: 1284098

Strand: Reverse

Name: betC [H]

Synonym: Amuc_1074

Alternate gene names: 187735569

Gene position: 1284098-1282440 (Counterclockwise)

Preceding gene: 187735571

Following gene: 187735565

Centisome position: 48.2

GC content: 55.64

Gene sequence:

>1659_bases
ATGAGACCTTTGAAAACCATCATCGCCGGAACTCTGGCCCTGCTGGCGGCAGCTCCCCTCTCAGCTCAAACCAAGGCTGA
GGAAAATAAAAAACCGAATATCCTCTTTATCATTACGGACGACCACGCCTACCAGACGCTGGGCACCGGCAATAATGATT
CCCCCGTGGCCCTGCCCAATTTCAACAAACTGGGACGCCAAGGCATGGTTTTTGACCGCAGCTACTGCGCCAACTCCCTG
TGCGGCCCCTCCCGCGCCTGCATCCTGACCGGCAGGCATTCCCACATGAACGGTTTTGTCTTCAACGGACAAAGACCGCT
GGACGGCTCCCAGCCCACTTACCCGAAAATGCTGCAGAAGGCCGGATACCAGACGGGCCTTTTCGGCAAATGGCATCTGG
AATCGGACCCCACCGGGTTCGACACGTGGGAAATCTTCCCCGGCCAGGGCAGCTACTACAATCCGGACTTTATCAGCCTC
AAGCCGGACGGCAAACGCCAGACAAAGCGTTTTCCCGGATATGCCACGGACGTGGTCACGGACAAATCCATCCAGTGGCT
GGGAAACCGGGACAAGAACAAACCTTTCCTGCTCGTTGTGGGCCACAAGGCTCCCCACCGCGCCTGGTGCCCTGCTCTGC
GCCACCTGGGCAAGGTGGACACTTCCAGCATGACGCCGCCCGCCAACTTCCATGACGACTATGCCAACCGTCCGGAATTC
CTGAAGAAAAACCAGCAGACAGTCGCCAATCACATGGCGATTTATTCCGACCTCAAAGTGCTTAAGGACCAGGTTCCGGA
AGAAATGCGCAAAAGCATCGTTTCCCCCGGTTACGGCTGGGACCTGGGCGAGTTGAACCGCATGACTCCGGAAGAAAAGA
AAACCTGGACGGACTATTACGCCAAGCGCACCAAATCCCTGGTGGACGGCATGAAATCCGGAAAACTGAAGGACCCGAAA
GCGTTTGCGGAATGGAAGTGGCATGCCTACATGGAGGATTATCTGGGATGCCTTCTGTCCGTGGACGACAGCATCGGCCG
CCTTATGGAATATCTGGACAAAGAGGGGATTGCGAAAGACACGCTGGTCATCTACTGCGGAGACCAGGGGTTCTACATGG
GAGAACACGGCATGTACGACAAGCGCTGGATTTTTGAAGAATCCCTCCGCATGCCCCTCATCATGAGATGGCCCGGCAAA
ATTCCCGCGGGCATCCGCAACAACACCATGGTGCAGAATATCGACTACGCTCCCACCATCGTTTCCGCGGCAGGGGCGGA
CACCCCGGAAAACATGAATACCTTCCAGGGCGTATCCCTGCTTCCCACCGCTTTCACGGGCAAAACTCCCGACAACTGGA
GGGATGCCATTTACTACTGTTTTTACGAAAATCCCGGCGAACACAACGCCCCGCGCCACGACGGCATCCGGACGGACCGC
TACACGCTTTCCTACATCTGGACCAGCGACGAATGGATGCTCTTTGACATGAAAAAGGATCCCATGCAAATGAAAAACGT
CATTGACGATCCTGCCTACAAGACTACGGTGGAACAGCTCAAGAAGCGTTACCACGAACTGCGCAAAACCTATAAAGTTC
CGGAAAACAGCCCCGGAGGCAAAGGAACGCCTATCCCCAAATTCGACGCTTCCTGGTAA

Upstream 100 bases:

>100_bases
CAGAGGAAAGCATGGGGCAAGAATACCATAACGCAGAATCCCGGCAAAAGGAATCCTATGACATTGACACGGCGCCGTCA
TCCTATATCCTCCAGGCAAT

Downstream 100 bases:

>100_bases
AAGCCACACAGGAAACTACCTTTCCAAGCGCCGGTTTTTCCGGCGCTTTTTTATTTTCGTCCATAGCAGGCCGGAGAGAT
CCGTCACGGAGGCCGAAACA

Product: sulfatase

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 552; Mature: 552

Protein sequence:

>552_residues
MRPLKTIIAGTLALLAAAPLSAQTKAEENKKPNILFIITDDHAYQTLGTGNNDSPVALPNFNKLGRQGMVFDRSYCANSL
CGPSRACILTGRHSHMNGFVFNGQRPLDGSQPTYPKMLQKAGYQTGLFGKWHLESDPTGFDTWEIFPGQGSYYNPDFISL
KPDGKRQTKRFPGYATDVVTDKSIQWLGNRDKNKPFLLVVGHKAPHRAWCPALRHLGKVDTSSMTPPANFHDDYANRPEF
LKKNQQTVANHMAIYSDLKVLKDQVPEEMRKSIVSPGYGWDLGELNRMTPEEKKTWTDYYAKRTKSLVDGMKSGKLKDPK
AFAEWKWHAYMEDYLGCLLSVDDSIGRLMEYLDKEGIAKDTLVIYCGDQGFYMGEHGMYDKRWIFEESLRMPLIMRWPGK
IPAGIRNNTMVQNIDYAPTIVSAAGADTPENMNTFQGVSLLPTAFTGKTPDNWRDAIYYCFYENPGEHNAPRHDGIRTDR
YTLSYIWTSDEWMLFDMKKDPMQMKNVIDDPAYKTTVEQLKKRYHELRKTYKVPENSPGGKGTPIPKFDASW

Sequences:

>Translated_552_residues
MRPLKTIIAGTLALLAAAPLSAQTKAEENKKPNILFIITDDHAYQTLGTGNNDSPVALPNFNKLGRQGMVFDRSYCANSL
CGPSRACILTGRHSHMNGFVFNGQRPLDGSQPTYPKMLQKAGYQTGLFGKWHLESDPTGFDTWEIFPGQGSYYNPDFISL
KPDGKRQTKRFPGYATDVVTDKSIQWLGNRDKNKPFLLVVGHKAPHRAWCPALRHLGKVDTSSMTPPANFHDDYANRPEF
LKKNQQTVANHMAIYSDLKVLKDQVPEEMRKSIVSPGYGWDLGELNRMTPEEKKTWTDYYAKRTKSLVDGMKSGKLKDPK
AFAEWKWHAYMEDYLGCLLSVDDSIGRLMEYLDKEGIAKDTLVIYCGDQGFYMGEHGMYDKRWIFEESLRMPLIMRWPGK
IPAGIRNNTMVQNIDYAPTIVSAAGADTPENMNTFQGVSLLPTAFTGKTPDNWRDAIYYCFYENPGEHNAPRHDGIRTDR
YTLSYIWTSDEWMLFDMKKDPMQMKNVIDDPAYKTTVEQLKKRYHELRKTYKVPENSPGGKGTPIPKFDASW
>Mature_552_residues
MRPLKTIIAGTLALLAAAPLSAQTKAEENKKPNILFIITDDHAYQTLGTGNNDSPVALPNFNKLGRQGMVFDRSYCANSL
CGPSRACILTGRHSHMNGFVFNGQRPLDGSQPTYPKMLQKAGYQTGLFGKWHLESDPTGFDTWEIFPGQGSYYNPDFISL
KPDGKRQTKRFPGYATDVVTDKSIQWLGNRDKNKPFLLVVGHKAPHRAWCPALRHLGKVDTSSMTPPANFHDDYANRPEF
LKKNQQTVANHMAIYSDLKVLKDQVPEEMRKSIVSPGYGWDLGELNRMTPEEKKTWTDYYAKRTKSLVDGMKSGKLKDPK
AFAEWKWHAYMEDYLGCLLSVDDSIGRLMEYLDKEGIAKDTLVIYCGDQGFYMGEHGMYDKRWIFEESLRMPLIMRWPGK
IPAGIRNNTMVQNIDYAPTIVSAAGADTPENMNTFQGVSLLPTAFTGKTPDNWRDAIYYCFYENPGEHNAPRHDGIRTDR
YTLSYIWTSDEWMLFDMKKDPMQMKNVIDDPAYKTTVEQLKKRYHELRKTYKVPENSPGGKGTPIPKFDASW

Specific function: Converts choline-O-sulfate into choline [H]

COG id: COG3119

COG function: function code P; Arylsulfatase A and related enzymes

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Unknown [C]

Operon status: Not Known

Operon components: None

Similarity: Belongs to the sulfatase family [H]

Homologues:

Organism=Homo sapiens, GI39930577, Length=506, Percent_Identity=22.7272727272727, Blast_Score=97, Evalue=3e-20,
Organism=Homo sapiens, GI31742482, Length=144, Percent_Identity=34.7222222222222, Blast_Score=83, Evalue=7e-16,
Organism=Homo sapiens, GI4503899, Length=242, Percent_Identity=27.6859504132231, Blast_Score=82, Evalue=1e-15,
Organism=Homo sapiens, GI4504061, Length=207, Percent_Identity=28.5024154589372, Blast_Score=81, Evalue=3e-15,
Organism=Homo sapiens, GI71852586, Length=122, Percent_Identity=39.344262295082, Blast_Score=79, Evalue=2e-14,
Organism=Homo sapiens, GI71852584, Length=122, Percent_Identity=39.344262295082, Blast_Score=78, Evalue=2e-14,
Organism=Homo sapiens, GI58743319, Length=117, Percent_Identity=36.7521367521368, Blast_Score=77, Evalue=4e-14,
Organism=Homo sapiens, GI157266309, Length=140, Percent_Identity=35, Blast_Score=74, Evalue=5e-13,
Organism=Homo sapiens, GI45430057, Length=141, Percent_Identity=36.8794326241135, Blast_Score=74, Evalue=5e-13,
Organism=Homo sapiens, GI59797060, Length=252, Percent_Identity=27.3809523809524, Blast_Score=73, Evalue=6e-13,
Organism=Homo sapiens, GI240255478, Length=216, Percent_Identity=25.9259259259259, Blast_Score=73, Evalue=7e-13,
Organism=Homo sapiens, GI240255483, Length=242, Percent_Identity=25.2066115702479, Blast_Score=73, Evalue=7e-13,
Organism=Homo sapiens, GI29789100, Length=242, Percent_Identity=25.2066115702479, Blast_Score=73, Evalue=7e-13,
Organism=Homo sapiens, GI38569407, Length=209, Percent_Identity=27.2727272727273, Blast_Score=71, Evalue=3e-12,
Organism=Homo sapiens, GI189571643, Length=130, Percent_Identity=33.8461538461538, Blast_Score=71, Evalue=3e-12,
Organism=Homo sapiens, GI189571641, Length=130, Percent_Identity=33.8461538461538, Blast_Score=71, Evalue=3e-12,
Organism=Homo sapiens, GI189571638, Length=130, Percent_Identity=33.8461538461538, Blast_Score=71, Evalue=3e-12,
Organism=Homo sapiens, GI189571636, Length=130, Percent_Identity=33.8461538461538, Blast_Score=71, Evalue=3e-12,
Organism=Homo sapiens, GI38569405, Length=209, Percent_Identity=27.2727272727273, Blast_Score=71, Evalue=3e-12,
Organism=Homo sapiens, GI109389362, Length=201, Percent_Identity=27.8606965174129, Blast_Score=70, Evalue=4e-12,
Organism=Escherichia coli, GI1790112, Length=297, Percent_Identity=24.5791245791246, Blast_Score=76, Evalue=6e-15,
Organism=Escherichia coli, GI1790233, Length=300, Percent_Identity=27, Blast_Score=72, Evalue=7e-14,
Organism=Caenorhabditis elegans, GI17568795, Length=238, Percent_Identity=26.890756302521, Blast_Score=76, Evalue=5e-14,
Organism=Caenorhabditis elegans, GI17559078, Length=112, Percent_Identity=35.7142857142857, Blast_Score=67, Evalue=3e-11,
Organism=Caenorhabditis elegans, GI115533416, Length=212, Percent_Identity=30.188679245283, Blast_Score=66, Evalue=4e-11,
Organism=Drosophila melanogaster, GI19922168, Length=440, Percent_Identity=23.8636363636364, Blast_Score=106, Evalue=4e-23,
Organism=Drosophila melanogaster, GI24653364, Length=440, Percent_Identity=23.8636363636364, Blast_Score=105, Evalue=7e-23,
Organism=Drosophila melanogaster, GI24666175, Length=218, Percent_Identity=30.2752293577982, Blast_Score=84, Evalue=2e-16,
Organism=Drosophila melanogaster, GI281366397, Length=204, Percent_Identity=29.9019607843137, Blast_Score=77, Evalue=2e-14,
Organism=Drosophila melanogaster, GI281366395, Length=204, Percent_Identity=29.9019607843137, Blast_Score=77, Evalue=2e-14,
Organism=Drosophila melanogaster, GI24666163, Length=204, Percent_Identity=29.9019607843137, Blast_Score=77, Evalue=2e-14,
Organism=Drosophila melanogaster, GI281363223, Length=200, Percent_Identity=30, Blast_Score=77, Evalue=3e-14,
Organism=Drosophila melanogaster, GI24666109, Length=217, Percent_Identity=30.8755760368664, Blast_Score=76, Evalue=6e-14,
Organism=Drosophila melanogaster, GI24647401, Length=130, Percent_Identity=33.8461538461538, Blast_Score=73, Evalue=4e-13,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR017849
- InterPro:   IPR017850
- InterPro:   IPR017785
- InterPro:   IPR000917 [H]

Pfam domain/function: PF00884 Sulfatase [H]

EC number: =3.1.6.6 [H]

Molecular weight: Translated: 62820; Mature: 62820

Theoretical pI: Translated: 8.81; Mature: 8.81

Prosite motif: PS00523 SULFATASE_1

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

1.3 %Cys     (Translated Protein)
3.8 %Met     (Translated Protein)
5.1 %Cys+Met (Translated Protein)
1.3 %Cys     (Mature Protein)
3.8 %Met     (Mature Protein)
5.1 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MRPLKTIIAGTLALLAAAPLSAQTKAEENKKPNILFIITDDHAYQTLGTGNNDSPVALPN
CCHHHHHHHHHHHHHHHCCCCCHHHHHCCCCCCEEEEEECCCCEEECCCCCCCCCEECCC
FNKLGRQGMVFDRSYCANSLCGPSRACILTGRHSHMNGFVFNGQRPLDGSQPTYPKMLQK
HHHHCCCCCEEEHHHHHHCCCCCCCEEEEECCCCCCCCEEECCCCCCCCCCCCHHHHHHH
AGYQTGLFGKWHLESDPTGFDTWEIFPGQGSYYNPDFISLKPDGKRQTKRFPGYATDVVT
CCCCCCCCEEEECCCCCCCCCCEEEECCCCCCCCCCEEEECCCCCHHHHCCCCCHHHHCC
DKSIQWLGNRDKNKPFLLVVGHKAPHRAWCPALRHLGKVDTSSMTPPANFHDDYANRPEF
CCCHHCCCCCCCCCCEEEEECCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHH
LKKNQQTVANHMAIYSDLKVLKDQVPEEMRKSIVSPGYGWDLGELNRMTPEEKKTWTDYY
HHHHHHHHHHHHHHHHHHHHHHHHCHHHHHHHHCCCCCCCCHHHHCCCCCHHHHHHHHHH
AKRTKSLVDGMKSGKLKDPKAFAEWKWHAYMEDYLGCLLSVDDSIGRLMEYLDKEGIAKD
HHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHCCCCCC
TLVIYCGDQGFYMGEHGMYDKRWIFEESLRMPLIMRWPGKIPAGIRNNTMVQNIDYAPTI
EEEEEECCCCEECCCCCCCCCHHHHHHHCCCCEEEECCCCCCCCCCCCCEEECCCCCCHH
VSAAGADTPENMNTFQGVSLLPTAFTGKTPDNWRDAIYYCFYENPGEHNAPRHDGIRTDR
HECCCCCCCCCCHHHCCCEECCCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCCC
YTLSYIWTSDEWMLFDMKKDPMQMKNVIDDPAYKTTVEQLKKRYHELRKTYKVPENSPGG
EEEEEEEECCCEEEEECCCCHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCCC
KGTPIPKFDASW
CCCCCCCCCCCC
>Mature Secondary Structure
MRPLKTIIAGTLALLAAAPLSAQTKAEENKKPNILFIITDDHAYQTLGTGNNDSPVALPN
CCHHHHHHHHHHHHHHHCCCCCHHHHHCCCCCCEEEEEECCCCEEECCCCCCCCCEECCC
FNKLGRQGMVFDRSYCANSLCGPSRACILTGRHSHMNGFVFNGQRPLDGSQPTYPKMLQK
HHHHCCCCCEEEHHHHHHCCCCCCCEEEEECCCCCCCCEEECCCCCCCCCCCCHHHHHHH
AGYQTGLFGKWHLESDPTGFDTWEIFPGQGSYYNPDFISLKPDGKRQTKRFPGYATDVVT
CCCCCCCCEEEECCCCCCCCCCEEEECCCCCCCCCCEEEECCCCCHHHHCCCCCHHHHCC
DKSIQWLGNRDKNKPFLLVVGHKAPHRAWCPALRHLGKVDTSSMTPPANFHDDYANRPEF
CCCHHCCCCCCCCCCEEEEECCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHH
LKKNQQTVANHMAIYSDLKVLKDQVPEEMRKSIVSPGYGWDLGELNRMTPEEKKTWTDYY
HHHHHHHHHHHHHHHHHHHHHHHHCHHHHHHHHCCCCCCCCHHHHCCCCCHHHHHHHHHH
AKRTKSLVDGMKSGKLKDPKAFAEWKWHAYMEDYLGCLLSVDDSIGRLMEYLDKEGIAKD
HHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHCCCCCC
TLVIYCGDQGFYMGEHGMYDKRWIFEESLRMPLIMRWPGKIPAGIRNNTMVQNIDYAPTI
EEEEEECCCCEECCCCCCCCCHHHHHHHCCCCEEEECCCCCCCCCCCCCEEECCCCCCHH
VSAAGADTPENMNTFQGVSLLPTAFTGKTPDNWRDAIYYCFYENPGEHNAPRHDGIRTDR
HECCCCCCCCCCHHHCCCEECCCCCCCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCCCC
YTLSYIWTSDEWMLFDMKKDPMQMKNVIDDPAYKTTVEQLKKRYHELRKTYKVPENSPGG
EEEEEEEECCCEEEEECCCCHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCCC
KGTPIPKFDASW
CCCCCCCCCCCC

PDB accession: NA

Resolution: NA

Structure class: Unstructured

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 9141699; 11481430; 9736747 [H]