Definition | Akkermansia muciniphila ATCC BAA-835, complete genome. |
---|---|
Accession | NC_010655 |
Length | 2,664,102 |
Click here to switch to the map view.
The map label for this gene is betC [H]
Identifier: 187735528
GI number: 187735528
Start: 1226293
End: 1228038
Strand: Reverse
Name: betC [H]
Synonym: Amuc_1033
Alternate gene names: 187735528
Gene position: 1228038-1226293 (Counterclockwise)
Preceding gene: 187735529
Following gene: 187735527
Centisome position: 46.1
GC content: 55.96
Gene sequence:
>1746_bases ATGAACCGTCATGCCGCCACCGCCCTGATGCTTGCGGCCTGTTCCCTGTCCGCTTCCGCGGACCAGCCTCAAAAGCAGAC GCCGGACCAGCGCCCCAACATCGTGGTCATTGTCACCGACGACCATTCCTACCAGACCCTGGGCACCTGTGAAAAGGATT CTCCCATGCCTTATCCGAACTTCCGCAAACTGGCGGACGAAGGCATGGTCTTTGACCGGAGCTACTGCGCCAACTCCCTG TGCGGACCTTCGCGGGCCTGTATTTACACTGGCCGCCATTCCCACATGAACGGGTACCTCTTCAACGAACATGCGGCTCC CTTTGACGGTTCCCAGCCCACTTTCCCGAAAATGCTCCAAAAGGCCGGTTACCAGACGGCTATTGTCGGCAAGTGGCACC TGGAAGCCATTCCGCCGGGCGCCAAGGGAGATACGTCCAAATATGAATCCGACCCCACCGGATTCGATTACTGGGAAATT TTCCCCGGCCAGGGCAACTATTTCAATCCGGATTTCATCACTCCCGGCAAGGACGGCAAACGCGTGGTGAAAACGGAGCC CGGCTATGCCACGGAACTGGTTACGCAAAAAAGCCTCAAATGGCTGGACCAGAGGGACAAGAACAAACCCTTCATGCTCG TCGTGGGCCACAAGGCACCCCACCGTTGCTGGTGCCCCTCCATTCAGAATCTGGGCCGCGCCAAACAGTATGCGGACGCC ATTGACCCGCCCGCCAATCTGGAAGACGATTTTGCAGACCGCCCGGAATTCCTGAAAATGACGGAACAAACCCTGCTCAA CCATTTCAACGTATGGTCTGACGAACACCTGATCAAGGAGGTGGTCCCCGAAGACATCCAGAAAATGCTTTCCTGCCCGG AATCCAAGACCCTGCATACTCAGTATGACTGGGAAATGCCGGAATGGGTGCGCATGGACCCGCAGCAGAAGGAAGCCTGG TACAACTACCACAAGGCCCGTACCGTACAGCTTGTCAAAGATATTAAAAACGGGAAAATCAAAACGCAGCGCGACATTTT GCTGCGCCGCTGGCGCCATTATATGGAAGACTATCTGGGCACCGTTCTTTCCGTGGACGAAAGCATCGGCCAGATCATGG ACTATCTGAAACAAAACGGTCTGGACAGGAATACGCTGGTGCTCTACTGCGGAGACCAGGGATTCTACATGGGCGAACAC GGCCTGTACGACAAACGCTGGATTTTCGAAGAATCCTTCCGCATGCCCCTCATCATGAGATGGCCGGGCCACATCAGGCC GGGCGTGCGCTCCTCCGCCATGGTGCAGGAACTGGATTATGCTCCCACTTTCTGCGACGTGGCCGGGGTAAATACCAAGG AAAATATGAATACCTTCCAGGGCCGCAGCCTCACTCCCCTGTTCAAGACCGGAGAACATCAGGATTTCAAAAACCGTTCT CTTTACTACGCCTTTTACGAAAATCCGGGTGAACACAACGCTCCGCGCCATGATGGCCTGCGCACGGACCGCTACACGCT GTCCTATATCTGGACCAGCGACGAATGGATGCTCTTTGACAACCAGAAGGACCCGGCCCAAATGCACAACGTCATCAACA AACCGGAATATGCGGAAACCGTGAAAGAACTCAAAGCCCTGTACGGCAAGCTCCGCAAAGACTACCAGGTACCGGAAGGC TTCCCCGGGGCCACCGGCAAACTGGCCGTCAAGCCGCAGTGGGACTGCGCTCCCTCCAGAGATTGA
Upstream 100 bases:
>100_bases CGGGGATGTGGTCTTTTCCGGACCGGCAGGAATGCCCTTGCGGACAAATTCTGCCGCATCCCCCTCCAACCATTCGCCTG TTTAATTCATGATTTCAACA
Downstream 100 bases:
>100_bases CGCCGCCATTCACCGTTCCTCCCGAAACACGGTTGTCCGGAACACGCCATTGACGACGGAGGAACAGGGTTTAAATCCTC ACACAAACAACCATTACCAT
Product: sulfatase
Products: NA
Alternate protein names: NA
Number of amino acids: Translated: 581; Mature: 581
Protein sequence:
>581_residues MNRHAATALMLAACSLSASADQPQKQTPDQRPNIVVIVTDDHSYQTLGTCEKDSPMPYPNFRKLADEGMVFDRSYCANSL CGPSRACIYTGRHSHMNGYLFNEHAAPFDGSQPTFPKMLQKAGYQTAIVGKWHLEAIPPGAKGDTSKYESDPTGFDYWEI FPGQGNYFNPDFITPGKDGKRVVKTEPGYATELVTQKSLKWLDQRDKNKPFMLVVGHKAPHRCWCPSIQNLGRAKQYADA IDPPANLEDDFADRPEFLKMTEQTLLNHFNVWSDEHLIKEVVPEDIQKMLSCPESKTLHTQYDWEMPEWVRMDPQQKEAW YNYHKARTVQLVKDIKNGKIKTQRDILLRRWRHYMEDYLGTVLSVDESIGQIMDYLKQNGLDRNTLVLYCGDQGFYMGEH GLYDKRWIFEESFRMPLIMRWPGHIRPGVRSSAMVQELDYAPTFCDVAGVNTKENMNTFQGRSLTPLFKTGEHQDFKNRS LYYAFYENPGEHNAPRHDGLRTDRYTLSYIWTSDEWMLFDNQKDPAQMHNVINKPEYAETVKELKALYGKLRKDYQVPEG FPGATGKLAVKPQWDCAPSRD
Sequences:
>Translated_581_residues MNRHAATALMLAACSLSASADQPQKQTPDQRPNIVVIVTDDHSYQTLGTCEKDSPMPYPNFRKLADEGMVFDRSYCANSL CGPSRACIYTGRHSHMNGYLFNEHAAPFDGSQPTFPKMLQKAGYQTAIVGKWHLEAIPPGAKGDTSKYESDPTGFDYWEI FPGQGNYFNPDFITPGKDGKRVVKTEPGYATELVTQKSLKWLDQRDKNKPFMLVVGHKAPHRCWCPSIQNLGRAKQYADA IDPPANLEDDFADRPEFLKMTEQTLLNHFNVWSDEHLIKEVVPEDIQKMLSCPESKTLHTQYDWEMPEWVRMDPQQKEAW YNYHKARTVQLVKDIKNGKIKTQRDILLRRWRHYMEDYLGTVLSVDESIGQIMDYLKQNGLDRNTLVLYCGDQGFYMGEH GLYDKRWIFEESFRMPLIMRWPGHIRPGVRSSAMVQELDYAPTFCDVAGVNTKENMNTFQGRSLTPLFKTGEHQDFKNRS LYYAFYENPGEHNAPRHDGLRTDRYTLSYIWTSDEWMLFDNQKDPAQMHNVINKPEYAETVKELKALYGKLRKDYQVPEG FPGATGKLAVKPQWDCAPSRD >Mature_581_residues MNRHAATALMLAACSLSASADQPQKQTPDQRPNIVVIVTDDHSYQTLGTCEKDSPMPYPNFRKLADEGMVFDRSYCANSL CGPSRACIYTGRHSHMNGYLFNEHAAPFDGSQPTFPKMLQKAGYQTAIVGKWHLEAIPPGAKGDTSKYESDPTGFDYWEI FPGQGNYFNPDFITPGKDGKRVVKTEPGYATELVTQKSLKWLDQRDKNKPFMLVVGHKAPHRCWCPSIQNLGRAKQYADA IDPPANLEDDFADRPEFLKMTEQTLLNHFNVWSDEHLIKEVVPEDIQKMLSCPESKTLHTQYDWEMPEWVRMDPQQKEAW YNYHKARTVQLVKDIKNGKIKTQRDILLRRWRHYMEDYLGTVLSVDESIGQIMDYLKQNGLDRNTLVLYCGDQGFYMGEH GLYDKRWIFEESFRMPLIMRWPGHIRPGVRSSAMVQELDYAPTFCDVAGVNTKENMNTFQGRSLTPLFKTGEHQDFKNRS LYYAFYENPGEHNAPRHDGLRTDRYTLSYIWTSDEWMLFDNQKDPAQMHNVINKPEYAETVKELKALYGKLRKDYQVPEG FPGATGKLAVKPQWDCAPSRD
Specific function: Converts choline-O-sulfate into choline [H]
COG id: COG3119
COG function: function code P; Arylsulfatase A and related enzymes
Gene ontology:
Cell location: Cytoplasmic
Metaboloic importance: Non Essential [C]
Operon status: Not Known
Operon components: None
Similarity: Belongs to the sulfatase family [H]
Homologues:
Organism=Homo sapiens, GI4503899, Length=263, Percent_Identity=30.7984790874525, Blast_Score=90, Evalue=6e-18, Organism=Homo sapiens, GI4504061, Length=210, Percent_Identity=31.4285714285714, Blast_Score=89, Evalue=9e-18, Organism=Homo sapiens, GI45430057, Length=143, Percent_Identity=37.0629370629371, Blast_Score=81, Evalue=3e-15, Organism=Homo sapiens, GI71852586, Length=141, Percent_Identity=34.7517730496454, Blast_Score=80, Evalue=4e-15, Organism=Homo sapiens, GI71852584, Length=141, Percent_Identity=34.7517730496454, Blast_Score=79, Evalue=9e-15, Organism=Homo sapiens, GI58743319, Length=119, Percent_Identity=36.9747899159664, Blast_Score=75, Evalue=2e-13, Organism=Homo sapiens, GI53831991, Length=161, Percent_Identity=30.4347826086957, Blast_Score=74, Evalue=6e-13, Organism=Homo sapiens, GI157266309, Length=136, Percent_Identity=31.6176470588235, Blast_Score=73, Evalue=7e-13, Organism=Homo sapiens, GI31742482, Length=164, Percent_Identity=30.4878048780488, Blast_Score=71, Evalue=3e-12, Organism=Homo sapiens, GI39930577, Length=265, Percent_Identity=27.1698113207547, Blast_Score=70, Evalue=4e-12, Organism=Homo sapiens, GI109389362, Length=213, Percent_Identity=26.7605633802817, Blast_Score=67, Evalue=4e-11, Organism=Homo sapiens, GI59797060, Length=206, Percent_Identity=30.0970873786408, Blast_Score=66, Evalue=9e-11, Organism=Escherichia coli, GI87081924, Length=258, Percent_Identity=27.906976744186, Blast_Score=79, Evalue=6e-16, Organism=Escherichia coli, GI1790112, Length=247, Percent_Identity=26.3157894736842, Blast_Score=78, Evalue=2e-15, Organism=Escherichia coli, GI1790233, Length=291, Percent_Identity=25.085910652921, Blast_Score=64, Evalue=2e-11, Organism=Caenorhabditis elegans, GI115533416, Length=223, Percent_Identity=31.390134529148, Blast_Score=80, Evalue=4e-15, Organism=Caenorhabditis elegans, GI17568795, Length=203, Percent_Identity=27.5862068965517, Blast_Score=72, Evalue=9e-13, Organism=Caenorhabditis elegans, GI17559078, Length=116, Percent_Identity=37.9310344827586, Blast_Score=71, Evalue=2e-12, Organism=Caenorhabditis elegans, GI115533418, Length=197, Percent_Identity=31.4720812182741, Blast_Score=68, Evalue=1e-11, Organism=Drosophila melanogaster, GI24666109, Length=208, Percent_Identity=29.3269230769231, Blast_Score=79, Evalue=9e-15, Organism=Drosophila melanogaster, GI24666175, Length=203, Percent_Identity=28.5714285714286, Blast_Score=79, Evalue=1e-14, Organism=Drosophila melanogaster, GI281363223, Length=220, Percent_Identity=27.2727272727273, Blast_Score=73, Evalue=5e-13, Organism=Drosophila melanogaster, GI24647401, Length=115, Percent_Identity=36.5217391304348, Blast_Score=73, Evalue=5e-13, Organism=Drosophila melanogaster, GI281366397, Length=235, Percent_Identity=24.2553191489362, Blast_Score=70, Evalue=6e-12, Organism=Drosophila melanogaster, GI281366395, Length=235, Percent_Identity=24.2553191489362, Blast_Score=70, Evalue=6e-12, Organism=Drosophila melanogaster, GI24666163, Length=235, Percent_Identity=24.2553191489362, Blast_Score=70, Evalue=6e-12,
Paralogues:
None
Copy number: NA
Swissprot (AC and ID): NA
Other databases:
- InterPro: IPR017849 - InterPro: IPR017850 - InterPro: IPR017785 - InterPro: IPR000917 [H]
Pfam domain/function: PF00884 Sulfatase [H]
EC number: =3.1.6.6 [H]
Molecular weight: Translated: 66999; Mature: 66999
Theoretical pI: Translated: 6.82; Mature: 6.82
Prosite motif: PS00149 SULFATASE_2
Important sites: NA
Signals:
None
Transmembrane regions:
None
Cys/Met content:
1.9 %Cys (Translated Protein) 3.4 %Met (Translated Protein) 5.3 %Cys+Met (Translated Protein) 1.9 %Cys (Mature Protein) 3.4 %Met (Mature Protein) 5.3 %Cys+Met (Mature Protein)
Secondary structure:
>Translated Secondary Structure MNRHAATALMLAACSLSASADQPQKQTPDQRPNIVVIVTDDHSYQTLGTCEKDSPMPYPN CCCHHHHHHHHHHHHCCCCCCCCHHCCCCCCCCEEEEEECCCCCCCCCCCCCCCCCCCCH FRKLADEGMVFDRSYCANSLCGPSRACIYTGRHSHMNGYLFNEHAAPFDGSQPTFPKMLQ HHHHHHCCCEEEHHHHHHCCCCCCCEEEEECCCCCCCCEEECCCCCCCCCCCCCHHHHHH KAGYQTAIVGKWHLEAIPPGAKGDTSKYESDPTGFDYWEIFPGQGNYFNPDFITPGKDGK HCCCCEEEEEEEEEEECCCCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCC RVVKTEPGYATELVTQKSLKWLDQRDKNKPFMLVVGHKAPHRCWCPSIQNLGRAKQYADA EEEECCCCHHHHHHHHHHHHHHHHHCCCCCEEEEECCCCCCCCCCCCHHHHHHHHHHHHH IDPPANLEDDFADRPEFLKMTEQTLLNHFNVWSDEHLIKEVVPEDIQKMLSCPESKTLHT CCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHCCCCCCCEEE QYDWEMPEWVRMDPQQKEAWYNYHKARTVQLVKDIKNGKIKTQRDILLRRWRHYMEDYLG EECCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHH TVLSVDESIGQIMDYLKQNGLDRNTLVLYCGDQGFYMGEHGLYDKRWIFEESFRMPLIMR HHHHHHHHHHHHHHHHHHCCCCCCEEEEEECCCCCCCCCCCCCCCHHHHHHCCCCCEEEE WPGHIRPGVRSSAMVQELDYAPTFCDVAGVNTKENMNTFQGRSLTPLFKTGEHQDFKNRS CCCCCCCCCCHHHHHHHHCCCCCCHHHCCCCCHHCCCCCCCCCCCHHHHCCCCCCCCCCC LYYAFYENPGEHNAPRHDGLRTDRYTLSYIWTSDEWMLFDNQKDPAQMHNVINKPEYAET EEEEEECCCCCCCCCCCCCCCCCCEEEEEEEECCCEEEECCCCCHHHHHHHHCCCHHHHH VKELKALYGKLRKDYQVPEGFPGATGKLAVKPQWDCAPSRD HHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCC >Mature Secondary Structure MNRHAATALMLAACSLSASADQPQKQTPDQRPNIVVIVTDDHSYQTLGTCEKDSPMPYPN CCCHHHHHHHHHHHHCCCCCCCCHHCCCCCCCCEEEEEECCCCCCCCCCCCCCCCCCCCH FRKLADEGMVFDRSYCANSLCGPSRACIYTGRHSHMNGYLFNEHAAPFDGSQPTFPKMLQ HHHHHHCCCEEEHHHHHHCCCCCCCEEEEECCCCCCCCEEECCCCCCCCCCCCCHHHHHH KAGYQTAIVGKWHLEAIPPGAKGDTSKYESDPTGFDYWEIFPGQGNYFNPDFITPGKDGK HCCCCEEEEEEEEEEECCCCCCCCCCCCCCCCCCCCEEEEECCCCCCCCCCCCCCCCCCC RVVKTEPGYATELVTQKSLKWLDQRDKNKPFMLVVGHKAPHRCWCPSIQNLGRAKQYADA EEEECCCCHHHHHHHHHHHHHHHHHCCCCCEEEEECCCCCCCCCCCCHHHHHHHHHHHHH IDPPANLEDDFADRPEFLKMTEQTLLNHFNVWSDEHLIKEVVPEDIQKMLSCPESKTLHT CCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHCCCCCCCEEE QYDWEMPEWVRMDPQQKEAWYNYHKARTVQLVKDIKNGKIKTQRDILLRRWRHYMEDYLG EECCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHH TVLSVDESIGQIMDYLKQNGLDRNTLVLYCGDQGFYMGEHGLYDKRWIFEESFRMPLIMR HHHHHHHHHHHHHHHHHHCCCCCCEEEEEECCCCCCCCCCCCCCCHHHHHHCCCCCEEEE WPGHIRPGVRSSAMVQELDYAPTFCDVAGVNTKENMNTFQGRSLTPLFKTGEHQDFKNRS CCCCCCCCCCHHHHHHHHCCCCCCHHHCCCCCHHCCCCCCCCCCCHHHHCCCCCCCCCCC LYYAFYENPGEHNAPRHDGLRTDRYTLSYIWTSDEWMLFDNQKDPAQMHNVINKPEYAET EEEEEECCCCCCCCCCCCCCCCCCEEEEEEEECCCEEEECCCCCHHHHHHHHCCCHHHHH VKELKALYGKLRKDYQVPEGFPGATGKLAVKPQWDCAPSRD HHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCCCCCC
PDB accession: NA
Resolution: NA
Structure class: Unstructured
Cofactors: NA
Metal ions: NA
Kcat value (1/min): NA
Specific activity: NA
Km value (mM): NA
Substrates: NA
Specific reaction: NA
General reaction: NA
Inhibitor: NA
Structure determination priority: 9.0
TargetDB status: NA
Availability: NA
References: 9141699; 11481430; 9736747 [H]