Definition | Burkholderia mallei NCTC 10247 chromosome II, complete genome. |
---|---|
Accession | NC_009079 |
Length | 2,352,693 |
Click here to switch to the map view.
The map label for this gene is betC [H]
Identifier: 126446117
GI number: 126446117
Start: 1889171
End: 1890739
Strand: Reverse
Name: betC [H]
Synonym: BMA10247_A1966
Alternate gene names: 126446117
Gene position: 1890739-1889171 (Counterclockwise)
Preceding gene: 126447271
Following gene: 126447171
Centisome position: 80.36
GC content: 68.64
Gene sequence:
>1569_bases ATGAGCGCCCAAGCGATGCCCGATACCGCCGAACCCACCGATATCCAGCCGAACATTCTCGTCCTGATGGCCGACCAGCT CACGCCCTTCGCGTTGCGCGCGTACGGCCATCGCGCGACGCGTACGCCGACGATCGACCGGCTCGCCGCCGAGGGCGTCG TCTTCGACGCCGCTTATTGCGCGAGCCCGCTCTGCGCGCCGTCGCGCTTCGCGCTGATGGCGGGCAAGCTGCCGTCGGCG CTCGGCGCTTACGATAACGCCGCCGAATTGCCGGCGCAAACGCTGACGTTCGCGCACTACCTGCGCGCGGCCGGTTACCG GACGATGCTGTCGGGCAAGATGCACTTCTGCGGGCCCGATCAGTTGCACGGCTTCGAGGAACGGCTCACGACCGACATCT ATCCGGCCGATTTCGGCTGGGTGCCGGACTGGACGCGTCCCGCCGAGCGGCCGAGCTGGTATCACAACATGAGCTCGGTG CTCGACGCCGGGCCTTGCGTGCGGACCAACCAGCTCGATTTCGACGACGATGCGACGTTCGCCGCGCGCCAGAAGATCTT CGACGTCGCGCGCGAGCGCGCGGCCGGCCGGGACGCGCGGCCGTTCTGCATGGTCGTGTCGCTCACGCATCCGCACGATC CGTATGCGATCACGCGCGAATACTGGGATCTGTACCGGGACGAGGACATCGATCTGCCCGCCGTGCGGATGGATTTCGAC GCGAGCGACCCGCATTCGCGGCGGCTGCGCGCCGTATGCGAGGTCGATCGCACGCCGCCGGAGGACCTGCAGATCCGGCG CGCGCGGCGCGCGTACTACGGCGCGACGTCCTATGTCGACGCGCAGTTCGGCGCGCTGCTCGCGACGCTCGAGCAATGCG GGCTCGCCGACGACACGATCGTGATCGTCACCGCCGATCACGGCGACATGCTCGGCGAGCGCGGCCTCTGGTACAAGATG ACGTTCTTCGAAGGCGCATGCCGCGTGCCGCTCATCGTGCACGCGCCGCGCCGGTTTCCGGCCGCGCGCGTGCCGGCGGC CGTGTCGCACGTCGATCTGCTGCCGACGCTCGTCGAGCTCGCGACGGGCGAGCGCCGCGCCGACTGGCCCGACGCCGTCG ACGGCCGCAGCCTCGTTCCCCATCTGCGCGGCGAAGGCGGCCATGACGAGGCGTTCGGCGAATATCTGGCCGAAGGCGCG ATCGCGCCGATCGTGATGATGCGCCGCGGCAGCCACAAGTACATCCATTCGCCCGCGGATCCGGATCAGCTCTTCGATCT GAGGAATGATCCGCGCGAGCTCGACAATCTCGCGAACACGCCCGCCGCGGCAAAGCACGTCGCCGCGTTTCGCATGGAGC GCGTCGCGCGCTGGGATCTCGATGCGCTGCATCAGCAGGTGCTCGCGAGCCAGCGCAGGCGGCGCTTCCATTTCGAGGCG ACGACCCAGGGGCGAATCCGGTCGTGGGACTGGCAGCCGTTCCAGGATGCGAGCCAGCGTTACATGCGCAATCACCTCGA ACTCGACGCGCTCGAGGCAGCCGCGCGTTTTCCTCGTCCGCACGCATGA
Upstream 100 bases:
>100_bases AAGCATTCGATGCCTTCACGGCCGCATTACGGCCGGCAATAGTGAGATCCGGCCGGCGGCGGCTGCCTGTCGCGCCGCCG TTCCGATCCGACGATTCCAT
Downstream 100 bases:
>100_bases CGCCCGGCGTCACGCATTTTCAAATGGAGCAAGCGATGAAACGATACGAATCCATTGCGCGGCGGCTCGCGCGCCGCGCG GCAGCCGCATCGCCGGCGTT
Product: choline sulfatase
Products: NA
Alternate protein names: NA
Number of amino acids: Translated: 522; Mature: 521
Protein sequence:
>522_residues MSAQAMPDTAEPTDIQPNILVLMADQLTPFALRAYGHRATRTPTIDRLAAEGVVFDAAYCASPLCAPSRFALMAGKLPSA LGAYDNAAELPAQTLTFAHYLRAAGYRTMLSGKMHFCGPDQLHGFEERLTTDIYPADFGWVPDWTRPAERPSWYHNMSSV LDAGPCVRTNQLDFDDDATFAARQKIFDVARERAAGRDARPFCMVVSLTHPHDPYAITREYWDLYRDEDIDLPAVRMDFD ASDPHSRRLRAVCEVDRTPPEDLQIRRARRAYYGATSYVDAQFGALLATLEQCGLADDTIVIVTADHGDMLGERGLWYKM TFFEGACRVPLIVHAPRRFPAARVPAAVSHVDLLPTLVELATGERRADWPDAVDGRSLVPHLRGEGGHDEAFGEYLAEGA IAPIVMMRRGSHKYIHSPADPDQLFDLRNDPRELDNLANTPAAAKHVAAFRMERVARWDLDALHQQVLASQRRRRFHFEA TTQGRIRSWDWQPFQDASQRYMRNHLELDALEAAARFPRPHA
Sequences:
>Translated_522_residues MSAQAMPDTAEPTDIQPNILVLMADQLTPFALRAYGHRATRTPTIDRLAAEGVVFDAAYCASPLCAPSRFALMAGKLPSA LGAYDNAAELPAQTLTFAHYLRAAGYRTMLSGKMHFCGPDQLHGFEERLTTDIYPADFGWVPDWTRPAERPSWYHNMSSV LDAGPCVRTNQLDFDDDATFAARQKIFDVARERAAGRDARPFCMVVSLTHPHDPYAITREYWDLYRDEDIDLPAVRMDFD ASDPHSRRLRAVCEVDRTPPEDLQIRRARRAYYGATSYVDAQFGALLATLEQCGLADDTIVIVTADHGDMLGERGLWYKM TFFEGACRVPLIVHAPRRFPAARVPAAVSHVDLLPTLVELATGERRADWPDAVDGRSLVPHLRGEGGHDEAFGEYLAEGA IAPIVMMRRGSHKYIHSPADPDQLFDLRNDPRELDNLANTPAAAKHVAAFRMERVARWDLDALHQQVLASQRRRRFHFEA TTQGRIRSWDWQPFQDASQRYMRNHLELDALEAAARFPRPHA >Mature_521_residues SAQAMPDTAEPTDIQPNILVLMADQLTPFALRAYGHRATRTPTIDRLAAEGVVFDAAYCASPLCAPSRFALMAGKLPSAL GAYDNAAELPAQTLTFAHYLRAAGYRTMLSGKMHFCGPDQLHGFEERLTTDIYPADFGWVPDWTRPAERPSWYHNMSSVL DAGPCVRTNQLDFDDDATFAARQKIFDVARERAAGRDARPFCMVVSLTHPHDPYAITREYWDLYRDEDIDLPAVRMDFDA SDPHSRRLRAVCEVDRTPPEDLQIRRARRAYYGATSYVDAQFGALLATLEQCGLADDTIVIVTADHGDMLGERGLWYKMT FFEGACRVPLIVHAPRRFPAARVPAAVSHVDLLPTLVELATGERRADWPDAVDGRSLVPHLRGEGGHDEAFGEYLAEGAI APIVMMRRGSHKYIHSPADPDQLFDLRNDPRELDNLANTPAAAKHVAAFRMERVARWDLDALHQQVLASQRRRRFHFEAT TQGRIRSWDWQPFQDASQRYMRNHLELDALEAAARFPRPHA
Specific function: Converts choline-O-sulfate into choline [H]
COG id: COG3119
COG function: function code P; Arylsulfatase A and related enzymes
Gene ontology:
Cell location: Cytoplasm [C]
Metaboloic importance: Unknown [C]
Operon status: Not Known
Operon components: None
Similarity: Belongs to the sulfatase family [H]
Homologues:
Organism=Homo sapiens, GI4557659, Length=397, Percent_Identity=29.2191435768262, Blast_Score=132, Evalue=1e-30, Organism=Homo sapiens, GI39930577, Length=543, Percent_Identity=23.3885819521179, Blast_Score=127, Evalue=2e-29, Organism=Homo sapiens, GI5360208, Length=317, Percent_Identity=27.7602523659306, Blast_Score=103, Evalue=3e-22, Organism=Homo sapiens, GI71852584, Length=440, Percent_Identity=29.0909090909091, Blast_Score=100, Evalue=3e-21, Organism=Homo sapiens, GI31742482, Length=434, Percent_Identity=26.9585253456221, Blast_Score=100, Evalue=4e-21, Organism=Homo sapiens, GI58743319, Length=434, Percent_Identity=27.1889400921659, Blast_Score=90, Evalue=4e-18, Organism=Homo sapiens, GI262118210, Length=316, Percent_Identity=28.7974683544304, Blast_Score=89, Evalue=8e-18, Organism=Homo sapiens, GI4504061, Length=405, Percent_Identity=23.9506172839506, Blast_Score=83, Evalue=7e-16, Organism=Homo sapiens, GI4503899, Length=400, Percent_Identity=25, Blast_Score=81, Evalue=2e-15, Organism=Homo sapiens, GI53831991, Length=132, Percent_Identity=37.8787878787879, Blast_Score=79, Evalue=8e-15, Organism=Homo sapiens, GI146229331, Length=111, Percent_Identity=40.5405405405405, Blast_Score=76, Evalue=1e-13, Organism=Homo sapiens, GI146229329, Length=111, Percent_Identity=40.5405405405405, Blast_Score=76, Evalue=1e-13, Organism=Homo sapiens, GI6005990, Length=111, Percent_Identity=40.5405405405405, Blast_Score=76, Evalue=1e-13, Organism=Homo sapiens, GI146229324, Length=111, Percent_Identity=40.5405405405405, Blast_Score=76, Evalue=1e-13, Organism=Homo sapiens, GI157266309, Length=112, Percent_Identity=39.2857142857143, Blast_Score=75, Evalue=1e-13, Organism=Homo sapiens, GI71852586, Length=122, Percent_Identity=39.344262295082, Blast_Score=74, Evalue=3e-13, Organism=Escherichia coli, GI1790112, Length=435, Percent_Identity=29.8850574712644, Blast_Score=153, Evalue=3e-38, Organism=Caenorhabditis elegans, GI17559078, Length=130, Percent_Identity=36.9230769230769, Blast_Score=77, Evalue=2e-14, Organism=Drosophila melanogaster, GI24656835, Length=430, Percent_Identity=29.5348837209302, Blast_Score=127, Evalue=2e-29, Organism=Drosophila melanogaster, GI281366397, Length=118, Percent_Identity=38.135593220339, Blast_Score=74, Evalue=3e-13, Organism=Drosophila melanogaster, GI281366395, Length=118, Percent_Identity=38.135593220339, Blast_Score=74, Evalue=3e-13, Organism=Drosophila melanogaster, GI24666163, Length=118, Percent_Identity=38.135593220339, Blast_Score=74, Evalue=3e-13,
Paralogues:
None
Copy number: NA
Swissprot (AC and ID): NA
Other databases:
- InterPro: IPR017849 - InterPro: IPR017850 - InterPro: IPR017785 - InterPro: IPR000917 [H]
Pfam domain/function: PF00884 Sulfatase [H]
EC number: =3.1.6.6 [H]
Molecular weight: Translated: 58721; Mature: 58590
Theoretical pI: Translated: 6.26; Mature: 6.26
Prosite motif: PS00523 SULFATASE_1 ; PS00149 SULFATASE_2
Important sites: NA
Signals:
None
Transmembrane regions:
None
Cys/Met content:
1.5 %Cys (Translated Protein) 2.9 %Met (Translated Protein) 4.4 %Cys+Met (Translated Protein) 1.5 %Cys (Mature Protein) 2.7 %Met (Mature Protein) 4.2 %Cys+Met (Mature Protein)
Secondary structure:
>Translated Secondary Structure MSAQAMPDTAEPTDIQPNILVLMADQLTPFALRAYGHRATRTPTIDRLAAEGVVFDAAYC CCCCCCCCCCCCCCCCCCEEEEEECCCCHHHHHHHCCCCCCCCCHHHHHHCCEEEEHHHH ASPLCAPSRFALMAGKLPSALGAYDNAAELPAQTLTFAHYLRAAGYRTMLSGKMHFCGPD CCCCCCCHHHHHHHCCCCHHHCCCCCHHHCCHHHHHHHHHHHHHCHHHHHCCCCCCCCCH QLHGFEERLTTDIYPADFGWVPDWTRPAERPSWYHNMSSVLDAGPCVRTNQLDFDDDATF HHCCHHHHHHCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCCCHH AARQKIFDVARERAAGRDARPFCMVVSLTHPHDPYAITREYWDLYRDEDIDLPAVRMDFD HHHHHHHHHHHHHHCCCCCCCEEEEEEECCCCCCCHHHHHHHHHHCCCCCCCCEEEECCC ASDPHSRRLRAVCEVDRTPPEDLQIRRARRAYYGATSYVDAQFGALLATLEQCGLADDTI CCCCHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHCCCCCCEE VIVTADHGDMLGERGLWYKMTFFEGACRVPLIVHAPRRFPAARVPAAVSHVDLLPTLVEL EEEECCCCCHHCCCCCEEEEEEECCCCCCCEEEECCCCCCCHHCCHHHHHHHHHHHHHHH ATGERRADWPDAVDGRSLVPHLRGEGGHDEAFGEYLAEGAIAPIVMMRRGSHKYIHSPAD HCCCCCCCCCCCCCCCCCCCHHCCCCCCHHHHHHHHHCCCHHHHHHHHCCCCCCCCCCCC PDQLFDLRNDPRELDNLANTPAAAKHVAAFRMERVARWDLDALHQQVLASQRRRRFHFEA HHHHHHCCCCHHHHHHHCCCCHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHEEEEEE TTQGRIRSWDWQPFQDASQRYMRNHLELDALEAAARFPRPHA CCCCCCCCCCCCCHHHHHHHHHHHHCCHHHHHHHHCCCCCCC >Mature Secondary Structure SAQAMPDTAEPTDIQPNILVLMADQLTPFALRAYGHRATRTPTIDRLAAEGVVFDAAYC CCCCCCCCCCCCCCCCCEEEEEECCCCHHHHHHHCCCCCCCCCHHHHHHCCEEEEHHHH ASPLCAPSRFALMAGKLPSALGAYDNAAELPAQTLTFAHYLRAAGYRTMLSGKMHFCGPD CCCCCCCHHHHHHHCCCCHHHCCCCCHHHCCHHHHHHHHHHHHHCHHHHHCCCCCCCCCH QLHGFEERLTTDIYPADFGWVPDWTRPAERPSWYHNMSSVLDAGPCVRTNQLDFDDDATF HHCCHHHHHHCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCCCHH AARQKIFDVARERAAGRDARPFCMVVSLTHPHDPYAITREYWDLYRDEDIDLPAVRMDFD HHHHHHHHHHHHHHCCCCCCCEEEEEEECCCCCCCHHHHHHHHHHCCCCCCCCEEEECCC ASDPHSRRLRAVCEVDRTPPEDLQIRRARRAYYGATSYVDAQFGALLATLEQCGLADDTI CCCCHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHCCCCCCEE VIVTADHGDMLGERGLWYKMTFFEGACRVPLIVHAPRRFPAARVPAAVSHVDLLPTLVEL EEEECCCCCHHCCCCCEEEEEEECCCCCCCEEEECCCCCCCHHCCHHHHHHHHHHHHHHH ATGERRADWPDAVDGRSLVPHLRGEGGHDEAFGEYLAEGAIAPIVMMRRGSHKYIHSPAD HCCCCCCCCCCCCCCCCCCCHHCCCCCCHHHHHHHHHCCCHHHHHHHHCCCCCCCCCCCC PDQLFDLRNDPRELDNLANTPAAAKHVAAFRMERVARWDLDALHQQVLASQRRRRFHFEA HHHHHHCCCCHHHHHHHCCCCHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHEEEEEE TTQGRIRSWDWQPFQDASQRYMRNHLELDALEAAARFPRPHA CCCCCCCCCCCCCHHHHHHHHHHHHCCHHHHHHHHCCCCCCC
PDB accession: NA
Resolution: NA
Structure class: Unstructured
Cofactors: NA
Metal ions: NA
Kcat value (1/min): NA
Specific activity: NA
Km value (mM): NA
Substrates: NA
Specific reaction: NA
General reaction: NA
Inhibitor: NA
Structure determination priority: 9.0
TargetDB status: NA
Availability: NA
References: 9141699; 11481430; 9736747 [H]