Definition Clostridium botulinum A2 str. Kyoto chromosome, complete genome.
Accession NC_012563
Length 4,155,278

Click here to switch to the map view.

The map label for this gene is endo I [H]

Identifier: 226948010

GI number: 226948010

Start: 931730

End: 933583

Strand: Direct

Name: endo I [H]

Synonym: CLM_0871

Alternate gene names: 226948010

Gene position: 931730-933583 (Clockwise)

Preceding gene: 226948004

Following gene: 226948011

Centisome position: 22.42

GC content: 32.52

Gene sequence:

>1854_bases
ATGAAAAACACTAAATTAATGAAATTAGTAGCTGGCTTTATGGCTGCTACATTTTCACTGTCGATTTTAACAACAAAAGT
AGAAGCAAAAAATATTGAAAATAAAAATACTAAAAGTGAAGTTAAACAAACTGCTAATAGTAAAGATTTAGAAAGAAAGT
TAATTGGATATTTCCCAGAGTGGGCCTATAATAGTGAAGCTCAAGGATATTTTAAGGTTACTGATTTGCAATGGGATTCT
TTAACTCATATTCAATATTCCTTTGCAATGGTGGATCAAGCTACAAATAAAATTAAATTAGGGGACAAACATGCTGCATT
GGAAGAGGAATTTAAAAATTATAATTTATCTTATAAAGGGAAGAAGGTAGAGTTAGATCCTAATCTCCCGTATAAAGGTC
ACTTTAATTTATTACAAACTATGAAAAAACAATATCCCGATGTTAATTTGCTTATATCAGTAGGTGGATGGGCTGGAAGC
AGAGGATTTTATACAATGCTTGATACGGATGAAGGAATAAATACCTTTGCAGATTCTTGCGTTGATTTTATAAGAAAGTA
TAACTTTGATGGTGTTGATATTGACTTTGAATATCCATCTTCAACAAGTCAGTCAGGAAATCCAGCGGATTTTGACTTAT
CTGAACCTAGAAGAGCTAAATTAAATGAGAGATATAATATACTCATGAAAACATTAAGAGAAAAAATAGATGCAGCTTCT
AAAAAGGATAATAAAGATTATATATTATCTGCTGCAGTTACAGCATCACCTTGGGTACTTGGTGGAGTCAAAGATAACAG
CTATGCTAAGTATCTAGATTTCTTAAGTGTTATGTCCTATGACTATCATGGAGGATGGAATGAATATGTTGAAAATTTAG
CAGGTATTTATCCAGATCCAAAAGATAGGGAAACTGCCAGTCAAATCATGCCAACACTTTGTATGGATTGGGCTTATAGA
TACTACAGAGGAGTTTTACCACCTGAAAAAATATTAATGGGCATACCATACTATACTAGAGGATGGGAAAATGTACAAGG
TGGACAAAATGGACTTCATGGATCAAGTAGAACACCTGCATCTGGTAAGTACAATATTTGGGGAGACGATTTAGATAATG
ATGGAAAGCTAGAACCCGCAGGCGCTAACCCATTGTGGCATGTATTAAACCTTATGGAAAAAGATAAGAATTTAAAAGTA
TATTGGGATGATGTAGAAAAGGTGCCATATGTTTGGCAAAATCAAGAGAGAGTTTTCTTATCCTTTGAAAATGAAAGATC
TATTGATGAAAGACTTAATTATATTAAAAATAAAAATCTAGGTGGAGCATTAATTTGGGTTATGAACGGTGATTATGGAT
TAAATCCTAACTATGAAGAAGGTTCAAGTGATATAAATAAAGGGAAATATACCTTTGGAAATACTTTAACTAAAAGATTA
AGTAATGGCTTGACTAATATGGGAGCATGCAGTAAAACACCAGAAGATTCTAATAATTCCCTAGAGCCTATAAATGTTTC
AGTGGACTTCGGTGGTAGCTATGATCATCCAAATTACACTTATGATATAAAGGTGAAGAATTATACAGGTAAAGAAATTA
AGGGAGGATGGGAAGTATCCTTTGATTTACCAAGATCAGCATTATATAAGTCTTCCTGGGGAGGCAACTATACTTTAAAA
GACAATGGAGATTTTACTACAGTTACAATAAAATCTGGTAACTGGCAAAATATTAATGCTGGTGCTACCGTTAACCTTCA
AGGAATGATAGGGTTATGTTTTTCAGATGTTAGAAATATTAAATTTAATGGTATGAAACCTGTAGGAGATCAAAAAATTA
TTCAACAAAAATAG

Upstream 100 bases:

>100_bases
AATTACTAAAACAATATAACCAAAATTTATATTTGAAACCAAATGGCAAGATCATCATAGCATAGTATTATAATATAAAA
ATGAAATAGGAGGATATACA

Downstream 100 bases:

>100_bases
ATAGTAAAAAATAAAATTATAAAGAAAAGTAAAATAAAACAGTGTCTATGTATTTTATAATGCTAAGCACTGTTTTATTT
AATTAAAAGGACGGAATCAT

Product: glycosyl hydrolase, family 18

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 617; Mature: 617

Protein sequence:

>617_residues
MKNTKLMKLVAGFMAATFSLSILTTKVEAKNIENKNTKSEVKQTANSKDLERKLIGYFPEWAYNSEAQGYFKVTDLQWDS
LTHIQYSFAMVDQATNKIKLGDKHAALEEEFKNYNLSYKGKKVELDPNLPYKGHFNLLQTMKKQYPDVNLLISVGGWAGS
RGFYTMLDTDEGINTFADSCVDFIRKYNFDGVDIDFEYPSSTSQSGNPADFDLSEPRRAKLNERYNILMKTLREKIDAAS
KKDNKDYILSAAVTASPWVLGGVKDNSYAKYLDFLSVMSYDYHGGWNEYVENLAGIYPDPKDRETASQIMPTLCMDWAYR
YYRGVLPPEKILMGIPYYTRGWENVQGGQNGLHGSSRTPASGKYNIWGDDLDNDGKLEPAGANPLWHVLNLMEKDKNLKV
YWDDVEKVPYVWQNQERVFLSFENERSIDERLNYIKNKNLGGALIWVMNGDYGLNPNYEEGSSDINKGKYTFGNTLTKRL
SNGLTNMGACSKTPEDSNNSLEPINVSVDFGGSYDHPNYTYDIKVKNYTGKEIKGGWEVSFDLPRSALYKSSWGGNYTLK
DNGDFTTVTIKSGNWQNINAGATVNLQGMIGLCFSDVRNIKFNGMKPVGDQKIIQQK

Sequences:

>Translated_617_residues
MKNTKLMKLVAGFMAATFSLSILTTKVEAKNIENKNTKSEVKQTANSKDLERKLIGYFPEWAYNSEAQGYFKVTDLQWDS
LTHIQYSFAMVDQATNKIKLGDKHAALEEEFKNYNLSYKGKKVELDPNLPYKGHFNLLQTMKKQYPDVNLLISVGGWAGS
RGFYTMLDTDEGINTFADSCVDFIRKYNFDGVDIDFEYPSSTSQSGNPADFDLSEPRRAKLNERYNILMKTLREKIDAAS
KKDNKDYILSAAVTASPWVLGGVKDNSYAKYLDFLSVMSYDYHGGWNEYVENLAGIYPDPKDRETASQIMPTLCMDWAYR
YYRGVLPPEKILMGIPYYTRGWENVQGGQNGLHGSSRTPASGKYNIWGDDLDNDGKLEPAGANPLWHVLNLMEKDKNLKV
YWDDVEKVPYVWQNQERVFLSFENERSIDERLNYIKNKNLGGALIWVMNGDYGLNPNYEEGSSDINKGKYTFGNTLTKRL
SNGLTNMGACSKTPEDSNNSLEPINVSVDFGGSYDHPNYTYDIKVKNYTGKEIKGGWEVSFDLPRSALYKSSWGGNYTLK
DNGDFTTVTIKSGNWQNINAGATVNLQGMIGLCFSDVRNIKFNGMKPVGDQKIIQQK
>Mature_617_residues
MKNTKLMKLVAGFMAATFSLSILTTKVEAKNIENKNTKSEVKQTANSKDLERKLIGYFPEWAYNSEAQGYFKVTDLQWDS
LTHIQYSFAMVDQATNKIKLGDKHAALEEEFKNYNLSYKGKKVELDPNLPYKGHFNLLQTMKKQYPDVNLLISVGGWAGS
RGFYTMLDTDEGINTFADSCVDFIRKYNFDGVDIDFEYPSSTSQSGNPADFDLSEPRRAKLNERYNILMKTLREKIDAAS
KKDNKDYILSAAVTASPWVLGGVKDNSYAKYLDFLSVMSYDYHGGWNEYVENLAGIYPDPKDRETASQIMPTLCMDWAYR
YYRGVLPPEKILMGIPYYTRGWENVQGGQNGLHGSSRTPASGKYNIWGDDLDNDGKLEPAGANPLWHVLNLMEKDKNLKV
YWDDVEKVPYVWQNQERVFLSFENERSIDERLNYIKNKNLGGALIWVMNGDYGLNPNYEEGSSDINKGKYTFGNTLTKRL
SNGLTNMGACSKTPEDSNNSLEPINVSVDFGGSYDHPNYTYDIKVKNYTGKEIKGGWEVSFDLPRSALYKSSWGGNYTLK
DNGDFTTVTIKSGNWQNINAGATVNLQGMIGLCFSDVRNIKFNGMKPVGDQKIIQQK

Specific function: Hydrolyzes chitin oligosaccharides; (GlcNAc)4 to (GlcNAc)2 and (GlcNAc)5,6 to (GlcNAc)2 and (GlcNAc)3. Inactive towards chitin, glucosamine oligosaccharides, glycoproteins and glycopeptides containing (GlcNAc)2 [H]

COG id: COG3325

COG function: function code G; Chitinase

Gene ontology:

Cell location: Periplasm (Probable) [H]

Metaboloic importance: NA

Operon status: Not Known

Operon components: None

Similarity: Belongs to the glycosyl hydrolase 18 family [H]

Homologues:

Organism=Homo sapiens, GI4502809, Length=438, Percent_Identity=26.4840182648402, Blast_Score=134, Evalue=3e-31,
Organism=Homo sapiens, GI58386720, Length=409, Percent_Identity=26.161369193154, Blast_Score=125, Evalue=1e-28,
Organism=Homo sapiens, GI133893286, Length=461, Percent_Identity=25.5965292841649, Blast_Score=125, Evalue=1e-28,
Organism=Homo sapiens, GI68533255, Length=415, Percent_Identity=25.3012048192771, Blast_Score=117, Evalue=5e-26,
Organism=Homo sapiens, GI68533253, Length=415, Percent_Identity=25.3012048192771, Blast_Score=116, Evalue=5e-26,
Organism=Homo sapiens, GI42542398, Length=347, Percent_Identity=26.5129682997118, Blast_Score=112, Evalue=1e-24,
Organism=Homo sapiens, GI68533260, Length=333, Percent_Identity=26.7267267267267, Blast_Score=105, Evalue=1e-22,
Organism=Homo sapiens, GI144226251, Length=407, Percent_Identity=23.0958230958231, Blast_Score=105, Evalue=1e-22,
Organism=Caenorhabditis elegans, GI71995504, Length=412, Percent_Identity=27.9126213592233, Blast_Score=141, Evalue=1e-33,
Organism=Caenorhabditis elegans, GI17559362, Length=405, Percent_Identity=28.8888888888889, Blast_Score=139, Evalue=4e-33,
Organism=Caenorhabditis elegans, GI17563680, Length=403, Percent_Identity=27.2952853598015, Blast_Score=138, Evalue=1e-32,
Organism=Caenorhabditis elegans, GI17551250, Length=484, Percent_Identity=24.3801652892562, Blast_Score=122, Evalue=5e-28,
Organism=Caenorhabditis elegans, GI17531695, Length=442, Percent_Identity=23.9819004524887, Blast_Score=109, Evalue=4e-24,
Organism=Caenorhabditis elegans, GI17535565, Length=333, Percent_Identity=28.2282282282282, Blast_Score=109, Evalue=5e-24,
Organism=Caenorhabditis elegans, GI71993650, Length=428, Percent_Identity=26.1682242990654, Blast_Score=103, Evalue=3e-22,
Organism=Caenorhabditis elegans, GI17535567, Length=314, Percent_Identity=27.0700636942675, Blast_Score=100, Evalue=3e-21,
Organism=Caenorhabditis elegans, GI17535569, Length=467, Percent_Identity=26.338329764454, Blast_Score=99, Evalue=7e-21,
Organism=Caenorhabditis elegans, GI17536195, Length=317, Percent_Identity=27.1293375394322, Blast_Score=97, Evalue=3e-20,
Organism=Caenorhabditis elegans, GI32564165, Length=441, Percent_Identity=23.3560090702948, Blast_Score=95, Evalue=1e-19,
Organism=Caenorhabditis elegans, GI17539198, Length=328, Percent_Identity=24.6951219512195, Blast_Score=92, Evalue=1e-18,
Organism=Caenorhabditis elegans, GI17531683, Length=312, Percent_Identity=24.0384615384615, Blast_Score=91, Evalue=2e-18,
Organism=Saccharomyces cerevisiae, GI6320579, Length=395, Percent_Identity=23.5443037974684, Blast_Score=84, Evalue=6e-17,
Organism=Drosophila melanogaster, GI221330815, Length=471, Percent_Identity=28.2377919320594, Blast_Score=144, Evalue=1e-34,
Organism=Drosophila melanogaster, GI24655584, Length=410, Percent_Identity=26.3414634146341, Blast_Score=130, Evalue=3e-30,
Organism=Drosophila melanogaster, GI45550474, Length=405, Percent_Identity=26.4197530864197, Blast_Score=126, Evalue=4e-29,
Organism=Drosophila melanogaster, GI21358195, Length=408, Percent_Identity=25.2450980392157, Blast_Score=124, Evalue=2e-28,
Organism=Drosophila melanogaster, GI116007452, Length=413, Percent_Identity=23.9709443099274, Blast_Score=105, Evalue=7e-23,
Organism=Drosophila melanogaster, GI22024049, Length=412, Percent_Identity=24.0291262135922, Blast_Score=96, Evalue=1e-19,
Organism=Drosophila melanogaster, GI161077700, Length=412, Percent_Identity=21.8446601941748, Blast_Score=91, Evalue=2e-18,
Organism=Drosophila melanogaster, GI221329807, Length=412, Percent_Identity=21.8446601941748, Blast_Score=91, Evalue=2e-18,
Organism=Drosophila melanogaster, GI24640248, Length=414, Percent_Identity=25.1207729468599, Blast_Score=90, Evalue=5e-18,
Organism=Drosophila melanogaster, GI28573679, Length=410, Percent_Identity=23.4146341463415, Blast_Score=72, Evalue=8e-13,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR003610
- InterPro:   IPR009470
- InterPro:   IPR011583
- InterPro:   IPR001223
- InterPro:   IPR017853
- InterPro:   IPR013781 [H]

Pfam domain/function: PF02839 CBM_5_12; PF06483 ChiC; PF00704 Glyco_hydro_18 [H]

EC number: =3.2.1.14 [H]

Molecular weight: Translated: 69901; Mature: 69901

Theoretical pI: Translated: 6.43; Mature: 6.43

Prosite motif: PS01095 CHITINASE_18

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.6 %Cys     (Translated Protein)
2.6 %Met     (Translated Protein)
3.2 %Cys+Met (Translated Protein)
0.6 %Cys     (Mature Protein)
2.6 %Met     (Mature Protein)
3.2 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MKNTKLMKLVAGFMAATFSLSILTTKVEAKNIENKNTKSEVKQTANSKDLERKLIGYFPE
CCCHHHHHHHHHHHHHHHEEEEEEEEEHHHCCCCCCHHHHHHHHCCHHHHHHHHHHCCCC
WAYNSEAQGYFKVTDLQWDSLTHIQYSFAMVDQATNKIKLGDKHAALEEEFKNYNLSYKG
CCCCCCCCCEEEEEECCCCCCCCEEEHHHHHHHCCCEEEECCHHHHHHHHHHCCCCEECC
KKVELDPNLPYKGHFNLLQTMKKQYPDVNLLISVGGWAGSRGFYTMLDTDEGINTFADSC
CEEEECCCCCCCCCHHHHHHHHHHCCCEEEEEEECCCCCCCCEEEEEECCCCHHHHHHHH
VDFIRKYNFDGVDIDFEYPSSTSQSGNPADFDLSEPRRAKLNERYNILMKTLREKIDAAS
HHHHHHCCCCCEEEEEECCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHCC
KKDNKDYILSAAVTASPWVLGGVKDNSYAKYLDFLSVMSYDYHGGWNEYVENLAGIYPDP
CCCCCCEEEEEEEECCCEEEECCCCCHHHHHHHHHHHHHCCCCCCHHHHHHHHHCCCCCC
KDRETASQIMPTLCMDWAYRYYRGVLPPEKILMGIPYYTRGWENVQGGQNGLHGSSRTPA
CCHHHHHHHHHHHHHHHHHHHHHCCCCHHHHEECCCCHHCCCCCCCCCCCCCCCCCCCCC
SGKYNIWGDDLDNDGKLEPAGANPLWHVLNLMEKDKNLKVYWDDVEKVPYVWQNQERVFL
CCEEEECCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCEEEECCCCCCCCEECCCCEEEE
SFENERSIDERLNYIKNKNLGGALIWVMNGDYGLNPNYEEGSSDINKGKYTFGNTLTKRL
EECCCCCHHHHHHHHHCCCCCEEEEEEECCCCCCCCCCCCCCCCCCCCCEEHHHHHHHHH
SNGLTNMGACSKTPEDSNNSLEPINVSVDFGGSYDHPNYTYDIKVKNYTGKEIKGGWEVS
HCCCCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCEEEEEEEECCCCCCCCCCEEEE
FDLPRSALYKSSWGGNYTLKDNGDFTTVTIKSGNWQNINAGATVNLQGMIGLCFSDVRNI
ECCCHHHHHHCCCCCCEEEECCCCEEEEEEECCCCCCCCCCCEEEECHHHHHHHHHHCCC
KFNGMKPVGDQKIIQQK
EECCCCCCCCCHHHCCC
>Mature Secondary Structure
MKNTKLMKLVAGFMAATFSLSILTTKVEAKNIENKNTKSEVKQTANSKDLERKLIGYFPE
CCCHHHHHHHHHHHHHHHEEEEEEEEEHHHCCCCCCHHHHHHHHCCHHHHHHHHHHCCCC
WAYNSEAQGYFKVTDLQWDSLTHIQYSFAMVDQATNKIKLGDKHAALEEEFKNYNLSYKG
CCCCCCCCCEEEEEECCCCCCCCEEEHHHHHHHCCCEEEECCHHHHHHHHHHCCCCEECC
KKVELDPNLPYKGHFNLLQTMKKQYPDVNLLISVGGWAGSRGFYTMLDTDEGINTFADSC
CEEEECCCCCCCCCHHHHHHHHHHCCCEEEEEEECCCCCCCCEEEEEECCCCHHHHHHHH
VDFIRKYNFDGVDIDFEYPSSTSQSGNPADFDLSEPRRAKLNERYNILMKTLREKIDAAS
HHHHHHCCCCCEEEEEECCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHCC
KKDNKDYILSAAVTASPWVLGGVKDNSYAKYLDFLSVMSYDYHGGWNEYVENLAGIYPDP
CCCCCCEEEEEEEECCCEEEECCCCCHHHHHHHHHHHHHCCCCCCHHHHHHHHHCCCCCC
KDRETASQIMPTLCMDWAYRYYRGVLPPEKILMGIPYYTRGWENVQGGQNGLHGSSRTPA
CCHHHHHHHHHHHHHHHHHHHHHCCCCHHHHEECCCCHHCCCCCCCCCCCCCCCCCCCCC
SGKYNIWGDDLDNDGKLEPAGANPLWHVLNLMEKDKNLKVYWDDVEKVPYVWQNQERVFL
CCEEEECCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCEEEECCCCCCCCEECCCCEEEE
SFENERSIDERLNYIKNKNLGGALIWVMNGDYGLNPNYEEGSSDINKGKYTFGNTLTKRL
EECCCCCHHHHHHHHHCCCCCEEEEEEECCCCCCCCCCCCCCCCCCCCCEEHHHHHHHHH
SNGLTNMGACSKTPEDSNNSLEPINVSVDFGGSYDHPNYTYDIKVKNYTGKEIKGGWEVS
HCCCCCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCEEEEEEEECCCCCCCCCCEEEE
FDLPRSALYKSSWGGNYTLKDNGDFTTVTIKSGNWQNINAGATVNLQGMIGLCFSDVRNI
ECCCHHHHHHCCCCCCEEEECCCCEEEEEEECCCCCCCCCCCEEEECHHHHHHHHHHCCC
KFNGMKPVGDQKIIQQK
EECCCCCCCCCHHHCCC

PDB accession: NA

Resolution: NA

Structure class: Alpha Beta

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 8969204 [H]