Definition Chloroflexus sp. Y-400-fl chromosome, complete genome.
Accession NC_012032
Length 5,268,950

Click here to switch to the map view.

The map label for this gene is bga [H]

Identifier: 222527387

GI number: 222527387

Start: 5175374

End: 5177641

Strand: Direct

Name: bga [H]

Synonym: Chy400_4177

Alternate gene names: 222527387

Gene position: 5175374-5177641 (Clockwise)

Preceding gene: 222527386

Following gene: 222527388

Centisome position: 98.22

GC content: 55.38

Gene sequence:

>2268_bases
ATGCCTGCATTCACCGCTCACCATCAACGCTTCTGGCTCGATGATCGACCACTGTTCATTCAGGCAGGTGAATTTCACTA
CTTCCGCACACCGGCTACCGATTGGGAGCAGCGCCTGGACTTGCTCGTGCGTGCCGGGTTCAATGCCGTTGCCTGCTATA
TTCCGTGGCTCTGGCATCAACCGCAACCCGATCAGATCGATCTGACCGGAACGACTCATCCGTTACGCAATCTGGCCGGC
TTTCTCGATCTGGCAGCGCAGAAGGGACTCTACGTGATTGCGCGTCCTGGCCCATACATTATGGCCGAGACCATCAATGA
AGGCATTCCGCCGTGGGTCTTTGATCATCATCCAGAGATTGCTCTCATCAATCAACACGGCACCGGCGAGAACGTCGCCA
GTTTTATGCACCCGTCCTTCCTGAGCTGCGTTGCCGGCTGGTATCAGGCCGTCTTCGCAATCCTTTCACCACGTCAGATT
ACCCGTGGTGGCCCGATCCTCATGGTCCAACTCGATAACGAGATGGGAATGCTGCATTGGGTGCGGAATAGCTTCGACCT
CAACCCAATCACGATGCAATACTTCGCCAACTGGCTCCAGGCGACGTATGGTGAGGATATGAGGGTTGATCCCCAGGTCC
TGGCTACACGTCTGCGCCATGCCGAAGGGGCAGCGGGTGCACAACTGGTGGCCGATTATCGGCGCTTTTTCCGCAGTTAT
CTGCGTGACTATGCGCAGTGGCTCACTGCCGAAGCACGCCGACACGGCCTTGACACGCCTGCTGTGTTCAACATCCACGG
TTTCGCTAACGGTGGCAAGACCTTTCCCATCGGTCTTTCACAACTTGCCGGTGTTTTACGCTTGCCTGATGTCATCAGTG
CCACCGATGTCTACCCCGGTCAGATTGGCGAAGGCACCTTCCACCAGCTATTGCTGGTTAACGCAATGACCTGTGCCATC
CAAAACCCTGATCAACCACTCTTTTCCATTGAGTTTCAGGCGGGTGGCAATCTTGACTTCAGCAATATGTCAAGTTCTTT
CTACGACCTGCATACCCGTTTGTGCCTTTCCAACGGTATGCGGGCCATCAACCATTACCTCTTCTTTGATGGCGAGAACG
ACCCGATCCTCAGTCCGGTCAAACGGCACGATTGGGGTCATCCGGTTCGTAAAGATGGCACCCTGCGCAGCCACTACCAC
CGATATCCGCGTTTATCCCAGACCATAGCCAGCTACGGTGAGGCGCTCATCCTGGCTCGTCCTGAACCGGTAACAACCAT
CGGCTTTCGCCTTGATGATTTCATGACGGAAGTGAATCTCGCATGCACGCAAGAAGAGACAGCGATCATCACCCATCAAC
GCGAGGTCATCCTGTTTGATCTGATCGCTCGTGGTCTCGCCCTGACCCAACGTGATTTTCAGGCGCTCGATCTGCAACAT
ACCGCTCCCGATCCCCACCAGGTGCCGCTACTGTGGGTGATGCTTGACTGTAATAGCGATCCTGCCACCGAACAGACCCT
GATCAAGTACCTTGAGGCTGGCGGCAACGTGGTCTTGATCGGGCGCTTACGGGCCGATGGGGTATTGGCAAAAACCCTGA
ACGTTCAACCCCACACCGATCCACCATTCACATCACGCCGGGTACACATCTTTACCACCCCCGATATTCCGGCCAGCTTT
GTACAAACCTATCAGGGTGATCTGGAAACCGTCTTTGCCACCGCGGACGGTGCGGCGGTCGGTTTCCGGAAACGAGTCGG
GCACGGCGAACTGATTATGCTCGGCGCCTCATTCCCCATCACCAACCTCGATGATCTGCACGTGTTTCAGCAACTTGCCG
CATGGGTTAACTGTCCAGCCCCATTTACCCTTTCGACCTGGGCTGATGTACGGCTGAGTCGTGGCCCTGATGGCGATTTT
CTGTTCGTCAACCACTACGGCGACGATCCGCTGGAAACGCACATTCACTATCAAGGTCAGCCACTTTGTGATGGTCATCC
CATTCGCCTGCCGGCCCGTAGTGGTGCCATTCTTCCGCTCAACTGGCAGGTACGCCCAGGGATCACCGTGCGCTATACGA
CGGTGGAAGTTCGCTCGATCACAGAAACTGCCGATGGCCTGACCTTCACCTTCGCCCAACCCGAAGGCTATGCCTGCTTT
GCTGTACGTATCAATCAGGGAAGTACAGAAGATAGGAGTGAGTCGATACAACACGTTCACTTTACCAATGGCACCGCATC
TGTGCAATGGCCCATCAGAACGACATAA

Upstream 100 bases:

>100_bases
CTGGGATATTTACACCAACAGCCCCATGATCCAACGCGCCCTGGCAGTGCTGGGGTTCACCCCGCAGCCATCAGCCCGCT
AAGACAGCAAAGGAAGTGTT

Downstream 100 bases:

>100_bases
GGATAGTGCAGGCAGCTCTACGTGCTCCGCATAGTCAGGGCTATCGTCTGTTTGTGGCAATACAACCGCAAGATACAACC
TCATCAGGAGTACATCAACA

Product: glycoside hydrolase family protein

Products: NA

Alternate protein names: Lactase [H]

Number of amino acids: Translated: 755; Mature: 754

Protein sequence:

>755_residues
MPAFTAHHQRFWLDDRPLFIQAGEFHYFRTPATDWEQRLDLLVRAGFNAVACYIPWLWHQPQPDQIDLTGTTHPLRNLAG
FLDLAAQKGLYVIARPGPYIMAETINEGIPPWVFDHHPEIALINQHGTGENVASFMHPSFLSCVAGWYQAVFAILSPRQI
TRGGPILMVQLDNEMGMLHWVRNSFDLNPITMQYFANWLQATYGEDMRVDPQVLATRLRHAEGAAGAQLVADYRRFFRSY
LRDYAQWLTAEARRHGLDTPAVFNIHGFANGGKTFPIGLSQLAGVLRLPDVISATDVYPGQIGEGTFHQLLLVNAMTCAI
QNPDQPLFSIEFQAGGNLDFSNMSSSFYDLHTRLCLSNGMRAINHYLFFDGENDPILSPVKRHDWGHPVRKDGTLRSHYH
RYPRLSQTIASYGEALILARPEPVTTIGFRLDDFMTEVNLACTQEETAIITHQREVILFDLIARGLALTQRDFQALDLQH
TAPDPHQVPLLWVMLDCNSDPATEQTLIKYLEAGGNVVLIGRLRADGVLAKTLNVQPHTDPPFTSRRVHIFTTPDIPASF
VQTYQGDLETVFATADGAAVGFRKRVGHGELIMLGASFPITNLDDLHVFQQLAAWVNCPAPFTLSTWADVRLSRGPDGDF
LFVNHYGDDPLETHIHYQGQPLCDGHPIRLPARSGAILPLNWQVRPGITVRYTTVEVRSITETADGLTFTFAQPEGYACF
AVRINQGSTEDRSESIQHVHFTNGTASVQWPIRTT

Sequences:

>Translated_755_residues
MPAFTAHHQRFWLDDRPLFIQAGEFHYFRTPATDWEQRLDLLVRAGFNAVACYIPWLWHQPQPDQIDLTGTTHPLRNLAG
FLDLAAQKGLYVIARPGPYIMAETINEGIPPWVFDHHPEIALINQHGTGENVASFMHPSFLSCVAGWYQAVFAILSPRQI
TRGGPILMVQLDNEMGMLHWVRNSFDLNPITMQYFANWLQATYGEDMRVDPQVLATRLRHAEGAAGAQLVADYRRFFRSY
LRDYAQWLTAEARRHGLDTPAVFNIHGFANGGKTFPIGLSQLAGVLRLPDVISATDVYPGQIGEGTFHQLLLVNAMTCAI
QNPDQPLFSIEFQAGGNLDFSNMSSSFYDLHTRLCLSNGMRAINHYLFFDGENDPILSPVKRHDWGHPVRKDGTLRSHYH
RYPRLSQTIASYGEALILARPEPVTTIGFRLDDFMTEVNLACTQEETAIITHQREVILFDLIARGLALTQRDFQALDLQH
TAPDPHQVPLLWVMLDCNSDPATEQTLIKYLEAGGNVVLIGRLRADGVLAKTLNVQPHTDPPFTSRRVHIFTTPDIPASF
VQTYQGDLETVFATADGAAVGFRKRVGHGELIMLGASFPITNLDDLHVFQQLAAWVNCPAPFTLSTWADVRLSRGPDGDF
LFVNHYGDDPLETHIHYQGQPLCDGHPIRLPARSGAILPLNWQVRPGITVRYTTVEVRSITETADGLTFTFAQPEGYACF
AVRINQGSTEDRSESIQHVHFTNGTASVQWPIRTT
>Mature_754_residues
PAFTAHHQRFWLDDRPLFIQAGEFHYFRTPATDWEQRLDLLVRAGFNAVACYIPWLWHQPQPDQIDLTGTTHPLRNLAGF
LDLAAQKGLYVIARPGPYIMAETINEGIPPWVFDHHPEIALINQHGTGENVASFMHPSFLSCVAGWYQAVFAILSPRQIT
RGGPILMVQLDNEMGMLHWVRNSFDLNPITMQYFANWLQATYGEDMRVDPQVLATRLRHAEGAAGAQLVADYRRFFRSYL
RDYAQWLTAEARRHGLDTPAVFNIHGFANGGKTFPIGLSQLAGVLRLPDVISATDVYPGQIGEGTFHQLLLVNAMTCAIQ
NPDQPLFSIEFQAGGNLDFSNMSSSFYDLHTRLCLSNGMRAINHYLFFDGENDPILSPVKRHDWGHPVRKDGTLRSHYHR
YPRLSQTIASYGEALILARPEPVTTIGFRLDDFMTEVNLACTQEETAIITHQREVILFDLIARGLALTQRDFQALDLQHT
APDPHQVPLLWVMLDCNSDPATEQTLIKYLEAGGNVVLIGRLRADGVLAKTLNVQPHTDPPFTSRRVHIFTTPDIPASFV
QTYQGDLETVFATADGAAVGFRKRVGHGELIMLGASFPITNLDDLHVFQQLAAWVNCPAPFTLSTWADVRLSRGPDGDFL
FVNHYGDDPLETHIHYQGQPLCDGHPIRLPARSGAILPLNWQVRPGITVRYTTVEVRSITETADGLTFTFAQPEGYACFA
VRINQGSTEDRSESIQHVHFTNGTASVQWPIRTT

Specific function: Preferentially hydrolyzes beta(1->3) galactosyl linkages over beta(1->4) linkages [H]

COG id: COG1874

COG function: function code G; Beta-galactosidase

Gene ontology:

Cell location: Cytoplasmic

Metaboloic importance: NA

Operon status: Not Known

Operon components: None

Similarity: Belongs to the glycosyl hydrolase 35 family [H]

Homologues:

Organism=Homo sapiens, GI40255043, Length=200, Percent_Identity=34.5, Blast_Score=113, Evalue=6e-25,
Organism=Homo sapiens, GI31543093, Length=198, Percent_Identity=31.3131313131313, Blast_Score=104, Evalue=4e-22,
Organism=Homo sapiens, GI164519026, Length=185, Percent_Identity=31.8918918918919, Blast_Score=98, Evalue=3e-20,
Organism=Homo sapiens, GI119372312, Length=166, Percent_Identity=33.7349397590361, Blast_Score=96, Evalue=1e-19,
Organism=Homo sapiens, GI119372308, Length=166, Percent_Identity=33.7349397590361, Blast_Score=96, Evalue=1e-19,
Organism=Caenorhabditis elegans, GI72000600, Length=179, Percent_Identity=33.5195530726257, Blast_Score=96, Evalue=6e-20,
Organism=Caenorhabditis elegans, GI17568491, Length=212, Percent_Identity=32.5471698113208, Blast_Score=94, Evalue=3e-19,
Organism=Drosophila melanogaster, GI24582088, Length=192, Percent_Identity=32.8125, Blast_Score=115, Evalue=1e-25,
Organism=Drosophila melanogaster, GI24646169, Length=194, Percent_Identity=33.5051546391753, Blast_Score=103, Evalue=5e-22,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR008979
- InterPro:   IPR019801
- InterPro:   IPR017853
- InterPro:   IPR013781
- InterPro:   IPR001944 [H]

Pfam domain/function: PF01301 Glyco_hydro_35 [H]

EC number: =3.2.1.23 [H]

Molecular weight: Translated: 84718; Mature: 84587

Theoretical pI: Translated: 6.28; Mature: 6.28

Prosite motif: NA

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

1.2 %Cys     (Translated Protein)
1.9 %Met     (Translated Protein)
3.0 %Cys+Met (Translated Protein)
1.2 %Cys     (Mature Protein)
1.7 %Met     (Mature Protein)
2.9 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MPAFTAHHQRFWLDDRPLFIQAGEFHYFRTPATDWEQRLDLLVRAGFNAVACYIPWLWHQ
CCCCCCCCCEEEECCCCEEEEECCEEEEECCCCCHHHHHHHHHHCCCCEEEEEEHHHCCC
PQPDQIDLTGTTHPLRNLAGFLDLAAQKGLYVIARPGPYIMAETINEGIPPWVFDHHPEI
CCCCEEEECCCCHHHHHHHHHHHHHHCCCEEEEECCCCEEEEEHHHCCCCCCEECCCCCE
ALINQHGTGENVASFMHPSFLSCVAGWYQAVFAILSPRQITRGGPILMVQLDNEMGMLHW
EEEECCCCCHHHHHHHCHHHHHHHHHHHHHHHHHHCCCEECCCCCEEEEEECCCCCCHHH
VRNSFDLNPITMQYFANWLQATYGEDMRVDPQVLATRLRHAEGAAGAQLVADYRRFFRSY
HHCCCCCCCEEHHHHHHHHHHHCCCCCCCCHHHHHHHHHHCCCCCHHHHHHHHHHHHHHH
LRDYAQWLTAEARRHGLDTPAVFNIHGFANGGKTFPIGLSQLAGVLRLPDVISATDVYPG
HHHHHHHHHHHHHHCCCCCCEEEEEEEECCCCCCCCCCHHHHHHHHHCCCHHCCCCCCCC
QIGEGTFHQLLLVNAMTCAIQNPDQPLFSIEFQAGGNLDFSNMSSSFYDLHTRLCLSNGM
CCCCCHHHHHHHHHHHEEEECCCCCCEEEEEECCCCCCCHHHCCCHHHHHHHHHHHHCCH
RAINHYLFFDGENDPILSPVKRHDWGHPVRKDGTLRSHYHRYPRLSQTIASYGEALILAR
HEEEEEEEEECCCCCCCCCHHHCCCCCCCCCCCCHHHHHHCCCHHHHHHHHCCCEEEEEC
PEPVTTIGFRLDDFMTEVNLACTQEETAIITHQREVILFDLIARGLALTQRDFQALDLQH
CCCCEEEEEEHHHHHHHCCEEECCCCEEEEEECCHHHHHHHHHHHHHHHHCCHHHCCCCC
TAPDPHQVPLLWVMLDCNSDPATEQTLIKYLEAGGNVVLIGRLRADGVLAKTLNVQPHTD
CCCCCCCCEEEEEEEECCCCCCHHHHHHHHHHCCCCEEEEEEECCCCEEEEEEECCCCCC
PPFTSRRVHIFTTPDIPASFVQTYQGDLETVFATADGAAVGFRKRVGHGELIMLGASFPI
CCCCCCEEEEEECCCCCHHHHHHHCCCCEEEEEECCCCHHHHHHHCCCCCEEEEECCCCC
TNLDDLHVFQQLAAWVNCPAPFTLSTWADVRLSRGPDGDFLFVNHYGDDPLETHIHYQGQ
CCCHHHHHHHHHHHHHCCCCCCEECCCEEEEECCCCCCCEEEEECCCCCCHHEEEEECCC
PLCDGHPIRLPARSGAILPLNWQVRPGITVRYTTVEVRSITETADGLTFTFAQPEGYACF
CCCCCCCEEEECCCCCEEEEEEEECCCEEEEEEEEEEEEHHHCCCCCEEEEECCCCEEEE
AVRINQGSTEDRSESIQHVHFTNGTASVQWPIRTT
EEEECCCCCCCHHCCCEEEEEECCCEEEEECEECC
>Mature Secondary Structure 
PAFTAHHQRFWLDDRPLFIQAGEFHYFRTPATDWEQRLDLLVRAGFNAVACYIPWLWHQ
CCCCCCCCEEEECCCCEEEEECCEEEEECCCCCHHHHHHHHHHCCCCEEEEEEHHHCCC
PQPDQIDLTGTTHPLRNLAGFLDLAAQKGLYVIARPGPYIMAETINEGIPPWVFDHHPEI
CCCCEEEECCCCHHHHHHHHHHHHHHCCCEEEEECCCCEEEEEHHHCCCCCCEECCCCCE
ALINQHGTGENVASFMHPSFLSCVAGWYQAVFAILSPRQITRGGPILMVQLDNEMGMLHW
EEEECCCCCHHHHHHHCHHHHHHHHHHHHHHHHHHCCCEECCCCCEEEEEECCCCCCHHH
VRNSFDLNPITMQYFANWLQATYGEDMRVDPQVLATRLRHAEGAAGAQLVADYRRFFRSY
HHCCCCCCCEEHHHHHHHHHHHCCCCCCCCHHHHHHHHHHCCCCCHHHHHHHHHHHHHHH
LRDYAQWLTAEARRHGLDTPAVFNIHGFANGGKTFPIGLSQLAGVLRLPDVISATDVYPG
HHHHHHHHHHHHHHCCCCCCEEEEEEEECCCCCCCCCCHHHHHHHHHCCCHHCCCCCCCC
QIGEGTFHQLLLVNAMTCAIQNPDQPLFSIEFQAGGNLDFSNMSSSFYDLHTRLCLSNGM
CCCCCHHHHHHHHHHHEEEECCCCCCEEEEEECCCCCCCHHHCCCHHHHHHHHHHHHCCH
RAINHYLFFDGENDPILSPVKRHDWGHPVRKDGTLRSHYHRYPRLSQTIASYGEALILAR
HEEEEEEEEECCCCCCCCCHHHCCCCCCCCCCCCHHHHHHCCCHHHHHHHHCCCEEEEEC
PEPVTTIGFRLDDFMTEVNLACTQEETAIITHQREVILFDLIARGLALTQRDFQALDLQH
CCCCEEEEEEHHHHHHHCCEEECCCCEEEEEECCHHHHHHHHHHHHHHHHCCHHHCCCCC
TAPDPHQVPLLWVMLDCNSDPATEQTLIKYLEAGGNVVLIGRLRADGVLAKTLNVQPHTD
CCCCCCCCEEEEEEEECCCCCCHHHHHHHHHHCCCCEEEEEEECCCCEEEEEEECCCCCC
PPFTSRRVHIFTTPDIPASFVQTYQGDLETVFATADGAAVGFRKRVGHGELIMLGASFPI
CCCCCCEEEEEECCCCCHHHHHHHCCCCEEEEEECCCCHHHHHHHCCCCCEEEEECCCCC
TNLDDLHVFQQLAAWVNCPAPFTLSTWADVRLSRGPDGDFLFVNHYGDDPLETHIHYQGQ
CCCHHHHHHHHHHHHHCCCCCCEECCCEEEEECCCCCCCEEEEECCCCCCHHEEEEECCC
PLCDGHPIRLPARSGAILPLNWQVRPGITVRYTTVEVRSITETADGLTFTFAQPEGYACF
CCCCCCCEEEECCCCCEEEEEEEECCCEEEEEEEEEEEEHHHCCCCCEEEEECCCCEEEE
AVRINQGSTEDRSESIQHVHFTNGTASVQWPIRTT
EEEECCCCCCCHHCCCEEEEEECCCEEEEECEECC

PDB accession: NA

Resolution: NA

Structure class: Alpha Beta

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 8563148 [H]