Definition Bacteroides vulgatus ATCC 8482 chromosome, complete genome.
Accession NC_009614
Length 5,163,189

Click here to switch to the map view.

The map label for this gene is yhgF [C]

Identifier: 150005421

GI number: 150005421

Start: 3672600

End: 3674729

Strand: Reverse

Name: yhgF [C]

Synonym: BVU_2904

Alternate gene names: 150005421

Gene position: 3674729-3672600 (Counterclockwise)

Preceding gene: 150005424

Following gene: 150005420

Centisome position: 71.17

GC content: 50.52

Gene sequence:

>2130_bases
ATGGAACTATTCAATAAAATGATTGCCGCAGCCCTGAAGGTTTCCGTACATCAAGTAGACAATACATTATCTCTACTGGG
CGGGGGAGCTACCATCCCTTTCATAAGCCGATACCGGAAAGAGGCAACCGGAGGGCTGGACGAAGTACAAATAGGCGAAA
TTAAAGACAGAAATGACAAGTTGTGTGAACTGGCCAAGCGAAAAGAAACCATTCTTTCTACCATAGAAGAACAGGGCAAA
CTGACAGAAGAATTGCGCAAACGCATTGAACAGAGTTGGGATGCCACCGAAGTGGAAGACATATATCTGCCCTATAAGCC
GAAACGTAAGACCCGCGCCGAAGCGGCACGGCAAAAAGGACTGGAACCGCTTGCCACCTTACTGTTATTGCAACGCGAAA
ACCATCTGGACAGCCGTCTCCCCGCTTTTGTGAAAGGAGATGTGAAAGACGAAGAGGATGCGCTGAAAGGCGCGCGCGAC
ATCATAGCCGAACAAGTGAGCGAGGACGAGCGCGCCCGCAACCAACTGCGCAACCAATTTTCCCGACAAGCTGTCATCAC
CTCCAAAGTGGTGAAAGGCAAAGAGGAAGAAGCTGCCAAATACCGCGATTATTTCGATTTCAGCGAACCTTTGAAACGCT
GTTCCTCCCACCGTCTGCTGGCTATCCGCCGGGGAGAGTCGGAAGGCCTGTTGAAAGTAAGTATAAGTCCTGATGACGAA
GAGTGCGCCGGACGGCTGGAACAAATGTATGTCCGTGGCAACAATGAATGCAGCCGCCAGGTAGGTGAGGCAGTGCGTGA
TGCGTACAAGCGGCTTCTGAAGCCTTCCATCGAAACCGAATTTTCAGCCTTGAGCAAGGAGAAAGCCGACGAGGAGGCAA
TCCGCGTCTTCGCCGGAAACCTGAGACAGCTTCTGCTGGCTCCTCCTTTGGGACAGAAACGGGTCATGGGAATTGATCCC
GGCTACCGTACCGGATGTAAAATAGTCTGTCTGGACGCACAAGGCTCTTTGCTGCACAATGAAACCATTTATCCGCATCC
GCCCAAGAATGAATACTCACAGGCAGCCCGCAGCATCGTGAAATTGGTAGAACAATATCAGATAGAGGCAATCGCCATAG
GCAATGGCACCGCCAGCCGTGAGACAGAACAATTCATCACTTCACAACGCTATGACCGTGAATTGCAAGTATTTGTAGTA
AGCGAGGACGGAGCATCCATTTACTCGGCCTCGAAAACCGCCCGAGACGAGTTCCCTGAATACGACGTGACCGTGCGTGG
AGCCGTATCCATAGGCCGCCGGTTGATGGACCCGCTGGCCGAACTGGTGAAAATAGATGCCAAATCTATCGGAGTGGGAC
AATACCAGCATGATGTGGACCAGACTTTATTGAAGAAATCATTGGACCAGACAGTGGAAAGTTGTGTGAACCTGGTAGGT
GTGAACCTCAATACCGCCAGCCGCCATCTGCTGACTTATATATCGGGACTAGGTCCTGCCTTGGCTCAAAACATTGTGGA
TTACCGCACCGAGAACGGGCCTTTCAGTTCACGCAAGGAGTTATTGAAAGTACCCCGGATGGGAGCGAAAGCGTTTGAGC
AATGCGCAGGGTTCCTGCGTATCCCACAAGCGAAGAATCCGCTGGACAACTCGGCCGTGCATCCCGAAAGCTATCCCATA
GTGGAGCAAATAGCCAAAGACCTGAACTGCACCGTAGACGAACTGATAAAAAGCAAAGAACTGCGCAGCCGGATTGATAT
CAAGAAGTATGTCACCCCGACAGTGGGGCTTCCCACCCTGACAGACATCATGCAGGAATTGGACAAGCCGGGACGCGACC
CAAGGCAACAAATCCAAGTTTTTGAATTTGACAAGAATGTAAAGACCATAGAGGATTTGACTGAAGGAATGGAACTGCCC
GGCATTGTGAACAATATTACCAACTTCGGTTGCTTTGTAGATATAGGAATCAAAGAGAAGGGTCTGGTGCACGTATCACA
ACTGGCCGACAAATTTGTGAGCGACCCCACCACCGTAGTCAGCATCCATCAGCATGTACGAGTGAAAGTGATGAGCATAG
ACCTAGAGCGCAAGCGCATCCAGCTCACTATGAAAGGATTGAATCAATAA

Upstream 100 bases:

>100_bases
ATACCGAGCGAACGGATAAGATACACTGTTTTCAACCTCAAATTATACGGAAAAACCCATACCTTTTTTATACCTTTGCC
CCCAAAAGAAAACAACAGAC

Downstream 100 bases:

>100_bases
TTGCAACAAGGTAAAAAAGAATATGGAAGAAGATATAGAAACCTGCGGCTGGTTCGTTTTTTATAAGGATCAGCTGCTGA
TAGAGAAAAAGAACGGAATG

Product: putative RNA-binding protein

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 709; Mature: 709

Protein sequence:

>709_residues
MELFNKMIAAALKVSVHQVDNTLSLLGGGATIPFISRYRKEATGGLDEVQIGEIKDRNDKLCELAKRKETILSTIEEQGK
LTEELRKRIEQSWDATEVEDIYLPYKPKRKTRAEAARQKGLEPLATLLLLQRENHLDSRLPAFVKGDVKDEEDALKGARD
IIAEQVSEDERARNQLRNQFSRQAVITSKVVKGKEEEAAKYRDYFDFSEPLKRCSSHRLLAIRRGESEGLLKVSISPDDE
ECAGRLEQMYVRGNNECSRQVGEAVRDAYKRLLKPSIETEFSALSKEKADEEAIRVFAGNLRQLLLAPPLGQKRVMGIDP
GYRTGCKIVCLDAQGSLLHNETIYPHPPKNEYSQAARSIVKLVEQYQIEAIAIGNGTASRETEQFITSQRYDRELQVFVV
SEDGASIYSASKTARDEFPEYDVTVRGAVSIGRRLMDPLAELVKIDAKSIGVGQYQHDVDQTLLKKSLDQTVESCVNLVG
VNLNTASRHLLTYISGLGPALAQNIVDYRTENGPFSSRKELLKVPRMGAKAFEQCAGFLRIPQAKNPLDNSAVHPESYPI
VEQIAKDLNCTVDELIKSKELRSRIDIKKYVTPTVGLPTLTDIMQELDKPGRDPRQQIQVFEFDKNVKTIEDLTEGMELP
GIVNNITNFGCFVDIGIKEKGLVHVSQLADKFVSDPTTVVSIHQHVRVKVMSIDLERKRIQLTMKGLNQ

Sequences:

>Translated_709_residues
MELFNKMIAAALKVSVHQVDNTLSLLGGGATIPFISRYRKEATGGLDEVQIGEIKDRNDKLCELAKRKETILSTIEEQGK
LTEELRKRIEQSWDATEVEDIYLPYKPKRKTRAEAARQKGLEPLATLLLLQRENHLDSRLPAFVKGDVKDEEDALKGARD
IIAEQVSEDERARNQLRNQFSRQAVITSKVVKGKEEEAAKYRDYFDFSEPLKRCSSHRLLAIRRGESEGLLKVSISPDDE
ECAGRLEQMYVRGNNECSRQVGEAVRDAYKRLLKPSIETEFSALSKEKADEEAIRVFAGNLRQLLLAPPLGQKRVMGIDP
GYRTGCKIVCLDAQGSLLHNETIYPHPPKNEYSQAARSIVKLVEQYQIEAIAIGNGTASRETEQFITSQRYDRELQVFVV
SEDGASIYSASKTARDEFPEYDVTVRGAVSIGRRLMDPLAELVKIDAKSIGVGQYQHDVDQTLLKKSLDQTVESCVNLVG
VNLNTASRHLLTYISGLGPALAQNIVDYRTENGPFSSRKELLKVPRMGAKAFEQCAGFLRIPQAKNPLDNSAVHPESYPI
VEQIAKDLNCTVDELIKSKELRSRIDIKKYVTPTVGLPTLTDIMQELDKPGRDPRQQIQVFEFDKNVKTIEDLTEGMELP
GIVNNITNFGCFVDIGIKEKGLVHVSQLADKFVSDPTTVVSIHQHVRVKVMSIDLERKRIQLTMKGLNQ
>Mature_709_residues
MELFNKMIAAALKVSVHQVDNTLSLLGGGATIPFISRYRKEATGGLDEVQIGEIKDRNDKLCELAKRKETILSTIEEQGK
LTEELRKRIEQSWDATEVEDIYLPYKPKRKTRAEAARQKGLEPLATLLLLQRENHLDSRLPAFVKGDVKDEEDALKGARD
IIAEQVSEDERARNQLRNQFSRQAVITSKVVKGKEEEAAKYRDYFDFSEPLKRCSSHRLLAIRRGESEGLLKVSISPDDE
ECAGRLEQMYVRGNNECSRQVGEAVRDAYKRLLKPSIETEFSALSKEKADEEAIRVFAGNLRQLLLAPPLGQKRVMGIDP
GYRTGCKIVCLDAQGSLLHNETIYPHPPKNEYSQAARSIVKLVEQYQIEAIAIGNGTASRETEQFITSQRYDRELQVFVV
SEDGASIYSASKTARDEFPEYDVTVRGAVSIGRRLMDPLAELVKIDAKSIGVGQYQHDVDQTLLKKSLDQTVESCVNLVG
VNLNTASRHLLTYISGLGPALAQNIVDYRTENGPFSSRKELLKVPRMGAKAFEQCAGFLRIPQAKNPLDNSAVHPESYPI
VEQIAKDLNCTVDELIKSKELRSRIDIKKYVTPTVGLPTLTDIMQELDKPGRDPRQQIQVFEFDKNVKTIEDLTEGMELP
GIVNNITNFGCFVDIGIKEKGLVHVSQLADKFVSDPTTVVSIHQHVRVKVMSIDLERKRIQLTMKGLNQ

Specific function: Unknown

COG id: COG2183

COG function: function code K; Transcriptional accessory protein

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Non_Essential [C]

Operon status: Not Known

Operon components: None

Similarity: Contains 1 S1 motif domain [H]

Homologues:

Organism=Homo sapiens, GI221136781, Length=771, Percent_Identity=34.3709468223087, Blast_Score=378, Evalue=1e-104,
Organism=Homo sapiens, GI27597090, Length=539, Percent_Identity=24.4897959183673, Blast_Score=102, Evalue=1e-21,
Organism=Escherichia coli, GI87082262, Length=718, Percent_Identity=48.8857938718663, Blast_Score=648, Evalue=0.0,
Organism=Escherichia coli, GI1787140, Length=80, Percent_Identity=42.5, Blast_Score=74, Evalue=4e-14,
Organism=Caenorhabditis elegans, GI17511129, Length=720, Percent_Identity=30.1388888888889, Blast_Score=236, Evalue=4e-62,
Organism=Caenorhabditis elegans, GI17552892, Length=600, Percent_Identity=22, Blast_Score=93, Evalue=5e-19,
Organism=Drosophila melanogaster, GI62484314, Length=768, Percent_Identity=31.25, Blast_Score=369, Evalue=1e-102,
Organism=Drosophila melanogaster, GI24640080, Length=505, Percent_Identity=21.3861386138614, Blast_Score=82, Evalue=9e-16,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR003583
- InterPro:   IPR012340
- InterPro:   IPR016027
- InterPro:   IPR003029
- InterPro:   IPR005227
- InterPro:   IPR006641
- InterPro:   IPR022967
- InterPro:   IPR018974
- InterPro:   IPR023097 [H]

Pfam domain/function: PF00575 S1; PF09371 Tex_N [H]

EC number: NA

Molecular weight: Translated: 79607; Mature: 79607

Theoretical pI: Translated: 6.90; Mature: 6.90

Prosite motif: PS50126 S1

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

1.4 %Cys     (Translated Protein)
1.4 %Met     (Translated Protein)
2.8 %Cys+Met (Translated Protein)
1.4 %Cys     (Mature Protein)
1.4 %Met     (Mature Protein)
2.8 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MELFNKMIAAALKVSVHQVDNTLSLLGGGATIPFISRYRKEATGGLDEVQIGEIKDRNDK
CHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCCCCEEECCCCCCCHH
LCELAKRKETILSTIEEQGKLTEELRKRIEQSWDATEVEDIYLPYKPKRKTRAEAARQKG
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHEECCCCCCHHHHHHHHHHCC
LEPLATLLLLQRENHLDSRLPAFVKGDVKDEEDALKGARDIIAEQVSEDERARNQLRNQF
CHHHHHHHHHHHHHHHHHHCCHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
SRQAVITSKVVKGKEEEAAKYRDYFDFSEPLKRCSSHRLLAIRRGESEGLLKVSISPDDE
HHHHHHHHHHHCCCHHHHHHHHHHCCHHHHHHHHCCCCEEEEEECCCCCEEEEEECCCCH
ECAGRLEQMYVRGNNECSRQVGEAVRDAYKRLLKPSIETEFSALSKEKADEEAIRVFAGN
HHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHH
LRQLLLAPPLGQKRVMGIDPGYRTGCKIVCLDAQGSLLHNETIYPHPPKNEYSQAARSIV
HHHHHHCCCCCCCEEEECCCCCCCCCEEEEECCCCCEEECCCCCCCCCCCHHHHHHHHHH
KLVEQYQIEAIAIGNGTASRETEQFITSQRYDRELQVFVVSEDGASIYSASKTARDEFPE
HHHHHHCEEEEEECCCCCCHHHHHHHHHHHCCCEEEEEEEECCCCCHHHHHHHHHHCCCC
YDVTVRGAVSIGRRLMDPLAELVKIDAKSIGVGQYQHDVDQTLLKKSLDQTVESCVNLVG
CCEEEHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHC
VNLNTASRHLLTYISGLGPALAQNIVDYRTENGPFSSRKELLKVPRMGAKAFEQCAGFLR
CCCCHHHHHHHHHHHHCCHHHHHHHHHHCCCCCCCHHHHHHHHCCCCCHHHHHHHHHHHC
IPQAKNPLDNSAVHPESYPIVEQIAKDLNCTVDELIKSKELRSRIDIKKYVTPTVGLPTL
CCCCCCCCCCCCCCCCCCHHHHHHHHHCCCCHHHHHHHHHHHHHHCHHHHCCCCCCCHHH
TDIMQELDKPGRDPRQQIQVFEFDKNVKTIEDLTEGMELPGIVNNITNFGCFVDIGIKEK
HHHHHHHCCCCCCHHHHCCHHCCCCCHHHHHHHHCCCCCCHHHHHHCCCCEEEEECCCCC
GLVHVSQLADKFVSDPTTVVSIHQHVRVKVMSIDLERKRIQLTMKGLNQ
CCCHHHHHHHHHHCCCHHHHHHHHHHEEEEEEEEHHHHHHHHHHHCCCC
>Mature Secondary Structure
MELFNKMIAAALKVSVHQVDNTLSLLGGGATIPFISRYRKEATGGLDEVQIGEIKDRNDK
CHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCCCCEEECCCCCCCHH
LCELAKRKETILSTIEEQGKLTEELRKRIEQSWDATEVEDIYLPYKPKRKTRAEAARQKG
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHEECCCCCCHHHHHHHHHHCC
LEPLATLLLLQRENHLDSRLPAFVKGDVKDEEDALKGARDIIAEQVSEDERARNQLRNQF
CHHHHHHHHHHHHHHHHHHCCHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
SRQAVITSKVVKGKEEEAAKYRDYFDFSEPLKRCSSHRLLAIRRGESEGLLKVSISPDDE
HHHHHHHHHHHCCCHHHHHHHHHHCCHHHHHHHHCCCCEEEEEECCCCCEEEEEECCCCH
ECAGRLEQMYVRGNNECSRQVGEAVRDAYKRLLKPSIETEFSALSKEKADEEAIRVFAGN
HHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHH
LRQLLLAPPLGQKRVMGIDPGYRTGCKIVCLDAQGSLLHNETIYPHPPKNEYSQAARSIV
HHHHHHCCCCCCCEEEECCCCCCCCCEEEEECCCCCEEECCCCCCCCCCCHHHHHHHHHH
KLVEQYQIEAIAIGNGTASRETEQFITSQRYDRELQVFVVSEDGASIYSASKTARDEFPE
HHHHHHCEEEEEECCCCCCHHHHHHHHHHHCCCEEEEEEEECCCCCHHHHHHHHHHCCCC
YDVTVRGAVSIGRRLMDPLAELVKIDAKSIGVGQYQHDVDQTLLKKSLDQTVESCVNLVG
CCEEEHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHC
VNLNTASRHLLTYISGLGPALAQNIVDYRTENGPFSSRKELLKVPRMGAKAFEQCAGFLR
CCCCHHHHHHHHHHHHCCHHHHHHHHHHCCCCCCCHHHHHHHHCCCCCHHHHHHHHHHHC
IPQAKNPLDNSAVHPESYPIVEQIAKDLNCTVDELIKSKELRSRIDIKKYVTPTVGLPTL
CCCCCCCCCCCCCCCCCCHHHHHHHHHCCCCHHHHHHHHHHHHHHCHHHHCCCCCCCHHH
TDIMQELDKPGRDPRQQIQVFEFDKNVKTIEDLTEGMELPGIVNNITNFGCFVDIGIKEK
HHHHHHHCCCCCCHHHHCCHHCCCCCHHHHHHHHCCCCCCHHHHHHCCCCEEEEECCCCC
GLVHVSQLADKFVSDPTTVVSIHQHVRVKVMSIDLERKRIQLTMKGLNQ
CCCHHHHHHHHHHCCCHHHHHHHHHHEEEEEEEEHHHHHHHHHHHCCCC

PDB accession: NA

Resolution: NA

Structure class: Unstructured

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 7542800 [H]