The gene/protein map for NC_002939 is currently unavailable.
Definition Geobacter sulfurreducens PCA chromosome, complete genome.
Accession NC_002939
Length 3,814,139

Click here to switch to the map view.

The map label for this gene is yhgF [C]

Identifier: 39995345

GI number: 39995345

Start: 242523

End: 244808

Strand: Reverse

Name: yhgF [C]

Synonym: GSU0235

Alternate gene names: 39995345

Gene position: 244808-242523 (Counterclockwise)

Preceding gene: 39995346

Following gene: 39995342

Centisome position: 6.42

GC content: 68.59

Gene sequence:

>2286_bases
ATGGCACTGAGCGACCAGACCCTGCAGGATATTCTCCGCTACCTGACCGAAGAAACCGGCCTCGCCCCCTTCCAGGTGGC
GAACACGGTGGAACTTCTGCGCGAAGGGGGCACGGTCCCCTTCATCGCCCGCTACCGGAAGGAGCGGACCGGCGAACTGG
ACGAGGTCGGCATCCGCGGGATCGAAGAGCGCCTCGCCTATTTCACCGAGTTGGAGGAGCGCAAGCTCACGGTCCTCAAA
TCCATCGAGGAACAGGGGAAGCTCACCCCCGAGCTGGCGGACCGCATCCGGACATCGCGCCAGAAGACCGAGGTTGAGGA
CCTCTACCTCCCCTACAAGCCCAAGCGGCGCACCAAGGCGACCATCGCCCGGGAGCGGGGGCTGGAGCCCCTGGCCGACC
TCATGGCGGCCCAGGAGCTGACCGCGGGCACGCCCGAAGAGGCTACCCTCCCCTTCGTGGACCCCGCCGGCGAGGTGCCC
GACGCGGCCGCCGCCCTGGAGGGAGCCGGCCACATCCTGGCCGAGCGGCTGGCGGACGACGCCGATGCCCGGGCCTTTGT
CCGCCGCCTCACGGCCGACCAGGGGATCTTCTGTTCACGGGTGGCGGCCGACCGGAAAGGGGCGGTCACCAAGTTCGAGA
TGTACTACGACCACCGGGAGCCCCTGAAATCGGTACCCTCCCACCGGATGCTCGCCATGCGCCGGGGCGAGAAGGAAGAG
GTGCTCGTCCTGACCATTGAGGCGCCGACGGAGGAGATCCTCGCCGGCCTCCGGGCCCGGATCGCGGGGCGCGCCAGCAT
CTTCCGCCAACTTCTGGAAGGGGCGGCAGAGGACGCCTACAAACGCCTCATCGCCCCCTCCATCGAAGTTGAGCTGCGGC
TGGAAGCCAAGCAGCGGGCCGACGAGGCAGCCATCACCGTCTTTGCCCAGAACCTCCGCAACCTCCTCCTGGCGCCGCCG
GCCGGCGGCAAGCGGGTCCTCGGTGTGGACCCGGGGCTGCGGACCGGCTCCAAGCTGGCTGCGGTAGACGGCACTGGCCG
GTTCCTGGAGCATGTGACCATCTACCCCCACGCCGGCGGCGAAGGGAAGGTGGCGGCCGCGAAGCGCGAATTCCTCCGGC
TGGTGGAGTGCCACGGCATCGAGATGGTGGCTGTGGGGAACGGCACCGCGGGACGGGAGATGGAGCAGTTCGCCAAGGAA
ACCCTGGCCGAAGGAGGCAAGAGGATTCCTGTGGTCATGGTGAGCGAGGCCGGGGCCAGCGTCTACTCCGCCTCGGAGAT
CGCCCGGGAGGAGTTCCCGGACCTGGACCTGACCGTGCGGGGGGCCATCTCCATCGCCCGGCGCCTCCAGGATCCCCTGG
CCGAACTGGTGAAGATCGACCCCAAGAGCATCGGCGTGGGCCAGTACCAGCACGACGTGGACCAGCGCGCCCTGAAGAAG
TCCCTGGACGCGGTGGTGGAGTCGTGCGTCAACTACGTGGGAGTCGACCTGAACACCGCCTCCTGGGCGCTCCTCTCCTA
CGTCTCCGGCGTGGGGCCGTCCTTGGCCCGTGCCATTGTCCGCCATCGGGACGAGCACGGTCCCTTCCCCGCCCGCCGGG
CACTCATGAAGGTCCCCCGGTTCGGCGACAAGGCCTTCGAGCAGGCAGCCGGCTTTCTCCGCATCAGGGGCGGCGAACAC
CCCCTGGACGCCACCGCCGTCCACCCGGAGCGCTATCCGGTGGTGGAGGCCATGGCAACGGATCTGGGCGTGTCCCTCAG
CCAGCTGGCCGCCGACCCGGCCCTGCCGGCGCGGATCGACCTGAAGCGCTACGTGACCGACGCGGTGGGGCTTCCCACCC
TGCGTGACATCATGGAAGAGTTGAAAAAACCGGGCCGGGACCCGCGGGAGGAATTCCGCACCGCCGCCTTCCGGGACGAC
GTGACCGAGATCGGCGATCTGAAGGAGGGAATGGTGCTCCAGGGGGTGGTGACCAACGTGACCGCGTTCGGGGCCTTCGT
GGACGTGGGGGTCCACCAGGACGGCCTGGTCCATGTGAGCCACCTCTCGGTCAAGTTCGTGAAGGACCCGGCCGAGGCGG
TGAGGGTGGGGCAGGTGGTGCAGGTGAAGGTCTTGTCCGTGGACGTGCAGCGCAAGCGCATCTCCCTCTCCATTCGGGAG
GCAGCGCCCGGCGGTGCCCCGCGGAGCACGGAAAAGCCGGCGCGAAAGGAAGAGCCGCGGAAAACCGGCGGCGGACTTGA
CCTGGGAGGGCTGGAACGGGCCGGATTCCGGGTGAAAAAGCGGTAA

Upstream 100 bases:

>100_bases
GCCGCATGGACCGGCACTGAAGCCTTCCCCGCTTTTCAAACGCCGCCCCCTGTGGTAGGATGCCCGCCTTTCCCAACTCC
CCCGGAACAGAGGACTTTCC

Downstream 100 bases:

>100_bases
CGGTTAGGCCTGTTTCCCCTCTTTTTTCAGGTGGATGAACCAGTAGGCAATAGCCACGAAGATCACCCCGCCGATGACAT
TGCCGATGGTGACCGGCACC

Product: S1 RNA-binding protein

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 761; Mature: 760

Protein sequence:

>761_residues
MALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRGIEERLAYFTELEERKLTVLK
SIEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKATIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVP
DAAAALEGAGHILAERLADDADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEE
VLVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRADEAAITVFAQNLRNLLLAPP
AGGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGGEGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKE
TLAEGGKRIPVVMVSEAGASVYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKK
SLDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPRFGDKAFEQAAGFLRIRGGEH
PLDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARIDLKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDD
VTEIGDLKEGMVLQGVVTNVTAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIRE
AAPGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR

Sequences:

>Translated_761_residues
MALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRGIEERLAYFTELEERKLTVLK
SIEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKATIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVP
DAAAALEGAGHILAERLADDADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEE
VLVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRADEAAITVFAQNLRNLLLAPP
AGGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGGEGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKE
TLAEGGKRIPVVMVSEAGASVYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKK
SLDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPRFGDKAFEQAAGFLRIRGGEH
PLDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARIDLKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDD
VTEIGDLKEGMVLQGVVTNVTAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIRE
AAPGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR
>Mature_760_residues
ALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRGIEERLAYFTELEERKLTVLKS
IEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKATIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVPD
AAAALEGAGHILAERLADDADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEEV
LVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRADEAAITVFAQNLRNLLLAPPA
GGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGGEGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKET
LAEGGKRIPVVMVSEAGASVYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKKS
LDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPRFGDKAFEQAAGFLRIRGGEHP
LDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARIDLKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDDV
TEIGDLKEGMVLQGVVTNVTAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIREA
APGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR

Specific function: Unknown

COG id: COG2183

COG function: function code K; Transcriptional accessory protein

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Non_Essential [C]

Operon status: Not Known

Operon components: None

Similarity: Contains 1 S1 motif domain [H]

Homologues:

Organism=Homo sapiens, GI221136781, Length=790, Percent_Identity=32.6582278481013, Blast_Score=383, Evalue=1e-106,
Organism=Homo sapiens, GI27597090, Length=518, Percent_Identity=24.9034749034749, Blast_Score=79, Evalue=2e-14,
Organism=Escherichia coli, GI87082262, Length=730, Percent_Identity=51.9178082191781, Blast_Score=676, Evalue=0.0,
Organism=Escherichia coli, GI1787140, Length=90, Percent_Identity=44.4444444444444, Blast_Score=75, Evalue=1e-14,
Organism=Caenorhabditis elegans, GI17511129, Length=713, Percent_Identity=28.8920056100982, Blast_Score=220, Evalue=2e-57,
Organism=Caenorhabditis elegans, GI17552892, Length=575, Percent_Identity=22.7826086956522, Blast_Score=78, Evalue=2e-14,
Organism=Saccharomyces cerevisiae, GI6321552, Length=202, Percent_Identity=27.7227722772277, Blast_Score=73, Evalue=2e-13,
Organism=Drosophila melanogaster, GI62484314, Length=768, Percent_Identity=32.5520833333333, Blast_Score=400, Evalue=1e-111,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR003583
- InterPro:   IPR012340
- InterPro:   IPR016027
- InterPro:   IPR003029
- InterPro:   IPR005227
- InterPro:   IPR006641
- InterPro:   IPR022967
- InterPro:   IPR018974
- InterPro:   IPR023097 [H]

Pfam domain/function: PF00575 S1; PF09371 Tex_N [H]

EC number: NA

Molecular weight: Translated: 83155; Mature: 83024

Theoretical pI: Translated: 6.90; Mature: 6.90

Prosite motif: PS50126 S1

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.4 %Cys     (Translated Protein)
1.6 %Met     (Translated Protein)
2.0 %Cys+Met (Translated Protein)
0.4 %Cys     (Mature Protein)
1.4 %Met     (Mature Protein)
1.8 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRG
CCCCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCCCHHHHHHHHHHCCCHHHHHHHH
IEERLAYFTELEERKLTVLKSIEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKA
HHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHH
TIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVPDAAAALEGAGHILAERLADD
HHHHHCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCC
ADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEE
HHHHHHHHHHHCCCCCHHHHHHCCCCCCCEEEHHHHCHHHHHHHCCCHHHHHHHCCCCCC
VLVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRA
EEEEEEECCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEECHHHCC
DEAAITVFAQNLRNLLLAPPAGGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGG
HHHHHHHHHHHHHHHEECCCCCCCEEEECCCCCCCCCCEEEECCCCHHHHEEEEEECCCC
EGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKETLAEGGKRIPVVMVSEAGAS
CCCCHHHHHHHHHHHHHCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCCEEEEECCCCC
VYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKK
HHHHHHHHHHHCCCCCEEHHHHHHHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHH
SLDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPR
HHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCCHHHHHHHHCCC
FGDKAFEQAAGFLRIRGGEHPLDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARID
CCHHHHHHHCCEEEECCCCCCCCCCCCCCCCCHHHHHHHHHHCCCHHHHHCCCCCCCHHH
LKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDDVTEIGDLKEGMVLQGVVTNV
HHHHHHHHCCCHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHH
TAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIRE
HHHHHHHCCCCCCCCCEEEEEEEEEEECCHHHHHHHCCEEEEEEEEEEHHHHHEEEEEEC
AAPGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR
CCCCCCCCCCCCCCHHCCCCCCCCCCCCCCCCCCCCEECCC
>Mature Secondary Structure 
ALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRG
CCCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCCCHHHHHHHHHHCCCHHHHHHHH
IEERLAYFTELEERKLTVLKSIEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKA
HHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHH
TIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVPDAAAALEGAGHILAERLADD
HHHHHCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCC
ADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEE
HHHHHHHHHHHCCCCCHHHHHHCCCCCCCEEEHHHHCHHHHHHHCCCHHHHHHHCCCCCC
VLVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRA
EEEEEEECCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEECHHHCC
DEAAITVFAQNLRNLLLAPPAGGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGG
HHHHHHHHHHHHHHHEECCCCCCCEEEECCCCCCCCCCEEEECCCCHHHHEEEEEECCCC
EGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKETLAEGGKRIPVVMVSEAGAS
CCCCHHHHHHHHHHHHHCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCCEEEEECCCCC
VYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKK
HHHHHHHHHHHCCCCCEEHHHHHHHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHH
SLDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPR
HHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCCHHHHHHHHCCC
FGDKAFEQAAGFLRIRGGEHPLDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARID
CCHHHHHHHCCEEEECCCCCCCCCCCCCCCCCHHHHHHHHHHCCCHHHHHCCCCCCCHHH
LKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDDVTEIGDLKEGMVLQGVVTNV
HHHHHHHHCCCHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHH
TAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIRE
HHHHHHHCCCCCCCCCEEEEEEEEEEECCHHHHHHHCCEEEEEEEEEEHHHHHEEEEEEC
AAPGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR
CCCCCCCCCCCCCCHHCCCCCCCCCCCCCCCCCCCCEECCC

PDB accession: NA

Resolution: NA

Structure class: Unstructured

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 7542800 [H]