| Definition | Geobacter sulfurreducens PCA chromosome, complete genome. |
|---|---|
| Accession | NC_002939 |
| Length | 3,814,139 |
Click here to switch to the map view.
The map label for this gene is yhgF [C]
Identifier: 39995345
GI number: 39995345
Start: 242523
End: 244808
Strand: Reverse
Name: yhgF [C]
Synonym: GSU0235
Alternate gene names: 39995345
Gene position: 244808-242523 (Counterclockwise)
Preceding gene: 39995346
Following gene: 39995342
Centisome position: 6.42
GC content: 68.59
Gene sequence:
>2286_bases ATGGCACTGAGCGACCAGACCCTGCAGGATATTCTCCGCTACCTGACCGAAGAAACCGGCCTCGCCCCCTTCCAGGTGGC GAACACGGTGGAACTTCTGCGCGAAGGGGGCACGGTCCCCTTCATCGCCCGCTACCGGAAGGAGCGGACCGGCGAACTGG ACGAGGTCGGCATCCGCGGGATCGAAGAGCGCCTCGCCTATTTCACCGAGTTGGAGGAGCGCAAGCTCACGGTCCTCAAA TCCATCGAGGAACAGGGGAAGCTCACCCCCGAGCTGGCGGACCGCATCCGGACATCGCGCCAGAAGACCGAGGTTGAGGA CCTCTACCTCCCCTACAAGCCCAAGCGGCGCACCAAGGCGACCATCGCCCGGGAGCGGGGGCTGGAGCCCCTGGCCGACC TCATGGCGGCCCAGGAGCTGACCGCGGGCACGCCCGAAGAGGCTACCCTCCCCTTCGTGGACCCCGCCGGCGAGGTGCCC GACGCGGCCGCCGCCCTGGAGGGAGCCGGCCACATCCTGGCCGAGCGGCTGGCGGACGACGCCGATGCCCGGGCCTTTGT CCGCCGCCTCACGGCCGACCAGGGGATCTTCTGTTCACGGGTGGCGGCCGACCGGAAAGGGGCGGTCACCAAGTTCGAGA TGTACTACGACCACCGGGAGCCCCTGAAATCGGTACCCTCCCACCGGATGCTCGCCATGCGCCGGGGCGAGAAGGAAGAG GTGCTCGTCCTGACCATTGAGGCGCCGACGGAGGAGATCCTCGCCGGCCTCCGGGCCCGGATCGCGGGGCGCGCCAGCAT CTTCCGCCAACTTCTGGAAGGGGCGGCAGAGGACGCCTACAAACGCCTCATCGCCCCCTCCATCGAAGTTGAGCTGCGGC TGGAAGCCAAGCAGCGGGCCGACGAGGCAGCCATCACCGTCTTTGCCCAGAACCTCCGCAACCTCCTCCTGGCGCCGCCG GCCGGCGGCAAGCGGGTCCTCGGTGTGGACCCGGGGCTGCGGACCGGCTCCAAGCTGGCTGCGGTAGACGGCACTGGCCG GTTCCTGGAGCATGTGACCATCTACCCCCACGCCGGCGGCGAAGGGAAGGTGGCGGCCGCGAAGCGCGAATTCCTCCGGC TGGTGGAGTGCCACGGCATCGAGATGGTGGCTGTGGGGAACGGCACCGCGGGACGGGAGATGGAGCAGTTCGCCAAGGAA ACCCTGGCCGAAGGAGGCAAGAGGATTCCTGTGGTCATGGTGAGCGAGGCCGGGGCCAGCGTCTACTCCGCCTCGGAGAT CGCCCGGGAGGAGTTCCCGGACCTGGACCTGACCGTGCGGGGGGCCATCTCCATCGCCCGGCGCCTCCAGGATCCCCTGG CCGAACTGGTGAAGATCGACCCCAAGAGCATCGGCGTGGGCCAGTACCAGCACGACGTGGACCAGCGCGCCCTGAAGAAG TCCCTGGACGCGGTGGTGGAGTCGTGCGTCAACTACGTGGGAGTCGACCTGAACACCGCCTCCTGGGCGCTCCTCTCCTA CGTCTCCGGCGTGGGGCCGTCCTTGGCCCGTGCCATTGTCCGCCATCGGGACGAGCACGGTCCCTTCCCCGCCCGCCGGG CACTCATGAAGGTCCCCCGGTTCGGCGACAAGGCCTTCGAGCAGGCAGCCGGCTTTCTCCGCATCAGGGGCGGCGAACAC CCCCTGGACGCCACCGCCGTCCACCCGGAGCGCTATCCGGTGGTGGAGGCCATGGCAACGGATCTGGGCGTGTCCCTCAG CCAGCTGGCCGCCGACCCGGCCCTGCCGGCGCGGATCGACCTGAAGCGCTACGTGACCGACGCGGTGGGGCTTCCCACCC TGCGTGACATCATGGAAGAGTTGAAAAAACCGGGCCGGGACCCGCGGGAGGAATTCCGCACCGCCGCCTTCCGGGACGAC GTGACCGAGATCGGCGATCTGAAGGAGGGAATGGTGCTCCAGGGGGTGGTGACCAACGTGACCGCGTTCGGGGCCTTCGT GGACGTGGGGGTCCACCAGGACGGCCTGGTCCATGTGAGCCACCTCTCGGTCAAGTTCGTGAAGGACCCGGCCGAGGCGG TGAGGGTGGGGCAGGTGGTGCAGGTGAAGGTCTTGTCCGTGGACGTGCAGCGCAAGCGCATCTCCCTCTCCATTCGGGAG GCAGCGCCCGGCGGTGCCCCGCGGAGCACGGAAAAGCCGGCGCGAAAGGAAGAGCCGCGGAAAACCGGCGGCGGACTTGA CCTGGGAGGGCTGGAACGGGCCGGATTCCGGGTGAAAAAGCGGTAA
Upstream 100 bases:
>100_bases GCCGCATGGACCGGCACTGAAGCCTTCCCCGCTTTTCAAACGCCGCCCCCTGTGGTAGGATGCCCGCCTTTCCCAACTCC CCCGGAACAGAGGACTTTCC
Downstream 100 bases:
>100_bases CGGTTAGGCCTGTTTCCCCTCTTTTTTCAGGTGGATGAACCAGTAGGCAATAGCCACGAAGATCACCCCGCCGATGACAT TGCCGATGGTGACCGGCACC
Product: S1 RNA-binding protein
Products: NA
Alternate protein names: NA
Number of amino acids: Translated: 761; Mature: 760
Protein sequence:
>761_residues MALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRGIEERLAYFTELEERKLTVLK SIEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKATIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVP DAAAALEGAGHILAERLADDADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEE VLVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRADEAAITVFAQNLRNLLLAPP AGGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGGEGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKE TLAEGGKRIPVVMVSEAGASVYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKK SLDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPRFGDKAFEQAAGFLRIRGGEH PLDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARIDLKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDD VTEIGDLKEGMVLQGVVTNVTAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIRE AAPGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR
Sequences:
>Translated_761_residues MALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRGIEERLAYFTELEERKLTVLK SIEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKATIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVP DAAAALEGAGHILAERLADDADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEE VLVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRADEAAITVFAQNLRNLLLAPP AGGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGGEGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKE TLAEGGKRIPVVMVSEAGASVYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKK SLDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPRFGDKAFEQAAGFLRIRGGEH PLDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARIDLKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDD VTEIGDLKEGMVLQGVVTNVTAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIRE AAPGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR >Mature_760_residues ALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRGIEERLAYFTELEERKLTVLKS IEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKATIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVPD AAAALEGAGHILAERLADDADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEEV LVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRADEAAITVFAQNLRNLLLAPPA GGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGGEGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKET LAEGGKRIPVVMVSEAGASVYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKKS LDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPRFGDKAFEQAAGFLRIRGGEHP LDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARIDLKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDDV TEIGDLKEGMVLQGVVTNVTAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIREA APGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR
Specific function: Unknown
COG id: COG2183
COG function: function code K; Transcriptional accessory protein
Gene ontology:
Cell location: Cytoplasm [C]
Metaboloic importance: Non_Essential [C]
Operon status: Not Known
Operon components: None
Similarity: Contains 1 S1 motif domain [H]
Homologues:
Organism=Homo sapiens, GI221136781, Length=790, Percent_Identity=32.6582278481013, Blast_Score=383, Evalue=1e-106, Organism=Homo sapiens, GI27597090, Length=518, Percent_Identity=24.9034749034749, Blast_Score=79, Evalue=2e-14, Organism=Escherichia coli, GI87082262, Length=730, Percent_Identity=51.9178082191781, Blast_Score=676, Evalue=0.0, Organism=Escherichia coli, GI1787140, Length=90, Percent_Identity=44.4444444444444, Blast_Score=75, Evalue=1e-14, Organism=Caenorhabditis elegans, GI17511129, Length=713, Percent_Identity=28.8920056100982, Blast_Score=220, Evalue=2e-57, Organism=Caenorhabditis elegans, GI17552892, Length=575, Percent_Identity=22.7826086956522, Blast_Score=78, Evalue=2e-14, Organism=Saccharomyces cerevisiae, GI6321552, Length=202, Percent_Identity=27.7227722772277, Blast_Score=73, Evalue=2e-13, Organism=Drosophila melanogaster, GI62484314, Length=768, Percent_Identity=32.5520833333333, Blast_Score=400, Evalue=1e-111,
Paralogues:
None
Copy number: NA
Swissprot (AC and ID): NA
Other databases:
- InterPro: IPR003583 - InterPro: IPR012340 - InterPro: IPR016027 - InterPro: IPR003029 - InterPro: IPR005227 - InterPro: IPR006641 - InterPro: IPR022967 - InterPro: IPR018974 - InterPro: IPR023097 [H]
Pfam domain/function: PF00575 S1; PF09371 Tex_N [H]
EC number: NA
Molecular weight: Translated: 83155; Mature: 83024
Theoretical pI: Translated: 6.90; Mature: 6.90
Prosite motif: PS50126 S1
Important sites: NA
Signals:
None
Transmembrane regions:
None
Cys/Met content:
0.4 %Cys (Translated Protein) 1.6 %Met (Translated Protein) 2.0 %Cys+Met (Translated Protein) 0.4 %Cys (Mature Protein) 1.4 %Met (Mature Protein) 1.8 %Cys+Met (Mature Protein)
Secondary structure:
>Translated Secondary Structure MALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRG CCCCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCCCHHHHHHHHHHCCCHHHHHHHH IEERLAYFTELEERKLTVLKSIEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKA HHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHH TIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVPDAAAALEGAGHILAERLADD HHHHHCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCC ADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEE HHHHHHHHHHHCCCCCHHHHHHCCCCCCCEEEHHHHCHHHHHHHCCCHHHHHHHCCCCCC VLVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRA EEEEEEECCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEECHHHCC DEAAITVFAQNLRNLLLAPPAGGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGG HHHHHHHHHHHHHHHEECCCCCCCEEEECCCCCCCCCCEEEECCCCHHHHEEEEEECCCC EGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKETLAEGGKRIPVVMVSEAGAS CCCCHHHHHHHHHHHHHCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCCEEEEECCCCC VYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKK HHHHHHHHHHHCCCCCEEHHHHHHHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHH SLDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPR HHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCCHHHHHHHHCCC FGDKAFEQAAGFLRIRGGEHPLDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARID CCHHHHHHHCCEEEECCCCCCCCCCCCCCCCCHHHHHHHHHHCCCHHHHHCCCCCCCHHH LKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDDVTEIGDLKEGMVLQGVVTNV HHHHHHHHCCCHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHH TAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIRE HHHHHHHCCCCCCCCCEEEEEEEEEEECCHHHHHHHCCEEEEEEEEEEHHHHHEEEEEEC AAPGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR CCCCCCCCCCCCCCHHCCCCCCCCCCCCCCCCCCCCEECCC >Mature Secondary Structure ALSDQTLQDILRYLTEETGLAPFQVANTVELLREGGTVPFIARYRKERTGELDEVGIRG CCCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCCCHHHHHHHHHHCCCHHHHHHHH IEERLAYFTELEERKLTVLKSIEEQGKLTPELADRIRTSRQKTEVEDLYLPYKPKRRTKA HHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHH TIARERGLEPLADLMAAQELTAGTPEEATLPFVDPAGEVPDAAAALEGAGHILAERLADD HHHHHCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCC ADARAFVRRLTADQGIFCSRVAADRKGAVTKFEMYYDHREPLKSVPSHRMLAMRRGEKEE HHHHHHHHHHHCCCCCHHHHHHCCCCCCCEEEHHHHCHHHHHHHCCCHHHHHHHCCCCCC VLVLTIEAPTEEILAGLRARIAGRASIFRQLLEGAAEDAYKRLIAPSIEVELRLEAKQRA EEEEEEECCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEECHHHCC DEAAITVFAQNLRNLLLAPPAGGKRVLGVDPGLRTGSKLAAVDGTGRFLEHVTIYPHAGG HHHHHHHHHHHHHHHEECCCCCCCEEEECCCCCCCCCCEEEECCCCHHHHEEEEEECCCC EGKVAAAKREFLRLVECHGIEMVAVGNGTAGREMEQFAKETLAEGGKRIPVVMVSEAGAS CCCCHHHHHHHHHHHHHCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCCCEEEEECCCCC VYSASEIAREEFPDLDLTVRGAISIARRLQDPLAELVKIDPKSIGVGQYQHDVDQRALKK HHHHHHHHHHHCCCCCEEHHHHHHHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHH SLDAVVESCVNYVGVDLNTASWALLSYVSGVGPSLARAIVRHRDEHGPFPARRALMKVPR HHHHHHHHHHHHHCCCCCHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCCHHHHHHHHCCC FGDKAFEQAAGFLRIRGGEHPLDATAVHPERYPVVEAMATDLGVSLSQLAADPALPARID CCHHHHHHHCCEEEECCCCCCCCCCCCCCCCCHHHHHHHHHHCCCHHHHHCCCCCCCHHH LKRYVTDAVGLPTLRDIMEELKKPGRDPREEFRTAAFRDDVTEIGDLKEGMVLQGVVTNV HHHHHHHHCCCHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHH TAFGAFVDVGVHQDGLVHVSHLSVKFVKDPAEAVRVGQVVQVKVLSVDVQRKRISLSIRE HHHHHHHCCCCCCCCCEEEEEEEEEEECCHHHHHHHCCEEEEEEEEEEHHHHHEEEEEEC AAPGGAPRSTEKPARKEEPRKTGGGLDLGGLERAGFRVKKR CCCCCCCCCCCCCCHHCCCCCCCCCCCCCCCCCCCCEECCC
PDB accession: NA
Resolution: NA
Structure class: Unstructured
Cofactors: NA
Metal ions: NA
Kcat value (1/min): NA
Specific activity: NA
Km value (mM): NA
Substrates: NA
Specific reaction: NA
General reaction: NA
Inhibitor: NA
Structure determination priority: 9.0
TargetDB status: NA
Availability: NA
References: 7542800 [H]