Definition Nitrosospira multiformis ATCC 25196 chromosome, complete genome.
Accession NC_007614
Length 3,184,243

Click here to switch to the map view.

The map label for this gene is yhgF [H]

Identifier: 82702712

GI number: 82702712

Start: 1817278

End: 1819650

Strand: Reverse

Name: yhgF [H]

Synonym: Nmul_A1585

Alternate gene names: 82702712

Gene position: 1819650-1817278 (Counterclockwise)

Preceding gene: 82702713

Following gene: 82702708

Centisome position: 57.15

GC content: 59.25

Gene sequence:

>2373_bases
ATGCTGCCATCCATTGAACAACGTCTTTCCCTTGAACTCGGCGCGAAGCCTGCACAGGTTAACGCGGCCATTGCCTTGCT
CGATGAAGGTGCCACCGTGCCCTTTATTGCACGTTACCGCAAGGAGGTGACTGGCGGGCTGGACGATGCGCAGTTGCGCC
TGCTGGAAGAACGGTTGCGTTATCTGCGTGAACTGGAAGACAGGCGCGCCGCGATTATCGCCTCGATAGAAGAGCAGGGG
AAAATGACGCCCGCGCTGCTTGCTTCCATCCTGCAGGCCGAGGATAAGACACGGCTGGAAGATCTGTATCTCCCCTTCAA
GAAAAAGCGGCGCACCAAGGCGCAGATCGCGCTCGAAGCGGGGCTGGAGCCACTGGCAGATGCGCTGCTTGCCGATCCGA
CGCTACAACCCGAGGAAGAGGCCATCAAGTACTTGAAGCCGCCCTTCGCTACCGAGCAGGGGGATAATCCCGGGGTACCG
GATGTGAAAGCTGCGCTCGAGGGAGCACGCCAGATACTGATGGAGCGTTTCGCGGAGGATGCCGAGTTGCTTCAGTGGCT
GCGCGAGTACCTGCTGGACCATGGGGTGGTGGAGTCGAAAGTCGCGAGCGACAAGAATGGCGGTAAGGGTAAGGAAGAGG
AGGGCGCCAAATATTCCGATTATTTCGATTACTCCGAACCGCTCAGCGCTATTCCTTCGCACCGGGCGCTGGCGCTTTTT
CGGGGGCGACGTGAAGAAATTTTACGCGTTGCCCTGCGTCTGGATTCGGAGGCGGAGAAACCGAAGTGGGATGCACCGCA
TAACCCGTGCGAGGCGCGCATTGCTGTCCGGTTCGGTATTGCGGACAAAGGGCGGCCTGCCGATGCGTGGCTGATGGACA
CGGTGCGCTGGACCTGGCGGGTGAAGAGTTTTCCGCATCTGGAACTCGATCTTATGGGTTCGTTGCGCACACGCGCCGAG
ACCGAAGCGATCCAGGTCTTTGCGCGCAACCTGAAAGCCCTGCTCATGGCCGCTCCCGCCGGGCCTCGCGTGACAATAGG
TCTAGACCCCGGCTTGCGCACAGGGGTGAAAGTCGCAGTAGTGGATGCGACAGGGCGGGTCATGGAAACGGCCACCATTT
ATCCACATCAACCAAGGAATGATTGGGAGGGGTCCCTTCATGTTCTCGGCACGCTTGCGGAAAAATTTCGGGTATCGCTG
ATAGCCATAGGCAATGGCACCGCTTCGCGCGAGACCGACAAGCTGGCAAAAGACCTGATCAAGCGCCGGCCCGATCTCAA
GCTCACCTCTATCGTGGTTTCGGAAGCGGGGGCTTCGGTTTATTCCGCCTCCGATCTGGCCTCCAGAGAGTTCCCCGATA
TGGATGTGTCGCTGAGAGGAGCGGTTTCCATTGCGCGGCGCCTGCAGGACCCTCTGGCGGAGCTGGTCAAAGTCGATCCG
AAATCGATTGGCGTAGGCCAATACCAGCATGATGTCGGGCAAACCCAGCTCGCGCGCTCGCTCGATGCCGTGGTCGAAGA
CTGCGTCAATGCGGTAGGCGTGGACGTCAATACGGCCTCCGCGCCGCTGCTCGAACGCGTTTCGGGGCTTAACCCGGCTG
TCGCACAAAGCATCGTTGTCTATCGCGAGGAAAACGGGATGTTTGCCTCGCGTGAAGCTCTGCACCAGGTGCCGCGCCTG
GGTGAGAAAACCTTCGAGCAGGCGGCAGGCTTCTTGCGGGTGATGCATGGCGAGAACCCGCTTGATGCGTCGGCAGTGCA
TCCCGAGTCGTATCCCGTCGTGCAAAGAATCCTTTCCGACTTGAAGCAGGAAATCAGGTCGATCATCGGCAATAACAAAT
TATTGAAGTCGCTCAATCCGGCGAGGTATGCGGATGATCGATTCGGCGTGCCGACCGTCACCGACATCGTGAAGGAGCTG
GAAAAGCCGGGCCGTGATCCCCGGCCGGAATTCATCACTGCCGCATTCAAGGAAGGCGTGAACGAGATTTCCGATCTGCA
GCCGGGTATGCTGCTGGAAGGCGTGGTAACCAACGTGGCTGCCTTCGGCGCGTTCGTCGATATCGGGGTGCATCAGGATG
GGCTGGTGCACATCTCCGCGCTCGCCGACAAATTCGTCAAAGACCCGCACACGGTCGTCAAGGTGGGGCAGGTGGTGAAG
GTCAAAGTGCTGGAAGTCGATGAAAAGCGTAAGCGCATTGCCCTTACGATGAGGTTGGCAGATGCGCCAGCACCACAGAC
ACAGGAGGCGCGAGGGGCTGGCAAGCGTGAGCAGCCAAGGAATAGGAAGGACCGCTCAGCCAAACCCCAGCAGAAACAGG
ATTCCAGGGCTGATACGGCGATGGCAGCAGCGTTTGCGAGATTGAAGGGTTGA

Upstream 100 bases:

>100_bases
CCAGCCAGAATACGAACTGAACCTGATCCTTTCCCCCTTCCGTCCATCTGTTTTACCACTTCCCCTTTTTCCACTGCAAC
TGAATAGGTGTCCGTCAATC

Downstream 100 bases:

>100_bases
TTGGAATTTAGCTTTACGATCTAGCTGTATCAATAGGGCAGACTAAGTCTGACCCTGTTGATTTCATTACCCCTTCCTTG
CGGCTATTTCGGTTCGGATA

Product: RNA binding S1

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 790; Mature: 790

Protein sequence:

>790_residues
MLPSIEQRLSLELGAKPAQVNAAIALLDEGATVPFIARYRKEVTGGLDDAQLRLLEERLRYLRELEDRRAAIIASIEEQG
KMTPALLASILQAEDKTRLEDLYLPFKKKRRTKAQIALEAGLEPLADALLADPTLQPEEEAIKYLKPPFATEQGDNPGVP
DVKAALEGARQILMERFAEDAELLQWLREYLLDHGVVESKVASDKNGGKGKEEEGAKYSDYFDYSEPLSAIPSHRALALF
RGRREEILRVALRLDSEAEKPKWDAPHNPCEARIAVRFGIADKGRPADAWLMDTVRWTWRVKSFPHLELDLMGSLRTRAE
TEAIQVFARNLKALLMAAPAGPRVTIGLDPGLRTGVKVAVVDATGRVMETATIYPHQPRNDWEGSLHVLGTLAEKFRVSL
IAIGNGTASRETDKLAKDLIKRRPDLKLTSIVVSEAGASVYSASDLASREFPDMDVSLRGAVSIARRLQDPLAELVKVDP
KSIGVGQYQHDVGQTQLARSLDAVVEDCVNAVGVDVNTASAPLLERVSGLNPAVAQSIVVYREENGMFASREALHQVPRL
GEKTFEQAAGFLRVMHGENPLDASAVHPESYPVVQRILSDLKQEIRSIIGNNKLLKSLNPARYADDRFGVPTVTDIVKEL
EKPGRDPRPEFITAAFKEGVNEISDLQPGMLLEGVVTNVAAFGAFVDIGVHQDGLVHISALADKFVKDPHTVVKVGQVVK
VKVLEVDEKRKRIALTMRLADAPAPQTQEARGAGKREQPRNRKDRSAKPQQKQDSRADTAMAAAFARLKG

Sequences:

>Translated_790_residues
MLPSIEQRLSLELGAKPAQVNAAIALLDEGATVPFIARYRKEVTGGLDDAQLRLLEERLRYLRELEDRRAAIIASIEEQG
KMTPALLASILQAEDKTRLEDLYLPFKKKRRTKAQIALEAGLEPLADALLADPTLQPEEEAIKYLKPPFATEQGDNPGVP
DVKAALEGARQILMERFAEDAELLQWLREYLLDHGVVESKVASDKNGGKGKEEEGAKYSDYFDYSEPLSAIPSHRALALF
RGRREEILRVALRLDSEAEKPKWDAPHNPCEARIAVRFGIADKGRPADAWLMDTVRWTWRVKSFPHLELDLMGSLRTRAE
TEAIQVFARNLKALLMAAPAGPRVTIGLDPGLRTGVKVAVVDATGRVMETATIYPHQPRNDWEGSLHVLGTLAEKFRVSL
IAIGNGTASRETDKLAKDLIKRRPDLKLTSIVVSEAGASVYSASDLASREFPDMDVSLRGAVSIARRLQDPLAELVKVDP
KSIGVGQYQHDVGQTQLARSLDAVVEDCVNAVGVDVNTASAPLLERVSGLNPAVAQSIVVYREENGMFASREALHQVPRL
GEKTFEQAAGFLRVMHGENPLDASAVHPESYPVVQRILSDLKQEIRSIIGNNKLLKSLNPARYADDRFGVPTVTDIVKEL
EKPGRDPRPEFITAAFKEGVNEISDLQPGMLLEGVVTNVAAFGAFVDIGVHQDGLVHISALADKFVKDPHTVVKVGQVVK
VKVLEVDEKRKRIALTMRLADAPAPQTQEARGAGKREQPRNRKDRSAKPQQKQDSRADTAMAAAFARLKG
>Mature_790_residues
MLPSIEQRLSLELGAKPAQVNAAIALLDEGATVPFIARYRKEVTGGLDDAQLRLLEERLRYLRELEDRRAAIIASIEEQG
KMTPALLASILQAEDKTRLEDLYLPFKKKRRTKAQIALEAGLEPLADALLADPTLQPEEEAIKYLKPPFATEQGDNPGVP
DVKAALEGARQILMERFAEDAELLQWLREYLLDHGVVESKVASDKNGGKGKEEEGAKYSDYFDYSEPLSAIPSHRALALF
RGRREEILRVALRLDSEAEKPKWDAPHNPCEARIAVRFGIADKGRPADAWLMDTVRWTWRVKSFPHLELDLMGSLRTRAE
TEAIQVFARNLKALLMAAPAGPRVTIGLDPGLRTGVKVAVVDATGRVMETATIYPHQPRNDWEGSLHVLGTLAEKFRVSL
IAIGNGTASRETDKLAKDLIKRRPDLKLTSIVVSEAGASVYSASDLASREFPDMDVSLRGAVSIARRLQDPLAELVKVDP
KSIGVGQYQHDVGQTQLARSLDAVVEDCVNAVGVDVNTASAPLLERVSGLNPAVAQSIVVYREENGMFASREALHQVPRL
GEKTFEQAAGFLRVMHGENPLDASAVHPESYPVVQRILSDLKQEIRSIIGNNKLLKSLNPARYADDRFGVPTVTDIVKEL
EKPGRDPRPEFITAAFKEGVNEISDLQPGMLLEGVVTNVAAFGAFVDIGVHQDGLVHISALADKFVKDPHTVVKVGQVVK
VKVLEVDEKRKRIALTMRLADAPAPQTQEARGAGKREQPRNRKDRSAKPQQKQDSRADTAMAAAFARLKG

Specific function: Unknown

COG id: COG2183

COG function: function code K; Transcriptional accessory protein

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Non_Essential [C]

Operon status: Not Known

Operon components: None

Similarity: Contains 1 S1 motif domain [H]

Homologues:

Organism=Homo sapiens, GI221136781, Length=810, Percent_Identity=33.2098765432099, Blast_Score=385, Evalue=1e-106,
Organism=Homo sapiens, GI27597090, Length=410, Percent_Identity=25.8536585365854, Blast_Score=92, Evalue=3e-18,
Organism=Escherichia coli, GI87082262, Length=764, Percent_Identity=62.0418848167539, Blast_Score=921, Evalue=0.0,
Organism=Escherichia coli, GI1787140, Length=76, Percent_Identity=44.7368421052632, Blast_Score=70, Evalue=7e-13,
Organism=Caenorhabditis elegans, GI17511129, Length=727, Percent_Identity=29.4360385144429, Blast_Score=240, Evalue=2e-63,
Organism=Caenorhabditis elegans, GI17552892, Length=293, Percent_Identity=26.962457337884, Blast_Score=74, Evalue=4e-13,
Organism=Drosophila melanogaster, GI62484314, Length=794, Percent_Identity=29.4710327455919, Blast_Score=333, Evalue=4e-91,
Organism=Drosophila melanogaster, GI24640080, Length=737, Percent_Identity=19.674355495251, Blast_Score=86, Evalue=1e-16,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR012340
- InterPro:   IPR016027
- InterPro:   IPR003029
- InterPro:   IPR005227
- InterPro:   IPR006641
- InterPro:   IPR022967
- InterPro:   IPR018974
- InterPro:   IPR023097 [H]

Pfam domain/function: PF00575 S1; PF09371 Tex_N [H]

EC number: NA

Molecular weight: Translated: 86900; Mature: 86900

Theoretical pI: Translated: 6.87; Mature: 6.87

Prosite motif: PS50126 S1

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.3 %Cys     (Translated Protein)
1.6 %Met     (Translated Protein)
1.9 %Cys+Met (Translated Protein)
0.3 %Cys     (Mature Protein)
1.6 %Met     (Mature Protein)
1.9 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MLPSIEQRLSLELGAKPAQVNAAIALLDEGATVPFIARYRKEVTGGLDDAQLRLLEERLR
CCCCHHHHHHHHCCCCCCHHCEEEEEECCCCCCHHHHHHHHHHHCCCCHHHHHHHHHHHH
YLRELEDRRAAIIASIEEQGKMTPALLASILQAEDKTRLEDLYLPFKKKRRTKAQIALEA
HHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHH
GLEPLADALLADPTLQPEEEAIKYLKPPFATEQGDNPGVPDVKAALEGARQILMERFAED
CCHHHHHHHHCCCCCCCHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCH
AELLQWLREYLLDHGVVESKVASDKNGGKGKEEEGAKYSDYFDYSEPLSAIPSHRALALF
HHHHHHHHHHHHHCCCHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCHHHHCCCHHHHHHH
RGRREEILRVALRLDSEAEKPKWDAPHNPCEARIAVRFGIADKGRPADAWLMDTVRWTWR
HCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHEEE
VKSFPHLELDLMGSLRTRAETEAIQVFARNLKALLMAAPAGPRVTIGLDPGLRTGVKVAV
ECCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCEEEEECCCCCCCCCEEEE
VDATGRVMETATIYPHQPRNDWEGSLHVLGTLAEKFRVSLIAIGNGTASRETDKLAKDLI
EECCCCEEEEEEECCCCCCCCCCCHHHHHHHHHHHHEEEEEEECCCCCCHHHHHHHHHHH
KRRPDLKLTSIVVSEAGASVYSASDLASREFPDMDVSLRGAVSIARRLQDPLAELVKVDP
HCCCCCCHHHHHHHHCCCCHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCC
KSIGVGQYQHDVGQTQLARSLDAVVEDCVNAVGVDVNTASAPLLERVSGLNPAVAQSIVV
CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHCCCCHHHHHHEEE
YREENGMFASREALHQVPRLGEKTFEQAAGFLRVMHGENPLDASAVHPESYPVVQRILSD
EECCCCCCHHHHHHHHHHCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHH
LKQEIRSIIGNNKLLKSLNPARYADDRFGVPTVTDIVKELEKPGRDPRPEFITAAFKEGV
HHHHHHHHHCCCHHHHHCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCHHHHHHHHHHHH
NEISDLQPGMLLEGVVTNVAAFGAFVDIGVHQDGLVHISALADKFVKDPHTVVKVGQVVK
HHHHHCCCCHHHHHHHHHHHHHHHHHEECCCCCCCCHHHHHHHHHHCCCHHHHHHCCEEE
VKVLEVDEKRKRIALTMRLADAPAPQTQEARGAGKREQPRNRKDRSAKPQQKQDSRADTA
EEEEECCHHHHHEEEEEEECCCCCCCCHHHHCCCCCCCCCCCCCCCCCCCHHHHHHHHHH
MAAAFARLKG
HHHHHHHHCC
>Mature Secondary Structure
MLPSIEQRLSLELGAKPAQVNAAIALLDEGATVPFIARYRKEVTGGLDDAQLRLLEERLR
CCCCHHHHHHHHCCCCCCHHCEEEEEECCCCCCHHHHHHHHHHHCCCCHHHHHHHHHHHH
YLRELEDRRAAIIASIEEQGKMTPALLASILQAEDKTRLEDLYLPFKKKRRTKAQIALEA
HHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHH
GLEPLADALLADPTLQPEEEAIKYLKPPFATEQGDNPGVPDVKAALEGARQILMERFAED
CCHHHHHHHHCCCCCCCHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHCH
AELLQWLREYLLDHGVVESKVASDKNGGKGKEEEGAKYSDYFDYSEPLSAIPSHRALALF
HHHHHHHHHHHHHCCCHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCHHHHCCCHHHHHHH
RGRREEILRVALRLDSEAEKPKWDAPHNPCEARIAVRFGIADKGRPADAWLMDTVRWTWR
HCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCEEEEEEECCCCCCCCCHHHHHHHHHHEEE
VKSFPHLELDLMGSLRTRAETEAIQVFARNLKALLMAAPAGPRVTIGLDPGLRTGVKVAV
ECCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCEEEEECCCCCCCCCEEEE
VDATGRVMETATIYPHQPRNDWEGSLHVLGTLAEKFRVSLIAIGNGTASRETDKLAKDLI
EECCCCEEEEEEECCCCCCCCCCCHHHHHHHHHHHHEEEEEEECCCCCCHHHHHHHHHHH
KRRPDLKLTSIVVSEAGASVYSASDLASREFPDMDVSLRGAVSIARRLQDPLAELVKVDP
HCCCCCCHHHHHHHHCCCCHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCC
KSIGVGQYQHDVGQTQLARSLDAVVEDCVNAVGVDVNTASAPLLERVSGLNPAVAQSIVV
CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHCCCCHHHHHHEEE
YREENGMFASREALHQVPRLGEKTFEQAAGFLRVMHGENPLDASAVHPESYPVVQRILSD
EECCCCCCHHHHHHHHHHCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHH
LKQEIRSIIGNNKLLKSLNPARYADDRFGVPTVTDIVKELEKPGRDPRPEFITAAFKEGV
HHHHHHHHHCCCHHHHHCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCHHHHHHHHHHHH
NEISDLQPGMLLEGVVTNVAAFGAFVDIGVHQDGLVHISALADKFVKDPHTVVKVGQVVK
HHHHHCCCCHHHHHHHHHHHHHHHHHEECCCCCCCCHHHHHHHHHHCCCHHHHHHCCEEE
VKVLEVDEKRKRIALTMRLADAPAPQTQEARGAGKREQPRNRKDRSAKPQQKQDSRADTA
EEEEECCHHHHHEEEEEEECCCCCCCCHHHHCCCCCCCCCCCCCCCCCCCHHHHHHHHHH
MAAAFARLKG
HHHHHHHHCC

PDB accession: NA

Resolution: NA

Structure class: Unstructured

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 9278503; 10493123 [H]