Definition Hyperthermus butylicus DSM 5456 chromosome, complete genome.
Accession NC_008818
Length 1,667,163

Click here to switch to the map view.

The map label for this gene is topA [H]

Identifier: 124027789

GI number: 124027789

Start: 921260

End: 923308

Strand: Reverse

Name: topA [H]

Synonym: Hbut_0916

Alternate gene names: 124027789

Gene position: 923308-921260 (Counterclockwise)

Preceding gene: 124027790

Following gene: 124027784

Centisome position: 55.38

GC content: 49.63

Gene sequence:

>2049_bases
ATGGTAAGAGCTGGGTATTGTAGTGCCGGGTTTGGCTATACACTTGTGATTGCTGAGAAGCCCAAAGCTGCTAGAAAGAT
TGCTGAAGCCCTCTCCGATAAACCCATAGCTTGTAAGCTTGGGGGCATACCATACTGGATTGTTACATGGATGGGGACAC
GTTACGTTATAGTGCCTGCTGCTGGACACCTTTTTGGGCTAACTACGGATAAGCATGGATTCCCCGTGTTTGAGTATTAT
TGGGCACCCCTATGGGCAGTTGATAGTTCATCGGCACATACTCGCAGGTTTCTCGAGGCTATTAAGAAGCTTGCCAGGCA
TGCTGTTAGATTTGTAAATGCATGCGACTATGATATTGAGGGTAGTGTAATAGGCTACAATATTCTCCGTGCACTAGGTG
TTGAGAAGAGAGCGCTTAGAGCTAAATTTAGCGCTTTGACTAGGCAGGATGTTCGTAGGGCATTTTCCAGGCTTGAGCGG
CTTGACTGGGACATGATTAATGCCGGGCTTGCAAGACATGAGCTGGATTGGTTGTGGGGCATCAATATTAGCCGTGCACT
CATGGAGTCGTTGCGGAGCGTAACTGGACGCTCGAAGGTTTTGAGTGCTGGAAGGGTTCAGTCGCCAACGCTTGTTGAGG
CTGTTTCCAGGACTATTAAGCGGAACCTGTTTGTTCCACTGCCATACTTTACTGTTACTGCTACAGTAGAGCTTGGAGGT
AAAGCTAGAAGACTACAAGTTGCGAGTTTTGAGCTACGAAGTGAGGCTCAACGCGTGGCGCGTCAGCTAAGGGCTACTGG
CTACCTTGTTGTGAGGGAGTATAGCGAGTATGTTGAGCGCATACCTCCACCATATCCCTTCAACCTTGGTGATTTGCAGG
TAGAGGCTTCGAGAATCCTAGGGTTTAGCCCCTACTATACTCAGAAGCTTGCTGAGGAACTGTACCTGGAGGGGCTTATC
AGCTATCCGCGAACCAATAGCCAGAAGATACCACCGACAATAGATATCAGTGCTATTGTACAAGCACTTGTTCGACAGAC
ACGCTATCGGGAGCTAGTTGAATACCTGCTAAGAGCGACGAGAGGTGTTCTACGTGTCAATAACGGGCCTAAGGAGGATC
CTGCGCATCCAGCTATTCACCCTACTGGCGAGCCGCCACATTCTGGTCTGAGTAAGAATCACCTGCGTCTCTACGATCTT
ATCGTGAGGAGGTTTCTGGCCAGCATGGCTCCACCCGCCGTCCTTGTTAGGGCTCGCCTCATACTTTCTGCCGAGGGGTT
GGGTTCTGCCTCAATTACTGGTGTGAGGATTGTTTCGCCTGGCTGGATGCAGATATATTACTGGGCTAAGCCCTCGGAGG
AGTATATACCGCGGCTATATCGTGGACAACGCATACAAGTGAAGAGTGTATCCGTACAGACATCATATACACAGCCTCCA
GAACTGCATACTAAGACGAGTCTTGTAAAGTGGATGGAGTCTAAGGGCATAGGCACCGAGGCTACCAGGGCACGGATTGT
CGAACTGCTGTTCGAGAGAGGCTACTTACGCTCTGAGGGAGGCAGAGTTCAAGCTACTGAGCTAGGCTTAACTGTTGCCA
ATATACTGCAAACCTACTTCCCCGATATAACTAGTGTTGAGCTGACAAGGAGGTTCGAAGAGCTACTCGAAGCTATACGC
ATGGGAAAGGTCTCGAAGGATGCTGTTATTAGTGAGGCTAGACGGTTTCTCGCTGGCATACTCTCAGAGTTCAAGGTCAA
ATCGATGCACGCTGTTGGTATGGAGCTAGCATACTCGCTTGGGCTTCTAAAGCCTCCTAGACAATGCCCTATATGTGGTA
GAAGGGCCGAGGGAGAATACTGTAGCTATCACGAGGCAGCTATAAGGAAGATTGTAGAGGCTTACCACGAGTGGCGGCGG
AGAACTGGCATCACATGCAGAGAGTATCTTGAGAAGCTGGCAGCAATGCGTTCTACCGGTAAGTGGGTAAGAGAGGCTGC
AAAGTACCTACTGGCTAGGAATGTTTGCCCCAAAGGTCTTACAGCTTAG

Upstream 100 bases:

>100_bases
TAGCTAACCTTAAGCTATATGCCTTCCGTTAGCTCTCCTAGCTGTTTTAAAGGAACAACATATTCTATTGCTTAATCCTA
GCTAGCTTAGTGGGGTATTG

Downstream 100 bases:

>100_bases
CTGCTAGCTTCTCAAGTAAAAGCTTAGAAGCCCCATGTACGGTATCCTTACTGGTACTTTGGCCTTGCTCTCGCCTAGGC
CTACGTGCACTGTTATAACT

Product: DNA topoisomerase I

Products: NA

Alternate protein names: DNA topoisomerase I; Omega-protein; Relaxing enzyme; Swivelase; Untwisting enzyme [H]

Number of amino acids: Translated: 682; Mature: 682

Protein sequence:

>682_residues
MVRAGYCSAGFGYTLVIAEKPKAARKIAEALSDKPIACKLGGIPYWIVTWMGTRYVIVPAAGHLFGLTTDKHGFPVFEYY
WAPLWAVDSSSAHTRRFLEAIKKLARHAVRFVNACDYDIEGSVIGYNILRALGVEKRALRAKFSALTRQDVRRAFSRLER
LDWDMINAGLARHELDWLWGINISRALMESLRSVTGRSKVLSAGRVQSPTLVEAVSRTIKRNLFVPLPYFTVTATVELGG
KARRLQVASFELRSEAQRVARQLRATGYLVVREYSEYVERIPPPYPFNLGDLQVEASRILGFSPYYTQKLAEELYLEGLI
SYPRTNSQKIPPTIDISAIVQALVRQTRYRELVEYLLRATRGVLRVNNGPKEDPAHPAIHPTGEPPHSGLSKNHLRLYDL
IVRRFLASMAPPAVLVRARLILSAEGLGSASITGVRIVSPGWMQIYYWAKPSEEYIPRLYRGQRIQVKSVSVQTSYTQPP
ELHTKTSLVKWMESKGIGTEATRARIVELLFERGYLRSEGGRVQATELGLTVANILQTYFPDITSVELTRRFEELLEAIR
MGKVSKDAVISEARRFLAGILSEFKVKSMHAVGMELAYSLGLLKPPRQCPICGRRAEGEYCSYHEAAIRKIVEAYHEWRR
RTGITCREYLEKLAAMRSTGKWVREAAKYLLARNVCPKGLTA

Sequences:

>Translated_682_residues
MVRAGYCSAGFGYTLVIAEKPKAARKIAEALSDKPIACKLGGIPYWIVTWMGTRYVIVPAAGHLFGLTTDKHGFPVFEYY
WAPLWAVDSSSAHTRRFLEAIKKLARHAVRFVNACDYDIEGSVIGYNILRALGVEKRALRAKFSALTRQDVRRAFSRLER
LDWDMINAGLARHELDWLWGINISRALMESLRSVTGRSKVLSAGRVQSPTLVEAVSRTIKRNLFVPLPYFTVTATVELGG
KARRLQVASFELRSEAQRVARQLRATGYLVVREYSEYVERIPPPYPFNLGDLQVEASRILGFSPYYTQKLAEELYLEGLI
SYPRTNSQKIPPTIDISAIVQALVRQTRYRELVEYLLRATRGVLRVNNGPKEDPAHPAIHPTGEPPHSGLSKNHLRLYDL
IVRRFLASMAPPAVLVRARLILSAEGLGSASITGVRIVSPGWMQIYYWAKPSEEYIPRLYRGQRIQVKSVSVQTSYTQPP
ELHTKTSLVKWMESKGIGTEATRARIVELLFERGYLRSEGGRVQATELGLTVANILQTYFPDITSVELTRRFEELLEAIR
MGKVSKDAVISEARRFLAGILSEFKVKSMHAVGMELAYSLGLLKPPRQCPICGRRAEGEYCSYHEAAIRKIVEAYHEWRR
RTGITCREYLEKLAAMRSTGKWVREAAKYLLARNVCPKGLTA
>Mature_682_residues
MVRAGYCSAGFGYTLVIAEKPKAARKIAEALSDKPIACKLGGIPYWIVTWMGTRYVIVPAAGHLFGLTTDKHGFPVFEYY
WAPLWAVDSSSAHTRRFLEAIKKLARHAVRFVNACDYDIEGSVIGYNILRALGVEKRALRAKFSALTRQDVRRAFSRLER
LDWDMINAGLARHELDWLWGINISRALMESLRSVTGRSKVLSAGRVQSPTLVEAVSRTIKRNLFVPLPYFTVTATVELGG
KARRLQVASFELRSEAQRVARQLRATGYLVVREYSEYVERIPPPYPFNLGDLQVEASRILGFSPYYTQKLAEELYLEGLI
SYPRTNSQKIPPTIDISAIVQALVRQTRYRELVEYLLRATRGVLRVNNGPKEDPAHPAIHPTGEPPHSGLSKNHLRLYDL
IVRRFLASMAPPAVLVRARLILSAEGLGSASITGVRIVSPGWMQIYYWAKPSEEYIPRLYRGQRIQVKSVSVQTSYTQPP
ELHTKTSLVKWMESKGIGTEATRARIVELLFERGYLRSEGGRVQATELGLTVANILQTYFPDITSVELTRRFEELLEAIR
MGKVSKDAVISEARRFLAGILSEFKVKSMHAVGMELAYSLGLLKPPRQCPICGRRAEGEYCSYHEAAIRKIVEAYHEWRR
RTGITCREYLEKLAAMRSTGKWVREAAKYLLARNVCPKGLTA

Specific function: The reaction catalyzed by topoisomerases leads to the conversion of one topological isomer of DNA to another [H]

COG id: COG0550

COG function: function code L; Topoisomerase IA

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Essential [C]

Operon status: Not Known

Operon components: None

Similarity: Belongs to the prokaryotic type I/III topoisomerase family [H]

Homologues:

Organism=Homo sapiens, GI4507635, Length=645, Percent_Identity=26.6666666666667, Blast_Score=160, Evalue=3e-39,
Organism=Homo sapiens, GI10835218, Length=653, Percent_Identity=26.3399693721286, Blast_Score=150, Evalue=5e-36,
Organism=Escherichia coli, GI1787529, Length=653, Percent_Identity=25.2679938744257, Blast_Score=145, Evalue=6e-36,
Organism=Escherichia coli, GI1788061, Length=603, Percent_Identity=27.6948590381426, Blast_Score=145, Evalue=9e-36,
Organism=Caenorhabditis elegans, GI32563869, Length=590, Percent_Identity=26.9491525423729, Blast_Score=159, Evalue=5e-39,
Organism=Caenorhabditis elegans, GI17555378, Length=585, Percent_Identity=25.2991452991453, Blast_Score=146, Evalue=4e-35,
Organism=Saccharomyces cerevisiae, GI6323263, Length=636, Percent_Identity=26.2578616352201, Blast_Score=150, Evalue=6e-37,
Organism=Drosophila melanogaster, GI24640096, Length=612, Percent_Identity=24.8366013071895, Blast_Score=150, Evalue=4e-36,
Organism=Drosophila melanogaster, GI24585251, Length=534, Percent_Identity=28.2771535580524, Blast_Score=147, Evalue=2e-35,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR003601
- InterPro:   IPR013497
- InterPro:   IPR013824
- InterPro:   IPR013826
- InterPro:   IPR000380
- InterPro:   IPR003602
- InterPro:   IPR005739
- InterPro:   IPR006171 [H]

Pfam domain/function: PF01131 Topoisom_bac; PF01751 Toprim [H]

EC number: =5.99.1.2 [H]

Molecular weight: Translated: 77010; Mature: 77010

Theoretical pI: Translated: 10.26; Mature: 10.26

Prosite motif: PS00396 TOPOISOMERASE_I_PROK

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

1.2 %Cys     (Translated Protein)
1.6 %Met     (Translated Protein)
2.8 %Cys+Met (Translated Protein)
1.2 %Cys     (Mature Protein)
1.6 %Met     (Mature Protein)
2.8 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MVRAGYCSAGFGYTLVIAEKPKAARKIAEALSDKPIACKLGGIPYWIVTWMGTRYVIVPA
CCCCCCCCCCCCEEEEEECCCHHHHHHHHHHCCCCCEEEECCCCEEEEEECCCEEEEEEC
AGHLFGLTTDKHGFPVFEYYWAPLWAVDSSSAHTRRFLEAIKKLARHAVRFVNACDYDIE
CCCEEEECCCCCCCCHHHHHHCEEEEECCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCC
GSVIGYNILRALGVEKRALRAKFSALTRQDVRRAFSRLERLDWDMINAGLARHELDWLWG
CCCHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHCC
INISRALMESLRSVTGRSKVLSAGRVQSPTLVEAVSRTIKRNLFVPLPYFTVTATVELGG
CCHHHHHHHHHHHHHCHHHHHHCCCCCCCHHHHHHHHHHHHCCEECCCCEEEEEEEEECC
KARRLQVASFELRSEAQRVARQLRATGYLVVREYSEYVERIPPPYPFNLGDLQVEASRIL
CCCEEEHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHCCCCCCCCCCCCEEEHHHHC
GFSPYYTQKLAEELYLEGLISYPRTNSQKIPPTIDISAIVQALVRQTRYRELVEYLLRAT
CCCHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHC
RGVLRVNNGPKEDPAHPAIHPTGEPPHSGLSKNHLRLYDLIVRRFLASMAPPAVLVRARL
CCEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCHHHHHHHHH
ILSAEGLGSASITGVRIVSPGWMQIYYWAKPSEEYIPRLYRGQRIQVKSVSVQTSYTQPP
HHCCCCCCCCCCCEEEEECCCCEEEEEEECCCHHHHHHHHCCCEEEEEEEEEEECCCCCC
ELHTKTSLVKWMESKGIGTEATRARIVELLFERGYLRSEGGRVQATELGLTVANILQTYF
CCHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHCCCCCCCCCEEEHHHHHHHHHHHHHHHC
PDITSVELTRRFEELLEAIRMGKVSKDAVISEARRFLAGILSEFKVKSMHAVGMELAYSL
CCCCHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
GLLKPPRQCPICGRRAEGEYCSYHEAAIRKIVEAYHEWRRRTGITCREYLEKLAAMRSTG
CCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHH
KWVREAAKYLLARNVCPKGLTA
HHHHHHHHHHHHHCCCCCCCCC
>Mature Secondary Structure
MVRAGYCSAGFGYTLVIAEKPKAARKIAEALSDKPIACKLGGIPYWIVTWMGTRYVIVPA
CCCCCCCCCCCCEEEEEECCCHHHHHHHHHHCCCCCEEEECCCCEEEEEECCCEEEEEEC
AGHLFGLTTDKHGFPVFEYYWAPLWAVDSSSAHTRRFLEAIKKLARHAVRFVNACDYDIE
CCCEEEECCCCCCCCHHHHHHCEEEEECCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCC
GSVIGYNILRALGVEKRALRAKFSALTRQDVRRAFSRLERLDWDMINAGLARHELDWLWG
CCCHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHCC
INISRALMESLRSVTGRSKVLSAGRVQSPTLVEAVSRTIKRNLFVPLPYFTVTATVELGG
CCHHHHHHHHHHHHHCHHHHHHCCCCCCCHHHHHHHHHHHHCCEECCCCEEEEEEEEECC
KARRLQVASFELRSEAQRVARQLRATGYLVVREYSEYVERIPPPYPFNLGDLQVEASRIL
CCCEEEHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHCCCCCCCCCCCCEEEHHHHC
GFSPYYTQKLAEELYLEGLISYPRTNSQKIPPTIDISAIVQALVRQTRYRELVEYLLRAT
CCCHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHC
RGVLRVNNGPKEDPAHPAIHPTGEPPHSGLSKNHLRLYDLIVRRFLASMAPPAVLVRARL
CCEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHCCHHHHHHHHH
ILSAEGLGSASITGVRIVSPGWMQIYYWAKPSEEYIPRLYRGQRIQVKSVSVQTSYTQPP
HHCCCCCCCCCCCEEEEECCCCEEEEEEECCCHHHHHHHHCCCEEEEEEEEEEECCCCCC
ELHTKTSLVKWMESKGIGTEATRARIVELLFERGYLRSEGGRVQATELGLTVANILQTYF
CCHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHCCCCCCCCCEEEHHHHHHHHHHHHHHHC
PDITSVELTRRFEELLEAIRMGKVSKDAVISEARRFLAGILSEFKVKSMHAVGMELAYSL
CCCCHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
GLLKPPRQCPICGRRAEGEYCSYHEAAIRKIVEAYHEWRRRTGITCREYLEKLAAMRSTG
CCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHH
KWVREAAKYLLARNVCPKGLTA
HHHHHHHHHHHHHCCCCCCCCC

PDB accession: NA

Resolution: NA

Structure class: Unstructured

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 10382966 [H]