Definition Mycobacterium bovis BCG str. Pasteur 1173P2, complete genome.
Accession NC_008769
Length 4,374,522

Click here to switch to the map view.

The map label for this gene is sufB [C]

Identifier: 121637391

GI number: 121637391

Start: 1672061

End: 1674601

Strand: Direct

Name: sufB [C]

Synonym: BCG_1522

Alternate gene names: 121637391

Gene position: 1672061-1674601 (Clockwise)

Preceding gene: 121637390

Following gene: 121637392

Centisome position: 38.22

GC content: 61.71

Gene sequence:

>2541_bases
ATGACACTCACCCCAGAGGCCAGCAAGAGCGTTGCCCAGCCCCCGACCCAGGCTCCCCTGACCCAGGAAGAGGCGATCGC
GTCGCTGGGCCGGTACGGCTACGGCTGGGCGGACTCCGACGTCGCGGGTGCCAACGCGCAGCGCGGGCTTTCCGAGGCGG
TGGTCCGCGACATCTCCGCGAAGAAGAACGAGCCCGATTGGATGCTGCAGTCGCGGCTGAAGGCGCTGCGCATTTTCGAC
CGCAAGCCCATTCCGAAGTGGGGCTCCAACCTCGATGGCATCGATTTCGACAACATCAAGTACTTCGTGCGCTCCACCGA
GAAGCAGGCCGCGAGCTGGGATGATTTGCCAGAGGACATCCGCAACACCTACGACCGGTTGGGAATCCCGGAGGCCGAGA
AGCAGAGATTAGTAGCTGGAGTAGCCGCACAATACGAAAGTGAAGTTGTATATCACCAGATCAGAGAGGATCTGGAGGCT
CAAGGAGTCATATTTTTAGACACTGATACTGGTTTGCGAGAACACCCGGATATTTTCAAGGAATATTTCGGTACAGTAAT
CCCTGCCGGCGATAATAAGTTTTCTGCATTGAATACTGCAGTTTGGAGTGGTGGGTCCTTTATTTACGTCCCGCCCGGTG
TTCACGTCGACATTCCGCTGCAGGCCTACTTCCGAATCAACACCGAGAACATGGGCCAGTTCGAGCGGACGCTGATCATC
GCCGATGAGGGCTCTTACGTGCACTACGTAGAGGGCTGCCTGCCCGCCGGCGAGCTCATCACGACCGCCGACGGCGATTT
GCGGCCCATCGAGTCGATTCGCGTCGGTGACTTCGTCACCGGCCACGACGGGCGGCCACACCGCGTCACCGCTGTACAGG
TGCGTGACCTCGATGGCGAGCTGTTCACCTTCACACCGATGTCGCCTGCCAACGCATTCTCTGTCACCGCCGAGCACCCC
CTTCTCGCTATTCCCCGCGACGAGGTGCGTGTTATGCGGAAGGAACGCAATGGGTGGAAGGCTGAAGTCAACAGCACCAA
GCTGCGTAGCGCCGAGCCGCGATGGATCGCGGCGAAGGATGTGGCCGAGGGTGACTTCCTGATCTACCCCAAGCCGAAGC
CGATCCCCCACAGGACGGTTTTGCCGCTCGAGTTTGCGCGCCTGGCGGGCTACTACCTGGCGGAGGGTCACGCGTGTCTC
ACCAATGGCTGTGAGTCGCTGATCTTCTCGTTCCACAGCGATGAGTTCGAGTACGTCGAGGATGTGCGCCAAGCGTGCAA
GTCGCTGTACGAGAAGTCGGGATCGGTATTGATCGAGGAGCACAAGCATTCGGCGCGCGTCACCGTGTACACGAAGGCGG
GCTATGCGGCGATGCGCGACAACGTCGGCATTGGATCGTCGAATAAGAAGCTGTCGGATCTGTTGATGCGTCAAGACGAG
ACGTTCTTGCGTGAGCTGGTCGACGCCTATGTGAATGGAGACGGCAACGTCACGCGCCGTAACGGGGCGGTGTGGAAGCG
GGTACATACGACATCGCGCCTCTGGGCGTTCCAGTTGCAGTCCATCCTGGCGCGTCTGGGTCACTACGCCACTGTTGAAC
TGCGCCGACCGGGCGGCCCTGGTGTGATCATGGGCCGCAACGTCGTTCGCAAGGACATCTACCAGGTGCAGTGGACCGAG
GGCGGCCGCGGACCGAAGCAGGCCCGCGACTGCGGCGACTACTTTGCGGTGCCAATCAAGAAGCGAGCGGTCCGCGAAGC
ACATGAGCCCGTCTACAACCTCGATGTCGAGAATCCGGACAGCTACCTCGCCTACGGGTTCGCCGTGCACAACTGCACCG
CACCGATCTACAAATCGGATTCATTGCACTCAGCGGTGGTCGAGATCATCGTGAAACCCCATGCGCGCGTGCGTTACACC
ACCATCCAGAACTGGTCGAACAACGTCTACAACCTGGTCACCAAGCGGGCCCGCGCCGAAGCCGGGGCCACCATGGAGTG
GATCGACGGCAACATCGGGTCCAAGGTGACCATGAAGTACCCGGCGGTCTGGATGACCGGCGAGCACGCCAAGGGCGAAG
TGCTCTCGGTGGCGTTCGCCGGCGAAGACCAGCACCAGGACACCGGCGCCAAGATGCTGCACCTGGCGCCGAACACGTCG
AGCAACATCGTGTCCAAGTCGGTGGCCCGCGGCGGCGGCCGCACCTCCTACCGTGGCCTGGTGCAGGTCAACAAGGGGGC
GCATGGGTCGCGGTCCAGCGTGAAATGCGATGCGCTGCTGGTGGATACGGTCAGCCGCAGCGACACCTACCCCTACGTCG
ACATCCGCGAGGACGACGTCACCATGGGCCACGAGGCCACCGTGTCCAAGGTCAGCGAGAACCAGCTGTTCTACCTGATG
AGCCGCGGGCTGACCGAGGACGAGGCGATGGCGATGGTGGTGCGCGGCTTCGTCGAGCCGATCGCCAAGGAGCTGCCGAT
GGAGTACGCGCTGGAGCTCAACCGGCTGATCGAGCTGCAGATGGAGGGCGCGGTCGGATGA

Upstream 100 bases:

>100_bases
CGACCATCGTCAACGGAGACTGCGCCTGCACCACCCACGTACCCCTGTCGCCGGCGCCCAGCCCGCGCCCACCCGCCACC
AGCACCGAAGGAGTGTCCCG

Downstream 100 bases:

>100_bases
CGGCTCCGGGACTGACAGCAGCCGTCGAGGGGATCGCACACAACAAGGGCGAGCTGTTCGCCTCCTTTGACGTGGACGCG
TTCGAGGTTCCGCACGGCCG

Product: hypothetical protein

Products: NA

Alternate protein names: Mtu pps1 intein

Number of amino acids: Translated: 846; Mature: 845

Protein sequence:

>846_residues
MTLTPEASKSVAQPPTQAPLTQEEAIASLGRYGYGWADSDVAGANAQRGLSEAVVRDISAKKNEPDWMLQSRLKALRIFD
RKPIPKWGSNLDGIDFDNIKYFVRSTEKQAASWDDLPEDIRNTYDRLGIPEAEKQRLVAGVAAQYESEVVYHQIREDLEA
QGVIFLDTDTGLREHPDIFKEYFGTVIPAGDNKFSALNTAVWSGGSFIYVPPGVHVDIPLQAYFRINTENMGQFERTLII
ADEGSYVHYVEGCLPAGELITTADGDLRPIESIRVGDFVTGHDGRPHRVTAVQVRDLDGELFTFTPMSPANAFSVTAEHP
LLAIPRDEVRVMRKERNGWKAEVNSTKLRSAEPRWIAAKDVAEGDFLIYPKPKPIPHRTVLPLEFARLAGYYLAEGHACL
TNGCESLIFSFHSDEFEYVEDVRQACKSLYEKSGSVLIEEHKHSARVTVYTKAGYAAMRDNVGIGSSNKKLSDLLMRQDE
TFLRELVDAYVNGDGNVTRRNGAVWKRVHTTSRLWAFQLQSILARLGHYATVELRRPGGPGVIMGRNVVRKDIYQVQWTE
GGRGPKQARDCGDYFAVPIKKRAVREAHEPVYNLDVENPDSYLAYGFAVHNCTAPIYKSDSLHSAVVEIIVKPHARVRYT
TIQNWSNNVYNLVTKRARAEAGATMEWIDGNIGSKVTMKYPAVWMTGEHAKGEVLSVAFAGEDQHQDTGAKMLHLAPNTS
SNIVSKSVARGGGRTSYRGLVQVNKGAHGSRSSVKCDALLVDTVSRSDTYPYVDIREDDVTMGHEATVSKVSENQLFYLM
SRGLTEDEAMAMVVRGFVEPIAKELPMEYALELNRLIELQMEGAVG

Sequences:

>Translated_846_residues
MTLTPEASKSVAQPPTQAPLTQEEAIASLGRYGYGWADSDVAGANAQRGLSEAVVRDISAKKNEPDWMLQSRLKALRIFD
RKPIPKWGSNLDGIDFDNIKYFVRSTEKQAASWDDLPEDIRNTYDRLGIPEAEKQRLVAGVAAQYESEVVYHQIREDLEA
QGVIFLDTDTGLREHPDIFKEYFGTVIPAGDNKFSALNTAVWSGGSFIYVPPGVHVDIPLQAYFRINTENMGQFERTLII
ADEGSYVHYVEGCLPAGELITTADGDLRPIESIRVGDFVTGHDGRPHRVTAVQVRDLDGELFTFTPMSPANAFSVTAEHP
LLAIPRDEVRVMRKERNGWKAEVNSTKLRSAEPRWIAAKDVAEGDFLIYPKPKPIPHRTVLPLEFARLAGYYLAEGHACL
TNGCESLIFSFHSDEFEYVEDVRQACKSLYEKSGSVLIEEHKHSARVTVYTKAGYAAMRDNVGIGSSNKKLSDLLMRQDE
TFLRELVDAYVNGDGNVTRRNGAVWKRVHTTSRLWAFQLQSILARLGHYATVELRRPGGPGVIMGRNVVRKDIYQVQWTE
GGRGPKQARDCGDYFAVPIKKRAVREAHEPVYNLDVENPDSYLAYGFAVHNCTAPIYKSDSLHSAVVEIIVKPHARVRYT
TIQNWSNNVYNLVTKRARAEAGATMEWIDGNIGSKVTMKYPAVWMTGEHAKGEVLSVAFAGEDQHQDTGAKMLHLAPNTS
SNIVSKSVARGGGRTSYRGLVQVNKGAHGSRSSVKCDALLVDTVSRSDTYPYVDIREDDVTMGHEATVSKVSENQLFYLM
SRGLTEDEAMAMVVRGFVEPIAKELPMEYALELNRLIELQMEGAVG
>Mature_845_residues
TLTPEASKSVAQPPTQAPLTQEEAIASLGRYGYGWADSDVAGANAQRGLSEAVVRDISAKKNEPDWMLQSRLKALRIFDR
KPIPKWGSNLDGIDFDNIKYFVRSTEKQAASWDDLPEDIRNTYDRLGIPEAEKQRLVAGVAAQYESEVVYHQIREDLEAQ
GVIFLDTDTGLREHPDIFKEYFGTVIPAGDNKFSALNTAVWSGGSFIYVPPGVHVDIPLQAYFRINTENMGQFERTLIIA
DEGSYVHYVEGCLPAGELITTADGDLRPIESIRVGDFVTGHDGRPHRVTAVQVRDLDGELFTFTPMSPANAFSVTAEHPL
LAIPRDEVRVMRKERNGWKAEVNSTKLRSAEPRWIAAKDVAEGDFLIYPKPKPIPHRTVLPLEFARLAGYYLAEGHACLT
NGCESLIFSFHSDEFEYVEDVRQACKSLYEKSGSVLIEEHKHSARVTVYTKAGYAAMRDNVGIGSSNKKLSDLLMRQDET
FLRELVDAYVNGDGNVTRRNGAVWKRVHTTSRLWAFQLQSILARLGHYATVELRRPGGPGVIMGRNVVRKDIYQVQWTEG
GRGPKQARDCGDYFAVPIKKRAVREAHEPVYNLDVENPDSYLAYGFAVHNCTAPIYKSDSLHSAVVEIIVKPHARVRYTT
IQNWSNNVYNLVTKRARAEAGATMEWIDGNIGSKVTMKYPAVWMTGEHAKGEVLSVAFAGEDQHQDTGAKMLHLAPNTSS
NIVSKSVARGGGRTSYRGLVQVNKGAHGSRSSVKCDALLVDTVSRSDTYPYVDIREDDVTMGHEATVSKVSENQLFYLMS
RGLTEDEAMAMVVRGFVEPIAKELPMEYALELNRLIELQMEGAVG

Specific function: Unknown

COG id: COG1372

COG function: function code L; Intein/homing endonuclease

Gene ontology:

Cell location: Cytoplasm [C]

Metaboloic importance: Unknown [C]

Operon status: Not Known

Operon components: None

Similarity: Contains 1 DOD-type homing endonuclease domain

Homologues:

Organism=Escherichia coli, GI87081955, Length=258, Percent_Identity=40.3100775193798, Blast_Score=221, Evalue=2e-58,
Organism=Escherichia coli, GI1787971, Length=154, Percent_Identity=26.6233766233766, Blast_Score=74, Evalue=4e-14,

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): Y1461_MYCTU (P67125)

Other databases:

- EMBL:   BX842576
- EMBL:   AE000516
- PIR:   H70871
- RefSeq:   NP_215977.1
- RefSeq:   NP_335958.1
- ProteinModelPortal:   P67125
- SMR:   P67125
- REBASE:   4231
- EnsemblBacteria:   EBMYCT00000000183
- EnsemblBacteria:   EBMYCT00000072074
- GeneID:   886609
- GeneID:   924461
- GenomeReviews:   AE000516_GR
- GenomeReviews:   AL123456_GR
- KEGG:   mtc:MT1508
- KEGG:   mtu:Rv1461
- TIGR:   MT1508
- TubercuList:   Rv1461
- GeneTree:   EBGT00050000017319
- HOGENOM:   HBG130984
- ProtClustDB:   CLSK791176
- GO:   GO:0005829
- GO:   GO:0040007
- InterPro:   IPR003586
- InterPro:   IPR003587
- InterPro:   IPR007868
- InterPro:   IPR006142
- InterPro:   IPR004042
- InterPro:   IPR006141
- InterPro:   IPR010231
- InterPro:   IPR000825
- PRINTS:   PR00379
- SMART:   SM00305
- SMART:   SM00306
- TIGRFAMs:   TIGR01443
- TIGRFAMs:   TIGR01980

Pfam domain/function: PF05203 Hom_end_hint; PF01458 UPF0051

EC number: NA

Molecular weight: Translated: 94172; Mature: 94041

Theoretical pI: Translated: 6.35; Mature: 6.35

Prosite motif: PS50818 INTEIN_C_TER; PS50819 INTEIN_ENDONUCLEASE; PS50817 INTEIN_N_TER

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.8 %Cys     (Translated Protein)
2.1 %Met     (Translated Protein)
3.0 %Cys+Met (Translated Protein)
0.8 %Cys     (Mature Protein)
2.0 %Met     (Mature Protein)
2.8 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MTLTPEASKSVAQPPTQAPLTQEEAIASLGRYGYGWADSDVAGANAQRGLSEAVVRDISA
CCCCCCCCHHHCCCCCCCCCCHHHHHHHHHCCCCCCCCCCCCCCCHHCCHHHHHHHHHHC
KKNEPDWMLQSRLKALRIFDRKPIPKWGSNLDGIDFDNIKYFVRSTEKQAASWDDLPEDI
CCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHHCCHHHHCCCCCCHHHH
RNTYDRLGIPEAEKQRLVAGVAAQYESEVVYHQIREDLEAQGVIFLDTDTGLREHPDIFK
HHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEECCCCCCCCHHHHH
EYFGTVIPAGDNKFSALNTAVWSGGSFIYVPPGVHVDIPLQAYFRINTENMGQFERTLII
HHHCCEECCCCCCCHHHHHHEECCCCEEEECCCCEEEECEEEEEEECCCCCCCEEEEEEE
ADEGSYVHYVEGCLPAGELITTADGDLRPIESIRVGDFVTGHDGRPHRVTAVQVRDLDGE
ECCCCEEEEECCCCCCCCEEEECCCCCCCCCCEEECCEEECCCCCCCEEEEEEEEECCCC
LFTFTPMSPANAFSVTAEHPLLAIPRDEVRVMRKERNGWKAEVNSTKLRSAEPRWIAAKD
EEEECCCCCCCCEEEECCCCEEECCHHHHHHHHHHCCCCEEECCCCCCCCCCCCEEEECC
VAEGDFLIYPKPKPIPHRTVLPLEFARLAGYYLAEGHACLTNGCESLIFSFHSDEFEYVE
CCCCCEEEECCCCCCCCCEECHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHCCCHHHHHH
DVRQACKSLYEKSGSVLIEEHKHSARVTVYTKAGYAAMRDNVGIGSSNKKLSDLLMRQDE
HHHHHHHHHHHCCCCEEEEECCCCCEEEEEEECCCHHHHCCCCCCCCCHHHHHHHHHCCH
TFLRELVDAYVNGDGNVTRRNGAVWKRVHTTSRLWAFQLQSILARLGHYATVELRRPGGP
HHHHHHHHHHCCCCCCEEECCCCEEEEHHHHHHHHHHHHHHHHHHHCCEEEEEEECCCCC
GVIMGRNVVRKDIYQVQWTEGGRGPKQARDCGDYFAVPIKKRAVREAHEPVYNLDVENPD
EEEECCHHHHHCCEEEEECCCCCCHHHHHHCCCEEECCHHHHHHHHHCCCEEEECCCCCC
SYLAYGFAVHNCTAPIYKSDSLHSAVVEIIVKPHARVRYTTIQNWSNNVYNLVTKRARAE
CEEEEEEEEECCCCCCCCCCCHHHHHHHHHCCCCCEEEEEEECCCCCHHHHHHHHHHHHH
AGATMEWIDGNIGSKVTMKYPAVWMTGEHAKGEVLSVAFAGEDQHQDTGAKMLHLAPNTS
CCCEEEEECCCCCCEEEEECCEEEEECCCCCCCEEEEEECCCCCCCCCCCEEEEECCCCC
SNIVSKSVARGGGRTSYRGLVQVNKGAHGSRSSVKCDALLVDTVSRSDTYPYVDIREDDV
CHHHHHHHHCCCCCCCCCEEEEECCCCCCCCCCEEEEEEEEECCCCCCCCCEEEECCCCC
TMGHEATVSKVSENQLFYLMSRGLTEDEAMAMVVRGFVEPIAKELPMEYALELNRLIELQ
CCCCHHHHHHCCCCCEEEEECCCCCHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHEEEE
MEGAVG
ECCCCC
>Mature Secondary Structure 
TLTPEASKSVAQPPTQAPLTQEEAIASLGRYGYGWADSDVAGANAQRGLSEAVVRDISA
CCCCCCCHHHCCCCCCCCCCHHHHHHHHHCCCCCCCCCCCCCCCHHCCHHHHHHHHHHC
KKNEPDWMLQSRLKALRIFDRKPIPKWGSNLDGIDFDNIKYFVRSTEKQAASWDDLPEDI
CCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCHHHHHHHHHCCHHHHCCCCCCHHHH
RNTYDRLGIPEAEKQRLVAGVAAQYESEVVYHQIREDLEAQGVIFLDTDTGLREHPDIFK
HHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEECCCCCCCCHHHHH
EYFGTVIPAGDNKFSALNTAVWSGGSFIYVPPGVHVDIPLQAYFRINTENMGQFERTLII
HHHCCEECCCCCCCHHHHHHEECCCCEEEECCCCEEEECEEEEEEECCCCCCCEEEEEEE
ADEGSYVHYVEGCLPAGELITTADGDLRPIESIRVGDFVTGHDGRPHRVTAVQVRDLDGE
ECCCCEEEEECCCCCCCCEEEECCCCCCCCCCEEECCEEECCCCCCCEEEEEEEEECCCC
LFTFTPMSPANAFSVTAEHPLLAIPRDEVRVMRKERNGWKAEVNSTKLRSAEPRWIAAKD
EEEECCCCCCCCEEEECCCCEEECCHHHHHHHHHHCCCCEEECCCCCCCCCCCCEEEECC
VAEGDFLIYPKPKPIPHRTVLPLEFARLAGYYLAEGHACLTNGCESLIFSFHSDEFEYVE
CCCCCEEEECCCCCCCCCEECHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHCCCHHHHHH
DVRQACKSLYEKSGSVLIEEHKHSARVTVYTKAGYAAMRDNVGIGSSNKKLSDLLMRQDE
HHHHHHHHHHHCCCCEEEEECCCCCEEEEEEECCCHHHHCCCCCCCCCHHHHHHHHHCCH
TFLRELVDAYVNGDGNVTRRNGAVWKRVHTTSRLWAFQLQSILARLGHYATVELRRPGGP
HHHHHHHHHHCCCCCCEEECCCCEEEEHHHHHHHHHHHHHHHHHHHCCEEEEEEECCCCC
GVIMGRNVVRKDIYQVQWTEGGRGPKQARDCGDYFAVPIKKRAVREAHEPVYNLDVENPD
EEEECCHHHHHCCEEEEECCCCCCHHHHHHCCCEEECCHHHHHHHHHCCCEEEECCCCCC
SYLAYGFAVHNCTAPIYKSDSLHSAVVEIIVKPHARVRYTTIQNWSNNVYNLVTKRARAE
CEEEEEEEEECCCCCCCCCCCHHHHHHHHHCCCCCEEEEEEECCCCCHHHHHHHHHHHHH
AGATMEWIDGNIGSKVTMKYPAVWMTGEHAKGEVLSVAFAGEDQHQDTGAKMLHLAPNTS
CCCEEEEECCCCCCEEEEECCEEEEECCCCCCCEEEEEECCCCCCCCCCCEEEEECCCCC
SNIVSKSVARGGGRTSYRGLVQVNKGAHGSRSSVKCDALLVDTVSRSDTYPYVDIREDDV
CHHHHHHHHCCCCCCCCCEEEEECCCCCCCCCCEEEEEEEEECCCCCCCCCEEEECCCCC
TMGHEATVSKVSENQLFYLMSRGLTEDEAMAMVVRGFVEPIAKELPMEYALELNRLIELQ
CCCCHHHHHHCCCCCEEEEECCCCCHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHEEEE
MEGAVG
ECCCCC

PDB accession: NA

Resolution: NA

Structure class: Alpha Beta

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 9634230; 12218036