Definition Mycobacterium avium subsp. paratuberculosis K-10, complete genome.
Accession NC_002944
Length 4,829,781

Click here to switch to the map view.

The map label for this gene is 41407004

Identifier: 41407004

GI number: 41407004

Start: 935735

End: 937720

Strand: Direct

Name: 41407004

Synonym: MAP0906

Alternate gene names: NA

Gene position: 935735-937720 (Clockwise)

Preceding gene: 41407003

Following gene: 41407005

Centisome position: 19.37

GC content: 69.28

Gene sequence:

>1986_bases
ATGGCTAACAGGGGCCACTCGTCGCGATACTCGGCGTACACCGGCGGACCCGATCCGCTGGCGCCGCCGGTGGATCTGCG
GGAGGCGCTCGAACAGATCGGCCAGGACGTGATGGCCGGCACCTCGCCGCGGCGTGCCCTCTCCGAGCTGCTGCGGCGCG
GCACCAAGAACATGCCCGGCGCCGACCGGCTGGCCGCCGAGGCCAACCGGCGGCGACGGGAATTGTTGCGCCGCAACAAT
TTAGACGGCACCCTGCAGGAGATCAAGAAGCTGCTGGACGAGGCCGTGCTGGCCGAACGCAAGGAGCTGGCCCGCGCGCT
CGACGATGACGCCCGGTTCGGTGAACTGCAGCTCGATGCGCTGCCGGCGTCGCCGGCCAAGGCCGTGCAGGAGCTTTCCG
AATACAACTGGCGCAGCAGCGAGGCGCGTGAAAAGTACGACCAGATCAAGGATCTGCTCGGGCGCGAAATCCTCGACCAA
CGCTTCGCGGGCATGAAGCAGGCGCTGCAGGGCGCCACCGACGAGGACCGTCAGCGGGTCAGCGACATGCTGAACGACCT
CAACGACCTGCTGGACAAGCACGCCAAAGGCCAAGACTCCCAGCAGGACTTCGACGACTTCATGGCCAAGCACGGCGAGT
TCTTCCCGGAGAGCCCGCGCAATGTCGAGGAGCTGCTGGATTCGTTGGCCAAGCGCGCCGCCGCCGCGCAGCGGTTCCGC
AACAGCCTCAGCGCCGAGCAGCGTGCCGAGCTGGATGCCTTGGCGCAGCAGGCTTTCGGCTCGCCCGACCTGATGCAGGC
GCTGAACCGGCTCGACGCGCATCTGCAGGCGGCGCGTCCCGGGGAGGACTGGGGCGGGTCCGAGCAGTTCTCCGGCGACA
ACCCGTTCGGCATGGGCGAGGGCACCCAGGCGCTGGCCGACATCGCCGAGCTGGAGCAGCTGGCCGAGCAGCTGTCGCAA
AGCTACCCGGGCGCCACCATGGATGACGTCGATCTTGACGCGCTGGCACGCCAACTCGGCGACCAGGCGGCCGTCGACGC
CCGCACGCTCGCCGAGCTGGAACGGGCCCTGGTCAATCAGGGCTTTTTGGACCGCGGGTCCGACGGCCAGTGGCGGCTAT
CGCCGAAGGCCATGCGCCGCCTGGGTGAGACGGCGTTACGCGATGTGGCACAACAGCTTTCCGGCCGCCGCGGTGAACGT
GACCACCGGCGCGCGGGGGCGGCCGGCGAACTCACCGGCGCCACCCGGCCCTGGCAGTTCGGTGACACCGAGCCGTGGAA
CATCTCCCGCACCCTGACCAACGCCGTGCTGCGCCAAGCCGGGACGGCCACTCTGGACGGACCGGACGGGCGGTTGAAGA
TCACCGTCGACGACGTTGAGGTCTCCGAAACCGAGACGCGCACGCAGGCCGCGGTCGCGCTGTTGGTGGACACGTCGTTC
TCGATGGTGATGGAGAACCGCTGGCTGCCGATGAAGCAGACGGCGCTGGCGCTCAACCATCTGGTGTGCACCCGGTTCCG
TTCGGATGCCTTGCAGATCATTGCTTTTGGCCGCTACGCCCGGACGGTGACCGCCGCCGAGCTGACCGGGCTGGAGGGTG
TGTACGAGCAGGGCACCAATCTGCATCACGCGCTGGCGCTGGCGGGCCGGCATCTGCGCCGCCACCCCAACGCGCTGCCC
GTGGTGCTGGTGGTCACCGACGGTGAACCGACCGCGCACCTGGAGGACTTCGAGGGCGCGGGCACGACCGTGTTCTTCGA
CTACCCCCCGCACCCGCGGACCATCGCGCACACCGTGCGCGGCTTCGACGAGATGGCACGGCTGGGCGCGCAGATCACCA
TCTTCCGGCTGGGCAGCGACCCCGGCCTGGCTCGGTTCATCGACCAGGTGGCCCGGCGCGTCGAGGGGCGCGTGGTGGTG
CCCGACCTCGACGGATTGGGAGCGGCCGTGGTGGGCGACTACCTGCGGTCCCGGCGGCGTCGATAG

Upstream 100 bases:

>100_bases
CGCTCGGAGGGCGAACGCGCCGCGGCGCTCGAACTTGCGCTGGAGGCACTGTATTTGGCCAAGCGAATCGACAAGGTGTC
CGGGGAGGGCCAGACCGTCT

Downstream 100 bases:

>100_bases
CGACACCGGTCGCGCGACTGAGTTTGACTGTGAAGGAGGAAAACCCGGGTGAGCAAGGTTCCGACGATCGAACTCAACGA
CGGTGCGCGCATCCCGCAGC

Product: hypothetical protein

Products: NA

Alternate protein names: NA

Number of amino acids: Translated: 661; Mature: 660

Protein sequence:

>661_residues
MANRGHSSRYSAYTGGPDPLAPPVDLREALEQIGQDVMAGTSPRRALSELLRRGTKNMPGADRLAAEANRRRRELLRRNN
LDGTLQEIKKLLDEAVLAERKELARALDDDARFGELQLDALPASPAKAVQELSEYNWRSSEAREKYDQIKDLLGREILDQ
RFAGMKQALQGATDEDRQRVSDMLNDLNDLLDKHAKGQDSQQDFDDFMAKHGEFFPESPRNVEELLDSLAKRAAAAQRFR
NSLSAEQRAELDALAQQAFGSPDLMQALNRLDAHLQAARPGEDWGGSEQFSGDNPFGMGEGTQALADIAELEQLAEQLSQ
SYPGATMDDVDLDALARQLGDQAAVDARTLAELERALVNQGFLDRGSDGQWRLSPKAMRRLGETALRDVAQQLSGRRGER
DHRRAGAAGELTGATRPWQFGDTEPWNISRTLTNAVLRQAGTATLDGPDGRLKITVDDVEVSETETRTQAAVALLVDTSF
SMVMENRWLPMKQTALALNHLVCTRFRSDALQIIAFGRYARTVTAAELTGLEGVYEQGTNLHHALALAGRHLRRHPNALP
VVLVVTDGEPTAHLEDFEGAGTTVFFDYPPHPRTIAHTVRGFDEMARLGAQITIFRLGSDPGLARFIDQVARRVEGRVVV
PDLDGLGAAVVGDYLRSRRRR

Sequences:

>Translated_661_residues
MANRGHSSRYSAYTGGPDPLAPPVDLREALEQIGQDVMAGTSPRRALSELLRRGTKNMPGADRLAAEANRRRRELLRRNN
LDGTLQEIKKLLDEAVLAERKELARALDDDARFGELQLDALPASPAKAVQELSEYNWRSSEAREKYDQIKDLLGREILDQ
RFAGMKQALQGATDEDRQRVSDMLNDLNDLLDKHAKGQDSQQDFDDFMAKHGEFFPESPRNVEELLDSLAKRAAAAQRFR
NSLSAEQRAELDALAQQAFGSPDLMQALNRLDAHLQAARPGEDWGGSEQFSGDNPFGMGEGTQALADIAELEQLAEQLSQ
SYPGATMDDVDLDALARQLGDQAAVDARTLAELERALVNQGFLDRGSDGQWRLSPKAMRRLGETALRDVAQQLSGRRGER
DHRRAGAAGELTGATRPWQFGDTEPWNISRTLTNAVLRQAGTATLDGPDGRLKITVDDVEVSETETRTQAAVALLVDTSF
SMVMENRWLPMKQTALALNHLVCTRFRSDALQIIAFGRYARTVTAAELTGLEGVYEQGTNLHHALALAGRHLRRHPNALP
VVLVVTDGEPTAHLEDFEGAGTTVFFDYPPHPRTIAHTVRGFDEMARLGAQITIFRLGSDPGLARFIDQVARRVEGRVVV
PDLDGLGAAVVGDYLRSRRRR
>Mature_660_residues
ANRGHSSRYSAYTGGPDPLAPPVDLREALEQIGQDVMAGTSPRRALSELLRRGTKNMPGADRLAAEANRRRRELLRRNNL
DGTLQEIKKLLDEAVLAERKELARALDDDARFGELQLDALPASPAKAVQELSEYNWRSSEAREKYDQIKDLLGREILDQR
FAGMKQALQGATDEDRQRVSDMLNDLNDLLDKHAKGQDSQQDFDDFMAKHGEFFPESPRNVEELLDSLAKRAAAAQRFRN
SLSAEQRAELDALAQQAFGSPDLMQALNRLDAHLQAARPGEDWGGSEQFSGDNPFGMGEGTQALADIAELEQLAEQLSQS
YPGATMDDVDLDALARQLGDQAAVDARTLAELERALVNQGFLDRGSDGQWRLSPKAMRRLGETALRDVAQQLSGRRGERD
HRRAGAAGELTGATRPWQFGDTEPWNISRTLTNAVLRQAGTATLDGPDGRLKITVDDVEVSETETRTQAAVALLVDTSFS
MVMENRWLPMKQTALALNHLVCTRFRSDALQIIAFGRYARTVTAAELTGLEGVYEQGTNLHHALALAGRHLRRHPNALPV
VLVVTDGEPTAHLEDFEGAGTTVFFDYPPHPRTIAHTVRGFDEMARLGAQITIFRLGSDPGLARFIDQVARRVEGRVVVP
DLDGLGAAVVGDYLRSRRRR

Specific function: Unknown

COG id: COG4867

COG function: function code R; Uncharacterized protein with a von Willebrand factor type A (vWA) domain

Gene ontology:

Cell location: Cytoplasmic

Metaboloic importance: NA

Operon status: Not Known

Operon components: None

Similarity: NA

Homologues:

None

Paralogues:

None

Copy number: NA

Swissprot (AC and ID): NA

Other databases:

- InterPro:   IPR002035 [H]

Pfam domain/function: NA

EC number: NA

Molecular weight: Translated: 73001; Mature: 72869

Theoretical pI: Translated: 5.23; Mature: 5.23

Prosite motif: PS00639 THIOL_PROTEASE_HIS

Important sites: NA

Signals:

None

Transmembrane regions:

None

Cys/Met content:

0.2 %Cys     (Translated Protein)
2.1 %Met     (Translated Protein)
2.3 %Cys+Met (Translated Protein)
0.2 %Cys     (Mature Protein)
2.0 %Met     (Mature Protein)
2.1 %Cys+Met (Mature Protein)

Secondary structure:

>Translated Secondary Structure
MANRGHSSRYSAYTGGPDPLAPPVDLREALEQIGQDVMAGTSPRRALSELLRRGTKNMPG
CCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCHHHHHHHHHHCCCCCCCC
ADRLAAEANRRRRELLRRNNLDGTLQEIKKLLDEAVLAERKELARALDDDARFGELQLDA
HHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEC
LPASPAKAVQELSEYNWRSSEAREKYDQIKDLLGREILDQRFAGMKQALQGATDEDRQRV
CCCCHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHH
SDMLNDLNDLLDKHAKGQDSQQDFDDFMAKHGEFFPESPRNVEELLDSLAKRAAAAQRFR
HHHHHHHHHHHHHHCCCCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHHHHHHHHH
NSLSAEQRAELDALAQQAFGSPDLMQALNRLDAHLQAARPGEDWGGSEQFSGDNPFGMGE
HHHCHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCC
GTQALADIAELEQLAEQLSQSYPGATMDDVDLDALARQLGDQAAVDARTLAELERALVNQ
HHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHCCHHHHHHHHHHHHHHHHHHC
GFLDRGSDGQWRLSPKAMRRLGETALRDVAQQLSGRRGERDHRRAGAAGELTGATRPWQF
CCCCCCCCCCEEECHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHCCCCCCCCCCCCCCCC
GDTEPWNISRTLTNAVLRQAGTATLDGPDGRLKITVDDVEVSETETRTQAAVALLVDTSF
CCCCCCCHHHHHHHHHHHHCCCCCCCCCCCEEEEEEECCCCCHHHHHHHHHHHHHHHCHH
SMVMENRWLPMKQTALALNHLVCTRFRSDALQIIAFGRYARTVTAAELTGLEGVYEQGTN
HHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCC
LHHALALAGRHLRRHPNALPVVLVVTDGEPTAHLEDFEGAGTTVFFDYPPHPRTIAHTVR
HHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCCHHCCCCCCCEEEECCCCCCHHHHHHHH
GFDEMARLGAQITIFRLGSDPGLARFIDQVARRVEGRVVVPDLDGLGAAVVGDYLRSRRR
HHHHHHHCCCEEEEEEECCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHHHHHHCC
R
C
>Mature Secondary Structure 
ANRGHSSRYSAYTGGPDPLAPPVDLREALEQIGQDVMAGTSPRRALSELLRRGTKNMPG
CCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCCCHHHHHHHHHHCCCCCCCC
ADRLAAEANRRRRELLRRNNLDGTLQEIKKLLDEAVLAERKELARALDDDARFGELQLDA
HHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEEEC
LPASPAKAVQELSEYNWRSSEAREKYDQIKDLLGREILDQRFAGMKQALQGATDEDRQRV
CCCCHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHH
SDMLNDLNDLLDKHAKGQDSQQDFDDFMAKHGEFFPESPRNVEELLDSLAKRAAAAQRFR
HHHHHHHHHHHHHHCCCCCCHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHHHHHHHHH
NSLSAEQRAELDALAQQAFGSPDLMQALNRLDAHLQAARPGEDWGGSEQFSGDNPFGMGE
HHHCHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCC
GTQALADIAELEQLAEQLSQSYPGATMDDVDLDALARQLGDQAAVDARTLAELERALVNQ
HHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHCCHHHHHHHHHHHHHHHHHHC
GFLDRGSDGQWRLSPKAMRRLGETALRDVAQQLSGRRGERDHRRAGAAGELTGATRPWQF
CCCCCCCCCCEEECHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHCCCCCCCCCCCCCCCC
GDTEPWNISRTLTNAVLRQAGTATLDGPDGRLKITVDDVEVSETETRTQAAVALLVDTSF
CCCCCCCHHHHHHHHHHHHCCCCCCCCCCCEEEEEEECCCCCHHHHHHHHHHHHHHHCHH
SMVMENRWLPMKQTALALNHLVCTRFRSDALQIIAFGRYARTVTAAELTGLEGVYEQGTN
HHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCC
LHHALALAGRHLRRHPNALPVVLVVTDGEPTAHLEDFEGAGTTVFFDYPPHPRTIAHTVR
HHHHHHHHHHHHHHCCCCCEEEEEEECCCCCCCHHCCCCCCCEEEECCCCCCHHHHHHHH
GFDEMARLGAQITIFRLGSDPGLARFIDQVARRVEGRVVVPDLDGLGAAVVGDYLRSRRR
HHHHHHHCCCEEEEEEECCCCCHHHHHHHHHHHHCCEEEECCCCCCHHHHHHHHHHHHCC
R
C

PDB accession: NA

Resolution: NA

Structure class: Unstructured

Cofactors: NA

Metal ions: NA

Kcat value (1/min): NA

Specific activity: NA

Km value (mM): NA

Substrates: NA

Specific reaction: NA

General reaction: NA

Inhibitor: NA

Structure determination priority: 9.0

TargetDB status: NA

Availability: NA

References: 9634230; 12218036 [H]