LOCUS NC_015657 5448 bp DNA circular BCT 16-JUN-2011 DEFINITION Frankia symbiont of Datisca glomerata plasmid pFSYMDG02, complete sequence. ACCESSION NC_015657 VERSION NC_015657.1 GI:336176136 DBLINK Project: 46257 KEYWORDS . SOURCE Frankia symbiont of Datisca glomerata ORGANISM Frankia symbiont of Datisca glomerata Bacteria; Actinobacteria; Actinobacteridae; Actinomycetales; Frankineae; Frankiaceae; Frankia. REFERENCE 1 (bases 1 to 5448) AUTHORS Lucas,S., Han,J., Lapidus,A., Cheng,J.-F., Goodwin,L., Pitluck,S., Peters,L., Mikhailova,N., Chertkov,O., Teshima,H., Han,C., Tapia,R., Land,M., Hauser,L., Kyrpides,N., Ivanova,N., Pagani,I., Berry,A., Pawlowski,K., Persson,T., Vanden Heuvel,B., Benson,D. and Woyke,T. CONSRTM US DOE Joint Genome Institute TITLE Complete sequence of plasmid 2 of Frankia symbiont of Datisca glomerata JOURNAL Unpublished REFERENCE 2 (bases 1 to 5448) CONSRTM NCBI Genome Project TITLE Direct Submission JOURNAL Submitted (15-JUN-2011) National Center for Biotechnology Information, NIH, Bethesda, MD 20894, USA REFERENCE 3 (bases 1 to 5448) AUTHORS Lucas,S., Han,J., Lapidus,A., Cheng,J.-F., Goodwin,L., Pitluck,S., Peters,L., Mikhailova,N., Chertkov,O., Teshima,H., Han,C., Tapia,R., Land,M., Hauser,L., Kyrpides,N., Ivanova,N., Pagani,I., Berry,A., Pawlowski,K., Persson,T., Vanden Heuvel,B., Benson,D. and Woyke,T. CONSRTM US DOE Joint Genome Institute TITLE Direct Submission JOURNAL Submitted (19-MAY-2011) US DOE Joint Genome Institute, 2800 Mitchell Drive B310, Walnut Creek, CA 94598-1698, USA COMMENT PROVISIONAL REFSEQ: This record has not yet been subject to final NCBI review. The reference sequence is identical to CP002803. URL -- http://www.jgi.doe.gov JGI Project ID: 4085684 Source DNA and organism available from David Benson (david.benson@uconn.edu) Contacts: David Benson (david.benson@uconn.edu) Tanja Woyke (microbe@cuba.jgi-psf.org) Annotation done by JGI-ORNL and JGI-PGF Finishing done by JGI-LANL The JGI and collaborators endorse the principles for the distribution and use of large scale sequencing data adopted by the larger genome sequencing community and urge users of this data to follow them. it is our intention to publish the work of this project in a timely fashion and we welcome collaborative interaction on the project and analysis. (http://www.genome.gov/page.cfm?pageID=10506376). ##MIGS-Data-START## investigation_type :: bacteria_archaea project_name :: Frankia symbiont of Datisca glomerata collection_date :: Missing lat_lon :: Missing depth :: Missing alt_elev :: Missing country :: Missing environment :: Host, Plant symbiont, Soil num_replicons :: 3 ref_biomaterial :: Missing biotic_relationship :: Symbiotic trophic_level :: Chemoorganotroph rel_to_oxygen :: Aerobe isol_growth_condt :: Missing sequencing_meth :: WGS assembly :: Newbler v. 2.3 (pre-release) finishing_strategy :: Finished GOLD Stamp ID :: Gi04474 Funding Program :: DOE-CSP 2007 Isolation Site :: Datisca glomerata Host Name :: Datisca glomerata Cell Shape :: Filament-shaped Motility :: Nonmotile Sporulation :: Sporulating Temperature Range :: Mesophile Gram Staining :: Gram+ Symbiotic Relationship :: Mutualistic Diseases :: None Phenotypes :: Non-Pathogen ##MIGS-Data-END## ##Genome-Assembly-Data-START## Finishing Goal :: Finished Current Finishing Status :: Finished Assembly Method :: Newbler v. 2.3 Genome Coverage :: 30x Sequencing Technology :: 454/Illumina ##Genome-Assembly-Data-END## COMPLETENESS: full length. FEATURES Location/Qualifiers source 1..5448 /organism="Frankia symbiont of Datisca glomerata" /mol_type="genomic DNA" /strain="4085684" /host="Datisca glomerata" /db_xref="taxon:656024" /plasmid="pFSYMDG02" gene 211..3903 /locus_tag="FsymDg_4560" /db_xref="GeneID:10785245" CDS 211..3903 /locus_tag="FsymDg_4560" /EC_number="3.1.26.4" /inference="protein motif:PRIAM:3.1.26.4" /note="KEGG: vpo:Kpol_1037p34 Tkp1 protein; PFAM: Integrase, catalytic core" /codon_start=1 /transl_table=11 /product="Ribonuclease H" /protein_id="YP_004585717.1" /db_xref="GI:336176137" /db_xref="InterPro:IPR001584" /db_xref="GeneID:10785245" /translation="MASNDIISTNVPSKVPTTDTEEYPVQHGSAAKTPIVPSDNQASP EESGSIDVEHKANADDEASGSHEGHHENFSKDSTKTGTPQDTLHKEPVDGSPVPNYAQ QSWYYHTQPPPQFYPSPYANYGPGPYTPPWANMNMPIPGANTNPDKTGGHYQTTGPSG DSSQYNPAYYFPYQLEPKPGLQSQTTTECTFGQLHNPRKTFTLSKVNSPHYYDSWVRE MCQAMLEQNLGHLIPTEDNKTPETPEPAEKRYIEEIHIACVPEEKYPKWLKSSYDEGE ALIDCFKEGIRRIKAEEDPEAIVLALAGLTLDYRESIPNFAKRLRKIHQRTVNAKFPM AEKIVISRALKALPERYEKVDINFTKSNDQSFNNFINILLTTEPRMKSSNQYDNQPVS VIYDKKLIHNFTSQSNQILVDVQQNEVAVKGEGNLELKFKGKRISLPAIYAPSTHTNI ISVEDLTKSQAYLDLRRNCLLSKNGKTIAPTHKFAGLRWLSRKNAIELPNQTQSVYAI TPRSVRSAPDKFSLTSIHNMFGHMNINYIRESFRKGLIQGVKEDDVDWTGVSSFQCQY CMEGKAKRNNHYVNARKDYTKEYLPFEYLHTDVFGPVRVQRTRTTPRYFIAFIDEVTK YIWTFPLLHKTAEEVAPTFKEVVMLIYTQFNTRVKTIQMDKGSEYLNTKVQKFLRERG IVSRETTVADSKANGAIERQHYTLLNDCRTFLRQANLRPRLWYHAVVYSTVMRNSLLN RSIGTSPRNRAGMSGLSFRDILPFGQPVIVHLPNPKSKLQARGVLGYALHPSTRSYGY IIVVGKKKKIPIDTRNYRVLNYPPGATISEDEVQYMIDRMENNDAESQGDIESNFEPN YTDMEQPIHHTADYFPNTTASNIETDQYNNDSFGLHYGGDSVPPESSSSEDELFPTDE SENDSDSSDQSFNDDGDPMSPPYSGGEEQLVPTAAPIRRVPPMEPPSPVEDSPPPLSG TDLDDLFGESNINNYIPEDTDLLALNHESVPEPDTAVPATTNIEQDNVLPLETNENSN PPNDSSDQSGDESGEQSGEESGEESVDNRLKSIPIFNGNKHKDARIAEADLDSLYGGG DNTENNNGPTLEEVFRSIEEDPFMLTQKRPRSRARYRESNQDSCDSGGDYESSSDSDG SSDESPQKGRKIQRVNYVNAVQKPVSVIPLNMSLNYSQAISRNRNEEEKDAFQKAYQK EIAQLTKMNTWNEELIDASTLPKKRF" misc_feature 1981..2340 /locus_tag="FsymDg_4560" /note="Integrase core domain; Region: rve; cl01316" /db_xref="CDD:194099" gene 3909..5333 /locus_tag="FsymDg_4561" /db_xref="GeneID:10785244" CDS 3909..5333 /locus_tag="FsymDg_4561" /EC_number="3.1.26.4" /inference="protein motif:PRIAM:3.1.26.4" /note="KEGG: sce:YDR210C-D Retrotransposon TYA Gag and TYB Pol genes; transcribed/translated as one unit; polyprotein is processed to make a nucleocapsid-like protein (Gag), reverse transcriptase (RT), protease (PR), and integrase (IN); similar to retroviral genes; PFAM: Reverse transcriptase, RNA-dependent DNA polymerase" /codon_start=1 /transl_table=11 /product="Ribonuclease H" /protein_id="YP_004585718.1" /db_xref="GI:336176138" /db_xref="InterPro:IPR013103" /db_xref="GeneID:10785244" /translation="MFIFTTKRDNSKKCRLVARGDQQAADTYDTELKANTVDNLALMT VLALTLDYNLTAFQLDISSAYLYADLKEELYIRAPPHMNAKNKVLRLNKSLYGLKQSG ANWYELIRSFLIKKCDLIEDRMWKCVFRDKEPLKLIICLFVDDMLVVGNDVKYIKKFI SKLSKRFDTKIVNDGSHRPEDGVNEYDILGIELEYKKKEYMKFGMQKSLEDKLPQLGI PLLPNAKIRKVPGVPGDYIFSGKELKLNEREYKSKVKHLQRIVGLASYVGHKFRFDIL YYVNILAQHQLYPSAKVLDRAAQLCQYLWDTRDKKLVWHYSGPKENNVTAVSDAAFAG NQDFKSQSGTLYLRNNKPIAAKSRKIKLTCISSTEAEIYAISESLPILRGLEHLVNKL QDIKATVKVKTDSQPSMAIINGTDDSACLKKHIGSRAMRIRDECDDLGLTLEYIPTKE NNADVLTKPLSVKLFKLLTEDWIQ" misc_feature 3915..4532 /locus_tag="FsymDg_4561" /note="Reverse transcriptase (RNA-dependent DNA polymerase); Region: RVT_2; pfam07727" /db_xref="CDD:149018" misc_feature 4884..5303 /locus_tag="FsymDg_4561" /note="Ty1/Copia family of RNase HI in long-term repeat retroelements; Region: RNase_HI_RT_Ty1; cd09272" /db_xref="CDD:187696" misc_feature order(4893..4904,5004..5012,5019..5021,5118..5120, 5199..5201,5280..5282) /locus_tag="FsymDg_4561" /note="RNA/DNA hybrid binding site [nucleotide binding]; other site" /db_xref="CDD:187696" misc_feature order(4893..4895,5019..5021,5118..5120,5256..5258) /locus_tag="FsymDg_4561" /note="active site" /db_xref="CDD:187696" ORIGIN 1 gctaaagcat ttgataattg atttgattcc aaaaagtata taagtagaag gtttcttcat 61 ccaattccca ctttcgaatt gacaacttac ttactatcca attataagtc aagtcagaag 121 aaaccaagga gaacaacagt aaatacttac ttatctagtg aatccaacaa ttcaattacc 181 aattaattaa tccaacagct tcaagtcaaa atggcatcca acgacattat ttctactaac 241 gtcccttcta aggtccctac taccgatact gaggaatatc cagtccaaca tggtagcgcc 301 gctaagacac ctattgtccc ctcagacaat caggcaagtc cagaggaatc aggtagcatc 361 gatgtcgagc ataaggccaa tgccgacgac gaagcgtccg gatcccacga aggtcatcat 421 gaaaattttt caaaggattc aacaaaaacc ggaactccac aggacaccct acacaaagag 481 cctgtggacg gttctcccgt tccaaattac gcccaacaaa gctggtacta ccatacacag 541 ccaccgccac agttttatcc atctccttac gctaattatg ggcctggtcc ttacactcct 601 ccgtgggcca acatgaatat gcccatccca ggagctaaca cgaacccaga taaaactgga 661 ggacattatc aaacaacggg tccatctggg gacagctccc aatacaaccc tgcttactac 721 ttcccgtacc aattggaacc aaaaccaggg ttgcaaagcc aaacaacgac cgaatgtact 781 ttcggccaat tgcacaatcc cagaaaaaca tttactctgt caaaggtaaa ctcacctcac 841 tattacgaca gctgggtaag agaaatgtgc caggcaatgc tcgagcaaaa cttgggccac 901 ctcattccaa cagaggataa caaaacccca gaaaccccag aacctgcaga gaagaggtac 961 atagaagaaa tccatatagc atgtgtgcct gaagaaaaat acccaaagtg gttaaagtcg 1021 agttacgacg aaggtgaagc actaattgac tgcttcaagg aaggaattag aagaataaag 1081 gcagaagaag atcctgaggc aatcgtgcta gccctagctg gactaacact agactacaga 1141 gagtctattc ccaacttcgc caagagatta aggaaaatcc accagagaac ggtaaacgca 1201 aaattcccaa tggcagaaaa gatcgtcatt tccagagcac taaaggctct accagaacga 1261 tacgagaaag ttgatatcaa cttcaccaaa tcaaacgatc agagtttcaa taacttcata 1321 aacatcctcc taacaactga gccaagaatg aaaagctcaa atcaatatga taatcaacca 1381 gtctctgtga tatatgacaa aaagttaatt cacaatttta catctcaatc taatcaaatt 1441 cttgtcgacg ttcaacaaaa cgaagtcgcc gttaaaggtg agggtaactt agaactcaag 1501 tttaagggca aacgaatttc cctacccgcc atatatgcac catcaacaca caccaatatc 1561 atcagtgtag aagatctaac aaaatcacag gcgtatctag atctaagaag gaactgcctg 1621 ctatccaaga acgggaaaac catcgctccg acgcacaagt tcgcgggcct aaggtggtta 1681 tctcgcaaaa atgctataga actaccaaac cagactcaat cagtctacgc aatcacaccc 1741 agatcggtaa gatctgcgcc agacaagttc tctctgacaa gcatccacaa catgtttgga 1801 catatgaata tcaattacat ccgagaatca ttcagaaaag gcttaattca aggtgtgaaa 1861 gaagacgatg tagactggac aggagtaagc tcattccaat gccaatactg tatggaagga 1921 aaagccaaac gtaataacca ttacgtgaac gctagaaaag attatacgaa ggaatatctt 1981 cctttcgaat acttgcatac cgacgttttt ggaccagtaa gagtacaaag aacccgtact 2041 actccaaggt acttcattgc attcatagac gaggtcacaa aatacatatg gaccttcccg 2101 ttactacata aaacagcaga agaagtagcc ccgacattca aggaagtcgt catgctgatt 2161 tatacacagt tcaacacgag agtgaaaacc atccaaatgg acaaaggatc ggaatacctg 2221 aacactaagg tacagaaatt cctaagggaa agaggaattg tttcgagaga aacgaccgtt 2281 gctgattcaa aagcaaatgg agccatagaa agacagcact atacactctt gaatgactgt 2341 agaacgttct tgcgacaagc taacctacgc cctagattgt ggtatcatgc cgtcgtatac 2401 tctacagtaa tgagaaattc actcctaaat agaagtatag gaacgtcccc gagaaacagg 2461 gcggggatgt cgggactgtc attcagagac attcttccct tcggacaacc cgtgattgtc 2521 cacttaccca acccaaaatc gaaactacaa gctcgaggag tcctaggcta tgctctccat 2581 ccatccacta gatcatacgg ctacatcata gtcgtaggaa agaagaagaa aatacctatc 2641 gacactcgaa actaccgagt cctgaactac ccaccaggtg caacaatctc agaagatgag 2701 gtacagtaca tgatcgaccg tatggaaaat aacgacgcag aatctcaagg cgatatagag 2761 tcgaactttg aaccaaatta tacggatatg gagcaaccca ttcaccacac agcggactat 2821 ttcccgaaca caaccgcctc aaacatcgag acagaccaat acaataatga tagcttcggt 2881 ctgcattacg ggggtgattc agtaccaccc gagtcgtcca gtagcgagga cgaactgttc 2941 cccacagacg aatcagaaaa cgattcagac tcatcggacc aatcattcaa tgatgatgga 3001 gaccccatgt cccctccata ttccgggggt gaggaacagt tagtaccgac agcagccccg 3061 attagacgcg ttccaccaat ggaaccacca tctcctgtcg aggactctcc cccaccgtta 3121 tctgggacag accttgacga cttatttgga gaatctaaca taaataatta catcccagaa 3181 gatacagatc tactggcact aaaccatgaa tctgttccgg aacccgatac cgcggtacca 3241 gcaacaacga acattgaaca agataatgtt ctaccattag aaactaacga aaactctaac 3301 cctccaaatg atagctcgga ccagtcaggc gacgagtcag gcgaacagtc aggcgaagag 3361 tccggcgagg agtcagtcga caacagactc aagtccatac ctattttcaa tggaaacaag 3421 cacaaagatg caagaattgc tgaagcagat ttggattctt tgtacggggg tggagataat 3481 acagaaaata acaatggacc aactttagaa gaggtgttca gatcgatcga agaagatcca 3541 ttcatgctaa cacagaaaag accgagatca cgtgcccgat atcgagaatc taaccaagac 3601 agttgcgatt ccgggggtga ctacgagtca agttcagact cagacgggtc atctgacgag 3661 tcacctcaga agggaagaaa gatacaacgg gtaaactacg taaacgcggt tcagaaacct 3721 gtgagtgtga tcccattgaa tatgtccttg aactattccc aggcaatatc tcggaacaga 3781 aacgaagaag agaaggatgc tttccagaag gcataccaga aggaaatagc acagctaacc 3841 aaaatgaaca catggaacga agaattaata gatgcctcta ctcttcctaa gaaaagattc 3901 taaactcaat gttcattttc accactaaaa gagataactc caagaagtgc agactagtcg 3961 ctagaggtga ccaacaagcc gcagacacgt atgacactga attaaaggca aatacagtgg 4021 acaacctagc tctcatgaca gtcttagcac tgacactaga ctacaacctg accgcattcc 4081 aacttgacat ctcatcagct tacctctacg ctgaccttaa ggaagaattg tacattagag 4141 caccaccaca catgaatgcc aagaacaagg tactaagact aaataaatca ctctatgggc 4201 taaaacagag tggagcaaat tggtacgaac taatcagatc gttcctaatt aagaaatgtg 4261 acctgattga agatagaatg tggaagtgcg tttttagaga caaagaaccg ctgaaactta 4321 ttatatgtct ttttgtcgat gacatgctgg ttgtaggaaa cgacgtcaag tatatcaaga 4381 aattcatatc aaagctatct aagagatttg atacaaagat tgtaaatgat ggttcacaca 4441 ggccagaaga tggagtaaac gagtatgaca ttttgggcat agaattagag tataagaaaa 4501 aagaatacat gaagttcgga atgcagaagt ctctagagga caagctgcca caactgggaa 4561 tacccctact cccaaacgct aaaatcagga aggttccggg ggtgcctgga gactacatct 4621 tctcaggaaa ggaactgaaa ttaaatgaaa gagaatataa aagcaaagtt aaacacctgc 4681 aaagaattgt aggactagcg tcctacgtag gacataagtt ccggtttgac atcttgtact 4741 acgtgaacat cttagcacaa catcaactgt atcccagcgc caaggtccta gacagggctg 4801 cacaattatg ccaatacttg tgggatacaa gagataagaa actagtttgg cattattctg 4861 gtcccaagga aaacaacgtt accgctgtat cagatgcagc atttgcaggg aaccaagatt 4921 ttaaatcaca atcaggaact ctttacctga gaaacaacaa gcccatagca gcaaaatcaa 4981 gaaaaatcaa gttaacttgt atctcgtcca cagaagccga gatatacgca atcagtgaaa 5041 gcctgccaat actacgtggg ttagaacacc tagtaaacaa gttacaagac ataaaagcaa 5101 cagtaaaggt taaaacagac agtcaaccat caatggcaat aataaacggc acggatgact 5161 cagcatgcct caagaaacac attggtagta gggcaatgag gataagagat gagtgcgatg 5221 atctcggact tacactcgaa tatatcccca caaaagaaaa caatgctgac gttttaacca 5281 aacccctatc cgtgaagcta ttcaaacttc tcacagagga ctggatacaa tagctttctc 5341 ctagtagggg gtgtgttgat cctggacgtc ctacgacgtt gtctagggcg ccagatatta 5401 taccaactat aaaagctaag gctaaagcct agatatacta ggatcaag //