Difference between revisions of "Release to BRC"

From TheSeed
Jump to navigation Jump to search
 
Line 11: Line 11:
 
  nmpdr2gff NMPDR
 
  nmpdr2gff NMPDR
  
This looks through all genomes for the NMPDR flag, and if it is found then a GFF3 file is created. If you suspect that files are not created for some genomes that should have them, then the NMPDR flag has not been set.
+
This looks through all genomes for the NMPDR flag, and if it is found then a GFF3 file is created using the seed2gff command. If you suspect that files are not created for some genomes that should have them, then the NMPDR flag has not been set. If you would like to create a gff3 file of a single organism, you can use seed2gff with just that organism.
  
 
The creation takes about 30-40 seconds per genome, so you can expect it to run for some time.
 
The creation takes about 30-40 seconds per genome, so you can expect it to run for some time.

Latest revision as of 09:43, 14 November 2006

How to Create and Release GFF3 files to the BRCs

We regularly release our data to BRC-central via GFF3 files. This page describes the steps to release the data.

Creating the files

Choose a machine that is upto date, and create an empty directory. For this example, we'll use the directory NMPDR

Run the command

nmpdr2gff NMPDR

This looks through all genomes for the NMPDR flag, and if it is found then a GFF3 file is created using the seed2gff command. If you suspect that files are not created for some genomes that should have them, then the NMPDR flag has not been set. If you would like to create a gff3 file of a single organism, you can use seed2gff with just that organism.

The creation takes about 30-40 seconds per genome, so you can expect it to run for some time.

Once complete you will have a directory structure that looks something like this (only the first two genomes are shown for each organism):

  • NMPDR
    • Campylobacter
      • Campylobacter.coli.RM2228.gff3
      • Campylobacter.jejuni.subsp.jejuni.84-25.gff3
      • ...
    • Listeria
      • Listeria.innocua.Clip11262.gff3
      • Listeria.monocytogenes.EGD-e.gff3
      • ...
    • Staphylococcus
      • Staphylococcus.aureus.RF122.gff3
      • Staphylococcus.aureus.subsp.aureus.MRSA252.gff3
      • ...
    • Streptococcus
      • Streptococcus.pneumoniae.R6.gff3
      • Streptococcus.pyogenes.MGAS10270.gff3
      • ...
    • Vibrio
      • Vibrio.cholerae.MO10.gff3
      • Vibrio.cholerae.O395.gff3
      • ...

Uploading the files

One the creation of the GFF3 files is complete, you need to use the brc-central validator to validate and upload the data to the site. The one tricky thing about this was that it requires the GO::Parser PERL module. This should be part of the standard install everywhere now, but you may run into problems if it is missing. Please contact Bob for help.

Use this command to validate and upload our data:

gff3_validator.pl -b NMPDR -d /path/to/directory/NMPDR -p CDS

One this has completed you should ftp to [1] and check that the files are correct. If there are problems with the validator or upload you should email Todd Creasy at TIGR for help.