![]() |
cutgextract |
The EMBOSS distribution comes loaded with a set of codon usage tables. Thes codon usage tables provided with the distribution are calculated from the files in ftp://ftp.ebi.ac.uk/pub/databases/codonusage/README), with a few additions whose exact derivation cannot easily be determined. Many people would prefer to create their own from the public CUTG data.
You run cutgextract on the CUTG database from ftp://ftp.ebi.ac.uk/pub/databases/cutg. You should get all the required *.codon files from CUTG, and uncompress them if they are compressed before running cutgextract on them.
The task of downloading the CUTG database and running cutgextract to create the codon usage table files from it would normally be done only once when the EMBOSS package is being installled or if a new version of the CUTG database is released.
Note by the way that CUTG has a drawback: it has a table for each organism without making the distinction between different gene populations.
It then parses out the codon usage data from these *.codon files and writes one file per species into the EMBOSS data/CODONS directory. The names of the files are derived from the species names in the CUTG files. These files names will be long (and therefore descriptive).
% cutgextract Extract data from CUTG CUTG directory [.]: ../../data |
Go to the output files for this example
Standard (Mandatory) qualifiers: [-directory] dirlist CUTG directory Additional (Optional) qualifiers: (none) Advanced (Unprompted) qualifiers: -wildspec string Type of codon file Associated qualifiers: (none) General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report deaths |
|
EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by the EMBOSS environment variable EMBOSS_DATA.
To see the available EMBOSS data files, run:
% embossdata -showall
To fetch one of the data files (for example 'Exxx.dat') into your current directory for you to inspect or modify, run:
% embossdata -fetch -file Exxx.dat
Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".
The directories are searched in the following order:
Program name | Description |
---|---|
aaindexextract | Extract data from AAINDEX |
printsextract | Extract data from PRINTS |
prosextract | Builds the PROSITE motif database for patmatmotifs to search |
rebaseextract | Extract data from REBASE |
tfextract | Extract data from TRANSFAC |