next up previous
Next: Listfiles Up: Working with sequences Previous: infoseq

Using multiple sequences

EMBOSS programs can also deal with multiple sequences. A quick search using SRS will tell you that the SwissProt sequence corresponding to the tembl sequence we've been looking at has the identifier OPSD_XENLA. To retrieve the information about all the other OPSD sequences in SwissProt we can use the wild card character:

unix % infoseq
Displays some simple information about sequences
Input sequence(s): tsw:opsd_*
# USA Name Accession Type Length Description
sw-id:OPSD_ABYKO OPSD_ABYKO O42294 P 289 RHODOPSIN (FRAGMENT).
sw-id:OPSD_ALLMI OPSD_ALLMI P52202 P 352 RHODOPSIN.
sw-id:OPSD_AMBTI OPSD_AMBTI Q90245 P 354 RHODOPSIN.
sw-id:OPSD_ANGAN OPSD_ANGAN Q90214 P 352 RHODOPSIN, DEEP-SEA
sw-id:OPSD_ANOCA OPSD_ANOCA P41591 P 352 RHODOPSIN.
sw-id:OPSD_APIME OPSD_APIME Q17053 P 377 RHODOPSIN.
sw-id:OPSD_ASTFA OPSD_ASTFA P41590 P 352 RHODOPSIN.
sw-id:OPSD_BATMU OPSD_BATMU O42300 P 289 RHODOPSIN (FRAGMENT).
sw-id:OPSD_BATNI OPSD_BATNI O42301 P 289 RHODOPSIN (FRAGMENT).
sw-id:OPSD_BOVIN OPSD_BOVIN P02699 P 348 RHODOPSIN.


We can also use the wild card character on the command line, but here we must enclose the specification in quotation marks:

unix % infoseq ``tsw:opsd_*''

You can use seqret to retrieve multiple sequences into a file; for exmaple:

unix % seqret ``tsw:opsd_a*'' -outseq opsd_a.seqs

retrieves all the sequences whose identifiers start ``opsd_a'' into a file called opsd_a.seqs. If we wanted to have each sequence in a separate file, we could type:

unix % seqret ``tsw:opsd_a*'' -ossingle

Filenames are generated based on the identifiers of the sequences.


next up previous
Next: Listfiles Up: Working with sequences Previous: infoseq
EMBnet
2005-01-22