EMBOSS programs can also deal with multiple sequences. A quick search using SRS will tell you that the SwissProt sequence corresponding to the tembl sequence we've been looking at has the identifier OPSD_XENLA. To retrieve the information about all the other OPSD sequences in SwissProt we can use the wild card character:
unix % infoseq
Displays some simple information about sequences
Input sequence(s): tsw:opsd_*
# USA | Name | Accession | Type | Length | Description |
sw-id:OPSD_ABYKO | OPSD_ABYKO | O42294 | P | 289 | RHODOPSIN (FRAGMENT). |
sw-id:OPSD_ALLMI | OPSD_ALLMI | P52202 | P | 352 | RHODOPSIN. |
sw-id:OPSD_AMBTI | OPSD_AMBTI | Q90245 | P | 354 | RHODOPSIN. |
sw-id:OPSD_ANGAN | OPSD_ANGAN | Q90214 | P | 352 | RHODOPSIN, DEEP-SEA |
sw-id:OPSD_ANOCA | OPSD_ANOCA | P41591 | P | 352 | RHODOPSIN. |
sw-id:OPSD_APIME | OPSD_APIME | Q17053 | P | 377 | RHODOPSIN. |
sw-id:OPSD_ASTFA | OPSD_ASTFA | P41590 | P | 352 | RHODOPSIN. |
sw-id:OPSD_BATMU | OPSD_BATMU | O42300 | P | 289 | RHODOPSIN (FRAGMENT). |
sw-id:OPSD_BATNI | OPSD_BATNI | O42301 | P | 289 | RHODOPSIN (FRAGMENT). |
sw-id:OPSD_BOVIN | OPSD_BOVIN | P02699 | P | 348 | RHODOPSIN. |
We can also use the wild card character on the command line, but here we must enclose the specification in quotation marks:
unix % infoseq ``tsw:opsd_*''
You can use seqret to retrieve multiple sequences into a file; for exmaple:
unix % seqret ``tsw:opsd_a*'' -outseq opsd_a.seqs
retrieves all the sequences whose identifiers start ``opsd_a'' into a file called opsd_a.seqs. If we wanted to have each sequence in a separate file, we could type:
unix % seqret ``tsw:opsd_a*'' -ossingle
Filenames are generated based on the identifiers of the sequences.