Next: Optimizing gap-penalties in Clustal,
Up: Viewing and editing alignments
Previous: Selecting alignment regions with
Clustalx is heuristic not only in finding the best alignment, but even
the most exact dynamic programming algorithm relies only on guess-work
when it comes to gap-opening and gap-extension parameters. So don't be
surprised if when you look at an alignment you find small parts that
seem that they could be clearly improved by eye. The trade-off is that
we can fool ourselves into second-guessing the machine, and trying to
read homology into something that is not by mistake. So you should
resist the temptation to hand-edit your alignments too much.
Nonetheless, here we'll see an example that cries out for
hand-editing. You can't do this in ClustalX but you can in Seaview.
- Discover the difference between sequence and alignment
coordinates.
Look at the Seaview window. At the top left and right corners you will
find two integers. Those are giving the left and right visible columns
in alignment coordinates. Alignment coordinates go from left to
right in the alignment. But because sequences have gaps and begin in
different places, different sequence will be differently placed in
their sequence coordinates at a given alignment
coordinate. Seaview can show you both.
Prove to yourself that sequence coordinates will always be less than
or equal to alignment coordinates for any position in any alignment.
Put your cursor on the ``M'' at the first residue of OPS2_PATYE.
In my copy of this alignment made with the 6 sequences under default
settings with emma, the
following cursor-location monitor should appear above the alignment:
Seq:6 Pos:39|1 [OPS2_PATYE]
Your results may vary. This output means that the cursor is at the
sixth sequence, at the 39th position in alignment coordinates and the
first position of this sequence in sequence coordinates. Move the
cursor through the alignment with the arrow-keys, and explore what
this does to the location-monitor.
- Edit the alignment. Scroll over to the right with the
mouse (drag the scrollbar at the bottom) until the window ends at
alignment coordinate 270 or slightly above. You should see a likely
deletion of size 6 in OPS_PATYE. Look at the other sequences where
this indel is and now look at the sequence to the immediate left of
the gap. Do you see a similarity? The motif ``EK..R[ED]Q'' is pretty
well conserved, and it is also there in the gapped sequence
belonging to sequence OPS_PATYE. This is probably bogus. If we line
up the sequence ``EKVCKD'' with the other sequences, the matches
will be quite high. There would then be a single residue deletion
where all other sequences have a ``Q.'' Let us say this is the most
likely alignment for this sequence, in which case there should be
two gaps instead of one. Now we'll see how to use Seaview to fix
this misaligned sequence.
- Enable sequence edition. Click Props -> Allow
seq. edition. You have now deprotected the alignment from editing.
- Edit the alignment. Place the cursor on the ``E'' at
sequence coordinate 209 in OPS_PAYTE. Use <SPACE> to move
the motif over. Then put the cursor over the ``S'' on position 215
(seq. coord.), and slide the rest of the sequence back where it was.
- Save the site set as a separate alignment. Now you have
made a nice trimmed alignment for phylogenetic analysis. You can
now save this site-set as an alignment ``slice'' that includes only the
parts that you have selected.
Click File -> save current sites. A window pops up asking
you format and name. Choose trimmed.phylip and in the pop-up
menu choose ``Phylip.'' Phylip is a common format for phylogenetic
analysis, and you will use this trimmed alignment later.
- Save your work. Seaview uses its own alignment format
called ``mase'' to do things like save the site-sets you have
created together internally the alignment (you can create more than one
site-set in an alignment to mean different things). To save all
this work:
Click Sites -> Hide set and File -> Save as. Give the
name ops2.mase, choose format ``Mase'' and save.
We'll also play around with this alignment a little in ClustalX,
so save your work in clustal format too. Give it the name ``ops2.aln''.
Now you can quit.
That concludes this section with Seaview. Seaview includes many other
useful functions, like drawing dotplots between pairs of sequences and
more.
Next: Optimizing gap-penalties in Clustal,
Up: Viewing and editing alignments
Previous: Selecting alignment regions with
David Ardell
2005-01-27