next up previous
Next: Optimizing gap-penalties in Clustal, Up: Viewing and editing alignments Previous: Selecting alignment regions with

Editing your alignment with Seaview and saving your work

Clustalx is heuristic not only in finding the best alignment, but even the most exact dynamic programming algorithm relies only on guess-work when it comes to gap-opening and gap-extension parameters. So don't be surprised if when you look at an alignment you find small parts that seem that they could be clearly improved by eye. The trade-off is that we can fool ourselves into second-guessing the machine, and trying to read homology into something that is not by mistake. So you should resist the temptation to hand-edit your alignments too much.

Nonetheless, here we'll see an example that cries out for hand-editing. You can't do this in ClustalX but you can in Seaview.

  1. Discover the difference between sequence and alignment coordinates.

    Look at the Seaview window. At the top left and right corners you will find two integers. Those are giving the left and right visible columns in alignment coordinates. Alignment coordinates go from left to right in the alignment. But because sequences have gaps and begin in different places, different sequence will be differently placed in their sequence coordinates at a given alignment coordinate. Seaview can show you both.

    Prove to yourself that sequence coordinates will always be less than or equal to alignment coordinates for any position in any alignment.

    Put your cursor on the ``M'' at the first residue of OPS2_PATYE. In my copy of this alignment made with the 6 sequences under default settings with emma, the following cursor-location monitor should appear above the alignment:

    Seq:6 Pos:39|1 [OPS2_PATYE]

    Your results may vary. This output means that the cursor is at the sixth sequence, at the 39th position in alignment coordinates and the first position of this sequence in sequence coordinates. Move the cursor through the alignment with the arrow-keys, and explore what this does to the location-monitor.

  2. Edit the alignment. Scroll over to the right with the mouse (drag the scrollbar at the bottom) until the window ends at alignment coordinate 270 or slightly above. You should see a likely deletion of size 6 in OPS_PATYE. Look at the other sequences where this indel is and now look at the sequence to the immediate left of the gap. Do you see a similarity? The motif ``EK..R[ED]Q'' is pretty well conserved, and it is also there in the gapped sequence belonging to sequence OPS_PATYE. This is probably bogus. If we line up the sequence ``EKVCKD'' with the other sequences, the matches will be quite high. There would then be a single residue deletion where all other sequences have a ``Q.'' Let us say this is the most likely alignment for this sequence, in which case there should be two gaps instead of one. Now we'll see how to use Seaview to fix this misaligned sequence.

    1. Enable sequence edition. Click Props -> Allow seq. edition. You have now deprotected the alignment from editing.

    2. Edit the alignment. Place the cursor on the ``E'' at sequence coordinate 209 in OPS_PAYTE. Use <SPACE> to move the motif over. Then put the cursor over the ``S'' on position 215 (seq. coord.), and slide the rest of the sequence back where it was.



  3. Save the site set as a separate alignment. Now you have made a nice trimmed alignment for phylogenetic analysis. You can now save this site-set as an alignment ``slice'' that includes only the parts that you have selected.

    Click File -> save current sites. A window pops up asking you format and name. Choose trimmed.phylip and in the pop-up menu choose ``Phylip.'' Phylip is a common format for phylogenetic analysis, and you will use this trimmed alignment later.

  4. Save your work. Seaview uses its own alignment format called ``mase'' to do things like save the site-sets you have created together internally the alignment (you can create more than one site-set in an alignment to mean different things). To save all this work:

    Click Sites -> Hide set and File -> Save as. Give the name ops2.mase, choose format ``Mase'' and save.

    We'll also play around with this alignment a little in ClustalX, so save your work in clustal format too. Give it the name ``ops2.aln''. Now you can quit.



That concludes this section with Seaview. Seaview includes many other useful functions, like drawing dotplots between pairs of sequences and more.


next up previous
Next: Optimizing gap-penalties in Clustal, Up: Viewing and editing alignments Previous: Selecting alignment regions with
David Ardell 2005-01-27