One of the most popular programs for performing multiple sequence alignments is clustalw ( [1]). EMBOSS has an interface to clustal called emma clustal (and thus emma) creates a multiple sequence alignment from a group of related sequences using progressive pairwise alignments. It can also produce a dendogram showing the clustering relationships used to create the alignment. The dendogram shows the order of the pairwise alignments of sequences and clusters of sequences that together generate the final alignment, but it is not an evolutionary tree, although the length of the branches is related to the relative distance of the sequences. clustal finds global optimal alignments. The alignment procedure begins with the pairwise alignment of the two most similar sequences, producing a cluster of two aligned sequences. This cluster can then be aligned to the next most related sequence or cluster of aligned sequences.Two clusters of sequences can be aligned by a simple extension of the pairwise alignment of two individual sequences. The final alignment is achieved by a series of progressive, pairwise alignments that include increasingly dissimilar sequences and clusters, until all sequences have been included in the final pairwise alignment. When gaps are inserted into a sequence to produce an alignment, they are inserted at the same position in all the sequences of the cluster. Each pairwise alignment uses the method of Needleman and Wunsch extended for use with clusters of aligned sequences.
pscan has told us that our sequence belongs to the
rhodopsin family. This is a very large family of sequences - for
example, you can see the Pfam entry for rhodopsin by doing a keyword
search at
http://www.sanger.ac.uk/Software/Pfam
We will now retrieve some further members of the family from SwissProt and produce a multiple alignment; we'll then use this multiple alignment to produce a profile of this group of sequences and use that to align them all to our original sequence.
First, let's retrieve the sequences using seqret: