TBI 02-03-012 Stochastic Pairwise Alignments


Stochastic Pairwise Alignments

[PostScript] [PDF]

U. Mückstein, I.L. Hofacker, P.F. Stadler

Motivation: The level of sequence conservation between related nucleic acids or proteins often varies considerably along the sequence. Both regions with high variability (mutational hot-spots) and regions of almost perfect sequence identity may occur in the same pair of molecules. The reliability of an alignment therefore strongly depends on the level of local sequence similarity.

Results: The probability Pij of a match between position i in the first and position j in the second sequence is computed using the the partition function over all canonical pairwise alignments. A probabilistic backtracking procedure can then be used to generate ensembles of suboptimal alignments with correct statistical weights.
A comparison between structure based alignments and large samples of stochastic alignments shows that the ensemble contains correct alignments with significant probabilities even though the optimal alignment deviates significantly from the structural alignment. Ensembles of suboptimal alignments obtained by stochastic backtracking, or the match probability matrices themselves, are therefore promising starting points for improved iterative multiple alignment procedures. In particular, it should be possible to overcome the problem of fixating an incorrect pairwise alignment in an early iteration.

Availability The software described in this contribution is available for downloading at http://www.tbi.univie.ac.at/~ulim/probA.

Keywords: Alignments, Partition Function, Stochastic Backtracking

Return to 2001 working papers list.